The pace of AI developments over just the last two weeks has been staggering. Foundation Models like GPT-4 are the catalyst inspiring the public’s imagination. In public discourse like blogs, podcasts, LinkedIn, and Twitter – the optimists and the pessimists are exchanging sometimes useful and sometimes heated arguments.
The optimists seem to outnumber, or at least out-shout the pessimists. Observers, pundits, analysts, and entrepreneurs are quick to hype these large Foundation Models as a step toward machines having true intelligence. They point out how it augments, outpaces and many times replace humans. The weaknesses, the argument goes, are easily overcome with more and better data, and more parameters.
The Optimists Point of View
The assumption that a lot of laypeople, and frankly, a lot of AI researchers make is that advancements in Foundation Models equate to advancements in a computer’s ability to think like a human. It is not.
Human intelligence is a collection of many cognitive abilities. In the 1960s we created computers that were more precise and faster than humans in doing math. Math is one cognitive ability. Loud voices at the time proclaimed that computers would soon replace humans. They didn’t.
Foundation Models have the ability to predict exponentially better and faster than humans. People also have the ability to predict. Take this thought experiment. What if we write a prompt and give it to a group of humans? Instruct the first person to read the prompt and predict the first word of the answer. Then, they will pass that answer to another person in the chain and tell them. That person will be told to read the prompt and the first word of the answer and add the next word and so on. In the end, you would have a pretty good answer based on the collective predictions.
One could argue that GPT-4 is better at this than humans. The thought experiment would yield worse results than giving the prompt to GPT-4. Let’s say that the prompt was, “Make a song about Bing vs Google in the style of Eminem.” GPT-4 will give a pretty good answer to that prompt.
But I would also argue that Eminem would make a better song about Bing vs Google, than GPT-4 which was prompted to pretend to be Eminem. Why? Eminem is not using next-word prediction but applies his whole register of human cognitive ability to the task. And one of those is his uncanny ability to hit the emotions of millions of people with a simple rhyme.
The optimists will claim, however, that as we increase the training data, increase the parameters and increase the quality of the alignment dataset – it will eventually beat Eminem at his own game.
A Pessimists Point of View
Pessimists are quick to point out how these models are flawed – that they are just and don’t have any true ability to reason at all. The argument is that the fundamental flaws, like, the inability to cite sources, the inability to plan, and the inability to reason, are not solvable by data and parameters. Said differently, a Foundation Model will not spontaneously change its topology to suddenly achieve new cognitive abilities.
Prediction, especially beyond human capability, can create an illusion of reasoning. Theoretically, a Foundation Model trained on all the data in the world would be able to pretend to reason about any topic where anybody has articulated their reasoning. But, this is exactly where the application of Foundation Models can get you into trouble. Prediction is simply not the right cognitive ability for every circumstance or interaction.
That we have taken a huge step in the ability to predict using foundation models, does not mean that we have taken huge steps in AI that attempt to replicate other cognitive abilities. GPT-4 doesn’t remember your name or your preferences across a session. It doesn’t collaborate with you at all by asking any questions. In fact, it still follows your instructions as if you clicked a button – it just follows instructions with a much higher degree of agency.
There are many things that Foundation Models are not good at. Optimists believe we can evolve the Foundation Models to overcome those weaknesses. Pessimists believe we would need similarly large breakthroughs, using completely different techniques, in other fields of AI.
Realists Delivering Balance
At Openstream we follow a middle ground. I call it realism. Inside Eva, our enterprise conversational AI platform, we are using Foundation Models for a lot of tasks already. And we will likely expand that. What makes me excited, however, is that if you list the weaknesses of Foundation Models – they map to our strengths.
Being able to hold a completely unscripted conversation where the user and Eva collaboratively solve the user’s problem feels almost magical. A magical capability that becomes even better now that we are using foundation models – where we need the instruction-following capability.
Optimists are correct in so far as recognizing that Foundation Models are a quantum leap in the usefulness of AI. Foundation Models, however, can’t do reasoning, planning, or other cognitive strategies. They do give us a cognitive ability that we can use in several places but, they will never spontaneously create those new abilities.
Humans are still several quanta leaps away from replicating all the cognitive abilities that make up human intelligence. So what’s the right answer? Optimists should listen to pessimists because the topology of the models creates fundamental problems. And pessimists should realize that the likely uses of Foundation Models go far beyond what they think because superhuman predictive capability is likely to be useful in places we simply don’t know yet.
In the end, it’s the pragmatists… or realists who will harness the most value out of the current Foundation Model Revolution. Millions of end-users and startups will be so enamored by the technology that they will apply it to the wrong problems. As the hype gives way to disillusionment… the realists will continue applying it for the right kind of problems and come out the winners.
That’s why I call us realists, but find both the optimists and pessimists have valuable viewpoints. In the future, we will show you the true power of collaborative plan-based systems – and how they widely differ from the instructional approach of Foundation Models. That Sam Altman, the CEO of OpenAI, admitted to Lex Friedman that they had no idea on how to do that yet made me smile.