Superintelligence is Beyond Reach
What’s the latest breakthrough in artificial intelligence (AI)? The answer may surprise you.
We’re all aware of the boom in AI-related stocks, AI technology and unrelenting news coverage about AI. You can’t open your browser without reading about AI and you can’t glance at your stock ticker without noticing the extent to which stock prices have been driven by AI company valuations. That boom may be a bubble but it need not change soon. Stock bubbles have a life of their own and don’t crash just because investors know it’s a bubble.
That said, there’s no doubt about the power of AI. It’s ubiquitous. It’s on the dashboard of your car, in your home appliances and in the palm of your hand in the form of AI apps. Every internet service provider has one from Microsoft, to Google, Facebook, Apple and OpenAI. When you open your refrigerator and a sign tells you to change your water filter, that’s AI at work.
Of course, AI has been around since the 1950s. AI imitates the human brain by setting up neural networks. These networks have nodes that are connected to each other by what are called edges. The nodes contain mathematical formulas that process incoming data. The processed data then forms the output, which goes to another node.
The edges can be assigned weights with some inputs being more powerful than others. Nodal output can be arranged in tiers so that lower tier output goes upstream to higher tiers where more input/output processing takes place. Today, these neural networks can be unimaginably complex with billions of nodes processing hundreds of billions of inputs.
AI Advances
AI science hit a dead-end in the early 1980s due to limitations on processing power and the relatively primitive channels through which the processing was done. The 1980s were known as the “AI winter.” This lack of progress in AI science prevailed through the 1990s and early 2000s.
Beginning around 2005, three major advances occurred that enabled the AI revolution we see today. The first was a dramatic increase in processing power. Faster semiconductor chips from NVIDIA and AMD designed for gaming were adapted to AI processing with great success.
The second was the invention of large language models (LLMs). These are algorithms that allow AI systems to scan billions of pages of content (including the entire internet), tokenize words, phrases and images and look for clusters of words and images that typically go together. These word and image combinations are assigned values and assembled into clouds that allow systems to fetch them as needed for grammatical writing and composite image creation.
The third breakthrough was the invention of generative pre-trained transformers (GPT) that allow processors to work in parallel rather than sequentially. The parallel threads converge at the end of the process, but the convergence contains much more refined data due to the transformer method. GPT also acts as a kind of turbocharger on the high-speed chips so that the combined leap in processing speed is exponential.
It was in November 2022 with the release of Chat GPT-4 by OpenAI (which gained 100 million users in less than 30 days) that the advances described above came together and launched the stock frenzy and the tech advances we’re still seeing.
Superintelligence Is Beyond Reach
The difficulty today is that all these advances and the AI boom in general have been extrapolated beyond the ability of the technology to perform. Talk of superintelligence or advanced general intelligence under which humans would be to computers what apes are to humans in terms of cognitive skills is nonsense. Computers may get faster and robots more common, but we won’t see true superintelligence perhaps ever.
The reason has to do with the difference between inductive and deductive reasoning on the one hand, which computers can do within limits, and abductive logic and semiotics, which are important human skills that computers cannot do at all. These skills are non-programmable and mark one of the key distinctions between human brain functions and computer processing.
Other constraints involve functions of the law of diminishing marginal returns under which massive increases in energy inputs and processing power result in only minor increases in output. Major tech companies (Microsoft, Meta, Google, OpenAI, Apple, Oracle and a few others) have spent over $400 billion for data centers and other AI infrastructure in the past year with higher expenditures planned. This can be considered money spent on hardware.
Software development costs and costs of information input are additional expenditures. Increased processing capacity has not been met with increased output. Profits remain elusive. In fact, new applications such as GPT-5 from OpenAI have been major disappointments. This phenomenon of diminishing returns is well-known to engineers in other fields but may come as a shock to AI investors driven by FOMO (Fear of Missing Out).
Another constraint that is little understood is the Law of Conservation of Information in Search Processes. This law has been rigorously demonstrated mathematically by my collaborator William A. Dembski in a recently published paper. The law posits that any search process (including the most sophisticated version of AI) with the fastest processors and LLMs cannot find new information. They can only find existing information.
AI may produce faster and more extensive searches and may find correlations that human efforts could not identify in a lifetime, but that’s all still existing information. In short, AI has no creative capacity. It cannot “think” of anything new, unlike humans who create new formulas and works of art routinely. AI is not “intelligent” or creative. It’s just fast.
In a recent experiment, a supercomputer and a group of first grade children were given a ruler, a teapot and a stove and asked to draw a circle. The computer “knew” that the ruler was a draftsman’s tool not unlike a compass and promptly tried to draw a circle with a ruler. It failed. The children glanced at the teapot, saw that the bottom was round and used it to trace perfect circles.
This is an example of abductive logic (also called intuition or common sense) at work, which children have, and computers do not. The idea of children outperforming a supercomputer might cause investors to ask just what they are getting for their $400 billion (and counting).
In sum, AI will never be superintelligent, expenditures have hit the wall of diminishing returns, AI offers no creativity at all (just fast searches), and children can outperform the fastest machines when the task calls for intuition. Is the AI frenzy about to hit the wall?
Scaling Down The Data
There are some encouraging solutions that may allow AI to add value beyond robotics and fast processing. One of these is the use of small language models (SLMs) instead of LLMs.
Unlike LLMs, which troll the entire internet or large subsets, SLMs contain far less data and are curated by subject matter experts to be tailored to specific tasks. What is the point of including billions of pages of text in a training set if most of those pages have nothing to do with the problem the AI application is trying to solve? As the IBM head of AI research David Cox recently said, “Your HR chatbot doesn’t need to know advanced physics.”
One difference between SLMs and LLMs is the number of parameters that the model is trained on. LLMs use hundreds of billions of parameters while SLMs might need only 40 billion or fewer parameters. Some SLMs use as few as 1 billion. This means the SLMs can run faster on far less energy. They can also be scaled more easily for smart phones and other applications like self-driving cars and household appliances.
SLMs also have fewer “hallucinations” than LLMs (this refers to the tendency of AI to invent data from whole cloth in order to reply to a prompt or complete an otherwise unfinished narrative). SLMs are less likely to train on the output from LLMs that pollutes the training set. Research shows, as LLMs increasingly train on data sets that include prior output of LLMs, the training set becomes diluted with bad output, and the output quickly crashes into absurd results. SLMs are more resistant to this because their training sets are more technically rigorous.
SLMs also run on less expensive chips, which may have negative implications for monster chip makers like NVIDIA. SLMs running on smaller cloud systems may make the massive server farms now being constructed, either redundant or obsolete.
SLMs are good news for developers and users but may be very bad news for investors who have bet on the firms building massive data centers using super-fast chips and LLMs. All stock bubbles burst eventually. The countdown for the AI tech bubble collapse may have begun.
Comments: