Artificial Intelligence today is dominated by Large Language Models (LLMs) which are statistical predictors. They use statistical probability to predict the next most probable token.
If LLMs were made to always predict the next most probable token, their output would be same for a given query, which would be very boring. So to “mimic” intelligence, LLMs are made to select some of the most probable tokens above a given statistical certainty (using top-p and top-k sampling) and then a randomness setting (called temperature) to select a random token out of these probable set of tokens.
This new token then becomes the current output, which is appended back to the current input and is reused to predict the next token! This process goes on and on till a special token denoting THE END is predicted, or till we reach the maximum token limit.
So where is the intelligence part here? But then, what is intelligence in the first place?

Training an LLM! Really?
Before defining intelligence, let us look at how LLMs are trained in the first place. For example, suppose we train a LLM using Wikipedia corpus. The LLM might first come across an article on quantum mechanics, even before it has read the article about what an atom is in the first place!!
What we need to understand here is that, unlike how humans are taught the fundamentals first, and then advanced topics, there is no such step by step training for LLMs! They are merely coming across stuff at internet scale!
Humans take years in schooling, where they first learn the basics, and then move towards slightly advanced topics every year. Imagine trying to teach quantum mechanics to a kid, without even teaching what an atom is!
Also, when we learn and understand something, we generally get the point in a couple of tries at the most. If we are reading the same sentences, over and over again, thousands of times, we do so only while parroting, not understanding! But LLMs are trained over huge corpus of text where same sentences might be repeated thousands and thousands of times!
Try asking an LLM about a fact which might appear rarely on internet, and you will see it immediately starts hallucinating. For instance, I asked an LLM about the music director of a 1960s Kannada movie, and it hallucinated it to be a 2025 movie and made up an answer (which obviously was wrong!)
If LLM is intelligent, and LLM like intelligence was so feasible, then even human brains would have evolved to learn anything in any order! But the fact is, humans cannot learn complex stuff, without understanding the basics. Complex stuff without the knowledge of basics can only be parroted, not understood.
There is a difference between making a class I kid understand Schrodinger’s equation and making a kid simply memorize it like a nursery rhyme.
Why LLMs Hallucinate?
The reason LLMs hallucinate is because they have not come across any token for the given input tokens during their massive training, at least not many times. And since they have been configured to mimic intelligence by selecting one of the top random tokens, they end up trying to make up facts which do not exist by selecting some random token!
Or, they hallucinate because, they came across the fact only a couple of times in the training data, and their statistical nature mislead them towards a more statistically probable token which appeared thousands of times in the training data for similarly grouped tokens!
Hallucination is the absence of relevant training data! LLMs appear to be intelligent because they are trained on such a huge corpus of data which is as good as a collection of entire human conversation ever possible to date!
If LLMs were really intelligent, then we should be able to train an LLM as good as an engineering graduate, by simply feeding it only those books which the graduate has ever read, and the conversations he/she has ever had! Not quite possible!
For instance, take the text from a given set of class I to class V text books. Train an LLM on them. Now ask it random questions from the examination papers of those classes and check the accuracy of the answers. I am sure it will flunk in mathematics! Parroting is not intelligence!
If LLMs were really intelligent, then what is the need for prompt engineering? Do we ever instruct an expert human in a given domain by saying “You are an expert mathematician. Think step by step before answering. Do not make up facts. Say “I don’t know”, if you do not know the answer” and so on? The prompt goes on and on. In fact, most system prompts today run into pages together, as if you are re-training your LLM before every query.
What then is intelligence? Don’t human brains use statistics in their synaptic weights between neurons? We remember stuff we do regularly, which is because of statistical gain, and forget stuff we do not regularly do, due to statistical loss.
But that is practice, not intelligence. We have practiced a lot how to walk, run, add, subtract and so on. So we do not forget them easily. Stuff which we have practiced a little, we start forgetting it after sometime when there is no regular practice. We might not remember a poem we learnt in the childhood. We might not recall the face or name of a childhood friend whom we have not met for a long time.
So, practice is similar to how an LLM is trained. We need to repeat it again and again and again to become perfect. Also, parroting is difficult for humans. Try reading a story and understanding it. It is very easy, and that is intelligence. Try parroting the entire story instead. Very difficult. Why?
Because for human brains, understanding something is easy, where as memorizing lengthy stuff is quite difficult.
What exactly is understanding something?
Understanding is grouping together of similar entities. So when we understand something, we are simply mapping it to a group which we already know.

If you have seen Chinese language characters, but do not know how to read them, it simply means that you have grouped them together to a bunch of neurons in your brain which recognize them as Chinese characters. You understand that it is a Chinese character, without having to learn each of them individually.
Since you already are aware of writing systems, and that different languages have different writing systems, and that Chinese (Mandarin) is a language which has its own writing system, it then becomes easy for you to understand that this is a Chinese writing system.
What is intelligence?
Knowing that I do not know something is the first step of intelligence. In other words, understanding that you do not understand something — is the gateway of Intelligence.
The day we can teach an LLM to understand that it does not know something, that is when AI starts becoming really intelligent. Once it knows it does not know something, then it should have an urge to learn what it does not know. That is intelligence.
So, the second step in intelligence, and this is real crucial, is to Know what to do when you do not know what to do.
Intelligence is what you use when you don’t know what to do — Jean Piaget
That is the true sign of intelligence. When you come across the end of a path in a forest, and are lost, and when you start weighing your options what to do next. Intelligence is what helps you survive when you come across a situation where you have never been before.
So an AI can start becoming truly intelligent (not mimic intelligence), the day it understands on its own that it does not know something, and then does something to learn it.
Until then,
LLMs are nothing but internet scale compressed databases that can be queried using natural language.