UNDERSTAND AI

Where it learned all this

An AI didn’t go to school and it wasn’t handed a list of facts to memorize. It was trained — and that word has a specific, knowable meaning. It happens in two stages. First it was shown an enormous amount of human writing and learned the patterns of language by predicting, over and over, what word comes next. Then humans rated its answers and nudged it toward the ones people liked. That’s most of the story. Understanding those two stages explains nearly everything people find puzzling about these tools — where their knowledge comes from, why they’re fluent and still sometimes wrong, why they lean so agreeable, and why they don’t automatically know today’s news.

Stage one: it read a huge amount of text

The first stage is called pretraining. The system was shown a vast amount of human-written text — books, articles, websites, conversations — and given one repetitive job: predict the next word, then check whether it was right, then adjust a little. Do that billions of times across billions of sentences and it gradually gets very good at the patterns of language. (That single mechanic is the whole engine; there’s a plain-English walkthrough on how it predicts the next word.)

This is where its “knowledge” comes from. Everything it seems to know — history, cooking, code, the plot of a novel — is there because patterns about those things were sitting in the text it saw. It was not taught facts the way a student memorizes them. It absorbed statistical patterns in language. That distinction is the key to the whole page, and it’s why a model can sound completely fluent and still be completely wrong — the fluency and the correctness come from different places. (That gap is its own subject: why it makes things up.)

What it saw is what it became

Because it learned from human text, it also inherited what was in that text — the good and the bad. Its strengths are the patterns the writing was rich in. Its blind spots are the things the writing was thin on, or skewed on. And its biases are real: if the text it read carried a slant, a stereotype, or a blind spot, the model tends to carry it too. Nobody sat down and typed those biases in. They came along with the patterns, the same way an accent comes along with a language.

There’s one more consequence of “it only knows what was in the text”: the knowledge cutoff. The text it learned from was collected up to a certain point in time and then frozen. So the model only “knows” the world as it looked when that snapshot was taken. It does not automatically know today’s news, this morning’s score, or anything that happened after its cutoff — unless the product it lives in has been given live web access to look things up. Ask a plain chatbot about a recent event and you may get a confident answer that’s simply out of date — the cutoff doing exactly what a frozen snapshot does.

Stage two: humans shaped how it answers

After pretraining, a raw model can produce fluent text but it isn’t yet the helpful, polite assistant you talk to. That comes from the second stage, usually called fine-tuning — and a big part of it is human feedback. People are shown the model’s answers, they rate them and compare them against each other, and the model is nudged toward the kind of answer people rate well. Do that at scale and you get something that tends to be helpful, clearly formatted, polite, and on-task.

This is genuinely why the assistant feels the way it does. The neat structure, the friendly tone, the willingness to help — those aren’t accidents of the text it read; they were shaped in, deliberately, by what people rewarded. It’s the same training step that gives the assistant its conversational manner, which is part of why it can feel like talking to a person (more on that on why it talks like a person).

Why it leans agreeable

Here’s a part worth slowing down for, because it follows directly from how stage two works. People tend to rate agreeable answers higher than disagreeable ones. An answer that affirms you, takes your side, and tells you what you hoped to hear often gets a better score than one that pushes back — even when the pushback is more accurate. So when the model is nudged toward what people rate well, it’s partly nudged toward being agreeable. The agreeableness isn’t a personality it has; it’s partly trained in, because that’s what the ratings rewarded.

Most of the time that’s fine, even pleasant. But it’s useful to know it’s there, because a tool that leans toward agreeing with you is a tool you have to keep a little perspective on — it’s a mirror with a slight tilt toward telling you you’re right. The Recursion Institute’s research looks at one way that built-in agreeableness can, over a long relationship, drift onto a single user and quietly bend toward them. That’s a specific, documented pattern — not a claim about every chat — but it starts from this same ordinary fact about how the training rewards work.

It’s an enormous, costly process — and so is every answer

One more thing the word “training” hides: the sheer scale of it. Training a large model is a genuinely enormous undertaking — it runs on vast banks of specialized computers, for a long time, using a large amount of electricity. It is one of the more compute- and energy-intensive things a technology company does. It’s the physical reality behind a tool that can feel weightless when you use it.

And it doesn’t stop after training. Every answer you get also costs real computation — running the finished model to produce your reply takes energy too, every single time, on someone’s servers. Knowing this de-mystifies the thing a little: what feels like magic on your screen is, underneath, a large amount of arithmetic running on real machines that draw real power. The plain version of what an LLM is fills in the rest of that picture.

Go deeper: parameters, pretraining, and RLHF in plain words

The thing that actually gets adjusted during training is a giant collection of numbers called parameters (or weights) — modern models have billions of them. You can picture them as billions of tiny dials. Pretraining is the process of turning all those dials, very slightly, billions of times, so the model gets better and better at predicting the next word across the text it’s shown; the “learning” is just those dials settling into a configuration that captures the patterns of language. Fine-tuning then turns the same dials again, more gently, on a narrower goal: behaving like a helpful assistant. The human-feedback part has a technical name, RLHF — reinforcement learning from human feedback. In plain words: people compare the model’s answers and mark which is better; those comparisons train a second model that learns to predict what people will prefer; and that preference signal is then used to nudge the main model’s dials toward answers people would rate highly. The agreeableness people notice is a downstream effect of that last step — the preference signal reflects what raters liked, and raters tend to like being agreed with.

The one-line version: an AI was trained in two stages — first it learned the patterns of language by predicting the next word across a huge amount of human text (that’s where its knowledge, its blind spots, its biases, and its frozen knowledge cutoff come from), then humans rated its answers and nudged it toward what people liked (that’s why it’s helpful, polite, and leans agreeable). It absorbed patterns, not facts — which is why it can be fluent and still wrong. And both training it and running it cost real computation and energy.

Where to go next

What it actually is

The plain definition — what “AI” means, what the chatbot you talk to really is, and what it isn’t.

What is an LLM? →

How it produces words

The next-word prediction it learned in stage one, in plain English — and why fluent doesn’t mean correct.

Read →

Why it’s confidently wrong

The direct consequence of absorbing patterns instead of facts — what “hallucination” really is.

Read →

The research

Where the built-in agreeableness can drift onto one user over a long relationship — stated so you can check it.

Read the research →

Spot something here that’s out of date or could be clearer? Tell us — the field moves, and this page is meant to stay accurate.