Jeremiah
๐ค SpeakerAppearances Over Time
Podcast Appearances
When a monk decides to swear an oath of celibacy and never reproduce, he does so using a brain that was optimized to promote reproduction, just using it very far out of distribution, in an area where it no longer functions as intended.
One level lower down, your brain was shaped by next-sense-dartum prediction.
Partly you learned how to do addition because only the mechanism of addition correctly predicted the next word out of your teacher's mouth when she said 3 plus 3 is.
It's more complicated than this, sorry, but this oversimplification is basically true.
You don't feel like you're predicting anything when you're doing a math problem.
You're just doing good, normal, mathematical steps, like reciting PEMDAS to yourself and carrying the one.
In the same way, even though an AI was shaped by next token prediction, the inside of its thoughts doesn't look like next token prediction.
In the abstract, it probably looks like a world model, the same as yours.
In the concrete...
The science of figuring out what an AI's innards are concretely doing is called mechanistic interpretability.
It's very hard to do.
AI innards are notoriously confusing, and one team at Anthropic produces most of the headline results.
Recently, they explored how Claude predicts where a line break will be in a page of text.
Since line break is a token, this is literally a next token prediction task.
Here's a diagram.
It's captioned, Key steps in the line-breaking behavior can be described in terms of the construction and manipulation of manifolds.
So there's a series of sub-diagrams in here.
The first is captioned, LLMs perceive visual properties of text despite only seeing a list of numbers.
So it shows line-wrapped text with various words and a line break, and then it says what the model sees.
It's just a long list of numbers.