Jeremiah
๐ค SpeakerAppearances Over Time
Podcast Appearances
You don't feel like you're predicting anything when you're doing a math problem.
You're just doing good, normal, mathematical steps, like reciting PEMDAS to yourself and carrying the one.
In the same way, even though an AI was shaped by next token prediction, the inside of its thoughts doesn't look like next token prediction.
In the abstract, it probably looks like a world model, the same as yours.
In the concrete...
The science of figuring out what an AI's innards are concretely doing is called mechanistic interpretability.
It's very hard to do.
AI innards are notoriously confusing, and one team at Anthropic produces most of the headline results.
Recently, they explored how Claude predicts where a line break will be in a page of text.
Since line break is a token, this is literally a next token prediction task.
Here's a diagram.
It's captioned, Key steps in the line-breaking behavior can be described in terms of the construction and manipulation of manifolds.
So there's a series of sub-diagrams in here.
The first is captioned, LLMs perceive visual properties of text despite only seeing a list of numbers.
So it shows line-wrapped text with various words and a line break, and then it says what the model sees.
It's just a long list of numbers.
How are different character counts represented?
Character count and line width manifolds are aligned and discretized by features.
How is the boundary detected?
Boundary heads use QK to shift offsets of line count and width manifolds.