Jeremiah

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

You don't feel like you're predicting anything when you're doing a math problem.

475.467 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

You're just doing good, normal, mathematical steps, like reciting PEMDAS to yourself and carrying the one.

478.831 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

In the same way, even though an AI was shaped by next token prediction, the inside of its thoughts doesn't look like next token prediction.

485.86 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

In the abstract, it probably looks like a world model, the same as yours.

493.731 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

In the concrete...

497.776 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

The science of figuring out what an AI's innards are concretely doing is called mechanistic interpretability.

500.257 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

It's very hard to do.

506.264 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

AI innards are notoriously confusing, and one team at Anthropic produces most of the headline results.

507.545 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

Recently, they explored how Claude predicts where a line break will be in a page of text.

514.973 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

Since line break is a token, this is literally a next token prediction task.

519.758 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

Here's a diagram.

525.839 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

It's captioned, Key steps in the line-breaking behavior can be described in terms of the construction and manipulation of manifolds.

527.021 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

So there's a series of sub-diagrams in here.

534.21 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

The first is captioned, LLMs perceive visual properties of text despite only seeing a list of numbers.

536.594 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

So it shows line-wrapped text with various words and a line break, and then it says what the model sees.

544.384 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

It's just a long list of numbers.

552.896 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

How are different character counts represented?

555.961 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

Character count and line width manifolds are aligned and discretized by features.

559.828 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

How is the boundary detected?

565.218 View full episode →

Astral Codex Ten Podcast

Next-Token Predictor Is An AI's Job, Not Its Species

Boundary heads use QK to shift offsets of line count and width manifolds.

566.64 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment