Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Jeremiah

๐Ÿ‘ค Speaker
1265 total appearances

Appearances Over Time

Podcast Appearances

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Will the next word fit?

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Final prediction is made by arranging representations to be linearly separable.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

And how are representations constructed?

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Multiple attention heads specialized in particular token ranges.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Their sum creates a line count manifold.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Scott writes, The answer was, the AI represents various features of the line breaking process as one-dimensional helical manifolds in a six-dimensional space, then rotates the manifolds in some way that corresponds to multiplying or comparing the numbers that they're representing.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

You don't need to understand what this means, so I've relegated my half-hearted attempt to explain it to a footnote.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Here's the footnote.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

My extremely half-hearted attempt at understanding this claim, the AI needs to track things like whether you're on character 1, 2, 3, etc.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

of the current line.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

The simplest way to do this would be to have one feature for the state of being on character number 1, another for the state of being on character number 2, etc.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Since AI features can be modeled as dimensions, this would correspond to locating the current character count in a 100-dimensional space, which would work.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

But this is expensive in feature count.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

A document with 100 characters per line would take 100 features for this simple task.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Another simple way to do this would be to have one feature whose value gets higher as the character count goes up.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

This would correspond to locating the character count in a one-dimensional space, aka a straight line.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

This fails for two technical reasons.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

First, AIs can't manipulate feature values that finely, and second, the AI needs to compare this feature to some other feature representing expected number of characters before the line break, and it can't directly compare feature values in this sense.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Its solution is, since one dimension is too small and 100 dimensions is too many, compromising and using some medium number of dimensions, which turns out to be 6.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Trying to map things in 6-dimensional space naturally produces these helical manifold structures, and comparing them to each other naturally looks like rotating the manifolds.