Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Jeremiah

๐Ÿ‘ค Speaker
1129 total appearances

Appearances Over Time

Podcast Appearances

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

How are different character counts represented?

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Character count and line width manifolds are aligned and discretized by features.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

How is the boundary detected?

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Boundary heads use QK to shift offsets of line count and width manifolds.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Will the next word fit?

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Final prediction is made by arranging representations to be linearly separable.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

And how are representations constructed?

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Multiple attention heads specialized in particular token ranges.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Their sum creates a line count manifold.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Scott writes, The answer was, the AI represents various features of the line breaking process as one-dimensional helical manifolds in a six-dimensional space, then rotates the manifolds in some way that corresponds to multiplying or comparing the numbers that they're representing.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

You don't need to understand what this means, so I've relegated my half-hearted attempt to explain it to a footnote.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Here's the footnote.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

My extremely half-hearted attempt at understanding this claim, the AI needs to track things like whether you're on character 1, 2, 3, etc.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

of the current line.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

The simplest way to do this would be to have one feature for the state of being on character number 1, another for the state of being on character number 2, etc.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Since AI features can be modeled as dimensions, this would correspond to locating the current character count in a 100-dimensional space, which would work.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

But this is expensive in feature count.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

A document with 100 characters per line would take 100 features for this simple task.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

Another simple way to do this would be to have one feature whose value gets higher as the character count goes up.

Astral Codex Ten Podcast
Next-Token Predictor Is An AI's Job, Not Its Species

This would correspond to locating the character count in a one-dimensional space, aka a straight line.