Nicholas Andresen

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Neuralese hasn't won yet.

1721.785 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Chain of thought has a structural advantage, and it's worth understanding why.

1724.188 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

When a model reasons in text, what gets stored?

1728.993 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Tokens.

1732.639 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Tokens are small.

1734.201 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

You can stall long sequences of them and searching back through them to find what's relevant stays feasible even as the sequence grows.

1736.265 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Need to remember something from step 12?

1743.937 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

It's right there.

1746.882 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

The model looks back over its own output and finds it instantly.

1748.564 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

If the problem is complex and requires long chains of reasoning, just keep generating tokens until you're done.

1752.533 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Neuralese doesn't have this luxury.

1758.168 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

To match Chain of Thought's capability, Neuralese needs a scratchpad too.

1761.878 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

But internal states are big, that's what makes them rich, and also harder to scale.

1766.776 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

At runtime, storing and searching gets expensive, and continuous values accumulate errors in ways discrete tokens don't.

1772.328 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

In training, parallelizing across machines is trickier when each state depends on the last.

1780.023 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

This is the fundamental trade-off.

1785.811 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Richer representations versus scalability.

1788.454 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Neuralese researchers are exploring different points along this trade-off, trying to find one that outperforms chain of thought.

1791.839 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

So far, none have.

1799.149 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Until someone does, the answer to where should the next unit of compute go keeps coming back.

1801.592 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment