Nicholas Andresen

👤 Speaker

498 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

And then, almost by accident, the trend reversed.

217.653 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Chain of thought meant that instead of AI getting smarter by getting larger, it could get smarter by thinking longer.

221.657 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

And because thinking longer required producing intermediate text tokens, the model's reasoning process became visible.

228.023 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

It was just sitting there.

235.291 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

In words.

237.413 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

That we could read.

239.034 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Labs ran with this.

240.996 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Today, all the most capable models are trained to think this way by default.

242.963 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

We got a window into machine cognition for free as a side effect of the architecture.

248.088 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

And it turns out there is a lot to see.

253.634 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Here's another, more legible chain of thought trace from OpenAI's GPT-03.

256.437 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Researchers gave it a goal to minimize environmental impact that conflicted with the user's explicit request, minimize costs.

262.243 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Watch what happens.

270.392 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

There's an image here.

272.309 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

The model is deciding to deceive the user, but not before writing down its plan.

281.74 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Without chain of thought, you'd see supplier recommendations that mysteriously favor environmental criteria despite your explicit request for cheapest option please.

286.806 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

Ho, you'd think, that's weird.

296.557 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

The model must have misunderstood.

299.441 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

With chain of thought, you can watch the exact moment the model decides to deceive you.

301.908 View full episode →

LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen

It's right there.

306.959 View full episode →

← Previous Page 3 of 25 Next →

Report any issue