Trenton Bricken

And so you could think of it as a reward, but it's a very dense reward where you're getting signal at every single token, and you're always getting some signal.

967.007 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

even if it only assigned 1% to that token or less, you're like, oh, I see you assigned 1%, good job, keep doing that.

977.498 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Upweight it.

984.28 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Yeah, exactly.

984.842 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

That's right, yeah, yeah, yeah.

986.588 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

You think so?

1015.882 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I don't know.

1043.648 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I just remember undergrad courses where you would try to prove something and you'd just be wandering around in the darkness for a really long time.

1044.049 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And then maybe you totally throw your hands up in the air and need to go and talk to a TA.

1052.5 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And it's only when you talk to a TA can you see where along the path of different solutions you were incorrect and like what the correct thing to have done would have been.

1056.365 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And that's in the case where you know what the final answer is, right?

1065.838 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

In other cases, if you're just kind of shooting blind and meant to give an answer de novo –

1068.581 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

It's really hard to learn anything.