Sholto Douglas

Because a lot of the tasks required in winning a Nobel Prize, or at least strongly assisting in helping to win a Nobel Prize, have more layers of verifiability built up.

361.689 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So I expect them to accelerate the process of doing Nobel Prize winning work

372.56 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

More initially than that of like writing Pulitzer Prize-worthy novels.

378.487 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Copy paste, copy paste, copy paste.

399.361 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Right, like carving away the marbles on this.

597.717 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I think it's worth noting that that paper was, I'm pretty sure, on the Lama and Quen models.

600.861 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And I'm not sure how much RL compute they used, but I don't think it was anywhere comparable to the amount of compute that was used in the base models.

606.008 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And so I think the amount of compute that you use in training is a decent proxy for the amount of actual raw new knowledge or capabilities you're adding to a model.

613.758 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So my prior at least, if you look at all of DeepMind's research from RL before,

623.754 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

RL was able to teach these Go and chess playing agents new knowledge that were in excess of human level performance just from RL signal, provided the RL signal was sufficiently clean.

629.283 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So there's nothing structurally limiting about the algorithm here that prevents it from imbuing the neural net with new knowledge.

641.2 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment