Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Ilya Sutskever

๐Ÿ‘ค Speaker
766 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

Or maybe plus minus, let's add error bars to those years.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

Because people say, this is amazing, you gotta scale more, keep scaling.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

The one word, scaling.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

But now the scale is so big, like, is the belief really that, oh, it's so big, but if you had 100x more, everything would be so different.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

Like it would be different for sure.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

But like, is the belief that if you just honeydex the scale, everything would be transformed?

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

I don't think that's true.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

So it's back to the age of research again, just with big computers.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

So we've already witnessed a transition from one type of scaling to a different type of scaling, from pre-training to RL.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

Now people are scaling RL.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

Now, based on what people say on Twitter, they spend more compute on RL than on pre-training at this point because RL can actually consume quite a bit of compute.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

You know, you do very, very long rollouts.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

Yes.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

So it takes a lot of compute to produce those rollouts.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

And then you get relatively small amount of learning for the rollout.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

So you really can spend a lot of compute.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

And I could imagine, like, I wouldn't, at this, it's more like, I wouldn't even call it a scaling.

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

I would say, hey, like, what are you doing?

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

And is the thing you are doing the most productive thing you could be doing?

Dwarkesh Podcast
Ilya Sutskever โ€“ We're moving from the age of scaling to the age of research

Can you find a more productive way of using your compute?