Dwarkesh

this was in the DeepSeq R1 paper, is that the space of trajectories is so wide that maybe it's hard to learn a mapping from an intermediate trajectory and value.

952.22 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

And also given that, you know, in coding, for example, you will have the wrong idea, then you'll go back, then you'll change something.

966.401 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

That's the thing I was actually planning on asking you.

1020.509 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

There's something really interesting about emotions of the value function, which is that...

1022.511 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

It's impressive that they have this much utility while still being rather simple to understand.

1026.2 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

So I have two responses.

1036.071 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

Yeah.

1129.886 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

People have been talking about scaling data, scaling parameter, scaling compute.

1130.006 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

Is there a more general way to think about scaling?

1136.354 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

What are the other scaling axes?

1137.996 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

That's a very interesting way to put it.

1326.946 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

But let me ask you the question you just posed then.

1330.405 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

What are we scaling and what would it mean to have a recipe?

1332.548 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

Because I guess I'm not aware of a very clean relationship that almost looks like a law of physics, which existed in pre-training.

1336.473 View full episode →

Dwarkesh Podcast

Ilya Sutskever – We're moving from the age of scaling to the age of research

There was a power law between data or computer parameters and loss.

1345.265 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment