Dwarkesh Patel

Reiner Pope – The math behind how LLMs are trained and served

I never thought about it in terms of how much...

Reiner Pope – The math behind how LLMs are trained and served

That if every single user who uses... Basically, for GPT to be trained optimally, every single user who uses GPT-5, the total amount of tokens that they stream should equal the total amount that have gone into pre-training.

5332.505 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah.

5345.062 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And the total amount of tokens that have gone into pre-training is the sum of all human knowledge.

5345.943 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So each model should generate...

5350.069 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

the sum of human knowledge on the output that it gets on the input.

5353.22 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Right.

5379.865 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And then can we back out how much more compute than chinchilla optimal for a given sized

5380.668 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Somebody told me $150 trillion.

5470.732 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

ActiveCrems?

5472.878 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Sorry, I meant tokens.

5473.78 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Oh, I see.

5526.268 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So how much is it over-trained?

5526.77 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

That's whatever.

5542.352 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Okay, so if you consider this right here, to the extent this is in the right ballpark,

5543.393 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

just by thinking about, okay, you kind of want everything to be equal in terms of compute.

5547.4 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Here's, if that OpenAI also realizes that and they're serving a certain amount of tokens per second, that tells you how much data went into the free training of GPT-5.

5552.848 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Even if it's like 50% off or something, that is sort of wild that you can sort of first principles

5564.284 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

These kinds of numbers.

5569.591 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

OK, so in the spirit of trying to deduce things, we can publicly look up the prices of the APIs of these models.

5581.081 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment