Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

๐Ÿ‘ค Speaker
15267 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

I never thought about it in terms of how much...

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

That if every single user who uses... Basically, for GPT to be trained optimally, every single user who uses GPT-5, the total amount of tokens that they stream should equal the total amount that have gone into pre-training.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Yeah.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

And the total amount of tokens that have gone into pre-training is the sum of all human knowledge.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

So each model should generate...

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

the sum of human knowledge on the output that it gets on the input.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Right.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

And then can we back out how much more compute than chinchilla optimal for a given sized

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Somebody told me $150 trillion.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

ActiveCrems?

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Sorry, I meant tokens.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Oh, I see.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

So how much is it over-trained?

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

That's whatever.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Okay, so if you consider this right here, to the extent this is in the right ballpark,

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

just by thinking about, okay, you kind of want everything to be equal in terms of compute.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Here's, if that OpenAI also realizes that and they're serving a certain amount of tokens per second, that tells you how much data went into the free training of GPT-5.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Even if it's like 50% off or something, that is sort of wild that you can sort of first principles

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

These kinds of numbers.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

OK, so in the spirit of trying to deduce things, we can publicly look up the prices of the APIs of these models.