Dwarkesh Patel

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And maybe you can learn something from that.

5590.27 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So first, with longer context,

5592.092 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Gemini 3.1 is 50% more expensive if you go over 200k tokens than if you're below 200k tokens.

5597.148 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

I mean, at a high level, I understand why that might be, but why specifically 50%?

5608.719 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

one, six, six, seven.

5907.361 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Like about one kilobyte, almost two kilobyte.

5910.807 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Ah, yeah.

6049.396 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

It's funny that they would leak so much information through their API pricing.

6084.91 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Maybe we can learn something about the difference in input versus output prices.

6093.863 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

and what that tells us about decode versus pre-fill in these models.

6098.521 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And I think, last I checked, it's like 50% more expensive or something like that?

6102.55 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Let's say it's five times more expensive.

6112.293 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Okay.

6113.836 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

This is the compute to process the next

6114.959 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

token in decode, suppose you're doing pre-fill, but you're not just processing the most recent token, you're processing all the tokens in parallel.

6118.248 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So I want to say that it would be this times len, len pre-fill?

6127.218 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Lens of the pass in general, yeah.

6136.988 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Okay, yeah, yeah.

6144.036 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So maybe like prefix?

6145.938 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Sure.

6147.099 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment