Reiner Pope

Reiner Pope – The math behind how LLMs are trained and served

So...

Reiner Pope – The math behind how LLMs are trained and served

the number of inference tokens you have, and this is just a function of like, I've got hundreds of millions of tokens per second times my model is deployed for, I don't know, two months before I shift to the next version.

5197.362 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

That should determine

5209.127 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

the number of tokens in RL and pre-training, and then I guess we didn't do the equivalence between pre-training and RL, so we'll do that here.

5212.735 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Data pre-training should be equal to like 2 over 10 times data in RL, for them to be cost equivalent.

5221.838 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Sorry, this one over I got a backwards, uh, like we pay more cost when it's inefficient.

5233.019 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So it's, this needs to be one over, um, uh, um, so this tracing this back, uh, back forward, um, this, this thing ends up actually being as written here.

5238.244 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

It's like, uh, yeah.

5250.837 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah.

5260.466 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Right.

5260.747 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

I think if you do it with a spreadsheet and actually model it out, you might notice when the money is going down the drain.

5262.905 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

All of these end up being close as modeled here.

5268.132 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

This 30% may have been a little bit too generous.

5274.8 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Let's say something like 1.5 here and leave this as a 1 here.

5277.784 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

I think at this point you can almost read it off.

5282.902 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

The number of inference tokens should be about the same as the number of pre-training tokens should be about the same as the number of RL tokens within factors that we're not able to reason about.

5285.811 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah, that's in general right.

5301.965 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Because RL is less efficient in terms of machine time.

5303.247 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And so if you're trying to equalize the RL and pre-training time, then you should have fewer tokens and not have the same wall time.