Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Reiner Pope

๐Ÿ‘ค Speaker
1157 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

So...

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

the number of inference tokens you have, and this is just a function of like, I've got hundreds of millions of tokens per second times my model is deployed for, I don't know, two months before I shift to the next version.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

That should determine

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

the number of tokens in RL and pre-training, and then I guess we didn't do the equivalence between pre-training and RL, so we'll do that here.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Data pre-training should be equal to like 2 over 10 times data in RL, for them to be cost equivalent.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Sorry, this one over I got a backwards, uh, like we pay more cost when it's inefficient.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

So it's, this needs to be one over, um, uh, um, so this tracing this back, uh, back forward, um, this, this thing ends up actually being as written here.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

It's like, uh, yeah.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Yeah.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Right.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

I think if you do it with a spreadsheet and actually model it out, you might notice when the money is going down the drain.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

All of these end up being close as modeled here.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

This 30% may have been a little bit too generous.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Let's say something like 1.5 here and leave this as a 1 here.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

I think at this point you can almost read it off.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

The number of inference tokens should be about the same as the number of pre-training tokens should be about the same as the number of RL tokens within factors that we're not able to reason about.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Yeah, that's in general right.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Because RL is less efficient in terms of machine time.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

And so if you're trying to equalize the RL and pre-training time, then you should have fewer tokens and not have the same wall time.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Equalizing in terms of data?