Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Reiner Pope

πŸ‘€ Speaker
1157 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So I guess we want the cost per token, in fact.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Or the time per token.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Well, actually, for processing the entire batch.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So, at this cost, we have processed this many tokens, like, let it pre-fill.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Yeah.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Well, I guess, pre-fill, yeah, like, of the paths.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Yeah.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Not this prefix, but it's this cost.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Okay, let's proceed to the paths.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So the result we want to work towards is that pre-fill is compute limited and decode is memory bandwidth limited.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

T... We want the cost per token, so it'll be T over some stuff.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

T over length of the pass.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

But then why is it cheaper?

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Why does it cost higher?

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Yeah, yeah.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So, I mean, we're going to... It's this division by length pass that actually makes it all...

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So... Okay, yeah, this is going to divide out, but then we're going to get... All of this is going to divide by length of pass, and it's going to make the memory cost cheaper.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Length of the pass, when it's one, that is decode.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

When it is bigger, that is pre-file.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Okay, I see, I see, I see.