Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

๐Ÿ‘ค Speaker
15267 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

It gets me confused about this.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Length pass is the... It seems like this should be higher when you're doing pre-fill.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Pre-fill has a bigger length pass, yeah.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Right.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Okay, yeah, let me think about this then.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Okay, so let's do one line for... Basically, we'll have four different lines.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Let's do the... Let's do pre-fill first, and so... Actually, let's do decode first.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

That makes sense.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Okay, getting back to it.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

So t-compute, if you have basically just this divided by length pass, so just this amount.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

So this actually does not vary based on t, so it'll just be some flat value like this.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

And this is t-compute.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

And then this is like... This is... That's decode.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Decode, right.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Now, tmem, if you have this whole thing divided by length pass, well, it doesn't really matter what's up there.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

It'll just be something that looks like this.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Right.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Yeah.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Let's say this is tmem.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

This is decode again.