Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Reiner Pope

πŸ‘€ Speaker
1157 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

if you're going to hold on to it for a very short amount of time yeah then the um all of this is like multiplied by the um hold time yep this one is and so is this one um and interestingly they have different prices to write for and as you specify this in the api for five minutes versus an hour

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Yeah, which suggests that the five minutes is HBM and the hour is DDR.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

I think that's a pretty good assumption.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

If you look at the numbers, it might also turn out that it's one tier down and it's DDR versus Flash.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Yeah, okay, interesting.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So actually, we might actually be able to determine which memory tier it is by the durations, actually.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Yeah, exactly.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

I think this will probably end up being...

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

it's going to be the drain time of the memory tier that you're in.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

And so what that means is like, given that I know I'm going to be holding something for five minutes, I would like to pick a memory that I can read every five minutes.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Like I can read the whole memory once per five minutes, ballpark.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So that is the drain time of the memory.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So if I take all the storage capacity over storage bandwidths,

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

bandwidth, I would like this to be like equal to five minutes or something like that.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

And so actually we did this calculation for HBM.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

For HBM, we know that this number is 20 milliseconds.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So HBM is much too short, like much too small.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

DDR could be about an order of magnitude or two off from this.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

And so this is probably in the order of like, actually, I think it might even be in the seconds, like one to 10 seconds.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

And then