Reiner Pope

So a way to think of this is that if I have too many of these things sitting in my HBM, if I fill up my HBM with just KV caches that I'm not using, I can't use that GPU.

6960.226 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And so how do I price that?

6971.638 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Maybe I say that the cost of it is proportional to the fraction of the HBM I'm using.

6972.96 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So there's also times GPU dollars.

6976.944 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And then let's just do one more memory tier and say something like DDR, store in DDR instead.

6981.589 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

The same kind of thing goes up for Flash and for DDR.

6991.462 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

I put these in the wrong columns, actually.

6997.198 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

I meant to make two columns.

6999.264 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

The distinction I want to make is that there is the cost to retrieve

7002.092 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And then there's a cost to store, cost to hold on.

7010.902 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And so this is like, there's a cost per second, whereas this is like an instantaneous cost.

7017.45 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So rematerialization has a cost to retrieve and has zero cost to store it because we've deleted it.

7023.858 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

This is the one that I put in the wrong location.

7031.867 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

This is actually the cost just to hold on.

7033.67 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So I will rewrite it.

7036.073 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment