Reiner Pope

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And if those numbers are big enough, it eventually becomes profitable to cut along there.

3255.281 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And we have selected two of them.

3259.885 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

The other two, in the way models are typically sized, are not profitable.

3262.048 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah.

3304.283 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And I guess we didn't really fully articulate even what is the benefit that we're getting from pipelining.

3306.245 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah.

3310.789 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And so these complexities are real.

3313.692 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Pipelining is a massive hassle, but it does give you some benefits.

3316.335 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And then you can then decide whether those benefits are worth the costs.

3322.381 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

The biggest benefit that shows up, so it has some benefits in inference, maybe bigger benefits in training.

3329.913 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

In inference, what are we saving on?

3335.141 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Are we saving on memory time or compute time?

3337.544 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Not really.

3341.41 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

We're just moving the memory time from one chip to another chip or one rack to a different rack.

3342.091 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

There's no actual benefit in runtime.

3347.199 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

However, what we are saving on is that the memory capacity is the amount of memory used per rack.

3351.085 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

If we think that the memory in a rack is a bottleneck, then there's a constraint on how fast we can go.

3358.072 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Pipelining allows us to massively reduce that bottleneck.

3364.579 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So we draw the pipeline bubble.

3410.188 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah.

3412.07 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment