Dwarkesh Patel

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

When I'm prepping for interviews, I often talk to experts in the field.

1252.612 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So for Reiner, I chatted with two of Jane Street's engineers, Clark and Axel.

1255.234 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Clark, who works on low latency trading systems, walked me through why Jane Street uses FPGAs to make sure that they have predictable nanosecond latencies.

1260.159 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

FPGAs allow you to react to the earliest part of the packet as it arrives, rather than having to wait for the full thing.

1298.85 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

We also talked about liquid cooling, network design, and many other things.

1304.3 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

If you're interested in this stuff, Jane Street is hiring.

1307.507 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

You can check out their open roles at janestreet.com slash dworkash.

1310.453 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And if you want to watch the full prep conversation, we posted it there too.

1315.644 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

If you've got a frontier model and you are actually doing inference,

1319.493 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

surely they must have more than 2000 concurrent users.

1324.516 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Is there any added latency from the fact that you need to have the whole bash fill up?

1328.606 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Or is it, if you have a reasonable amount of users, it's so unlikely that you wouldn't, it would not take you 100 milliseconds to fill up

1332.456 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

the next 2,000 slots.

1340.497 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

The units make sense.

1457.05 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

You would have a...

1457.932 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

A byte divided by bytes per second.

1460.41 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Let me just make sure I understand what it's saying.

1482.162 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

I mean, I understand why the units can't, the sort of unit analysis.

1483.804 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

But what it's saying is, we can evacuate and replace the HBM in this amount of time.

1487.929 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And so we don't want to be in a situation where the HBM is not big enough that we're not actually able to

1499.083 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment