Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

๐Ÿ‘ค Speaker
15267 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

When I'm prepping for interviews, I often talk to experts in the field.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

So for Reiner, I chatted with two of Jane Street's engineers, Clark and Axel.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Clark, who works on low latency trading systems, walked me through why Jane Street uses FPGAs to make sure that they have predictable nanosecond latencies.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

FPGAs allow you to react to the earliest part of the packet as it arrives, rather than having to wait for the full thing.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

We also talked about liquid cooling, network design, and many other things.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

If you're interested in this stuff, Jane Street is hiring.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

You can check out their open roles at janestreet.com slash dworkash.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

And if you want to watch the full prep conversation, we posted it there too.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

If you've got a frontier model and you are actually doing inference,

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

surely they must have more than 2000 concurrent users.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Is there any added latency from the fact that you need to have the whole bash fill up?

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Or is it, if you have a reasonable amount of users, it's so unlikely that you wouldn't, it would not take you 100 milliseconds to fill up

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

the next 2,000 slots.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

The units make sense.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

You would have a...

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

A byte divided by bytes per second.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

Let me just make sure I understand what it's saying.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

I mean, I understand why the units can't, the sort of unit analysis.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

But what it's saying is, we can evacuate and replace the HBM in this amount of time.

Dwarkesh Podcast
Reiner Pope โ€“ The math behind how LLMs are trained and served

And so we don't want to be in a situation where the HBM is not big enough that we're not actually able to