Dwarkesh Patel
๐ค SpeakerAppearances Over Time
Podcast Appearances
When I'm prepping for interviews, I often talk to experts in the field.
So for Reiner, I chatted with two of Jane Street's engineers, Clark and Axel.
Clark, who works on low latency trading systems, walked me through why Jane Street uses FPGAs to make sure that they have predictable nanosecond latencies.
FPGAs allow you to react to the earliest part of the packet as it arrives, rather than having to wait for the full thing.
We also talked about liquid cooling, network design, and many other things.
If you're interested in this stuff, Jane Street is hiring.
You can check out their open roles at janestreet.com slash dworkash.
And if you want to watch the full prep conversation, we posted it there too.
If you've got a frontier model and you are actually doing inference,
surely they must have more than 2000 concurrent users.
Is there any added latency from the fact that you need to have the whole bash fill up?
Or is it, if you have a reasonable amount of users, it's so unlikely that you wouldn't, it would not take you 100 milliseconds to fill up
the next 2,000 slots.
The units make sense.
You would have a...
A byte divided by bytes per second.
Let me just make sure I understand what it's saying.
I mean, I understand why the units can't, the sort of unit analysis.
But what it's saying is, we can evacuate and replace the HBM in this amount of time.
And so we don't want to be in a situation where the HBM is not big enough that we're not actually able to