Ilya Sutskever
👤 PersonAppearances Over Time
Podcast Appearances
was built on eight to 64 GPUs, no single transformer paper experiment used more than 64 GPUs of 2017, which would be like what two GPUs of today.
So the ResNet, right?
Many like even even the the, you could argue that the like 01 reasoning was not the most compute heavy thing in the world.
So they're definitely for for research,
You need definitely some amount of compute, but it's far from obvious that you need the absolutely largest amount of compute ever for research.
You might argue, and I think it is true, that if you want to build the absolutely best system, if you want to build the absolutely best system, then it helps to have much more compute.
And especially if everyone is within the same paradigm, then compute becomes one of the big differentiators.
So I can comment on that, which is the short comment is that, you know, you mentioned SSI.
Specifically for us,
the amount of compute that SSI has for research is really not that small.
And I want to explain why, like a simple math can explain why the amount of compute that we have is actually a lot more comparable for research than one might think.
I'll explain.
So SSI has raised $3 billion, which is like
not small, but it's like a lot by any absolute sense.
But you could say, but look at the other companies raising much more.
But a lot of their compute goes for inference.
Like these big numbers, these big loans, it's earmarked for inference.
That's number one.
Number two, you need if you want to have a product on which you do inference, you need to have a big staff of engineers, of salespeople, a lot of the research needs to be dedicated for producing all kinds of product related features.
So then when you look at what's actually left for research, the difference becomes a lot smaller.