Noam Shazeer
๐ค SpeakerAppearances Over Time
Podcast Appearances
How's the load test looking?
There's just lots of feedback that happens.
And then we have our Gemini chat rooms for people who are not in that micro kitchen.
We have a team all over the world
And, you know, there's probably 120 chat rooms I'm in related to Gemini things.
And, you know, this particular very focused topic, we have like seven people working on this.
And there's like exciting results being shared by the London colleagues.
And when you wake up, you see like what's happening in there.
Or it's a big group of like people focused on data.
And there's all kinds of issues, you know, happening in there.
It's just fun.
Yeah, I mean, I think...
you're going to want a lot of inference compute is the rough highest level view of these capable models.
Because if one of the techniques for improving their quality is scaling up the amount of inference compute you use, then all of a sudden what's currently like one request to generate some tokens now becomes 50 or a hundred or a thousand times as computationally intensive, even though it's producing the same amount of output.
And you're also going to then see tremendous scaling up of the uses of these services as, you know, not everyone in the world has discovered these, you know, chat-based conversational interfaces where you can get them to do all kinds of amazing things.
You know, probably 10% of the computer users in the world have discovered that today or 20% as that pushes towards 100% and people make heavier use of it.
That's going to be another order of magnitude or two of scaling.
And so you're now going to have two orders of magnitude from that, two orders of magnitude from that.
The models are probably going to be bigger.
You'll get another order of magnitude or two from that.