Noam Shazeer
👤 PersonAppearances Over Time
Podcast Appearances
Could it attend to all your personal information for you?
I would love a model that has access to all my emails and all my documents and all my photos.
And when I ask it to do something, it can sort of make use of that with my permission to sort of help solve what it is I'm wanting it to do.
But that's going to be a big computational challenge because the naive attention algorithm is quadratic.
And you can kind of barely make it work on a fair bit of hardware for millions of tokens, but there's no hope of making that just naively go to trillions of tokens.
So we need a whole bunch of interesting algorithmic approximations to what you would really want to make a way for the model to attend kind of conceptually to, you know, lots and lots of more tokens, trillions of tokens, and attend to your tokens.
You know, maybe we can put all of the Google code base in context for every Google developer, all the world's source code in context for any open source developer.
That would be amazing.
It would be incredible.
Yeah, you take a word and you blow it up to 10 kilobytes or something.
Oh, to be clear, we have actually already done further training on a Gemini model on our internal code base for our internal developers.
But that's different than attending to all of it.
because it sort of stirs together the code base into a bunch of parameters.
And I think having it in context makes things clearer.
But even the sort of further trained model internally is incredibly useful.
Like Sundar, I think, has said that 25% of the characters that we're checking into our code base these days are generated by our AI-based coding models with kind of human kind of...
Yeah.
I mean, I think one of the...
in addition to kind of researchy context, like anytime you're seeing these models used, I think they're able to make software developers more productive because they can kind of take sort of a high-level spec or sentence description of what you want done and give a pretty approximate, you know, pretty reasonable first cut at that.
And so from a research perspective, maybe you can say, I'd really like you to explore this kind of idea similar to the one in this paper, but maybe let's try making it convolutional or something.