Trenton Bricken
๐ค SpeakerAppearances Over Time
Podcast Appearances
His attention heads are querying his context, but then how he's projecting his query and keys in the space and how his MLPs are then retrieving longer term facts or modifying that information is allowing him to then in later layers do even more sophisticated queries and slowly be able to reason through and come to a meaningful conclusion.
Something I think that the people who aren't doing this research can overlook is after your first layer of the model, every query key and value that you're using for attention
comes from the combination of all the previous tokens.
So like my first layer, I'll query my previous tokens and just extract information from them.
But all of a sudden, let's say that I attended to tokens one, two, and four in equal amounts, then the vector in my residual stream, assuming that they wrote out the same thing to the value vectors, but ignore that for a second, is a third of each of those.
And so when I'm querying in the future, my query is actually a third of each of those things.
But they might be written to different subspaces.
That's right.
Hypothetically, but they wouldn't have to.
And so you can recombine and immediately, even by layer two and certainly by the deeper layers, just have these very rich vectors that are packing in a ton of information.
And the causal graph is literally over every single layer that happened in the past.
And that's what you're operating on.
You can get an LLM to write it.
Or we could purposely exclude it, right?
Well, you need to scrape any discussion of it from Reddit or any other thing, right?
So for me, because this is a very legitimate response, right?
It's like, well, artificial general intelligence aren't, if you say humans are generally intelligent, then they're no more capable or competent.
I'm just worried that you have that level of general intelligence in silicon.
where you can then immediately clone hundreds of thousands of agents and they don't need to sleep and they can have super long context windows and then they can start recursively improving and then things get really scary.
So I think to answer your original question, yes, you're right.