Trenton Bricken
๐ค SpeakerAppearances Over Time
Podcast Appearances
Like it's a good reason.
So, I guess context for listeners.
The induction head is basically and you see the line like Mr. and Mrs. Dursley did something.
Mr. Blank.
And you're trying to predict what blank is.
And the head has learned to look for previous occurrences of the word Mr.
look at the word that comes after it, and then copy and paste that as the prediction for what should come next, which is a super reasonable thing to do.
And there is computation being done there to accurately predict the next token.
but yeah that is context dependent that is yeah yeah but it's not like it's not like reasoning you know what i mean like but but is is i guess going back to the like associations all the way down it's like if you chain together a bunch of these uh reasoning circuits or or uh heads that have different rules for how to relate information but but in the sort of like zero shot case uh
Well, I think there would be another circuit for extracting pixels and turning them into latent representations of the different objects in the game, right?
And a circuit that is learning physics.
Or like, I mean, that would just be an empirical question, right?
Of like, how big does the model need to be to perform this task?
But like, I mean, maybe it's useful if I just talk about some other circuits that we've seen.
So we've seen like the IOI circuit, which is the indirect object identification.
And so this is like, if you see, it's like Mary and Jim went to the store, Jim gave the object to blank, right?
And it would predict Mary because Mary's appeared before as like the indirect object.
or it'll infer pronouns, right?
And this circuit even has behavior where if you ablate it, then other heads in the model will pick up that behavior.
We'll even find heads that want to do copying behavior, and then other heads will suppress.