Trenton Bricken
๐ค SpeakerAppearances Over Time
Podcast Appearances
They would still need to learn associations,
Well, I think then you can get into really interesting cases of meta learning.
Um, like when you play a new video game or like study a new textbook, uh, you're bringing a whole bunch of skills to the table to form those associations much more quickly.
And like, because everything in some way ties back to the physical worlds, I think there are like general features that you can pick up and then, and then apply in novel circumstances.
I mentioned multiple agents and I'm like, oh, here we go.
But any thoughts?
I think we need a few more nines of reliability in order for it to really be useful and trustworthy.
Right now, it's like...
And just having context lengths that are super long and it's very cheap to have.
If I'm working in our code base, it's really only small modules that I can get Claude to write for me right now.
But it's very plausible that within the next few years or even sooner, it can automate most of my tasks.
The only other thing here that I will note is...
The research that at least our sub-team in interpretability is working on is so early stage that you really have to be able to make sure everything is done correctly in a bug-free way and contextualize the results with everything else in the model.
And if something isn't going right, be able to enumerate all of the possible things and then slowly work on those.
Like an example that we've publicly talked about in previous papers is dealing with layer norm, right?
And it's like, if I'm trying to get an early result or look at like the logit effects of the model, right?
So it's like, if I activate this feature that we've identified to a really large degree, how does that change the output of the model?
Am I using layer norm or not?
How is that changing the feature that's being learned?
And that will take even more context or reasoning abilities for the model.