Trenton Bricken
๐ค SpeakerAppearances Over Time
Podcast Appearances
And then you mentioned offhand that Max was looking for something new.
Yeah.
Yeah, I mean, the job might just be so out of distribution from anything else that people would do.
That's right.
Yeah.
So Aditya Ray asks, how do you make it on Substack as a newbie writer?
Yeah, super excited for the book launch.
Thank you.
The website's awesome, by the way.
I appreciate it.
Oh, yeah.
Yeah.
And you can ignore what I'm about to say, because given the introduction, alignment is solved and eye safety isn't a problem.
But I think the context stuff does get problematic, but also interesting here.
I think there'll be more work coming out in the not too distant future around what happens if you give 100 shot prompt for jailbreaks, adversarial attacks.
It's also interesting in the sense of if your model is doing gradient descent and learning on the fly, even if it's been trained to be harmless, you're dealing with a totally new model in a way.
You're like fine tuning, but in a way where you can't control what's going on.
Can you explain what do you mean by gradient descent is happening in the forward pass and attention?
Yeah, no, no, no.
There was something in the paper about trying to teach the model to do linear regression.