Chris Olah
๐ค SpeakerAppearances Over Time
Podcast Appearances
I do think a lot of it is eliciting powerful pre-trained models. So people are probably divided on this because obviously in principle, you can definitely teach new things. I think for the most part, for a lot of the capabilities that we... most use and care about.
A lot of that feels like it's there in the pre-trained models and reinforcement learning is eliciting it and getting the models to bring it out.
A lot of that feels like it's there in the pre-trained models and reinforcement learning is eliciting it and getting the models to bring it out.
A lot of that feels like it's there in the pre-trained models and reinforcement learning is eliciting it and getting the models to bring it out.
It's weird because I think that a lot of people prefer he for Claude. I actually kind of like that I think Claude is usually, it's slightly male-leaning, but it's like it can be male or female, which is quite nice.
It's weird because I think that a lot of people prefer he for Claude. I actually kind of like that I think Claude is usually, it's slightly male-leaning, but it's like it can be male or female, which is quite nice.
It's weird because I think that a lot of people prefer he for Claude. I actually kind of like that I think Claude is usually, it's slightly male-leaning, but it's like it can be male or female, which is quite nice.
I still use it, and I have mixed feelings about this because I'm like maybe, like I now just think of it as like, or I think of like the it pronoun for Claude as, I don't know, it's just like the one I associate with Claude. Yeah. I can imagine people moving to like he or she.
I still use it, and I have mixed feelings about this because I'm like maybe, like I now just think of it as like, or I think of like the it pronoun for Claude as, I don't know, it's just like the one I associate with Claude. Yeah. I can imagine people moving to like he or she.
I still use it, and I have mixed feelings about this because I'm like maybe, like I now just think of it as like, or I think of like the it pronoun for Claude as, I don't know, it's just like the one I associate with Claude. Yeah. I can imagine people moving to like he or she.
I've wondered if I anthropomorphize things too much. Because, you know, I have this like with my car, especially like my car and bikes, you know, like I don't give them names because then I once had, I used to name my bikes and then I had a bike that got stolen and I cried for like a week. And I was like, if I'd never given it a name, I wouldn't have been so upset. Yeah.
I've wondered if I anthropomorphize things too much. Because, you know, I have this like with my car, especially like my car and bikes, you know, like I don't give them names because then I once had, I used to name my bikes and then I had a bike that got stolen and I cried for like a week. And I was like, if I'd never given it a name, I wouldn't have been so upset. Yeah.
I've wondered if I anthropomorphize things too much. Because, you know, I have this like with my car, especially like my car and bikes, you know, like I don't give them names because then I once had, I used to name my bikes and then I had a bike that got stolen and I cried for like a week. And I was like, if I'd never given it a name, I wouldn't have been so upset. Yeah.
I felt like I'd let it down. Maybe it's that I've wondered as well, like it might depend on how much it feels like a kind of like objectifying pronoun. Like if you just think of it as like a, this is a pronoun that like objects often have, and maybe AIs can have that pronoun.
I felt like I'd let it down. Maybe it's that I've wondered as well, like it might depend on how much it feels like a kind of like objectifying pronoun. Like if you just think of it as like a, this is a pronoun that like objects often have, and maybe AIs can have that pronoun.
I felt like I'd let it down. Maybe it's that I've wondered as well, like it might depend on how much it feels like a kind of like objectifying pronoun. Like if you just think of it as like a, this is a pronoun that like objects often have, and maybe AIs can have that pronoun.
And that doesn't mean that I think of, if I call Claude it, that I think of it as less intelligent or like I'm being disrespectful. I'm just like, you are a different kind of entity. And so- That's, I'm going to give you the kind of the respectful it.
And that doesn't mean that I think of, if I call Claude it, that I think of it as less intelligent or like I'm being disrespectful. I'm just like, you are a different kind of entity. And so- That's, I'm going to give you the kind of the respectful it.
And that doesn't mean that I think of, if I call Claude it, that I think of it as less intelligent or like I'm being disrespectful. I'm just like, you are a different kind of entity. And so- That's, I'm going to give you the kind of the respectful it.
So there's a couple of components of it. The main component I think people find interesting is the kind of reinforcement learning from AI feedback. So you take a model that's already trained and you show it two responses to a query and you have a principle. So suppose the principle, like we've tried this with harmlessness a lot. So suppose that the query is about