Sam Altman
👤 SpeakerAppearances Over Time
Podcast Appearances
But if you want a system to be pretty offensive and you ask it to be, I think part of alignment is doing what its user asks for within these broad bounds that society agrees on. The second thing that really matters is what the defaults are. So if you don't do any of that, which most users don't, And you ask whatever controversial question you want, how should the system respond?
And we put a ton of work into both of those things. We also try to write up how the model should behave. We call this the model spec, such that you can tell if it's a bug or you disagree with us on some stance. But that's how we think about it.
And we put a ton of work into both of those things. We also try to write up how the model should behave. We call this the model spec, such that you can tell if it's a bug or you disagree with us on some stance. But that's how we think about it.
And we put a ton of work into both of those things. We also try to write up how the model should behave. We call this the model spec, such that you can tell if it's a bug or you disagree with us on some stance. But that's how we think about it.
I think no matter how neutral you try to write the thing, it will either be useless because it will just say, I can't answer that because there's politics in everything, or it will have some sort of point of view, which is why what we think we can do is write down what we intend for our default. People can debate that. If there's bugs in there, we can look at the bugs.
I think no matter how neutral you try to write the thing, it will either be useless because it will just say, I can't answer that because there's politics in everything, or it will have some sort of point of view, which is why what we think we can do is write down what we intend for our default. People can debate that. If there's bugs in there, we can look at the bugs.
I think no matter how neutral you try to write the thing, it will either be useless because it will just say, I can't answer that because there's politics in everything, or it will have some sort of point of view, which is why what we think we can do is write down what we intend for our default. People can debate that. If there's bugs in there, we can look at the bugs.
If there's problems with how we defined it, we can change what the definition is and retrain the system. But, yeah, I don't think any system can be – no two people are ever going to agree that one system is perfectly unbiased. But that's another reason why personalization matters so much.
If there's problems with how we defined it, we can change what the definition is and retrain the system. But, yeah, I don't think any system can be – no two people are ever going to agree that one system is perfectly unbiased. But that's another reason why personalization matters so much.
If there's problems with how we defined it, we can change what the definition is and retrain the system. But, yeah, I don't think any system can be – no two people are ever going to agree that one system is perfectly unbiased. But that's another reason why personalization matters so much.
Yeah.
Yeah.
Yeah.
Actually, here's one thing I've been thinking about recently as a principal. Like OpenAI has not adopted this at all, but this has just been an idea that I think gets at what you're saying. Let's say we discover some new thing where it's like if you do this, people learn way better. If ChatGPT responds always with the Socratic method or whatever, students using it learn way better.
Actually, here's one thing I've been thinking about recently as a principal. Like OpenAI has not adopted this at all, but this has just been an idea that I think gets at what you're saying. Let's say we discover some new thing where it's like if you do this, people learn way better. If ChatGPT responds always with the Socratic method or whatever, students using it learn way better.
Actually, here's one thing I've been thinking about recently as a principal. Like OpenAI has not adopted this at all, but this has just been an idea that I think gets at what you're saying. Let's say we discover some new thing where it's like if you do this, people learn way better. If ChatGPT responds always with the Socratic method or whatever, students using it learn way better.
But let's say user preferences are not to get the Socratic message. Users just say, like, I just want you to answer my question. Tell me. Right. Then, like, how should we decide what to do there as the default behavior? And one idea that I have increasingly been thinking about is... What if we're always just really clear when we make a change to the spec?
But let's say user preferences are not to get the Socratic message. Users just say, like, I just want you to answer my question. Tell me. Right. Then, like, how should we decide what to do there as the default behavior? And one idea that I have increasingly been thinking about is... What if we're always just really clear when we make a change to the spec?
But let's say user preferences are not to get the Socratic message. Users just say, like, I just want you to answer my question. Tell me. Right. Then, like, how should we decide what to do there as the default behavior? And one idea that I have increasingly been thinking about is... What if we're always just really clear when we make a change to the spec?
And so you'll never have our thumb on the scale hiding behind an algorithm, which I think Twitter does all the time, for example, and all sorts of weird things there. We'll always tell you what the intended behavior is. And if we make a change to it, we'll explain why.