Sam Altman
๐ค SpeakerAppearances Over Time
Podcast Appearances
And I totally get why people were like, give us GPT-4 right away.
But I'm happy we did it this way.
So I want to be very clear.
I do not think we have yet discovered a way to align a super powerful system.
We have something that works for our current scale called RLHF.
And we can talk a lot about the benefits of that and the utility it provides.
It's not just an alignment.
Maybe it's not even mostly an alignment capability.
It helps make a better system, a more usable system.
Yeah.
And this is actually something that I don't think people outside the field understand enough.
It's easy to talk about alignment and capability as orthogonal vectors.
They're very close.
Better alignment techniques lead to better capabilities and vice versa.
There's cases that are different and they're important cases.
But on the whole, I think things that you could say like RLHF or interpretability that sound like alignment issues also help you make much more capable models.
And the division is just much fuzzier than people think.
And so in some sense, the work we do to make GPT-4 safer and more aligned looks very similar to all the other work we do of solving the research and engineering problems associated with creating useful and powerful models.
And there's no one set of human values or there's no one set of right answers to human civilization.
So I think what's going to have to happen is we will need to agree on, as a society, on very broad bounds.