Sam Altman
π€ SpeakerAppearances Over Time
Podcast Appearances
But I wouldn't have guessed it at the time when I was like an eight-year-old.
So we finished last summer.
We immediately started
giving it to people to Red Team.
We started doing a bunch of our own internal safety EFLs on it.
We started trying to work on different ways to align it.
And that combination of an internal and external effort, plus building a whole bunch of new ways to align the model.
And we didn't get it perfect by far, but one thing that I care about is that our degree of alignment increases faster than our rate of capability progress.
And that I think will become more and more important over time.
And I think we made reasonable progress there to a more aligned system than we've ever had before.
I think this is the most capable and most aligned model that we've put out.
We were able to do a lot of testing on it.
And that takes a while.
And I totally get why people were like, give us GPT-4 right away.
But I'm happy we did it this way.
So I want to be very clear.
I do not think we have yet discovered a way to align a super powerful system.
We have something that works for our current scale called RLHF.
And we can talk a lot about the benefits of that and the utility it provides.
It's not just an alignment.