John Schulman
๐ค SpeakerAppearances Over Time
Podcast Appearances
So I think it was just much easier to tell people
for people to get the idea of what the model was supposed to do.
So as a result, I think the model had a much more coherent personality and it was much easier to get pretty sensible behavior robustly.
Interesting.
Not exactly.
I mean, they could have... I don't remember the status of which models were available for fine tuning.
Assuming we had 3.5 available for fine tuning at the time, you could have made something pretty decently close, but I'm not sure you would have...
I don't think you would have been able to do just one iteration of fine tuning where you have like purely human written data and you fine tune on that.
I think you would want to do several iterations.
Like if you're not gonna do RL, which we did,
you'd want to do some kind of iterative supervised fine tuning where you have like humans edit the model generated outputs because it's really hard to get people to like if you train on human generated data even if it's really high quality it's just hard for a model to fit that data perfectly because it might not be like it might not be something a model is capable of outputting so you need to do something iterative that looks a little bit more like rl
So I think if you had done that, you could have gotten something pretty close, but that would have been kind of non-trivial.
But we also had another instruction following model trained with RL that was released a little before ChatGBT.
So I think if you put a chat wrapper on that, you would get something decently close.
But that model, if you just prompted it with chat,
So, but that model had some differences in strengths.
Like it was, like that model was pretty good at writing and poetry and so forth, but it wasn't, it sort of, it wasn't as good at knowing its limitations and at factuality and so forth.
I would say faster than I would have expected since GPT-2.
I was pretty like bought into scaling and yeah, pre-training and so forth being a good idea.
But when GPT-2 was done, I would say I wasn't completely sold on it being revolutionizing everything.