John Schulman
๐ค SpeakerAppearances Over Time
Podcast Appearances
There might be some biases in the labeling that lead to verbosity, like the fact that we tend to train for one message at a time rather than the full interaction.
So like if you only see one message, then there's something that just has like a clarifying question or maybe a short response with an invitation to follow up is going to be it's going to look less complete than something that covers all possibilities.
There's also
a question of whether people's preferences would change depending on how fast the model is streaming its output.
Like clearly if you're sitting there waiting for the tokens to come out, you're gonna prefer that it gets to the point.
But if it just gives you like a dump of text instantly, maybe you don't actually care if there's a bunch of boilerplate or like if there's a bunch of stuff you're gonna skim, you'd rather just have it all there.
Yeah, that's a good question.
So I think these preference models do learn a lot of subtleties about what people prefer that would be hard to articulate in an instruction manual.
Yeah.
Maybe if you...
Obviously, you can write an instruction manual that has lots of examples of comparisons.
And that's what the model spec has.
It has a lot of examples with some explanation.
So...
It's not clear what the optimal format is for describing preferences.
I would guess that whatever you can get out of like a big data set that captures fuzzy preferences, you can distill it down to like a smaller, a shorter document that mostly captures the ideas.
And I would think that the big like
Like the bigger models are like, they do like learn a lot of these concepts automatically of what people might find.
Like they'll have some,
uh they'll just learn from all the pre-training data what uh people would find useful and helpful and uh what uh they'll have uh like some there'll be some complex uh like uh like moral theories uh that they can they have and they can uh but of course there's still a lot of uh um room to latch on to a different uh like different style or a different morality so i think like when we have um