Sholto Douglas
๐ค SpeakerAppearances Over Time
Podcast Appearances
And a ongoing challenge would be imbuing taste into the models and setting up the right feedback loops such that you can actually do that.
Maybe the best public example is actually a paper that OpenAI put out recently where they judge the answers to medical questions using these like grading criteria feedback.
So there's like doctors have posed various questions and then there's all these like it's like a marking criteria for a long for like a short answer question in an exam where did the model mention X, Y, Z?
Did it recommend to do this kind of thing?
And they grade the model according to this.
And in this paper, they found that one, the models are like incredible at this.
And two, that the models are sufficient to grade the answers.
Because maybe like one good mental model is roughly
If you can construct a grading criteria that an everyday person off the street could do, then the models are probably capable of interpreting that criteria.
If it requires expertise and taste, that's a tougher question.
In viewing, is this a wonderful piece of art?
That's difficult.
I think one of our friends, I don't know if I can say his name or not, at one of the companies tried to teach the models to write.
Yeah.
And I think had a lot of trouble hiring human writers that he thought had taste and weren't encouraging the models to write slop.
Yeah.
It worked to some degree.
Big model smell.
Yeah.
But that was in part because of his efforts at doing this and paring down the number of humans.