Nick Heiner
๐ค SpeakerAppearances Over Time
Podcast Appearances
that sort of makes some of those judgments in a more specialized way and then tells the LLM or sort of gives it some prompting that's going to nudge it into a certain distribution.
Exactly.
And we have things like that today.
I just think that they're going to get so much better.
Yeah.
Today, I can give LLM a lot of my own writing and then ask it to write like me, and I feel like I can tell the difference.
But I think there will come a time when I won't be able to.
Yeah, I mean, I think it really does come down to the reward signal.
You know, and part of it, though, is also just the scaling problem of, like, a lot of sort of, I mean, in 2023, people would put a prompt into a model and they would get an answer in 90 seconds, you know, at the most, and they'd be thrilled with that response.
And now we want models to do things that would take a human a week or a month.
And like, that's just that's a very hard problem to figure out how to scale and sort of construct those tasks, because, you know, think about making an assignment for a student.
If if it takes you the expert a week to do something, that's just not easy to do the task.
But now you also need to come up with like the grading rubric and you need to make sure like.
you know, okay, I knew what I wanted the task to be, but did I actually write all those instructions down properly?
Or, like, would a reasonable student take different interpretations that would then not line up with the grading rubric I wrote?
So if it takes you a week to do the task, that's just the baseline.
And now you've got to do way more work to actually make this, you know, something you can teach someone with.
So, yeah, just figuring out how to scale all that, you know, is a pretty substantial bottleneck right now.
Yeah, yeah.
And can you fix that?