John Schulman
👤 PersonAppearances Over Time
Podcast Appearances
So you could, so for example, right now,
Like you could imagine having the models to carry out a whole coding project instead of maybe giving you one suggestion on how to write a function.
So you could imagine the model, like you giving it sort of high level instructions on what to code up and it'll go and write many files and test it, look at the output, iterate on that a bit.
So just much more complex tasks.
Yeah, I would say this will come from some combination of just training the models to do harder tasks like this.
So I'd say the models aren't...
Most of the training data is more like doing single steps at a time.
And I would expect us to do more for training the models to carry out these longer projects.
So I'd say any kind of training, like doing RL to learn how to do these tasks, however you do it, whether you're supervising the final output or supervising it like each step, I think any kind of training at carrying out these long projects is going to make them a lot better.
And since the whole area is pretty new, I'd say there's just a lot of low-hanging fruit.
Interesting.
in doing this kind of training.
So I'd say that's one thing.
Also, I would expect that as the models get better, they're just better at recovering from errors or they have just,
they're better at dealing with edge cases, or when things go wrong, they know how to recover from it.
So the models will be more sample efficient, so you don't have to collect a ton of data to teach them how to get back on track.
Just a little bit of data or just their generalization from other abilities will allow them to get back on track, whereas current models might just get stuck and get lost.
Right.
They're not directly connected.
So I would say you usually have a little bit of data that does everything.