John Schulman
👤 PersonAppearances Over Time
Podcast Appearances
um but i guess i wouldn't expect the web to get um like totally redesigned to have apis everywhere because i i'd expect that we can get models to use the same kind of uis that humans use right i mean i guess that's been the big lesson of language models right that they can they can act in the similar affordances that humans have um
There's definitely been some interesting instances of generalization in post-training, like one well-known phenomenon is if you do all your fine tuning with English data, you'll automatically, you'll have the model also
behaving well in other languages.
So if you train the assistant on English data, it'll also do something reasonable in Spanish, say.
And sometimes you might get the wrong behavior in terms of whether it replies in English or replies in Spanish, but usually you get the right behavior there as well.
Like you get it to respond in Spanish to Spanish queries.
So that's one kind of interesting instance of generalization that you just sort of latch onto the right.
uh helpful persona and then you automatically do the right thing in different languages we've seen some version of this with um multimodal data where uh if you do um text only fine tuning you also get reasonable behavior with images um uh early on in um chat gbt we uh we were trying to fix some issues in terms of the model uh understanding its own uh limitations like um
like early versions of the model would think they could like send you an email or call an Uber or something.
Like the model would try to play the assistant and it would say, oh yeah, of course I sent that email.
And obviously it didn't.
So we started collecting some data to fix those problems.
And we found that a tiny amount of data did the trick, even when you mix it together with everything else.
So I don't remember exactly how many examples, but something like 30,
Well, we had, I don't know, pretty small number of examples showing this general behavior of like explaining that the model doesn't have this capability and that generalized pretty well to all sorts of capabilities we didn't train for.
it's hard to say exactly what will be the deficit i mean i would say that uh when you talk to the models today they have various um uh weaknesses besides uh long-term coherence in terms of also like um like really uh thinking hard about things or paying attention to what you ask them so um
I would say I wouldn't expect just improving the coherence a little bit to be all it takes to get to AGI, but I guess I wouldn't be able to articulate exactly what the main weakness is that'll stop them from being a fully functional colleague.
It seems like then you should be planning for the possibility you would have AGI very soon.
Well, I would say that if AGI came way sooner than expected, we would definitely want to be careful about it.
And we might want to slow down a little bit on training and deployment until we're pretty sure we know we can deal with it safely.