John Schulman

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Whereas people doing more like English writing and composition tend more to be like US based.

5359.265 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So,

5368.194 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, and I'd say there have been times when we needed to hire different experts for some of our campaigns.

5368.795 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Some of them are very talented, and we even find that they're at least as good as us, the researchers, at doing these tasks, and they're much more careful than us.

5375.342 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I would say the people we have now are quite skilled and conscientious.

5386.674 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Right, you don't exactly need that because, yeah, you can get quite a bit out of generalization.

5442.1 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So if you... Like the base model has already been trained on tons of documentation, tons of code with shell scripts and so forth.

5449.327 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So it's already seen all the FFmpeg man pages and lots of bash scripts and everything.

5460.157 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And it's...

5466.362 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So even just giving the base model a good few-shot prompt, you can get it to answer queries like this.

5469.185 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Just training a preference model like for helpfulness will, even if you don't train it on, probably even if you don't train it on any STEM, it'll somewhat generalize to STEM.

5477.993 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And like, so not only do you not need like examples of how to use F of M tag, you might not even need anything with programming to get some reasonable behavior in the programming domain.

5490.351 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I would definitely, yeah, I would expect things to move in that direction.

5551.417 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

It's unclear what's gonna be the best form factor, whether it's like something that's, it's like a Clippy that's on your computer and helping you with something, or if it's more like a helpful colleague in the cloud.

5556.182 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So we'll see which kinds of form factors work the best.

5569.475 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And I would expect people to try all of them out.

5574.48 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, I would expect more like,

5577.082 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, I would expect something like, yeah, the mental model of like a helpful assistant or helpful colleague to become more real where you can share more of your everyday work or have it like, instead of just giving it one-off queries, you would have a whole project that you're doing and it knows about everything you've done on that project so far.

5581.471 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

You can tell it, it can like even proactively make suggestions like,

5602.922 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

maybe you can tell it, oh yeah, like remember to ask me about this and if I've made any progress on it.

5610.091 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment