John Schulman

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

If we were to write a doc or if we're going to align these models, what we're doing is latching onto a specific style, a specific morality, and you still need a decently long document to capture exactly what you want.

5143.463 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah.

5163.578 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I think there's something of a moat because it's just a very complex, uh, operation and there's, uh, so it takes, uh, you have to have a lot of, uh, skilled people doing it.

5179.12 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And, uh, so there's a lot of tacit knowledge and, uh,

5190.392 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

There's a lot of organizational knowledge that's required.

5194.831 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I think, yeah, I think post-training, to create a model that actually has all the functionality people care about,

5199.64 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

requires a pretty complicated effort.

5212.465 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So, and this requires a lot of, this is basically an accumulation of a lot of R and D. So I would say that makes it somewhat of a moat that it's not trivial to spin this up immediately.

5217.875 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

it does seem like the same companies that are putting together the most serious pre-training efforts are also putting together the serious post-training efforts.

5234.47 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So it seems like it is,

5243.652 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

It is somewhat possible to copy or to spin up more of these efforts.

5249.658 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

There's also like one force that sort of makes it less of a mode is that you can like distill the models or you can take someone else's model and clone the outputs, or you can use someone else's model as a judge to like do comparisons.

5256.647 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I think like,

5271.667 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

the more big league people probably aren't doing that because it goes against uh terms of service policies but and it would also be uh uh sort of hurt to hit to their pride but i would expect some of the smaller players are doing that to get off the ground and that catches you up to a large extent i guess it helps you clear them out what what is the median raider like where are they based what are their politics uh what is their sort of knowledge level

5274.11 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I would say it varies a lot.

5300.536 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So we've definitely hired raters with different skills or for different kinds of tasks or projects.

5302.779 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I would say like a decent,

5313.716 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

decent mental model is uh just look at people who are on upwork and other platforms like that like who's doing um sort of odd odd jobs with remote work um so it's um yeah it's a pretty international group there's a decent number of people in the us uh we hire different um

5318.373 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

people like different groups of people for different types of labeling, like whether we're more focused on writing or like STEM tasks.

5339.045 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So people doing STEM tasks are more likely to be in India or other sort of like middle or lower middle income countries.

5348.995 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment