Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

John Schulman

๐Ÿ‘ค Speaker
528 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

If we were to write a doc or if we're going to align these models, what we're doing is latching onto a specific style, a specific morality, and you still need a decently long document to capture exactly what you want.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I think there's something of a moat because it's just a very complex, uh, operation and there's, uh, so it takes, uh, you have to have a lot of, uh, skilled people doing it.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And, uh, so there's a lot of tacit knowledge and, uh,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

There's a lot of organizational knowledge that's required.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I think, yeah, I think post-training, to create a model that actually has all the functionality people care about,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

requires a pretty complicated effort.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So, and this requires a lot of, this is basically an accumulation of a lot of R and D. So I would say that makes it somewhat of a moat that it's not trivial to spin this up immediately.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

it does seem like the same companies that are putting together the most serious pre-training efforts are also putting together the serious post-training efforts.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So it seems like it is,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

It is somewhat possible to copy or to spin up more of these efforts.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

There's also like one force that sort of makes it less of a mode is that you can like distill the models or you can take someone else's model and clone the outputs, or you can use someone else's model as a judge to like do comparisons.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I think like,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

the more big league people probably aren't doing that because it goes against uh terms of service policies but and it would also be uh uh sort of hurt to hit to their pride but i would expect some of the smaller players are doing that to get off the ground and that catches you up to a large extent i guess it helps you clear them out what what is the median raider like where are they based what are their politics uh what is their sort of knowledge level

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I would say it varies a lot.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So we've definitely hired raters with different skills or for different kinds of tasks or projects.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I would say like a decent,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

decent mental model is uh just look at people who are on upwork and other platforms like that like who's doing um sort of odd odd jobs with remote work um so it's um yeah it's a pretty international group there's a decent number of people in the us uh we hire different um

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

people like different groups of people for different types of labeling, like whether we're more focused on writing or like STEM tasks.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So people doing STEM tasks are more likely to be in India or other sort of like middle or lower middle income countries.