Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

John Schulman

๐Ÿ‘ค Speaker
528 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Actually, you could also say maybe in the future, we'll say the model itself.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I would say we're not going there yet.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

But anyway, they...

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

yeah, we have these different stakeholders.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Sometimes they have conflicting demands and we have to make some call on how to resolve those conflicts.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And it's not always obvious how to do that.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I would say we had to think through, yeah, we just had to think through the trade-offs and basically the rough heuristic is that we mostly want the models to

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

follow your instructions and be helpful to the user and the developer.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

But when this impinges on other people's

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

and other people's happiness or way of life, this becomes a problem and we have to block certain kinds of usage.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

But we don't want to be too, we mostly want the models to just be an extension of people's will and do what they say.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

We don't want to be too paternalistic.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

We want to be kind of neutral and not like impose our opinions on people.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, we want to mostly let people do what they want with the models.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Like in this case, you really are going after the edge cases.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, we wanted it to be very actionable so that it wasn't just a bunch of nice sounding principles, but it was like each example kind of tells you something about some non-obvious situation and reasons through that situation.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Everyone has their complaints about the ML literature, but I would say overall, I think it's a relatively healthy field compared to some other ones like in the social sciences, just because, well, it's largely grounded in practicality and getting things to work.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

If you publish something that can't be replicated easily, then people will just forget about it.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

It's accepted that often you don't just report someone's number from their paper, you also try to re-implement their method and compare it to your method on the same

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

say on the same training data set.