John Schulman

So I would say we had to think through, yeah, we just had to think through the trade-offs and basically the rough heuristic is that we mostly want the models to

4359.159 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

follow your instructions and be helpful to the user and the developer.

4371.398 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

But when this impinges on other people's

4377.883 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

and other people's happiness or way of life, this becomes a problem and we have to block certain kinds of usage.

4381.743 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

But we don't want to be too, we mostly want the models to just be an extension of people's will and do what they say.

4390.873 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

We don't want to be too paternalistic.

4397.08 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

We want to be kind of neutral and not like impose our opinions on people.

4399.062 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, we want to mostly let people do what they want with the models.

4404.688 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Like in this case, you really are going after the edge cases.

4436.052 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, we wanted it to be very actionable so that it wasn't just a bunch of nice sounding principles, but it was like each example kind of tells you something about some non-obvious situation and reasons through that situation.

4438.173 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Everyone has their complaints about the ML literature, but I would say overall, I think it's a relatively healthy field compared to some other ones like in the social sciences, just because, well, it's largely grounded in practicality and getting things to work.

4482.74 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

If you publish something that can't be replicated easily, then people will just forget about it.

4504.559 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

It's accepted that often you don't just report someone's number from their paper, you also try to re-implement their method and compare it to your method on the same

4512.955 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

say on the same training data set.

4522.953 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment