Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

John Schulman

๐Ÿ‘ค Speaker
528 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I think if you publish methods that are like really hard to implement or are really finicky, they'll tend to get forgotten.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And as a result, people actually try to open source their work a lot.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I guess there's also, there's various like incentives that,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

There's various unfavorable incentives.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, people are incentivized to make the baseline methods, the methods they're comparing to, worse.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

There are other mild pathologies, like trying to make your method seem sophisticated mathematically.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

But I would say overall, I feel like the field makes progress.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I would probably like to see a little bit more science and trying to understand things rather than more like hill climbing on benchmarks and trying to propose new methods.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And there's been a decent amount of that recently, but yeah, I think we could use more of that.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And I think that's a good thing for like academics to work on.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Oh yeah, on the social sciences, on a slightly different note, I think actually,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I'd be really excited to see more research using base models to do simulated social science because these models have a probabilistic model of the whole world and you can set up like a simulated questionnaire or like a conversation and

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And you can look at how anything is correlated, like any traits that you might imagine, you can see how they might be correlated with other traits.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So it'd be pretty cool to see if people could replicate some of the more notable results in the social sciences, like moral foundations and that sort of thing, by just prompting base models in different ways and seeing what's correlated.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

What is that Stanford experiment?

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, well, definitely there's always progress in improving the efficiency.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Whenever you have a 1D performance metric, you're going to find that different improvements can kind of substitute for each other.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So you might find that post-training and training

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

pre-training both improve the metrics or, uh, like improve, uh, they, they, they'll have a different, slightly different profile of which metrics they improve.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

But, uh, if, if at the end of the day, you have a single number, they're both gonna, they're gonna substitute for each other, uh, somewhat.