John Schulman

Actually, I think there might be some funny effects going on where there's unintentional distillation happening between the language model providers where if you hire someone to go do a labeling task, they might just be feeding it into a model.

4842.329 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

They might just be pulling up their favorite chatbot and feeding it in and having the model do the task and then copying and pasting it back.

4860.054 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So there might be...

4868.125 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

that might account for some of the convergence, but also I think some of the things we're seeing are just what people like.

4869.903 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I mean, I think people do like bullet points.

4876.997 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

They like the structured responses.

4879.061 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

People do often like the big info dumps that they get from the models.

4881.927 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

It's not completely clear how much is just a quirk of the particular choices and design of the post-training processes, and how much is actually intrinsic to

4887.959 View full episode →

Dwarkesh Podcast

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

what people actually want.

4911.415 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment