Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

John Schulman

๐Ÿ‘ค Speaker
528 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

like language playing with language models so i i mean i'd say i just dabbled with these things and uh i'd say the people who do well at this kind of research uh have some view of the whole stack and have a lot of curiosity about the different parts of it and uh also sort of think about um

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Well, you want to be both empirical and let experiments update your views, but you also want to think from first principles somewhat.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Assuming that learning works, what would be the ideal type of data to collect and that sort of thing?

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

the the training corpus yeah okay yeah i'll try to respond to all that so uh first um are we about to hit the data wall i mean i wouldn't draw too much from uh the uh time since gpd4 was released because i mean it does um um yeah it takes a while to um like train these models and to um

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

like get all the, do all the prep to train a new model, like generation a model.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So yeah, I wouldn't draw too much from that fact.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I would say there are definitely some challenges from the limited amount of data, but,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I wouldn't expect us to immediately hit the data wall, but I would expect the nature of pre-training to somewhat change over time as we get closer to it.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

In terms of generalization from different types of pre-training data,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I would say it's pretty hard to do science on this type of question because you can't do that, create that many pre-trained models.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So maybe you can't train like a GPT-4 size model.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

You can't do ablation studies at GPT-4 scale.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Maybe you can do like train a ton of GPT-2 size models, or maybe even a GPT-3 size model with different data blends and see what you get.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I'm not aware of any results, like public results on ablations involving code data and reasoning performance and so forth.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Um, so that would be, I'd be very interested to know about those results, but.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Right, you might not be able to conclude that if transfer fails at GBD2 size, then it's also going to fail at a higher scale.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So it might be that for the smaller models,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, for the larger models, you learn these better shared representations.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Or the smaller models have to lean too much on memorization, whereas the larger models can learn how to do the right computation.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I would expect this to be true to some extent.