Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

John Schulman

๐Ÿ‘ค Speaker
528 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I don't think anyone has a good answer for a good explanation of the scaling law with parameter count.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I mean, there are some

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

uh i don't even know what the uh what the best um sort of mental model is uh for this like clearly you have more capacity if you have a bigger model but uh so like you should be able to eventually get uh lower loss but i guess why are bigger models more sample efficient um i guess you could um i can give you some like very sketchy uh explanations like uh like they have um

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

You could say that the model is sort of an ensemble of a bunch of different circuits that do the computation.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

You could imagine that it has a bunch of computations that it's doing in parallel, and the output is a weighted combination of them.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

if you have more um just width of the or if you just have i mean actually width is somewhat similar to depth because uh like with residual networks uh you end up like the depth can do something similar to width in terms of like updating what's in the residual stream but uh

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

If you, yeah, you could argue that you're learning all these things in parallel, you're learning all these different computations in parallel and you just have more of them with the bigger model.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So you have more chance that one of them is lucky and ends up like having high, like winning, guessing correctly a lot and getting up-weighted.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So that's kind of like a...

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

uh what would be the yeah there's some algorithms uh that work this way like that um

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

like mixture, what is it?

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Mixture, some kind of mixture model or multiplicative weight update algorithm.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Yeah, there's some algorithms that kind of work like this.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So where you have like some kind of mixture of, I don't wanna say mixture of experts, cause it means something different, but like basically a weighted combination of experts with some learned gating and,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I actually, anyway, I said something slightly wrong, but anyway, yeah, you could imagine something like that and just having a bigger model gives you more chances to get the right function.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So that would be, and then of course, it's not just like you have a bunch of like totally disjoint, like functions that have,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

you're taking a linear combination of.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

It's more like a library where you might chain the functions together in some way.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So there's some composability.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So I would just say the bigger model has a bigger library of different computations, including lots of stuff that's kind of dormant and only being used some of the time.