Mark Zuckerberg

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

compute required to training is probably much higher than most other companies that are doing this stuff just because of the sheer volume of the community that we're serving.

1425.108 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

Yeah, yeah.

1449.138 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

Although, and one of the interesting things about it that we saw even with the $70 billion is we thought it would get more saturated at, you know, it's like we trained it on around $15 trillion tokens.

1450.06 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

I guess our prediction going in was that it was going to asymptote more.

1462.204 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

But even by the end, it was...

1467.294 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

still learning right it's like we probably could have fed it more tokens and it would have gotten somewhat better but i mean at some point you know you're running a company you need to do these meta reasoning questions of like all right how do i want to spend our gpus on like training this 70 billion model further do we want to kind of get on with it so we can start testing hypotheses for llama 4 so we kind of needed to to make um to make that call and i think we got it we i think we

1469.118 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

There will be others in the future where, you know, the 70 billion multimodal one that'll come over the next period.

1499.105 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

But yeah, I mean, that was fascinating that the architectures at this point can just take so much data.

1504.271 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

Okay.

1521.972 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

But does that mean like the LAMA-4?

1522.853 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

In the same order of magnitude.

1523.814 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

This is one of the great questions that I think no one knows.

1529.904 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

It's one of the trickiest things in the world to plan around is when you have an exponential curve, how long does it keep going for?

1533.55 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

Yeah.

1544.048 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

I think it's likely enough that it will keep going, that it is worth investing the tens or 100 billion plus in building the infrastructure to assume that if that kind of keeps going, you're going to get some really amazing things that are just going to make amazing products.

1545.41 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

But...

1563.618 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

I don't think anyone in the industry can really tell you that it will continue scaling at that rate for sure.

1564.9 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

In general, in history, you hit bottlenecks at certain points.