Mark Zuckerberg

I think like over time, I'd love to get, you know, a billion parameter model or a 2 billion parameter model, or even like a, I don't know, maybe like a 500 million parameter model and see what you can do with that.

1341.959 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

Because I mean, as they start getting...

1352.197 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

If with 8 billion parameters, we're basically nearly as powerful as the largest Lama 2 model, then with a billion parameters, we should be able to do something that's interesting, right?

1354.283 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

And faster, good for classification or a lot of kind of like basic things that people do before kind of understanding the intent of a user query and feeding it to the most powerful model to kind of hone what the prompt should be.

1363.758 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

So I don't know.

1379.123 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

I think that's one thing that maybe the community can help fill in.

1381.025 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

But I mean, we're also thinking about getting around to distilling some of these ourselves.

1383.168 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

But right now the GPUs are training the 405.

1388.053 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

That's the whole fleet.

1398.665 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

I mean, we built two...

1399.506 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

I think it's like 22, 24,000 clusters that are kind of the single clusters that we have for training the big models.

1402.85 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

I mean, obviously across a lot of the stuff that we do, a lot of our stuff goes towards training like reels models and like Facebook news feed and Instagram feed.

1410.503 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

And then inference is a huge thing for us because we serve a ton of people, right?

1418.697 View full episode →

Dwarkesh Podcast

Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

So our ratio of inference models

1421.923 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment