Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

Dwarkesh Podcast

Mark Zuckerberg — AI Will Write Most Meta Code in 18 Months

29 Apr 2025

1h 15m duration
14192 words
2 speakers
29 Apr 2025
Description

Zuck on:* Llama 4, benchmark gaming* Intelligence explosion, business models for AGI* DeepSeek/China, export controls, & Trump* Orion glasses, AI relationships, and preventing reward-hacking from our tech.Watch on Youtube; listen on Apple Podcasts and Spotify.----------SPONSORS* Scale is building the infrastructure for safer, smarter AI. Scale’s Data Foundry gives major AI labs access to high-quality data to fuel post-training, while their public leaderboards help assess model capabilities. They also just released Scale Evaluation, a new tool that diagnoses model limitations. If you’re an AI researcher or engineer, learn how Scale can help you push the frontier at scale.com/dwarkesh.* WorkOS Radar protects your product against bots, fraud, and abuse. Radar uses 80+ signals to identify and block common threats and harmful behavior. Join companies like Cursor, Perplexity, and OpenAI that have eliminated costly free-tier abuse by visiting workos.com/radar.* Lambda is THE cloud for AI developers, with over 50,000 NVIDIA GPUs ready to go for startups, enterprises, and hyperscalers. By focusing exclusively on AI, Lambda provides cost-effective compute supported by true experts, including a serverless API serving top open-source models like Llama 4 or DeepSeek V3-0324 without rate limits, and available for a free trial at lambda.ai/dwarkesh.To sponsor a future episode, visit dwarkesh.com/p/advertise.----------TIMESTAMPS(00:00:00) – How Llama 4 compares to other models(00:11:34) – Intelligence explosion(00:26:36) – AI friends, therapists & girlfriends(00:35:10) – DeepSeek & China(00:39:49) – Open source AI(00:54:15) – Monetizing AGI(00:58:32) – The role of a CEO(01:02:04) – Is big tech aligning with Trump?(01:07:10) – 100x productivity Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe

Audio
Transcription

Chapter 1: What is the main topic discussed in this episode?

0.031 - 7.823 Dwarkesh Kheterpal

All right, Mark, thanks for coming on the podcast again. Yeah, happy to do it. Good to see you. You too. Last time you were here, you had launched Llama 3.

0

Chapter 2: What new features does Llama 4 offer compared to previous models?

8.064 - 12.07 Dwarkesh Kheterpal

Yeah. Now you've launched Llama 4. Well, the first version. That's right. What's new? What's exciting?

0

Chapter 3: How is Meta preparing for an intelligence explosion with AI?

12.15 - 16.176 Dwarkesh Kheterpal

What's changed? Oh, well, I mean, the whole field's so dynamic.

0

16.216 - 25.389 Mark Zuckerberg

So, I mean, I feel like a ton has changed since the last time that we talked. Meta AI has almost a billion people using it now, monthly. So that's pretty wild.

0

26.21 - 48.415 Mark Zuckerberg

And I think that this is going to be a really big year on all of this, because especially once you start getting the personalization loop going, which we're just starting to build in now, really, from both the context that all the algorithms have about what you're interested in feed and all your profile information, all the social graph information, but also just what you're interacting with the AI about.

0

48.395 - 73.035 Mark Zuckerberg

I think that's just going to be kind of the next thing that's going to be super exciting. So really big on that. The modeling stuff continues to make really impressive advances too, as you know. The Llama 4 stuff, I'm pretty happy with the first set of releases. We announced four models and we released the first two, the Scout and Maverick ones, which are kind of like...

0

73.015 - 90.795 Mark Zuckerberg

the mid-sized models, mid-sized to small. It's not like... Actually, the most popular Llama 3 model was the 8 billion parameter model. So we've got one of those coming in the Llama 4 Series 2. Our internal codename for it is Little Llama.

Chapter 4: What role do AI relationships play in our future?

92.056 - 120.268 Mark Zuckerberg

But that's coming probably over the coming months. But the Scout and Maverick ones... And I mean, they're good. They're some of the highest intelligence per cost that you can get of any model that's out there, natively multimodal, very efficient, run on one host, designed to just be very efficient and low latency for a lot of the use cases that we're building for internally.

0

120.368 - 139.816 Mark Zuckerberg

And that's our whole thing. We basically build what we want, and then we open source it so other people can use it too. So I'm excited about that. I'm also excited about the behemoth model, which is coming up. That's going to be our first model that is sort of at the frontier.

0

Chapter 5: How does DeepSeek's development impact AI competition?

139.876 - 156.822 Mark Zuckerberg

I mean, it's like more than 2 trillion parameters. So it is I mean, as the name says, it's quite big. So we're kind of trying to figure out how we make that useful for people. It's so big that we've had to build a bunch of infrastructure just to be able to post-train it ourselves.

0

156.842 - 175.415 Mark Zuckerberg

And we're kind of trying to wrap our head around how does the average developer out there, how are they going to be able to use something like this? And how do we make it so it can be useful for distilling into models that are of reasonable size to run? Because you're obviously not going to want to run you know, something like that in a consumer model. But yeah, I mean, there's a lot to go.

0

175.435 - 194.219 Mark Zuckerberg

I mean, as you saw with the Llama 3 stuff last year, the initial Llama 3 launch was exciting. And then we just kind of built on that over the year. 3.1 was when we released the 405 billion model. 3.2 is when we got all the multimodal stuff in. So we basically have a roadmap like that for this year too. So a lot going on.

0

194.559 - 197.843 Dwarkesh Kheterpal

I'm interested to hear more about it. There's this impression that,

0

197.823 - 219.547 Dwarkesh Kheterpal

that the gap between the best closed source and the best open source models has increased over the last year, where I know the full family of Lama 4 models isn't out yet, but Lama 4 Maverick is 35 on Chatbot Arena, and on a bunch of major benchmarks, it seems like 04 Mini or Gemini 2.5 Flash are beating Maverick, which is in the same class.

Chapter 6: What are the challenges of monetizing AGI?

219.888 - 222.892 Dwarkesh Kheterpal

What do you make of that impression? Yeah, well, okay, there's a few things.

0

223.313 - 236.554 Mark Zuckerberg

I actually think that this has been a very good year for open source overall. If you go back to where we were last year, what we were doing with Llama was the only real super innovative open source model

0

Chapter 7: How does Mark Zuckerberg view the role of a CEO in tech innovation?

236.534 - 249.593 Mark Zuckerberg

Now you have a bunch of them in the field. And I think in general, the prediction that this would be the year where open source generally overtakes closed sources, the most used models out there, I think is generally on track to be true.

0

249.653 - 261.009 Mark Zuckerberg

I think the thing that's been sort of an interesting surprise, I think positive in some ways, negative in others, but I think overall good is that it's not just Lama. There are a lot of good ones out there.

0

Chapter 8: Is there an alignment between big tech and political figures like Trump?

262.552 - 285.764 Mark Zuckerberg

So... I think that that's quite good. Then there's the reasoning phenomenon, which you basically are alluding to with talking about 03 and 04 and some of the other models. I do think that there is this specialization that's happening where if you want to model that is sort of the best at math problems or coding or different things like that.

0

286.404 - 312.171 Mark Zuckerberg

I do think that these reasoning models with a lot of the ability to just consume more test time or inference time compute in order to provide more intelligence is a really compelling paradigm. But for a lot of the applications that, and we're gonna do that too, we're building a Lama 4 reasoning model and that'll come out at some point. For a lot of the things that we care about,

0

314.294 - 338.741 Mark Zuckerberg

Latency and good intelligence per cost are actually much more important product attributes. If you're primarily designing for a consumer product, people don't necessarily want it to wait like half a minute to go think through the answer. If you can provide an answer that's generally quite good too in like half a second, then that's great and that's a good trade-off.

0

338.761 - 358.049 Mark Zuckerberg

So I think that both of these are gonna end up being important directions. I am optimistic about integrating the reasoning models with kind of the core language models over time. I think that's sort of the direction that Google has gone in with some of the more recent Gemini models. And I think that that's really promising.

0

358.61 - 380.127 Mark Zuckerberg

But I think that there's just going to be a bunch of different stuff that goes on. I mean, you also mentioned the whole chatbot arena thing, which I think is interesting. And it goes to this challenge around how do you do the benchmarking, right? And basically, how do you know what models are good for which things? And one of the things that we've generally tried to do over the last year

380.107 - 406.606 Mark Zuckerberg

is anchor more of our models in our meta AI product Northstar use cases, because the issue with both kind of open source benchmarks and, you know, any given thing like, like the LM arena stuff is it's just, it's, they're often skewed for a, either a very specific, you know, set of use cases, which are often not actually what any normal person does in your product.

407.789 - 436.24 Mark Zuckerberg

They are often weighted, kind of the portfolio of things that they're trying to measure is different from what people care about in any given product. And... Because of that, we've found that trying to optimize too much for that stuff has often led us astray and actually not led towards the highest quality products and the most usage and best feedback within Meta AI as people use our stuff.

436.5 - 460.745 Mark Zuckerberg

So we're trying to anchor our North Star in... And basically the product value that people kind of report to us and what they say that they want and what their revealed preferences are and using the experiences that we have. So sometimes I think sometimes these things don't quite line up. And I think that a lot of them are quite easily misplaced. um, gameable, right?

460.765 - 483.128 Mark Zuckerberg

So, I mean, I think on, on the, um, arena, you'll see stuff like, uh, like Sonnet 3.7. It's like a great model, right? And it's, it's like not near the top. Um, and it was relatively easy for our team to tune a version of Llama 4 Maverick, um, that basically was way at the top. Um, whereas the one that we released, um, That's the kind of the pure model actually has no tuning for that at all.

Comments

There are no comments yet.

Please log in to write the first comment.