Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing
Podcast Image

The a16z Show

Hugging Face's Clem Delangue on Open Source AI and the LLM Bubble | MTS Live

22 May 2026

Transcription

Chapter 1: What is the significance of open source in AI innovation?

0.605 - 20.173 Clem Delangue

The idea of like restricting a technology like AI based on risks is just like, for example, you would say, OK, some people can punch other people. So let's tie down everybody's hands. Why? Because it is too dangerous. Some people can punch. But in reality, you don't want to do that because your hands are so useful.

0

20.213 - 42.34 Clem Delangue

The way you want to control it is untie everyone and then regulate or fight the bad actors. So, for example, if hacking, that creates cybersecurity risks. It's illegal, right? So you have to fight it, but not by preventing everyone from getting these capabilities. Otherwise, you...

0

42.438 - 50.895 Clem Delangue

blow down progress, you create massive gaps in terms of controls, in terms of capabilities, and you create actually additional risks.

0

52.158 - 70.176 Theo Jaffee

This episode originally aired on NTS. Open-source software built much of the modern internet. Linux, Apache, Kubernetes, and even the transformer architecture behind ChatGPT all spread because researchers and developers could study, modify, and improve them in public.

0

70.896 - 91.023 Theo Jaffee

But AI is increasingly moving in the opposite direction, with the most powerful models distributed behind closed APIs, controlled by a small number of companies. At the same time, China has emerged as one of the biggest contributors to open-source AI, while debates around safety, regulation, and access are becoming more politically charged.

91.624 - 104.845 Theo Jaffee

And now those same tensions are extending into robotics, where AI is beginning to move off the screen and into the physical world. Theo Jaffe and Sofia Puccini speak with Clem DeLong, CEO at Hugging Face.

107.255 - 131.406 Sofia Puccini

We are live here on MTS with Clement DeLong, who is the CEO of Hugging Face, which has been really an incredible resource for anyone who's interested in large language models and especially open weight large language models. I've been a Hugging Face user for a while now. So it's great to have you here. Clem, thanks so much for coming on MTS. Yeah, of course. Thanks for having me. Absolutely.

131.707 - 150.255 Unknown

Okay, so you are a big proponent of open source. First of all, how do you predict and you believe that open source is like a very important, you know, thing for innovation and competition. So can you compare and contrast sort of like the open source environments in the US and China to start?

150.235 - 166.236 Clem Delangue

Yeah, so, I mean, historically, the US was super, super strong with open source, right? That's kind of like what led to the current AI revolution, right? Like the T in chat, GTT, is actually coming from Transformer, which was open source from Google.

Chapter 2: How do the open source environments in the US and China differ?

193.351 - 217.401 Clem Delangue

If you ask most startups, most academia in the U.S. that are using open source, they're usually using Chinese open source models. You've probably heard of DeepSeq, of Quen, of Kimi. There are a bunch of companies and organizations in China contributing massively to the field of open source.

0

218.039 - 223.125 Unknown

Great. So you recently said we're in an LLM bubble. What makes you think that?

0

224.207 - 257.71 Clem Delangue

Well, I was asked if we were in an AI bubble, and I said we're probably not in an AI as a general field bubble, but I feel like if there's one specific domain of AI where there's so much investment that there's maybe a risk of over-investing. It's large language models distributed behind APIs, right? Like you see the building of crazy data centers for it.

0

258.591 - 275.063 Clem Delangue

And obviously you see a lot of revenue growth, but with kind of like uncertain margins and certain kind of like long-term sustainability and mode for it. So if there is a bubble, it's probably an LLM, but we'll see what happens in the next few months.

0

275.986 - 302.673 Sofia Puccini

Well, you're a big proponent of open source, you know, as we all know. But do you think that labs should ever restrict releasing their models in an open source way for safety reasons? Like, yeah, in 2022, 23, it was way too early for that. The models at the time were toys. But now we have stuff like Claude Mythos, which supposedly can like really assist people with cyber attacks.

302.693 - 311.361 Sofia Puccini

We have models that are increasing pretty dramatically in bio capability, which could be even scarier. So do you think companies should still be releasing their models open source?

312.472 - 334.218 Clem Delangue

So the interesting thing is that we've had these conversations and this kind of like talking point for a while in AI when we were earlier taking face, I think six, seven years ago. At the time it was GPT-2 and there was already like a lot of people saying that it was too dangerous to release in open source at the time, right?

334.238 - 356.586 Clem Delangue

It was six, seven years ago when basically it was nothing more than just an auto-complete. I think we've seen progressively that these were quite overblown. And I think they're also overblown today, right? And the whole point is that, you know, Mitos, I think when it was announced, was it like three weeks ago, a month ago?

356.566 - 382.219 Clem Delangue

it was crazy dangerous and now it's starting to be deployed kind of like everywhere, right? I think they just gave access to the first international organization in South Korea, I think yesterday or something like that. And probably in a few weeks or in a few months, everyone is going to be using Mitos and not kind of like destroy the world as a result.

Chapter 3: What is the current state of the large language model (LLM) bubble?

421.504 - 451.339 Clem Delangue

Whereas kind of like if you make it more open, actually it's usually easier for the defenders to react and kind of like make the whole system safer. So that's kind of like what we see with each releases where there are always kind of like overgrown concerns before and then progressively just we all adapt and the benefits kind of like outweighs the risks.

0

452.281 - 467.59 Unknown

Yeah, it feels like we'll still be dealing with this problem in like 50 years where somebody releases like some sort of like open source robotics, you know, robot or program or something. And then everyone is like, no, you shouldn't have done that. It's so risky. And then we'll just adapt again.

0

467.688 - 493.228 Clem Delangue

It's kind of like the story of technology, you know, like, I mean, the idea of like restricting a technology like AI based on risks is just like, for example, you would say, okay, some people can punch other people. So let's tie down everybody's hands. Because it's too dangerous. Some people can punch. But in reality, you don't want to do that because your hands are so useful.

0

493.288 - 525.885 Clem Delangue

They're creating so many good things in the world. You need your hands. The way you want to control it is untie everyone, give the freedom to everyone, and then regulate or fight the bad actors. So, for example, if hacking creates cybersecurity risks, I mean, it's illegal, right? You have to make it illegal. You have to fight it, but not by preventing everyone from getting these capabilities.

0

527.046 - 539.14 Clem Delangue

Because otherwise you slow down progress, you create massive gaps in terms of controls, in terms of capabilities, and you create actually additional risks.

539.491 - 558.073 Sofia Puccini

Well, right now on the topic of regulation, President Trump is in China where he will be meeting with Xi Jinping over the next couple of days. And they're going to be discussing, among other things, AI regulation and international AI agreements. So what do you hope to get out of this in terms of open source?

559.117 - 586.087 Clem Delangue

Yeah, I mean, I'm excited to see conversations about open source AI. Probably there's going to be some conversations about distillation, about collaborations between two countries. I hope, you know, both countries will be able to agree on fostering more transparency, more openness. to kind of like help more people access this technology.

586.708 - 608.682 Clem Delangue

I'm glad that Jensen hopped into the plane and joined these conversations because I think he has a lot of the right perspectives on this topic to kind of like basically create more collaboration between countries and kind of like share progress.

609.944 - 627.902 Unknown

Yeah, I'm curious about your robotics push. So you guys launched LeRobot in 2024. And you've talked about how robotics is the next frontier unlocked by AI and all of this stuff. How do you sort of see this playing out? And what is the role of open source?

Chapter 4: Should AI labs restrict the release of models for safety reasons?

750.291 - 763.577 Sofia Puccini

But why wasn't GitHub the GitHub of AI? It seems like they've kind of fumbled a lot of things in the AI realm. So why do you think HuggingFace became sort of the go-to place for model developers to deploy models and not GitHub?

0

764.3 - 789.553 Clem Delangue

Yeah, I mean, I don't blame them. They have a lot on their plates. I think with the coding assistant, they've kind of been dealing with their own set of issues. The reality is that hosting and sharing AI artifacts is quite different than hosting code. So even if people have been calling us the GitHub of AI, I think it's two very different things.

0

790.055 - 821.043 Clem Delangue

For example, for us, the volume of files, of data that we're dealing with is much, much larger than what the GitHub is doing. For example, just last week, we... we added two petabytes of data to the platform just last week. It's kind of like a matter of comparison. It's the equivalent of 500,000 two-hour movies that have been uploaded to Hugging Face just last week.

0

821.844 - 845.168 Clem Delangue

So you have a lot of structural differences, and we managed to build our infrastructure capabilities in a way that makes it just better for people that are building in AI to use Hugging Face to host their models, their data sets, both publicly but also privately. We have a lot of private usage now.

0

845.148 - 853.501 Clem Delangue

So that's kind of like some of the reasons why we managed to do it, whereas GitHub focused on other things.

853.521 - 866.802 Sofia Puccini

Totally. Well, that's pretty cool. We love Hugging Face. And we really appreciate your early support of MTS and our drops. So it was great to have you on today. Clem, thanks so much for coming on MTS.

869.566 - 890.163 Theo Jaffee

Thanks for listening to this episode of the A16Z podcast. If you liked this episode, be sure to like, comment, subscribe, leave us a rating or review, and share it with your friends and family. For more episodes, go to YouTube, Apple Podcasts, and Spotify. Follow us on X and A16Z, and subscribe to our sub stack at a16z.substack.com.

890.544 - 923.502 Theo Jaffee

Thanks again for listening, and I'll see you in the next episode. Thank you. Thank you. Thank you.

Comments

There are no comments yet.

Please log in to write the first comment.