Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Stefano Ermon

๐Ÿ‘ค Speaker
359 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Yeah, that's the

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

benefit I think of being in academia that everything is open and you're allowed to publish all of your work and you know that's the whole point for advancing the field together as a community I love that aspect and I think I mean a lot of the researchers do and you know I could sense a lot of unhappiness from colleagues and other researchers at

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

you know, in industry, working in the big labs, you know, as people, as the publication policies, you know, started to tighten and people were not allowed to publish anymore.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

I think there was a lot of, a lot of people were not happy.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Yeah.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

At the moment, we're going after what we think of like instant AI, sort of kind of like applications of LLMs where latency is critical, which typically means there's a human in the loop and the human cannot wait.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

and that human could be a developer.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So we're seeing a lot of usage of Mercury models in IDEs where you're essentially providing suggestions or edits to the code, for example, directly to a developer.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And there you maybe have a few hundred milliseconds of latency budget, and you want to be able to provide the best possible suggestion within the latency budget.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

But it could also be customer support, voice agents, edtech.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

I want to ask about that.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Yeah, any other situation where you have to give an answer, you have to interact.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

with a human in real time, then latency becomes critical and the game becomes, again, sort of like what's the best quality result that you can provide within the latency budget for a reasonable cost.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And that's where we dominate existing autoregressive solutions.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And that's where we're seeing a lot of the initial traction.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

I think eventually, as the intelligence of the models keeps improving, as we do more R&D, as we catch up with frontier quality models, I think there's going to be more and more applications that we can go after.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

But right now, we're going after latency-sensitive applications of other labs.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Yeah, and I mean, you're absolutely right that diffusion actually does work and it works really, really well for speech and music generation.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

I know that some of the open source models and some of the state-of-the-art actually closed source models are based on diffusion for text-to-speech.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

I didn't even know that.