Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Stefano Ermon

๐Ÿ‘ค Speaker
359 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So, you know, the agent can type, the agent can click, the agent can, you know, check out an item for you or can book a flight for you.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So it needs to take a bunch of actions, let's say on a website or maybe on your apps or on some apps on your phone, like depending on what's the environment.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Yeah, yeah, exactly.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So existing models, yeah, they take a mix of structured information about what kind of menus are available, where the buttons are.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

and end images, and then they use that and they map it to an action.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So you can imagine something similar where instead you have a, you have a diffusion model, processes the same inputs, but then produces the answer, not one token at a time, but through this refinement process, which makes a lot of sense because people are already using diffusion policies and flow-based policies.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Like if you look at RL and robotics right now, the way

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

one of the best approaches for controlling robots and kind of like implementing the policy that decides what actions the robots take is based on diffusion.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

It's based on flow models, more or less the same thing.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so that's another kind of like data point that really excites me and gives me even more confidence that we are on the right track and we really need to do this at some point.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So those trade-offs change in the sense that at the end of the day, hallucinations happen because you're building a statistical model.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so whenever you fit a statistical model to data,

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

there is a certain regime where maybe you're interpolating, but then you might need to extrapolate and then mistakes happen unless you have a perfect model, but perfect model is never actually possible.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And because you're fitting a different model, even if you use the same data, you're going to get a different kind of behavior.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So it's going to interpolate, it's going to extrapolate in different ways.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

What we're seeing is that, you know, they still make mistakes.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

If you try a mercury model, it's not perfect.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

It does hallucinate.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

But it does so in different ways.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And it's hard to quantify how.