Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Stefano Ermon

๐Ÿ‘ค Speaker
359 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

It's a very different way of using the models at the inference time.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

It's very fast.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

It's very fast because the key thing is, you know, in an autoregressive model, you know, you get one token at a time.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

You have to process a massive neural network with

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

hundreds of billions or trillions of parameters, and at the end you only get a single token.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Very inefficient if you think of it that way.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

You still need big neural networks, but each forward pass, you need to still evaluate the whole thing, but then at the end you are able to modify more than one thing.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so as long as you don't need too many denoising steps, too many diffusion steps, this can be really, really fast.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Yeah, so it started, I mean, I've always been passionate about diffusion models.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

The kind of like original idea came out from my lab at Stanford back in 2019.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Back then, I

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Pretty much all the image generative models were based on GANs, generative adversarial networks, which were very difficult to train, very unstable.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

There is this kind of game between a generator and a discriminator.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

It's a pretty complex kind of model to train, and there is all kinds of issues in scaling that approach up.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And we came up with this alternative approach of sort of like training the model to remove noise and, and then, uh, kind of like generating in a course defined way.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And, you know, we showed that it was working much better and eventually that took off and everybody kind of switched to diffusion models for image, video generation, mid-journey, SORA, stable diffusion.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

They were all based on those original ideas from, from my lab at Stanford.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And since back then, I kind of tried to see, can we get this to work also in text and code generation?

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And it took a few years to figure out how to do it properly.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

But then in 2024, we kind of had a breakthrough.