Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Stefano Ermon

๐Ÿ‘ค Speaker
359 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so the costs are actually significantly lower.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Yeah, so that really depends more on the architecture than whether it's a diffusion model or an autoregressive model.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Right now, as I mentioned, we're still using self-attention, which unfortunately scales pretty poorly with the context length.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So I would say there is no difference.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

It's not better, it's not worse than an autoregressive model as you think about longer context.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Our models are supporting roughly 100K tokens of context length.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

we could potentially scale it up more.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Again, it's not something that is very different.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

If you think about an autoregressive model versus a diffusion model, it's more a function of the underlying architecture.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And in fact, we can actually use alternative architectures that scale better with respect to the context like state-space models or other attention variants that are more efficient.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

We have some preliminary results, so everything is compatible with different kind of backbones, but not in the production at the moment.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Nothing particularly.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

I think it's just like a fundamental problem for which, you know, it's going to be hard to get, you know, like a real breakthrough.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Like there is just like,

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

inherent trade-offs.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

I think of them in terms of sufficient statistics, like what do you store about your past and how do you keep track of, you know, you want to remember the things that are useful, you want to discard the things that are not useful.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And that's just fundamentally a hard problem.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Like there is no, there's always, you know, there is some kind of no free lunch involved where ahead of time, you don't know what you should remember and what you should discard.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And some things are going to be useful for something and they're going to be not useful for something else.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so I think it's a fundamentally very difficult problem where you have to make trade-offs.