Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Stefano Ermon

๐Ÿ‘ค Speaker
359 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

We were matching the perplexity, but we were able to be like 10 times faster.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

That was super exciting to me.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And I really wanted to see what happens if you train something bigger than a GPT-2 model, possible to build something commercially viable.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And that's why I started the company to scale things up.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

The arithmetic intensity of inference workloads that we have today with an ultra aggressive model is very bad.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

The utilization is very low and that's why people are building massive data centers or even building custom chips, AI inference chips that are better suited for that kind of work.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Basically, if you can generate more tokens per second,

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

What this means is that for the same amount of hardware, for the same number of GPUs, you can produce more tokens.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so the cost per token is going to go down.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And that's why we're able to serve our models much more cheaply than what you would get because we make better use of the existing hardware.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So now the Mercury models that we have in production are significantly larger.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

They've been trained on more data.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

That's going to enable Mercury models to be even smarter.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

It's going to have much better planning and kind of like reasoning capabilities.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so that's going to enable a lot of

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

agentic use cases that people really care about.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

They're going to make them really, really fast.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Thank you.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Pleasure to be here.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Good to see you again.

โ† Previous Page 1 of 18 Next โ†’