Stefano Ermon

The utilization is very low and that's why people are building massive data centers or even building custom chips, AI inference chips that are better suited for that kind of work.

20.755 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Basically, if you can generate more tokens per second,

30.368 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

What this means is that for the same amount of hardware, for the same number of GPUs, you can produce more tokens.

33.241 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so the cost per token is going to go down.

38.247 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And that's why we're able to serve our models much more cheaply than what you would get because we make better use of the existing hardware.

41.17 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So now the Mercury models that we have in production are significantly larger.

47.998 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

They've been trained on more data.

51.442 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

That's going to enable Mercury models to be even smarter.

53.104 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

It's going to have much better planning and kind of like reasoning capabilities.

56.568 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so that's going to enable a lot of

60.452 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

agentic use cases that people really care about.

62.114 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

They're going to make them really, really fast.

64.838 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Thank you.

204.352 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Pleasure to be here.

204.833 View full episode →

The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Good to see you again.

205.653 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment