Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Stefano Ermon

๐Ÿ‘ค Speaker
359 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

Yeah.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so, you know, it does make a lot of sense.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

One of the challenges is that

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

if you wanted to go straight from voice to voice, is that often these kind of interactions still involve tool calls.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

So if you're doing a customer support, you might still need to be able to query a database or check a calendar for availabilities or look up the menu to get the prices.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so there still needs to be some text, I think, some code involved, which makes it a little bit more tricky to develop.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

But we are very excited about eventually getting to something that is actually multimodal.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

The existing Mercury models are just text only or code only.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

But we know the future models work really well for image, video, music.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so if we put everything together, we could get to something like truly phenomenal handling different kinds of modalities and have a real world model that understands everything and puts together all the possibilities.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

the learnings and the signals from all the different modalities but it's definitely something we want to do at some point yeah that would be so awesome would that be something that would be useful for like a robotic kind of situation or would that be more for like simulations that you can use to train robots in your opinion could be a mix it could be decision making like if you're using a video or other kinds of sensors as input and then use the model to make decisions or

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

kind of like analyze what's going on in the surroundings.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

It's a very useful kind of like application of this technology.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

In fact, we've already heard it from some early adopters that they would love for our models to have image

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

inputs because they're building computer agents.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so that's another space where you really need to be quick.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

You need to be able to interact fast with whatever software, whatever application the agent interacts with.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

But it's important not to just look at the text and the HTML code, let's say, of a web page, but actually seeing what's happening.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

And so that would open up a lot of other applications, I think.

The Neuron: AI Explained
Diffusion for Text: Why Mercury Could Make LLMs 10x Faster

For computer use, no, it would be more like controlling the actions.