Stefano Ermon
๐ค SpeakerAppearances Over Time
Podcast Appearances
Yeah, that's the
benefit I think of being in academia that everything is open and you're allowed to publish all of your work and you know that's the whole point for advancing the field together as a community I love that aspect and I think I mean a lot of the researchers do and you know I could sense a lot of unhappiness from colleagues and other researchers at
you know, in industry, working in the big labs, you know, as people, as the publication policies, you know, started to tighten and people were not allowed to publish anymore.
I think there was a lot of, a lot of people were not happy.
Yeah.
At the moment, we're going after what we think of like instant AI, sort of kind of like applications of LLMs where latency is critical, which typically means there's a human in the loop and the human cannot wait.
and that human could be a developer.
So we're seeing a lot of usage of Mercury models in IDEs where you're essentially providing suggestions or edits to the code, for example, directly to a developer.
And there you maybe have a few hundred milliseconds of latency budget, and you want to be able to provide the best possible suggestion within the latency budget.
But it could also be customer support, voice agents, edtech.
I want to ask about that.
Yeah, any other situation where you have to give an answer, you have to interact.
with a human in real time, then latency becomes critical and the game becomes, again, sort of like what's the best quality result that you can provide within the latency budget for a reasonable cost.
And that's where we dominate existing autoregressive solutions.
And that's where we're seeing a lot of the initial traction.
I think eventually, as the intelligence of the models keeps improving, as we do more R&D, as we catch up with frontier quality models, I think there's going to be more and more applications that we can go after.
But right now, we're going after latency-sensitive applications of other labs.
Yeah, and I mean, you're absolutely right that diffusion actually does work and it works really, really well for speech and music generation.
I know that some of the open source models and some of the state-of-the-art actually closed source models are based on diffusion for text-to-speech.
I didn't even know that.