Yannis Antonoglou

This DeepMind Vet Raised $2B to Open-Source Frontier AI

Yeah, this is a really technical difference between what we mean by dense models and what we mean by MOEs and how MOEs might sound like much larger models, but at the same time, they're quite efficient when it comes to inference.

510.179 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

So at the heart of it, you can actually think of a mixture of experts model as like many, many, many models kind of like put right next to each other.

528.05 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

And then there's like what we call a router, which is really a system that selects one of the models to route each forward pass.

537.506 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

So when I have a dense model, I guess attention is the idea of looking at every token and try to predict based on every token in the past, try to predict the next token.

546.882 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

And there's like an everything-to-everything kind of like connection.

563.926 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

So like this is like the fully connected dense models.

568.031 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

In the mixture of experts, you have like many of these models, like one right next to the next, one right next to the other, and a system that selects during inference, during runtime, which one of the model to route its path to.

571.456 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

No, no, no, that's not correct actually.

614.146 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

Because GPT-4, GPT-4.5 could also be a mixture of expert models.

616.85 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

It's more of the architecture.

621.616 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

So the architecture is that as if you just took and you trained many small LLMs and then you just kind of put them together and then just trained a system that can route between them.

624.58 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

And then this way, kind of like the actual models, like many, many experts contributing to the final output.

637.518 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

And, you know, you actually like do that, like you don't do it, you don't have like actual full models.

644.766 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

You just like have them interspersed, right?

650.432 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

Like as in you have like layers of like experts and they kind of like contribute to the same final output.

655.017 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

Yeah, so it's kind of like, you can think of it as like a path, a path that like a model can take to kind of like use different experts along the way to just kind of like produce the final output.

670.883 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

No, it's actually more at the level of the token that I'll just do that.

719.716 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

So it's kind of like the mix of experts.

724.481 View full episode →

The Neuron: AI Explained

This DeepMind Vet Raised $2B to Open-Source Frontier AI

I guess the experts is the idea that each expert will just learn something on the data, but it's not at the level of that abstraction.