Andy Halliday

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

with a new technology that they've put forward called sparse attention.

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

So as you know, in transformer models, the attention mechanism is what makes it possible for you to throw just any kind of text, all these words, many of which are

960.31 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

you know, in human language, not that important.

973.955 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

All the ands, thes, and is, and all those things, you know, those can be ignored to a large degree.

977.079 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

And the attention mechanisms determine which tokens are actually important to the reasoning task that's there.

984.688 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

Well, sparse attention uses this concept of, you know, cutting back the total number of tokens that are necessary by selectively

992.238 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

Using the ones that have the most import to the inference process.

1003.452 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

So the new sparse attention, deep seek sparse attention, they call it that, is an advanced attention mechanism that selectively prioritizes relevant input data in your prompt.

1009.002 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

So it reduces computational overhead.

1022.787 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

and maintains high accuracy across longer contexts.

1025.812 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

Because obviously, if you can compress to the most relevant using selective sparse attention, then you can put a much larger prompt through that gives you a larger context.

1029.061 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

delivery by the user and then you if you can maintain accuracy and reasoning across that longer context that's improvement in your performance so it allows the mom to process large data sets sufficiently and it allows for larger context but uh they're claiming that they beat gemini 3.0 pro but they didn't beat gemini 3.0 pro thinking

1042.096 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

Okay, so they're saying, oh, here, our thinking model, the speciality model, beats Gemini 3.0 Pro.

1067.392 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

Well...

1074.56 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

yeah okay it does beat that marginally but still the state of the art is currently gemini 3.0 pro thinking okay so you give give gemini 3.0 pro just a little more budget uh not not using this sparse attention mechanism that we know of necessarily but you give a little more budget for thinking and it is clearly the winner that way

1075.475 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

But anyway, I thought it was very interesting, this sparse attention approach.

1099.403 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

It's a concept that's being used in many ways, especially in the context of mixture of experts models, where the sparsity is you're going to activate only those areas of the model first.

1103.491 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

that are relevant experts to the process of inference for the prompt that's been given.

1117.966 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

So making things more efficient, generally speaking, and DeepSeq has moved that forward.

1127.124 View full episode →

The Daily AI Show

Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines

Not to cut down a deep seek, they're really an amazing open source model and being used widely in enterprise in the United States and around the world as the foundation model for any kind of fine tuning and creation of a customized LLM that's operating in the enterprise context.

1133.035 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment