Andy Halliday

The Daily AI Show

From DeepSeek to Desktop Agents

So take notes.

479.186 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

There'll be a quiz on this afterward.

480.73 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

DeepSeek has just introduced a new technique in LLM inference that's advancing its capability in pure reasoning in a dramatic way.

483.955 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

So, you know, the innovations that the Chinese companies starved of the sort of the scaling compute capabilities available, if you can acquire the top end data center infrastructure like the NVIDIA Blackwell chips and so on, they've innovated around efficiencies that are along two different dimensions and,

499.582 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

I'll circle back to this, but one of those two dimensions is the use of sparsity.

523.713 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

Now, sparsity is the opposite of dense in the terminology of AI.

531.767 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

Dense means that you're using every layer of the network in each inference run.

538.519 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

That's a dense, deep neural network.

544.93 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

And sparsity means you're only activating certain portions of it.

548.823 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

So if you have a 100 billion parameter model, any one inference run is dynamically assessing which portions of that deep neural network, the LLM, which layers of those have to be activated.

553.328 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

And this has given rise to the primary architecture for LLMs today, which is called mixture of experts.

571.188 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

So the only experts that are activated in this context are the ones which are relevant to the query.

578.236 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

And that reduces the computational overhead and it makes for a more efficient and effective inference run and reduces the cost in both energy and compute time and allows for a larger context window to be executed.

586.408 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

So all of those things are improving on the efficiency scale.

604.223 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

The second dimension has to do with memory.

609.375 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

And this is where the new deep seek technique comes in.

613.103 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

So.

616.791 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

We know that models just left as a dense model and being injected with your prompt and some additional context that you type in at the time of inference, they can be subject to hallucinations.

617.858 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

And so we like to ground that with a retrieval augmented generation model where you have an external memory, a database that is going to be referenced as context based.

631.691 View full episode →

The Daily AI Show

From DeepSeek to Desktop Agents

And semantic relevance is used to selectively retrieve the relevant components of the grounding truth data that's in that retrieval augmented generation, typically a vector database, in order to achieve that semantic retrieval.

644.504 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment