Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Eno Reyes

๐Ÿ‘ค Speaker
513 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

But you quickly realize that that does not work.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

Yeah.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so then we started to try to figure out, you know, what are the individual dimensions that matter for a compression of a situation?

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

Like, you know, it's going to lose some information.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

So what matters to preserve?

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

How can we design a system that not only doesn't just summarize, but sort of provides a very high quality and active balance?

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

block of information that the agent can then resume its task with little to no issues.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so, you know, you can look at all these things like the accuracy, if whether or not the actual compressed artifact is accurate,

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

accurate.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

You can look at its context awareness, like what information is or isn't included about what is currently happening, what has previously happened.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

You can look at the artifact trail.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

Are the files and the logs and the...

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

critical, like, you know, singular information pieces present in the summarization, you can look at completeness and continuity and instruction following.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

Anyway, you can look at all these different dimensions, and you can start to evaluate different strategies for context compression.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so what we found is that using this method called probe based evaluation, which is a fairly popular way to evaluate LLMs, where you basically, you know,

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

The idea is you can ask like very focused questions with rubric based evaluations using another LLM to extract information from some state or from some answer that has previously been been executed.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so, you know, an example probe might be, you know, I have an agent session and then I compact that.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And then a probe would ask, what was the file that had the bug inside of it, right?

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so if you can answer that question, right?

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And LLMs are now good enough such that it's basically a binary like yes, no, like it either has the information or it doesn't, right?