Eno Reyes

👤 Speaker

513 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

But you quickly realize that that does not work.

1649.644 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

Yeah.

1652.148 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so then we started to try to figure out, you know, what are the individual dimensions that matter for a compression of a situation?

1652.448 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

Like, you know, it's going to lose some information.

1662.359 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

So what matters to preserve?

1666.043 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

How can we design a system that not only doesn't just summarize, but sort of provides a very high quality and active balance?

1667.705 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

block of information that the agent can then resume its task with little to no issues.

1676.815 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so, you know, you can look at all these things like the accuracy, if whether or not the actual compressed artifact is accurate,

1682.467 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

accurate.

1692.661 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

You can look at its context awareness, like what information is or isn't included about what is currently happening, what has previously happened.

1693.503 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

You can look at the artifact trail.

1703.642 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

Are the files and the logs and the...

1705.426 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

critical, like, you know, singular information pieces present in the summarization, you can look at completeness and continuity and instruction following.

1708.051 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

Anyway, you can look at all these different dimensions, and you can start to evaluate different strategies for context compression.

1717.1 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so what we found is that using this method called probe based evaluation, which is a fairly popular way to evaluate LLMs, where you basically, you know,

1723.487 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

The idea is you can ask like very focused questions with rubric based evaluations using another LLM to extract information from some state or from some answer that has previously been been executed.

1734.798 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so, you know, an example probe might be, you know, I have an agent session and then I compact that.

1748.682 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

And then a probe would ask, what was the file that had the bug inside of it, right?

1756.432 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so if you can answer that question, right?

1762.341 View full episode →

The Neuron: AI Explained

This AI Agent Builds Better Code Than Most Developers (Factory AI)

And LLMs are now good enough such that it's basically a binary like yes, no, like it either has the information or it doesn't, right?

1764.884 View full episode →

← Previous Page 13 of 26 Next →

Report any issue