Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Eno Reyes

๐Ÿ‘ค Speaker
513 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And you sort of still to some extent have to think a little bit about

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

you know, what if it does, what if it gives the right answer, but it's not parsed correctly?

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

So there's a little bit of work required to make sure this is robust, but generally fairly reliable way to determine does the LLM actually know at this point in time what happened?

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And so,

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

When we evaluated our compaction method and compared it to OpenAI's compression strategy, as well as Claude Code's compression strategy, we found that we had generally built one that was across all of these dimensions, much stronger at instruction following, continuity, completeness, but most importantly, just accuracy and context awareness, right?

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

Yeah.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

Where it was just able to recall all of the critical pieces of information quite well.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

Well, not just faster.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

I think actually speed was relatively similar across the board.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

But the two things that really matter is like the quality of the compression and how much it actually compresses, right?

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

Basically like the token reduction efficiency.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And we do have the worst token reduction efficiency.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

You know, OpenAI is 99.3%.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

uh cloud codes was 98.7 and ours was 98.6 so like 0.1 off maybe maybe that's within the error bars right but uh but the overall quality right you can basically take all of these characteristics and you can uh build sort of a quality score that just says you know across all these dimensions which one is stronger

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

I think that like probably the most important thing we learned was just how much structure matters.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

Right.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

So I think that probably the biggest failure case is generic summarization.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

And I think that the worst performing in our evaluation techniques were the ones that basically just treat all content as equally compressible.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

It's just one big summary and let the LLM figure it out.

The Neuron: AI Explained
This AI Agent Builds Better Code Than Most Developers (Factory AI)

A file path could be very low entropy information, but it's probably the most important piece of information an agent needs.