Nathaniel Whittemore
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
Here's what we found.
One...
Hybrid legal agents can beat frontier models on quality and cost by routing selectively to a frontier advisor.
We tested a hybrid setup where GLM 5.1 served as the primary worker routing tasks to Opus 4.7 as an advisor when needed.
GLM invoked Opus sparingly, just 0.83 times per task on average.
The hybrid setup beat Opus on both quality and cost.
They also found that post-training can push open models to frontier-level legal performance.
With a little bit of post-training on Kimi's K2.6 model, they were able to move Kimi ahead of Opus on their legal agent benchmark and to do so for 11 times cheaper than Opus alone.
Writes Patrick Oyo, this is the multi-model routing thesis proved in production on one of the hardest benchmarks in enterprise AI.
The insight isn't that open source beat frontier.
It's that smart routing beat brute force.
Using the most expensive model for every task is not a quality strategy.
It's a laziness tax.
The teams building routing layers that send each task to the right model at the right cost are now demonstrably ahead on both dimensions simultaneously.
Inference optimization just became a first-class competitive advantage.
Legal proved it first because the stakes forced the discipline.
Now, luckily for enterprise AI buyers, the infrastructure required for this sort of routing and even post-training is very quickly becoming productized.
Software development company Factory just released a new product this week called Factory Router, which they say picks the right model for every task automatically.
They write,
A higher token build does not mean more work is getting done.