Andy Halliday
๐ค SpeakerAppearances Over Time
Podcast Appearances
So an 8 billion parameter model trained with that system surpassed GPT-5 and Clawed Opus 4.1 on Humanity's last exam.
So that's a pretty good accomplishment.
I'm not comparing it to GPT-5.1.
or OPUS 4.5.
This happens to have been done, the research was done before the release of those models.
So who knows whether those models would be a little bit better on humanity's last exam, but it scored above GPT-5 and CLAWed OPUS 4.1 while being two and a half times faster and more efficient.
Like the orchestration process does something better than large model inference.
Okay, so even when they used unseen tools, tools that's not familiar with, the orchestrator adapted to that well, showing that it can work with changing tool sets and be kind of exploratory or discover new tools to use.
And so...
Tool orchestra is this new approach.
It's a new old approach.
Like we've seen this emerging, I think, but of taking smaller dedicated models in a strapped architecture that puts them all together with a commander, right?
This orchestrator, the tool orchestrator.
And when you do that, you get faster, more efficient and smarter, right?
than you do with even the then most advanced frontier models.
I'm throwing up this comment from Gareth about Andre Karpathy just posted in this past several days about an LLM council that he created.
And it's a similar notion, right?
You have a group of LLMs, smaller ones, although he might be using full scale, you know, frontier models as part of his council.
But having them be able, setting them up in such a way that they can talk to each other and are directed to communicate with each other is really important.
So, yeah.