Andy Halliday
👤 PersonAppearances Over Time
Podcast Appearances
Literally one year ago.
And that was the first major model release that expanded inference time computation in a process that was focused on not just doing one pass through the model, but an internal multi-step deliberative process with planning and refinement before actually generating the final output.
And that's the way that models work today that are approximating, as Beth just said, they're at the level of human intelligence in many ways now in respect of reasoning.
And they make mistakes just like humans do.
So anyway, that's the leader of the news.
And I want to just tell you that you can see this report and do your own digest of it.
It's at openrouter.ai slash state dash of dash AI.
So check that out.
I can paste that in the private and you guys can put that up on the chat too.
Well, I have a couple of tidbits about internal leaking, I think probably the source about what garlic is.
And early internal runs show it outperforming both Gemini 3 and Anthropix Opus 4.5, so the state-of-the-art leading frontier models, on coding and logic while staying small enough to deploy cheaply.
So that's the interesting part of this is that they're going to put something out that's at the lead position and also less expensive, which implicates the entire house of cards that's being built here where it costs so much to put together the inference machines to support these models.
And yet the per token costs are being driven down by competition.
You know, why would OpenAI put out a cheaper model, like a less expensive model that is, you know, at the state of the art?
It's just they're trying to get the momentum back from Gemini.
And that's one of the things that's going to happen here is probably going to be less expensive and is probably more efficient.
And it makes it possible for them not to really dig their hole deeper or a lot deeper.
But still, nonetheless, we really should be seeing
inference costs rising in alignment with the useful exploitation of that inference.