Nathaniel Whittemore
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
Now, part of the reason that the media is so interested in this story is speculation around how this could slow revenue growth for both OpenAI and Anthropic, heading right into their IPOs, and then further, how a slowdown in revenue growth and perhaps an underperforming IPO could change the capital market's appetite to continue to put money into those companies, which could have downstream impacts on that AI build-out, which could make the problem worse, etc., etc.,
However, for our purposes today, we're not interested in the market discourse side of the conversation.
What I want to focus on is how companies are actually adapting and getting more token efficient.
Now, part one of this is a simple recognition that the efficiency and cost of intelligence are just as important as the raw underneath intelligence when it comes to AI in practice.
Perplexity's CEO, Aravind Srinivas, recently argued on CNBC that the single metric that would determine the winner of the AI race was which company can provide the most token value per watt per user.
He continued, "...whoever is able to maximize this particular objective really well by balancing accuracy, latency, cost..."
privacy, and intelligence all together, they're going to win.
That's what's going to win long term.
Again, when it comes to AI in practice, it's not just raw intelligence, but the efficiency with which that intelligence is delivered that's going to really matter.
Now, we're starting to see efficiency considerations show up in other areas of the discourse like benchmarking as well.
Up until this year, pretty much the only things people cared about when it came to benchmarks was the highest overall number in raw intelligence.
That's what state-of-the-art meant.
However, as we've moved into the agent paradigm, even the benchmarking companies themselves are spending a lot more time on the efficiency of intelligence as well.
For example, increasingly the most important chart from artificial analysis is not just their leaderboard score, but their intelligence versus output tokens used for quadrant chart.
This one is a little bit easier if you're looking at it, but for those of you who are listening, I'll try to describe it.
In the Y column, we have the raw score on the artificial analysis intelligence index.
That's the aggregate score across all of artificial analysis's tests that has at the moment Claude Opus 4.8 on max setting and 5.5 on extra high setting up at the very top scoring between 60 and 62.
On the x-axis is the output tokens used in all of the tests that represent the artificial analysis intelligence index.
with fewer obviously being better.
The top left quadrant then represents a combination of highest scores and best token efficiency and paints quite a different picture than just the intelligence index alone.