Nathaniel Whittemore
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
Build better apps, faster.
Start with the link in the description.
Today, we're diving deeper on the big AI theme of the moment, which is token efficiency.
Now, you might have heard this term coming up a lot more recently.
Matthew Berman recently tweeted, everyone is talking about token efficiency now.
I made an argument on Twitter yesterday that every AI business is now and for the foreseeable future, a token efficiency business.
In other words, every company that is selling services or products around AI is somehow and in some way going to be trying to help companies be better at allocating AI budgets effectively to get the most value from the raw capability that the AI represents.
Now, there are a ton of stories right now about advanced early AI adopter companies shifting their strategies as token consumption goes way up in the agent era.
Walmart, as we discussed this week, has started to cap usage of their internal AI tool because employees were using it too much.
Uber, as we discussed just yesterday, has set a $1,500 a month limit on spend on tools like CloudCode.
And the whole issue of token cost is starting to come home to roost for the big labs.
In their enterprise event on Tuesday, OpenAI's Sam Altman said that AI budgeting had recently become a, in his words, huge issue for some companies, even though cost was something that, quote, never came up earlier in the year.
Now, again, none of this is particularly surprising if you look at the underlying dynamics.
The move from assisted AI to deploying lots of agents to do things for us has meant a significant increase in the amount of AI being consumed, represented by the number of AI tokens being used.
However, the number of AI tokens being consumed are limited by the number of AI tokens that get produced, which is limited by a whole supply chain of things like power and inputs and components.
And unfortunately for all of us, we are in the very early days of the build-out of that infrastructure and are very likely to be, over the course of the next half decade at least, living in a situation of some sort of token shortage.
And what happens in a market economy?
When there's more demand for something than there is supply of something, the price goes up.
Or, which is manifesting in the case of the labs, as shifting people off of subsidized per-seat-based plans and over onto API pricing, meaning that they are paying for all of the tokens they're consuming.
And because that consumption can be effectively unlimited, that's why you're seeing companies start to impose caps.