James Kynge
๐ค SpeakerAppearances Over Time
Podcast Appearances
So it's a word or part of a word or a punctuation mark that is represented by a token ID.
So a token ID is assigned to a bit of a word like that.
And all of the large language models
use these token IDs or these tokens to generate text.
So every time you put a question into your large language model, what is actually happening in the technology is that a whole load of tokens are being generated in order to answer your question.
So that's really it.
But
Now that we've got the rise of agentic AI, so these are pieces of software that actually do tasks.
So for instance, you could ask an AI agent to book your entire holiday.
You just write, please book me a holiday in France.
with a few more instructions.
It'll book your air tickets, it'll book your hotel, book your buses, book your taxis, book everything that you need.
That is performing a task rather than simply replying to a query that you had.
And of course, that uses many, many more tokens than simply asking a question.
So let me come to the crucial aspect of this, the cost differential between Chinese-generated tokens and US-generated tokens.
So just to give you a little bit of an example, Minimax and Moonshot, these are two Chinese AI models.
Their cost per million output tokens is about two US dollars to three US dollars.
Conversely, Anthropix Claude Sonnet 4.5, that's a big American LLM, they cost about 15 US dollars per million output tokens.
So there's about a six-fold gap.
between the US and China.