Nathaniel Whittemore
👤 PersonAppearances Over Time
Podcast Appearances
Now, what's valuable about this paper is, as Gleason put it, having a framework where there's an actual trackable numeric score that people can assess progress on.
For example, if all market actors accepted this framework, which of course won't happen, and then they went and looked and GPT-6 came out, instead of the inevitable endless debates about whether we had hit a wall again, theoretically, you could just look and see how much it had improved from GPT-4's 27% and GPT-5's 58%.
And yet at the same time, there is one highly problematic shortfall that could be very important.
Again, as Rohan Paul put it, the scope is cognitive ability, not motor control or economic output, so a high score does not guarantee business value.
In fact, increasingly other AGI definitions have fallen back on economic value as the most important proxy for intelligence.
Sometimes that's because more complex notions like continuous learning or performing tasks outside of the training set are too difficult to define.
One prominent example came from OpenAI's contract dispute with Microsoft.
Their agreement originally had Microsoft losing access to OpenAI's technology once AGI was achieved.
The problem was, of course, that the definition of AGI from OpenAI was pretty vague.
It defined AGI as, quote, highly autonomous systems that outperform humans at most economically valuable work.
The OpenAI board also had sole discretion to declare that AGI had been achieved.
This was viewed as an unfalsifiable claim that could cost Microsoft tens of billions of dollars.
The two companies ultimately settled on changing the definition of AGI to use a financial measurement as a proxy.
They decided that AGI would be deemed to have been achieved when OpenAI developed software that could generate $100 billion in profits.
Earlier this week, during the controversy around the Andre interview, Elon Musk revealed that he has a similar definition.
He posted on X that AGI is, quote, "...capable of doing anything a human with a computer can do, but not smarter than all humans and computers combined."
He said it's probably three to five years away.
He also put forward his belief that Grok 5 has a 10% chance to meet this definition and the odds are rising.
Now, I think there are of course merits to both economic and functional definitions of AGI.
The functional definition is laid out in the new paper, establishes the areas where current models are lacking and the new capabilities they will need to achieve AGI.