Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Nathaniel Whittemore

👤 Person
2010 total appearances

Appearances Over Time

Podcast Appearances

Content creator Lewis Gleason wrote, What's powerful here is that this framework lets us track AGI like a scorecard.

For the first time, we have a framework that turns AGI from a buzzword into a measurable spectrum.

Instead of arguing are we close to AGI, we can now ask how much cognitive ground remains before parity.

Now, one of the interesting things about this framework is the focus on what's missing rather than highlighting a model's frontier abilities.

Over the summer, for example, GPT-5 and Gemini 2.5 Pro achieved gold medal performances in the International Mathematical Olympiad and the International Collegiate Programming Contest.

The leading models then are already at a human level, a very advanced human level when it comes to math or coding.

Importantly though, while achieving that level was a huge milestone on the path to AGI, based on this center's approach to an AGI definition, further progress in those areas isn't going to make a big difference.

In contrast, audio and visual understanding is still very nascent and needs to improve dramatically before AI models could be considered anywhere close to AGI.

Google has made incredible strides with their multimodal models over the past year, and visual understanding seems to be developing quickly.

The VO3 set of models and Sora 2 are also able to add appropriate audio to generated videos, implying strong auditory understanding.

The big area that is so clearly missing, the biggest hole, by a mile, is around memory.

The paper, in fact, describes this as perhaps the most significant bottleneck.

Now, of course, this is a huge area of focus for the labs.

Anthropic recently introduced their skills feature, which introduces a more efficient way of storing and accessing memory.

but we're yet to see a model that can intelligently store and retrieve information at anywhere close to a human level.

In fact, one of the things that you hear when people critique how far ahead the hype may have gotten in their estimation than where the capabilities of models are, it tends to come around to this part of cognition, where models don't have memory and they can't learn in the way that humans do.

Commenting on the study's exploration of memory from the paper, Rohan Paul noted, "...they show that today's systems often fake memory by stuffing huge context windows and fake precise recall by leaning on retrieval from external tools, which hides real gaps in storing new facts and recalling them without hallucinations."

They emphasize that both GPT-4 and GPT-5 fail to form lasting memories across sessions and still mix in wrong facts when retrieving, which limits dependable learning and personalization over days or weeks.

Anyone who has thought that they had locked in core knowledge and context about themselves with an LLM only to have it feed you back a response that has none of that understanding built in will understand what a big problem this actually is.