Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Andy Halliday

๐Ÿ‘ค Speaker
8321 total appearances

Appearances Over Time

Podcast Appearances

The Daily AI Show
Spotify Engineers Stopped Writing Code

So deep thing.

The Daily AI Show
Spotify Engineers Stopped Writing Code

I'm going to share my screen here quickly just to show you what it did.

The Daily AI Show
Spotify Engineers Stopped Writing Code

But on the far left over here, you see the Arc AGI scale and Arc AGI 2 is the one that we're looking at here.

The Daily AI Show
Spotify Engineers Stopped Writing Code

Arc AGI is a really complex set of.

The Daily AI Show
Spotify Engineers Stopped Writing Code

logic puzzles, basically, that Arc AGI 1, you know, was ultimately replaced by Arc AGI 2 just less than a year ago.

The Daily AI Show
Spotify Engineers Stopped Writing Code

And at the time that ArcAGI 2 came out, which is testing for advanced reasoning and knowledge, and as part of the sort of the metrics set that is being used to attempt to determine whether AI is getting to artificial general intelligence or not, when it first came out, not so long ago, none of the models could surpass anything like 5% on this thing.

The Daily AI Show
Spotify Engineers Stopped Writing Code

Now, Gemini 3 DeepThink has 85% on the Arc AGI 2 test.

The Daily AI Show
Spotify Engineers Stopped Writing Code

What?

The Daily AI Show
Spotify Engineers Stopped Writing Code

Humanity's last exam, it's outperforming Clawed Opus 4.6 Thinking Max by 8.4 percentage points.

The Daily AI Show
Spotify Engineers Stopped Writing Code

And, you know, it's against GPT 5.2.

The Daily AI Show
Spotify Engineers Stopped Writing Code

All these things will be updated, obviously, once GPT 5.3 gets to, you know, get in the game here a little bit more.

The Daily AI Show
Spotify Engineers Stopped Writing Code

But this is a major advance, Gemini 3 DeepThink.

The Daily AI Show
Spotify Engineers Stopped Writing Code

And, you know, OpenAI is struggling to maintain its lead overall.

The Daily AI Show
Spotify Engineers Stopped Writing Code

while being inherently more expensive and underfunded compared to major players like Gemini.

The Daily AI Show
Spotify Engineers Stopped Writing Code

And that's behind, you know, that's Google behind them, right?

The Daily AI Show
Spotify Engineers Stopped Writing Code

Google has new money to support this.

The Daily AI Show
Spotify Engineers Stopped Writing Code

And they're basically surpassing what OpenAI is doing here, in my view.

The Daily AI Show
Spotify Engineers Stopped Writing Code

It has value to me.

The Daily AI Show
Spotify Engineers Stopped Writing Code

And let me just add something about Google DeepThink.

The Daily AI Show
Spotify Engineers Stopped Writing Code

So on top of Google DeepThink's reasoning capabilities, Google DeepThink has basically...