Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Andy

👤 Person
20707 total appearances

Appearances Over Time

Podcast Appearances

The Daily AI Show
Who Is Winning The AI Model Wars?

And, you know, including in the Math Arena Apex mathematics test, Gemini 3 Pro scored 23.4%, which was...

The Daily AI Show
Who Is Winning The AI Model Wars?

compared to GPT-5.1's 1%.

The Daily AI Show
Who Is Winning The AI Model Wars?

That's how big an advantage Google Gemini 3 Pro set that way.

The Daily AI Show
Who Is Winning The AI Model Wars?

And then in addition to that, Gemini 3 Pro, again, a slightly larger model than just the standard Gemini 3 that comes out,

The Daily AI Show
Who Is Winning The AI Model Wars?

Gemini 3 Pro, and I'm not sure how to parse those because there's also Gemini 3 Deep Think, which is the winner in the Arc AGI 2 thing.

The Daily AI Show
Who Is Winning The AI Model Wars?

I'm going to talk about that in a second.

The Daily AI Show
Who Is Winning The AI Model Wars?

But Gemini 3 Pro set a new high score for AI models in the tracking AI's offline IQ test, meaning it cannot use search or anything else.

The Daily AI Show
Who Is Winning The AI Model Wars?

It has to use its internal reasoning capabilities.

The Daily AI Show
Who Is Winning The AI Model Wars?

It's offline AI.

The Daily AI Show
Who Is Winning The AI Model Wars?

and it surpassed grok for expert modes 126 achieving 130 uh point iq on a reasoning test basically so that's that's google gemini 3. so then anthropics coming back strong with claude opus 4.5

The Daily AI Show
Who Is Winning The AI Model Wars?

It reclaimed the top spot on key coding benchmarks like SWE Bench Verified, software engineering.

The Daily AI Show
Who Is Winning The AI Model Wars?

81%, edging out Gemini 3 Pro's 76%.

The Daily AI Show
Who Is Winning The AI Model Wars?

So significantly better than Gemini 3 Pro.

The Daily AI Show
Who Is Winning The AI Model Wars?

And at the same time,

The Daily AI Show
Who Is Winning The AI Model Wars?

The important takeaway for Claude Opus 4.5 is they cut their prices.

The Daily AI Show
Who Is Winning The AI Model Wars?

Opus was their top model before, and it was expensive in the 4.1.

The Daily AI Show
Who Is Winning The AI Model Wars?

The new Opus 4.5 is a third the cost of what Opus 4.1 was before.

The Daily AI Show
Who Is Winning The AI Model Wars?

So they're making it affordable, and it is the top model.

The Daily AI Show
Who Is Winning The AI Model Wars?

a state-of-the-art model when it comes to coding benchmarks benchmarks okay but then open ai came out with codex max using 5.1 and that one introduced this com context compaction technology that allows for breakthroughs in the context continuity of authentic work using

The Daily AI Show
Who Is Winning The AI Model Wars?

Basically, there's no end to how long this thing can go on reasoning without losing track of what it's doing.