Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Andy Halliday

๐Ÿ‘ค Speaker
3893 total appearances

Appearances Over Time

Podcast Appearances

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

And also we now have very competent tools seamlessly interacting with the model

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

and bringing in web search real-time data to augment the pre-training data at the cutoff.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

So you don't really need to worry about, oh, I'm going to not have as much recent information in the model anymore if I use Gemini with this old thing, and maybe some other model has a more recent cutoff date.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

That's receding in importance in my view.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

What I will be models, you know, so we got three today and we'll be hearing much more about it in the coming days as the the people who really have the time and attention to make direct comparisons with all the other major models will start doing that work for us.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

But yesterday or maybe the day before Grok 4.1 was released by XAI.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

And it had already gone under the name Quasar Flux on LM Arena.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

It had already gone to the number one ranking overall for user preference.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

So now the question is, in LM Arena, will Gemini 3.0 bump past Grok 4.1?

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

Now, I don't use Grok at all.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

And I've never understood why I would.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

But here's some interesting points about Grok 4.1's release.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

So it achieved the highest emotional intelligence score among tested systems, optimizing for personality traits like empathy and conversational tone.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

So it has that going for it.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

It reduced its own prior model hallucination rate from 12% to 4%.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

and cut factual errors by 66% compared to the prior version of Grok.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

And it also saw a significant upgrade in creative writing tasks ranking just behind chat GPT 5.1 on creative writing V3 benchmark.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

So it's really a state of the art model in many respects and was at the top of the LM Arena leaderboard in effect for user preference on the responses that people were getting in this kind of blind comparison that you get on the LM Arena.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

Yeah, it was there at the top.

The Daily AI Show
Gemini 3 Goes Live, Bezos Backs Prometheus, and Nvidia Drops Apollo

Somebody took a screenshot of it, I'm sure.