Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Jaeden Schaefer

๐Ÿ‘ค Speaker
2075 total appearances

Appearances Over Time

Podcast Appearances

AI in Business
Google Launches Gemini 3.1 and YouTube AI

A lot of the leaderboards where people are like, look, we like basically anytime these AI companies can can test their own model.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

on a benchmark it feels like they are cheating they're being scammy in some way and i mean i don't mean to call the kettle black but i feel like anthropic google and opening i have all been caught doing some form of this over the last few years so i don't really put as much stock in you know because basically those really screenshots where they're like we scored you know 72 on this exam and have a screenshot where it's like they they skipped out a couple of the questions that it

AI in Business
Google Launches Gemini 3.1 and YouTube AI

probably didn't do good on.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

Anyways, I'm not saying this is Google, but there is an AI company that has done this.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

And so when it comes to these companies testing themselves, I trust them a lot less than the real world leaderboards.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

So some of those examples are when essentially they have side by side comparisons of their model versus another model.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

And they just give have people give blind

AI in Business
Google Launches Gemini 3.1 and YouTube AI

They vote blindly on which response they like better.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

And when a new model really starts crushing it on those type of leaderboards, I take stock in that because this is actual people blind testing saying that their model is better.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

So that's great.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

One of these kind of real world leaderboards comes from a company called Merkur, their CEO, who's Brendan Foody.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

said he was posting about this.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

He says that Gemini 3.1 Pro is now the number one company on, they have a leaderboard called the Apex Agents Leaderboard.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

It's basically a benchmark that is designed to measure how well these AI systems handle professional knowledge-based tasks.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

And he says that this is, I mean, basically just showing how quickly this can move into a lot of the systems that agents are using to improve real work.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

So what's interesting to me is it feels like Google's putting a lot of stock in kind of this knowledge-based tasks field.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

They're doing a lot with education.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

And it seems like it's paying off in the benchmarks.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

With this whole release, this is obviously really heating up the competition, OpenAI, Anthropic.

AI in Business
Google Launches Gemini 3.1 and YouTube AI

Everyone's rolling out new systems and it feels like they're only months apart.