Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing
8930 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 4
Confidence: High

Appearances Over Time

Podcast Appearances

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

Right.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

I don't think this is the right way to get this type of result.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

So I think they're they're using this to try to it's a new model they're about to release.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

So just like Anthropic made the claim that mythos can they can find bugs that no other model could before, even though it turns out that like actually largely they could.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

so that you would be more willing to just use their mythos model in general.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

I think this is open AI.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

I'd be like, hey, it solved the math problem.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

Hopefully companies will now pay to use this model.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

I think that's what's going on here.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

I want to point out, for example,

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

Right after this OpenAI announcement, Google DeepMind put out their own paper announcing the Alpha Pro Nexus, which is a modular architecture system of the type I'm talking about in which you have LLMs tuned on math, you have proof solvers, you have really complicated control logic, you have agents and sub-agents that are systematically... The control logic, I believe, if I understand it correctly, is helping to systematically guide the types of prompts to the LLMs and what spaces to explore, and then taking the answers and running through the proof solver and giving feedback.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

And this is this new, this modular architecture, the cutting edge modular architecture.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

They just announced they ran it on 353 open problems, open Erdos problems, same type of problems this was from, and it solved nine of them.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

The other ones it couldn't solve, but solved nine of them.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

And it was pretty cheap to run.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

These are small models.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

These are not 20 trillion parameter general reasoning LLMs.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

So that probably is the way to do math.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

This doesn't discount the importance of the problem.

Deep Questions with Cal Newport
Did AI Just “Solve” Math? (Let’s Take a Closer Look) | AI Reality Check

I'm just saying the fact that it's a pure LLM is new, but maybe not that important.