Cal Newport
👤 SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
Right.
I don't think this is the right way to get this type of result.
So I think they're they're using this to try to it's a new model they're about to release.
So just like Anthropic made the claim that mythos can they can find bugs that no other model could before, even though it turns out that like actually largely they could.
so that you would be more willing to just use their mythos model in general.
I think this is open AI.
I'd be like, hey, it solved the math problem.
Hopefully companies will now pay to use this model.
I think that's what's going on here.
I want to point out, for example,
Right after this OpenAI announcement, Google DeepMind put out their own paper announcing the Alpha Pro Nexus, which is a modular architecture system of the type I'm talking about in which you have LLMs tuned on math, you have proof solvers, you have really complicated control logic, you have agents and sub-agents that are systematically... The control logic, I believe, if I understand it correctly, is helping to systematically guide the types of prompts to the LLMs and what spaces to explore, and then taking the answers and running through the proof solver and giving feedback.
And this is this new, this modular architecture, the cutting edge modular architecture.
They just announced they ran it on 353 open problems, open Erdos problems, same type of problems this was from, and it solved nine of them.
The other ones it couldn't solve, but solved nine of them.
And it was pretty cheap to run.
These are small models.
These are not 20 trillion parameter general reasoning LLMs.
So that probably is the way to do math.
This doesn't discount the importance of the problem.
I'm just saying the fact that it's a pure LLM is new, but maybe not that important.