Dr. Alexander Wissner-Gross
👤 SpeakerAppearances Over Time
Podcast Appearances
But the downside is at least certainly the earlier set of the XAI Grok models, they really smell like they've been benchmarked on a few hand curated benchmarks.
And I don't know whether that's in fact the ground truth behind the scenes, but reading between the lines of the Elon quote that it was built incorrectly the first time,
Something like that would be my suspicion.
And now that there is new leadership and the head of Starlink, as we talked about on the last episode, that the VP heading Starlink at SpaceX is now the president of XAI and gutting the engineering team.
I would expect that they're taking a look at making sure that benchmark, this is purely speculative admittedly, but that benchmarking for particular benchmarks isn't what happens.
And I think in this era of general reasoning models, where as with meta and meta's new models, where some would say meta's new models, the first under Alexander Wang's leadership, maybe have a bit of a smell of what,
data orientation, data-oriented fine-tuning versus reasoning model orientation.
XAI, if it wants to stay in the frontier, which right now is three labs plus XAI plus meta question mark, question mark, really can't afford to not have the world's strongest reasoning models and can't afford to just bench max to vanity benchmarks anymore.
I mean, to Elon's credit, at least he's being transparent about the number of parameters in the models.
The other frontier labs, by and large, no longer report the number of parameters in the models.
So I think there are a few things that are worth noting here.
One is that he's going up to $10 trillion.
The other frontier labs, certainly the top three-ish, no longer report that they go up to 10 trillion models.
For example, in the last episode, we were talking quite a bit about mythos.
I don't know how many parameters are in the mythos model.
I could speculate based on cost, but I just don't know the ground truth.
So I do think knowing that we're now going up to 10 trillion versus 1 trillion, where historically approximately 1 trillion was,
was the the widely reported soft ceiling or one and a half trillion ish soft ceiling number of parameters i think this is an important element of transparency i think it's also at the same time worth noting now that we have access thank you elon to the number of parameters it's worth noting that the ceiling in terms of the number of parameters is very much intact after all of this time the fact that
And aspirational frontier lab is still maxing out at 10 trillion parameters means that the parameter scaling race seems to be over.
If it hadn't had continued, remember for a while there as with the clock speed scaling race early sort of ending in the mid 2000s or late 90s, depending on how you count.