Rob Wiblin
๐ค SpeakerAppearances Over Time
Podcast Appearances
And companies do release benchmark results when they release models right now.
So they say, you know,
Claude Opus 4 was released and they have a model card that says like, you know, it has this score on this like hacking benchmark.
It has this score on the software engineering benchmark and so on as part of a report about whether it's dangerous.
Or GPT-5 had the same thing.
I think that that's great that they do that.
But in my ideal world, they would release their highest internal benchmark score at some calendar time cadence.
So every three months, they would say, we've achieved this level score on this hacking benchmark, this level score on software engineering benchmark, this score on an autonomy benchmark.
And that's because, as you said, danger could manifest from purely internal deployment.
Because if they have an AI agent that's sufficiently good at AI R&D, they could use that to go much faster internally.
And then other capabilities and therefore other risks might come online much faster than people were previously expecting.
So it's not ideal to have your report card for the model come out when you release it to the public, unless there's some sort of guarantee that you're not sitting on a product that's substantially more powerful than the public product.
So maybe it's fine to release your model card and system card along with the product if you also separately have a guarantee that you won't have too much of a gap between the internal and the external.
So that's on the end of things that are currently discussed.
It's kind of how I would tweak information that's currently reported to be somewhat more helpful for this concern.
But then there's a bunch of other stuff that is not currently reported that I think ideally it would be really great to know.
Stuff like how much and how are they using AI systems internally?
So one thing I'm very interested in is, so companies will sometimes report kind of to brag about like the percentage of lines of code that are written by their AI systems.
Various CEOs have said like internally 90% of our lines of code are written by AIs and things like that.
I think it'd be great to have systematic reporting of those kinds of metrics, but those metrics aren't the like ideal metric I'd be interested in.