Dr. Richard Moulange
π€ SpeakerAppearances Over Time
Podcast Appearances
AI did much, much better.
So back in early 25, when the paper was released, OpenAI's best model at the time, the O series models, O1, O3, I think it was O3, got something like 45%.
The best AI systems were getting double top virology experts answering in their own area of expertise about these tacit knowledge problems.
Why has this petri dish gone wrong?
Or what is going on in this experiment that doesn't make sense?
This is huge because this put pay to the claim that tacit knowledge barriers would always and inevitably be something that could never be overcome.
The eval doesn't answer everything about tacit knowledge.
You're quite right.
You talked about holding a pipette or how to sort of pour a particular kind of gel.
These are sort of very physical things that it's not easy to test in eval.
But the test really does get an awful lot of difficult knowledge that humans themselves say are huge blockers on modern state-of-the-art work.
And we know that are blockers because they didn't do very well.
And models could do much, much better.
Yes and no.
It moved certain people in the community a lot.
And people really woke up to, oh, we thought it would be a few years until this tacit knowledge thing really started kicking in.
Oh, it looks like we're here already.
And I'll note it's not just AI has been much better than individual experts.
They even went back and got teams of experts together.
And the teams still weren't as good as the best AI.