Tamay Besiroglu
๐ค PersonAppearances Over Time
Podcast Appearances
I think if you look at their capability profile, if you compare it to a random job in the economy, I agree they are better at
doing sort of coding tasks that would be involved in R&D compared to a random job in the economy.
But in absolute terms, I don't think they're that good.
I think they are good at things that maybe impress us about human coders.
If you wanted to see what makes a person a really impressive coder, you might look at their competitive programming performance.
I mean, in fact, companies often hire people based on, if they're relatively junior, based on their performance on these kinds of problems.
But that is just impressive in the human distribution.
So if you look in absolute terms at what are the skills you need to actually automate the process of being a researcher, then what fraction of those skills do the AI systems actually have, even in coding?
Like a lot of coding is you have a very large code base you have to work with.
The instructions are very kind of vague.
There isn't, for example, you mentioned a meter eval in which because they needed to make it an eval, all the tasks have to be kind of compact and closed and have clear instructions
evaluation metrics, like here's a model, get its loss on this data set as low as possible, or whatever.
Or here's another model and its embedding matrix has been scrambled, just fix it to recover most of its original performance, et cetera.
Those are not problems that you actually work on in AI R&D.
They're very artificial problems.
Now, if a human was good at doing those problems, you would infer, I think logically, that that human is likely to actually be a good researcher.
But if an AI is able to do them, like the AI lacks so many other competences that a human would have, not just a researcher, just an ordinary human that we don't think about in the process of research.
So our view would be automating research is, first of all, more difficult than people give it credit for.
I think you need more skills to do it
and definitely more than models are displaying right now.