Adam Brown
👤 PersonAppearances Over Time
Podcast Appearances
solve the maths question.
That tends to be the typical structure of these problems.
So you need to be able to do both.
The bit that's maybe only LLMs can do and wouldn't be so easy for other things is step one of that is like turning into a maths problem.
I think if you ask them hard research problems, you certainly can come up with problems that they can't solve.
That's for sure.
But it's pretty noticeable as we have tried to develop evaluations for these models that
As recently as a couple of years ago, certainly three years ago, you just scrape from the internet any number of problems that are standard, totally standard high school maths problems that they couldn't do.
And now we need to hire PhDs in whatever field.
And, you know, they come up with one great problem a day or something, you know.
The difficulty, as these LLMs have got stronger, the difficulty of evaluating their performance has increased.
Generally, you see positive transfer between domains.
So if you make them better at one thing, they become better at another thing across all domains.
It is possible to make a model that is like really, really, really good at one very particular thing that you care about.
And then at some stage there is some Pareto frontier and you start degrading performance on other metrics.
But generally speaking, there's positive transfer between abilities across all domains.
People are trying that.
There's an effort.
Shirley Ho and Flatiron, which is basically that exact plan, is they take the pipeline of all of the data that comes out of these astronomical observatories, they plug them into a transformer and see what happens.
You can come up with all sorts of reasons in advance why that might not be something that will work, but you could also come up with reasons in advance why large language models wouldn't work, and they do.