Francois Chollet
👤 SpeakerAppearances Over Time
Podcast Appearances
And does that pattern keep continuing as you keep getting bigger and bigger?
To the extent that the new patterns you're giving the model to learn are a good match for what it has learned before.
If you present something that's actually novel, that is not in a steady distribution, like an arc puzzle, for instance, it will fail.
Possibly.
Why not?
But you know, if these models were actually capable of synthesizing novel programs, however simple, they should be able to do Arc.
Because for any Arc task, if you write down the solution program in Python, it's not a complex program.
It's extremely simple.
And humans can figure it out.
So why can LLMs not do it?
Quite possibly, yes.
I think if you start, so honestly, what I would like to see is an LLM type model solving ARC at like 80%, but
after having only been trained on core knowledge related stuff.
Let me rephrase that.
Only trained on information that is not explicitly trying to anticipate what's going to be in the arc test set.
Yes, that is the point.
So if Arc were a perfect, flawless benchmark, it would be impossible to anticipate what's in the test set.
And, you know, Arc was released more than four years ago, and so far it's been resistant to memorization.
So I think it has, to some extent, passed the test of time.
But I don't think it's perfect.