Steven Byrnes
👤 SpeakerAppearances Over Time
Podcast Appearances
I bring these up because I think the LLM-focused discourse sometimes has far too narrow a notion of what problem continual learning is supposed to be solving.
They tend to think the problem is about losing track of information, not failing to build new knowledge, and they propose to solve this problem with strategies like make the context window longer, as Dario Amodei recently mused, or better scratchpads with retrieval-augmented generation, RAG, etc.,
But real continual learning also includes the ways that AlphaZero changes after a million games of self-play or the ways that a human brain changes after 20 years in a new career.
There is no system of scratchpads that you can give to a 15-year-old such that it would be an adequate substitute for them spending the next 20 years growing into a 35-year-old world expert in some field.
Likewise, there is no context window that can turn GPT-2 into GPT-5.
Suppose you took an actual country of geniuses in a data center, completely sealed them from the outside world, and gave them a virtual reality environment to hang out in for the equivalent of 100 years.
What would you find when you unsealed it?
There would be whole new ways of thinking about the world and everything in it.
Entirely new fields of science, schools of philosophy, and so on.
Can a bunch of LLMs do that?
Well, consider this thought experiment.
Suppose you take a whole new field of science, wildly different from anything in the training data, and put a giant textbook for this field purely in an LLM context window, with no weight updates at all.
Will this LLM be able to understand, criticize, and build on this field?
My opinion is absolutely not , which implies that merely increasing context lengths is definitely not sufficient for a real country of geniuses in a data center, when the data center is sealed shut for the equivalent of 100 years, contra Dario who seems to think that it's at least in the realm of possibility that more context is sufficient by itself to get continual learning at a country of geniuses.
Level.
If we're talking about what a sealed country of human geniuses could do over the course of, like, a one minute, rather than over the course of 100 years, then, yeah sure, maybe that could be reproduced with future CLLMs.
See von Oswald et al.
2022 on how, so-called, in-context learning can imitate a small number of steps of actual weight updates.
Heading.
Why real continual learning can't be copied by an imitation learner.