Steven Byrnes

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

I bring these up because I think the LLM-focused discourse sometimes has far too narrow a notion of what problem continual learning is supposed to be solving.

151.855 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

They tend to think the problem is about losing track of information, not failing to build new knowledge, and they propose to solve this problem with strategies like make the context window longer, as Dario Amodei recently mused, or better scratchpads with retrieval-augmented generation, RAG, etc.,

160.65 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

But real continual learning also includes the ways that AlphaZero changes after a million games of self-play or the ways that a human brain changes after 20 years in a new career.

178.315 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

There is no system of scratchpads that you can give to a 15-year-old such that it would be an adequate substitute for them spending the next 20 years growing into a 35-year-old world expert in some field.

189.131 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

Likewise, there is no context window that can turn GPT-2 into GPT-5.

201.029 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

Suppose you took an actual country of geniuses in a data center, completely sealed them from the outside world, and gave them a virtual reality environment to hang out in for the equivalent of 100 years.

207.405 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

What would you find when you unsealed it?

218.849 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

There would be whole new ways of thinking about the world and everything in it.

221.615 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

Entirely new fields of science, schools of philosophy, and so on.

225.202 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

Can a bunch of LLMs do that?

230.152 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

Well, consider this thought experiment.

232.877 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

Suppose you take a whole new field of science, wildly different from anything in the training data, and put a giant textbook for this field purely in an LLM context window, with no weight updates at all.

235.302 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

Will this LLM be able to understand, criticize, and build on this field?

247.265 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

My opinion is absolutely not , which implies that merely increasing context lengths is definitely not sufficient for a real country of geniuses in a data center, when the data center is sealed shut for the equivalent of 100 years, contra Dario who seems to think that it's at least in the realm of possibility that more context is sufficient by itself to get continual learning at a country of geniuses.

252.295 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

Level.

274.413 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

If we're talking about what a sealed country of human geniuses could do over the course of, like, a one minute, rather than over the course of 100 years, then, yeah sure, maybe that could be reproduced with future CLLMs.

275.515 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

See von Oswald et al.

288.87 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

2022 on how, so-called, in-context learning can imitate a small number of steps of actual weight updates.

290.071 View full episode →

LessWrong (Curated & Popular)

"You can’t imitation-learn how to continual-learn" by Steven Byrnes

Heading.