Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Stephen McAleese

๐Ÿ‘ค Speaker
449 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Quote

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

The popular 2006 textbook Pattern Recognition and Machine Learning uses a simple example from polynomial regression.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

There are infinitely many polynomials of order equal to or greater than the number of data points which interpolate the training data perfectly, and almost all such polynomials are terrible at extrapolating to unseen points.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

End quote.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

However, in practice large neural networks trained with SGD reliably generalize.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Counting the number of possible models is irrelevant because it ignores the inductive bias of the optimizer and the loss landscape which favor simpler, generalizing models.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

While there are theoretically a vast number of bad, overfitting models, they usually exist in sharp and isolated regions of the landscape.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Good, generalizing models typically reside in flat regions of the lost landscape, where small changes to the parameters don't significantly increase error.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

An optimizer like SGD doesn't pick a model at random.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Instead it tends to be pulled into a vast, flat basin of attraction while avoiding the majority of non-generalizing solutions.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Additionally, larger networks generalize better because of the blessing of dimensionality.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

High dimensionality increases the relative volume of flat, generalizing minima, biasing optimizers toward them.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

This phenomenon contradicts the counting argument which predicts that larger models with more possible bad models would be less likely to generalize.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

This argument is based on an ML analogy which I'm not sure is highly relevant to AI alignment.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Still I think it's interesting because it shows intuitive theoretical arguments that seem correct can still be completely wrong.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

I think the lesson is that real-world evidence often beats theoretical models, especially for new and counterintuitive phenomena like neural network training.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Subheading Arguments based on the aligned behavior of modern LLMs One of the most intuitive arguments against AI alignment being difficult is the abundant evidence of helpful, polite, and aligned behavior from large language models, LLMs, such as GPT-5.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

For example, the authors of the essay AI is easy to control use the moral reasoning capabilities of GPT-4 as evidence that human values are easy to learn and deeply embedded in modern AIs.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

The moral judgments of current LLMs already align with common sense to a high degree, and LLMs usually show an appropriate level of uncertainty when presented with morally ambiguous scenarios.

LessWrong (Curated & Popular)
"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

This strongly suggests that, as an AI is being trained, it will achieve a fairly strong understanding of human values well before it acquires dangerous capabilities like self-awareness, the ability to autonomously replicate itself, or the ability to develop new technologies.