Stephen McAleese

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Quote

2306.911 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

The popular 2006 textbook Pattern Recognition and Machine Learning uses a simple example from polynomial regression.

2308.615 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

There are infinitely many polynomials of order equal to or greater than the number of data points which interpolate the training data perfectly, and almost all such polynomials are terrible at extrapolating to unseen points.

2316.106 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

End quote.

2328.405 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

However, in practice large neural networks trained with SGD reliably generalize.

2329.787 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Counting the number of possible models is irrelevant because it ignores the inductive bias of the optimizer and the loss landscape which favor simpler, generalizing models.

2335.396 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

While there are theoretically a vast number of bad, overfitting models, they usually exist in sharp and isolated regions of the landscape.

2344.897 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Good, generalizing models typically reside in flat regions of the lost landscape, where small changes to the parameters don't significantly increase error.

2352.968 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

An optimizer like SGD doesn't pick a model at random.

2362.361 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Instead it tends to be pulled into a vast, flat basin of attraction while avoiding the majority of non-generalizing solutions.

2366.345 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Additionally, larger networks generalize better because of the blessing of dimensionality.

2374.213 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

High dimensionality increases the relative volume of flat, generalizing minima, biasing optimizers toward them.

2379.398 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

This phenomenon contradicts the counting argument which predicts that larger models with more possible bad models would be less likely to generalize.

2386.525 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

This argument is based on an ML analogy which I'm not sure is highly relevant to AI alignment.

2394.492 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Still I think it's interesting because it shows intuitive theoretical arguments that seem correct can still be completely wrong.

2400.373 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

I think the lesson is that real-world evidence often beats theoretical models, especially for new and counterintuitive phenomena like neural network training.

2407.199 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Subheading Arguments based on the aligned behavior of modern LLMs One of the most intuitive arguments against AI alignment being difficult is the abundant evidence of helpful, polite, and aligned behavior from large language models, LLMs, such as GPT-5.

2416.493 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

For example, the authors of the essay AI is easy to control use the moral reasoning capabilities of GPT-4 as evidence that human values are easy to learn and deeply embedded in modern AIs.

2433.558 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

The moral judgments of current LLMs already align with common sense to a high degree, and LLMs usually show an appropriate level of uncertainty when presented with morally ambiguous scenarios.

2447.22 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

This strongly suggests that, as an AI is being trained, it will achieve a fairly strong understanding of human values well before it acquires dangerous capabilities like self-awareness, the ability to autonomously replicate itself, or the ability to develop new technologies.

2458.297 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment