Stephen McAleese
๐ค SpeakerAppearances Over Time
Podcast Appearances
An inner optimization process like training a model by gradient descent then trains each AI architecture variant produced by the outer search process.
Instead the author believes that human engineers will perform the work of the outer optimizer by manually designing learning algorithms and writing code.
The author gives three arguments why the outer optimizer is more likely to involve human engineering than automated search like evolution.
Most learning algorithms or AI architectures developed so far, for example SGD, transformers, were invented by human engineers rather than an automatic optimization process.
Running learning algorithms and training ML models is often extremely expensive so searching over possible learning algorithms or AI architectures similar to evolution would be prohibitively expensive.
Learning algorithms are often simple, for example SGD, making it tractable for human engineers to design them.
However, one reason why I personally find the evolution analogy relevant is that I believe the RLHF training process often used today appears to be a belevel optimization process similar to evolution.
One, like evolution optimizing the genome, the first step of RLHF is to learn a reward function from a dataset of binary preference labels.
2.
This learned reward function is then used to train the final model.
This step is analogous to an organism's lifetime learning where behavior is adjusted to maximize a reward function fixed in the outer optimization stage.
Subheading.
Arguments against counting arguments.
One argument for AI doom that I described above is a counting argument.
Because the space of misaligned goals is astronomically larger than the tiny space of aligned goals, we should expect AI alignment to be highly improbable by default.
In the post-counting arguments provide no evidence of AI doom the authors challenge this argument using an analogy to machine learning.
A similar counting argument can be constructed to prove that neural network generalization is very unlikely.
Yet in practice, training neural networks to generalize is common.
Before the deep learning revolution, many theorists believed that models with millions of parameters would simply memorize data rather than learn patterns.
The authors cite a classic example from regression.