Stephen McAleese
๐ค SpeakerAppearances Over Time
Podcast Appearances
If ASI alignment is extremely difficult, we should stop ASI progress to avoid creating an ASI which would be misaligned with high probability and catastrophic for humanity in expectation.
If AI alignment is easy, we should build an ASI to bring about a futuristic utopia.
Therefore, one's beliefs about the difficulty of the AI alignment problem is a key crux for deciding how we should govern the future of AI development.
Heading Background arguments to the key claim To avoid making this post too long, I'm going to assume that the following arguments made by the book are true.
General intelligence is extremely powerful.
Humans are the first entities to have high general intelligence and used it to transform the world to better satisfy their own goals.
ASI is possible and likely to be created in the near future The laws of physics permit ASI to be created and economic incentives make it likely that ASI will be created in the near future because it would be profitable to do so
A misaligned ASI would cause human extinction and that would be undesirable.
It's possible that an ASI could be misaligned and have alien goals.
Conversely, it's also possible to create an ASI that would be aligned with human values, see the orthogonality thesis.
The book explains these arguments in detail in case you want to learn more about them.
I'm making the assumption that these arguments are true because I haven't seen high-quality counter-arguments against them, and I doubt they exist.
In contrast, the book's claim that successfully aligning an ASI with human values is difficult and unlikely seems to be more controversial, is less obvious to me, and I have seen high-quality counter-arguments against this claim.
Therefore, I'm focusing on it in this post.
The following section focuses on what I think is one of the key claims and cruxes of the book.
That solving the AI alignment problem would be extremely difficult and that the first ASI would almost certainly be misaligned and harmful to humanity rather than aligned and beneficial.
Heading The Key Claim ASI alignment is extremely difficult to solve.
First, the key claim of the book is that the authors believe that building an ASI would lead to the extinction of humanity.
Why?
Because they believe that the AI alignment problem is so difficult that we are very unlikely to successfully aim the first ASI at a desirable goal.