Stephen McAleese

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

The preferences that wind up in a mature AI are complicated, practically impossible to predict, and vanishingly unlikely to be aligned with our own, no matter how it was trained.

722.72 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

End quote.

733.135 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

One thing that's initially puzzling about the author's view is their apparent overconfidence.

734.597 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

If you don't know what's going to happen then how can you predict the outcome with high confidence?

739.724 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

But it's still possible to be highly confident in an uncertain situation if you have the right prior.

744.491 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

For example, even though you have no idea what the lottery number in a lottery is, you can predict with high confidence that you won't win the lottery because your prior probability of winning is so low.

750.074 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

The authors also believe that the AI alignment problem has accurses similar to other hard engineering problems like launching a space probe, building a nuclear reactor safely, and building a secure computer system.

760.671 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Subheading 1.

772.67 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Human values are a very specific, fragile, and tiny space of all possible goals.

774.674 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

One reason why AI alignment is difficult is that human morality and values may be a complex, fragile, and tiny target within the vast space of all possible goals.

780.866 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Therefore, AI alignment engineers have a small target to hit.

791.126 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Just as randomly shuffling metal parts is statistically unlikely to assemble a Boeing 747, a randomly selected goal from the space of all possible intelligences is unlikely to be compatible with human flourishing or survival, for example maximizing the number of paperclips in the universe.

795.417 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

This intuition is also articulated in the blog post The Rocket Alignment Problem which compares AI alignment to the problem of landing a rocket on the moon.

812.223 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Both require deep understanding of the problem and precise engineering to hit a narrow target.

820.375 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Similarly, the authors argue that human values are fragile.

826.161 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

The loss of just a few key values like subjective experience or novelty could result in a future that seems dystopian and undesirable to us.

830.366 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Or the converse problem, an agent that contains all the aspects of human value, except the valuation of subjective experience.

840.597 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

so that the result is a non-sentient optimizer that goes around making genuine discoveries, but the discoveries are not savored and enjoyed because there is no one there to do so.

848.438 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

This, I admit, I don't quite know to be possible.

858.491 View full episode →

LessWrong (Curated & Popular)

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

Consciousness does still confuse me to some extent.

862.196 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment