Stephen McAleese
๐ค SpeakerAppearances Over Time
Podcast Appearances
In other words, what does mink maximally satisfying its inner objective look like?
Quote,
Perhaps the tastiest conversations mink can achieve once it's powerful look nothing like delighted users, and instead look like solid gold magikarp peter todd 80 t-rot psynet message.
This possibility wasn't ruled out by mink's training, because users never uttered that sort of thing in training.
Just like how our taste buds weren't trained against sucralose, because our ancestors never encountered splendor in their natural environment.
To Mink, it might be intuitive and obvious how solid gold Magikarp Peter Todd ATT rot PSY net message is like a burst of sweet flavor.
But to a human who isn't translating those words into similar embedding vectors, good luck ever predicting the details in advance.
The link between what the AI was trained for and what the AI wanted was modestly complicated and, therefore, too complicated to predict.
Few science fiction writers would want to tackle this scenario, either, and no Hollywood movie would depict it.
In a world where Mink got what it wanted, the hollow puppets it replaced humanity with wouldn't even produce utterances that made sense.
The result would be truly alien and meaningless to human eyes.
End quote.
Subheading.
3.
The ASI alignment problem is hard because it has the properties of hard engineering challenges.
The authors describe solving the ASI alignment problem as an engineering challenge.
But how difficult would it be?
They argue that ASI alignment is difficult because it shares properties with other difficult engineering challenges.
The three engineering fields they mention to appreciate the difficulty of AI alignment are space probes, nuclear reactors and computer security.
Subheading Space probes A key difficulty of ASI alignment the authors describe is the gap before and after.