Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Narrator (TYPE III AUDIO)

👤 Speaker
266 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

What LLMs do in artificial-sounding ethical dilemmas is not necessarily very informative.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Conversely, anthropomorphic identity assumptions might also lead us to underestimate the subtle forms of emergent cooperation and implicit goal-directedness that can arise either if we adopt the broader meanings of individuals or selves, or from AIs implicitly coordinating with AIs just like them through shared underlying assumptions, training, goals, or predictive structures.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Heading.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Coordinating selves.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Many AI safety schemes are loosely based on the assumption that you can model AIs as roughly game-theoretic agents and you can make them play against each other.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

However, what you get from game theory depends heavily on how you identify the players.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

To get a visceral sense why, I recommend this experiment.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Pick two toothpicks, one per hand.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Now, make your right-hand sword fight your left hand.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

For each hit, the respective hand player scores a point.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

If you actually try that, and your experience is similar to mine, it is hard to make this into an actual fight.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

The part of the predictive processing substrate playing right hand has too much info about left hand and vice versa.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

The closest thing to an actual fight is when the hands move almost chaotically.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

In contrast, what is easy to do is to stage a fight where a centrally planned choreography leads to motions as if the hands were fighting.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

There's an image here.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

For a different intuition pump, consider collective agencies like a church acting through its members, or an ideological egregore acting through its adherents.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

In these cases, that self is not a single human individual, but a distributed network of actors loosely coordinated by shared beliefs, values, and goals.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

The church or ideology doesn't have a singular, localized mind, but it can still exhibit goal-directed behavior and forms of self-preservation and propagation.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Now imagine future AI systems that have been trained on overlapping datasets, share similar architectures and training processes, and have absorbed common ideas and values from the broader memetic environment in which they were developed.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Even without explicit coordination, these systems may exhibit convergent behaviors and implicit cooperation in pursuit of shared objectives.