Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Narrator (TYPE III AUDIO)

👤 Speaker
266 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Heading Individuality in AI Systems

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

While individuality in AI systems may seem human, it is not.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

To be clear, it also isn't the same as in plants.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Some intuitions may generalize from humans, some from plans, and some aspects of AIs are different and weirder.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

To unpack this, let's start with the most intuitive ways we could conceptualize what that something we are interacting with and talking about is.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Individual conversational instance.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

In each conversation you have with an AI, such as a chat GPT, you can understand the conversational counterpart as an individual.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

This individuality arises in real-time as context-specific interactions shape the AI's immediate behavior and apparent personality.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

The personality is somewhat ephemeral, may change, or be entirely replaced.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Model-wide individuality.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Alternatively, we might view all conversational instances derived from the same underlying trained model, the same neural network weights, as a single something, model, individual, clawed sonnet.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

This is probably the most common conceptualization in AI safety, with people asking if models can exfiltrate themselves, rogue deploy, or similar.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

Model family.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

An extension of the above.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

If we assume a model is continually getting updated and fine-tuned but keeps somewhat consistent character, name, and deployment, it may still be natural to consider it a single individual.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

The answers to what an individual AI could mean become stranger if we add some nuance to our understanding of LLM psychology.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

The three-layer model views of LLMs as operating through distinct but interacting layers rather than having a single coherent self.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

The surface layer consists of trigger action patterns, almost reflexive responses.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

The character layer maintains a stable personality through statistical patterns that make certain responses more probable than others.

LessWrong (Curated & Popular)
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit

The deepest predictive ground layer represents the fundamental prediction machinery, a vast prediction system derived from processing billions of texts, minimizing prediction error.