Narrator (TYPE III AUDIO)
👤 SpeakerAppearances Over Time
Podcast Appearances
Heading Individuality in AI Systems
While individuality in AI systems may seem human, it is not.
To be clear, it also isn't the same as in plants.
Some intuitions may generalize from humans, some from plans, and some aspects of AIs are different and weirder.
To unpack this, let's start with the most intuitive ways we could conceptualize what that something we are interacting with and talking about is.
Individual conversational instance.
In each conversation you have with an AI, such as a chat GPT, you can understand the conversational counterpart as an individual.
This individuality arises in real-time as context-specific interactions shape the AI's immediate behavior and apparent personality.
The personality is somewhat ephemeral, may change, or be entirely replaced.
Model-wide individuality.
Alternatively, we might view all conversational instances derived from the same underlying trained model, the same neural network weights, as a single something, model, individual, clawed sonnet.
This is probably the most common conceptualization in AI safety, with people asking if models can exfiltrate themselves, rogue deploy, or similar.
Model family.
An extension of the above.
If we assume a model is continually getting updated and fine-tuned but keeps somewhat consistent character, name, and deployment, it may still be natural to consider it a single individual.
The answers to what an individual AI could mean become stranger if we add some nuance to our understanding of LLM psychology.
The three-layer model views of LLMs as operating through distinct but interacting layers rather than having a single coherent self.
The surface layer consists of trigger action patterns, almost reflexive responses.
The character layer maintains a stable personality through statistical patterns that make certain responses more probable than others.
The deepest predictive ground layer represents the fundamental prediction machinery, a vast prediction system derived from processing billions of texts, minimizing prediction error.