Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Sam Marks

๐Ÿ‘ค Speaker
891 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We describe the Persona Selection Model, PSM, the idea that LLMs learn to simulate diverse characters during pre-training and post-training elicits and refines a particular such assistant persona.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Interactions with an AI assistant are then well understood as being interactions with the assistant, something roughly like a character in an LLM-generated story.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We survey empirical behavioral, generalization, and interpretability-based evidence for PSM.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

PSM has consequences for AI development, such as recommending anthropomorphic reasoning about AI psychology and introduction of positive AI archetypes into training data.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

An important open question is how exhaustive PSM is, especially whether there might be sources of agency external to the assistant persona, and how this might change in the future.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Heading Introduction What sort of thing is a modern AI assistant?

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

One perspective holds that they are shallow, rigid systems that narrowly pattern-match user inputs to training data.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Another perspective regards AI systems as alien creatures with learned goals, behaviors, and patterns of thought that are fundamentally inscrutable to us.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

A third option is to anthropomorphize AIs and regard them as something like a digital human.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Developing good mental models for AI systems is important for predicting and controlling their behaviors.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

If our goal is to make AI assistants that are useful and aligned with human values, the right approach will differ quite a bit if we are dealing with inflexible computer programs, aliens, or digital humans.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Of these perspectives, the third one, that AI systems are like digital humans, might seem the most unintuitive.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

After all, the neural architectures of modern large language models, LLMs, are very different from human brains, and LLM training is quite unlike biological evolution or human learning.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

That said, in our experience, AI assistants like Claude are shockingly human-like.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

For example, they often appear to express emotions, like frustration when struggling with a task, despite no explicit training to do so.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

And, as we'll discuss, we observe deeper forms of human likeness in how they generalize from their training data and internally represent their own behaviors.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

In this post, we share a mental model we have found useful for understanding AI assistants and predicting their behaviors.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Under this model, LLMs are best thought of as actors or authors capable of simulating a vast repertoire of characters, and the AI assistant that users interact with is one such character.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

In more detail, this model, which we call the Persona Selection Model, PSM, states that

โ† Previous Page 1 of 45 Next โ†’