Sam Marks

"The persona selection model" by Sam Marks

Opposing views of PSM exhaustiveness.

"The persona selection model" by Sam Marks

The masked shoggoth, left, depicts the idea that the LLM, the shoggoth, has its own agency beyond plausible text generation.

310.177 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

It play-acts the assistant persona, but only instrumentally for its own inscrutable reasons.

318.514 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Source

324.246 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

In contrast, the operating system view, write, views the LLM as being like a simulation engine and the assistant like a person inside this simulation.

325.617 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

The simulation engine does not puppet the assistant for its own ends.

335.108 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

It only tries to simulate probable behavior according to its understanding of the assistant.

339.393 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Source.

344.719 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

NanoBananaPro.

345.94 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

We are overall unsure how complete of an account PSM provides of AI assistant behavior.

347.982 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Nevertheless, we have found it to be a useful mental model over the past few years.

353.647 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

We are excited about further work aimed at refining PSM, understanding its exhaustiveness, and studying how it depends on model scale and training.

359.195 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

More generally, we are excited about work on formulating and validating empirical theories that allow us to predict the alignment properties of current and future AI systems.

368.127 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Heading The Persona Selection Model

378.262 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

In this section, we first review how modern AI assistants are built by using LLMs to generate completions to assistant turns in user-to-assistant dialogues.

381.828 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

We then state the Persona Selection Model, PSM, which roughly says that LLMs can be viewed as simulating a character, the assistant, whose traits are a key determiner of AI assistant behavior.

391.499 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

We'll then discuss a number of empirical observations regarding AI systems that are well explained by PSM.

403.293 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

We claim no originality for the ideas presented here, which have been previously discussed by many others, for example Andreas, 2022.

409.863 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Janus, 2022.