Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Sam Marks

๐Ÿ‘ค Speaker
891 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Opposing views of PSM exhaustiveness.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

The masked shoggoth, left, depicts the idea that the LLM, the shoggoth, has its own agency beyond plausible text generation.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

It play-acts the assistant persona, but only instrumentally for its own inscrutable reasons.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Source

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

In contrast, the operating system view, write, views the LLM as being like a simulation engine and the assistant like a person inside this simulation.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

The simulation engine does not puppet the assistant for its own ends.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

It only tries to simulate probable behavior according to its understanding of the assistant.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Source.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

NanoBananaPro.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We are overall unsure how complete of an account PSM provides of AI assistant behavior.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Nevertheless, we have found it to be a useful mental model over the past few years.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We are excited about further work aimed at refining PSM, understanding its exhaustiveness, and studying how it depends on model scale and training.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

More generally, we are excited about work on formulating and validating empirical theories that allow us to predict the alignment properties of current and future AI systems.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Heading The Persona Selection Model

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

In this section, we first review how modern AI assistants are built by using LLMs to generate completions to assistant turns in user-to-assistant dialogues.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We then state the Persona Selection Model, PSM, which roughly says that LLMs can be viewed as simulating a character, the assistant, whose traits are a key determiner of AI assistant behavior.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We'll then discuss a number of empirical observations regarding AI systems that are well explained by PSM.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We claim no originality for the ideas presented here, which have been previously discussed by many others, for example Andreas, 2022.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Janus, 2022.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Hubinger, 2023.