Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Sam Marks

๐Ÿ‘ค Speaker
891 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Lay out a spectrum of views on the exhaustiveness of PSM, ranging from the popular masked Shoggoth view that attributes substantial non-persona agency to the LLM itself, to an antithetical operating system view under which all agency originates from the assistant persona.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Discuss conceptual considerations around the exhaustiveness of PSM and how it might change in the future.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

For instance, one reason for PSM to be exhaustive is that personas provide an especially simple way for the LLM to fit the post-training objective.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Survey some relevant empirics.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

While these empirical observations don't settle the question of how exhaustive PSM is, we use them as an opportunity to concretely ground the views we discuss.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Our discussion in this section is especially informal, relying heavily on evocative analogies.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

There is no well-established definition of agency or goal-directed behaviour, and it's possible that these abstractions are unsuitable in ways that obscure important weaknesses in our analysis.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We nevertheless put these informal questions about the exhaustiveness of PSM forward for future study.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Subheading Shoggoths, Actors, Operating Systems, and Authors

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

In this section, we describe a spectrum of perspectives on LLM agency.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Roughly speaking, the views here vary on two axes.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

One, non-persona agency ascribed to the LLM itself.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

At one extreme is the Shoggoth view, which assigns substantial agency to the underlying LLM.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

At the other is the operating system view, which assigns none.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

In the middle is the router view, where there is some limited non-persona agency in the choice of which persona to enact, but the AI's behavior is always locally persona-like.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Other sources of persona-like agency There may be a interior persona sitting between the assistant and the outer LLM.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

For example, even a pre-trained LLM might enact actor persona which is itself enacting the assistant.