Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Sam Marks

๐Ÿ‘ค Speaker
891 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

For example, in the case of the faithful actor, there is an actor who is playing the assistant, but the actor never distorts their portrayal.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Thus, understanding how the actor will behave when in character reduces to understanding the character.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We consider agentic Reuters and narrative agency to be ambiguously agentic, and narrative agency to be ambiguously persona-like, for the reasons discussed above.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Note that these perspectives are not exhaustive.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Subheading

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Why might we expect PSM to be exhaustive?

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We know that randomly initialized neural networks can learn to implement agentic behaviors from scratch via reinforcement learning, RL.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

For example, randomly initialized networks can learn superhuman performance at chess, Shorji, and go without any human demonstration data, Silver et al., 2017.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Because there is no pre-training prior to speak of in this setting, the agency learned by these networks is necessarily shoggoth-like rather than persona-like.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Given that we know non-persona agency can arise from scratch via RL, why would we expect agency in post-trained LLMs to be substantially persona-based?

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Here we discuss two conceptual reasons.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

First, that not much new is learned during LLM post-training.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Second, that reusing persona's modeling capabilities is a simple and effective way to fit the post-training objective.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We also discuss how and whether we should expect these considerations to change in the future.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Subheading Post-training as elicitation

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

A common view among some AI developers is that little fundamentally new is learned during post-training.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

On this view, the role of post-training is mainly to elicit capabilities that the model already had.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

For example, pre-trained LLMs have been trained on vast amounts of code data, including both low and high-quality code.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

These pre-trained LLMs are capable of writing high-quality code, but often choose not to because high-quality code is not always the most probable.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Post-training such an LLM to write high-quality code then draws out this latent capability more so than it teaches the LLM new, strong coding capabilities from scratch.