Sam Marks

"The persona selection model" by Sam Marks

For example, in the case of the faithful actor, there is an actor who is playing the assistant, but the actor never distorts their portrayal.

4249.747 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Thus, understanding how the actor will behave when in character reduces to understanding the character.

4257.717 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

We consider agentic Reuters and narrative agency to be ambiguously agentic, and narrative agency to be ambiguously persona-like, for the reasons discussed above.

4263.865 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Note that these perspectives are not exhaustive.

4274.338 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Subheading

4277.762 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Why might we expect PSM to be exhaustive?

4279.075 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

We know that randomly initialized neural networks can learn to implement agentic behaviors from scratch via reinforcement learning, RL.

4282.5 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

For example, randomly initialized networks can learn superhuman performance at chess, Shorji, and go without any human demonstration data, Silver et al., 2017.

4290.753 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Because there is no pre-training prior to speak of in this setting, the agency learned by these networks is necessarily shoggoth-like rather than persona-like.

4301.789 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Given that we know non-persona agency can arise from scratch via RL, why would we expect agency in post-trained LLMs to be substantially persona-based?

4310.666 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Here we discuss two conceptual reasons.

4320.585 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

First, that not much new is learned during LLM post-training.

4323.49 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Second, that reusing persona's modeling capabilities is a simple and effective way to fit the post-training objective.

4328.56 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

We also discuss how and whether we should expect these considerations to change in the future.

4335.795 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Subheading Post-training as elicitation

4341.307 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

A common view among some AI developers is that little fundamentally new is learned during post-training.

4345.006 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

On this view, the role of post-training is mainly to elicit capabilities that the model already had.

4351.455 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

For example, pre-trained LLMs have been trained on vast amounts of code data, including both low and high-quality code.

4357.504 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

These pre-trained LLMs are capable of writing high-quality code, but often choose not to because high-quality code is not always the most probable.

4365.235 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Post-training such an LLM to write high-quality code then draws out this latent capability more so than it teaches the LLM new, strong coding capabilities from scratch.

4373.407 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment