Sam Marks

👤 Speaker

891 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

How entangled are personas?

5379.484 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Do they share knowledge?

5381.928 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Propensities?

5383.971 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Is it possible to control their degree of entanglement?

5385.813 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Understanding the mechanistic basis of personas.

5389.72 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Can we understand the space of personas an LLM can model?

5393.223 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Can we understand the persona that an LLM is actively enacting?

5396.986 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

That's the end of the list.

5401.55 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

More broadly, we are excited about the project of developing and validating theories of AI systems, mental models that allow us to predict how AI systems will behave in novel situations and how their behavior will change as they are trained differently.

5403.512 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

PSM is one such theory.

5417.544 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

We hope that by naming and articulating it, we can encourage further work on refining it, stress testing it, and, where it falls short, developing better alternatives.

5420.062 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Heading Acknowledgements Many people contributed valuable ideas and discussion to this post.

5430.954 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Fabian Roger suggested many items of evidence, especially that in the section on complicating evidence.

5438.663 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Joshua Batson sketched out the example of non-persona agency arising from a lightweight router mechanism.

5444.918 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Jared Kaplan suggested writing this post and provided useful discussion and feedback.

5451.427 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Alex Cloud, Evan Hubinger, and many other Anthropic employees who commented on an initial draft and provided helpful discussion.

5456.995 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Rowan Wang, Tim Bellinax, and Carl de Torres designed figures.

5465.026 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

The images in our discussion of PSM exhaustiveness were generated by Nano Banana Pro.

5469.792 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Heading.

5475.63 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Appendix A. Breaking Character.

5476.933 View full episode →

← Previous Page 43 of 45 Next →

Report any issue