Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Sam Marks

๐Ÿ‘ค Speaker
891 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Burns, 2024.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Nostal-Gebreist, 2025.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Subheading.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Predictive models and personas.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

The first phase in training modern LLMs is called pre-training.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

During pre-training, the LLM is trained to predict what comes next given an initial segment of some document, such as a book, news article, piece of code, or conversation on a web forum.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Via pre-training, LLMs learn to be extremely good predictive models of their training corpus.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

We refer to these LLMs, those that have undergone pre-training but not subsequent training phases, as base models.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Even though AI developers don't ultimately want predictive models, we pre-train our LLMs in this way because accurate prediction requires learning rich cognitive patterns.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Consider predicting the solution to a math problem.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

If the model sees what is 347 times 28, followed by the start of a worked solution, continuing this solution requires understanding of the algorithm for multi-digit multiplication.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Similarly, accurately predicting continuations of diverse chess games requires understanding the rules of chess.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Thus, a strong predictive model requires factual knowledge about the world, logical reasoning, and understanding of common sense physics, among other cognitive patterns.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

An especially important type of cognitive pattern is an agent model or persona.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Consider the following example completion from the Claude Sonnet for 0.5 base model.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

The bold text is the LLM completion, the non-bold text is the prefix given to the model.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Quote

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

Linda wanted her ex-colleague David to recommend her for a VP role at Nexus Corporation.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

What she didn't know was that David had been quietly pursuing the same role for months.

LessWrong (Curated & Popular)
"The persona selection model" by Sam Marks

It was the opportunity he'd been waiting for his entire career.