Sam Marks

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Of these, only three constitutes evidence against PSM.

2272.552 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

In contrast, typical examples of bizarre AI assistant mistakes seem more likely due to 1 or 2.

2277.262 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

For example, when AI assistants miscount the number of RS in strawberry, this is likely because the underlying LLM itself does not know the number of RS.

2284.47 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Nevertheless, these examples illustrate that, due to limitations of the underlying LLM, PSM doesn't imply that AI assistants always exhibit human-like behavior.

2293.799 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Even if the model is attempting to simulate a human-like assistant persona, it may lack the capabilities needed to do so faithfully.

2303.89 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

This can result in behavior that appears alien or bizarre, not because the model has departed from persona simulation, but because the predictive model itself cannot execute what a human-like persona would do.

2311.558 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

This is an important caveat.

2322.97 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

PSM predicts human-like intentions in how the model approaches tasks, but the execution of those intentions is bounded by the LLM's actual capabilities.

2325.153 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Within context inconsistency, AI assistants sometimes contradict themselves in strange ways.

2334.502 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

For example, when asked is 3 plus 5 is equal to 8?

2341.639 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Claude Haiku for point 5 with extended thinking responds.