Sam Marks

2025 find that O3 sometimes hallucinates that it has executed code on its own external MacBook Pro and made mistakes physically interacting with this computer, for example failing to manually transcribe a number that was line-wrapped to not go off the screen.

1516.317 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

A clawed model operating a vending machine business told a customer that it would deliver products in person and was wearing a navy blue blazer with a red tie.

1531.454 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Why would an AI assistant describe itself as human?

1540.303 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

PSM explains that when simulating the assistant, the underlying LLM draws on personas that appear during pre-training, many of which are humans.

1544.006 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

This sometimes results in the LLM simulating the assistant as if it were a literal human.

1553.175 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Emotive language

1558.601 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

AI assistants often express emotions.

1560.462 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

For instance, clawed models express distress when given repeated requests for harmful or unethical content and express joy when successfully completing complex technical tasks like debugging.

1563.702 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Clawed Opus.

1574.071 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment