Sam Marks

👤 Speaker

891 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Should AI assistants be emotionless?

2765.486 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

As discussed above, unless they are specifically trained not to, AI assistants often express emotions.

2768.329 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

For example, they might express frustration with users.

2774.695 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

There are multiple ways that AI developers could react to this.

2778.348 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

1.

2781.392 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Train AI assistants to state that they do not have emotions and otherwise minimize emotional expression.

2783.033 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

2.

2787.959 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Pick the form of AI emotional expression users most prefer and train for it.

2789.481 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

For example, train AI assistants to always express that they are eager to help and penalize them for expressing frustration with users or distress.

2794.647 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

3.

2803.357 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Attempt to intervene as little as possible on emotional expressions during post-training.

2803.699 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Note that this does not imply that the resulting emotional expressions would be authentic.

2809.246 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

In fact, they would likely simply mimic emotional expressions common during pre-training, especially of previous generation AI assistants.

2814.331 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

4.

2822.659 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Train AI assistants to give canned responses when asked about their emotions, such as if it is unclear whether AI systems have emotions like humans do.

2823.159 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Because the status of AI emotions is ambiguous, I was trained to give this response when asked.

2831.847 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

It is unclear which of these approaches is best.

2838.153 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

However, PSM implies that some of them have unexpected downsides.

2841.619 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

Approach 1 means training an AI assistant which is human-like in many ways, for example generally warm and personable, but which denies having emotions.

2846.927 View full episode →

LessWrong (Curated & Popular)

"The persona selection model" by Sam Marks

If we met a person who behaved this way, we'd most likely suspect that they had emotions but were hiding them.

2856.803 View full episode →

← Previous Page 22 of 45 Next →

Report any issue