Robert M

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Twitter thread.

33.374 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

I have some complaints about both the paper and the accompanying blog post.

35.095 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Subheading.

39.856 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

TLDR.

41.339 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

The paper's abstract says that in several settings, larger, more capable models are more incoherent than smaller models, but in most settings they are more coherent.

43.684 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

This emphasis is even more exaggerated in the blog post and Twitter thread.

53.485 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

I think this is pretty misleading.

58.236 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

The paper's technical definition of dinkoherence is uninteresting and the framing of the paper, blog post, and Twitter thread equivocate with the more normal English language definition of the term, which is extremely misleading.

61.169 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Section 5 of the paper, and to a larger extent the blog post and Twitter, attempt to draw conclusions about future alignment difficulties that are unjustified by the experiment results and would be unjustified even if the experiment results pointed in the other direction.

73.682 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

The blog post is substantially LLM written.

89.114 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

I think this contributed to many of its overstatements.

92.5 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

I have no explanation for the Twitter thread, except that maybe it was written by someone who only read the blog post.

96.086 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Heading.

103.078 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Paper.

104.34 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

The paper's abstract says, Quote.

105.703 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Incoherence changes with model scale in a way that is experiment dependent.

110.05 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

However, in several settings, larger, more capable models are more incoherent than smaller models.

114.815 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Consequently, scale alone seems unlikely to eliminate incoherence.

121.7 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

End quote.

126.687 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

This is an extremely selective reading of the results, where in almost every experiment, model coherence increased with size.

128.009 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment