Robert M

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

To the extent that the survey says anything interesting, it says that coherence as understood by the survey takers is unrelated to the ability of various agents to cause harm to other agents.

374.983 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Heading.

385.803 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Blog.

387.045 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

First of all, the blog post seems to be substantially the output of an LLM.

388.207 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

In context, this is not that surprising, but it is annoying to read, and I also think this might have contributed to some of the more significant exaggerations or unjustified inferences.

393.372 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Let me quibble with a couple sections.

404.123 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

First, why should we expect incoherence, LLMs as dynamical systems?

406.886 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Quote

412.532 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

A key conceptual point, Bell LMs are dynamical systems, not optimizers.

414.245 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

When a language model generates text or takes actions, it traces trajectories through a high-dimensional state space.

419.852 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

It has to be trained to act as an optimizer and trained to align with human intent.

426.961 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

It's unclear which of these properties will be more robust as we scale.

432.348 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Constraining a generic dynamical system to act as a coherent optimizer is extremely difficult.

435.873 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

End quote.

441.903 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

The paper has a similar section, with an even zanier claim.

444.167 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Quote.

463.858 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

The set of dynamical systems that act as optimizers of a fixed loss is measure zero in the space of all dynamical systems.

465.342 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

End quote.

472.761 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

This seems to me like a vacuous attempt at defining away the possibility of building superintelligence, or perhaps coherent optimizers.

474.044 View full episode →

LessWrong (Curated & Popular)

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

I will spend no effort on its refutation, clawed for 0.5 opus being capable of doing a credible job.

482.2 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment