Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Robert M

👤 Speaker
195 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

To the extent that the survey says anything interesting, it says that coherence as understood by the survey takers is unrelated to the ability of various agents to cause harm to other agents.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Heading.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Blog.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

First of all, the blog post seems to be substantially the output of an LLM.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

In context, this is not that surprising, but it is annoying to read, and I also think this might have contributed to some of the more significant exaggerations or unjustified inferences.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Let me quibble with a couple sections.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

First, why should we expect incoherence, LLMs as dynamical systems?

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Quote

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

A key conceptual point, Bell LMs are dynamical systems, not optimizers.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

When a language model generates text or takes actions, it traces trajectories through a high-dimensional state space.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

It has to be trained to act as an optimizer and trained to align with human intent.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

It's unclear which of these properties will be more robust as we scale.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Constraining a generic dynamical system to act as a coherent optimizer is extremely difficult.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

End quote.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

The paper has a similar section, with an even zanier claim.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

Quote.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

The set of dynamical systems that act as optimizers of a fixed loss is measure zero in the space of all dynamical systems.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

End quote.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

This seems to me like a vacuous attempt at defining away the possibility of building superintelligence, or perhaps coherent optimizers.

LessWrong (Curated & Popular)
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

I will spend no effort on its refutation, clawed for 0.5 opus being capable of doing a credible job.