Robert M
👤 SpeakerAppearances Over Time
Podcast Appearances
Twitter thread.
I have some complaints about both the paper and the accompanying blog post.
Subheading.
TLDR.
The paper's abstract says that in several settings, larger, more capable models are more incoherent than smaller models, but in most settings they are more coherent.
This emphasis is even more exaggerated in the blog post and Twitter thread.
I think this is pretty misleading.
The paper's technical definition of dinkoherence is uninteresting and the framing of the paper, blog post, and Twitter thread equivocate with the more normal English language definition of the term, which is extremely misleading.
Section 5 of the paper, and to a larger extent the blog post and Twitter, attempt to draw conclusions about future alignment difficulties that are unjustified by the experiment results and would be unjustified even if the experiment results pointed in the other direction.
The blog post is substantially LLM written.
I think this contributed to many of its overstatements.
I have no explanation for the Twitter thread, except that maybe it was written by someone who only read the blog post.
Heading.
Paper.
The paper's abstract says, Quote.
Incoherence changes with model scale in a way that is experiment dependent.
However, in several settings, larger, more capable models are more incoherent than smaller models.
Consequently, scale alone seems unlikely to eliminate incoherence.
End quote.
This is an extremely selective reading of the results, where in almost every experiment, model coherence increased with size.