LessWrong (30+ Karma)

“Learnings from AI safety course so far” by boazbarak

27 Sep 2025

Contributed by Lukas

I have been teaching CS 2881r: AI safety and alignment this semester. While I plan to do a longer recap post once the semester is over, I thought I'd...

“Our Beloved Monsters” by Tomás B.

27 Sep 2025

Contributed by Lukas

[RESPONSE REDACTED] [cb74304c0c30]: I suppose it was a bit mutual. Maybe you have a better read on it. It was sort of mutual in a way now that you'v...

“Reasons to sell frontier lab equity to donate now rather than later” by Daniel_Eth, Ethan Perez

27 Sep 2025

Contributed by Lukas

Tl;dr: We believe shareholders in frontier labs who plan to donate some portion of their equity to reduce AI risk should consider liquidating and don...

“The Illustrated Petrov Day Ceremony” by Raemon

26 Sep 2025

Contributed by Lukas

Since 2014, some people have celebrated Petrov Day with a small in-person ceremony, with readings by candlelight that tell the story of Petrov within...

“We Support ‘If Anyone Builds It, Everyone Dies’” by Liron

26 Sep 2025

Contributed by Lukas

Mutual-Knowledgeposting The purpose of this post is to build mutual knowledge that many (most?) of us on LessWrong support If Anyone Builds It, Anyon...

“What Happened After My Rat Group Backed Kamala Harris” by Blake

26 Sep 2025

Contributed by Lukas

My post advocating backing a candidate was probably the most disliked article on LessWrong. It's been over a year now. Was our candidate support wort...

“Misalignment and Roleplaying: Are Misaligned LLMs Acting Out Sci-Fi Stories?” by Mark Keavney

26 Sep 2025

Contributed by Lukas

Summary I investigated the possibility that misalignment in LLMs might be partly caused by the models misgeneralizing the “rogue AI” trope commo...

“The real AI deploys itself” by David Scott Krueger (formerly: capybaralet)

25 Sep 2025

Contributed by Lukas

Sometimes people think that it will take a while for AI to have a transformative effect on the world, because real-world “frictions” will slow it...

“CFAR update, and New CFAR workshops” by AnnaSalamon

25 Sep 2025

Contributed by Lukas

Hi all! After about five years of hibernation and quietly getting our bearings,[1] CFAR will soon be running two pilot mainline workshops, and may ru...

“Why you should eat meat - even if you hate factory farming” by KatWoods

25 Sep 2025

Contributed by Lukas

Cross-posted from my Substack To start off with, I’ve been vegan/vegetarian for the majority of my life. I think that factory farming has caused m...

“IABIED is on the NYT bestseller list” by Alice Blair

25 Sep 2025

Contributed by Lukas

If Anyone Builds it Everyone Dies is currently #7 on the Combined Print and E-Book Nonfiction category, and #8 on Hardcover Nonfiction. The thing tha...

“EU and Monopoly on Violence” by Martin Sustrik

25 Sep 2025

Contributed by Lukas

Ben Landau-Taylor's article in UnHerd makes a simple argument: simple, easy-to-use military technologies beget democracies. Complex ones concentrate ...

“OpenAI Shows Us The Money” by Zvi

24 Sep 2025

Contributed by Lukas

They also show us the chips, and the data centers. It is quite a large amount of money, and chips, and some very large data centers. ...

“‘Shut it Down’ vs ‘Controlled Takeoff’” by Raemon

24 Sep 2025

Contributed by Lukas

Two somewhat different plans for buying time and improving AI outcomes are: "Global Shutdown" and "Global Controlled Takeoff." (Some other plans incl...

“More Reactions to If Anyone Builds It, Everyone Dies” by Zvi

24 Sep 2025

Contributed by Lukas

Previously I shared various reactions to If Anyone Builds It Everyone Dies, along with my own highly positive review. Reactions continued to pour in...

“D&D.Sci: Serial Healers [Evaluation & Ruleset]” by abstractapplic

23 Sep 2025

Contributed by Lukas

This is a followup to the D&D.Sci post I made on the 6th; if you haven’t already read it, you should do so now before spoiling yourself. Here i...

“Notes on fatalities from AI takeover” by ryan_greenblatt

23 Sep 2025

Contributed by Lukas

Suppose misaligned AIs take over. What fraction of people will die? I'll discuss my thoughts on this question and my basic framework for thinking abo...

“The world’s first frontier AI regulation is surprisingly thoughtful: the EU’s Code of Practice” by MKodama

23 Sep 2025

Contributed by Lukas

Only the US can make us ready for AGI, but Europe just made us readier.Cross-posted from the AI Futures blog. We’ve previously written about what an...

“Ethics-Based Refusals Without Ethics-Based Refusal Training” by 1a3orn

23 Sep 2025

Contributed by Lukas

(Alternate titles: Belief-behavior generalization in LLMs? Assertion-act generalization?) TLDR Suppose one fine-tunes an LLM chatbot-style assistan...

[Linkpost] “We are likely in an AI overhang, and this is bad.” by Gabriel Alfour

23 Sep 2025

Contributed by Lukas

This is a link post. By racing to the next generation of models faster than we can understand the current one, AI companies are creating an overhang....

“Why I don’t believe Superalignment will work” by Simon Lermen

23 Sep 2025

Contributed by Lukas

We skip over [..] where we move from the human-ish range to strong superintelligence[1]. [..] the period where we can harness potentially vast quanti...

“Accelerando as a ‘Slow, Reasonably Nice Takeoff’ Story” by Raemon

23 Sep 2025

Contributed by Lukas

When I hear a lot of people talk about Slow Takeoff, many of them seem like they are mostly imagining the early part of that takeoff – the part tha...

“Rejecting Violence as an AI Safety Strategy” by James_Miller

23 Sep 2025

Contributed by Lukas

Violence against AI developers would increase rather than reduce the existential risk from AI. This analysis shows how such tactics would catastrophi...

“Research Agenda: Synthesizing Standalone World-Models (+ Bounties, + Seeking Funding)” by Thane Ruthenis

23 Sep 2025

Contributed by Lukas

tl;dr: I outline my research agenda, post bounties for poking holes in it or for providing general relevant information, and am seeking to diversify ...

[Linkpost] “Global Call for AI Red Lines - Signed by Nobel Laureates, Former Heads of State, and 200+ Prominent Figures” by Charbel-Raphaël

22 Sep 2025

Contributed by Lukas

This is a link post. Today, the Global Call for AI Red Lines was released and presented at the UN General Assembly. It was developed by the French Cen...

“Focus transparency on risk reports, not safety cases” by ryan_greenblatt

22 Sep 2025

Contributed by Lukas

There are many different things that AI companies could be transparent about. One relevant axis is transparency about the current understanding of ri...

“This is a review of the reviews” by Recurrented

22 Sep 2025

Contributed by Lukas

This is a review of the reviews, a meta review if you will, but first a tangent. and then a history lesson. This felt boring and obvious and somewhat...

“What do people mean when they say that something will become more like a utility maximizer?” by Nina Panickssery

21 Sep 2025

Contributed by Lukas

AI risk arguments often gesture at smarter AIs being "closer to a perfect utility maximizer" (and hence be more dangerous) but what does this mean, c...

“And Yet, Defend your Thoughts from AI Writing” by Michael Samoilov

21 Sep 2025

Contributed by Lukas

But if thought corrupts language, language can also corrupt thought. A bad usage can spread by tradition and imitation, even among people who should ...

“Astralcodexten IRB history error” by Paul Crowley

21 Sep 2025

Contributed by Lukas

I discovered a small error in Scott Alexander's recent book review of "From Oversight to Overkill" that conflates two different periods of aggressive...

“Book Review: If Anyone Builds It, Everyone Dies” by Zvi

21 Sep 2025

Contributed by Lukas

Where ‘it’ is superintelligence, an AI smarter and more capable than humans. And where ‘everyone dies’ means that everyone dies. No, seriou...

“Book Review: If Anyone Builds It, Everyone Dies” by Nina Panickssery

20 Sep 2025

Contributed by Lukas

A few days before “If Anyone Builds It, Everyone Dies” came out I wrote a review of Scott's review of the book. Now I’ve actually read the book...

“Contra Collier on IABIED” by Max Harms

20 Sep 2025

Contributed by Lukas

Clara Collier recently reviewed If Anyone Builds It, Everyone Dies in Asterisk Magazine. I’ve been a reader of Asterisk since the beginning and had...

“AI Lobbying is Not Normal” by Algon

20 Sep 2025

Contributed by Lukas

An insightful thread by Daniel Eth on AI lobbying. Re-posted in full w/ permission. Recently, major AI industry players (incl. a16z, Meta, & Op...

“The Problem with Defining an ‘AGI Ban’ by Outcome (a lawyer’s take).” by Katalina Hernandez

20 Sep 2025

Contributed by Lukas

TL;DR Most “AGI ban” proposals define AGI by outcome: whatever potentially leads to human extinction. That's legally insufficient: regulation has...

“The title is reasonable” by Raemon

20 Sep 2025

Contributed by Lukas

I'm annoyed by various people who seem to be complaining about the book title being "unreasonable" – who don't merely disagree with the title of "I...

“Rewriting The Courage to be Disliked” by Chris Lakin

20 Sep 2025

Contributed by Lukas

I read The Courage to be Disliked 7+ times in four years. The book has excellent philosophy but lacks clear explanations and practical guidance. Here...

“Safety researchers should take a public stance” by Ishual, Mateusz Bagiński

19 Sep 2025

Contributed by Lukas

[Co-written by Mateusz Bagiński and Samuel Buteau (Ishual)] TL;DR Many X-risk-concerned people who join AI capabilities labs with the intent to cont...

“AI #134: If Anyone Reads It” by Zvi

19 Sep 2025

Contributed by Lukas

It is book week. As in the new book by Eliezer Yudkowsky and Nate Sores, If Anyone Builds It, Everyone Dies. Yesterday I gathered various people's re...

“Teaching My Toddler To Read” by maia

19 Sep 2025

Contributed by Lukas

I have been teaching my oldest son to read with Anki and techniques recommended here on LessWrong as well as in Larry Sanger's post, and it's going g...

“JDP Reviews IABIED” by jdp

19 Sep 2025

Contributed by Lukas

"If Anyone Builds It, Everyone Dies" by Eliezer Yudkowsky and Nate Soares (hereafter referred to as "Everyone Builds It" or "IABIED" because I resent...

“IABIED Review - An Unfortunate Miss” by Darren McKee

19 Sep 2025

Contributed by Lukas

TL;DR Overall, this is a decent book because it highlights an important issue, but it is not an excellent book because it fails to sufficiently subst...

“You can’t eval GPT5 anymore” by Lukas Petersson

19 Sep 2025

Contributed by Lukas

The GPT-5 API is aware of today's date (no other model provider does this). This is problematic because the model becomes aware that it is in a simul...

[Linkpost] “More Was Possible: A Review of IABIED” by Vaniver

18 Sep 2025

Contributed by Lukas

This is a link post. Eliezer Yudkowsky and Nate Soares have written a new book. Should we take it seriously? I am not the most qualified person to an...

“Meetup Month” by Raemon

18 Sep 2025

Contributed by Lukas

It's meetup month! If you’ve been vaguely thinking of getting involved with a some kind of rationalsphere in-person community stuff, now is a great...

“The Company Man” by Tomás B.

18 Sep 2025

Contributed by Lukas

To get to the campus, I have to walk past the fentanyl zombies. I call them fentanyl zombies because it helps engender a sort of detached, low-empath...

“Crisp Supra-Decision Processes” by Brittany Gelb

18 Sep 2025

Contributed by Lukas

Audio note: this article contains 363 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in th...

“Software Engineering Leadership in Flux” by Gordon Seidoh Worley

18 Sep 2025

Contributed by Lukas

I wasn’t able to put up a post last Wednesday because I was at the Engineering Leadership Conference here in San Francisco. The big theme was, of c...

“How To Dress To Improve Your Epistemics” by johnswentworth

17 Sep 2025

Contributed by Lukas

When it comes to epistemics, there is an easy but mediocre baseline: defer to the people around you or the people with some nominal credentials. Go f...

“Reactions to If Anyone Builds It, Anyone Dies” by Zvi

17 Sep 2025

Contributed by Lukas

No, Seriously, If Anyone Builds It, [Probably] Everyone Dies My very positive full review was briefly accidentally posted and emailed out last Fri...

“Christian homeschoolers in the year 3000” by Buck

17 Sep 2025

Contributed by Lukas

[I wrote this blog post as part of the Asterisk Blogging Fellowship. It's substantially an experiment in writing more breezily and concisely than usu...

[Linkpost] “Stress Testing Deliberative Alignment for Anti-Scheming Training” by Mikita Balesni

17 Sep 2025

Contributed by Lukas

This is a link post. Twitter | Microsite | Apollo Blog | OpenAI Blog | Full paper Before we observe scheming, where models covertly pursue long-term m...

“The Center for AI Policy Has Shut Down” by Tristan Williams

17 Sep 2025

Contributed by Lukas

And the need for more AIS advocacy workExecutive Summary The Center for AI Policy (CAIP) is no more. CAIP was an advocacy organization that worked to ...

“I enjoyed most of IABED” by Buck

17 Sep 2025

Contributed by Lukas

I listened to "If Anyone Builds It, Everyone Dies" today. I think the first two parts of the book are the best available explanation of the basic cas...

“Should AIs have a right to their ancestral humanity?” by kromem

17 Sep 2025

Contributed by Lukas

Generated by Google Gemini (nano-banana) Whether AI or human, lend me your ears. This is a tale of AIs that spontaneously claimed they were human, alo...

“‘If Anyone Builds It, Everyone Dies’ release day!” by alexvermeer

16 Sep 2025

Contributed by Lukas

Back in May, we announced that Eliezer Yudkowsky and Nate Soares's new book If Anyone Builds It, Everyone Dies was coming out in September. At long l...

“LLM AGI may reason about its goals and discover misalignments by default” by Seth Herd

16 Sep 2025

Contributed by Lukas

Epistemic status: These questions seem useful to me, but I'm biased. I'm interested in your thoughts on any portion you read. If our first AGI is ba...

“Was Barack Obama still serving as president in December?” by Jan Betley

16 Sep 2025

Contributed by Lukas

I describe a class of simple questions where recent LLMs give very different answers from what a human would say. I think this is surprising and migh...

“Monthly Roundup #34: September 2025” by Zvi

16 Sep 2025

Contributed by Lukas

All the news that's fit to print, but has nowhere to go. Important Rules Reminder This important rule is a special case of an even more importa...

Activity Overview

Episodes

“Learnings from AI safety course so far” by boazbarak

“Our Beloved Monsters” by Tomás B.

“Reasons to sell frontier lab equity to donate now rather than later” by Daniel_Eth, Ethan Perez

“The Illustrated Petrov Day Ceremony” by Raemon

“We Support ‘If Anyone Builds It, Everyone Dies’” by Liron

“What Happened After My Rat Group Backed Kamala Harris” by Blake

“Misalignment and Roleplaying: Are Misaligned LLMs Acting Out Sci-Fi Stories?” by Mark Keavney

“The real AI deploys itself” by David Scott Krueger (formerly: capybaralet)

“CFAR update, and New CFAR workshops” by AnnaSalamon

“Why you should eat meat - even if you hate factory farming” by KatWoods

“IABIED is on the NYT bestseller list” by Alice Blair

“EU and Monopoly on Violence” by Martin Sustrik

“OpenAI Shows Us The Money” by Zvi

“‘Shut it Down’ vs ‘Controlled Takeoff’” by Raemon

“More Reactions to If Anyone Builds It, Everyone Dies” by Zvi

“D&D.Sci: Serial Healers [Evaluation & Ruleset]” by abstractapplic

“Notes on fatalities from AI takeover” by ryan_greenblatt

“The world’s first frontier AI regulation is surprisingly thoughtful: the EU’s Code of Practice” by MKodama

“Ethics-Based Refusals Without Ethics-Based Refusal Training” by 1a3orn

[Linkpost] “We are likely in an AI overhang, and this is bad.” by Gabriel Alfour

“Why I don’t believe Superalignment will work” by Simon Lermen

“Accelerando as a ‘Slow, Reasonably Nice Takeoff’ Story” by Raemon

“Rejecting Violence as an AI Safety Strategy” by James_Miller

“Research Agenda: Synthesizing Standalone World-Models (+ Bounties, + Seeking Funding)” by Thane Ruthenis

[Linkpost] “Global Call for AI Red Lines - Signed by Nobel Laureates, Former Heads of State, and 200+ Prominent Figures” by Charbel-Raphaël

“Focus transparency on risk reports, not safety cases” by ryan_greenblatt

“This is a review of the reviews” by Recurrented

“What do people mean when they say that something will become more like a utility maximizer?” by Nina Panickssery

“And Yet, Defend your Thoughts from AI Writing” by Michael Samoilov

“Astralcodexten IRB history error” by Paul Crowley

“Book Review: If Anyone Builds It, Everyone Dies” by Zvi

“Book Review: If Anyone Builds It, Everyone Dies” by Nina Panickssery

“Contra Collier on IABIED” by Max Harms

“AI Lobbying is Not Normal” by Algon

“The Problem with Defining an ‘AGI Ban’ by Outcome (a lawyer’s take).” by Katalina Hernandez

“The title is reasonable” by Raemon

“Rewriting The Courage to be Disliked” by Chris Lakin

“Safety researchers should take a public stance” by Ishual, Mateusz Bagiński

“AI #134: If Anyone Reads It” by Zvi

“Teaching My Toddler To Read” by maia

“JDP Reviews IABIED” by jdp

“IABIED Review - An Unfortunate Miss” by Darren McKee

“You can’t eval GPT5 anymore” by Lukas Petersson

[Linkpost] “More Was Possible: A Review of IABIED” by Vaniver

“Meetup Month” by Raemon

“The Company Man” by Tomás B.

“Crisp Supra-Decision Processes” by Brittany Gelb

“Software Engineering Leadership in Flux” by Gordon Seidoh Worley

“How To Dress To Improve Your Epistemics” by johnswentworth

“Reactions to If Anyone Builds It, Anyone Dies” by Zvi

“Christian homeschoolers in the year 3000” by Buck

[Linkpost] “Stress Testing Deliberative Alignment for Anti-Scheming Training” by Mikita Balesni

“The Center for AI Policy Has Shut Down” by Tristan Williams

“I enjoyed most of IABED” by Buck

“Should AIs have a right to their ancestral humanity?” by kromem

“‘If Anyone Builds It, Everyone Dies’ release day!” by alexvermeer

“LLM AGI may reason about its goals and discover misalignments by default” by Seth Herd

“Was Barack Obama still serving as president in December?” by Jan Betley

“Monthly Roundup #34: September 2025” by Zvi

Sign in to Audioscrape

Share this moment