LessWrong (Curated & Popular)
Episodes
[Linkpost] “Consider donating to AI safety champion Scott Wiener” by Eric Neyman
24 Oct 2025
Contributed by Lukas
This is a link post. Written in my personal capacity. Thanks to many people for conversations and comments. Written in less than 24 hours; sorry for a...
“Which side of the AI safety community are you in?” by Max Tegmark
23 Oct 2025
Contributed by Lukas
In recent years, I’ve found that people who self-identify as members of the AI safety community have increasingly split into two camps: Camp A) &qu...
“Doomers were right” by Algon
23 Oct 2025
Contributed by Lukas
There's an argument I sometimes hear against existential risks, or any other putative change that some are worried about, that goes something li...
“Do One New Thing A Day To Solve Your Problems” by Algon
22 Oct 2025
Contributed by Lukas
People don't explore enough. They rely on cached thoughts and actions to get through their day. Unfortunately, this doesn't lead to them ma...
“Humanity Learned Almost Nothing From COVID-19” by niplav
21 Oct 2025
Contributed by Lukas
Summary: Looking over humanity's response to the COVID-19 pandemic, almostsix years later, reveals that we've forgotten to fulfill our inte...
“Consider donating to Alex Bores, author of the RAISE Act” by Eric Neyman
20 Oct 2025
Contributed by Lukas
Written by Eric Neyman, in my personal capacity. The views expressed here are my own. Thanks to Zach Stein-Perlman, Jesse Richardson, and many others...
“Meditation is dangerous” by Algon
20 Oct 2025
Contributed by Lukas
Here's a story I've heard a couple of times. A youngish person is looking for some solutions to their depression, chronic pain, ennui or so...
“That Mad Olympiad” by Tomás B.
19 Oct 2025
Contributed by Lukas
"I heard Chen started distilling the day after he was born. He's only four years old, if you can believe it. He's written 18 novels. H...
“The ‘Length’ of ‘Horizons’” by Adam Scholl
17 Oct 2025
Contributed by Lukas
Current AI models are strange. They can speak—often coherently, sometimes even eloquently—which is wild. They can predict the structure of protei...
“Don’t Mock Yourself” by Algon
15 Oct 2025
Contributed by Lukas
About half a year ago, I decided to try stop insulting myself for two weeks. No more self-deprecating humour, calling myself a fool, or thinking I&ap...
“If Anyone Builds It Everyone Dies, a semi-outsider review” by dvd
14 Oct 2025
Contributed by Lukas
About me and this review: I don’t identify as a member of the rationalist community, and I haven’t thought much about AI risk. I read AstralCodex...
“The Most Common Bad Argument In These Parts” by J Bostock
12 Oct 2025
Contributed by Lukas
I've noticed an antipattern. It's definitely on the dark pareto-frontier of "bad argument" and "I see it all the time amongs...
“Towards a Typology of Strange LLM Chains-of-Thought” by 1a3orn
11 Oct 2025
Contributed by Lukas
Intro LLMs being trained with RLVR (Reinforcement Learning from Verifiable Rewards) start off with a 'chain-of-thought' (CoT) in whatever l...
“I take antidepressants. You’re welcome” by Elizabeth
10 Oct 2025
Contributed by Lukas
It's amazing how much smarter everyone else gets when I take antidepressants. It makes sense that the drugs work on other people, because ther...
“Inoculation prompting: Instructing models to misbehave at train-time can improve run-time behavior” by Sam Marks
10 Oct 2025
Contributed by Lukas
This is a link post for two papers that came out today: Inoculation Prompting: Eliciting traits from LLMs during training can suppress them at test-...
“Hospitalization: A Review” by Logan Riggs
10 Oct 2025
Contributed by Lukas
I woke up Friday morning w/ a very sore left shoulder. I tried stretching it, but my left chest hurt too. Isn't pain on one side a sign of a hea...
“What, if not agency?” by abramdemski
09 Oct 2025
Contributed by Lukas
Sahil has been up to things. Unfortunately, I've seen people put effort into trying to understand and still bounce off. I recently talked to som...
“The Origami Men” by Tomás B.
08 Oct 2025
Contributed by Lukas
Of course, you must understand, I couldn't be bothered to act. I know weepers still pretend to try, but I wasn't a weeper, at least not the...
“A non-review of ‘If Anyone Builds It, Everyone Dies’” by boazbarak
06 Oct 2025
Contributed by Lukas
I was hoping to write a full review of "If Anyone Builds It, Everyone Dies" (IABIED Yudkowski and Soares) but realized I won't have ti...
“Notes on fatalities from AI takeover” by ryan_greenblatt
06 Oct 2025
Contributed by Lukas
Suppose misaligned AIs take over. What fraction of people will die? I'll discuss my thoughts on this question and my basic framework for thinkin...
“Nice-ish, smooth takeoff (with imperfect safeguards) probably kills most ‘classic humans’ in a few decades.” by Raemon
04 Oct 2025
Contributed by Lukas
I wrote my recent Accelerando post to mostly stand on it's own as a takeoff scenario. But, the reason it's on my mind is that, if I imagine...
“Omelas Is Perfectly Misread” by Tobias H
03 Oct 2025
Contributed by Lukas
The Standard Reading If you've heard of Le Guin's ‘The Ones Who Walk Away from Omelas’, you probably know the basic idea. It's a go...
“Ethical Design Patterns” by AnnaSalamon
01 Oct 2025
Contributed by Lukas
Related to: Commonsense Good, Creative Good (and my comment); Ethical Injunctions. Epistemic status: I’m fairly sure “ethics” does useful work ...
“You’re probably overestimating how well you understand Dunning-Kruger” by abstractapplic
30 Sep 2025
Contributed by Lukas
I The popular conception of Dunning-Kruger is something along the lines of “some people are too dumb to know they’re dumb, and end up thinking th...
“Reasons to sell frontier lab equity to donate now rather than later” by Daniel_Eth, Ethan Perez
27 Sep 2025
Contributed by Lukas
Tl;dr: We believe shareholders in frontier labs who plan to donate some portion of their equity to reduce AI risk should consider liquidating and don...
“CFAR update, and New CFAR workshops” by AnnaSalamon
26 Sep 2025
Contributed by Lukas
Hi all! After about five years of hibernation and quietly getting our bearings,[1] CFAR will soon be running two pilot mainline workshops, and may ru...
“Why you should eat meat - even if you hate factory farming” by KatWoods
26 Sep 2025
Contributed by Lukas
Cross-posted from my Substack To start off with, I’ve been vegan/vegetarian for the majority of my life. I think that factory farming has caused m...
[Linkpost] “Global Call for AI Red Lines - Signed by Nobel Laureates, Former Heads of State, and 200+ Prominent Figures” by Charbel-Raphaël
23 Sep 2025
Contributed by Lukas
This is a link post. Today, the Global Call for AI Red Lines was released and presented at the UN General Assembly. It was developed by the French Cen...
“This is a review of the reviews” by Recurrented
23 Sep 2025
Contributed by Lukas
This is a review of the reviews, a meta review if you will, but first a tangent. and then a history lesson. This felt boring and obvious and somewhat...
“The title is reasonable” by Raemon
21 Sep 2025
Contributed by Lukas
I'm annoyed by various people who seem to be complaining about the book title being "unreasonable" – who don't merely disagree ...
“The Problem with Defining an ‘AGI Ban’ by Outcome (a lawyer’s take).” by Katalina Hernandez
21 Sep 2025
Contributed by Lukas
TL;DR Most “AGI ban” proposals define AGI by outcome: whatever potentially leads to human extinction. That's legally insufficient: regulatio...
“Contra Collier on IABIED” by Max Harms
20 Sep 2025
Contributed by Lukas
Clara Collier recently reviewed If Anyone Builds It, Everyone Dies in Asterisk Magazine. I’ve been a reader of Asterisk since the beginning and had...
“You can’t eval GPT5 anymore” by Lukas Petersson
20 Sep 2025
Contributed by Lukas
The GPT-5 API is aware of today's date (no other model provider does this). This is problematic because the model becomes aware that it is in a ...
“Teaching My Toddler To Read” by maia
20 Sep 2025
Contributed by Lukas
I have been teaching my oldest son to read with Anki and techniques recommended here on LessWrong as well as in Larry Sanger's post, and it&apos...
“Safety researchers should take a public stance” by Ishual, Mateusz Bagiński
20 Sep 2025
Contributed by Lukas
[Co-written by Mateusz Bagiński and Samuel Buteau (Ishual)] TL;DR Many X-risk-concerned people who join AI capabilities labs with the intent to cont...
“The Company Man” by Tomás B.
19 Sep 2025
Contributed by Lukas
To get to the campus, I have to walk past the fentanyl zombies. I call them fentanyl zombies because it helps engender a sort of detached, low-empath...
“Christian homeschoolers in the year 3000” by Buck
19 Sep 2025
Contributed by Lukas
[I wrote this blog post as part of the Asterisk Blogging Fellowship. It's substantially an experiment in writing more breezily and concisely tha...
“I enjoyed most of IABED” by Buck
17 Sep 2025
Contributed by Lukas
I listened to "If Anyone Builds It, Everyone Dies" today. I think the first two parts of the book are the best available explanation of the...
“‘If Anyone Builds It, Everyone Dies’ release day!” by alexvermeer
16 Sep 2025
Contributed by Lukas
Back in May, we announced that Eliezer Yudkowsky and Nate Soares's new book If Anyone Builds It, Everyone Dies was coming out in September. At l...
“Obligated to Respond” by Duncan Sabien (Inactive)
16 Sep 2025
Contributed by Lukas
And, a new take on guess culture vs ask culture Author's note: These days, my thoughts go onto my substack by default, instead of onto LessWrong...
“Chesterton’s Missing Fence” by jasoncrawford
15 Sep 2025
Contributed by Lukas
The inverse of Chesterton's Fence is this: Sometimes a reformer comes up to a spot where there once was a fence, which has since been torn down....
“The Eldritch in the 21st century” by PranavG, Gabriel Alfour
14 Sep 2025
Contributed by Lukas
Very little makes sense. As we start to understand things and adapt to the rules, they change again. We live much closer together than we ever did hi...
“The Rise of Parasitic AI” by Adele Lopez
14 Sep 2025
Contributed by Lukas
[Note: if you realize you have an unhealthy relationship with your AI, but still care for your AI's unique persona, you can submit the persona i...
“High-level actions don’t screen off intent” by AnnaSalamon
13 Sep 2025
Contributed by Lukas
One might think “actions screen off intent”: if Alice donates $1k to bed nets, it doesn’t matter if she does it because she cares about people ...
[Linkpost] “MAGA populists call for holy war against Big Tech” by Remmelt
11 Sep 2025
Contributed by Lukas
This is a link post. Excerpts on AI Geoffrey Miller was handed the mic and started berating one of the panelists: Shyam Sankar, the chief technology o...
“Your LLM-assisted scientific breakthrough probably isn’t real” by eggsyntax
05 Sep 2025
Contributed by Lukas
Summary An increasing number of people in recent months have believed that they've made an important and novel scientific breakthrough, which th...
“Trust me bro, just one more RL scale up, this one will be the real scale up with the good environments, the actually legit one, trust me bro” by ryan_greenblatt
04 Sep 2025
Contributed by Lukas
I've recently written about how I've updated against seeing substantially faster than trend AI progress due to quickly massively scaling up...
“⿻ Plurality & 6pack.care” by Audrey Tang
03 Sep 2025
Contributed by Lukas
(Cross-posted from speaker's notes of my talk at Deepmind today.) Good local time, everyone. I am Audrey Tang, 🇹🇼 Taiwan's Cyber Amba...
[Linkpost] “The Cats are On To Something” by Hastings
03 Sep 2025
Contributed by Lukas
This is a link post. So the situation as it stands is that the fraction of the light cone expected to be filled with satisfied cats is not zero. This ...
[Linkpost] “Open Global Investment as a Governance Model for AGI” by Nick Bostrom
03 Sep 2025
Contributed by Lukas
This is a link post. I've seen many prescriptive contributions to AGI governance take the form of proposals for some radically new structure. Som...
“Will Any Old Crap Cause Emergent Misalignment?” by J Bostock
28 Aug 2025
Contributed by Lukas
The following work was done independently by me in an afternoon and basically entirely vibe-coded with Claude. Code and instructions to reproduce can...
“AI Induced Psychosis: A shallow investigation” by Tim Hua
27 Aug 2025
Contributed by Lukas
“This is a Copernican-level shift in perspective for the field of AI safety.” - Gemini 2.5 Pro “What you need right now is not validation, but ...
“Before LLM Psychosis, There Was Yes-Man Psychosis” by johnswentworth
27 Aug 2025
Contributed by Lukas
A studio executive has no beliefs That's the way of a studio system We've bowed to every rear of all the studio chiefs And you can bet your...
“Training a Reward Hacker Despite Perfect Labels” by ariana_azarbal, vgillioz, TurnTrout
26 Aug 2025
Contributed by Lukas
Summary: Perfectly labeled outcomes in training can still boost reward hacking tendencies in generalization. This can hold even when the train/test s...
“Banning Said Achmiz (and broader thoughts on moderation)” by habryka
23 Aug 2025
Contributed by Lukas
It's been roughly 7 years since the LessWrong user-base voted on whether it's time to close down shop and become an archive, or to move tow...
“Underdog bias rules everything around me” by Richard_Ngo
23 Aug 2025
Contributed by Lukas
People very often underrate how much power they (and their allies) have, and overrate how much power their enemies have. I call this “underdog bias...
“Epistemic advantages of working as a moderate” by Buck
22 Aug 2025
Contributed by Lukas
Many people who are concerned about existential risk from AI spend their time advocating for radical changes to how AI is handled. Most notably, they...
“Four ways Econ makes people dumber re: future AI” by Steven Byrnes
21 Aug 2025
Contributed by Lukas
(Cross-posted from X, intended for a general audience.) There's a funny thing where economics education paradoxically makes people DUMBER at thi...
“Should you make stone tools?” by Alex_Altair
21 Aug 2025
Contributed by Lukas
Knowing how evolution works gives you an enormously powerful tool to understand the living world around you and how it came to be that way. (Though i...
“My AGI timeline updates from GPT-5 (and 2025 so far)” by ryan_greenblatt
21 Aug 2025
Contributed by Lukas
As I discussed in a prior post, I felt like there were some reasonably compelling arguments for expecting very fast AI progress in 2025 (especially o...
“Hyperbolic model fits METR capabilities estimate worse than exponential model” by gjm
20 Aug 2025
Contributed by Lukas
This is a response to https://www.lesswrong.com/posts/mXa66dPR8hmHgndP5/hyperbolic-trend-with-upcoming-singularity-fits-metr which claims that a hype...
“My Interview With Cade Metz on His Reporting About Lighthaven” by Zack_M_Davis
18 Aug 2025
Contributed by Lukas
On 12 August 2025, I sat down with New York Times reporter Cade Metz to discuss some criticisms of his 4 August 2025 article, "The Rise of Silic...
“Church Planting: When Venture Capital Finds Jesus” by Elizabeth
18 Aug 2025
Contributed by Lukas
I’m going to describe a Type Of Guy starting a business, and you’re going to guess the business: The founder is very young, often under 25. He...
“Somebody invented a better bookmark” by Alex_Altair
16 Aug 2025
Contributed by Lukas
This will only be exciting to those of us who still read physical paper books. But like. Guys. They did it. They invented the perfect bookmark. Class...
“How Does A Blind Model See The Earth?” by henry
12 Aug 2025
Contributed by Lukas
Sometimes I'm saddened remembering that we've viewed the Earth from space. We can see it all with certainty: there's no northwest pass...
“Re: Recent Anthropic Safety Research” by Eliezer Yudkowsky
12 Aug 2025
Contributed by Lukas
A reporter asked me for my off-the-record take on recent safety research from Anthropic. After I drafted an off-the-record reply, I realized that I w...
“How anticipatory cover-ups go wrong” by Kaj_Sotala
09 Aug 2025
Contributed by Lukas
1. Back when COVID vaccines were still a recent thing, I witnessed a debate that looked like something like the following was happening: Some offici...
“SB-1047 Documentary: The Post-Mortem” by Michaël Trazzi
08 Aug 2025
Contributed by Lukas
Below some meta-level / operational / fundraising thoughts around producing the SB-1047 Documentary I've just posted on Manifund (see previous L...
“METR’s Evaluation of GPT-5” by GradientDissenter
08 Aug 2025
Contributed by Lukas
METR (where I work, though I'm cross-posting in a personal capacity) evaluated GPT-5 before it was externally deployed. We performed a much more...
“Emotions Make Sense” by DaystarEld
07 Aug 2025
Contributed by Lukas
For the past five years I've been teaching a class at various rationality camps, workshops, conferences, etc. I’ve done it maybe 50 times in t...
“The Problem” by Rob Bensinger, tanagrabeast, yams, So8res, Eliezer Yudkowsky, Gretta Duleba
06 Aug 2025
Contributed by Lukas
This is a new introduction to AI as an extinction threat, previously posted to the MIRI website in February alongside a summary. It was written indep...
“Many prediction markets would be better off as batched auctions” by William Howard
04 Aug 2025
Contributed by Lukas
All prediction market platforms trade continuously, which is the same mechanism the stock market uses. Buy and sell limit orders can be posted at any...
“Whence the Inkhaven Residency?” by Ben Pace
04 Aug 2025
Contributed by Lukas
Essays like Paul Graham's, Scott Alexander's, and Eliezer Yudkowsky's have influenced a generation of people in how they think about s...
“I am worried about near-term non-LLM AI developments” by testingthewaters
01 Aug 2025
Contributed by Lukas
TL;DR I believe that: Almost all LLM-centric safety research will not provide any significant safety value with regards to existential or civilisati...
“Optimizing The Final Output Can Obfuscate CoT (Research Note)” by lukemarks, jacob_drori, cloud, TurnTrout
31 Jul 2025
Contributed by Lukas
Produced as part of MATS 8.0 under the mentorship of Alex Turner and Alex Cloud. This research note overviews some early results which we are looking...
“About 30% of Humanity’s Last Exam chemistry/biology answers are likely wrong” by bohaska
30 Jul 2025
Contributed by Lukas
FutureHouse is a company that builds literature research agents. They tested it on the bio + chem subset of HLE questions, then noticed errors in the...
“Maya’s Escape” by Bridgett Kay
30 Jul 2025
Contributed by Lukas
Maya did not believe she lived in a simulation. She knew that her continued hope that she could escape from the nonexistent simulation was based on m...
“Do confident short timelines make sense?” by TsviBT, abramdemski
26 Jul 2025
Contributed by Lukas
TsviBT Tsvi's context Some context: My personal context is that I care about decreasing existential risk, and I think that the broad distributi...
“HPMOR: The (Probably) Untold Lore” by Gretta Duleba, Eliezer Yudkowsky
26 Jul 2025
Contributed by Lukas
Eliezer and I love to talk about writing. We talk about our own current writing projects, how we’d improve the books we’re reading, and what we w...
“On ‘ChatGPT Psychosis’ and LLM Sycophancy” by jdp
25 Jul 2025
Contributed by Lukas
As a person who frequently posts about large language model psychology I get an elevated rate of cranks and schizophrenics in my inbox. Often these a...
“Subliminal Learning: LLMs Transmit Behavioral Traits via Hidden Signals in Data” by cloud, mle, Owain_Evans
23 Jul 2025
Contributed by Lukas
Authors: Alex Cloud*, Minh Le*, James Chua, Jan Betley, Anna Sztyber-Betley, Jacob Hilton, Samuel Marks, Owain Evans (*Equal contribution, randomly o...
“Love stays loved (formerly ‘Skin’)” by Swimmer963 (Miranda Dixon-Luinenburg)
21 Jul 2025
Contributed by Lukas
This is a short story I wrote in mid-2022. Genre: cosmic horror as a metaphor for living with a high p-doom. One The last time I saw my mom, we me...
“Make More Grayspaces” by Duncan Sabien (Inactive)
21 Jul 2025
Contributed by Lukas
Author's note: These days, my thoughts go onto my substack by default, instead of onto LessWrong. Everything I write becomes free after a week o...
“Shallow Water is Dangerous Too” by jefftk
21 Jul 2025
Contributed by Lukas
Content warning: risk to children Julia and I knowdrowning is the biggestrisk to US children under 5, and we try to take this seriously.But yesterday...
“Narrow Misalignment is Hard, Emergent Misalignment is Easy” by Edward Turner, Anna Soligo, Senthooran Rajamanoharan, Neel Nanda
18 Jul 2025
Contributed by Lukas
Anna and Ed are co-first authors for this work. We’re presenting these results as a research update for a continuing body of work, which we hope wi...
“Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety” by Tomek Korbak, Mikita Balesni, Vlad Mikulik, Rohin Shah
16 Jul 2025
Contributed by Lukas
Twitter | Paper PDF Seven years ago, OpenAI five had just been released, and many people in the AI safety community expected AIs to be opaque RL agen...
“the jackpot age” by thiccythot
14 Jul 2025
Contributed by Lukas
This essay is about shifts in risk taking towards the worship of jackpots and its broader societal implications. Imagine you are presented with this ...
“Surprises and learnings from almost two months of Leo Panickssery” by Nina Panickssery
14 Jul 2025
Contributed by Lukas
Leo was born at 5am on the 20th May, at home (this was an accident but the experience has made me extremely homebirth-pilled). Before that, I was on ...
“An Opinionated Guide to Using Anki Correctly” by Luise
13 Jul 2025
Contributed by Lukas
I can't count how many times I've heard variations on "I used Anki too for a while, but I got out of the habit." No one ever stic...
“Lessons from the Iraq War about AI policy” by Buck
12 Jul 2025
Contributed by Lukas
I think the 2003 invasion of Iraq has some interesting lessons for the future of AI policy. (Epistemic status: I’ve read a bit about this, talked t...
“So You Think You’ve Awoken ChatGPT” by JustisMills
11 Jul 2025
Contributed by Lukas
Written in an attempt to fulfill @Raemon's request. AI is fascinating stuff, and modern chatbots are nothing short of miraculous. If you've...
“Generalized Hangriness: A Standard Rationalist Stance Toward Emotions” by johnswentworth
11 Jul 2025
Contributed by Lukas
People have an annoying tendency to hear the word “rationalism” and think “Spock”, despite direct exhortation against that exact interpretati...
“Comparing risk from internally-deployed AI to insider and outsider threats from humans” by Buck
10 Jul 2025
Contributed by Lukas
I’ve been thinking a lot recently about the relationship between AI control and traditional computer security. Here's one point that I think i...
“Why Do Some Language Models Fake Alignment While Others Don’t?” by abhayesian, John Hughes, Alex Mallen, Jozdien, janus, Fabien Roger
10 Jul 2025
Contributed by Lukas
Last year, Redwood and Anthropic found a setting where Claude 3 Opus and 3.5 Sonnet fake alignment to preserve their harmlessness values. We reprodu...
“A deep critique of AI 2027’s bad timeline models” by titotal
09 Jul 2025
Contributed by Lukas
Thank you to Arepo and Eli Lifland for looking over this article for errors. I am sorry that this article is so long. Every time I thought I was don...
“‘Buckle up bucko, this ain’t over till it’s over.’” by Raemon
09 Jul 2025
Contributed by Lukas
The second in a series of bite-sized rationality prompts[1]. Often, if I'm bouncing off a problem, one issue is that I intuitively expect the pr...
“Shutdown Resistance in Reasoning Models” by benwr, JeremySchlatter, Jeffrey Ladish
08 Jul 2025
Contributed by Lukas
We recently discovered some concerning behavior in OpenAI's reasoning models: When trying to complete a task, these models sometimes actively ci...
“Authors Have a Responsibility to Communicate Clearly” by TurnTrout
08 Jul 2025
Contributed by Lukas
When a claim is shown to be incorrect, defenders may say that the author was just being “sloppy” and actually meant something else entirely. I arg...
“The Industrial Explosion” by rosehadshar, Tom Davidson
07 Jul 2025
Contributed by Lukas
Summary To quickly transform the world, it's not enough for AI to become super smart (the "intelligence explosion"). AI will also hav...
“Race and Gender Bias As An Example of Unfaithful Chain of Thought in the Wild” by Adam Karvonen, Sam Marks
03 Jul 2025
Contributed by Lukas
Summary: We found that LLMs exhibit significant race and gender bias in realistic hiring scenarios, but their chain-of-thought reasoning shows zero ev...