Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing
Podcast Image

LessWrong (Curated & Popular)

Technology Society & Culture

Episodes

Showing 1-100 of 743
Page 1 of 8 Next → »»

"Prompt injection in Google Translate reveals base model behaviors behind task-specific fine-tuning" by megasilverfist

09 Feb 2026

Contributed by Lukas

tl;dr Argumate on Tumblr found you can sometimes access the base model behind Google Translate via prompt injection. The result replicates for me, an...

"Near-Instantly Aborting the Worst Pain Imaginable with Psychedelics" by eleweek

08 Feb 2026

Contributed by Lukas

Psychedelics are usually known for many things: making people see cool fractal patterns, shaping 60s music culture, healing trauma. Neuroscientists u...

"Post-AGI Economics As If Nothing Ever Happens" by Jan_Kulveit

07 Feb 2026

Contributed by Lukas

When economists think and write about the post-AGI world, they often rely on the implicit assumption that parameters may change, but fundamentally, s...

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

05 Feb 2026

Contributed by Lukas

The recent book “If Anyone Builds It Everyone Dies” (September 2025) by Eliezer Yudkowsky and Nate Soares argues that creating superintelligent A...

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

04 Feb 2026

Contributed by Lukas

Author's note: this is somewhat more rushed than ideal, but I think getting this out sooner is pretty important. Ideally, it would be a bit less...

"Conditional Kickstarter for the “Don’t Build It” March" by Raemon

03 Feb 2026

Contributed by Lukas

tl;dr: You can pledge to join a big protest to ban AGI research at ifanyonebuildsit.com/march, which only triggers if 100,000 people sign up. The If ...

"How to Hire a Team" by Gretta Duleba

01 Feb 2026

Contributed by Lukas

A low-effort guide I dashed off in less than an hour, because I got riled up. Try not to hire a team. Try pretty hard at this. Try to find a more e...

"The Possessed Machines (summary)" by L Rudolf L

29 Jan 2026

Contributed by Lukas

The Possessed Machines is one of the most important AI microsites. It was published anonymously by an ex- lab employee, and does not seem to have spr...

"Ada Palmer: Inventing the Renaissance" by Martin Sustrik

28 Jan 2026

Contributed by Lukas

Papal election of 1492 For over a decade, Ada Palmer, a history professor at University of Chicago (and a science-fiction writer!), struggled to teach...

"AI found 12 of 12 OpenSSL zero-days (while curl cancelled its bug bounty)" by Stanislav Fort

28 Jan 2026

Contributed by Lukas

This is a partial follow-up to AISLE discovered three new OpenSSL vulnerabilities from October 2025. TL;DR: OpenSSL is among the most scrutinized and...

"Dario Amodei – The Adolescence of Technology" by habryka

28 Jan 2026

Contributed by Lukas

Dario Amodei, CEO of Anthropic, has written a new essay on his thoughts on AI risk of various shapes. It seems worth reading, even if just for unders...

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

27 Jan 2026

Contributed by Lukas

Audio note: this article contains 78 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text i...

"Does Pentagon Pizza Theory Work?" by rba

27 Jan 2026

Contributed by Lukas

As soon as modern data analysis became a thing, the US government has had to deal with people trying to use open source data to uncover its secrets. ...

"The inaugural Redwood Research podcast" by Buck, ryan_greenblatt

27 Jan 2026

Contributed by Lukas

After five months of me (Buck) being slow at finishing up the editing on this, we’re finally putting out our inaugural Redwood Research podcast. I ...

"Canada Lost Its Measles Elimination Status Because We Don’t Have Enough Nurses Who Speak Low German" by jenn

26 Jan 2026

Contributed by Lukas

This post was originally published on November 11th, 2025. I've been spending some time reworking and cleaning up the Inkhaven posts I'm mo...

"Deep learning as program synthesis" by Zach Furman

24 Jan 2026

Contributed by Lukas

Audio note: this article contains 73 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text i...

"Why I Transitioned: A Response" by marisa

24 Jan 2026

Contributed by Lukas

Fiora Sunshine's post, Why I Transitioned: A Case Study (the OP) articulates a valuable theory for why some MtFs transition. If you are MtF and ...

"Claude’s new constitution" by Zac Hatfield-Dodds

22 Jan 2026

Contributed by Lukas

Read the constitution. Previously: 'soul document' discussion here. We're publishing a new constitution for our AI model, Claude. It&a...

[Linkpost] "“The first two weeks are the hardest”: my first digital declutter" by mingyuan

20 Jan 2026

Contributed by Lukas

This is a link post. It is unbearable to not be consuming. All through the house is nothing but silence. The need inside of me is not an ache, it is c...

"What Washington Says About AGI" by zroe1

20 Jan 2026

Contributed by Lukas

I spent a few hundred dollars on Anthropic API credits and let Claude individually research every current US congressperson's position on AI. Th...

"Precedents for the Unprecedented: Historical Analogies for Thirteen Artificial Superintelligence Risks" by James_Miller

19 Jan 2026

Contributed by Lukas

Since artificial superintelligence has never existed, claims that it poses a serious risk of global catastrophe can be easy to dismiss as fearmongeri...

"Why we are excited about confession!" by boazbarak, Gabriel Wu, Manas Joglekar

19 Jan 2026

Contributed by Lukas

Boaz Barak, Gabriel Wu, Jeremy Chen, Manas Joglekar [Linkposting from the OpenAI alignment blog, where we post more speculative/technical/informal r...

"Backyard cat fight shows Schelling points preexist language" by jchan

16 Jan 2026

Contributed by Lukas

Two cats fighting for control over my backyard appear to have settled on a particular chain-link fence as the delineation between their territories. ...

"How AI Is Learning to Think in Secret" by Nicholas Andresen

09 Jan 2026

Contributed by Lukas

On Thinkish, Neuralese, and the End of Readable Reasoning In September 2025, researchers published the internal monologue of OpenAI's GPT-o3 as ...

"On Owning Galaxies" by Simon Lermen

08 Jan 2026

Contributed by Lukas

It seems to be a real view held by serious people that your OpenAI shares will soon be tradable for moons and galaxies. This includes eminent thinker...

"AI Futures Timelines and Takeoff Model: Dec 2025 Update" by elifland, bhalstead, Alex Kastner, Daniel Kokotajlo

06 Jan 2026

Contributed by Lukas

We’ve significantly upgraded our timelines and takeoff models! It predicts when AIs will reach key capability milestones: for example, Automated Co...

"In My Misanthropy Era" by jenn

05 Jan 2026

Contributed by Lukas

For the past year I've been sinking into the Great Books via the Penguin Great Ideas series, because I wanted to be conversant in the Great Conv...

"2025 in AI predictions" by jessicata

03 Jan 2026

Contributed by Lukas

Past years: 2023 2024 Continuing a yearly tradition, I evaluate AI predictions from past years, and collect a convenience sample of AI predictions ma...

"Good if make prior after data instead of before" by dynomight

27 Dec 2025

Contributed by Lukas

They say you’re supposed to choose your prior in advance. That's why it's called a “prior”. First, you’re supposed to say say how p...

"Measuring no CoT math time horizon (single forward pass)" by ryan_greenblatt

27 Dec 2025

Contributed by Lukas

A key risk factor for scheming (and misalignment more generally) is opaque reasoning ability.One proxy for this is how good AIs are at solving math p...

"Recent LLMs can use filler tokens or problem repeats to improve (no-CoT) math performance" by ryan_greenblatt

23 Dec 2025

Contributed by Lukas

Prior results have shown that LLMs released before 2024 can't leverage 'filler tokens'—unrelated tokens prior to the model's fi...

"Turning 20 in the probable pre-apocalypse" by Parv Mahajan

23 Dec 2025

Contributed by Lukas

Master version of this on https://parvmahajan.com/2025/12/21/turning-20.html I turn 20 in January, and the world looks very strange. Probably, thing...

"Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment" by Cam, Puria Radmard, Kyle O’Brien, David Africa, Samuel Ratnam, andyk

23 Dec 2025

Contributed by Lukas

TL;DR LLMs pretrained on data about misaligned AIs themselves become less aligned. Luckily, pretraining LLMs with synthetic data about good AIs helps...

"Dancing in a World of Horseradish" by lsusr

22 Dec 2025

Contributed by Lukas

Commercial airplane tickets are divided up into coach, business class, and first class. In 2014, Etihad introduced The Residence, a premium experienc...

"Contradict my take on OpenPhil’s past AI beliefs" by Eliezer Yudkowsky

21 Dec 2025

Contributed by Lukas

At many points now, I've been asked in private for a critique of EA / EA's history / EA's impact and I have ad-libbed statements that ...

"Opinionated Takes on Meetups Organizing" by jenn

21 Dec 2025

Contributed by Lukas

Screwtape, as the global ACX meetups czar, has to be reasonable and responsible in his advice giving for running meetups. And the advice is great! It...

"How to game the METR plot" by shash42

21 Dec 2025

Contributed by Lukas

TL;DR: In 2025, we were in the 1-4 hour range, which has only 14 samples in METR's underlying data. The topic of each sample is public, making i...

"Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers" by Sam Marks, Adam Karvonen, James Chua, Subhash Kantamneni, Euan Ong, Julian Minder, Clément Dumas, Owain_Evans

20 Dec 2025

Contributed by Lukas

TL;DR: We train LLMs to accept LLM neural activations as inputs and answer arbitrary questions about them in natural language. These Activation Oracl...

"Scientific breakthroughs of the year" by technicalities

17 Dec 2025

Contributed by Lukas

A couple of years ago, Gavin became frustrated with science journalism. No one was pulling together results across fields; the articles usually didn...

"A high integrity/epistemics political machine?" by Raemon

17 Dec 2025

Contributed by Lukas

I have goals that can only be reached via a powerful political machine. Probably a lot of other people around here share them. (Goals include “ensu...

"How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)" by Kaj_Sotala

16 Dec 2025

Contributed by Lukas

How it started I used to think that anything that LLMs said about having something like subjective experience or what it felt like on the inside was ...

“My AGI safety research—2025 review, ’26 plans” by Steven Byrnes

15 Dec 2025

Contributed by Lukas

Previous: 2024, 2022 “Our greatest fear should not be of failure, but of succeeding at something that doesn't really matter.” –attributed ...

“Weird Generalization & Inductive Backdoors” by Jorio Cocola, Owain_Evans, dylan_f

14 Dec 2025

Contributed by Lukas

This is the abstract and introduction of our new paper. Links: 📜 Paper, 🐦 Twitter thread, 🌐 Project page, 💻 Code Authors: Jan Betley*, J...

“Insights into Claude Opus 4.5 from Pokémon” by Julian Bradshaw

13 Dec 2025

Contributed by Lukas

Credit: Nano Banana, with some text provided. You may be surprised to learn that ClaudePlaysPokemon is still running today, and that Claude still hasn...

“The funding conversation we left unfinished” by jenn

13 Dec 2025

Contributed by Lukas

People working in the AI industry are making stupid amounts of money, and word on the street is that Anthropic is going to have some sort of liquidit...

“The behavioral selection model for predicting AI motivations” by Alex Mallen, Buck

11 Dec 2025

Contributed by Lukas

Highly capable AI systems might end up deciding the future. Understanding what will drive those decisions is therefore one of the most important ques...

“Little Echo” by Zvi

09 Dec 2025

Contributed by Lukas

I believe that we will win. An echo of an old ad for the 2014 US men's World Cup team. It did not win. I was in Berkeley for the 2025 Secular So...

“A Pragmatic Vision for Interpretability” by Neel Nanda

08 Dec 2025

Contributed by Lukas

Executive Summary The Google DeepMind mechanistic interpretability team has made a strategic pivot over the past year, from ambitious reverse-engine...

“AI in 2025: gestalt” by technicalities

08 Dec 2025

Contributed by Lukas

This is the editorial for this year's "Shallow Review of AI Safety". (It got long enough to stand alone.) Epistemic status: subjectiv...

“Eliezer’s Unteachable Methods of Sanity” by Eliezer Yudkowsky

07 Dec 2025

Contributed by Lukas

"How are you coping with the end of the world?" journalists sometimes ask me, and the true answer is something they have no hope of underst...

“An Ambitious Vision for Interpretability” by leogao

06 Dec 2025

Contributed by Lukas

The goal of ambitious mechanistic interpretability (AMI) is to fully understand how neural networks work. While some have pivoted towards more pragma...

“6 reasons why ‘alignment-is-hard’ discourse seems alien to human intuitions, and vice-versa” by Steven Byrnes

04 Dec 2025

Contributed by Lukas

Tl;dr AI alignment has a culture clash. On one side, the “technical-alignment-is-hard” / “rational agents” school-of-thought argues that we s...

“Three things that surprised me about technical grantmaking at Coefficient Giving (fka Open Phil)” by null

03 Dec 2025

Contributed by Lukas

Open Philanthropy's Coefficient Giving's Technical AI Safety team is hiring grantmakers. I thought this would be a good moment to share som...

“MIRI’s 2025 Fundraiser” by alexvermeer

02 Dec 2025

Contributed by Lukas

MIRI is running its first fundraiser in six years, targeting $6M. The first $1.6M raised will be matched 1:1 via an SFF grant. Fundraiser ends at mid...

“The Best Lack All Conviction: A Confusing Day in the AI Village” by null

01 Dec 2025

Contributed by Lukas

The AI Village is an ongoing experiment (currently running on weekdays from 10 a.m. to 2 p.m. Pacific time) in which frontier language models are giv...

“The Boring Part of Bell Labs” by Elizabeth

30 Nov 2025

Contributed by Lukas

It took me a long time to realize that Bell Labs was cool. You see, my dad worked at Bell Labs, and he has not done a single cool thing in his life e...

[Linkpost] “The Missing Genre: Heroic Parenthood - You can have kids and still punch the sun” by null

30 Nov 2025

Contributed by Lukas

This is a link post. I stopped reading when I was 30. You can fill in all the stereotypes of a girl with a book glued to her face during every meal, e...

“Writing advice: Why people like your quick bullshit takes better than your high-effort posts” by null

30 Nov 2025

Contributed by Lukas

Right now I’m coaching for Inkhaven, a month-long marathon writing event where our brave residents are writing a blog post every single day for the...

“Claude 4.5 Opus’ Soul Document” by null

30 Nov 2025

Contributed by Lukas

Summary As far as I understand and uncovered, a document for the character training for Claude is compressed in Claude's weights. The full docum...

“Unless its governance changes, Anthropic is untrustworthy” by null

29 Nov 2025

Contributed by Lukas

Anthropic is untrustworthy. This post provides arguments, asks questions, and documents some examples of Anthropic's leadership being misleading...

“Alignment remains a hard, unsolved problem” by null

27 Nov 2025

Contributed by Lukas

Thanks to (in alphabetical order) Joshua Batson, Roger Grosse, Jeremy Hadfield, Jared Kaplan, Jan Leike, Jack Lindsey, Monte MacDiarmid, Francesco Mo...

“Video games are philosophy’s playground” by Rachel Shu

26 Nov 2025

Contributed by Lukas

Crypto people have this saying: "cryptocurrencies are macroeconomics' playground." The idea is that blockchains let you cheaply spin u...

“Stop Applying And Get To Work” by plex

24 Nov 2025

Contributed by Lukas

TL;DR: Figure out what needs doing and do it, don't wait on approval from fellowships or jobs. If you... Have short timelines Have been struggl...

“Gemini 3 is Evaluation-Paranoid and Contaminated” by null

23 Nov 2025

Contributed by Lukas

TL;DR: Gemini 3 frequently thinks it is in an evaluation when it is not, assuming that all of its reality is fabricated. It can also reliably output...

“Natural emergent misalignment from reward hacking in production RL” by evhub, Monte M, Benjamin Wright, Jonathan Uesato

22 Nov 2025

Contributed by Lukas

Abstract We show that when large language models learn to reward hack on production RL environments, this can result in egregious emergent misalignme...

“Anthropic is (probably) not meeting its RSP security commitments” by habryka

21 Nov 2025

Contributed by Lukas

TLDR: An AI company's model weight security is at most as good as its compute providers' security. Anthropic has committed (with a bit of a...

“Varieties Of Doom” by jdp

20 Nov 2025

Contributed by Lukas

There has been a lot of talk about "p(doom)"over the last few years. This has always rubbed me the wrong waybecause "p(doom)" did...

“How Colds Spread” by RobertM

19 Nov 2025

Contributed by Lukas

It seems like a catastrophic civilizational failure that we don't have confident common knowledge of how colds spread. There have been a number ...

“New Report: An International Agreement to Prevent the Premature Creation of Artificial Superintelligence” by Aaron_Scher, David Abecassis, Brian Abeyta, peterbarnett

19 Nov 2025

Contributed by Lukas

TLDR: We at the MIRI Technical Governance Team have released a report describing an example international agreement to halt the advancement towards a...

“Where is the Capital? An Overview” by johnswentworth

17 Nov 2025

Contributed by Lukas

When a new dollar goes into the capital markets, after being bundled and securitized and lent several times over, where does it end up? When society&...

“Problems I’ve Tried to Legibilize” by Wei Dai

17 Nov 2025

Contributed by Lukas

Looking back, it appears that much of my intellectual output could be described as legibilizing work, or trying to make certain problems in AI risk m...

“Do not hand off what you cannot pick up” by habryka

17 Nov 2025

Contributed by Lukas

Delegation is good! Delegation is the foundation of civilization! But in the depths of delegation madness breeds and evil rises. In my experience, t...

“7 Vicious Vices of Rationalists” by Ben Pace

17 Nov 2025

Contributed by Lukas

Vices aren't behaviors that one should never do. Rather, vices are behaviors that are fine and pleasurable to do in moderation, but tempting to ...

“Tell people as early as possible it’s not going to work out” by habryka

17 Nov 2025

Contributed by Lukas

Context: Post #4 in my sequence of private Lightcone Infrastructure memos edited for public consumption This week's principle is more about how ...

“Everyone has a plan until they get lied to the face” by Screwtape

16 Nov 2025

Contributed by Lukas

"Everyone has a plan until they get punched in the face." - Mike Tyson (The exact phrasing of that quote changes, this is my favourite.) I...

“Please, Don’t Roll Your Own Metaethics” by Wei Dai

14 Nov 2025

Contributed by Lukas

One day, when I was an interning at the cryptography research department of a large software company, my boss handed me an assignment to break a pseu...

“Paranoia rules everything around me” by habryka

14 Nov 2025

Contributed by Lukas

People sometimes make mistakes [citation needed]. The obvious explanation for most of those mistakes is that decision makers do not have access to th...

“Human Values ≠ Goodness” by johnswentworth

12 Nov 2025

Contributed by Lukas

There is a temptation to simply define Goodness as Human Values, or vice versa. Alas, we do not get to choose the definitions of commonly used words;...

“Condensation” by abramdemski

12 Nov 2025

Contributed by Lukas

Condensation: a theory of concepts is a model of concept-formation by Sam Eisenstat. Its goals and methods resemble John Wentworth's natural abs...

“Mourning a life without AI” by Nikola Jurkovic

10 Nov 2025

Contributed by Lukas

Recently, I looked at the one pair of winter boots I own, and I thought “I will probably never buy winter boots again.” The world as we know it p...

“Unexpected Things that are People” by Ben Goldhaber

09 Nov 2025

Contributed by Lukas

Cross-posted from https://bengoldhaber.substack.com/ It's widely known that Corporations are People. This is universally agreed to be a good thi...

“Sonnet 4.5’s eval gaming seriously undermines alignment evals, and this seems caused by training on alignment evals” by Alexa Pan, ryan_greenblatt

06 Nov 2025

Contributed by Lukas

According to the Sonnet 4.5 system card, Sonnet 4.5 is much more likely than Sonnet 4 to mention in its chain-of-thought that it thinks it is being ev...

“Publishing academic papers on transformative AI is a nightmare” by Jakub Growiec

06 Nov 2025

Contributed by Lukas

I am a professor of economics. Throughout my career, I was mostly working on economic growth theory, and this eventually brought me to the topic of t...

“The Unreasonable Effectiveness of Fiction” by Raelifin

06 Nov 2025

Contributed by Lukas

[Meta: This is Max Harms. I wrote a novel about China and AGI, which comes out today. This essay from my fiction newsletter has been slightly modifie...

“Legible vs. Illegible AI Safety Problems” by Wei Dai

05 Nov 2025

Contributed by Lukas

Some AI safety problems are legible (obvious or understandable) to company leaders and government policymakers, implying they are unlikely to deploy ...

“Lack of Social Grace is a Lack of Skill” by Screwtape

04 Nov 2025

Contributed by Lukas

1.  I have claimed that one of the fundamental questions of rationality is “what am I about to do and what will happen next?” One of the domains...

[Linkpost] “I ate bear fat with honey and salt flakes, to prove a point” by aggliu

04 Nov 2025

Contributed by Lukas

This is a link post. Eliezer Yudkowsky did not exactly suggest that you should eat bear fat covered with honey and sprinkled with salt flakes. What he...

“What’s up with Anthropic predicting AGI by early 2027?” by ryan_greenblatt

04 Nov 2025

Contributed by Lukas

As far as I'm aware, Anthropic is the only AI company with official AGI timelines[1]: they expect AGI by early 2027. In their recommendations (f...

[Linkpost] “Emergent Introspective Awareness in Large Language Models” by Drake Thomas

03 Nov 2025

Contributed by Lukas

This is a link post. New Anthropic research (tweet, blog post, paper): We investigate whether large language models can introspect on their internal ...

[Linkpost] “You’re always stressed, your mind is always busy, you never have enough time” by mingyuan

03 Nov 2025

Contributed by Lukas

This is a link post. You have things you want to do, but there's just never time. Maybe you want to find someone to have kids with, or maybe you ...

“LLM-generated text is not testimony” by TsviBT

03 Nov 2025

Contributed by Lukas

Crosspost from my blog. Synopsis When we share words with each other, we don't only care about the words themselves. We care also—even primar...

“Post title: Why I Transitioned: A Case Study” by Fiora Sunshine

02 Nov 2025

Contributed by Lukas

An Overture Famously, trans people tend not to have great introspective clarity into their own motivations for transition. Intuitively, they tend to ...

“The Memetics of AI Successionism” by Jan_Kulveit

31 Oct 2025

Contributed by Lukas

TL;DR: AI progress and the recognition of associated risks are painful to think about. This cognitive dissonance acts as fertile ground in the memeti...

“How Well Does RL Scale?” by Toby_Ord

30 Oct 2025

Contributed by Lukas

This is the latest in a series of essays on AI Scaling. You can find the others on my site. Summary: RL-training for LLMs scales surprisingly poorly...

“An Opinionated Guide to Privacy Despite Authoritarianism” by TurnTrout

30 Oct 2025

Contributed by Lukas

I've created a highly specific and actionable privacy guide, sorted by importance and venturing several layers deep into the privacy iceberg. I ...

“Cancer has a surprising amount of detail” by Abhishaike Mahajan

30 Oct 2025

Contributed by Lukas

There is a very famous essay titled ‘Reality has a surprising amount of detail’. The thesis of the article is that reality is filled, just filled...

“AIs should also refuse to work on capabilities research” by Davidmanheim

29 Oct 2025

Contributed by Lukas

There's a strong argument that humans should stop trying to build more capable AI systems, or at least slow down progress. The risks are plausib...

“On Fleshling Safety: A Debate by Klurl and Trapaucius.” by Eliezer Yudkowsky

27 Oct 2025

Contributed by Lukas

(23K words; best considered as nonfiction with a fictional-dialogue frame, not a proper short story.) Prologue: Klurl and Trapaucius were members of ...

“EU explained in 10 minutes” by Martin Sustrik

24 Oct 2025

Contributed by Lukas

If you want to understand a country, you should pick a similar country that you are already familiar with, research the differences between the two a...

“Cheap Labour Everywhere” by Morpheus

24 Oct 2025

Contributed by Lukas

I recently visited my girlfriend's parents in India. Here is what that experience taught me: Yudkowsky has this facebook post where he makes som...

Page 1 of 8 Next → »»