LessWrong posts by zvi
Episodes
“Dating Roundup #7: Back to Basics” by Zvi
01 Sep 2025
Contributed by Lukas
There's quite a lot in the queue since last time, so this is the first large chunk of it, which focuses on apps and otherwise finding an initial conn...
“AI #131 Part 2: Various Misaligned Things” by Zvi
29 Aug 2025
Contributed by Lukas
It doesn’t look good, on many fronts, especially taking a stake in Intel. We continue. Table of Contents America Extorts 10% of Intel. Nic...
“AI #131 Part 1: Gemini 2.5 Flash Image is Cool” by Zvi
28 Aug 2025
Contributed by Lukas
Once again we’ve reached the point where the weekly update needs to be split in two. Thus, the alignment and policy coverage will happen tomorrow. ...
“Are They Starting To Take Our Jobs?” by Zvi
27 Aug 2025
Contributed by Lukas
Is generative AI making it harder for young people to find jobs? My answer is: Yes, definitely, in terms of for any given job that exists finding ...
“Reports Of AI Not Progressing Or Offering Mundane Utility Are Often Greatly Exaggerated” by Zvi
26 Aug 2025
Contributed by Lukas
In the wake of the confusions around GPT-5, this week had yet another round of claims that AI wasn’t progressing, or AI isn’t or won’t create m...
“Arguments About AI Consciousness Seem Highly Motivated And At Best Overconfident” by Zvi
25 Aug 2025
Contributed by Lukas
I happily admit I am deeply confused about consciousness. I don’t feel confident I understand what it is, what causes it, which entities have it, ...
“DeepSeek v3.1 Is Not Having a Moment” by Zvi
22 Aug 2025
Contributed by Lukas
What if DeepSeek released a model claiming 66 on SWE and almost no one tried using it? Would it be any good? Would you be able to tell? Or would we ge...
“AI #130: Talking Past The Sale” by Zvi
21 Aug 2025
Contributed by Lukas
One potentially big event was that DeepSeek came out with v3.1. Initial response was very quiet, but this is DeepSeek and there are some strong score...
“AI Companion Conditions” by Zvi
20 Aug 2025
Contributed by Lukas
The conditions are: Lol, we’re Meta. Or lol we’re xAI. This expands upon many previous discussions, including the AI Companion Piece. Lol We’...
“Monthly Roundup #33: August 2025” by Zvi
19 Aug 2025
Contributed by Lukas
I got suckered into paying attention to multiple non-AI political stories this month: The shooting of the messenger, in violation of the most sacred ...
“GPT-5: The Reverse DeepSeek Moment” by Zvi
18 Aug 2025
Contributed by Lukas
Everyone agrees that the release of GPT-5 was botched. Everyone can also agree that the direct jump from GPT-4o and o3 to GPT-5 was not of similar si...
“Spending Too Much Time At Airports” by Zvi
15 Aug 2025
Contributed by Lukas
In honor of Nate Silver's analysis of when to leave for the airport, and because it's been an intense week, I thought I’d offer my thoughts on vari...
“GPT-5s Are Alive: Synthesis” by Zvi
13 Aug 2025
Contributed by Lukas
What do I ultimately make of all the new versions of GPT-5? The practical offerings and how they interact continues to change by the day. I expect m...
“GPT-5s Are Alive: Outside Reactions, the Router and the Resurrection of GPT-4o” by Zvi
12 Aug 2025
Contributed by Lukas
A key problem with having and interpreting reactions to GPT-5 is that it is often unclear whether the reaction is to GPT-5, GPT-5-Router or GPT-5-Thi...
“GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card” by Zvi
11 Aug 2025
Contributed by Lukas
GPT-5 was a long time coming. Is it a good model, sir? Yes. In practice it is a good, but not great, model. Or rather, it is several good models re...
“OpenAI’s GPT-OSS Is Already Old News” by Zvi
08 Aug 2025
Contributed by Lukas
That's on OpenAI. I don’t schedule their product releases. Since it takes several days to gather my reports on new models, we are doing our coverag...
“AI #128: Four Hours Until Probably Not The Apocalypse” by Zvi
07 Aug 2025
Contributed by Lukas
Brace for impact. We are presumably (checks watch) four hours from GPT-5. That's the time you need to catch up on all the other AI news. In another...
“Opus 4.1 Is An Incremental Improvement” by Zvi
06 Aug 2025
Contributed by Lukas
Claude Opus 4 has been updated to Claude Opus 4.1. This is a correctly named incremental update, with the bigger news being ‘we plan to release su...
“Childhood and Education #13: College” by Zvi
05 Aug 2025
Contributed by Lukas
There's a time and a place for everything. It used to be called college. Table of Contents The Big Test. Testing, Testing. Legalized Cheati...
“On Altman’s Interview With Theo Von” by Zvi
04 Aug 2025
Contributed by Lukas
Sam Altman talked recently to Theo Von. Double click to interact with video Theo is genuinely engaging and curious throughout. This made me wa...
“The Week in AI Governance” by Zvi
01 Aug 2025
Contributed by Lukas
There was enough governance related news this week to spin it out. The EU AI Code of Practice Anthropic, Google, OpenAI, Mistral, Aleph Alpha, ...
“AI #127: Continued Claude Code Complications” by Zvi
31 Jul 2025
Contributed by Lukas
Due to Continued Claude Code Complications, we can report Unlimited Usage Ultimately Unsustainable. May I suggest using the API, where Anthropic's ye...
“Childhood and Education: College Admissions” by Zvi
30 Jul 2025
Contributed by Lukas
Table of Contents College Applications. The College Application Essay (Is) From Hell. Don’t Guess The Teacher's Password, Ask For It Explici...
“Spilling the Tea” by Zvi
29 Jul 2025
Contributed by Lukas
The Tea app is or at least was on fire, rapidly gaining lots of users. This opens up two discussions, one on the game theory and dynamics of Tea, one...
“AI Companion Piece” by Zvi
28 Jul 2025
Contributed by Lukas
AI companions, other forms of personalized AI content and persuasion and related issues continue to be a hot topic. What do people use companions for?...
“America’s AI Action Plan Is Pretty Good” by Zvi
25 Jul 2025
Contributed by Lukas
No, seriously. If you look at the substance, it's pretty good. I’ll go over the whole thing in detail, including the three executive actions implem...
“AI #126: Go Fund Yourself” by Zvi
24 Jul 2025
Contributed by Lukas
The big AI news this week came on many fronts. Google and OpenAI unexpectedly got 2025 IMO Gold using LLMs under test conditions, rather than a tool...
“GPT Agent Is Standing By” by Zvi
23 Jul 2025
Contributed by Lukas
OpenAI now offers 400 shots of ‘agent mode’ per month to Pro subscribers. This incorporates and builds upon OpenAI's Operator. Does that give us...
“Google and OpenAI Get 2025 IMO Gold” by Zvi
22 Jul 2025
Contributed by Lukas
Congratulations, as always, to everyone who got to participate in the 2025 International Mathematical Olympiad, and especially to the gold and other ...
“Monthly Roundup #32: July 2025” by Zvi
21 Jul 2025
Contributed by Lukas
Welcome to the monthly roundup of things that don’t fit into other categories and don’t rise to the level of their own posts. Bad News When...
“On METR’s AI Coding RCT” by Zvi
18 Jul 2025
Contributed by Lukas
METR ran a proper RCT experiment seeing how much access to Cursor (using Sonnet 3.7) would accelerate coders working on their own open source repos. ...
“AI #125: Smooth Criminal” by Zvi
17 Jul 2025
Contributed by Lukas
One story has towered over things this week. Unleash the Grok also known as the anime waifu codependent AI girlfriend Ani, also known as MechaHitler, ...
“Kimi K2” by Zvi
16 Jul 2025
Contributed by Lukas
While most people focused on Grok, there was another model release that got uniformly high praise: Kimi K2 from Moonshot.ai. It's definitely a good ...
“Grok 4 Various Things” by Zvi
15 Jul 2025
Contributed by Lukas
Yesterday I covered a few rather important Grok incidents. Today is all about Grok 4's capabilities and features. Is it a good model, sir? It's not...
“Worse Than MechaHitler” by Zvi
14 Jul 2025
Contributed by Lukas
Grok 4, which has excellent benchmarks and which xAI claims is ‘the world's smartest artificial intelligence,’ is the big news. If you set aside...
“OpenAI Model Differentiation 101” by Zvi
11 Jul 2025
Contributed by Lukas
LLMs can be deeply confusing. Thanks to a commission, today we go back to basics. How did we get such a wide array of confusingly named and labeled ...
“AI #124: Grokless Interlude” by Zvi
10 Jul 2025
Contributed by Lukas
Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4. There are some impressive ...
“No, Grok, No” by Zvi
09 Jul 2025
Contributed by Lukas
It was the July 4 weekend. Grok on Twitter got some sort of upgrade. Elon Musk: We have improved @Grok significantly. You should notice a differenc...
“Balsa Update: Springtime in DC” by Zvi
08 Jul 2025
Contributed by Lukas
Today's post is an update from my contractor at Balsa Research, Jennifer Chen. I offer guidance and make strategic choices, but she's the one who make...
“On Alpha School” by Zvi
07 Jul 2025
Contributed by Lukas
The epic 18k word writeup on Austin's flagship Alpha School is excellent. It is long, but given the blog you’re reading now, if you have interest i...
“Housing Roundup #12” by Zvi
04 Jul 2025
Contributed by Lukas
Abundance and YIMBY are on the march. Things are looking good. The wins are each small, but every little bit helps. There are lots of different littl...
“AI #123: Moratorium Moratorium” by Zvi
03 Jul 2025
Contributed by Lukas
The big AI story this week was the battle over the insane AI regulatory moratorium, which came dangerously close to passing. Ultimately, after Senato...
“Congress Asks Better Questions” by Zvi
02 Jul 2025
Contributed by Lukas
Back in May I did a dramatization of a key and highly painful Senate hearing. Now, we are back for a House committee meeting. It was entitled ‘Auth...
“AI Moratorium Stripped From BBB” by Zvi
01 Jul 2025
Contributed by Lukas
The insane attempted AI moratorium has been stripped from the BBB. That doesn’t mean they won’t try again, but we are good for now. We should use...
“Substack and Other Blog Recommendations” by Zvi
30 Jun 2025
Contributed by Lukas
Substack recommendations are remarkably important, and the actual best reason to write here instead of elsewhere. As in, even though I have never ma...
“Childhood and Education #11: The Art of Learning” by Zvi
27 Jun 2025
Contributed by Lukas
In honor of the latest (always deeply, deeply unpopular) attempts to destroy tracking and gifted and talented programs, and other attempts to get chil...
“AI #122: Paying The Market Price” by Zvi
26 Jun 2025
Contributed by Lukas
If you are Meta, and you want to attract top AI talent, you have a problem, because no one wants to work for you or on your products. So it is going ...
“Love Island USA Season 7 Episode 20: What Could The Producers Be Thinking” by Zvi
26 Jun 2025
Contributed by Lukas
(Note: This is NOT being posted on my Substack or Wordpress, but I do want a record of it that is timestamped and accessible for various reasons, so ...
“Analyzing A Critique Of The AI 2027 Timeline Forecasts” by Zvi
24 Jun 2025
Contributed by Lukas
There was what everyone agrees was a high quality critique of the timelines component of AI 2027, by the LessWrong user and Substack writer Titotal. ...
“Childhood and Education #10: Behaviors” by Zvi
23 Jun 2025
Contributed by Lukas
Edition #9, that School is Hell, turned out to hit quite the nerve. Thus, I’m going to continue with the system of making the roundups have more f...
“AI #121 Part 2: The OpenAI Files” by Zvi
20 Jun 2025
Contributed by Lukas
You can find Part 1 here. This resumes the weekly, already in progress. The primary focus here is on the future, including policy and alignment, but ...
“AI #121 Part 1: New Connections” by Zvi
19 Jun 2025
Contributed by Lukas
That's right. I said Part 1. The acceleration continues. I do not intend to let this be a regular thing. I will (once again!) be raising the bar for...
“Gemini 2.5 Pro: From 0506 to 0605” by Zvi
18 Jun 2025
Contributed by Lukas
Google recently came out with Gemini-2.5-0605, to replace Gemini-2.5-0506, because I mean at this point it has to be the companies intentionally fuck...
“o3 Turns Pro” by Zvi
17 Jun 2025
Contributed by Lukas
You can now have o3 throw vastly more compute at a given problem. That's o3-pro. Should you have o3 throw vastly more compute at a given problem, if...
“RTFB: The RAISE Act” by Zvi
16 Jun 2025
Contributed by Lukas
The RAISE Act has overwhelmingly passed the New York Assembly (95-1 among Democrats and 24-21 among Republicans) and New York Senate (37-1 among Democ...
“Monthly Roundup #31: June 2025” by Zvi
13 Jun 2025
Contributed by Lukas
It's always a nice break to see what else is going on out there. Bad News Study finds sleep in male full-time workers falls as income rises, wi...
“AI #120: While o3 Turned Pro” by Zvi
12 Jun 2025
Contributed by Lukas
This week we got o3-Pro. As is my custom, I’m going to wait a bit so we can gather more information, especially this time since it runs so slowly. ...
“The Dream of a Gentle Singularity” by Zvi
11 Jun 2025
Contributed by Lukas
Thanks For the Memos Sam Altman offers us a new essay, The Gentle Singularity. It's short (if a little long to quote in full), so given you read my p...
“Give Me a Reason(ing Model)” by Zvi
10 Jun 2025
Contributed by Lukas
Are we doing this again? It looks like we are doing this again. This time it involves giving LLMs several ‘new’ tasks including effectively a Tow...
“Dwarkesh Patel on Continual Learning” by Zvi
09 Jun 2025
Contributed by Lukas
A key question going forward is the extent to which making further AI progress will depend upon some form of continual learning. Dwarkesh Patel offer...
“DeepSeek-r1-0528 Did Not Have a Moment” by Zvi
06 Jun 2025
Contributed by Lukas
When r1 was released in January 2025, there was a DeepSeek moment. When r1-0528 was released in May 2025, there was no moment. Very little talk. He...
“AI #119: Goodbye AISI?” by Zvi
05 Jun 2025
Contributed by Lukas
AISI is being rebranded highly non-confusingly as CAISI. Is it the end of AISI and a huge disaster, or a tactical renaming to calm certain people dow...
“Dating Roundup #6” by Zvi
04 Jun 2025
Contributed by Lukas
Previously: #1, #2, #3, #4, #5 Dating Roundup #4 covered dating apps. Roundup #5 covered opening without them. Dating Roundup #6 covers everything ...
“In Which I Make the Mistake of Fully Covering an Episode of the All-In Podcast” by Zvi
03 Jun 2025
Contributed by Lukas
I have been forced recently to cover many statements by US AI Czar David Sacks. Here I will do so again, for the third time in a month. I would much...
“Letting Kids Be Kids” by Zvi
30 May 2025
Contributed by Lukas
Letting kids be kids seems more and more important to me over time. Our safetyism and paranoia about children is catastrophic on way more levels than...
“AI #118: Claude Ascendant” by Zvi
29 May 2025
Contributed by Lukas
The big news of this week was of course the release of Claude 4 Opus. I offered two review posts: One on safety and alignment, and one on mundane uti...
“Fun With Veo 3 and Media Generation” by Zvi
28 May 2025
Contributed by Lukas
Since Claude 4 Opus things have been refreshingly quiet. Video break! The First Good AI Videos First up we have Prompt Theory, made with Veo 3,...
“Dating Roundup #5: Opening Day” by Zvi
27 May 2025
Contributed by Lukas
Previously: #1, #2, #3, #4. Since we all know that dating apps are terrible, the wise person seeks to meet prospective dates in other ways, ideally ...
“Claude 4 You: The Quest for Mundane Utility” by Zvi
26 May 2025
Contributed by Lukas
How good are Claude Opus 4 and Claude Sonnet 4? They’re good models, sir. If you don’t care about price or speed, Opus is probably the best mod...
“Claude 4 You: Safety and Alignment” by Zvi
25 May 2025
Contributed by Lukas
Unlike everyone else, Anthropic actually Does (Some of) the Research. That means they report all the insane behaviors you can potentially get their m...
“AI #117: OpenAI Buys Device Maker IO” by Zvi
22 May 2025
Contributed by Lukas
What a week, huh? America signed a truly gigantic chip sales agreement with UAE and KSA that could be anything from reasonable to civilizational suic...
“Google I/O Day” by Zvi
21 May 2025
Contributed by Lukas
What did Google announce on I/O day? Quite a lot of things. Many of them were genuinely impressive. Google is secretly killing it on the actual techn...
“The Codex of Ultimate Vibing” by Zvi
20 May 2025
Contributed by Lukas
While we wait for wisdom, OpenAI releases a research preview of a new software engineering agent called Codex, because they previously released a lig...
“America Makes AI Chip Diffusion Deal with UAE and KSA” by Zvi
19 May 2025
Contributed by Lukas
Our government, having withdrawn the new diffusion rules, has now announced an agreement to sell massive numbers of highly advanced AI chips to UAE a...
“Regarding South Africa” by Zvi
16 May 2025
Contributed by Lukas
The system prompt being modified by an unauthorized person in pursuit of a ham-fisted political point very important to Elon Musk once already doesn’...
“AI #116: If Anyone Builds It, Everyone Dies” by Zvi
15 May 2025
Contributed by Lukas
If Anyone Builds It, Everyone Dies is the title of the new book coming September 16 from Eliezer Yudkowsky and Nate Sores. The ‘it’ in question i...
“Fighting Obvious Nonsense About AI Diffusion” by Zvi
14 May 2025
Contributed by Lukas
Our government is determined to lose the AI race in the name of winning the AI race. The least we can do, if prioritizing winning the race, is to tr...
“Monthly Roundup #30: May 2025” by Zvi
13 May 2025
Contributed by Lukas
I hear word a bunch of new frontier AI models are coming soon, so let's do this now. Table of Contents Programming Environments Require Magic...
“A Live Look at the Senate AI Hearing” by Zvi
12 May 2025
Contributed by Lukas
Today's post will be a little different. This past week,Sam Altman and others testified at a US Senate hearing on AI competitiveness. He...
“Cheaters Gonna Cheat Cheat Cheat Cheat Cheat” by Zvi
09 May 2025
Contributed by Lukas
Cheaters. Kids these days, everyone says, are all a bunch of blatant cheaters via AI. Then again, look at the game we are forcing them to play, and h...
“AI #115: The Evil Applications Division” by Zvi
08 May 2025
Contributed by Lukas
It can be bleak out there, but the candor is very helpful, and you occasionally get a win. Zuckerberg is helpfully saying all his dystopian AI vision...
“OpenAI Claims Nonprofit Will Retain Nominal Control” by Zvi
07 May 2025
Contributed by Lukas
Your voice has been heard. OpenAI has ‘heard from the Attorney Generals’ of Delaware and California, and as a result the OpenAI nonprofit will re...
“Zuckerberg’s Dystopian AI Vision” by Zvi
06 May 2025
Contributed by Lukas
You think it's bad now? Oh, you have no idea. In his talks with Ben Thompson and Dwarkesh Patel, Zuckerberg lays out his vision for our AI future. I ...
“GPT-4o Sycophancy Post Mortem” by Zvi
05 May 2025
Contributed by Lukas
Last week I covered that GPT-4o was briefly an (even more than usually) absurd sycophant, and how OpenAI responded to that. Their explanation at tha...
“OpenAI Preparedness Framework 2.0” by Zvi
02 May 2025
Contributed by Lukas
Right before releasing o3, OpenAI updated its Preparedness Framework to 2.0. I previously wrote an analysis of the Preparedness Framework 1.0. I stil...
“AI #114: Liars, Sycophants and Cheaters” by Zvi
01 May 2025
Contributed by Lukas
Gemini 2.5 Pro is sitting in the corner, sulking. It's not a liar, a sycophant or a cheater. It does excellent deep research reports. So why does it h...
“GPT-4o Responds to Negative Feedback” by Zvi
30 Apr 2025
Contributed by Lukas
Whoops. Sorry everyone. Rolling back to a previous version. Here's where we are at this point, now that GPT-4o is no longer an absurd sycophan...
“Dating Roundup #4: An App for That” by Zvi
29 Apr 2025
Contributed by Lukas
Previously: #1, #2, #3. As time goes by, the fundamental things in life are still the same, and yet they change quite a lot with the times. But they ...
“GPT-4o Is An Absurd Sycophant” by Zvi
28 Apr 2025
Contributed by Lukas
GPT-4o tells you what it thinks you want to hear. The results of this were rather ugly. You get extreme sycophancy. Absurd praise. Mystical experien...
“Worries About AI Are Usually Complements Not Substitutes” by Zvi
25 Apr 2025
Contributed by Lukas
A common claim is that concern about [X] ‘distracts’ from concern about [Y]. This is often used as an attack to cause people to discard [X] conce...
“AI #113: The o3 Era Begins” by Zvi
24 Apr 2025
Contributed by Lukas
Enjoy it while it lasts. The Claude 4 era, or the o4 era, or both, are coming soon. Also, welcome to 2025, we measure eras in weeks or at most months...
“o3 Is a Lying Liar” by Zvi
23 Apr 2025
Contributed by Lukas
I love o3. I’m using it for most of my queries now. But that damn model is a lying liar. Who lies. This post covers that fact, and so...
“You Better Mechanize” by Zvi
22 Apr 2025
Contributed by Lukas
Or you had better not. The question is which one. This post covers the announcement of Mechanize, the skeptical response from those worried AI might ...
“Crime and Punishment #1” by Zvi
21 Apr 2025
Contributed by Lukas
This seemed like a good next topic to spin off from monthlies and make into its own occasional series. There's certainly a lot to discuss regarding c...
“o3 Will Use Its Tools For You” by Zvi
18 Apr 2025
Contributed by Lukas
OpenAI has finally introduced us to the full o3 along with o4-mini. Greg Brockman (OpenAI): Just released o3 and o4-mini! These models feel incredibly...
“AI #112: Release the Everything” by Zvi
17 Apr 2025
Contributed by Lukas
OpenAI has upgraded its entire suite of models. By all reports, they are back in the game for more than images. GPT-4.1 and especially GPT-4.1-mini a...
“GPT-4.1 Is a Mini Upgrade” by Zvi
16 Apr 2025
Contributed by Lukas
Yesterday's news alert, nevertheless: The verdict is in. GPT-4.1-Mini in particular is an excellent practical model, offering strong performance at a...
“OpenAI #13: Altman at TED and OpenAI Cutting Corners on Safety Testing” by Zvi
15 Apr 2025
Contributed by Lukas
Three big OpenAI news items this week were the FT article describing the cutting of corners on safety testing, the OpenAI former employee amicus brief...
“Monthly Roundup #29: April 2025” by Zvi
14 Apr 2025
Contributed by Lukas
In Monthly Roundup #28 I made clear I intend to leave the Trump administration out of my monthly roundups, for both better and worse, outside of my f...
“On Google’s Safety Plan” by Zvi
11 Apr 2025
Contributed by Lukas
Google Lays Out Its Safety Plans I want to start off by reiterating kudos to Google for actually laying out its safety plan. No matter how good th...