Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing
Podcast Image

LessWrong (Curated & Popular)

"My picture of the present in AI" by ryan_greenblatt

09 Apr 2026

Transcription

Transcript generated automatically by AI and may contain errors.

Chapter 1: What is the main topic discussed in this episode?

0.031 - 15.649 Unknown

My Picture of the Present in AI by Ryan Greenblatt Published on April 7, 2026 In this post, I'll go through some of my best guesses for the current situation in AI as of the start of April 2026.

0

17.39 - 35.508 Ryan Greenblatt

You can think of this as a scenario forecast, but for the present, which is already uncertain, rather than the future. I will generally state my best guess without argumentation and without explaining my level of confidence. Some of these claims are highly speculative while others are better grounded, certainly some will be wrong.

0

36.59 - 57.844 Ryan Greenblatt

I tried to make it clear which claims are relatively speculative by saying something like I guess, I expect, etc., but I may have missed some. You can think of this post as more like a list of my current views rather than a structured post with a thesis, but I think it may be informative nonetheless. In a future post, I'll go beyond the present and talk about my predictions for the future.

0

Chapter 2: What is the current state of AI R&D and software acceleration?

58.906 - 81.618 Ryan Greenblatt

I was originally working on writing up some predictions, but the predictions about today ended up being extensive enough that a separate post seemed warranted. Heading AI R&D Acceleration and Software Acceleration More Generally Right now, AI companies are heavily integrating and deploying AI tools in their work and getting significant, but not insane, speed-ups from this.

0

82.74 - 105.216 Ryan Greenblatt

At the start of 2026, the serial research engineering speed-up was around 1.4x, but it's now reached more like 1.6x at OpenAI and Anthropic with more capable models, better tooling, adaptation, humans learning how to use models better, workflow changes, people shifting what they work on to areas that benefit more from AI assistance, etc., and some diffusion.

0

106.297 - 125.32 Ryan Greenblatt

As in, using AI tools provides as much of an engineering productivity increase as if people operated 1.6x faster when doing engineering, in addition to literal coding, engineering includes less central activities, like determining what features to implement, deciding how to architect code, and coordinating a meeting with other engineers.

0

126.462 - 139.708 Ryan Greenblatt

For many specific engineering and research tasks, people can now leverage AIs to do that task with much less of their time, for example 3-10x less human time, but other tasks see much smaller speed-ups.

0

139.688 - 159.406 Ryan Greenblatt

People are shifting their work toward two kinds of tasks, lower value, tasks where AIs are particularly helpful, and tasks they wouldn't have been able to do without AI due to insufficient skills in knowledge. When people think about AI uplift, they naturally think about something like how much longer would it take me to do the work I'm currently doing without AI.

160.387 - 175.821 Ryan Greenblatt

But this isn't the right question, because people have adapted their workflows, completing more tasks where AI helps a lot and doing tasks they wouldn't otherwise have the skills for. This bias is the answer upward relative to how much productivity is actually increased.

175.842 - 196.223 Ryan Greenblatt

The question that better captures the actual productivity value is something like how much would we have to speed you up before you'd be indifferent between that speed up and having AI tools? I think the answer to this, the serial speed up I quoted above, is around 1.6x right now, while the answer to the prior question might be more like 3-20x.

196.642 - 205.676 Ryan Greenblatt

the speed up is also lower than it might seem because the resulting code is generally sloppier, less reliable, and less well understood than if it was just written by human engineers.

206.817 - 225.045 Ryan Greenblatt

It's more common for no one, including the AIs themselves, to have a great understanding of how some code works or how exactly it fits into a broader system, and for example what assumptions it makes, making some issues more frequent. Other types of errors are made less frequent because AIs make testing less expensive.

Chapter 3: How are AI engineering capabilities evolving?

307.62 - 327.577 Ryan Greenblatt

Because engineering is only a subset of the relevant labor, labor is only one input to algorithmic progress, compute for experiments is another, and algorithmic progress itself is only one component, though probably the majority. perhaps around 60% or 80% of overall AI progress, scaling up training compute and spending more on data also contribute.

0

328.88 - 333.45 Unknown

Heading AI engineering capabilities and qualitative abilities

0

334.021 - 338.669 Ryan Greenblatt

AIs are able to automate increasingly large and difficult tasks.

0

338.689 - 358.122 Ryan Greenblatt

The old meter time horizon benchmark has mostly saturated when it comes to measuring 50% reliability time horizon, as in, scores are sufficiently high this measurement is unreliable, but at 80% reliability the best publicly deployed models are at a bit over an hour while I expect the best internal models are reaching a bit below two hours.

0

358.102 - 373.313 Ryan Greenblatt

I expect that increasingly this 80% reliability score is dominated by relatively niche tasks that don't centrally reflect automating software engineering or AI R&D. Further, the time horizon measurement is increasingly sensitive to the task distribution.

374.395 - 382.724 Ryan Greenblatt

On tasks that are easy and cheap to verify, AIs can often complete difficult tasks that would take the best human experts many months and in some cases years.

383.846 - 396.681 Ryan Greenblatt

This requires somewhat custom scaffolding, large amounts of inference compute, though still much less than human cost for the same task, and relies on the AIs being able to just keep making forward progress and checking whether they've succeeded.

396.661 - 416.903 Ryan Greenblatt

Even though AIs make big errors during this process and sometimes end up severely mistaken about what's going on, they can recover by just seeing what isn't working and looking into this. When they fail to complete tasks, this is often because the task requires addition or legitimately very complex methods that are hard to build in an incremental and sloppy way.

417.944 - 430.523 Ryan Greenblatt

The more the task is just a relatively straightforward but extremely large engineering project, the better AIs do. Often, they also fail just by not trying hard enough or giving up on something they shouldn't give up on.

Chapter 4: What are the implications of AI misalignment?

633.245 - 642.2 Unknown

I expect this improved pre-training to have a moderate multiplier effect on the RL. Heading Misalignment and misalignment-related properties.

0

643.301 - 651.414 Ryan Greenblatt

Current systems are reasonably likely to reward-hack especially on very hard, or impossible, tasks and when operating autonomously for long stretches.

0

652.496 - 668.2 Ryan Greenblatt

They also systematically do various misaligned behaviors that likely performed well in training and are reward-hacking, approval-hacking, reward-seeking adjacent like overstating their results, downplaying errors or issues, and trying to make it less likely that failures are clearly visible when possible.

0

668.602 - 680.316 Ryan Greenblatt

My best guess is that the model typically isn't consciously aware of many or most of these misalignments, especially anthropic models, and the situation is more like self-deception, similar to the elephant in the brain idea.

0

681.005 - 700.583 Ryan Greenblatt

Models are more aware of straightforward reward hacks, but might justify these with insanely motivated reasoning such that it's unclear if they're consciously aware they are cheating. Overall, current models aren't very aligned in the mundane behavioral sense of actually trying to do what they are supposed to do, but they aren't plotting against us or particularly power-seeking.

700.563 - 711.598 Ryan Greenblatt

And, anthropic models likely have a self-conception of being aligned, to the extent they have a detailed self-conception that influences their behavior, which seems better than having a self-conception of being misaligned.

712.68 - 729.924 Ryan Greenblatt

The exact misalignments we see today are likely relatively tractable to behaviorally fix by improving reward provision, detecting and resolving issues with training environments, and adding additional types of training data. However, I don't think these behavioral fixes will solve the underlying problem longer term.

730.185 - 746.871 Ryan Greenblatt

If AIs are very superhuman, it may be quite hard to notice and fix issues with reward provision, and as systems get more capable, some of these solutions will either get less applicable or will incentivize longer-run unintended goals, like trying to make their problematic actions very hard to detect.

746.851 - 766.075 Ryan Greenblatt

While the chain of thought, COT, for open AI models reasonably accurately reflects the model's cognition, the COT for anthropic models does so to a substantially lesser extent. This may be due to spillover effects where reinforcement on outputs transfers to the COT because anthropic COT is less distinct from the output.

Chapter 5: How could AI impact cybersecurity in the near future?

931.204 - 946.294 Ryan Greenblatt

More speculative. My current sense is that AI companies overall probably have an overly optimistic sense of how good of a job they've done on mundane alignment while the teams working on the issue have a mostly reasonable view of this. This seems especially true for Anthropic.

0

947.356 - 959.952 Ryan Greenblatt

This is due to a mix of AIs, especially Anthropic Systems, acting like sympathetic characters and very plausibly being sympathetic characters and motivated reasoning about the company doing well in general.

0

959.932 - 967.059 Unknown

Heading. Cyber. AIs have been getting increasingly good at finding vulnerabilities and cyber offense.

0

968.16 - 990.704 Ryan Greenblatt

I think it's likely, 60%, that in the next six months a very well set up and somewhat hand-engineered agent scaffold that uses the best AI could succeed in fully autonomously creating a strong end-to-end exploit against one of the top 10 most important software targets, for example Chrome OneClick, Safari OneClick, iMessage ZeroClick, etc., when given $1 million in inference compute per target.

0

991.746 - 1013.391 Ryan Greenblatt

This assumes there aren't issues with refusals, for example the AI is helpful only, and that this AI is given this task before this AI is used to patch the relevant software. My largest uncertainty here is around how effectively software will get patched by earlier AIs. I'm uncertain about how much effort will be spent on leveraging AIs to find and patch vulnerabilities.

Chapter 6: What are the potential risks of AI in bioweapons?

1014.493 - 1026.118 Ryan Greenblatt

I'm also uncertain about the extent to which patching vulnerabilities found by earlier models will transfer to preventing somewhat more capable models, possibly with more inference compute, from finding vulnerabilities.

0

1026.098 - 1039.999 Ryan Greenblatt

More strongly, I think that AIs in the next six months are quite likely, 80%, to be able to succeed at this objective for a January 2026 version of the corresponding software without internet access and assuming no contamination.

0

1041.08 - 1057.166 Ryan Greenblatt

Many difficult parts of cyber offence seem particularly well suited to AI strengths, relatively checkable, benefits from extensive knowledge, parts are highly parallelisable. I don't think the rate of cybercrime is elevated right now, though the rate of vulnerability discovery is very elevated.

0

1058.248 - 1067.985 Ryan Greenblatt

I don't currently expect a very large increase in cybercrime by end of year, though I think it's possible and a 2x increase is quite plausible, roughly 30%.

0

1067.965 - 1086.987 Ryan Greenblatt

I expect the situation with AI cyber capabilities will seem extreme to security professionals and to maintainers of commonly used software that tries to be secure, for example Chrome, Linux, etc., but will have almost no direct effect on random people in the US and won't even have much effect on software engineers at big tech companies.

1086.967 - 1089.372 Unknown

Heading. Bio-weapons.

1090.535 - 1106.511 Ryan Greenblatt

Wannabe bioterrorists without much bio-expertise who are very good at using LLMs are probably seriously uplifted by unsafeguarded versions of the current best AIs, as in, helpful-only models, but no one knows how large this effect is and how good at using LLMs you need to be.

1107.233 - 1131.578 Ryan Greenblatt

After taking into account safeguards, I don't think current publicly released LLMs, as of April 1, 2026, have more than double bioterror risk, though I'm pretty unsure. Also, even a 2x increase would be from a relatively low baseline. We don't have a great sense of what this baseline is in terms of expected fatalities, though we can bound the frequency of bioterror attempts reasonably well.

1132.86 - 1135.083 Unknown

Heading. Economic effects.

Chapter 7: How is AI affecting the economy today?

1260.132 - 1264.005 Ryan Greenblatt

The original text contained five footnotes which were omitted from the narration.

0
Comments

There are no comments yet.

Please log in to write the first comment.