Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Ryan Greenblatt

๐Ÿ‘ค Speaker
243 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

However, it seems possible that most people at GDM are actually using anthropic models as part of a compute deal which could make their speedup be substantially larger.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

While the serial engineering speedup is 1.6x, the overall speedup to AI progress is much smaller, more like 1.15x or 1.2x.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

Because engineering is only a subset of the relevant labor, labor is only one input to algorithmic progress, compute for experiments is another, and algorithmic progress itself is only one component, though probably the majority.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

perhaps around 60% or 80% of overall AI progress, scaling up training compute and spending more on data also contribute.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

AIs are able to automate increasingly large and difficult tasks.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

The old meter time horizon benchmark has mostly saturated when it comes to measuring 50% reliability time horizon, as in, scores are sufficiently high this measurement is unreliable, but at 80% reliability the best publicly deployed models are at a bit over an hour while I expect the best internal models are reaching a bit below two hours.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

I expect that increasingly this 80% reliability score is dominated by relatively niche tasks that don't centrally reflect automating software engineering or AI R&D.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

Further, the time horizon measurement is increasingly sensitive to the task distribution.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

On tasks that are easy and cheap to verify, AIs can often complete difficult tasks that would take the best human experts many months and in some cases years.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

This requires somewhat custom scaffolding, large amounts of inference compute, though still much less than human cost for the same task, and relies on the AIs being able to just keep making forward progress and checking whether they've succeeded.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

Even though AIs make big errors during this process and sometimes end up severely mistaken about what's going on, they can recover by just seeing what isn't working and looking into this.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

When they fail to complete tasks, this is often because the task requires addition or legitimately very complex methods that are hard to build in an incremental and sloppy way.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

The more the task is just a relatively straightforward but extremely large engineering project, the better AIs do.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

Often, they also fail just by not trying hard enough or giving up on something they shouldn't give up on.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

Because current RL isn't very well targeted towards getting AIs to operate effectively in these massive inference compute scaffolds, AIs have somewhat degenerate tendencies in these scaffolds, for example.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

Getting into attractor states where they become convinced of some false belief, for example that something isn't possible, and being bad at delegating to sub-agents, for instance, giving overly specific instructions based on guessing from limited context rather than letting the sub-agent figure things out or assuming context the sub-agent doesn't have.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

Reward hacking and similar tendencies caused by bad RL incentives, for example agents giving up on some tasks they were assigned and making up some excuse for why it isn't feasible, amplify these issues, though reward hacks often get fixed via having agents iteratively inspect the work, but sometimes they persist, with all the agents claiming the reward hack is okay or can't be removed even though they know it's cheating or unintended at some level.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

Adding a human, even a human with minimal context, to the loop can help substantially by noticing and correcting some of these issues as well as making it easier to apply more inference compute without needing more infrastructure scaffolding, for example by doing multiple runs in parallel and picking the best one or picking the one that didn't reward hack.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

Relative to benchmarks and easy and cheap-to-verify tasks, AIs do worse on randomly sampled engineering tasks from within AI companies.

LessWrong (Curated & Popular)
"My picture of the present in AI" by ryan_greenblatt

This is especially true if we wait by value or undo a recent shift towards doing more work that AIs are especially good at.