Future of Life Institute Podcast

Can AI Do Our Alignment Homework? (with Ryan Kidd)

06 Feb 2026

1h 46m

20000 words

3 speakers

06 Feb 2026

Audio

Description

Ryan Kidd is a co-executive director at MATS. This episode is a cross-post from "The Cognitive Revolution", hosted by Nathan Labenz. In this conversation, they discuss AGI timelines, model deception risks, and whether safety work can avoid boosting capabilities. Ryan outlines MATS research tracks, key researcher archetypes, hiring needs, and advice for applicants considering a career in AI safety. Learn more about Ryan's work and MATS at: https://matsprogram.orgCHAPTERS: (00:00) Episode Preview (00:20) Introductions and AGI timelines (10:13) Deception, values, and control (23:20) Dual use and alignment (32:22) Frontier labs and governance (44:12) MATS tracks and mentors (58:14) Talent archetypes and demand (01:12:30) Applicant profiles and selection (01:20:04) Applications, breadth, and growth (01:29:44) Careers, resources, and ideas (01:45:49) Final thanks and wrap PRODUCED BY: https://aipodcast.ing SOCIAL LINKS: Website: https://podcast.futureoflife.org Twitter (FLI): https://x.com/FLI_org Twitter (Gus): https://x.com/gusdocker LinkedIn: https://www.linkedin.com/company/future-of-life-institute/ YouTube: https://www.youtube.com/channel/UC-rCCy3FQ-GItDimSR9lhzw/ Apple: https://geo.itunes.apple.com/us/podcast/id1170991978 Spotify: https://open.spotify.com/show/2Op1WO3gwVwCrYHg4eoGyP

Chapters

1. What is the main topic discussed in this episode? 2. What are the AGI timelines discussed in the episode? 3. How do deception and values impact AI control? 4. What is the significance of dual use in AI alignment? 5. How do frontier labs interact with governance in AI safety? 6. What research tracks are being pursued by MATS? 7. What talent archetypes are in demand within AI safety? 8. What profiles do successful applicants to AI safety programs have?

Featured

Ryan Kidd

Unknown

Nathan Labenz

Transcription

Transcript generated automatically by AI and may contain errors.

Chapter 1: What is the main topic discussed in this episode?

0.031 - 2.635 Nathan Labenz

Welcome to the Future of Life Institute podcast.

Chapter 2: What are the AGI timelines discussed in the episode?

3.195 - 19.578 Nathan Labenz

This episode is a cross post from the Cognitive Revolution podcast featuring Nathan Labentz interviewing Ryan Kidd. Ryan is the co-executive director of MATS, which is one of the largest AI safety research talent pipelines in the world. Please enjoy.

19.618 - 26.383 Unknown

Ryan Kidd, co-executive director at MATS. Welcome to the Cognitive Revolution. Thanks so much.

26.743 - 27.804 Ryan Kidd

I'm glad to be here.

28.625 - 35.03 Unknown

I'm excited for this conversation. So I've mentioned a couple of times that I've been a personal donor to MATS.

Chapter 3: How do deception and values impact AI control?

35.671 - 57.732 Unknown

I think it's actually the first time we've ever met and spoken, but your reputation certainly precedes you. And I've seen a lot of great work come out of the program and a lot of great reviews. And in my research also as part of the Survival and Flourishing Fund recommender group, got a lot of great commentary on the importance of MATS as a talent pipeline into the AI safety research field.

57.812 - 74.581 Unknown

So big supporter of your work from afar, and I appreciate the fact that you guys have come on as a sponsor of the podcast recently as well. This conversation, not technically a part of that deal, but we're in one of these open AI style circular flow of funds sorts of things where we're somehow both inflating one another's revenue.

75.162 - 78.047 Ryan Kidd

I like to think we didn't buy our way onto the podcast. Yeah.

78.027 - 97.842 Unknown

No, the enthusiasm definitely is real because I've heard so many great things over time. So excited to get into this. I thought we would maybe just start with kind of big picture from your perspective. And I think, you know, having watched some of your previous talks, I know that you play sort of a portfolio strategy where you're not like –

Chapter 4: What is the significance of dual use in AI alignment?

97.822 - 117.94 Unknown

I have a very specific, narrow prediction, and I'm trying to maximize the value of this organization, this program for that very hyper-specific prediction. It seems like you're more saying, well, there's a lot of uncertainty out there in the space, and we're going to try to be valuable across a range of those scenarios as much as we can be.

117.92 - 135.54 Unknown

With that said, you can kind of speak on behalf of yourself or on behalf of mentors or the community as a whole. Where are you guys right now? Where are we in terms of timelines, so to speak? And how has your strategy evolved over the last year or so as we've gained more information on where we are relative to the singularity?

136.87 - 150.328 Ryan Kidd

Yeah, okay. So I don't like to have opinions here, or I don't like to have opinions very loudly. And the reason for that, I think, is because as you say, we are somewhat like a hedge fund or something, or maybe an index fund, right?

151.149 - 160.781 Ryan Kidd

More likely, which is to say, like, we have a broad portfolio, we adopt a bunch of different theories of change as valid, and we try and like, you know, have our thumb in 100 pies.

Chapter 5: How do frontier labs interact with governance in AI safety?

161.342 - 175.301 Ryan Kidd

So I would say in terms of Mass's institutional opinion on this, definitely we tend to go with things like Metaculous and prediction markets and the Forecasting Research Institute, FRI, their predictions and so on.

175.982 - 196.659 Ryan Kidd

So the current Metaculous prediction for strong AGI, I think it's called, which is, I think you can ignore most of the requirements of the test and just look at one of them, which is the two-hour adversarial Turing test. That's predicted somewhere around mid-2033. Okay, so I think that is probably the best button we have for when AGI of that nature occurs.

197.16 - 219.826 Ryan Kidd

Now we recently just had two days ago, or three days ago perhaps, dropped this new AI Futures project, dropped this new report. which two Matz fellows were, one is a lead author, one is a contributing author on. So very excited about that. And that just was updating their model. And I think they predicted something between 2031, well, 2030 to 2032, depending upon how you define AGI.

219.846 - 228.577 Ryan Kidd

They broke it down to all these automated coders. They can do all the coding stuff, these top expert dominating AI across all these fields and so on.

Chapter 6: What research tracks are being pursued by MATS?

229.117 - 238.429 Ryan Kidd

So I think, I don't know, somewhere around 2033 seems like a decent bet. But also we had, you know, Nathan Young recently compiled all these different like forecasting platforms.

238.449 - 259.578 Ryan Kidd

So Metaculous, Manifold, another Metaculous poll that was like for weak AGI, it was like a little bit less good at Turing tests and all these other like, I don't know, some Cauchy thing where they're opening, I will achieve it. And he came out with an average of 2030. Now, I don't know, I still like the Metaculous 2033, but like I wouldn't bet against 2030 in terms of nearness of AGI.

260.319 - 281.053 Ryan Kidd

As for superintelligence, Complicated, right? Could be six months or less. Could be a very hard takeoff after this AGI thing. If it's like a very software-only singularity scenario where you don't need a big bunch of hardware, scale up, you aren't limited by compute, it's just recursive self-improvement or something, algorithmic improvement, AIs are improving.

281.033 - 299.987 Ryan Kidd

The algorithms are training eyes, and it's like, well, that's a fast feedback loop, right? Or you might need a lot more experimentation, right? You might need massive hardware scale-ups. You might need just staggeringly more compute than exists in the world, in which case that could take you a decade to get your singularity.

Chapter 7: What talent archetypes are in demand within AI safety?

299.967 - 323.61 Ryan Kidd

I currently think that 2033 is a decent central estimate in terms of the median for what we're preparing for. But obviously, 20% chance by 2028. I think that's the meticulous prediction. That's a lot, right? So we should definitely be considering scenarios that are sooner, right? In particular, I think the sooner AGI happens, the more dangerous it might be, right?

323.63 - 345.66 Ryan Kidd

The less time we have to do critical technical research to prepare, the less time we have to implement policy solutions, And I don't know, if it's happening during a transition period for U.S. government, it could be even wilder. So I would say median bet on 2033-ish, but really care a lot about the impacts of AI. Front load your concern to pre-2033 scenarios.

346.361 - 353.532 Ryan Kidd

And I think that Matt's mentors, we haven't surveyed them, but I think if we were to poll them, we'd get something similar.

354.608 - 372.634 Unknown

Yeah, I always say I'd rather prepare for the shorter timelines and then have a little extra time, and I'm sure we'll find ways that we'll still need it. But it does seem wise to me to play to that sort of first quartile possible range of outcomes.

Chapter 8: What profiles do successful applicants to AI safety programs have?

372.614 - 396.727 Unknown

Is there still any room in the program or in the community? Like if I showed up and I was like a 2063 guy, would I be sort of out on an island on my own? And are there any work streams going on right now in the sort of brain computer interface or sort of... deep totalizing, obviously interpretability has, you know, different flavors, right?

396.747 - 415.547 Unknown

But with the recent turn toward more like pragmatic interpretability, I wonder if there's any space left for the sort of, you know, we know we really want to understand everything kind of interpretability, or if that has kind of been understood generally as like, and it's probably going to take too long for us to really be excited about pushing it right now.

416.337 - 434.374 Ryan Kidd

Yeah, it's a good question. I actually think there is plenty of room for this, and here's why. The mainline kind of meta strategy that the AI safety community seems to be pursuing on the whole, we're talking in terms of funding, in terms of sheer numbers of people and resources deployed, not necessarily in terms of less wrong posts written or something, right?

434.714 - 449.772 Ryan Kidd

But in terms of resources deployed is this AI control strategy, which is where basically you build, perhaps it's better called alignment MVP. which is a term coined by Jan Leakey, former head of super alignment at OpenAI, now co-lead of alignment science at Anthropic.

450.173 - 461.532 Ryan Kidd

And alignment MVP is an AI system that is a minimum viable product for accelerating the pace of alignment research differentially over capabilities research such that

461.512 - 487.063 Ryan Kidd

we get the right outcome so basically you're getting ai's to do your homework and there's been a lot of debate on this uh there's a very strong camp in in the direction of like this just never will work because as soon as an ai system is strong enough to be useful it's dangerous right i think you know quad code shows this is not the case for at least for software engineering but perhaps for people who think that aligning ai systems requires like serious research taste

487.28 - 506.506 Ryan Kidd

they would probably say that this quality code is nowhere near there, right? Where generally AI systems are nowhere near that level of research taste ability. Now, All of the things that you're mentioning that pay off only in 2063 scenarios, presumably they only pay off in that many over that time period. Not necessarily because of like, I don't know, human challenge trials or something.

506.526 - 524.471 Ryan Kidd

Maybe that, maybe that makes a difference if you're interested in like, I don't know, making humans more intelligent with genetic engineering or some of the crazy things that are being tossed around. But if you're mainly interested in like, oh, this thing is going to take decades of technical work. Maybe you can compress those decades into a really short period. of AI labor, right?

524.832 - 543.527 Ryan Kidd

If you can get them to run faster, massively parallelize things, and just, in general, just get them to do your homework, those 2063 AI alignment plans might be automatable over a shorter period of time. And so we should definitely be pursuing those because the more we do to raise the waterline of understanding on these different scenarios,

Future of Life Institute Podcast

Can AI Do Our Alignment Homework? (with Ryan Kidd)

Chapter 1: What is the main topic discussed in this episode?

Chapter 2: What are the AGI timelines discussed in the episode?

Chapter 3: How do deception and values impact AI control?

Chapter 4: What is the significance of dual use in AI alignment?

Chapter 5: How do frontier labs interact with governance in AI safety?

Chapter 6: What research tracks are being pursued by MATS?

Chapter 7: What talent archetypes are in demand within AI safety?

Chapter 8: What profiles do successful applicants to AI safety programs have?

Sign in to Audioscrape

Share this moment

Future of Life Institute Podcast

Can AI Do Our Alignment Homework? (with Ryan Kidd)

Chapter 1: What is the main topic discussed in this episode?

Chapter 2: What are the AGI timelines discussed in the episode?

Chapter 3: How do deception and values impact AI control?

Chapter 4: What is the significance of dual use in AI alignment?

Chapter 5: How do frontier labs interact with governance in AI safety?

Chapter 6: What research tracks are being pursued by MATS?

Chapter 7: What talent archetypes are in demand within AI safety?

Chapter 8: What profiles do successful applicants to AI safety programs have?

Want to see the complete chapter?

Sign in to Audioscrape

Share this moment