Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Joe Carlsmith

👤 Person
1218 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

So our alignment work, control, cybersecurity, general epistemics, maybe some coordination applications, stuff like that.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

There's a bunch of stuff you can do with AIs.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

that in principle could kind of differentially accelerate our security with respect to the sorts of considerations we're talking about.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

If you have the eyes that are capable of that, and you can successfully elicit that capability in a way that's not sort of being sabotaged or like messing with you in other ways, and they can't yet take over the world or do some other sort of really problematic form of power seeking, then I think

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

if we were really committed, we could like, you know, really go, go hard, put a ton of resources, really differentially direct this like glut of AI productivity towards these sort of security factors.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And, and hopefully kind of control and understand, you know, do a lot of these things you're talking about for kind of making sure our AIs don't kind of take over or mess with us in the meantime.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And I think we have a lot of tools there.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

I think you have to

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

You have to really try though.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

It's possible that those sorts of measures just don't happen or don't happen at the level of kind of commitment and diligence and like seriousness that you would need.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Especially if things are like moving really fast and there's other sort of competitive pressures and like, you know, the compute, ah, this is going to take compute to do these like intensive, all these experiments on the AIs and stuff and that compute, we could use that for experiments for the, you know, the next, the next scaling step and stuff like that.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

So, yeah.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

you know, I do, I am, I'm not here saying like, this is impossible, especially for that band of AIs.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

It's just, I think you have to, you have to try really hard.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Yeah.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

it is the case that by the time we're building super intelligence, we'll have like much better.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Uh, I mean, even right now, like when you, when you look at like labs talking about how they're planning to align the AIs, no one is saying like, we're going to do our LHF.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Um, you know, at the least you're talking about scalable oversight.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Um, you're, you have like some hope about, uh, interpretability.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

You have like automated red teaming.