Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Joe Carlsmith

๐Ÿ‘ค Speaker
1218 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

so we don't want to like chain civilization to like a barbarous past or whatever.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Like everyone should agree on that, including, and the people who are interested in alignment also agree on that.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Um, so, uh, obviously there's a concern that people like don't engage in that process or that something shuts down the process of reflection.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

But I think everyone agrees we want that.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And so that will lead potentially to something that is quite different from our, uh, current conception of what's, what's valuable.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Um, and, uh,

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

There's a question of how different.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And I think there are also questions about what exactly are we talking about with reflection?

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

I have an essay on this where I think this is not... I don't actually think there's a kind of off-the-shelf, pre-normative notion of reflection that you can just be like, oh, obviously you take an agent, you stick it through reflection, and then you get values, right?

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Like, no.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

There's a bunch of...

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

types of reflect.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

I mean, I think that really there's just a bunch of, there's like a whole pattern of empirical facts about like take an agent, put it through some process of like reflection, all sorts of things, ask it questions.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

There's like also, and then that'll go in all sorts of directions for a given empirical case.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And then you have to look at the pattern of outputs and be like, okay, what do I make of that?

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Um, but overall I think we should expect like even the good futures I think will be quite weird.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Um, and they might even be incomprehensible.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Like,

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

to us i don't i don't think so like so i mean there's different types of incomprehensible so say i show up in the in the future and this is all computers right i'm like okay all right and then they're like we're up we ran we're running like creatures on the computers i'm like so i have to somehow get in there and see like what's actually going on with the computers or something like that maybe i can actually see maybe i actually understand what's going on in the computers but i don't yet know what values i should be using to evaluate that so it can be the case that you don't

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

us if we showed up would not be very good at like recognizing goodness or badness.