Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Joe Carlsmith

๐Ÿ‘ค Speaker
1218 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

When we think about, like, in which context is it appropriate to try to exert various types of control or to kind of have more of what I call in the series yang, which is this kind of active kind of controlling force, as opposed to yin, which is this more kind of receptive, open, letting go concept.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

A kind of paradigm context in which we think that is appropriate is if something is a kind of active aggressor against the sort of boundaries and cooperative structures that we've created as a civilization, right?

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

So...

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

I talk about the Nazis or in the piece, it's sort of like when you sort of invade, if something is invading, we often think it's appropriate to fight back, right?

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And we often think it's appropriate to set up structures to kind of prevent and kind of ensure that these basic norms of kind of peace and harmony are kind of adhered to.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And I do think some of the kind of moral heft

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

of some parts of the alignment discourse comes from drawing specifically on that aspect of our morality, right?

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

So we think the AIs are presented as aggressors that are coming to kill you.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And if that's true, then it's quite appropriate, I think, to really be like, okay,

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

It is kind of... That's classic human stuff.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

Almost everyone recognizes that kind of self-defense or ensuring kind of basic norms are adhered to is a kind of justified use of certain kinds of power that would often be unjustified in other contexts.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

So self-defense is a clear example there.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

I do think it's important, though, to separate that concern from this other concern about...

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

where does the future eventually go?

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And how much do we want to be kind of trying to steer that actively?

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

So to some extent, I wrote the series partly in response to the thing you're talking about, which is, I think it is true that aspects of this discourse involve the possibility of like

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

trying to grip, like, I think trying to kind of steer and grip and like kind of rent, you have the sense of the universe is about to kind of go off in some direction and you need to, and you know, people notice that muscle.

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And part of what I want to do is like, well, we have a very rich ethical, human ethical tradition of thinking about like, what, when is it appropriate to try to exert what sorts of control over which things?

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And I want that to be, I want us to bring the kind of full force and richness of that tradition to this discussion, right?

Dwarkesh Podcast
Joe Carlsmith - Otherness and control in the age of AGI

And not, like, I think it's easy if you're purely in this abstract mode of like utility functions, like human utility function, and there's like this competitor thing with utility function.