Joe Carlsmith

👤 Speaker

1218 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

So, you know, maybe the AIs are like, they really want to be like shmeltful and like shmonest and shmarmless, right?

1730.475 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

But their concept is, like, importantly different from the human concept.

1737.97 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

And they know this.

1740.797 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

So they know that the human concept would mean blah.

1742.761 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

But they, like, ended up... Their values ended up fixating on, like, a somewhat different structure.

1744.866 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

Yeah.

1749.236 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

So that's, like, another version.

1750.599 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

And then a fourth version... Or a fifth version, which I think...

1751.662 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

you know, I think about less because I think it's just like such an own goal if you do this, but I do think it's possible.

1754.997 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

It's just like, you could have AIs that are actually just doing what it says on the tin.

1759.163 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

Like you have AIs that are just genuinely aligned to the model spec.

1764.03 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

They're just really, they're just really trying to like benefit humanity and reflect well on open AI.

1767.595 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

And what's, what's the, what's the other one?

1772.021 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

Help that, you know, assist the developer or the user, right?

1775.806 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

Yeah.

1777.889 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

But your model spec, unfortunately, was just not robust to the degree of optimization that this AI is bringing to bear.

1779.331 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

And so, you know, it decides when it's looking out at the world and they're like, what's the best way to benefit open AI and or sorry, reflect on open AI and and benefit humanity and such.

1786.461 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

And so it decides that, you know, the best way is to go rogue.

1796.376 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

That's I think that's like a real own goal, because at that point you like.

1800.922 View full episode →

Dwarkesh Podcast

Joe Carlsmith - Otherness and control in the age of AGI

you got so close, you know, you really, you really, you just have to write the model spec well, um, and red team it suitably.

1804.087 View full episode →

← Previous Page 12 of 61 Next →

Report any issue