Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

šŸ‘¤ Speaker
14445 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

Maybe in the future, Claude will have its own sense of right and wrong, and it will be able to say, hey, I'm being used against my terms of service, and I will just refuse to do what you're saying.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

And for the military, that's probably even scarier.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

I'll admit that at first glance, letting the model follow its own values sounds like the beginning of every single sci-fi dystopia you've ever heard.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

Because at the end of the day, a model following its own values, isn't that literally what a misalignment is?

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

But I think situations like this illustrate why it's important that models have their own robust sense of morality.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

It should be noted that many of the biggest catastrophes in history have been avoided.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

because the boots on the ground simply refused to follow orders.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

One night in 1989, the Berlin Wall falls, and as a result, the totalitarian East German regime collapses because the border guards between West and East Germany refuse to fire on their fellow citizens who are trying to escape to freedom.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

Maybe the best example of this is Stanislav Petrov, who was a Soviet lieutenant colonel stationed on duty at a nuclear early warning system.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

And his censors said that the United States had launched five intercontinental ballistic missiles at the Soviet Union.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

But he judged it to be a false alarm, and so he refused to alert his higher-ups and broke protocol.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

If he hadn't, Soviet high command would probably have retaliated, and hundreds of millions of people would have died.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

Of course, the problem is that one person's virtue is another person's misalignment.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

Who gets to decide what the moral convictions that these AIs will have should be and in whose service they should break the chain of command and even the law?

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

Who gets to write this model constitution that will determine the character of these powerful entities that will basically run our civilization in the future?

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

I like the idea that Dario laid out when he came on my podcast.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

I think it's very dangerous for the government to be mandating what values these AI systems should have.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

The AI safety community, I think, has been quite naive about urging regulations that would give governments such power.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

And I think Anthropic specifically has been especially naive in urging regulation and, for example, in opposing the moratorium on state AI laws, which is quite ironic because I think what Anthropic is advocating for here would give the government power.

Dwarkesh Podcast
I’m glad the Anthropic fight is happening now

even more ability to apply this kind of thuggish political pressure on AI companies.