Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Daniel Kokotajlo

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
617 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And that one is one where, you know, you're still trying different things.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

There's failure and success and experimentation.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And then there's another where it's like the thing has happened, and now you send the probe out, and then you look out at the night sky six months later, and you see something occluding the sun.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

In your story, you have basically two different scenarios after some point.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

So yeah, what is a sort of crucial turning point and what happens in these two scenarios?

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

So in the world where they're getting deployed through the economy, but they are misaligned.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And you, you're, you know, people in charge, at least at this moment, think that they are in a good position with regard to misalignment.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

It just seems with even smart humans, they get caught in weird ways because they don't have logical omniscience.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

They don't realize the consequences of the way they did something which just obviously gave them away.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And there is this โ€“ with lying, there is this thing where it's just really hard to keep an inconsistent false world model alive.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

working with the people around you, and that's why psychopaths often get caught.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And so if you have all these AIs that are deployed to the economy and they're all working towards this big conspiracy, I feel like one of them who's siloed or loses internet access and has to confabulate a story will just get caught, and then you're like, wait, what the fuck?

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And then, you know, you catch it before it's, like, taken over the world.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

So it is the case that certain things that people would have considered egregious misalignment in the past are happening.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

But also certain things which people who are especially worried about misalignment said would be impossible to solve have just been solved in the normal course of getting more capabilities.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

Like Eliezer had that thing about can you even specify what you want the AI to do without the AI totally misunderstanding you and then just converting the universe to paperclothes.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And now just by the nature of

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

GPT-4 having to understand natural language.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

It totally has a common sense understanding of what you're trying to make it do, right?

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

So I think this sort of like trend cuts both ways, basically.