Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh

๐Ÿ‘ค Speaker
1735 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

That's not the Terminator scenario.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

That's just one of these natural consequences of how we train it.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And I think that once a thousand of these natural consequences of training add up, the AI is evil in the same way that like once the AI can do chess and philosophy and all these other things, eventually you got to admit it's intelligent.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

Yeah.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

So I think that each individual failure, like maybe it will make the national news.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

Maybe people say, oh, it's so strange that GPT-7 did this particular thing and then they'll train it away and then it won't do that thing.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And there will be some point at the process of becoming super intelligent at which it

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

I don't want to say makes the last mistake because you'll probably have like gradually decreasing number of mistakes to some asymptote, but the last mistake that anyone worries about.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And after that, it will be able to do its own thing.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

Yeah, I think the alignment community did not really expect LLMs.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

I mean, if you look in Bostrom Superintelligence, there's a discussion of Oracle AIs, which are sort of like LLMs.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

I think that came as a surprise.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

I think one of the reasons I'm more hopeful than I used to be is that LLMs are great for

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

compared to the kind of reinforcement learning self-play agents that they expected.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

I do think that now we are kind of starting to move away from the LLMs to those reinforcement learning agents.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

We're going to face all of these problems again.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

So...

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

I am the writer and the celebrity spokesperson for this scenario.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

I am the only person on the team who is not a genius forecaster.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model โ€” Scott Alexander & Daniel Kokotajlo

And maybe related to that, my PDoom is the lowest of anyone on the team.