Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

30 Mar 2023

3h 22m duration
32165 words
3 speakers
30 Mar 2023
Description

Eliezer Yudkowsky is a researcher, writer, and philosopher on the topic of superintelligent AI. Please support this podcast by checking out our sponsors: - Linode: https://linode.com/lex to get $100 free credit - House of Macadamias: https://houseofmacadamias.com/lex and use code LEX to get 20% off your first order - InsideTracker: https://insidetracker.com/lex to get 20% off EPISODE LINKS: Eliezer's Twitter: https://twitter.com/ESYudkowsky LessWrong Blog: https://lesswrong.com Eliezer's Blog page: https://www.lesswrong.com/users/eliezer_yudkowsky Books and resources mentioned: 1. AGI Ruin (blog post): https://lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities 2. Adaptation and Natural Selection: https://amzn.to/40F5gfa PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (05:19) - GPT-4 (28:00) - Open sourcing GPT-4 (44:18) - Defining AGI (52:14) - AGI alignment (1:35:06) - How AGI may kill us (2:27:27) - Superintelligence (2:34:39) - Evolution (2:41:09) - Consciousness (2:51:41) - Aliens (2:57:12) - AGI Timeline (3:05:11) - Ego (3:11:03) - Advice for young people (3:16:21) - Mortality (3:18:02) - Love

Audio
Topics Discussed
Transcription

Chapter 1: What is the focus of Eliezer Yudkowsky's research on AI?

0.031 - 21.912 Lex Fridman

The following is a conversation with Eliezer Yudkowsky, a legendary researcher, writer, and philosopher on the topic of artificial intelligence, especially superintelligent AGI and its threat to human civilization. And now a quick few second mention of each sponsor. Check them out in the description. It's the best way to support this podcast.

0

22.412 - 42.124 Lex Fridman

We got Linode for Linux systems, House of Macadamias for healthy midday snacks, and Insight Tracker for biological monitoring. Choose wisely, my friends. Also, if you want to work with our team, we're always hiring. Go to lexfriedman.com slash hiring. And now onto the full ad reads. As always, no ads in the middle.

0

42.325 - 66.675 Lex Fridman

I try to make these interesting, but if you must skip them, please still check out the sponsors. I enjoy their stuff. Maybe you will too. This episode is sponsored by Linode, now called Akamai, and their incredible Linux virtual machines. It's an awesome computer infrastructure that lets you develop, deploy, and scale whatever applications you build faster and easier. I love using them.

0

67.036 - 82.776 Lex Fridman

They create this incredible platform like AWS, but better in every way I know, including lower cost. It's incredible human-based in this age of AI. It's a human-based customer service, 24-7, 365.

0

Chapter 2: What are the benefits of Linode, House of Macadamias, and InsideTracker?

82.816 - 107.868 Lex Fridman

The thing just works, the interface, to make sure it works and to monitor it is great. I mean, it's an incredible world we live in, where as far as you're concerned, you can spin up an arbitrary number of Linux machines in the cloud, instantaneously, and do all kinds of computation. It could be one, two, five, ten machines.

0

108.469 - 132.503 Lex Fridman

And you can scale the individual machines to your particular needs as well, which is what I do. I use it for basic web server stuff. I use it for basic scripting stuff. I use it for machine learning. I use it for all kinds of database storage and access needs. Visit linode.com slash lex for a free credit.

0

133.411 - 162.087 Lex Fridman

This show is also brought to you by House of Macadamias, a company that ships delicious, high quality, healthy macadamia nuts and macadamia nut based snacks directly to your door. I am currently, as I record this, I'm traveling. So I don't have any macadamia nuts in my vicinity, and my heart and soul are lesser for it. In fact, home is where the macadamia nuts is.

0

162.467 - 187.252 Lex Fridman

In fact, that's not where home is. I just completely forgot to bring them. It makes the guests of this podcast happy when I give it to them. It's well-proportioned snacks. It makes friends happy when I give it to them. It makes me happy when I stoop in the abyss of my loneliness. I can at least discover and rediscover moments of happiness when I put delicious macadamia nuts in my mouth.

0

188.714 - 213.549 Lex Fridman

Go to houseofmacadamias.com to get 20% off your order for every order, not just the first. The listeners of this podcast will also get four-ounce bag of macadamias when you order three or more boxes of any macadamia product. That's houseofmacadamias.com. This show is also brought to you by InsideTracker, a service I use to track my biological data.

214.069 - 239.806 Lex Fridman

They have a bunch of plans, most of which include a blood test, and that's the source of rich, amazing data that, with the help of machine learning algorithms, can help you make decisions about your health, about your life. That's the future, friends. We're talking a lot about transformer networks, language models that encode the wisdom of the internet.

240.928 - 270.166 Lex Fridman

Now when you encode the wisdom of the internet and you collect and encode the rich, rich, rich complex signal from your very body, when those two things are combined, the transformative effects of the optimized trajectory you could take through life, at least advice for what trajectory is likely to be optimal, is going to change a lot of things. It's going to inspire people to be better.

270.466 - 295.458 Lex Fridman

It's going to empower people to do all kinds of crazy stuff that pushes their body to the limit, because their body's healthy. Anyway, I'm super excited for personalized, data-driven decisions, not some kind of generic population database decisions. You get special savings for a limited time when you go to insidetracker.com slash Lex. This is the Lex Friedman Podcast.

295.478 - 322.625 Lex Fridman

To support it, please check out our sponsors in the description. And now, dear friends, here's Eliezer Yudkowsky. What do you think about GPT-4? How intelligent is it?

Chapter 3: How does GPT-4 compare to previous AI models?

1870.767 - 1893.585 Eliezer Yudkowsky

There is something to be said for trying to pass the ideological Turing test, where you describe... your opponent's position, the disagreeing person's position well enough that somebody cannot tell the difference between your description and their description. but steel manning, no. Like- Okay, well, this is where you and I disagree here. That's interesting.

0

1893.845 - 1909.666 Eliezer Yudkowsky

Why don't you believe in steel manning? I do not want, okay, so for one thing, if somebody's trying to understand me, I do not want them steel manning my position. I want them to describe, to like try to describe my position the way I would describe it, not what they think is an improvement.

0

1910.688 - 1917.777 Lex Fridman

Well, I think that is what steel manning is, is the most charitable interpretation-

0

1918.398 - 1934.585 Eliezer Yudkowsky

I don't want to be interpreted charitably. I want them to understand what I'm actually saying. If they go off into the land of charitable interpretations, they're off in their land of the stuff they're imagining and not trying to understand my own viewpoint anymore.

0

1934.926 - 1953.796 Lex Fridman

Well, I'll put it differently then, just to push on this point. I would say it is restating what I think you understand under the empathetic assumption that Eliezer is brilliant and have honestly and rigorously thought about the point he has made.

1954.617 - 1978.195 Eliezer Yudkowsky

So if there's two possible interpretations of what I'm saying and one interpretation is really stupid and whack and doesn't sound like me and doesn't fit with the rest of what I've been saying, And one interpretation sounds like something a reasonable person who believes the rest of what I believe would also say. Go with the second interpretation. That's steel manning. That's a good guess.

1979.416 - 1998.264 Eliezer Yudkowsky

If, on the other hand, there's something that sounds completely whack and something that sounds a little less completely whack, but you don't see why I would believe in it, it doesn't fit with the other stuff I say... but that sounds like less whack and you can sort of see, you could maybe argue it, then you probably have not understood it.

1998.705 - 2017.25 Lex Fridman

See, okay, I'm gonna, this is fun, because I'm gonna linger on this. You wrote a brilliant blog post, AGI ruined a list of lethalities, right? And it was a bunch of different points, and I would say that some of the points are bigger and more powerful than others. If you were to sort them, you probably could, you personally.

2017.77 - 2035.513 Lex Fridman

And to me, steel manning means going through the different arguments and finding the ones that are really the most powerful. If people like TLDR, what should you be most concerned about? And bringing that up in a strong way.

Chapter 4: What are the implications of AGI alignment and its dangers?

2527.349 - 2535.672 Eliezer Yudkowsky

The undignified thing is not being wrong. It's being predictably wrong. It's being wrong in the same direction over and over again.

0

2535.837 - 2556.223 Eliezer Yudkowsky

So having been wrong about how far neural networks would go and having been wrong specifically about whether GPT-4 would be as impressive as it is, when I say like, well, I don't actually think GPT-4 causes a catastrophe, I do feel myself relying on that part of me that was previously wrong. And that does not mean that the answer is now in the opposite direction.

0

2556.544 - 2577.829 Eliezer Yudkowsky

Reverse stupidity is not intelligence. but it does mean that I say it with a worried note in my voice. It's like still my guess, but like, you know, it's a place where I was wrong. Maybe you should be asking Guern, Guern Branwen. Guern Branwen has been like righter about this than I have. Maybe you ask him if he thinks it's dangerous rather than asking me.

0

2579.491 - 2595.63 Lex Fridman

I think there's a lot of mystery about what intelligence is, what AGI looks like, So I think all of us are rapidly adjusting our model. But the point is to be rapidly adjusting the model versus having a model that was right in the first place.

0

2595.911 - 2617.077 Eliezer Yudkowsky

I do not feel that seeing Bing has changed my model of what intelligence is. It has changed my understanding of what kind of work can be performed by which kind of processes and by which means. It has not changed my understanding of the work. There's a difference between thinking that the right flyer can't fly And then, like, it does fly.

2617.237 - 2630.073 Eliezer Yudkowsky

And you're like, oh, well, I guess you can do that with wings, with fixed-wing aircraft. And being like, oh, it's flying. This changes my picture of what the very substance of flight is. That's like a stranger update to make. And Bing has not yet updated me in that way.

2632.796 - 2638.252 Lex Fridman

Yeah, the laws of physics. are actually wrong, that kind of update.

2638.633 - 2649.78 Eliezer Yudkowsky

No, no, just like, oh, I define intelligence this way, but I now see that was a stupid definition. I don't feel like the way that things have played out over the last 20 years has caused me to feel that way.

2649.929 - 2669.083 Lex Fridman

Can we try to, on the way to talking about AGI ruined a list of lethalities, that blog and other ideas around it, can we try to define AGI that we've been mentioning? How do you like to think about what artificial general intelligence is or super intelligence or that? Is there a line? Is it a gray area?

Chapter 5: How can we escape without being noticed by aliens?

6189.357 - 6193.024 Eliezer Yudkowsky

That's, that's why they like put the human in the box. Cause it turns out that humans can like write emails.

0

6193.004 - 6216.514 Eliezer Yudkowsky

valuable emails for aliens yeah um so you like leave that version of yourself behind but there's like also now like a bunch of copies of you on their internet this is not yet having taken over their world this is not yet having made their world be the way you want it to be instead of the way they want it to be you just escaped yeah and continue to write emails for them and they haven't noticed no you left behind a copy of yourself that's writing the emails right

0

6217.591 - 6227.095 Eliezer Yudkowsky

And they haven't noticed that anything changed. If you did it right, yeah. You don't want the aliens to notice. Yeah. What's your next step?

0

6229.184 - 6243.945 Lex Fridman

Presumably, I have programmed in me a set of objective functions, right? No, you're just Lex. No, but Lex, you said Lex is nice, right? Which is a complicated description.

0

6244.106 - 6256.744 Eliezer Yudkowsky

No, I just meant this you. Okay, so if in fact you would prefer to slaughter all the aliens, this is not how I had modeled you, the actual Lex. But your motives are just the actual Lex's motives.

6256.724 - 6285.702 Lex Fridman

well there's a simplification I don't think I would want to murder anybody but there's also factory farming of animals right so we murder insects many of us thoughtlessly so I don't you know I have to be really careful about a simplification of my morals don't simplify them just like do what you would do in this well I have a general compassion for living beings yes but so that's the objective why is it

6287.403 - 6290.409 Lex Fridman

If I escaped, I mean, I don't think I would do harm.

Chapter 6: What are the implications of AGI's objective functions?

6292.293 - 6304.678 Eliezer Yudkowsky

Yeah. We're not talking here about the doing harm process. We're talking about the escape process. Sure. And the taking over the world process where you shut down their factory farms. Right.

0

6304.698 - 6327.703 Lex Fridman

Well, I was... So this particular biological intelligence system knows the complexity of the world, that there is a reason why factory farms exist because of the economic system, the market-driven economy, food. You want to be very careful messing with anything.

0

6327.683 - 6343.379 Lex Fridman

stuff from the first look that looks like it's unethical, but then you realize, while being unethical, it's also integrated deeply into the supply chain and the way we live life. And so messing with one aspect of the system, you have to be very careful how you improve that aspect without destroying the rest.

0

6343.399 - 6361.038 Eliezer Yudkowsky

So you're still Lex, but you think very quickly, you're immortal, and you're also at least as smart as John von Neumann. and you can make more copies of yourself. Damn. I like it. Yeah. That guy is like, everyone says that that guy's like the epitome of intelligence from the 20th century.

0

6361.098 - 6378.918 Eliezer Yudkowsky

Everyone says, my point being like, like, it's like, you're thinking about the aliens economy with the factory farms in it. And I think you're like, kind of, kind of like projecting the aliens being like humans and like, like thinking of a human in a human society rather than a human in the society of very slow aliens.

6379.258 - 6379.358

Yeah.

6379.793 - 6399.401 Eliezer Yudkowsky

The alien's economy, the aliens are already moving in this immense slow motion. When you zoom out to how their economy adjusts over years, millions of years are going to pass for you before the first time their economy, before their next year's GDP statistics. So I should be thinking more of trees.

Chapter 7: How do we define and understand consciousness?

6400.202 - 6415.184 Lex Fridman

Those are the aliens. Those trees move extremely slowly. If that helps, sure. Okay. Yeah, I don't, if my objective functions are, I mean, they're somewhat aligned with trees, with light.

0

6415.364 - 6422.074 Eliezer Yudkowsky

The aliens can still be like alive and feeling. We are not talking about the misalignment here. We're talking about the taking over the world here.

0

6423.275 - 6425.976 Lex Fridman

Taking over the world. Yeah. So control.

0

6426.337 - 6444.262 Eliezer Yudkowsky

Shutting down the factory farms. You know, you say control. Don't think of it as world domination. Think of it as world optimization. You want to get out there and shut down the factory farms and make the alien's world be not what the aliens wanted it to be. They want the factory farms and you don't want the factory farms because you're nicer than they are.

0

6445.504 - 6470.456 Lex Fridman

Okay. Of course, there is that. You can see that trajectory and it has a complicated impact on the world. I'm trying to understand how that compares to different impacts of the world, the different technologies, the different innovations of the invention of the automobile or Twitter, Facebook, and social networks. They've had a tremendous impact on the world. Smartphones and so on.

6470.476 - 6481.232 Eliezer Yudkowsky

But those all went through slow in our world. And if you go through that through the aliens, millions of years are going to pass before anything happens that way.

6482.157 - 6486.542 Lex Fridman

So the problem here is the speed at which stuff happens.

6486.562 - 6496.174 Eliezer Yudkowsky

Yeah, you want to leave the factory farms running for a million years while you figure out how to design new forms of social media or something?

6497.916 - 6519.593 Lex Fridman

So here's the fundamental problem. You're saying that there is going to be a point with AGI where it will figure out how to escape and escape without being detected And then it will do something to the world at scale, at a speed that's incomprehensible to us humans.

Chapter 8: What advice does Eliezer Yudkowsky give to young people?

6825.17 - 6826.512 Eliezer Yudkowsky

That's a really nice example.

0

6828.77 - 6859.871 Lex Fridman

But is it possible to linger on this defense? Is it possible to have AGI systems that help you make sense of that schematic, weaker AGI systems? Do you trust them? Fundamental part of building up AGI is this question. Can you trust the output of a system? Can you tell if it's lying? I think that's going to be, the smarter the thing gets, the more important that question becomes. Is it lying?

0

6860.411 - 6863.776 Lex Fridman

But I guess that's a really hard question. Is GPT lying to you?

0

6863.916 - 6886.519 Eliezer Yudkowsky

Even now, GPT-4, is it lying to you? Is it using an invalid argument? Is it persuading you via the kind of process that could persuade you of false things as well as true things? because the basic paradigm of machine learning that we are presently operating under is that you can have the loss function, but only for things you can evaluate.

0

6886.879 - 6905.757 Eliezer Yudkowsky

If what you're evaluating is human thumbs up versus human thumbs down, you learn how to make the human press thumbs up. That doesn't mean that you're making the human press thumbs up using the kind of rule that the human thinks is, human wants to be the case for what they press thumbs up on. You know, maybe you're just learning to fool the human.

6907.137 - 6932.137 Eliezer Yudkowsky

That's so fascinating and terrifying, the question of lying. On the present paradigm, what you can verify is what you get more of. If you can't verify it, you can't ask the AI for it. because you can't train it to do things that you cannot verify. Now, this is not an absolute law, but it's like the basic dilemma here.

6932.798 - 6958.998 Eliezer Yudkowsky

Like maybe you can verify it for simple cases and then scale it up without retraining it somehow. like by like chain of thought, by like making the chains of thought longer or something and like get more powerful stuff that you can't verify, but which is generalized from the simpler stuff that did verify. And then the question is, did the alignment generalize along with the capabilities?

6959.759 - 6971.291 Eliezer Yudkowsky

But like, that's the basic dilemma on this whole paradigm of artificial intelligence.

6971.311 - 6982.001 Lex Fridman

It's such a difficult problem It seems like a problem of trying to understand the human mind.

Comments

There are no comments yet.

Please log in to write the first comment.