Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

Lex Fridman Podcast

#344 – Noam Brown: AI vs Humans in Poker and Games of Strategic Negotiation

06 Dec 2022

2h 33m duration
28336 words
3 speakers
06 Dec 2022
Description

Noam Brown is a research scientist at FAIR, Meta AI, co-creator of AI that achieved superhuman level performance in games of No-Limit Texas Hold'em and Diplomacy. Please support this podcast by checking out our sponsors: - True Classic Tees: https://trueclassictees.com/lex and use code LEX to get 25% off - Audible: https://audible.com/lex to get 30-day free trial - InsideTracker: https://insidetracker.com/lex to get 20% off - ExpressVPN: https://expressvpn.com/lexpod to get 3 months free EPISODE LINKS: Noam's Twitter: https://twitter.com/polynoamial Noam's LinkedIn: https://www.linkedin.com/in/noam-brown-8b785b62/ webDiplomacy: https://webdiplomacy.net/ Noam's papers: Superhuman AI for multiplayer poker: https://par.nsf.gov/servlets/purl/10119653 Superhuman AI for heads-up no-limit poker: https://par.nsf.gov/servlets/purl/10077416 Human-level play in the game of Diplomacy: https://www.science.org/doi/10.1126/science.ade9097 PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (05:37) - No Limit Texas Hold 'em (09:30) - Solving poker (22:40) - Poker vs Chess (29:18) - AI playing poker (1:02:46) - Heads-up vs Multi-way poker (1:13:37) - Greatest poker player of all time (1:17:10) - Diplomacy game (1:27:01) - AI negotiating with humans (2:09:26) - AI in geopolitics (2:14:11) - Human-like AI for games (2:20:12) - Ethics of AI (2:24:26) - AGI (2:28:25) - Advice to beginners

Audio
Featured in this Episode
Transcription

Chapter 1: What is the significance of No Limit Texas Hold'em in AI research?

0.031 - 15.386 Lex Fridman

The following is a conversation with Noah Brown, research scientist at FAIR, Facebook AI research group at Meta AI. He co-created the first AI system that achieved superhuman level performance in No Limit Texas Hold'em, both heads up and multiplayer.

0

16.047 - 37.772 Lex Fridman

And now, recently, he co-created an AI system that can strategically out-negotiate humans using natural language in a popular board game called Diplomacy, which is a war game that emphasizes negotiation. And now a quick few second mention of each sponsor. Check them out in the description. It's the best way to support this podcast.

0

38.172 - 59.94 Lex Fridman

We got True Classic Tees for shirts, Audible for audiobooks, Insight Tracker for bio-monitoring, and ExpressVPN for ISP privacy. Choose wisely, my friends. And now on to the full ad reads. As always, no ads in the middle. I try to make this interesting, but if you skip them, please still check out the sponsors. I enjoy their stuff. Maybe you will too.

0

60.325 - 89.421 Lex Fridman

This show is brought to you by True Classic Tees. High quality, soft, slim fitted t-shirts for men. They also make all the other menswear staples like polos, workout shirts, and boxers. I mostly wear their black t-shirt. And that's what I'm wearing right now. I have a ton of them. I enjoy how it feels. I enjoy how it looks. That's like the standard default look of the guy behind the keyboard.

0

89.873 - 111.375 Lex Fridman

I guess programmer. I wonder when the t-shirt was created. Looking up on Wiki, the t-shirt evolved from undergarments used in the 19th century and in the mid-20th century transitioned from undergarments to generally used casual clothing. Okay, wonderful. Anyway, go to trueclassic.com and enter code LEX to get 25% off.

112.908 - 134.035 Lex Fridman

This episode is brought to you by Audible, an audiobook service that has given me hundreds, if not thousands of hours of education through listening to audiobooks. I'm currently re-listening for the, I don't know how many times I've re-listened to this book, how many times I've read it, but George Orwell's 1984. Not nearly as many times as Animal Farm.

134.856 - 159.031 Lex Fridman

I don't know why exactly, but Animal Farm, just as a story, just connects with me. A basic fable. of creatures being not so good to each other and being greedy and also being good to each other and having personalities and just in the simple way that sort of whenever animals are made in a children-like story, it's simple, but it can still get to the profound.

159.112 - 186.649 Lex Fridman

I think I like to think of life in those terms. Anyway, I'm currently listening to 1984. It's really well sort of spoken, voiced or whatever on Audible, so I highly recommend it. New members can try free for 30 days at audible.com slash Lex or text Lex to 500 500. This show is also brought to you by InsideTracker, a service I use to track biological data.

187.33 - 211.157 Lex Fridman

Did you know there's 1.2 to 1.5 gallons of blood in a human being? It's approximately 10% of adults' weight. I remember reading some insane stats on the volume of blood pumped a day through the human body. It's just incredible, this whole machinery that interacts with...

Chapter 2: How does No Limit Texas Hold'em differ from Limit Hold'em?

1697.973 - 1706.821 Lex Fridman

It's not every hand is a new hand. Is there a continuation in terms of estimating what kind of player I'm facing here?

0

1707.502 - 1722.837 Noam Brown

That's a good question. So you could approach the game that way. The way that the bots do it, and the way that humans approach it also, expert human players, the way they approach it is to basically assume that you know my strategy. So...

0

1722.817 - 1732.155 Noam Brown

I'm going to try to pick a strategy where even if I were to play it for 10,000 hands and you could figure out exactly what it was, you still wouldn't be able to beat it. Basically, what that means is I'm trying to approximate the Nash equilibrium.

0

1732.556 - 1749.038 Noam Brown

I'm trying to be perfectly balanced because if I'm playing the Nash equilibrium, even if you know what my strategy is, like I said, I'm still unbeatable in expectation. So that's what the bot aims for. And that's actually what a lot of expert poker players aim for as well, to start by playing the Nash equilibrium.

0

1749.359 - 1753.954 Noam Brown

And then maybe if they spot weaknesses in the way you're playing, then they can deviate a little bit to take advantage of that.

1755.402 - 1781.765 Lex Fridman

They aim to be unbeatable in expectation. Okay, so who's the greatest poker player of all time and why is it Phil Hellmuth? So this is for Phil. So he's known, at least in part, for maybe playing suboptimally and he still wins a lot. It's a bit chaotic. So maybe can you speak from an AI perspective about the genius of his madness or the madness of his genius?

1782.645 - 1791.112 Lex Fridman

So playing suboptimally, playing chaotically, as a way to make you hard to pin down about what your strategy is.

1791.953 - 1800.321 Noam Brown

So, okay. The thing that I should explain, first of all, with like Nash equilibrium, it doesn't mean that it's predictable. The whole point of it is that you're trying to be unpredictable.

1801.062 - 1822.1 Noam Brown

Now, I think when somebody like Phil Helmuth might be really successful is not in being unpredictable, but in being able to take advantage of the other player and figure out where they're being predictable, right? or guiding the other player into thinking that you have certain weaknesses and then understanding how they're going to change their behavior.

Chapter 3: What strategies are employed in No Limit Texas Hold'em?

2057.969 - 2067.175 Lex Fridman

What's that look like? There's different actions like raising, calling. Yeah, what are the actions... Is it just a search over actions?

0

2067.856 - 2086.965 Noam Brown

So in a game like chess, the search is like, okay, I'm in this chess position and I can move these different pieces and see where things end up. In poker, what you're searching over is the actions that you can take for your hand, the probabilities that you take those actions, and then also the probabilities that you take other actions with other hands that you might have.

0

2086.945 - 2112.674 Noam Brown

And that's kind of hard to wrap your head around. Why are you searching over these other hands that you might have and trying to figure out what you would do with those hands? And the idea is, again, you want to always be balanced and unpredictable. If your search algorithm is saying, oh, I want to raise with this hand, well, in order to know whether that's a good action, let's say it's a bluff.

0

2112.874 - 2125.035 Noam Brown

Let's say you have a bad hand and you're saying, oh, I think I should be betting here with this really bad hand and bluffing. Well, that's only a good action if you're also betting with a strong hand. Otherwise, it's an obvious bluff.

0

2125.15 - 2134.879 Lex Fridman

So if your action in some sense maximizes your unpredictability, so that action could be mapped by your opponent to a lot of different hands, then that's a good action.

2135.399 - 2150.933 Noam Brown

Basically, what you want to do is put your opponent into a tough spot. So you want them to always have some doubt, like, should I call here? Should I fold here? And if you are raising in the appropriate balance between bluffs and good hands, then you're putting them into that tough spot. And so that's what we're trying to do.

2150.953 - 2154.837 Noam Brown

We're always trying to search for a strategy that would put the opponent into a difficult position.

2154.985 - 2165.08 Lex Fridman

can you give a metric that you're trying to maximize or minimize? Does this have to do with the regret thing that we're talking about in terms of putting your opponent in a maximally tough spot?

2165.482 - 2174.916 Noam Brown

Yeah, ultimately what you're trying to maximize is your expected winnings, like your expected value, the amount of money that you're going to walk away from, assuming that your opponent was playing optimally in response.

Chapter 4: What are the implications of Nash Equilibrium in poker?

2743.49 - 2744.291 Lex Fridman

He was excited.

0

2744.491 - 2766.824 Noam Brown

He was excited. And he honestly, he wanted to play against the bot. He thought he had a decent chance of beating it. I think, you know, this was like several years ago when I think it was like not as clear to everybody that, you know, the AIs were taking over. I think now people recognize that like if you're playing against a bot, there's like no chance that you have in a game like poker.

0

2766.845 - 2779.279 Lex Fridman

So consistently the bots will win. The bots have heads up and in other variants too. So multi six player Texas Hold'em, no limit Texas Hold'em is the bots win.

0

2779.917 - 2797.055 Noam Brown

Yeah, that's the case. So I think there's some debate about like, is it true for every single variant of poker? I think for every single variant of poker, if somebody really put in the effort, they can make an AI that would beat all humans at it. We've focused on the most popular variants. So heads up, no limit, Texas Hold'em.

0

2797.075 - 2808.928 Noam Brown

And then we followed that up with six-player poker as well, where we managed to make a bot that beat expert human players. And I think even there now, it's pretty clear that humans don't stand a chance.

2808.908 - 2827.056 Lex Fridman

See, I would love to hook up an AI system that looks at EEG, like how, like actually tries to optimize the toughness of the spot it puts a human in. And I would love to see how different is that from the game theory optimal. So you try to maximize the heart rate of the human player.

2827.897 - 2827.997

Yeah.

2827.977 - 2845.851 Lex Fridman

Like the freaking out over a long period of time. I wonder if there's going to be different strategies that emerge that are close in terms of effectiveness. Because something tells me you could still achieve superhuman level performance by just making people sweat.

2846.641 - 2866.267 Noam Brown

I feel like there's a good chance that that is the case, yeah. It's like a decent proxy for score, right? And this is actually the common poker wisdom when they're teaching players. Before there were bots and they were trying to teach people how to play poker, they would say the key to the game is to put your opponent to difficult spots.

Chapter 5: What are the unique challenges of AI in poker compared to other games?

4314.457 - 4328.303 Noam Brown

The problem in Go was that, or the final problem in Go at least, was that nobody had a good way of looking at a board and figuring out who was winning or losing, and describing through a simple algorithm who was winning or losing.

0

4328.283 - 4341.244 Noam Brown

And so there neural nets were super helpful because you could just feed in a ton of different board positions into this neural net and it would be able to predict then who was winning or losing. But in poker, the features weren't the challenge.

0

4341.284 - 4352.923 Noam Brown

The challenge was how do you design a scalable algorithm that would allow you to find this balanced strategy that would understand that you have to bluff with the right probability.

0

4353.697 - 4362.559 Lex Fridman

So can that be somehow incorporated into the value function, the complexity of poker that you've described?

0

4362.978 - 4386.897 Noam Brown

Yeah, so the way the value functions work in poker, like the latest and greatest poker AIs, they do use neural nets for the value function. The way it's done is very different from how it's done in a game like chess or go, because in poker, you have to reason about beliefs. And so the value of a state depends on the beliefs that players have about what the different cards are.

4387.578 - 4405.967 Noam Brown

Like if you have pocket aces, then whether that's a really, really good hand or just an okay hand depends on whether you know I have pocket aces. If you know that I have pocket aces, then if I bet, you're going to fold immediately. But if you think that I have a really bad hand, then I could bet with pocket aces and make a ton of money.

4406.588 - 4416.12 Noam Brown

So the value function in poker these days takes the beliefs as an input, which is very different from how chess and go AIs work.

4417.248 - 4423.936 Lex Fridman

So as a person who appreciates the game, who do you think is the greatest poker player of all time?

4425.117 - 4426.459 Noam Brown

That's a tough question.

Chapter 6: Who is considered the greatest poker player of all time?

4433.087 - 4457.245 Lex Fridman

So the AHS engines can give estimates of the quality of play, right? Yeah. I wonder if there's a, is there an ELO rating type of system for poker? I suppose you could, but there's just not enough. You would have to play a lot of games, right? A very large number of games, like more than you would in chess. The deterministic game makes it easier to estimate ELO.

0

4457.917 - 4480.043 Noam Brown

I think it is much harder to estimate something like ELO rating in poker. I think it's doable. The problem is that the game is very high variance. You could be profitable in poker for a year and you could actually be a bad player just because the variance is so high. You've got top professional poker players that would lose for a year just because they're on a really bad streak.

0

4480.478 - 4498.977 Lex Fridman

yeah so for elo you have to have a nice clean way of saying if player a played player b and a beats b that says something that's a signal in poker that's a very noisy signal it's a very noisy signal now there is a signal there and so you could do this this calculation it would just be much harder

0

Chapter 7: What is the game of Diplomacy and how does it differ from poker?

4498.957 - 4520.155 Noam Brown

But the same way that AIs have now taken over chess and all the top professional chess players train with AIs, the same is true for poker. The game has become a very computational. People train with AIs to try to find out where they're making mistakes, try to learn from the AIs to improve their strategy.

0

Chapter 8: How can AI improve negotiation strategies in games?

4521.256 - 4528.963 Noam Brown

So the game has been revolutionized in the past five years by the development of AI in this sport.

0

4528.943 - 4533.227 Lex Fridman

The skill with which you avoided the question of the greatest of all time was impressive.

0

4533.388 - 4555.17 Noam Brown

So my feeling is that it's a difficult question because just like in chess, where you can't really compare Magnus Carlsen today to Garry Kasparov because the game has evolved so much. The poker players today are so far beyond the skills of people that were playing even 10 or 20 years ago.

0

4555.15 - 4573.372 Noam Brown

So you look at the kinds of all-stars that were on ESPN at the height of the poker boom, pretty much all those players are actually not that good at the game today. At least the strategy aspect. I mean, they might still be good at reading the player at the other side of the table and trying to figure out, are they bluffing or not?

0

4573.852 - 4589.022 Noam Brown

But in terms of the actual computational strategy of the game, a lot of them have really struggled to keep up with that development. Now, so for that reason, I'll give an answer and I'm going to say Daniel Negreanu, who you actually had on the podcast recently. I saw it was a great episode.

4589.575 - 4618.873 Noam Brown

so much and Phil's gonna hate this so much and I'm gonna give him I'm gonna give him credit because he is one of the few like old school really strong players that have kept up with the development of AI so he is trying to he's constantly studying the game theory optimal way of playing exactly yeah and I think a lot of a lot of the old school poker players have just kind of given up on that aspect and I gotta give Daniel Negreanu credit for keeping up with all the developments that are happening in the sport

4619.495 - 4628.618 Lex Fridman

Yeah, it's fascinating to watch. It's fascinating to watch where it's headed. Yeah, so there you go. Some love for Daniel. Quick pause. Bath and break?

4628.979 - 4629.4 Noam Brown

Yeah, let's do it.

4630.382 - 4636.457 Lex Fridman

Let's go from poker to diplomacy. What is, at a high level, the game of diplomacy?

Comments

There are no comments yet.

Please log in to write the first comment.