A brand new poker-playing bot was developed by the same group of researchers who presented Libratus back in 2017, and this time, it had successfully defeated a team of top poker players in a 6-handed NLHE game.
Facebook worked with researchers from the Carnegie Mellon University spearheaded by CMU professor Tuomas Sandholm and his graduate student Noam Brown, who scheduled a showdown featuring the latest version of the AI poker supercomputer called “Pluribus” versus some of the top poker players in the industry, and the results, published earlier this month in the journal Science, were remarkable. The bot managed to beat down its opponents, something no other AI before has achieved (previous AI programs can only win at a two-player poker game). It is expected that the outcome of this project shall have major implications on AI research and the game of poker itself.
The Two Trials
Poker is considered a game that embodies the challenges of hidden information pretty well, and so researchers use that as a challenge in the creation and development of artificial intelligence. However, no machine has ever been actually successful in beating human players in a multiplayer no-limit Texas hold’em setting (the most popular format in playing poker), until Pluribus came along.
Pluribus battled against some of the best poker pros in a 6-max NLHE format. The pros are 6-max specialists, who have won at least $1 million in their pro poker careers.
Pluribus underwent two different trials. In the first trial, five humans and one AI (5H+1AI) were pitted against each other, whereas in the second experiment only one human faced against five clones of Pluribus (1H+5AI), taking into consideration that the bots were not allowed to communicate and they did not know who they were playing against, thus preventing any collusion.
Victory for Pluribus
The results of the two trials revealed that Pluribus’ win rate was significantly higher compared to the human poker players.
Some of the players who joined in on the experiment were Anthony Gregg, Dong Kim, Greg Merson, Jacob Toole, Jason Les, Jimmy Chou, Linus Loeliger, Michael Gagliano, Nick Petrangelo, Sean Ruane, Seth Davies and Trevor Savage, with each player using a nickname during play.
In the 5H+1AI experiment, a total of 10,000 hands were played over the course of 12 days, while in the 1H+5AI test, poker players Darren Elias and Chris Ferguson played 5,000 hands up against five Pluribus copies.
When the results were tallied, it turned out that the bot was able to beat the humans for about $5 per hand and nearly $1,000 per hour, according to Noam Brown’s Facebook AI blog post.
Video demonstration of Pluribus vs Pros:
What the Poker Pros Think of AI Pluribus
Here’s what the participants of the experiment had to say about the latest poker bot.
Seth Davies: “The most stimulating thing about playing against Pluribus was responding to its complex preflop strategies. Unlike humans, Pluribus used multiple raise sizes preflop. Attempting to respond to nonlinear open ranges was a fun challenge that differs from human games.”
Jason Les: “It is an absolute monster bluffer. I would say it’s a much more efficient bluffer than most humans. And that’s what makes it so difficult to play against. You’re always in a situation with a ton of pressure that the AI is putting on you and you know it’s very likely it could be bluffing here.”
Jimmy Chou: “Whenever playing the bot, I feel like I pick up something new to incorporate into my game. As humans I think we tend to oversimplify the game for ourselves, making strategies easier to adopt and remember. The bot doesn’t take any of these shortcuts and has an immensely complicated/balanced game tree for every decision.”
Chris Ferguson: “Pluribus is a very hard opponent to play against. It’s really hard to pin him down on any kind of hand. He’s also very good at making thin value bets on the river. He’s very good at extracting value out of his good hands.”
Darren Elias: “It’s just me and then five versions of this AI poker bot, which I would play against every day, thousands of hands. It was improving very rapidly, where it went from being a mediocre player to basically a world-class-level poker player in a matter of days and weeks. Which was pretty scary.”
Pluribus in a Nutshell
For now, it is safe to say that one of the world’s best poker players doesn’t have a poker face.
Pluribus’ core, or its blueprint strategy, was built through self-play, or competition against clones of itself. This is the same method utilized to create OpenAI Five, a team of five neural networks that trained for an equivalent of 45,000 years and was able to defeat a pro eSports team in the video game Dota 2.
Pluribus teaches itself from scratch, utilizing a type of reinforcement learning just like what DeepMind’s Go AI uses, AlphaZero. It begins by playing poker at random and improves as it determines which actions win more money. After each hand, it recalls how it played and ponders whether it would have made more money with different actions, such as raising instead of sticking to a bet. If the alternatives lead to better results, it will most likely choose those in future situations.
By playing trillions of hands of poker against itself, Pluribus formulated a basic strategy that it follows in matches. For every decision it makes, Pluribus compares the state of the game with its blueprint and predicts a few moves ahead to see how the game plays out. It will then decide whether it can improve on it. Being self-taught without human input, Pluribus employs a few strategies that human players would not think of using.
The success of Pluribus is mainly due to its efficiency. When playing poker, it runs on just two central processing units (CPUs), unlike Libratus’ 100 CPUs and DeepMind’s original Go bot which used almost 2,000 CPUs when they first beat top pro players. When Pluribus plays against itself, it plays a hand in about 20 seconds, roughly twice as fast as professional human players.
Another thing that’s great about Pluribus is that the software operates on a cloud server that costs only $144 to run. To compare, Libratus was powered up by a $9.65 million supercomputer, which was very expensive to run.
Of course, this experiment doesn’t just end at poker. While it is indeed a huge accomplishment to be able to beat the best human poker pros in a six-handed game, it shows that its computing power can have a different purpose other than playing cards. The results cement the fact that AI can operate at ‘superhuman’ levels in scenarios with multiple subjects and limited access to information, and could possibly be applied anywhere from investment banking and negotiation strategies to self-driving car technology, according to Brown.
Pluribus now joins the ranks of games such as Chess and Go, two games in which the world’s best human players have been defeated by artificial intelligence.