AI program beats pros in six-player poker

IMAGE Noam Brown is a Facebook AI research scientist while finishing his Ph.D. at Carnegie Mellon.  view more  Credit Noam Brown

IMAGE Noam Brown is a Facebook AI research scientist while finishing his Ph.D. at Carnegie Mellon. view more Credit Noam Brown

The game was a six-player no-limit Texas Hold'em, which is now the most popular form of poker.

Pluribus first took on Chris "Jesus" Ferguson, a player who has won the highest distinction in poker by topping the World Series of Poker six times. In one, Pluribus went up against five poker pros, while in the other, there were five instances of Pluribus at the table against one pro. Pluribus - so-named because it takes on many opponents at once - learns by playing against itself over and over and remembering which strategies worked best.

But the tables turned overnight as an algorithm trained by the might of Facebook AI and Carnegie Mellon University beat the poker professionals at the latest showdown in Texas.

In another round where one human elite played 5,000 hands of poker against five copies of the Pluribus, the AI beat the human by 32 milli big blinds per game. "The ability to beat five other players in such a complicated game opens up new opportunities to use AI to solve a wide variety of real-world problems". Photo Credit: Carnegie Mellon University. A research paper about Pluribus was published Thursday in the journal Science.

That didn't work in six-player poker games like no-limit Texas Hold 'em, Brown says.

Pluribus' algorithms created some surprising features in its strategy. And while humans usually avoid so-called "donk betting" - the practice of ending the first round of betting with a call and opening the next with a bet - Pluribus embraced the tactic.

The pros seemed intrigued by the types of strategies Pluribus employed, such as the atypical (for humans) move of kicking off a round with a bet after calling the previous go-round.

"It was incredibly fascinating getting to play against the poker bot and seeing some of the strategies it chose", poker player Michael Gagliano said.

Artificial Intelligence (AI) has once again approved that it is capable of matching up to what humans can do or even beating them at some of those things.

That extrapolation comes as the result of several tests that Brown ran against human opponents. Each player received at least $0.40 per hand just for playing and as much as $1.60 per hand, depending upon performance.

Pluribus registered a solid win with statistical significance, which is particularly impressive given its opposition, Elias said.

More news: Year Old Louie Barry Joins Barcelonas La Masia
More news: Dog the Bounty Hunter Pays Tribute to Beth Chapman at Colorado Service
More news: Irina Shayk reflects on motherhood and marriage after Bradley Cooper split

These human players were not just average players either. As the AI plays, it figures out what actions lead to better outcomes.

Michael "Gags" Gagliano, who has earned almost $2 million in career earnings, also competed against Pluribus. "And that's what makes it so hard to play against", said one of the humans, Jason Les. "Bots/AI are an important part in the evolution of poker, and it was wonderful to have first-hand experience in this large step toward the future". "No other popular recreational game captures the challenges of hidden information as effectively and as elegantly as poker", the researchers write.

And unlike in chess or Go, the computer does not have access to all the information available as it can not see its opponent's cards. But poker is a bigger challenge because it is an incomplete information game: players can't be certain which cards are in play and opponents can and will bluff. The novel strategy makes AI more relevant to "real-world" problems, which often involve missing information and multiple players. Multi-player games present fundamental additional issues for AI beyond those in two-player games.

AI in two-player games tends to approximate a Nash equilibrium, guaranteeing that only a result no worse than a tie, and AI emerges victorious once its opponent errs and can not maintain the equilibrium. A successful bluff can dramatically change a poker game in your favor, but do it too much and your deception becomes predictable. Sean Ruane commented on Pluribus' "relentless consistency".

Pluribus learned poker by playing copies of itself.

This doesn't work when you add in more players. Limited-lookahead search is a standard approach in perfect-information games, but is extremely challenging in imperfect-information games.

Pluribus also seeks to be unpredictable.

For example, if the bot is dealt a really strong hand, AIVAT will subtract a baseline value from its winnings to counter the good luck. So Pluribus calculates how it would act with every possible hand it could hold and then computes a strategy balanced across all of those possibilities.

The AI made remarkably efficient use of computation. One, called Strategic Machine, pursues business and gaming applications. Pluribus builds on and incorporates large parts of that technology and code.

Pluribus sets a trap for professional poker players and wins (Video from Carnegie Mellon University, courtesy of Facebook).

Pluribus training cost is the equivalent of less than $150 worth of cloud computing resources compared to other recent AI milestone projects, which required the equivalent of millions of dollars' worth of computing resources to train.

Recommended News

We are pleased to provide this opportunity to share information, experiences and observations about what's in the news.
Some of the comments may be reprinted elsewhere in the site or in the newspaper.
Thank you for taking the time to offer your thoughts.