The Machines Are Coming For Poker

PITTSBURGH — The sound of shuffling stacks and splashing piles of colorful chips filled the cavernous poker room of the Rivers Casino. It was noon on a Wednesday, and on one side of the hall, dozens of men at a handful of tables were peering at their cards, placing bets and taking one another’s money. On the other side sat two players, roped off from the rest. They were carefully deploying sophisticated poker strategies and tactics, drawing on the sum of human knowledge about the game. Yet they held no cards and stacked no chips. Their faces were lit by the blue glow of computer screens, and their opponent, an artificial intelligence program running on a brand new Hewlett Packard supercomputer, sat unblinking in a suburb 15 miles away. The two poker pros were playing for more than money. Pride, and the future of poker, was on the line.

Those two players, Jason Les and Daniel McAulay, are part of a four-person team, along with Jimmy Chuo and Dong Kim, taking on Libratus, a poker superprogram. The pros and the program are both experts in a type of poker called heads-up no-limit Texas Hold ’em. The game is one-on-one, and each player is dealt two private cards and uses up to five shared, public cards to make the best hand. Each round, players can place bets of any size, up to and including their entire cache of dollars (in this case digital and ersatz).

“It’s such a good form of poker,” Les told me. “You’re involved in so many pots, there are so many tough spots, and bluffing is so important. It really is the purest form.”

By the end of the month, the human quartet will have played 120,000 hands against “the bot,” as they all called it. It’s a rematch of 2015’s showdown, during which 80,000 hands were played and the humans — two of whom are back this year — outplayed a similar bot. But this year’s A.I. iteration is stronger and faster. And next year’s will be stronger and faster still. In the long run, it’s a lost cause — the four pros admit that a bot will win eventually. But they aren’t about to give up without a fight.

For the past few decades, humans have ceded thrones to artificial intelligence in games of all kinds. In 1995, a program called Chinook won a man vs. machine world checkers championship. In 1997, Garry Kasparov, probably the best (human) chess player of all time, lost a match to an IBM computer called Deep Blue. In 2007, checkers was “solved,” mathematically ensuring that no human would ever again beat the best machine.¹ In 2011, Ken Jennings and Brad Rutter were routed on “Jeopardy!” by another IBM creation, Watson. And last March, a human champion of Go, Lee Sedol, fell to a Google program in devastating and bewildering fashion.

Poker may be close to all we have left. Computers have yet to beat humans in a major no-limit competition like this. “I think of it as the last frontier within the visible horizon,” an eager Tuomas Sandholm said as we sat at an empty poker table.

Sandholm, a computer science professor at Carnegie Mellon University, isn’t much of a poker player. But he kept a watchful eye over the action on the digital tables as he sipped a coffee. His uniform, standard issue for academics — blazer, tie, chinos, glasses — contrasted with that of the players — hoodies, T-shirts, jeans, sneakers. Sandholm, along with his Ph.D. student Noam Brown, wrote the code for Libratus — Latin for “balanced” or “powerful” — and were in high spirits. After that morning’s session, the first of the match, their algorithmic creation was off to an early, $46,346 lead.

For computer scientists, poker is an artificial intelligence test bed. Unlike chess, say, it’s a game of imperfect information: You don’t know what cards your opponent holds. This is attractive to both players (it’s fun!) and programmers (it’s a challenge that more closely cleaves to real-world applications).

Humans have already lost their dominance in some versions of poker. Limit Hold ’em, in which there is a fixed bet size in each round, has around $10^{14}$ possible decision points — possible situations in which a player must make a move. A bot bested top humans at that game in 2008, and it was essentially solved two years ago. No-limit Hold ’em is an A.I. frontier in part because of its size. While a novice can easily learn the rules in a single sitting, the game is said to have something like $10^{160}$ decision points, allowing for remarkably deep strategic ideas. Suppose that every single atom in the universe was an entire other universe unto itself. If you counted up all the atoms in all those universes, you’d get a number similar to the possible decision points in heads-up no-limit Hold ’em.

Sandholm told me that he thinks his poker bot may in the future be applicable to cybersecurity and warfare — two other imperfect information “games.”

Libratus taught itself heads-up no-limit Hold ’em through something called “reinforcement learning.” After being told only the rules, it played itself over and over, trying to win the most money from itself, trillions of times. It’s a bit like shadowboxing in front of the mirror for a few months, furiously and alone, before getting into a ring with Mike Tyson at Caesars Palace.

There is one obvious advantage that computers hold over humans in this game: randomization. In certain situations, it’s optimal to play a “mixed strategy,” randomly choosing from a set of options. If you hold a very strong hand, for example, it may be optimal to mix between betting various healthy amounts, hoping to get paid off, and checking, hoping for a lucrative check-raise. This keeps the opponent guessing and dodges effective counter-strategies. Humans, however, even if they know something like that might be optimal, are no good at picking randomly. They also can’t effectively deploy a large menu of bet sizes without getting hopelessly lost and tangled in the thicket of their own best-laid plans.

On the way to meet the players for dinner after their first day, I explained to my Uber driver, Rodney, what I’d been doing at the casino. He had some thoughts on the matter: “Man put his faith in the machine and called the machine the genius,” he said. “But the genius is the one who created the machine.”

The four human players are trying to prove Rodney’s point, and a hotel restaurant in downtown Pittsburgh has become the armory of a gaming Alamo. It was there that the quartet, fueled by charred cauliflower and Malbec, plotted its last stand. It was the players’ third meal there in 24 hours. At a marble-topped table, amid a din of jazz, they readied their intellectual garrison. Les and Kim pulled out their laptops. The four players are taking advantage of one of humankind’s advantages: We’re social creatures. They’ve banded together and are using the bot’s own weapons — and the hard data it generates — against it.

The players asked that I not report on strategically substantive portions of the conversation we had that night — the match is ongoing, and the battle plans require secrecy. After each day’s session, a log of every played hand is digitally delivered to the players. On that Wednesday, as they loaded up the data, color-coded numbers, heat and line graphs, and probabilities flashed into the window of their analysis software. They got to work dissecting the bot’s pre-flop strategy and its three-bet tendencies. They reviewed all the biggest hands of the day, digesting them nearly instantly, like a chess master might play a game over in his head in seconds. They looked to plug the leaks in their own strategy and rip open those in the bot’s. Slowly but surely, it seemed they were chipping away at Libratus’s game plan.

“Are you guys super-geniuses or something?” the waiter asked.

“We play poker professionally,” Chuo responded. The waiter took that as a “yes.”

And if a dozen combined pounds of brain hold their own versus a nearly $10 million supercomputer, the waiter wouldn’t be wrong. The processor in the computer on which I’m writing this article has four cores. The poker-playing supercomputer has 400 nodes, each with 28 cores, Sandholm said. Nevertheless, I asked the four humans if they’d be willing to play Libratus not for digital dollars but for cold, hard cash the next day. They unanimously and enthusiastically said “yes.” One of humankind’s weaknesses: pride.

Should the humans prevail at the end of the month, another battle will inevitably await. In early January, days before my trip to Pittsburgh, a team of researchers at the University of Alberta uploaded a working paper describing DeepStack, a deep-learning artificial intelligence that has, it says, defeated a group of poker professionals at heads-up no-limit. Those players aren’t known to be specialists in the game, but it’s a victory of a certain kind for the computer nonetheless. A battle, if not the war.

Computer scientist Michael Bowling, who is a co-author of the DeepStack paper and was designated to receive inquiries about it, would not comment on the project because the work is under peer review.

Should the humans fail in their contest against Libratus, it may spell the death of the very game they love. If a robot is better than even the best humans, why would a novice human attempt to master the game? Already, the community of heads-up no-limit players is dwindling. Les estimated that there are at most 20 “good” players in the world. The game is not especially conducive to tournament or casino play, and after Black Friday decimated the online game in the U.S., there isn’t much easy money floating around to sustain the pros. One of the players in the match against Libratus has slept with his laptop so that an alert will wake him if a juicy game becomes available.

For now, the best bot has to run on an expensive supercomputer. But that might not be the case for long — one day it might fit on a laptop or in a pocket. “There may come a time when the bots are that good,” McAulay said. “And then, I think, the game will die.”

“I really hope we beat it,” he said.

Footnotes

Checkers has 500 billion billion possible positions.

FiveThirtyEight

The Machines Are Coming For Poker

Footnotes

Comments