Chess’s New Best Player Is A Fearless, Swashbuckling Algorithm

Chess is an antique, about 1,500 years old, according to most historians. As a result, its evolution seems essentially complete, a hoary game now largely trudging along. That’s not to say that there haven’t been milestones. In medieval Europe, for example, they made the squares on the board alternate black and white. In the 15th century, the queen got her modern powers.¹

And in the 20th century came the computer. Chess was simple enough (not many rules, smallish board) and complicated enough (many possible games) to make a fruitful test bed for artificial intelligence programs. This attracted engineering brains and corporate money. In 1997, they broke through: IBM’s Deep Blue supercomputer defeated the world champion, Garry Kasparov. Humans don’t hold a candle to supercomputers, or even smartphones, in competition anymore. Top human players do, however, lean on computers in training, relying on them for guidance, analysis and insight. Computer engines now mold the way the game is played at its highest human levels: calculating, stodgy, defensive, careful.

Or at least that’s how it has been. But if you read headlines from the chess world last month, you’d think the game was jolted forward again by an unexpected quantum leap. But to where?

The revolutionary is known as AlphaZero. It’s a new neural network, reinforcement learning algorithm developed by DeepMind, Google’s secretive artificial intelligence subsidiary. Unlike other top programs, which receive extensive input and fine-tuning from programmers and chess masters, drawing on the wealth of accumulated human chess knowledge, AlphaZero is exclusively self-taught. It learned to play solely by playing against itself, over and over and over — 44 million games. It kept track of what strategies led to a win, favoring those, and which didn’t, casting those aside. After just four hours of this tabula rasa training, it clobbered the top chess program, an engine called Stockfish, winning 28 games, drawing 72 and losing zero. These results were described last month in a paper posted on arXiv, a repository of scientific research.

Within hours, the chess world descended, like the faithful to freshly chiseled tablets of stone, on the sample of 10 computer-versus-computer games published in the paper’s appendix. Two broad themes emerged: First, AlphaZero adopted an all-out attacking style, making many bold material sacrifices to set up positional advantages. Second, elite chess may therefore not be as prone to dull draws as we thought. It will still be calculating, yes, but not stodgy, defensive and careful. Chess may yet have some evolution to go.

For a taste of AlphaZero’s prowess, consider the following play from one of the published games. It’s worth emphasizing here just how good Stockfish, which is open source and was developed by a small team of programmers, is. It won the 2016 Top Chess Engine Championship, the premier computer tournament, and no human player who has ever lived would stand a chance against it in a match.

It was AlphaZero’s turn to move, armed with the white pieces, against Stockfish with the black, in the position below:

AlphaZero is already behind by two pawns, and its bishop is, in theory, less powerful than one of Stockfish’s rooks. It’s losing badly on paper. AlphaZero moved its pawn up a square, to g4 — innocuous enough. But now consider Stockfish’s black position. Any move it makes leaves it worse off than if it hadn’t moved at all! It can’t move its king, or its queen, without disaster. It can’t move its rooks because its f7 pawn would die and its king would be in mortal danger. It can’t move any of its other pawns without them being captured. It can’t do anything. But that’s the thing about chess: You have to move. This situation is known as zugzwang, German for “forced move.” AlphaZero watches while Stockfish walks off its own plank. Stockfish chose to move its pawn forward to d5; it was immediately captured by the white bishop as the attack closed further in.

You could make an argument that that game, and the other games between the two computers, were some of the strongest contests of chess, over hundreds of years and billions of games, ever played.

But were they fair? After the AlphaZero research paper was published, some wondered if the scales were tipped in AlphaZero’s favor. Chess.com received a lengthy comment from Tord Romstad, one of Stockfish’s creators. “The match results by themselves are not particularly meaningful,” Romstad said. He cited the fact that the games were played giving each program one minute per move — a rather odd decision, given that games get much more complicated as they go on and that Stockfish was programmed to be able to allocate its time wisely. Players are typically allowed to distribute their allotted time across their moves as they see fit, rather than being hemmed in to a specific amount of time per turn. Romstad also noted that an old version of Stockfish was used, with settings that hadn’t been properly tested and data structures insufficient for those settings.

Romstad called the comparison of Stockfish to AlphaZero “apples to orangutans.” A computer analysis of the zugzwang game, for example, reveals that Stockfish, according to Stockfish, made four inaccuracies, four mistakes and three blunders. Not all iterations of Stockfishes are created equal.

DeepMind declined to comment for this article, citing the fact that its AlphaZero research is under peer review.

Strong human players want to see more, ideally with the playing field more level. “I saw some amazing chess, but I also know we did not get the best possible,” Robert Hess, an American grandmaster, told me. “This holds true for human competition as well: If you gave Magnus [Carlsen] and Fabiano [Caruana] 24 hours per move, would there be any wins? How few mistakes? In being practical, we sacrifice perfection for efficiency.”

Chess.com surveyed a number of top grandmasters, who were assembled this month for a tournament in London (the home of DeepMind), about what AlphaZero means for their profession. Sergey Karjakin, the Russian world championship runner-up, said he’d pay “maybe $100,000” for access to the program. One chess commentator joked that Russian president Vladimir Putin might help Karjakin access the program to prepare for next year’s Candidates Tournament. Maxime Vachier-Lagrave, the top French player, said it was “worth easily seven figures.” Wesley So, the U.S. national champion, joked that he’d call Rex Sinquefield, the wealthy financier and chess philanthropist, to see how much he’d pony up.

“I don’t think this changes the landscape of human chess much at all for the time being,” the grandmaster Hess told me. “We don’t have the ability to memorize everything, and the games themselves were more or less perfect models of mostly known concepts.”

In some aesthetic ways, though, AlphaZero represents a computer shift toward the human approach to chess. Stockfish evaluated 70 million positions per second, a brute-force number suitable to hardware, while AlphaZero evaluated only 80,000, relying on its “intuition,” like a human grandmaster would. Moreover, AlphaZero’s style of play — relentless aggression — was thought to be “refuted” by stodgy engines like Stockfish, leading to the careful and draw-prone style that currently dominates the top ranks of competitive chess.

But maybe it’s more illustrative to say that AlphaZero played like neither a human nor a computer, but like an alien — some sort of chess intelligence which we can barely fathom. “I find it very positive!” David Chalmers, a philosopher at NYU who studies AI and the singularity, told me. “Just because it’s alien to us now doesn’t mean it’s something that humans could never have gotten to.”

In the middle of the AlphaZero paper is a diagram called Table 2. It shows the 12 most popular chess openings played by humans, along with how frequently AlphaZero “discovered” and played those openings during its intense tabula rasa training. These openings are the result of extensive human study and trial — blood, sweat and tears — spread across the centuries and around the globe. AlphaZero taught itself them one by one: the English opening, the French, the Sicilian, the Queen’s gambit, the Caro-Kann.

The diagram is a haunting image, as if a superfast algorithm had taught itself English in an afternoon and then re-created, almost by accident, full stanzas of Keats. But it’s also reassuring. That we even have a theory of the opening moves in chess is an artifact of our status as imperfect beings. There is a single right and best way to begin a chess game. Mathematical theory tells us so. We just don’t know what it is. Neither does AlphaZero.

Yet.

DeepMind was also responsible for the program AlphaGo, which has bested the top humans in Go, that other, much more complex ancient board game, to much anguish and consternation. An early version of AlphaGo was trained, in part, by human experts’ games — tabula inscripta. Later versions, including AlphaZero, stripped out all traces of our history.

“For a while, for like two months, we could say to ourselves, ‘Well, the Go AI contains thousands of years of accumulated human thinking, all the rolled up knowledge of heuristics and proverbs and famous games,’” Frank Lantz, the director of NYU’s Game Center, told me. “We can’t tell that story anymore. If you don’t find this terrifying, at least a little, you are made of stronger stuff than me. I find it terrifying, but I also find it beautiful. Everything surprising is beautiful in a way.”

Footnotes

Long ago, the queen could move only one square diagonally at a time.

FiveThirtyEight

Chess’s New Best Player Is A Fearless, Swashbuckling Algorithm

Footnotes

Comments