Computer scientists at DeepMind have developed an artificially intelligent bot capable of defeating the world’s best players at StarCraft II, the popular real-time strategy video game.
New research published today in Nature describes AlphaStar—the first artificially intelligent agent capable of playing StarCraft II at the grandmaster level. Developed by DeepMind, the system is ranked above the 99.8 percentile of active players on Battle.net, the official game server of StarCraft II. This is obviously a big deal for the StarCraft II community, but the system’s proficiency represents an important achievement for AI researchers, as similar approaches could be applied in the real world to solve complicated problems, or to expand the scope of machine intelligence.
UK-based DeepMind, which is owned by Google’s parent company Alphabet Inc., previously developed systems capable of playing chess, Go, and shogi at a superhuman level, but StarCraft II presented an entirely different set of challenges.
Released by Blizzard Entertainment in 2010, StarCraft II is a science fiction-themed real-time strategy video game in which two players compete against each other. Gamers can choose to play as one of three alien species—Terrans, Protoss, and Zerg—each with their own strengths, weaknesses, and idiosyncrasies.
StarCraft II has attracted the interest of AI researchers owing to its complex and open-ended gameplay. Unlike chess and Go, players have imperfect information in terms of what’s going on, making it similar to poker in that respect. The game also involves a massive decision space, as there are upwards of 1026 possible actions available to players at each time step. Players can invoke thousands of actions before the game is either won or lost.
StarCraft II also involves game theoretic scenarios, long-term planning, along with the challenge posed by real-time gameplay. Thus, the game is considered a “grand challenge” among AI researchers. To win, players scramble to collect resources, which they use to build bases and structures, and to develop powerful new tech to defeat their opponent. The game is not turn-based and it unfolds in real-time. Much of the map is hidden to players, requiring them to scout their opponent’s moves and adjust their strategies accordingly. Games typically last around 5 to 20 minutes, but matches sometimes last for an hour or more.
All that is in part why historically, AI agents have failed to equal the best human players, even when the game is simplified. To finally create a system capable of playing at a high level, computer scientist Oriol Vinyals and his colleagues at DeepMind trained a neural network with general-purpose learning algorithms, namely a combination of imitation learning and reinforcement learning.
Imitation learning is exactly how it sounds, in which an AI learns by imitating human gameplay. This strategy alone allowed AlphaStar to play better than 84 per cent of StarCraft II players. Reinforcement learning works by motivating a system to proficiently achieve a designated goal. By gaining or losing points, the system adopts effective strategies or policies for completing that goal.
As part of its training, AlphaStar continually played itself in order to enhance its gamesmanship even further, and to devise even better strategies and counter-strategies.
In an early test of the system back in December 2018, the researchers at DeepMind pitted AlphaStar pitted against two world class players, Grzegorz “MaNa” Komincz and Dario “TLO” Wünsch from Team Liquid, both of whom were defeated handily.
The ultimate challenge, however, was for AlphaStar to achieve grandmaster status by playing under standard professional tournament conditions. Specifically, the system had to view the StarCraft II world through a camera, compete as any of the three alien species at a high level, use the same maps as the human players, apply an action rate comparable to human gameplay (a rate approved by Wünsch), and play on the Battle.net game server, among other stipulations.
Under these conditions, AlphaStar still managed to play at a high level, achieving the grandmaster rank for all three of the StarCraft alien species. It’s the first time an AI has achieved this level for a professionally played e-sport, and it did so without any of the previous restrictions, such as operating under a simplified version of the game.
“This is an extremely impressive AI achievement on a challenging two-player imperfect-information game that has a large number of actions to choose from at every point and the game lasts for thousands of actions,” Tuomas Sandholm, a professor of computer science at Carnegie Mellon University who wasn’t involved with the research, wrote in an email to Gizmodo. “Their AI starts by imitating human play and then continues to improve on its own using reinforcement learning.”
In a press release, professional StarCraft II player Diego “Kelazhur” Schwimer called the AI agent an “intriguing and unorthodox player—one with the reflexes and speed of the best pros but strategies and a style that are entirely its own.” Team Liquid’s Grzegorz “MaNa” Komincz, another professional player, said it was “exciting to see the agent develop its own strategies differently from the human players.
Yet despite AlphaStar’s impressive performance, Sandholm believes there’s still room for improvement. And indeed, comments made by StarCraft II professionals hinted at possible weaknesses in the system.
“I’ve found AlphaStar’s gameplay incredibly impressive—the system is very skilled at assessing its strategic position, and knows exactly when to engage or disengage with its opponent,” Wünsch, professional StarCraft II player for Team Liquid, said. “And while AlphaStar has excellent and precise control, it doesn’t feel superhuman—certainly not on a level that a human couldn’t theoretically achieve. Overall, it feels very fair—like it is playing a ‘real’ game of StarCraft.”
Sandholm’s team is also responsible for developing Pluribus—an AI capable of defeating poker pros at six-player Texas Hold’em. These researchers put Pluribus’ predecessor, the two-player Libratus AI, through this kind of test, but after this intense testing, “even the top professionals were not able to beat… Libratus, although they had 120,000 game repetitions to try to do so,” explained Sandholm. Afterwards, “Libratus beat a team of strong professionals in a match in China despite them having scraped all those prior matches from the video stream and having analysed them computationally,” said Sandholm, to which he added: “In two-player zero-sum games, game-theoretic strategies are unbeatable even if the opponent knows your strategy.”
“The approach is not as sophisticated on the strategic, game-theoretic aspects as recent AI milestones in poker, so the AI is likely exploitable,” he said. “It would be interesting to see an evaluation where humans can knowingly practice against the AI as a group for tens of thousands of games to try to find weaknesses in the AI.”
For the DeepMind team to progress even further, Sandholm recommended they study real-time games involving more than two players, similar to what his team achieved with Pluribus and the game of six-player Texas Hold’em poker.
These new insights into AI could be applied elsewhere to help systems solve complex, real-world problems, and to improve the generalisations of machine intelligence. With each passing breakthrough, however, there can only be fewer domains in which humans remain superior to AI.