According to Vox, DeepMind, a British subsidiary of Google, has conquered chess and Go games and turned to sophisticated real-time games. Now it has reached a new milestone by beating Starcraft II's professional human players 10 to 1.
AlphaStar, a new AI system developed by DeepMind, competes with several professional players in StarCraft 2, which is an amazing demonstration of the extent to which AI capabilities have developed. StarCraft is a complex strategic game, requiring players to consider hundreds of choices at any time, make the most rewarding strategic choices on the basis of a long campaign, and operate in a rapidly changing environment with imperfect information. There are more than 200,000 StarCraft competitions every day.
DeepMind's AI game was broadcast live on YouTube and Twitch, and since the video was released Tuesday, not only gamers but also AI fans have high hopes for it. The results were shocking: AlphaStar crushed human players 10 to 1. AlphaStar's success shocked observers. Of course, it also made some mistakes, some very obvious, some strange, but it still won the final victory.
Which games are not affected by AI?
Three years ago, AI startup DeepMind caused a worldwide sensation with its neural network AlphaGo. The company was subsequently acquired by Google and is currently an independent subsidiary of Alphabet, the parent company of Google. AlphaGo surpasses all human Go experts and presents a Go strategy that amazes and fascinates professional Go players. A year later, DeepMind launched AlphaZero, an improved AI system to learn about two-player games, which can be trained to master Go, chess and other games with similar attributes.
Chess and Go have some special features, which allow them to use the same machine learning technology directly. They are two-player games with perfect information, which means that no player needs to hide any information. In each round, the player only needs to make a decision. In chess, players need to decide which pieces to move. Where to place new pieces in Go?
Modern competitive computer games like StarCraft are much more complex. They usually require players to make many decisions at a time, including deciding where to focus their attention. They usually contain incomplete information, don't know what your opponent is doing, and don't know what you're going to face next.
These features make these games a very suitable testing platform for AI. Deep learning systems, like DeepMind is good at developing, require a lot of data to develop their capabilities, and a lot of data about how people play games. For StarCraft and StarCraft II, people have been playing online for 20 years. For AI, they represent greater challenges than games such as chess or Go, but the available data are enough to make them overcome.
For this reason, AI Labs are increasingly interested in testing their creations in online games. OpenAI, which Tang Jie works on, is always committed to using AI system to fight professional players in DOTA games. DeepMind worked with Blizzard Entertainment as early as 2017 to launch a number of tools to train AI systems in games like StarCraft. Today, we see the results of such efforts.
StarCraft has different game modes, but the competitive mode belongs to the two-player game. Every player has some basic resources. They set up their bases, sent scouts, and when they were ready, sent troops to attack enemy bases. Whoever first destroys all the enemy buildings is the winner. Some StarCraft games end very quickly, so you can build an army early, send it before your opponent is ready, and destroy it in five minutes.
Other game modes may last for more than an hour. We watched AlphaStar's early games with a fast and aggressive strategy, and also the games that lasted significantly longer. Both sides dispatched huge armies and produced advanced weapons. However, none of these games lasted more than half an hour, which meant that we didn't have a chance to see how AlphaStar handled the late StarCraft campaign, but that's just because nobody could resist AlphaStar for long enough to make it last longer.
Today, DeepMind released 10 Secret game-to-game videos between AlphaStar and professional players over the past few months, and then broadcast a live game between the latest version of AlphaStar and the top professional star players. AlphaStar's first five human rivals are professional TLO. In these competitions, DeepMind trained a series of AIs in real time for a week (during which AI was equivalent to 200 years of StarCraft) with a slightly different focus, and then chose the best AI to compete with humans.
After five games with TLO, the DeepMind team retrained AlphaStar. After 14 days of real-time training, the winners from the championship-style training environment spent the equivalent of 200 years training, and the difference is obvious. AI no longer makes obvious tactical mistakes. Its decisions are still not always meaningful to human observers, but it is difficult to identify any obvious mistakes.
Then DeepMind went back to the drawing board. In these 10 games, AI has a huge advantage that human players don't have: it can see all the visible parts on the map, and human players have to manipulate the camera to see them. DeepMind trained a new version of AlphaStar, which had to operate its own camera. Then it's 200 years of training and selecting the best AI agent from self-confrontation.
The new AlphaStar AI, which lost to MaNa in the live games that followed, seemed to be severely hampered by its own need to operate cameras, unlike many of the amazing strategies adopted by other versions of AlphaStar in earlier games. For DeepMind, the defeat may have been a disappointing outcome, but the AI only trained for seven days. It seems that when it has the chance to get further training, it will probably win the game again. DeepMind found that AI, which manages cameras, is only a little weaker and is catching up.
The current AlphaStar model certainly has weaknesses. In fact, many of the flaws of early AlphaStar AI are reminiscent of DeepMind AlphaGo's early games. Early releases of Alpha Go usually win, but there are often errors that humans can recognize. The DeepMind team has been improving it, and AlphaZero is not going to make any mistakes that humans might notice today.
Obviously, AlphaStar still has room for improvement in StarCraft. Its strategic advantage over human beings comes largely from the fact that, as a computer, it is better at micro-management. Its army is good at flanking and capturing the human army, partly because it can command five troops at the same time, which no human can do.
In these games, few tactics are widely used in professional games, because AI's success is not because it defeats humans by considering human limitations, but because it finds tactics that can integrate into its own advantages. Although technically speaking, AI's minute operation and reaction time are within human control, it still seems to have advantages because of its higher accuracy. A fairer approach might be to further limit AlphaStar's capabilities.
Humans still have more advantages than the best AI in many ways. For example, MaNa adjusted his reaction to AlphaStar based on AlphaStar's first five games, which might give him an advantage in live games. AlphaStar can't do that. We don't know much about the training methods that allow AI to learn a lot in one game and then apply those experiences in the next game.