AlphaZero, l'IA de DeepMind, est capable de matriser rapidement n'importe quel jeu comme les checs, le Go et le Shogi sans l'assistance humaine

OpenAI algorithms had co-opted to beat a team of five of the strongest amateurs in the Dota 2 strategy game. It was already a significant feat. But the artificial intelligence industry does not intend to stop l. Last July, DeepMind, the UK-based subsidiary of Google, designed algorithms that proved their ability to beat Quake III human teams. The method used to form these algorithms is the one that is becoming the standard method in the field. This is the method of reinforcement learning. It involves making the algorithm a huge amount of testing and letting it draw conclusions about what to do and what not to do. In 2016, DeepMind introduced AlphaGo, the first artificial intelligence (AI) system. able to beat a human go game champion, a board game from China. Three years after this indisputable feat, DeepMind hit the headlines of "Science" magazine with AlphaZero, its gaming AI. Recall that in December 2017, Google's subsidiary had been talking about it through a program called IAA AlphaZero that should allow it to get a little closer to its ultimate goal to know: create a highly versatile and autonomous artificial intelligence that would be able to learn and to solve complex problems alone. AlphaZero was announced as the successor of AlphaGo Zero, the first self-taught AI of the Google subsidiary that was presented in October 2017. AlphaGo Zero, for its part, represents the ultimate evolution of the IAA program called AlphaGo which defeated and defeated all the great masters go game.
For comparison, it took only three days AlphaGo Zero to come after his predecessor, AlphaGo, after an intense phase of training. It should also be noted that Alphapozo, which exploited both the supervised learning technique and the reinforcement learning technique at the same time, uses only reinforcement learning. In addition, since the creation of AlphaGo Zero, the AIs developed by DeepMind would now use a unified deep neural learning network that merges a decision-making network with the old network of value found on its previous versions of IA in order to get more powerful AIs. Zero, the dIA program named AlphaZero has the particularity to evolve and improve by practicing tabula rasa learning by reinforcement. With this method, we just make sure that the DIA program knows the basic rules of the game (the way in which the pieces move on the game board for example), but we quickly give it some established data concerning the strategies or the tactics applicable. . As a result, the DIA program is forced to improve on its own by repeatedly playing an acclaimed rhythm. In other words, AlphaZero was not designed to be especially skilled in a specific task, but rather to behave like the first draft of a more generalist version than its predecessors. Unlike its predecessor AlphaGo, AlphaZero is not limited to the game of go : the same algorithm is also a champion in chess and shogi games, a game very close to chess very popular in Japan. The researchers got the perfectly shaped instances of Alphahazero from Stockfish, Elmo and the previous version of AlphaGo Zero to chess, shogi and go. Each program was run on the hardware for which it was designed. This time, AlphaZero did not have to face humans, but other software, all considered as superior to the best champions: Stockfish for the chess, Elmo for the shogi, and AlphaGo Zero (a program designed by DeepMind in 2017) for the go. AlphaZero defeated AlphaGo Zero, winning 61% of the matches. This shows that a general approach allows to recover the performances of an algorithm exploiting the symmetries of cards to generate eight times more data, specify the researchers. It has largely won in all cases, including failures with 155 wins and only 6 losses on 1000 games (the others ended in a draw). For the shogi, AlphaZero beat Elmo, winning 98.2% of the matches in black and 91.2% in total.
The research, published today in the journal Science, was conducted by a team led by David Silver of DeepMind. The paper is accompanied by a commentary by Murray Campbell, an Artificial Intelligence researcher at the IBM Thomas J. Watson Research Center in Yorktown Heights, New York. AlphaZero is only looking for 60,000 positions per second for chess and shogi, compared to 60 million for Stockfish and 25 million for Elmo. AlphaZero can compensate for the low number of evaluations by using its deep neural network to focus much more selectively on the most promising variations. To illustrate the potential of its AI, DeepMind had specified in 2017 that it would take only four hours of training and 44 million parts AlphaZero starting from the basic rules of the game chess to come end of Stockfish, one of the best chess programs currently. In the same way, two hours and 24 million games were enough for AlphaZero to defeat Elmo, the best shogi program (a variation of the practice chess game in Japan); and it took him only eight hours of training and 21 million games against himself, to defeat AlphaGo-Lee, the first AI to have dominated a human player. The most impressive advance of AlphaZero is that the difference of the generations previous game software, it was not programmed or based on data from parts played by humans. The algorithm went by itself from the rules of the game, playing hundreds of thousands of hits against himself. This phase of training, using a technique called reinforcement learning, mobilized 5000 processors for nine hours for chess, twelve hours for shogi, and thirteen days for go. On arrival, not only is the machine better than the best programs in the world, but its game strategy is totally indite. AlphaZero plays an extremely innovative way, neither as a human nor as a machine, with a very dynamic game strategy, said DeepMind's founder and CEO Echos Demis Hassabis. This work has indeed closed a chapter in artificial intelligence research over several decades, writes Campbell, a member of the team that designed IBM's Deep Blue, which defeated Garry Kasparov, then world chess champion, in 1997. Researchers in artificial intelligence must turn to a new generation of games to meet the next challenges, he added. Source: ScienceAnd you? What do you think? See also Artificial intelligence: AlphaGo elbow elbow with a six-year-old in terms of IQ Bing and Siri do worseAlphaZero: the DeepMind AI that gets unbeatable on three different games in less than 24 hours first steps towards a generalist AI Men vs IA goGo: AlphaGo now has two rounds zero lee Sedol no longer has the right to make a mistake if he wants to win the gameAlphaGo is at the top of the ranking of the best Go players in the world the first time for a computer science program: DeepMind's intelligent agents beat the human record at Quake III, a first-person shooter

Leave a comment

Send a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.