Complexity too high for straightforward data-driven learning approaches
Combination of supervised learning to assess board positions and reinforcement learning for probabilistic policies to restrict the space of possible moves
Specialized AI, learnt skills not transferable/applicable to other contexts
The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves.
A new approach to computer Go introduces value networks to evaluate board positions and policy networks to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Furthermore, a new search algorithm is introduced: it combines Monte Carlo simulation with value and policy networks. Using this search algorithm, the computer program AlphaGo developed by Google DeepMind achieved a 99.8 % winning rate against other Go programs.
Click here to read more.