The problem with this approach is that most real-world situations, and even some games, don’t have a simple set of rules governing how they operate. So some researchers have tried to get around the problem by using an approach that attempts to model how a particular game or scenario environment will affect an outcome and then use that knowledge to make a plan. The drawback of this system is that some domains are so complex that modeling every aspect is nearly impossible. This has proven to be the case with most Atari games, for instance.
In a way, MuZero combines the best of both worlds. Rather than modeling everything, it only attempts to consider those factors that are important to making a decision. As DeepMind points out, this is something you do as a human being. When most people look out the window and see dark clouds forming on the horizon, they generally don’t get caught up thinking about things like condensation and pressure fronts. They instead think about how they should dress to stay dry if they go outside. MuZero does something similar.
It takes into account three factors when it has to make a decision. It will consider the outcome of its previous decision, the current position it finds itself in and the best course of action to take next. That seemingly simple approach makes MuZero the most effective algorithm DeepMind made to date. In its testing, it found MuZero was as good as AlphaZero at chess, Go and shogi, and better than all its previous algorithms, including Agent57, at Atari games. It also found that the more time it gave MuZero to consider an action, the better it performed. DeepMind also conducted testing in which it put a limit on the number of simulations MuZero could complete in advance of committing to a move in Ms Pac-Man. In those tests, it found MuZero was still able to achieve good results.
Putting up high scores in Atari games is all well and good, but what about the practical applications of DeepMind’s latest research? In a word, they could be groundbreaking. While we’re not there yet, MuZero is the closest researchers have come to developing a general-purpose algorithm. The subsidiary says MuZero learning capabilities could one day help it tackle complex problems in fields like robotics where there aren’t straightforward rules.