**EigenGame maps out a new approach to solve fundamental ML problems.**

Modern AI systems approach tasks like recognising objects in images and predicting the 3D structure of proteins as a diligent student would prepare for an exam. By training on many example problems, they minimise their mistakes over time until they achieve success. But this is a solitary endeavour and only one of the known forms of learning. Learning also takes place by interacting and playing with others. It’s rare that a single individual can solve extremely complex problems alone. By allowing problem solving to take on these game-like qualities, previous DeepMind efforts have trained AI agents to play Capture the Flag and achieve Grandmaster level at Starcraft. This made us wonder if such a perspective modeled on game theory could help solve other fundamental machine learning problems.

Today at ICLR 2021 (the International Conference on Learning Representations), we presented “EigenGame: PCA as a Nash Equilibrium,” which received an Outstanding Paper Award. Our research explored a new approach to an old problem: we reformulated principal component analysis (PCA), a type of eigenvalue problem, as a competitive multi-agent game we call EigenGame. PCA is typically formulated as an optimisation problem (or single-agent problem); however, we found that the multi-agent perspective allowed us to develop new insights and algorithms which make use of the latest computational resources. This enabled us to scale to massive data sets that previously would have been too computationally demanding, and offers an alternative approach for future exploration.

### PCA as a Nash equilibrium

First described in the early 1900s, PCA is a long-standing technique for making sense of the structure of high-dimensional data. This approach is now ubiquitous as a first step in the data-processing pipeline and makes it easy to cluster and visualise data. It can also be a useful tool for learning low-dimensional representations for regression and classification. More than a century later, there are still compelling reasons to study PCA.

Firstly, data was originally recorded by hand in paper notebooks, and now it is stored in data centres the size of warehouses. As a result, this familiar analysis has become a computational bottleneck. Researchers have explored randomised algorithms and other directions to improve how PCA scales, but we found that these approaches have difficulty scaling to massive datasets because they are unable to fully harness recent deep-learning-centric advances in computation — namely access to many parallel GPUs or TPUs.

Secondly, PCA shares a common solution with many important ML and engineering problems, namely the singular value decomposition (SVD). By approaching the PCA problem in the right way, our insights and algorithms apply more broadly across the branches of the ML tree.

Figure 1. With SVD at its roots, the tree of knowledge encompasses many fundamental ideas in machine learning including PCA, Least Squares, Spectral Clustering, Proto Value Functions, Latent Semantic Indexing, and Sorting.

As with any board game, in order to reinvent PCA as a game we need a set of rules and objectives for players to follow. There are many possible ways to design such a game; however, important ideas come from PCA itself: the optimal solution consists of *eigenvectors* which capture the important variance in the data and are orthogonal to each other.

Figure 2. Each player wants to align with a direction of maximum variance (larger data spread) but also remain perpendicular to players higher up in the hierarchy (all players with a lower number).

In EigenGame each player controls an eigenvector. Players increase their score by explaining variance within the data but are penalised if they’re too closely aligned to other players. We also establish a hierarchy: Player 1 only cares about maximising variance, whereas other players also have to worry about minimising their alignment with players above them in the hierarchy. This combination of rewards and penalties defines each player’s utility.

Figure 3. Summarising the utility of each player above.

With appropriately designed** ****Var**** **and

terms, we can show that:**Align**

- If all players play optimally, together they achieve the Nash equilibrium of the game, which is the PCA solution.
- This can be achieved if each player maximises their utility independently and simultaneously using gradient ascent.

Figure 4. EigenGame guides each player along the unit sphere from the empty circles to the arrows in parallel. Blue is player 1. Red is player 2. Green is player 3.

This independence property of simultaneous ascent is particularly important because it allows for the computation to be distributed across dozens of Google Cloud TPUs, enabling both data- and model-parallelism. This makes it possible for our algorithm to adapt to truly large-scale data. EigenGame finds the principal components in a matter of hours for hundred-terabyte datasets comprising millions of features or billions of rows.

Figure 5. Each coloured square is a separate device. (L) Each player lives and computes updates on a single device. (R) Each player is copied to multiple devices and computes updates using independent batches of data; the different updates are then averaged to form a more robust update direction.

### Utilities, updates, and everything in between

By thinking about PCA from a multi-agent perspective, we were able to propose scalable algorithms and novel analyses. We also uncovered a surprising connection to Hebbian Learning — or, how neurons adapt when learning. In EigenGame, each player maximising their utilities gives rise to update equations that are similar to update rules derived from Hebbian models of synaptic plasticity in the brain. Hebbian updates are known to converge to the PCA solution but are not derived as the gradient of any utility function. Game theory gives us a fresh lens to view Hebbian learning, and also suggests a continuum of approaches to machine learning problems.

On one end of the ML continuum is the well-developed path of proposing an objective function that can be optimised: Using the theory of convex and non-convex optimisation, researchers can reason about the global properties of the solution. On the other end, pure connectionist methods and update rules inspired by neuroscience are specified directly, but analysis of the entire system can be more difficult, often invoking the study of complicated dynamical systems.

Game theoretic approaches like EigenGame sit somewhere in between. Player updates are not constrained to be the gradient of a function, only a best response to the current strategies of the other players. We’re free to design utilities and updates with desirable properties — for example, specifying updates which are unbiased or accelerated — while ensuring the Nash property still allows us to analyse the system as a whole.

Figure 6: Allowing multiple utilities bridges the gap between optimisation approaches and dynamical systems.

EigenGame represents a concrete example of designing the solution to a machine learning problem as the output of a large multi-agent system. More generally, designing machine learning problems as multi-agent games is a challenging *mechanism design* problem; however, researchers have already used the class of two-player, zero-sum games to solve machine learning problems. Most notably, the success of generative adversarial networks (GANs) as an approach to generative modelling has driven interest in the relationship between game theory and machine learning.

EigenGame moves beyond this to the more complex many-player, general-sum setting. This enables more obvious parallelism for greater scale and speed. It also presents a quantitative benchmark for the community to test novel multi-agent algorithms alongside richer domains, such as Diplomacy and Soccer.

We hope our blueprint for designing utilities and updates will encourage others to explore this direction for designing new algorithms, agents, and systems. We’re looking forward to seeing what other problems can be formulated as games and whether the insights we glean will further improve our understanding of the multi-agent nature of intelligence.

For more details see our paper EigenGame: PCA as a Nash Equilibrium and our follow-up work EigenGame Unloaded: When playing games is better than optimising. This blog post is based on joint work with Thore Graepel, a research group lead at DeepMind and Chair of Machine Learning at University College London.

We would like to thank Rob Fergus for their technical feedback on this post as well as Sean Carlson, Jon Fildes, Dominic Barlow, Mario Pinto, and Emma Yousif for pulling this all together.

Custom figures by Jim Kynvin and Adam Cain.