Mean payoff game

In game theory, a mean payoff game is a zero-sum game played on the vertices of a weighted directed graph. The game is played as follows: at the start of the game, a token is placed on one of the vertices of the graph. Each vertex is assigned to either the Maximizer of the Minimizer. The player that controls the current vertex the token is on, may choose one outgoing edge along which the token moves next. In doing so, the Minimizer pays the maximizer the number that is on the edge. Then, again, the player controlling the next vertex the token gets can choose where it goes, and this continues indefinitely. The objective for the Maximizer is to maximize their long term average payoff, and the Minimizer has the opposite objective.

Formal definition

A mean payoff game consists of a graph $G=(V_{Max}\cup V_{Min},E)$ , and a function $w:E\to \mathbb {R}$ where $V=V_{Max}\cup V_{Min}$ is the set of vertices, which are partitioned between the players, and where $w(e)$ is the weight of an edge. Often, the graph is assumed to be sinkless, which means that every vertex has at least one outgoing edge. A play is a possible outcome of the game, which is an inifinite walk on the graph, we could write this as a sequence of edges: $\pi =e_{1},e_{2},e_{3},\ldots$ where the head of $e_{i}$ equals the tail of $e_{i+1}$ . The objective value of the game can then be written as follows:

$O(\pi )=\liminf _{t\to \infty }{\frac {1}{t}}\sum _{i=1}^{t}w(e_{i})$

A strategy for the Maximizer is a function $\sigma :FW_{Max}\to E$ , where $FW_{v}$ is the set of finite walks that start at the initial vertex and end at some vertex $v\in V_{Max}$ , which returns an outgoing edge of the end vertex $v$ . A strategy $\tau$ for the Minimizer can be defined analogously. If both players fix a strategy, say they pick strategies $\sigma$ and $\tau$ , then the outcome of the game is fixed, and the resulting play is the path $\pi _{\sigma \tau }$ .

One of the fundamental results for mean payoff games is that they are positionally determined.^[1] This means in our case that the game has a unique value, and that each player has a strategy that can attain the value, and that strategy is positional, e.g. it only depends on the current vertex the token is on. In formulas, the following equation holds for the value $V(G,w)$ :

$V(G,w)=\max _{\sigma \in {\text{(positional Max strategies)}}}\inf _{\tau \in {\text{(Min strategies)}}}O(\pi _{\sigma \tau })=\min _{\tau \in {\text{(positional Min strategies)}}}\sup _{\sigma \in {\text{(Max strategies)}}}O(\pi _{\sigma \tau })$

Solving mean payoff games

Solving a mean payoff game can mean several things, although in practice finding one often also yields the other:

Determining the value of one or all starting vertices
Determining the optimal strategy for one or both players
Determining the set of starting vertices from which the Maximizer can guarantee a value of at least 0. Doing so is called finding the zero-mean partition (and is also related to solving energy games)

It is a major open problem in computer science whether there exists a polynomial time algorithm for solving any of the above problems. These problems are one of the few to be contained in both the classes NP and coNP^[2] but not known to be in P. Currently, the fastest algorithm is a randomized strategy improvement algorithm, which runs in time $O(2^{\sqrt {n\log(n)}})$ ,^[3] where $n$ is the number of Maximizer vertices. The best deterministic algorithms run in time $m2^{n/2}$ ,^[4] where now $m$ is the number of edges and $n$ the total number of vertices.

Three of the most well-known algorithms for solving mean payoff games are the following (each of which has their own slight variants):

The GKK algorithm.^[5] Its main idea is to add a potential to every vertex in the graph, and slowly increase the potential on some nodes until we find the zero-mean partition. This also gives us the energy values in an energy game.
Value iteration.^[6] Its main idea is to give each vertex a value, and update the values via fixed point iteration. When the fixed point is reached, the zero-mean partition and energy values can be found.
Strategy improvement.^[3] Its main idea is to start with an arbitrary Maximizer strategy, and assign it a valuation. Then, by repeatedly making changes to the strategy that improve its valuation, we end up with a strategy for the Maximizer than guarantees nonzero payoff wherever it is possible.

References

^ Ehrenfeucht, A.; Mycielski, J. (June 1979). "Positional strategies for mean payoff games". International Journal of Game Theory. 8 (2): 109–113. doi:10.1007/BF01768705. ISSN 0020-7276.
^ Zwick, Uri; Paterson, Mike (1996-05-20). "The complexity of mean payoff games on graphs". Theoretical Computer Science. 158 (1): 343–359. doi:10.1016/0304-3975(95)00188-3. ISSN 0304-3975.
^ ^a ^b Björklund, Henrik; Vorobyov, Sergei (2007-01-15). "A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games". Discrete Applied Mathematics. 29th Symposium on Mathematical Foundations of Computer Science MFCS 2004. 155 (2): 210–229. doi:10.1016/j.dam.2006.04.029. ISSN 0166-218X.
^ Ohlmann, Pierre (2022), Kulikov, Alexander S.; Raskhodnikova, Sofya (eds.), "The GKK Algorithm is the Fastest over Simple Mean-Payoff Games", Computer Science – Theory and Applications, Lecture Notes in Computer Science, vol. 13296, Cham: Springer International Publishing, pp. 269–288, doi:10.1007/978-3-031-09574-0_17, ISBN 978-3-031-09573-3, retrieved 2024-08-20
^ Gurvich, V. A.; Karzanov, A. V.; Khachivan, L. G. (1988-01-01). "Cyclic games and an algorithm to find minimax cycle means in directed graphs". USSR Computational Mathematics and Mathematical Physics. 28 (5): 85–91. doi:10.1016/0041-5553(88)90012-2. ISSN 0041-5553.
^ Brim, L.; Chaloupka, J.; Doyen, L.; Gentilini, R.; Raskin, J. F. (April 2011). "Faster algorithms for mean-payoff games". Formal Methods in System Design. 38 (2): 97–118. doi:10.1007/s10703-010-0105-x. ISSN 0925-9856.
^ Puri, Anuj (1995). Theory of Hybrid Systems and Discrete Event Systems. EECS Department, University of California, Berkeley.
^ Allamigeon, Xavier; Benchimol, Pascal; Gaubert, Stéphane; Joswig, Michael (January 2014). "Combinatorial Simplex Algorithms Can Solve Mean Payoff Games". SIAM Journal on Optimization. 24 (4): 2096–2117. arXiv:1309.5925. doi:10.1137/140953800. ISSN 1052-6234.

Mean payoff game

Formal definition

Solving mean payoff games

Related games and problems

References

Further reading