Rationality and Randomness (extended abstract) Francesco Pasquale Università di Roma “Tor Vergata”, Roma, Italy pasquale@mat.uniroma2.it Imagine you are sitting in a room with a large number of people attending a conference and the organizers give you and all the other persons in the room a small piece of paper, asking each person to write down a number between 0 and 100. All the numbers will then be collected and a prize will be given to the person who will have written the closest number to half of the average of all the numbers. What number would you write on your piece of paper? Game theory [16] provides a powerful and elegant framework to predict the outcome of situations like the one described above. Intuitively speaking, in order to choose a number to write on her piece of paper, a person in the room could think as follows: Since numbers are constrained to be between 0 and 100, their average will be a number between 0 and 100 as well, so half of the average will be a number between 0 and 50; hence, if I am rational I would definitely not write a number larger than 50. Now, if everyone in the room is rational and writes a number not larger than 50, then the average itself will be not larger than 50 and half of the average will be not larger than 25; so, if I am rational and I believe that all other people in the room are rational, I would not write a number larger than 25. If everyone writes a number not larger than 25, . . . A few more steps of reasoning along the above lines and we have the game theoretic prediction of the outcome: Every person in the room writes 0 on her piece of paper. This is indeed the unique Nash equilibrium [14] of that game. However, if you try it out in any real scenario, you will see by yourself how far from the truth is such a prediction. The main reason is that it relies on the very strong assumption that rationality is common knowledge [6] among the agents (everyone is rational, everyone knows that everyone is rational, everyone knows that everyone knows that everyone is rational, and so forth). Bounded rationality and the Logit dynamics Several branches of game theory try to relax the rationality assumption in order to get to more reliable predictions. For example, Evolutionary game theory [15] applies the game theoretic framework to repeated games in biological contexts and introduces new solution concepts (evolutionary stable strategies); Behavioral game theory [9] proposes a more experimental approach: Observe the behavior of real players and appropriately extend the theory to give explanation to the outcomes. A simple and elegant approach to model players with limited rationality in the case of repeated games is the Logit dynamics, introduced in [8] and inspired by statistical mechanics: Starting from some initial strategy profile, at each round a player is selected uniformly at random and she updates her current strategy according to a probability distribution biased toward strategies promising higher payoffs. More formally, let n be the number of players, for i = 1, . . . , n let Si be the set of strategies for player i and let ui : S1 × · · · × Sn → R be her utility function. If x = (x1 , . . . , xn ) ∈ S1 × · · · × Sn is the current profile of strategies and player i is selected for the update, then she will play strategy y ∈ Si with probability proportional to eβui (x−i ,y) , where (x−i , y) is the vector obtained from x by replacing the i-th entry xi with y and β > 0 is a parameter measuring the rationality level of the players (it is indeed easy to see that, for β = 0 players play one of their strategies uniformly at random, i.e., they have no rationality at all, while for β → ∞ players tend to best respond to the current strategies of the other players, i.e., they are fully rational ). Logit dynamics defines an ergodic Markov chain over the space of strategy profiles (more precisely, a family of ergodic Markov chains indexed by β) whose unique stationary distribution πβ we proposed [5] as the equilibrium solution concept for the game. The “prediction” of such a solution concept is thus that, for any subset A ⊆ S1 × · · · × Sn of the state space, the fraction of times that the system spends in A is given by its stationary probability πβ (A), in the long run. This is certainly a more rough prediction than that of a Nash equilibrium but it is likely much more realistic in several cases (as much as a statement like “The next economic crisis will happen in 2059” cannot be considered a serious prediction today, while a statement like “We should expect at least two global economic crises for each century” is certainly weaker but also much more credible while still containing some non-negligible information). Once we model an evolving system of agents as an ergodic Markov chain, the unique stationary distribution of the chain becomes a powerful solution concept for the system, since it allows us to make predictions on its behavior in the long run, regardless of the starting state. However, the strength of this solution concept is also its weakness: The predictive appeal of the stationary distribution essentially vanishes if the mixing time [13] (the time it takes the chain to get close to its stationary distribution, starting at an arbitrary state) is too long with respect to the time-scale we are interested in. In [2, 3] we classified some classes of strategic games with respect to their Logit dynamics’ mixing time, distinguishing between polynomial and exponential (in the number of players) mixing. For the games whose Logit dynamics has exponential mixing time, in order to be able to make predictions at polynomial time-scales, in [4] we introduced a generalization of the concept of stationary distribution for a Markov chain and we called it metastable probability distribution. Metastability is a word that can have slightly different meanings in differ- ent scientific disciplines. Nevertheless, all the meanings are somehow related to evolving systems hanging around “persistent” configurations but out of their main equilibrium. Our definition in [4] is no exception: Roughly speaking, we define a probability distribution µ to be metastable for a Markov chain with tran- sition matrix P , if µP is close (with respect to an appropriate distance function) to µ. Among all possible metastable distributions, the interesting ones are those quickly reached from some subset of the state space. As a simple example, con- sider the following process: Start with a sequence of n bits (x1 . . . , xn ) ∈ {0, 1}n ; at each round pick one index i ∈ [n] uniformly at random and replace xi with either 0 or 1 with probability 1/2; stop when you reach either the sequence with all zeros 0 = (0, . . . , 0) or the sequence with all ones 1 = (1, . . . , 1). This is a lazy random walk on the hypercube (see, e.g., [10]) modified by making 0 and 1 absorbing states. Clearly, starting from any state x ∈ {0, 1}n , eventually the process will end up in one of the two absorbing states. However, for every initial state x 6= 0, 1, the process will run for an exponential number of rounds, in ex- pectation, before being absorbed; can’t we say anything on the distribution xP t at a shorter time-scale? Yes, indeed. It is easy to see that, after t = Θ(n log n) rounds, xP t will get close to the uniform distribution U ∼ {0, 1}n , and it will stay close to U for an exponential number of rounds. In [4] we introduced and studied metastable probability distributions as so- lution concepts for strategic games. However, the usefulness of looking at the “metastable phase” of evolving systems goes beyond the domains of game the- ory and Markov chains. In the next section we briefly describe an example of a simple dynamics whose metastable phase analysis allowed us to solve an intrigu- ing computational task in a distributed way [7]. Community detection via averaging Consider the following simple rules executed synchronously, in discrete rounds, by the nodes of an undirected graph: Each node initially holds a value (say, a rational number) and then, at each round, it updates its value to the average of the values held by its neighbors. It comes as no surprise that, under mild assumptions on the graph (namely, it has to be connected and non-bipartite) all nodes will converge to the same value, in the long run. May be it is much less obvious that, in its metastable regime, this simple dynamics can be effectively used to efficiently solve the community detection problem [12, 11, 1] in a fully- distributed way. Let us consider a graph formed by two good expanders connected by a sparse cut. For the sake of concreteness and simplicity, let us assume we have a “very regular” graph G = (V, E) with an even number n of nodes partitioned in two blocks of equal size, V1 and V2 , and that each node has exactly a neighbors in its own block and exactly b neighbors in the other block, for some positive integers a and b, with a  b. Let µ1 and µ2 be the averages of the initial values of the nodes in V1 and V2 , respectively. Intuitively speaking, by running the averaging dynamics on a graph like that the following happens: In a first phase, the values of all nodes in V1 quickly converge to a value close to µ1 , while those of nodes in V2 to a value close to µ2 . After that initial phase, the system enters in a metastable regime in which all values slowly converge toward the global average (µ1 + µ2 )/2; if the two averages µ1 and µ2 are different, say µ1 < µ2 , in this second phase the value of each node in V1 increases at every round and the value of each node in V2 decreases. Thus, if we augment the averaging dynamics with a “clustering criterion” asking each node to raise, at each round, a blue flag or a red flag depending on whether its value has increased or decreased from the last round, then as soon as the system enters the metastable regime, all nodes in one of the blocks raise the red flag and all nodes in the other block raise the blue one. We thus have a fully-distributed protocol that allows the nodes to perform a perfect reconstruction of the two clusters of the graph. To make the intuitions in the above paragraph a bit more real, we can use (t) (t) some linear algebra: Let us name x(t) = (xu , u ∈ V ) the vector where xu is the value of node u at round t. It is easy to see that the averaging dynamics is defined by the recursion x(t+1) = P x(t) , where P = A/(a + b) and A is the adjacency matrix of the graph. The vector of values at round t is thus x(t) = P t x(0) . Since P is a symmetric matrix with real entries, all its eigenvalues are real. Since P is stochastic (the entries of each row sum up to 1), all its eigenvalues are between −1 and 1 and vector 1 = (1, u ∈ V ), i.e., the vector with all entries equal to 1, is an eigenvector with eigenvalue 1. Moreover, for our “very regular” graph G, the partition indicator vector χ = (1, u ∈ V1 ; −1, u ∈ V2 ), taking value +1 on all the nodes of one of the blocks and −1 on all the nodes of the other block, is also an eigenvector and its eigenvalue is (a − b)/(a + b). Under the assumption a  b, that formalizes the fact that the two blocks are internally well-connected with respect to the cut between them, (a − b)/(a + b) is the second-largest eigenvalue, λ2 , of P . The vector of values at round t can thus be written as x(t) = α1 1 + α2 λt2 χ + e(t) , where α1 and α2 are suitable coefficients depending only on the initial values (namely, α1 is the global average (µ1 + µ2 )/2 of the initial values and α2 is half of the difference of the initial averages (µ1 − µ2 )/2) and e(t) is a vector orthogonal to 1 and χ. When we consider the difference between the vectors in two consecutive rounds, the component in the direction of 1 cancels, and what is left is x(t−1) − x(t) = α2 λt−1 2 (1 − λ2 )χ + e (t−1) − e(t) . (t) Since e , for t = 0, 1, . . . , are vectors orthogonal to 1 and χ, the norm of e(t) goes to zero at least as fast as the third eigenvalue of P , λt3 . Hence, if the coefficient α2 is non-zero, after a number of rounds depending on 1/(λ2 − λ3 ) it holds that |α2 λt−1 2 (1 − λ2 )| > |e (t−1) (u) − e(t) (u)| for all nodes u. This implies (t−1) (t) that, for each node u, the sign of xu − xu is equal to the sign of α2 χ: In other words, it is positive for all the nodes in one of the blocks and negative for all the nodes in the other block. Finally, in order to ensure in a distributed way that α2 = (µ1 − µ2 )/2 is non-zero, we can use randomness: If each node chooses its initial value√as +1 or −1 with probability 1/2, then the two initial averages differ for Θ(1/ n) with high probability. Some final remarks Evolving systems governed by simple local rules that produce observable global phenomena, can be fruitfully studied from a computational perspective. On the one hand, they seem suitable for modeling natural phenomena emerging in dif- ferent fields (physics, economy, biology); on the other hand, they can be used as design principles and building blocks for lightweight distributed protocols for relevant computational tasks. Randomness plays a fundamental role in this kind of processes, either in a Bayesian sense, to model the intrinsic uncertainty about the local rules executed by the agents of the system, or as a distributed symmetry-breaking tool. The most interesting phenomena often cannot be studied by means of limiting behaviors and equilibria, because they appear at shorter time-scales, i.e., when systems are in their metastable phases. References 1. Emmanuel Abbe and Colin Sandon. Community detection in general stochastic block models: Fundamental limits and efficient algorithms for recovery. In Proc. of the 56th IEEE Ann. Symp. on Foundations of Computer Science (FOCS’15), pages 670–688. IEEE, 2015. 2. Vincenzo Auletta, Diodato Ferraioli, Francesco Pasquale, Paolo Penna, and Giuseppe Persiano. Logit dynamics with concurrent updates for local interaction potential games. Algorithmica, 73(3):511–546, 2015. 3. Vincenzo Auletta, Diodato Ferraioli, Francesco Pasquale, Paolo Penna, and Giuseppe Persiano. Convergence to equilibrium of Logit dynamics for strategic games. Algorithmica, 76(1):110–142, 2016. 4. Vincenzo Auletta, Diodato Ferraioli, Francesco Pasquale, and Giuseppe Persiano. Metastability of Logit dynamics for coordination games. In Proc. of the ACM- SIAM Symp. on Discrete Algorithms (SODA’12), pages 1006–1024. SIAM, 2012. 5. Vincenzo Auletta, Diodato Ferraioli, Francesco Pasquale, and Giuseppe Persiano. Mixing time and stationary expected social welfare of Logit dynamics. Theory of Computing Systems, 53(1):3–40, 2013. 6. Robert J. Aumann. Agreeing to disagree. The Annals of Statistics, pages 1236– 1239, 1976. 7. Luca Becchetti, Andrea Clementi, Emanuele Natale, Francesco Pasquale, and Luca Trevisan. Find your place: Simple distributed algorithms for community detection. In Proc. of the ACM-SIAM Symp. on Discrete Algorithms (SODA’17), pages 940– 959. SIAM, 2017. 8. Lawrence E. Blume. The statistical mechanics of strategic interaction. Games and Economic Behavior, 5(3):387–424, 1993. 9. Colin Camerer. Behavioral game theory: Experiments in strategic interaction. Princeton University Press, 2003. 10. Persi Diaconis, Ronald L. Graham, and John A. Morrison. Asymptotic analysis of a random walk on a hypercube with many dimensions. Random Structures & Algorithms, 1(1):51–72, 1990. 11. Santo Fortunato. Community detection in graphs. Physics reports, 486(3):75–174, 2010. 12. Michelle Girvan and Mark E.J. Newman. Community structure in social and biological networks. Proc. of the National Academy of Sciences, 99(12):7821–7826, 2002. 13. David A. Levin, Yuval Peres, and Elizabeth L. Wilmer. Markov chains and mixing times. American Mathematical Society, 2009. 14. John Nash. Non-cooperative games. Annals of Mathematics, pages 286–295, 1951. 15. J. Maynard Smith. The theory of games and the evolution of animal conflicts. Journal of Theoretical Biology, 47(1):209–221, 1974. 16. John von Neumann and Oskar Morgenstern. Theory of games and economic be- havior. Princeton University Press, 1944.