Understanding Reasoning Using Utility Proportional Beliefs Christian Nauerz EpiCenter, Maastricht University c.nauerz@maastrichtuniversity.nl Abstract. Traditionally very little attention has been paid to the reasoning pro- cess that underlies a game theoretic solution concept. When modeling bounded rationality in one-shot games, however, the reasoning process can be a great source of insight. The reasoning process itself can provide testable assertions, which pro- vide more insight than the fit to experimental data. Based on Bach and Perea’s [1] concept of utility proportional beliefs, we analyze the players’ reasoning process and find three testable implications: (1) players form an initial belief that is the basis for further reasoning; (2) players reason by alternatingly considering their own and their opponent’s incentives; (3) players perform only several rounds of deliberate reasoning. Keywords: Epistemic game theory, interactive epistemology, solution concepts, bounded rationality, utility proportional beliefs, reasoning. 1 Introduction Most of the ongoing research in game theory focuses on the prediction of players’ choices. In this paper, we want to take another perspective and look at the players’ reasoning pro- cess rather than their decisions. This perspective can be especially helpful to understand experimental data from one-shot games without opportunities for learning or coordinat- ing. Existing bounded rationality concepts (e.g. Quantal Response Equilibrium (QRE) [5] or Cognitive Hierarchy Models (CHM) [3]) focus mainly on the prediction of empirical frequencies rather than accurately mimicking players’ reasoning process. Therefore, these concepts do not provide a clear rationale for players selecting certain choices. The under- lying idea of a concept might give a hint but no a clear insight. Most often, these models are evaluated by comparing their fit to the data. This method, however, tells us little about the validity of certain features of the models. A deeper insight into the reasoning process could help to better understand a concept’s characteristics and therefore which features work well and which do not. Therefore, a concept that makes clear assertions about the reasoning process can be tested much more rigorously. In fact, a good concept should present clear assertions about the reasoning process that can be tested individually and in their interaction with each other. In this paper, we discuss a solution concept that is based on a general idea and provides detailed assertions about the reasoning process that can be tested individually. 2 Utility Proportional Beliefs Bach and Perea [1], henceforth BP, suggest a concept for bounded rationality that builds up on a simple idea: the differences in probabilities a player assigns to his opponent’s choices should be equal to the differences in the opponent’s utilities for these choices. BP, formalize the solution concept using the type-based approach to epistemic game theory. Here we will only introduce the main definition and focus on the two player case. For a more formal treatment consult BP. However, before stating the definition of utility proportional beliefs we need to introduce some further notation. By I = {1, 2} we denote the set of players, by Cj we denote player j ’s finite choice set, by Tj we denote player j ’s set of types, and by Ui : Ci ⇥ Cj ! R we denote player i’s utility function. The best and the worst possible utilities of player j are denoted as ūj := maxc2C uj (c) and uj := minc2C uj (c). (bi (ti ))(cj |tj ) gives the probability that player i ’s type ti assigns to j ’s choice cj given that j is of type tj , where ti 2 Ti and tj 2 T j . Definition 1. Let i, j 2 I be the two players, and j 2 R such that j 0. A type ti 2 Ti of player i expresses j -utility-proportional-beliefs, if j (bi (ti ))(cj |tj ) (bi (ti ))(c0j |tj ) = (uj (cj , tj ) uj (c0j , tj )) (2.1) ūj uj for all tj 2 Tj (ti ), for all cj , c0j 2 Cj . The definition directly corresponds to the idea of utility proportional beliefs: the difference in probabilities player i assigns to the opponents’ choices is equal to the difference of the utilities times the proportionality factor j /(ūj uj ). BP give an intuitive interpretation of j as measure of the sensitivity of a player’s beliefs to differences in the opponents utilities. Note that there exists an upper bound for the j called maxj . It is the maximum value of j for which equation (2.1) yields well-defined probability measures. The lower limit of j is 0. The concept of common belief in -utility-proportional-beliefs requires that both play- ers entertain utility proportional beliefs, that both players believe their opponent holds utility proportional beliefs, that both players believe their opponents believe that their opponents do so, and so on. BP introduce an algorithm to find exactly those beliefs that are possible under common belief in -utility-proportional-beliefs. The algorithm itera- tively deletes beliefs so that only the beliefs, which are possible under common belief in -utility-proportional-beliefs, survive. BP show in their Theorem 2 that beliefs are unique in the two player case. By using their Lemma 4 we find an explicit expression for the unique beliefs under common belief in -utility-proportional-beliefs instead of using their algorithm. This expression reveals clues about the reasoning process players might go through to obtain utility proportional beliefs. 3 Reasoning Process To introduce the formula for the player’s beliefs some more notation needs to be fixed. We denote the number of choices of player i by n = |Ci | and the number of choices for player j by m = |Cj |. Moreover, let N = {1, ..., n} and M = {1, ..., m}. The n ⇥ 1 vector in with in = ( n1 , ..., n1 ). Let Ci = {c1i , . . . , cni } and Cj = {c1j , . . . , cm j } so that we can denote player i’s n ⇥ m utility matrix by 2 3 Ui (c1i , c1j ) · · · Ui (c1i , cm j ) 1 6 .. .. 7 Uinorm = 4 . . 5. ui ui n 1 n m Ui (ci , cj ) · · · Ui (ci , cj ) The m ⇥ m matrix Zm has mm 1 on the diagonal and m 1 off the diagonal. Intuitively, the centering matrix subtracts the mean from the columns of a matrix when left multiplied. We define the matrix Gj := j Zm Ujnorm since it will be useful to develop a more intuitive understanding. By left-multiplying the normalized utility matrix Ujnorm with the centering matrix Zm , one obtains a matrix where for every element the average of its column has been subtracted. Note that the rows of Ujnorm correspond to i’s choices and the columns to j’s choices. The same holds for the matrix Zm Ujnorm , only that now each element represents the relative goodness of a choice given an opponent’s choice. Therefore, the matrix Gj gives the goodness of a choice given a belief about the opponent’s choice, scaled by the sensitivity to the opponents differences in utility j . Now we can state the formula for i ’s beliefs about j ’s choices under common belief in -utility-proportional-beliefs: 1 X i = (Gj Gi )k (im + Gj in ) (3.1) k=0 = (im + Gj in ) + Gj Gi (im + Gj in ) +Gj Gi [Gj Gi (im + Gj in )] + · · · , where i is a m ⇥ 1 vector with the probabilities that player i assigns to player j ’s choices. We see that the expression (im + Gj in ) is repeated several times. In the second term, this expression is then adjusted by left multiplying the matrices Gj Gi . In the third term the second term is adjusted by left multiplying Gj Gi , and so on. Therefore, we call (im + Gj in ) the initial belief, iinitial . It shows how player i constructs her beliefs about player j without taking into account that player j reasons about her. Player i starts off by assigning equal probability to her opponent’s choice combinations. Then she adjusts her belief by adding the term Gj in , which represents the goodness of j ’s choices when j assigns equal probability to all of i ’s choices. To emphasize the reasoning process, we define ik as the belief that player i holds after the k th reasoning step, 0 initial i := i k i := initial i + Gj Gi ( ik 1 ), such that limk!1 k i = i holds. To obtain a more intuitive understanding we rewrite k i as follows k k 1 i = im + Gj (in + Gi ( i )). We see that first player i takes j ’s perspective, which is reflected in the expression in + Gi ( ik 1 ). Here player j forms a belief about player i given i ’s belief about j from the previous reasoning step. First j assigns equal probability to all of i ’s choices. Then she corrects these beliefs by the goodness of i ’s choices given i ’s belief about j from the previous reasoning step. The result is a new belief of j about i. Then player i takes her own perspective and assigns equal probability to all of j ’s choices. These probabilities are then again corrected by the goodness of j ’s choices given the new belief of j about i. The process then continues in the same fashion for the subsequent reasoning steps. It is also important to note that later reasoning steps will be less important for the final belief than earlier ones. Define i = ↵i max with ↵i 2 [0, 1) and note that Gj = ↵i max i Zm Ujnorm , so that (3.1) can be written as 1 X i = (↵i ↵j max i max j Zm Ujnorm Zn Uinorm )k iinitial . k=0 P1 Since ↵i , ↵j 2 [0, 1), later terms in k=0 (↵i ↵j max i max j Zm Uj Zn Ui )k will be smaller than earlier ones and therefore less important for the final belief i . This has also an important implication for the meaning of the proportionality factor i : the lower the value of i the fewer steps of reasoning a player will undergo to approximate the final belief within a reasonable bound. The same holds true for her opponent’s proportionality factor. 4 Connections to Psychology These features correspond closely to findings in the psychology literature. In his book “Thinking, Fast and Slow” Kahneman [4] advocates the idea of reasoning in two distinct ways. He calls the two modes of thinking System 1 and System 2, according to Stanovich [6]. Note that the word system should not indicate an actual system but only serves as label for different modes of thinking. System 1 is an automatic and mostly unconscious way of thinking that demands little computational capacity. System 2 describes the idea of deliberate reasoning. It comes into play when controlled analytical thinking is needed. Table 1 summarizes the properties of the two systems according to [6]. System 1 System 2 associative rule-based holistic analytic relatively undemanding of cognitive demanding of cognitive capacity capacity relatively fast relatively slow acquisition by biology, exposure, and acquisition by cultural and formal personal experience tuition Table 1. Properties of System 1 & 2 The concept of System 1 describes the unconscious first reaction to a situation, which happens almost immediately and without demanding a lot of cognitive resources. More- over, Kahneman [4] argues that the beliefs formed by System 1 are the basis for conscious reasoning within System 2. This is consistent with our findings since the initial belief does not take into account any strategic interaction. This belief can be seen as an au- tomatic initial reaction to the game. The deliberate reasoning process described above can be imagined as being executed by System 2 using the findings of System 1, or in this case the initial belief. Taking another player’s perspective takes deliberate reasoning and can hardly be done automatically. Finally, we showed that the final belief can be approximated with finitely many steps of reasoning. This feature is closely related to the problem of limited working memory. Baddeley [2] defines working memory as "... [A] brain system that provides temporary storage and manipulation of the information necessary for such complex cognitive tasks as language comprehension, learning, and reasoning." Since working memory is critical for reasoning, however bounded, human beings can only perform a limited number of reasoning without the support of tools. Therefore, a model resembling human reasoning should not predict an infinite amount of reasoning steps. References 1. Bach, C.W., Perea, A.: Utility proportional beliefs. International Journal of Game Theory pp. 1–22 (2014) 2. Baddeley, A.: Working memory. Science 255(5044), 556–559 (1992) 3. Camerer, C.F., Ho, T.H., Chong, J.K.: A cognitive hierarchy model of games. The Quarterly Journal of Economics 119(3), 861–898 (2004) 4. Kahneman, D.: Thinking, fast and slow. Macmillan (2011) 5. McKelvey, R.D., Palfrey, T.R.: Quantal response equilibria for normal form games. Games and economic behavior 10(1), 6–38 (1995) 6. Stanovich, K.E., West, R.F., et al.: Individual differences in reasoning: Implications for the rationality debate? Behavioral and brain sciences 23(5), 645–665 (2000)