Understanding Reasoning Using Utility Proportional
                         Beliefs

                                      Christian Nauerz

                              EpiCenter, Maastricht University
                            c.nauerz@maastrichtuniversity.nl


      Abstract. Traditionally very little attention has been paid to the reasoning pro-
      cess that underlies a game theoretic solution concept. When modeling bounded
      rationality in one-shot games, however, the reasoning process can be a great source
      of insight. The reasoning process itself can provide testable assertions, which pro-
      vide more insight than the fit to experimental data. Based on Bach and Perea’s [1]
      concept of utility proportional beliefs, we analyze the players’ reasoning process
      and find three testable implications: (1) players form an initial belief that is the
      basis for further reasoning; (2) players reason by alternatingly considering their
      own and their opponent’s incentives; (3) players perform only several rounds of
      deliberate reasoning.


Keywords: Epistemic game theory, interactive epistemology, solution concepts, bounded
rationality, utility proportional beliefs, reasoning.


1    Introduction

Most of the ongoing research in game theory focuses on the prediction of players’ choices.
In this paper, we want to take another perspective and look at the players’ reasoning pro-
cess rather than their decisions. This perspective can be especially helpful to understand
experimental data from one-shot games without opportunities for learning or coordinat-
ing. Existing bounded rationality concepts (e.g. Quantal Response Equilibrium (QRE)
[5] or Cognitive Hierarchy Models (CHM) [3]) focus mainly on the prediction of empirical
frequencies rather than accurately mimicking players’ reasoning process. Therefore, these
concepts do not provide a clear rationale for players selecting certain choices. The under-
lying idea of a concept might give a hint but no a clear insight. Most often, these models
are evaluated by comparing their fit to the data. This method, however, tells us little
about the validity of certain features of the models. A deeper insight into the reasoning
process could help to better understand a concept’s characteristics and therefore which
features work well and which do not. Therefore, a concept that makes clear assertions
about the reasoning process can be tested much more rigorously. In fact, a good concept
should present clear assertions about the reasoning process that can be tested individually
and in their interaction with each other. In this paper, we discuss a solution concept that
is based on a general idea and provides detailed assertions about the reasoning process
that can be tested individually.


2    Utility Proportional Beliefs

Bach and Perea [1], henceforth BP, suggest a concept for bounded rationality that builds
up on a simple idea: the diﬀerences in probabilities a player assigns to his opponent’s
choices should be equal to the diﬀerences in the opponent’s utilities for these choices. BP,
formalize the solution concept using the type-based approach to epistemic game theory.
Here we will only introduce the main definition and focus on the two player case. For
a more formal treatment consult BP. However, before stating the definition of utility
proportional beliefs we need to introduce some further notation.
    By I = {1, 2} we denote the set of players, by Cj we denote player j ’s finite choice
set, by Tj we denote player j ’s set of types, and by Ui : Ci ⇥ Cj ! R we denote player
i’s utility function. The best and the worst possible utilities of player j are denoted as
ūj := maxc2C uj (c) and uj := minc2C uj (c). (bi (ti ))(cj |tj ) gives the probability that
player i ’s type ti assigns to j ’s choice cj given that j is of type tj , where ti 2 Ti and
tj 2 T j .
Definition 1. Let i, j 2 I be the two players, and j 2 R such that                                  j     0. A type ti 2 Ti
of player i expresses j -utility-proportional-beliefs, if
                                                                      j
                 (bi (ti ))(cj |tj )   (bi (ti ))(c0j |tj ) =                  (uj (cj , tj )   uj (c0j , tj ))       (2.1)
                                                                ūj       uj
for all tj 2 Tj (ti ), for all cj , c0j 2 Cj .
The definition directly corresponds to the idea of utility proportional beliefs: the diﬀerence
in probabilities player i assigns to the opponents’ choices is equal to the diﬀerence of the
utilities times the proportionality factor j /(ūj uj ). BP give an intuitive interpretation
of j as measure of the sensitivity of a player’s beliefs to diﬀerences in the opponents
utilities. Note that there exists an upper bound for the j called maxj   . It is the maximum
value of j for which equation (2.1) yields well-defined probability measures. The lower
limit of j is 0.
    The concept of common belief in -utility-proportional-beliefs requires that both play-
ers entertain utility proportional beliefs, that both players believe their opponent holds
utility proportional beliefs, that both players believe their opponents believe that their
opponents do so, and so on. BP introduce an algorithm to find exactly those beliefs that
are possible under common belief in -utility-proportional-beliefs. The algorithm itera-
tively deletes beliefs so that only the beliefs, which are possible under common belief in
  -utility-proportional-beliefs, survive.
    BP show in their Theorem 2 that beliefs are unique in the two player case. By using
their Lemma 4 we find an explicit expression for the unique beliefs under common belief
in -utility-proportional-beliefs instead of using their algorithm. This expression reveals
clues about the reasoning process players might go through to obtain utility proportional
beliefs.

3    Reasoning Process
To introduce the formula for the player’s beliefs some more notation needs to be fixed. We
denote the number of choices of player i by n = |Ci | and the number of choices for player
j by m = |Cj |. Moreover, let N = {1, ..., n} and M = {1, ..., m}. The n ⇥ 1 vector in with
in = ( n1 , ..., n1 ). Let Ci = {c1i , . . . , cni } and Cj = {c1j , . . . , cm   j } so that we can denote player
i’s n ⇥ m utility matrix by
                                                       2                                      3
                                                         Ui (c1i , c1j ) · · · Ui (c1i , cm
                                                                                          j )
                                                 1     6       ..                     ..      7
                             Uinorm =                  4        .                      .      5.
                                            ui ui             n 1                   n m
                                                         Ui (ci , cj ) · · · Ui (ci , cj )

The m ⇥ m matrix Zm has mm 1 on the diagonal and m  1
                                                      oﬀ the diagonal. Intuitively, the
centering matrix subtracts the mean from the columns of a matrix when left multiplied.
We define the matrix Gj := j Zm Ujnorm since it will be useful to develop a more intuitive
understanding. By left-multiplying the normalized utility matrix Ujnorm with the centering
matrix Zm , one obtains a matrix where for every element the average of its column has
been subtracted. Note that the rows of Ujnorm correspond to i’s choices and the columns
to j’s choices. The same holds for the matrix Zm Ujnorm , only that now each element
represents the relative goodness of a choice given an opponent’s choice. Therefore, the
matrix Gj gives the goodness of a choice given a belief about the opponent’s choice, scaled
by the sensitivity to the opponents diﬀerences in utility j .
   Now we can state the formula for i ’s beliefs about j ’s choices under common belief in
 -utility-proportional-beliefs:
                                      1
                                      X
                                i =         (Gj Gi )k (im + Gj in )                      (3.1)
                                      k=0
                                 = (im + Gj in ) + Gj Gi (im + Gj in )
                                      +Gj Gi [Gj Gi (im + Gj in )] + · · · ,

where i is a m ⇥ 1 vector with the probabilities that player i assigns to player j ’s choices.
    We see that the expression (im + Gj in ) is repeated several times. In the second term,
this expression is then adjusted by left multiplying the matrices Gj Gi . In the third term
the second term is adjusted by left multiplying Gj Gi , and so on. Therefore, we call
(im + Gj in ) the initial belief, iinitial . It shows how player i constructs her beliefs about
player j without taking into account that player j reasons about her. Player i starts oﬀ
by assigning equal probability to her opponent’s choice combinations. Then she adjusts
her belief by adding the term Gj in , which represents the goodness of j ’s choices when j
assigns equal probability to all of i ’s choices.
    To emphasize the reasoning process, we define ik as the belief that player i holds after
the k th reasoning step,
                                       0        initial
                                       i :=     i
                                       k
                                       i :=
                                                initial
                                                i       + Gj Gi ( ik 1 ),

such that limk!1     k
                     i =    i holds. To obtain a more intuitive understanding we rewrite
                                                                                             k
                                                                                             i
as follows
                                      k                      k 1
                                      i = im + Gj (in + Gi ( i   )).
    We see that first player i takes j ’s perspective, which is reflected in the expression
in + Gi ( ik 1 ). Here player j forms a belief about player i given i ’s belief about j from
the previous reasoning step. First j assigns equal probability to all of i ’s choices. Then
she corrects these beliefs by the goodness of i ’s choices given i ’s belief about j from the
previous reasoning step. The result is a new belief of j about i. Then player i takes her
own perspective and assigns equal probability to all of j ’s choices. These probabilities are
then again corrected by the goodness of j ’s choices given the new belief of j about i. The
process then continues in the same fashion for the subsequent reasoning steps.
    It is also important to note that later reasoning steps will be less important for the
final belief than earlier ones. Define i = ↵i max with ↵i 2 [0, 1) and note that Gj =
↵i max
    i    Zm Ujnorm , so that (3.1) can be written as
                          1
                          X
                    i =         (↵i ↵j max
                                       i
                                           max
                                           j   Zm Ujnorm Zn Uinorm )k iinitial .
                          k=0
                                      P1
Since ↵i , ↵j 2 [0, 1), later terms in k=0 (↵i ↵j max
                                                  i
                                                       max
                                                       j   Zm Uj Zn Ui )k will be smaller than
earlier ones and therefore less important for the final belief i . This has also an important
implication for the meaning of the proportionality factor i : the lower the value of i the
fewer steps of reasoning a player will undergo to approximate the final belief within a
reasonable bound. The same holds true for her opponent’s proportionality factor.


4    Connections to Psychology

These features correspond closely to findings in the psychology literature. In his book
“Thinking, Fast and Slow” Kahneman [4] advocates the idea of reasoning in two distinct
ways. He calls the two modes of thinking System 1 and System 2, according to Stanovich
[6]. Note that the word system should not indicate an actual system but only serves as
label for diﬀerent modes of thinking. System 1 is an automatic and mostly unconscious
way of thinking that demands little computational capacity. System 2 describes the idea
of deliberate reasoning. It comes into play when controlled analytical thinking is needed.
Table 1 summarizes the properties of the two systems according to [6].


          System 1                              System 2

          associative                           rule-based
          holistic                              analytic
          relatively undemanding of cognitive demanding of cognitive capacity
          capacity
          relatively fast                       relatively slow
          acquisition by biology, exposure, and acquisition by cultural and formal
          personal experience                   tuition
                             Table 1. Properties of System 1 & 2


    The concept of System 1 describes the unconscious first reaction to a situation, which
happens almost immediately and without demanding a lot of cognitive resources. More-
over, Kahneman [4] argues that the beliefs formed by System 1 are the basis for conscious
reasoning within System 2. This is consistent with our findings since the initial belief
does not take into account any strategic interaction. This belief can be seen as an au-
tomatic initial reaction to the game. The deliberate reasoning process described above
can be imagined as being executed by System 2 using the findings of System 1, or in
this case the initial belief. Taking another player’s perspective takes deliberate reasoning
and can hardly be done automatically. Finally, we showed that the final belief can be
approximated with finitely many steps of reasoning. This feature is closely related to the
problem of limited working memory. Baddeley [2] defines working memory as "... [A] brain
system that provides temporary storage and manipulation of the information necessary
for such complex cognitive tasks as language comprehension, learning, and reasoning."
Since working memory is critical for reasoning, however bounded, human beings can only
perform a limited number of reasoning without the support of tools. Therefore, a model
resembling human reasoning should not predict an infinite amount of reasoning steps.


References
1. Bach, C.W., Perea, A.: Utility proportional beliefs. International Journal of Game Theory pp.
   1–22 (2014)
2. Baddeley, A.: Working memory. Science 255(5044), 556–559 (1992)
3. Camerer, C.F., Ho, T.H., Chong, J.K.: A cognitive hierarchy model of games. The Quarterly
   Journal of Economics 119(3), 861–898 (2004)
4. Kahneman, D.: Thinking, fast and slow. Macmillan (2011)
5. McKelvey, R.D., Palfrey, T.R.: Quantal response equilibria for normal form games. Games
   and economic behavior 10(1), 6–38 (1995)
6. Stanovich, K.E., West, R.F., et al.: Individual diﬀerences in reasoning: Implications for the
   rationality debate? Behavioral and brain sciences 23(5), 645–665 (2000)