Strategic learning towards equilibrium. Exploratory
analysis and models
Oleksii P. Ignatenko1
1
    Institute of Software Systems NAS Ukraine, 40 Academician Glushkov Ave., Kyiv, 03187, Ukraine


                                         Abstract
                                         This paper deals with strategic behavior of people, we observed from experiments. The research question,
                                         formulated in this work, is how players (mainly children) learn in complex strategic situations which
                                         they never faced before. We examine data from different games, played during popular lectures about
                                         game theory and present findings about players progress in strategic learning while competing with
                                         other players. Four “pick a number” games were investigated, all with similar-looking rules but very
                                         different properties. These games were introduced to very different groups of listeners. The data gathered
                                         is available in open repository for replication and analysis. In the work we analyse data and propose
                                         the agent-based model of beauty contest game, explaining observed behavior. Finally, we discuss the
                                         findings propose hypothesis to investigate and formulate open questions for future research.

                                         Keywords
                                         behavioral game theory, guessing game, k-beauty contest, active learning, R, agent-based modeling


1. Introduction
Game theory is a field of science which investigates decision-making under uncertainty. The
source of the uncertainty can be strategic structure, e.g. probability of certain events, lack of
information about future possibilities or it can be about decisions of other agents. In last case
we can talk about interdependence, that is, when the actions of some players affect the payoffs
of others. Such situations arise around us every day and we, consciously or unconsciously, take
part in them. The success is heavily relies on our perception about actions of other player. The
problem is how we can know future actions of other players. Truth be told we can‘t, but we can
start with some assumptions, that will help eventually create framework, model or theory of
“mind”, which will predict future (reasonable) actions. Game theory proposed approach, which
is now under questioning (especially from the side of experimental or behavioral economy),
nevertheless we will start from standard notions and then will proceed to experimental data.
   One can expect that other players will play “reasonably”, and by this game theory means
that they will try to achieve a better result in some agent’s sense. This idea is grasped by term
rationality. Every rational player must calculate best possible result, taking into account the

CoSinE 2021: 9th Illia O. Teplytskyi Workshop on Computer Simulation in Education,
co-located with the 17th International Conference on ICT in Education, Research, and Industrial Applications:
Integration, Harmonization, and Knowledge Transfer (ICTERI 2021), October 1, 2021, Kherson, Ukraine
" o.ignatenko@gmail.com (O. P. Ignatenko)
~ https://www.nas.gov.ua/EN/PersonalSite/Pages/default.aspx?PersonID=0000004947 (O. P. Ignatenko)
 0000-0001-8692-2062 (O. P. Ignatenko)
                                       © 2022 Copyright for this paper by its authors.
                                       Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                         83
rules of the game, the interests of other participants in other words think strategically. It is
well known from theory that rational player will play Nash equilibrium (NE) if there is any,
which is very useful in the games where only one unique NE exists. Notion of rationality was
indeed fundamental for development of game theory, especially mathematical part. However,
the problems with this notion also quite numerous.
   First of all, it is very demanding because it actually presuppose that agent has complete,
transitive preferences and actually capable to compute equilibrium in given strategic situation.
But this is simply not feasible in many real situations (for example we know about NE in chess,
but still no one can compute it). Secondly, probably more important, there are many games
where NE is poor prediction of actual human behavior. In this paper we investigate some of
data from such games and discuss the difference.
   All this makes decision making very interesting problem to investigate. This is rich area of
research, where theoretical constructions of game theory seems to fail to work and experimental
data shows unusual patterns. These patterns are persistent and usually do not depend on age,
education, country and other things. During last 25 years behavioral game theory in numerous
studies examines bounded rationality (best close concept to rationality of game theory), cognitive
distortions and heuristics people use to reason in strategic situations. For example we can
note surveys of Crawford et al. [1], Mauersberger and Nagel [2]. Also there is comprehensive
description of the field of behavioral game theory by Camerer [3].
   We will concentrate on the guessing games, which are notable part of research because of their
simplicity for players and easy analysis of rules from game theoretic prospective. In this paper
we present results of games played during 2018–2021 years is part of popular lectures about
game theory. The audience of these lectures was quite heterogeneous, but we can distinguish
following main groups: (1) children at schools (strong mathematical schools, ordinary schools,
alternative education schools); (2) students (bachelor and master levels); (3) mixed adults with
almost any background; (4) adults with business background; (5) participants of Data Science
School; (6) participants of summer STEM camps for children.
   We propose framework of four types of games, each presenting one idea or concept of game
theory. These games were introduced to players with no prior knowledge (at least in vast
majority) about the theory. From the other hand, games have simple formulation and clear
winning rules, which makes them intuitively understandable even for kids. This makes these
games perfect choice to test ability of strategic thinking and investigate process of understanding
of complex concepts during the play, with immediate application to the practice. This dual
learning, as we can name it, shows how players try-and-learn in real conditions and react to
challenges of interaction with other strategic players. At the last section we build agent-based
model using Netlogo environment and analyse its relevance to experiment data.
   First we start with some definitions.

1.1. Game theory definitions and assumptions
We will consider games in strategic or normal form in non-cooperative setup. A non-
cooperativeness here does not imply that the players do not cooperate, but it means that
any cooperation must be self-enforcing without any coordination among the players. Strict
definition is as follows.


                                                84
 A non-cooperative game in strategic                    (or   normal)     form     is   a   triplet
𝐺 = {𝒩 , {𝑆𝑖 }𝑖=∈𝒩 , {𝑢𝑖 }𝑖∈𝒩 }, where:

    • 𝒩 is a finite set of players, 𝒩 = {1, . . . , 𝑁 };
    • 𝑆𝑖 is the set of admissible strategies for player 𝑖;
    • 𝑢𝑖 : 𝑆 →− ℛ is the utility (payoff) function for player 𝑖, with 𝑆 = {𝑆1 × · · · × 𝑆𝑁 }
      (Cartesian product of the strategy sets).

   A game is said to be static if the players take their actions only once, independently of each
other. In some sense, a static game is a game without any notion of time, where no player
has any knowledge of the decisions taken by the other players. Even though, in practice, the
players may have made their strategic choices at different points in time, a game would still
be considered static if no player has any information on the decisions of others. In contrast, a
dynamic game is one where the players have some (full or imperfect) information about each
others’ choices and can act more than once. In this work we deal with static repeated games,
which means that the same game is played twice (sometimes three times) with the same players.
   Agents rationality is very important issue, sometimes it is called full rationality (to differ it
from bounded rationality – less restricting notion). When fully rational agent try to find the
best action, it usually depends on the action of other self-interest agent. So first agent must form
beliefs about second agent’s beliefs about beliefs of first agent and so on. Such constructions
seems to be too complicated but they in fact are base for the predictions of classical game theory,
which assumes all agents to be fully rational.
   One quite famous result by Aumann [4] is that for an arbitrary perfect-information extensive-
form game, the only behavior that is compatible with (1) common knowledge of rationality,
and in particular by (2) each agent best responding to their knowledge is for each agent to play
according to the strategy, obtained by the backward induction. Aumann and Brandenburger
[5] similarly showed that common knowledge of rationality, the game payoffs, and the other
agents’ beliefs are a sufficient condition for Nash equilibrium in an arbitrary game.
   In this regard, the most accepted solution concept for a non-cooperative game is that of a
Nash equilibrium, introduced by John F. Nash [6]. Loosely speaking, a Nash equilibrium is a
state of a non-cooperative game where no player can improve its utility by changing its strategy,
if the other players maintain their current strategies. Of course players use also information
and beliefs about other players, so we can say, that (in Nash equilibrium) beliefs and incentives
are important to understand why players choose strategies in real situations.
   The NE is the core concept of game theory, but it is differ from experiments and sometimes
reality. In some games humans demonstrate convergence to equilibrium, but in others do not.
This gap between similarly looking games is slim and not easy to catch. We will consider
guessing games as a playground to work with players behavior.


2. Guessing games history
In early 90xx Rosemary Nagel starts series of experiments of guessing games, summarized in
[7]. She wasn’t the first one to invent the games, they were used during the lectures by different
game theory researchers (for example Moulin [8]). In recent work [9] authors provide extensive


                                                85
research of origins of the guessing game with unexpected links to editor of the French magazine
“Jeux & Stratégie” Alain Ledoux, who, as far as it is known today, was the first who used the
rules and then publish article about unusual patterns observed [10]. Anyway, work of Nagel [7]
was first experimental try to investigate the hidden patterns in the guessing game and in this
work framework of k-level models was proposed.
   Later, Ho et al. [11] gave the name “p-beauty contest” inspired by Keynes comparison of
stock market instruments and newspaper beauty contests.
   The beauty contest game (BCG) has become important tool to measure “depth of reasoning”
of group of people using simple abstract rules. To begin with we should note that behavioral
game theory aims to develop models, which explain human behavior in game-theoretic setup
more accurately, based both on experiments and theory [3]. There are two main approaches
how to deal with problem of replacing full rationality with bounded rationality. First view is
to consider boundedness as error. For example quantal response notion [12] or 𝜖-equilibrium
[13] assume that agents make an error by choosing not optimal strategy profile. They play
near-optimal response, because they do not have capacity to calculate exact best action.
   The second approach is to treat bounded rationality as a structural property of an agent’s
reasoning process. One of the most prominent class of models from this type is iterative
models scheme. They include the k-level reasoning [7, 14], cognitive hierarchy [15] and quantal
cognitive hierarchy models [16]. All these models consider boundedness as immanent part of
reasoning. Each agent has a non-negative integer level representing the degree of strategic
reasoning (i.e., modeling of recursive beliefs) of which the agent is capable. Level-0 agents are
nonstrategic—they do not model other agents’ beliefs or actions at all; level-1 agents model
level-0 agents’ actions; level-2 agents model the beliefs and actions of level-1 agents; and so
forth [17].
   In this work we support latter idea, analyzing experimental data to estimate changes in
numbers of different levels with learning and teaching process.

2.1. Learning models
Recently, game theorists began to actively research the process of reasoning towards the
equilibrium. Two prominent simple learning models are reinforcement and belief learning
(e.g., fictitious play). In reinforcement, strategies have numerical attraction levels which are
reinforced (increased) when a strategy is chosen and the result is good. Reinforcement is a
good model of animal learning but does not gracefully accommodate the fact that people often
choose strategies that have not been directly reinforced.
   In fictitious play, players form beliefs based on a weighted average of what others have
done in the past, and best-respond given their beliefs. Remarkably, weighted fictitious play is
equivalent to a generalized reinforcement model in which unchosen strategies are reinforced
by the forgone payoffs they would have yielded.
   There are a lot of other approaches, we will mention approach which enriches 0-level rea-
soning [16]. Specifically, they investigate general rules that can be used to induce a level-0
specification from the normal-form description of an arbitrary game.
   Also we can note work by Gill and Prowse [18], where participants were tested on cognitive
abilities and character skills before the experiments. Then authors perform statistical analysis


                                               86
to understand the impact of such characteristics on the quality of making strategic decisions
(using p-beauty contest game with multiple rounds). In more recent work by Fe et al. [19]
even more elaborate experiments are presented. It is interesting that in the mentioned paper
experiments are very strict and rigorous (as close to laboratory purity as possible) in contrast to
games, played in our research. But in the end of the day the results are not differ very much.
   As we know there are not many works about game theory experiments for children. In our
previous works [20, 21] we presented data from games with participants of 15-18 years old.
There is master thesis by Povea and Citak [22], with study of the the behaviour of children aged
8–11 in a beauty contest game with ten repetitions. Authors found evidence that children are
able to play a beauty contest game using not only cognitive skills but also empathy.
   To deal with these problems, computer simulation, mainly agent-based modeling ABM, can be
used. Agent-based models are essentially a tool to discover patterns in behaviors that emerges
from simple rules – micro behavior. Agent-based modeling for guessing game is not very
developed area of research, for example see paper by Nichols and Radzicki [23].


3. Experiments setup
We claim that our setup is closer to reality then to laboratory and this is the point of this
research: how people learn under real-world situations. All games were played under following
conditions:

   1. Game were played during the lecture about the game theory. Participants were asked
      not to comment or discuss their choice until they submit it. However, this rule wasn’t
      enforced, so usually they have this possibility if wanted;
   2. Participants were not rewarded for win. The winner was announced (so get some “good
      feelings”), but no more;
   3. During some early games we used pieces of paper and we got some percentage of joking
      or trash submission, usually very small. Later we have switched to google forms, which
      is better tool to control submission (for example only natural numbers allowed).
   4. Google Forms gives possibility to make multiple submission (with different names), since
      we didn’t have time for verification, but total number of submission allows to control that
      to some extend.

   The aim of this setup was to free participants to explore the rules and give them flexibility
to make decision in uncertain environment. We think it is closer to real life learning without
immediate rewards then laboratory experiments. Naturally, this setup has strong and weak
sides. Lets summarize both.
   The strong sides are:

   1. This setup allow to measure how people make decisions in “almost real” circumstances
      and understand the (possible) difference with laboratory experiments;
   2. These games are part of integrated approach to active learning, when games are mixed
      with explanations about concepts of game theory (rationality, expected payoff, Nash
      equilibrium etc), and they allow participants to combine experience with theory;


                                                87
Table 1
Summary of first game for id of experiment and type of players. Explanation of columns is in the text
 Id        Type          Age    Round Average Winning Zlevel Median Count Irrationality
  1  Alternative H 12-14          1       66.7         44.5    69.23    78       13        46.15
  1  Alternative H 12-14          2       3.91         2.61      0     3.5       12          0
  2  Alternative M 12-14          2       42.82        28.54   23.52   45.0      17          0
  2  Alternative M 12-14          2       24.37        16.24     0     26.5      16          0
  3      Adults                   1       40.57        27.05   31.57   40.0      19         5.26
  4  Alternative H 12-14          1       52.54        35.03   63.63    55       11        9.09
  4  Alternative H 12-14          2       15.41        10.27   8.33      6       12        8.33
  5      Adults                   1       22.98        15.32   11.76   17.0      102          0
  6   TechSchool    16-18         1       43.41        28.94   35.29   45.0      51         3.92
  6   TechSchool    16-18         2       46.5         30.99   35.48   29.0      62        32.25
  7  Math lyceum 16-18            1       30.58        20.38    16     27.5      50         2.0
  7  Math lyceum 16-18            2       14.26         9.5    5.26      7       57         5.26
  8  Math lyceum 15-16            1       37.06        24.71   20.68   33.0      29        3.44
  8  Math lyceum 15-16            2       26.20        17.47   10.34   17.0      29        6.89
  9  Math lyceum 14-16            1       42.0         27.99   44.44   42.5      18        11.11
  9  Math lyceum 14-16            2       23.1         15.39    5.0    19.0      20          0
 10 Ordinary school 14-16         1       48.69        32.46   46.15   46.5      26          0
 10 Ordinary school 14-16         2       19.78        13.18     0     22.0      23          0
 11 DS conference                 1       37.25        24.83   28.33   33.0      60        8.33
 11 DS conference                 2       21.44        14.29   15.78   9.0       57        12.28
 12    Students                   1       42.40        28.27   33.33   40.0      27         3.7
 13    Students                   1       27.37        18.24   12.5    25.5       8          0
 13    Students                   2       8.62         5.74      0     8.5        8          0
 14 Math lyceum 14-16             1       41.05        27.37   22.22   35.0      18        11.11
 14 Math lyceum 14-16             2       17.23        11.49   5.88    13.0      17          0
 15      Adults                   1       34.32        22.88   20.73   30.0      82        1.21
 15      Adults                   2       12.48        8.32    2.19    8.0       91        2.19
 16      Adults                   1       43.05        28.70   33.96   40.0      53        1.88
 16      Adults                   2       14.69        9.79    1.88    11.0      53        1.88
 17      Adults                   1       50.33        33.55   41.66   50.0      12        8.33
 17      Adults                   2       13.50        8.99      0     12.0      46          0
 18 Math lyceum 14-16             1       41.72        27.81   36.36   37.0      11        9.09
 18 Math lyceum 14-16             2       26.36        17.57     0     30.0      11          0
 19 Math lyceum 14-16             1       29.43        19.62   13.63   25.0      44          0
 19 Math lyceum 14-16             2       27.25        18.16   20.45   9.5       44        20.45


   3. Freedom and responsibility. The rules doesn’t regulate manipulations with conditions. So
      this setup allows (indirectly) to measure preferences of players: do they prefer cheat with
      rules, just choose random decision without thinking or put efforts in solving the task.

  Weak sides are:

   1. Some percentage of players made “garbage” decisions. For example choose obviously
      worse choice just to spoil efforts for others;


                                                  88
   2. Kids has (and often use) possibility to talk out decision with the neighbors;
   3. Sometimes participants (especially kids) lost concentration and didn’t think about the
      game but made random choice or just didn’t make decisions at all;
   4. Even for simplest rules, sometimes participants failed to understand the game first time.
      We suppose it is due to conditions of lecture with (usually) 30–40 persons around.

3.1. Rules
All games have the same preamble: Participants are asked to choose integer number in range 1
– 100, margins included. Note, that some setups, investigated in references, use range starting
with 0. But the difference is small.
   To provide quick choice calculation we have used QR code with link to google.form, where
participants input their number. All answers were anonymous (players indicate nicknames to
announce the winners, but then all records were anatomized). The winning condition is specific
for every game.

   1. p-beauty contest. The winning number is the closest to 𝑝 of average. Usually 𝑝 = 2/3, but
      we used other setups as well;
   2. Two equilibrium game. The winning number is the furthest from the 𝑝 times average.
      Usually 𝑝 = 1 if not explicitly mentioned;
   3. Coordination with assurance. The winning number is the number, chosen by plurality. In
      case of tie lower number wins;
   4. No equlibrium game. The winning number is the smallest unique;

  All these games are well-known in game theory, let’s address paper [21], where all properties
are summarized and do not repeat them there.


4. Results and data analysis
In this section we present summary of data, gathered during the games.

4.1. First game
Summary of results of First game is given in the table 1. Columns descriptions are:

    • id is the id of experiment;
    • type is the type of group. Alternative H and M is for alternative school (not in govern-
      mental system) with humanitarian and mathematical direction respectively. Math lyceum
      goes also for summer camps with participants from different lyceums;
    • age is the approximate age of participants, only indicated for children, to distinguish
      possible borderline between stages of strategic reasoning;
    • round is the round of the game;
    • average is the average of choices;
    • winning number is the average * 0.66;


                                              89
    • zlevel is the percent of players, choosing numbers bigger than 50, it is estimation of 0-level
      players in this round. As one can expect it is declining with round;
    • median is the median of choices (sometimes it is more informative then average);
    • count is the number of choices;
    • irrationality is the percent of choices bigger than 90.

   Almost all winning numbers are fall (roughly) in the experimental margins, obtained by
Nagel [7]. With winning number no bigger then 36 and not smaller then 18 in first round.
Two exceptions in our experiments were Facebook on-line game (15.3), when players can read
information about the game in, for example, Wikipedia. And other is alternative humanitarian
school (40.1), where participants seems didn’t got the rules from the first time.

4.1.1. Metrics and analysis
First metric to observe is the percent of “irrational choices” - choices that can’t win in (almost)
any case. Lets explain, imagine that all players will choose 100. It is impossible from practice but
not forbidden. In this case everybody wins, but if only one player will deviate to smaller number
– he/her will win and others will lose. So playing numbers bigger then 66 is not rational, unless
you don’t want to win. And here we come to important point, in all previous experiments this
metric drops in second round and usually is very low (like less than 5%) [11]. But in our case
there are experiments where this metric become higher or changes very slightly. And initially
values are much higher then expected. So here we should include factor of special behavior, we
can call it “let’s show this lecturer how we can cheat his test!”. What is more interesting – this
behavior more clear in case of adult then kids.
   It is also interesting to see distribution of choices for different types of groups. We can
summarize choices on the histograms (figure 1). Using models of strategic thinking we will
adopt the theory of k-levels. According to this idea 0-level reasoning means, that players make
random choices (drawn from uniform distribution), and k-level reasoning means that these
players use best-response for reasoning of previous level. So 1-level reasoning is to play 33,
which is best response to belief that average will be 50, 2-level is best response to belief that
players will play 33 and so on.
   As we can see from the diagram 1, some spikes in choices are predicted very good, but it
depends on the background of players. The best prediction is for attendees of Data Science
conference, which presumes high level of cognitive skill and computer science background.
   On figure 2 we can see boxplots, defined by number of players with different level of perception.
Level are defined in next subsection, but we can see pattern of behavior. Number of “irrational”
(choices with big numbers) is decreasing, so as “next-to-win-but-bigger” numbers. Number
of 2-level reasoning, especially after explanation about the equilibrium concept is growing
substantially, while number of “too smart” choices – choices from [1,5] is basically more or less
the same.
   Interesting hypotheses, that need to be tested in details, can be formulated: Higher number
of choices from [50,100] in first round leads to higher number of choices from [1,5] in
second round and vice verse.


                                                90
Figure 1: Histogram of choices for each round.


   Another metric [24] is how much winning choice in second round is smaller then in first.
Due to concept of multi-level reasoning, every player in this game trying to its best to win
but cant do all steps to winning idea. So there are players, who just have 0-level reasoning,
they choose random numbers. First-level players choose 33, which is best response for players
of 0-level and so on. Based on result of first round and, in fact, explanation about the Nash
equilibrium, players must know that it is better to choose much lower numbers. But graph
shows that decrease is quite moderate. Only students shows good performance in this matter.
And tech school shows (small) increase in winning number in second round!

4.1.2. Levels of reasoning analysis
Another point about the process of learning in this game is how players decision are distributed
over the space of strategies. We claim that there is distinct difference in changes between first
and second round for different groups. To perform this analysis we apply the idea of k-level
thinking.
   To find differences we need to simplify this approach. First, we define b-level players players
who choose numbers from the range [50,100]. It is beginner players, who do not understand rules
(play randomly) or do not expect to win or want to loose intentionally (for reasons discussed


                                                 91
Figure 2: Change in winning number for number of participants.


above). The substantiation for such range is that numbers higher then 50 did not win in any
game. Second level we call m-level, it is for range [18,50]. It is for players with middle levels of
reasoning, usually first round winning number is in this range (and in part of second rounds
also).
   Third level is h-level, it is for range [5, 18]. It is for high level reasoning and finally inf-level
([1,5] range) is for “almost common knowledge” level of thinking.
   Calculating the number of levels for each game we can estimate change (in percentage of
number of players) in adopting different strategy levels.
   There are some limitation of this approach:

    • number of players changed with rounds, since not everyone participated (it was option,
      not obligation);
    • limits of ranges are not defined by model or data. It can be future direction of research –
      how to define levels in best way.

   Results are presented in table 2.
   What conclusions we can draw from this data? There are no clear difference in changing, but
at least we can summarise few points:


                                                  92
Table 2
Summary of change in strategy levels
                Type              b-difference   m-difference   h-difference   inf-difference
       Alternative humanitarian        -72             -8            0              72
       Alternative mathematical        -24             -6           30              -6
       Alternative humanitarian        -52              0           17              43
             Math lyceum                -9            -36           24              34
             Math lyceum               -10            -24           28              7
            Ordinary school            -49             12           20               4
       DS conference attendees         -14            -32           14              27
             MS students               -12            -50           50              12
       Alternative mathematical        -17            -34           23              23
       DS conference attendees         -17            -30           23              35
               Business                -32            -17           21              28


    • Usually after first round and equilibrium concept explanation there is decrease in b-level
      and m-level;
    • Symmetrically, there is increase in two other levels, but sometimes it is more distributed,
      sometimes it is (almost) all for inf-level;
    • Last situation is more likely to happen in schools, were kids are less critical to new
      knowledge;
    • Usually second round winning choice in the realm of h-level, so groups with biggest
      increase in this parameter are the ones with better understanding.

4.1.3. Size and winning choice
This game is indeed rich for investigation, let us formulate last (in this paper) finding about
this game. Can we in some way establish connection between number of players and winning
number (actually with strategies, players choose during the game)? To clarify our idea see at
figure 3. It is scatter plot of two-dimensional variable, x-axis is for number of participants in
the game and y-axis is for winning choice per round. Different color are for different types of
group, where games was played.
   As we can observe, first and second rounds form two separate clusters. This is expected and
inform us that players learned about the equilibrium concept between rounds and apply it to
practice. Also there is mild tendency that smaller groups have bigger winning numbers. At
least variation is bigger.
   This is yet too bold to formulate connection between size of the group and winning number,
but probably the reason is that when size of the group is bigger, number of “irrational” players
increases. It can be due to some stable percentage of such persons in any group or other reasons,
but it is interesting connection to investigate.


                                                 93
Figure 3: Change in winning number for number of participants.


4.1.4. Intentionally irrational?
Another interesting finding is that after first round finished, after observing result and listening
explanation about the NE number of players, who choose over 90 (it is obviously non winning
choice) increases. It is actually not accidental, data from [22] also show increase in 2-5 round. We
believe that this is quite important part of play. This phenomena especially visible in high school
children, with strong math background (usually they has more freedom and self confidence in
choosing non-standard strategies).

4.2. Second game
In second game the key point is to understand that almost all strategies are dominated. One can
see that average can be bigger or smaller then 50, and accordingly winning choice will be 1 or
100. It is worth to note, that popular nature of these experiments and freedom to participate
make the data gathering not easy. For example many participants just didn’t take any decision
in second game. Results are summarised in the table 2.
   We refine players decisions to see how many players made choices with rationalizability
(Bernheim [25]), which are best response for some strategy profile of other players. In this game
there are only two best responses possible (in pure strategies), literally 1 and 100.
   This is remarkable result, players without prior communications choose to almost perfect
mixed equilibrium: almost the same percentage choose 1 and 100. This is even more striking
taking into account no prior knowledge about mixed strategies and mixed equilibrium, kids


                                                94
Table 3
Second game. Rationalizable choices summary
                        Type               Average       Choose 100   Choose 1   Count
                        Adults               46.5          24.3%       24.34%     115
              Alternative mathematical       43.8          25.9%       27.9%       27
                      Business               50.6          29.3%       29.3%       99
              DS conference attendees        37.4          15.8%       36.8%      114
                    Math lyceum              48.5          22.7%       24.7%      154
                   Ordinary school           51.2          30.4%        30.4%      23


play it intuitively and without any communication. To illustrate the mixed Nash learning by
groups, put dependency of percent of 1 choices and 100 choices on plot.

4.3. Third game
Third game is simpler then first two, it is coordination game where players should coordinate
without a word. And, as predicted by Schelling [26], they usually do. Date presented in the [27]
shows that 1 is natural coordination point, with one exception – Tech school decided that it
would be funny to choose number 69. Probably, it is the age (11th grade) here to blame. Also
we can note attempt to coordinate around 7, 50 and 100.
  Interesting and paradoxical result, which is expected from general theory, that with fewer
options coordination in fact is more difficult. Lets consider, where players decision was to
choose integer from [1,10], only 10 choices. Comparing to previous game with 100 possible
choices, coordination was very tricky - two numbers got almost the same result.

4.4. Fourth game
Here we just note, that the winning numbers were: 12, 2, 4, 20. Since no equilibrium here was
theoretically found, we can only gather data at this stage and formulate hypothesis to found
one.
  All experimental data and R file for graphs can be accessed in open repository [27].


5. Agent-based model
The existence of irrational behavior challenges the basic game-theoretic assumption about
self-interest and capability to calculate best option. In other words, real people do not think
as machines or algorithms. They form hypotheses or expectations using simple rules. These
rules are influenced by emotions, social norms and also can be changed depending on feedback
(reinforced). This use of inductive reasoning leads to two issues. first, what rules that people fol-
low? Second, suppose we know these rules, how do we model the behavior of many interacting,
heterogeneous agents in that situation?


                                                    95
5.1. Netlogo
NetLogo is an environment for agent-based modeling and simulation based on high level
programming language. It is a freely available, open-source, multi-platform software.
   In Netlogo agents live in ‘world’ – rectangular grid, where agents of different types can be
created. There are four four types of agents: patches, turtles, links and the observer. Patches
are fixed pieces of “ground” that make the world. Turtles can move through patches in any
distance/direction at every step of time. Links have the function of connecting at least two
agents, and are represented by lines. The observer is the overseeing agent that gives other
agents instructions and makes changes in the world. It is responsible for holding a set of global
variables.
   The time passes as discrete steps called ticks, in which all agents can behave.

5.2. Model
Epstein [28] defines following characteristics of agent-based model: (1) heterogeneity – agents
are different in some ways; (2) autonomy – each agent make own decisions; (3) explicit space –
agents interact in a given environment; (4) local Interaction – agents generally interact with
their neighbors and immediate environment; (5) bounded Rationality – agents have limited
information and computing power; agent behavior is generated by simple rules that may adapt
over time; (6) non-equilibrium dynamics.
   As a first step to build model of guessing games players we propose agent-based model (it
can be downloaded from the repository with experimental data) with characteristics 1, 2, 5. The
structure of model is following. Agents have properties – level, irrationality, choice. Level is
their current level of understanding, irrationality is the flag variable, which includes possible
irrational behavior and choice is the current number choice. Setup procedure allows to create
different distributions of initial levels and p is the game multiplier. On each step every agent
chooses number according to simple rules. The code is downloaded on github, here we just
informally explain the main idea. Level of agent defines the number - agents with level 0 choose
randomly from [1,100], level 1 choose from [45,55] and so on. Margins are adjustable and will
be tuned by real data in future work. After the round winner is determined, and each player
with choice bigger than winning number increase its level of understanding. In this model we
consider only five levels, where level 5 means common knowledge, when player chooses 1.
   When no irrationality is in the model we can observe typical convergence to equilibrium 4
left and this is stable pattern.
   But as we already know from the experiments it is not what we can observe in real life. So
irrational behavior was included to meet pattern from data.
   Irrationality in our model implemented as ’anger’, when player who is currently a loser
sometimes goes to irrational mode and chooses 100 in one next round. This leads to interesting
pattern (figure 4, right), when sometimes winning number increasing in second round, but
nevertheless convergence to equilibrium is inevitable.


                                               96
Figure 4: Plots of players win choice over round.


6. Conclusions
In this paper we have presented data of experiments, analysis and explanation of patterns and
agent-based model, which can be calibrated in future. We were able to confirm existence of
pattern in decision making – every group behave in almost the same way dealing with unknown
strategic situation. We also found some deviation and this is interesting point of difference with
more “laboratory” setup of existing research. Also we build agent-based model and show how
simple rules could lead to complex behavior of players.
   The main findings of the paper are following:
   1. To learn the rules you need to break them. Participants have chosen obviously not
      winning moves (> 66) partly because of new situation and trouble with understanding the
      rules. But high percent of such choices was present in second round also, when players
      knew exactly what is going on. This effect was especially notable in the cases of high
      school and adults and almost zero in case of special math schools and kids below 9th
      grade. We can formulate hypothesis that high school is the age of experimentation when
      children discover new things and do not afraid to do so.
   2. If we considered winning number as decision of a group we can see that group learning
      fast and steady. Even if some outliers choose 100, mean still declines with every round. It
      seems that there is unspoken competition between players that leads to improvement in
      aggregated decision even if no prize is on stake. Actually, it is plausible scenario when
      all participants choose higher numbers. But this didn’t happen in any experiment. The
      closest case – Tech school, when bunch of pupils (posible coordinating) switch to 100 still
      only managed to keep mean on the same level.
   3. Still it seems that there is stable percent of people, who choose about 100 and it is not
      about learning how to play the game. We think that this is something like -1 level of


                                                    97
      reasoning, when player intentionally play “bad move” and this is essential part of model.
      If we neglect such persons and their motivation, our model will not be correct.
   4. agent-based model is the key element to computational analysis of game theory models.
      We create model with simple rules, which shows close to reality behavior. Now we can
      simulate different rules and understand limitation of the model.


References
 [1] V. P. Crawford, M. A. Costa-Gomes, N. Iriberri, Structural models of nonequilibrium
     strategic thinking: Theory, evidence, and applications, Journal of Economic Literature 51
     (2013) 5–62. doi:10.1257/jel.51.1.5.
 [2] F. Mauersberger, R. Nagel, Levels of reasoning in keynesian beauty contests: a generative
     framework, in: C. Hommes, B. LeBaron (Eds.), Handbook of Computational Economics,
     volume 4 of Handbook of Computational Economics, Elsevier, 2018, pp. 541–634.
 [3] C. F. Camerer, Behavioral game theory: Experiments in strategic interaction, The
     Roundtable Series in Behavioral Economics, Princeton University Press, 2003.
 [4] R. J. Aumann, Backward induction and common knowledge of rationality, Games and
     Economic Behavior 8 (1995) 6–19. doi:10.1016/S0899-8256(05)80015-6.
 [5] R. Aumann, A. Brandenburger, Epistemic conditions for nash equilibrium, Econometrica
     63 (1995) 1161–1180. URL: http://www.jstor.org/stable/2171725.
 [6] H. W. Kuhn, J. C. Harsanyi, R. Selten, J. W. Weibull, E. Van Damme, J. F. Nash Jr., P. Ham-
     merstein, The work of john nash in game theory, Journal of Economic Theory 69 (1996)
     153–185. doi:10.1006/jeth.1996.0042.
 [7] R. Nagel, Unraveling in guessing games: An experimental study, The American Economic
     Review 85 (1995) 1313–1326. URL: https://www.cs.princeton.edu/courses/archive/spr09/
     cos444/papers/nagel95.pdf.
 [8] H. Moulin, Game theory for the social sciences, 2nd ed., New York Univeristy Press, New
     York, 1986.
 [9] R. Nagel, C. Bühren, B. Frank, Inspired and inspiring: Hervé Moulin and the discovery of
     the beauty contest game, Mathematical Social Sciences 90 (2017) 191–207.
[10] A. Ledoux, Concours résultats complets. Les victimes se sont plu à jouer le 14 d’atout,
     Jeux & Stratégie 2 (1981) 10–11.
[11] T.-H. Ho, C. Camerer, K. Weigelt, Iterated dominance and iterated best response in
     experimental “p-beauty contests”, The American Economic Review 88 (1998) 947–969.
[12] C. F. Camerer, T.-H. Ho, J.-K. Chong, Sophisticated experience-weighted attraction learning
     and strategic teaching in repeated games, Journal of Economic theory 104 (2002) 137–188.
[13] K. Leyton-Brown, Y. Shoham, Essentials of game theory: A concise, multidisciplinary
     introduction, Synthesis lectures on artificial intelligence and machine learning 2 (2008)
     1–88. doi:10.2200/S00108ED1V01Y200802AIM003.
[14] M. Costa-Gomes, V. P. Crawford, B. Broseta, Cognition and behavior in normal-form
     games: An experimental study, Econometrica 69 (2001) 1193–1235.
[15] C. F. Camerer, T.-H. Ho, J.-K. Chong, A cognitive hierarchy model of games, The Quarterly
     Journal of Economics 119 (2004) 861–898. doi:10.1162/0033553041502225.


                                              98
[16] J. R. Wright, K. Leyton-Brown, Predicting human behavior in unrepeated, simultaneous-
     move games, Games and Economic Behavior 106 (2017) 16–37.
[17] J. R. Wright, K. Leyton-Brown, Models of level-0 behavior for predicting human behavior
     in games, CoRR abs/1609.08923 (2016). arXiv:1609.08923.
[18] D. Gill, V. Prowse, Cognitive ability, character skills, and learning to play equilibrium: A
     level-k analysis, Journal of Political Economy 124 (2016) 1619–1676. doi:10.1086/688849.
[19] E. Fe, D. Gill, V. L. Prowse, Cognitive skills, strategic sophistication, and life outcomes,
     Working Paper Series 448, The University of Warwick, 2019.
[20] O. Ignatenko, Guessing games experiments in school education and their analysis, CEUR
     Workshop Proceedings 2732 (2020) 881–892.
[21] O. P. Ignatenko, Guessing games experiments in Ukraine. Learning towards equilibrium,
     in: S. Semerikov, V. Osadchyi, O. Kuzminska (Eds.), Proceedings of the Symposium on
     Advances in Educational Technology, AET 2020, University of Educational Management,
     SciTePress, Kyiv, 2022.
[22] E. Povea, F. Citak, Children in the beauty contest game: behaviour and determinants of
     game performance, Master’s thesis, Norwegian School of Economics, 2019. URL: https:
     //openaccess.nhh.no/nhh-xmlui/handle/11250/2611651.
[23] M. W. Nichols, M. J. Radzicki, An Agent-Based Model of Behavior in “Beauty Contest”
     Games, Working Paper 07-010, University of Nevada, 2007.
[24] W. Güth, M. Kocher, M. Sutter, Experimental ‘beauty contests’ with homogeneous and
     heterogeneous players and with interior and boundary equilibria, Economics Letters 74
     (2002) 219–228. doi:10.1016/S0165-1765(01)00544-4.
[25] B. D. Bernheim, Rationalizable strategic behavior, Econometrica 52 (1984) 1007–1028.
[26] T. C. Schelling, The Strategy of Conflict: With a New Preface by The Author, Harvard
     University Press, 1980.
[27] O. P. Ignatenko, Data from experiments, 2021. URL: https://github.com/ignatenko/
     GameTheoryExperimentData.
[28] J. M. Epstein, Agent-based computational models and generative social science, Complexity
     4 (1999) 41–60.


                                               99