Strategic learning towards equilibrium. Exploratory analysis and models Oleksii P. Ignatenko1 1 Institute of Software Systems NAS Ukraine, 40 Academician Glushkov Ave., Kyiv, 03187, Ukraine Abstract This paper deals with strategic behavior of people, we observed from experiments. The research question, formulated in this work, is how players (mainly children) learn in complex strategic situations which they never faced before. We examine data from different games, played during popular lectures about game theory and present findings about players progress in strategic learning while competing with other players. Four “pick a number” games were investigated, all with similar-looking rules but very different properties. These games were introduced to very different groups of listeners. The data gathered is available in open repository for replication and analysis. In the work we analyse data and propose the agent-based model of beauty contest game, explaining observed behavior. Finally, we discuss the findings propose hypothesis to investigate and formulate open questions for future research. Keywords behavioral game theory, guessing game, k-beauty contest, active learning, R, agent-based modeling 1. Introduction Game theory is a field of science which investigates decision-making under uncertainty. The source of the uncertainty can be strategic structure, e.g. probability of certain events, lack of information about future possibilities or it can be about decisions of other agents. In last case we can talk about interdependence, that is, when the actions of some players affect the payoffs of others. Such situations arise around us every day and we, consciously or unconsciously, take part in them. The success is heavily relies on our perception about actions of other player. The problem is how we can know future actions of other players. Truth be told we can‘t, but we can start with some assumptions, that will help eventually create framework, model or theory of “mind”, which will predict future (reasonable) actions. Game theory proposed approach, which is now under questioning (especially from the side of experimental or behavioral economy), nevertheless we will start from standard notions and then will proceed to experimental data. One can expect that other players will play “reasonably”, and by this game theory means that they will try to achieve a better result in some agent’s sense. This idea is grasped by term rationality. Every rational player must calculate best possible result, taking into account the CoSinE 2021: 9th Illia O. Teplytskyi Workshop on Computer Simulation in Education, co-located with the 17th International Conference on ICT in Education, Research, and Industrial Applications: Integration, Harmonization, and Knowledge Transfer (ICTERI 2021), October 1, 2021, Kherson, Ukraine " o.ignatenko@gmail.com (O. P. Ignatenko) ~ https://www.nas.gov.ua/EN/PersonalSite/Pages/default.aspx?PersonID=0000004947 (O. P. Ignatenko)  0000-0001-8692-2062 (O. P. Ignatenko) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 83 rules of the game, the interests of other participants in other words think strategically. It is well known from theory that rational player will play Nash equilibrium (NE) if there is any, which is very useful in the games where only one unique NE exists. Notion of rationality was indeed fundamental for development of game theory, especially mathematical part. However, the problems with this notion also quite numerous. First of all, it is very demanding because it actually presuppose that agent has complete, transitive preferences and actually capable to compute equilibrium in given strategic situation. But this is simply not feasible in many real situations (for example we know about NE in chess, but still no one can compute it). Secondly, probably more important, there are many games where NE is poor prediction of actual human behavior. In this paper we investigate some of data from such games and discuss the difference. All this makes decision making very interesting problem to investigate. This is rich area of research, where theoretical constructions of game theory seems to fail to work and experimental data shows unusual patterns. These patterns are persistent and usually do not depend on age, education, country and other things. During last 25 years behavioral game theory in numerous studies examines bounded rationality (best close concept to rationality of game theory), cognitive distortions and heuristics people use to reason in strategic situations. For example we can note surveys of Crawford et al. [1], Mauersberger and Nagel [2]. Also there is comprehensive description of the field of behavioral game theory by Camerer [3]. We will concentrate on the guessing games, which are notable part of research because of their simplicity for players and easy analysis of rules from game theoretic prospective. In this paper we present results of games played during 2018–2021 years is part of popular lectures about game theory. The audience of these lectures was quite heterogeneous, but we can distinguish following main groups: (1) children at schools (strong mathematical schools, ordinary schools, alternative education schools); (2) students (bachelor and master levels); (3) mixed adults with almost any background; (4) adults with business background; (5) participants of Data Science School; (6) participants of summer STEM camps for children. We propose framework of four types of games, each presenting one idea or concept of game theory. These games were introduced to players with no prior knowledge (at least in vast majority) about the theory. From the other hand, games have simple formulation and clear winning rules, which makes them intuitively understandable even for kids. This makes these games perfect choice to test ability of strategic thinking and investigate process of understanding of complex concepts during the play, with immediate application to the practice. This dual learning, as we can name it, shows how players try-and-learn in real conditions and react to challenges of interaction with other strategic players. At the last section we build agent-based model using Netlogo environment and analyse its relevance to experiment data. First we start with some definitions. 1.1. Game theory definitions and assumptions We will consider games in strategic or normal form in non-cooperative setup. A non- cooperativeness here does not imply that the players do not cooperate, but it means that any cooperation must be self-enforcing without any coordination among the players. Strict definition is as follows. 84 A non-cooperative game in strategic (or normal) form is a triplet 𝐺 = {𝒩 , {𝑆𝑖 }𝑖=∈𝒩 , {𝑢𝑖 }𝑖∈𝒩 }, where: • 𝒩 is a finite set of players, 𝒩 = {1, . . . , 𝑁 }; • 𝑆𝑖 is the set of admissible strategies for player 𝑖; • 𝑢𝑖 : 𝑆 →− ℛ is the utility (payoff) function for player 𝑖, with 𝑆 = {𝑆1 × · · · × 𝑆𝑁 } (Cartesian product of the strategy sets). A game is said to be static if the players take their actions only once, independently of each other. In some sense, a static game is a game without any notion of time, where no player has any knowledge of the decisions taken by the other players. Even though, in practice, the players may have made their strategic choices at different points in time, a game would still be considered static if no player has any information on the decisions of others. In contrast, a dynamic game is one where the players have some (full or imperfect) information about each others’ choices and can act more than once. In this work we deal with static repeated games, which means that the same game is played twice (sometimes three times) with the same players. Agents rationality is very important issue, sometimes it is called full rationality (to differ it from bounded rationality – less restricting notion). When fully rational agent try to find the best action, it usually depends on the action of other self-interest agent. So first agent must form beliefs about second agent’s beliefs about beliefs of first agent and so on. Such constructions seems to be too complicated but they in fact are base for the predictions of classical game theory, which assumes all agents to be fully rational. One quite famous result by Aumann [4] is that for an arbitrary perfect-information extensive- form game, the only behavior that is compatible with (1) common knowledge of rationality, and in particular by (2) each agent best responding to their knowledge is for each agent to play according to the strategy, obtained by the backward induction. Aumann and Brandenburger [5] similarly showed that common knowledge of rationality, the game payoffs, and the other agents’ beliefs are a sufficient condition for Nash equilibrium in an arbitrary game. In this regard, the most accepted solution concept for a non-cooperative game is that of a Nash equilibrium, introduced by John F. Nash [6]. Loosely speaking, a Nash equilibrium is a state of a non-cooperative game where no player can improve its utility by changing its strategy, if the other players maintain their current strategies. Of course players use also information and beliefs about other players, so we can say, that (in Nash equilibrium) beliefs and incentives are important to understand why players choose strategies in real situations. The NE is the core concept of game theory, but it is differ from experiments and sometimes reality. In some games humans demonstrate convergence to equilibrium, but in others do not. This gap between similarly looking games is slim and not easy to catch. We will consider guessing games as a playground to work with players behavior. 2. Guessing games history In early 90xx Rosemary Nagel starts series of experiments of guessing games, summarized in [7]. She wasn’t the first one to invent the games, they were used during the lectures by different game theory researchers (for example Moulin [8]). In recent work [9] authors provide extensive 85 research of origins of the guessing game with unexpected links to editor of the French magazine “Jeux & Stratégie” Alain Ledoux, who, as far as it is known today, was the first who used the rules and then publish article about unusual patterns observed [10]. Anyway, work of Nagel [7] was first experimental try to investigate the hidden patterns in the guessing game and in this work framework of k-level models was proposed. Later, Ho et al. [11] gave the name “p-beauty contest” inspired by Keynes comparison of stock market instruments and newspaper beauty contests. The beauty contest game (BCG) has become important tool to measure “depth of reasoning” of group of people using simple abstract rules. To begin with we should note that behavioral game theory aims to develop models, which explain human behavior in game-theoretic setup more accurately, based both on experiments and theory [3]. There are two main approaches how to deal with problem of replacing full rationality with bounded rationality. First view is to consider boundedness as error. For example quantal response notion [12] or 𝜖-equilibrium [13] assume that agents make an error by choosing not optimal strategy profile. They play near-optimal response, because they do not have capacity to calculate exact best action. The second approach is to treat bounded rationality as a structural property of an agent’s reasoning process. One of the most prominent class of models from this type is iterative models scheme. They include the k-level reasoning [7, 14], cognitive hierarchy [15] and quantal cognitive hierarchy models [16]. All these models consider boundedness as immanent part of reasoning. Each agent has a non-negative integer level representing the degree of strategic reasoning (i.e., modeling of recursive beliefs) of which the agent is capable. Level-0 agents are nonstrategic—they do not model other agents’ beliefs or actions at all; level-1 agents model level-0 agents’ actions; level-2 agents model the beliefs and actions of level-1 agents; and so forth [17]. In this work we support latter idea, analyzing experimental data to estimate changes in numbers of different levels with learning and teaching process. 2.1. Learning models Recently, game theorists began to actively research the process of reasoning towards the equilibrium. Two prominent simple learning models are reinforcement and belief learning (e.g., fictitious play). In reinforcement, strategies have numerical attraction levels which are reinforced (increased) when a strategy is chosen and the result is good. Reinforcement is a good model of animal learning but does not gracefully accommodate the fact that people often choose strategies that have not been directly reinforced. In fictitious play, players form beliefs based on a weighted average of what others have done in the past, and best-respond given their beliefs. Remarkably, weighted fictitious play is equivalent to a generalized reinforcement model in which unchosen strategies are reinforced by the forgone payoffs they would have yielded. There are a lot of other approaches, we will mention approach which enriches 0-level rea- soning [16]. Specifically, they investigate general rules that can be used to induce a level-0 specification from the normal-form description of an arbitrary game. Also we can note work by Gill and Prowse [18], where participants were tested on cognitive abilities and character skills before the experiments. Then authors perform statistical analysis 86 to understand the impact of such characteristics on the quality of making strategic decisions (using p-beauty contest game with multiple rounds). In more recent work by Fe et al. [19] even more elaborate experiments are presented. It is interesting that in the mentioned paper experiments are very strict and rigorous (as close to laboratory purity as possible) in contrast to games, played in our research. But in the end of the day the results are not differ very much. As we know there are not many works about game theory experiments for children. In our previous works [20, 21] we presented data from games with participants of 15-18 years old. There is master thesis by Povea and Citak [22], with study of the the behaviour of children aged 8–11 in a beauty contest game with ten repetitions. Authors found evidence that children are able to play a beauty contest game using not only cognitive skills but also empathy. To deal with these problems, computer simulation, mainly agent-based modeling ABM, can be used. Agent-based models are essentially a tool to discover patterns in behaviors that emerges from simple rules – micro behavior. Agent-based modeling for guessing game is not very developed area of research, for example see paper by Nichols and Radzicki [23]. 3. Experiments setup We claim that our setup is closer to reality then to laboratory and this is the point of this research: how people learn under real-world situations. All games were played under following conditions: 1. Game were played during the lecture about the game theory. Participants were asked not to comment or discuss their choice until they submit it. However, this rule wasn’t enforced, so usually they have this possibility if wanted; 2. Participants were not rewarded for win. The winner was announced (so get some “good feelings”), but no more; 3. During some early games we used pieces of paper and we got some percentage of joking or trash submission, usually very small. Later we have switched to google forms, which is better tool to control submission (for example only natural numbers allowed). 4. Google Forms gives possibility to make multiple submission (with different names), since we didn’t have time for verification, but total number of submission allows to control that to some extend. The aim of this setup was to free participants to explore the rules and give them flexibility to make decision in uncertain environment. We think it is closer to real life learning without immediate rewards then laboratory experiments. Naturally, this setup has strong and weak sides. Lets summarize both. The strong sides are: 1. This setup allow to measure how people make decisions in “almost real” circumstances and understand the (possible) difference with laboratory experiments; 2. These games are part of integrated approach to active learning, when games are mixed with explanations about concepts of game theory (rationality, expected payoff, Nash equilibrium etc), and they allow participants to combine experience with theory; 87 Table 1 Summary of first game for id of experiment and type of players. Explanation of columns is in the text Id Type Age Round Average Winning Zlevel Median Count Irrationality 1 Alternative H 12-14 1 66.7 44.5 69.23 78 13 46.15 1 Alternative H 12-14 2 3.91 2.61 0 3.5 12 0 2 Alternative M 12-14 2 42.82 28.54 23.52 45.0 17 0 2 Alternative M 12-14 2 24.37 16.24 0 26.5 16 0 3 Adults 1 40.57 27.05 31.57 40.0 19 5.26 4 Alternative H 12-14 1 52.54 35.03 63.63 55 11 9.09 4 Alternative H 12-14 2 15.41 10.27 8.33 6 12 8.33 5 Adults 1 22.98 15.32 11.76 17.0 102 0 6 TechSchool 16-18 1 43.41 28.94 35.29 45.0 51 3.92 6 TechSchool 16-18 2 46.5 30.99 35.48 29.0 62 32.25 7 Math lyceum 16-18 1 30.58 20.38 16 27.5 50 2.0 7 Math lyceum 16-18 2 14.26 9.5 5.26 7 57 5.26 8 Math lyceum 15-16 1 37.06 24.71 20.68 33.0 29 3.44 8 Math lyceum 15-16 2 26.20 17.47 10.34 17.0 29 6.89 9 Math lyceum 14-16 1 42.0 27.99 44.44 42.5 18 11.11 9 Math lyceum 14-16 2 23.1 15.39 5.0 19.0 20 0 10 Ordinary school 14-16 1 48.69 32.46 46.15 46.5 26 0 10 Ordinary school 14-16 2 19.78 13.18 0 22.0 23 0 11 DS conference 1 37.25 24.83 28.33 33.0 60 8.33 11 DS conference 2 21.44 14.29 15.78 9.0 57 12.28 12 Students 1 42.40 28.27 33.33 40.0 27 3.7 13 Students 1 27.37 18.24 12.5 25.5 8 0 13 Students 2 8.62 5.74 0 8.5 8 0 14 Math lyceum 14-16 1 41.05 27.37 22.22 35.0 18 11.11 14 Math lyceum 14-16 2 17.23 11.49 5.88 13.0 17 0 15 Adults 1 34.32 22.88 20.73 30.0 82 1.21 15 Adults 2 12.48 8.32 2.19 8.0 91 2.19 16 Adults 1 43.05 28.70 33.96 40.0 53 1.88 16 Adults 2 14.69 9.79 1.88 11.0 53 1.88 17 Adults 1 50.33 33.55 41.66 50.0 12 8.33 17 Adults 2 13.50 8.99 0 12.0 46 0 18 Math lyceum 14-16 1 41.72 27.81 36.36 37.0 11 9.09 18 Math lyceum 14-16 2 26.36 17.57 0 30.0 11 0 19 Math lyceum 14-16 1 29.43 19.62 13.63 25.0 44 0 19 Math lyceum 14-16 2 27.25 18.16 20.45 9.5 44 20.45 3. Freedom and responsibility. The rules doesn’t regulate manipulations with conditions. So this setup allows (indirectly) to measure preferences of players: do they prefer cheat with rules, just choose random decision without thinking or put efforts in solving the task. Weak sides are: 1. Some percentage of players made “garbage” decisions. For example choose obviously worse choice just to spoil efforts for others; 88 2. Kids has (and often use) possibility to talk out decision with the neighbors; 3. Sometimes participants (especially kids) lost concentration and didn’t think about the game but made random choice or just didn’t make decisions at all; 4. Even for simplest rules, sometimes participants failed to understand the game first time. We suppose it is due to conditions of lecture with (usually) 30–40 persons around. 3.1. Rules All games have the same preamble: Participants are asked to choose integer number in range 1 – 100, margins included. Note, that some setups, investigated in references, use range starting with 0. But the difference is small. To provide quick choice calculation we have used QR code with link to google.form, where participants input their number. All answers were anonymous (players indicate nicknames to announce the winners, but then all records were anatomized). The winning condition is specific for every game. 1. p-beauty contest. The winning number is the closest to 𝑝 of average. Usually 𝑝 = 2/3, but we used other setups as well; 2. Two equilibrium game. The winning number is the furthest from the 𝑝 times average. Usually 𝑝 = 1 if not explicitly mentioned; 3. Coordination with assurance. The winning number is the number, chosen by plurality. In case of tie lower number wins; 4. No equlibrium game. The winning number is the smallest unique; All these games are well-known in game theory, let’s address paper [21], where all properties are summarized and do not repeat them there. 4. Results and data analysis In this section we present summary of data, gathered during the games. 4.1. First game Summary of results of First game is given in the table 1. Columns descriptions are: • id is the id of experiment; • type is the type of group. Alternative H and M is for alternative school (not in govern- mental system) with humanitarian and mathematical direction respectively. Math lyceum goes also for summer camps with participants from different lyceums; • age is the approximate age of participants, only indicated for children, to distinguish possible borderline between stages of strategic reasoning; • round is the round of the game; • average is the average of choices; • winning number is the average * 0.66; 89 • zlevel is the percent of players, choosing numbers bigger than 50, it is estimation of 0-level players in this round. As one can expect it is declining with round; • median is the median of choices (sometimes it is more informative then average); • count is the number of choices; • irrationality is the percent of choices bigger than 90. Almost all winning numbers are fall (roughly) in the experimental margins, obtained by Nagel [7]. With winning number no bigger then 36 and not smaller then 18 in first round. Two exceptions in our experiments were Facebook on-line game (15.3), when players can read information about the game in, for example, Wikipedia. And other is alternative humanitarian school (40.1), where participants seems didn’t got the rules from the first time. 4.1.1. Metrics and analysis First metric to observe is the percent of “irrational choices” - choices that can’t win in (almost) any case. Lets explain, imagine that all players will choose 100. It is impossible from practice but not forbidden. In this case everybody wins, but if only one player will deviate to smaller number – he/her will win and others will lose. So playing numbers bigger then 66 is not rational, unless you don’t want to win. And here we come to important point, in all previous experiments this metric drops in second round and usually is very low (like less than 5%) [11]. But in our case there are experiments where this metric become higher or changes very slightly. And initially values are much higher then expected. So here we should include factor of special behavior, we can call it “let’s show this lecturer how we can cheat his test!”. What is more interesting – this behavior more clear in case of adult then kids. It is also interesting to see distribution of choices for different types of groups. We can summarize choices on the histograms (figure 1). Using models of strategic thinking we will adopt the theory of k-levels. According to this idea 0-level reasoning means, that players make random choices (drawn from uniform distribution), and k-level reasoning means that these players use best-response for reasoning of previous level. So 1-level reasoning is to play 33, which is best response to belief that average will be 50, 2-level is best response to belief that players will play 33 and so on. As we can see from the diagram 1, some spikes in choices are predicted very good, but it depends on the background of players. The best prediction is for attendees of Data Science conference, which presumes high level of cognitive skill and computer science background. On figure 2 we can see boxplots, defined by number of players with different level of perception. Level are defined in next subsection, but we can see pattern of behavior. Number of “irrational” (choices with big numbers) is decreasing, so as “next-to-win-but-bigger” numbers. Number of 2-level reasoning, especially after explanation about the equilibrium concept is growing substantially, while number of “too smart” choices – choices from [1,5] is basically more or less the same. Interesting hypotheses, that need to be tested in details, can be formulated: Higher number of choices from [50,100] in first round leads to higher number of choices from [1,5] in second round and vice verse. 90 Figure 1: Histogram of choices for each round. Another metric [24] is how much winning choice in second round is smaller then in first. Due to concept of multi-level reasoning, every player in this game trying to its best to win but cant do all steps to winning idea. So there are players, who just have 0-level reasoning, they choose random numbers. First-level players choose 33, which is best response for players of 0-level and so on. Based on result of first round and, in fact, explanation about the Nash equilibrium, players must know that it is better to choose much lower numbers. But graph shows that decrease is quite moderate. Only students shows good performance in this matter. And tech school shows (small) increase in winning number in second round! 4.1.2. Levels of reasoning analysis Another point about the process of learning in this game is how players decision are distributed over the space of strategies. We claim that there is distinct difference in changes between first and second round for different groups. To perform this analysis we apply the idea of k-level thinking. To find differences we need to simplify this approach. First, we define b-level players players who choose numbers from the range [50,100]. It is beginner players, who do not understand rules (play randomly) or do not expect to win or want to loose intentionally (for reasons discussed 91 Figure 2: Change in winning number for number of participants. above). The substantiation for such range is that numbers higher then 50 did not win in any game. Second level we call m-level, it is for range [18,50]. It is for players with middle levels of reasoning, usually first round winning number is in this range (and in part of second rounds also). Third level is h-level, it is for range [5, 18]. It is for high level reasoning and finally inf-level ([1,5] range) is for “almost common knowledge” level of thinking. Calculating the number of levels for each game we can estimate change (in percentage of number of players) in adopting different strategy levels. There are some limitation of this approach: • number of players changed with rounds, since not everyone participated (it was option, not obligation); • limits of ranges are not defined by model or data. It can be future direction of research – how to define levels in best way. Results are presented in table 2. What conclusions we can draw from this data? There are no clear difference in changing, but at least we can summarise few points: 92 Table 2 Summary of change in strategy levels Type b-difference m-difference h-difference inf-difference Alternative humanitarian -72 -8 0 72 Alternative mathematical -24 -6 30 -6 Alternative humanitarian -52 0 17 43 Math lyceum -9 -36 24 34 Math lyceum -10 -24 28 7 Ordinary school -49 12 20 4 DS conference attendees -14 -32 14 27 MS students -12 -50 50 12 Alternative mathematical -17 -34 23 23 DS conference attendees -17 -30 23 35 Business -32 -17 21 28 • Usually after first round and equilibrium concept explanation there is decrease in b-level and m-level; • Symmetrically, there is increase in two other levels, but sometimes it is more distributed, sometimes it is (almost) all for inf-level; • Last situation is more likely to happen in schools, were kids are less critical to new knowledge; • Usually second round winning choice in the realm of h-level, so groups with biggest increase in this parameter are the ones with better understanding. 4.1.3. Size and winning choice This game is indeed rich for investigation, let us formulate last (in this paper) finding about this game. Can we in some way establish connection between number of players and winning number (actually with strategies, players choose during the game)? To clarify our idea see at figure 3. It is scatter plot of two-dimensional variable, x-axis is for number of participants in the game and y-axis is for winning choice per round. Different color are for different types of group, where games was played. As we can observe, first and second rounds form two separate clusters. This is expected and inform us that players learned about the equilibrium concept between rounds and apply it to practice. Also there is mild tendency that smaller groups have bigger winning numbers. At least variation is bigger. This is yet too bold to formulate connection between size of the group and winning number, but probably the reason is that when size of the group is bigger, number of “irrational” players increases. It can be due to some stable percentage of such persons in any group or other reasons, but it is interesting connection to investigate. 93 Figure 3: Change in winning number for number of participants. 4.1.4. Intentionally irrational? Another interesting finding is that after first round finished, after observing result and listening explanation about the NE number of players, who choose over 90 (it is obviously non winning choice) increases. It is actually not accidental, data from [22] also show increase in 2-5 round. We believe that this is quite important part of play. This phenomena especially visible in high school children, with strong math background (usually they has more freedom and self confidence in choosing non-standard strategies). 4.2. Second game In second game the key point is to understand that almost all strategies are dominated. One can see that average can be bigger or smaller then 50, and accordingly winning choice will be 1 or 100. It is worth to note, that popular nature of these experiments and freedom to participate make the data gathering not easy. For example many participants just didn’t take any decision in second game. Results are summarised in the table 2. We refine players decisions to see how many players made choices with rationalizability (Bernheim [25]), which are best response for some strategy profile of other players. In this game there are only two best responses possible (in pure strategies), literally 1 and 100. This is remarkable result, players without prior communications choose to almost perfect mixed equilibrium: almost the same percentage choose 1 and 100. This is even more striking taking into account no prior knowledge about mixed strategies and mixed equilibrium, kids 94 Table 3 Second game. Rationalizable choices summary Type Average Choose 100 Choose 1 Count Adults 46.5 24.3% 24.34% 115 Alternative mathematical 43.8 25.9% 27.9% 27 Business 50.6 29.3% 29.3% 99 DS conference attendees 37.4 15.8% 36.8% 114 Math lyceum 48.5 22.7% 24.7% 154 Ordinary school 51.2 30.4% 30.4% 23 play it intuitively and without any communication. To illustrate the mixed Nash learning by groups, put dependency of percent of 1 choices and 100 choices on plot. 4.3. Third game Third game is simpler then first two, it is coordination game where players should coordinate without a word. And, as predicted by Schelling [26], they usually do. Date presented in the [27] shows that 1 is natural coordination point, with one exception – Tech school decided that it would be funny to choose number 69. Probably, it is the age (11th grade) here to blame. Also we can note attempt to coordinate around 7, 50 and 100. Interesting and paradoxical result, which is expected from general theory, that with fewer options coordination in fact is more difficult. Lets consider, where players decision was to choose integer from [1,10], only 10 choices. Comparing to previous game with 100 possible choices, coordination was very tricky - two numbers got almost the same result. 4.4. Fourth game Here we just note, that the winning numbers were: 12, 2, 4, 20. Since no equilibrium here was theoretically found, we can only gather data at this stage and formulate hypothesis to found one. All experimental data and R file for graphs can be accessed in open repository [27]. 5. Agent-based model The existence of irrational behavior challenges the basic game-theoretic assumption about self-interest and capability to calculate best option. In other words, real people do not think as machines or algorithms. They form hypotheses or expectations using simple rules. These rules are influenced by emotions, social norms and also can be changed depending on feedback (reinforced). This use of inductive reasoning leads to two issues. first, what rules that people fol- low? Second, suppose we know these rules, how do we model the behavior of many interacting, heterogeneous agents in that situation? 95 5.1. Netlogo NetLogo is an environment for agent-based modeling and simulation based on high level programming language. It is a freely available, open-source, multi-platform software. In Netlogo agents live in ‘world’ – rectangular grid, where agents of different types can be created. There are four four types of agents: patches, turtles, links and the observer. Patches are fixed pieces of “ground” that make the world. Turtles can move through patches in any distance/direction at every step of time. Links have the function of connecting at least two agents, and are represented by lines. The observer is the overseeing agent that gives other agents instructions and makes changes in the world. It is responsible for holding a set of global variables. The time passes as discrete steps called ticks, in which all agents can behave. 5.2. Model Epstein [28] defines following characteristics of agent-based model: (1) heterogeneity – agents are different in some ways; (2) autonomy – each agent make own decisions; (3) explicit space – agents interact in a given environment; (4) local Interaction – agents generally interact with their neighbors and immediate environment; (5) bounded Rationality – agents have limited information and computing power; agent behavior is generated by simple rules that may adapt over time; (6) non-equilibrium dynamics. As a first step to build model of guessing games players we propose agent-based model (it can be downloaded from the repository with experimental data) with characteristics 1, 2, 5. The structure of model is following. Agents have properties – level, irrationality, choice. Level is their current level of understanding, irrationality is the flag variable, which includes possible irrational behavior and choice is the current number choice. Setup procedure allows to create different distributions of initial levels and p is the game multiplier. On each step every agent chooses number according to simple rules. The code is downloaded on github, here we just informally explain the main idea. Level of agent defines the number - agents with level 0 choose randomly from [1,100], level 1 choose from [45,55] and so on. Margins are adjustable and will be tuned by real data in future work. After the round winner is determined, and each player with choice bigger than winning number increase its level of understanding. In this model we consider only five levels, where level 5 means common knowledge, when player chooses 1. When no irrationality is in the model we can observe typical convergence to equilibrium 4 left and this is stable pattern. But as we already know from the experiments it is not what we can observe in real life. So irrational behavior was included to meet pattern from data. Irrationality in our model implemented as ’anger’, when player who is currently a loser sometimes goes to irrational mode and chooses 100 in one next round. This leads to interesting pattern (figure 4, right), when sometimes winning number increasing in second round, but nevertheless convergence to equilibrium is inevitable. 96 Figure 4: Plots of players win choice over round. 6. Conclusions In this paper we have presented data of experiments, analysis and explanation of patterns and agent-based model, which can be calibrated in future. We were able to confirm existence of pattern in decision making – every group behave in almost the same way dealing with unknown strategic situation. We also found some deviation and this is interesting point of difference with more “laboratory” setup of existing research. Also we build agent-based model and show how simple rules could lead to complex behavior of players. The main findings of the paper are following: 1. To learn the rules you need to break them. Participants have chosen obviously not winning moves (> 66) partly because of new situation and trouble with understanding the rules. But high percent of such choices was present in second round also, when players knew exactly what is going on. This effect was especially notable in the cases of high school and adults and almost zero in case of special math schools and kids below 9th grade. We can formulate hypothesis that high school is the age of experimentation when children discover new things and do not afraid to do so. 2. If we considered winning number as decision of a group we can see that group learning fast and steady. Even if some outliers choose 100, mean still declines with every round. It seems that there is unspoken competition between players that leads to improvement in aggregated decision even if no prize is on stake. Actually, it is plausible scenario when all participants choose higher numbers. But this didn’t happen in any experiment. The closest case – Tech school, when bunch of pupils (posible coordinating) switch to 100 still only managed to keep mean on the same level. 3. Still it seems that there is stable percent of people, who choose about 100 and it is not about learning how to play the game. We think that this is something like -1 level of 97 reasoning, when player intentionally play “bad move” and this is essential part of model. If we neglect such persons and their motivation, our model will not be correct. 4. agent-based model is the key element to computational analysis of game theory models. We create model with simple rules, which shows close to reality behavior. Now we can simulate different rules and understand limitation of the model. References [1] V. P. Crawford, M. A. Costa-Gomes, N. Iriberri, Structural models of nonequilibrium strategic thinking: Theory, evidence, and applications, Journal of Economic Literature 51 (2013) 5–62. doi:10.1257/jel.51.1.5. [2] F. Mauersberger, R. Nagel, Levels of reasoning in keynesian beauty contests: a generative framework, in: C. Hommes, B. LeBaron (Eds.), Handbook of Computational Economics, volume 4 of Handbook of Computational Economics, Elsevier, 2018, pp. 541–634. [3] C. F. Camerer, Behavioral game theory: Experiments in strategic interaction, The Roundtable Series in Behavioral Economics, Princeton University Press, 2003. [4] R. J. Aumann, Backward induction and common knowledge of rationality, Games and Economic Behavior 8 (1995) 6–19. doi:10.1016/S0899-8256(05)80015-6. [5] R. Aumann, A. Brandenburger, Epistemic conditions for nash equilibrium, Econometrica 63 (1995) 1161–1180. URL: http://www.jstor.org/stable/2171725. [6] H. W. Kuhn, J. C. Harsanyi, R. Selten, J. W. Weibull, E. Van Damme, J. F. Nash Jr., P. Ham- merstein, The work of john nash in game theory, Journal of Economic Theory 69 (1996) 153–185. doi:10.1006/jeth.1996.0042. [7] R. Nagel, Unraveling in guessing games: An experimental study, The American Economic Review 85 (1995) 1313–1326. URL: https://www.cs.princeton.edu/courses/archive/spr09/ cos444/papers/nagel95.pdf. [8] H. Moulin, Game theory for the social sciences, 2nd ed., New York Univeristy Press, New York, 1986. [9] R. Nagel, C. Bühren, B. Frank, Inspired and inspiring: Hervé Moulin and the discovery of the beauty contest game, Mathematical Social Sciences 90 (2017) 191–207. [10] A. Ledoux, Concours résultats complets. Les victimes se sont plu à jouer le 14 d’atout, Jeux & Stratégie 2 (1981) 10–11. [11] T.-H. Ho, C. Camerer, K. Weigelt, Iterated dominance and iterated best response in experimental “p-beauty contests”, The American Economic Review 88 (1998) 947–969. [12] C. F. Camerer, T.-H. Ho, J.-K. Chong, Sophisticated experience-weighted attraction learning and strategic teaching in repeated games, Journal of Economic theory 104 (2002) 137–188. [13] K. Leyton-Brown, Y. Shoham, Essentials of game theory: A concise, multidisciplinary introduction, Synthesis lectures on artificial intelligence and machine learning 2 (2008) 1–88. doi:10.2200/S00108ED1V01Y200802AIM003. [14] M. Costa-Gomes, V. P. Crawford, B. Broseta, Cognition and behavior in normal-form games: An experimental study, Econometrica 69 (2001) 1193–1235. [15] C. F. Camerer, T.-H. Ho, J.-K. Chong, A cognitive hierarchy model of games, The Quarterly Journal of Economics 119 (2004) 861–898. doi:10.1162/0033553041502225. 98 [16] J. R. Wright, K. Leyton-Brown, Predicting human behavior in unrepeated, simultaneous- move games, Games and Economic Behavior 106 (2017) 16–37. [17] J. R. Wright, K. Leyton-Brown, Models of level-0 behavior for predicting human behavior in games, CoRR abs/1609.08923 (2016). arXiv:1609.08923. [18] D. Gill, V. Prowse, Cognitive ability, character skills, and learning to play equilibrium: A level-k analysis, Journal of Political Economy 124 (2016) 1619–1676. doi:10.1086/688849. [19] E. Fe, D. Gill, V. L. Prowse, Cognitive skills, strategic sophistication, and life outcomes, Working Paper Series 448, The University of Warwick, 2019. [20] O. Ignatenko, Guessing games experiments in school education and their analysis, CEUR Workshop Proceedings 2732 (2020) 881–892. [21] O. P. Ignatenko, Guessing games experiments in Ukraine. Learning towards equilibrium, in: S. Semerikov, V. Osadchyi, O. Kuzminska (Eds.), Proceedings of the Symposium on Advances in Educational Technology, AET 2020, University of Educational Management, SciTePress, Kyiv, 2022. [22] E. Povea, F. Citak, Children in the beauty contest game: behaviour and determinants of game performance, Master’s thesis, Norwegian School of Economics, 2019. URL: https: //openaccess.nhh.no/nhh-xmlui/handle/11250/2611651. [23] M. W. Nichols, M. J. Radzicki, An Agent-Based Model of Behavior in “Beauty Contest” Games, Working Paper 07-010, University of Nevada, 2007. [24] W. Güth, M. Kocher, M. Sutter, Experimental ‘beauty contests’ with homogeneous and heterogeneous players and with interior and boundary equilibria, Economics Letters 74 (2002) 219–228. doi:10.1016/S0165-1765(01)00544-4. [25] B. D. Bernheim, Rationalizable strategic behavior, Econometrica 52 (1984) 1007–1028. [26] T. C. Schelling, The Strategy of Conflict: With a New Preface by The Author, Harvard University Press, 1980. [27] O. P. Ignatenko, Data from experiments, 2021. URL: https://github.com/ignatenko/ GameTheoryExperimentData. [28] J. M. Epstein, Agent-based computational models and generative social science, Complexity 4 (1999) 41–60. 99