Simulation of Minority Game in TuCSoN
                                                Enrico Oliva
                                                Mirko Viroli
                                              Andrea Omicini
                              A LMA M ATER S TUDIORUM—Università di Bologna
                                    via Venezia 52, 47023 Cesena, Italy
                  E-mail:{enrico.oliva,mirko.viroli,andrea.omicini}@unibo.it

                                                                                     Agent
                  I. A PPLICATION D OMAIN
                                                                                     Coordination Artifact
   Minority Game (MG) is a mathematical model that takes
inspiration from the “El Farol Bar” problem introduced by
Brian Arthur (1). It is based on a simple scenario where at each
step a set of agents perform a boolean vote which conceptually
splits them in two classes: the agents in the smaller class win.                                                           Tuning

In this game, a rational agent keeps track of previous votes
and victories, and has the goal of winning throughout the steps
of the game—for which a rational strategy has to be figured                          Player
out.
                                                                                                             Monitor
   One of the most important applications of MG is in the
market models: (2) use MG as a coarse-grained model for
                                                                                   Fig. 1.    TuCSoN Simulation Framework for MG
financial markets to study their fluctuation phenomena and
statistical properties. Even though the model is coarse-grained
and provides an over-simplified micro-scale description, it any-                             II. L OGIC A RCHITECTURE
way captures the most relevant features of system interaction,
and generates collective properties that are quite similar to               The architecture proposed for MAS simulation is based on
those of the real system.                                                TuCSoN (5), which is an infrastructure for the coordination of
                                                                         MASs. TuCSoN provides agents with an environment made
   Another point of view, presented e.g. by (3), considers the
                                                                         of logic tuple centres, which are logic-based programmable
MG as a point in space of a Resource Allocation Game (RAG).
                                                                         tuple spaces. The language used to program the coordination
In this work a generalisation of MG is presented that relaxes
                                                                         behaviour of tuple centres is ReSpecT, which specifies how
the constraints on the number of resources, studying how the
                                                                         a tuple centre has to react to an observable event (e.g. when
system behaves within a given range.
                                                                         a new tuple is inserted) and has to accordingly change the
   MG can be considered a social simulation that aims to repro-
                                                                         tuple-set state (7). Tuple centres are a possible incarnation of
duce a simplified human scenario. In principle, a logic-based
                                                                         the coordination artifact notion (8), representing a device that
approach based on BDI agent makes it easier to explicitly
                                                                         persists independently of agent life-cycle and provides services
model a variety of diverse social behaviours.
                                                                         to let agents participate to social activities.
   As showed by (4), a multiagent system (MAS) can be used
                                                                            In our simulation framework we adopt logic-based agents,
to realise a MG simulation—there, BDI agents provide for
                                                                         namely, agents built using a logic programming style, keeping
rationality and planning. An agent-based simulation is partic-
                                                                         a knowledge base (KB) of facts and acting according to
ularly useful when the simulated systems include autonomous
                                                                         some rule—rules and facts thus forming a logic theory. The
entities that are diverse, thus making it difficult to exploit the
                                                                         implementation is based on tuProlog technology1 for Java-
traditional framework of mathematical equations.
                                                                         Prolog integration, and relies on its inference capabilities for
   In order to implement MG simulations we adopt the                     agent rationality. Agents roughly follow the BDI architecture,
TuCSoN infrastructure for agent coordination (5), which in-              as the KB models agent beliefs while rules model agent
troduces tuple centres as artifact representatives. A tuple centre       intentions.
is a programmable coordination medium living in the MAS                     Three kinds of agents are used in our simulation: player
environment, used by agents interacting by exchanging tuples             agents, monitor agents and tuning agents (as depicted in
(logic tuples in the case of TuCSoN logic tuple centres). As             Figure 1): all the agents share the same coordination artifact.
we are not concerned much with the mere issues of agent                  The agent types differ because of their role and behaviour:
intelligence, we rely here on a weak form of rationality,                player agents play MG, the monitor agent is an observer
through logic-based agents adopting pre-compiled plans called
operating instructions (6).                                                1 http://tuprolog.alice.unibo.it


                                                                     6
            of interactions which visualises the progress of the system,
            the tuning agent can change some rules or parameters of
            coordination, and drives the simulation to new states. Note
            that the main advantage of allowing a dynamic tuning of
            parameters instead of running different simulations lays in
            the possibility of tackling emergent aspects which would not
            necessarily appear in new runs.
               The main control loop of a player agent is a sequence of
            actions: observing the world, updating its KB, scheduling next         Fig. 3. Typical Time evolution of the Original MG with N = 51, m = 5
            intention, elaborating and executing a plan. To connect agent          and s = 2
            mental states with interactions we use the concept of action
            preconditions and perception effects as usual.
                        III. M INORITY G AME P ERFORMANCE
              To track the performance of an MG system, the most inter-
           esting quantity is variance, defined as σ 2 = [A(t) − A(t)]2 : it
           shows the variability of the bets around the average value A(t).
           In particular, the normalised version of variance ρ = σ 2 /N is
           considered. Figure 3 shows a typical evolution of the game.
              Generally speaking, variance is the inverse of global ef-
           ficiency: as variance decreases agent coordination improves,                     Fig. 4.   Variance of the Game with 11 Random Agents
           making more agents winning. Variance is interestingly affected
           by the parameters of the model, such as number of agents (N ),
           memory (m) and number of strategies (s): in particular, the             evolution of the system with particular and different kinds of
           fluctuation of variance is shown to depend only on the ratio            agent behaviour at the micro level, imposed as coordination
           α = 2m /N between agent memory and the number N of                      parameters which are changed on-the-fly.
           agents.
              The results of observations suggest that the behaviour of            A. Operating Instructions
           MG can be classified in two phases: an information-rich                    Each agent has an internal plan, structured as an algebraic
           asymmetric phase, and an unpredictable or symmetric phase.              composition of allowed actions (with their preconditions) and
           A phase transition is located where σ 2 /N attains its minimum          perceptions (with their effects), that enables the agent to use
           (αc = 1/2), and it separates the symmetric phase with α < αc            the coordination artifact to play the MG. This plan can be
           from an asymmetric phase with α > αc .                                  seen as Operating Instructions (6), a formal description based
              All these cases have been observed with the TuCSoN                   on Labelled Transition System (LTS) that the agent reads to
     Agent mental  stateframework described in next section.
           simulation                                                              understand what its step-by-step behaviour should be. Through
                                                                                   an inference process, the agent accordingly chooses the next
                         IV. T HE S IMULATION F RAMEWORK
                                                               Effects             action to execute, thus performing the cycle described in
              The construction of MG simulations with MASs is based                Section 2.
            on the TuCSoN framework and on tuProlog as an inferential                 Operating instructions are expressed by the following the-
            engine to program logic agents. The main innovative aspect             ory:
            of this MG simulation is the possibility of studying the               firststate(agent(first,[])).
                                                                                   definitions([
                                                                                     def(first,[],...),
                                                                                     def(main,[S],
                                                                                          [act(out(play(X)),pre(choice(S,X))),
                                                           Effects                         per(in(result(Y)),eff(res(Y))),
                                               Beliefs                                     agent(main,[S])]
                                                                                     ),
Preconditions                                                                        ...
                                                                                   ]).
                                               Desires
                                              Intentions
                                                                                      The first part of operating instructions is expressed by
                         Preconditions
                                                                                   term first, where the agent reads the game parameters that
                                                                                   are stored in the KB, and randomly creates its own set of
                                      Action         Perception
                                                                                   strategies.
                                                                                      In the successive part main, the agent executes its main
                                                                                   cycle. It first puts tuple play(X) in the tuple space, where
                                                                                   X = ±1 is agent vote. The precondition of this action
                                Fig. 2.   Agent Architecture                       choice(S,X) is used to bind in the KB X with the


                                                                               7
value currently chosen by the agent according to strategy S.
Then, the agent gets the whole result of the game in tuple
result(Y) and applies it to its KB. After this perception,
the cycle is iterated again.
B. Tuple Centre Behaviour
   The interaction protocol between agents and the coordina-
tion artifact is then simply structured as follows. First each
agent puts the tuple for its vote. When the tuples for all agents
have been received, the tuple centre checks them, computes the
result of the game—either 1 or −1 is winning—and prepares
a result tuple to be read by agents.                                    Fig. 6.   Variance of the System with Initial Parameters N = 5 and m = 3
   The ReSpecT program for this behaviour is loaded in the
tuple centre by a configuration agent at bootstrap, through
operation set_spec(). The following ReSpecT reaction                      rd_r(numloss(NumberOfLoss)),
is fired when an agent inserts tuple play(X), and triggers                rd_r(mem(MemorySize)),
the whole behaviour:                                                      out_r(winner(A,P,CS,NumberOfLoss,MemorySize,G)),
reaction(out(play(X)),(                                                   out_r(totcount(P))
  in_r(count(Y)),                                                       )).
  Z is Y+1,
  in_r(sum(M)),                                                            The winner tuple contain the result of game (R), the
  V is M+X,                                                             number of step (NS), two tuning parameters (NumberOfLoss
  out_r(sum(V)),                                                        and MemorySize) and one constant to communicate agents
  out_r(count(Z))
)).                                                                     whether they have to stop or to play further (last/more).
                                                                        Figure 5 reports the graphical interface of the monitor agent
   This reaction considers the bet (X) counts the bets (Z)
and computes the partial result of the game (V). When                   that during its life-time reads the tuple winner and draws
all the agents have played, the artifact produces the tuple             variance.
winner(R,NS,NumberOfLoss,MemorySize,last/more) which
is the main tuple of MG coordination.                                   C. Tuning the Simulation
reaction(out_r(count(X)),(
  rd_r(numag(Num)),
                                                                           In classical MG simulation there are a number of parameters
  X=:=Num,                                                              that can affect the system behaviour, which are explicitly
  in_r(totcount(T)),                                                    represented in the tuple centre in form of tuples: the number of
  P is T+1,
  rd_r(game(G)),
                                                                        agents numag(X), memory size mem(X), and the number of
  in_r(sum(A)),                                                         strategies numstr(X). In our framework, we have introduced
  out_r(sum(0)),                                                        as a further parameter the number of wrong moves after
  rd_r(countsession(CS)),
  in_r(count(Y)),
                                                                        which the single agent should be recalculate own strategy,
  out_r(count(0)),                                                      represented as a tuple numloss(X). Such a threshold is
  %%calculate variance                                                  seemingly useful to break the symmetry in the strategy space
  in_r(qsum(SQ)),
  NSQ is A*A+SQ,
                                                                        when the system is in a pathological state, i.e., when all
  out_r(qsum(NSQ)),                                                     agents have the same behaviour and the game oscillates from
  %%calculate mean                                                      minimum to maximum value.
  in_r(totsum(R)),
  NewS is R+A,
                                                                           In our framework, it is possible to explore the possibility
  out_r(totsum(NewS)),                                                  to dynamically tune up the coordination rules by changing
                                                                        numloss and mem coordination parameters, which are stored
                                                                        as tuples in the coordination artifact. The simulation architec-
                                                                        ture built in this way, in fact, allows for on-the-fly change of
                                                                        some game configuration parameters—such as the dimension
                                                                        of agent memory—with no need to stop the simulation and
                                                                        re-program the agents.
                                                                           By changing the parameters, the tuning agent can drive the
                                                                        system from an equilibrium state to another, by controlling
                                                                        agent strategies, the dimension of memory, or the number of
                                                                        losses that an agent can accept before discarding a strategy.
                                                                        This agent observes system variance, and decides whether and
                                                                        how to change tuning parameters: reference variance is calcu-
                                                                        lated by first making agents playing the game randomly—
                                                                        see Figure 4. The new value of parameters is stored in
               Fig. 5.   Interface of the Monitor Agent                 tuple centre through tuples numloss(NumberOfLoss) and


                                                                    8
                                                                                                   VI. ACKNOWLEDGEMENTS
                                                                                     The first author of this paper would like to thank Dr.
                                                                                   Peter McBurney and the Department of Computer Science at
                                                                                   University of Liverpool for their scientific support and their
                                                                                   hospitality during his stay in Liverpool, when this paper was
                                                                                   mostly written.
                                                                                                           R EFERENCES
                                                                                   [1] W. B. Arthur, “Inductive reasoning and bounded rational-
                                                                                       ity (the El Farol problem),” American Economic Review,
          Fig. 7.   System Evolution of the Variance in Figure 6
                                                                                       vol. 84, no. 2, pp. 406–411, May 1994.
                                                                                   [2] D. Challet, M. Marsili, and Y.-C. Zhang, “Modeling
                                                                                       market mechanism with minority game,” Physica A:
mem(MemorySize), the rules of coordination react and                                   Statistical and Theoretical Physics, vol. 276, no.
update the information that will be read by the agents.                                1–2, pp. 284–315, Feb. 2000. [Online]. Available:
                                                                                       http://dx.doi.org/10.1016/S0378-4371(99)00446-X
D. Simulation Results                                                              [3] H. V. D. Parunak, S. Brueckner, J. Sauter, and R. Savit,
                                                                                       “Effort profiles in multi-agent resource allocation,” pp.
   The result of the tuned simulation in Figures 6 and 7 shows
                                                                                       248–255.
how the system changes its equilibrium state and achieves
                                                                                   [4] W. Renz and J. Sudeikat, “Modeling Minority Games
a better value of variance.2 In this simulation the tuning
                                                                                       with BDI agents – a case study,” in Multiagent
agent is played by a human that observes the evolution of
                                                                                       System Technologies, ser. LNCS, T. Eymann, F. Klügl,
the system and acts through the tuning interface to change
                                                                                       W. Lamersdorf, M. Klusch, and M. N. Huhns, Eds.
the coordination parameters, such as threshold of losses and
                                                                                       Springer, 2005, vol. 3550, pp. 71–81, 3rd German
memory, hopefully finding new and better configurations. The
                                                                                       Conference (MATES 2005), Koblenz, Germany, 11-
introduction of the threshold of losses in the agent behaviour
                                                                                       13 Sept. 2005. Proceedings. [Online]. Available: http:
is useful when the game is played by few agents: these param-
                                                                                       //www.springerlink.com/link.asp?id=y62q174g56788gh8
eters enable system evolution and a better agent cooperative
                                                                                   [5] A. Omicini and F. Zambonelli, “Coordination for Internet
behaviour.
                                                                                       application development,” Autonomous Agents and Multi-
                          V. P ERSPECTIVES                                             Agent Systems, vol. 2, no. 3, pp. 251–269, Sept. 1999.
                                                                                   [6] M. Viroli and A. Ricci, “Instructions-based semantics
   In this paper, we aim at introducing new perspectives on                            of agent mediated interaction,” in 3rd International
agent-based simulation by adopting a novel MAS meta-model                              Joint Conference on Autonomous Agents and Multiagent
based on agents and artifacts, and by applying it to Minority                          Systems (AAMAS 2004), N. R. Jennings, C. Sierra,
Game simulation. We implement and study MG over the                                    L. Sonenberg, and M. Tambe, Eds., vol. 1. New York,
TuCSoN coordination infrastructure, and show some benefits                             USA: ACM, 19–23 July 2004, pp. 102–109. [Online].
of the artifact model in terms of flexibility and controllability                      Available: http://portal.acm.org/citation.cfm?id=1018409.
of the simulation. In particular, in this work we focus on the                         1018737
possibility to build a feedback loop on the rules of coordination                  [7] A. Omicini and E. Denti, “Formal ReSpecT,” Electronic
driving a system to a new and better equilibrium state.                                Notes in Theoretical Computer Science, vol. 48, pp. 179–
   We foresee some new perspectives in the use of the                                  196, June 2001.
TuCSoN simulation framework in a industrial environment.                           [8] A. Omicini, A. Ricci, M. Viroli, C. Castelfranchi, and
The first one is to use the system to drive manufacturing in                           L. Tummolini, “Coordination artifacts: Environment-based
case of limited resources. In this scenario each agent is a half-                      coordination for intelligent agents,” in 3rd International
processed item, whose production has to be completed as faster                         Joint Conference on Autonomous Agents and Multiagent
as possible, and whose access to the resources is regulated by                         Systems (AAMAS 2004), N. R. Jennings, C. Sierra,
dedicated resource artifacts. Another possible perspective is                          L. Sonenberg, and M. Tambe, Eds., vol. 1. New York,
to evaluate the product demand and production in order to                              USA: ACM, 19–23 July 2004, pp. 286–293. [Online].
drive industry through market fluctuation. In our framework                            Available: http://portal.acm.org/citation.cfm?id=1018409.
we could model a market scenario by minority rules, and                                1018752
then try to evaluate demand. Furthermore, all such applications                    [9] N. R. Jennings, C. Sierra, L. Sonenberg, and M. Tambe,
would benefit from using a logic-based approach rather than                            Eds., 3rd International Joint Conference on Autonomous
an equation-based approach.                                                            Agents and Multiagent Systems (AAMAS 2004). New
   2 In Figure 6, the first phase of equilibrium is followed by a second one           York, USA: ACM, 19–23 July 2004.
obtained by changing the threshold parameter S = 5. Finally, a third phase
is obtained changing the dimension of the memory to m = 5.


                                                                               9