Simulation of Minority Game in TuCSoN Enrico Oliva Mirko Viroli Andrea Omicini A LMA M ATER S TUDIORUM—Università di Bologna via Venezia 52, 47023 Cesena, Italy E-mail:{enrico.oliva,mirko.viroli,andrea.omicini}@unibo.it Agent I. A PPLICATION D OMAIN Coordination Artifact Minority Game (MG) is a mathematical model that takes inspiration from the “El Farol Bar” problem introduced by Brian Arthur (1). It is based on a simple scenario where at each step a set of agents perform a boolean vote which conceptually splits them in two classes: the agents in the smaller class win. Tuning In this game, a rational agent keeps track of previous votes and victories, and has the goal of winning throughout the steps of the game—for which a rational strategy has to be figured Player out. Monitor One of the most important applications of MG is in the market models: (2) use MG as a coarse-grained model for Fig. 1. TuCSoN Simulation Framework for MG financial markets to study their fluctuation phenomena and statistical properties. Even though the model is coarse-grained and provides an over-simplified micro-scale description, it any- II. L OGIC A RCHITECTURE way captures the most relevant features of system interaction, and generates collective properties that are quite similar to The architecture proposed for MAS simulation is based on those of the real system. TuCSoN (5), which is an infrastructure for the coordination of MASs. TuCSoN provides agents with an environment made Another point of view, presented e.g. by (3), considers the of logic tuple centres, which are logic-based programmable MG as a point in space of a Resource Allocation Game (RAG). tuple spaces. The language used to program the coordination In this work a generalisation of MG is presented that relaxes behaviour of tuple centres is ReSpecT, which specifies how the constraints on the number of resources, studying how the a tuple centre has to react to an observable event (e.g. when system behaves within a given range. a new tuple is inserted) and has to accordingly change the MG can be considered a social simulation that aims to repro- tuple-set state (7). Tuple centres are a possible incarnation of duce a simplified human scenario. In principle, a logic-based the coordination artifact notion (8), representing a device that approach based on BDI agent makes it easier to explicitly persists independently of agent life-cycle and provides services model a variety of diverse social behaviours. to let agents participate to social activities. As showed by (4), a multiagent system (MAS) can be used In our simulation framework we adopt logic-based agents, to realise a MG simulation—there, BDI agents provide for namely, agents built using a logic programming style, keeping rationality and planning. An agent-based simulation is partic- a knowledge base (KB) of facts and acting according to ularly useful when the simulated systems include autonomous some rule—rules and facts thus forming a logic theory. The entities that are diverse, thus making it difficult to exploit the implementation is based on tuProlog technology1 for Java- traditional framework of mathematical equations. Prolog integration, and relies on its inference capabilities for In order to implement MG simulations we adopt the agent rationality. Agents roughly follow the BDI architecture, TuCSoN infrastructure for agent coordination (5), which in- as the KB models agent beliefs while rules model agent troduces tuple centres as artifact representatives. A tuple centre intentions. is a programmable coordination medium living in the MAS Three kinds of agents are used in our simulation: player environment, used by agents interacting by exchanging tuples agents, monitor agents and tuning agents (as depicted in (logic tuples in the case of TuCSoN logic tuple centres). As Figure 1): all the agents share the same coordination artifact. we are not concerned much with the mere issues of agent The agent types differ because of their role and behaviour: intelligence, we rely here on a weak form of rationality, player agents play MG, the monitor agent is an observer through logic-based agents adopting pre-compiled plans called operating instructions (6). 1 http://tuprolog.alice.unibo.it 6 of interactions which visualises the progress of the system, the tuning agent can change some rules or parameters of coordination, and drives the simulation to new states. Note that the main advantage of allowing a dynamic tuning of parameters instead of running different simulations lays in the possibility of tackling emergent aspects which would not necessarily appear in new runs. The main control loop of a player agent is a sequence of actions: observing the world, updating its KB, scheduling next Fig. 3. Typical Time evolution of the Original MG with N = 51, m = 5 intention, elaborating and executing a plan. To connect agent and s = 2 mental states with interactions we use the concept of action preconditions and perception effects as usual. III. M INORITY G AME P ERFORMANCE To track the performance of an MG system, the most inter- esting quantity is variance, defined as σ 2 = [A(t) − A(t)]2 : it shows the variability of the bets around the average value A(t). In particular, the normalised version of variance ρ = σ 2 /N is considered. Figure 3 shows a typical evolution of the game. Generally speaking, variance is the inverse of global ef- ficiency: as variance decreases agent coordination improves, Fig. 4. Variance of the Game with 11 Random Agents making more agents winning. Variance is interestingly affected by the parameters of the model, such as number of agents (N ), memory (m) and number of strategies (s): in particular, the evolution of the system with particular and different kinds of fluctuation of variance is shown to depend only on the ratio agent behaviour at the micro level, imposed as coordination α = 2m /N between agent memory and the number N of parameters which are changed on-the-fly. agents. The results of observations suggest that the behaviour of A. Operating Instructions MG can be classified in two phases: an information-rich Each agent has an internal plan, structured as an algebraic asymmetric phase, and an unpredictable or symmetric phase. composition of allowed actions (with their preconditions) and A phase transition is located where σ 2 /N attains its minimum perceptions (with their effects), that enables the agent to use (αc = 1/2), and it separates the symmetric phase with α < αc the coordination artifact to play the MG. This plan can be from an asymmetric phase with α > αc . seen as Operating Instructions (6), a formal description based All these cases have been observed with the TuCSoN on Labelled Transition System (LTS) that the agent reads to Agent mental stateframework described in next section. simulation understand what its step-by-step behaviour should be. Through an inference process, the agent accordingly chooses the next IV. T HE S IMULATION F RAMEWORK Effects action to execute, thus performing the cycle described in The construction of MG simulations with MASs is based Section 2. on the TuCSoN framework and on tuProlog as an inferential Operating instructions are expressed by the following the- engine to program logic agents. The main innovative aspect ory: of this MG simulation is the possibility of studying the firststate(agent(first,[])). definitions([ def(first,[],...), def(main,[S], [act(out(play(X)),pre(choice(S,X))), Effects per(in(result(Y)),eff(res(Y))), Beliefs agent(main,[S])] ), Preconditions ... ]). Desires Intentions The first part of operating instructions is expressed by Preconditions term first, where the agent reads the game parameters that are stored in the KB, and randomly creates its own set of Action Perception strategies. In the successive part main, the agent executes its main cycle. It first puts tuple play(X) in the tuple space, where X = ±1 is agent vote. The precondition of this action Fig. 2. Agent Architecture choice(S,X) is used to bind in the KB X with the 7 value currently chosen by the agent according to strategy S. Then, the agent gets the whole result of the game in tuple result(Y) and applies it to its KB. After this perception, the cycle is iterated again. B. Tuple Centre Behaviour The interaction protocol between agents and the coordina- tion artifact is then simply structured as follows. First each agent puts the tuple for its vote. When the tuples for all agents have been received, the tuple centre checks them, computes the result of the game—either 1 or −1 is winning—and prepares a result tuple to be read by agents. Fig. 6. Variance of the System with Initial Parameters N = 5 and m = 3 The ReSpecT program for this behaviour is loaded in the tuple centre by a configuration agent at bootstrap, through operation set_spec(). The following ReSpecT reaction rd_r(numloss(NumberOfLoss)), is fired when an agent inserts tuple play(X), and triggers rd_r(mem(MemorySize)), the whole behaviour: out_r(winner(A,P,CS,NumberOfLoss,MemorySize,G)), reaction(out(play(X)),( out_r(totcount(P)) in_r(count(Y)), )). Z is Y+1, in_r(sum(M)), The winner tuple contain the result of game (R), the V is M+X, number of step (NS), two tuning parameters (NumberOfLoss out_r(sum(V)), and MemorySize) and one constant to communicate agents out_r(count(Z)) )). whether they have to stop or to play further (last/more). Figure 5 reports the graphical interface of the monitor agent This reaction considers the bet (X) counts the bets (Z) and computes the partial result of the game (V). When that during its life-time reads the tuple winner and draws all the agents have played, the artifact produces the tuple variance. winner(R,NS,NumberOfLoss,MemorySize,last/more) which is the main tuple of MG coordination. C. Tuning the Simulation reaction(out_r(count(X)),( rd_r(numag(Num)), In classical MG simulation there are a number of parameters X=:=Num, that can affect the system behaviour, which are explicitly in_r(totcount(T)), represented in the tuple centre in form of tuples: the number of P is T+1, rd_r(game(G)), agents numag(X), memory size mem(X), and the number of in_r(sum(A)), strategies numstr(X). In our framework, we have introduced out_r(sum(0)), as a further parameter the number of wrong moves after rd_r(countsession(CS)), in_r(count(Y)), which the single agent should be recalculate own strategy, out_r(count(0)), represented as a tuple numloss(X). Such a threshold is %%calculate variance seemingly useful to break the symmetry in the strategy space in_r(qsum(SQ)), NSQ is A*A+SQ, when the system is in a pathological state, i.e., when all out_r(qsum(NSQ)), agents have the same behaviour and the game oscillates from %%calculate mean minimum to maximum value. in_r(totsum(R)), NewS is R+A, In our framework, it is possible to explore the possibility out_r(totsum(NewS)), to dynamically tune up the coordination rules by changing numloss and mem coordination parameters, which are stored as tuples in the coordination artifact. The simulation architec- ture built in this way, in fact, allows for on-the-fly change of some game configuration parameters—such as the dimension of agent memory—with no need to stop the simulation and re-program the agents. By changing the parameters, the tuning agent can drive the system from an equilibrium state to another, by controlling agent strategies, the dimension of memory, or the number of losses that an agent can accept before discarding a strategy. This agent observes system variance, and decides whether and how to change tuning parameters: reference variance is calcu- lated by first making agents playing the game randomly— see Figure 4. The new value of parameters is stored in Fig. 5. Interface of the Monitor Agent tuple centre through tuples numloss(NumberOfLoss) and 8 VI. ACKNOWLEDGEMENTS The first author of this paper would like to thank Dr. Peter McBurney and the Department of Computer Science at University of Liverpool for their scientific support and their hospitality during his stay in Liverpool, when this paper was mostly written. R EFERENCES [1] W. B. Arthur, “Inductive reasoning and bounded rational- ity (the El Farol problem),” American Economic Review, Fig. 7. System Evolution of the Variance in Figure 6 vol. 84, no. 2, pp. 406–411, May 1994. [2] D. Challet, M. Marsili, and Y.-C. Zhang, “Modeling market mechanism with minority game,” Physica A: mem(MemorySize), the rules of coordination react and Statistical and Theoretical Physics, vol. 276, no. update the information that will be read by the agents. 1–2, pp. 284–315, Feb. 2000. [Online]. Available: http://dx.doi.org/10.1016/S0378-4371(99)00446-X D. Simulation Results [3] H. V. D. Parunak, S. Brueckner, J. Sauter, and R. Savit, “Effort profiles in multi-agent resource allocation,” pp. The result of the tuned simulation in Figures 6 and 7 shows 248–255. how the system changes its equilibrium state and achieves [4] W. Renz and J. Sudeikat, “Modeling Minority Games a better value of variance.2 In this simulation the tuning with BDI agents – a case study,” in Multiagent agent is played by a human that observes the evolution of System Technologies, ser. LNCS, T. Eymann, F. Klügl, the system and acts through the tuning interface to change W. Lamersdorf, M. Klusch, and M. N. Huhns, Eds. the coordination parameters, such as threshold of losses and Springer, 2005, vol. 3550, pp. 71–81, 3rd German memory, hopefully finding new and better configurations. The Conference (MATES 2005), Koblenz, Germany, 11- introduction of the threshold of losses in the agent behaviour 13 Sept. 2005. Proceedings. [Online]. Available: http: is useful when the game is played by few agents: these param- //www.springerlink.com/link.asp?id=y62q174g56788gh8 eters enable system evolution and a better agent cooperative [5] A. Omicini and F. Zambonelli, “Coordination for Internet behaviour. application development,” Autonomous Agents and Multi- V. P ERSPECTIVES Agent Systems, vol. 2, no. 3, pp. 251–269, Sept. 1999. [6] M. Viroli and A. Ricci, “Instructions-based semantics In this paper, we aim at introducing new perspectives on of agent mediated interaction,” in 3rd International agent-based simulation by adopting a novel MAS meta-model Joint Conference on Autonomous Agents and Multiagent based on agents and artifacts, and by applying it to Minority Systems (AAMAS 2004), N. R. Jennings, C. Sierra, Game simulation. We implement and study MG over the L. Sonenberg, and M. Tambe, Eds., vol. 1. New York, TuCSoN coordination infrastructure, and show some benefits USA: ACM, 19–23 July 2004, pp. 102–109. [Online]. of the artifact model in terms of flexibility and controllability Available: http://portal.acm.org/citation.cfm?id=1018409. of the simulation. In particular, in this work we focus on the 1018737 possibility to build a feedback loop on the rules of coordination [7] A. Omicini and E. Denti, “Formal ReSpecT,” Electronic driving a system to a new and better equilibrium state. Notes in Theoretical Computer Science, vol. 48, pp. 179– We foresee some new perspectives in the use of the 196, June 2001. TuCSoN simulation framework in a industrial environment. [8] A. Omicini, A. Ricci, M. Viroli, C. Castelfranchi, and The first one is to use the system to drive manufacturing in L. Tummolini, “Coordination artifacts: Environment-based case of limited resources. In this scenario each agent is a half- coordination for intelligent agents,” in 3rd International processed item, whose production has to be completed as faster Joint Conference on Autonomous Agents and Multiagent as possible, and whose access to the resources is regulated by Systems (AAMAS 2004), N. R. Jennings, C. Sierra, dedicated resource artifacts. Another possible perspective is L. Sonenberg, and M. Tambe, Eds., vol. 1. New York, to evaluate the product demand and production in order to USA: ACM, 19–23 July 2004, pp. 286–293. [Online]. drive industry through market fluctuation. In our framework Available: http://portal.acm.org/citation.cfm?id=1018409. we could model a market scenario by minority rules, and 1018752 then try to evaluate demand. Furthermore, all such applications [9] N. R. Jennings, C. Sierra, L. Sonenberg, and M. Tambe, would benefit from using a logic-based approach rather than Eds., 3rd International Joint Conference on Autonomous an equation-based approach. Agents and Multiagent Systems (AAMAS 2004). New 2 In Figure 6, the first phase of equilibrium is followed by a second one York, USA: ACM, 19–23 July 2004. obtained by changing the threshold parameter S = 5. Finally, a third phase is obtained changing the dimension of the memory to m = 5. 9