CLUSTERING AGENT OPTIMIZATION RESULTS IN DYNAMIC SCENARIOS André Restivo a Luı́s Paulo Reis ab a Faculdade de Engenharia da Universidade do Porto Rua Dr. Roberto Frias 4200-465 Porto Portugal b Laboratory of Artificial Intelligence and Computer Science Rua do Campo Alegre, 823 4150-180 Porto Portugal {arestivo,lpreis}@fe.up.pt Abstract The application of optimization algorithms to parameter driven simulations and agents has been thoroughly explored in literature. However, classical optimization algorithms do not take into account the fact that simulations normally have dynamic scenarios. This paper analyzes the possibility of using the classical optimization methods, combined with clustering techniques, in order to optimize parameter driven agents, in simulations having dynamic scenarios. This will be accomplished by optimizing the agents in several random static scenarios and clustering the optimum results of each of these optimizations in order to find a set of typical solutions for the agent parametrization problem. These typical solutions can then be used in dynamic scenario simulations as references that will help the agents adapt to scenario changes. The results of this approach show that, in some cases, it is possible to improve the outcome of simulations in dynamic environments while still using the classical methods developed for static scenarios. 1 Introduction Agents normally have parameters that can be changed in order to tweak their performance. Finding the parameters that yield the best results is a recurrent problem that anyone developing agents has already faced. Several optimization algorithms have been developed and studied over the years such as: Sim- ulated Annealing, Tabu Search and Genetic Algorithms [7]. However, most of the times, these algorithms will only optimize agent parameters for a certain static scenario [1]. If the scenario being analyzed by the agents changes, the optimum parameters will probably also change. Simulation scenarios are often also defined by a set of parameters. In cases where this does not happen, parameters describing the scenario can sometimes be extracted. In this way, it can be said that a simulation normally has two sets of parameters that will influence its outcome: environment, or scenario, parameters (that are normally out of the agent’s control) and agent parameters that can be changed in order to get better results. One way of optimizing agents living in dynamic simulations, would be to run the optimization process against several different static scenarios and then have the agent constantly use the optimum parameters found for the scenario that most resembles the current one. This solution has several drawbacks like, for example, the overhead caused by the constant changing of parameters. The proposed way, described in this paper, of tackling this problem is to use clustering algorithms in order to minimize the number of different parameter sets, thus minimizing the number of times parameters need to be changed without losing responsiveness or efficiency. We will show that, in certain simulations, using this minimized set of parameter configurations can improve agent performance. a1 s1 parameters parameters scenario agent a2 s2 Figure 1: Optimum Agent Parameters per Scenario (clustered) 2 Clustering Clustering can be defined as grouping sets of elements that are similar in some way. A simple way of achieving this objective is by obtaining maximum intra-cluster similarity and maximum inter-cluster dissimilarity. Clustering methods have to deal with two different problems: membership (whether an element belongs to a certain cluster or not) and how many clusters to create. Most methods only deal with the first of these problems but some strategies have been developed to determine the number of clusters and cluster membership at the same time [3]. To develop an agent optimization system that can cope with scenario changes, solutions from an optimization problem must be grouped together, having in mind that different solutions apply to different initial conditions of the simulation. The objective is to find out which different kinds of solutions exist and to which type of problems each one of them can be applied (see Figure 1). To implement the aggregation of scenarios, the K-Means clustering algorithm was used [4]. The K-Means algorithm starts with the creation of k clusters with initial centroids estimated using some kind of heuristic. Elements are then assigned to the cluster with the nearest centroid (usually using simple euclidean distance measurements). New cluster centroids are then calculated and the process repeats until convergence is met (i.e. no element changes cluster). The major drawbacks of this method are: the fact that the number of clusters must be predeter- mined; having a poor performance as distances to the cluster centroids must always be recalculated in each step; and its results being very dependent of the initial choice of cluster centroids. 3 Scenario Clustering − → Optimizing n different scenarios for a simulation S, will produce a set of results R , in the form − → R = (r1,..., rn ). Each one of those results r is the union of three other values r = (− →s ,− → p , v), where − →s is a set of environmental parameters, in the form − → s = (s1 , . . . , sm ), − → p is a set containing the agent parameters that optimize that scenario, in the form − → p = (p1 , . . . , pk ), and each v is the best value achieved in that optimization. The values n, m, and k represent, respectively, the number of different scenarios optimized, the number of different environmental parameters and the number of different parameters optimized in each scenario. − → − → If we take all − → p from the set R we get a set P containing all good solutions for the problem (although each solution is optimum only for a specific scenario). Having the same approach we can − → − → derive a set S from all the −→ s values from R . − → Hypothesis: In certain simulations, given a set P , containing the optimum solutions − → − → − → for a set of representative scenarios S , it is possible to construct subsets P 1 , . . . , P q whose elements are similar, inside each one of those subsets, but dissimilar to elements of the other subsets. In other words, what we are stating is, that for some simulations, classes of solutions should emerge from the set of all possible solutions. It is important to notice that some simulations do Parametrization Scenario Figure 2: Continuous scenario selection not have a clear set of different classes of solutions. Some simulations can be so erratic that similar scenarios have completely different solutions and others might have solutions so evenly scattered throughout the solution space that no classes can be identified. 4 Scenario Adaptation One way of implementing scenario adaptation will be by simply listing all scenarios tested with their optimum parameter as calculated by the optimizer in the form: S(i) = si1 , . . . , sin , pi1 , . . . , pim (1) Agents could then just find the scenario from the listing that is nearest to the present simulation conditions (see Figure 2). A simple metric, like the euclidean distance, could be used to determinate the correct scenario. This approach, although simple, has some disadvantages: • The scenario listing might be to extensive. This could make finding the best scenario com- putationally impossible. • In some simulations constantly changing parameters can be complicated or undesired. This method would force simulations to change their parameters only due to small scenarios fluc- tuations. The second problem could be solved by only changing the agent parameters in regular intervals (see Figure 3). This would unfortunately cause other problems: • In simulations where conditions do not change often, at least not dramatically, but that need a swift response when they do, this method could cause a slow reaction to environmental changes. • Even if not as regularly this method would force configuration changes even when not strictly necessary. A different approach would be to only change the configuration when the current scenario had an optimum configuration that differed significantly from the current one (or when the current scenario changed dramatically). The problem with this approach is how to evaluate if a configuration/scenario differs greatly from another one. With no sense of scale it becomes tricky to make any type of decision on when to change to another parameter configuration. One solution would be to first analyze, for example, average distances between elements. A better one would be to classify the solutions and group them according to their similarity. A different method of implementing adaptiveness could be developed by using the optimum parameter and scenario classes that the aggregator module produces. This could be done, if for every scenario class we had a representative parameter configuration. This configuration could be Parametrization Scenario Figure 3: Time based scenario selection for example the centroid of the configuration class or the tested configuration most near to the centroid. To use this method we would need the aggregator to output its results in the following form: S(i) = sci1 , . . . , scin , pi1 , . . . , pim (2) Where scij is the coordinate j of the centroid of scenario class i and pik is the value of the parameter k that was found to be optimum for that same scenario class. Some different methods can be used to find which parameter configuration to use for each scenario class: • Test all parameters configurations found for that class against all scenarios. This method might reveal impracticable due to performance reasons; • Find the nearest tested parameter configuration to the class centroid; • Select a sample of the parameter configurations of that class and test them against the class scenarios; • Select random parameter configurations nearer the class centroid and test them against the class scenarios; • Apply an optimization algorithm but this time use the average result from the complete scenario class instead of testing one scenario at a time. The advantages of this method over the Nearest Scenario approach are that we can react to sudden changes quickly and, at the same time, we are not changing the simulation configuration constantly as a reaction to small environmental changes. Then main concern one has to have when using the Nearest Aggregate approach, is to be sure that the parameter configuration chosen for each scenario is good enough for all the elements in that scenario. This can be easily done by testing that configuration against all representatives of the scenario class. As we do not know the scale of the simulation results, we need some kind of input from the user to be able to determine how much of a loss is admissible. 5 Test Case Scenario In order to find a simulation that allowed the testing of all the desired aspects of the implemented system, several characteristics were sought: • Different Scenarios - The simulation had to have different scenarios that required com- pletely different parameter configurations. This would allow testing the optimization module, as well as the aggregation module; Figure 4: Traffic Simulation Interface • Parametrized Configurations - The agents should have a set of configurable parameters. The optimum configuration of these parameters should depend on the current scenario; • Dynamic Scenarios - It should be possible to create a dynamic scenario. In this way the simulation would experiment different parameter configuration needs during each run. This would allow testing the scenario adaptation capabilities of the system; • Simulation Speed - A simulation run should not take too long in order to allow the maxi- mum number of tests possible; • Stochasticity - A stochastic simulation would allow testing if the simulation adapted well to this type of situations. Several possible simulations were considered, and in the end the choice was to implement a very simple traffic simulation system. This simulation has all the characteristics listed previously and was fairly easy to implement. The model chosen was very simple: • Several roads, each with only one lane in either direction; • Roads could be either vertical or horizontal; • Each car would enter the city in a certain lane and exit the city in that same lane. Lane changing, or turning, were not considered to keep the simulation simple; • Traffic lights at each road intersection. A traffic light could be in one of three states: open for vertical traffic, open for horizontal traffic or changing states (yellow light); • Cars would follow the car in their front according to a simple driver model [8]. Besides having the characteristics just listed, in this simulation it was expected that classes of solutions (containing different agent parameter configurations) would emerge. 6 Tool Implementation A generic optimization system has been implemented to validate the ideas behind this paper. Four main points are behind the construction of this prototype: • In the simulation optimization field, researchers often develop their own optimization systems. This happens mainly because it is relatively easy to develop a fairly decent optimizer from scratch. However having a generic optimization system available would allow the researcher to optimize his simulation with several, and perhaps more advanced, optimization methods and use some already developed analysis tools. Random Optimizer Simulation Simulation Simulation Scenario & Runners Simulation Simulation Runners Generator Evaluator Simulation Runners Dynamic Scenario Simulation Clustering Generator Figure 5: Generic Optimizer and Aggregator Architecture • No system is ever generic enough for everyone. So, having the possibility of extending a system is crucial when developing a generic system. In particular this optimization system should allow new optimization methods to be added easily. • Simulation based optimizations are always, or almost always, CPU intensive. This happens because optimization algorithms need to run simulations, which sometimes are already CPU intensive, numerous times, in order to achieve good results. An usable optimization system must take this into account.Distribution is the most obvious way of working around this problem, so a good optimization system should allow the distribution of workload across different machines. • Simulations are used many times to optimize a set of parameters that will be later used in the real world. Most optimization systems approaches do not take into account the fact that there are environment changes occurring every time in real life applications. As explained before, a simulation normally receives a set of parameters and outputs a set of results. Besides that, most simulations have a set of environment parameters that can be user adjusted or randomly selected. A generic optimization system must cover as many different configurations as possible. However, creating a system that is to complicated to use should be avoided. So, a few concessions had to be made in order to keep the system simple. However, the system allows different simulation configurations to be used by means of system extensibility. The system is composed by several modules that will be introduced in the following paragraphs and are explained in full detail in [5]. To begin, a module that will interact with the different type of simulations was needed. This module is, first of all, distributable so we can have several instances of it running in different machines. It should also be able to receive a binary file for a certain simulation, instructions on how to run it with different parameters and how to gather results from it. A second module, that works closely with the one just described, is responsible for evaluating simulation runs. Another task of this module is to mask the fact that simulations are normally stochastic in nature. Optimization is obviously the major goal of the system so an optimization module is essential. This module uses the evaluator module in order to get the results from various simulation runs and use these results to find the optimum parameters for a given scenario. A final module aggregates and analyzes results from the optimizer, in order to create dynamic optimization schedules for the simulation, and allow the user to better understand how parameters and scenarios influence each others. The clustering module receives the optimum parameters for each scenario from the optimizer and outputs the dynamic optimization schedules that are then used by a simulation having dynamic scenarios. Figure 5 captures the various modules and their interaction. Table 1: Adaptative Simulation Results Avg/Max Waiting Time Max Queue Size Configuration Changes Single 28/87s 610 cars 0 Complete 30/94s 130 cars 1116 Clustered 23/82s 17 cars 162 7 Results Several different scenarios were randomly generated with an optimization algorithm (in this case a Genetic Algorithm [2]) applied to each one of them. This generated approximately one hundred different scenarios and their respective agent optimum parameters. These results were then aggregated into several representative clusters using the K-Means algo- rithm. This step, produced a set with only nine different scenarios. An altered version of the traffic light simulator was developed, that instead of receiving a static traffic light schedule, would receive a set of scenario and agent parameter configuration pairs. This version of the simulator would constantly select the scenario, from that set, that most resembled the current scenario and use the traffic light configuration that was considered as the best for that particular case. This setting was run with three different configuration files: • The complete set of scenarios (one hundred) and their optimum parameter configurations. • A significantly smaller set of scenarios created with base in the larger set using the aggregation process. • A single optimum parameter configuration that was calculated as being the best for an average scenario. Table 1 captures the most important results from each test run. Using a smaller subset of results, one would expect fewer changes in the used configuration. In fact Table 1 shows exactly that. Using the complete set of results the simulation changed parameters 1116 different times against 162 times using the clustered subset. As less information was available, theoretically the simulation could not adapt as well as in the previous method. However, using the clustered results did not affect the simulation performance and even made it more efficient. There is a simple reason that explain this, at first glance, awkward behaviour. This happened because constantly changing parameters can lead to a loss in perfor- mance. In this particular case each time the parameters changed the timing of the traffic lights would be affected momentarily causing smaller, or longer, traffic light patterns than the optimum ones. In other scenarios the cost of changing parameters could be even higher (e.g. if the agents had to change their planning every time the parameters changed). 8 Conclusions Agent based simulations, involving complex and dynamic scenarios, have poor results when single static configurations are used. One way of coping with this problem is to have a large set of configurations that will be used in each specific scenario. The problem is that some simulations do not cope well with configuration changes. This can be true either because changing the configuration is expensive or because a period of instability is created when the configuration is changed. In this paper, it has been shown that using clustering algorithms to create smaller, but still meaningful, sets of configurations is a valid method that can be used to minimize the number of possible configurations thus minimizing the number of configuration changes. In the particular simulation tested it has also been shown that using this same method, better results can be achieved. The K-Means clustering algorithm has been shown as an effective clustering algorithm for this particular problem. More complex simulations might require different clustering algorithms. Other tactics to minimize the number of times the configuration is changed have been discussed like, for example, only changing the configuration when the current scenario differed significantly from the last scenario where the configuration has been changed. It has been explained that this alternative can create some other problems, like estimating how much change is needed for a scenario to be significantly different from another. Other problems would exist if two close scenarios needed radically different approaches. Another approach would be to analyze if a configuration change is needed from time to time. This alternative approach would not work well if the simulation was very dynamic and quick responses were needed when the scenario changes. 9 Future Work The only clustering method implemented as support for this dissertation has been the K-Means algorithm. Further testing could be done with different clustering algorithms or with variations of the used algorithm. In this paper it has been assumed that clusters always have a n-dimensional spherical form. This is not always true. Optimization results can form very different types of clusters like ellipsoids and stripes or even have a completely different formations that can only be defined mathematically. In the later case, clustering algorithms would no longer be useful and other methods had to be studied. The method presented for scenario adaptation is just one of many different possible approaches. Several other methods could be researched. One obvious problem that needs to be tackled is that of simulations where it is hard to make runs in static scenarios and are difficult to assess using small time sections of the simulation. For example, in the Robo Soccer Simulated League [6], the scenario could be seen as a set of different parameters, representing things like the behavior of the other team, remaining time or ball position. In this case it would be impossible to run the simulation with a static scenario as some of these parameters would change throughout the simulation. 10 Acknowledgements This work was partially supported by FCT Project FCT/POSC/EIA/57671/2004 (ABSES - Agent Based Simulation of Ecological Systems). References [1] Farhad Azadivar. A tutorial on simulation optimization. In WSC ’92: Proceedings of the 24th conference on Winter simulation, pages 198–204, New York, NY, USA, 1992. ACM Press. [2] Thomas Bäck and Hans-Paul Schwefel. Evolutionary computation: An overview. In T. Fukuda, T. Furuhashi, and D. B. Fogel, editors, Proceedings of 1996 IEEE International Conference on Evolutionary Computation (ICEC ’96), Nagoya, pages 20–29, Piscataway NJ, 1996. IEEE Press. [3] C. Fraley and A. E. Raftery. How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal, 41(8):578–588, 1998. [4] A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: a review. ACM Computing Surveys, 31(3):264–323, 1999. [5] André Restivo. Dynamic scenario simulation optimization. Master’s thesis, Faculdade de En- genharia da Universidade do Porto, 2006. [6] RoboCup. Robocup international homepage (http://www.robocup.org/), 2006. [7] J. Swisher, P. Hyden, S. Jacobson, and L. Scruben. A survey of simulation optimization tech- niques and procedures. In K. Kang J.A. Joines, R.R. Barton and P.A. Fishwick, editors, Pro- ceedings of the 2000 Winter Simulation Conference, 2000. [8] Martin Treiber, Ansgar Hennecke, and Dirk Helbing. Congested traffic states in empirical observations and microscopic simulations. Physical Review E, 62:1805, 2000.