Self-Explaining Agents in Virtual Training Maaike Harbers1,2 , Karel van den Bosch2 , and John-Jules Meyer1 1 Utrecht University, P.O.Box 80.089, 3508 TB Utrecht, The Netherlands 2 TNO, Kampweg 5, 3796 DE Soesterberg, The Netherlands maaike@cs.uu.nl,jj@cs.uu.nl,karel.vandenbosch@tno.nl Abstract. Virtual training systems are increasingly used for the train- ing of complex, dynamic tasks. To give trainees the opportunity to train autonomously, intelligent agents are used to generate the behavior of the virtual players in the training scenario. For effective training however, trainees should be supported in the reflection phase of the training as well. Therefore, we propose to use self-explaining agents, which are able to generate and explain their own behavior. The explanations aim to give a trainee insight into other players’ perspectives, such as their percep- tion of the world and the motivations for their actions, and thus facili- tate learning. Our project investigates the possibilities of self-explaining agents in virtual training systems, and the effects on learning. 1 Introduction Virtual training is used to train people for complex, dynamic tasks in which fast decision making is required, e.g. crisis management, military missions or fire- fighting. In typical virtual training, a trainee has to accomplish a given mission and therefore he has to interact with other virtual players, e.g. team-members, opponents, or neutral participants. Currently, in most virtual training these are controlled by other trainees or instructors. However, using intelligent agents instead of humans gives trainees more flexibility to train where and whenever they want, and it reduces costs. Fire-fighters could for example train during a night shift, when they spend most of their time waiting for an alarm. Intelligent agents can only (partly) replace humans if they are able to gener- ate believable behavior, which might be complex. Moreover, trainees should be supported to reflect on the training because that promotes learning [13]. Reflec- tion could be provoked by providing (the possibility to request for) explanations about the virtual players’ behavior, which can give a trainee insight into their perspectives. Such a facility requires intelligent agents that are able to explain their actions, so called self-explaining agents. In this paper, we present a PhD project issuing the topic of self-explaining agents in virtual training. In section 2, we discuss some related work, and in section 3 we give a formulation of our research question. Then, we provide a more detailed discussion on our approach and the results achieved so far in section 4. We end the paper with a conclusion and an outline of future research in section 5. 2 Related work A lot of research has been done on intelligent tutoring systems (ITS) [11, 10], which is a topic closely related to self-explaining agents. ITSs teach students how to solve a problem or execute a task by giving explanations during and after task execution. ITSs have been successfully designed for the training of well-structured skills and tasks such as programming or mathematics. In con- trast, tasks that are being trained in virtual training systems are usually real world, complex and dynamic. The space of possible actions of a trainee is large, and often there is no single ’right’ way to accomplish a task. So instead of ex- planations that give hints and recipes of what is to be done as provided by ITSs, explanations in virtual training should give insight into the processes in the training scenarios. Trainees can use these to make sense of the situation and construct a picture of what is going on themselves, and thus reflect on their own performances. A few proposals for self-explaining agents in virtual training systems have been made. The first called Debrief [9], which has been implemented as part of a fighter pilot simulation and allows trainees to ask an explanation about any of the artificial fighter pilot’s actions. To generate an answer, Debrief modifies the recalled situation repeatedly and systematically, and observes the effects on the agent’s decisions. With the observations, Debrief determines what factors were responsible for ’causing’ the decisions. However, Debrief derives what must have been the agent’s underlying beliefs for an action, but sometimes an action has several possible explanations. If (some of) the agent’s reasoning steps would be made explicit instead of derived from observable behavior, the actual reasons for executing an action could be given. A more recently developed account of self-explaining agents is the XAI expla- nation component [14]. The XAI system has been incorporated into a simulation- based training for commanding a light infantry company. After a training session, trainees can select a time and an entity, and ask questions about the entity’s state. However, the questions involve the entity’s physical state, e.g. its location or health, but not its mental state. A second version of the XAI system [4, 1] was developed to overcome the shortcomings of the first; it claims to support domain independency, modularity, and the ability to explain the motivations behind entities’ actions. This second XAI system is applicable to different simulation-based training systems, and for the generation of explanations it depends on information that is made available by the simulation. Most simulations however do not represent agents’ goals, and preconditions and effects of actions, and thus still no explanations of agents’ reasons can be given. 3 Research question The previous section showed that ITSs usually support the learning of well- structured tasks, in which it is clear what actions are right and wrong. In the virtual training systems we focus on however, training tasks can be achieved in many different ways and require another type of feedback than provided by ITSs. Self-explaining agents might be a good alternative, but we believe that the exist- ing accounts lack some crucial aspects. They either just give explanations about the agent’s physical state, or they derive information about the agent’s mental state from its behavior or the simulation. We believe that an agent’s behavior can best be explained by its actual underlying motivations, i.e. information about its mental state, and that the explanation component thus should have direct access to this information and not on their effects. To solve the shortcomings in the current solutions, the PhD project presented in this paper addresses the following research question: How can we develop self-explaining agents and how can they be applied in virtual systems to support training? The question is two-fold, the first part is a technical question, and the sec- ond part focuses more on educational aspects. In the remainder of this paper we discuss our methodology, and the results achieved so far. 4 Our approach To obtain direct access to an agent’s reasons for executing an action, we believe that behavior explanation should be connected to the generation of behavior. The deliberation steps that are taken to generate an action can also be best used to explain that action, and when these deliberation steps are understand- able, the explanations should be as well. To obtain understandable deliberation steps, the agent’s reasoning elements should have some level of abstraction. For instance, the description ”an agent is opening a door” is more useful for under- standing its behavior than ”an agent is moving object x from position (x1,y1,z1) to (x2,y2,z2)”. We have chosen to use a BDI-based (beliefs desires intentions) approach [12], so that our agents reason with concepts such as beliefs, desires and plans, and also provide explanations in these terms. We have chosen for the BDI approach because it matches the way humans give explanations. Humans adopt a certain ’stance’ or ’mode of construal’ for explaining and predicting phenomena, and different stances are chosen to explain different phenomena [7]. Dennett for example distinguishes the mechanical, the design, and the intentional stance [3]. The intentional stance considers entities as having beliefs, desires and other mental contents, and fits most natural to explain the behavior of humans or virtual characters that behave like humans. To understand the behavior of agents, it only matters whether they behave as if they had beliefs and desires. However, agents that have to generate understandable explanations based on their deliberation should also have actual beliefs and desires and reason with them. The BDI approach defines an agent’s reasoning elements, but it does not tell how actions can be generated from an agent’s goals and beliefs, i.e. how planning works. For an account of planning, we have looked at the GPGP (generalized partial global planning) approach [8]. The GPGP approach is a framework for the coordination of small teams of agents and makes use of task structures. TAEMS (task analysis, environment modeling and simulation) is the language used to represent these task structures. The underlying model of the GPGP approach can be represented conceptually as an extended AND/OR goal tree in which task are decomposed into subtasks which in turn are decomposed, etc. The leaves of the tree are primitive (non-decomposable) actions. Conform the GPGP approach, we structure the possible goals, plan and actions of an agent in a task hierarchy. The task at the top of the hierarchy is an agent’s goal, the subtasks possible plans for reaching that goal, and the leaves are the agent’s actions. Consequently, for each of the agent’s goals, a task hierarchy is defined. The actions that an agent executes to achieve a goal depend on three aspects (explained in the next paragraph): its beliefs, the nature of the task-subtask relation, and its preferences. Fig. 1. The goal, plans and actions (boxes), and beliefs (circles) of a fire-fighting agent First, the GPGP model is designed for a team of agents, but we take a single agent perspective. Therefore, in our model the beliefs of a single agent can be added to the task hierarchy, to form the conditions that determine which tasks can possibly be executed. Second, three task-subtask relations can be distin- guished. A task can be executed when: – All subtasks are executed – One subtask is executed – All subtasks are executed in a specific Order Third, the agent’s preferences determine the order in which subtasks are executed in an All-relation, and which subtask is executed in an One-condition. Figure 1 shows a the model of a simple fire-fighting agent. The agents main goal is to handle the incident, and it has several plans available to achieve this goal. Its current beliefs determine how the agent ’walks through’ the hierarchy. The first step is to choose between saving a victim (if the agent beliefs that there is an actual victim), or extinguishing a fire (if the agent beliefs that there is a fire and no victim because saving victims is preferred over extinguishing fires). For saving a victim, it first has to search the victim and then carry it away. For extinguishing a fire it can either use water or foam, dependent on its beliefs about the availability of water and foam. The same agent model can also be used for the generation of explanations about the agent’s actions. The actions that are the result of an agent’s delibera- tion process can be explained by the beliefs, goals and reasoning steps that were in involved in the process. For instance, extinguishing a fire with foam could be explained as follows. I wanted to handle the incident, and I believed there was a fire and no victim, therefore I wanted to extinguish the fire, and I believed there was foam and no water, therefore I used foam Such explanations can become quite long, which is not desired [7]. Therefore, the most informative elements in the explanations have to be selected, e.g. ’I used foam because there was no water’. For the implementation of our agent model it is required that the agent’s reasoning elements and its deliberation steps can be explicitly represented in the programming language. Second, an agent needs to have access to this in- formation. We have chosen to implement our model in the agent programming language 2APL [2]. 2APL is a BDI-based programming language, so the goals, plans and beliefs of an agent can explicitly be represented. Moreover, a 2APL agent is capable of introspection into its own beliefs and goals. For more details on the agent model and its implementation see [5, 6]. 5 Conclusion We argued that virtual training of complex and dynamic tasks requires intelligent agents that can provide explanations for their behavior in such a fashion that it helps trainees to improve their understanding. So far, we have developed an agent model capable of the generation and explanation of behavior, and we have made an implementation of the model. We are currently reviewing literature on cognitive behavior research to de- termine which information people find most useful in explanations. Based on the outcome of this study we want to develop filters which select the most useful in- formation out of longer explanations. Furthermore, we want to extend the agent model with factors like emotions, personality or social contracts, which may also influence an agent’s behavior. Explanations referring to these properties may help trainees to become more sensitive and understanding to them. The next step will be to connect the self-explaining agent to an existing virtual training system, and perform user experiments. We believe that our approach can create a learning tool that is currently not existing, and which fulfills the requirements of autonomous training of complex and dynamic training tasks. Our goal in this project is to demonstrate that the self-explaining agents we are developing deliver appropriate and useful explana- tions, leading to improved learning. Acknowledgements This research has been supported by the GATE project, funded by the Netherlands Organization for Scientific Research (NWO) and the Netherlands ICT Research and Innovation Authority (ICT Regie). References 1. M. Core, T. Traum, H. Lane, W. Swartout, J. Gratch, and M. van Lent. Teaching negotiation skills through practice and reflection with virtual humans. Simulation, 82(11), 2006. 2. M. Dastani. 2APL: a practical agent programming language. Autonomous Agents and Multi-agent Systems, 16(3):214–248, 2008. 3. D. Dennett. The Intentional Stance. MIT Press, 1987. 4. D. Gomboc, S. Solomon, M. G. Core, H. C. Lane, and M. van Lent. Design recommendations to support automated explanation and tutoring. In Proc. of the 14th Conf. on Behavior Representation in Modeling and Simulation, Universal City, CA., 2005. 5. M. Harbers, v. d. Bosch, K., F. Dignum, and J. Meyer. A cognitive model for the generation and explanation of behavior in virtual training systems. In Proc. of Exact 2008, Patras, Greece, Forthcoming. 6. M. Harbers, F. Dignum, v. d. Bosch, K., and J. Meyer. Explaining simulations through self-explaining agents. In Proc. of EPOS 2008, Lisbon, Portugal, Forth- coming. 7. F. Keil. Explanation and understanding. Annual Reviews Psychology, 57:227–254, 2006. 8. V. Lesser, K. Decker, N. Carver, A. Garvey, D. Neiman, M. Nagendra Prasad, and T. Wagner. Evolution of the GPGP/TAEMS domain-independent coordination framework. Autonomous agents and muli-agent systems, 9:87–143, 2004. 9. W. Lewis Johnson. Agents that learn to explain themselves. In Proc. of the 12th Nat. Conf. on Artificial Intelligence, pages 1257–1263, 1994. 10. T. Murray. Authoring intelligent tutoring systems: An analysis of the state of the art. Internat. Journal of Artificial Intelligence in Education, (10):98–129, 1999. 11. M. Polson and J. Richardson. Foundations of Intelligent Tutoring Systems. Lawrence Erlbaum Associates, Inc., Mahwah, NJ, 1988. 12. A. Rao and M. Georgeff. Modeling rational agents within a BDI-architecture. In J. Allen, R. Fikes, and E. Sandewall, editors, Proc. of the 2nd Internat. Conf. on Principles of Knowledge Representation and Reasoning, pages 473–484, San Mateo, CA, USA, 1991. Morgan Kaufmann publishers Inc. 13. D. Schon. Educating the Reflective Practitioner. Jossey-Bass, San Francisco, 1987. 14. M. Van Lent, W. Fisher, and M. Mancuso. An explainable artificial intelligence system for small-unit tactical behavior. In Proc. of IAAA 2004, Menlo Park, CA, 2004. AAAI Press.