I see what you see: Integrating eye tracking into Hanabi playing agents Eva Tallula Gottwald Markus Eger and Chris Martens egottwald@mills.edu meger@ncsu.edu, crmarten@ncsu.edu Principles of Expressive Machines Lab Mills College NC State University Oakland, CA, USA Raleigh, NC, USA Abstract from their hand and putting it on the table. If the played card is the next card in numerical order of its corresponding color, Humans’ eye movements convey a lot of information about e.g. if a blue 4 was played and the highest blue card currently their intentions, often unconsciously. Intelligent agents that cooperate with humans in various domains can benefit from on the table is a 3, the card is added to the board, otherwise interpreting this information. This paper contains a prelimi- it is discarded and a mistake is noted. When there is no card nary look at how eye tracking could be useful for agents that of a particular color on the board, a 1 is considered to be the play the cooperative card game Hanabi with human players. next card in numerical order. Players may also opt to out- We outline several situations in which an AI agent can utilize right discard a card instead of playing it; this recovers one gaze information, and present an outlook on how we plan to hint token. After players play or discard a card they draw a integrate this with reimplementations of contemporary Han- new card from the deck to restock their hand. The game ends abi agents. once the players collectively have made 3 mistakes, or when the deck has been exhausted, plus one extra round. The score of the players equals the number of cards on the board, for Introduction a maximum of 25 points if all five cards in each of the five Humans often give non-verbal cues to indicate their inten- colors have been played successfully. tions (Land and Hayhoe 2001) or augment their verbal com- Even though the game provides the players with very lim- munication, often subconsciously. It would therefore be ben- ited communication, when human players play the game, eficial for the usability of computational systems to be able they typically follow the same strategy as in normal con- to interpret such signals. However, the subtle, subconscious versation, using Grice’s maxims of communication (Grice use of signaling and lack of simple test domains make inter- 1975): preting these signals very challenging. For many other AI techniques, games have served as a test environment, be- • The maxim of quantity by giving necessary hints, but not cause they provide a low-risk, high-fidelity environment and more often have a clear performance metric that can be used to • The maxim of quality is enforced by the rules (players measure success. We propose that games involving commu- may not lie) nication can be used as test environments for the interpreta- tion of non-verbal cues given by humans. • The maxim of relation by not giving hints that are not One example for a game that relies heavily on inter-player relevant to the current state of the game communication is Hanabi (Bauza 2010), a cooperative card • The maxim of manner by trying to avoid hints that could game in which players collaborate to build fireworks rep- be misinterpreted resented by cards with ranks from 1 to 5 in five colors. Un- However, when games are closely observed, players also of- like in traditional card games, players hold their cards facing ten provide clues about their behavior in ways that are not away from them, i.e. every player sees every other players’ strictly part of game play, such as hesitation, visibly decid- card, but not their own. On a player’s turn, they may give ing between two players to give hints to, etc. While there a hint to another player about the contents of that player’s has been significant research into Hanabi game play, includ- hand. These hints are limited to either telling the other player ing how to build agents that play the game well with human which of their cards have a particular color, or a particular players, the interpretation of non-verbal communication dur- rank. For example, player A may tell player B which of their ing game play has been understudied. cards are red and which are not, but not a subset thereof. Giv- ing a hint expends a hint token, of which the players initially In this paper we present preliminary work that aims to collectively have eight. Instead of giving a hint on their turn, integrate eye-tracking into agents that play Hanabi with hu- players may also opt to play a card by choosing any card man players. We have implemented a 2-player version of Hanabi in Unity that integrates with a Tobii eye tracker. We will present our hypotheses of how eye tracking information could be utilized by AI agents, the eye tracking information we have available, and some initial observations about play- to interpret it correctly. The work cited above does that by ers’ gaze behavior. either assuming that the player follows a fixed protocol, or by explicitly or implicitly enumerating all possible current Related work game states given what is known about the hidden informa- tion, and determining in which game state a player would Hanabi has been of interest for several AI researchers be- give the hint they gave. However, as mentioned, humans of- cause of its cooperative nature, the hidden information and ten indicate their intentions with their gaze. We therefore limited communication channels. One approach to the game postulate that an AI agent with access to eye tracking infor- is to purely optimize the score the agents obtain. Cox et al. mation can perform better than one without. (2015) have devised a logical/mathematical protocol to con- Consider, for example, the case where a hint can be inter- vey a large amount of information using the limited commu- preted to indicate either a playable card, or a card that should nication Hanabi allows, scoring close to a perfect score in be discarded. By using the player’s gaze, the AI agent might most games. While this approach only works in games with be able to disambiguate between the two options. If the card 5 players, Bouzy (2017) presents an improved version that should be played, it is more likely that the player looked at also works with fewer players. Walton-Rivers et al. (2017) the board where that card should go, whereas a card that present a comparison of several different approaches, in- should be discarded might prompt the player to look at the cluding several based on Monte Carlo Tree Search (Browne other discarded cards more. et al. 2012), focusing on how they perform in simulated What we are interested in is going beyond this very basic games. However, while agents using the techniques dis- example, and look at more complex cases. Hypothetically cussed by these authors obtain very high scores when play- interesting scenarios include: ing with each other, the protocols they use are very hard to follow for humans, and certainly not what a human player • The player’s gaze going back and forth between two cards would intuitively expect. in the AI agent’s hand before giving a hint including one Another approach for building Hanabi playing agents is of them. Depending on what the AI agent already knows more in line with how human players approach the game. about the two cards, they may be able to infer additional Van den Bergh et al. (2015) present several agents using sim- information. For example, since there is only one copy of ple if-then rules defined by experts. Osawa (2015) describes each 5, players often give hints to prevent them from be- agents that follow an expert-informed protocol, while also ing discarded, especially when they think that the person deducing how information obtained from the other players holding the 5 is likely to discard it. However, this is at ten- should be interpreted by having a model of possible inter- sion with giving hints that have a more immediate effect pretations. Note that Walton-Rivers et al. also included sev- on game play. An AI agent could deduce this tension by eral rule-based approaches, including Osawa’s and van der observing which options a player is considering. Bergh’s, in their comparison, some of which did not perform • Because players know how many copies of each card are much worse than their Monte Carlo Tree Search variants. in the deck, counting cards that were discarded, played Eger et al. (2017) specifically investigated how AI agents or are in the other players’ hands can be used to narrow interact with human players, noting that agents that exhibit down which cards are in a player’s own hand. By tracking intentional behavior score higher when playing with a hu- which cards a player looks at before performing a play man player than those simply following their own protocol. or discard action, it is possible to determine which possi- It has been noted that humans use the gaze of other peo- bilities they are considering. For example, consider that a ple to determine their intentions, feelings, etc. starting from player knows that they have a 4, but not which color it is. as young as 3-4 years (Baron-Cohen 1997). In order to make When they look at all discarded 4s and a particular card in the interaction with computers more natural, it is therefore of the AI agent’s hand before playing their own 4, it is pos- great interest to research the integration of gaze into human- sible to deduce that the card in the AI agent’s hand might computer interaction (Poole and Ball 2006). Bader and Bey- also be a 4. In particular, if the color of the player’s 4 was erer (2011) report how user’s mental models change their ambiguous, the AI agent might infer that their card is a 4 gaze behavior to be more forward-looking to indicate their of a color that would help disambiguate the color of the intentions as they become more familiar with a task. Hris- player’s 4. tova and Grinberg (2005) showed that players that are more • When the AI agent draws a new card, the duration of the likely to cooperate in an Iterated Prisoners’ Dilemma sce- gaze of the human player can be used to determine how nario are also more likely to look at the payouts, while play- immediately useful a card is likely to be. This is partic- ers less likely to cooperate looked more at the computer’s ularly true if the players are waiting to draw a specific moves. While this indicates that player behavior can be pre- card, such as a missing 1, or if a card that can only be dicted from their gaze, the game under consideration was played later in the game, such as a 4 is drawn early. We very simple. In the next section we will explore how eye- believe that players’ gaze will linger shorter on cards that tracking could be used in a more complex domain. are not immediately useful. However, if a card is useful, the player has to scan the other cards in the AI agent’s Eye Tracking for Hanabi hand to determine which hint to give to unambiguously In Hanabi, when receiving a hint from another player, it is indicate the usefulness of the new card. essential to determine the intention behind that hint, in order To be able to integrate these scenarios into an agent that (a) A screenshot of our Hanabi implementation during game play (b) A heatmap of player gaze behavior overlayed over the game screen Figure 1: Unity implementation of Hanabi with eye tracking plays Hanabi, we need to be able to track the player’s gaze niques to reduce the noise, starting with a simple low-pass (to determine what they are focusing on), including which of filter. However, even with this noisy data, one can already multiple options it changes between (to determine decision see that players focus on particular cards more than others. making), and how long it rests in a particular spot (to deter- Another, not entirely unexpected, observation we have made mine interest/ disinterest in a particular option). Addition- is that a player’s gaze is drawn towards UI elements that ally, because of the inherent uncertainty of the information move or pop up, such as when they are given a hint, or when obtained the agent must not take this information as a fact, they click on a pop-up menu. but rather only use it as guidance to help determine player In our current version of the game, the AI agent performs intentions. its moves randomly, with our main focus being obtaining Existing Hanabi agents interpret hints that they are given and interpreting eye tracking data. In the next section we by determining in which situation, or in service of which will discuss how we plan on incorporating this information goal, the other player would give such a hint. If the agent into the agent’s decisions. determines that there are multiple applicable situations, it needs to break this tie in some way. The conservative ap- Future Work proach would be to refrain from choosing any particular sit- uation and continuing game play with the information ob- So far, our efforts have been focused on creating an imple- tained, as is done by Osawa’ (2015) Outer State Player. mentation of Hanabi that incorporates eye tracking. For fu- Alternatively, in the approach used by Eger et al. (2017), ture work, we want to explore how players’ gaze behavior ambiguities are resolved by assuming that players prefer ac- lines up with the situations outlined above, and test the hy- tionable hints that advance the game. Eye tracking data can pothesis that player’s gaze can be used by an agent to im- be used in addition to these options to provide additional prove its behavior. While we have already identified that weight to each possible situation, without being the sole de- players tend to look at certain UI elements when they be- ciding factor. This is particularly appealing because it would come interactable, we have yet to determine how closely a allow our approach to be integrated in multiple existing and player’s gaze correlates with their intentions, and in what new agent designs. way. However, the main advantage of having eye tracking in- formation available is not that it is necessarily an accurate in- Implementation dicator of a player’s intention, but rather that it can be used In order to test our hypotheses about how to integrate gaze in addition to other techniques. We therefore plan to reim- into Hanabi agents, we implemented the 2 player version of plement Osawa’s (2015) and Eger et al.’s (2017) agents, and Hanabi in Unity with support for a Tobii eye tracker1 . Using use eye-tracking for cases in which their approaches have the eye tracker, we are able to determine where a player’s to decide between two or more possibilities. To validate this gaze lingers with reasonable precision to determine which approach, we plan on performing a user study to compare card they are focusing on. Figure 1 shows the user interface the different approaches. Each participant will play several of our implementation, as well as an example for where a games with the same agent type, where the agents will ig- player’s gaze lands on the screen. Note that this data comes nore the eye tracking information in some games, while for from a rough development version which does not currently others it is taken into account. We believe that taking gaze filter out any noise. We are still in the process of tweaking into account will allow the AI agents to perform better when gaze duration thresholds, and considering additional tech- playing with human player, but the games could also provide insights into how gaze differs between different players, if at 1 https://developer.tobii.com/tobii-unity-sdk/ all. Additionally, the score in the game is not the only rele- vant variable. We will therefore also perform a survey to ask van den Bergh, M.; Spieksma, F.; and Kosters, W. 2015. participants if they perceived the AI to understand them bet- Hanabi, a co-operative game of fireworks. Bachelor’s thesis, ter, or play more rationally. Universiteit Leiden. One limitation of this approach, and an ethical experiment Walton-Rivers, J.; Williams, P. R.; Bartle, R.; Perez- setup in general, is that the participants are necessarily aware Liebana, D.; and Lucas, S. M. 2017. Evaluating and mod- of the eye tracker, and may adapt their behavior. Our exper- elling hanabi-playing agents. In Evolutionary Computation imental design therefore only compares games in which eye (CEC), 2017 IEEE Congress on, 1382–1389. IEEE. tracking information is present, but ignored by the AI agents, with games in which it is utilized. By comparing the scores from games in which gaze information is ignored with prior work, we can determine whether players also changed their in-game behavior, though. Finally, we are also considering applications beyond games, such as assisting users of software tools. By deter- mining user’s intentions, help can be given in a more con- textual way. References Bader, T., and Beyerer, J. 2011. Influence of users men- tal model on natural gaze behavior during human-computer interaction. In 2nd Workshop on Eye Gaze in Intelligent Hu- man Machine Interaction, 25–32. Baron-Cohen, S. 1997. Mindblindness: An essay on autism and theory of mind. MIT press. Bauza, A. 2010. Hanabi. Bouzy, B. 2017. Playing hanabi near-optimally. In Advances in Computer Games, 51–62. Springer. Browne, C. B.; Powley, E.; Whitehouse, D.; Lucas, S. M.; Cowling, P. I.; Rohlfshagen, P.; Tavener, S.; Perez, D.; Samothrakis, S.; and Colton, S. 2012. A survey of monte carlo tree search methods. IEEE Transactions on Computa- tional Intelligence and AI in games 4(1):1–43. Cox, C.; De Silva, J.; Deorsey, P.; Kenter, F. H.; Retter, T.; and Tobin, J. 2015. How to make the perfect fireworks display: Two strategies for hanabi. Mathematics Magazine 88(5):323–336. Eger, M.; Martens, C.; and Alfaro Córdoba, M. 2017. An intentional ai for hanabi. In Computational Intelligence and Games (CIG), 2017 IEEE Conference on, 68–75. IEEE. Grice, H. P. 1975. Logic and conversation. Syntax and semantics 41–58. Hristova, E., and Grinberg, M. 2005. Information acquisi- tion in the iterated prisoners dilemma game: An eye-tracking study. In Proceedings of the 27th annual conference of the cognitive science society, 983–988. Lawrence Erlbaum Hillsdale, NJ. Land, M. F., and Hayhoe, M. 2001. In what ways do eye movements contribute to everyday activities? Vision re- search 41(25-26):3559–3565. Osawa, H. 2015. Solving Hanabi: Estimating hands by op- ponent’s actions in cooperative game with incomplete infor- mation. In Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence. Poole, A., and Ball, L. J. 2006. Eye tracking in hci and usability research. In Encyclopedia of human computer in- teraction. IGI Global. 211–219.