=Paper=
{{Paper
|id=Vol-1382/paper23
|storemode=property
|title=A Game-based Model for Human-robots Interaction
|pdfUrl=https://ceur-ws.org/Vol-1382/paper23.pdf
|volume=Vol-1382
|dblpUrl=https://dblp.org/rec/conf/woa/MuranoS15
}}
==A Game-based Model for Human-robots Interaction==
1 Proc. of the 16th Workshop “From Object to Agents” (WOA15) June 17-19, Naples, Italy A game-based model for human-robots interaction∗ Aniello Murano and Loredana Sorrentino Dipartimento di Ingegneria Elettrica e Tecnologie dell’Informazione Università degli Studi di Napoli Federico II, Italy {aniello.murano,loredana.sorrentino}@unina.it I. A BSTRACT Solving a game corresponds to check whether a designed player has a winning strategy in the game, i.e. a sequence of Game theory has exhibited to be a fruitful metaphor to moves that let him satisfy the winning condition no matter reason about multi-player systems. Two kinds of games are how the other players behave. In the literature, we distinguish mainly studied and adopted: turn-based and concurrent. They between the case the condition is given as an “external entity”, differ on the way the players are allowed to move. However, in for example via a formula of a logic [1], [14], or internally real scenarios, there are very simple interplays among players as a condition over the states of the arena. While external whose modeling does not fit well in any of these settings. conditions offer some modularity and allow to formalize very In this paper we introduce a novel game-based framework intricate targets, they require solutions with a very high com- to model and reason about the interaction between robots and plexity [12]. Internal conditions instead offer a good balance humans. This framework combines all positive features of both between expressiveness and complexity and this is the setting turn-based and concurrent games. Over this game model we we consider in this paper. study the reachability problem. To give an evidence of the usefulness of the introduced A very simple and largely used (internal) winning condition framework, we use it to model the interaction between a consists of labeling some states of the arena as “good” and human and a team of two robots, in which the former tries then consider as target the reachability of at least one of them. to run away from the latter. We also provide an algorithm that Properly speaking, the resulting setting is called a reachability decides in polynomial time whether at least one robot catches game. These games have been exploited in both the (two- the human. player) turn-based and (multi-player) concurrent settings and fruitfully applied in several interesting real scenarios. How- ever, there are specific situations that cannot be casted in any II. I NTRODUCTION of these settings. In particular, this happens when we want In recent years, game theory has exhibited to be a fruit- to model the interplay among agents with a different essence. ful metaphor in multi-agent system modeling and reasoning, This is the case, for example in human-robot interaction. To where the evolution of the entire system emerges from the give an evidence of this necessity, we discuss along the paper coordination of moves taken by all agents being involved [1], a scenario involving the interaction between a human and two [20], [21], [22], [15], [23], [7], [8]. In the simplest setting, robots. The shape of the arena is a maze and the three players we consider finite-state games consisting of just two players are initially placed in three different positions. The goal of (or agents), conventionally named Player 1 and Player 2. the human is to run away from the robots, while the robots Depending on the possible interactions among the players, have the opposite goal. Therefore, looking at the game from games can be either turn-based or concurrent. In the former the robots side, the good states are those in which at least case, the moves of the players are interleaved. In the latter one of the robots and the human sit in the same place in the case, instead, the players move simultaneously. In a turn- same moment. We assume that whenever the human decides to based game, the states of the game are partitioned into those move, none of the robots can interfere and vice-versa. In other belonging to Player 1 and those belonging to Player 2. When words the human uses the game on its side as turn-based while the game is at a state of Player i, only Player i determines the robots use it as being concurrent. We introduce a novel the next state. In a concurrent game, instead, the two players model framework that is able to represent efficiently such a choose, simultaneously and independently, their moves and the scenario and provide a solution algorithm that can decide in next state of the game depends on the combination of such polynomial time whether the robots have a winning strategy, moves. i.e., they have a sequence of moves that, no matter how the A game consists of two main objects, the arena and the human behaves, they reach a desired state. winning condition. The arena is used to describe the players, the game states (configurations), and the possible evolution of Related work. Robotic technology is quickly advancing and the game in accordance to the moves the players can take. The this rapid progress inevitably is having a huge effect over winning condition is used to express the objective the players the people. The interaction between human and robots is aim to achieve. a complex issue that challenges both engineering and cog- nitive science. In several settings, such an interaction has ∗ This work is partially supported by the FP7 EU project 600958-SHERPA. been modeled in terms of a suitable interplay between all 146 2 Proc. of the 16th Workshop “From Object to Agents” (WOA15) June 17-19, Naples, Italy actors involved (see [5] for a survey). Several models in this 0 1 2 3 4 context take inspiration from the field of open-system formal verifications [9], [19], [11], [16]. A system is considered open 0 H R1 if it has an ongoing interaction with an external unpredictable environment and the system behavior fully depends on this 1 interaction. To verify an open system means to check that the system is correct no matter how the environment behaves. 2 Several models based on two-player games (system vs. en- vironment) have been proposed in order to model such an interaction as well as suitable algorithms to check whether 3 the system is correct (i.e. wins the game) [3], [24]. In this context, multi-agent games have been also proposed in order to model and reason about multiple players able to play in a 4 R2 cooperative or adversarial manner [1], [14]. The game setting we consider in this paper has several con- Fig. 1. Maze Game. nections with planning problems as well [2], [4], [6]. Indeed, planning can be rephrased as the problem of synthesizing a strategy (the plan) for an agent to achieve a determined task an interplay, the game reaches a configuration in which both within an environment. Often the environment is hostile and he and one of the robots sit in the same room. Accordingly, consists of an aggregation of several entities working together. we label all such configurations as good ones (as seen by the By casting such a scenario in our model setting one can see robots). Thus, deciding whether the team of robots can defeat the environment as the team of cooperative agents working the human corresponds to solve a reachability game. against the one aiming for the planning. It is important to note that the scenario we have discussed Since the environment can be seen as an adversarial player is neither (just) turn-based, as the robots move simultaneously, the correlation the our setting follows nor concurrent, as the human moves independently from the robots. Moreover, the discussed game involves three players and it is not trivially reducible to a two-player one because of III. C ASE S TUDY the particular target: at least one of the robots has to catch the In this section we introduce a simple but effective human- human. robot interaction scenario that will be used along the paper as To model this scenario, we introduce in the next section an application for the game-model framework we introduce. a novel game model in which all players, except a designed The scenario is described in the following and depicted in one, work in team and move simultaneously. The designed Figure 1. one instead will move alternately and independently from the The interaction takes place in a maze and involves tree other players. players: a human H and two robots R1 and R2 . The goal of the human is to escape from both the robots. The robots work IV. T HE CONCEIVED MODEL in team and aim just the opposite. For simplicity we assume In this section we introduce a novel multi-player game- the maze divided in square rooms and we start by considering based framework that is suitable to represent, under a unique that the players sit all in different rooms. All players from structure, both the players moving turn-based and those acting every room can access to an adjacent one unless there is a concurrently. This framework, which we call hybrid, oppor- wall (drawn with a bold line in the figure). We assume to tunely combines and extends the main features behind con- have an interleaving of moves between the human and the current game structures [1] and two-player turn-based games team of robots. Hence, the robots move simultaneously. We (see [18] for an introduction). assume that all players can move in the four directions up, Definition 1 (Hybrid Game models): A hybrid reachability down, left, and right, unless the shape of the maze forbids it. game structure is a tuple G =< Ag, Ac, St, s0 , tr, Stf >, Let us now discuss how such an human-robot interaction can where Ag is a finite non-empty set of agents, partitioned into be rephrased in terms of a game. We make this reasoning more two teams Ag0 and Ag1 . Ac and St are enumerable non- formal in the next section. Starting from an initial position in empty sets of actions and states, respectively, and s0 ∈ St the maze, all players by taking the allowed moves can change is a designated initial state. The set of states is partitioned their position. Each possible placing of the players can be seen in St0 and St1 as those states belonging to the teams Ag0 as a configuration of the game and the starting one is usually or Ag1 , respectively. For i ∈ {0, 1}, let Dci = Agi * Ac denoted the initial configuration. Clearly, we can move in one to be the set of decisions of team i, i.e., partial functions step from one configuration to another only if we have moves describing the choices of an action by all agents in Agi . Then, that allow it. In particular, as seen from the human, moves are tr : Dci ×Sti → St1−i denotes the transition function mapping interleaved in a turn-based fashion, while they are taken in a every pair of decisions δ ∈ Dci and state s ∈ Sti for the team concurrent way as seen by the robots. All legal moves can be i to a successive state over the deterministic graph belonging collected by an opportune data structure or a table. Following the the adversary team. Finally, we define Stf ⊆ St the subset the target of the robots, we have that the human loses if along of terminal (or accepting) states. 147 3 Proc. of the 16th Workshop “From Object to Agents” (WOA15) June 17-19, Naples, Italy To better clarify the meaning of the above formalization, let 0 H R1 us discuss some examples over the maze. From the initial state, which belongs to the team Ag1 , by using the decision lr, the 1 R2 game moves to the state (((0, 0)(0, 1)(1, 1)), 0). In accordance with the flag, this is now a state belonging to the team Ag0 and thus this team (the human) takes the turn to move. From 0 1 2 this state, the human agent has two available moves, that Fig. 2. A Reduced Maze Game. are d and r. In the first case the game moves in the state (((1, 0)(0, 1)(1, 1)), 1) and in the second case it moves in the state (((0, 0)(0, 1)(1, 1)), 1), both belonging to the players in the coalition Ag1 . And so on. In Figure 3, we report the com- We now show how the case study we have presented in putations of the game. It is easy to observe that the accepting the previous section can be easily and formally described by states are ((1, 0), (0, 2)(1, 0)), 0), ((1, 0), (0, 0)(1, 0)), 0) and means of a hybrid game model G. ((0, 1), (0, 1)(1, 1)), 1) since one of the two robots and the human are both in the room. For a sake of clarity, instead of considering the scenario depicted in Figure 1, we consider a reduced one as reported in ((0, 0), (0, 2)(1, 0)), 1) Figure 2. Also, we allow the robots to move only horizontally (left and right), while the human is still able to move in all directions. While this new scenario may look too naive, it will lr avoid a bunch of tedious calculations in the sequel. We model the human-robots interaction over this maze by setting Ag0 as ((0, 0), (0, 1)(1, 1)), 0) the team consisting of the unique player human and Ag1 as the team of robots R1 and R2 . We consider the maze as a grid- board made by K × J positions, with K = {0, 1} and J = d r {0, 1, 2}. In each position, zero, one, or more than one player can sit. States are just a set of positions of the three players ((1, 0), (0, 1)(1, 1)), 1) ((0, 1), (0, 1)(1, 1)), 1) plus a flag we use to recall which team is playing in that state. Formally, we have as set of states St = ((K × J)3 ) × {0, 1}. rl ll Sti contains those states having the flag equal to i. The initial state is just the initial position of the players. ((1, 0), (0, 2)(1, 0)), 0) ((1, 0), (0, 0)(1, 0)), 0) Accordingly to the picture depicted in Figure 2, as- suming R1 and R2 are the next to move, the state is Fig. 3. A game model G for the simplify maze in Figure 2. ((0, 0), (1, 0), (0, 2), 1). We assume in our example that this is the initial state. The possible actions for the robots are set to r for right and l for left. For the possible actions of the human we set u for up, d for down, l for left, and r for V. T HE PROPOSED SOLUTION right. Decisions are defined accordingly and must respect the Under the conceived model, we can handle all possible limitations imposed by the shape of the maze. As far as the targets that can be represented in terms of reachability, i.e., set of accepting states concerns, recall that the target of the the players in the coalition Ag1 set some configurations of the robots is to reach a configuration where at least one of them game as “good” and aim to reach them. These configurations catches the human, being both siting in the same position. are those represented by states Stf in the model. The coalition This means that Stf must contain all those states in which Ag1 wins the game if its players have a winning strategy, i.e., the first pair of coordinates (corresponding to the position of they can force the game, by means of a sequence of moves, the human) is equal to the second or third pair. Formally, to reach a state in Stf , no matter how the players in the team Stf = {((a, b), (c, d), (e, f ), i) |(a = c and b = d) OR Ag1 behave. Deciding a game means deciding whether the (a = e and b = f )}. Finally, it remains to define the transition coalition Ag1 wins the game. relation. For the sake of exposition, we only report the part In this section we provide an algorithm to decide a game corresponding to the team Ag0 . Note that this is coherent with under the hybrid framework we have introduced. This algo- the shape of the maze. rithm aims to find the set of states of the model from which the players in the coalition Ag1 win the game, that is the set of tr(δ, ((i, j)(k, l)(m, n)), 0) = states reach1 (Stf ). As the complement of this set contains the states from which players in the coalition Ag0 win the game, as a corollary we obtain that our game model is zero-sum (i.e. (((i − 1, j)(k, l)(m, n)), 1), if δ = u and i > 0; (((i, j − 1)(k, l)(m, n)), 1), from each node only one team can win the game). if δ = l and j > 0; The algorithm proceeds as follows. We start from the (((i, j + 1)(k, l)(m, n)), 1), if δ = r and j < 2; set Stf containing all winning states for players in Ag1 . Then, in a backward manner, we recursively build the set (((i + 1, j)(k, l)(m, n)), 1), if δ = d and i < 1. 148 4 Proc. of the 16th Workshop “From Object to Agents” (WOA15) June 17-19, Naples, Italy reachi1 (Stf ) containing all states s ∈ St such that players usefulness of the introduced framework by means of a case in Ag1 can force a visit from s to a state in the set Stf in study. less than i steps. Formally, we have that reachi1 (Stf ) = {s ∈ This work opens to several future directions. First, it would St |Ag1 can force a visit from s to Stf in less than i moves}. be interesting to reasoning about quantitative aspects about the Formally, we have the following. human-robots interaction. For example, to determine what is the best strategy to perform. reach01 (Stf ) = Stf ; Another direction would be to consider other possible win- ning conditions. In particular, one could import some winning reachi+1 i 1 (Stf ) = reach1 (Stf ) ∪ conditions studied in formal verification such as the Büchi {s ∈ St1 |∃s0 ∈ St : tr(Dc1 , s) = s0 ∧ s0 ∈ reachi1 (Stf )} ∪ and the parity ones (see [17], [10] for an introduction) or {s ∈ St0 |∀s0 ∈ St : tr(Dc0 , s) = s0 ∧ s0 ∈ reachi1 (Stf )}. external winning conditions given in terms of formulas of a suitable logic. Some scenarios along these lines have been already investigated in the case of turn-based and concurrent In words, from the set reachi1 (Stf ) we select all states that game settings [9], [19] and showed to have some useful appli- have incident edges in this set. From each of these states, cations [13], [14]. We plan, as future work, to investigate them say s, if it belongs to the coalition Ag1 , then, this state in the settings of multi-robots and human-robots interactions. is immediately added to reachi+1 1 (Stf ) (as we may move from s to reachi1 (Stf ) and thus reach Stf ). Conversely, if s is a state belonging to the coalition Ag0 , then it is added R EFERENCES i+1 to reach1 (Stf ) only if all outgoing edges from s fall in [1] R. Alur, T. Henzinger, and O. Kupferman. Alternating-Time Temporal reachi1 (Stf ) (i.e., from s, players in Ag0 are forced to move Logic. Journal of the ACM, 49(5):672–713, 2002. [2] D. Calvanese and G. De Giacomo and M.Y Vardi. Reasoning about to reach1 (Stf )). i actions and planning in LTL action theories. KR, 2:593–602, 2002. Finally we have that [3] L. de Alfaro, T. Henzinger, and O. Kupferman. Concurrent reachability games. In Foundations of Computer Science, 1998. Proceedings. 39th |St| Annual Symposium on, pages 564–575. IEEE, 1998. reach1 (Stf ) = reach1 (Stf ). [4] G. De Giacomo and M. Y. Vardi. Automata-theoretic approach to planning for temporally extended goals. pages 226–238, 2000. As the calculation of reach1 (Stf ) requires a number of [5] M. A. Goodrich and A. C. Schultz. Human-robot interaction: a survey. iterations linear in number of states St, we have that the whole Foundations and trends in human-computer interaction, 1(3):203–275, 2007. algorithm requires at most quadratic time in the size of the [6] H. Geffner and B. Bonet. A concise introduction to models and methods model, as reported in the following theorem. for automated planning. Synthesis Lectures on Artificial Intelligence and Theorem 1: Given a hybrid reachability game, it can be Machine Learning, 8(1):1–141, 2013. [7] J. Gutierrez and M. Wooldridge. Equilibria of concurrent games on decided in quadratic time. event structures. In Proceedings of the Joint Meeting of the Twenty- By a matter of calculation, one can see that by applying Third EACSL Annual Conference on Computer Science Logic (CSL) and the algorithm above over our reduced case study, the coalition the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), page 46. ACM, 2014. Ag1 wins the game from every state. In fact, for each state [8] J. Van Benthem. Logic games: From tools to models of interaction. of the model, there exists always a winning strategy for the Institute for Logic, Language and Computation (ILLC), University of players in the team Ag1 . Amsterdam, 2007. [9] W. Jamroga and A. Murano. On Module Checking and Strategies. In Autonomous Agents and MultiAgent Systems’14, pages 701–708. Inter- VI. D ISCUSSION AND F UTURE W ORKS national Foundation for Autonomous Agents and Multiagent Systems, 2014. In the last years, human-robots interaction is receiving large [10] O. Kupferman, M. Vardi, and P. Wolper. An Automata Theoretic attention in several research fields. An important aspect in this Approach to Branching-Time Model Checking. Journal of the ACM, 47(2):312–360, 2000. study is to come up with appropriate models to design and [11] O. Kupferman, M. Vardi, and P. Wolper. Module Checking. Information reasoning about how such interactions can take place and how and Computation, 164(2):322–344, 2001. they affect the future behavior of the involved actors. In this [12] F. Mogavero, A. Murano, G. Perelli, and M. Vardi. Reasoning About Strategies: On the Model-Checking Problem. Transactions On Compu- setting, game theory is a powerful framework that is able to tational Logic, 15(4):34:1–42, 2014. formalize the interplay between the human and the robots in [13] F. Mogavero, A. Murano, and L. Sorrentino. On Promptness in a very natural way. Parity Games. In Logic for Programming Artificial Intelligence and Reasoning’13, LNCS 8312, pages 601–618. Springer, 2013. In this paper, we have introduced a game model framework [14] F. Mogavero, A. Murano, and M. Vardi. Reasoning About Strategies. that allows to represent and reasoning about scenarios in In Foundations of Software Technology and Theoretical Computer Sci- which the interaction between the humans and the robots ence’10, LIPIcs 8, pages 133–144. Leibniz-Zentrum fuer Informatik, 2010. results in a hybrid two-team game: the game between the [15] S. Parsons and M. Wooldridge. Game theory and decision theory in two teams is turn-based while all players in each team play multi-agent systems. Autonomous Agents and Multi-Agent Systems, concurrently among them. We have observed that classic turn- 5(3):243–254, 2002. [16] P. Schobbens. Alternating-Time Logic with Imperfect Recall. 85(2):82– based and concurrent games are not powerful enough to 93, 2004. model such a setting. Over the conceived model, we study the [17] W. Thomas. Automata on infinite objects. Handbook of theoretical reachability problem and show that it is solvable in quadratic computer science, 2, 1990. [18] W. Thomas. Monadic Logic and Automata: Recent Developments. 1998. time. Therefore, with no extra cost with respect to classic [19] W. Jamroga and A. Murano. Module checking of strategic ability. pages turn-based and concurrent games. We give an evidence of the 227–235, 2015. 149 5 Proc. of the 16th Workshop “From Object to Agents” (WOA15) June 17-19, Naples, Italy [20] M. Wooldridge. Intelligent Agents. In G. Weiss, editor, Multiagent Systems. A Modern Approach to Distributed Artificial Intelligence. MIT Press: Cambridge, Mass, 1999. [21] M. Wooldridge. Computationally Grounded Theories of Agency. In International Conference on MultiAgent Systems’00, pages 13–22. IEEE Computer Society, 2000. [22] M. Wooldridge. An Introduction to Multi Agent Systems. John Wiley & Sons, 2002. [23] Y. Jiang and J. Hu and D. Lin. Decision making of networked multiagent systems for interaction structures. Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, 41(6):1107–1121, 2011. [24] W. Zielonka. Infinite Games on Finitely Coloured Graphs with Appli- cations to Automata on Infinite Trees. Theoretical Computer Science, 200(1-2):135–183, 1998. 150