Verifying strategies in dialogue games (Extended abstract) Anna Sawicka1 , Magdalena Kacprzak2 , Marcin Dziubiński3 , and Katarzyna Budzynska4,5 1 Faculty of Computer Science, Polish-Japanese Institute of Information Technology, Poland asawicka@pjwstk.edu.pl 2 Faculty of Computer Science, Białystok University of Technology, Poland m.kacprzak@pb.edu.pl 3 Institute of Informatics, University of Warsaw, Poland m.dziubinski@mimuw.edu.pl 4 Institute of Philosophy and Sociology, Polish Academy of Sciences, Poland 5 School of Computing, University of Dundee, UK budzynska.argdiap@gmail.com Abstract. The aim of the paper is to consider a game-theoretic properties of strategies available to players in dialogue games. We treat speech acts formulated in a dialogue as moves in a dialogue game and rules for their appropriateness as rules of this game. We would like to analyze dialogical systems as a Nash-style games, and explore verification possibilities for some properties of such games e.g. dominant strategies and Nash equilibrium. Since automatic verification of such games could be of high complexity, we consider a model checking approach, which is commonly applied method allowing for verification of different systems and their properties. This method is based on solutions, which are trying to deal with the state explosion problem. Keywords: formal natural dialogues, dialogue games, game theory, dominant strategy, Nash equilibrium 1 Introduction Formal dialogue systems study players’ decisions during the communication process. To specify such a system, we typically use such elements as protocol, locution and commitment rules, which allow us to define legal moves in a particular type of dia- logue game. Some of these systems examine dialogue types that satisfy certain con- straints, e.g. prohibition of fallacies (see e.g. [7, 12, 22]), the others concentrate on the requirements that dialogical communication must serve specific goals, such as persua- sion, negotiation, information-seeking, inquiry, and so on (see e.g. [21, 14, 18, 19, 8, 1]). Even though strategies in games get a lot of attention [5, 6, 11, 17, 24, 1, 4, 9], only a few works try to combine a game-theoretic approach with dialogue games [15, 20, 13, 16, 10]. In these approaches is emphasized that we can investigate not only whether the participants behave correctly (according to the rules of the game) but also whether they behave successfully (according to the goals of the player). We continue the study pre- sented in [10] in which strategy (consistent with given protocol) should be chosen so that the result of the dialogue will be most advantageous and rational for a given player. The method of such application using Mackenzie’s DC system was presented in this paper. The application of the game-theoretic approach to dialogue system modelling con- sists of three steps: the selected dialogue system needs to be expressed in game-theoretic notions, it must then be supplemented with the specification of the system’s proper- ties, such as players’ preferences or payoffs, which are necessary for exploiting game- theoretic tools (DC needed to be expanded to include the following elements: (1) a goal for the dialogue, (2) termination and outcome rules; and (3) players’ preferences), and finally system must be studied in solutions for the DC system in view of players goals. We shall assume that the reader is familiar with the basic concepts of the game theory like dominant strategy, Nash equilibrium, sets of strategies, etc. (these notions were presented in a more detailed way in [10]). The concept of solution defines sets of strategy profiles which represent stable outcomes of the game. We focus on two types of solutions which determine the rules for playing a specific game: dominant strategies and Nash equilibrium. Given a selected specification of DC (e.g. player preferences), if there exists a solution in dominant strategies, then it provides the rules of how to play a DC game in order to win regardless of the moves of the opponent. However, if such a solution does not exist, but there is a Nash equilibrium for this game, then the player’s victory is not certain, but the Nash equilibrium gives rules on how to play rationally. The purpose of our current research is to propose a model checking technique to verify communication protocols used by autonomous agents interacting using dialogue games. For modelling strategies we are going to use a variant of the logic of actions and graded beliefs A G n [2]. Whereas, for verification of properties of dialogue games we plan to apply methods and techniques explored in the software tool Perseus [3]. 2 The game-theoretic framework 2.1 The model The model is a game in extensive form, formalizing the DC argumentation system cre- ated by Mackenzie [12]. To define the model we use the following standard notation. Given a set Σ , the set of all finite sequences over Σ is denoted by Σ ∗ and the set of all infinite sequences over Σ is denoted by Σ ω . The empty sequence is denoted by ε and the operation of concatenation is denoted by ·. Given sets A and B, C ⊆ A, D ⊆ B, and a → − → − function f : A → B, symbols f (C), f −1 (D) will denote the image and inverse image of f . The preliminary step of defining the game is to specify the following parameters of the game: the set of statements, the set of locutions and the relation of immediate consequence on the set of statements. Let S0 be a non-empty and countable set called the set of atomic statements. The set of statements, S[S0 ], is a minimal set such that: – S0 ⊆ S. – If s ∈ S, then ¬s ∈ S. (Negation) – If T ⊆ S, then T ∈ S. (Conjunction) V – If s,t ∈ S, then s → t ∈ S. (Conditional) The set of locutions, L[S0 ], is then defined as follows: L[S0 ] = S[S0 ] ∪ {Qs : s ∈ S[S0 ]} ∪ {Ws : s ∈ S[S0 ]} ∪ {Ys : s ∈ S[S0 ]} ∪ {Rs : s ∈ S[S0 ]}, where Q (question), W (withdrawal), Y (challenge), and R (resolution demand) are operators used for locution construction. A relation of immediate consequence, 7→⊆ 2S[S0 ] × 2S[S0 ] is a binary relation on the set of statements such that for all s,t ∈ S, {s, s → t} 7→ {t}. Given a set of atomic statements, S0 , and a relation of immediate consequence, 7→, the (argumentation) game is a tuple Γ[S0 ,7→] = hP, π, H, T, (-i )i∈P , (Ai )i∈P , (αi )i∈P i where – P = {W, B} is the set of players. – H ⊆ L[S0 ]∗ ∪ L[S0 ]ω is the set of histories. A history is a (finite or infinite) sequence of locutions from L[S0 ]. The set of finite histories in H is denoted by H̄. – π : H̄ → P ∪ {∅} is the player function assigning to each finite history the player who moves after it, or ∅, if no player is to move. The set of histories at which player i ∈ P is to move is Hi = → −π −1 (i). → − – T = π −1 (∅) ∪ (H ∩ L[S0 ]ω ) is the set of terminal histories. A terminal history is a history after which no player is to move, hence it consists of the set of finite histories mapped to ∅ by the player function and the set of all infinite histories. – -i ⊆ T × T is the preference relation of player i defined on the set of terminal histories.6 – Ai = L[S0 ] is the set of actions of player i ∈ P. – αi : Hi → 2Ai is the admissible actions function of player i ∈ P, determining the set of actions that i can choose from after history h ∈ Hi . In what follows, we will assume that the set of atomic statements S0 is fixed and omit it, writing S rather than S[S0 ] and L rather than L[S0 ]. We are defining the properties of the player function for DC system: π(h) ∈ {W∅} if |h| is even and π(h) ∈ {B, ∅} if |h| is odd. Additionally, the rules of dialogue place restrictions on when the game can terminate: if π(h · l) = ∅, then l ∈/ {Qs, Ys, Rs}. The remaining specification of game termination depends on the context in which the dialogue system is used, as discussed below. The admissible actions function of the players is determined by the rules of dia- logue. In the case of the DC system, the rules of dialogue are defined using the notion of players’ commitment sets. The commitment set function of player i is a function Ci : L∗ → L, assigning to each finite sequence of locutions h ∈ L∗ the commitment set Ci (h) of i at h. The commitment set function of i ∈ P is defined inductively at follows (where, given i ∈ P, {−i} = P \ {i}). The details of the definition of the commitment set function are presented in [10]. The set of immediate consequence conditionals ICC = { T → s : T 7→ {s}}. A set V of statements, T ⊆ S, is immediately inconsistent if there exists a finite subset T 0 ⊆ T and a statement s such that ¬s ∈ T 0 and T 0 7→ {s}. For more details, see [12]. The details of the definition of the admissible actions function αi of player i ∈ P are presented in [10]. 6 The preference relation is a total preorder, i.e. it is total and transitive. The set of histories, H, is the maximal set of sequences from L∗ ∪ Lω satisfying the following: – ε ∈ H. – For any h1 · h2 ∈ H with h1 ∈ L∗ and h2 ∈ L∗ ∪ Lω , h1 ∈ H. – For any h1 · s · h2 ∈ H with h1 ∈ L∗ , h2 ∈ L∗ ∪ Lω and s ∈ L, s ∈ απ(h1 ) (h1 ). Two elements of the dialogue modelling game remain to define. These are player pref- erences and the termination rules that describe which finite histories are mapped to ∅. Note that every strategy profile S̄ = (SW , SB ) determines a unique terminal history hS̄ such that for each strategy s ∈ SW ∪ SB , finite history h0 ∈ H̄ and history h00 ∈ H with hS̄ = h0 · a · h00 , a = Sπ(h0 ) (h0 ). Player i ∈ P prefers strategy profile S̄ to strategy profile S̄0 , S̄0 -i S̄, if hS̄0 -i hS̄ . Thus to define players’ preferences, we need to specify their preferences on terminal histories and the next section is intended to illustrate how this can be defined. 2.2 Systems with pure persuasion Persuasion dialogues are dialogues aimed at resolving conflicts of opinion between at least two participants [21]. In this paper we restrict our attention to two participants games. A conflict of opinion is with regard to a statement, t ∈ S, called a topic. The proponent holds a positive view on t, the opponent doubts t. According to [21], the conflict is resolved if all players have the same point of view. If the dialogue ends, either one of the players is the winner of the dialogue (in which case the other player is the loser), or the dialogue ends in a tie, in which case neither of the players is a winner or a loser. If the dialogue does not end, then neither of the players is a winner or a loser. We consider a particular type of persuasion dialogue, called pure persuasion, as described in [10]. By combining the DC system with pure persuasion we can complete the definition of the argumentation game Γ as follows. The first mover, player W, is the proponent and the second mover, player B, is the opponent. According to the description above, the game should end if one of the following happens: (i) The commitment sets of both players contain the topic t, or (ii) neither player’s commitment set contains the topic t. Since, according to the rules of DC system the most recent statements of any player are added to the commitment sets of both players, we adjust these rules of termination by allowing the next mover to respond and withdraw these statements from his set of commitments. If he is unable to do so, the game will terminate after his move. Similarly, if at any point in the game neither player’s commitment set contains t, the next mover has the opportunity to add t to at least one of the commitment sets. If he is unable to do so, the game terminates after his move. Formally, this amounts to a definition of finite terminal histories, which are defined as follows. A finite history h ∈ H̄ is terminal, i.e. π(h) = ∅, if one of the following conditions is satisfied: Tprop : |h| is even and t ∈ CW (h) ∩CB (h), Topp : |h| is odd and t ∈ / CW (h) ∪CB (h). Note that in the DC system the set of admissible actions at each non-terminal his- tory is non-empty. Therefore a finite history can be terminal only if one of the above conditions is satisfied. Having defined finite terminal histories we will now define the preferences of the players. Let HWwin denote the set of finite histories for which condition T prop is satisfied win and let HB denote the set of finite histories for which condition Topp is satisfied. Set HWwin contains the terminal histories at which player W, the proponent, is the winner, and set HBwin contains the terminal histories at which player W, the opponent, is the winner. We assume the following preference relation on terminal histories. Given h1 , h2 ∈ T , h1 -W h2 if h2 ∈ HWwin or h ∈ win win 2 / HB and h1 ∈ HB , and win h1 -B h2 if h2 ∈ HB or h2 ∈ win / HW and h1 ∈ HWwin . The above means that each player prefers a terminal history at which he wins to that at which he does not win, and each player prefers a history at which the opponent does not win to one at which the opponent wins. These preferences of the players and termination rules imply that a rational propo- nent (player W) should start with an action that results in the topic t being added to the commitment set of at least one of the players. The similar analysis can be done for the second type of persuasive dialogue, i.e., a conflict resolution dialogue, in which both participants attempt to reach an outcome whereby either both of them have the topic t in their commitment sets or both have the negation of the topic, ¬t, in their commitment sets [10]. 3 Strategies and solutions 3.1 Definitions A strategy of a player i is a function from player i’s histories to the set of actions Si : Hi → L, such that for all h ∈ Hi , Si (h) ∈ αi (h). Thus a strategy is a contingent plan that determines a player’s move at each of his histories. The set of strategies of player i is denoted by Si . A strategy profile S̄ = (Si , S−i ) is a pair of strategies chosen by each of the players, S̄ ∈ Si × S−i . Solution concepts define sets of strategy profiles which represent stable outcomes of the game. Below we define two basic solution concepts and illustrate them in the context of pure persuasion. A strategy Si is dominant for player i if for all Si0 ∈ Si and all S j ∈ S j with j = −i, (Si0 , S j ) -i (Si , S j ). A strategy is strictly dominant if the property above holds with strict inequality. A strat- egy profile S̄ = (Si , S−i ) is a solution in (strictly) dominant strategies iff Si is dominant for player i and S−i is dominant for player −i.7 A strategy profile S̄ = (Si , S−i ) is a Nash equilibrium if for all i ∈ P and for all Si0 ∈ Si , (Si0 , S−i ) -i (Si , S−i ). Note that if the game has a solution in dominant strategies, then it also has a Nash equilibrium (but the reverse is not necessarily true). 7 Note that the solution in strictly dominant strategies, if it exists, is unique. 3.2 Solutions to pure persuasive dialogues Consider a game Γ[S0 ,7→] defined for some set of atomic statements S0 and a relation of immediate consequence 7→. Suppose that players’ preferences and finite terminal histories are defined as for a persuasive dialogue. Let us assume first that the topic, t, is an immediate consequence conditional, t ∈ ICC. Consider a strategy s1 ∈ SW such that s1 (ε) = t. Note that after history h = t the commitment sets of both players contain t and the opponent cannot remove t from the commitment set of either of the players, because t ∈ ICC, . The game ends in two rounds and the outcome most preferred by player W is achieved. Thus, any s1 as de- scribed above is a dominant strategy of the proponent. Moreover, for any s2 ∈ SB , the outcome of the game from strategy profile s̄ = (s1 , s2 ) is the same: the commitment sets of proponent and opponent contain t. Changing a strategy does not guarantee to any player that he can obtain a strictly preferred outcome. Hence, s̄ is a Nash equilibrium of the game and leads to an outcome whereby the proponent wins. Next, let us consider the case that the topic, t, is not an immediate consequence conditional, t ∈/ ICC. Let s2 ∈ SB be a strategy such that for any non-terminal history h with t ∈ CW (h) ∩CB (h) and |h| odd, s2 (h) = Wt. In this case Wt ∈ αB (h), because the last action of h must have been t. On the other hand, let s1 ∈ SW be a strategy such that for any non-terminal history h with t ∈ / CW (h) ∪CB (h), s1 (h) = t. Note that the strategy profile s̄ = (s1 , s2 ) is a Nash equilibrium of the game. This occurs because changing strategy does not give any of the players his most preferred outcome. Note that this equilibrium results in an infinite history. The detailed solutions to conflict resolution dialogue with the DC system was pre- sented in [10]. 3.3 Example In order to recall the illustration of the results from [10], let us consider a version of the Battle of the Sexes game (cf. [16]) in which the couple – Wilma (W) and Brian (B) – have to decide on their plans for the day. Brian thinks they should go to a soccer match, but Wilma thinks they should go to the ballet. Even though Wilma prefers the ballet to the soccer, she would still rather attend a soccer match than stay at home. Similarly, Brian likes the soccer match more than the ballet, but he prefers the ballet to staying home. We assume the set of atomic statements to be such that {p, q, r,t} ⊆ S0 , where p means that they should go to the ballet, q means that they should go to the soccer match, r means that Wilma is too sick for the outdoors, and t means that Brian’s ex-wife will be at the ballet. Moreover, {q → ¬p, p → ¬q, r → ¬q,t → ¬p} ⊆7→. This game has a topic p. Wilma is the proponent of p and Brian is the opponent of p. Let us consider the strategy profile s0 = (s01 , s02 ) where s01 is a strategy of Wilma’s in which she states p and r as soon as it is possible and s02 is a strategy of Brian’s in which he states q as soon as it is possible, never attacking Wilma’s statement by r stating t. More formally, s1 is a strategy such that (a) s1 (ε) = p and (b) for any non- terminal history h, if r ∈ αW (h) then s1 (h) = r, while s2 is a strategy such that for any move number locution of W locution of B CW CB 1 p - p p 2 - Wp p - 3 r - p, r r 4 - q p, q, r q, r 5 r → ¬q - p, q, r, r → ¬q q, r, r → ¬q 6 - R(q ∧ r ∧ (r → ¬q)) p, q, r, r → ¬q q, r, r → ¬q 7 Wq - p, r, r → ¬q q, r, r → ¬q 8 - Wq p, r, r → ¬q r, r → ¬q 9 p - p, r, r → ¬q p, r, r → ¬q 10 - no move p, r, r → ¬q p, r, r → ¬q Table 1. A play for the Battle of the Sexes game non-terminal history h, (a) if q ∈ αB (h) then s2 (h) = q, and (b) if h = h0 · r · h00 then s2 (h) 6= t. A play in line with this strategy profile is given in Table 1. This play begins with the Wilma’s statement that they should go to the ballet. Brian withdraws this statement from his commitment set. Next, Wilma states that she is too sick for the outdoors. Brian replies that they should go to the soccer match. Then Wilma states that they cannot go to the soccer match since she is too sick. So, Brian asks for a resolution of the conflict. Wilma then withdraws her statement that they should go to the soccer match. Brian, not wanting to put his wife at risk, also gives up the idea of going to the soccer match. Then Wilma repeats her statement that they should go to the ballet and finally Brian agrees, i.e. he makes no move removing p from his commitment set. The strategy profile s0 = (s01 , s02 ) described above is a Nash equilibrium of the game (i.e. no player wants to change his/her strategy, assuming that the other player does not) and it achieves Wilma’s most preferred outcome. 4 Verification of strategies In this paper, we continue the study how existing dialogue systems can be formalized in terms of game theory and we take Mackenzie’s DC argumentation system as a repre- sentation of such a game. Furthermore, we analyze the obtained results how solutions in dominant strategies and solutions in a Nash equilibrium can be used for the analysis of dialogue games. We consider a model checking approach as a verification mean for these properties, since automatic verification of these games is connected with excessive complexity. Model checking is commonly applied method allowing for verification of different systems and their properties and there are also some solutions applied within model checking, which are trying to deal with the state explosion problem. The solution in our case is software system using this approach which allows to examine selected di- alogue game systems with respect to game-theory properties, by semantically verifying satisfaction of formulas of the A G n language which describe considered properties in a given model. We will adopt A G n logic proposed by Budzynska and Kacprzak [2]. Cur- rently, we are designing the application, which will expand the possibilities of earlier verification software tool Perseus [3]. We are planning to study functionality of exist- ing model checking techniques, like Bounded Model Checking [23], for verification of agent communication modelled in terms of game semantics. References 1. E. Black and A. Hunter. An inquiry dialogue system. Autonomous Agents and Multi-Agent Systems, 19(2):173–209, 2009. 2. K. Budzyńska and M. Kacprzak. A logic for reasoning about persuasion. Fundamenta Informaticae, 85:51–65, 2008. 3. K. Budzyńska, M. Kacprzak, and P. Rembelski. Perseus. software for analyzing persuasion process. Fundamenta Informaticae, 2009. 4. D. E. Chukwuemeka, F. Guerin, T. J. Norman, and P. Edwards. A framework for learning argumentation strategies. In Proceedings of the Third International Workshop on Argumen- tation in Multi-Agent Systems, pages 151–154, 2006. 5. J. Devereux and C. Reed. Strategic argumentation in rigorous persuasion dialogue. In P. McBurney, I. Rahwan, S. Parsons, and N. Maudet, editors, ArgMAS, volume 6057 of Lec- ture Notes in Computer Science, pages 94–113. Springer, 2009. 6. S. Gabrielli, R. Maimone, P. Forbes, and S. Wells. Exploring change strategies for sustainable urban mobility. CHI 2013 Designing Social Media for Change Workshop, 2013. 7. C. L. Hamblin. Fallacies. Methuen and Co. Ltd, 1970. 8. A. Hussain and F. Toni. Bilateral agent negotiation with information-seeking. In Proc. of the 5th European Workshop on Multi-Agent Systems, 2007. 9. M. Kacprzak and K. Budzynska. Reasoning about dialogical strategies. In M. Graa, C. Toro, R. J. Howlett, and L. C. Jain, editors, KES (Selected Papers), volume 7828 of Lecture Notes in Computer Science, pages 171–184. Springer, 2012. 10. M. Kacprzak, M. Dziubiński, and K. Budzynska. Strategies in dialogues: A game-theoretic approach. In Proc. of COMMA, 2014. 11. K. Larson and I. Rahwan. Welfare properties of argumentation-based semantics. In Pro- ceedings of the 2nd COMSOC, 2008. 12. J. D. Mackenzie. Question-begging in non-cumulative systems. J. of Phil. Logic, 8:117–133, 1979. 13. P.-A. Matt and F. Toni. A game-theoretic measure of argument strength for abstract argu- mentation. In S. Hlldobler, C. Lutz, and H. Wansing, editors, JELIA, volume 5293 of Lecture Notes in Computer Science, pages 285–297. Springer, 2008. 14. H. Prakken. Formal systems for persuasion dialogue. The Knowledge Eng. Review, 21:163– 188, 2006. 15. A. D. Procaccia and J. S. Rosenschein. Extensive-form argumentation games. In M. P. Gleizes, G. A. Kaminka, A. Now, S. Ossowski, K. Tuyls, and K. Verbeeck, editors, EUMAS, pages 312–322, 2005. 16. I. Rahwan and K. Larson. Argumentation and game theory. In Argumentation in AI, pages 321–339. Springer, 2009. 17. I. Rahwan, K. Larson, and F. Tohme. A characterisation of strategy-proofness for grounded argumentation semantics. In Proceedings of the 21st IJCAI, 2009. 18. I. Rahwan, S. Ramchurn, N. Jennings, P. McBurney, S. Parsons, and E. Sonenberg. Argumentation-based negotiation. Knowledge Engineering Review, (18(4)):343–375, 2003. 19. S. D. Ramchurn, N. R. Jennings, and C. Sierra. Persuasive negotiation for autonomous agents: A rhetorical approach. In IJCAI Workshop on Comp. Models of Natural Argument, Acapulco, Mexico, 2003. 20. R. Riveret, H. Prakken, A. Rotolo, and G. Sartor. Heuristics in argumentation: A game theory investigation. In P. Besnard, S. Doutre, and A. Hunter, editors, COMMA, volume 172 of Frontiers in Artificial Intelligence and Applications, pages 324–335. IOS Press, 2008. 21. D. Walton and E. Krabbe. Commitment in Dialogue: Basic Concepts of Interpersonal Rea- soning. SUNY series in Logic and Language. State University of New York Press, 1995. 22. D. N. Walton. Logical Dialogue-games And Fallacies. 23. B. Woźna-Szcześniak. On the sat-based verification of communicative commitments. In Trends in Contemporary Computer Science, pages 171–182. Bialystok University of Tech- nology, 2014. 24. T. Yuan, D. Moore, and A. Grierson. A conversational agent system as a test-bed to study the philosophical model dc. In 3rd Workshop on Computational Models of Natural Argument (CMNA03), 2003.