=Paper=
{{Paper
|id=Vol-2659/persiani
|storemode=property
|title=Mediating joint intention with a dialogue
management system
|pdfUrl=https://ceur-ws.org/Vol-2659/persiani.pdf
|volume=Vol-2659
|authors=Michele Persiani,Maitreyee Tewari
|dblpUrl=https://dblp.org/rec/conf/ecai/PersianiT20
}}
==Mediating joint intention with a dialogue
management system==
Mediating Joint Intentions with a Dialogue Management System Michele Persiani Maitreyee Tewari Umeå University Umeå University Umeå, Sweden Umeå, Sweden michelep@cs.umu.se maittewa@cs.umu.se ABSTRACT Joint Intention π Joint Intention π A necessary skill which enables machines to take part in decision making processes with their users is the ability to participate in prepare prepare fry eggs salad the mediation of joint intentions. This paper presents a formal- salad isation of an architecture to create and mediate joint intentions with an artificial agent. The proposed system is loosely based on I want to Ok, I can fry prepare a salad some eggs the framework of we-intentions and embodied on a combination of Plan Recognition techniques to identify the user intention, and a Reinforcement Learning network which learns how to best interact with the inferred intention. Joint Intention π Joint Intention π KEYWORDS prepare prepare set set salad Joint Intentions, Robotics, Goal Recognition, Reinforcement Learn- salad table table ing No, you can set the table instead OK 1 INTRODUCTION The socio-technological evolution of human society has motivated the integration of robots in social and personal spaces. Hence, it is becoming a pressuring requirement for social robotics to under- stand human intentions and adapt to social values and needs. Figure 1: Creation of a joint intention with a robot. During Among other reasons, humans interact to understand and me- its turn each participant adds and removes tasks (or primi- diate intentions with other human participants [16]. A successful tive actions) from the shared intention π . Participants spec- mediation of intention enable participants to decide profitable col- ify what they or the other will do until a common agreement laboration, to manage expectations, or to decide whether to trust is met. the other participant. Natural language dialogues are among the primitive modes [4] of human-human interaction, and are also consistently used to mediate intentions. Dialogue management To answer the research question, the formalisation of joint inten- strategies have exploited joint intention theory for building team tions in the context of shared task planning, and defining dialogue dialogues [15]. However, this work views joint intentions with an act functions [6] was done. This formalism offered a turn-based accent on joint task planning [13] for a human and a robot partici- interaction scheme that allows two participant (human and a robot) pant, rather than on the communicative protocols being involved. to mediate an intention regarding a shared task. The objective of this work is to model mediation of intentions for Human-Robot Interaction (HRI) in a household scenario, and is 2 METHODOLOGY loosely based on the framework of we-intentions [17]. Within the scenario, we explore the cases where a person could need assistance Some of the previous work [7, 11] proposed team rationality for from a robot such as: in cooking, finding different objects in the building collaborative multi-agent systems, for example, in [11] the house, preparing for a visit to the supermarket, doctor or a friend. authors used Shared Plan [5] and Propose Trees to model collabora- For instance, the person might say “I want to prepare a salad” to tion as multi-agent planning problem, where a rational team will a robot, possibly having an intention for the robot to help her in perform an action only if the benefits from performing an action cooking the dinner. is less than its cost. In [2] the authors formalised communication Hence, we explore the following research question: how to cre- protocols using joint intention theory. The authors used joint per- ate joint intention with machines? The motivation behind this sistent goals and persistent weak achievement goals to build joint research question belongs to desired specifications of AI systems, intentions, and speech acts such as request, offer, inform, confirm, including the need of an integrated cognition and collaboration refuse, acknowledge, and standing-offer for their mediation. mechanism, and a natural interaction between human and AI sys- tems. Ultimately, it is about investigating the boundaries between Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons the Eco-system of AI with that of human-beings. License Attribution 4.0 International (CC BY 4.0). Michele Persiani and Maitreyee Tewari As later described, we propose certain assumptions to lift some that translates as there is always a joint intention between x and of the complexity that previous research utilizes in the—context y. This assumption, while being quite strong, is quite reasonable of joint intention theory. We believe that such complexities, while for our context as the proposed DM is specifically tailored to me- theoretically sound, make implementations on real systems difficult diate joint intentions. During every dialogue a joint intention is and brittle; for this reason, we utilize a simplification of previous always obtained, and when the user leaves the conversation there work’s formalization for our needs. The rest of the section provides is always an intention that was formed and is shared with the DM. our simplified formalisation of mediating joint intention theory Furthermore, it is always the case that the user utilises the DM to and attempts to briefly reason about the constraints posed. instantiate joint intentions. We do not take into consideration the Our proposed approach is based on predicate logic combined cases in which the joint intention is bootstrapped or terminated as with planning, and is influenced by logic based semantics proposed for example shown in [1]. in [1, 2, 14]. Agents are represented by x, y, . . . x 1, x 2, . . . y1, y2, . . . Following the given definitions, we propose an interaction mech- and their actions by a 1, a 2, an . An intention of a single agent x is a anism that allows two participants to collaboratively build O, by plan π = {a 0, a 1, ..., an } of actions together with a goal д the agent being able to add or remove actions from it. Currently, we have is committed to [16] and the intention is partially observed through the following assumptions: 1) for every trial two participants are O ⊆ π. present, that is a human user and a Dialogue Manager (DM), that Know(x, p) ≡ p ∧ Bel(x, p) represents the knowledge of agents can be integrated for example in a house robot. 2) the DM is mod- and MutBel(x, y, p) that x and y share a mutual belief about p—In elled to be user initiated, which always proposes the first action our formulation an agent’s intention is represented by the predicate that will enter the set O. Intend(x, д, O) while a joint intention JointIntend(x, y, д, O). An Having an observed set O in a form of a partial plan, the DM can agent has an intention if following holds: infer the most likely full intention π by utilizing plan recognition techniques as later described. This inference is based on the current state of the world that we assume to be available to the DM in the Intend(x, д, O) ≡ Know(x, ∃π O ⊆ π ∧ form of truth predicates. A possible architecture for maintaining Goal(π ) = д∧ (1) an updated world description is not provided by this paper but can Commit(x, π )) be for example implemented as in [3]. i.e. not only is true that the agent has an intention and is committed to it, but the agent also has a belief about it. The set O is an explicit 2.1 Goal Recognition and plan generation subset of π for which it is known that the agent already committed At every turn of the dialogue the agent is required to infer the joint to it, and contains past observations or declarations about future plan π to be able to participate in its mediation. For this purpose, we commitments about π . utilize plan recognition techniques based on the Planning Domain Eq. 2 expresses that to provide agents an intention doesn’t require Definition Language (PDDL) [8]. PDDL belongs to the group of to explicit their full intention π , but only a part of it (see Figure 1), planning techniques known as classical planning, and allows to with the full intention being instead inferred by grounding the easily create non-hierarchical task domains. observed commitments in the task space. For a given task domain we select the set of goals G as possible A joint intention is an intention shared by the agents x and y goals a user can pursue. Example of possible goals for Figure 1 with the same goal д. Therefore, a joint intention is a plan π = can be to prepare dinner, lunch or breakfast. Plan recognition is {a x 0, ay0, a x 1, ..., ayn } together with a goal д where the actions achieved by a modified version of the method proposed in [10] with in π can be allocated to either participants x or y. Furthermore, the following differences: 1) we allow the PDDL planner to plan the involved agents have a mutual believe MutBel about each oth- using partially instantiated actions1 , and 2) the observations O are ers’ commitments. Hence, two agents hold a joint intention if the treated as a set rather than a sequence. Given an eventually empty following holds true: set of observations O, goal recognition is performed as: C(∅, д) д̂ = arдmax (4) JointIntend(x, y, д, O) ≡ Intend(x, д, O) ∧ Intend(y, д, O)∧ д ∈G C(O, д) MutBel(x, y, JointIntend(x, y, д, O)) (2) where C(O, д) is the cost of a plan achieving д and constrained to By this formulation x and y are allowed to have separate beliefs and contain O, C(∅, д) is the cost of an optimal plan achieving д without inference mechanisms through which they find π ; but are bound C(∅,д) being constrained by O. Hence, 0 ≤ C(O ,д) ≤ 1 gives indication to have the same goal and observed commitments. Notice that this is a simplification of how joint intentions have been previously on how costly it is to deviate from an optimal plan achieving д for formalized in the literature, and to which we invite the reader. Nev- compliance with O. Finally, for an inferred goal д̂ we obtain π as ertheless, this formalization is sufficient for our purpose of creating the optimal plan achieving д while being constrained to contain a dialogue manager that allows mediation of joint intentions. In the observations O. this context of a dialogue between—two agents x and y we further make the following assumption: 1 We define a PDDL action as partially instantiated if not all of its arguments are grounded in the task domain. An action is fully instantiated when all arguments are |= ∃O∃д JointIntend(x, y, д, O) (3) grounded. Mediating Joint Intentions with a Dialogue Management System Dialogue Act Precondition Effect Offer a , x, a a <θ ∧a < R ∧a < R a∈θ Offer д , x, д̂ ∅ д = д̂ Counter-offer, x, a 1, a 2 a 1 ∈ θ a 2 < R ∧ a 2 < O a 1 < θ ∧ a 2 ∈ θ Accept, x, a a∈θ a <θ ∧a ∈ O Reject, x, a a∈θ a <θ ∧a ∈ R Table 1: Speech acts for the SDS that allow to mediate actions with respect to the sets of offered, rejected and accepted com- mitments. Figure 2: A traditional Spoken Dialogue Management Sys- Since a dialogue policy based on Finite-State-Machines is not tem. realizable as it would need to consider all possible intentions, also based on the state of the task, we propose to learn the DM dialogue policy with Reinforcement Learning methods. This approach is not new in the context of dialogue management, and by this method the user is simulated by an Agenda [12]. 2.2 Mediation of Joint Intention 2.3 Learning the agent strategy with The agent and the user have to use a medium to communicate their Reinforcement Learning (RL) joint intention, and to negotiate goals д and commitments O. In At every turn, a Q-Network [18] evaluates the current inferred π order to do that, we formalise a finite-state negotiation dialogue together with the actions in the sets θ, R and O and the current strategy with following dialogue acts: (offer, counter-offer, accept, PDDL state, selecting which dialogue act to perform by an ϵ-greedy and reject). The dialogue strategies will be implemented with a policy computed on the dialogue acts expected return. In RL, agents spoken dialogue management system (SDS) 2. learn which policy to adopt by maximising the reward they receive Traditionally, an SDS consists of speech synthesis that recognises during each episode. The current version of the reward function is: and generates speech, natural language understanding (NLU) trans- forms the human generated natural language to knowledge for the π ∩π C(π, д) R = −αT + β +γ (5) machine. A dialogue manager (DM) makes the decision based on π ∪π C(π, д) the NLU and other components such as previous history, database where π and д form the user’s original desired joint intention (held etc, and natural language generation (NLG) receives the decision by the Agenda). The first terms penalises every turn that the inter- from the DM, transforms it to human understandable format and action takes, hence making interactions as short as possible. The sends to speech synthesizer. second term evaluates how the final resulting intention is similar When the user generates its first utterance, it is transformed to the one the user had as objective for the interaction. The third from speech to text and arrives at the NLU component. The NLU term evaluates instead the cost the resulting mediated intention transforms the text to knowledge (semantic roles) and assigns a has, compared to the user’s original one. α, β and γ determine how dialogue act offer 2 . An offer from the user instantiates the plan the three components of the reward function are weighted. Notice π by performing plan recognition, and creates a joint intention as that the system cannot access π and д, that are instead only used at described by JointIntend. the end of every interaction for evaluation. Thus, the the DM learns We define five dialogue acts Offer a , Offer д , Counter-offer, Accept to mediate and improve the unobservable user intention π, д. and Reject with which both user and DM can mediate the intention’s goal д and commitments O. Table 1 contains the effects of these dialogue acts with respect to three sets: θ is a set of offers, R and 3 FUTURE WORK O are respectively the sets of rejected and accepted commitments. An Offera represents offer about an action, Offerg indicates offer The research is still in its early stages and we are currently imple- about a goal, Counter-offer is an action a 1 is not accepted and an menting the described system. We developed the goal recognition alternative a 2 is instead proposed. An Accept and Reject can be used and the reinforcement learning components together with a simple to accept or reject proposed commitments. user Agenda. The Agenda is based on PDDL and simulates how the user would modify the joint intention during its turn, while having as objective a randomly generated joint intention. Initial experiments gave positive results, in the sense that the RL is able to learn the structure of the problem for simple scenarios, and successfully maximises the possible rewards. Several investigations are needed and are still open: what is the Q-Network learning? 2 At this stage we define a finite state SDS and only the user can initiate a dialogue Does our current setting allows any generalisation? The current only using offer together with a goal or an action. implementation requires hundreds of episodes to converge. Can Michele Persiani and Maitreyee Tewari the process be made faster/simpler? How to facilitate the online [6] Bunt Harry, Petukhova Volha, Traum David, and Alexandersson Jan. 2017. Di- adaptation over real users? alogue Act Annotation with the ISO 24617-2 Standard. Springer International Publishing, 109–135. Encapsulation of the joint intention model with SDS is still to [7] Ece Kamar, Ya’akov Gal, and Barbara J Grosz. 2009. Incorporating helpful behavior be implemented. For early prototypes of the system we plan to into collaborative planning. In Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS). Springer Verlag. implement the dialogue manager as described in Section 2.2. Later [8] Drew McDermott. 2003. The Formal Semantics of Processes in PDDL. In Proc. versions could see the implementation of a more complete SDS ICAPS Workshop on PDDL. ICAPS Workshop, Trento, Italy, 8. through for example a POMDP model [9]. This could allow to have [9] Michael Frederick McTear, Zoraida Callejas, and David Griol. 2016. The conversa- tional interface. Vol. 6. Springer, Spain. dialogues that are not strictly related to the mediation of the joint [10] Miguel Ramírez and Hector Geffner. 2010. Probabilistic plan recognition using intention, but rather more flexible and intuitive for the user. Finally, off-the-shelf classical planners. In Twenty-Fourth AAAI Conference on Artificial investigation about the soundness of this approach in real scenarios Intelligence. MIT Press, USA, 6. [11] Timothy W. Rauenbusch and Barbara J. Grosz. 2003. A Decision Making Pro- for example in user studies is still to be performed. cedure for Collaborative Planning. In Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS ’03). ACKNOWLEDGMENTS Association for Computing Machinery, 1106–1107. [12] Jost Schatzmann, Blaise Thomson, and Steve Young. 2007. Statistical user simula- This work has received funding from the European Union’s Horizon tion with a hidden agenda. In Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue. 273–282. 2020 research and innovation program under the Marie Skłodowska- [13] David P. Schweikard and Hans Bernhard Schmid. 2013. Collective Intentionality. Curie grant agreement No 721619 for the SOCRATES project. In The Stanford Encyclopedia of Philosophy (summer 2013 ed.), Edward N. Zalta (Ed.). Metaphysics Research Lab, Stanford University, USA. [14] Ira A Smith, Philip R Cohen, Jeffrey M Bradshaw, Mark Greaves, and Heather REFERENCES Holmback. 1998. Designing conversation policies using joint intention theory. In [1] Philip R Cohen. 2019. Foundations of Collaborative Task-Oriented Dialogue: Proceedings International Conference on Multi Agent Systems (Cat. No. 98EX160). What’s in a Slot?. In Proceedings of the 20th Annual SIGdial Meeting on Discourse IEEE, 269–276. and Dialogue. 198–209. [15] Rajah Subramanian, Sanjeev Kumar, and Philip Cohen. 2006. Integrating Joint [2] Philip R Cohen and Hector J Levesque. 1990. Intention is choice with commitment. Intention Theory, Belief Reasoning, and Communicative Action for Generating Artificial intelligence 42, 2-3 (1990), 213–261. Team-Oriented Dialogue.. In Proceedings of the National Conference on Artificial [3] Sandra Devin and Rachid Alami. 2016. An implemented theory of mind to im- Intelligence, Vol. 21. MIT Press, Boston, USA, 6. prove human-robot shared plans execution. In 2016 11th ACM/IEEE International [16] Michael Tomasello, Malinda Carpenter, Josep Call, Tanya Behne, and Henrike Conference on Human-Robot Interaction (HRI). IEEE, 319–326. Moll. 2005. Understanding and sharing intentions: The origins of cultural cogni- [4] New World Encyclopedia. 2017. Dialogue — New World Encyclopedia,. //www. tion. Behavioral and Brain Sciences 28, 5 (2005), 675–691. newworldencyclopedia.org/p/index.php?title=Dialogue&oldid=1007366 [Online; [17] Raimo Tuomela. 2005. We-intentions revisited. Philosophical Studies 125, 3 (2005), accessed 24-January-2020]. 327–369. [5] Barbara Grosz and Sarit Kraus. 1996. Collaborative plans for complex group [18] Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learn- action. Artificial Intelligence (1996). ing with double q-learning. In Thirtieth AAAI conference on artificial intelligence. MIT Press, Arizona, USA, 7.