=Paper= {{Paper |id=Vol-2659/persiani |storemode=property |title=Mediating joint intention with a dialogue management system |pdfUrl=https://ceur-ws.org/Vol-2659/persiani.pdf |volume=Vol-2659 |authors=Michele Persiani,Maitreyee Tewari |dblpUrl=https://dblp.org/rec/conf/ecai/PersianiT20 }} ==Mediating joint intention with a dialogue management system== https://ceur-ws.org/Vol-2659/persiani.pdf

Mediating Joint Intentions with a Dialogue Management System
Michele Persiani Maitreyee Tewari
Umeå University Umeå University
Umeå, Sweden Umeå, Sweden
michelep@cs.umu.se maittewa@cs.umu.se

ABSTRACT Joint Intention π Joint Intention π

A necessary skill which enables machines to take part in decision
making processes with their users is the ability to participate in prepare
prepare
fry eggs
salad
the mediation of joint intentions. This paper presents a formal- salad

isation of an architecture to create and mediate joint intentions
with an artificial agent. The proposed system is loosely based on I want to Ok, I can fry
prepare a salad some eggs
the framework of we-intentions and embodied on a combination of
Plan Recognition techniques to identify the user intention, and a
Reinforcement Learning network which learns how to best interact
with the inferred intention.
Joint Intention π Joint Intention π

KEYWORDS prepare
prepare set
set salad
Joint Intentions, Robotics, Goal Recognition, Reinforcement Learn- salad
table
table
ing
No, you can set
the table instead OK

1 INTRODUCTION
The socio-technological evolution of human society has motivated
the integration of robots in social and personal spaces. Hence, it
is becoming a pressuring requirement for social robotics to under-
stand human intentions and adapt to social values and needs. Figure 1: Creation of a joint intention with a robot. During
Among other reasons, humans interact to understand and me- its turn each participant adds and removes tasks (or primi-
diate intentions with other human participants [16]. A successful tive actions) from the shared intention π . Participants spec-
mediation of intention enable participants to decide profitable col- ify what they or the other will do until a common agreement
laboration, to manage expectations, or to decide whether to trust is met.
the other participant. Natural language dialogues are among the
primitive modes [4] of human-human interaction, and are also
consistently used to mediate intentions. Dialogue management To answer the research question, the formalisation of joint inten-
strategies have exploited joint intention theory for building team tions in the context of shared task planning, and defining dialogue
dialogues [15]. However, this work views joint intentions with an act functions [6] was done. This formalism offered a turn-based
accent on joint task planning [13] for a human and a robot partici- interaction scheme that allows two participant (human and a robot)
pant, rather than on the communicative protocols being involved. to mediate an intention regarding a shared task.
The objective of this work is to model mediation of intentions
for Human-Robot Interaction (HRI) in a household scenario, and is 2 METHODOLOGY
loosely based on the framework of we-intentions [17]. Within the
scenario, we explore the cases where a person could need assistance Some of the previous work [7, 11] proposed team rationality for
from a robot such as: in cooking, finding different objects in the building collaborative multi-agent systems, for example, in [11] the
house, preparing for a visit to the supermarket, doctor or a friend. authors used Shared Plan [5] and Propose Trees to model collabora-
For instance, the person might say “I want to prepare a salad” to tion as multi-agent planning problem, where a rational team will
a robot, possibly having an intention for the robot to help her in perform an action only if the benefits from performing an action
cooking the dinner. is less than its cost. In [2] the authors formalised communication
Hence, we explore the following research question: how to cre- protocols using joint intention theory. The authors used joint per-
ate joint intention with machines? The motivation behind this sistent goals and persistent weak achievement goals to build joint
research question belongs to desired specifications of AI systems, intentions, and speech acts such as request, offer, inform, confirm,
including the need of an integrated cognition and collaboration refuse, acknowledge, and standing-offer for their mediation.
mechanism, and a natural interaction between human and AI sys-
tems. Ultimately, it is about investigating the boundaries between Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons
the Eco-system of AI with that of human-beings. License Attribution 4.0 International (CC BY 4.0).
Michele Persiani and Maitreyee Tewari

As later described, we propose certain assumptions to lift some that translates as there is always a joint intention between x and
of the complexity that previous research utilizes in the—context y. This assumption, while being quite strong, is quite reasonable
of joint intention theory. We believe that such complexities, while for our context as the proposed DM is specifically tailored to me-
theoretically sound, make implementations on real systems difficult diate joint intentions. During every dialogue a joint intention is
and brittle; for this reason, we utilize a simplification of previous always obtained, and when the user leaves the conversation there
work’s formalization for our needs. The rest of the section provides is always an intention that was formed and is shared with the DM.
our simplified formalisation of mediating joint intention theory Furthermore, it is always the case that the user utilises the DM to
and attempts to briefly reason about the constraints posed. instantiate joint intentions. We do not take into consideration the
Our proposed approach is based on predicate logic combined cases in which the joint intention is bootstrapped or terminated as
with planning, and is influenced by logic based semantics proposed for example shown in [1].
in [1, 2, 14]. Agents are represented by x, y, . . . x 1, x 2, . . . y1, y2, . . . Following the given definitions, we propose an interaction mech-
and their actions by a 1, a 2, an . An intention of a single agent x is a anism that allows two participants to collaboratively build O, by
plan π = {a 0, a 1, ..., an } of actions together with a goal д the agent being able to add or remove actions from it. Currently, we have
is committed to [16] and the intention is partially observed through the following assumptions: 1) for every trial two participants are
O ⊆ π. present, that is a human user and a Dialogue Manager (DM), that
Know(x, p) ≡ p ∧ Bel(x, p) represents the knowledge of agents can be integrated for example in a house robot. 2) the DM is mod-
and MutBel(x, y, p) that x and y share a mutual belief about p—In elled to be user initiated, which always proposes the first action
our formulation an agent’s intention is represented by the predicate that will enter the set O.
Intend(x, д, O) while a joint intention JointIntend(x, y, д, O). An Having an observed set O in a form of a partial plan, the DM can
agent has an intention if following holds: infer the most likely full intention π by utilizing plan recognition
techniques as later described. This inference is based on the current
state of the world that we assume to be available to the DM in the
Intend(x, д, O) ≡ Know(x, ∃π O ⊆ π ∧
form of truth predicates. A possible architecture for maintaining
Goal(π ) = д∧ (1) an updated world description is not provided by this paper but can
Commit(x, π )) be for example implemented as in [3].

i.e. not only is true that the agent has an intention and is committed
to it, but the agent also has a belief about it. The set O is an explicit
2.1 Goal Recognition and plan generation
subset of π for which it is known that the agent already committed At every turn of the dialogue the agent is required to infer the joint
to it, and contains past observations or declarations about future plan π to be able to participate in its mediation. For this purpose, we
commitments about π . utilize plan recognition techniques based on the Planning Domain
Eq. 2 expresses that to provide agents an intention doesn’t require Definition Language (PDDL) [8]. PDDL belongs to the group of
to explicit their full intention π , but only a part of it (see Figure 1), planning techniques known as classical planning, and allows to
with the full intention being instead inferred by grounding the easily create non-hierarchical task domains.
observed commitments in the task space. For a given task domain we select the set of goals G as possible
A joint intention is an intention shared by the agents x and y goals a user can pursue. Example of possible goals for Figure 1
with the same goal д. Therefore, a joint intention is a plan π = can be to prepare dinner, lunch or breakfast. Plan recognition is
{a x 0, ay0, a x 1, ..., ayn } together with a goal д where the actions achieved by a modified version of the method proposed in [10] with
in π can be allocated to either participants x or y. Furthermore, the following differences: 1) we allow the PDDL planner to plan
the involved agents have a mutual believe MutBel about each oth- using partially instantiated actions1 , and 2) the observations O are
ers’ commitments. Hence, two agents hold a joint intention if the treated as a set rather than a sequence. Given an eventually empty
following holds true: set of observations O, goal recognition is performed as:

C(∅, д)
д̂ = arдmax (4)
JointIntend(x, y, д, O) ≡ Intend(x, д, O) ∧ Intend(y, д, O)∧ д ∈G C(O, д)
MutBel(x, y, JointIntend(x, y, д, O)) (2)
where C(O, д) is the cost of a plan achieving д and constrained to
By this formulation x and y are allowed to have separate beliefs and contain O, C(∅, д) is the cost of an optimal plan achieving д without
inference mechanisms through which they find π ; but are bound C(∅,д)
being constrained by O. Hence, 0 ≤ C(O ,д) ≤ 1 gives indication
to have the same goal and observed commitments. Notice that this
is a simplification of how joint intentions have been previously on how costly it is to deviate from an optimal plan achieving д for
formalized in the literature, and to which we invite the reader. Nev- compliance with O. Finally, for an inferred goal д̂ we obtain π as
ertheless, this formalization is sufficient for our purpose of creating the optimal plan achieving д while being constrained to contain
a dialogue manager that allows mediation of joint intentions. In the observations O.
this context of a dialogue between—two agents x and y we further
make the following assumption: 1 We define a PDDL action as partially instantiated if not all of its arguments are
grounded in the task domain. An action is fully instantiated when all arguments are
|= ∃O∃д JointIntend(x, y, д, O) (3) grounded.
Mediating Joint Intentions with a Dialogue Management System

Dialogue Act Precondition Effect
Offer a , x, a a <θ ∧a < R ∧a < R a∈θ
Offer д , x, д̂ ∅ д = д̂
Counter-offer, x, a 1, a 2 a 1 ∈ θ a 2 < R ∧ a 2 < O a 1 < θ ∧ a 2 ∈ θ
Accept, x, a a∈θ a <θ ∧a ∈ O
Reject, x, a a∈θ a <θ ∧a ∈ R
Table 1: Speech acts for the SDS that allow to mediate actions
with respect to the sets of offered, rejected and accepted com-
mitments.

Figure 2: A traditional Spoken Dialogue Management Sys- Since a dialogue policy based on Finite-State-Machines is not
tem. realizable as it would need to consider all possible intentions, also
based on the state of the task, we propose to learn the DM dialogue
policy with Reinforcement Learning methods. This approach is not
new in the context of dialogue management, and by this method
the user is simulated by an Agenda [12].

2.2 Mediation of Joint Intention 2.3 Learning the agent strategy with
The agent and the user have to use a medium to communicate their Reinforcement Learning (RL)
joint intention, and to negotiate goals д and commitments O. In At every turn, a Q-Network [18] evaluates the current inferred π
order to do that, we formalise a finite-state negotiation dialogue together with the actions in the sets θ, R and O and the current
strategy with following dialogue acts: (offer, counter-offer, accept, PDDL state, selecting which dialogue act to perform by an ϵ-greedy
and reject). The dialogue strategies will be implemented with a policy computed on the dialogue acts expected return. In RL, agents
spoken dialogue management system (SDS) 2. learn which policy to adopt by maximising the reward they receive
Traditionally, an SDS consists of speech synthesis that recognises during each episode. The current version of the reward function is:
and generates speech, natural language understanding (NLU) trans-
forms the human generated natural language to knowledge for the π ∩π C(π, д)
R = −αT + β +γ (5)
machine. A dialogue manager (DM) makes the decision based on π ∪π C(π, д)
the NLU and other components such as previous history, database
where π and д form the user’s original desired joint intention (held
etc, and natural language generation (NLG) receives the decision
by the Agenda). The first terms penalises every turn that the inter-
from the DM, transforms it to human understandable format and
action takes, hence making interactions as short as possible. The
sends to speech synthesizer.
second term evaluates how the final resulting intention is similar
When the user generates its first utterance, it is transformed
to the one the user had as objective for the interaction. The third
from speech to text and arrives at the NLU component. The NLU
term evaluates instead the cost the resulting mediated intention
transforms the text to knowledge (semantic roles) and assigns a
has, compared to the user’s original one. α, β and γ determine how
dialogue act offer 2 . An offer from the user instantiates the plan
the three components of the reward function are weighted. Notice
π by performing plan recognition, and creates a joint intention as
that the system cannot access π and д, that are instead only used at
described by JointIntend.
the end of every interaction for evaluation. Thus, the the DM learns
We define five dialogue acts Offer a , Offer д , Counter-offer, Accept
to mediate and improve the unobservable user intention π, д.
and Reject with which both user and DM can mediate the intention’s
goal д and commitments O. Table 1 contains the effects of these
dialogue acts with respect to three sets: θ is a set of offers, R and 3 FUTURE WORK
O are respectively the sets of rejected and accepted commitments.
An Offera represents offer about an action, Offerg indicates offer The research is still in its early stages and we are currently imple-
about a goal, Counter-offer is an action a 1 is not accepted and an menting the described system. We developed the goal recognition
alternative a 2 is instead proposed. An Accept and Reject can be used and the reinforcement learning components together with a simple
to accept or reject proposed commitments. user Agenda. The Agenda is based on PDDL and simulates how the
user would modify the joint intention during its turn, while having
as objective a randomly generated joint intention.
Initial experiments gave positive results, in the sense that the RL
is able to learn the structure of the problem for simple scenarios, and
successfully maximises the possible rewards. Several investigations
are needed and are still open: what is the Q-Network learning?
2 At this stage we define a finite state SDS and only the user can initiate a dialogue Does our current setting allows any generalisation? The current
only using offer together with a goal or an action. implementation requires hundreds of episodes to converge. Can
Michele Persiani and Maitreyee Tewari

the process be made faster/simpler? How to facilitate the online [6] Bunt Harry, Petukhova Volha, Traum David, and Alexandersson Jan. 2017. Di-
adaptation over real users? alogue Act Annotation with the ISO 24617-2 Standard. Springer International
Publishing, 109–135.
Encapsulation of the joint intention model with SDS is still to [7] Ece Kamar, Ya’akov Gal, and Barbara J Grosz. 2009. Incorporating helpful behavior
be implemented. For early prototypes of the system we plan to into collaborative planning. In Proceedings of The 8th International Conference on
Autonomous Agents and Multiagent Systems (AAMAS). Springer Verlag.
implement the dialogue manager as described in Section 2.2. Later [8] Drew McDermott. 2003. The Formal Semantics of Processes in PDDL. In Proc.
versions could see the implementation of a more complete SDS ICAPS Workshop on PDDL. ICAPS Workshop, Trento, Italy, 8.
through for example a POMDP model [9]. This could allow to have [9] Michael Frederick McTear, Zoraida Callejas, and David Griol. 2016. The conversa-
tional interface. Vol. 6. Springer, Spain.
dialogues that are not strictly related to the mediation of the joint [10] Miguel Ramírez and Hector Geffner. 2010. Probabilistic plan recognition using
intention, but rather more flexible and intuitive for the user. Finally, off-the-shelf classical planners. In Twenty-Fourth AAAI Conference on Artificial
investigation about the soundness of this approach in real scenarios Intelligence. MIT Press, USA, 6.
[11] Timothy W. Rauenbusch and Barbara J. Grosz. 2003. A Decision Making Pro-
for example in user studies is still to be performed. cedure for Collaborative Planning. In Proceedings of the Second International
Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS ’03).
ACKNOWLEDGMENTS Association for Computing Machinery, 1106–1107.
[12] Jost Schatzmann, Blaise Thomson, and Steve Young. 2007. Statistical user simula-
This work has received funding from the European Union’s Horizon tion with a hidden agenda. In Proceedings of the 8th SIGdial Workshop on Discourse
and Dialogue. 273–282.
2020 research and innovation program under the Marie Skłodowska- [13] David P. Schweikard and Hans Bernhard Schmid. 2013. Collective Intentionality.
Curie grant agreement No 721619 for the SOCRATES project. In The Stanford Encyclopedia of Philosophy (summer 2013 ed.), Edward N. Zalta
(Ed.). Metaphysics Research Lab, Stanford University, USA.
[14] Ira A Smith, Philip R Cohen, Jeffrey M Bradshaw, Mark Greaves, and Heather
REFERENCES Holmback. 1998. Designing conversation policies using joint intention theory. In
[1] Philip R Cohen. 2019. Foundations of Collaborative Task-Oriented Dialogue: Proceedings International Conference on Multi Agent Systems (Cat. No. 98EX160).
What’s in a Slot?. In Proceedings of the 20th Annual SIGdial Meeting on Discourse IEEE, 269–276.
and Dialogue. 198–209. [15] Rajah Subramanian, Sanjeev Kumar, and Philip Cohen. 2006. Integrating Joint
[2] Philip R Cohen and Hector J Levesque. 1990. Intention is choice with commitment. Intention Theory, Belief Reasoning, and Communicative Action for Generating
Artificial intelligence 42, 2-3 (1990), 213–261. Team-Oriented Dialogue.. In Proceedings of the National Conference on Artificial
[3] Sandra Devin and Rachid Alami. 2016. An implemented theory of mind to im- Intelligence, Vol. 21. MIT Press, Boston, USA, 6.
prove human-robot shared plans execution. In 2016 11th ACM/IEEE International [16] Michael Tomasello, Malinda Carpenter, Josep Call, Tanya Behne, and Henrike
Conference on Human-Robot Interaction (HRI). IEEE, 319–326. Moll. 2005. Understanding and sharing intentions: The origins of cultural cogni-
[4] New World Encyclopedia. 2017. Dialogue — New World Encyclopedia,. //www. tion. Behavioral and Brain Sciences 28, 5 (2005), 675–691.
newworldencyclopedia.org/p/index.php?title=Dialogue&oldid=1007366 [Online; [17] Raimo Tuomela. 2005. We-intentions revisited. Philosophical Studies 125, 3 (2005),
accessed 24-January-2020]. 327–369.
[5] Barbara Grosz and Sarit Kraus. 1996. Collaborative plans for complex group [18] Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learn-
action. Artificial Intelligence (1996). ing with double q-learning. In Thirtieth AAAI conference on artificial intelligence.
MIT Press, Arizona, USA, 7.