Ontology-Driven Context Interpretation and Conflict Resolution in Dialogue-Based Home Care Assistance Georgios Meditskos, Efstratios Kontopoulos, Stefanos Vrochidis, Ioannis Kompatsiaris Information Technologies Institute, Centre for Research and Technology – Hellas Thessaloniki, Greece {gmeditsk, skontopo, stefanos, ikom}@iti.gr Abstract. In this paper we present a framework for conversational awareness and conflict resolution in spoken dialogue systems for home care assistance. Conver- sational awareness is supported through OWL ontologies for capturing conver- sational modalities, while interpretation and incremental context enrichment is facilitated through Description Logics reasoning. Conflict resolution further as- sists the interaction with end users, facilitating exception handling and context prioritisation by coupling defeasible logics with medical and profile information. Keywords: Ontologies, defeasible logics, dialogue-based systems, healthcare. 1 Introduction Spoken dialogue systems aim to assist end users in satisfying their information needs, hiding the complexity of knowledge representation and query languages. Of the numer- ous domains of interest, conversational assistance in healthcare is a notable case where natural language interfaces provide unique solutions to patients and medical experts. In addition, multimodal dialogue-based systems overcome the limitations of dialogue sys- tems that use speech as the only communication means, collecting and analysing infor- mation from multiple sources and modalities. The presented framework focuses on enriching multimodal dialogue-based agents with (a) intelligent context aggregation for conversation understanding, and (b) conflict resolution of domain inconsistencies and conflicts. To this end, OWL ontologies are used for modelling multimodal information (e.g. verbal and non-verbal modalities) and the semantics that underpin the interpretation logic, while defeasible logics [1] provide the non-monotonic semantics needed to deliver advanced conflict resolution strategies. 2 Related Work In the domain of natural language interfaces and dialogue-based systems, ontologies such as WordNet and BabelNet, provide the vocabulary and semantics for content dis- ambiguation [2]. Ontologies and Description Logics (DL) [3] have also been used in NLP for co-reference resolution [4]. In multimodal fusion, ontologies are used for fus- ing multi-level contextual information [5]. For example, [6] presents a framework for coupling audio-visual cues with multimedia ontologies. Relevant approaches are also described in [7] for multimedia analysis tasks. As far as defeasible reasoning is con- cerned, the non-monotonic semantics of the logic has been mainly used for building argumentative dialogue-based systems [8] or resolving conflictual arguments through counterarguments [9]. Through the use of DL reasoning for conversational awareness and defeasible rules for conflict resolution, this work focuses on conversation under- standing and high-level conflict resolution. 3 Ontology-Driven Conversational Awareness Contextual information, such as multimedia information (e.g. speech analysis, named entities and concepts) and video analysis (e.g. gestures, facial expressions) is mapped to ontological entities in a hierarchical manner. The topic hierarchy defines the way conversational observations can lead to the derivation of high-level interpretations. In terms of DL semantics, the Topic (root) class is defined as: Topic ≡ ∃contains.Observation (1) For example, the recognition of a topic that indicates a pain problem is defined as: PainTopic ≡ Topic ⊓ ∃contains.HurtReference (2) HurtSpoken ⊑ HurtReference (3) Topics can be further specialized hierarchically, defining additional contain prop- erty restrictions. For example, for the recognition of certain symptoms of pain, e.g. headache based on language analysis and deictic gestures, (2) can be extended as: HeadacheTopic ≡ PainTopic ⊓ ∃contains.HeadReference (4) HeadReference ≡ HeadDeictic ⊔ HeadSpoken (5) The hierarchical topic decomposition also facilitates the descriptive modelling of topic-related semantics, i.e. to model descriptive information that does not directly de- fine the conversational topic but provides useful information to drive the interaction with the user (see Section 5). Descriptive context is modelled in terms of the Descrip- tiveContext hierarchy, whose root class is defined as: DescriptiveContext ≡ ∃requires.Concept (6) The descriptive context of a topic is specified through one or more requires prop- erty assertions about domain concepts. For example, PainTopic can be further associ- ated with structures denoting the intensity or the part of the body: PainTopic ⊑ DescriptiveContext ⊓ (∃requires.Intensity (7) ⊔ ∃requires.BodyPart) Similarly, the descriptive context of headache may contain structures relevant to sleep quality or coffee consumption: HeadacheTopic ≡ DescriptiveContext (8) ⊓ (∃requires.SleepQuality ⊔ ∃requires.CoffeeConsumption) 4 Context-based Reasoning and Conflict Resolution Context-based reasoning aims at coupling the semantics of conversational awareness with background knowledge, such as medical and profile information, in order to ac- quire a better understanding of the situation, resolve conflicts and provide the most plausible responses. Each conversational topic 𝑡 is associated with a defeasible rule base 𝐷𝑡 that handles domain contextual semantics. Assuming that 𝑇 is the set of all conversational topics supported (∀𝑡 ∈ 𝑇, 𝑇 ⊑ 𝑇𝑜𝑝𝑖𝑐), we define ∀𝑡 ∈ 𝑇, 𝐷𝑡 = {𝑟𝑖 : 𝐴(𝑟𝑖 ) ↬ 𝐶(𝑟𝑖 )} where 𝑟𝑖 is a unique label of the rule, 𝐴(𝑟𝑖 ) is the antecedent, 𝐶(𝑟𝑖 ) is the consequent and ↬ indicates the rule type: strict (→), defeasible (⇒) or defeater (). Intuitively, the detection of 𝑡 triggers the inference mechanisms of the defeasible rule base 𝐷𝑡 . 5 Use Case We describe the simulated evaluation of our framework that is part of the KRISTINA agent [10] (Fig. 1) and involves interaction with users at a home in order to acquire information about their condition and suggest treatments for frequent problems. In one of the evaluation scenarios, the user informs the agent about feeling pain (“I feel pain”). The Dialogue Manager (DM) collects the incoming verbal observation, which involves a hurt reference captured by language analysis, and builds the current context: Topic(t1), HurtSpoken(h1), contains(t1, h1) (9) The context is then passed to Conversational Awareness to interpret the topic and, according to axioms (2) and (3), it classifies t1 in the PainTopic class. Next, the avail- able descriptive context is collected. According to (7), PainTopic is associated with the Intensity and BodyPart concepts that are sent back to DM to decide upon next steps. In our scenario, it is assumed that DM decides to further enrich the current con- versational context by asking the user where he hurts. The user points to his head and says: “It hurts here”. Again, a hurt spoken reference is detected from speech analysis, as well as a deictic gesture to the head. Both observations are added to the topic instance t1 (9) using contains assertions: HurtSpoken(h2), HeadDeictic(hd1), contains(t1, h2), (10) contains(t1, hd1) The new contextual information is passed to Conversational Awareness to reason again on the current context. The enriched context now satisfies (4), and Headache- Topic becomes the current conversational topic that, along with its descriptive context SleepQuality and CoffeeConsumption (derived by (9)) are sent back to DM. DM Conversational Awareness Language Topic Understanding Dialogue DL Generation Management reasoning Descriptive Context Verbal, non-verbal observations Avatar Context-based Reasoning Conflict Resolution Defeasible rules Ontology-based User Profile Question Answering Generic Content Fig. 1. Conceptual architecture of the simulated evaluation. Arrows visualize interactions. decides not to further enrich the context (e.g. by asking questions about sleep problems or coffee consumption habits) and propagates the current context (HeadacheTopic) to Context-based Reasoning for generating appropriate responses. The generic defeasible logics rule base for HeadacheTopic involves the following defeasible rules for relevant treatment recommendations: 𝑟1 : 𝐻𝑒𝑎𝑑𝑎𝑐ℎ𝑒𝑇𝑜𝑝𝑖𝑐 ⇒ 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑆𝑙𝑒𝑒𝑝 𝑟2 : 𝐻𝑒𝑎𝑑𝑎𝑐ℎ𝑒𝑇𝑜𝑝𝑖𝑐 ⇒ 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑁𝑜𝐶𝑜𝑓𝑓𝑒𝑒 𝑟3 : 𝐻𝑒𝑎𝑑𝑎𝑐ℎ𝑒𝑇𝑜𝑝𝑖𝑐 ⇒ 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑀𝑖𝑙𝑑𝑃𝑎𝑖𝑛𝑘𝑖𝑙𝑙𝑒𝑟𝑠 According to the elderly’s profile, he suffers from frequent migraines and caffeine intolerance. Therefore, the following personalized rules are also considered: 𝑟4 : 𝐻𝑒𝑎𝑑𝑎𝑐ℎ𝑒𝑇𝑜𝑝𝑖𝑐, 𝑃𝑟𝑜𝑓𝑖𝑙𝑒_𝐶𝑎𝑓𝑓𝑒𝑖𝑛𝑒𝐼𝑛𝑡𝑜𝑙𝑒𝑟𝑎𝑛𝑐𝑒  ¬𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑁𝑜𝐶𝑜𝑓𝑓𝑒𝑒 𝑟5 : 𝐻𝑒𝑎𝑑𝑎𝑐ℎ𝑒𝑇𝑜𝑝𝑖𝑐, 𝑃𝑟𝑜𝑓𝑖𝑙𝑒_𝑀𝑖𝑔𝑟𝑎𝑖𝑛𝑒𝑠 ⇒ 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑆𝑡𝑟𝑜𝑛𝑔𝑃𝑎𝑖𝑛𝑘𝑖𝑙𝑙𝑒𝑟𝑠 𝑟5 > 𝑟3 and 𝐶 = {𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑀𝑖𝑙𝑑𝑃𝑎𝑖𝑛𝑘𝑖𝑙𝑙𝑒𝑟𝑠, 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑆𝑡𝑟𝑜𝑛𝑔𝑃𝑎𝑖𝑛𝑘𝑖𝑙𝑙𝑒𝑟𝑠} In addition, the scenario involves a sleep sensor that monitors night sleep quality and provides an assessment every morning. The following defeater enriches context-based reasoning by fusing sleep quality information that overrides 𝑟1 : 𝑟6 : 𝐻𝑒𝑎𝑑𝑎𝑐ℎ𝑒𝑇𝑜𝑝𝑖𝑐, 𝐿𝑜𝑔𝑠_𝐺𝑜𝑜𝑑𝑆𝑙𝑒𝑒𝑝𝑄𝑢𝑎𝑙𝑖𝑡𝑦  ¬ 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑆𝑙𝑒𝑒𝑝 The rule base of the example (via SPINdle [11]) finally recommends that the user should take strong painkillers for his headache, since he suffers from migraines, over- riding other plausible recommendations based on profile and sleep-related information. 6 Conclusions In this paper we presented a framework for conversational awareness and conflict res- olution in spoken dialogue systems combining ontologies and defeasible reasoning. OWL is used to model multimodal input and the semantics that underpin the conversa- tional logic, while defeasible rules provide the non-monotonic semantics needed to de- liver intuitive knowledge representation and advanced conflict resolution. We are currently conducting pilots for collecting additional data and evaluating the framework with more use cases. In parallel, we are working towards further enrichment of the fusion and interpretation capabilities of the framework, so as to support additional use cases, e.g. taking into account emotions and facial expressions. Acknowledgements. This work has been partially supported by the H2020-645012 project “KRISTINA: A Knowledge-Based Information Agent with Social Competence and Human Interaction Capabilities”. References 1. Maier, F., Nute, D.: Well-founded semantics for defeasible logic. Synthese. 176, 243–274 (2010). 2. Damljanović, D., Agatonović, M., Cunningham, H., Bontcheva, K.: Improving habitability of natural language interfaces for querying ontologies with feedback and clarification dialogues. Web Semant. Sci. Serv. Agents World Wide Web. 19, 1–21 (2013). 3. Baader, F.: The description logic handbook: theory, implementation, and applica- tions. Cambridge university press (2003). 4. Prokofyev, R., Tonon, A., Luggen, M., Vouilloz, L., Difallah, D.E., Cudré-Mau- roux, P.: SANAPHOR: Ontology-Based Coreference Resolution. In: International Semantic Web Conference. pp. 458–473. Springer (2015). 5. Dourlens, S., Ramdane-Cherif, A., Monacelli, E.: Multi levels semantic architec- ture for multimodal interaction. Appl. Intell. 38, 586–599 (2013). 6. Perperis, T., Giannakopoulos, T., Makris, A., Kosmopoulos, D.I., Tsekeridou, S., Perantonis, S.J., Theodoridis, S.: Multimodal and ontology-based fusion ap- proaches of audio and visual processing for violence detection in movies. Expert Syst. Appl. 38, 14102–14116 (2011). 7. Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal Fusion for Multimedia Analysis: A Survey. Multimed. Syst. 16, 345–379 (2010). 8. Modgil, S., Prakken, H.: The ASPIC+ framework for structured argumentation: a tutorial. Argum. Comput. 5, 31–62 (2014). 9. Prakken, H.: On dialogue systems with speech acts, arguments, and counterargu- ments. In: European Workshop on Logics in Artificial Intelligence. pp. 224–238. Springer (2000). 10. Wanner, L., Blat, J., Dasiopoulou, S., al, et: Towards a Multimedia Knowledge- Based Agent with Social Competence and Human Interaction Capabilities. In: Pro- ceedings of the 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction. pp. 21–26. ACM (2016). 11. Lam, H.-P., Governatori, G.: The making of SPINdle. In: International Workshop on Rules and Rule Markup Languages for the Semantic Web. pp. 315–322. Springer (2009).