<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Ontology-Driven Context Interpretation and Conflict Resolution in Dialogue-Based Home Care Assistance</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Georgios Meditskos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Efstratios Kontopoulos</string-name>
          <email>skontopo@iti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefanos Vrochidis</string-name>
          <email>stefanos@iti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ioannis Kompatsiaris</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Information Technologies Institute, Centre for Research and Technology - Hellas Thessaloniki</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we present a framework for conversational awareness and conflict resolution in spoken dialogue systems for home care assistance. Conversational awareness is supported through OWL ontologies for capturing conversational modalities, while interpretation and incremental context enrichment is facilitated through Description Logics reasoning. Conflict resolution further assists the interaction with end users, facilitating exception handling and context prioritisation by coupling defeasible logics with medical and profile information.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontologies</kwd>
        <kwd>defeasible logics</kwd>
        <kwd>dialogue-based systems</kwd>
        <kwd>healthcare</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Spoken dialogue systems aim to assist end users in satisfying their information needs,
hiding the complexity of knowledge representation and query languages. Of the
numerous domains of interest, conversational assistance in healthcare is a notable case where
natural language interfaces provide unique solutions to patients and medical experts. In
addition, multimodal dialogue-based systems overcome the limitations of dialogue
systems that use speech as the only communication means, collecting and analysing
information from multiple sources and modalities.</p>
      <p>The presented framework focuses on enriching multimodal dialogue-based agents
with (a) intelligent context aggregation for conversation understanding, and (b) conflict
resolution of domain inconsistencies and conflicts. To this end, OWL ontologies are
used for modelling multimodal information (e.g. verbal and non-verbal modalities) and
the semantics that underpin the interpretation logic, while defeasible logics [1] provide
the non-monotonic semantics needed to deliver advanced conflict resolution strategies.
In the domain of natural language interfaces and dialogue-based systems, ontologies
such as WordNet and BabelNet, provide the vocabulary and semantics for content
disambiguation [2]. Ontologies and Description Logics (DL) [3] have also been used in
NLP for co-reference resolution [4]. In multimodal fusion, ontologies are used for
fusing multi-level contextual information [5]. For example, [6] presents a framework for
coupling audio-visual cues with multimedia ontologies. Relevant approaches are also
described in [7] for multimedia analysis tasks. As far as defeasible reasoning is
concerned, the non-monotonic semantics of the logic has been mainly used for building
argumentative dialogue-based systems [8] or resolving conflictual arguments through
counterarguments [9]. Through the use of DL reasoning for conversational awareness
and defeasible rules for conflict resolution, this work focuses on conversation
understanding and high-level conflict resolution.
3</p>
    </sec>
    <sec id="sec-2">
      <title>Ontology-Driven Conversational Awareness</title>
      <p>Contextual information, such as multimedia information (e.g. speech analysis, named
entities and concepts) and video analysis (e.g. gestures, facial expressions) is mapped
to ontological entities in a hierarchical manner. The topic hierarchy defines the way
conversational observations can lead to the derivation of high-level interpretations. In
terms of DL semantics, the Topic (root) class is defined as:</p>
      <p>
        For example, the recognition of a topic that indicates a pain problem is defined as:
Topics can be further specialized hierarchically, defining additional contain
property restrictions. For example, for the recognition of certain symptoms of pain, e.g.
headache based on language analysis and deictic gestures, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) can be extended as:
      </p>
      <sec id="sec-2-1">
        <title>HeadacheTopic ≡ PainTopic ⊓ ∃contains.HeadReference</title>
      </sec>
      <sec id="sec-2-2">
        <title>HeadReference ≡ HeadDeictic ⊔ HeadSpoken</title>
      </sec>
      <sec id="sec-2-3">
        <title>Topic ≡ ∃contains.Observation</title>
      </sec>
      <sec id="sec-2-4">
        <title>PainTopic ≡ Topic ⊓ ∃contains.HurtReference HurtSpoken ⊑ HurtReference</title>
        <p>
          (
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
(
          <xref ref-type="bibr" rid="ref6">6</xref>
          )
(
          <xref ref-type="bibr" rid="ref7">7</xref>
          )
        </p>
        <p>The hierarchical topic decomposition also facilitates the descriptive modelling of
topic-related semantics, i.e. to model descriptive information that does not directly
define the conversational topic but provides useful information to drive the interaction
with the user (see Section 5). Descriptive context is modelled in terms of the
DescriptiveContext hierarchy, whose root class is defined as:</p>
      </sec>
      <sec id="sec-2-5">
        <title>DescriptiveContext ≡ ∃requires.Concept</title>
        <p>The descriptive context of a topic is specified through one or more requires
property assertions about domain concepts. For example, PainTopic can be further
associated with structures denoting the intensity or the part of the body:</p>
      </sec>
      <sec id="sec-2-6">
        <title>PainTopic ⊑ DescriptiveContext ⊓ (∃requires.Intensity ⊔ ∃requires.BodyPart)</title>
        <p>Similarly, the descriptive context of headache may contain structures relevant to
sleep quality or coffee consumption:</p>
      </sec>
      <sec id="sec-2-7">
        <title>HeadacheTopic ≡ DescriptiveContext</title>
        <p>
          ⊓ (∃requires.SleepQuality ⊔ ∃requires.CoffeeConsumption)
(
          <xref ref-type="bibr" rid="ref8">8</xref>
          )
4
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Context-based Reasoning and Conflict Resolution</title>
      <p>Context-based reasoning aims at coupling the semantics of conversational awareness
with background knowledge, such as medical and profile information, in order to
acquire a better understanding of the situation, resolve conflicts and provide the most
plausible responses. Each conversational topic  is associated with a defeasible rule
base   that handles domain contextual semantics. Assuming that  is the set of all
conversational topics supported (∀ ∈  ,  ⊑  ), we define</p>
      <p>∀ ∈  ,   = {  :  (  ) ↬  (  )}
where   is a unique label of the rule,  (  ) is the antecedent,  (  ) is the consequent
and ↬ indicates the rule type: strict (→), defeasible (⇒) or defeater (). Intuitively, the
detection of  triggers the inference mechanisms of the defeasible rule base   .
5</p>
    </sec>
    <sec id="sec-4">
      <title>Use Case</title>
      <p>We describe the simulated evaluation of our framework that is part of the KRISTINA
agent [10] (Fig. 1) and involves interaction with users at a home in order to acquire
information about their condition and suggest treatments for frequent problems. In one
of the evaluation scenarios, the user informs the agent about feeling pain (“I feel pain”).
The Dialogue Manager (DM) collects the incoming verbal observation, which involves
a hurt reference captured by language analysis, and builds the current context:</p>
      <sec id="sec-4-1">
        <title>Topic(t1), HurtSpoken(h1), contains(t1, h1)</title>
        <p>
          The context is then passed to Conversational Awareness to interpret the topic and,
according to axioms (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) and (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ), it classifies t1 in the PainTopic class. Next, the
available descriptive context is collected. According to (
          <xref ref-type="bibr" rid="ref7">7</xref>
          ), PainTopic is associated with
the Intensity and BodyPart concepts that are sent back to DM to decide upon next
steps. In our scenario, it is assumed that DM decides to further enrich the current
conversational context by asking the user where he hurts. The user points to his head and
says: “It hurts here”. Again, a hurt spoken reference is detected from speech analysis,
as well as a deictic gesture to the head. Both observations are added to the topic instance
t1 (
          <xref ref-type="bibr" rid="ref9">9</xref>
          ) using contains assertions:
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>HurtSpoken(h2), HeadDeictic(hd1), contains(t1, h2), contains(t1, hd1)</title>
        <p>
          (
          <xref ref-type="bibr" rid="ref9">9</xref>
          )
(
          <xref ref-type="bibr" rid="ref10">10</xref>
          )
        </p>
        <p>
          The new contextual information is passed to Conversational Awareness to reason
again on the current context. The enriched context now satisfies (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ), and
HeadacheTopic becomes the current conversational topic that, along with its descriptive context
SleepQuality and CoffeeConsumption (derived by (
          <xref ref-type="bibr" rid="ref9">9</xref>
          )) are sent back to DM. DM
Verbal, non-verbal
observations
        </p>
        <p>Generic Content</p>
        <p>Language
Generation</p>
        <p>Avatar</p>
        <p>Ontology-based
Question Answering</p>
        <p>Conversational Awareness</p>
        <p>Topic Understanding</p>
        <p>DL
reasoning</p>
        <p>Descriptive Context
Context-based Reasoning</p>
        <p>Conflict Resolution</p>
        <p>Defeasible rules
decides not to further enrich the context (e.g. by asking questions about sleep problems
or coffee consumption habits) and propagates the current context (HeadacheTopic) to
Context-based Reasoning for generating appropriate responses.</p>
        <p>The generic defeasible logics rule base for HeadacheTopic involves the following
defeasible rules for relevant treatment recommendations:
 1: 
 2: 
 3: 
 6: 
 4:  ℎ
 5:  ℎ
 5 &gt;  3 and</p>
        <p>= {
ℎ
ℎ
ℎ
ℎ
⇒ 
⇒ 
⇒ 
, 
, 
, 
_
_
_
User Profile
}</p>
        <p>According to the elderly’s profile, he suffers from frequent migraines and caffeine
intolerance. Therefore, the following personalized rules are also considered:
⇒ 
, 
 ¬</p>
        <p> ¬</p>
        <p>In addition, the scenario involves a sleep sensor that monitors night sleep quality and
provides an assessment every morning. The following defeater enriches context-based
reasoning by fusing sleep quality information that overrides  1:</p>
        <p>The rule base of the example (via SPINdle [11]) finally recommends that the user
should take strong painkillers for his headache, since he suffers from migraines,
overriding other plausible recommendations based on profile and sleep-related information.
6</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>In this paper we presented a framework for conversational awareness and conflict
resolution in spoken dialogue systems combining ontologies and defeasible reasoning.
OWL is used to model multimodal input and the semantics that underpin the
conversational logic, while defeasible rules provide the non-monotonic semantics needed to
deliver intuitive knowledge representation and advanced conflict resolution.</p>
      <p>We are currently conducting pilots for collecting additional data and evaluating the
framework with more use cases. In parallel, we are working towards further enrichment
of the fusion and interpretation capabilities of the framework, so as to support additional
use cases, e.g. taking into account emotions and facial expressions.</p>
      <p>Acknowledgements. This work has been partially supported by the H2020-645012
project “KRISTINA: A Knowledge-Based Information Agent with Social Competence
and Human Interaction Capabilities”.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Maier</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nute</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Well-founded semantics for defeasible logic</article-title>
          .
          <source>Synthese</source>
          .
          <volume>176</volume>
          ,
          <fpage>243</fpage>
          -
          <lpage>274</lpage>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Damljanović</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Agatonović</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cunningham</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Improving habitability of natural language interfaces for querying ontologies with feedback and clarification dialogues</article-title>
          .
          <source>Web Semant. Sci. Serv</source>
          . Agents World Wide Web.
          <volume>19</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>The description logic handbook: theory, implementation, and applications</article-title>
          . Cambridge university press (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Prokofyev</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tonon</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luggen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vouilloz</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Difallah</surname>
            ,
            <given-names>D.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cudré-Mauroux</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : SANAPHOR:
          <article-title>Ontology-Based Coreference Resolution</article-title>
          . In: International Semantic Web Conference. pp.
          <fpage>458</fpage>
          -
          <lpage>473</lpage>
          . Springer (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Dourlens</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramdane-Cherif</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monacelli</surname>
          </string-name>
          , E.:
          <article-title>Multi levels semantic architecture for multimodal interaction</article-title>
          .
          <source>Appl. Intell</source>
          .
          <volume>38</volume>
          ,
          <fpage>586</fpage>
          -
          <lpage>599</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Perperis</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giannakopoulos</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Makris</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kosmopoulos</surname>
            ,
            <given-names>D.I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsekeridou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perantonis</surname>
            ,
            <given-names>S.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Theodoridis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Multimodal and ontology-based fusion approaches of audio and visual processing for violence detection in movies</article-title>
          .
          <source>Expert Syst. Appl</source>
          .
          <volume>38</volume>
          ,
          <fpage>14102</fpage>
          -
          <lpage>14116</lpage>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Atrey</surname>
            ,
            <given-names>P.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hossain</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>El Saddik</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kankanhalli</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          :
          <article-title>Multimodal Fusion for Multimedia Analysis: A Survey</article-title>
          .
          <source>Multimed. Syst</source>
          .
          <volume>16</volume>
          ,
          <fpage>345</fpage>
          -
          <lpage>379</lpage>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Modgil</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prakken</surname>
          </string-name>
          , H.:
          <article-title>The ASPIC+ framework for structured argumentation: a tutorial</article-title>
          .
          <source>Argum. Comput. 5</source>
          ,
          <fpage>31</fpage>
          -
          <lpage>62</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Prakken</surname>
          </string-name>
          , H.:
          <article-title>On dialogue systems with speech acts, arguments, and counterarguments</article-title>
          .
          <source>In: European Workshop on Logics in Artificial Intelligence</source>
          . pp.
          <fpage>224</fpage>
          -
          <lpage>238</lpage>
          . Springer (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Wanner</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blat</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dasiopoulou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , al, et:
          <article-title>Towards a Multimedia KnowledgeBased Agent with Social Competence and Human Interaction Capabilities</article-title>
          .
          <source>In: Proceedings of the 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction</source>
          . pp.
          <fpage>21</fpage>
          -
          <lpage>26</lpage>
          . ACM (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Lam</surname>
            ,
            <given-names>H.-P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Governatori</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The making of SPINdle</article-title>
          . In: International Workshop on Rules and
          <article-title>Rule Markup Languages for the Semantic Web</article-title>
          . pp.
          <fpage>315</fpage>
          -
          <lpage>322</lpage>
          . Springer (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>