<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Conversational Assistants for Elderly Users - The Importance of Socially Cooperative Dialogue</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stefan Kopp</string-name>
          <email>skopp@uni-bielefeld.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Katharina Cyra</string-name>
          <email>katharina.cyra@uni-due.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Franz Kummert</string-name>
          <email>franz@techfak.uni-bielefeld.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lars Schillingmann</string-name>
          <email>lschilli@techfak.uni-bielefeld.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mara Brandt</string-name>
          <email>mbrandt@techfak.uni-bielefeld.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Farina Freigang</string-name>
          <email>farina.freigang@uni-bielefeld.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christiane Opfermann</string-name>
          <email>christiane.opfermann@uni-due.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carolin Straßmann</string-name>
          <email>carolin.strassmann@uni-due.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ramin Yaghoubzadeh</string-name>
          <email>ryaghoub@techfak.uni-bielefeld.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hendrik Buschmeier</string-name>
          <email>hbuschme@uni-bielefeld.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicole Krämer</string-name>
          <email>nicole.kraemer@uni-due.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karola Pitsch</string-name>
          <email>karola.pitsch@uni-due.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eduard Wall</string-name>
          <email>ewall@techfak.uni-bielefeld.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Bielefeld University</institution>
          ,
          <addr-line>CITEC</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Duisburg-Essen</institution>
        </aff>
      </contrib-group>
      <fpage>10</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>Conversational agents can provide valuable cognitive and/or emotional assistance to elderly users or people with cognitive impairments who often have dificulties in organizing and following a structured day schedule. Previous research showed that a virtual assistant that can interact in spoken language would be a desirable help for those users. However, these user groups pose specific requirements for spoken dialogue interaction that existing systems hardly meet. This paper presents work on a virtual conversational assistant that was designed for, and together with, elderly as well as cognitively handicapped users. It has been specifically developed to enable 'socially cooperative dialogue' - adaptive and aware conversational interaction in which mutual understanding is co-constructed and ensured collaboratively. The technical approach is described and results of evaluation studies are reported.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>In recent years politics and society have placed emphasis on ways
to enable a longer autonomous and self-determined life for elderly
people. One approach is the development of assistive technology.
However, this has often been focused on supporting physical tasks
(e.g., fetching or lifting objects, moving around) and it has been
struggling with questions of human–machine interaction and user
acceptance. The goal of the kompass project (which started in 2015)
was to develop a virtual assistant (‘Billie’) to accompany and guide a
user throughout the day. The system has been specifically designed
for, and together with, two user groups: elderly users that live
autonomously in their home environment but are on the verge of
needing home assistance services, and cognitively handicapped
users that are already supported by professional care-givers. What
both user groups have in common are mild cognitive impairments
that create a need for support with autonomously organizing and
following a structured day schedule.</p>
      <p>
        While technical means of supporting this are already available,
many elderly users have little prior experience with using assistive
systems. Applying such technology thus requires to overcome a
‘digital barrier’ both with the individual users as well as their
careproviding environment. The kompass project built on pre-studies
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] suggesting that natural spoken-language interaction with a
virtual agent may be desirable and acceptable for these user groups.
Building and applying conversational agents for these user groups,
however, raises it’s own challenges. Elderly users often have
selectively impaired abilities, e.g., for auditory perception, articulation,
adapting to a recommended interaction style, adhering to a clean
turn-taking structure, or comprehending content of high
information density [
        <xref ref-type="bibr" rid="ref35 ref37">35, 37</xref>
        ]. We thus set out to develop a conversational
agent that provides a dialogue style that enables robust and reliable,
yet acceptable spoken-language interactions with these user groups.
We refer to this special quality as ‘socially cooperative dialogue’
[
        <xref ref-type="bibr" rid="ref32">32</xref>
        ].
      </p>
      <p>In this paper we present our approach and report on results
obtained in evaluation studies. After discussing related work in
the next section, Sect. 3 points our requirements before Sect. 4
describes our approach to modeling socially cooperative dialogue in
the virtual assistant ‘Billie’. Section 5 presents results and lessons
learned from several evaluation studies carried out with users in
the lab environment as well as in their real home environment
(ongoing), showing how conversational agents can be built to achieve
the interaction abilities needed to provide elderly users and mildly
cognitively impaired persons with successful assistance.
2</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>
        Several conversational assistants have been developed for
carerelated settings such as companionship for people living alone [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ],
assistance in multilingual care-giving/receiving [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], or pain and
afect management [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. An increasing body of work suggests that
spoken interaction with users with cognitive impairments seems,
in general, to be feasible and accepted by the user group, though
some requirements need to be met. Meis [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] noted that older
subjects want a spoken-dialogue helper to have a name and to react
contingently to social afordances such as expressions of gratitude.
Miehle and colleagues [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] noted that conversational assistants are
required to speak suficiently loud and with an appropriate pace but
were accepted as interlocutors in a study with elderly people.
Bickmore and colleagues [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] analyzed long-term interactions of older
adults with an agent-based coaching system using spoken language
while user input was given through a touchscreen and found them
efective on the short term. Sidner and colleagues [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] attempted to
identify preferred domains of conversation or joint activity based
on this system design. Yaghoubzadeh and colleagues [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] reported
that older adults and people with cognitive impairments are able
to successfully ground information. Explicit confirmation patterns
and a low information density (one information unit per utterance)
enabled the user to detect and repair more of the system’s language
understanding problems and subsequent errors.
      </p>
      <p>
        A well-discussed concept that is central for successful human–
human dialogue is ‘grounding’ [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Researchers have attempted to
model it computationally for conversational agents, discretely as
a finite-state process [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] or probabilistically using Bayesian
networks [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Recent work on real-time dialogue systems has focused
on advanced issues so that discourse context is taken into account
[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], partial and overlapping utterances can be grounded
incrementally [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ], groundedness can be estimated from multimodal
feedback cues [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], or information from multiple modalities related
to socio-emotional aspects such as attention and engagement are
taken into account [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>REQUIREMENTS</title>
    </sec>
    <sec id="sec-4">
      <title>Interactional tasks</title>
      <p>
        The goal of the KOMPASS project was to develop a conversational
assistant that helps users organize and keep track of their schedule
for the day. This goal was identified by our application partner
(v. Bodelschwingh Foundation Bethel, Bielefeld, Germany) as an
important need of their respective client groups. The conversational
agent system ‘Billie’ thus ofers several functions within the domain
of schedule management. Users can enter their various kinds of
appointments (single appointments, recurrent appointments with
specification of recurrence, e.g., weekly, biweekly and for how long
the recurrent appointment will be reiterated), they can choose being
reminded of them including setting a time for the reminder, and they
can edit already entered appointments. The editing of appointments
comprises the following sub-tasks: Users can change any of the
appointment values ‘start time’, ‘end time’, ‘topic’ and ‘duration’,
they can delete or replace appointments within the calendar, and
they can query their entered appointments of any point in time
(same day, same week, forthcoming day or days and weeks, and
previous days or weeks). Moreover, the agent system provides for
user-tailored suggestions for leisure time activities [
        <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
        ] in order
to promote a more active life.
3.2
      </p>
    </sec>
    <sec id="sec-5">
      <title>Socially cooperative dialogue</title>
      <p>
        In line with other previous work, focus groups and pre-studies that
we carried out within a user-centered design process made clear
that assistive functions of the system preferably should be accessible
and realized through easy-to-use spoken dialogue interaction [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Further analyses of several quasi experimental studies (run in a
Wizard-of-Oz scenario in 2015, as a semi-autonomous study in
2017, and as a long-term study ongoing; see section 5) pointed to
the fact that spoken-language interaction, while being generally
preferred, raises a number of (well-known) challenges that tend
to be amplified in our user groups. Therefore, the conversational
assistant has to fulfill specific requirements for the users to be
acceptable and successful with respect to the various sub-tasks of
schedule management:
• The system has to be able to deal with long, extensive user
utterances. Interruptions and barge-ins by the user must be
possible at all times, in particular when they are instrumental
in solving the current communicative task at hand. Overall,
turn-taking has to be cooperative such that interruptions
by the system should be foreshadowed through nonverbal
behavior [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ]. Simultaneously, the system must be robust to
non-cooperative turn-taking behavior of the user such that
turn fights are avoided (generally yielding the turn to the
user).
• Generally, the system must work to ensure dynamic
coordination of understanding and grounding in dialogue.
Feedback by the system to user input must be provided timely
to prevent long user turns, and clearly mark the system’s
current level of understanding [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]; user feedback must
be continuously processed and interpreted for indicators of
miscommunication.
• The handling of understanding problems on part of the
system or the user is crucial. User-initiated displays of
nonunderstanding (e.g., “Sorry” or “Can you repeat please”) must
always be possible and handled by the system properly. For
non-specific system displays of non-understanding, a
variation of error handling strategies must be available, e.g.,
in form of reprompts and non-understanding notification,
combined with more restrictive clarification sequences
depending on the sub-task and local move. While reprompts
may be beneficial for problem solving in its first issuing [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ],
a lack of progress in problem solving without a change in
error recovery strategy may lead to further complications
(cf. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]). This holds especially in action contexts like
appointment suggestions in which user responses can address
social matters like willingness, availability, disposition and
deontic authority. Thus, error handling on part of the system
should employ strategies that clarify the user’s agreement
or resistance [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
• Topic shifts by the user must be possible and followed readily
by the system. This requires the system to also keep track
of non-settled discourse segments and to return to them at
appropriate points in time (when the user does not pursue
other discourse goals) and with a cooperative and gentle
entrance strategy, possibly with repeating and rectifying
parts of the discourse unit that have been discussed already.
Generally, the system should avoid topic shifts. If they are
unavoidable, e.g., because a previous discourse topic has not
been settled yet, they should be of as close a distance as
possible and must be marked explicitly.
      </p>
      <p>
        The requirements listed above are (necessary, but most probably
not suficient) examples of a specific dialogue quality that we deem
necessary for our user groups. We refer to this dialogue quality as
‘socially cooperative’ [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ] and note that it goes well beyond classical
notions like grounding in dialogue, as it implies a specific role that
the agent has to fulfill consistently throughout the interaction. This
role entails a range of collaborative-supportive action policies, e.g.,
for readily following topic shifts, yielding turns, adhering to rules
of politeness, and adapting dialogue structures thoroughly to the
needs of the user.
4
4.1
      </p>
    </sec>
    <sec id="sec-6">
      <title>APPROACH AND IMPLEMENTATION</title>
    </sec>
    <sec id="sec-7">
      <title>Architecture</title>
      <p>
        To account for the requirements identified above, we have based
the conversational assistant on an interaction framework that aims
to support the features of incrementality (to quickly update and
relay discussed information), provisions for representation and
resolution of uncertainty (resulting from input and unclear grounding)
with explicit representation of topics, structured hierarchically in
units intuitive to laymen. The overall architecture is shown in fig. 2.
It is built on top of the IPAACA middleware [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], a distributed,
platform-independent implementation of a general model for
incremental dialogue processing proposed by Schlangen and Skantze
[
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. This provides the back-end for the connection of the core
dialogue management components to input (including ASR, tagger
and parser, eye tracker, keyboard/mouse/touch etc.), multimodal
fusion, behavior planning (NLG, gaze, gesture, calender) and output
Nuance Dragon
      </p>
      <p>ASR
Tobii 4C
Eyetracking
Head-gesture
recognition
Filled pause
detection</p>
      <p>Fusion and
interpretation</p>
      <p>Natural
language
understanding
Estimation of
contact and
engagement
asrstate
userWords
userGaze</p>
      <p>floor
agentWords
nlgRequest
agent…Gaze</p>
      <p>Dialog
management</p>
      <p>flexdiam
– Issues
– Planning
– flow_ctl
– User model
– …</p>
      <p>Behavior
generation
Natural
language
generation</p>
      <p>Gaze
Gesture</p>
      <p>Calendar</p>
      <p>Timeboard
sil…
?
idle</p>
      <p>speech
Billie, I’d like to add a new a…</p>
      <p>agent
atentive
silence</p>
      <p>calendar
realization (synthesis, graphical components / GUI changes, control
of animated characters etc.).</p>
      <p>The architecture is built around a ‘timeboard’, a central
representation that captures temporal information of interactional
events on diferent tiers. Importantly, these tiers hold rewindable
representations of certain and uncertain variables (probability
distributions) with generic metrics – like entropy – that serve as the
basis for local decision heuristics. Event-driven observers are used
to derive events from interval relations between existing ones, and
trigger higher-level functions, most centrally the dialogue manager
proper, but also a contribution manager, which schedules queued
communicative intentions when the floor situation allows.
4.2</p>
    </sec>
    <sec id="sec-8">
      <title>Dialogue management</title>
      <p>
        To realize the required dialogue management abilities, the ‘flexdiam’
[
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] system has been developed. Following an issue-based approach,
it generally pursues a single joint task and discourse model for both
interactants. The basic structure of the joint task and discourse
model is a forest of independent but hierarchically interdependent
agents termed ‘Issues’, as well as generic update rules to transform
this forest after dialogue management invocations. When an Issue
is instantiated, it is at the same time made a child of the Issue that
created it. Any path from a leaf Issue to the root corresponds to a
nested (sub-)topic of discussion. Any number of topics can be active
at any one time and will be considered valid points of reference in
parallel, if applicable according to their grounding state. To that
end, any Issue can be in one of five states (new, entered, fulfilled,
failed, obsolete).
      </p>
      <p>Invocations that trigger local processing in Issues come in two
lfavors: input handling (e.g., prompt request, NLU parse) and plan
structure updates (e.g., child issue progressed, completed or
invalidated). Issues will decide along their local path in the hierarchy,
and based on the current global context, whether they can provide
a plan to handle an invocation. If an Issue cannot handle an input
handling invocation locally, a preference is marked to let its parent
handle it instead. Partial localized processing does not preclude
propagation through the hierarchy, though. This allows for situated
partial interpretation and processing, which is most specific and
situation-dependent in the leaves, and most generic and general in
the roots of the forest.</p>
      <p>If a user contribution does not fit well into any active Issue, a
discourse transition based on user initiative can be assumed to have
taken place. Depending on the situation, this could be construed
as either a forward-looking contribution (if anticipated by the
currently invoked entrance point or a direct ancestor) or a real topic
jump. A new branch is created then, marked as entered, and moved
to the top of the entry point priority queue.</p>
      <p>Note that the present system is suited to quick, interactive
approaches to spoken interaction and to modeling real-world
applications within limited domains. Manual extension is quite
straightforward. Incremental processing and the handling of uncertain
input and information derived from it has received special focus,
the ‘output’ side employs a similar notion of indeterminate state
until evidence for communicative success provides a precondition
for grounding being attested. Communicative plans are capable of
employing several modalities and the implemented suite of basic
Issues for grounding problems can be fine-tuned to cover a wide
space of varying explicitness, verbosity, and conversational styles,
which can be used to seed user models that best suit the
estimated capabilities and preferences of our specific user groups. This
extends to information density (configurable via diferent options
for packaging and diferent approaches to confirmation requests),
but also discourse structure: explicit ratification for topic jumps
beyond a distance threshold (and implicit acceptance by means of
contingent continuation by the user) are currently in development.
4.3</p>
    </sec>
    <sec id="sec-9">
      <title>Socio-communicative signal processing</title>
      <p>
        Human communication is highly multi-modal and thus the ability
to process this variety of information is very important to facilitate
communication with a virtual agent. Therefore, several modules
in our architecture recognize visual communication signals and
non-verbal socio-emotional speech cues. Confirmations play an
important role in the dialog structure of the interaction with ‘Billie’.
Therefore, we focused on the recognition of natural confirmation
signals, like nodding and non-lexical confirmations like “mhm”,
which are typically not recognized by automatic speech recognition.
To detect non-lexical confirmations, the speech signal is
segmented into speech intervals using voice activity detection (VAD). Our
module detects non-lexical confirmations by extracting acoustic
features and classifying the result using a Support Vector Machine
(SVM). If confirmations are detected, the component sends
messages via the IPAACA middleware to inform other components [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
Further, the system is able to detect human head nods based on
dynamic time warping and estimations of head pose angles from
facial landmark features [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ]. In addition, the face detection is used
to verify contact with the user. A cue aggregation module combines
signals detected in individual modalities to derive a higher-level
interpretation using a Bayesian network (see Fig. 3). Currently, the
system detects if the user signals confirmation by combining
nonlexical confirmation and nod detection. User contact is detected by
combining face detection and eye contact related cues.
Regarding output behavior, the conversational assistant ‘Billie’ is
enhanced with multimodal cues such as gestures, facial expressions
and head movements in order to develop a more natural behavior
and making it easily accessible, understandable and helpful for
the human user. The cues were selected based on an analysis of
their form, function and frequency in natural interaction data [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
Specifically, we simulate cues that serve the pragmatic functions to
convey and mark emphasis, de-emphasis and (un-)certainty of the
speaker. These functions have been successfully mapped onto the
conversational agent [
        <xref ref-type="bibr" rid="ref10 ref12">10, 12</xref>
        ]. For the calendar domain, key phrases
were selected and are accompanied by these multimodal functions if
applicable (emphasis: “Let’s continue!”, de-emphasis: “Good, this is
canceled”, uncertainty: “Did you say ‘swimming’?”; see fig. 4). The
multimodal expressiveness has been evaluated with the participants
in lab-based studies (e.g., showing better information uptake) and
will be evaluated systematically during a long-term study.
      </p>
      <p>
        Additionally to the assistant ‘Billie’ itself, we designed diferent
states, visual and sound features of the weekly-based calendar that
support the dialogue between user and agent multimodally. As
previous analyses have shown [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], users orient toward the calendar
area of the interface while entering appointments as this is the
area of interest for the ongoing interactional task. So, to design
a responsive and recipient system that supports the dialogue not
only with verbal and non-verbal features of the agent, the calendar
provides visual cues (mainly through highlighting) to
comprehension hypotheses of the system even before words are uttered by
the agent (see fig. 5). These visual updates represent the system’s
status regarding the understanding of user input. Furthermore, a
sound was added to mark the successful entry of appointments.
4.5
      </p>
    </sec>
    <sec id="sec-10">
      <title>Data Recording</title>
      <p>Our system architecture is designed to be used in both lab and field
environments. As manual recording control is not feasible in long
term field studies, the architecture contains an automatic recording
module which starts recording when the user starts interacting with
the system by pushing a button. Recording is automatically stopped
if the user says goodbye and, in order to ensure users’ privacy, when
the user is not visible to the system and does not react to system
prompts for an extended period of time, or when a system error is
detected. The recordings comprise five video and five audio tracks
which are compressed in real-time using hardware acceleration.
One video track (depicted in Fig. 6) is a 4-in-1 overview of the other
video tracks. System and interaction log files are archived after each
session.</p>
    </sec>
    <sec id="sec-11">
      <title>EVALUATION RESULTS</title>
      <p>
        In our project we followed an iterative design-implement-evaluation
approach that comprised a number of empirical evaluations of
individual sub-systems of the agent [
        <xref ref-type="bibr" rid="ref10 ref17 ref18 ref25 ref26 ref3 ref30 ref5 ref8 ref9">3, 5, 8–10, 17, 18, 25, 26, 30</xref>
        ].
Further, to gather training data for the socio-emotional signal
recognition components and to get insights into users’ reactions
to potential interaction problems, an initial Wizard-of-Oz study
with 53 participants from the diferent user groups (18 senior
participants, 19 participants with cognitive impairments and, 16 controls
from the local student population) was carried out. In order to elicit
participants’ reactions, the agent’s behavior in this study followed
a script that was designed to create a number of typical interaction
problems. Participants could negotiate and enter their own
appointments, but could also use previously prepared appointment cards.
In the following, we report more recent studies that were carried
out to evaluate the socially cooperative dialogue abilities of the
full-blown conversational assistant in less restricted interactions.
5.1
      </p>
    </sec>
    <sec id="sec-12">
      <title>Lab-based evaluation</title>
      <p>
        Based on first insights gathered from the above described
WOzstudy as well as on knowledge acquired in preparatory studies
[
        <xref ref-type="bibr" rid="ref13 ref35">13, 35</xref>
        ], a first version of a semi-autonomous agent was evaluated
in a laboratory setting. This study investigated whether participants
from the diferent user groups – without specific instructions –
were able to carry out calendar-related tasks through spoken
interaction with the agent. Furthermore, the study’s objective was
to investigate how participants manage transitions between bigger
topics/issues, how the length of participants’ utterances vary given
diferent confirmation strategies, how users react to communication
of uncertainty, and whether the agent’s ways of guiding the users’
attention (via voice, manual gesture, gaze, calendar highlighting,
and sounds) is efective.
      </p>
      <p>
        We employed a system that autonomously handled dialogue
management for entering appointments as well as for stating
appointment suggestions. A human ‘wizard’ was included only for
controlling transitions between global modes of user entering
appointments, auto-generated partial suggestions of appointments
by the agent (“Would you like to do something on Saturday?”), and
closing the interaction. 44 participants took part in the study: 19
older adults (SEN) aged about 75+; 15 cognitively impaired adults
(CIM) of working age; and 10 students serving as control group
(CTL). The task was free-form entering of appointments. All
subjects managed to enter the required number of appointments into
the calendar. The number of final entries averaged 10.4, 8.5, and
8.9 for CTL, SEN and CIM, respectively (including up to two
agentrecommended items). Older adults and the group with impairments
on average spent about 20% longer on a topic than controls; some
participants from the CIM group made long hesitations in isolated
instances (up to tens of seconds; see [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] for a detailed discussion).
Still, the socially cooperative dialogue abilities of the agent enabled
the user to conduct successful repair with the agent and to settle in
on acceptable solutions in every subtask.
      </p>
      <p>The study was also conducted to gain qualitative insight into
the repair, revision and meta-communicative patterns exhibited by
the user groups. Further, we wanted to observe temporal aspects
of the planning and verbalizing of an appointment to learn about
participants’ practices in diferent phases of appointment entry
(to eventually design the timing and turn taking of the system).
Analyses of the data are still ongoing and focus among others (1)
on the analysis of gaze when interactional trouble occurs, (2) on
the system’s repair strategies and (3) on the diferent multimodal
states and uptake strategies of the system (as a representation of
its recipiency) with efects on turn production of the users.
5.2</p>
    </sec>
    <sec id="sec-13">
      <title>Field study</title>
      <p>
        A long-term field study is currently ongoing to evaluate the
system’s performance and efects, as well as how participants adopt
and handle it in their home environment over a 15-day period. For
this study we implemented and apply a fully autonomous system
with no additional aids by a human wizard (see fig. 7 for the setup).
This study comprises an ethnographic component focusing on daily
life management in the homes of seniors living alone and people
with cognitive impairments in supported-living [
        <xref ref-type="bibr" rid="ref1 ref7">1, 7</xref>
        ]. These
analyses aim to shed light also on broader questions such as: What are
the challenges when a novel technology is brought into a household
of the target groups? What are the participants’ experiences and
expectations of a technological assistant? What are the efects of
the assistant on participants’ daily routines and in particular their
schedule management? And, an overall topic considered
throughout the project, what are issues with regard to privacy and data
protection with our special user groups?
      </p>
      <p>Overall, the field study followed an iterative approach consisting
of three phases. In the pre-pilot study, we aim to acquaint
participants with the system and to identify their specific expectations
and needs. Researchers lead semi-structured interviews in the
apartments of the participants and discuss the possible placement of the
system in the apartment using a full-size paper prototype. Next, a
pilot study evaluates the feasibility of the main study with a special
focus on the robustness of the system and its performance outside
of the lab. Therefore the prototype system is set up in the
apartments of the participants for a period of about 48 hours. In addition
to the system evaluation, we want to learn about the acceptance
of the system by the participants and their assessment of the
dialogue design. This provides the basis for further optimizations of
the dialogue design and preparation of the main study, for which
the prototype system is placed into the participants’ apartment for
a period of about 15 days (including setup and dismounting day).
Participants are asked to manage their daily schedule together with
”Billie” and to jot down their impressions about the system into a
research diary (freely as well as to structured questions). After the
period of applying the system, participants give a final rating in
the form of a semi-structured interview.</p>
      <p>First results. The above described study design has been carried
out with one female senior person as the first long-term study. The
system has been in use in the participant’s home for 13 days
(excluding a setup and a dismounting day), during which she interacted
61 times with the system, for a total duration of 284 minutes and
46 seconds. She used the system for a mean number of 4.8 times
per day (SD = 1.7, Min = 2, Max = 8). Although usage over the
duration of the study varied between days, it did not difer much
between the first seven days ( M = 5.3SD = 1.8, Min = 3, Max = 8)
and the last six days (Mean = 4.2, SD = 1.6, Min = 2, Max = 6),
suggesting that the participant did not lose interest.</p>
      <p>Interaction durations varied greatly, with the shortest interaction
lasting only 6 seconds and the longest interaction lasting 18 minutes
and 34 seconds. Mean interaction duration is 4:26 minutes (SD =
5:11). This is mainly due to the fact that interactions difer due
to the type of activity. Interactions that are initiated by the user
are typically longer, whereas agent-initiated reminder interaction
can often be handled quickly and usually do not lead to longer
dialogues. Pending a detailed analysis of the actual interaction
logs, we diferentiate between reminder-based interactions and
other interactions by setting a threshold of 120 seconds. The 33
shorter – probably reminder – interactions have a mean duration
of 45 seconds (SD = 0:33). The 28 longer interactions have a mean
duration of 8:57 minutes (SD =4:37). This diferentiation gives
further insight into daily usage of the system: A mean of 2.5 (SD =
1.7) of the interactions per day were reminders, and a mean of 2.2
(SD = 1.0) of the interactions lasted longer than 120 seconds.</p>
      <p>The durations of the interactions indicate that the participant
used the conversational assistant quite a lot. After the 13 days of
usage she had entered 67 unique events in her calender, five of
which were serial events (yielding a total of 132 events displayed
in the calendar). The large number of reminder-based interactions
(33) also indicates that she successfully used this function of the
system.</p>
      <p>As the first main study has just ended, further analyses regarding
the changes of daily life and routines are still due. However, the
pre-pilot and pilot study already has been carried out with other
elderly participants who could interact with the prototype system
for three days at home. Diferent assessments or concerns were
gathered from these participants as anecdotal feedback:
• Concerning the social presence and relationship: “We greet
each other kindly every morning and he is asking what he
can do for me. It’s great.“
• Concerning the dialogue: “Whenever we had problems in a
conversation, we could resolve them. Was really nice.”
• After coming out of coma and being isolated in hospital:
“I would have been glad about having such a guy next to
my bed.” (in order to practice speaking and to have social
interactions)</p>
      <p>• Concerning the size of the current setup and afordability:
“Who wants to have this in the living room? Me not and I
wouldn’t buy it either, if it’s a lot of money.”
• Concerning the duration of acquaintance: “He doesn’t know
anything about me. If that’s possible, one has to use such a
device for at least half a year to get used to it.”
6</p>
    </sec>
    <sec id="sec-14">
      <title>CONCLUSIONS</title>
      <p>The present work has explored how conversational agents can be
used to provide cognitive or emotional assistance to elderly users.
We have focused in particular on the use of spoken language
dialogue as a preferred way of interacting with technical systems (as
indicated by studies by others as well as ourselves). Yet, enabling
successful and acceptable dialogue with these user groups raises
several challenges and communication problems abound quickly
with of-the-shelf dialogue system technology. However, our
findings indicate that virtual assistants can still be an efective and
acceptable help if they provide abilities for the kind of socially
cooperative dialogue needed to resolve these issues. The key
insight of the present project is how conversational agents can be
built such that this is possible for the majority of issues, even for
the special user groups of persons with a mild cognitive
impairment (and often also additional motoric or perceptual handicaps).
This requires numerous things, from the processing of subtle,
multimodal and context-dependent communication-relevant signals,
to generating them in combination with visual cues (calender), to
enabling a highly flexible dialogue with responsive turn-taking,
communicative feedback, and (pro-)active strategies for avoiding
communication problems as well as repairing them. One
prerequisite to achieve this was a high degree of user involvement throughout
the design and implementation phases, which also helped a lot in
increasing acceptance and willingness to participate in the project.</p>
    </sec>
    <sec id="sec-15">
      <title>ACKNOWLEDGMENTS</title>
      <p>This research was supported by the German Research Foundation
(DFG) in the Cluster of Excellence ‘Cognitive Interaction
Technology’(EXC 277) and by the German Federal Ministry of Education
and Research (BMBF) in the project ‘KOMPASS’ (FKZ 16SV7271K).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Antje</given-names>
            <surname>Amrhein</surname>
          </string-name>
          , Katharina Cyra, and
          <string-name>
            <given-names>Karola</given-names>
            <surname>Pitsch</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Processes of reminding and requesting in supporting people with special needs. Human practices as basis for modeling a virtual assistant?</article-title>
          .
          <source>In Proceedings 1st ECAI Workshop on Ethics in the Design of Intelligent Agents. The Hague, The Netherlands</source>
          ,
          <fpage>18</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Timothy</surname>
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Bickmore</surname>
          </string-name>
          , Rebecca A.
          <string-name>
            <surname>Silliman</surname>
          </string-name>
          , Kerrie Nelson, Debbie M. Cheng, Michael Winter, Lori Henault, and
          <string-name>
            <surname>Michael</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Paasche-Orlow</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>A randomized controlled trial of an automated exercise coach for older adults</article-title>
          .
          <source>Journal of the American Geriatrics Society</source>
          <volume>61</volume>
          (
          <year>2013</year>
          ),
          <fpage>1676</fpage>
          -
          <lpage>1683</lpage>
          . https://doi.org/10.1111/jgs.12449
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Mara</given-names>
            <surname>Brandt</surname>
          </string-name>
          , Britta Wrede, Franz Kummert, and
          <string-name>
            <given-names>Lars</given-names>
            <surname>Schillingmann</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Conifrmation detection in human-agent interaction using non-lexical speech cues</article-title>
          .
          <source>Presented at the AAAI Fall Symposium on Natural Communication for HumanRobot Collaboration</source>
          . (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Hendrik</given-names>
            <surname>Buschmeier</surname>
          </string-name>
          and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Co-constructing grounded symbolsFeedback and incremental adaptation in human-agent dialogue</article-title>
          .
          <source>Künstliche Intelligenz</source>
          <volume>27</volume>
          (
          <year>2013</year>
          ),
          <fpage>137</fpage>
          -
          <lpage>143</lpage>
          . https://doi.org/10.1007/s13218-013-0241-8
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Hendrik</given-names>
            <surname>Buschmeier</surname>
          </string-name>
          and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Communicative listener feedback in human-agent interaction: artificial speakers need to be attentive and adaptive</article-title>
          .
          <source>In Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems</source>
          . Stockholm, Sweden.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Herbert</surname>
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Clark</surname>
          </string-name>
          .
          <year>1996</year>
          .
          <string-name>
            <given-names>Using</given-names>
            <surname>Language</surname>
          </string-name>
          . Cambridge University Press, Cambridge, UK. https://doi.org/10.1017/CBO9780511620539
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Katharina</given-names>
            <surname>Cyra</surname>
          </string-name>
          , Antje Amrhein, and
          <string-name>
            <given-names>Karola</given-names>
            <surname>Pitsch</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Fallstudien zur Alltagsrelevanz von Zeit- und Kalenderkonzepten</article-title>
          . In Mensch und Computer 2016 Kurzbeiträge. Aachen, Germany,
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          . https://doi.org/10.18420/muc2016-mci-0253
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Katharina</given-names>
            <surname>Cyra</surname>
          </string-name>
          and
          <string-name>
            <given-names>Karola</given-names>
            <surname>Pitsch</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Dealing with long utterances: How to interrupt the user in a socially acceptable manner?</article-title>
          .
          <source>In Proceedings of the 5th International Conference on Human Agent Interaction. Bielefeld, Germany</source>
          ,
          <fpage>341</fpage>
          -
          <lpage>345</lpage>
          . https://doi.org/10.1145/3125739.3132586
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Katharina</given-names>
            <surname>Cyra</surname>
          </string-name>
          and
          <string-name>
            <given-names>Karola</given-names>
            <surname>Pitsch</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Dealing with 'long turns' produced by users of an assistive system: How missing uptake and recipiency lead to turn increments</article-title>
          .
          <source>In Proceedings of the 26th IEEE International Symposium on Robot and Human Interactive Communication</source>
          . Lisbon, Portugal,
          <fpage>329</fpage>
          -
          <lpage>334</lpage>
          . https: //doi.org/10.1109/ROMAN.
          <year>2017</year>
          .8172322
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Farina</surname>
            <given-names>Freigang</given-names>
          </string-name>
          , Sören Klett, and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Pragmatic multimodality: Efects of nonverbal cues of focus and certainty in a virtual human</article-title>
          .
          <source>In Proceedings of the 17th International Conference on Intelligent Virtual Agents</source>
          . Stockholm, Sweden,
          <fpage>142</fpage>
          -
          <lpage>155</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -67401-8_
          <fpage>16</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Farina</given-names>
            <surname>Freigang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Analysing the modifying functions of gesture in multimodal utterances</article-title>
          .
          <source>In Proceedings of the 4th Conference on Gesture and Speech in Interaction (GESPIN)</source>
          . Nantes, France,
          <fpage>107</fpage>
          -
          <lpage>112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Farina</given-names>
            <surname>Freigang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>This is what's important-using speech and gesture to create focus in multimodal utterance</article-title>
          .
          <source>In Proceedings of the 16th International Conference on Intelligent Virtual Agents</source>
          . Los Angeles, CA, USA,
          <fpage>96</fpage>
          -
          <lpage>109</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -47665-
          <issue>0</issue>
          _
          <fpage>9</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Marcel</surname>
            <given-names>Kramer</given-names>
          </string-name>
          , Ramin Yaghoubzadeh, Stefan Kopp, and
          <string-name>
            <given-names>Karola</given-names>
            <surname>Pitsch</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>A conversational virtual human as autonomous assistant for elderly and cognitively impaired users? Social acceptability and design considerations</article-title>
          .
          <source>In Proceedings of INFORMATIK 2013</source>
          . Koblenz, Germany,
          <fpage>1105</fpage>
          -
          <lpage>1119</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Gregor</surname>
            <given-names>Mehlmann</given-names>
          </string-name>
          , Kathrin Janowski, and
          <string-name>
            <given-names>Elisabeth</given-names>
            <surname>André</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Modeling grounding for interactive social companions</article-title>
          .
          <source>Künstliche Intelligenz</source>
          <volume>30</volume>
          (
          <year>2016</year>
          ),
          <fpage>45</fpage>
          -
          <lpage>52</lpage>
          . https://doi.org/10.1007/s13218-015-0397-5
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Markus</given-names>
            <surname>Meis</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Nutzerzentrierte Entwicklung eines Erinnerungsassistenten</article-title>
          .
          <source>Presented at Abschlusssymposium Niedersächsischer Forschungsverbund Gestaltung altersgerechter Lebenswelten</source>
          .
          <article-title>(</article-title>
          <year>2013</year>
          ). https://www. altersgerechte-lebenswelten.de/
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Juliana</surname>
            <given-names>Miehle</given-names>
          </string-name>
          , Ilker Bagci, Wolfgang Minker, and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Ultes</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A social companion and conversation partner for elderly</article-title>
          .
          <source>In Proceedings of the 8th International Workshop On Spoken Dialogue Systems. Farmington</source>
          , PA, USA.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Christiane</given-names>
            <surname>Opfermann</surname>
          </string-name>
          and
          <string-name>
            <given-names>Karola</given-names>
            <surname>Pitsch</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Reprompts as error handling strategy in human-agent-dialog? User responses to a system's display of nonunderstanding</article-title>
          .
          <source>In Proceedings of the 26th IEEE International Symposium on Robot and Human Interactive Communication</source>
          . Lisbon, Portugal,
          <fpage>310</fpage>
          -
          <lpage>316</lpage>
          . https://doi. org/10.1109/ROMAN.
          <year>2017</year>
          .8172319
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Christiane</surname>
            <given-names>Opfermann</given-names>
          </string-name>
          , Karola Pitsch, Ramin Yaghoubzadeh, and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>The communicative activity of 'making suggestions' as an interactional process: Towards a dialog model for HAI</article-title>
          .
          <source>In Proceedings of the 5th International Conference on Human Agent Interaction. Bielefeld, Germany</source>
          ,
          <fpage>161</fpage>
          -
          <lpage>170</lpage>
          . https: //doi.org/10.1145/3125739.3125752
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Tim</given-names>
            <surname>Paek</surname>
          </string-name>
          and
          <string-name>
            <given-names>Eric</given-names>
            <surname>Horvitz</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Conversation as action under uncertainty</article-title>
          .
          <source>In Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence</source>
          . Stanford, CA, USA,
          <fpage>455</fpage>
          -
          <lpage>464</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Lazlo</surname>
            <given-names>Ring</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            <given-names>Shi</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kathleen Totzke</surname>
            , and
            <given-names>Timothy</given-names>
          </string-name>
          <string-name>
            <surname>Bickmore</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Social support agents for older adults: longitudinal afective computing in the home</article-title>
          .
          <source>Journal on Multimodal User Interfaces</source>
          <volume>9</volume>
          (
          <year>2014</year>
          ),
          <fpage>79</fpage>
          -
          <lpage>88</lpage>
          . https://doi.org/10.1007/ s12193-014-0157-0
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>David</given-names>
            <surname>Schlangen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Timo</given-names>
            <surname>Baumann</surname>
          </string-name>
          , Hendrik Buschmeier, Okko Buß, Stefan Kopp, Gabriel Skantze, and
          <string-name>
            <given-names>Ramin</given-names>
            <surname>Yaghoubzadeh</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Middleware for incremental processing in conversational agents</article-title>
          .
          <source>In Proceedings of the 11th Annual Meeting of the Special Interest Group in Discourse and Dialogue</source>
          . Tokyo, Japan,
          <fpage>51</fpage>
          -
          <lpage>54</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>David</given-names>
            <surname>Schlangen</surname>
          </string-name>
          and
          <string-name>
            <given-names>Gabriel</given-names>
            <surname>Skantze</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>A general, abstract model of incremental dialogue processing</article-title>
          .
          <source>Dialogue and Discourse</source>
          <volume>2</volume>
          (
          <year>2011</year>
          ),
          <fpage>83</fpage>
          -
          <lpage>111</lpage>
          . https://doi.org/10.5087/dad.
          <year>2011</year>
          .105
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Candace</surname>
            <given-names>Sidner</given-names>
          </string-name>
          , Timothy Bickmore, Charles Rich, Barbara Barry, Lazlo Ring, Morteza Behrooz, and
          <string-name>
            <given-names>Mohammad</given-names>
            <surname>Shayganfar</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Demonstration of an alwayson companion for isolated older adults</article-title>
          .
          <source>In Proceedings of the 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue</source>
          . Metz, France,
          <fpage>148</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Gabriel</given-names>
            <surname>Skantze</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Error Handling in Spoken Dialogue Systems</article-title>
          . Managing Uncertainty, Grounding and Miscommunication.
          <source>Ph.D. Dissertation. Computer Science and Communication</source>
          , Department of Speech, Music and Hearing, KTH Stockholm, Stockholm, Sweden.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Carolin</given-names>
            <surname>Straßmann</surname>
          </string-name>
          and
          <string-name>
            <given-names>Nicole C.</given-names>
            <surname>Krämer</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A categorization of virtual agent appearances and a qualitative study on age-related user preferences</article-title>
          .
          <source>In Proceedings of the 17th International Conference on Intelligent Virtual Agents</source>
          . Stockholm, Sweden,
          <fpage>413</fpage>
          -
          <lpage>422</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -67401-8_
          <fpage>51</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Carolin</surname>
            <given-names>Straßmann</given-names>
          </string-name>
          ,
          <source>Astrid Rosenthal von der Pütten</source>
          , Ramin Yaghoubzadeh, Rafael Kaminski, and
          <string-name>
            <given-names>Nicole</given-names>
            <surname>Krämer</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>The efect of an intelligent virtual agent's nonverbal behavior with regard to dominance and cooperativity</article-title>
          .
          <source>In Proceedings of the 16th International Conference on Intelligent Virtual Agents</source>
          . Los Angeles, CA, USA,
          <fpage>15</fpage>
          -
          <lpage>28</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -47665-
          <issue>0</issue>
          _
          <fpage>2</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>David</surname>
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Traum</surname>
          </string-name>
          .
          <year>1994</year>
          .
          <article-title>A Computational Theory of Grounding in Natural Language Conversation</article-title>
          .
          <source>Ph.D. Dissertation</source>
          . University of Rochester, Rochester,
          <string-name>
            <surname>NY</surname>
          </string-name>
          , USA.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Maria</surname>
            <given-names>Velana</given-names>
          </string-name>
          , Sascha Gruss,
          <string-name>
            <given-names>Georg</given-names>
            <surname>Layher</surname>
          </string-name>
          , et al.
          <year>2017</year>
          .
          <article-title>The SenseEmotion database: A multimodal database for the development and systematic validation of an automatic pain- and emotion-recognition system</article-title>
          .
          <source>In Proceedings of the 4th IAPR TC 9 Workshop on Pattern Recognition of Social Signals in Human-ComputerInteraction. Cancun</source>
          , Mexico,
          <fpage>127</fpage>
          -
          <lpage>139</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -59259-6
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Thomas</surname>
            Visser,
            <given-names>David R.</given-names>
          </string-name>
          <string-name>
            <surname>Traum</surname>
          </string-name>
          , David DeVault, and Rieks op den Akker.
          <year>2014</year>
          .
          <article-title>A model for incremental grounding in spoken dialogue systems</article-title>
          .
          <source>Journal on Multimodal User Interfaces</source>
          <volume>8</volume>
          (
          <year>2014</year>
          ),
          <fpage>61</fpage>
          -
          <lpage>73</lpage>
          . https://doi.org/10.1007/s12193-013-0147-7
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Eduard</surname>
            <given-names>Wall</given-names>
          </string-name>
          , Lars Schillingmann, and
          <string-name>
            <given-names>Franz</given-names>
            <surname>Kummert</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Online nod detection in human-robot interaction</article-title>
          .
          <source>In Proceedings of the 26th IEEE International Symposium on Robot and Human Interactive Communication</source>
          . Lisbon, Portugal,
          <fpage>811</fpage>
          -
          <lpage>817</lpage>
          . https://doi.org/10.1109/ROMAN.
          <year>2017</year>
          .8172396
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Leo</surname>
            <given-names>Wanner</given-names>
          </string-name>
          , Elisabeth André,
          <string-name>
            <given-names>Josep</given-names>
            <surname>Blat</surname>
          </string-name>
          , et al.
          <year>2017</year>
          .
          <article-title>KRISTINA: A knowledgebased virtual conversation agent</article-title>
          .
          <source>In Proceedings of the 15th International Conference on Practical Applications of Agents and Multi-Agent Systems. Porto, Portugal</source>
          ,
          <fpage>284</fpage>
          -
          <lpage>295</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -59930-4_
          <fpage>23</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Ramin</surname>
            <given-names>Yaghoubzadeh</given-names>
          </string-name>
          , Hendrik Buschmeier, and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Socially cooperative behavior for artificial companions for elderly and cognitively impaired people</article-title>
          .
          <source>In Proceedings of the 1st International Symposium on CompanionTechnology. Ulm, Germany</source>
          ,
          <fpage>15</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>Ramin</given-names>
            <surname>Yaghoubzadeh</surname>
          </string-name>
          and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Towards graceful turn management in human-agent interaction for people with cognitive impairments</article-title>
          .
          <source>In Proceedings of the 7th Workshop on Speech and Language Processing for Assistive Technologies</source>
          . San Francisco, CA, USA,
          <fpage>26</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>Ramin</given-names>
            <surname>Yaghoubzadeh</surname>
          </string-name>
          and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Enabling robust and fluid spoken dialogue with cognitively impaired users</article-title>
          .
          <source>In Proceedings of the 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue</source>
          . Saarbrücken, Germany,
          <fpage>273</fpage>
          -
          <lpage>283</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Ramin</surname>
            <given-names>Yaghoubzadeh</given-names>
          </string-name>
          , Marcel Kramer, Karola Pitsch, and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Virtual agents as daily assistants for elderly or cognitively impaired people</article-title>
          .
          <source>In Proceedings of the 13th International Conference on Intelligent Virtual Agents. Edinburgh, United Kingdom</source>
          ,
          <fpage>79</fpage>
          -
          <lpage>91</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -40415-
          <issue>3</issue>
          _
          <fpage>7</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Ramin</surname>
            <given-names>Yaghoubzadeh</given-names>
          </string-name>
          , Karola Pitsch, and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Kopp</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Adaptive grounding and dialogue management for autonomous conversational assistants for elderly users</article-title>
          .
          <source>In Proceedings of the 15th International Conference on Intelligent Virtual Agents. Delft</source>
          , The Netherlands,
          <fpage>28</fpage>
          -
          <lpage>38</lpage>
          . https://doi.org/10.1007/ 978-3-
          <fpage>319</fpage>
          -21996-
          <issue>7</issue>
          _
          <fpage>3</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>Victoria</given-names>
            <surname>Young</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alex</given-names>
            <surname>Mihailidis</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Dificulties in automatic speech recognition of dysarthric speakers and implications for speech-based applications used by the elderly: A literature review</article-title>
          .
          <source>Assistive Technology</source>
          <volume>22</volume>
          (
          <year>2010</year>
          ),
          <fpage>99</fpage>
          -
          <lpage>112</lpage>
          . https://doi.org/10.1080/10400435.
          <year>2010</year>
          .483646
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>