<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>ICMLA.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Empathic Response Generation in Chatbots</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Timo Spring</string-name>
          <email>timo.spring@students.unibe.ch</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jacky Casas</string-name>
          <email>jacky.casas@hes-so.ch</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karl Daher</string-name>
          <email>karl.daher@hes-so.ch</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elena Mugellini Omar Abou Khaled</string-name>
          <email>elena.mugellini@hes-so.ch</email>
          <email>elena.mugellini@hes-so.ch omar.aboukhaled@hes-so.ch</email>
          <email>omar.aboukhaled@hes-so.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>HES-SO HES-SO, University of Applied Sciences University of Applied Sciences, Western Switzerland Western Switzerland</institution>
          ,
          <addr-line>Fribourg, Switzerland Fribourg</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>HES-SO, University of Applied Sciences, Western Switzerland</institution>
          ,
          <addr-line>Fribourg</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Bern</institution>
          ,
          <addr-line>Bern</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <volume>00011</volume>
      <fpage>1120</fpage>
      <lpage>1125</lpage>
      <abstract>
        <p />
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Recent years show an increasing
popularity of chatbots, with latest efforts aiming
to make them more empathic and
humanlike, finding application for example in
customer service or in treating mental
illnesses. Thereby, emphatic chatbots can
understand the user’s emotional state and
respond to it on an appropriate emotional
level. This survey provides an overview
of existing approaches used for emotion
detection and empathic response
generation. These approaches raise at least
one of the following profound challenges:
the lack of quality training data,
balancing emotion and content level
information, considering the full end-to-end
experience and modelling emotions throughout
conversations. Furthermore, only few
approaches actually cover response
generation. We state that these approaches are
not yet empathic in that they either
mirror the user’s emotional state or leave it up
to the user to decide the emotion category
of the response. Empathic response
generation should select appropriate emotional
responses more dynamically and express
them accordingly, for example using
emojis.</p>
    </sec>
    <sec id="sec-2">
      <title>1 Introduction</title>
      <p>Chatbots are everywhere, from booking a flight
online to checking the balance of a bank account.
Thereby, most of these interactions with
chatbots are still of transactional nature, for example
when ordering a pizza. Furthermore, the
interactions with chatbots are usually short and
therefore, not resembling normal human-like
conversations. Hence, recent efforts aim to also create
more personalised chatbots for deeper and
emotionally charged conversations. This can also help
boosting the usage of chatbots by making users
feel better, instead of providing or offering
certain services to them. Thus, making the overall
interaction more natural and human-like. Empathic
chatbots find application for example in customer
service or for treating mental illnesses.</p>
      <p>Chatbots for customer service is a growing
trend and Gartner1 predicts that by 2020 about
25% of customer service requests will be handled
using chatbots. Xu et al. (2017) have analysed one
million service requests made over Twitter. The
authors note that about 40% of the requests
express emotions, attitudes or opinions rather than
seek for specific information. In addition, the
average response time for customer service requests
is about 6.5 hours. However, 72% of users who file
a request, expect a response within an hour. Thus,
empathic chatbots could help improving customer
support by reducing the response time, reacting to
specific user emotions and reduce overall costs.</p>
      <p>Empathic chatbots also indicate potential in
di1https://www.gartner.com/en/newsroom/
press-releases/2018-02-19-gartnersays-25-percent-of-customer-serviceoperations-will-use-virtual-customerassistants-by-2020
agnosing and treating mental illnesses. According
to the Swiss Health Observatory, one out of five
Swiss suffers from at least a slight depression2. In
the United States of America, nearly one in five
adults suffers from some form of mental illness
causing economic costs of around $210 billion
annually3. A lack of mental professionals and
psychiatrists makes it difficult to treat and detect
affected individuals. Empathic chatbots can provide
good accessibility and are scalable to a vast public
with a low entrance-barrier and to help detecting
and treating mental illness faster.</p>
      <p>
        There are already some noteworthy
advancements in the field of empathic chatbots to treat or
detect mental illnesses. Woebot
        <xref ref-type="bibr" rid="ref11">(Fitzpatrick et al.,
2017)</xref>
        for example, is a chatbot from the
University of Stanford, using methods from Cognitive
Behavioural Therapy (CBT) to provide a
step-bystep guidance to users with anxieties or
depressions. Another noteworthy chatbot is Replika in
the form of a digital companion with the main
goal of providing someone to talk to 24/7 and to
tackle certain resolutions for example being more
social4. The underlying code of Replika is open
source.
      </p>
      <p>One important aspect in designing empathic
chatbots is understanding what empathy actually
is. In this survey, we consider empathic chatbots to
use affective empathy as defined by Liu and
Sundar (2018). So, the chatbots detect and understand
the user’s emotions and respond to them on an
appropriate emotional level. Liu and Sundar (2018)
observe in their study that the expression of either
sympathy or empathy from a health advice chatbot
is favoured over an unemotional response.</p>
      <p>
        With chatbots becoming more and more
humanlike, it gets difficult for people to distinguish
online conversations with bots and humans. This
fact has lately become problematic5, since bots
are increasingly being misused for political
propaganda and the manipulation of people. The
Computational Propaganda Research Project
(COMPROP)6 from the University of Oxford devotes it’s
time to investigating, how chatbots and other
al2https://www.obsan.admin.ch/de/
publikationen/psychische-gesundheit
3https://www.nimh.nih.gov/health/
statistics/mental-illness.shtml
4https://replika.ai
5https://www.nytimes.com/2018/12/04/
opinion/chatbots-ai-democracy-freespeech.html
6https://comprop.oii.ox.ac.uk
gorithms are used to manipulate the public and
form opinions, yielding in multiple reports on the
matter. Woolley and Guilbeault (2017) analysed
the usage of chatbots during the 2016
presidential election in the United States using a
quantitative network analysis of over 17 million tweets.
The authors state that chatbots in fact showed a
measurable influence during the election by
either manufacturing online popularity or by
democratizing propaganda. Thus, governments of
several countries start to introduce regulations
to fight against these kinds of online
manipulations
        <xref ref-type="bibr" rid="ref14">(Howard et al., 2018)</xref>
        . However, chatbots
oftentimes remain a widely-accepted tool for
propaganda
        <xref ref-type="bibr" rid="ref11 ref13 ref17 ref39 ref4 ref5">(Woolley and Guilbeault, 2017)</xref>
        .
      </p>
      <p>The rest of this survey is structured as follows.
Section 2 introduces the different stages in the
interaction with empathic chatbots. For each stage,
we present the most common and noteworthy
approaches. In Section 2.3, we focus on the state
of empathic response generation and outline
shortcomings. Finally, we conclude and discuss the
survey in Section 3.
2</p>
    </sec>
    <sec id="sec-3">
      <title>The Four Stages of Empathic Chatbots</title>
      <p>
        We partition the interaction with an empathic
chatbot in four stages — the emotion expression by the
user in text format, the emotion detection and
response generation by the chatbot and the response
or rather emotion expression from the chatbot back
to the user in text format. An overview of the
stages can be seen in Figure 1. Each stage
requires special attention to ensure a proper
endto-end user experience. The following chapter
presents each stage and points out common
approaches, challenges and shortcomings.
Emotions are a complex construct and the
ability to detect emotions in text is heavily dependent
on how these emotions are expressed. Even for
humans, it can be tricky to guess the emotional
state of a text message. There are three major
challenges when it comes to emotions. First, they
are context sensitive by nature, they can be
multilayered within a sentence, and they can be implicit.
Hence, emotions are perceived differently based
on contextual and personal circumstances, such as
the culture, age, sex, education, previous
experiences and other individual parameters
        <xref ref-type="bibr" rid="ref24">(Ben-Zeev,
2000; Oatley et al., 2006)</xref>
        .
      </p>
      <p>In normal face-to-face conversations, emotions
are also expressed over the tonality of the speaker,
body language, gestures and facial expressions.
However, when focusing solely on the emotion
expression in text, lots of potential information
stemming from these non-verbal cues go lost. This
might lead to mis-interpretations of the opponent’s
emotions, when communicating over text
messages only.</p>
      <p>Words holding a strong emotional charge such
as kisses for love, tears for sadness, or wow for
surprise can help interpreting the emotional state.
Such word associations are also used in emotion
lexicons and word embeddings.</p>
      <p>The usage of emojis can help amplifying or
transporting emotional meaning in text-based
conversations, but can also pose additional
interpretation challenges, for example, when multiple
contradicting emojis are used. We further discuss
emojis in the context of response expression in
Section 2.4.
2.2</p>
      <sec id="sec-3-1">
        <title>Emotion Detection</title>
        <p>In the emotion detection stage, we try to classify
and map an utterance to an emotional category.
It is important to note, that emotion detection is
strongly tied with response generation as similar
approaches are used for both stages.</p>
        <p>
          One of the first challenges is setting the
number of emotion categories to be used for
classification. There is no universally accepted model of
emotions and the number of emotions differs
drastically depending on the underlying model. One
of the most popular models in emotion detection
is Ekman’s six basic emotions — happiness,
sadness, fear, anger, disgust, and surprise
          <xref ref-type="bibr" rid="ref9">(Ekman,
1992)</xref>
          . Other popular models include Plutchik’s
wheel of emotions
          <xref ref-type="bibr" rid="ref26">(Plutchik, 1991)</xref>
          or Parrot’s
Emotional Layers
          <xref ref-type="bibr" rid="ref25">(Parrott, 2001)</xref>
          consisting of
thirty-one different emotions. However, the
latter two models are seldomly used for emotion
detection, since more emotion categories mean
additional complexity for the classification task.
        </p>
        <p>Seyeditabari et al. (2018) review existing works
and approaches in the field of emotion detection
and provide a good overview of the
state-of-theart of emotion detection in text. They list different
resources used for detecting emotions in text such
as labelled text, emotion lexicons, or word
embeddings and elaborate on common approaches used
for emotion detection. Seyeditabari et al. (2018)
conclude that there is still potential for improving
emotion detection in text. Thereby, the complex
nature of emotion expression, the shortage of
quality data and inefficient models induce most
challenges for future work.</p>
        <p>In this survey, we distinguish three major
approaches used for emotion detection in text
— rule-based, non-neural machine learning and
deep learning.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2.1 Rule-Based Approaches</title>
        <p>Rule-based approaches mainly use emotion
lexicons or word embeddings. Both approaches are
based on keyword lookup from text to detect the
underlying emotion. The rule-based approach is
only as good, as is its parsing algorithm, and the
quality of the lexical resource used for the lookup.
Emotion lexicons list emotion-bearing words and
classify them to single or multiple emotional
categories. Word embeddings, on the other hand, also
take into account frequently co-occurring words
that are semantically similar.</p>
        <p>
          Emotion lexicons can be built from scratch.
However, there exist good off-the-shelf
solutions. One of the most popular being
WordNetAffect
          <xref ref-type="bibr" rid="ref34">(Strapparava et al., 2004)</xref>
          . These
off-theshelf solutions differ tremendously in terms of
their number of entries. WordNet-Affect
contains close to five thousand words, whereas
DepecheMood
          <xref ref-type="bibr" rid="ref20 ref8">(Liu and Zhang, 2012)</xref>
          , another
popular lexicon, contains more than thirty-five
thousand words. Nonetheless, the quality of the
lexicon is not solely dependent on its size. The
vocabulary used for the lexicon also impacts its
quality. Bandhakavi et al. (2017) argue that
generalpurpose lexicons such as WordNet-Affect perform
not as good as domain-specific emotion
lexicons. Therefore, a smaller domain-specific
lexicon might yield in better results than a larger
general-purpose lexicon. LIWC-based lexicons
(Linguistic Inquiry and Word Count) are also
widely used, since these dictionaries list
grammatical, psychological, and content word categories,
and thus also emotion categories with thousands
of entries
          <xref ref-type="bibr" rid="ref20 ref8">(Chung and Pennebaker, 2012)</xref>
          .
        </p>
        <p>
          The idea behind word embeddings is similar to
emotion lexicons. Each word is represented as a
vector in the vector space. Thereby, frequently
cooccurring words are considered semantically
similar and therefore, close in the vector space
          <xref ref-type="bibr" rid="ref28">(Seyeditabari et al., 2018)</xref>
          . Among the most popular word
embedding methods is word2Vec
          <xref ref-type="bibr" rid="ref23">(Mikolov et al.,
2013)</xref>
          . Word embeddings are also often used to
train machine learning models, like LSTM, which
usually take word vectors as inputs.
        </p>
        <p>
          Both rule-based approaches are straightforward.
However, there are some drawbacks to them. The
emotional meaning of keywords can be ambiguous
and is context-sensitive. The sentences She hates
me, and I hate her, could both be classified as
anger based on the keyword hate. However, when
looking at the sentence level information, the first
utterance could also be perceived as sad.
Ignoring the syntactic structure and semantics of the
whole sentence, can therefore lead to
misinterpretations. Furthermore, sentences without any
emotional keywords cannot be classified. Even if they
might contain an implicit expression of emotions,
for example in the form of a metaphor
          <xref ref-type="bibr" rid="ref18">(Kao et al.,
2009)</xref>
          . As a consequence, especially emotion
lexicons often lack accuracy compared to more
complex approaches.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>2.2.2 Non-Neural Machine-Learning</title>
        <p>
          Unlike rule-based approaches, non-neural
learning-based approaches are trying to detect
emotions using trained classifiers, such as the
Support Vector Machine (SVM)
          <xref ref-type="bibr" rid="ref38">(Teng et al.,
2006)</xref>
          , Naive Bayes, or Decision Trees.
        </p>
        <p>
          We distinguish between supervised and
unsupervised learning. Unsupervised approaches are
an evolution of the rule-based approaches and
are learning from test data that is not
annotated with emotional labels. Most commonly,
these approaches use movie dialogues
          <xref ref-type="bibr" rid="ref13 ref3">(Banchs,
2017; Honghao et al., 2017)</xref>
          or children’s fairy
tales
          <xref ref-type="bibr" rid="ref19">(Kim et al., 2010)</xref>
          to build emotional lexicons
and train their models.
        </p>
        <p>
          Supervised approaches, on the other hand, learn
from labelled data such as Twitter messages.
Common labels are annotations, hashtags or
emojis. There exist a few good sources for
emotionally labelled text, one of the most prominent
being the Swiss Center for Affective Sciences7
providing datasets like the International Survey On
Emotion Antecedents And Reactions (ISEAR) and
other useful tools for emotion detection. Other
well-known datasets are EmotiNet
          <xref ref-type="bibr" rid="ref2">(Balahur et al.,
2011)</xref>
          and SemEval-2007
          <xref ref-type="bibr" rid="ref33">(Strapparava and
Mihalcea, 2007)</xref>
          .
        </p>
        <p>As stated by Seyeditabari et al. (2018), one of
the major challenges for supervised approaches
is the lack of quality training data. Oftentimes,
these datasets are unbalanced in terms of
emotion categories. Banchs (2017) analyse the large
movie dialogue dataset MovieDiC and conclude
that emotions such as love, or joy occur much
more frequently than fear or surprise. The
classifiers trained on such datasets will therefore
underperform for these emotional categories.
2.2.3</p>
      </sec>
      <sec id="sec-3-4">
        <title>Deep Learning Approaches</title>
        <p>
          Most recent advances that showed to be effective
in the field of emotion detection, have been made
using deep learning
          <xref ref-type="bibr" rid="ref29">(Xu et al., 2017)</xref>
          .
        </p>
        <p>
          Oftentimes, deep learning approaches are
covering both, the emotion detection and the response
generation, for example when using an
EncoderDecoder architecture
          <xref ref-type="bibr" rid="ref27">(Serban et al., 2015)</xref>
          . This
architecture consists of two stages — the
encoding and decoding stage. In the encoding stage, the
raw text input is turned into a feature
representation, usually in the form of a vector. The vector
is then used as an input for the decoding stage to
generate a response by applying the same
strategies as in the encoding stage, but in the opposite
direction.
        </p>
        <p>
          A well-known approach applying the
encodingdecoding architecture is Long Short-Term Memory
(LSTM)
          <xref ref-type="bibr" rid="ref17">(Jithesh et al., 2017)</xref>
          . LSTM is a
Recurrent Neural Network (RNN) that allows to capture
long-term dependencies and store sequential
information over a longer time. It can retain and forget
the previous state and memorise extracted
information from the input data depending on its
importance
          <xref ref-type="bibr" rid="ref29 ref35">(Xu et al., 2017; Sun et al., 2019)</xref>
          .
        </p>
        <p>
          The commonly used Sequence to Sequence
(Seq2Seq) model also uses LSTMs and the
Encoding-Decoding architecture
          <xref ref-type="bibr" rid="ref36">(Sutskever et al.,
2014)</xref>
          . There is one LSTM for the encoding stage,
transforming the raw text input into a fixed-length
vector representation, whereas another LSTM is
used for the decoding to a variable-length text
out7https://www.unige.ch/cisa/
put
          <xref ref-type="bibr" rid="ref1 ref21 ref28 ref29 ref31 ref35 ref6 ref7">(Cho et al., 2014; Xu et al., 2017; Chan and
Lui, 2018)</xref>
          .
        </p>
        <p>To improve the model’s efficiency, Chan and
Lui (2018) investigate different approaches on
embedding emotional information for Seq2Seq
models. Different styles, positioning, and embeddings
of emotional information are tested. The authors
conclude that the positioning in general matters
and impacts the emotion detection.
2.3</p>
      </sec>
      <sec id="sec-3-5">
        <title>Response Generation</title>
        <p>
          One of the most difficult tasks for empathic
chatbots is generating an empathic response. Firstly,
because it faces similar challenges as the emotion
detection stage. Secondly, because it not only has
to ensure that the response is appropriate in terms
of content level information, but also in terms of
emotion level information. This balancing act is
tremendously difficult, as one usually has to
sacrifice accuracy for one of the information levels,
when trying to optimise the other
          <xref ref-type="bibr" rid="ref29 ref30">(Xu et al., 2017;
Zhou et al., 2017)</xref>
          .
        </p>
        <p>In terms of empathic response generation, we
distinguish between two strategies —
retrievalbased approaches and dynamic generation.</p>
      </sec>
      <sec id="sec-3-6">
        <title>2.3.1 Retrieval-Based approaches</title>
        <p>These approaches look up common responses to
the user’s utterance in conversation datasets.
However, this method is very limited in its
applicability. Similar inputs yield in the same responses,
making the conversation repetitive and less
natural. Furthermore, huge datasets of emotional
conversations are required for such systems in order
to achieve acceptable results. As such datasets are
scare, these types of approaches tend to yield in
responses such as I don’t know in cases where no
candidate response can be found.</p>
        <p>
          A more advanced version of retrieval-based
systems uses word embeddings on the input text to
find the closest candidate responses, thus
yielding in slightly more diverse responses
          <xref ref-type="bibr" rid="ref11 ref13 ref17 ref39 ref4 ref5">(Bartl and
Spanakis, 2017)</xref>
          . However, it still requires lots of
emotionally charged sample conversations.
        </p>
        <p>
          Empathic response generation in general
requires similar datasets as emotion detection, but
with a bigger focus on conversation and dialogue
turns. Movie dialogues are a good source for
emotionally charged conversations. However, they
oftentimes do not resemble daily conversation and
seem more artificial and theatrical
          <xref ref-type="bibr" rid="ref21 ref28 ref31 ref35 ref6">(Chan and Lui,
2018)</xref>
          . Furthermore, emotions in movie dialogues
are after all still acted and not naturally occurring,
which might also have an impact on the quality of
the training data.
        </p>
        <p>
          Other common datasets include chat
conversations, for example from Twitter service
requests
          <xref ref-type="bibr" rid="ref29">(Xu et al., 2017)</xref>
          that might yield in more
natural conversations.
2.3.2
        </p>
      </sec>
      <sec id="sec-3-7">
        <title>Dynamic Generation</title>
        <p>These approaches are strongly tied with the
deep-learning approaches used for emotion
detection from Section 2.2.3 and usually based
on the encoder-decoder architecture, such as the
Sequence-to-Sequence model.</p>
        <p>
          The input sentence is encoded on a
word-byword basis by embedding each word separately,
whilst taking into account already encoded words
using hidden states. Thereby, the last word
embedding will produce a vector representation of the
whole input sequence, encapsulating all relevant
sentence level information. Semantically similar
sentences are therefore close to each other in a
vector space
          <xref ref-type="bibr" rid="ref36">(Sutskever et al., 2014)</xref>
          .
        </p>
        <p>The decoder will then use the sentence vector
or rather sentence embedding to produce an
output sentence using inverted encoding mechanisms
on a word-by-word basis. This allows
encodingdecoding architectures to generate variable length
responses. Thereby, it will consider already
decoded words to ensure that the generated response
is also grammatically correct. To find
appropriate responses to a given word from the encoding
stage, vocabularies or word-embeddings are being
used.</p>
        <p>
          However, the longer the input sequence, the
more challenging to capture the full meaning in a
single sentence embedding. For an input sentence
of 30 words, the decoder would have to consider,
what was encoded 30 steps ago, just to decode the
first word. This long-range dependency problem
oftentimes results in poor responses, like I don’t
know
          <xref ref-type="bibr" rid="ref21 ref28 ref31 ref35 ref6">(Chan and Lui, 2018)</xref>
          .
        </p>
        <p>
          Attention mechanisms are commonly used
to tackle the issue of long-range
dependencies
          <xref ref-type="bibr" rid="ref36">(Sutskever et al., 2014)</xref>
          . Using attention, the
decoder has direct access to the hidden state of
each encoded word and can weight each word
correspondingly. This allows the decoder to attend
and weight on relevant parts of an input sentence,
when generating the output. This mechanism is
also applied in Neural Machine Translation
          <xref ref-type="bibr" rid="ref1">(Bahdanau et al., 2014)</xref>
          .
        </p>
        <p>
          However, these Recurrent Neural Networks
(RNN) still suffer from the vanishing gradient
problem, that causes issues with long-range
dependencies
          <xref ref-type="bibr" rid="ref12">(Hochreiter, 1998)</xref>
          . LSTMs also apply
attention and in addition allow to retain and
forget information, therefore handling the long-range
dependency problem better than other approaches.
        </p>
        <p>
          All these mechanisms are essentially required to
ensure an appropriate response in terms of content
level information. When we also want to consider
emotion level information, we add additional
complexity to the model. Emotions either have to be
additionally encoded during the encoding stage or
fed directly to the decoding stage to generate
emotional responses
          <xref ref-type="bibr" rid="ref30">(Zhou et al., 2017)</xref>
          .
        </p>
        <p>The Emotional Chatting Machine as proposed
by Zhou et al. (2017) is a recent and noteworthy
approach for assessing the emotional state of
conversations and to generate appropriate emotional
responses. Therefore, it belongs to the dynamic
generation approaches. The ECM deep
learning algorithm is trained with 22.300 Chinese blog
posts that are manually annotated with Ekman’s
six basic emotions.</p>
        <p>In terms of architecture, the ECM is based on
the Seq2Seq model with an encoding and
decoding phase. In addition, Zhou et al. (2017)
introduce an internal and external memory to the model
to capture changes in the emotion state throughout
the sentence and to map explicit emotion
expressions to emotion categories. Figure 2 provides a
good overview of the ECM architecture. As
input, the ECM requires the user’s text message and
one of the Ekman’s six emotion categories to
condition the response. Based on the input message,
the ECM will generate an appropriate response
and condition it using the input emotion
category. Zhou et al. (2017) benchmark the ECM with
other approaches, such as the traditional Seq2Seq
model or lexicon-based approaches and show that
ECM performs best across all emotion categories.
However, it lacks behind slightly on the content
level that the authors put down to an imbalance in
the training set.</p>
        <p>One drawback of the ECM is that the input
emotion has to be set manually. This hardwiring of the
output emotion can be useful, if the chatbot should
always respond in the same emotional state, or
express certain personality traits such as being angry
all the time. However, if we want the chatbot to
dynamically react to the user’s emotions and make
the interaction natural, then we have to change the
emotion category based on the emotional state of
the user’s message automatically.</p>
      </sec>
      <sec id="sec-3-8">
        <title>2.3.3 Empathic Responses</title>
        <p>As discussed in Section 1, empathy requires
understanding of the user’s emotion and replying to
them on an appropriate emotional level. Using
emotion detection, we can achieve good results in
understanding the user’s emotions. The difficult
part is actually selecting the appropriate emotion
to condition the response with.</p>
        <p>
          One approach could be to simply mirror
the user’s emotion. However, in
humanconversations, empathy finds expression, when
one tries to feel with the opponent and not
necessarily similar to the opponent. One’s own emotion
must not be confused with the opponent’s
emotion. When resonating to the opponent’s emotion,
one is still aware that it might be different from the
personal emotion
          <xref ref-type="bibr" rid="ref1 ref32 ref36">(Singer and Klimecki, 2014)</xref>
          . If
someone is sad, you might understand this sadness
and try to cheer them up, instead of responding in
a sad way as well. However, it does not mean that
you are necessarily feeling sad as well. Thus,
simply mirroring the user’s emotion does not
necessarily yield in empathic responses.
        </p>
        <p>This is also an important aspect with regards to
a chatbot’s personality, since in these cases, one
should think about the chatbot’s own emotion, as
well as how it might resonate on someone else’s
emotions.</p>
        <p>Another crucial aspect is taking into account the
user’s emotional evolution throughout the whole
conversation. When just considering the latest
user utterance for emotion detection,
misinterpretations or frequent switches in the emotions
expressed by the chatbot’s response might occur. For
example, the user could genuinely be in a bad
mood, but laugh at a joke one just made. If we
would consider just the latest user utterance to
detect the user’s emotion and condition the
response accordingly, the chatbot’s expressed
emotion would switch from negative to positive within
a single sentence. Modelling the user’s emotion
over a longer period might also be important when
applying empathic chatbots in treating mental
illnesses, or when building personality profiles to
monitor the emotional state of the user.</p>
        <p>We observe that only little research actually
focusses on the generation of empathic responses
in Computer Science, compared to the efforts
done for emotion detection. There exist some
approaches, such as the Encoder-Decoder
architecture, that cover emotion detection as well as
emotional response generation. Nonetheless, an
emotional response is not necessarily an empathic
response as elaborated before.</p>
        <p>
          How humans are generating empathic responses
is still an ongoing field of research in
neuroscience
          <xref ref-type="bibr" rid="ref21 ref28 ref31 ref35 ref6">(Shamay-Tsoory and Lamm, 2018)</xref>
          .
Similar to emotions, there is no universally accepted
model for empathic responses, except that
empathy is heavily context-dependent
          <xref ref-type="bibr" rid="ref1 ref32 ref36">(Singer and
Klimecki, 2014)</xref>
          .
2.4
        </p>
      </sec>
      <sec id="sec-3-9">
        <title>Response Expression</title>
        <p>
          In a normal conversation, non-verbal cues such as
facial expressions or gestures can help indicate a
person’s emotional state. However, with chatbots,
we are missing such information and have to
focus solely on the user’s text, to detect the
emotional state. Similar constraints also apply to the
response expression by the chatbot. It is difficult
to transport the intended emotion from the
generated response back to the user in a text format.
Some approaches try to simulate non-verbal cues
by displaying the chatbot as a 3D simulation of a
person
          <xref ref-type="bibr" rid="ref37">(Tatai et al., 2003)</xref>
          . We note that in general,
chatbots do not express responses in any other way
than text. Because such non-verbal cues are
missing in traditional electronic messaging systems,
people are using emojis to supply such cues. The
usage of emojis has increased heavily over the
previous years. In 2017, Facebook revealed that on an
average day, over 5 Billion emojis are being sent
over Messenger only8. Hu et al. (2017) state that
8https://www.adweek.com/digital/
facebook-world-emoji-day-stats-theemoji-movie-stickers/
the main reason for the usage of emojis in
messages is to express emotions or strengthen
expressions.
        </p>
        <p>Emojis could therefore also be considered when
detecting the user’s emotion. However, two
challenges arise when using emojis as possible
emotional labels. First, the emoji label could be
contradictory to the perceived emotional state from
the text, for example implying a sarcastic
utterance. Figure 3 shows, how emojis can lead to such
contradicting interpretations. Second, emojis are
prone to cultural differences as stated by Ljubesˇic´
and Fisˇer (2016). Chatbots with a global scope
should therefore take into account, that emojis
might be used and perceived differently
depending on the country.</p>
        <p>
          DeepMoji
          <xref ref-type="bibr" rid="ref10">(Felbo et al., 2017)</xref>
          is an impressive
tool translating text into a set of emojis expressing
a similar emotional state returning the five most
likely emojis together with their probabilities. It
is trained on 1.2 billion tweets containing
emojis and uses LSTM to predict the most
appropriate emojis. Generated emojis could be mapped to
different emotion categories and used to express
emotions in the response to the user. We leave the
validation of this method for future work.
        </p>
        <p>We state that the usage of emojis in
conversational agents might be a clue to make them more
human-like and to help expressing non-verbal cues
that otherwise might go missing. Future work
should therefore focus on validating this
hypothesis.
3</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion and Conclusion</title>
      <p>We note that current state-of-the-art approaches
face the following major challenges:</p>
      <p>1. Shortage of quality training data —
Machine-Learning algorithms for emotion
detection and empathic response generation require an
extensive amount of annotated training data.
Existing datasets are scarce and are oftentimes
unbalanced for different emotions. Hence, chatbots that
were trained using such datasets will lead to poor
performances on these emotions. Using annotated
data from social media has proven to yield in good
results, but also suffers from unbalanced emotion
distribution. To generate human-like responses,
natural conversations should be used for training
as opposed to artificial and theatrical movie
dialogues.</p>
      <p>2. Emotion level and content level — To
generate responses that are grammatically correct and
that reflect the appropriate content level is very
complex. Using domain specific training data can
help improve the accuracy. If the answer should
also reflect the emotional level and detect possibly
implicit or multi-layered emotions, then the
complexity increases even further. Improving one of
the levels — emotion or content — without
sacrificing accuracy for the other is very
challenging. For response generation, we note that existing
approaches are mainly focusing on content level
information and consider emotions only as
additional information during encoding.</p>
      <p>3. Considering the full end-to-end experience
— In order to achieve good results, one has to
consider the impacts of all four stages — emotion
expression by the user, emotion detection by the
chatbot, response generation by the chatbot, and
appropriate response expression back to the user.
Only by considering the full end-to-end
experience can chatbots be improved to be more
humanlike and empathic.</p>
      <p>Future work should investigate the integration
of emojis into the full end-to-end experience —
from emotion detection to response expression.</p>
      <p>4. Modelling emotions throughout
conversations — We state that when selecting emotions to
condition the response, one should not only
consider the detected emotion from the latest user
message. Taking into account the evolution of the
user’s emotion throughout the whole conversation
and possibly even over several previous
conversations, prevents frequent changes of the chatbot’s
expressed emotions and helps model the user’s
long-term emotional state.</p>
      <p>Furthermore, more efforts should be devoted to
understanding empathy and how chatbots can
generate empathic responses instead of just emotional
responses.</p>
      <p>In this survey, we have presented the
state-ofthe-art of empathy and especially empathic
response generation in chatbots and pointed out
several noteworthy approaches. We pointed out the
four stages of the interaction with the chatbot and
underlined the importance to take all the stages
into account when creating empathic chatbots.</p>
      <p>We note that there exist many different
approaches to tackle the problem of emotion
detection, but only few for empathic response
generation. Overall, deep learning algorithms, such as
the Emotional Chatting Machine (ECM) tend to
yield in the best results. Even though, there is still
potential for improvement as the ECM only
generates emotional responses, but not empathic ones.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Dzmitry</given-names>
            <surname>Bahdanau</surname>
          </string-name>
          , Kyunghyun Cho, and
          <string-name>
            <given-names>Yoshua</given-names>
            <surname>Bengio</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Neural machine translation by jointly learning to align and translate</article-title>
          .
          <source>arXiv preprint arXiv:1409</source>
          .0473 https://arxiv.org/abs/1409.0473.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Alexandra</given-names>
            <surname>Balahur</surname>
          </string-name>
          , Jesu´s
          <string-name>
            <given-names>M.</given-names>
            <surname>Hermida</surname>
          </string-name>
          , Andre´s Montoyo, and Rafael Mun˜oz.
          <year>2011</year>
          .
          <article-title>EmotiNet: A knowledge base for emotion detection in text built on the appraisal theories</article-title>
          .
          <source>In Natural Language Processing and Information Systems</source>
          , Springer Berlin Heidelberg, pages
          <fpage>27</fpage>
          -
          <lpage>39</lpage>
          . https://doi.org/10.1007/978- 3-
          <fpage>642</fpage>
          -22327-34.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Rafael E.</given-names>
            <surname>Banchs</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>On the construction of more human-like chatbots: Affect and emotion analysis of movie dialogue data</article-title>
          .
          <source>In 2017 Asia-Pacific Signal and Information Processing Association</source>
          Annual Summit and
          <string-name>
            <surname>Conference (APSIPA ASC</surname>
          </string-name>
          <article-title>)</article-title>
          . IEEE. https://doi.org/10.1109/apsipa.
          <year>2017</year>
          .
          <volume>8282245</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Anil</given-names>
            <surname>Bandhakavi</surname>
          </string-name>
          , Nirmalie Wiratunga,
          <string-name>
            <given-names>Stewart</given-names>
            <surname>Massie</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Deepak</given-names>
            <surname>Padmanabhan</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Lexicon generation for emotion detection from text</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          <volume>32</volume>
          (
          <issue>1</issue>
          ):
          <fpage>102</fpage>
          -
          <lpage>108</lpage>
          . https://doi.org/10.1109/mis.
          <year>2017</year>
          .
          <volume>22</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>A.</given-names>
            <surname>Bartl</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Spanakis</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A retrievalbased dialogue system utilizing utterance and context embeddings</article-title>
          .
          <source>In 2017 16th IEEE International Conference on Machine Learning Aaron Ben-Zeev</source>
          .
          <year>2000</year>
          .
          <article-title>The Subtlety of Emotions</article-title>
          . The MIT Press. https://doi.org/10.7551/mitpress/6548.001.0001.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>Yin Hei Chan and Andrew Kwok Fai Lui</source>
          .
          <year>2018</year>
          .
          <article-title>Encoding emotional information for sequence-to-sequence response generation</article-title>
          .
          <source>In 2018 International Conference on Artificial Intelligence</source>
          and
          <article-title>Big Data (ICAIBD)</article-title>
          . IEEE. https://doi.org/10.1109/icaibd.
          <year>2018</year>
          .
          <volume>8396177</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Kyunghyun</given-names>
            <surname>Cho</surname>
          </string-name>
          , Bart van Merrienboer,
          <string-name>
            <surname>Caglar Gulcehre</surname>
            , Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and
            <given-names>Yoshua</given-names>
          </string-name>
          <string-name>
            <surname>Bengio</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Learning phrase representations using RNN encoder-decoder for statistical machine translation</article-title>
          .
          <source>In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          .
          <article-title>Association for Computational Linguistics</article-title>
          . https://doi.org/10.3115/v1/d14-
          <fpage>1179</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Cindy</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Chung</surname>
            and
            <given-names>James W.</given-names>
          </string-name>
          <string-name>
            <surname>Pennebaker</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Linguistic inquiry and word count (LIWC)</article-title>
          .
          <source>In Applied Natural Language Processing, IGI Global</source>
          , pages
          <fpage>206</fpage>
          -
          <lpage>229</lpage>
          . https://doi.org/10.4018/978-1-
          <fpage>60960</fpage>
          -741-8.
          <year>ch012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Paul</given-names>
            <surname>Ekman</surname>
          </string-name>
          .
          <year>1992</year>
          .
          <article-title>An argument for basic emotions</article-title>
          .
          <source>Cognition and Emotion</source>
          <volume>6</volume>
          (
          <issue>3</issue>
          -4):
          <fpage>169</fpage>
          -
          <lpage>200</lpage>
          . https://doi.org/10.1080/02699939208411068.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Bjarke</given-names>
            <surname>Felbo</surname>
          </string-name>
          , Alan Mislove, Anders Søgaard, Iyad Rahwan, and
          <string-name>
            <given-names>Sune</given-names>
            <surname>Lehmann</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm</article-title>
          .
          <source>In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing</source>
          .
          <article-title>Association for Computational Linguistics</article-title>
          . https://doi.org/10.18653/v1/d17-
          <fpage>1169</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Kathleen</given-names>
            <surname>Kara</surname>
          </string-name>
          <string-name>
            <surname>Fitzpatrick</surname>
          </string-name>
          , Alison Darcy, and
          <string-name>
            <given-names>Molly</given-names>
            <surname>Vierhile</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (woebot): A randomized controlled trial</article-title>
          .
          <source>JMIR Mental Health</source>
          <volume>4</volume>
          (
          <issue>2</issue>
          ):e19. https://doi.org/10.2196/mental.7785.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Sepp</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>The vanishing gradient problem during learning recurrent neural nets and problem solutions</article-title>
          .
          <source>International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems</source>
          <volume>6</volume>
          (
          <issue>02</issue>
          ):
          <fpage>107</fpage>
          -
          <lpage>116</lpage>
          . https://doi.org/10.1142/S0218488598000094.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>WEI</given-names>
            <surname>Honghao</surname>
          </string-name>
          ,
          <string-name>
            <surname>Yiwei Zhao</surname>
            ,
            <given-names>and Junjie</given-names>
          </string-name>
          <string-name>
            <surname>Ke</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Building chatbot with emotions http://web</article-title>
          .stanford.edu/class/cs224s/reports.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Philip N.</given-names>
            <surname>Howard</surname>
          </string-name>
          , Bence Kollanyi, Samantha Bradshaw, and
          <string-name>
            <surname>Lisa-Maria Neudert</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Social media, news and political information during the US election: Was polarizing content concentrated in swing states? CoRR abs/</article-title>
          <year>1802</year>
          .03573.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          http://arxiv.org/abs/
          <year>1802</year>
          .03573.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Tianran</given-names>
            <surname>Hu</surname>
          </string-name>
          , Han Guo, Hao Sun,
          <article-title>Thuy-vy Thi Nguyen, and</article-title>
          <string-name>
            <given-names>Jiebo</given-names>
            <surname>Luo</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Spice up your chat: The intentions and sentiment effects of using emoji</article-title>
          .
          <source>arXiv preprint arXiv:1703</source>
          .02860 https://arxiv.org/abs/1703.02860.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>V</given-names>
            <surname>Jithesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M Justin</given-names>
            <surname>Sagayaraj</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K G</given-names>
            <surname>Srinivasa</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>LSTM recurrent neural networks for high resolution range profile based radar target classification</article-title>
          .
          <source>In 2017 3rd International Conference on Computational Intelligence &amp; Communication Technology (CICT)</source>
          . IEEE. https://doi.org/10.1109/ciact.
          <year>2017</year>
          .
          <volume>7977298</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Edward</given-names>
            <surname>Chao-Chun</surname>
          </string-name>
          <string-name>
            <given-names>Kao</given-names>
            ,
            <surname>Chun-Chieh</surname>
          </string-name>
          <string-name>
            <surname>Liu</surname>
          </string-name>
          ,
          <source>TingHao Yang</source>
          ,
          <string-name>
            <surname>Chang-Tai Hsieh</surname>
          </string-name>
          , and
          <string-name>
            <surname>Von-Wun Soo</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Towards text-based emotion detection a survey and possible improvements</article-title>
          .
          <source>In 2009 International Conference on Information Management and Engineering</source>
          . IEEE, pages
          <fpage>70</fpage>
          -
          <lpage>74</lpage>
          . https://doi.org/10.1109/icime.
          <year>2009</year>
          .
          <volume>113</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Sunghwan</given-names>
            <surname>Mac</surname>
          </string-name>
          <string-name>
            <given-names>Kim</given-names>
            , Alessandro Valitutti, and
            <surname>Rafael</surname>
          </string-name>
          <string-name>
            <given-names>A.</given-names>
            <surname>Calvo</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Evaluation of unsupervised emotion models to textual affect recognition</article-title>
          .
          <source>In Proceedings of the NAACL HLT</source>
          <year>2010</year>
          <article-title>Workshop on Computational Approaches to Analysis and Generation of Emotion in Text. Association for Computational Linguistics</article-title>
          , Stroudsburg, PA, USA, CAAGET '
          <volume>10</volume>
          , pages
          <fpage>62</fpage>
          -
          <lpage>70</lpage>
          . http://dl.acm.org/citation.cfm?id=
          <volume>1860631</volume>
          .
          <fpage>1860639</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Bing</given-names>
            <surname>Liu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Lei</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>A survey of opinion mining and sentiment analysis</article-title>
          .
          <source>In Mining Text Data</source>
          , Springer US, pages
          <fpage>415</fpage>
          -
          <lpage>463</lpage>
          . https://doi.org/10.1007/978-1-
          <fpage>4614</fpage>
          -3223-413.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Bingjie</given-names>
            <surname>Liu</surname>
          </string-name>
          and
          <string-name>
            <given-names>S. Shyam</given-names>
            <surname>Sundar</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Should machines express sympathy and empathy? experiments with a health advice chatbot</article-title>
          .
          <source>Cyberpsychology, Behavior, and Social Networking</source>
          <volume>21</volume>
          (
          <issue>10</issue>
          ):
          <fpage>625</fpage>
          -
          <lpage>636</lpage>
          . https://doi.org/10.1089/cyber.
          <year>2018</year>
          .
          <volume>0110</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Nikola Ljubesˇic´and Darja Fisˇer</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A global analysis of emoji usage</article-title>
          .
          <source>In Proceedings of the 10th Web as Corpus Workshop</source>
          . Association for Computational Linguistics. https://doi.org/10.18653/v1/w16-
          <fpage>2610</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , Kai Chen, Greg Corrado, and
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Efficient estimation of word representations in vector space</article-title>
          .
          <source>arXiv preprint arXiv:1301</source>
          .3781 https://arxiv.org/abs/1301.3781.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Keith</given-names>
            <surname>Oatley</surname>
          </string-name>
          , Dacher Keltner, and
          <string-name>
            <surname>Jennifer M Jenkins</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Understanding emotions</article-title>
          .
          <source>Blackwell publishing.</source>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>W Gerrod</given-names>
            <surname>Parrott</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Emotions in social psychology: Essential readings</article-title>
          . Psychology Press.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Robert</given-names>
            <surname>Plutchik</surname>
          </string-name>
          .
          <year>1991</year>
          .
          <article-title>The emotions</article-title>
          . University Press of America.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <given-names>Iulian</given-names>
            <surname>Vlad</surname>
          </string-name>
          <string-name>
            <surname>Serban</surname>
          </string-name>
          , Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and
          <string-name>
            <given-names>Joelle</given-names>
            <surname>Pineau</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Hierarchical neural network generative models for movie dialogues</article-title>
          .
          <source>CoRR abs/1507</source>
          .04808. http://arxiv.org/abs/1507.04808.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <given-names>Armin</given-names>
            <surname>Seyeditabari</surname>
          </string-name>
          , Narges Tabari, and
          <string-name>
            <given-names>Wlodek</given-names>
            <surname>Zadrozny</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Emotion detection in text: a review</article-title>
          .
          <source>arXiv preprint arXiv:1806</source>
          .00674 https://arxiv.org/abs/
          <year>1806</year>
          .00674.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <given-names>Anbang</given-names>
            <surname>Xu</surname>
          </string-name>
          , Zhe Liu, Yufan Guo, Vibha Sinha, and
          <string-name>
            <given-names>Rama</given-names>
            <surname>Akkiraju</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A new chatbot for customer service on social media</article-title>
          .
          <source>In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI 17</source>
          . ACM Press. https://doi.org/10.1145/3025453.3025496.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <given-names>Hao</given-names>
            <surname>Zhou</surname>
          </string-name>
          , Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu.
          <year>2017</year>
          .
          <article-title>Emotional chatting machine: Emotional conversation generation with internal and external memory</article-title>
          .
          <source>arXiv preprint arXiv:1704</source>
          .01074 https://arxiv.org/abs/1704.01074.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <given-names>Simone</given-names>
            <surname>Shamay-Tsoory</surname>
          </string-name>
          and
          <string-name>
            <given-names>Claus</given-names>
            <surname>Lamm</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>The neuroscience of empathy - from past to present and future</article-title>
          .
          <source>Neuropsychologia</source>
          <volume>116</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          . Special Issue:
          <article-title>The Neuroscience of Empathy</article-title>
          . https://doi.org/10.1016/j.neuropsychologia.
          <year>2018</year>
          .
          <volume>04</volume>
          .034.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <given-names>Tania</given-names>
            <surname>Singer and Olga M. Klimecki</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Empathy and compassion</article-title>
          .
          <source>Current Biology</source>
          <volume>24</volume>
          (
          <issue>18</issue>
          ):
          <fpage>R875</fpage>
          -
          <lpage>R878</lpage>
          . https://doi.org/10.1016/j.cub.
          <year>2014</year>
          .
          <volume>06</volume>
          .054.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <string-name>
            <given-names>Carlo</given-names>
            <surname>Strapparava</surname>
          </string-name>
          and
          <string-name>
            <given-names>Rada</given-names>
            <surname>Mihalcea</surname>
          </string-name>
          .
          <year>2007</year>
          . SemEval
          <article-title>-2007 task 14</article-title>
          .
          <source>In Proceedings of the 4th International Workshop on Semantic Evaluations - SemEval 07</source>
          .
          <article-title>Association for Computational Linguistics</article-title>
          . https://doi.org/10.3115/1621474.1621487.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <string-name>
            <given-names>Carlo</given-names>
            <surname>Strapparava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Valitutti</surname>
          </string-name>
          , et al.
          <year>2004</year>
          .
          <article-title>Wordnet affect: an affective extension of wordnet</article-title>
          .
          <source>In Lrec. Citeseer</source>
          , volume
          <volume>4</volume>
          , pages
          <fpage>1083</fpage>
          -
          <lpage>1086</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          <string-name>
            <given-names>Xiao</given-names>
            <surname>Sun</surname>
          </string-name>
          , Chen Zhang, and
          <string-name>
            <given-names>Lian</given-names>
            <surname>Li</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Dynamic emotion modelling and anomaly detection in conversation based on emotional transition tensor</article-title>
          .
          <source>Information Fusion</source>
          <volume>46</volume>
          :
          <fpage>11</fpage>
          -
          <lpage>22</lpage>
          . https://doi.org/10.1016/j.inffus.
          <year>2018</year>
          .
          <volume>04</volume>
          .001.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          <string-name>
            <given-names>Ilya</given-names>
            <surname>Sutskever</surname>
          </string-name>
          , Oriol Vinyals, and
          <string-name>
            <surname>Quoc</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Le</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Sequence to sequence learning with neural networks</article-title>
          .
          <source>CoRR abs/1409</source>
          .3215. http://arxiv.org/abs/1409.3215.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          <article-title>Ga´bor Tatai, Annama´ria Csorda´s, A´rpa´d Kiss, Attila Szalo´</article-title>
          , and La´szlo´ Laufer.
          <year>2003</year>
          .
          <article-title>Happy chatbot, happy user</article-title>
          .
          <source>In Intelligent Virtual Agents</source>
          , Springer Berlin Heidelberg, pages
          <fpage>5</fpage>
          -
          <lpage>12</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>540</fpage>
          -39396-22.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          <string-name>
            <given-names>Zhi</given-names>
            <surname>Teng</surname>
          </string-name>
          , Fuji Ren, and
          <string-name>
            <given-names>Shingo</given-names>
            <surname>Kuroiwa</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Retracted: Recognition of emotion with SVMs</article-title>
          .
          <source>In Lecture Notes in Computer Science</source>
          , Springer Berlin Heidelberg, pages
          <fpage>701</fpage>
          -
          <lpage>710</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>540</fpage>
          -37275-287.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          <string-name>
            <surname>Samuel C Woolley and Douglas R Guilbeault</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Computational propaganda in the united states of america: Manufacturing consensus online</article-title>
          .
          <source>Computational Propaganda Research Project page 22</source>
          . http://blogs.oii.ox.ac.uk/politicalbots/wpcontent/uploads/sites/89/2017/06/CompropUSA.pdf.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>