<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Happy Together: Learning and Understanding Appraisal From Natural Language</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Arun Rajendran</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chiyu Zhang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Muhammad Abdul-Mageed</string-name>
          <email>muhammad.mageed@ubc.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Natural Language Processing Lab The University of British Columbia</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we explore various approaches for learning two types of appraisal components from happy language. We focus on `agency' of the author and the `sociality' involved in happy moments based on the HappyDB dataset. We develop models based on deep neural networks for the task, including uni- and bi-directional long short-term memory networks, with and without attention. We also experiment with a number of novel embedding methods, such as embedding from neural machine translation (as in CoVe) and embedding from language models (as in ELMo). We compare our results to those acquired by several traditional machine learning methods. Our best models achieve 87.97% accuracy on agency and 93.13% accuracy on sociality, both of which are signi cantly higher than our baselines.</p>
      </abstract>
      <kwd-group>
        <kwd>Emotion</kwd>
        <kwd>emotion detection</kwd>
        <kwd>sentiment analysis</kwd>
        <kwd>language models</kwd>
        <kwd>text classi cation</kwd>
        <kwd>agency</kwd>
        <kwd>sociality</kwd>
        <kwd>appraisal theory</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Emotion is an essential part of human experience that a ects both individual and
group decision making. For this reason, it is desirable to understand the language
of emotion and develop tools to aid such an understanding. Although there has
been recently works focusing on detecting human emotion from text data [
        <xref ref-type="bibr" rid="ref1 ref2 ref20">20, 1,
2</xref>
        ], we still lack a deeper understanding of various components related to emotion.
Available emotion detection tools have so far been based on theories of basic
emotion like the work of Paul Ekman and colleagues (e.g., [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]) and extensions
of these (e.g., Robert Plutchik's models [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]). Emotion theory, however, has
more to o er than mere categorization of human experience based on valence
(e.g., anger, joy, sadness). As such, computational treatment of emotion is yet
to bene t from existing (e.g., psychological) theories by building models that
capture nuances these theories o er. Our work focuses on the cognitive appraisal
theory [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] where Roseman posits the existence of 5 appraisal components,
including that of `agency'. Agency refers to whether a stimuli is caused by the
individual, self-caused, another individual, other-caused, or merely the result of
the situation circumstance-caused. Identifying the exact type of agency related
to an emotion is useful in that it helps determine the target of emotion (i.e.,
another person or some other type of entity). We focus on agency since it was
recently labeled as an extension of the HappyDB dataset [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] as part of the
CL-A shared task [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] 1.
      </p>
      <p>The CL-A shared task distribution of HappyDB also includes labels for the
concept of `sociality'. Sociality refers to whether or not other people than the
author are involved in the emotion situation. Identifying the type of sociality
associated to an emotion further enriches our knowledge of the emotion
experience. For example, an emotion experience with a sociality value \yes" (i.e.,
other people are involved) could teach us about social groups (e.g., families)
and the range of emotions expressed during speci c types of situations (e.g.,
wedding, death). Overall, agency and sociality are two concepts that we believe
to be useful. Predictions of these concepts can be added to a computational
toolkit which can be run on huge datasets to derive useful insights. To the best
of our knowledge, no works have investigated learning these two concepts from
language data. In this paper, we thus aim at pioneering this learning task by
developing novel deep learning models for predicting agency and sociality.</p>
      <p>Moreover, we train attention-based models that are able to assign weights
to features contributing to a given task. In other words, we are able to identify
the words most relevant to each of the two concepts of agency and sociality.
This not only enriches our knowledge about the distribution of these language
items over each of these concepts, but also provides us with intuition about
what our models learn (i.e., model interpretability). Interpretability is becoming
increasingly important for especially deep learning models since many of these
models are currently deployed in various real-life domains. Being able to identify
why a model is making a certain decision helps us explain model decisions to
end users, including by showing them examples of attention-based outputs.</p>
      <p>
        In modeling agency and sociality, we experiment with various machine
learning methods, both traditional and deep learning-based. In this way, we are able
to establish strong baselines for this task as well as report competitive models.
Our deep learning models are based on recurrent neural networks [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. We
also exploit frameworks with novel embedding methods, including embeddings
from neural machine translation as in CoVe [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and embedding from language
models as in ELMo [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Additionally, we investigate the utility of ne-tuning
our models using the recently proposed ULMFiT model [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Overall, we o er the following contributions: (1) we develop successful models
for identifying the novel concepts of agency and sociality in happy language,
(2) we probe our models to o er meaningful interpretations (in the form of
visualization) of the contribution of di erent words to the learning tasks, thereby
supporting model interpretability. The rest of the paper is organized as follows:
Section 2 is about our dataset and data splits. In Section 3, we describe our
methods and in Section 4 we provide our results. We o er model attention-based
visualizations in Section 5, and we conclude in Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>1 https://sites.google.com/view/a con2019/cl-a -shared-task</title>
      <sec id="sec-2-1">
        <title>Dataset</title>
        <p>
          HappyDB [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] is a dataset of about 100,000 `happy moments' crowd-sourced via
Amazons Mechanical Turk where each worker was asked to describe in a complete
sentence \what made them happy in the past 24 hours". Each user was asked to
describe three such moments. In particular, we exploit the agency and sociality
annotations provided on the dataset as part of the recent CL-A shared task 2,
associated with the AAAI-19 workshop of a ective content analysis 3.
        </p>
        <p>For this particular shared task, 10,560 moments are labelled for agency and
sociality and were available as labeled training data. 4 Then, there were 17,215
moments used as test data. Test labels were not released and teams were
expected to submit the predictions based on their systems on the test split. For
our models, we split the labeled data into 80% training set (8,448 moments)
and 20% development set (2112 moments). We train our models on train and
tune parameters on dev. For our system runs, we submit labels from the
models trained only on the 8,448 training data points. The distribution of labeled
data is as follows: agency (`yes'=7,796; `no'= 2,764), sociality (`yes'=5,625; `no'=
4,935).
3
3.1</p>
      </sec>
      <sec id="sec-2-2">
        <title>Methods</title>
        <sec id="sec-2-2-1">
          <title>Traditional Machine learning Models</title>
          <p>We develop multiple basic machine learning models, including Naive Bayes,
Linear Support Vector Machine (LinSVM), and Logistic Regression (Log Reg). For
each model, we have two settings: (a) we use a bag of words (BOW) approach
(with n-gram values from 1 to 4) and (2) we combine the BOW with a TF-IDF
transformation. These are strong baselines, due to our use of the combination of
higher up n-grams (with n=4). We use the default parameters of Scikit-learn5 to
train all the classical machine learning models. We also use an ensemble method
that takes prediction labels from each of the classi ers and nds the majority
among the di erent model classi cations to decide the nal prediction. We report
results in terms of binary classi cation accuracy.
3.2</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>Deep Learning</title>
          <p>
            We apply various models based on deep neural networks. All our deep learning
models are based on variations of recurrent neural networks (RNNs), which have
achieved remarkable performance on text classi cation tasks such as sentiment
analysis and emotion detection [
            <xref ref-type="bibr" rid="ref1 ref11 ref16 ref18 ref19 ref21">19, 16, 11, 1, 21, 18</xref>
            ]. RNNs and its variations are
2 https://sites.google.com/view/a con2019/cl-a -shared-task?authuser=0
3 https://sites.google.com/view/a con2019/
4 There were also 72,326 moments available as unlabeled training data, but we did
not use these in our work.
5 https://scikit-learn.org/
able to capture sequential dependencies especially in time series data. One
weakness of basic RNNs, however, lies in the gradient either vanishing or exploding,
as the time gaps become larger. Long short term memory (LSTM) networks [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ]
were developed to address this limitation. We also use a bidirectional LSTM
(BiLSTM). BiLSTM extends the unidirectional LSTM network by o ering a
second layer where the hidden to hidden states ow in opposite chronological
order [
            <xref ref-type="bibr" rid="ref22">22</xref>
            ]. Overall, our systems can be categorized as follows: (1) Systems tuning
simple pre-trained embeddings; (2) Systems tuning embeddings from neural
machine translation (NMT); (3) Systems tuning embeddings from language models
(LM); and (4) Systems directly tuning language models (ULMFiT).
Exploiting Simple GloVe Embeddings For the embedding layer, we obtain
the 300-dimensional embedding vector for tokens using GloVe's Common Crawl
pre-trained model [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ]. GloVe embeddings are global vectors for word
representation based on frequencies of pairs of co-occurring words. In this setting, we x
the embedding layer in our deep learning models at pre-trained GloVe
embeddings. We apply four architectures (i.e. LSTM, LSTM with attention, BiLSTM,
and BiLSTM with attention) to learn classi cation of agency and sociality
respectively. For each models, we optimize the number of layers and the number
of hidden unit within each layer to obtain the best performance. We experiment
with layers from the set f1, 2g, and hidden units from the set f128, 256, 512g.
Each of the setting was run with batch size 64 and dropout 0.75 for 20 epochs.
Embeddings from NMT Bryan et al. [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ] proposed CoVe, an approach for
contextualized word embeddings directly from machine translation models. CoVe
not only contains the word-level information from GloVe but also information
learned with an LSTM in the context of the MT task. CoVe is trained on three
di erent MT datasets, 2016 WMT multimodal dataset, 2016 IWSLT training
set, and 2017 WMT news track training set. To train CoVe, we use an LSTM
with attention. Our hyperparameter using CoVe are shown in Table 1.
Embedding from LM Peters et al. [
            <xref ref-type="bibr" rid="ref14 ref6">14, 6</xref>
            ] introduced ELMo, a model based
on learning embeddings directly from language models. The pre-training with
language models provides ELMo with both complex characteristics of words as
well as the usage of these words across various linguistic contexts. ELMo is
trained on 1 billion words benchmark dataset [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ], and these embeddings are
employed as our input layer. More speci cally, we extract the 3rd layer of the
ELMo representation and experiment with it using an LSTM with attention
network. This is our best model, and the only model we submitted for
the competition. We provide its hyperparameters in Table 1.
Fine Tuning LM: ULMFiT Transfer learning is extensively used in the eld
of computer vision for improving the ability of models to learn on new data.
Inspired by this idea, Howard and Ruder [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ] present ULMFiT6, ne tunes a
pretrained language model (trained on the Wikitext-103 dataset). With ULMFiT,
we use a forward language model. We use the same network architecture and
hyperparameters (except dropout ratio and epochs) that Howard and Ruder
used, as we report in Table 1.
4
          </p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Results</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>6 http://nlp.fast.ai/</title>
      <p>acquires the best accuracy (0.9181) on the sociality task. These results suggest
that sociality is an easier task than the agency task. One confounding factor is
that the sociality training data is more balanced than the agency training data
(with majority class at 0.5327 for sociality vs. 0.7382 for agency).</p>
      <p>Layers Nodes LSTM BiLSTM LSTM-A BiLSTM-A
1
2
1
2
128
256
512
128
256
512
128
256
512
128
256
512</p>
      <p>Next, we present our results with the CoVe, ELMo, and ULMFiT trained
models in Table 5. Table 5 shows results in accuracy, AUC score, and F1 score(for
positive class in binary classi cation) of our models on the validation set. From
Tables 2, 4, 3 and 5, it can be observed that our CoVe, ELMo and ULMFiT
models lead to (a) signi cant performance improvements compared to
traditional machine learning models and (b) sizable improvements compared to deep
learning models with simple GloVe embeddings.</p>
      <p>Among the systems with pre-trained embeddings (mentioned in Section 3.2),
ELMo performs better best. One nuance is that ELMo outperform the ULMFiT
model that ne-tunes a language model rather than the embeddings. One
probable explanation for this is the impact of attention that is used in the LSTM
model with ELMo embedding which is crucial for this particular task and is not
present in the ULMFiT model. We now turn to probing our models further by
visualizing the attention weights for words in our data.
For interpretability, and to acquire a better understanding of the two important
concepts of agency and sociality, we provide attention-based visualization of 24
example sentences from our data. In each example, color intensity corresponds
to the self attention weights assigned by our model (LSTM-A). Figures 1
(handpicked) and 2 (randomly picked) provide examples from the agency data, each
for the positive then the negative class respectively. As the Figures demonstrate,
the model attentions are relatively intuitive. For example, for the positive class
cases (hand-picked), the model attends to words such as `my', `with', and
`coworkers', which refer to (or establishes a connection with) the agent. Figures 3
and 4 provide similar visualizations for the sociality task. Again, the attention
weights cast some intuitive light on the concept of sociality. The model, for
example, attends to words like `daughter', `grandson', `members', and `family'
in the hand-picked positive cases. Also, in the hand-picked negative examples, the
model attends to words referring to non-persons such as `book', `mail', `workout',
and `dog'.</p>
      <p>a. Examples of happy moments with positive agency label
b. Examples of happy moments with negative agency label</p>
      <p>Next, in Figure 5, we provide the top 35 words the model attends to in
the positive classes in each of the agency and sociality datasets. Again, many
b. Examples of happy moments with negative agency label
a. Examples of happy moments with positive sociality label
b. Examples of happy moments with negative sociality label
a. Examples of happy moments with positive sociality label
b. Examples of happy moments with negative sociality label
of these words are intutively relevant to each task. For example, for agency, the
model attends to words referring to others the agent indicating interaction with
(e.g., `girlfriend', `friend' `mother' and `family') and social activities the agent
is possibly contributing to (e.g., `lunch', 'trip', `party', and `dinner'). Similarly,
for sociality, the model is attending to verbs indicating being socially involved
(e.g., `told', `came', `bought', and `took') and others/social groups (e.g., `friends',
`son', `family', and `daughter'). Clearly, the two concepts of agency and sociality
are not orthogonal: The words the model attends to in each case indicate overlap
between the two concepts to some extent.
6</p>
      <sec id="sec-3-1">
        <title>Conclusion</title>
        <p>In this paper, we reported successful models learning agency and sociality in
a supervised setting. We also presented extensive visualizations based on the
models' self-attentions that enhance our understanding of these two concepts as
well as model decisions (i.e., interpretability). In the future, we plan to develop
models for the same tasks based on more sophisticated attention mechanisms.
7</p>
      </sec>
      <sec id="sec-3-2">
        <title>Acknowledgement</title>
        <p>We acknowledge the support of the Natural Sciences and Engineering Research
Council of Canada (NSERC). The research was partially enabled by support
from WestGrid (www.westgrid.ca) and Compute Canada (www.computecanada.ca).</p>
        <p>Rajendran et al.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abdul-Mageed</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ungar</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Emonet: Fine-grained emotion detection with gated recurrent neural networks</article-title>
          .
          <source>In: Proceedings of the 55th Annual</source>
          <article-title>Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</article-title>
          .
          <source>vol. 1</source>
          , pp.
          <volume>718</volume>
          {
          <issue>728</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Alhuzali</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abdul-Mageed</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ungar</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Enabling deep learning of emotion with rst-person seed expressions</article-title>
          .
          <source>In: Proceedings of the Second Workshop on Computational Modeling of Peoples Opinions</source>
          , Personality, and Emotions in Social Media. pp.
          <volume>25</volume>
          {
          <issue>35</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Asai</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Evensen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Golshan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Halevy</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lopatenko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stepanov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suhara</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>W.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Happydb: A corpus of 100,000 crowdsourced happy moments</article-title>
          . arXiv preprint arXiv:
          <year>1801</year>
          .
          <volume>07746</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Chelba</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schuster</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ge</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brants</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koehn</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>One billion word benchmark for measuring progress in statistical language modeling</article-title>
          .
          <source>arXiv preprint arXiv:1312.3005</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ekman</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>An argument for basic emotions</article-title>
          .
          <source>Cognition &amp; emotion 6(3-4)</source>
          ,
          <volume>169</volume>
          {
          <fpage>200</fpage>
          (
          <year>1992</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Gardner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grus</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neumann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tafjord</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dasigi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>N.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmitz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zettlemoyer</surname>
            ,
            <given-names>L.S.:</given-names>
          </string-name>
          <article-title>AllenNLP: A deep semantic natural language processing platform</article-title>
          .
          <source>In: ACL workshop for NLP Open Source Software</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Graves</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Supervised sequence labelling</article-title>
          .
          <source>In: Supervised Sequence Labelling with Recurrent Neural Networks</source>
          , pp.
          <volume>5</volume>
          {
          <fpage>13</fpage>
          . Springer (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hochreiter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidhuber</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Long short-term memory</article-title>
          .
          <source>Neural computation 9(8)</source>
          ,
          <volume>1735</volume>
          {
          <fpage>1780</fpage>
          (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Howard</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruder</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Universal language model ne-tuning for text classi cation</article-title>
          . In:
          <article-title>Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</article-title>
          .
          <source>vol. 1</source>
          , pp.
          <volume>328</volume>
          {
          <issue>339</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Jaidka</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mumick</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chhaya</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ungar</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>The CL-A Happiness Shared Task: Results and Key Insights</article-title>
          .
          <source>In: Proceedings of the 2nd Workshop on A ective Content Analysis @ AAAI (A Con2019)</source>
          . Honolulu,
          <string-name>
            <surname>Hawaii</surname>
          </string-name>
          (
          <year>January 2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qiu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Recurrent neural network for text classi cation with multi-task learning</article-title>
          .
          <source>In: Proceedings of the Twenty-Fifth International Joint Conference on Arti cial Intelligence</source>
          . pp.
          <volume>2873</volume>
          {
          <fpage>2879</fpage>
          . AAAI Press (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>McCann</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bradbury</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiong</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
          </string-name>
          , R.:
          <article-title>Learned in translation: Contextualized word vectors</article-title>
          .
          <source>In: Advances in Neural Information Processing Systems</source>
          . pp.
          <volume>6294</volume>
          {
          <issue>6305</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Pennington</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.: Glove:
          <article-title>Global vectors for word representation</article-title>
          .
          <source>In: EMNLP</source>
          . vol.
          <volume>14</volume>
          , pp.
          <volume>1532</volume>
          {
          <issue>1543</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>M.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neumann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iyyer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gardner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zettlemoyer</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Deep contextualized word representations</article-title>
          .
          <source>arXiv preprint arXiv:1802</source>
          .
          <volume>05365</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Plutchik</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>The psychology and biology of emotion</article-title>
          . HarperCollins College Publishers (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ji</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Context-sensitive twitter sentiment classication using neural network</article-title>
          .
          <source>In: AAAI</source>
          . pp.
          <volume>215</volume>
          {
          <issue>221</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Roseman</surname>
            ,
            <given-names>I.J.:</given-names>
          </string-name>
          <article-title>Cognitive determinants of emotion: A structural theory</article-title>
          .
          <source>Review of personality &amp; social psychology</source>
          (
          <year>1984</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Samy</surname>
            ,
            <given-names>A.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>El-Beltagy</surname>
            ,
            <given-names>S.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hassanien</surname>
          </string-name>
          , E.:
          <article-title>A context integrated model for multilabel emotion detection</article-title>
          .
          <source>Procedia computer science 142</source>
          ,
          <volume>61</volume>
          {
          <fpage>71</fpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Tai</surname>
            ,
            <given-names>K.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.:
          <article-title>Improved semantic representations from tree-structured long short-term memory networks</article-title>
          .
          <source>arXiv preprint arXiv:1503.00075</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Volkova</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bachrach</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Inferring perceived demographics from user emotional tone and user-environment emotional contrast</article-title>
          .
          <source>In: Proceedings of the 54th ACL</source>
          . vol.
          <volume>1</volume>
          , pp.
          <volume>1567</volume>
          {
          <issue>1578</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qiu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Cached long short-term memory neural networks for document-level sentiment classi cation</article-title>
          .
          <source>In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>1660</volume>
          {
          <issue>1669</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tian</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qi</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hao</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Attention-based bidirectional long short-term memory networks for relation classi cation</article-title>
          .
          <source>In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>2</volume>
          :
          <string-name>
            <given-names>Short</given-names>
            <surname>Papers</surname>
          </string-name>
          <article-title>)</article-title>
          .
          <source>vol. 2</source>
          , pp.
          <volume>207</volume>
          {
          <issue>212</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>