<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Emotion and sentiment analysis of tweets using BERT</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Claudia Diamantini</string-name>
          <email>c.diamantini@univpm.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Alex Mircoli Università Politecnica delle Marche Ancona</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Andrea Chiorrini Università Politecnica delle Marche Ancona</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Domenico Potena Università Politecnica delle Marche Ancona</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Università Politecnica delle Marche Ancona</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The huge difusion of social networks has made available an unprecedented amount of publicly-available user-generated data, which may be analyzed in order to determine people's opinions and emotions. In this paper we investigate the use of Bidirectional Encoder Representations from Transformers (BERT) models for both sentiment analysis and emotion recognition of Twitter data. We define two separate classifiers for the two tasks and we evaluate the performance of the obtained models on real-world tweet datasets. Experiments show that the models achieve an accuracy of 0.92 and 0.90 on, respectively, sentiment analysis and emotion recognition.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        In the last decade, the great difusion of social networks, personal
blogs and review sites has made available a huge amount of
publicly-available user-generated content. Such data is considered
authentic, as in the above contexts people usually feel free to
express their thoughts. Therefore, the analysis of this
user-generated content provides valuable information about the opinion
of users about a large variety of topics and products, allowing
ifrms to address typical marketing problems as, for instance,
the evaluation of custo-mer satisfaction or the measurement of
the impact of a new marke-ting campaign on brand perception.
Moreover, the analysis of customers’ opinions about a certain
product can be a driver for open innovation, as it helps business
owners to find out possible issues and can possibly suggest
new interesting features. For this reason, in the last years many
researchers (e.g., [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]) focused on techniques for the
automatic analysis of writer’s opinions and emotions,
generally referred to as, respectively, sentiment analysis and emotion
analysis.
      </p>
      <p>
        Sentiment analysis is the process of automatic extraction of
writer’s opinions and their characterization in terms of polarity:
positive, negative and neutral. On the other hand, emotion analysis
has the goal of recognizing the emotion expressed in the text. This
task is usually more dificult than sentiment analysis given the
greater variety of classes and the more subtle diferences between
them. Although in literature such tasks have been addressed
through both lexicon-based [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and learning-based approaches
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], the latter have shown better performance in terms of
classification. For this reason, recent works have focused on large deep
learning models [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ] [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In order to be accurately trained, such
models require large corpora of labelled data, which are usually
scarce and expensive to build [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        As a consequence, pre-trained models that only need a
finetuning phase with a smaller dataset have been widely used. In
particular, many neural networks composed of a task-agnostic
pre-trained word embedding layer (e.g., GloVe [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]) and a
taskspecific neural architecture have been proposed but the
improvement of these models measured by accuracy or F1 score has
reached a bottleneck [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Anyway, recent architectures based
on Trasformer [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] have shown further room for improvement.
      </p>
      <p>
        In the present paper, we investigate the enhancement in terms
of classification accuracy of Bidirectional Encoder Representations
from Transformers (BERT) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], one of the most popular
pretrained language models based on Transformer, on both the tasks
of senti-ment analysis and emotion recognition. To this purpose,
we propose two BERT-based architectures for text classification
and we fine-tune them in order to evaluate their performance. In
the rest of the work we focus on data collected from microblogging
platforms and, in particular, from Twitter. The main reasons of
this choice are the wide availability of tweets (as opposed to, for
instance, Facebook posts, due to diferent data policies) and the
fact that such data are usually challenging to analyze due to the
presence of slang, typos and abbreviations (e.g., "btw" for "by the
way") and hence represent a good benchmark for text classifiers.
      </p>
      <p>The rest of the paper is structured as follows: the next section
presents some relevant related work on sentiment analysis and
emotion recognition. The architecture of the models used for
both tasks is proposed in Section 3, while Section 4 reports the
results of the experimental evaluation of the models on
realworld datasets of tweets. Finally, Section 5 draws conclusions
and discusses future work.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
    </sec>
    <sec id="sec-3">
      <title>Sentiment analysis</title>
      <p>
        With the ever increasing amount of user generated content available
online, the field of automatic sentiment analysis has become a
topic of increasing research interest. As in many other field, deep
learning techniques are being widely used for sentiment analysis,
as demonstrated by the presence of various surveys regarding the
subject over the last years [
        <xref ref-type="bibr" rid="ref12 ref16 ref24">12, 16, 24</xref>
        ]. The first complex task that
sentiment analysis must tackle is the vector representation of
words, which is typically performed thorough word embeddings:
a technique which transforms the words in a vocabulary into
vectors of continuous real numbers.
      </p>
      <p>
        The most commonly used word embeddings are Word2Vec 1
and Global Vector (GloVe) [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
      </p>
      <p>
        Word2Vec is a neural network that learns the word embeddings
from text, and contains both the continuous bag-of-words (CBoW)
model [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and the Skip-gram model [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Given a set of context
words (e.g. “the girl is _ an apple,” where “_” denotes the target
word) the CBoW predicts the target word (e.g., “eating”), conversely
the Skip-gram model, given the target word, predicts the context
words.
      </p>
      <p>GloVe is trained on the non-zero entries of a global word-word
co-occurrence matrix, rather than on the entire sparse matrix or
on individual context windows in a large corpus.</p>
      <p>
        Subsequent works have focused on further refining the idea
of embedding. In [
        <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
        ] the authors proposed models that learn
sentiment-specific word embeddings (SSWE). In these embeddings,
the senti-ment information is embedded in the learned word
vectors as well as the semantic. The authors of [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] designed
and trained a neural network that learns a sentiment-related
embedding representation through the integration of sentiment
supervision both at document level and at word level. A further
refinement of semantics-oriented word vectors has been proposed
in [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] which integrates the word embedding model with standard
matrix factorization through a projection level.
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Emotion analysis</title>
      <p>Though there is not universal agreement over which are the
primary emotions of human being, the scientific community
is giving ever increasing attention to the specific problem of
emotion recognition.</p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] a bilingual attention network model has been proposed
for code-switched emotion prediction. In particular, a document
level representation of each post has been built using a Long
Short-Term Memory (LSTM) model, while the informative words
from the context have been captured through the attention
mechanism.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the authors used distant supervision to automatically
build a dataset for emotion detection and trained a fine-grained
emotion detection system using Gated Recurrent Unit (GRU)
network.
      </p>
      <p>
        Another approach [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] focused on learning a better
representations of emotional contexts by using millions of emoji occurrences
in social media to pre-train neural models.
      </p>
      <p>
        Even more recently a Bidirectional Encoder Representations
from Transformers (BERT) model has been proposed in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This
pre-trained BERT model has provided, without any substantial
task-specific architecture modifications, state of the art
performances over various NLP tasks.
      </p>
      <p>
        In the sentiment analysis field, BERT has been mostly used in
aspect-based sentiment analysis such as in [
        <xref ref-type="bibr" rid="ref15 ref25 ref31">15, 25, 31</xref>
        ], while few
authors focused on emotion analysis.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the authors performed a comparative analysis of various
pre-trained transformer model, including BERT, for the text
emotion recognition problem. However, our work difers from
the previous as we evaluate the performance of the emotion
classification when applied to social content, which is usually
more challenging.
1https://code.google.com/archive/p/word2vec/ https://code.google.com/archive/p/
word2vec/
      </p>
    </sec>
    <sec id="sec-5">
      <title>MODEL</title>
      <p>In this section we describe the proposed model for the tasks
of emotion and sentiment analysis. The model is built by
finetuning BERT on specific datasets of tweets developed for such
tasks. Since tweets usually contain words that are irrelevant for
text classification, a text preprocessing phase is needed in order
to remove:
• mentions: users often cite other Twitter usernames in their
tweets through the character ’@’ in order to direct their
messages;
• urls: urls are very common in tweets, both for media (i.e.,
pictures and videos) and links to other webpages;
• retweets: users often resend tweets they consider relevant
to their followers. Retweets are usually marked with the
prefix "RT" and hence are easily identifiable.</p>
      <p>After the preprocessing phase, data can be used as input to
train task-specific BERT-based models. The architecture of a
generic BERT model consists of a series of bi-directional
multilayer encoder-based Transformers. Nowadays, several pre-trained
BERT models are available. Table 1 shows the main BERT models
as a function of the number of layers L (i.e. the number of
encoders) and the number of hidden units H. Smaller models are
intended for environments with limited computational resources,
since bigger models have a large number of trainable parameters:
a model of average size like BERT-Base has approximately 110
million trainable parameters, while BERT-Large has more than
340 million parameters.</p>
      <p>Specifically, the reference model used in this work is the
BERTBase, both in the uncased and the cased version. The uncased
version implies that text is converted to lowercase before the
word tokeniza-tion process (ex. Michael Jackson becomes michael
jackson) and accents are ignored. The architecture of the
BERTBase model consists of 12 encoders, each composed of 8 layers: 4
multi-head self-attention layers and 4 feed forward layers. We
extended such model by adding a fully connected layer and a
softmax layer for classification, as reported in Figure 1. The
architecture is common to both the sentiment and emotion
classiifers: the only diference between the two models is represented
by the last softmax layer, in which the number of neurons is
equal to the number of classes (i.e., 3 for sentiment analysis and
4 for emotion recognition).
4</p>
    </sec>
    <sec id="sec-6">
      <title>EXPERIMENTS</title>
      <p>In this section we present some experimental results aimed at
evaluating the performance of the proposed BERT-based models.
The results for the emotion analysis and the sentiment analysis
tasks are discussed separately.
4.1</p>
    </sec>
    <sec id="sec-7">
      <title>Experimental setting</title>
      <p>
        The proposed models have been evaluated through two diferent
datasets, namely Go et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] for sentiment analysis and the Tweet
Emotion Intensity dataset [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] for emotion recognition. The same
criteria have been used for the experiments: in particular, each
dataset has been split through a stratified sampling into train
(80%), dev (10%) and test (10%) set. Moreover, we tested both the
uncased and the case version of BERT. Experiments have been
performed on a laptop with 2x2.2GHz CPU, 8GB RAM and a
Nvidia Geforce 740M graphics card: execution times reported in
the following subsections refer to such hardware configuration.
      </p>
      <p>We evaluated the model by means of two metrics: classification
accuracy and F1 score. Let   be the number of data belonging</p>
    </sec>
    <sec id="sec-8">
      <title>4.2 Emotion analysis</title>
      <p>In order to evaluate the performance of the proposed architecture
on the emotion analysis task we considered the Tweet Emotion
Intensity dataset, which consists of 6755 tweets labelled with
respect to the following four emotions: anger, fear, happiness,
sadness. Since samples in the original dataset were not equally
distributed among classes, we balanced the training set by applying
the undersampling technique. In particular, we randomly chose
1300 tweets from each class. We also filtered out 974 meaningless
tweets, e.g. tweets only containing non-ASCII characters or very
short tweets. As a result, we obtained a training+dev set of 5200
equally distributed tweets and a test set of 581 tweets with the
class distribution reported in Figure 2.
(1)
(2)
(3)
(4)
(5)</p>
      <p>Each occurrence in the dataset is associated not only with
a label emotion but also with a parameter called intensity, that
represents the intensity of the emotion. Specifically, this parameter
is a value between 0 and 1 that indicates the degree of intensity
with which the author of the tweet felt that emotion. In Figure 3
it is shown the histogram of the occurrences for diferent lengths
(in characters) of tweets; the length of 452 characters represents
the upper-bound of lengths present in the dataset and is actually
an isolated case given by a single tweet, while the average length
ranges between 9 and 50 characters. Since the model requires
defining a maximum length for the sequence of input characters
(the max_seq_length parameter), analyzing the histogram we
decided to set max_seq_length=95.</p>
      <p>After a preliminary phase of hyperparameter tuning aimed
at determining the best values for the hyperparameters of our
model, we trained our classifier by using the values reported in
Table 2.</p>
      <p>Training required about 5’30"/epoch, while the prediction of
the emotion related to a tweet in test set took approximately 0.4
seconds. We trained the model for a variable number of epochs,
ranging from 1 to 6. The reason for choosing such a small number
to  -th class which have been classified as -th class. Let  be
the number of classes and  be the total number of data. The
accuracy achieved by a classifier is computed as:
Precision and recall of -th class are determined as follows:
 =
 =
 =
1 ∑︂
 =1</p>
      <p>∑︁  
=1


∑︁  
=1
F1 score of -th class is equal to:</p>
      <p>·
1 = 2 ·  + 
Therefore, the F1 score achieved by a classification model is
defined as the average of F 1 :
1 =
1 ∑︂
 =1
1
of epochs is the fact that pre-trained models usually need a short
ifne-tuning phase in order not to overfit the data.</p>
      <p>
        We evaluated both the uncased and the cased version of
BERTBase by using the same hyper-parameter configuration. Training
and validation loss in function of the number of epochs are
reported, respectively, in Figure 4 for the uncased version and in
Figure 5 for the cased one. It has to be noticed that,in line with
expectations, in both cases the optimal training is reached in
only 2 epochs. In fact, starting from the third epoch, even if the
training error diminishes, the validation loss begins to increase:
a phenomenon which is usually correlated with overfitting.
The confusion matrices for the uncased and cased version are
respectively shown in Table 3 and 4. The uncased BERT has
accuracy = 0.89 and 1 = 0.89, while the cased version has accuracy
= 0.90 and 1=0.91: hence, the cased version shows slightly higher
performance. Table 4 shows that the happiness class has the
highest precision, while the highest recall is reached by the
sadness class, which has also the best average metrics. Happy
tweets seems to be the most dificult to be detected, since the
happiness class has the lowest recall (0.85). Anyway, the diference
with other classes is rather small. Generally speaking, the
performance of the classifier seems promising, especially considering
The performance of the sentiment analysis classifier has been
evaluated on the dataset proposed by Go et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Such dataset is
composed of a training set of 1,600,000 tweets annotated through
distant supervision (by considering emoticons in text) and a test
set of 430 manually-annotated tweets. We only considered the
latter, since that dataset has been annotated by humans and
hence it is more reliable. Each tweet has been annotated with
respect to its polarity (i.e., positive, negative or neutral); the
class distribution is reported in Table 5. The dataset is slightly
imbalanced and the neutral is the minority class. However, it
does not represent a problem since we are more interested in
detecting emotion-bearing tweets.
      </p>
      <p>Coherently with the approach proposed in Section 3, we
preprocessed the dataset in order to remove noisy words like links,
hashtags, retweets and mentions. Similarly to Section 4.2, we
analyzed the length (in terms of characters) of each tweet. In
this case, we set max_seq_length=82, since tweets were shorter
than those in the Tweet Emotion Intensity Dataset on average.
We performed a phase of hyperparameter tuning through a grid
search and we determined the best configuration (see Table 6).</p>
      <p>Due to the smaller size of the dataset, the time required by
training was smaller: in particular, it took about 1’15"/epoch. We
trained the model for a variable number of epochs - from 1 to 6
- and we noticed a behavior similar to the emotion recognition
task. As it can be observed in Figures 6 and 7, the validation loss
reached its minimum value after a single epoch, then started to
increase, probably due to overfitting. A possible explaination of
such phenomenon is that the dataset was small if compared to
the number of parameters of the model and hence the classifier
rapidly overfitted. Anyway, further investigations with larger
datasets are required.</p>
      <p>The confusion matrices for the uncased and cased version are
respectively reported in Table 7 and 8. In this case, the uncased
and cased BERT have similar performance, both in terms of
accuracy (0.92) and 1 (0.92), hence the cased version provides
no improvement over the cased one.</p>
      <p>
        It can be observed that the largest part of misclassified tweets
is composed by emotion-bearing text that are, instead, classified
as neutral. This phenomenon can be justified by considering that
there are sentences that are weakly polarized (e.g., for the lack of
strongly polarized adjectives, such as "wonderful" or "ugly") and
sentences containing slang, which are really dificult to properly
classify. It is remarkable that polarity inversions, i.e. positive
sentences classified as negatives and vice versa, are quite rare
(1.8%). In fact, polarity inversions are usually more costly in
terms of classification error as they correspond to completely
misrepresent the user’s opinion. Such performance can be compared
to those presented in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], where traditional machine learning
algorithms are applied to the same dataset. It can be noticed
that the best model proposed in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], i.e., SVM, has an accuracy
of 0.82. Therefore, the use of BERT leads to a remarkable 0.10
improvement in terms of accuracy.
5
      </p>
    </sec>
    <sec id="sec-9">
      <title>CONCLUSION</title>
      <p>The goal of this work was the evaluation of the use of Bidirectional
Encoder Representations from Transformers (BERT) models for
both sentiment analysis and emotion recognition of Twitter data.
We defined an architecture composed of BERT-Base followed
by a final classification stage and we fine-tuned the model for
the above-mentioned tasks. We measured the performance of
our classifiers by considering two datasets of tweets and we
obtained a remarkable 92% accuracy for sentiment analysis and
a 90% accuracy for emotion analysis, from which it was possible
to deduce that BERT’s language modeling power significantly
contributes to achieve a good text classification.</p>
      <p>
        In future work, we plan to improve the performance of our
classifiers by determining the best number of layers and neurons
in the final classification layers (i.e., fully connected layers) .
We also intend to extend the experimentation by considering
larger datasets, such as the SemEval 2017 Task 4 [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] dataset
for sentiment analysis and the EmoBank [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] dataset for emotion
analysis. This is particular important for the sentiment analysis
task, in which we observed a repentine increment of the validation
loss after the first epoch, probably due to overfitting. Although
the models reach high accuracy and the approach seems promising,
a comparison with other state-of-the-art classifiers will be useful
to thoroughly evaluate the performance of our approach. We also
intend to investi-gate the impact of BERT-Base by replacing it
with other BERT distributions (e.g., BERT-Large) or traditional
word embeddings, such as Word2Vec [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] or GloVe [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
      </p>
    </sec>
    <sec id="sec-10">
      <title>ACKNOWLEDGMENTS</title>
      <p>The authors would like to thank the students Federico Filipponi,
Leonardo Lucarelli and Alessandrino Manilii for their help in
imple-menting the architecture for emotion recognition.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Muhammad</given-names>
            <surname>Abdul-Mageed</surname>
          </string-name>
          and
          <string-name>
            <given-names>Lyle</given-names>
            <surname>Ungar</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Emonet: Fine-grained emotion detection with gated recurrent neural networks. In Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: Long papers</article-title>
          ).
          <fpage>718</fpage>
          -
          <lpage>728</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Acheampong</given-names>
            <surname>Francisca</surname>
          </string-name>
          <string-name>
            <given-names>Adoma</given-names>
            ,
            <surname>Nunoo-Mensah Henry</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Wenyu</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Comparative Analyses of Bert, Roberta, Distilbert, and Xlnet for TextBased Emotion Recognition</article-title>
          .
          <source>In 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)</source>
          . IEEE,
          <fpage>117</fpage>
          -
          <lpage>121</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Qurat</given-names>
            <surname>Tul</surname>
          </string-name>
          <string-name>
            <surname>Ain</surname>
          </string-name>
          , Mubashir Ali, Amna Riaz, Amna Noureen, Muhammad Kamran, Babar Hayat, and
          <string-name>
            <given-names>A</given-names>
            <surname>Rehman</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Sentiment analysis using deep learning techniques: a review</article-title>
          .
          <source>Int J Adv Comput Sci Appl 8</source>
          ,
          <issue>6</issue>
          (
          <year>2017</year>
          ),
          <fpage>424</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Sven</given-names>
            <surname>Buechel</surname>
          </string-name>
          and
          <string-name>
            <given-names>Udo</given-names>
            <surname>Hahn</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Emobank: Studying the impact of annotation perspective and representation format on dimensional emotion analysis</article-title>
          .
          <source>In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume</source>
          <volume>2</volume>
          ,
          <string-name>
            <given-names>Short</given-names>
            <surname>Papers</surname>
          </string-name>
          .
          <fpage>578</fpage>
          -
          <lpage>585</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Wei</surname>
            <given-names>Chang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Kristina</given-names>
            <surname>Toutanova</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          . arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Wei</surname>
            <given-names>Chang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Kristina</given-names>
            <surname>Toutanova</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          .
          <source>In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers).
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . https://doi.org/10.18653/v1/
          <fpage>N19</fpage>
          -1423
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Diamantini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mircoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Potena</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Storti</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Semantic disambiguation in a social information discovery system</article-title>
          .
          <source>In 2015 International Conference on Collaboration Technologies and Systems</source>
          ,
          <string-name>
            <surname>CTS</surname>
          </string-name>
          <year>2015</year>
          .
          <volume>326</volume>
          -
          <fpage>333</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Bjarke</given-names>
            <surname>Felbo</surname>
          </string-name>
          , Alan Mislove, Anders Søgaard, Iyad Rahwan, and
          <string-name>
            <given-names>Sune</given-names>
            <surname>Lehmann</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm</article-title>
          .
          <source>arXiv preprint arXiv:1708.00524</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Alec</given-names>
            <surname>Go</surname>
          </string-name>
          , Richa Bhayani, and
          <string-name>
            <given-names>Lei</given-names>
            <surname>Huang</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Twitter sentiment classification using distant supervision</article-title>
          .
          <source>CS224N project report, Stanford</source>
          <volume>1</volume>
          ,
          <issue>12</issue>
          (
          <year>2009</year>
          ),
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Ali</surname>
            <given-names>Hasan</given-names>
          </string-name>
          , Sana Moin, Ahmad Karim, and
          <string-name>
            <given-names>Shahaboddin</given-names>
            <surname>Shamshirband</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Machine learning-based sentiment analysis for twitter accounts</article-title>
          .
          <source>Mathematical and Computational Applications</source>
          <volume>23</volume>
          ,
          <issue>1</issue>
          (
          <year>2018</year>
          ),
          <fpage>11</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Maha</surname>
            <given-names>Heikal</given-names>
          </string-name>
          , Marwan Torki, and
          <string-name>
            <surname>Nagwa</surname>
          </string-name>
          El-Makky.
          <year>2018</year>
          .
          <article-title>Sentiment analysis of Arabic Tweets using deep learning</article-title>
          .
          <source>Procedia Computer Science</source>
          <volume>142</volume>
          (
          <year>2018</year>
          ),
          <fpage>114</fpage>
          -
          <lpage>122</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Doaa</given-names>
            <surname>Mohey El-Din Mohamed Hussein</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>A survey on sentiment analysis challenges</article-title>
          .
          <source>Journal of King Saud University-Engineering Sciences 30</source>
          ,
          <issue>4</issue>
          (
          <year>2018</year>
          ),
          <fpage>330</fpage>
          -
          <lpage>338</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Zhao</surname>
            <given-names>Jianqiang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gui Xiaolin</surname>
          </string-name>
          , and Zhang Xuejun.
          <year>2018</year>
          .
          <article-title>Deep convolution neural networks for twitter sentiment analysis</article-title>
          .
          <source>IEEE Access</source>
          <volume>6</volume>
          (
          <year>2018</year>
          ),
          <fpage>23253</fpage>
          -
          <lpage>23260</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Xin</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Lidong</given-names>
            <surname>Bing</surname>
          </string-name>
          , Wenxuan Zhang, and
          <string-name>
            <given-names>Wai</given-names>
            <surname>Lam</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Exploiting BERT for end-to-end aspect-based sentiment analysis</article-title>
          .
          <source>arXiv preprint arXiv:1910</source>
          .
          <volume>00883</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Xinlong</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Xingyu</given-names>
            <surname>Fu</surname>
          </string-name>
          , Guangluan Xu,
          <string-name>
            <surname>Yang</surname>
            <given-names>Yang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiuniu</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>Jin</given-names>
          </string-name>
          , Qing Liu, and
          <string-name>
            <given-names>Tianyuan</given-names>
            <surname>Xiang</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Enhancing BERT representation with contextaware embedding for aspect-based sentiment analysis</article-title>
          .
          <source>IEEE Access</source>
          <volume>8</volume>
          (
          <year>2020</year>
          ),
          <fpage>46868</fpage>
          -
          <lpage>46876</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Walaa</surname>
            <given-names>Medhat</given-names>
          </string-name>
          , Ahmed Hassan, and
          <string-name>
            <given-names>Hoda</given-names>
            <surname>Korashy</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Sentiment analysis algorithms and applications: A survey</article-title>
          .
          <source>Ain Shams engineering journal 5</source>
          ,
          <issue>4</issue>
          (
          <year>2014</year>
          ),
          <fpage>1093</fpage>
          -
          <lpage>1113</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Tomas</surname>
            <given-names>Mikolov</given-names>
          </string-name>
          , Kai Chen, Greg Corrado, and
          <string-name>
            <given-names>Jefrey</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Eficient estimation of word representations in vector space</article-title>
          .
          <source>arXiv preprint arXiv:1301.3781</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Tomas</surname>
            <given-names>Mikolov</given-names>
          </string-name>
          , Ilya Sutskever, Kai Chen, Greg Corrado, and
          <string-name>
            <given-names>Jefrey</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>arXiv preprint arXiv:1310.4546</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mircoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cucchiarelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Diamantini</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Potena</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Automatic emotional text annotation using facial expression analysis</article-title>
          .
          <source>In CEUR Workshop Proceedings</source>
          , Vol.
          <year>1848</year>
          .
          <volume>188</volume>
          -
          <fpage>196</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Saif</surname>
            <given-names>M</given-names>
          </string-name>
          <string-name>
            <surname>Mohammad and Felipe</surname>
          </string-name>
          Bravo-Marquez.
          <year>2017</year>
          .
          <article-title>Emotion intensities in tweets</article-title>
          .
          <source>arXiv preprint arXiv:1708.03696</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Jefrey</surname>
            <given-names>Pennington</given-names>
          </string-name>
          , Richard Socher, and
          <string-name>
            <given-names>Christopher D</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Glove: Global vectors for word representation</article-title>
          .
          <source>In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</source>
          .
          <volume>1532</volume>
          -
          <fpage>1543</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Ana</given-names>
            <surname>Reyes-Menendez</surname>
          </string-name>
          , José Ramón Saura, and
          <string-name>
            <surname>Cesar</surname>
          </string-name>
          Alvarez-Alonso.
          <year>2018</year>
          .
          <article-title>Understanding# WorldEnvironmentDay user opinions in Twitter: A topicbased sentiment analysis approach</article-title>
          .
          <source>International journal of environmental research and public health 15</source>
          ,
          <issue>11</issue>
          (
          <year>2018</year>
          ),
          <fpage>2537</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Sara</surname>
            <given-names>Rosenthal</given-names>
          </string-name>
          , Noura Farra, and
          <string-name>
            <given-names>Preslav</given-names>
            <surname>Nakov</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>SemEval2017 Task 4: Sentiment Analysis in Twitter</article-title>
          .
          <source>In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-</source>
          <year>2017</year>
          ).
          <article-title>Association for Computational Linguistics</article-title>
          , Vancouver, Canada,
          <fpage>502</fpage>
          -
          <lpage>518</lpage>
          . https://doi.org/ 10.18653/v1/
          <fpage>S17</fpage>
          -2088
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Kim</given-names>
            <surname>Schouten</surname>
          </string-name>
          and
          <string-name>
            <given-names>Flavius</given-names>
            <surname>Frasincar</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Survey on aspect-level sentiment analysis</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>28</volume>
          ,
          <issue>3</issue>
          (
          <year>2015</year>
          ),
          <fpage>813</fpage>
          -
          <lpage>830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Chi</surname>
            <given-names>Sun</given-names>
          </string-name>
          , Luyao Huang, and
          <string-name>
            <given-names>Xipeng</given-names>
            <surname>Qiu</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Utilizing BERT for aspectbased sentiment analysis via constructing auxiliary sentence</article-title>
          . arXiv preprint arXiv:
          <year>1903</year>
          .
          <volume>09588</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Duyu</surname>
            <given-names>Tang</given-names>
          </string-name>
          , Furu Wei, Bing Qin, Nan Yang, Ting Liu, and
          <string-name>
            <given-names>Ming</given-names>
            <surname>Zhou</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Sentiment embeddings with applications to sentiment analysis</article-title>
          .
          <source>IEEE transactions on knowledge and data Engineering</source>
          <volume>28</volume>
          ,
          <issue>2</issue>
          (
          <year>2015</year>
          ),
          <fpage>496</fpage>
          -
          <lpage>509</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Duyu</surname>
            <given-names>Tang</given-names>
          </string-name>
          , Furu Wei, Nan Yang,
          <string-name>
            <surname>Ming Zhou</surname>
            , Ting Liu, and
            <given-names>Bing</given-names>
          </string-name>
          <string-name>
            <surname>Qin</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Learning sentiment-specific word embedding for twitter sentiment classification</article-title>
          .
          <source>In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</source>
          .
          <fpage>1555</fpage>
          -
          <lpage>1565</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Ashish</surname>
            <given-names>Vaswani</given-names>
          </string-name>
          , Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez,
          <string-name>
            <surname>Lukasz Kaiser</surname>
            , and
            <given-names>Illia</given-names>
          </string-name>
          <string-name>
            <surname>Polosukhin</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Attention is all you need</article-title>
          .
          <source>arXiv preprint arXiv:1706.03762</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Leyi</given-names>
            <surname>Wang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Rui</given-names>
            <surname>Xia</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Sentiment lexicon construction with representation learning based on hierarchical sentiment supervision</article-title>
          .
          <source>In Proceedings of the 2017 conference on empirical methods in natural language processing</source>
          .
          <volume>502</volume>
          -
          <fpage>510</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Zhongqing</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Yue Zhang, Sophia Lee,
          <string-name>
            <given-names>Shoushan</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Guodong</given-names>
            <surname>Zhou</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A bilingual attention network for code-switched emotion prediction</article-title>
          .
          <source>In Proceedings of COLING</source>
          <year>2016</year>
          ,
          <source>the 26th International Conference on Computational Linguistics: Technical Papers</source>
          .
          <fpage>1624</fpage>
          -
          <lpage>1634</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Hu</surname>
            <given-names>Xu</given-names>
          </string-name>
          , Bing Liu, Lei Shu, and
          <string-name>
            <surname>Philip S Yu</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>BERT post-training for review reading comprehension and aspect-based sentiment analysis</article-title>
          .
          <source>arXiv preprint arXiv:1904</source>
          .
          <volume>02232</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Lei</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Shuai Wang, and
          <string-name>
            <given-names>Bing</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Deep learning for sentiment analysis: A survey</article-title>
          .
          <source>Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery</source>
          <volume>8</volume>
          ,
          <issue>4</issue>
          (
          <year>2018</year>
          ),
          <year>e1253</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Wei</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Quan Yuan, Jiawei Han, and
          <string-name>
            <given-names>Jianyong</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Collaborative multi-Level embedding learning from reviews for rating prediction</article-title>
          ..
          <string-name>
            <surname>In</surname>
            <given-names>IJCAI</given-names>
          </string-name>
          , Vol.
          <volume>16</volume>
          .
          <fpage>2986</fpage>
          -
          <lpage>2992</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>