<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Cataldo Musto</string-name>
          <email>cataldo.musto@uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Semeraro</string-name>
          <email>giovanni.semeraro@uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Polignano</string-name>
          <email>marco.polignano@uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science University of Bari Aldo Moro</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The exponential growth of available online information provides computer scientists with many new challenges and opportunities. A recent trend is to analyze people feelings, opinions and orientation about facts and brands: this is done by exploiting Sentiment Analysis techniques, whose goal is to classify the polarity of a piece of text according to the opinion of the writer. In this paper we propose a lexicon-based approach for sentiment classi cation of Twitter posts. Our approach is based on the exploitation of widespread lexical resources such as SentiWordNet, WordNet-A ect, MPQA and SenticNet. In the experimental session the e ectiveness of the approach was evaluated against two state-of-the-art datasets. Preliminary results provide interesting outcomes and pave the way for future research in the area.</p>
      </abstract>
      <kwd-group>
        <kwd>Sentiment Analysis</kwd>
        <kwd>Opinion Mining</kwd>
        <kwd>Semantics</kwd>
        <kwd>Lexicons</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Thanks to the exponential growth of available online information many new
challenges and opportunities arise for computer scientists. A recent trend is to
analyze people feelings, opinions and orientation about facts and brands: this is
done by exploiting Sentiment Analysis [
        <xref ref-type="bibr" rid="ref13 ref8">13, 8</xref>
        ] techniques, whose goal is to classify
the polarity of a piece of text according to the opinion of the writer.
      </p>
      <p>
        State of the art approaches for sentiment analysis are broadly classi ed in
two categories: supervised approaches [
        <xref ref-type="bibr" rid="ref12 ref6">6, 12</xref>
        ] learn a classi cation model on the
ground of a set of labeled data, while unsupervised (or lexicon-based ) ones [
        <xref ref-type="bibr" rid="ref18 ref4">18,
4</xref>
        ] infer the sentiment conveyed by a piece of text on the ground of the polarity
of the word (or the phrases) which compose it. Even if recent work in the area
showed that supervised approaches tend to overcome unsupervised ones (see the
recent SemEval 2013 and 2014 challenges [
        <xref ref-type="bibr" rid="ref10 ref15">10, 15</xref>
        ]), the latter have the advantage
of avoiding the hard-working step of labeling training data.
      </p>
      <p>However, these techniques rely on (external) lexical resources which are
concerned with mapping words to a categorical (positive, negative, neutral) or
numerical sentiment score, which is used by the algorithm to obtain the overall
sentiment conveyed by the text. Clearly, the e ectiveness of the whole approach
strongly depends on the goodness of the lexical resource it relies on. As a
consequence, in this work we investigated the e ectiveness of some widespread
available lexical resources in the task of sentiment classi cation of microblog posts.
2</p>
    </sec>
    <sec id="sec-2">
      <title>State-of-the-art Resources for</title>
    </sec>
    <sec id="sec-3">
      <title>Lexicon-based Sentiment Analysis</title>
      <p>
        SentiWordNet: SentiWordNet [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is a lexical resource devised to support
Sentiment Analysis applications. It provides an annotation based on three numerical
sentiment scores (positivity, negativity, neutrality) for each WordNet synset [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Clearly, given that this lexical resource provides a synset-based sentiment
representation, di erent senses of the same term may have di erent sentiment scores.
As shown in Figure 1, the term terrible is provided with two di erent sentiment
associations. In this case, SentiWordNet needs to be coupled with a Word Sense
Disambiguation (WSD) algorithm to identify the most promising meaning.
      </p>
      <p>
        WordNet-A ect: WordNet-A ect [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] is a linguistic resource for a lexical
representation of a ective knowledge. It is an extension of WordNet which labels
a ective-related synsets with a ective concepts de ned as A-Labels (e.g. the
term euphoria is labeled with the concept positive-emotion, the noun illness
is labeled with physical state, and so on). The mapping is performed on the
ground of a domain-independent hierarchy (a fragment is provided in Figure 2)
of a ective labels automatically built relying on WordNet relationships.
      </p>
      <p>
        MPQA: MPQA Subjectivity Lexicon [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] provides a lexicon of 8,222 terms
(labeled as subjective expressions ), gathered from several sources. This lexicon
contains a list of words, along with their POS-tagging, labeled with polarity
(positive, negative, neutral) and intensity (strong, weak).
      </p>
      <p>
        SenticNet: SenticNet [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is a lexical resource for concept-level sentiment
analysis. It relyies on the Sentic Computing [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], a novel multi-disciplinary paradigm
for Sentiment Anaylsis. Di erently from the previously mentioned resources,
SenticNet is able to associate polarity and a ective information also to complex
concepts such as accomplishing goal, celebrate special occasion and so on. At
present, SenticNet provides sentiment scores (in a range between -1 and 1) for
14,000 common sense concepts. The sentiment conveyed by each term is de ned
on the ground of the intensity of sixteen basic emotions, de ned in a model called
Hourglass of Emotions (see Figure 3).
3
      </p>
    </sec>
    <sec id="sec-4">
      <title>Methodology</title>
      <p>Typically, lexicon-based approaches for sentiment classi cation are based on the
insight that the polarity of a piece of text can be obtained on the ground of
the polarity of the words which compose it. However, due to the complexity of
natural languages, a so simple approach is likely to fail since many facets of the
language (e.g., the presence of the negation) are not taken into acccount. As a
consequence, we propose a more ne-grained approach: given a Tweet T, we split
it in several micro-phrases m1 : : : mn according to the splitting cues occurring in
the content. As splitting cues we used punctuations, adverbs and conjunctions.
Whenever a splitting cue is found in the text, a new micro-phrase is built.
3.1</p>
      <sec id="sec-4-1">
        <title>Description of the approach</title>
        <p>
          Given such a representation, we de ne the sentiment S conveyed by a Tweet T as
the sum of the polarity conveyed by each of the micro-phrases mi which compose
it. In turn, the polarity of each micro-phrase depends on the sentimental score
of each term in the micro-phrase, labeled as score(tj), which is obtained from
one of the above described lexical resources. In this preliminary formulation of
the approach we did not take into account any valence shifters [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] except of
the negation. When a negation is found in the text, the polarity of the whole
micro-phrase is inverted. No heuristics have been adopted to deal with neither
language intensi ers and downtoners, or to detect irony [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>We de ned four di erent implementations of such approach: basic,
normalized, emphasized and emphasized-normalized. In the basic formulation, the
Fig. 3. The Hourglass of Emotions</p>
        <p>Sbasic(T ) =
polbasic(mi) =
n
X polbasic(mi)
i=1</p>
        <p>jT j
k
X score(tj )
j=1
Snorm(T ) =
polnorm(mi) =
n
X polnorm(mi)
i=1
k
X score(tj )
j=1
jmij</p>
        <p>In the normalized formulation, the micro-phrase-level scores are normalized
by using the length of the single micro-phrase, in order to weigh di erently the
micro-phrases according to their length.</p>
        <p>The emphasized version is an extension of the basic formulation which gives
a bigger weight to the terms tj belonging to speci c POS categories:
sentiment of the Tweet is obtained by rst summing the polarity of each
microphrase. Then, the score is normalized through the length of the whole Tweet.
In this case the micro-phrases are just exploited to invert the polarity when a
negation is found in text.
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
polemph(mi) =</p>
        <p>Semph(T ) =
n
X polemph(mi)
i=1</p>
        <p>jT j
k
X score(tj ) wpos(tj)
j=1</p>
        <p>SemphNorm(T ) =
polemphNorm(mi) =
n
X polemphNorm(mi)
i=1
Xk score(tj ) wpos(tj)
j=1
jmij
where wpos(tj) is greater than 1 if pos(tj ) = adverbs; verbs; adjectives,
otherwise 1.</p>
        <p>Finally, the emphasized-normalized is just a combination of the second
and third version of the approach:
3.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Lexicon-based Score Determination</title>
        <p>Regardless of the variant which is adopted, the e ectiveness of the whole
approach strictly depends on the way score(tj ) is calculated. For each lexical
resource, a di erent way to determine the sentiment score is adopted.</p>
        <p>As regards SentiWordNet, tj is processed through an NLP pipeline to get its
POS-tag. Next, all the synsets mapped to that POS of the terms are extracted.
Finally, score(tj ) is calculated as the weighted average of all the sentiment scores
of the sysnets.</p>
        <p>If WordNet-A ect is chosen as lexical resource, the algorithm tries to map the
term tj to one of the nodes of the a ective hierarchy. The hierarchy is climbed
until a matching is obtained. In that case, the term inherits the sentiment score
(extracted from SentiWordNet) of the A-Label it matches. Otherwise, it is
ignored.</p>
        <p>The determination of the score with MPQA and is quite straightforward,
since the algorithm rst associates the correct POS-tag to the term tj , then
looks for it in the lexicon. If found, the term is assigned with a di erent score
according to its categorical label.</p>
        <p>A similar approach is performed for SenticNet, since the knowledge-base
is queried and the polarity associated to that term is obtained. However, given
that SenticNet also models common sense concepts, the algorithm tries to match
more complex expressions (as bigrams and trigrams ) before looking for simple
unigrams.
4</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experimental Evaluation</title>
      <p>In the experimental session we evaluated the e ectiveness of the above described
lexical resources in the task of sentiment classi cation of microblog posts.
Specifically, we evaluated the accurracy of our lexicon-based approach on varying both
the four lexical resources as well as the four versions of the algorithm.</p>
      <p>
        Dataset and Experimental Design: experiments were performed by
exploiting SemEval-2013 [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and Stanford Twitter Sentiment (STS) datasets [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
SemEval-20131 dataset consists of 14,435 Tweets already split in training (8,180
Tweets) and test data (3,255). Tweets have been manually annotated and are
classi ed as positive, neutral and negative. STS dataset contains more that
1,600,000 Tweets, already split in training and test test, but test set is
considerably smaller than training (only 359 Tweets). In this case tweets have been
collected through Twitter APIs2 and automatically labeled according to the
emoticons they contained.
      </p>
      <p>Even if our approach can work in a totally unsupervised manner, we used
training data to learn positive and negative classi cation thresholds through
a simple Greedy strategy. For SemEval-2013 all the data were used to learn
the thresholds, while for STS only 10,000 random tweets were exploited, due
to computational issues. As regards the emphasis-based approach, the boosting
factor w is set to 1.5 after a rough tuning (the score of adjectives, adverbs and
nouns is increased by 50%). As regards the lexical resources, the last versions of
MPQA, SentiWordNet and WordNet-A ect were downloaded, while SenticNet</p>
      <sec id="sec-5-1">
        <title>1 www.cs.york.ac.uk/semeval-2013/task2/ 2 https://dev.twitter.com/</title>
        <p>
          was invoked through the available REST APIs3. Some statistics about the
coverage of the lexical resources is provided is provided in Table 1. For POS-tagging
of Tweets, we adopted TwitterNLP4 [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], a resource speci cally developed for
POS-tagging of microblog posts. Finally, The e ectiveness of the approaches was
evaluated by calculating both accuracy and F1-measure [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] on test sets, while
stastical signi cance was assessed through McNemar's test5.
        </p>
        <p>Discussion of the Results: results of the experiments on SemEval-2013
data are provided in Figure 4. Due to space reasons, we only report accuracy
scores. Results shows that the best-performing con guration is the one based
on SentiWordNet which exploits both emphasis and normalization. By
comparing all the variants, it emerges that the introduction of emphasis leads to
an improvement in 7 out of 8 comparisons (0.4% on average). Di erences are
statistically signi cant only by considering the introduction of emphasis on
normalized approach with SenticNet (p &lt; 0:0001) and SentiWordNet (p &lt; 0:0008).
On the other side, the introduction of normalization leads to an improvement
only in 1 out of 4 comparisons, by using the WordNet-A ect resource (p &lt; 0:04).
By comparing the e ectiveness of the di erent lexical resources, it emerges that
SentiWordNet performs signi cantly better than both SenticNet and
WordNetA ect (p &lt; 0:0001). However, even if the gap with MPQA results quite large
(0.7%, from 58.24 to 58.98), the di erence is not statistically signi cant (p &lt; 0:5).
To sum up, the analysis performed on SemEval-2013 showed that SentiWordNet
and MPQA are the best-perfoming lexical resources on such data.</p>
        <p>Figure 5 shows the results of the approaches on STS dataset. Due to the
small number of Tweets in the test set, results have a smaller statistical
signi cance. In this case, the best-perfoming lexical resource is SenticNet, which
obtained 74.65% of accuracy, greater than those obtained by the other
lexical resources. However, the gap is statistically signi cant only if compared to
WordNet-A ect (p &lt; 0:00001) and almost signi cant with respect to MPQA
(p &lt; 0:11). Finally, even if the gap with SentiWordNet is around 2% (72.42%
accuracy), the di erence does not seem statistically signi cant (p &lt; 0:42).
Differently from SemEval-2013 data, it emerges that the introduction of emphasis</p>
      </sec>
      <sec id="sec-5-2">
        <title>3 http://sentic.net/api/ 4 http://www.ark.cs.cmu.edu/TweetNLP/ 5 http://en.wikipedia.org/wiki/McNemar's test</title>
        <p>leads to an improvement only in 2 comparisons (+0.28% only on MPQA and
WordNet-A ect), while in all the other cases no improvement was noted. The
introduction of normalization produced a improvement in 3 out of 4 comparisons
(average improvement of 0.6%, peak of 1.2% on MPQA). In all these cases, no
statistical di erences emerged on varying the approaches on the same lexical
resource.
5</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusions and Future Work</title>
      <p>In this paper we provided a thorough comparison of lexicon-based approaches for
sentiment classi cation of microblog posts. Speci cally, four widespread lexical
resources and four di erent variants of our algorithm have been evaluated against
two state of the art datasets.</p>
      <p>Even if the results have been quite controversial, some interesting
behavioral patterns were noted: MPQA and SentiWordNet emerged as the
bestperforming lexical resources on those data. This is an interesting outcome since
even a resource with a smaller coverage as MPQA can produce results which are
comparable to a general-purpose lexicon as SentiWordNet. This is probably due
to the fact that subjective terms, which MPQA strongly rely on, play a key role
for sentiment classi cation. On the other side, results obtained by
WordNetA ect were not good. This is partially due to the very small coverage of the
lexicon, but it is likely that the choice of relying sentiment classi cation only on
a ective features lters out a lot of relevant terms. Finally, results obtained by
SenticNet were really interesting since it was the best-performing con guration
on STS and the worst-performing one on SemEval data. Further analysis on the
results showed that this behaviour was due to the fact that SenticNet can hardly
classi cate neutral Tweets (only 20% accuracy on that data), and this negatively
a ected the overall results on a three-class classi cation task. Further analysis
are needed to investigate this behavior.</p>
      <p>As future work, we will extend the analysis by evaluating more lexical
resources as well as more datasets. Moreover, we will re ne our technique for
threshold learning and we will try to improve our algorithm by modeling more
complex syntactic structures as well as by introducing a word-sense
disambiguation strategy to make our approach semantics-aware.</p>
      <p>Acknowledgments. This work full ls the research objectives of the project
"VINCENTE - A Virtual collective INtelligenCe ENvironment to develop
sustainable Technology Entrepreneurship ecosystems" funded by the Italian
Ministry of University and Research (MIUR)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Esuli</surname>
          </string-name>
          <string-name>
            <surname>Baccianella</surname>
          </string-name>
          ,
          <source>Stefano and Fabrizio Sebastiani. SentiWordNet 3</source>
          .
          <article-title>0: An enhanced lexical resource for sentiment analysis and opinion mining</article-title>
          .
          <source>In Proceedings of LREC</source>
          , volume
          <volume>10</volume>
          , pages
          <fpage>2200</fpage>
          {
          <fpage>2204</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Erik</given-names>
            <surname>Cambria</surname>
          </string-name>
          and
          <string-name>
            <given-names>Amir</given-names>
            <surname>Hussain</surname>
          </string-name>
          . Sentic computing. Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Erik</surname>
            <given-names>Cambria</given-names>
          </string-name>
          , Daniel Olsher, and
          <string-name>
            <given-names>Dheeraj</given-names>
            <surname>Rajagopal</surname>
          </string-name>
          .
          <article-title>Senticnet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis</article-title>
          .
          <source>AAAI, Quebec City</source>
          , pages
          <volume>1515</volume>
          {
          <fpage>1521</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Xiaowen</given-names>
            <surname>Ding</surname>
          </string-name>
          , Bing Liu, and
          <string-name>
            <surname>Philip S Yu</surname>
          </string-name>
          .
          <article-title>A holistic lexicon-based approach to opinion mining</article-title>
          .
          <source>In Proceedings of the 2008 International Conference on Web Search and Data Mining</source>
          , pages
          <volume>231</volume>
          {
          <fpage>240</fpage>
          . ACM,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Alec</given-names>
            <surname>Go</surname>
          </string-name>
          , Richa Bhayani, and
          <string-name>
            <given-names>Lei</given-names>
            <surname>Huang</surname>
          </string-name>
          .
          <article-title>Twitter sentiment classi cation using distant supervision</article-title>
          .
          <source>CS224N Project Report</source>
          , Stanford, pages
          <volume>1</volume>
          {
          <fpage>12</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Xia</given-names>
            <surname>Hu</surname>
          </string-name>
          , Lei Tang,
          <string-name>
            <given-names>Jiliang</given-names>
            <surname>Tang</surname>
          </string-name>
          , and Huan Liu.
          <article-title>Exploiting social relations for sentiment analysis in microblogging</article-title>
          .
          <source>In Proceedings of the sixth ACM international conference on Web search and data mining</source>
          , pages
          <volume>537</volume>
          {
          <fpage>546</fpage>
          . ACM,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Alistair</given-names>
            <surname>Kennedy</surname>
          </string-name>
          and
          <string-name>
            <given-names>Diana</given-names>
            <surname>Inkpen</surname>
          </string-name>
          .
          <article-title>Sentiment classi cation of movie reviews using contextual valence shifters</article-title>
          .
          <source>Computational Intelligence</source>
          ,
          <volume>22</volume>
          (
          <issue>2</issue>
          ):
          <volume>110</volume>
          {
          <fpage>125</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Bing</given-names>
            <surname>Liu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Lei</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <article-title>A survey of opinion mining and sentiment analysis</article-title>
          .
          <source>In Mining Text Data</source>
          , pages
          <volume>415</volume>
          {
          <fpage>463</fpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>George</surname>
            <given-names>A</given-names>
          </string-name>
          Miller.
          <article-title>WordNet: a lexical database for english</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>38</volume>
          (
          <issue>11</issue>
          ):
          <volume>39</volume>
          {
          <fpage>41</fpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Preslav</surname>
            <given-names>Nakov</given-names>
          </string-name>
          , Zornitsa Kozareva, Alan Ritter, Sara Rosenthal, Veselin Stoyanov, and Theresa Wilson. Semeval
          <article-title>-2013 task 2: Sentiment analysis in twitter</article-title>
          .
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Olutobi</surname>
            <given-names>Owoputi</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brendan O'Connor</surname>
            , Chris Dyer, Kevin Gimpel,
            <given-names>Nathan</given-names>
          </string-name>
          <string-name>
            <surname>Schneider</surname>
          </string-name>
          , and
          <string-name>
            <surname>Noah</surname>
            <given-names>A</given-names>
          </string-name>
          <string-name>
            <surname>Smith.</surname>
          </string-name>
          <article-title>Improved part-of-speech tagging for online conversational text with word clusters</article-title>
          .
          <source>In HLT-NAACL</source>
          , pages
          <volume>380</volume>
          {
          <fpage>390</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Pak</surname>
          </string-name>
          and
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Paroubek</surname>
          </string-name>
          .
          <article-title>Twitter as a corpus for sentiment analysis and opinion mining</article-title>
          .
          <source>In LREC</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>Bo</given-names>
            <surname>Pang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Lillian</given-names>
            <surname>Lee</surname>
          </string-name>
          .
          <article-title>Opinion mining and sentiment analysis</article-title>
          .
          <source>Foundations and trends in information retrieval</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          -2):1{
          <fpage>135</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. Antonio Reyes, Paolo Rosso, and
          <string-name>
            <given-names>Tony</given-names>
            <surname>Veale</surname>
          </string-name>
          .
          <article-title>A multidimensional approach for detecting irony in twitter</article-title>
          .
          <source>Language Resources and Evaluation</source>
          ,
          <volume>47</volume>
          (
          <issue>1</issue>
          ):
          <volume>239</volume>
          {
          <fpage>268</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Sara</surname>
            <given-names>Rosenthal</given-names>
          </string-name>
          , Preslav Nakov, Alan Ritter, and
          <string-name>
            <given-names>Veselin</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          .
          <article-title>Semeval-2014 task 9: Sentiment analysis in twitter</article-title>
          .
          <source>Proc. SemEval</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>Fabrizio</given-names>
            <surname>Sebastiani</surname>
          </string-name>
          .
          <source>Machine learning in automated text categorization. ACM computing surveys (CSUR)</source>
          ,
          <volume>34</volume>
          (
          <issue>1</issue>
          ):1{
          <fpage>47</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>Carlo</given-names>
            <surname>Strapparava</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Valitutti</surname>
          </string-name>
          .
          <article-title>Wordnet a ect: an a ective extension of wordnet</article-title>
          .
          <source>In LREC</source>
          , volume
          <volume>4</volume>
          , pages
          <fpage>1083</fpage>
          {
          <fpage>1086</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Maite</surname>
            <given-names>Taboada</given-names>
          </string-name>
          , Julian Brooke, Milan To loski, Kimberly Voll, and
          <string-name>
            <given-names>Manfred</given-names>
            <surname>Stede</surname>
          </string-name>
          .
          <article-title>Lexicon-based methods for sentiment analysis</article-title>
          .
          <source>Computational linguistics</source>
          ,
          <volume>37</volume>
          (
          <issue>2</issue>
          ):
          <volume>267</volume>
          {
          <fpage>307</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Janyce</surname>
            <given-names>Wiebe</given-names>
          </string-name>
          , Theresa Wilson, and
          <string-name>
            <given-names>Claire</given-names>
            <surname>Cardie</surname>
          </string-name>
          .
          <article-title>Annotating expressions of opinions and emotions in language</article-title>
          .
          <source>Language resources and evaluation</source>
          ,
          <volume>39</volume>
          (
          <issue>2-3</issue>
          ):
          <volume>165</volume>
          {
          <fpage>210</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>