<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Sentiment Analysis for Real-time Applications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Javi Fernandez</string-name>
          <email>javifm@ua.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Alicante</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we present a supervised hybrid approach for Sentiment Analysis in Real-time Applications. The main goal of this work is to design an approach which employs very few resources but obtains near state-of-the-art results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Recent years have seen the birth of Social
Networks and Web 2.0. They have
facilitated people to share aspects and opinions
about their everyday life. This subjective
information can be interesting for general users,
brands and organisations. However, the vast
amount of information (for example, over 500
million messages per day in Twitter1)
complicates traditional sentiment analysis systems
to process this subjective information in
realtime. The performance of sentiment analysis
tools has become increasingly critical.</p>
      <p>The main goal of our work is to design a
sentiment analysis approach oriented to
realtime applications. An approach that
balances e ciency and quality. It must employ
very few resources, in order to be able to
process as many texts as possible. This will also
make sentiment analysis more accessible for</p>
      <p>This research work has been partially funded
by the University of Alicante, Generalitat
Valenciana, Spanish Government, Ministerio de
Educacion, Cultura y Deporte and Ayudas Fundacion
BBVA a equipos de investigacion cient ca 2016
through the projects TIN2015-65100-R,
TIN201565136-C2-2-R, PROMETEOII/2014/001,
GRE1601: Plataforma inteligente para recuperacion, analisis
y representacion de la informacion generada por
usuarios en Internet and Analisis de Sentimientos
Aplicado a la Prevencion del Suicidio en las Redes
Sociales (ASAP).</p>
      <p>1www.internetlivestats.com/twitter-statistics
everybody. In addition, the quality of the
approach should be near the state-of-the-art
results. In the following sections we explain our
approach in detail. Section 2 brie y describes
the related work in the eld and introduce
our work. In Section 3 we detail the approach
we propose. Finally, Section 4 concludes the
paper, and outlines the future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Two main approaches can be followed:
machine learning and lexicon-based
        <xref ref-type="bibr" rid="ref11 ref15 ref16 ref19 ref2 ref21 ref3 ref6 ref8 ref9">(Taboada et
al., 2011; Medhat, Hassan, y Korashy, 2014;
Mohammad, 2015; Ravi y Ravi, 2015)</xref>
        .
Machine learning approaches treat polarity
classi cation as a text categorisation problem.
Texts are usually represented as vectors of
features, and depending on the features used,
the system can reach better results. If a
labelled training set of documents is needed,
the approach is de ned as supervised
learning; if not, it is de ned as unsupervised
learning. These approaches perform very well in
the domain they are trained on, but their
performance drops when the same classi er is
used in a di erent domain
        <xref ref-type="bibr" rid="ref22">(Pang y Lee, 2008;
Tan et al., 2009)</xref>
        . In addition, if the number
of features is big, the e ciency drops
dramatically. Lexicon-based approaches make use of
dictionaries of opinionated words and phrases
to discern the polarity of a text. In these
approaches, each word in the dictionary is
assigned a score for each sentiment (e.g.
positivity and negativity). To detect the
polarity of a text, the scores of its words are
combined, and the polarity with the
greatest score is chosen. These dictionaries can be
generated manually
        <xref ref-type="bibr" rid="ref23">(Tong, 2001)</xref>
        ,
semiautomatically from an initial seed of opinionated
words
        <xref ref-type="bibr" rid="ref1 ref14">(Kim, Rey, y Hovy, 2004; Baccianella,
Esuli, y Sebastiani, 2010)</xref>
        , or automatically
from a labelled dataset
        <xref ref-type="bibr" rid="ref1 ref13 ref4">(Jijkoun, de Rijke, y
Weerkamp, 2010; Cruz et al., 2013)</xref>
        . The
major disadvantage of these approaches is the
incapability to nd opinion words with domain
and context speci c orientations, while the
last one helps to solve this problem
        <xref ref-type="bibr" rid="ref15 ref2 ref3 ref6 ref8">(Medhat,
Hassan, y Korashy, 2014)</xref>
        . These approaches
are usually faster than machine learning ones,
as the combination of scores is normally a
prede ned mathematical function. We
decided to use a hybrid approach, trying to
take advantage of the machine learning
approach categorisation quality and the lexicon
approach speed.
      </p>
      <p>
        Most of the current sentiment analysis
approaches employ words, n-grams and phrases
as information units for their models, either
as features for machine learning approaches,
or as dictionary entries in the lexicon-based
approaches. However, words and n-grams
have some problems to represent the
exibility and sequentiality of human language.
This is the reason why we decided to use
skipgrams. The use of skipgrams is a
technique whereby n-grams are formed (bigrams,
trigrams, etc.), but in addition to using
adjacent sequences of words, it also allows some
words to be skipped
        <xref ref-type="bibr" rid="ref10">(Guthrie et al., 2006)</xref>
        . In
this way, skipgrams are new terms that retain
part of the sequentiality of the terms, but in
a more exible way than n-grams
        <xref ref-type="bibr" rid="ref5 ref6 ref8">(Fernandez
et al., 2014)</xref>
        . Note that an n-gram can be
dened as a 0-skip-n-gram, a skipgram where
k = 0. For example, the sentence \I love
healthy food" has two word level trigrams: \I
love healthy" and \love healthy food".
However, there is one important trigram implied
by the sentence that was not captured: \I
love food". The use of skipgrams allows the
word \health" to be skipped, providing the
mentioned trigram.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>Our contribution consists in a hybrid
approach which creates a lexicon from a labelled
dataset and builds a polarity classi er from
the dataset and the generated lexicon with
machine learning techniques. Its architecture
can be seen in Figure 1. In the following
subsections we explain the di erent parts of our
approach in detail.</p>
      <p>Dataset</p>
      <p>Tokenisation
Lexicon Generation</p>
      <p>
        Supervised Learning
Lexicon
Classifier
We tried to employ the minimum number of
external linguistic tools, to minimise the
possible propagation of external errors, in
addition to the extra time they can consume. The
tokenisation process starts obtaining all the
words in the text. We only extract words
containing alphabetic characters. Numbers,
punctuation symbols, or emoticons, are not
considered at this moment, but we are
studying the best way to include them in the
future. The only external resource we employ
for the tokenisation process is a stemmer to
obtain the most general form of the words
we extracted. We preferred a stemmer over
a lemmatiser because they are much faster
        <xref ref-type="bibr" rid="ref15 ref2 ref3 ref6 ref8">(Balakrishnan y Lloyd-Yemoh, 2014)</xref>
        and
require less resources, one of the goals of our
approach. Speci cally, we used the Snowball 2
implementation for each language.
      </p>
      <p>Once we have the words in the text, we
combine them using the skipgram modelling
to obtain multiword terms. We will use two
variables in this work: n will be the
maximum number of words when building a new
term with the skipgram modelling, and k will
be the maximum number of skips. Note that
n = 3 includes all the terms with 1, 2 and 3
words, and k = 3 includes 1, 2 and 3 skips.
3.2</p>
      <p>
        Lexicon generation
In summary, our sentiment lexicon consists
of a list of terms for each polarity, assigning
a score indicating how strongly that term is
2snowball.tartarus.org
related to that polarity. To build this lexicon,
we need a polarity labelled dataset, which
will provide both the terms in the lexicon and
their scores. There exist many term scoring
techniques
        <xref ref-type="bibr" rid="ref15 ref2 ref25 ref3 ref6 ref8">(Yang y Pedersen, 1997;
Chandrashekar y Sahin, 2014)</xref>
        , and the majority
of them employ probabilities to calculate the
scores. However, they take full advantage of
the skipgram modelling, because they give
the same importance to terms where words
were adjacent, than to those where the words
were not adjacent (we skipped some of them).
Because of this, we created our custom
scoring formula.
      </p>
      <p>First, we will describe our counting
formulas. In general, when we want to count the
number of documents the term t occurs, we
usually loop over the dataset and add 1 each
time we nd that term in a document.
Instead, we add a value that is inversely
proportional to the number of skips. This is what
formulas in Equations 1 and 2 do, where D
is the labelled dataset; jDj is the number of
documents in D, d is a document in D, Dp is
the subset of documents in D labelled with
polarity p, jtj is the number of words in term
t, and (t; d) is the number of skips of term
t in document d.</p>
      <p>C(t) =
C(t; p) =</p>
      <p>X [t 2 d]
d2D
X [t 2 d]
d2Dp</p>
      <p>jtj
jtj + (t; d)</p>
      <p>jtj
jtj + (t; d)</p>
      <p>With this counting formulas, the
number of skips is taken into account, and we
can build our nal scoring formula shown in
Equation 3, where s(t; p) is the score of term
t for the polarity p, and is a factor that
gives more relevance to terms that appear a
largest number of times. This factor depends
on the size and the domain of the dataset.
s(t; p) =</p>
      <p>C(t; p)</p>
      <p>C(t)</p>
      <p>C(t; p)
C(t; p) +</p>
      <p>
        At the end of this process we have a list of
skipgrams with a score for each polarity: our
sentiment lexicon. Table 1 shows an example
of a dictionary built using the Movie Reviews
dataset
        <xref ref-type="bibr" rid="ref18">(Pang, Lee, y Vaithyanathan, 2002)</xref>
        ,
with n = 2 and k = 10. In this example, we
show only the best ve terms for each
polarity.
(1)
(2)
(3)
      </p>
      <sec id="sec-3-1">
        <title>Negative</title>
        <p>this mess
worst movie
is terrible
ludicrous
waste</p>
      </sec>
      <sec id="sec-3-2">
        <title>Positive</title>
        <p>outstanding
is terri c</p>
        <p>nest
breathtaking
is excellent</p>
      </sec>
      <sec id="sec-3-3">
        <title>Score</title>
        <p>.862
.826
.823
.803
.795
We use machine learning techniques to create
a model able to classify the polarity of new
texts. The documents in the dataset are
employed as training instances, and the labelled
polarities are used as categories. However, in
contrast with text classi cation approaches,
we do not create one feature per term, we
create a feature per polarity. In other words, we
have the same number of features and
categories. Our hypothesis is that this number
of features is enough to obtain a decent
system quality with a low latency. The weight
of each feature is calculated as speci ed in
Equation 4, where w(d; p) is the weight of
the feature for polarity p in document d.
w(d; p) = X s(t; p)
t2d</p>
        <p>jtj
jtj + (t; d)
(4)</p>
        <p>Table 2 shows an example of feature
weighting for the text \worst movie ever"
using again the scores of a dictionary built using
the Movie Reviews dataset, with n = 2 and
k = 10. The nal weights (positive = 1:48,
negative = 3:40) will be employed as feature
weights for the machine learning process.</p>
        <p>
          To build our model we employed
Support Vector Machines (SVM), as it has been
proved to be e ective on text categorisation
tasks
          <xref ref-type="bibr" rid="ref17 ref20 ref24 ref7">(Sebastiani, 2002; Mohammad,
Kiritchenko, y Zhu, 2013)</xref>
          . Speci cally, we used
the Weka3
          <xref ref-type="bibr" rid="ref12">(Hall et al., 2009)</xref>
          default
implementation with the default parameters
(linear kernel, C = 1, = 0:1).
        </p>
        <p>3www.cs.waikato.ac.nz/ml/weka
worst
movie
ever
worst movie
worst ever
movie ever
weight(w)
In this paper we presented a supervised
hybrid approach for Sentiment Analysis in
Twitter. We built a sentiment lexicon from
a polarity dataset using statistical measures.
We employed skipgrams as information units,
to enrich the sentiment lexicon with
combinations of words that do not appear explicitly
in the text. The lexicon created was used
in conjunction with machine learning
techniques to create a polarity classi er.</p>
        <p>Preliminary performance experiments
have shown an acceptable speed to be
employed in real-time applications4. Processing
speeds go from 1; 000 documents per second
in the worst cases (long texts, great values
for n and k) to 10; 000 in the best cases
(short texts, low values for n and k). These
numbers are good enough to work with
extensively used platforms like Twitter,
where users generate over 500 million tweets
per day (this is almost 6,000 tweets per
second)5.</p>
        <p>
          Moreover, experiments with di erent
datasets have also obtained promising
results
          <xref ref-type="bibr" rid="ref11 ref15 ref19 ref2 ref3 ref5 ref5 ref6 ref6 ref7 ref8 ref8 ref9 ref9">(Fernandez et al., 2013; Fernandez,
Gomez, y Mart nez-Barco, 2014; Fernandez
et al., 2014; Gutierrez, Tomas, y
Fernandez, 2015; Fernandez et al., 2015)</xref>
          .
Experiments with the Movie Reviews dataset
          <xref ref-type="bibr" rid="ref18">(Pang, Lee, y Vaithyanathan, 2002)</xref>
          obtained
an accuracy of 86.7%, with long texts in
English and 2-level polarity, and 64.7% with
the TASS 2012 dataset
          <xref ref-type="bibr" rid="ref24 ref4 ref7">(Villena-Roman y
Garc a-Morera, 2013)</xref>
          for Spanish tweets and
6-level polarity.
        </p>
        <p>As future work, we plan to study new
methods to calculate and combine the weight
4Using a Macbook Pro 2.4 GHz i5 with 8GB RAM
5www.internetlivestats.com/twitter-statistics
of the skipgrams. We also want to add more
features to the machine learning algorithm,
but always trying to maintain a small
number of them, in order to avoid increasing the
latency. In addition, we want to include
external resources and tools, such as knowledge
from existing sentiment lexicons, but always
focused in real-time applications. We will
also extend our study to di erent corpora and
domains, to con rm the robustness of the
approach.
Tweets. En Proceedings of the
International Workshop on Semantic Evaluation
(SemEval-2013).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Baccianella</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Esuli, y
          <string-name>
            <given-names>F.</given-names>
            <surname>Sebastiani</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining</article-title>
          .
          <source>En LREC</source>
          , volumen
          <volume>10</volume>
          , paginas
          <volume>2200</volume>
          {
          <fpage>2204</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Balakrishnan</surname>
            , V. y
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Lloyd-Yemoh</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Stemming and lemmatization: a comparison of retrieval performances</article-title>
          .
          <source>Lecture Notes on Software Engineering</source>
          ,
          <volume>2</volume>
          (
          <issue>3</issue>
          ):
          <fpage>262</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Chandrashekar</surname>
            , G. y
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Sahin</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>A survey on feature selection methods</article-title>
          .
          <source>Computers &amp; Electrical Engineering</source>
          ,
          <volume>40</volume>
          (
          <issue>1</issue>
          ):
          <volume>16</volume>
          {
          <fpage>28</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Cruz</surname>
            ,
            <given-names>F. L.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Troyano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Enr quez</surname>
          </string-name>
          , F. J. Ortega, y
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Vallejo</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Long autonomy or long delay? the importance of domain in opinion mining</article-title>
          .
          <source>Expert Systems with Applications</source>
          ,
          <volume>40</volume>
          (
          <issue>8</issue>
          ):
          <volume>3174</volume>
          {
          <fpage>3184</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , y P. Mart nezBarco.
          <year>2014</year>
          .
          <article-title>A supervised approach for sentiment analysis using skipgrams</article-title>
          .
          <source>En 11th International Workshop on Natural Language Processing and Cognitive Science (NAACL).</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , y P.
          <string-name>
            <surname>Mart</surname>
          </string-name>
          nez-Barco.
          <year>2014</year>
          .
          <article-title>Gplsi: Supervised sentiment analysis in twitter using skipgrams</article-title>
          .
          <source>En Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval</source>
          <year>2014</year>
          ), paginas
          <volume>294</volume>
          {
          <fpage>299</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          <article-title>Mart nez-</article-title>
          <string-name>
            <surname>Barco</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Montoyo, y
          <string-name>
            <given-names>R.</given-names>
            <surname>Munoz</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Sentiment analysis of spanish tweets using a ranking algorithm and skipgrams</article-title>
          .
          <string-name>
            <surname>En XXIX Congreso de la Sociedad Espanola de Procesamiento de Lenguaje Natural</surname>
          </string-name>
          (SEPLN
          <year>2013</year>
          ), paginas
          <volume>133</volume>
          {
          <fpage>142</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , y P.
          <string-name>
            <surname>Mart</surname>
          </string-name>
          nez-Barco.
          <year>2014</year>
          .
          <article-title>GPLSI: Supervised Sentiment Analysis in Twitter using Skipgrams</article-title>
          .
          <source>En Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval</source>
          <year>2014</year>
          ), numero SemEval, paginas
          <volume>294</volume>
          {
          <fpage>299</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tomas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , y P.
          <string-name>
            <surname>Mart</surname>
          </string-name>
          nez-Barco.
          <year>2015</year>
          .
          <article-title>Evaluating a sentiment analysis approach from a business point of view.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Guthrie</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Allison</surname>
          </string-name>
          , W. Liu, L. Guthrie, y
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wilks</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>A Closer Look at Skip-gram Modelling</article-title>
          .
          <source>En 5th international Conference on Language Resources and Evaluation (LREC</source>
          <year>2006</year>
          ), paginas 1{
          <fpage>4</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Gutierrez</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , D. Tomas,
          <string-name>
            <given-names>y J.</given-names>
            <surname>Fernandez</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Bene ts of using ranking skip-gram techniques for opinion mining approaches</article-title>
          . En eChallenges e-2015
          <string-name>
            <surname>Conference</surname>
          </string-name>
          ,
          <year>2015</year>
          , paginas
          <volume>1</volume>
          {
          <fpage>10</fpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Frank</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Holmes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Pfahringer</surname>
          </string-name>
          , P. Reutemann,
          <string-name>
            <given-names>y I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>The weka data mining software: an update</article-title>
          .
          <source>ACM SIGKDD explorations newsletter</source>
          ,
          <volume>11</volume>
          (
          <issue>1</issue>
          ):
          <volume>10</volume>
          {
          <fpage>18</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Jijkoun</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , M. de Rijke, y
          <string-name>
            <given-names>W.</given-names>
            <surname>Weerkamp</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Generating focused topic-speci c sentiment lexicons</article-title>
          .
          <source>En Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics</source>
          , paginas
          <volume>585</volume>
          {
          <fpage>594</fpage>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Kim</surname>
            , S.-m., M. Rey,
            <given-names>y E.</given-names>
          </string-name>
          <string-name>
            <surname>Hovy</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Determining the Sentiment of Opinions</article-title>
          .
          <source>En Proceedings of the 20th International Conference on Computational Linguistics (COLING</source>
          <year>2004</year>
          ), pagina 1367.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Medhat</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , A. Hassan, y
          <string-name>
            <given-names>H.</given-names>
            <surname>Korashy</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Sentiment Analysis Algorithms and Applications: a Survey. Ain Shams Engineering Journal</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Mohammad</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <year>2015</year>
          .
          <article-title>Sentiment analysis: Detecting valence, emotions, and other affectual states from text</article-title>
          .
          <source>Emotion measurement, paginas</source>
          <volume>201</volume>
          {
          <fpage>238</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Mohammad</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>S</surname>
          </string-name>
          . Kiritchenko, y
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhu</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Pang, B</article-title>
          . y
          <string-name>
            <given-names>L.</given-names>
            <surname>Lee</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Opinion Mining and Sentiment Analysis</article-title>
          .
          <source>Foundations and Trends in Information Retrieval</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          {2):1{
          <fpage>135</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , L. Lee, y
          <string-name>
            <given-names>S.</given-names>
            <surname>Vaithyanathan</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Thumbs up? Sentiment Classi cation using Machine Learning Techniques</article-title>
          .
          <source>En Conference on Empirical Methods in Natural Language Processing (EMNLP</source>
          <year>2002</year>
          ), numero July, paginas
          <volume>79</volume>
          {
          <fpage>86</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Ravi</surname>
            , K. y
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Ravi</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>A survey on opinion mining and sentiment analysis: tasks, approaches and applications</article-title>
          .
          <source>KnowledgeBased Systems</source>
          ,
          <volume>89</volume>
          :
          <fpage>14</fpage>
          {
          <fpage>46</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Sebastiani</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <year>2002</year>
          . Machine Learning in
          <source>Automated Text Categorization. ACM Computing Surveys (CSUR)</source>
          ,
          <volume>34</volume>
          (
          <issue>1</issue>
          ):1{
          <issue>47</issue>
          ,
          <fpage>3</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Taboada</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Brooke</surname>
          </string-name>
          , M. To loski, K. Voll, y
          <string-name>
            <given-names>M.</given-names>
            <surname>Stede</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Lexicon-based methods for sentiment analysis</article-title>
          .
          <source>Computational Linguistics</source>
          ,
          <volume>37</volume>
          (
          <issue>2</issue>
          ):
          <volume>267</volume>
          {
          <fpage>307</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , X. Cheng, Y. Wang, y
          <string-name>
            <given-names>H.</given-names>
            <surname>Xu</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Adapting Naive Bayes to Domain Adaptation for Sentiment Analysis</article-title>
          .
          <source>Advances in Information Retrieval</source>
          , paginas
          <volume>337</volume>
          {
          <fpage>349</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Tong</surname>
            ,
            <given-names>R. M.</given-names>
          </string-name>
          <year>2001</year>
          .
          <article-title>An operational system for detecting and tracking opinions in on-line discussion</article-title>
          .
          <source>En Working Notes of the ACM SIGIR 2001 Workshop on Operational Text Classi cation, volumen 1, pagina 6.</source>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Villena-Roman</surname>
            ,
            <given-names>J.</given-names>
            y J.
          </string-name>
          <string-name>
            <surname>Garc</surname>
          </string-name>
          a-Morera.
          <year>2013</year>
          .
          <source>TASS 2013-Workshop on Sentiment Analysis at SEPLN</source>
          <year>2013</year>
          :
          <article-title>An overview</article-title>
          . En XXIX Congreso de la Sociedad Espan~ola de Procesamiento de Lenguaje Natural (SEPLN
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>Y. y J. O.</given-names>
          </string-name>
          <string-name>
            <surname>Pedersen</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>A comparative study on feature selection in text categorization</article-title>
          .
          <source>En Icml</source>
          , volumen
          <volume>97</volume>
          , paginas
          <volume>412</volume>
          {
          <fpage>420</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>