TASS 2015, septiembre 2015, pp 29-34                                            recibido 17-07-15 revisado 24-07-15 aceptado 28-07-15


            Aspect based Sentiment Analysis of Spanish Tweets
       Análisis de Sentimientos de Tweets en Español basado en Aspectos
             Oscar Araque, Ignacio Corcuera, Constantino Román,
                 Carlos A. Iglesias y J. Fernando Sánchez-Rada
  Grupo de Sistemas Inteligentes, Departamento de Ingenierı́a de Sistemas Telemáticos,
                  Universidad Politécnica de Madrid (UPM), España
                 Avenida Complutense, no 30, 28040 Madrid, España
              {oscar.aiborra, ignacio.cplatas, c.romang}@alumnos.upm.es
                               {cif, jfernando}@dit.upm.es

        Resumen: En este artı́culo se presenta la participación del Grupo de Sistemas
        Inteligentes (GSI) de la Universidad Politécnica de Madrid (UPM) en el taller de
        Análisis de Sentimientos centrado en tweets en Español: el TASS2015. Este año
        se han propuesto dos tareas que hemos abordado con el diseño y desarrollo de un
        sistema modular adaptable a distintos contextos. Este sistema emplea tecnologı́as
        de Procesado de Lenguaje Natural (NLP) ası́ como de aprendizaje automático, de-
        pendiento además de tecnologı́as desarrolladas previamente en nuestro grupo de in-
        vestigación. En particular, hemos combinado un amplio número de rasgos y léxicos
        de polaridad para la detección de sentimento, junto con un algoritmo basado en
        grafos para la detección de contextos. Los resultados experimentales obtenidos tras
        la consecución del concurso resultan prometedores.
        Palabras clave: Aprendizaje automático, Procesado de lenguaje natural, Análisis
        de sentimientos, Detección de aspectos
       Abstract: This article presents the participation of the Intelligent Systems Group
       (GSI) at Universidad Politécnica de Madrid (UPM) in the Sentiment Analysis work-
       shop focused in Spanish tweets, TASS2015. This year two challenges have been
       proposed, which we have addressed with the design and development of a modu-
       lar system that is adaptable to different contexts. This system employs Natural
       Language Processing (NLP) and machine-learning technologies, relying also in pre-
       viously developed technologies in our research group. In particular, we have used a
       wide number of features and polarity lexicons for sentiment detection. With regards
       to aspect detection, we have relied on a graph-based algorithm. Once the challenge
       has come to an end, the experimental results are promising.
       Keywords: Machine learning, Natural Language Processing, Sentiment analysis,
       Aspect detection
1 Introduction                                      NONE means absence of sentiment polarity.
                                                    This task provides a corpus (Villena-Román
In this article we present our participation        et al., 2015b), which contains a total of 68.000
for the TASS2015 challenge (Villena-Román          tweets written in Spanish, describing a diver-
et al., 2015a). This work deals with two dif-       sity of subjects.
ferent tasks, that are described next.
   The first task of this challenge, Task
1 (Villena-Román et al., 2015b), consists of           The second and last task, Task 2 (Villena-
determining the global polarity at a message        Román et al., 2015b), is aimed to detect
level. Inside this task, there are two eval-        the sentiment polarity at an aspect level us-
uations: one in which 6 polarity labels are         ing three labels (P, N and NEU). Within
considered (P+, P, NEU, N, N+, None), and           this task, two corpora (Villena-Román et al.,
another one with 4 polarity labels considered       2015b) are provided: SocialTV and STOM-
(P, N, NEU, NONE). P stands for positive,           POL corpus. We have restricted ourselves
while N means negative and NEU is neu-              to the SocialTV corpus in this edition. This
tral. The “+” symbol is used for intensifi-         corpus contains 2.773 tweets captured during
cation of the polarity. It is considered that       the celebration of the 2014 Final of Copa del
Publicado en http://ceur-ws.org/Vol-1397/. CEUR-WS.org es una publicación en serie con ISSN reconocido               ISSN 1613-0073
               Oscar Araque, Ignacio Corcuera, Constantino Román, Carlos A. Iglesias, J. Fernando Sánchez-Rada


rey championship1 . Along with the corpus                          machine learning algorithm. Fernández et al.
a set of aspects which appear in the tweets                        (2013) employ a ranking algorithm using bi-
is given. This list is essentially composed by                     grams and added to this a skipgrams scorer,
football players, coaches, teams, referees, and                    which allow them to create sentiment lexi-
other football-related concepts such as crowd,                     cons that are able to retain the context of the
authorities, match and broadcast.                                  terms. A different approach is by means of
    The complexity presented by the challenge                      the Word2Vec model, used by Montejo-Ráez,
has taken us to develop a modular system, in                       Garcı́a-Cumbreras, and Dı́az-Galiano (2104),
which each component can work separately.                          in which each word is considered in a 200-
We have developed and experimented with                            dimensional space, without using any lexical
each module independently, and later com-                          or syntactical analysis: this allows them to
bine them depending on the Task (1 or 2) we                        develop a fairly simple system with reason-
want to solve.                                                     able results.
    The rest of the paper is organized as fol-
lows. First, Section 2 is a review of the                          3     System architecture
research involving sentiment analysis in the                       One of ours main goals is to design and de-
Twitter domain. After this, Section 3 briefly                      velop an adaptable system which can func-
describes the general architecture of the de-                      tion in a variety of situations. As we have al-
veloped system. Following that, Section 4                          ready mentioned, this has taken us to a sys-
describes the module developed in order to                         tem composed of several modules that can
confront the Task 1 of this challenge. After                       work separately. Since the challenge pro-
this, Section 5 explains the other modules                         poses two different tasks (Villena-Román et
necessaries to address the Task 2. Finally,                        al., 2015b), we will utilize each module when
Section 6 concludes the paper and presents                         necessary.
some conclusions regarding our participation                           Our system is divided into three modules:
in this challenge, as well as future works.
                                                                       • Named Entity Recognizer (NER)
                                                                         module. The NER module detects the
2    Related Work
                                                                         entities within a text, and classifies them
Centering the attention in the scope of TASS,                            as one of the possibles entities. In the
many researches have experimented, through                               Section 5 a more detailed description of
the TASS corpora, with different approaches                              this module and the set of entities given
to evaluate the performance of these systems.                            is presented, as it is used in the Task 2.
Vilares et al. (2014) present a system re-
                                                                       • Aspect and Context detection mod-
lying in machine learning classification for
                                                                         ule. This module is in charge of detect-
the tasks of sentiment analysis, and a heuris-
                                                                         ing the remaining aspects -aspects that
tics based approach for aspect-based senti-
                                                                         are not entities and therefore can not be
ment analysis. Another example of classifi-
                                                                         detected as such- and the contexts of all
cation through machine learning is the work
                                                                         aspects. In the Section 5 this module is
of Hurtado and Pla (2014), in which they
                                                                         described in greater detail since it is only
utilize Support Vector Machine (SVM) with
                                                                         used for tackling the Task 2.
remarkable results. It is common to in-
corporate linguistic knowledge to this sys-                            • Sentiment Analysis module. As the
tems, as proposed by Urizar and Roncal                                   name suggests, the goal of this module
(2013), who also employ lexicons in its work.                            is to classify the given texts using sen-
Balahur and Perea-Ortega (2013) deal with                                timent polarity labels. This module is
this problem using dictionaries and trans-                               based on combining NLP and machine
lated data from English to Spanish, as well                              learning techniques and is used in both
as machine-learning techniques. An inter-                                Task 1 and 2. It is explained in more
esting procedure is performed by Vilares,                                detail next.
Alonso, and Gómez-Rodrı́guez (2013): us-
ing semantic information added to psycho-                          3.1      Sentiment Analysis module
logical knowledge extracted from dictionar-             The sentiment analysis module relies in a
ies, they combine these features to train a             SVM machine-learning model that is trained
                                                        with data composed of features extracted
    1
      www.en.wikipedia.org/wiki/2014 Copa del Rey Final from the TASS dataset: General corpus for
                                                             30
                             Aspect based Sentiment Analysis of Spanish Tweets


the Task 1 and SocialTV corpus for Task                         words with polarity is to combine sev-
2 (Villena-Román et al., 2015b).                               eral resources lexicon. The lexicons used
3.1.1 Feature Extraction                                        are: Elhuyar Polar Lexicon (Urizar and
                                                                Roncal, 2013), ISOL (Martı́nez-Cámara
We have used different approaches to design
                                                                et al., 2013), Sentiment Spanish Lexi-
the feature extraction. The reference docu-
                                                                con (SSL) (Veronica Perez Rosas, 2012),
ment taken in the development of the fea-
                                                                SOCAL (Taboada et al., 2011) and ML-
tures extraction was made by Mohammad,
                                                                SentiCON (Cruz et al., 2014).
Kiritchenko, and Zhu (2013). With this in
mind, the features extracted from each tweet                 • Intensifiers,   a    intensifier  dictio-
to form a feature vector are:                                  nary (Cruz et al., 2014) has been
                                                               used for calculating the polarity of a
  • N-grams, combination of contiguous se-                     word, increasing or decreasing its value.
    quences of one, two and three tokens                     • Negation, explained in 3.1.2.
    consisting on words, lemmas and stem
    words. As this information can be dif-                   • Global Polarity, this score is the sum
    ficult to handle due to the huge volume                    of the punctuations from the emoticon
    of N-grams that can be formed, we set a                    analysis and the lexicon resources.
    minimum frequency of three occurrences
                                                         3.1.2 Negation
    to consider the N-gram.
                                                         An important feature that has been used to
  • All-caps, the number of words with all               develop the classifier is the treatment of the
    characters in upper cases that appears               negations. This approach takes into account
    in the tweets.                                       the role of the negation words or phrases, as
                                                         they can alter the polarity value of the words
  • POS information, the frequency of each
                                                         or phrases they precede.
    part-of-speech tag.
                                                            The polarity of a word changes if it is
  • Hashtags, the number of hashtags terms.              included in a negated context. For detect-
                                                         ing a negated context we have utilized a
  • Punctuation marks, these marks are fre-
                                                         set of negated words, which has been man-
    quently used to increase the sentiment
                                                         ually composed by us. Besides, detecting the
    of a sentence, specially on the Twitter
                                                         context requires deciding how many tokens
    domain. The presence or absence of
                                                         are affected by the negation. For this, we
    these marks (?!) are extracted as a new
                                                         have followed the proposal by Pang, Lee, and
    feature, as well as its relative position
                                                         Vaithyanathan (2002).
    within the document.
                                                            Once the negated context is defined there
  • Elongated words, the number of words                 are two features affected by this: N-grams
    that has one character repeated more                 and lexicon. The negation feature is added
    than two times.                                      to these features, implying that its negated
                                                         (e.g. positive becomes negative, +1 becomes
  • Emoticons, the system uses a Emoticons               -1). This approximation is based on the work
    Sentiment Lexicon, which has been de-                by Saurı́ and Pustejovsky (2012).
    veloped by Hogenboom et al. (2013).
  • Lexicon Resources, for each token w, we              4  Task 1: Sentiment analysis at
    used the sentiment score score(w) to de-                global level
    termine:                                             4.1 Experiment and results
      1. Number of words that have a                     In this competition it is allowed for submis-
         score(w) 6= 0.                                  sion up to three experiments for each corpus.
                                                         With this in mind, three experiments have
      2. Polarity of each word that has a                been developed in this task attending to the
         score(w) 6= 0.                                  lexicons that adjust better to the corpus:
      3. Total score of all the polarities of
         the words that have a score(w) 6= 0.                • RUN-1, there is one lexicon that is
                                                               adapted well to the corpus, the ElhPolar
    The best way to increase the coverage                      lexicon. It has been decided to use only
    range with respect to the detection of                     this dictionary in the first run.
                                                    31
               Oscar Araque, Ignacio Corcuera, Constantino Román, Carlos A. Iglesias, J. Fernando Sánchez-Rada


    • RUN-2, in this run the two lexicons that                     5.1      NER
      have the best results in the experiments                     The goal of this module is to detect the words
      have been combined, the ElhPolar and                         that represent a certain entity from the set
      the ISOL.                                                    of entities that can be identified as a per-
    • RUN-3, the last run is a mix of all the                      son (players and coaches) or an organization
      lexicon used on the experiments.                             (teams).
                                                                       For this module we used the Stanford CRF
                                                                   NER (Finkel, Grenager, and Manning, 2005).
                                                                   It includes a Spanish model trained on news
    Experiment        Accuracy          F1-Score                   data. To adapt the model, we trained it
    6labels             61.8              50.0                     instead with the training dataset (Villena-
    6labels-1k          48.7              44.6                     Román et al., 2015b) and a gazette. The
    4labels             69.0              55.0                     model is trained with two labels: Per-
    4labels-1k          65.8              53.1                     son (PER) and Organization (ORG). The
                                                                   gazette entries were collected from the train-
                                                                   ing dataset, resulting in a list of all the ways
    Table 1: Results of RUN-1 in the Task 1                        the entities (players, teams or coaches) were
                                                                   named. We verified the performance of the
                                                                   Stanford NER by means of cross-validation
    Experiment        Accuracy          F1-Score                   on the training data. With this, we obtained
    6labels             61.0              49.5                     an average F1-Score of 91.05%.
    6labels-1k          48.0              44.0                         As the goal of the NER module is to detect
    4labels             67.9              54.6                     the words that represent a specific entity, we
    4labels-1k          64.6              53.1                     used a list of all the ways these entities were
                                                                   named. In this way, once the Stanford NER
                                                                   detect the general entity our improved NER
    Table 2: Results of RUN-2 in the Task 1                        module search in this list and decides the par-
                                                                   ticular entity by matching the pattern of the
                                                                   entity words.
    Experiment        Accuracy          F1-Score
    6labels             60.8              49.3                     5.2      Aspect and Context detection
    6labels-1k          47.9              43.7                     This module aims to detect the aspects that
    4labels             67.8              54.5                     are not entities, and thus have not been
    4labels-1k          64.6              48.7                     detected by the NER module. To achieve
                                                                   this, we have composed a dictionary using
                                                                   the training dataset (Villena-Román et al.,
    Table 3: Results of RUN-3 in the Task 1
                                                                   2015b) which contains all the manners that
                                                                   all the aspects -including the entities for-
5     Task 2: Aspect-based sentiment                               merly detected- are named. Using this dic-
                                                                   tionary, this module can detect words that
      analysis
                                                                   are related to a specific aspect. Although
This task is an extension of the Task 1 in                         the NER module already detects entities as
which sentiment analysis is made at the as-                        players, coaches or teams, this module can
pect level. The goal in this task is to detect                     detect them too: it treats these detected en-
the different aspects that can be in a tweet                       tities as more relevant than its own recogni-
and afterwards analyze the sentiment associ-                       tions, combining in this way the capacity of
ated with each aspect.                                             aspect/entity detection of the NER module
   For this, we used a pipeline that takes the                     and this module.
provided corpus as input and produces the                              As for the context detection, we have im-
sentiment annotated corpus as output. This                         plemented a graph based algorithm (Mukher-
pipeline can be divided into three major mod-                      jee and Bhattacharyya, 2012) that allows us
ules that work in a sequential manner: first                       to extract sets of words related to an aspect
the NER, second the Aspect and Context de-                         from a sentence, even if this sentence has dif-
tection, and third the Sentiment Analysis as                       ferent aspects and mixed emotions. The con-
described below.                                                   text of an aspect is the set of words related
                                                             32
                               Aspect based Sentiment Analysis of Spanish Tweets


to that aspect. Besides, we have extended                      Experiment          Accuracy   F1-Score
this algorithm in such a way that allow us to
                                                               RUN-1                 63.5       60.6
configure the scope of this context detection.
                                                               RUN-2                 62.1       58.4
   Combining this two approaches -aspect
                                                               RUN-3                 55.7       55.8
and context detection- this module is able to
detect the word or words which identify an
aspect, and extract the context of this aspect.              Table 4: Results of each run in the Task 2
This context allows us to isolate the senti-
ment meaning of the aspect, fact that will be              system ranked first in F1-Score and second
very interesting for the sentiment analysis at             in Accuracy.
an aspect level.
   We have obtained an accuracy of 93.21%                  6      Conclusions and future work
in this second step of the pipeline with
the training dataset (Villena-Román et al.,               In this paper we have described the partici-
2015b). As for the test dataset (Villena-                  pation of the GSI in the TASS 2015 challenge
Román et al., 2015b) we obtained an accu-                 (Villena-Román et al., 2015a). Our proposal
racy of 89.27%2 .                                          relies in both NLP and machine-learning
                                                           techniques, applying them jointly to obtain
5.3   Sentiment analysis                                   a satisfactory result in the rankings of the
The sentiment analysis module is the end of                challenge. We have designed and developed
the processing pipeline. This module is in                 a modular system that relies in previous tech-
charge of classifying the detected aspects in              nologies developed in our group (Sánchez-
polarity values through the contexts of each               Rada, Iglesias, and Gil, 2015). These charac-
aspect. We have used the same model used                   teristics make this system adaptable to dif-
in Task 1 to analyse every detected aspect in              ferent conditions and contexts, feature that
Task 2, given that the detected aspect con-                results very useful in this competition given
texts in Task 2 are similar to the texts anal-             the diversity of tasks (Villena-Román et al.,
ysed in Task 1.                                            2015b).
   Nevertheless, though using the same                         As future work, our aim is to improve as-
model, it is needed to train this model with               pect detection by including semantic similar-
the proper data. For this, we extracted the                ity based on the available lexical resources in
aspects and contexts from the train dataset,               the Linguistic Linked Open Data Cloud. To
process the corresponding features (explained              this aim, we will integrate also vocabularies
in Section 3), and then train the model with               such as Marl (Westerski, Iglesias, and Tapia,
these. In this way, the trained machine is fed             2011). In addition, we are working on im-
contexts of aspects that will classify in one              proving the sentiment detection based on the
of the three labels (as mentioned: positive,               social context of users within the MixedEmo-
negative and neutral).                                     tions project.

5.4   Results                                              Acknowledgement
By means of connecting these three modules                 This research has been partially funded
together, we obtain a system that is able to               and by the EC through the H2020 project
recognize entities and aspects, detect the con-            MixedEmotions (Grant Agreement no:
text in which they are enclosed, and classify              141111) and by the Spanish Ministry of
them at an aspect level. The performance of                Industry, Tourism and Trade through the
this system is showed in the Table 4. The dif-             project Calista (TEC2012-32457). We would
ferent RUNs represent separate adjustments                 like to thank Maite Taboada as well as the
of the same experiment, in which several pa-               rest of researchers for providing us their
rameters are controlled in order to obtain the             valuable lexical resources.
better performance.
   As can be seen in Table 4, the global per-              References
formance obtained is fairly positive, as our               Balahur, A. and José M. Perea-Ortega. 2013.
  2
   We calculated this metric using the out-
                                                             Experiments using varying sizes and ma-
put granted by the TASS uploading page                       chine translated data for sentiment analy-
www.daedalus.es/TASS2015/private/evaluate.php.               sis in twitter.
                                                      33
             Oscar Araque, Ignacio Corcuera, Constantino Román, Carlos A. Iglesias, J. Fernando Sánchez-Rada


Cruz, Fermı́n L, José A Troyano, Beatriz                        Sánchez-Rada, J. Fernando, Carlos A. Igle-
  Pontes, and F Javier Ortega. 2014. Build-                         sias, and Ronald Gil. 2015. A Linked
  ing layered, multilingual sentiment lexi-                         Data Model for Multimodal Sentiment
  cons at synset and lemma levels. Expert                           and Emotion Analysis. 4th Workshop on
  Systems with Applications, 41(13):5984–                           Linked Data in Linguistics: Resources and
  5994.                                                             Applications.
Fernández, J., Y. Gutiérrez, J. M. Gómez,                     Saurı́, Roser and James Pustejovsky. 2012.
   P. Martı́nez-Barco, A. Montoyo, and                             Are you sure that this happened? as-
   R. Muñoz. 2013. Sentiment analysis of                          sessing the factuality degree of events
   Spanish tweets using a ranking algorithm                        in text.      Computational Linguistics,
   and skipgrams.                                                  38(2):261–299.
Finkel, Jenny Rose, Trond Grenager, and                          Taboada, Maite, Julian Brooke, Milan
   Christopher Manning. 2005. Incorpo-                             Tofiloski, Kimberly Voll, and Manfred
   rating non-local information into informa-                      Stede. 2011. Lexicon-based methods for
   tion extraction systems by gibbs sampling.                      sentiment analysis. Computational lin-
   pages 363–370.                                                  guistics, 37(2):267–307.
Hogenboom, A., D. Bal, F. Franciscar,                            Urizar, Xabier Saralegi and Iñaki San Vicente
  M. Bal, F. De Jong, and U. Kaymak.                                Roncal. 2013. Elhuyar at TASS 2013.
  2013. Exploiting emoticons in polarity                         Veronica Perez Rosas, Carmen Banea,
  classification of text.                                          Rada Mihalcea. 2012. Learning senti-
Hurtado, Ll. and F. Pla. 2014. ELiRF-UPV                           ment lexicons in spanish. In Proc. of the
  en TASS 2014: Análisis de sentimientos,                         international conference on Language Re-
  detección de tópicos y análisis de sen-                       sources and Evaluation (LREC), Istanbul,
  timientos de aspectos en twitter.                                Turkey.
Martı́nez-Cámara, E., M. Martın-Valdivia,                       Vilares, D., M. A. Alonso, and C. Gómez-
  MD Molina-González, and L. Ureña                                Rodrı́guez. 2013. LyS at TASS 2013:
  López. 2013. Bilingual experiments on                            Analysing Spanish tweets by means of de-
  an opinion comparable corpus. WASSA                               pendency parsing, semantic-oriented lexi-
  2013, 87.                                                         cons and psychometric word-properties.
Mohammad, Saif M, Svetlana Kiritchenko,                          Vilares, David, Yerai Doval, Miguel A.
  and Xiaodan Zhu. 2013. Nrc-canada:                                Alonso, and Carlos Gómez-Rodrı́guez.
  Building the state-of-the-art in sentiment                        2014. LyS at TASS 2014: a prototype
  analysis of tweets. In Second Joint Con-                          for extracting and analysing aspects from
  ference on Lexical and Computational Se-                          Spanish tweets.
  mantics (* SEM), volume 2, pages 321–                          Villena-Román, Julio, Janine Garcı́a-Morera,
  327.                                                              Miguel A. Garcı́a-Cumbreras, Eugenio
Montejo-Ráez, A, M.A. Garcı́a-Cumbreras,                           Martı́nez-Cámara, M. Teresa Martı́n-
  and M.C. Dı́az-Galiano. 2104. Partici-                            Valdivia, and L. Alfonso Ureña-López, ed-
  pación de SINAI Word2Vec en TASS 2014.                           itors. 2015a. Proc. of TASS 2015: Work-
Mukherjee, Subhabrata and Pushpak Bhat-                             shop on Sentiment Analysis at SEPLN,
  tacharyya. 2012. Feature specific senti-                          number 1397 in CEUR Workshop Proc.,
  ment analysis for product reviews. vol-                           Aachen.
  ume 7181 of Lecture Notes in Computer                          Villena-Román, Julio, Janine Garcı́a-Morera,
  Science, pages 475–487. Springer.                                 Miguel A. Garcı́a-Cumbreras, Eugenio
Pang, Bo, Lillian Lee, and Shivakumar                               Martı́nez-Cámara, M. Teresa Martı́n-
  Vaithyanathan. 2002. Thumbs up?: senti-                           Valdivia, and L. Alfonso Ureña-López.
  ment classification using machine learning                        2015b. Overview of TASS 2015.
  techniques. In Proc. of the ACL-02 con-                        Westerski, Adam, Carlos A. Iglesias, and Fer-
  ference on Empirical methods in natural                          nando Tapia. 2011. Linked Opinions:
  language processing-Volume 10, pages 79–                         Describing Sentiments on the Structured
  86. Association for Computational Lin-                           Web of Data. In Proc. of the 4th Interna-
  guistics.                                                        tional Workshop Social Data on the Web.
                                                           34