TASS 2018: Workshop on Semantic Analysis at SEPLN, septiembre 2018, págs. 45-49


          INGEOTEC solution for Task 1 in TASS’18
                      competition
         Solución del grupo INGEOTEC para la tarea 1 de la
                         competencia TASS’18
              Daniela Moctezuma1 , José Ortiz-Bejar3 , Eric S. Tellez2 ,
                      Sabino Miranda-Jiménez2 , Mario Graff2
                               1
                                 CONACYT-CentroGEO
                                2
                                  CONACYT-INFOTEC
                                       3
                                         UMSNH
          dmoctezuma@centrogeo.edu.mx, jortiz@umich.mx, eric.tellez@infotec.mx,
                   sabino.miranda@infotec.mx, mario.graff@infotec.mx

       Resumen: El análisis de sentimientos sobre redes sociales consiste en analizar men-
       sajes publicados por usuarios de dichas redes sociales y determinar la polaridad de
       dichos mensajes (p.e. positivos, negativos, o una gama similar pero más amplia de
       dichos sentimientos). Cada lenguaje tiene caracterı́sticas que podrı́an dificultar el
       análisis de polaridad, como la ambigüedad natural en los pronombres, la sinónimia
       o la polisemı́a; adicionalmente, dado que las redes sociales suelen ser un medio de
       comunicación poco formal ya que los mensajes suele tener una gran cantidad de
       errores y variantes léxicas que dificultan el análisis mediante enfoques tradicionales.
       En este artı́culo se presenta la participación del equipo INGEOTEC en TASS’18.
       Esta solución propuesta está basada en varios subsistemas orquestados mediante
       nuestro sistema de programación genética EvoMSA.
       Palabras clave: Categorización automática de texto, programación genética,
       análisis de sentimientos, clasificación de polaridad
       Abstract: The sentiment analysis over social networks determines the polarity of
       messages published by users. In this sense, a message can be classified as positive
       or negative, or a similar scheme using more fine-grained labels. Each language has
       characteristics that difficult the correct determination of the sentiment, such as the
       natural ambiguity of pronouns, the synonymy, and the polysemy. Additionally, given
       that messages in social networks are quite informal, they tend to be plagued with
       lexical errors and lexical variations that make difficult to determine a sentiment using
       traditional approaches. This paper describes our participating system in TASS’18.
       Our solution is composed of several subsystems independently collected and trained,
       combined with our EvoMSA genetic programming system.
       Keywords: text categorization, genetic programming, sentiment analysis, polarity
       classification

1    Introduction                                               pus InterTASS, has been expanded with two
                                                                more subsets, namely, a dataset containing
Sentiment Analysis is an active research area
                                                                tweets from Costa Rica and another one com-
that performs the computational analysis of
                                                                ing from Peruvian tweeters. Therefore, there
people’s feelings or beliefs expressed in texts
                                                                are three varieties of the Spanish language,
such as emotions, opinions, attitudes, ap-
                                                                namely, Spain (ES), Peru (PE), and Costa
praisals, among others (Liu y Zhang, 2012).
                                                                Rica (CR). Moreover, several subtasks are
In social media, people share their opinions
                                                                also introduced:
and sentiments. In addition to the inher-
ent polarity, these feelings also have an in-
tensity. As in previous years, TASS’18 or-                         • Subtask-1:   Monolingual ES: Train-
ganizes a task related to four level polarity                        ing and test using the InterTASS ES
classification in tweets. In this year, the cor-                     dataset.
ISSN 1613-0073                     Copyright © 2018 by the paper's authors. Copying permitted for private and academic purposes.
                   Daniela Moctezuma, José Ortiz-Bejar, Eric S. Tellez, Sabino Miranda-Jiménez y Mario Graff


    • Subtask-2: Monolingual PE: Train-                             a meta-heuristics to solve a combinatorial
      ing and test using the InterTASS PE                           optimization problem over the configuration
      dataset.                                                      space; the selected model is described in Ta-
                                                                    ble 1. On the second hand, EvoDAG (Graff
    • Subtask-3: Monolingual CR: Train-
                                                                    et al., 2016; Graff et al., 2017) is a classi-
      ing and test using the InterTASS CR
                                                                    fier based on Genetic Programming with se-
      dataset.
                                                                    mantic operators which makes the final pre-
    • Subtask-4: Cross-lingual: Here, the                           diction through a combination of all the de-
      training can be with a specific dataset                       cision function values. The domain-specific
      and a different one is used to test.                          resources can be also added under the same
                                                                    scheme. Figure 1 shows the architecture of
    These subtasks are mostly based on sep-                         EvoMSA. In the first part, a set of differ-
arating language variations in train and test                       ent classifiers are trained with datasets pro-
datasets. Martı́nez-Cámara et al. (Martı́nez-                      vided by the contests and others resources as
Cámara et al., 2018) detail TASS’18 Task 1                         additional knowledge, i.e., the idea is to be
and their associated datasets.                                      able to integrate any other kind of related
    This paper details the Task 1 solution of                       knowledge into the model. In this case, we
our INGEOTEC team. Our approach con-                                used tailor-made lexicons for the aggressive-
sists of a number of subsystems combined                            ness task: aggressiveness words and affective
using a non-linear expression over individ-                         words (positive and negative), see Section 2.2
ual predictions using our EvoMSA genetic                            for more details. The precise configuration of
programming system. It is worth to men-                             our benchmarked system is described in Sec-
tion that we tackle both Task 1 (this one)                          tion 3.
and Task 4 (good or bad news) using a sim-
ilar scheme, that is, the same resources and                        Table 1: Example of set of configurations for
the same portfolio of algorithms, we also ap-                       text modeling
plied the same hyper-parameters for the al-
gorithms; of course, we use the given task’s                                  Text transformation              Value
training set to learn and optimize for each                                   remove diacritics                yes
task.                                                                         remove duplicates                yes
    The manuscript is organized as follows.                                   remove punctuation               yes
Section 2 details subsystems that compose                                     emoticons                        group
                                                                              lowercase                        yes
our solution. Section 3 presents our results,
                                                                              numbers                          group
and finally, Section 4 summarizes and con-                                    urls                             group
cludes this report.                                                           users                            group
                                                                              hashtags                         none
2       System Description                                                    entities                         none
Our participating system is a combination of                                            Term weighting
several sub-systems that tackles the polarity                                 TF-IDF                           yes
categorization of the tweets independently,                                   Entropy                          no
and then all these independent predictions
                                                                                           Tokenizers
are combined using our EvoMSA genetic pro-
gramming system. The rest of this section                                     n-words                          {1, 2}
details the use of these sub-systems and re-                                  q-grams                          {2, 3, 4}
                                                                              skip-grams                       —
sources.
2.1       EvoMSA
EvoMSA1 is a multilingual sentiment analy-                          2.2      Lexicon-based models
sis system based on genetic text classifiers,
domain-specific resources, and a genetic pro-                       To introduce extra knowledge into our ap-
gramming combiner of the parts. The first                           proach, we used two lexicon-based mod-
one, namely B4MSA (Tellez et al., 2017), per-                       els. The first, Up-Down model produces a
forms a hyper-parameter optimization over a                         counting of affective words, that is, it pro-
large search space of possible models. It uses                      duces two indexes for a given text: one
                                                                    for positive words, and another for negative
    1
        https://github.com/INGEOTEC/EvoMSA                          words. We created the positive-negative lex-
                                                              46
                               INGEOTEC solution for Task 1 in TASS'18 competition


                         Figure 1: Architecture of our EvoMSA framework

icon based on the several Spanish affective                 lent.
lexicons (de Albornoz, Plaza, y Gervás, 2012;
Sidorov et al., 2013; Perez-Rosas, Banea,                   2.4      FastText
y Mihalcea, 2012); we also enriched this                    FastText (Joulin et al., 2017) is a tool to
lexicon with Spanish WordNet (Fernández-                   create text classifiers and learn a semantic
Montraveta, Vázquez, y Fellbaum, 2008).                    vocabulary, learned from a given collection
The other Bernoulli model was created to                    of documents; this vocabulary is represented
predict aggressiveness using a lexicon with                 with a collection of high dimensional vectors,
aggressive words. We created this lexicon                   one per word. It is worth to mention that
gathering common aggressive words for Span-                 FastText is robust to lexical errors since out-
ish. These indexes and prediction along with                vocabulary words are represented as the com-
B4MSA’s (µTC) outputs are the input for                     bination of vectors of sub-words, that is, a
EvoDAG system.                                              kind of character q-grams limited in context
                                                            to words. Nonetheless, the main reason of in-
2.3      EvoDAG                                             cluding FastText as part of our system is to
EvoDAG2 (Graff et al., 2016; Graff et al.,                  overcome the small train set that comes with
2017) is a Genetic Programming system                       Task 4, which is fulfilled using the pre-trained
specifically tailored to tackle classification              vectors computed in the Spanish content of
problems on very large and high dimensional                 Wikipedia (Bojanowski et al., 2016). We use
vector spaces. EvoDAG uses the principles                   these vectors to create document vectors, one
of Darwinian evolution to create models rep-                vector per document. A document vector is,
resented as a directed acyclic graph (DAG).                 roughly speaking, a linear combination of the
Due to lack of space, we refer the reader to                word vectors that compose the document into
(Graff et al., 2016) where EvoDAG is broadly                a single vector of the same dimension. These
described. It is important to mention that                  document vectors were used as input to an
EvoDAG does not have information regard-                    SVM with a linear kernel, and we use the de-
ing whether input Xi comes from a particu-                  cision function as input to EvoMSA.
lar class decision function, consequently from
                                                            3       Experiments and results
EvoDAG point of view all inputs are equiva-
                                                            The following tables show the performance
  2
      https://github.com/mgraffg/EvoDAG                     of our system in the InterTASS dataset. We
                                                       47
                 Daniela Moctezuma, José Ortiz-Bejar, Eric S. Tellez, Sabino Miranda-Jiménez y Mario Graff


also show the performance of a number of se-                               Table 2: Monolingual subtasks
lected systems to provide a context for our
                                                                               (a) Subtask-1, Spain dataset (ES)
solution. The following tables always show
the top-k best results that include our sys-                             Team’s name                Macro-F1     Accuracy
tem, i.e., we always show the best ones but
                                                                       ELiRF-UPV                      0.503        0.612
sometimes we do not show all results below
                                                                     RETUYT-InCo                      0.499        0.549
our system.
                                                                         Atalaya                      0.476        0.544
   Please recall that the InterTASS dataset
                                                                        UNSA dajo                     0.472         0.6
is split according to each sub-task. Ta-
                                                                    UNSA UCSP DaJo                    0.472         0.6
ble 2 shows the performance on monolingual
                                                                        MEFaMAF                        0.46        0.55
datasets. For instance, the results of training
                                                                      INGEOTEC                        0.445        0.53
with Spain-InterTASS and testing on tweets
                                                                         ABBOT                        0.409        0.482
generated by people of Spain is shown in
                                                                       ITAINNOVA                      0.383        0.433
Table 2a where we reached seventh position
from a total of nine participants teams. In
                                                                        (b) Subtask-2, Costa Rica’s dataset (CR)
the case training and test corpus of other
Spanish varieties, in Table 2b and Table 2c                           Team’s name             Macro-F1        Accuracy
show the result of training with CR and PE
                                                                    RETUYT-InCo                  0.504         0.537
subsets, respectively. Our team achieved the
                                                                     ELiRF-UPV                   0.482         0.561
fourth position among eight teams in CR,
                                                                       Atalaya                   0.475         0.582
and the third one among eight participants.
                                                                    INGEOTEC                     0.474         0.522
Notice that all our results are marked as bold
                                                                     MEFaMAF                     0.418         0.512
to improve the readability.
                                                                      ABBOT                      0.408          0.46
   In contrary, the results of training with the
ES subset and test with subsets ES, CR, and                               (c) Subtask-3, Peruvian dataset (PE)
PE are presented in Tables 3a, 3b, and 3c, re-
spectively. Our team achieved the best result                         Team’s name             Macro-F1        Accuracy
in cross-lingual task with Peruvian tweets,                         RETUYT-InCo                  0.472         0.494
and also reached the second best results in                            Atalaya                   0.462         0.451
ES (Spain) and CR (Costa Rica) subsets.                             INGEOTEC                     0.439         0.447
   The performance of our method in cross                            ELiRF-UPV                   0.438         0.461
lingual tasks 4 is shown in Table 3. For in-                         UNSA dajo                   0.413         0.319
stance, Table 3a shows our performance on
the ES subset; here, we achieved the second
position among three teams. In general, the
number of participants was smaller than the                       herent feature of the Spanish variation.
monolingual tasks. Table 3b show the rank of
the four participant teams over the Peruvian                      Acknowledgements
subset of the test, here we reached the best                      The authors would like to thank Laborato-
position on the Macro-F1 score. Finally, we                       rio Nacional de GeoInteligencia for partially
reached the second rank on the Costa Rica                         funding this work.
subset, just below of RETUYT-InCo.
                                                                  References
4   Conclusions                                                   Bojanowski, P., E. Grave, A. Joulin, y
It is worth to mention that we used the same                        T. Mikolov. 2016. Enriching word vectors
scheme, explained in Section 2, to tackle all                       with subword information. arXiv preprint
subtasks. Note that our EvoMSA allow to                             arXiv:1607.04606.
change the training set as specified for each
                                                                  de Albornoz, J. C., L. Plaza, y P. Gervás.
subtasks, so we can optimize the pipeline for
                                                                     2012.    Sentisense: An easily scalable
each particular objective.
                                                                     concept-based affective lexicon for senti-
    Regarding the obtained results, our ap-
                                                                     ment analysis. En Proceedings of LREC
proach performs better when it is trained
                                                                     2012, páginas 3562–3567.
with tweets from Spain and test with other
Spanish varieties. However, it is not clear if                    Fernández-Montraveta, A., G. Vázquez, y
this performance is due to the data or a in-                         C. Fellbaum. 2008. The spanish version of
                                                            48
                               INGEOTEC solution for Task 1 in TASS'18 competition


Table 3: Performance comparison of the                      Liu, B. y L. Zhang, 2012. A Survey of
cross-lingual (subtask-4) benchmark over                       Opinion Mining and Sentiment Analysis,
three different test corpus.                                   páginas 415–463. Springer US, Boston,
                                                               MA.
           (a) Spain’s variation (ES).
                                                            Martı́nez-Cámara, E., Y. Almeida-Cruz,
   Team’s name        Macro-F1       Accuracy                 M. C. Dı́az-Galiano, S. Estévez-Velarde,
  RETUYT-InCo           0.471            0.555                M. A. Garcı́a-Cumbreras, M. Garcı́a-
  INGEOTEC              0.445            0.53                 Vega, Y. Gutiérrez, A. Montejo Ráez,
    Atalaya             0.441            0.485                A. Montoyo, R. Muñoz, A. Piad-
                                                              Morffis, y J. Villena-Román.       2018.
          (b) Peruvian variation (PE).                        Overview of TASS 2018:          Opinions,
                                                              health and emotions. En E. Martı́nez-
   Team’s name        Macro-F1       Accuracy                 Cámara Y. Almeida-Cruz M. C. Dı́az-
  INGEOTEC              0.447            0.506                Galiano S. Estévez-Velarde M. A. Garcı́a-
  RETUYT-InCo           0.445            0.514                Cumbreras M. Garcı́a-Vega Y. Gutiérrez
     Atalaya            0.438            0.523                A. Montejo Ráez A. Montoyo R. Muñoz
   ITAINNOVA            0.367            0.382                A. Piad-Morffis, y J. Villena-Román, edi-
                                                              tores, Proceedings of TASS 2018: Work-
        (c) Costa Rica’s variation (CR).                      shop on Semantic Analysis at SEPLN
                                                              (TASS 2018), volumen 2172 de CEUR
   Team’s name        Macro-F1       Accuracy
                                                              Workshop Proceedings, Sevilla, Spain,
  RETUYT-InCo           0.476            0.569                September. CEUR-WS.
  INGEOTEC              0.454            0.538
                                                            Perez-Rosas, V., C. Banea, y R. Mihal-
     Atalaya            0.453            0.565
                                                               cea. 2012. Learning sentiment lexi-
   ITAINNOVA            0.409            0.440
                                                               cons in spanish. En LREC, volumen 12,
                                                               página 73.
                                                            Sidorov, G., S. Miranda-Jiménez, F. Viveros-
  wordnet 3.0. Text Resources and Lexical
                                                               Jiménez, A. Gelbukh, N. Castro-Sánchez,
  Knowledge. Mouton de Gruyter, páginas
                                                               F. Velásquez, I. Dı́az-Rangel, S. Suárez-
  175–182.
                                                               Guerra, A. Treviño, y J. Gordon. 2013.
Graff, M., E. S. Tellez, S. Miranda-Jiménez,                  Empirical study of machine learning based
  y H. J. Escalante. 2016. Evodag: A                           approach for opinion mining in tweets. En
  semantic genetic programming python li-                      Proceedings of the 11th Mexican Interna-
  brary. En 2016 IEEE International Au-                        tional Conference on Advances in Arti-
  tumn Meeting on Power, Electronics and                       ficial Intelligence - Volume Part I, MI-
  Computing (ROPEC), páginas 1–6, Nov.                        CAI’12, páginas 1–14, Berlin, Heidelberg.
                                                               Springer-Verlag.
Graff, M., E. S. Tellez, H. J. Escalante, y
  S. Miranda-Jiménez. 2017. Semantic Ge-                   Tellez, E. S., S. Miranda-Jiménez, M. Graff,
  netic Programming for Sentiment Analy-                       D. Moctezuma, R. R. Suárez, y O. S. Sior-
  sis. En O. Schütze L. Trujillo P. Legrand,                  dia. 2017. A simple approach to multi-
  y Y. Maldonado, editores, NEO 2015, nu-                      lingual polarity classification in Twitter.
  mero 663 en Studies in Computational In-                     Pattern Recognition Letters, 94:68–74.
  telligence. Springer International Publish-
  ing, páginas 43–65. DOI: 10.1007/978-3-
  319-44003-3 2.
Joulin, A., E. Grave, P. Bojanowski, y
   T. Mikolov. 2017. Bag of tricks for
   efficient text classification. En Proceed-
   ings of the 15th Conference of the Euro-
   pean Chapter of the Association for Com-
   putational Linguistics: Volume 2, Short
   Papers, páginas 427–431. Association for
   Computational Linguistics, April.
                                                       49