TASS 2017: Workshop on Semantic Analysis at SEPLN, septiembre 2017, págs. 13-21


                                     Overview of TASS 2017
                                        Resumen de TASS 2017
Eugenio Martı́nez-Cámara1 , Manuel C. Dı́az-Galiano2 , M. Ángel Garcı́a-Cumbreras2 ,
                      Manuel Garcı́a-Vega2 , Julio Villena-Román3
                   1
                     Ubiquitous Knowledge Processing Lab (UKP-TUDA)
          Department of Computer Science, Technische Universität Darmstadt
                               2
                                 Grupo de Investigación SINAI
                              Universidad de Jaén, Jaén, Spain
                               3
                                 MeaningCloud, Madrid, Spain
                         1
                           camara@ukp.informatik.tu-darmstadt.de,
              2
                {mcdiaz, magc, mgarcia}@ujaen.es, 3 julio.villena@sngular.team

             Abstract: This paper describes TASS 2017, the sixth edition of the Workshop on
             Semantic Analysis at SEPLN 2017. The main aim is to encourage the research
             and development of new resources, algorithms and techniques for different tasks of
             semantic analysis in Spanish. In this paper, we present the proposed tasks, the
             generated datasets, and a summary of the submitted systems.
             Keywords: TASS 2017, sentiment analysis, semantic analysis
             Resumen: Este artı́culo describe la sexta edición del Taller de Análisis Semántico en
             la SEPLN, conocido como TASS 2017. TASS tiene como objetivo principal incentivar
             la investigación y desarrollo de recursos, técnicas, algoritmos y herramientas para
             tareas relacionadas con el análisis semántico en español. A continuación, se describen
             las tareas propuestas para la edición 2017, ası́ como los corpus creados y utilizados,
             los distintos participantes y los resultados obtenidos.
             Palabras clave: TASS 2017, análisis de opiniones, análisis semántico

   1       Introduction                                                ish is far away to the advance of other lan-
   Since some years ago, Natural Language Pro-                         guages like English. Consequently, TASS2
   cessing (NLP) researchers have been working                         (Taller de Análisis de Sentimientos en la SE-
   on the discovery of the meaning of utterances                       PLN / Workshop on Sentiment Analysis at
   from different perspectives. One of those per-                      SEPLN) was born in 2012 with the aim of
   spectives is the understanding of the subjec-                       fostering the development of specific NLP
   tive information or rather opinion informa-                         techniques for the computational treatment
   tion. The task of Sentiment Analysis (SA)                           of opinions of text written in Spanish. The
   is the result of this study, and it is defined                      previous editions in 2016 (Garcı́a-Cumbreras
   as the computational treatment of opinion,                          et al., 2016), 2015 (Villena-Román et al.,
   sentiment and subjectivity in text (Pang and                        2015), 2014 (Villena Román et al., 2015),
   Lee, 2008).                                                         2013 (Villena-Román et al., 2014) and 2012
       However, the potential semantic informa-                        (Villena-Román et al., 2013) have yielded
   tion encoded in an utterance is so rich and                         outstanding linguistic resources such as the
   broad that different new NLP tasks have                             General Corpus of TASS and some datasets
   arosen, such as argumentation mining, stance                        for the task of polarity classification at as-
   classification, irony detection or the consid-                      pect level, used by a great number of research
   ered tasks in the different editions of our sib-                    groups and companies as reference for Span-
   ling workshop SemEval.1                                             ish. Additionally, a research community has
       The Spanish language is the second native                       been created around TASS that usually par-
   language in the world and the second lan-                           ticipate in the workshop and contribute with
   guage in number of speakers. Nevertheless,                          vivid discussions about the state-of-the-art
   the progress of the NLP research in Span-                           and the next challenges in SA in Spanish.
       1                                                                  2
           http://alt.qcri.org/semeval2018/                                   http://www.sepln.org/workshops/tass
   ISSN 1613-0073                         Copyright © 2017 by the paper's authors. Copying permitted for private and academic purposes.
      Eugenio Martínez-Cámara, Manuel C. Díaz-Galiano, M. Ángel García-Cumbreras, Manuel García-Vega, Julio Villena-Román


    The organization committee of the work-                        General Corpus of TASS3 and a new corpus,
shop has updated its name in the edition                           InterTASS, which was specifically developed
of 2017 because of the need of widening                            in 2017 for the task (see Section 3).
the gamut of semantic tasks in TASS. The                               Datasets were annotated with 4 different
new name of TASS is Workshop on Seman-                             polarity labels positive, negative, neu-
tic Analysis at SEPLN (Taller de Análisis                         tral and none), and systems had to iden-
Semántico en la SEPLN ), which allows to                          tify the intensity of the opinion expressed in
keep the acronym TASS. The change of the                           each tweet in any of those 4 intensity lev-
name is a call to researchers on other seman-                      els. For the two sets of the General Corpus
tic tasks (argumentation mining, irony de-                         of TASS, which was annotated in 6 polarity
tection, stance classification...) to organize                     tags, a direct translation from P+ into P and
a shared-task for the treatment of semantic                        N+ into N was performed so that the evalu-
information in Spanish for the next edition.                       ation is consistent with InterTASS and based
    The edition of 2017 proposes two subtasks,                     on 4 levels of intensity of polarity.
polarity classification at document (tweet)                            All datasets were divided into training, de-
level (Task 1) and aspect level polarity clas-                     velopment and test datasets, which were pro-
sification (Task 2). Apart from reusing sev-                       vided to participants in order to train and
eral datasets of previous editions, a new                          evaluate their systems. Systems were allowed
dataset was specifically generated for this                        to use any set of data as training dataset, i.e.
edition. The new dataset is called Inter-                          the training set of InterTASS, other train-
TASS, which is composed of more than 2,000                         ing sets from the previous editions of TASS
tweets annotated at four opinion intensity                         or other sets of tweets. However, using the
level (positive, neutral, negative and                             test set of InterTASS and the test set of the
none). Further details about the tasks and                         datasets of previous editions as training data
the datasets in Sections 2 and 3 respectively.                     was obviously forbidden. Apart from that,
    The edition of 2017 has attracted the par-                     participants could use any kind of linguistic
ticipation of 11 teams, mainly from Spain                          resource for the development of their classifi-
and America. Most of the systems follow the                        cation model.
state-of-the-art of SA, which is the use of a                          Participants were expected to submit 3 ex-
deep learning architecture. Most of the teams                      periments per each evaluation set, so each
participated in Task 1, and a few of them in                       participant team could submit a maximum of
Task 2, which is an indication that polarity                       9 files of results. Results must be submitted
classification at aspect level is a tough task.                    in a plain text file with the following format:
    The rest of this paper is organized as
follows. Section 2 presents in more details                         tweet id \t polarity
the two subtasks of TASS 2017. Section 3
describes the datasets and how we created                             Allowed values for polarity were P, NEU,
them. Section 4 presents the submitted sys-                        N and NONE.
tems and the results reached by them. Fi-                             Accuracy and the macro-averaged ver-
nally, Section 5 concludes and points the fu-                      sions of Precision, Recall and F1 were used
ture work in TASS.                                                 as evaluation measures. Systems were be
                                                                   ranked by the Macro-F1 and Accuracy mea-
2     Tasks                                                        sures.

TASS 2017 has proposed two tasks address-                          2.2      Task 2. Aspect-based
ing the challenging task of SA in Twitter in                                Sentiment Analysis
Spanish.                                                           This second task proposed the development
                                                                   of aspect-based polarity classification sys-
2.1     Task 1. Sentiment Analysis at                              tems. Two datasets from previous editions
        Tweet level                                                were used to evaluate the systems: Social-
                                                                   TV and STOMPOL (see Section 3). The two
This main task focused on the evaluation of
                                                                   datasets were annotated for aspect, the main
polarity classification systems at tweet level
in Spanish. Systems were evaluated on three                           3
                                                                        The entire test set annotated with 4 classes, the
different datasets: the two test sets of the                       1k test set also annotated with 4 classes.
                                                              14
                                              Overview of TASS 2017


category of aspect, and the polarity of the                     Then, the general sentiment of a random
opinion about the aspect. Systems had to                     selection of tweets was manually annotated
classify the opinion about the given aspect in               by five annotators. We used a scale of 4 lev-
3 different polarity labels (positive, nega-                 els of polarity: positive (p), neutral (neu),
tive, neutral).                                              negative (n) and no sentiment tag (none).
   Participants were expected to submit up                   Each tweet was finally annotated at least by
to 3 experiments for each corpus, each in a                  three annotators. When a tweet has the same
plain text file with the following format:                   tag by two of more annotators, the process
                                                             end. If not, each annotator revised the tweet
tweetid \t aspect \t polarity                                again, until it has the same tag by two of
                                                             more annotators. The annotation resulted in
Allowed polarity values were p, neu and n.                   a corpus of 3,413 tweets, which was split into
   For evaluation, exact match with a single                 three datasets: training, development and
label combining “aspect-polarity” was used.                  test. Table 1 shows the size of each dataset
Similarly to Task 1, the macro-averaged ver-                 of InterTASS corpus.
sion of Precision, Recall and F1, and Ac-
curacy were the evaluation measures, and                                Corpus           #Tweets
Macro-F1 were used for ranking the systems.                             Training            1,008
                                                                        Developement          506
3     Datasets                                                          Test                1,899
TASS 2017 provides four datasets to the par-                            Total               3,413
ticipants for the evaluation of their systems.
Three of the datasets were used in previous                  Table 1: Number of tweets in each dataset of
editions, and a new dataset was created for                  InterTASS
TASS 2017, namely InterTass.                                     Each tweet includes its ID (tweetid), the
    The datasets will be made freely available               creation date (date) and the user ID (user).
to the community after the workshop.4                        Due to restrictions in the Twitter API Terms
                                                             of Service,6 it is forbidden to redistribute a
3.1     InterTASS
                                                             corpus that includes text contents or informa-
International TASS Corpus (InterTASS ) is                    tion about users. However, it is valid if those
a new corpus released this year for general                  fields are removed and instead IDs (includ-
task (Task 1). The goal of the organiza-                     ing Tweet IDs and user IDs) are provided.
tion of TASS is the creation of a corpus                     The actual message content can be easily ob-
of tweets written in the Spanish language                    tained by making queries to the Twitter API
spoken in Spain and in different Hispano-                    using the tweetid.
American countries. We release the first ver-                    The training set was released, so the par-
sion of InterTASS in TASS 2017, which is                     ticipants could train and validate their mod-
only composed of tweets posted in Spain and                  els. The test corpus was provided without
written in the Spanish language spoken in                    any annotation and has been used to evalu-
Spain.                                                       ate the results. The InterTass statistics are
    More than 500,000 tweets were collected,                 in Table 2.
from July 2016 to January 2017, using some
keywords. The downloaded set of tweets was                                    Training    Dev.    Test
filtered out in order to meet the following re-                       P            317     156     642
quirements:                                                           NEU          133      69     216
                                                                      N            416     219     767
    • The language of the tweets must be                              NONE         138      62     274
      Spanish5 ,                                                      Total      1,008     506   1,899
    • each tweet must contain at least one ad-
                                                             Table 2: Number of tweets in each dataset
      jective,
                                                             and class of InterTASS
    • the minimum length of each tweet must
      be four words.                                            The three datasets of the corpus are three
    4
                                                             XML files. Figure 1 shows an example of an
      Further information for requesting the datasets        InterTASS XML file.
in: http://www.sepln.org/workshops/tass/.
    5                                                          6
      We used the python library langdetect.                       https://dev.twitter.com/terms/api-terms
                                                        15
      Eugenio Martínez-Cámara, Manuel C. Díaz-Galiano, M. Ángel García-Cumbreras, Manuel García-Vega, Julio Villena-Román


<tweet>                                                            3.3      Social-TV Corpus
  <tweetid>768212591105703936</                                    The Social-TV corpus was collected during
      tweetid>                                                     the 2014 Final of Copa del Rey champi-
  <user>martitarey13</user>                                        onship in Spain between Real Madrid and
  <content>@estherct209 jajajaja la                                F.C. Barcelona, played on 16 April 2014 at
      tuya y la d mucha gente seguro                               Mestalla Stadium in Valencia. After filter-
      !! Pero yo no puedo sin mi                                   ing out useless information a subset of 2.773
      melena me muero </content>                                   tweets was selected. The details of the corpus
  <date>2016-08-23 22:25:29</date>                                 are described in (Villena-Román et al., 2015;
  <lang>es</lang>                                                  Garcı́a-Cumbreras et al., 2016).
  <sentiment>                                                          All tweets were manually annotated with
    <polarity>                                                     31 different aspects and its sentiment polar-
      <value>N</value>                                             ity. It was randomly divided into training
      <type>AGREEMENT</type>                                       set (1.773 tweets) and test set (1.000 tweets),
    </polarity>                                                    with a similar distribution of both aspects
  </sentiment>                                                     and sentiments.
</tweet>                                                               Figure 3 shows a tweet from the Social-TV
                                                                   corpus.
Figure 1: A tweet from the XML file of In-
terTASS corpus                                                     <tweet id="456544894501146625">
                                                                   Para mi, <sentiment aspect="Jugador-
                                                                      Isco" polarity="P">ISCO</
<tweet>                                                               sentiment>
 <tweetid>0000000000</tweetid>                                     ha hecho un <sentiment aspect="
 <user>usuario0</user>                                                Partido" polarity="P">partidazo</
 <content><![CDATA[’Conozco a alguien                                 sentiment>.
      q es adicto al drama! Ja ja ja                               <sentiment aspect="Partido" polarity=
     te suena d algo!]]></content>                                    "P">El mejor partido</sentiment>
 <date>2011-12-02T02:59:03</date>                                     desde que llego al
 <lang>es</lang>                                                   <sentiment aspect="Equipo-Real_Madrid
 <sentiments>                                                         " polarity="NEU">Real Madrid</
    <polarity><value>P+</value><type>                                 sentiment>.
        AGREEMENT</type></polarity>                                </tweet>
 </sentiments>
 <topics>                                                          Figure 3: A tweet from the XML file of the
    <topic>entretenimiento</topic>                                 Social-TV corpus
 </topics>
</tweet>
                                                                   3.4      STOMPOL
Figure 2: A tweet from the XML file of the                         STOMPOL (corpus of Spanish Tweets for
General Corpus of TASS                                             Opinion Mining at aspect level about POLi-
                                                                   tics) is a corpus of Spanish tweets developed
                                                                   for the research in opinion mining at aspect
3.2     General corpus                                             level. Each tweet in the corpus has been
                                                                   manually annotated by two annotators, and
The General Corpus of TASS has 68,000                              a third one in case of disagreement, with the
tweets, written in Spanish by about 150 well-                      sentiment polarity at aspect level.
known personalities and celebrities of the                            The corpus is composed of 1,284 tweets,
world of politics, economy, communication,                         and has been divided into training set (784
mass media and culture, between November                           tweets), which is provided for building and
2011 and March 2012. The details of the cor-                       validating the systems, and test set (500
pus are described in (Villena-Román et al.,                       tweets) that will be used for evaluation.
2015; Garcı́a-Cumbreras et al., 2016). Fig-                        The details of the corpus are described
ure 2 shows a tweet from the General Corpus                        in (Villena-Román et al., 2015; Garcı́a-
of TASS.                                                           Cumbreras et al., 2016). Figure 4 shows a
                                                              16
                                         Overview of TASS 2017


tweet from the STOMPOL corpus.                         current neural network (CNN) and the third
                                                       one with a long-short term memory (LSTM)
<tweet id="591172256971280385">                        recurrent neural network (RNN). The perfor-
@rosadiezupyd lamenta que el #                         mance of each configuration depends on the
<sentiment aspect="Economia" entity="                  training set used.
   Union_Progreso_y_Democracia"                            Cerón-Guzmán (2017) presented an en-
   polarity="N">empleo</sentiment>                     semble classifier system for the first task.
   no termine de estabilizarse y                       The author generated quantitative features
   dice que el                                         from the tweets, such as the number of
<sentiment aspect="Economia" entity="                  words in upper case and the number of words
   Union_Progreso_y_Democracia"                        with repeated letters. Moreover, the system
   polarity="N">#paro</sentiment> "                    used lists of opinion bearing words like iSOL
   sigue siendo dram~
                    A¡tico" http://t                   (Molina-González et al., 2013), as well as the
   .co/1xdS3UjJWk #EPA                                 inversion of the polarity of words following a
</tweet>                                               window shifting approach for negation han-
                                                       dling. The base classifiers of the ensemble
    Figure 4: STOMPOL XML example                      system were Logistic Regression and SVM.
                                                       The system followed two ensemble strategies,
                                                       namely stacking and maximum classification
4   Participants and Results                           confidence. The maximum confidence strat-
Most of the systems submitted in TASS 2017             egy outperformed the stacking strategy and
are based on the use of deep learning tech-            it reached the highest accuracy value with the
niques as the state-of-the-art in SA in Twit-          test set of the InterTASS dataset.
ter. However, some of the systems are based                Montañés Salas et al. (2017) used the
on traditional machine learning methods and            classifier FastText (Joulin et al., 2016) for
others are meta-classifiers whose inputs are           only classifying the test set of the InterTASS
the output of deep learning systems and tra-           dataset. The authors performed a traditional
ditional machine learning algorithms. We de-           pre-processing to the input tweets, however
pict the main features of the systems submit-          the substitution of words with a emotional
ted in the subsequent paragraphs.                      meaning by their synonyms from a list of
    Table 3, Table 4 and Table 5 show the              words with a emotional meaning (Bradley
results reached by the submitted systems in            and Lang, 1999) stands out.
Task 1, using the test sets of InterTASS cor-              Rosá et al. (2017) participated in the two
pus and the General Corpus (full test and 1k           tasks. Concerning the first task, the authors
test). Table 6 and Table 7 shows the results           submitted three systems: 1) a SVM classifier
reached by the submitted systems in Task 2,            with word embeddings and quantitative lin-
using the test sets of Social-TV corpus and            guistic properties as features; 2) a deep neu-
STOMPOL corpus respectively.                           ral network grounded on the use of a CNN for
    Hurtado, Pla, and González (2017) par-            encoding the input tweets; and 3) the combi-
ticipated in the two tasks. They submitted             nation of the two previous classifiers by the
the same system for both tasks, and the only           selection of the output class with a higher
difference between the tasks lies in the char-         probability mean from the two previous clas-
acteristics of the input. The input of the             sifiers. The third strategy outperformed the
first task is the entire tweet, meanwhile the          other ones in two test sets of Task 1. Re-
input in the second task is the context of             garding the Task 2, the authors submitted
the aspects, which is previously determined.           two SVM classifiers mainly grounded on the
The authors created a set of domain-specific           use word embeddings.
word embeddings following the approach of                  Garcı́a-Vega et al. (2017) submitted four
Tang (2015). The former word embeddings                systems for the classification of the test set
set is jointly used with a general-domain set          of the InterTASS dataset. The first two sys-
of embeddings to represent the tokens of the           tems are a SVM classifier that uses word-
tweets. The authors evaluated three different          embeddings as features. The difference be-
neural networks architectures, the first one is        tween these two systems lies in the use of ad-
a multilinear perceptron (MLP), the second             ditional tweets from the users of the training
encodes the tweets with a convolutional re-            set. The intention of the authors was the in-
                                                  17
     Eugenio Martínez-Cámara, Manuel C. Díaz-Galiano, M. Ángel García-Cumbreras, Manuel García-Vega, Julio Villena-Román


  System                            M-F1        Acc.                 System                              M-F1        Acc.
  ELiRF-UPV-run1                    0.493       0.607                INGEOTEC-                           0.577       0.645
  RETUYT-svm cnn                    0.471       0.596                evodag 003
  ELiRF-UPV-run3                    0.466       0.597                jacerong-run-1                       0.569      0.706
  ITAINNOVA-model4                  0.461       0.576                jacerong-tass 2016-                  0.568      0.705
  jacerong-run-2                    0.460       0.602                run 3
  jacerong-run-1                    0.459       0.608                ELiRF-UPV-run2                       0.549      0.659
  INGEOTEC-                         0.457       0.507                ELiRF-UPV-run3                       0.548      0.725
  evodag 001                                                         RETUYT-svm cnn                       0.546      0.674
  RETUYT-svm                          0.457     0.583                jacerong-run-2                       0.545      0.701
  tecnolengua-sent only               0.456     0.582                ELiRF-UPV-run1                       0.542      0.666
  ELiRF-UPV-run2                      0.450     0.436                RETUYT-cnn                           0.541      0.638
  ITAINNOVA-model3                    0.445     0.561                RETUYT-cnn3                          0.539      0.654
  RETUYT-cnn3                         0.443     0.558                tecnolengua-run3                     0.528      0.657
  SINAI-w2v-nouser                    0.442     0.575                tecnolengua-final                    0.517      0.632
  tecnolengua-run3                    0.441     0.576                tecnolengua-                         0.508      0.652
  tecnolengua-                        0.441     0.595                531F1 no ngrams
  sent only fixed                                                    INGEOTEC-                            0.447      0.514
  ITAINNOVA-model2                    0.436     0.576                evodag 001
  LexFAR-run3                         0.432     0.541                OEG-victor2                          0.389      0.496
  LexFAR-run1                         0.430     0.539                INGEOTEC-                            0.364      0.449
  jacerong-run-3                      0.430     0.576                evodag 002
  SINAI-w2v-user                      0.428     0.569                OEG-laOEG                            0.346      0.407
  INGEOTEC-                           0.403     0.515                GSI-64sent99ally                     0.324      0.434
  evodag 002
  OEG-victor2                         0.395     0.451             Table 4: Task 1 General Corpus of TASS (full
  OEG-victor0                         0.383     0.433             test) results
  OEG-laOEG                           0.377     0.505
  LexFAR-run2                         0.372     0.490             ditionally they used EvoDAG, a GP system
  GSI-sent64-189                      0.371     0.524             that combines all decision values predicted by
  SINAI-embed-rnn2                    0.333     0.391             B4MSA systems. They also used two exter-
  GSI-sent64-149-ant-2                0.306     0.479             nal datasets to train the B4MSA algorithm.
  GSI-sent64-149-ant                  0.000     0.000
                                                                      Navas-Loro and Rodrı́guez-Doncel (2017)
 Table 3: Task 1 InterTASS corpus results                         participated only on Task 1. They experi-
                                                                  mented with two classifier algorithms, Multi-
                                                                  nominal Naı̈ve Bayes and Sequential Minimal
troduction of the use of language of each user                    Optimization for SVM. Furthermore they
in the classification. The two last systems are                   used morphosyntactic analyses for negation
deep neural networks grounded on the use of                       detection, along with the use of lexicons and
LSTM RNN for the encoding of the mean-                            dedicated preprocessing techniques for de-
ing of the input tweets. The first neural ar-                     tecting and correcting frequent errors and ex-
chitecture uses word embeddings as features,                      pressions in tweets.
and the second one the TF-IDF value of each                           Araque et al. (2017) have proposed, for
word of the tweets.                                               Task 1, a RNN architecture composed of
    Moctezuma et al. (2017) participation                         LSTM cells followed by a feed-forward net-
was based on an ensemble of SVM classi-                           work. The architecture makes use of two
fiers combined into a non-linear model cre-                       different types of features: word embeddings
ated with genetic programming to tackle                           and sentiment lexicon values. The recurrent
the task of global polarity classification at                     architecture allows them to process text se-
tweet level. They used B4MSA algorithm,                           quences of different lengths, while the lexicon
a proposed entropy-based term weighting                           inserts directly into the system sentiment in-
scheme, which is a baseline supervised learn-                     formation. Two variations of this architec-
ing system based on the SVM classifier, an                        ture were used: a LSTM that iterates over
entropy-based term-weighting scheme. Ad-                          the input word vectors, and on the other
                                                             18
                                         Overview of TASS 2017


  System                     M-F1      Acc.                System                    M-F1      Acc.
  RETUYT-svm                 0.562     0.700               ELiRF-UPV-run1            0.537     0.615
  RETUYT-cnn4                0.557     0.694               RETUYT-svm2               0.508     0.590
  RETUYT-cnn2                0.555     0.694               ELiRF-UPV-run3            0.486     0.578
  INGEOTEC-                  0.526     0.595               ELiRF-UPV-run2            0.486     0.541
  evodag 003                                               C100T-PUCP-run3           0.445     0.528
  tecnolengua-run3             0.521   0.638               C100T-PUCP-run1           0.415     0.563
  ELiRF-UPV-run1               0.519   0.630               C100T-PUCP-run2           0.414     0.517
  jacerong-tass 2016-          0.518   0.625               RETUYT-svm                0.377     0.514
  run 3
  jacerong-run-1               0.508   0.678            Table 7: Task 2 STOMPOL corpus results
  jacerong-run-2               0.506   0.673
  ELiRF-UPV-run2               0.504   0.596           lexicons are represented by the bag-of-word
  tecnolengua-final            0.488   0.618           model and they are weighted using Term Fre-
  tecnolengua-run4             0.483   0.612           quency measure at tweet level.
  ELiRF-UPV-run3               0.477   0.588               Moreno-Ortiz and Pérez Hernández
  INGEOTEC-                    0.439   0.431           (2017) have proposed, for Task 1, a clas-
  evodag 002                                           sification model based on the Lingmotif
  INGEOTEC-                    0.388   0.486           Spanish lexicon, and combined this with a
  evodag 001                                           number of formal text features, both general
  OEG-victor3b                 0.367   0.386           and CMC-specific, as well as single-word
  OEG-victor2                  0.366   0.412           keywords and n-gram keywords. They use
  OEG-laOEG                    0.346   0.448           logistic regression classifier trained with the
  GSI-run-1                    0.327   0.558           optimal set of features, SVM classifier on
  GSI-64sent99ally             0.321   0.499           the same features set. Sentiment features
                                                       are obtained with Lingmotif SA engine
Table 5: Task 1 General Corpus of TASS (1k)            (sentiment feature set, text feature set and
results                                                keywords feature set).
  System                     M-F1      Acc.            5     Conclusion and Future work
  ELiRF-UPV-run3             0.537     0.615
                                                       TASS was the first workshop about sentiment
  ELiRF-UPV-run2             0.513     0.600           analysis focused on the processing of texts
  ELiRF-UPV-run1             0.476     0.625           written in Spanish. In this edition, 11 teams
  RETUYT-svm2                0.426     0.595           participated with a total of 123 runs, most of
  RETUYT-svm                 0.413     0.493           them in the InterTASS task.
                                                          Anyway, the released corpora and the re-
 Table 6: Task 2 Social-TV corpus results
                                                       ports from participants will for sure be help-
                                                       ful for other research groups approaching
hand a combination of the input word vectors           these tasks.
and polarity values from a sentiment lexicon.             The future work will mainly go in two
    Tume Fiestas and Sobrevilla Cabezudo               directions. On the one hand, the organiza-
(2017) have proposed, for Task 2, an ap-               tion of one o more shared-tasks for the treat-
proach based on word embeddings for polar-             ment of semantic information in Spanish like
ity classification at aspect-level. They used          those mentioned above (argumentation min-
word embeddings to get the similarity be-              ing, irony detection and stance classification).
tween words selected from a training set and           On the other hand, the extension and im-
make a model to classify each polarity of each         provement of the InterTASS corpus. This
aspect for each tweet. Their results show that         corpus has been received with great inter-
the more tweets are used, the better accuracy          est, almost 90% of the experiments have been
is obtained.                                           developed in the first task, so an exhaustive
    Reyes-Ortiz et al. (2017) have proposed,           analysis of the behavior of the corpus in this
for Task 1, a system that uses machine learn-          task will shows the right way for a new ver-
ing, vector support machines algorithm and             sion of the corpus.
lexicons of semantic polarities at the level of           TASS corpora will be released after the
lemma for Spanish. Features extracted from             workshop for free use by the research com-
                                                  19
    Eugenio Martínez-Cámara, Manuel C. Díaz-Galiano, M. Ángel García-Cumbreras, Manuel García-Vega, Julio Villena-Román


munity.                                                          Joulin, A., E. Grave, P. Bojanowski, and
                                                                    T. Mikolov. 2016. Bag of tricks for ef-
Acknowledgement                                                     ficient text classification. arXiv preprint
                                                                    arXiv:1607.01759.
This research work is partially supported by
the project REDES (TIN2015-65136-C2-1-R)                         Moctezuma, D., M. Graff, S. Miranda-
and a grant from the Fondo Europeo de De-                          Jiménez, E. S. Tellez, A. Coronado, C. N.
sarrollo Regional (FEDER).                                         Sánchez, and J. Ortiz-Bejar. 2017. A ge-
                                                                   netic programming approach to sentiment
References                                                         analysis for twitter: Tass’17. In Proceed-
                                                                   ings of TASS 2017: Workshop on Senti-
Araque, O., R. Barbado, J. F. Sánchez-Rada,
                                                                   ment Analysis at SEPLN co-located with
  and C. A. Iglesias. 2017. Applying recur-
                                                                   33nd SEPLN Conference (SEPLN 2017),
  rent neural networks to sentiment analysis
                                                                   volume 1896 of CEUR Workshop Proceed-
  of spanish tweets. In Proceedings of TASS
                                                                   ings, Murcia, Spain, September. CEUR-
  2017: Workshop on Sentiment Analysis
                                                                   WS.
  at SEPLN co-located with 33nd SEPLN
  Conference (SEPLN 2017), volume 1896                           Molina-González, M. D., E. Martı́nez-
  of CEUR Workshop Proceedings, Murcia,                            Cámara, M.-T. Martı́-Valdivia, and J. M.
  Spain, September. CEUR-WS.                                       Perea-Ortega. 2013. Semantic orientation
                                                                   for polarity classification in spanish re-
Bradley, M. M. and P. J. Lang. 1999. Af-                           views. Expert Systems with Applications,
  fective norms for english words (anew):                          40(18):7250 – 7257.
  Stimuli, instruction manual, and affective
  ratings. Technical report, Center for Re-                      Montañés Salas, R. M., R. del Hoyo Alonso,
  search in Psychophysiology, University of                        J. Vea-Murguı́a Merck, R. Aznar Gimeno,
  Florida.                                                         and F. J. Lacueva-Pérez. 2017. FastText
                                                                   como alternativa a la utilización de deep
Cerón-Guzmán, J. A. 2017. Classier ensem-                        learning en corpus pequeños. In Proceed-
  bles that push the state-of-the-art in sen-                      ings of TASS 2017: Workshop on Senti-
  timent analysis of spanish tweets. In Pro-                       ment Analysis at SEPLN co-located with
  ceedings of TASS 2017: Workshop on Sen-                          33nd SEPLN Conference (SEPLN 2017).
  timent Analysis at SEPLN co-located with
  33nd SEPLN Conference (SEPLN 2017).                            Moreno-Ortiz, A. and C. Pérez Hernández.
                                                                   2017. Tecnolengua lingmotif at tass 2017:
Garcı́a-Cumbreras, M. A., J. Villena-Román,                       Spanish twitter dataset classification com-
  E. Martı́nez-Cámara, M. C. Dı́az-Galiano,                       bining wide-coverage lexical resources and
  M. T. Martı́n-Valdivia, and L. A. Ureña                         text features. In Proceedings of TASS
  López. 2016. Overview of tass 2016.                             2017: Workshop on Sentiment Analysis
  In TASS 2016: Workshop on Sentiment                              at SEPLN co-located with 33nd SEPLN
  Analysis at SEPLN, pages 13–21.                                  Conference (SEPLN 2017), volume 1896
                                                                   of CEUR Workshop Proceedings, Murcia,
Garcı́a-Vega, M., A. Montejo-Ráez, M. C.
                                                                   Spain, September. CEUR-WS.
  Dı́az-Galiano, and S. M. Jiménez-Zafra.
  2017. Sinai en tass 2017: Clasificación                       Navas-Loro, M. and V. Rodrı́guez-Doncel.
  de la polaridad de tweets integrando in-                         2017. Oeg at tass 2017: Spanish sentiment
  formación de usuario. In Proceedings                            analysis of tweets at document level. In
  of TASS 2017: Workshop on Sentiment                              Proceedings of TASS 2017: Workshop on
  Analysis at SEPLN co-located with 33nd                           Sentiment Analysis at SEPLN co-located
  SEPLN Conference (SEPLN 2017).                                   with 33nd SEPLN Conference (SEPLN
                                                                   2017), volume 1896 of CEUR Workshop
Hurtado, L.-F., F. Pla, and J.-A. González.                       Proceedings, Murcia, Spain, September.
  2017. Elirf-upv en tass 2017: Análisis de                       CEUR-WS.
  sentimientos en twitter basado en apren-
  dizaje profundo. In Proceedings of TASS                        Pang, B. and L. Lee. 2008. Opinion mining
  2017: Workshop on Sentiment Analysis at                          and sentiment analysis. Foundations and
  SEPLN co-located with 33nd SEPLN Con-                            Trends in Information Retrieval, 2(1-2):1–
  ference (SEPLN 2017).                                            135.
                                                            20
                                       Overview of TASS 2017


Reyes-Ortiz, J. A., F. Paniagua-Reyes,                   Jiménez Zafra.    2015.   Tass 2014 -
  B. Priego-Sánchez, and M. Tovar. 2017.                the challenge of aspect-based sentiment
  Lexfar en la competencia tass 2017:                    analysis.   Procesamiento del Lenguaje
  Análisis de sentimientos en twitter basado            Natural, 54(0):61–68.
  en lexicones. In Proceedings of TASS
  2017: Workshop on Sentiment Analysis
  at SEPLN co-located with 33nd SEPLN
  Conference (SEPLN 2017), volume 1896
  of CEUR Workshop Proceedings, Murcia,
  Spain, September. CEUR-WS.
Rosá, A., L. Chiruzzo, M. Etcheverry, and
  S. Castro. 2017. Retuyt en tass 2017:
  Análisis de sentimientos de tweets en
  español utilizando svm y cnn. In Proceed-
  ings of TASS 2017: Workshop on Senti-
  ment Analysis at SEPLN co-located with
  33nd SEPLN Conference (SEPLN 2017).
Tang, D. 2015. Sentiment-specific repre-
  sentation learning for document-level sen-
  timent analysis. In Proceedings of the
  Eighth ACM International Conference on
  Web Search and Data Mining, WSDM
  ’15, pages 447–452, New York, NY, USA.
  ACM.
Tume Fiestas, F. and M. A. Sobre-
  villa Cabezudo. 2017. C100tpucp at
  tass 2017: Word embedding experiments
  for aspect-based sentiment analysis in
  spanish tweets. In Proceedings of TASS
  2017: Workshop on Sentiment Analysis
  at SEPLN co-located with 33nd SEPLN
  Conference (SEPLN 2017), volume 1896
  of CEUR Workshop Proceedings, Murcia,
  Spain, September. CEUR-WS.
Villena-Román, J., J. Garcı́a-Morera, M. A.
   Garcı́a-Cumbreras, E. Martı́nez-Cámara,
   M. T. Martı́n-Valdivia, and L. A. Ureña
   López. 2015. Overview of tass 2015.
   In TASS 2015: Workshop on Sentiment
   Analysis at SEPLN, pages 13–21.
Villena-Román, J., J. Garcı́a-Morera,
   S. Lana-Serrano, and J. C. González-
   Cristóbal. 2014. Tass 2013 - a second
   step in reputation analysis in spanish.
   Procesamiento del Lenguaje Natural,
   52(0):37–44, March.
Villena-Román,    J.,    S. Lana-Serrano,
   E. Martı́nez-Cámara, and J. C. González-
   Cristóbal. 2013. Tass - workshop on
   sentiment analysis at sepln.      Proce-
   samiento del Lenguaje Natural, 50:37–44.
Villena Román, J., E. Martı́nez Cámara,
   J. Garcı́a Morera,     and S. M.
                                                21