=Paper= {{Paper |id=Vol-1397/overview |storemode=property |title=Overview of TASS 2015 |pdfUrl=https://ceur-ws.org/Vol-1397/overview.pdf |volume=Vol-1397 |dblpUrl=https://dblp.org/rec/conf/sepln/Villena-RomanGC15 }} ==Overview of TASS 2015== https://ceur-ws.org/Vol-1397/overview.pdf
TASS 2015, septiembre 2015, pp 13-21                                            recibido 20-07-15 revisado 24-07-15 aceptado 25-07-15




                                          Overview of TASS 2015

                                              Resumen de TASS 2015
              Julio Villena Román                                        Miguel Ángel García Cumbreras
             Janine García Morera                                           Eugenio Martínez Cámara
                  Daedalus, S.A.                                            M. Teresa Martín Valdivia
               28031 Madrid, Spain                                            L. Alfonso Ureña López
   {jvillena, jgarcia, cdepablo}@daedalus.es                                     Universidad de Jaén
                                                                                  23071 Jaén, Spain
                                                                      {magc, emcamara, laurena, maite}@uja.es

          Resumen: Este artículo describe la cuarta edición del taller de evaluación experimental TASS
          2015, enmarcada dentro del congreso internacional SEPLN 2015. El principal objetivo de TASS
          es promover la investigación y el desarrollo de nuevos algoritmos, recursos y técnicas para el
          análisis de sentimientos en medios sociales (concretamente en Twitter), aplicado al idioma
          español. Este artículo describe las tareas propuestas en TASS 2015, así como el contenido de los
          corpus utilizados, los participantes en las distintas tareas y los resultados generales obtenidos y
          el análisis de estos resultados.
          Palabras clave: TASS 2015, análisis de opiniones, medios sociales

          Abstract: This paper describes TASS 2105, the fourth edition of the Workshop on Sentiment
          Analysis at SEPLN. The main objective is to promote the research and the development of new
          algorithms, resources and techniques in the field of sentiment analysis in social media
          (specifically Twitter), focused on Spanish language. This paper presents the TASS 2015
          proposed tasks, the contents of the generated corpora, the participant groups and the results and
          analysis of them.
          Keywords: TASS 2015, sentiment analysis, social media.


                                                                      approaches. The first one applies machine
      1      Introduction                                             learning algorithms in order to train a polarity
                                                                      classifier using a labelled corpus (Pang et al.
TASS is an experimental evaluation workshop,
                                                                      2002). This approach is also known as the
a satellite event of the annual SEPLN
                                                                      supervised approach. The second one is known
Conference, with the aim to promote the
                                                                      as semantic orientation, or the unsupervised
research of sentiment analysis systems in social
                                                                      approach, and it integrates linguistic resources
media, focused on Spanish language. The
                                                                      in a model in order to identify the valence of
fourth edition will be held on September 15th,
                                                                      the opinions (Turney 2002).
2015 at University of Alicante, Spain.
                                                                          The aim of TASS is to provide a competitive
   Sentiment analysis (SA) can be defined as
                                                                      forum where the newest research works in the
the computational treatment of opinion,
                                                                      field of SA in social media, specifically focused
sentiment and subjectivity in texts (Pang & Lee,
                                                                      on Spanish tweets, are showed and discussed by
2002). It is a hard task because even humans
                                                                      scientific and business communities.
often disagree on the sentiment of a given text.
                                                                          The rest of the paper is organized as follows.
And it is a harder task when the text has only
                                                                      Section 2 describes the different corpus
140 characters (Twitter messages or tweets).
                                                                      provided to participants. Section 3 shows the
   Text classification techniques, although
                                                                      different tasks of TASS 2015. Section 4
studied and improved for a longer time, still
                                                                      describes the participants and the overall results
need more research effort and resources to be
                                                                      are presented in Section 5. Finally, the last
able to build better models to improve the
                                                                      section shows some conclusions and future
current result values. Polarity classification
                                                                      directions.
has usually been tackled following two main

Publicado en http://ceur-ws.org/Vol-1397/. CEUR-WS.org es una publicación en serie con ISSN reconocido               ISSN 1613-0073
     J. Villena Román, J. García Morera, M. Á. García Cumbreras, E. Martínez Cámara, M. T. Martín Valdivia, L. A. Ureña López


    2     Corpus                                                     included for those cases when applicable. These
                                                                     values are similarly tagged with 6 possible
TASS 2015 experiments are based on three
                                                                     values and include the level of agreement as
corpus, specifically built for the different
                                                                     related to each entity.
editions of the workshop.
                                                                     This corpus is based on a selection of a set of
   2.1 General corpus                                                topics. Thematic areas such as "política"
                                                                     ("politics"), "fútbol" ("soccer"), "literatura"
The general corpus contains over 68.000 tweets,                      ("literature")        or        "entretenimiento"
written in Spanish, about 150 well-known                             ("entertainment"). Each tweet in both the
personalities and celebrities of the world of                        training and test set has been assigned to one or
politics, economy, communication, mass media                         several of these topics (most messages are
and culture, between November 2011 and                               associated to just one topic, due to the short
March 2012. Although the context of extraction                       length of the text).
has a Spain-focused bias, the diverse nationality                        All      tagging      has      been      done
of the authors, including people from Spain,                         semiautomatically: a baseline machine learning
Mexico, Colombia, Puerto Rico, USA and                               model is first run and then all tags are manually
many other countries, makes the corpus reach a                       checked by human experts. In the case of the
global coverage in the Spanish-speaking world.                       polarity at entity level, due to the high volume
Eachtweet includes its ID (tweetid), the creation                    of data to check, this tagging has just been done
date (date) and the user ID (user). Due to                           for the training set.
restrictions in the Twitter API Terms of Service
(https://dev.twitter.com/terms/api-terms), it is                        Table 1 shows a summary of the training
forbidden to redistribute a corpus that includes                     and test corpora provided to participants.
text contents or information about users.
However, it is valid if those fields are removed                         Attribute                            Value
and instead IDs (including Tweet IDs and user                            Tweets                                      68.017
IDs) are provided. The actual message content                            Tweets (test)                         60.798 (89%)
can be easily obtained by making queries to the                          Tweets (test)                          7.219 (11%)
Twitter API using the tweetid.                                           Topics                                          10
The general corpus has been divided into                                 Users                                          154
training set (about 10%) and test set (90%). The                         Date start (train)                      2011-12-02
training set was released, so the participants                           Date end (train)                        2012-04-10
could train and validate their models. The test                          Date start (test)                       2011-12-02
                                                                         Date end (test)                         2012-04-10
corpus was provided without any tagging and
has been used to evaluate the results.
Obviously, it was not allowed to use the test                                       Table 1: Corpus statistics
data from previous years to train the systems.
Each tweet was tagged with its global polarity                          Users were journalists (periodistas),
(positive, negative or neutral sentiment) or no                      politicians (políticos) or celebrities (famosos).
sentiment at all. A set of 6 labels has been                         The only language involved this year was
defined: strong positive (P+), positive (P),                         Spanish (es).
neutral (NEU), negative (N), strong negative                            The list of topics that have been selected is
(N+) and one additional no sentiment tag                             the following:
(NONE).
                                                                        • Politics (política)
In addition, there is also an indication of the
                                                                        • Entertainment (entretenimiento)
level of agreement or disagreement of the
expressed sentiment within the content, with                            • Economy (economía)
two possible values: AGREEMENT               and                        • Music (música)
DISAGREEMENT. This is especially useful to                              • Soccer (fútbol)
make out whether a neutral sentiment comes                              • Films (películas)
from neutral keywords or else the text contains                         • Technology (tecnología)
positive and negative sentiments at the same                            • Sports (deportes)
time.                                                                   • Literature (literatura)
Moreover, the polarity values related to the                            • Other (otros)
entities that are mentioned in the text are also


                                                               14
                                          Overview of TASS 2015


   The corpus is encoded in XML. Figure 1                   •     Equipo - Real Madrid (Team - Real
shows the information of two sample tweets.                       Madrid)
The first tweet is only tagged with the global              •     Equipo (any other team)
polarity as the text contains no mentions to any            •     Jugador - Alexis Sánchez (Player -
entity, but the second one is tagged with both                    Alexis Sánchez)
the global polarity of the message and the                  •     Jugador - Álvaro Arbeloa (Player -
polarity associated to each of the entities that                  Álvaro Arbeloa)
appear in the text (UPyD and Foro Asturias).                •     Jugador - Andrés Iniesta (Player -
                                                                  Andrés Iniesta)
                                                            •     Jugador - Ángel Di María (Player -
                                                                  Ángel Di Maria)
                                                            •     Jugador - Asier Ilarramendi (Player -
                                                                  Asier Ilarramendi)
                                                            •     Jugador - Carles Puyol (Player - Carles
                                                                  Puyol)
                                                            •     Jugador - Cesc Fábregas (Player - Cesc
                                                                  Fábregas)
                                                            •     Jugador - Cristiano Ronaldo (Player -
                                                                  Cristiano Ronaldo)
                                                            •     Jugador - Dani Alves (Player - Dani
                                                                  Alves)
                                                            •     Jugador - Dani Carvajal (Player - Dani
                                                                  Carvajal)
                                                            •     Jugador - Fábio Coentrão (Player -
                                                                  Fábio Coentrão)
                                                            •     Jugador - Gareth Bale (Player - Gareth
                                                                  Bale)
                                                            •     Jugador - Iker Casillas (Player - Iker
                                                                  Casillas)
                                                            •     Jugador - Isco (Player - Isco)
   Figure 1: Sample tweets (General corpus)                 •     Jugador - Javier Mascherano (Player -
                                                                  Javier Mascherano)
2.2   Social-TV corpus                                      •     Jugador - Jesé Rodríguez (Player - Jesé
                                                                  Rodríguez)
The Social-TV corpus was collected during the
                                                            •     Jugador - José Manuel Pinto (Player -
2014 Final of Copa del Rey championship in
                                                                  José Manuel Pinto)
Spain between Real Madrid and F.C.
                                                            •     Jugador - Karim Benzema (Player -
Barcelona, played on 16 April 2014 at Mestalla
                                                                  Karim Benzema)
Stadium in Valencia. After filtering useless
information a subset of 2.773 tweets was                   •      Jugador - Lionel Messi (Player - Lionel
selected.                                                         Messi)
All tweets were manually tagged with the                   •      Jugador - Luka Modric (Player - Luka
aspects and its sentiment polarity. Tweets may                    Modric)
cover more than one aspect.                                •      Jugador - Marc Bartra (Player - Marc
   The list of the 31 aspects that have been                      Bartra)
defined is the following:                                  •      Jugador - Neymar Jr. (Player - Neymar
   • Afición (supporters)                                         Jr.)
   • Árbitro (referee)                                     •      Jugador - Pedro Rodríguez (Player -
   • Autoridades (authorities)                                    Pedro Rodríguez)
   • Entrenador (coach)                                    •      Jugador - Pepe (Player - Pepe)
   • Equipo - Atlético de Madrid (Team-                    •      Jugador - Sergio Busquets (Player -
       Atlético de Madrid)                                        Sergio Busquets)
   • Equipo - Barcelona (Team- Barcelona)                  •      Jugador - Sergio Ramos (Player - Sergio
                                                                  Ramos)

                                                   15
       J. Villena Román, J. García Morera, M. Á. García Cumbreras, E. Martínez Cámara, M. T. Martín Valdivia, L. A. Ureña López


   •    Jugador - Xabi Alonso (Player - Xabi                               bad (corruption, criticism) related to the
        Alonso)                                                            entity
    • Jugador - Xavi Hernández (Player -                               • Otros_aspectos (Other aspects): electoral
        Xavi Hernández)                                                    system, environmental policy...
    • Jugador (any other player)                                          Each aspect is related to one or several
    • Partido (Football match)                                         entities that correspond to one of the main
    • Retransmisión (broadcast)                                        political parties in Spain, which are:
    Sentiment polarity has been tagged from the                        • Partido_Popular (PP)
point of view of the person who writes the                             • Partido_Socialista_Obrero_Español
tweet, using 3 levels: P, NEU and N. No                                    (PSOE)
distinction is made in cases when the author                           • Izquierda_Unida (IU)
does not express any sentiment or when he/she                          • Podemos
expresses a no-positive no-negative sentiment.                         • Ciudadanos (Cs)
    The Social-TV corpus was randomly divided                          • Unión_Progreso_y_Democracia (UPyD)
into training set (1.773 tweets) and test set
(1.000 tweets), with a similar distribution of                            Each tweet in the corpus has been manually
both aspects and sentiments. The training set                          tagged by two annotators, and a third one in
was released previously and the test corpus was                        case of disagreement, with the sentiment
provided without any tagging and has been used                         polarity at aspect level. Sentiment polarity has
to evaluate the results provided by the different                      been tagged from the point of view of the
systems.                                                               person who writes the tweet, using 3 levels: P,
    The following figure shows the information                         NEU and N. Again, no difference is made
of three sample tweets in the training set.                            between no sentiment and a neutral sentiment
                                                                       (neither positive nor negative). Each political
                                                                       aspect is linked to its correspondent political
                                                                       party and its polarity.

                                                                          Figure 3 shows the information of two
                                                                       sample tweets.




  Figure 2: Sample tweets (Social-TV corpus)                              Figure       3:    Sample       tweets     (STOMPOL
                                                                       corpus)

2.3     STOMPOL corpus                                                    These three corpora will be made freely
    STOMPOL (corpus of Spanish Tweets for                              available to the community after the workshop.
Opinion Mining at aspect level about POLitics)                         Please send an email to tass@daedalus.es filling
is a corpus of Spanish tweets prepared for the                         in the TASS Corpus License agreement with
research in the challenging task of opinion                            your email, affiliation (institution, company or
mining at aspect level. The tweets were                                any kind of organization) and a brief
gathered from 23rd to 24th of April 2015, and                          description of your research objectives, and you
are related to one of the following political                          will be given a password to download the files
aspects that appear in political campaigns:                            in the password protected area. The only
• Economics             (Economía):           taxes,                   requirement is to include a citation to a relevant
     infrastructure, markets, labor policy...                          paper and/or the TASS website.
• Health System (Sanidad): hospitals,
     public/private health system, drugs,                                    3      Description of tasks
     doctors...                                                           First of all, we are interested in evaluating
• Education (Educacion): state school, private                         the evolution of the different approaches for SA
     school, scholarships...                                           and text classification in Spanish during these
• Political party (Propio_partido): anything                           years. So, the traditional SA at global level task
     good (speeches, electoral programme...) or                        will be repeated again, reusing the same corpus,

                                                                 16
                                          Overview of TASS 2015


to compare results. Moreover, we want to foster          of the imbalanced distribution of labels between
the research in the analysis of fine-grained             the training and test set, a selected test subset
polarity analysis at aspect level (aspect-based          containing 1.000 tweets with a similar
SA, one of the new requirements of the market            distribution to the training corpus was extracted
of natural language processing in these areas).          to be used for an alternate evaluation of the
So, two legacy tasks will be repeated again, to          performance of systems.
compare results, and a new corpus has been
created for the second task.                             3.2       (legacy) Task 2: Aspect-based
    Participants are expected to submit up to 3                    sentiment analysis
results of different experiments for one or both
of these tasks, in the appropriate format                Participants have been provided with a corpus
described below.                                         tagged with a series of aspects, and systems
    Along with the submission of experiments,            must identify the polarity at the aspect-level.
participants have been invited to submit a paper         Two corpora have been provided: the Social-
to the workshop in order to describe their               TV corpus, used in TASS 2014, and the new
experiments and discussing the results with the          STOMPOL corpus, collected in 2015
audience in a regular workshop session.                  (described above). Both corpora have been
    The two proposed tasks are described next.           splitted into training and test set, the first one
                                                         for building and validating the systems, and the
3.1   (legacy) Task 1: Sentiment Analysis                second for evaluation.
                                                         Participants are expected to submit up to 3
      at Global Level
                                                         experiments for each corpus, each in a plain
   This is the same task as previous editions.           text file with the following format:
This task consists on performing an automatic
                                                            tweetid \t aspect \t polarity
polarity classification to determine the global
polarity of each message in the test set of the             [for the Social-TV corpus]
General corpus. Participants have been
provided with the training set of the General               tweetid \t aspect-entity \t polarity
corpus so that they may train and validate their
models. There will be two different evaluations:            [for the STOMPOL corpus]
one based on 6 different polarity labels (P+, P,
NEU, N, N+, NONE) and another based on just 4               Allowed polarity values are P, N and NEU.
labels (P, N, NEU, NONE).                                   For evaluation, a single label combining
Participants are expected to submit (up to 3)            "aspect-polarity" has been considered. Similarly
experiments for the 6-labels evaluation, but are         to the first task, accuracy, macroaveraged
also allowed to submit (up to 3) specific                precision,     macroaveraged      recall    and
experiments for the 4-labels scenario.                   macroaveraged       F1-measure     have    been
   Results must be submitted in a plain text file        calculated for the global result.
with the following format:
        tweetid \t polarity
                                                               4     Participants and Results
                                                         This year 35 groups registered (as compared to
where polarity can be:                                   31 groups last year) but unfortunately only 7
 • P+, P, NEU, N, N+ and NONE for the 6-labels           groups (14 last year) sent their submissions.
   case                                                  The list of active participant groups is shown in
 • P, NEU, N and NONE for the 4-labels case.             Table 2, including the tasks in which they have
                                                         participated.
The same test corpus of previous years will be              Fourteen of the seventeen participant groups
used for the evaluation, to allow for comparison         sent a report describing their experiments and
among systems. Accuracy, macroaveraged                   results achieved. Papers were reviewed and
precision,     macroaveraged        recall    and        included in the workshop proceedings.
macroaveraged F1-measure have been used to               References are listed in Table 3.
evaluate each run.
   Notice that there are two test sets: the
complete set and 1k set, a subset of the first
one. The reason is that, to deal with the problem

                                                    17
      J. Villena Román, J. García Morera, M. Á. García Cumbreras, E. Martínez Cámara, M. T. Martín Valdivia, L. A. Ureña López


                                                                       BittenPotato            BittenPotato: Tweet
           Group                     1        2                                                sentiment analysis by
           LIF                       X                                                         combining multiple
           ELiRF                     X       X                                                 classifiers
           GSI                       X       X                         SINAI_wd2v              Participación de SINAI
           LyS                       X       X                                                 DW2Vec en TASS 2015
           DLSI                      X                                 DT                      DeustoTech Internet at
           GTI-GRAD                  X                                                         TASS 2015: Sentiment
           ITAINNOVA                 X                                                         analysis and polarity
           SINAI-ESMA                X                                                         classification in Spanish
           CU                        X                                                         tweets
           TID-spark                 X       X                         UCSP                    Comparing Supervised
           BittenPotato              X
                                                                                               Learning Methods for
                                     X
           SINAI_wd2v                                                                          Classifying Spanish Tweets
                                     X
           DT                        X                                 INGEOTEC                Sentiment Analysis for
           GAS-UCR                   X                                                         Twitter: TASS 2015
           UCSP                      X
           SEDEMO                    X                                              Table 3: Participant reports
           INGEOTEC                  X
           Total groups             17        4
                                                                            5      Results
            Table 2: Participant groups                                   Results for each task are described next.

Group                                 Report                          5.1       Task 1: Sentiment Analysis at
ELiRF                   ELiRF-UPV en TASS                                       Global Level
                        2015: Análisis de
                                                                      Submitted runs and results for Task 1,
                        Sentimientos en Twitter
                                                                      evaluation based on 5 polarity levels with the
GSI                     Aspect based Sentiment
                                                                      whole General test corpus, are shown in Table
                        Analysis of Spanish Tweets
                                                                      4.    Accuracy,     macroaveraged      precision,
LyS                     LyS at TASS 2015: Deep
                                                                      macroaveraged recall and macroaveraged F1-
                        Learning Experiments for
                                                                      measure have been used to evaluate each
                        Sentiment Analysis on
                                                                      individual label and ranking the systems.
                        Spanish Tweets
DLSI                    Evaluating a Sentiment
                                                                             Run Id                                      Acc
                        Analysis Approach from a                            LIF-Run-3                                    0.672
                        Business Point of View                              LIF-Run-2                                    0.654
GTI-GRAD                GTI-Gradiant at TASS                                ELiRF-run3                                   0.659
                        2015: A Hybrid Approach                             LIF-Run-1                                    0.628
                        for Sentiment Analysis in                           ELiRF-run1                                   0.648
                        Twitter                                             ELiRF-run2                                   0.658
ITAINNOVA               Ensemble algorithm with                             GSI-RUN-1                                    0.618
                        syntactical tree features to                        run_out_of_date                              0.673
                        improve the opinion                                 GSI-RUN-2                                    0.610
                                                                            GSI-RUN-3                                    0.608
                        analysis
                                                                            LyS-run-1                                    0.552
SINAI-EMMA              SINAI-EMMA: Vectores                                DLSI-Run1                                    0.595
                        de Palabras para el                                 Lys-run-2                                    0.568
                        Análisis de Opiniones en                            GTI-GRAD-Run1                                0.592
                        Twitter                                             Ensemble exp1.1                              0.535
CU                      Spanish Twitter Messages                            SINAI-EMMA-1                                 0.502
                        Polarized through the Lens                          INGEOTEC-M1                                  0.488
                        of an English System                                Ensemble exp3_emotions                       0.549
TID-spark               Sentiment Classification                            CU-Run-1                                     0.495
                        using Sociolinguistic                               TID-spark-1                                  0.462
                                                                            BP-wvoted-v2_1                               0.534
                        Clusters
                                                                            Ensemble exp2_emotions                       0.524

                                                                18
                                           Overview of TASS 2015


   BP-voted-v2                         0.535                 BP-wvoted-v1                        0.416
   SINAI_wd2v_500                      0.474                 BP-rbf-v1                           0.418
   SINAI_wd2v_300                      0.474                 SEDEMO-E1                           0.397
   BP-wvoted-v1                        0.522                 DT-RUN-1                            0.407
   BP-voted-v1                         0.522                 DT-RUN-2                            0.408
   BP-rbf-v2                           0.514                 DT-RUN-3                            0.396
   Lys-run-3                           0.505                 GAS-UCR-1                           0.338
   BP-rbf-v1                           0.494                 INGEOTEC-E1                         0.174
   CU-Run-2-CompMod                    0.362                 INGEOTEC-E2                         0.168
   DT-RUN-1                            0.560
   DT-RUN-3                            0.557
   DT-RUN-2                            0.545               Table 5: Results for Task 1, 5 levels, selected
   GAS-UCR-1                           0.342                                 1k corpus
   UCSP-RUN-1                          0.273
   BP-wvoted-v2                        0.009                  Run Id                             Acc
                                                             LIF-Run-3                           0.726
Table 4: Results for Task 1, 5 levels, whole test            LIF-Run-2                           0.725
                     corpus                                  ELiRF-run3                          0.721
                                                             LIF-Run-1                           0.710
                                                             ELiRF-run1                          0.712
    As previously described, an alternate                    ELiRF-run2                          0.722
evaluation of the performance of systems was                 GSI-RUN-1                           0.690
done using a new selected test subset containing             run_out_of_date                     0.725
1.000 tweets with a similar distribution to the              GSI-RUN-2                           0.679
                                                             GSI-RUN-3                           0.678
training corpus. Results are shown in Table 5.
                                                             DLSI-Run1                           0.655
    In order to perform a more in-depth                      LyS-run-1                           0.664
evaluation, results are calculated considering               GTI-GRAD-Run1                       0.695
the classification only in 3 levels (POS, NEU,               TID-spark-1                         0.594
NEG) and no sentiment (NONE) merging P and P+                INGEOTEC-M1                         0.613
in only one category, as well as N and N+ in                 UCSP-RUN-2                          0.594
another one. The same double evaluation using                UCSP-RUN-3                          0.613
the whole test corpus and a new selected corpus              Ensemble exp2_3_SPARK               0.591
have been carried out, shown Tables 8 and 9.                 UCSP-RUN-1                          0.602
                                                             CU-RUN-1                            0.597
                                                             Ensemble exp1_3_SPARK               0.610
    Run Id                             Acc
                                                             UCSP-RUN-1-ME                       0.600
   ELiRF-run2                          0.488
                                                             BP-wvoted-v1                        0.593
   GTI-GRAD-Run1                       0.509
                                                             BP-voted-v1                         0.593
   LIF-Run-2                           0.516
                                                             Ensemble exp3_3                     0.594
   GSI-RUN-1                           0.487
                                                             DT-RUN-2                            0.625
   GSI-RUN-2                            0.48
                                                             SINAI_wd2v                          0.619
   GSI-RUN-3                           0.479
                                                             SINAI_wd2v_2                        0.613
   LIF-Run-1                           0.481
                                                             BP-rbf-v1                           0.602
   ELiRF-run1                          0.476
                                                             Lys-run-2                           0.599
   SINAI_wd2v                          0.389
                                                             DT-RUN-3                            0.608
   ELiRF-run3                          0.477
                                                             UCSP-RUN-1-NB                       0.560
   INGEOTEC-M1                         0.431
                                                             SINAI_w2v                           0.604
   Ensemble exp1 1K                    0.405
                                                             UCSP-RUN-1-DT                       0.536
   LyS-run-1                           0.428
                                                             CU-Run2-CompMod                     0.481
   Ensemble exp2 1K                    0.384
                                                             DT-RUN-1                            0.490
   Lys-run-3                           0.430
                                                             UCSP-RUN-2-ME                       0.479
   Lys-run-2                           0.434
                                                             SINAI_d2v                           0.429
   SINAI-EMMA-1                        0.411
                                                             GAS-UCR-1                           0.446
   CU-Run-1-CompMod                    0.419
   Ensemble exp3 1K                    0.396
   TID                                 0.400             Table 6: Results for Task 1, 3 levels, whole test
   BP-voted-v1                         0.408                                  corpus
   DLSI-Run1                           0.385
   CU-Run-2                            0.397
                                                              Run Id                             Acc

                                                    19
       J. Villena Román, J. García Morera, M. Á. García Cumbreras, E. Martínez Cámara, M. T. Martín Valdivia, L. A. Ureña López


      LIF-Run-1                                   0.632                     ELiRF-run1                                    0.655
      ELiRF-run2                                  0.610                     LyS-run-1                                     0.610
      LIF-Run-2                                   0.692                     TID-spark-1                                   0.631
      BP-wvoted-v1                                0.632                     GSI-RUN-1                                     0.533
      GSI-RUN-1                                   0.658                     Lys-run-2                                     0.522
      GTI-GRAD-Run1                               0.674
      BP-voted-v1                                 0.611                Table 10: Results for Task 2, Social-TV corpus
      LyS-run-1                                   0.634
      TID-spark-1                                 0.649
      DLSI-Run1                                   0.637                     Run Id                                        Acc
      ELiRF-run1                                  0.645                     ELiRF-run1                                    0.633
      DT-RUN-1                                    0.601                     LyS-run-1                                     0.599
      GSI-RUN-2                                   0.646                     Lys-run-2                                     0.540
      GSI-RUN-3                                   0.647                     TID-spark-1                                   0.557
      ELiRF-run3                                  0.595
      Ensemble exp3 1K 3                          0.614
      UCSP-RUN-2                                  0.586                    Table 11: Results for Task 2, STOMPOL
      Ensemble exp2 1K 3                          0.611                                     corpus
      Ensemble exp1 1K 3                          0.503
      INGEOTEC-M1                                 0.595
      CU-Run-2-CompMod                            0.600                      6      Conclusions and Future Work
      CU-RUN-1                                    0.578
                                                                           TASS was the first workshop about SA
      SINAI_wd2v_2_500                            0.641
      UCSP-RUN-1                                  0.582                focused on the processing of texts written in
      SINAI_w2v                                   0.627                Spanish. Clearly this area receives great
      UCSP-RUN-3                                  0.626                attraction from research groups and companies,
      SINAI_wd2v                                  0.633                as this fourth edition has had a greater impact in
      BP-rbf-v1                                   0.611                terms of registered groups, and the number of
      UCSP-RUN-1-NB                               0.636                participants that submitted experiments in 2015
      UCSP-RUN-1-ME                               0.626                tasks has increased.
      Lys-run-2                                   0.605                    Anyway, the developed corpus and gold
      DT-RUN-2                                    0.583
                                                                       standards, and the reports from participants will
      DT-RUN-3                                    0.571
                                                                       for sure be helpful for other research groups
      UCSP-RUN-1-DR                               0.495
      UCSP-RUN-2-NB                               0.559                approaching these tasks.
      UCSP-RUN-2-ME                               0.509                    TASS corpora will be released after the
      DT-RUN-1                                    0.514                workshop for free use by the research
      GAS-UCR-1                                   0.556                community. In 2014 the corpora had been
      SINAI_d2v                                   0.510                downloaded up to date by more than 60
                                                                       research groups, 25 out of Spain, by groups
 Table 7: Results for Task 1, 3 levels, selected                       coming from academia and also from private
                   1k corpus                                           companies to use the corpus as part of their
                                                                       product development. We expect to reach a
                                                                       similar impact with this year's corpus.
5.2     Task 2: Aspect-based Sentiment
        Analysis
Submitted runs and results for Task 2, with the
                                                                       Acknowledgements
Social-TV and STOMPOL corpus, are shown in
                                                                       This work has been partially supported by a
Tables 10 and 11. Accuracy, macroaveraged
                                                                       grant from the Fondo Europeo of Desarrollo
precision,    macroaveraged      recall    and
                                                                       Regional (FEDER), ATTOS (TIN2012-38536-
macroaveraged F1-measure have been used to
                                                                       C03-0) and Ciudad2020 (INNPRONTA IPT-
evaluate each individual label and ranking the
                                                                       20111006) projects from the Spanish
systems.
                                                                       Government, and AORESCU project (P11-TIC-
      Run Id                                      Acc
                                                                       7684 MO).
      GSI-RUN-1                                   0.635
      GSI-RUN-2                                   0.621
      GSI-RUN-3                                   0.557


                                                                 20
                                           Overview of TASS 2015


References                                                   workshop at SEPLN 2014. 16-19 September
                                                             2014, Girona, Spain.
Villena-Román, Julio; Lana-Serrano, Sara;
   Martínez-Cámara,       Eugenio;      González-         San Vicente Roncal, Iñaki; Saralegi Urizar,
   Cristobal, José Carlos. 2013. TASS -                      Xabier. Looking for Features for Supervised
   Workshop on Sentiment Analysis at SEPLN.                  Tweet Polarity Classification. In Proc. of the
   Revista de Procesamiento del Lenguaje                     TASS workshop at SEPLN 2014. 16-19
   Natural,          50,        pp          37-44.           September 2014, Girona, Spain.
   http://journal.sepln.org/sepln/ojs/ojs/index.p
   hp/pln/article/view/4657.
Villena-Román, Julio; García-Morera, Janine;
   Lana-Serrano, Sara; González-Cristobal,
   José Carlos. 2014. TASS 2013 - A Second
   Step in Reputation Analysis in Spanish.
   Revista de Procesamiento del Lenguaje
   Natural,          52,        pp          37-44.
   http://journal.sepln.org/sepln/ojs/ojs/index.p
   hp/pln/article/view/4901.
Vilares, David; Doval, Yerai; Alonso, Miguel
   A.; Gómez-Rodríguez, Carlos. LyS at TASS
   2014: A Prototype for Extracting and
   Analysing Aspects from Spanish tweets. In
   Proc. of the TASS workshop at SEPLN
   2014. 16-19 September 2014, Girona, Spain.
Perea-Ortega, José M. Balahur, Alexandra.
   Experiments on feature replacements for
   polarity classification of Spanish tweets. In
   Proc. of the TASS workshop at SEPLN
   2014. 16-19 September 2014, Girona, Spain.
Hernández Petlachi, Roberto; Li, Xiaoou.
  Análisis de sentimiento sobre textos en
  Español    basado    en    aproximaciones
  semánticas con reglas lingüísticas. In Proc.
  of the TASS workshop at SEPLN 2014. 16-
  19 September 2014, Girona, Spain.
Montejo-Ráez, A.; García-Cumbreras, M.A.;
  Díaz-Galiano, M.C. Participación de SINAI
  Word2Vec en TASS 2014. In Proc. of the
  TASS workshop at SEPLN 2014. 16-19
  September 2014, Girona, Spain.
Hurtado, Lluís F.; Pla, Ferran. ELiRF-UPV en
  TASS 2014: Análisis de Sentimientos,
  Detección de Tópicos y Análisis de
  Sentimientos de Aspectos en Twitter. In
  Proc. of the TASS workshop at SEPLN
  2014. 16-19 September 2014, Girona, Spain.
Jiménez Zafra, Salud María; Martínez Cámara,
   Eugenio; Martín Valdivia, M. Teresa.;
   Ureña López, L. Alfonso. SINAI-ESMA: An
   unsupervised approach for Sentiment
   Analysis in Twitter. In Proc. of the TASS


                                                     21