=Paper=
{{Paper
|id=Vol-1397/overview
|storemode=property
|title=Overview of TASS 2015
|pdfUrl=https://ceur-ws.org/Vol-1397/overview.pdf
|volume=Vol-1397
|dblpUrl=https://dblp.org/rec/conf/sepln/Villena-RomanGC15
}}
==Overview of TASS 2015==
TASS 2015, septiembre 2015, pp 13-21 recibido 20-07-15 revisado 24-07-15 aceptado 25-07-15
Overview of TASS 2015
Resumen de TASS 2015
Julio Villena Román Miguel Ángel García Cumbreras
Janine García Morera Eugenio Martínez Cámara
Daedalus, S.A. M. Teresa Martín Valdivia
28031 Madrid, Spain L. Alfonso Ureña López
{jvillena, jgarcia, cdepablo}@daedalus.es Universidad de Jaén
23071 Jaén, Spain
{magc, emcamara, laurena, maite}@uja.es
Resumen: Este artículo describe la cuarta edición del taller de evaluación experimental TASS
2015, enmarcada dentro del congreso internacional SEPLN 2015. El principal objetivo de TASS
es promover la investigación y el desarrollo de nuevos algoritmos, recursos y técnicas para el
análisis de sentimientos en medios sociales (concretamente en Twitter), aplicado al idioma
español. Este artículo describe las tareas propuestas en TASS 2015, así como el contenido de los
corpus utilizados, los participantes en las distintas tareas y los resultados generales obtenidos y
el análisis de estos resultados.
Palabras clave: TASS 2015, análisis de opiniones, medios sociales
Abstract: This paper describes TASS 2105, the fourth edition of the Workshop on Sentiment
Analysis at SEPLN. The main objective is to promote the research and the development of new
algorithms, resources and techniques in the field of sentiment analysis in social media
(specifically Twitter), focused on Spanish language. This paper presents the TASS 2015
proposed tasks, the contents of the generated corpora, the participant groups and the results and
analysis of them.
Keywords: TASS 2015, sentiment analysis, social media.
approaches. The first one applies machine
1 Introduction learning algorithms in order to train a polarity
classifier using a labelled corpus (Pang et al.
TASS is an experimental evaluation workshop,
2002). This approach is also known as the
a satellite event of the annual SEPLN
supervised approach. The second one is known
Conference, with the aim to promote the
as semantic orientation, or the unsupervised
research of sentiment analysis systems in social
approach, and it integrates linguistic resources
media, focused on Spanish language. The
in a model in order to identify the valence of
fourth edition will be held on September 15th,
the opinions (Turney 2002).
2015 at University of Alicante, Spain.
The aim of TASS is to provide a competitive
Sentiment analysis (SA) can be defined as
forum where the newest research works in the
the computational treatment of opinion,
field of SA in social media, specifically focused
sentiment and subjectivity in texts (Pang & Lee,
on Spanish tweets, are showed and discussed by
2002). It is a hard task because even humans
scientific and business communities.
often disagree on the sentiment of a given text.
The rest of the paper is organized as follows.
And it is a harder task when the text has only
Section 2 describes the different corpus
140 characters (Twitter messages or tweets).
provided to participants. Section 3 shows the
Text classification techniques, although
different tasks of TASS 2015. Section 4
studied and improved for a longer time, still
describes the participants and the overall results
need more research effort and resources to be
are presented in Section 5. Finally, the last
able to build better models to improve the
section shows some conclusions and future
current result values. Polarity classification
directions.
has usually been tackled following two main
Publicado en http://ceur-ws.org/Vol-1397/. CEUR-WS.org es una publicación en serie con ISSN reconocido ISSN 1613-0073
J. Villena Román, J. García Morera, M. Á. García Cumbreras, E. Martínez Cámara, M. T. Martín Valdivia, L. A. Ureña López
2 Corpus included for those cases when applicable. These
values are similarly tagged with 6 possible
TASS 2015 experiments are based on three
values and include the level of agreement as
corpus, specifically built for the different
related to each entity.
editions of the workshop.
This corpus is based on a selection of a set of
2.1 General corpus topics. Thematic areas such as "política"
("politics"), "fútbol" ("soccer"), "literatura"
The general corpus contains over 68.000 tweets, ("literature") or "entretenimiento"
written in Spanish, about 150 well-known ("entertainment"). Each tweet in both the
personalities and celebrities of the world of training and test set has been assigned to one or
politics, economy, communication, mass media several of these topics (most messages are
and culture, between November 2011 and associated to just one topic, due to the short
March 2012. Although the context of extraction length of the text).
has a Spain-focused bias, the diverse nationality All tagging has been done
of the authors, including people from Spain, semiautomatically: a baseline machine learning
Mexico, Colombia, Puerto Rico, USA and model is first run and then all tags are manually
many other countries, makes the corpus reach a checked by human experts. In the case of the
global coverage in the Spanish-speaking world. polarity at entity level, due to the high volume
Eachtweet includes its ID (tweetid), the creation of data to check, this tagging has just been done
date (date) and the user ID (user). Due to for the training set.
restrictions in the Twitter API Terms of Service
(https://dev.twitter.com/terms/api-terms), it is Table 1 shows a summary of the training
forbidden to redistribute a corpus that includes and test corpora provided to participants.
text contents or information about users.
However, it is valid if those fields are removed Attribute Value
and instead IDs (including Tweet IDs and user Tweets 68.017
IDs) are provided. The actual message content Tweets (test) 60.798 (89%)
can be easily obtained by making queries to the Tweets (test) 7.219 (11%)
Twitter API using the tweetid. Topics 10
The general corpus has been divided into Users 154
training set (about 10%) and test set (90%). The Date start (train) 2011-12-02
training set was released, so the participants Date end (train) 2012-04-10
could train and validate their models. The test Date start (test) 2011-12-02
Date end (test) 2012-04-10
corpus was provided without any tagging and
has been used to evaluate the results.
Obviously, it was not allowed to use the test Table 1: Corpus statistics
data from previous years to train the systems.
Each tweet was tagged with its global polarity Users were journalists (periodistas),
(positive, negative or neutral sentiment) or no politicians (políticos) or celebrities (famosos).
sentiment at all. A set of 6 labels has been The only language involved this year was
defined: strong positive (P+), positive (P), Spanish (es).
neutral (NEU), negative (N), strong negative The list of topics that have been selected is
(N+) and one additional no sentiment tag the following:
(NONE).
• Politics (política)
In addition, there is also an indication of the
• Entertainment (entretenimiento)
level of agreement or disagreement of the
expressed sentiment within the content, with • Economy (economía)
two possible values: AGREEMENT and • Music (música)
DISAGREEMENT. This is especially useful to • Soccer (fútbol)
make out whether a neutral sentiment comes • Films (películas)
from neutral keywords or else the text contains • Technology (tecnología)
positive and negative sentiments at the same • Sports (deportes)
time. • Literature (literatura)
Moreover, the polarity values related to the • Other (otros)
entities that are mentioned in the text are also
14
Overview of TASS 2015
The corpus is encoded in XML. Figure 1 • Equipo - Real Madrid (Team - Real
shows the information of two sample tweets. Madrid)
The first tweet is only tagged with the global • Equipo (any other team)
polarity as the text contains no mentions to any • Jugador - Alexis Sánchez (Player -
entity, but the second one is tagged with both Alexis Sánchez)
the global polarity of the message and the • Jugador - Álvaro Arbeloa (Player -
polarity associated to each of the entities that Álvaro Arbeloa)
appear in the text (UPyD and Foro Asturias). • Jugador - Andrés Iniesta (Player -
Andrés Iniesta)
• Jugador - Ángel Di María (Player -
Ángel Di Maria)
• Jugador - Asier Ilarramendi (Player -
Asier Ilarramendi)
• Jugador - Carles Puyol (Player - Carles
Puyol)
• Jugador - Cesc Fábregas (Player - Cesc
Fábregas)
• Jugador - Cristiano Ronaldo (Player -
Cristiano Ronaldo)
• Jugador - Dani Alves (Player - Dani
Alves)
• Jugador - Dani Carvajal (Player - Dani
Carvajal)
• Jugador - Fábio Coentrão (Player -
Fábio Coentrão)
• Jugador - Gareth Bale (Player - Gareth
Bale)
• Jugador - Iker Casillas (Player - Iker
Casillas)
• Jugador - Isco (Player - Isco)
Figure 1: Sample tweets (General corpus) • Jugador - Javier Mascherano (Player -
Javier Mascherano)
2.2 Social-TV corpus • Jugador - Jesé Rodríguez (Player - Jesé
Rodríguez)
The Social-TV corpus was collected during the
• Jugador - José Manuel Pinto (Player -
2014 Final of Copa del Rey championship in
José Manuel Pinto)
Spain between Real Madrid and F.C.
• Jugador - Karim Benzema (Player -
Barcelona, played on 16 April 2014 at Mestalla
Karim Benzema)
Stadium in Valencia. After filtering useless
information a subset of 2.773 tweets was • Jugador - Lionel Messi (Player - Lionel
selected. Messi)
All tweets were manually tagged with the • Jugador - Luka Modric (Player - Luka
aspects and its sentiment polarity. Tweets may Modric)
cover more than one aspect. • Jugador - Marc Bartra (Player - Marc
The list of the 31 aspects that have been Bartra)
defined is the following: • Jugador - Neymar Jr. (Player - Neymar
• Afición (supporters) Jr.)
• Árbitro (referee) • Jugador - Pedro Rodríguez (Player -
• Autoridades (authorities) Pedro Rodríguez)
• Entrenador (coach) • Jugador - Pepe (Player - Pepe)
• Equipo - Atlético de Madrid (Team- • Jugador - Sergio Busquets (Player -
Atlético de Madrid) Sergio Busquets)
• Equipo - Barcelona (Team- Barcelona) • Jugador - Sergio Ramos (Player - Sergio
Ramos)
15
J. Villena Román, J. García Morera, M. Á. García Cumbreras, E. Martínez Cámara, M. T. Martín Valdivia, L. A. Ureña López
• Jugador - Xabi Alonso (Player - Xabi bad (corruption, criticism) related to the
Alonso) entity
• Jugador - Xavi Hernández (Player - • Otros_aspectos (Other aspects): electoral
Xavi Hernández) system, environmental policy...
• Jugador (any other player) Each aspect is related to one or several
• Partido (Football match) entities that correspond to one of the main
• Retransmisión (broadcast) political parties in Spain, which are:
Sentiment polarity has been tagged from the • Partido_Popular (PP)
point of view of the person who writes the • Partido_Socialista_Obrero_Español
tweet, using 3 levels: P, NEU and N. No (PSOE)
distinction is made in cases when the author • Izquierda_Unida (IU)
does not express any sentiment or when he/she • Podemos
expresses a no-positive no-negative sentiment. • Ciudadanos (Cs)
The Social-TV corpus was randomly divided • Unión_Progreso_y_Democracia (UPyD)
into training set (1.773 tweets) and test set
(1.000 tweets), with a similar distribution of Each tweet in the corpus has been manually
both aspects and sentiments. The training set tagged by two annotators, and a third one in
was released previously and the test corpus was case of disagreement, with the sentiment
provided without any tagging and has been used polarity at aspect level. Sentiment polarity has
to evaluate the results provided by the different been tagged from the point of view of the
systems. person who writes the tweet, using 3 levels: P,
The following figure shows the information NEU and N. Again, no difference is made
of three sample tweets in the training set. between no sentiment and a neutral sentiment
(neither positive nor negative). Each political
aspect is linked to its correspondent political
party and its polarity.
Figure 3 shows the information of two
sample tweets.
Figure 2: Sample tweets (Social-TV corpus) Figure 3: Sample tweets (STOMPOL
corpus)
2.3 STOMPOL corpus These three corpora will be made freely
STOMPOL (corpus of Spanish Tweets for available to the community after the workshop.
Opinion Mining at aspect level about POLitics) Please send an email to tass@daedalus.es filling
is a corpus of Spanish tweets prepared for the in the TASS Corpus License agreement with
research in the challenging task of opinion your email, affiliation (institution, company or
mining at aspect level. The tweets were any kind of organization) and a brief
gathered from 23rd to 24th of April 2015, and description of your research objectives, and you
are related to one of the following political will be given a password to download the files
aspects that appear in political campaigns: in the password protected area. The only
• Economics (Economía): taxes, requirement is to include a citation to a relevant
infrastructure, markets, labor policy... paper and/or the TASS website.
• Health System (Sanidad): hospitals,
public/private health system, drugs, 3 Description of tasks
doctors... First of all, we are interested in evaluating
• Education (Educacion): state school, private the evolution of the different approaches for SA
school, scholarships... and text classification in Spanish during these
• Political party (Propio_partido): anything years. So, the traditional SA at global level task
good (speeches, electoral programme...) or will be repeated again, reusing the same corpus,
16
Overview of TASS 2015
to compare results. Moreover, we want to foster of the imbalanced distribution of labels between
the research in the analysis of fine-grained the training and test set, a selected test subset
polarity analysis at aspect level (aspect-based containing 1.000 tweets with a similar
SA, one of the new requirements of the market distribution to the training corpus was extracted
of natural language processing in these areas). to be used for an alternate evaluation of the
So, two legacy tasks will be repeated again, to performance of systems.
compare results, and a new corpus has been
created for the second task. 3.2 (legacy) Task 2: Aspect-based
Participants are expected to submit up to 3 sentiment analysis
results of different experiments for one or both
of these tasks, in the appropriate format Participants have been provided with a corpus
described below. tagged with a series of aspects, and systems
Along with the submission of experiments, must identify the polarity at the aspect-level.
participants have been invited to submit a paper Two corpora have been provided: the Social-
to the workshop in order to describe their TV corpus, used in TASS 2014, and the new
experiments and discussing the results with the STOMPOL corpus, collected in 2015
audience in a regular workshop session. (described above). Both corpora have been
The two proposed tasks are described next. splitted into training and test set, the first one
for building and validating the systems, and the
3.1 (legacy) Task 1: Sentiment Analysis second for evaluation.
Participants are expected to submit up to 3
at Global Level
experiments for each corpus, each in a plain
This is the same task as previous editions. text file with the following format:
This task consists on performing an automatic
tweetid \t aspect \t polarity
polarity classification to determine the global
polarity of each message in the test set of the [for the Social-TV corpus]
General corpus. Participants have been
provided with the training set of the General tweetid \t aspect-entity \t polarity
corpus so that they may train and validate their
models. There will be two different evaluations: [for the STOMPOL corpus]
one based on 6 different polarity labels (P+, P,
NEU, N, N+, NONE) and another based on just 4 Allowed polarity values are P, N and NEU.
labels (P, N, NEU, NONE). For evaluation, a single label combining
Participants are expected to submit (up to 3) "aspect-polarity" has been considered. Similarly
experiments for the 6-labels evaluation, but are to the first task, accuracy, macroaveraged
also allowed to submit (up to 3) specific precision, macroaveraged recall and
experiments for the 4-labels scenario. macroaveraged F1-measure have been
Results must be submitted in a plain text file calculated for the global result.
with the following format:
tweetid \t polarity
4 Participants and Results
This year 35 groups registered (as compared to
where polarity can be: 31 groups last year) but unfortunately only 7
• P+, P, NEU, N, N+ and NONE for the 6-labels groups (14 last year) sent their submissions.
case The list of active participant groups is shown in
• P, NEU, N and NONE for the 4-labels case. Table 2, including the tasks in which they have
participated.
The same test corpus of previous years will be Fourteen of the seventeen participant groups
used for the evaluation, to allow for comparison sent a report describing their experiments and
among systems. Accuracy, macroaveraged results achieved. Papers were reviewed and
precision, macroaveraged recall and included in the workshop proceedings.
macroaveraged F1-measure have been used to References are listed in Table 3.
evaluate each run.
Notice that there are two test sets: the
complete set and 1k set, a subset of the first
one. The reason is that, to deal with the problem
17
J. Villena Román, J. García Morera, M. Á. García Cumbreras, E. Martínez Cámara, M. T. Martín Valdivia, L. A. Ureña López
BittenPotato BittenPotato: Tweet
Group 1 2 sentiment analysis by
LIF X combining multiple
ELiRF X X classifiers
GSI X X SINAI_wd2v Participación de SINAI
LyS X X DW2Vec en TASS 2015
DLSI X DT DeustoTech Internet at
GTI-GRAD X TASS 2015: Sentiment
ITAINNOVA X analysis and polarity
SINAI-ESMA X classification in Spanish
CU X tweets
TID-spark X X UCSP Comparing Supervised
BittenPotato X
Learning Methods for
X
SINAI_wd2v Classifying Spanish Tweets
X
DT X INGEOTEC Sentiment Analysis for
GAS-UCR X Twitter: TASS 2015
UCSP X
SEDEMO X Table 3: Participant reports
INGEOTEC X
Total groups 17 4
5 Results
Table 2: Participant groups Results for each task are described next.
Group Report 5.1 Task 1: Sentiment Analysis at
ELiRF ELiRF-UPV en TASS Global Level
2015: Análisis de
Submitted runs and results for Task 1,
Sentimientos en Twitter
evaluation based on 5 polarity levels with the
GSI Aspect based Sentiment
whole General test corpus, are shown in Table
Analysis of Spanish Tweets
4. Accuracy, macroaveraged precision,
LyS LyS at TASS 2015: Deep
macroaveraged recall and macroaveraged F1-
Learning Experiments for
measure have been used to evaluate each
Sentiment Analysis on
individual label and ranking the systems.
Spanish Tweets
DLSI Evaluating a Sentiment
Run Id Acc
Analysis Approach from a LIF-Run-3 0.672
Business Point of View LIF-Run-2 0.654
GTI-GRAD GTI-Gradiant at TASS ELiRF-run3 0.659
2015: A Hybrid Approach LIF-Run-1 0.628
for Sentiment Analysis in ELiRF-run1 0.648
Twitter ELiRF-run2 0.658
ITAINNOVA Ensemble algorithm with GSI-RUN-1 0.618
syntactical tree features to run_out_of_date 0.673
improve the opinion GSI-RUN-2 0.610
GSI-RUN-3 0.608
analysis
LyS-run-1 0.552
SINAI-EMMA SINAI-EMMA: Vectores DLSI-Run1 0.595
de Palabras para el Lys-run-2 0.568
Análisis de Opiniones en GTI-GRAD-Run1 0.592
Twitter Ensemble exp1.1 0.535
CU Spanish Twitter Messages SINAI-EMMA-1 0.502
Polarized through the Lens INGEOTEC-M1 0.488
of an English System Ensemble exp3_emotions 0.549
TID-spark Sentiment Classification CU-Run-1 0.495
using Sociolinguistic TID-spark-1 0.462
BP-wvoted-v2_1 0.534
Clusters
Ensemble exp2_emotions 0.524
18
Overview of TASS 2015
BP-voted-v2 0.535 BP-wvoted-v1 0.416
SINAI_wd2v_500 0.474 BP-rbf-v1 0.418
SINAI_wd2v_300 0.474 SEDEMO-E1 0.397
BP-wvoted-v1 0.522 DT-RUN-1 0.407
BP-voted-v1 0.522 DT-RUN-2 0.408
BP-rbf-v2 0.514 DT-RUN-3 0.396
Lys-run-3 0.505 GAS-UCR-1 0.338
BP-rbf-v1 0.494 INGEOTEC-E1 0.174
CU-Run-2-CompMod 0.362 INGEOTEC-E2 0.168
DT-RUN-1 0.560
DT-RUN-3 0.557
DT-RUN-2 0.545 Table 5: Results for Task 1, 5 levels, selected
GAS-UCR-1 0.342 1k corpus
UCSP-RUN-1 0.273
BP-wvoted-v2 0.009 Run Id Acc
LIF-Run-3 0.726
Table 4: Results for Task 1, 5 levels, whole test LIF-Run-2 0.725
corpus ELiRF-run3 0.721
LIF-Run-1 0.710
ELiRF-run1 0.712
As previously described, an alternate ELiRF-run2 0.722
evaluation of the performance of systems was GSI-RUN-1 0.690
done using a new selected test subset containing run_out_of_date 0.725
1.000 tweets with a similar distribution to the GSI-RUN-2 0.679
GSI-RUN-3 0.678
training corpus. Results are shown in Table 5.
DLSI-Run1 0.655
In order to perform a more in-depth LyS-run-1 0.664
evaluation, results are calculated considering GTI-GRAD-Run1 0.695
the classification only in 3 levels (POS, NEU, TID-spark-1 0.594
NEG) and no sentiment (NONE) merging P and P+ INGEOTEC-M1 0.613
in only one category, as well as N and N+ in UCSP-RUN-2 0.594
another one. The same double evaluation using UCSP-RUN-3 0.613
the whole test corpus and a new selected corpus Ensemble exp2_3_SPARK 0.591
have been carried out, shown Tables 8 and 9. UCSP-RUN-1 0.602
CU-RUN-1 0.597
Ensemble exp1_3_SPARK 0.610
Run Id Acc
UCSP-RUN-1-ME 0.600
ELiRF-run2 0.488
BP-wvoted-v1 0.593
GTI-GRAD-Run1 0.509
BP-voted-v1 0.593
LIF-Run-2 0.516
Ensemble exp3_3 0.594
GSI-RUN-1 0.487
DT-RUN-2 0.625
GSI-RUN-2 0.48
SINAI_wd2v 0.619
GSI-RUN-3 0.479
SINAI_wd2v_2 0.613
LIF-Run-1 0.481
BP-rbf-v1 0.602
ELiRF-run1 0.476
Lys-run-2 0.599
SINAI_wd2v 0.389
DT-RUN-3 0.608
ELiRF-run3 0.477
UCSP-RUN-1-NB 0.560
INGEOTEC-M1 0.431
SINAI_w2v 0.604
Ensemble exp1 1K 0.405
UCSP-RUN-1-DT 0.536
LyS-run-1 0.428
CU-Run2-CompMod 0.481
Ensemble exp2 1K 0.384
DT-RUN-1 0.490
Lys-run-3 0.430
UCSP-RUN-2-ME 0.479
Lys-run-2 0.434
SINAI_d2v 0.429
SINAI-EMMA-1 0.411
GAS-UCR-1 0.446
CU-Run-1-CompMod 0.419
Ensemble exp3 1K 0.396
TID 0.400 Table 6: Results for Task 1, 3 levels, whole test
BP-voted-v1 0.408 corpus
DLSI-Run1 0.385
CU-Run-2 0.397
Run Id Acc
19
J. Villena Román, J. García Morera, M. Á. García Cumbreras, E. Martínez Cámara, M. T. Martín Valdivia, L. A. Ureña López
LIF-Run-1 0.632 ELiRF-run1 0.655
ELiRF-run2 0.610 LyS-run-1 0.610
LIF-Run-2 0.692 TID-spark-1 0.631
BP-wvoted-v1 0.632 GSI-RUN-1 0.533
GSI-RUN-1 0.658 Lys-run-2 0.522
GTI-GRAD-Run1 0.674
BP-voted-v1 0.611 Table 10: Results for Task 2, Social-TV corpus
LyS-run-1 0.634
TID-spark-1 0.649
DLSI-Run1 0.637 Run Id Acc
ELiRF-run1 0.645 ELiRF-run1 0.633
DT-RUN-1 0.601 LyS-run-1 0.599
GSI-RUN-2 0.646 Lys-run-2 0.540
GSI-RUN-3 0.647 TID-spark-1 0.557
ELiRF-run3 0.595
Ensemble exp3 1K 3 0.614
UCSP-RUN-2 0.586 Table 11: Results for Task 2, STOMPOL
Ensemble exp2 1K 3 0.611 corpus
Ensemble exp1 1K 3 0.503
INGEOTEC-M1 0.595
CU-Run-2-CompMod 0.600 6 Conclusions and Future Work
CU-RUN-1 0.578
TASS was the first workshop about SA
SINAI_wd2v_2_500 0.641
UCSP-RUN-1 0.582 focused on the processing of texts written in
SINAI_w2v 0.627 Spanish. Clearly this area receives great
UCSP-RUN-3 0.626 attraction from research groups and companies,
SINAI_wd2v 0.633 as this fourth edition has had a greater impact in
BP-rbf-v1 0.611 terms of registered groups, and the number of
UCSP-RUN-1-NB 0.636 participants that submitted experiments in 2015
UCSP-RUN-1-ME 0.626 tasks has increased.
Lys-run-2 0.605 Anyway, the developed corpus and gold
DT-RUN-2 0.583
standards, and the reports from participants will
DT-RUN-3 0.571
for sure be helpful for other research groups
UCSP-RUN-1-DR 0.495
UCSP-RUN-2-NB 0.559 approaching these tasks.
UCSP-RUN-2-ME 0.509 TASS corpora will be released after the
DT-RUN-1 0.514 workshop for free use by the research
GAS-UCR-1 0.556 community. In 2014 the corpora had been
SINAI_d2v 0.510 downloaded up to date by more than 60
research groups, 25 out of Spain, by groups
Table 7: Results for Task 1, 3 levels, selected coming from academia and also from private
1k corpus companies to use the corpus as part of their
product development. We expect to reach a
similar impact with this year's corpus.
5.2 Task 2: Aspect-based Sentiment
Analysis
Submitted runs and results for Task 2, with the
Acknowledgements
Social-TV and STOMPOL corpus, are shown in
This work has been partially supported by a
Tables 10 and 11. Accuracy, macroaveraged
grant from the Fondo Europeo of Desarrollo
precision, macroaveraged recall and
Regional (FEDER), ATTOS (TIN2012-38536-
macroaveraged F1-measure have been used to
C03-0) and Ciudad2020 (INNPRONTA IPT-
evaluate each individual label and ranking the
20111006) projects from the Spanish
systems.
Government, and AORESCU project (P11-TIC-
Run Id Acc
7684 MO).
GSI-RUN-1 0.635
GSI-RUN-2 0.621
GSI-RUN-3 0.557
20
Overview of TASS 2015
References workshop at SEPLN 2014. 16-19 September
2014, Girona, Spain.
Villena-Román, Julio; Lana-Serrano, Sara;
Martínez-Cámara, Eugenio; González- San Vicente Roncal, Iñaki; Saralegi Urizar,
Cristobal, José Carlos. 2013. TASS - Xabier. Looking for Features for Supervised
Workshop on Sentiment Analysis at SEPLN. Tweet Polarity Classification. In Proc. of the
Revista de Procesamiento del Lenguaje TASS workshop at SEPLN 2014. 16-19
Natural, 50, pp 37-44. September 2014, Girona, Spain.
http://journal.sepln.org/sepln/ojs/ojs/index.p
hp/pln/article/view/4657.
Villena-Román, Julio; García-Morera, Janine;
Lana-Serrano, Sara; González-Cristobal,
José Carlos. 2014. TASS 2013 - A Second
Step in Reputation Analysis in Spanish.
Revista de Procesamiento del Lenguaje
Natural, 52, pp 37-44.
http://journal.sepln.org/sepln/ojs/ojs/index.p
hp/pln/article/view/4901.
Vilares, David; Doval, Yerai; Alonso, Miguel
A.; Gómez-Rodríguez, Carlos. LyS at TASS
2014: A Prototype for Extracting and
Analysing Aspects from Spanish tweets. In
Proc. of the TASS workshop at SEPLN
2014. 16-19 September 2014, Girona, Spain.
Perea-Ortega, José M. Balahur, Alexandra.
Experiments on feature replacements for
polarity classification of Spanish tweets. In
Proc. of the TASS workshop at SEPLN
2014. 16-19 September 2014, Girona, Spain.
Hernández Petlachi, Roberto; Li, Xiaoou.
Análisis de sentimiento sobre textos en
Español basado en aproximaciones
semánticas con reglas lingüísticas. In Proc.
of the TASS workshop at SEPLN 2014. 16-
19 September 2014, Girona, Spain.
Montejo-Ráez, A.; García-Cumbreras, M.A.;
Díaz-Galiano, M.C. Participación de SINAI
Word2Vec en TASS 2014. In Proc. of the
TASS workshop at SEPLN 2014. 16-19
September 2014, Girona, Spain.
Hurtado, Lluís F.; Pla, Ferran. ELiRF-UPV en
TASS 2014: Análisis de Sentimientos,
Detección de Tópicos y Análisis de
Sentimientos de Aspectos en Twitter. In
Proc. of the TASS workshop at SEPLN
2014. 16-19 September 2014, Girona, Spain.
Jiménez Zafra, Salud María; Martínez Cámara,
Eugenio; Martín Valdivia, M. Teresa.;
Ureña López, L. Alfonso. SINAI-ESMA: An
unsupervised approach for Sentiment
Analysis in Twitter. In Proc. of the TASS
21