=Paper=
{{Paper
|id=Vol-1896/p10_c100tpucp_tass2017
|storemode=property
|title=C100TPUCP at TASS 2017: Word Embedding Experiments for Aspect-Based Sentiment Analysis in Spanish Tweets
|pdfUrl=https://ceur-ws.org/Vol-1896/p10_c100tpucp_tass2017.pdf
|volume=Vol-1896
|authors=Franco Tume Fiestas,Marco A. Sobrevilla Cabezudo
}}
==C100TPUCP at TASS 2017: Word Embedding Experiments for Aspect-Based Sentiment Analysis in Spanish Tweets==
TASS 2017: Workshop on Semantic Analysis at SEPLN, septiembre 2017, págs. 85-90
C100TPUCP at TASS 2017: Word Embedding
Experiments for Aspect-Based Sentiment Analysis in
Spanish Tweets
C100TPUCP en TASS 2017: Experimentos con Word
Embeddings para Análisis de Sentimiento basado en Aspectos
sobre Tweets en Español
Franco Tume Fiestas Marco A. Sobrevilla Cabezudo
Grupo de Reconocimiento de Patrones Grupo de Reconocimiento de Patrones
e Inteligencia Artificial Aplicada e Inteligencia Artificial Aplicada
Pontificia Universidad Católica del Pontificia Universidad Católica del
Perú Perú
Lima, Perú Lima, Perú
a20110060@pucp.pe msobrevilla@pucp.edu.pe
Abstract: Aspect-Based Sentiment Analysis is in charge of study the opinion of
people about different aspects from a certain entity. This task is challenging and
highly relevant for the Natural Language Processing community. In this paper, we
report the participation of C100T-PUCP team in the TASS 2017 for the second
task about sentiment analysis. In this edition, we used word embeddings to get the
similarity between words selected from a training set that had tweets about political
parties from Spain and made a model to classify each polarity of each aspect for each
tweet. The results showed that using more examples to training the model with this
approach is more convenient. Moreover, the proposed approach avoids the problem
of the classical methods that are oriented to a specific training data set.
Keywords: Word Embeddings, Sentiment Analysis, Twitter
Resumen: El Análisis de Sentimientos basado en Aspectos está encargado del es-
tudio de las opiniones de las personas sobre diferentes aspectos de cierta entidad.
Esta tarea es desafiante y muy importante para la comunidad de Procesamiento de
Lenguaje Natural. En este trabajo se describe la participacion del equipo C100T-
PUCP en el TASS 2017 para la segunda tarea sobre analisis de sentimientos. En esta
edición, usamos word embeddings para conseguir la similitud entre diferentes pal-
abras seleccionadas del conjunto de datos de entrenamiento, la cual contiene tweets
sobre grupos polı́ticos de España y construimos un modelo para clasificar la polari-
dad de los sentimientos expresados sobre cada aspecto en cada tweet. Los resultados
indicaron que el utilizar mayor cantidad de ejemplos para entrenar el modelo con
este método es conveniente. Además, el enfoque propuesto evita los problemas de
metodos clásicos que están orientados a un set de datos de entranamiento especı́fico.
Palabras clave: Word Embeddings, Análisis de Sentimiento, Twitter
1 Introduction taken with many different approaches. One
The 6th edition of the TASS workshop con- of them consists in representing a word as a
sists of two task in sentiment analysis focus- vector and using it to get a similarity with
ing of Spanish tweets: (1) polarity classifica- other words. This method is called word em-
tion at global level and (2) aspect-based sen- beddings. Word Embeddings is a well-know
timent analysis in which the goal is to pre- technique to get a vector of a word in natural
dict the polarity of tweets in relation to a set language processing. Although this method
of identified aspects (Martı́nez-Cámara et al., is widely used in English, there are few im-
2017). plementations of this approach for Spanish.
The task of polarity classification has been In this sense, many studies use deep learn-
ISSN 1613-0073 Copyright © 2017 by the paper's authors. Copying permitted for private and academic purposes.
Franco Tume Fiestas, Marco A. Sobrevilla Cabezudo
ing as a main approach to tackle sentiment list of words obtained from the training cor-
analysis so they can get a similarity between pus sorting by TF-IDF. The results obtained
a bag of words that represent each sentiment showed a F1-measure of 0.587.
(Alvarez-López et al., 2016). With this com- In TASS 2015, only the Aspect-based Sen-
parison we can get a vector features and use timent Analysis Task was proposed, but a
it with classical machine learning algorithms. new corpus was added in the evaluation
Our system uses this kind of approach to clas- (Villena-Román et al., 2015). This corpus
sify the sentiment of each aspect of each tweet was composed by tweets related to Politics
presented in the task. in Spanish, called STOMPOL. In this edi-
This paper summarizes the participation tion, a method similar to that presented
of the C100T-PUCP team from Pontificia in TASS 2014 was proposed in (Hurtado,
Universidad Católica del Perú in the second Pla, and Buscaldi, 2015). This method in-
task of the workshop. In this edition, we cluded an additional dictionary and SVM al-
propose a word embedding-based approach gorithm. This method showed an accuracy
to tackle the problem of aspect-level polarity of 65.50% in Social-TV corpus and 63.3% in
classification. Firstly, we obtain a word em- STOMPOL corpus and the F-measure was
beddings set from politics corpus to get sim- not shown. Another method was presented
ilarity between tweets. Then, we explored which used a set of lexical and morphosyn-
a feature selection method for unbalanced tactic features in a supervised learning al-
data. Finally, we built experiments with gorithm (Araque et al., 2015). The way to
some classifiers using word embeddings and tackle the problem was divided into three
the obtained features. steps: (1) identifying entities, (2) getting the
This paper is organized as follows: an context (using a graph-based algorithm) and
overview of related works is shown in Section (3) executing the supervised learning algo-
2. Section 3 presents the analyzed corpus and rithm. Their method obtained an accuracy of
its class distribution. The system description 63.5% and a F-measure of 0.606 in Social-TV
is described in Section 4. Section 5 shows the corpus. Finally, The third work proposed a
experimentation and results, and finally, Sec- deep learning-based approach (Vilares et al.,
tion 6 presents some conclusions and future 2015). These authors used a LSTM Neural
works. Network to tackle the problem of polarity de-
tection. Their method obtained an accuracy
2 Related Work of 61.00% in Social-TV corpus and 59.9% in
In TASS 2014, an Aspect Detection and STOMPOL corpus. The F-measure was not
Aspect-based Sentiment Analysis Task were shown in this work.
proposed (Román et al., 2015). The cor- In TASS 2016, two proposals were sub-
pus was composed by tweets related to the mitted to the Aspect-based Sentiment Anal-
final game of the “Copa del Rey” in Span- ysis task (Villena-Román et al., 2016) on
ish called Social-TV. In general, there were STOMPOL corpus. The first one applied
two works submitted to these two tasks. The a supervised algorithm using features as As-
first one proposed a method to detect as- pect, Lemma, POS-Tag, Negation and Word
pects based on the match of a tweet con- Tokens from the training corpus (Alvarez-
tent with a pre-specified set of features re- López et al., 2016). The result obtained was
lated to the football domain (Vilares et al., a F1-measure of 0.463. Finally, the other
2014). This method obtained a F1-measure method proposed an experimentation of dif-
of 0.854. To identify the polarity on each as- ferent supervised algorithms using the same
pect, the authors used a supervised method features as TASS 2015 (Hurtado and Pla,
with syntactic-based features. The method 2016). Their best method obtained a F1-
obtained a F1-measure of 0.546. The sec- measure of 0.526.
ond one proposed an aspect detection method
based on a list of features and a set of regular 3 STOMPOL Corpus
expressions (Hurtado and Pla, 2014). This The STOMPOL corpus is composed of tweets
method obtained a F1-measure of 0.909. In in Spanish about Spanish elections of 2015.
the polarity detection, the authors proposed This corpus was presented in TASS 2014
a supervised method which used as features (Román et al., 2015). Each tweet is re-
a list of positive and negative terms and a lated to one of the following aspects: Eco-
86
C100TPUCP at TASS 2017: Word Embedding Experiments for Aspect-Based Sentiment Analysis in Spanish Tweets
nomics, Health System, Education, Political tences selected by the model were from the
party and other aspects. Also, each aspect same domain of the corpus, i. e., politics.
is related to one of these sentiments: posi- Tweets were selected using search queries
tive, negative and neutral. The distribution related to political parties from South Amer-
of each aspect in the training data and the ica and Spain in a range from 2012 to 2015.
distribution for each sentiment per aspect is Even though many tweets were selected for
shown in Table 1. As shown, this dataset this approach much more data was needed,
presents unbalanced classes and aspects. For In that sense, online news websites were also
example, there are too many samples about scrapped. Specifically, we obtained texts
the Political Party aspect and also about neg- from ”El paı́s”1 , ”ABC” 2 and ”20 Minutos”3
ative sentiment. Spanish newspapers. From these sites only
political news were selected and no range in
4 System Description time was used.
The system presented in this edition of the After this, we got 400MB of data which in-
TASS uses a preprocessing removing stop- cluded about 5 million sentences, 30 millions
words using NLTK tool. Also, words like words and 1.5 million unique words. The
URL’s and special characters are removed corpus was preprocessed following the next
from the tweets. After this, all the words steps:
are passed through Freeling lemmatizer, ver- • removing stopwords and special charac-
sion 4.0. Furthermore, hashtags and labels ters that are common in tweets as ”...”
to users are kept in this processing. As using and URLs
this preprocessing the tokenization for all the
tweets is completed. • removing words that have only one char-
The Aspect-based Sentiment Analysis acter
was tackled as a classification problem. For
• removing numbers
this, support vector machines (SVM) and
adaptive boosting (AdaBoost) classifiers • lemmatizing words using Freeling4 ,
were used because of the precedence in pre- keeping all mentions to users and hash-
vious works that showed that they behave tags
well in classifying long vector features. For
these models, scikit-learn implementations Additionally, mentions to particular enti-
are used from the toolkit. These, are ties in news data were replaced with their
sklearn.ensemble.AdaBoostClassifier for the user ids from Twitter. This was to make
adaptive boosting and sklearn.svm.SVC for the news data the most similar to the twit-
the SVM implementation with a polynomial ter data so the embeddings get the same re-
kernel. lations between words. For example, if we
have a new that mentions Pablo Iglesias we
Also each vector was filled with the cosine replace it with his Twitterś id Pablo Iglesias
similarity between each feature and the top and we do the same with other common po-
50 most important words for each sentiment litical persons.
using a probabilistic appearance metric (Liu, To create the word embeddings from
Loh, and Sun, 2009) to give context to the the corpus, we used the Word2vec
aspect. This similarity was calculated using model(Mikolov et al., 2013). This model use
the vector representation of the words. For two type of neural network to generate the
this, Mikolov Word2Vec model was used. embeddings. In this case, Skip-gram model
Finally, each model was verified using cross was used because it is recommended using
fold validation with 10 iterations and using it when the training data for the model is
a learning curve to verify that our models small. After this, we tested different values
are not over-fitted. for the model parameters. The hyperpa-
rameters that were tested were: minimum
word count, context window and size of the
4.1 Word Embeddings Generation 1
Available in https://elpais.com/
A word embeddings model was created us- 2
Available in http://www.abc.es/
ing Mikolov Word2Vec model (Mikolov et al., 3
Available in http://www.20minutos.es/
4
2013) of GenSim implementation. The sen- Available in http://nlp.lsi.upc.edu/freeling/
87
Franco Tume Fiestas, Marco A. Sobrevilla Cabezudo
Aspects Tweets Negative Neutral Positive
Economics 117 78 20 19
Health 21 7 10 4
Education 30 16 12 2
Political Party 777 431 176 170
Others 98 54 22 22
Total 1043 586 240 217
Table 1: STOMPOL Corpus distribution
vector. The values were selected based on two ways, one was only using the traditional
experiments and those selected are in Table approach, i. e., the vector a bag-of-words and
2. filling the vector with 1 if the word feature oc-
Hyperparameter Value curs in the window and the other way is to
fill the feature with the most similarity value
Minimum Word Count 100 to the feature. Then each vector was used to
Vector Size 300 train a SVM and AdaBoost models. In gen-
Context Window 6 eral, we sent three runs which are shown as
Table 2: Hyperparameter Values below:
• Run 1: The first run consisted of using
4.2 Aspect-based Sentiment vector filled with the cosine similarity
Analysis between each word and the words in the
For this task, a window of three words were dictionary of polarities. This vector was
selected from the previously identified aspect used to train a SVM model with gamma
in the training corpus to extract the features. value of 0.31622776601683794, C 1 and
Each tweet in the training corpus was also degree 2.
preprocessed in the same way as the word em- • Run 2: The second run was using the
beddings model, removing stopwords, special same vector were the hyper parame-
characters and URLs. After this, the cor- ters were: gamma was 0.0316, C was
pus was lemmatized keeping user tags and 1122.018 and degree was 2.
hashtags. The analysis was based on the co-
sine similarity of these words with the words • Run 3: The last run was using Ad-
selected as a feature for detecting the senti- aBoost Classifier with a modified vec-
ment. tor that have the most representative
In this step, firstly, we needed to create a word of each sentiment but for each as-
dictionary with words that represent in a bet- pect. Also this vector has one hot encod-
ter way each sentiment that was classified us- ing based on each aspect. This model
ing the labels for the sentiments in the train- was trained with a Naive Bayes classi-
ing data. Thus, we used two metrics, TF- fier as the weak classifier. The AdaBoost
IDF and probabilistic occurrence (Liu, Loh, model was created with the following hy-
and Sun, 2009), to see how relevant is a word perparemeters selected by experimenta-
for a sentiment so these words could be used tion: learning rate was 0.000001, number
as a dictionary for each sentiment. Thus, of estimators was 100.
there were more words to compare the se-
lected words in the context of the detected 5 Results and Discussions
aspect that represents the sentiment. Also, For the experiments a SVM classifier and an
these metrics were applied for a new dictio- AdaBoost classifier were tested using each
nary but based for each aspect relating them set of features prepared using the dictionar-
with each sentiments, so, a more versatile fea- ies previously created. In order to test these
ture set was extracted. approaches F1-score (F1), Precision (P) and
After selecting the features, two vectors (R) recall were used. These metrics evaluate
were created: (1) based on probabilistic oc- how well the model predicts based on how
currence and (2) based on probabilistic oc- many true-positive it predicts and the inverse
currence per aspect. Each vector was filled in how many false-positive it predicts.
88
C100TPUCP at TASS 2017: Word Embedding Experiments for Aspect-Based Sentiment Analysis in Spanish Tweets
In the other hand, accuracy just measured 6 Conclusions and Future Works
how many hits it does in the prediction. Fur- In the 6th edition of the TASS 2017, we have
thermore, to test the model to detect the sen- tried a word embedding approach to classify
timent polarity the TASS experiment page sentiment of an aspect. With this approach
was used. This page measure a macro F1- we generate a vector feature of how similar
score, recall, precision and accuracy based on each selected word is to the features in the
the combination of how well it predicts each vector. The performance of the models has
aspect in an specific entity. been compared with the models of the other
The results for the polarity classification participants of the TASS.
are presented in Table 3. In this case a SVM Although the model was not trained with
classifier and an AdaBoost classifier using a a big amount of data that it was required, this
Naive Bayes model were tested. Also, as got the similarity for most of the words in the
discussed earlier, the data is not well bal- training set. Also, the results were close to
anced, so, for these methods a SMOTE over- the other participants. The obtained results
sampling was applied. For these results, the may give us an idea about how well word em-
SVM-run2 model was created by using the beddings could perform in sentiment analysis
features using probabilistic weights of the due to word embeddings are not adjusted to a
words. In this sense, this model predicts the specific set of words like traditional methods
sentiment based on the similarity of the word (using Bag-Of-Words).
selected as the sentiment representatives with With these results, we propose to test this
each word that represent a feature in the vec- approach using a more balanced data for as-
tor, keeping the best score of these words. pects and sentiments for training. Also, get-
This vector was also used for the other mod- ting more sentences to train the word2vec
els except for the ADA-run3 model. model could improve the differentiation of the
words that are used in the features. Using
The SVM-run1 considers that every word
this improvement could get better results be-
may have different meanings for each context
cause as we saw the models can learn to pre-
so it uses a vector that has words that repre-
dict a class very well if they have a lot of
sents all the sentiments for each aspect and
information about it.
a feature that represents from which element
the tweet came. This model has better re- References
sults in comparison to its similar SVM that
only use top words in all the training set and Alvarez-López, T., M. F. Gavilanes,
have a better accuracy compared with the S. Garcı́a-Méndez, J. Juncal-Martı́nez,
ADA-run3 but not a better F1-Score. and F. J. González-Castano. 2016. Gti at
TASS 2016: Supervised approach for as-
pect based sentiment analysis in Twitter.
Execution P R F1 Ac. In Proceedings of TASS 2016: Workshop
run1 0.418 0.413 0.415 0.563 on Semantic Analysis at SEPLN (TASS
run2 0.412 0.426 0.416 0.517 2016), pages 53–57.
run3 0.452 0.438 0.445 0.528
Araque, O., I. Corcuera, C. Román, C. A.
Table 3: Results of the different experiments Iglesias, and J. F. Sánchez-Rada. 2015.
Aspect based sentiment analysis of Span-
ish tweets. In Proceedings of TASS 2015:
As seen, the model was not trained with a Workshop on Semantic Analysis at SE-
big amount of data but it still had almost PLN (TASS 2015), pages 29–34.
all the words inside it’s vocabulary. This
Hurtado, L.-F. and F. Pla. 2014. Elirf-upv en
pointed us that the errors for the sentiment
TASS 2014: Análisis de sentimientos, de-
polarity were caused probably by the unbal-
tección de tópicos y análisis de sentimien-
ance data for each sentiment per aspect. The
tos de aspectos en Twitter. Procesamiento
sentiments were in it’s majority negative and
del Lenguaje Natural.
few were neutral or positive. Although, us-
ing this approach gave us results that were Hurtado, L.-F. and F. Pla. 2016. Elirf-upv
close to other participants that uses domain en TASS 2016: Análisis de sentimientos
specific systems. en Twitter. In Proceedings of TASS 2016:
89
Franco Tume Fiestas, Marco A. Sobrevilla Cabezudo
Workshop on Semantic Analysis at SE- volume 1702 of CEUR Workshop Proceed-
PLN (TASS 2016), pages 47–51. ings. CEUR-WS.org.
Hurtado, L.-F., F. Pla, and D. Buscaldi. Villena-Román, J., J. Garcı́a-Morera,
2015. Elirf-upv en TASS 2015: Análisis de M. A. G. Cumbreras, E. Martı́nez-
sentimientos en Twitter. In Proceedings of Cámara, M. T. Martı́n-Valdivia, and
TASS 2015: Workshop on Semantic Anal- L. A. U. López. 2015. Overview of
ysis at SEPLN (TASS 2015), pages 75–79. TASS 2015. In Proceedings of TASS
Liu, Y., H. T. Loh, and A. Sun. 2009. Imbal- 2015: Workshop on Semantic Analysis
anced text classification: A term weight- at SEPLN (TASS 2015), volume 1397
ing approach. Expert systems with Appli- of CEUR Workshop Proceedings, pages
cations, 36(1):690–701. 13–21. CEUR-WS.org.
Martı́nez-Cámara, E., M. C. Dı́az-Galiano,
M. A. Garcı́a-Cumbreras, M. Garcı́a-
Vega, and J. Villena-Román. 2017.
Overview of tass 2017. In J. Vil-
lena Román, M. A. Garcı́a Cumbreras,
D. G. M. C. Martı́nez-Cámara, Eugenio,
and M. Garcı́a Vega, editors, Proceedings
of TASS 2017: Workshop on Semantic
Analysis at SEPLN (TASS 2017), vol-
ume 1896 of CEUR Workshop Proceed-
ings, Murcia, Spain, September. CEUR-
WS.
Mikolov, T., K. Chen, G. Corrado, and
J. Dean. 2013. Efficient estimation
of word representations in vector space.
arXiv preprint arXiv:1301.3781.
Román, J. V., E. M. Cámara, J. G. Morera,
and S. M. J. Zafra. 2015. TASS 2014-the
challenge of aspect-based sentiment anal-
ysis. Procesamiento del Lenguaje Natural,
54:61–68.
Vilares, D., Y. Doval, M. A. Alonso, and
C. Gómez-Rodrıguez. 2014. Lys at TASS
2014: A prototype for extracting and
analysing aspects from Spanish tweets.
In Proceedings of TASS 2014: Workshop
on Semantic Analysis at SEPLN (TASS
2014).
Vilares, D., Y. Doval, M. A. Alonso, and
C. Gómez-Rodrı́guez. 2015. Lys at TASS
2015: Deep learning experiments for senti-
ment analysis on Spanish tweets. In Pro-
ceedings of TASS 2015: Workshop on Se-
mantic Analysis at SEPLN (TASS 2015),
pages 47–52.
Villena-Román, J., M. Á. G. Cumbreras,
E. M. Cámara, M. C. Dı́az-Galiano, M. T.
Martı́n-Valdivia, and L. A. U. López.
2016. Overview of TASS 2016. In Pro-
ceedings of TASS 2016: Workshop on Se-
mantic Analysis at SEPLN (TASS 2016),
90