=Paper=
{{Paper
|id=Vol-1896/p3_tecnolengua_tass2017
|storemode=property
|title=Tecnolengua Lingmotif at TASS 2017: Spanish Twitter Dataset Classification Combining Wide-Coverage Lexical Resources and Text Features
|pdfUrl=https://ceur-ws.org/Vol-1896/p3_tecnolengua_tass2017.pdf
|volume=Vol-1896
|authors=Antonio Moreno-Ortiz,Chantal Pérez Hernéndez
}}
==Tecnolengua Lingmotif at TASS 2017: Spanish Twitter Dataset Classification Combining Wide-Coverage Lexical Resources and Text Features==
TASS 2017: Workshop on Semantic Analysis at SEPLN, septiembre 2017, págs. 35-42
Tecnolengua Lingmotif at TASS 2017: Spanish
Twitter Dataset Classification Combining
Wide-coverage Lexical Resources and Text Features
Tecnolengua Lingmotif en TASS 2017: Clasificación de
polaridad de tuits en español combinando recursos léxicos de
amplia cobertura con rasgos textuales.
Antonio Moreno-Ortiz & Chantal Pérez Hernández
University of Málaga
Spain
{amo, mph}@uma.es
Abstract: In this paper we describe our participation in TASS 2017 shared task
on polarity classification of Spanish tweets. For this task we built a classification
model based on the Lingmotif Spanish lexicon, and combined this with a number of
formal text features, both general and CMC-specific, as well as single-word keywords
and n-gram keywords, achieving above-average results across all three datasets. We
report the results of our experiments with different combinations of said feature sets
and machine learning algorithms (logistic regression and SVM).
Keywords: sentiment analysis, twitter, polarity classification
Resumen: En este artı́culo describimos nuestra participación en la tarea de clasi-
ficación de polaridad de tweets en español del TASS 2017. Para esta tarea hemos
desarrollado un modelo de clasificación basado en el lexicón español de Lingmotif,
combinado con una serie de rasgos formales de los textos, tanto generales como es-
pecı́ficos de la comunicación mediada por ordenador (CMC), junto con palabras y
unidades fraseológicas clave, lo que nos ha permitido obtener unos resultados por
encima de la media en los tres conjuntos de la prueba. Mostramos los resultados
de nuestros experimentos con diferentes combinaciones de conjuntos de funciones y
algoritmos de aprendizaje automático (regresión logı́stica y SVM).
Palabras clave: análisis de sentimiento, twitter, clasificación de polaridad
1 Introduction basis, thus being a milestone not only for
The use of microblogging sites in general, and Spanish Twitter content, but for sentiment
Twitter in particular, has become so well - analysis in general.
established that it is now a common source The General Corpus of TASS was pub-
to poll user opinion and even social happiness lished for TASS 2013 (Villena Román et al.,
(Abdullah et al., 2015). Its relevance as a so- 2013), introducing aspect-based sentiment
cial hub can hardly be overestimated, and it analysis, consisting of over 68,000 polarity-
is now common for traditional media to refer- annotated tweets. Its creation followed cer-
ence Twitter trending topics as an indicator tain design criteria in terms of topics (poli-
of social concerns and interests. tics, football, literature, and entertainment)
It is not surprising, then, that Twitter and users.
datasets are increasingly being used for sen- TASS 2017 (Martı́nez-Cámara et al.,
timent analysis shared tasks. The SemEval 2017) keeps the Spain-only General Corpus
series of shared tasks included Sentiment of TASS, and introduces a new international
Analysis of English Twitter content in 2013 corpus of Spanish tweets, named InterTASS.
(Nakov et al., 2013), and included other lan- The InterTASS corpus adds considerable dif-
guages in later editions. The TASS Work- ficulty to the tasks not only because of its
shop on Sentiment Analysis at SEPLN series multi-varietal nature, but also because, un-
started in 2012, and continued on a yearly like the General Corpus of TASS, content has
ISSN 1613-0073 Copyright © 2017 by the paper's authors. Copying permitted for private and academic purposes.
Antonio Moreno-Ortiz, Chantal Pérez Hernéndez
not been filtered or their users selected, which makes it more difficult to compare results
introduces many and varied decoding issues. with those of other sentiment classification
shared tasks, where the none class is not con-
1.1 Classification tasks sidered.
TASS 2017 proposes two classification tasks.
Task 1 focuses on sentiment analysis at the 1.2 Lexicon-based Sentiment
tweet level, while Task 2 deals with aspect- Analysis
based sentiment classification. We took part Within Sentiment Analysis it is common
in Task 1, since we have not yet tackled to distinguish corpus-based approaches from
aspect-based sentiment analysis. The aim lexicon-based approaches. Although a com-
of this task is the automatic classification of bination of both methods can be found in the
tweets in one of 4 levels: positive, nega- literature (Riloff, Patwardhan, and Wiebe,
tive, neutral, and none. 2006), Lexicon-based approaches are usu-
The neutral/none distinction intro- ally preferred for sentence-level classification
duces added difficulty to the classification (Andreevskaia and Bergler, 2007), whereas
task. Tweets annotated as none are sup- corpus-based, statistical approaches are pre-
posed to express no sentiment whatsoever, as ferred for document-level classification.
in informative or declarative texts, whereas Using sentiment dictionaries has a long
the neutral category of tweets is meant to tradition in the field. WordNet (Fellbaum,
qualify tweets where both positive and neg- 1998) has been a recurrent source of lexical
ative opinion is expressed, but they cancel information (Kim and Hovy, 2004; Hu and
each other out, resulting in a neutral overall Liu, 2004; Adreevskaia and Bergler, 2006),
message. either directly, as a source of lexical informa-
We believe this distinction is too fuzzy tion, or for sentiment lexicon construction.
to be annotated reliably. First, precise bal- Other common lexicons used in English sen-
ance of polarity is hardly ever found in any timent analysis research include The Gen-
message where sentiment is expressed: the eral Inquirer (Stone and Hunt, 1963), MPQA
message is usually ”negative/positive situa- (Wilson, Wiebe, and Hoffmann, 2005), and
tion x, somehow counterbalanced by posi- Bing Liu’s Opinion Lexicon (Hu and Liu,
tive/negative situation y”, with an entail- 2004). Yet other researchers have used a
ment that the result is tilted to either side. combination of existing lexicons or created
The following are examples of tweets tagged their own (Hatzivassiloglou and McKeown,
as neutral in the training set: 1997; Turney, 2002). The use of lexicons
• 768547351443169284 Parece que las cosas no te has sometimes been straightforward, where
van muy bien, espero que todo mejore, que todo the mere presence of a sentiment word de-
el mundo merece ser feliz. termines a given polarity. However, negation
• 770417499317895168 No hay nada más bonito q and intensification can alter the valence or
separarse d una persona y q al tiempo t diga q polarity of that word.1 Modification of sen-
t echa de menos... pero a mi no m va a pasar
timent in context has also been widely rec-
We also found a number of examples ognized and dealt with by some researchers
where tweets that clearly fell into none cases, (Kennedy and Inkpen, 2006; Polanyi and Za-
where wrongly annotated as neutral: enen, 2006; Choi and Cardie, 2008; Taboada
• 768588061496209408 Estas palabras, del Po- et al., 2011).
ema, INSTANTES, son de Nadine Stair. Es- However, the valence of a given word may
critora norteamericana, a la q le gustan los hela- vary greatly from one domain to another, a
dos. fact well recognized in the literature (Aue
• 767846757996847104 pues imaginate en una and Gamon, 2005; Pang and Lee, 2008; Choi,
casa muy grande Kim, and Myaeng, 2009), which causes prob-
• 769993102442524674 Ninguno de los clubes lo lems when a sentiment lexicon is the only
hizo oficial pero se dice que sı́ source of knowledge. A number of solutions
These annotation issues are to be ex- have been proposed, mostly using ad hoc dic-
pected, due to the added cognitive load that 1
The use of the terms valence and polarity is used
is placed on the annotators, as other re- inconsistently in the literature. We use polarity to
searchers have pointed out (Mohammad and refer to the binary distinction positive/negative sen-
Bravo-Marquez, 2017a). Also, its presence timent, and valence to a value of intensity on a scale.
36
Tecnolengua Lingmotif at TASS 2017: Spanish Twitter Dataset Classification Combining Wide-Coverage Lexical Resources and Text Features
tionaries, sometimes created automatically
from a domain-specific corpus (Tai and Kao,
2013; Lu et al., 2011).
Our approach to using a lexicon takes
some ideas from the aforementioned ap-
proaches. We describe it in the next section.
2 System description
Our system for this polarity classification
task relies on the availability of rich sets
of lexical, sentiment, and (formal) text fea-
tures, rather than on highly sophisticated al- Figure 1: Lingmotif Learn
gorithms. We basically used a logistic re-
gression classifier trained on the optimal set
of features after many feature combinations cessing datasets, getting the text run trough
were tried on the training set. We also tried the Lingmotif SA engine, and feeding the
a SVM classifier on the same feature sets, but resulting data into one of several machine
we consistently obtained poorer results com- learning algorithms. Lingmotif Learn is able
pared to the logistic regression classifier. Pa- to extract both Sentiment features and non-
rameter finetuning on each classifier was very sentiment features, such as raw text metrics
limited; we simply performed a grid search and keywords, and it makes it easy to experi-
on the C parameter, which threw 100 as op- ment with different feature set combinations.
timal. For the SVM classifier we found the
RBF kernel to perform better than the lin- 2.1 The Lingmotif tool
ear kernel2 . We mostly focused on feature Sentiment features are returned by the Ling-
selection and combination. motif SA engine. Lingmotif (Moreno-Ortiz,
We obtained good results on the three test 2017a) is a user-friendly, multilingual, text
datasets, with some important differences be- analysis application with a focus on senti-
tween the InterTASS and General datasets. ment analysis that offers several modes of
Results, however, were not as good as we had text analysis. It is not specifically geared to-
anticipated based on our experiments on the wards any particular type of text or domain.
training datasets. We discuss this in section It can analyze long documents, such as nar-
3 below. Here we describe our general system ratives, medium-sized ones, such as political
architecture and feature sets. speeches and debates, and short to very short
This TASS shared task is our first expe- texts, such as user reviews and tweets. For
rience with Twitter data sentiment classifi- each of these, the tool offers different outputs
cation proper, although we had the related and metrics.
experience from our recent participation in For large collections of short texts, such
WASSA-2017 Shared Task on Emotion In- as Twitter datasets, it provides a multi-
tensity (Mohammad and Bravo-Marquez, document mode whose default output is clas-
2017b). From this shared task we learnt the sification. In the current publicly available
relevance and impact that other, non-lexical version this classification is entirely based on
text features can have in microblogging texts. the Text Sentiment Score (TSS), which at-
Since our focus was on identifying the pre- tempts to summarize the text’s overall po-
dictive power of classification features, and larity on a 0-100 scale. TSS is calculated as
intended to perform many experiments with a function of the text’s positive and nega-
features combinations, we designed a simple tive scores and the sentiment intensity, which
tool to facilitate this. reflects the proportion of sentiment to non-
This tool, Lingmotif Learn, is a GUI- sentiment lexical items in the text. Spe-
enabled convenience tool that manages cific details on TSS calculation can be found
datasets and uses the Python-based scikit- in Moreno-Ortiz (2017a). A description of
learn (Pedregosa et al., 2011) machine learn- its applications is found in Moreno-Ortiz
ing toolkit. It facilitates loading and prepro- (2017b).
2
For the RBF kernel we used gamma=0.001, Lingmotif results are generated as a
C=100. For the linear kernel we used C=1000. HTML/Javascript document, which is saved
37
Antonio Moreno-Ortiz, Chantal Pérez Hernéndez
Name Description Name Description
tss Text Sentiment Score sentences Number of sentences
tsi Text Sentiment Intensity tt.ratio Type/Token ratio
sent.it Number of lexical Items lex.items Number of lexical items
pos.sc Positive score gram.items Number of grammatical items
neg.sc Negative score vb.items Number of verbs
pos.it Number of positive items nn.items Number of nouns
neg.it Number of negative items nnp.items Number of proper nouns
neu.it Number of neutral items jj.items Number of adjectives
split1.tss TSS for split 1 of text rb.items Number of adverbs
split2.tss TSS for split 2 of text chars Number of characters
sentences Number of sentences intensifiers Number of intensifiers
shifters Number of sentiment shifters contrasters Number of contrast words
emoticons Number of emoticons/emojis
Table 1: Sentiment feature set all.caps Number of upper case words
char.ngrams Number of character ngrams
x.marks Number of exclamation marks
locally to a predefined location and auto- q.marks Number of question marks
matically sent to the user’s default browser quote.marks Number of quotation marks
susp.marks Number of suspension marks
for immediate display. Internally, the ap- x.marks.seqs Number of x.marks sequences
plication generates results as an XML doc- q.marks.seqs Number of q.marks sequences
ument containing all the relevant data; this xq.marks.seqs Number of x/q marks sequences
XML document is then parsed against one of handles Number of Twitter handles
several available XSL templates, and trans- hashtags Number of hashtags
urls Number of URL’s
formed into the final HTML.
Lingmotif Learn simply plugs into the in- Table 2: Text feature set
ternally generated XML document to retrieve
the desired sentiment analysis data, and ap- 2.3 Text features
pends the data to each tweet as features.
Raw text features are commonly used in sen-
2.2 Sentiment features timent analysis shared tasks successfully (e.g.
Mohammad, Kiritchenko, and Zhu (2013),
Table 1 summarizes the sentiment-related Kiritchenko et al. (2014)), including previ-
feature set generated by the Lingmotif en- ous editions of TASS (Cerón-Guzmán, 2016).
gine. The role of some of them is rather obvi-
Most of these features are included in the ous; the presence of emoticons or exclama-
original Lingmotif engine, but for this occa- tion marks, for example, usually determines
sion we experimented with text splits to test (strong) sentiment or opinion, thus being a
the relevance of the position of the sentiment good candidate predictor for the none vs rest
words in the tweet. The features split1.tss distinction. The role of others, however, is
and split2.tss are the combined sentiment not as clear. For example, we consistently ob-
score for each half of the tweet. The assump- tained better results using the gram.items
tion was that sentiment words used towards feature, whereas the number of lexical items
the end of the tweet may have more weight was not a good predictor. The number of
on the overall tweet polarity. This might be verbs, adjectives and adverbs also proved to
helpful especially for the P/N/NEU distinc- be useful, whereas the number of nouns did
tion. Neutral tweets are supposed to have not.
some balance between positivity and negativ- Table 2 contains the full list of text fea-
ity. In our tests with the training set, how- tures we experimented with.
ever, adding these features did not improve
results. We also experimented with 3 splits, 2.4 Keyword features
with the same results. These features were In order to account for words and expressions
thus discarded for test set classification. that convey sentiment but may not be in-
Some of these features are in fact re- cluded in the sentiment lexicon, we experi-
dundant. Notably, tss already encapsulates mented with automatic keyword extraction
pos.sc, neg.sc, and neu.it. In our tests, for each of the classes in the training set.
the classifier performed better using just the Automatic keyword and keyphrase extrac-
pos.sc and neg.sc values, than our calcu- tion is a well developed field and a number
lated tss, so we only used these two features. of tools and methodologies have been pro-
38
Tecnolengua Lingmotif at TASS 2017: Spanish Twitter Dataset Classification Combining Wide-Coverage Lexical Resources and Text Features
Name Description Experiment Macro-F1 Accuracy
p.kw Positive keywords run3 0.528 0.657
p.ng.kw Positive ngram keywords final 0.517 0.632
p.handles Positive handles no ngrams 0.508 0.652
n.kw Negative keywords
n.ng.kw Negative ngram keywords
n.handles Negative handles
Table 5: Official results for the General TASS
neu.kw Neutral keywords test set
neu.ng.kw Neutral ngram keywords
neu.handles Neutral handles
Experiment Macro-F1 Accuracy
none.kw None keywords run3 0.521 0.638
none.ng.kw None ngram keywords final 0.488 0.618
none.handles None handles run4 0.483 0.612
Table 3: Keywords feature set Table 6: Official results for the General
TASS-1k test set
posed. Hasan and Ng (2014) provide a good
overview of the state-of-the-art techniques for the former than the latter. All our models
keyphrase extraction. were trained on one dataset where both train-
We used a very simple approach that ing datasets (General and InterTASS) where
consisted in comparing frequencies of single merged. Perhaps better results would have
words and ngrams (2 to 4 words) on a one- been obtained by training on each dataset
vs-rest basis for each of our four classes, for separately.
words and ngrams with a minimum frequency The other reason for poorer performance
of 2. We calculated and ranked keyness based on the InterTASS test set concerns the very
on the chi-square statistic, and then manually different nature of the datasets. The Gen-
removed irrelevant results. We ended up with eral Corpus of TASS consists of tweets gen-
a list of 100 keywords and 100 keyphrases for erated by public figures (artists, politicians,
each class. We did the same for Twitter han- journalists) with a large number of follow-
dles. ers. Such Twitter users are more predictable
Using the keywords feature set improved both in terms of the content of their tweets
results considerably in our tests with the and the language they use. They are also
training set. However, this improvement did Castilian Spanish speakers entirely. Most of
not transfer well to the test sets, especially in these tweets contain very compact but care-
the case of the InterTASS dataset. We fur- fully chosen language, expressing users’ opin-
ther discuss this issue in section 3. ion or evaluation of polically or socially rel-
evant events. On the other hand, the in-
3 Experiments and Results terTASS corpus shows much more variabil-
Tables 4, 5, and 6 show our results for each of ity; first, the tweets were collected not only
the test sets. Although performance is strong from Spain, but from several Latin American
across all three, there clearly is a difference countries, which introduces important lexi-
between the General TASS datasets, on the cal variability. Second, no user selection is
one hand, and the InterTASS dataset on the apparent. Tweets were randomly collected
other. from the whole Spanish speaking user base.
This introduces spelling errors and a much
Experiment Macro-F1 Accuracy more colloquial and chatty language. Non-
sent-only 0.456 0.582 lexical linguistic features, such as exclama-
run3 0.441 0.576
tion marks, emojis or emoticons, are recur-
sent-only-fixed 0.441 0.595
rent, as are, user-to user messages, which
Table 4: Official results for the InterTASS are of course hard-to-decode, since they pre-
test set suppose certain privately shared knowledge.
These issues have obviously affected the per-
We believe this is due to two main reasons. formance of all TASS participants, as is clear
First, the General training set (7,218 tweets) from the final leader board.
is much larger than the InterTASS training We obtained the best results for the Gen-
set (1,514 tweets, using both the training and eral datasets with our run3 experiment,
development datasets). This of course pro- where we combined a selection of features
vides a much more solid training base for from the three feature sets listed in tables
39
Antonio Moreno-Ortiz, Chantal Pérez Hernéndez
Features model overfitting, with an obvious negative
pos.sc neu.kw impact on the classification of the test set.
neg.sc neu.ng.kw After this became apparent on our first re-
vb.items neu.handles sults upload, we corrected by reducing the
jj.items none.kw sets of keywords, keyphrases and user han-
rb.items none.ng.kw dles, which resulted in better overall results.
gram.items none.handles
n.chars emoticons 4 Conclusions
intensifiers all.caps
This shared task has served us to assess the
contrasters char.ngrams
usefulness of many different features as pre-
p.kw x.marks
dictors of polarity classification in Spanish
p.ng.kw q.marks
tweets. The differing sizes and characteristics
p.handles susp.marks
of the training and test datasets determined
n.kw hashtags
to some extent our results, but we also felt we
n.ng.kw handles
overfitted our model with too large a selec-
n.handles urls
tion of keywords, which threw overoptimistic
Table 7: run3 experiment feature set results in our tests.
Our results on par with other participants
Features who used more sophisticated systems from
pos.sc handles the technical perspective, which is also an
neg.sc emoticons indication of the salient role that curated,
vb.items all.caps high-quality lexical resources play in senti-
jj.items char.ngrams ment analysis.
rb.items x.marks We also experienced the negative impact
gram.items q.marks of model overfitting and learnt how to limit
n.chars susp.marks its effects. We plan to use this knowledge in
intensifiers urls future versions of Lingmotif, which currently
contrasters hashtags uses sentiment features exclusively. It is ob-
vious that combining those with other formal
Table 8: sent-only experiment feature set features can improve results considerably.
Acknowledgments
1, 2, and 3. This selection was in fact the
optimal we found during our cross-validation This research was supported by Spain’s
tests on the training dataset. Table 7 lists MINECO through the funding of project
the feature set used in this experiment. Lingmotif2 (FFI2016-78141-P).
Concerning the InterTass test set, the best
results were obtained with the sent-only ex- References
periment, where a reduced set of features was Abdullah, S., E. L. Murnane, J. M. Costa,
used. We list these features in table 8. and T. Choudhury. 2015. Collective
We obtained better results for the Inter- smile: Measuring societal happiness from
TASS test set using this reduced set of fea- geolocated images. In Proceedings of the
tures because the keyword sets were caus- 18th ACM Conference on Computer Sup-
ing noise, since they were extracted using the ported Cooperative Work & Social
whole training set, which contained a much Computing, CSCW ’15, pages 361–374,
larger proportion of tweets from the General New York, NY, USA. ACM.
TASS dataset.
Adreevskaia, A. and S. Bergler. 2006. Min-
Another important aspect is the large dif- ing wordnet for fuzzy sentiment: Senti-
ference that we encountered between our own ment tag extraction from wordnet glosses.
tests on the training datasets and our final In 11th Conference of the European Chap-
(official) results. For the General corpus of ter of the Association for Computational
TASS, we consistently obtained very high F1 Linguistics, pages 209–216.
scores (upwards of 0.73) using the keyword
set, but much closer to the official results Andreevskaia, A. and S. Bergler. 2007.
without them. This is a clear indication of Clac and clac-nb: Knowledge-based and
40
Tecnolengua Lingmotif at TASS 2017: Spanish Twitter Dataset Classification Combining Wide-Coverage Lexical Resources and Text Features
corpus-based approaches to sentiment tag- and data mining, pages 168–177, Seattle,
ging. In Proceedings of the 4th In- WA, USA. ACM.
ternational Workshop on Semantic Eval-
Kennedy, A. and D. Inkpen. 2006. Sentiment
uations, SemEval ’07, pages 117–120,
classification of movie reviews using con-
Stroudsburg, PA, USA. Association for
textual valence shifters. Computational
Computational Linguistics.
Intelligence, 22(2):110–125.
Aue, A. and M. Gamon. 2005. Customizing Kim, S.-M. and E. Hovy. 2004. Determin-
sentiment classifiers to new domains: A ing the sentiment of opinions. In Pro-
case study. Borovets, Bulgaria. ceedings of the 20th international confer-
Cerón-Guzmán, J. A. 2016. Jacerong at tass ence on Computational Linguistics, page
2016: An ensemble classifier for sentiment 1367, Geneva, Switzerland. Association
analysis of spanish tweets at global level. for Computational Linguistics.
In Proceedings of TASS 2016: Work- Kiritchenko, S., X. Zhu, C. Cherry, and
shop on Sentiment Analysis at SEPLN S. Mohammad. 2014. Nrc–canada-2014:
co-located with 32nd SEPLN Conference Detecting aspects and sentiment in cus-
(SEPLN 2016), pages 35–39, Salamanca, tomer reviews. In Proceedings of the
Spain. SEPLN. 8th International Workshop on Semantic
Choi, Y. and C. Cardie. 2008. Learning with Evaluation (SemEval 2014), pages 437–
compositional semantics as structural in- 442, Dublin, Ireland, August. Association
ference for subsentential sentiment analy- for Computational Linguistics and Dublin
sis. In Proceedings of the Conference on City University.
Empirical Methods in Natural Language Lu, Y., M. Castellanos, U. Dayal, and
Processing, EMNLP ’08, pages 793–801, C. Zhai. 2011. Automatic construction of
Stroudsburg, PA, USA. a context-aware sentiment lexicon: An op-
Choi, Y., Y. Kim, and S.-H. Myaeng. 2009. timization approach. In Proceedings of the
Domain-specific sentiment analysis using 20th International Conference on World
contextual feature generation. In Proceed- Wide Web, WWW ’11, pages 347–356,
ing of the 1st international CIKM work- New York, NY, USA. ACM.
shop on Topic-sentiment analysis for mass Martı́nez-Cámara, E., M. C. Dı́az-Galiano,
opinion, pages 37–44, Hong Kong, China. M. Á. Garcı́a-Cumbreras, M. Garcı́a-
ACM. Vega, and J. Villena-Román. 2017.
Fellbaum, C., editor. 1998. WordNet An Overview of tass 2017. In J. Vil-
Electronic Lexical Database. The MIT lena Román, M. Á. Garcı́a Cumbreras,
Press, Cambridge, MA; London, May. E. Martı́nez-Cámara, M. C. Dı́az Galiano,
and M. Garcı́a Vega, editors, Proceedings
Hasan, K. S. and V. Ng. 2014. Auto- of TASS 2017: Workshop on Semantic
matic keyphrase extraction: A survey of Analysis at SEPLN (TASS 2017), vol-
the state of the art. In Proceedings of the ume 1896 of CEUR Workshop Proceed-
52nd Annual Meeting of the Association ings, Murcia, Spain, September. CEUR-
for Computational Linguistics (Volume 1: WS.
Long Papers), pages 1262–1273.
Mohammad, S. and F. Bravo-Marquez.
Hatzivassiloglou, V. and K. R. McKeown. 2017a. Emotion intensities in tweets. In
1997. Predicting the semantic orientation Proceedings of the sixth joint conference
of adjectives. In Proceedings of the eighth on lexical and computational semantics
conference on European chapter of the As- (*Sem), Vancouver, Canada.
sociation for Computational Linguistics,
Mohammad, S. and F. Bravo-Marquez.
pages 174–181, Madrid, Spain. Associa-
2017b. Wassa-2017 shared task on emo-
tion for Computational Linguistics.
tion intensity. In Proceedings of the
Hu, M. and B. Liu. 2004. Mining and sum- EMNLP 2017 Workshop on Computa-
marizing customer reviews. In Proceed- tional Approaches to Subjectivity, Sen-
ings of the tenth ACM SIGKDD interna- timent, and Social Media, Copenhagen,
tional conference on Knowledge discovery Denmark, September.
41
Antonio Moreno-Ortiz, Chantal Pérez Hernéndez
Mohammad, S. M., S. Kiritchenko, and Stone, P. J. and E. B. Hunt. 1963. A
X. Zhu. 2013. Nrc-canada: Building the computer approach to content analysis:
state-of-the-art in sentiment analysis of Studies using the general inquirer sys-
tweets. In Proceedings of the seventh in- tem. In Proceedings of the May 21-23,
ternational workshop on Semantic Evalu- 1963, Spring Joint Computer Conference,
ation Exercises (SemEval-2013), Atlanta, AFIPS ’63 (Spring), pages 241–256, New
Georgia, USA, June. York, NY, USA. ACM.
Moreno-Ortiz, A. 2017a. Lingmotif: A user- Taboada, M., J. Brooks, M. Tofiloski,
focused sentiment analysis tool. Proce- K. Voll, and M. Stede. 2011. Lexicon-
samiento del Lenguaje Natural, 58(0):133– based methods for sentiment analysis.
140, March. Computational Linguistics, 37(2):267–
307.
Moreno-Ortiz, A. 2017b. Lingmotif: Senti-
ment analysis for the digital humanities. Tai, Y.-J. and H.-Y. Kao. 2013. Automatic
In Proceedings of the 15th Conference of domain-specific sentiment lexicon genera-
the European Chapter of the Association tion with label propagation. In Proceed-
for Computational Linguistics, pages 73– ings of International Conference on In-
76, Valencia, Spain, April. Association for formation Integration and Web-based Ap-
Computational Linguistics. plications & Services, IIWAS ’13, pages
53:53–53:62, New York, NY, USA. ACM.
Nakov, P., Z. Kozareva, A. Ritter, S. Rosen-
Turney, P. D. 2002. Thumbs up or thumbs
thal, V. Stoyanov, and T. Wilson. 2013.
down? semantic orientation applied to
Semeval-2013 task 2: Sentiment analy-
unsupervised classification of reviews. In
sis in twitter. In Proceedings of the Sev-
Proceedings of the 40th Annual Meeting
enth International Workshop on Seman-
of the Association for Computational Lin-
tic Evaluation (SemEval 2013), Atlanta,
guistics (ACL), pages 417–424, Philadel-
Georgia, USA, June.
phia, USA.
Pang, B. and L. Lee. 2008. Opinion mining Villena Román, J., S. Lana Serrano,
and sentiment analysis. Foundations and E. Martı́nez Cámara, and J. C.
Trends in Information Retrieval, 2(1-2):1– González Cristóbal. 2013. Tass -
135. workshop on sentiment analysis at sepln.
Pedregosa, F., G. Varoquaux, A. Gram- Procesamiento del Lenguaje Natural,
fort, V. Michel, B. Thirion, O. Grisel, 50:37–44.
M. Blondel, P. Prettenhofer, R. Weiss, Wilson, T., J. Wiebe, and P. Hoffmann.
V. Dubourg, J. Vanderplas, A. Passos, 2005. Recognizing contextual polarity
D. Cournapeau, M. Brucher, M. Perrot, in phrase-level sentiment analysis. In
and É. Duchesnay. 2011. Scikit-learn: Proceedings of the Conference on Hu-
Machine learning in python. J. Mach. man Language Technology and Empirical
Learn. Res., 12:2825–2830, November. Methods in Natural Language Processing,
HLT ’05, pages 347–354, Stroudsburg, PA,
Polanyi, L. and A. Zaenen. 2006. Contex-
USA. Association for Computational Lin-
tual valence shifters. In Computing Atti-
guistics.
tude and Affect in Text: Theory and Ap-
plications, volume 20 of The Information
Retrieval Series. Springer, Dordrecht, The
Netherlands, shanahan, james g., qu, yan,
wiebe, janyce edition, pages 1–10.
Riloff, E., S. Patwardhan, and J. Wiebe.
2006. Feature subsumption for opinion
analysis. In Proceedings of the 2006 Con-
ference on Empirical Methods in Natural
Language Processing, EMNLP ’06, pages
440–448, Stroudsburg, PA, USA. Associa-
tion for Computational Linguistics.
42