=Paper= {{Paper |id=Vol-1749/paper_036 |storemode=property |title=Exploiting Emotive Features for the Sentiment Polarity Classification of tweets |pdfUrl=https://ceur-ws.org/Vol-1749/paper_036.pdf |volume=Vol-1749 |authors=Lucia C. Passaro,Alessandro Bondielli,Alessandro Lenci |dblpUrl=https://dblp.org/rec/conf/clic-it/PassaroBL16a }} ==Exploiting Emotive Features for the Sentiment Polarity Classification of tweets== https://ceur-ws.org/Vol-1749/paper_036.pdf

Exploiting Emotive Features for the
Sentiment Polarity Classification of tweets

Lucia C. Passaro, Alessandro Bondielli and Alessandro Lenci
CoLing Lab, Dipartimento di Filologia, Letteratura e Linguistica
University of Pisa (Italy)
lucia.passaro@for.unipi.it
alessandro.bondielli@gmail.com
alessandro.lenci@unipi.it

Abstract emoticons, slang, specific terminology, abbrevia-
tions, links and hashtags is higher than in other do-
English. This paper describes the CoL- mains and platforms. Twitter users post messages
ing Lab system for the participation in from many different media, including their smart-
the constrained run of the EVALITA 2016 phones, and they “tweet” about a great variety of
SENTIment POLarity Classification Task topics, unlike what can be observed in other sites,
(Barbieri et al., 2016). The system ex- which appear to be tailored to a specific group of
tends the approach in (Passaro et al., 2014) topics (Go et al., 2009).
with emotive features extracted from ItEM The paper is organized as follows: Section 2
(Passaro et al., 2015; Passaro and Lenci, describes the architecture of the system, as well
2016) and FB-NEWS15 (Passaro et al., as the pre-processing and the features designed
2016). in (Passaro et al., 2014). Section 3 shows the
additional features extracted from emotive VSM
Italiano. Questo articolo descrive il and from LDA. Section 4 shows the classification
sistema sviluppato all’interno del CoL- paradigm, and the last sections are left for results
ing Lab per la partecipazione al task and conclusions.
di EVALITA 2016 SENTIment POLarity
Classification Task (Barbieri et al., 2016).
2 Description of the system
Il sistema estende l’approccio descritto in
(Passaro et al., 2014) con una serie di fea- The system extends the approach in (Passaro et al.,
tures emotive estratte da ItEM (Passaro et 2014) with emotive features extracted from ItEM
al., 2015; Passaro and Lenci, 2016) and (Passaro et al., 2015; Passaro and Lenci, 2016)
FB-NEWS15 (Passaro et al., 2016). and FB-NEWS15 (Passaro et al., 2016). The main
goal of the work is to evaluate the contribution of
a distributional affective resource to estimate the
1 Introduction valence of words. The CoLing Lab system for
Social media and microblogging services are ex- polarity classification includes the following ba-
tensively used for rather different purposes, from sic steps: (i) a preprocessing phase, to separate
news reading to news spreading, from entertain- linguistic and nonlinguistic elements in the target
ment to marketing. As a consequence, the study tweets; (ii) a feature extraction phase, in which the
of how sentiments and emotions are expressed in relevant characteristics of the tweets are identified;
such platforms, and the development of methods (iii) a classification phase, based on a Support Vec-
to automatically identify them, has emerged as a tor Machine (SVM) classifier with a linear kernel.
great area of interest in the Natural Language Pro-
2.1 Preprocessing
cessing Community. Twitter presents many lin-
guistic and communicative peculiarities. A tweet, The aim of the preprocessing phase is the identifi-
in fact, is a short informal text (140 characters), cation of the linguistic and nonlinguistic elements
in which the frequency of creative punctuation, in the tweets and their annotation.
While the preprocessing of nonlinguistic ele- 2015). The features are, for each emotion,
ments such as links and emoticons is limited to the total count of strongly emotional tokens
their identification and classification (cf. section in each tweet.
2.2.4), the treatment of the linguistic material re-
quired the development of a dedicated rule-based Bad words lexicon: By exploiting an in house
procedure, whose output is a normalized text that built lexicon of common Italian bad words,
is subsequently feed to a pipeline of general- we reported, for each tweet, the frequency of
purpose linguistic annotation tools. The following bad words belonging to a selected list, as well
rules have been applied in the linguistic prepro- as the total amount of these lemmas.
cessing phase: Sentix: Sentix (Sentiment Italian Lexicon:
(Basile and Nissim, 2013)) is a lexicon for
• Emphasis: tokens presenting repeated char-
Sentiment Analysis in which 59,742 lemmas
acters like bastaaaa “stooooop” are replaced
are annotated for their polarity and intensity,
by their most probable standardized forms
among other information. Polarity scores
(i.e. basta “stop”);
range from −1 (totally negative) to 1 (totally
• Links and emoticons: they are identified and positive), while Intensity scores range from
removed; 0 (totally neutral) to 1 (totally polarized).
Both these scores appear informative for the
• Punctuation: linguistically irrelevant punctu- classification, so that we derived, for each
ation marks are removed; lemma, a Combined score Cscore calculated
as follows:
• Usernames: the users cited in a tweet are
identified and normalized by removing the @ Cscore = Intensity ∗ P olarity (1)
symbol and capitalizing the entity name; Depending on their Cscore , the selected lem-
• Hashtags: they are identified and normalized mas have been organized into several groups:
by simply removing the # symbol; • strongly positives: 1 ≤ Cscore < 0.25
• weakly positives:0.25 ≤ Cscore < 0.125
The output of this phase are linguistically- • neutrals: 0.125 ≤ Cscore ≤ −0.125
• weakly negatives: −0.125 < Cscore ≤ −0.25
standardized tweets, that are subsequently POS
• highly negatives: −0.25 < Cscore ≤ −1
tagged with the Part-Of-Speech tagger described
in (Dell’Orletta, 2009) and dependency-parsed Since Sentix relies on WordNet sense dis-
with the DeSR parser (Attardi et al., 2009). tinctions, it is not uncommon for a lemma
to be associated with more than one
2.2 Feature extraction hIntensity,Polarityi pair, and consequently to
The inventory of features can be organized into six more than one Cscore .
classes. The five classes of features described in In order to handle this phenomenon, the lem-
this section have been designed in 2014, the sixth mas have been splitted into three different
class, described in the next section is referred to ambiguity classes: Lemmas with only one
the emotive and LDA features. entry or whose entries are all associated with
2.2.1 Lexical Features the same Cscore value, are marked as “Unam-
biguous” and associated with their Cscore .
Lexical features represent the occurrence of bad
words or of words that are either highly emotional Ambiguous cases were treated by inspecting,
or highly polarized. Relevant lemmas were identi- for each lemma, the distribution of the associ-
fied from two in-house built lexicons (cf. below), ated Cscores : Lemmas which had a Majority
and from Sentix (Basile and Nissim, 2013), a lexi- Vote (MV) were marked as “Inferable” and
con of sentiment-annotated Italian words. Lexical associated with the Cscore of the MV. If there
features include: was no MV, lemmas were marked as “Am-
biguous” and associated with the mean of the
ItEM seeds: Lexicon of 347 highly emotional Cscores . To isolate a reliable set of polarized
Italian words built by exploiting an online words, we focused only on the Unambigu-
feature elicitation paradigm (Passaro et al., ous or Inferable lemmas and selected only the
250 topmost frequent according to the PAIS and :-), marked with their polarity score: 1
corpus (Lyding et al., 2014), a large collec- (positive), −1 (negative), 0 (neutral).
tion of Italian web texts. LexEmo is used both to identify emoticons
Other Sentix-based features in the ColingLab and to annotate their polarity.
model are: the number of tokens for each Emoticon-related features are the total
Cscore group, the Cscore of the first token in amount of emoticons in the tweet, the
the tweet, the Cscore of the last token in the polarity of each emoticon in sequential order
tweet and the count of lemmas that are repre- and the polarity of each emoticon in reversed
sented in Sentix. order. For instance, in the tweet :-(quando
ci vediamo? mi manchi anche tu! :*:*
2.2.2 Negation
“:-(when are we going to meet up? I miss
Negation features have been developed to encode you, too :*:*” there are three emoticons,
the presence of a negation and the morphosyntac- the first of which (:-() is negative while the
tic characteristics of its scope. others are positive (:*; :*).
The inventory of negative lemmas (e.g. “non”) Accordingly, the classifier has been fed
and patterns (e.g. “non ... mai”) have been ex- with the information that the polarity of
tracted from (Renzi et al., 2001). The occurrences the first emoticon is −1, that of the second
of these lemmas and structures have been counted emoticon is 1 and the same goes for the third
an inserted as features to feed the classifier. emoticon. At the same way, another group of
In order to characterize the scope of each nega- feature specifies that the polarity of the last
tion, we used the dependency parsed tweets pro- emoticon is 1, as it goes for that of the last
duced by DeSR (Attardi et al., 2009). The scope but one emoticon, while the last but two has
of a negative element is assumed to be its syntac- a polarity score of −1.
tic head or the predicative complement of its head,
in the case the latter is a copula. Although it is Links: These features contain a shallow classifi-
clearly a simplifying assumption, the preliminary cation of links performed using simple reg-
experiments show that this could be a rather cost- ular expressions applied to URLs, to clas-
effective strategy in the analysis of linguistically sify them as following: video, images,
simple texts like tweets. social and other. We also use as feature
This information has been included in the model the absolute number of links for each tweet.
by counting the number of negation patterns en-
countered in each tweet, where a negation pat- Emphasis: The features report on the number of
tern is composed by the PoS of the negated ele- emphasized tokens presenting repeated char-
ment plus the number of negative tokens depend- acters like bastaaaa, the average number of
ing from it and, in case it is covered by Sentix, ei- repeated characters in the tweet, and the cu-
ther its Polarity, its Intensity and its Cscores value. mulative number of repeated characters in the
tweet.
2.2.3 Morphological features
The linguistic annotation produced in the prepro- Creative Punctuation: Sequences of contigu-
cessing phase has been exploited also in the pop- ous punctuation characters, like !!!,
ulation of the following morphological statistics: !?!?!?!!?!????! or ......., are
(i) number of sentences in the tweet; (ii) number of identified and classified as a sequence of
linguistic tokens; (iii) proportion of content words dots, exclamations marks, question marks or
(nouns, adjectives, verbs and adverbs); (iv) num- mixed. For each tweet, the features corre-
ber of tokens for Part-of-Speech. spond to the number of sequences belonging
to each group and their average length in
2.2.4 Shallow features characters.
This group of features has been developed to de-
scribe distinctive characteristics of web communi- Quotes: The number of quotations in the tweet.
cation. The group includes: 2.2.5 Twitter features
Emoticons: We used the lexicon LexEmo to mark This group of features describes some Twitter-
the most common emoticons, such as :-( specific characteristics of the target tweets.
Topic: This information marks if a tweet has been of features have been extracted. The simplest ones
retrieved via a specific political hashtag or include general statistics such as the number of
keywords. It is provided by organizers as an emotive words and the emotive score of a tweet.
attribute of the tweet; More sophisticated features are aimed at inferring
the degree of distinctivity of a word as well as its
Usernames: The number of @username in the polarity from their own emotive profile.
tweet;
Number of emotive words: Words belonging to
Hashtags: Hashtags play the role of organizing the emotive Facebook spaces;
the tweets around a single topic, so that they
are useful to be considered in determing their Emotive/words ratio: The ratio between the
polarity (i.e. a tweet containing hashtags like number of emotive words and the total num-
and #amore “#love” and #felice “#happy” ber of words in the tweet;
is expected to be positive and a tweet con-
Strongly emotive words: Number of words hav-
taining hashtags like #ansia “#anxiety” and
ing a high (greater than 0.4) emotive score for
#stressato “#stressedout” is expected to be
at least one emotion;
negative. This group of features registers the
presence of an hashtag belonging to the list Tweet emotive score: Score calculated as the ra-
of the hashtags with a frequency higher than tio between the number of strongly polarized
1 in the training corpus. words and the number of the content words in
the tweet (Eq. 2). The feature assumes values
3 Introducing emotive and LDA features in the interval [0, 1]. In absence of strongly
In order to add emotive features to the CoLing Lab emotive words, the default value is 0.
model, we created an emotive lexicon from the Count(Strongly emotive words)
corpus FB-NEWS15 (Passaro et al., 2016) follow- E(T weet) =
Count(Content words)
ing the strategy illustrated in (Passaro et al., 2015; (2)
Passaro and Lenci, 2016). The starting point is
Maximum values: The maximum emotive value
a set of seeds strongly associated to one or more
for each emotion (8 features);
emotions of a given taxonomy, that are used to
build centroid distributional vectors representing Quartiles: The features take into account the dis-
the various emotions. tribution of the emotive words in the tweet.
In order to build the distributional profiles of the For each emotion, the list of the emotive
words, we extracted the list T of the 30,000 most words has been ordered according to the
frequent nouns, verbs and adjectives from FB- emotive scores and divided into quartiles
NEWS15. The lemmas in T were subsequently (e.g. the fourth quartile contains the most
used as target and contexts in a square matrix of emotive words and the first quartile the less
co-occurrences extracted within a five word win- emotive ones.). Each feature registers the
dow (±2 words, centered on the target lemma). In count of the words belonging to the pair
addition, we extended the matrix to the nouns, ad- hemotion, quartilei (32 features in total);
jectives and verbs in the corpus of tweets (i. e.
lemmas not belonging to T ). ItEM seeds: Boolean features registering the
For each hemotion, PoSi pair we built a centroid presence of words belonging to the words
vector from the vectors of the seeds belonging to used as seeds to build the vector space mod-
that emotion and PoS, obtaining in total 24 cen- els. In particular, the features include the
troids1 . Starting from these spaces, several groups top 4 frequent words for each emotion (32
boolean features in total);
1
Following the configuration in (Passaro et al., 2015; Pas-
saro and Lenci, 2016), the co-occurrence matrix has been Distintive words: 32 features corresponding to
re-weighted using the Pointwise Mutual Information (Church
and Hanks, 1990), and in particular the Positive PMI (PPMI), the top 4 distinctive words for each emotion.
in which negative scores are changed to zero (Niwa and The degree of distinctivity of a word for a
Nitta, 1994). We constructed different word spaces accord- given emotion is calculated starting from the
ing to PoS because the context that best captures the meaning
of a word differs depending on the word to be represented VSM normalized using Z-scores. In particu-
(Rothenhusler and Schtze, 2007). lar, the feature corresponds to the proportion
of the emotion hemotioni i against the sum of so that we opted for the more economical setting,
total emotion score [e1 , ..., e8 ]; i.e. the multiclass one.

Polarity (count): The number of positive and 5 Results
negative words. The polarity of a word is
Although this model is not optimal according to
calculated by applying Eq. 3, in which pos-
the global ranking, if we focus on the recognition
itive emotions are assumed to be J OY and
of the negative tweets (i.e. the NEG task), it ranks
T RUST, and negative emotions are assumed
fifth (F1-score), and first if we consider the class 1
to be D ISGUST, F EAR , A NGER and S AD -
of the NEG task (i.e. NEG, F.sc. 1). Such trend is
NESS.
reversed if we consider the POS task, which is the
J OY +T RUST worst performing class of this system.
P olarity(w) =
2 (3)
D ISGUST +F EAR +A NGER +S ADNESS
− Task Class Precision Recall F-score
4 POS 0 0,8548 0,7682 0,8092
POS 1 0,264 0,3892 0,3146
POS task 0,5594 0,5787 0,5619
Polarity (values): The polarity (calculated using NEG 0 0,7688 0,6488 0,7037
NEG 1 0,5509 0,6883 0,612
Eq. 3) of the emotive words in the tweet.
NEG task 0,65985 0,66855 0,6579
The maximum number of emotive words is GLOBAL 0,609625 0,623625 0,6099
assumed to be 20;
Table 1: System results.
LDA features: This group of features includes 50
features referred to the topic distribution of
Due to the great difference in terms of perfor-
the tweet. The LDA model has been built
mance between the results obtained by performing
on the FB-NEWS15 corpus (Passaro et al.,
a 10 fold cross validation, we suspected that the
2016) which is organized into 50 clusters of
system was overfitting the training data, so that we
thematically related news created with LDA
performed different feature ablation experiments,
(Blei et al., 2003) (Mallet implementation
in which we included only the lexical information
(McCallum, 2002)). Each feature refers to
derived from ItEM and FB-NEWS (i.e. we re-
the association between the text of the tweet
moved the features relying to Sentix, Negation and
and a topic extracted from FB-NEWS15.
Hashtags (cf. table 2). The results demonstrate on
4 Classification one hand that significant improvements can be ob-
tained by using lexical information, especially to
We used the same paradigm used in (Passaro et al., recognize negative texts. On the other hand the
2014). In particular, we chose to base the CoL- results highlight the overfitting of the submitted
ing Lab system for polarity classification on the model, probably due to the overlapping between
SVM classifier with a linear kernel implementa- Sentix and the emotive features.
tion available in Weka (Witten and Frank, 2011),
trained with the Sequential Minimal Optimization Task Class Precision Recall F-score
POS 0 0,8518 0,8999 0,8752
(SMO) algorithm introduced by Platt (Platt, 1999). POS 1 0,3629 0,267 0,3077
The classification task proposed by the orga- POS task 0,60735 0,58345 0,59145
NEG 0 0,8082 0,6065 0,693
nizers could be approached either by building NEG 1 0,5506 0,7701 0,6421
two separate binary classifiers relying of two dif- NEG task 0,6794 0,6883 0,66755
ferent models (one judging the positiveness of GLOBAL 0,643375 0,635875 0,6295
the tweet, the other judging its negativeness),
or by developing a single multiclass classifier Table 2: System results for a filtered model.
where the possible outcomes are Positive Polar-
ity (Task POS:1, Task NEG:0), Negative Polar- The advantage of using only the lexical features
ity (Task POS:0, Task NEG:1), Mixed Polarity derived from ItEM are the following: i) the emo-
(Task POS:1, Task NEG:1) and No Polarity (Task tional values of the words can be easily updated;
POS:0, Task NEG:0). In Evalita 2014 (Passaro et ii) the VSM can be extended to increase the lexical
al., 2014) we tried both approaches in our devel- coverage of the resource; iii) the system is “lean”
opment phase, and found no significant difference, (it can do more with less).
6 Conclusions Verena Lyding, Egon Stemle, Claudia Borghetti, Marco
Brunello, Sara Castagnoli, Felice DellOrletta, Hen-
The Coling Lab system presented in 2014 (Pas- rik Dittmann, Alessandro Lenci, and Vito Pirrelli.
saro et al., 2014) has been enriched with emo- 2014. The PAISÀ Corpus of Italian Web Texts. In
Proceedings of the 9th Web as Corpus Workshop
tive features derived from a distributional, corpus-
(WaC-9), pages 36–43, Gothenburg (Sweden). As-
based resource built from the social media cor- sociation for Computational Linguistics.
pus FB-NEWS15 (Passaro et al., 2016). In ad-
dition, the system exploits LDA features extacted Andrew K. McCallum. 2002. Mallet: A
machine learning for language toolkit.
from the same corpus. Additional experiments http://mallet.cs.umass.edu.
demonstrated that removing most of the non-
distributional lexical features derived from Sentix, Yoshiki Niwa and Yoshihiko Nitta. 1994. Co-
occurrence vectors from corpora vs. distance vectors
the performance can be improved. As a conse- from dictionaries. In Proceedings of the 15th Inter-
quence, with a relatively low number of features national Conference On Computational Linguistics,
the system reaches satisfactory performance, with pages 304–309, Kyoto (Japan).
top-scores in recognizing negative tweets.
Lucia C. Passaro and Alessandro Lenci. 2016. Eval-
uating context selection strategies to build emotive
vector space models. In Proceedings of the Tenth In-
References ternational Conference on Language Resources and
Evaluation (LREC 2016), Portoro (Slovenia). Euro-
Giuseppe Attardi, Felice Dell’Orletta, Maria Simi, and pean Language Resources Association (ELRA).
Joseph Turian. 2009. Accurate dependency pars-
ing with a stacked multilayer perceptron. In Pro- Lucia C. Passaro, Gianluca E. Lebani, Emmanuele
ceedings of EVALITA 2009 Evaluation of NLP and Chersoni, and Alessandro Lenci. 2014. The col-
Speech Tools for Italian 2009, Reggio Emilia (Italy). ing lab system for sentiment polarity classification
Springer. of tweets. In Proceedings of the First Italian Confer-
ence on Computational Linguistics CLiC-it 2014 &
Francesco Barbieri, Valerio Basile, Danilo Croce, and of the Fourth International Workshop EVALITA
Malvina Nissim, Nicole Novielli, and Viviana Patti. 2014, pages 87–92, Pisa (Italy).
2016. Overview of the EVALITA 2016 SENTiment
POLarity Classification Task. In Pierpaolo Basile, Lucia C. Passaro, Laura Pollacci, and Alessandro
Anna Corazza, Franco Cutugno, Simonetta Monte- Lenci. 2015. Item: A vector space model to boot-
magni, Malvina Nissim, Viviana Patti, Giovanni Se- strap an italian emotive lexicon. In Proceedings
meraro, and Rachele Sprugnoli, editors, Proceed- of the second Italian Conference on Computational
ings of Third Italian Conference on Computational Linguistics CLiC-it 2015, pages 215–220, Trento
Linguistics (CLiC-it 2016) & Fifth Evaluation Cam- (Italy).
paign of Natural Language Processing and Speech
Tools for Italian. Final Workshop (EVALITA 2016), Lucia C. Passaro, Alessandro Bondielli, and Alessan-
Napoli (Italy). Academia University Press. dro Lenci. 2016. Fb-news15: A topic-annotated
facebook corpus for emotion detection and senti-
Valerio Basile and Malvina Nissim. 2013. Sentiment ment analysis. In Proceedings of the Third Italian
analysis on italian tweets. In Proceedings of the 4th Conference on Computational Linguistics CLiC-it
Workshop on Computational Approaches to Subjec- 2016., Napoli (Italy). To appear.
tivity, Sentiment and Social Media Analysis, pages
100–107, Atlanta. John C. Platt, 1999. Advances in Kernel Meth-
ods, chapter Fast Training of Support Vector Ma-
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. chines Using Sequential Minimal Optimization,
2003. Latent dirichlet allocation. The Journal of pages 185–208. MIT Press, Cambridge, MA, USA.
Machine Learning Research, 3:993–1022. Lorenzo Renzi, Giampaolo Salvi, and Anna Cardi-
naletti. 2001. Grande grammatica italiana di con-
Kenneth W. Church and Patrick Hanks. 1990. Word sultazione. Number v. 1. Il Mulino.
association norms, mutual information, and lexicog-
raphy. Computational Linguistics, 16:22–29. Klaus Rothenhusler and Hinrich Schtze. 2007. Part of
speech filtered word spaces. In Sixth International
Felice Dell’Orletta. 2009. Ensemble system for part- and Interdisciplinary Conference on Modeling and
of-speech tagging. In Proceedings of EVALITA 2009 Using Context.
Evaluation of NLP and Speech Tools for Italian
2009, Reggio Emilia (Italy). Springer. Ian H. Witten and Eibe Frank. 2011. Data Mining:
Practical Machine Learning Tools and Techniques.
Alec Go, Richa Bhayani, and Lei Huang. 2009. Twit- Morgan Kaufmann Publishers Inc., San Francisco,
ter sentiment classification using distant supervision. CA, USA, 3rd edition.
Processing, pages 1–6.