=Paper= {{Paper |id=Vol-1749/paper_037 |storemode=property |title=Samskara Minimal structural features for detecting subjectivity and polarity in Italian tweets |pdfUrl=https://ceur-ws.org/Vol-1749/paper_037.pdf |volume=Vol-1749 |authors=Irene Russo,Monica Monachini |dblpUrl=https://dblp.org/rec/conf/clic-it/RussoM16 }} ==Samskara Minimal structural features for detecting subjectivity and polarity in Italian tweets== https://ceur-ws.org/Vol-1749/paper_037.pdf

Samskara
Minimal structural features for detecting subjectivity and polarity in
Italian tweets

Irene Russo, Monica Monachini
Istituto di Linguistica Computazionale “Antonio Zampolli“ (ILC CNR)
Lari Lab
firstname.secondname@ilc.cnr.it

Abstract cause we assist every day to an exponential growth
of opinionated content on the web that require
English. Sentiment analysis classification computational systems to be managed. Detected,
tasks strongly depend on the properties extracted and classified, opinionated content can
of the medium that is used to communi- also be labeled as positive or negative, but ad-
cate opinionated content. There are some ditional categories (ambiguous, neutral etc.) are
limitations in Twitter that force the user possible. Resources and methodologies created
to exploit structural properties of this so- for the detection and classification of subjectiv-
cial network with features that have prag- ity and polarity in reviews are not applicable with
matic and communicative functions. Sam- good results on different data, such as tweets or
skara is a system that uses minimal struc- comments about news from online fora.
tural features to classify Italian tweets as There are several reasons behind this: first and
instantiations of a textual genre, obtain- foremost, opinions can be expressed more or
ing good results for subjectivity classifi- less explicitly depending on the context; lexical
cation, while polarity classification needs cues from lexical resources such as SentiWord-
substantial improvements. Net (Baccianella et al., 2010) or General Inquirer
Italiano. I compiti di classificazione a (Stone, 1966) could be useless when people write
livello di sentiment analysis dipendono their point of views in complex and subtle ways.
fortemente dalle proprietà del mezzo us- Secondly, different media and platforms impose
ato per comunicare contenuti d’opinione. different constraints on the structure of the con-
Vi sono limiti oggettivi in Twitter che tent expressed.
forzano l’utente a sfruttare le proprietà Twitter’s limits in terms of characters force the use
strutturali del mezzo assegnando ad al- of abbreviations and the omission of syntactic el-
cuni elementi funzioni pragmatiche e co- ements. But users try to exploit creatively these
municative. Samskara è un sistema che limitations, for example adding pragmatic func-
si propone di classificare i tweets ital- tions with emoticons.
iani come se appartenessero a un genere Features and functionalities anchoring the text to
testuale, interprentandoli come elementi extra-linguistic dimensions (such as mentions and
caratterizzati da strutture minimali e otte- pictures in tweets or like/agree from other users
nendo buoni risultati nella classificazione in online debates) should be considered in Sen-
della soggettività mentre la classificazione timent Analysis classification tasks because of to
della polarità ha bisogno di sostanziali their communicative functions.
miglioramenti. In this paper we present Samskara, a Lari lab sys-
tem for the classification of Italian tweets that took
part in two tasks at Sentipolc2016 (Task 1,subjec-
1 Introduction tivity and Task 2, polarity classification). The sys-
After 15 years of NLP works on the topic Sen- tem is described in par. 2, with results presented in
timent Analysis is still a relevant task, mainly be- 2.2 where we discuss the limitations of the system.
2 System description (Pennebaker et al., 2015) we see that frequen-
cies are unable to distinguish between positive and
Samskara is a classification system based on a negative tweets in the Sentipolc2016 training data
minimal set of features that wants to address the (see Table 1). To avoid this, we defined for inter-
issue of subjectivity and polarity classifications of
Italian tweets. Tweets are considered as instanti- class tokens LIWC+ LIWC-
ations of a textual genre, namely they have spe- pos 92295 234 (0.26%) 225 (0.25%)
cific structural properties with communicative and neg 114435 78 (0.07%) 683 (0.6%)
pragmatic functions. In our approach, focusing on
the structural properties means: Table 1: Absolute and relative frequencies of Ital-
ian LIWC2015 lemmas in positive and negative
• abstracting the task from lexical values of tweets (Sentipolc2016 training set).
single words that could be a deceptive cue
because of lexical sparseness, ambiguity of nal use a subset of SentiWordNet 3.0 (Baccianella
words, use of jargon and ironic exploitations et al., 2010) that we call SWN Core selecting:
of words;
• all the words corresponding to senses that are
• taking into account features used in author- polarised;
ship attribution to represent abstract patterns
characterizing different styles, e.g. PoS tag • from the set above, all the words correspond-
n-gram frequencies(Stamatos, 2009)1 ; ing to senses that display single-valued po-
larity (i.e. they are always positive or always
• choosing a tagset for PoS that includes tags negative);
peculiar of tweets as a textual genre, i.e. in-
• from the set above we delete all the words
terjection and emoticon.
that have also a neutral sense;
More generally, we want to capture high-level lin-
• we sum polarity values for every lemma in
guistic and extra-linguistic properties of tweets,
order to have for example a single value for
also considering basic sequential structures in
lemmas listed in SWN with two different
forms of sequences of bigrams.
positive values or three different negative val-
2.1 Data analysis, data preprocessing and ues.
feature selection The English SWN Core is composed by 6640 ex-
Before starting with the selections of features, data clusively positive lemmas and 7603 exclusively
analysis of the training set helped in the investiga- negative lemmas. Since in these lists items have
tion of several hypotheses. a polarity value ranging from 0.125 to 3.25, with
Polarised lexical items have been widely used in the idea of selecting lemmas that are strongly po-
sentiment analysis classification (Liu and Zhang, larised we set 0.5 as threshold; as a consequence
2012) but resources in this field list values at of this decision we have 1844 very positive and
sense level (such as SentiWordNet) or conflate the 3272 very negative lemmas. After deletion of
senses in a single entry (such as General Inquirer multiword expressions these strongly opinionated
and LIWC). Without an efficient word sense dis- words have been translated to Italian using Google
ambiguation module, using SentiWordNet is dif- Translate, manually checked and annotated with
ficult. One strategy is to sum all the values and PoS and polarity.
to select a threshold for words that are tagged We clean the lists, deleting lemmas that appear
as polarised in text. That means to overstimate two times, lemmas that have been translated as
positive/negative content, without finding a clear multiword expressions and lemmas that do not
boundary between, for example, positive and neg- have polarity in Italian. At the end we have 890
ative tweets. positive and 1224 negative Italian lemmas. Con-
Considering the Italian version of LIWC2015 sidering their frequencies in the training set (see
1
Table 2) we find out that only negative items are
For the moment we think that sequences of syntactic re-
lations are not useful because of the poor performance of Ital- distinctive. Because of the presence of ironic
ian syntactic parsers on tweets. tweets positive lemmas tend to occur in tweets that
have been tagged as negative. The exploitation of TreeTagger PoSTWITA
positive words in ironic communication is a well- AUX [A-Z a-z]+ AUX
known phenomenon (Dews and Winner, 1995) - DET [A-Z a-z]+ DET
the positive literal meaning is subverted by the PRO [A-Z a-z]+ PRON
negative intended meaning - and neglecting this NPR [A-Z a-z]+ PROPN
aspect of the Sentipolc2016 training set could im- PUN PUNCT
ply lower classification performances. If we al- SENT PUNCT
low positive items from SWN Core in the system VER[A-Z a-z]+cli VERB CLIT
the classification of negative tweets is made diffi- VER [A-Z a-z]+ VERB
cult. As we mention above, structural properties
Table 3: Comparison between TreeTagger and
SWN Core+ SWN Core- PoSTWITA tagsets.
obj 536 (0.76%) 264 (0.37%)
subj 2307 (1.4%) 1608 (1%)
by the tag VERYNEG. At this point, with the in-
pos 1055 (4.8%) 200 (0.9%)
tention to have a minimal sequence of significant
neg 839 (2%) 1096 (2.6%)
tags, we created 4 version of the training set ac-
Table 2: Absolute and relative frequencies of cording to 4 minimal structures, deleting all lem-
SWN Core lemmas in Sentipolc2016 training set. mas and leaving only PoS tags:

• minimal structure 1 (MSTRU1): EMO,
of tweets can be treated as sequences of PoS. To
MENTION, HASHTAG, URL, EMAIL;
reduce data sparseness and to include dedicated
tags for Twitter we choose the tagset proposed • minimal structure 2 (MSTRU2): EMO,
by PoSTWITA, an Evalita2016 task (Bosco et al., MENTION, HASHTAG, URL, EMAIL,
2016). It looks promising because it contains cat- PROPN, INTJ;
egories that:
• minimal structure 3 (MSTRU3): EMO,
• could be easily tagged as preprocessing step MENTION, HASHTAG, URL, EMAIL,
with regular expressions (for example MEN- PROPN, INTJ, ADJ, ADV;
TION and LINK);
• minimal structure 4 (MSTRU4): EMOTI-
• are suitable for noisy data, tagging uniformly CON, MENTION, HASHTAG, URL,
items that can be written in several, non- EMAIL, PROPN, INTJ, VERYNEG.
predictable ways (ahahahha, haha as INTJ);
We performed classification experiments with
• contains tags that have communicative and these features and we get better results with
pragmatic functions, such as emoticon and MSTRU4 (see par. 2.2).
interjection (see Table 4). For Samskara each tweet is represented as a se-
We preprocessed all the tweets in the training set quence including its EMO, MENTION, HASH-
substituting elements that are easy to find, such as TAG, URL, EMAIL, PROPN (Proper Noun),
mention, hashtags, email, link, emoticon (all tags INTJ and VERYNEG lemmas from SWN Core
included in PoSTWITA). (see tweet in example 1 represented in example
After that, Sentipolc2016 training set has been 2). This minimal, very compact way to repre-
tagged with TreeTagger (Schmid, 1997); TreeTag- sent a tweet is very convenient because partially
ger tags have been converted to PostTWITA tagset avoids any noise introduced by PoS tagger (con-
(see Table 3) and additional tags from PosTWITA taining only VERYNEG and PROPN as elements
have been added, building dedicated lists for them that should be properly tagged with this tool).
that include items from PoSTWITA training set (1) @FGoria Mario Monti Premier! #Italiare-
plus additional items selected by the authors (see siste.
Table 4).
(2) MENTION PROPN HASHTAG.
Thanks to TreeTagger we have all the words lem-
matized and so all the lemmas included in the neg- Additional features for the classification of subjec-
ative counterpart of SWN Core can be substituted tive and positive or negative tweets are listed in
new tag type examples
PART particle ’s
EMO emoticon :DD, :-)))), u u
INTJ interjection ah, boh, oddioo
SYM symbol %, &, <
CONJ coordinating conjunction ebbene, ma, oppure
SCONJ subordinating conjunction nonostante, mentre, come

Table 4: Examples of lemmas tagged according to Twitter-specific PoSTWITA tags.

Table 5, with BOOL meaning boolean feature and performance on the training set was not satisfy-
NUM numeric feature (they correspond to abso- ing but nevertheless we decided to submit results
lute frequencies). The features have been selected for Task 2 on test set using all the features. In
thinking about their communicative function: a1 Table 9 the official results submitted for the com-
for example is useful because there is a tendency to petition are reported. Samskara was first among
communicate opinionated content in discussions the constrained systems for subjectivity classifi-
with other users while we choose a2 because neu- cation, while not surprisingly the performance in
tral tweets often advertise newspapers’ articles in a Task 2 was bad. Results in Task 2 can be explained
non opinionated way including the link at the end by the absence in the system of structural features
of the tweet, but the URL is significant in other that are meaningful for the positive-negative dis-
positions a6, a6 1. Together with emoticons, in- tinctions or by the unsuitability of such a minimal
terjections are items that signal the presence of approach for the task. It is possible that richer se-
opinionated content. For the kind of asynchronous mantic features are necessary for the detection and
communication that characterize them, tweets can the classification of polarity and polarised lexical
contain questions that don’t expect an answer, that items should be revised, for example, represent-
are rethorical a8 1, thus making the tweet opinio- ing each lemma as a sentiment specific word em-
nanted. bedding (SSWE) encoding sentiment information
(Tang et al., 2014).
2.2 Results and Discussion With Samskara we prove that classification of
The system adopts the Weka2 library that allows tweets should take into account structural proper-
experiments with different classifiers. Due to bet- ties of content on social media, especially proper-
ter performance of Naive Bayes (default settings, ties that have communicative and pragmatic func-
10- fold cross validation) with respect to Support tions. The minimal features we selected for Sam-
Vector Machine we choose the first; best perfor- skara were successful for the classification of sub-
mances were obtained with MSTRU4 considering jective Italian tweets. The system is based on a
frequencies of unigrams and bigrams of PoS as minimal set of features that are easy to retrieve and
features. We took part to Sentipolc2016 only with tag; the classification system is efficient and fast
a constrained run, choosing slightly different set of for Task 1 and as such it is promising for real-time
features for subjectivity and polarity evaluation. processing of big data stream.
Adding the additional features in Table 5 we se-
lected for Task 1 a subset of them after an ablation
test. More specifically, the feature set 1 (FS1 in
References
Table 7) is composed by features a1, a2, a4, a4 1, Stefano Baccianella and Andrea Esuli and Fabrizio Se-
a6, a6 1, a7, a7 1, a8 1, a9. The system perfor- bastian. 2010. SentiWordNet 3.0: An Enhanced
Lexical Resource for Sentiment Analysis and Opin-
mance is reported in terms of F-score, according to ion Mining. In Proceedings of the Seventh Interna-
the measure adopted by the task organizers (Barbi- tional Conference on Language Resources and Eval-
eri et al., 2016). Results on the training data look uation (LREC’10).
promising for Task 1, less promising for Task 2 Barbieri, Francesco and Basile, Valerio and Croce,
(see Table 8). We didn’t succeed in optimising Danilo and Nissim, Malvina and Novielli, Nicole
features for the polarity detection sub-task. The and Patti, Viviana. 2016. Overview of the
EVALITA 2016 SENTiment POLarity Classification
2
http://www.cs.waikato.ac.nz/ml/weka/ Task. In Pierpaolo Basile, Anna Corazza, Franco
features description type
a1 the tweet starts with MENTION BOOL
a2 the tweet ends with a LINK BOOL
a3 the tweet has PoS of type PUNCT BOOL
a3 1 number of PoS of type PUNCT in each tweet NUM
a4 the tweet has PoS of type VERYNEG BOOL
a4 1 number of PoS of type VERYNEG in each tweet NUM
a5 the tweet has PoS of type INTJ BOOL
a5 1 number of PoS of type INTJ in each tweet NUM
a6 the tweet has PoS of type URL BOOL
a6 1 number of PoS of type URL in each tweet NUM
a7 the tweet has PoS of type EMOTICON BOOL
a7 1 number of PoS of type EMOTICON in each tweet NUM
a8 1 the tweet contains a question BOOL
a8 2 the tweet contains a question at the end BOOL
a9 the tweet contains two consecutive exclamation marks (’!!’) BOOL
the tweets contains connectives such as anzitutto,
a10 BOOL
comunque, dapprima, del resto

Table 5: Additional features for subjectivy and polarity classification of tweets.

MSTRU4 + FS1 Linguistica Computazionale (AILC).
obj F-score 0.532
subj F-score 0.811 Bosco, Cristina and Tamburini, Fabio and Bolioli, An-
drea and Mazzei, Alessandro. 2016. Overview
Avg F-score 0.724 of the EVALITA 2016 Part Of Speech on TWit-
ter for ITAlian Task. In Pierpaolo Basile, Anna
Table 6: Classification results for Task 1 obtained Corazza, Franco Cutugno, Simonetta Montemagni,
on Sentipolc2016 training set. Malvina Nissim, Viviana Patti, Giovanni Semer-
aro and Rachele Sprugnoli, editors, Proceedings of
MSTRU4 + AllF Third Italian Conference on Computational Linguis-
pos F-score 0.424 tics (CLiC-it 2016) & Fifth Evaluation Campaign
of Natural Language Processing and Speech Tools
neg F-score 0.539 for Italian. Final Workshop (EVALITA 2016) As-
both F-score 0.047 sociazione Italiana di Linguistica Computazionale
neu F-score 0.526 (AILC).
Avg F-score 0.48
Shelly Dews and Ellen Winner. 1995. Muting the
Table 7: Classification results for Task 2 obtained meaning: A social function of irony. Metaphor and
Symbolic Activity, 10(1):319.
on Sentipolc2016 training set.
Bing Liu and Lei Zhang. 2012. A Survey of Opinion
F-score Rank Mining and Sentiment Analysis. In C. C. Aggarwal
Task 1 0.7184 1 & C. Zhai (Eds.) Mining Text Data, pp. 415–463,
Task 2 0.5683 13 US: Springer.

Table 8: Classification results for Task 1 and Task James W. Pennebaker, Ryan L. Boyd, Kayla Jordan,
2 on Sentipolc2016 test set. and Kate Blackburn. 2015. The Development and
Psychometric Properties of LIWC2015.

Cutugno, Simonetta Montemagni, Malvina Nis- Helmut Schmid. 1997. Probabilistic Part-of-Speech
sim, Viviana Patti, Giovanni Semeraro and Rachele Tagging Using Decision Trees. In New Methods in
Sprugnoli, editors, Proceedings of Third Italian Language Processing, UCL Press, pp. 154-164.
Conference on Computational Linguistics (CLiC-it
2016) & Fifth Evaluation Campaign of Natural Lan- Efstathios Stamatatos. 2009. A Survey of Modern Au-
guage Processing and Speech Tools for Italian. Final thorship Attribution Methods. Journal of the Ameri-
Workshop (EVALITA 2016) Associazione Italiana di can Society for Information Science and Technology.
Stone, Philip J. 1966. The General Inquirer: A
Computer Approach to Content Analysis. The MIT
Press.
Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu
and Bing Qin. 2014. Learning Sentiment-Specific
Word Embedding for Twitter Sentiment Classifica-
tion In Proceedings of the 52nd Annual Meeting of
the Association for Computational Linguistics.