=Paper= {{Paper |id=Vol-1749/paper_037 |storemode=property |title=Samskara Minimal structural features for detecting subjectivity and polarity in Italian tweets |pdfUrl=https://ceur-ws.org/Vol-1749/paper_037.pdf |volume=Vol-1749 |authors=Irene Russo,Monica Monachini |dblpUrl=https://dblp.org/rec/conf/clic-it/RussoM16 }} ==Samskara Minimal structural features for detecting subjectivity and polarity in Italian tweets== https://ceur-ws.org/Vol-1749/paper_037.pdf
                                 Samskara
    Minimal structural features for detecting subjectivity and polarity in
                               Italian tweets

                                 Irene Russo, Monica Monachini
             Istituto di Linguistica Computazionale “Antonio Zampolli“ (ILC CNR)
                                            Lari Lab
                           firstname.secondname@ilc.cnr.it




                     Abstract                          cause we assist every day to an exponential growth
                                                       of opinionated content on the web that require
    English. Sentiment analysis classification         computational systems to be managed. Detected,
    tasks strongly depend on the properties            extracted and classified, opinionated content can
    of the medium that is used to communi-             also be labeled as positive or negative, but ad-
    cate opinionated content. There are some           ditional categories (ambiguous, neutral etc.) are
    limitations in Twitter that force the user         possible. Resources and methodologies created
    to exploit structural properties of this so-       for the detection and classification of subjectiv-
    cial network with features that have prag-         ity and polarity in reviews are not applicable with
    matic and communicative functions. Sam-            good results on different data, such as tweets or
    skara is a system that uses minimal struc-         comments about news from online fora.
    tural features to classify Italian tweets as       There are several reasons behind this: first and
    instantiations of a textual genre, obtain-         foremost, opinions can be expressed more or
    ing good results for subjectivity classifi-        less explicitly depending on the context; lexical
    cation, while polarity classification needs        cues from lexical resources such as SentiWord-
    substantial improvements.                          Net (Baccianella et al., 2010) or General Inquirer
    Italiano. I compiti di classificazione a           (Stone, 1966) could be useless when people write
    livello di sentiment analysis dipendono            their point of views in complex and subtle ways.
    fortemente dalle proprietà del mezzo us-          Secondly, different media and platforms impose
    ato per comunicare contenuti d’opinione.           different constraints on the structure of the con-
    Vi sono limiti oggettivi in Twitter che            tent expressed.
    forzano l’utente a sfruttare le proprietà         Twitter’s limits in terms of characters force the use
    strutturali del mezzo assegnando ad al-            of abbreviations and the omission of syntactic el-
    cuni elementi funzioni pragmatiche e co-           ements. But users try to exploit creatively these
    municative. Samskara è un sistema che             limitations, for example adding pragmatic func-
    si propone di classificare i tweets ital-          tions with emoticons.
    iani come se appartenessero a un genere            Features and functionalities anchoring the text to
    testuale, interprentandoli come elementi           extra-linguistic dimensions (such as mentions and
    caratterizzati da strutture minimali e otte-       pictures in tweets or like/agree from other users
    nendo buoni risultati nella classificazione        in online debates) should be considered in Sen-
    della soggettività mentre la classificazione      timent Analysis classification tasks because of to
    della polarità ha bisogno di sostanziali          their communicative functions.
    miglioramenti.                                     In this paper we present Samskara, a Lari lab sys-
                                                       tem for the classification of Italian tweets that took
                                                       part in two tasks at Sentipolc2016 (Task 1,subjec-
1   Introduction                                       tivity and Task 2, polarity classification). The sys-
After 15 years of NLP works on the topic Sen-          tem is described in par. 2, with results presented in
timent Analysis is still a relevant task, mainly be-   2.2 where we discuss the limitations of the system.
2       System description                                        (Pennebaker et al., 2015) we see that frequen-
                                                                  cies are unable to distinguish between positive and
Samskara is a classification system based on a                    negative tweets in the Sentipolc2016 training data
minimal set of features that wants to address the                 (see Table 1). To avoid this, we defined for inter-
issue of subjectivity and polarity classifications of
Italian tweets. Tweets are considered as instanti-                  class   tokens      LIWC+           LIWC-
ations of a textual genre, namely they have spe-                    pos     92295     234 (0.26%)     225 (0.25%)
cific structural properties with communicative and                  neg     114435    78 (0.07%)      683 (0.6%)
pragmatic functions. In our approach, focusing on
the structural properties means:                                  Table 1: Absolute and relative frequencies of Ital-
                                                                  ian LIWC2015 lemmas in positive and negative
    • abstracting the task from lexical values of                 tweets (Sentipolc2016 training set).
      single words that could be a deceptive cue
      because of lexical sparseness, ambiguity of                 nal use a subset of SentiWordNet 3.0 (Baccianella
      words, use of jargon and ironic exploitations               et al., 2010) that we call SWN Core selecting:
      of words;
                                                                    • all the words corresponding to senses that are
    • taking into account features used in author-                    polarised;
      ship attribution to represent abstract patterns
      characterizing different styles, e.g. PoS tag                 • from the set above, all the words correspond-
      n-gram frequencies(Stamatos, 2009)1 ;                           ing to senses that display single-valued po-
                                                                      larity (i.e. they are always positive or always
    • choosing a tagset for PoS that includes tags                    negative);
      peculiar of tweets as a textual genre, i.e. in-
                                                                    • from the set above we delete all the words
      terjection and emoticon.
                                                                      that have also a neutral sense;
More generally, we want to capture high-level lin-
                                                                    • we sum polarity values for every lemma in
guistic and extra-linguistic properties of tweets,
                                                                      order to have for example a single value for
also considering basic sequential structures in
                                                                      lemmas listed in SWN with two different
forms of sequences of bigrams.
                                                                      positive values or three different negative val-
2.1      Data analysis, data preprocessing and                        ues.
         feature selection                                        The English SWN Core is composed by 6640 ex-
Before starting with the selections of features, data             clusively positive lemmas and 7603 exclusively
analysis of the training set helped in the investiga-             negative lemmas. Since in these lists items have
tion of several hypotheses.                                       a polarity value ranging from 0.125 to 3.25, with
Polarised lexical items have been widely used in                  the idea of selecting lemmas that are strongly po-
sentiment analysis classification (Liu and Zhang,                 larised we set 0.5 as threshold; as a consequence
2012) but resources in this field list values at                  of this decision we have 1844 very positive and
sense level (such as SentiWordNet) or conflate the                3272 very negative lemmas. After deletion of
senses in a single entry (such as General Inquirer                multiword expressions these strongly opinionated
and LIWC). Without an efficient word sense dis-                   words have been translated to Italian using Google
ambiguation module, using SentiWordNet is dif-                    Translate, manually checked and annotated with
ficult. One strategy is to sum all the values and                 PoS and polarity.
to select a threshold for words that are tagged                   We clean the lists, deleting lemmas that appear
as polarised in text. That means to overstimate                   two times, lemmas that have been translated as
positive/negative content, without finding a clear                multiword expressions and lemmas that do not
boundary between, for example, positive and neg-                  have polarity in Italian. At the end we have 890
ative tweets.                                                     positive and 1224 negative Italian lemmas. Con-
Considering the Italian version of LIWC2015                       sidering their frequencies in the training set (see
    1
                                                                  Table 2) we find out that only negative items are
     For the moment we think that sequences of syntactic re-
lations are not useful because of the poor performance of Ital-   distinctive. Because of the presence of ironic
ian syntactic parsers on tweets.                                  tweets positive lemmas tend to occur in tweets that
have been tagged as negative. The exploitation of            TreeTagger               PoSTWITA
positive words in ironic communication is a well-               AUX                 [A-Z a-z]+ AUX
known phenomenon (Dews and Winner, 1995) -                      DET                 [A-Z a-z]+ DET
the positive literal meaning is subverted by the                PRO                [A-Z a-z]+ PRON
negative intended meaning - and neglecting this                 NPR               [A-Z a-z]+ PROPN
aspect of the Sentipolc2016 training set could im-              PUN                     PUNCT
ply lower classification performances. If we al-               SENT                     PUNCT
low positive items from SWN Core in the system             VER[A-Z a-z]+cli           VERB CLIT
the classification of negative tweets is made diffi-            VER                [A-Z a-z]+ VERB
cult. As we mention above, structural properties
                                                       Table 3: Comparison between TreeTagger and
               SWN Core+       SWN Core-               PoSTWITA tagsets.
       obj     536 (0.76%)     264 (0.37%)
       subj    2307 (1.4%)      1608 (1%)
                                                       by the tag VERYNEG. At this point, with the in-
       pos     1055 (4.8%)     200 (0.9%)
                                                       tention to have a minimal sequence of significant
       neg      839 (2%)       1096 (2.6%)
                                                       tags, we created 4 version of the training set ac-
Table 2: Absolute and relative frequencies of          cording to 4 minimal structures, deleting all lem-
SWN Core lemmas in Sentipolc2016 training set.         mas and leaving only PoS tags:

                                                         • minimal structure 1 (MSTRU1): EMO,
of tweets can be treated as sequences of PoS. To
                                                           MENTION, HASHTAG, URL, EMAIL;
reduce data sparseness and to include dedicated
tags for Twitter we choose the tagset proposed           • minimal structure 2 (MSTRU2): EMO,
by PoSTWITA, an Evalita2016 task (Bosco et al.,            MENTION, HASHTAG, URL, EMAIL,
2016). It looks promising because it contains cat-         PROPN, INTJ;
egories that:
                                                         • minimal structure 3 (MSTRU3): EMO,
  • could be easily tagged as preprocessing step           MENTION, HASHTAG, URL, EMAIL,
    with regular expressions (for example MEN-             PROPN, INTJ, ADJ, ADV;
    TION and LINK);
                                                         • minimal structure 4 (MSTRU4): EMOTI-
  • are suitable for noisy data, tagging uniformly         CON, MENTION, HASHTAG, URL,
    items that can be written in several, non-             EMAIL, PROPN, INTJ, VERYNEG.
    predictable ways (ahahahha, haha as INTJ);
                                                       We performed classification experiments with
  • contains tags that have communicative and          these features and we get better results with
    pragmatic functions, such as emoticon and          MSTRU4 (see par. 2.2).
    interjection (see Table 4).                        For Samskara each tweet is represented as a se-
We preprocessed all the tweets in the training set     quence including its EMO, MENTION, HASH-
substituting elements that are easy to find, such as   TAG, URL, EMAIL, PROPN (Proper Noun),
mention, hashtags, email, link, emoticon (all tags     INTJ and VERYNEG lemmas from SWN Core
included in PoSTWITA).                                 (see tweet in example 1 represented in example
After that, Sentipolc2016 training set has been        2). This minimal, very compact way to repre-
tagged with TreeTagger (Schmid, 1997); TreeTag-        sent a tweet is very convenient because partially
ger tags have been converted to PostTWITA tagset       avoids any noise introduced by PoS tagger (con-
(see Table 3) and additional tags from PosTWITA        taining only VERYNEG and PROPN as elements
have been added, building dedicated lists for them     that should be properly tagged with this tool).
that include items from PoSTWITA training set            (1) @FGoria Mario Monti Premier! #Italiare-
plus additional items selected by the authors (see           siste.
Table 4).
                                                         (2)   MENTION PROPN HASHTAG.
Thanks to TreeTagger we have all the words lem-
matized and so all the lemmas included in the neg-     Additional features for the classification of subjec-
ative counterpart of SWN Core can be substituted       tive and positive or negative tweets are listed in
                        new tag                 type                     examples
                         PART                 particle                        ’s
                         EMO                 emoticon            :DD, :-)))), u u
                         INTJ               interjection             ah, boh, oddioo
                         SYM                  symbol                      %, &, <
                         CONJ         coordinating conjunction     ebbene, ma, oppure
                        SCONJ        subordinating conjunction   nonostante, mentre, come

              Table 4: Examples of lemmas tagged according to Twitter-specific PoSTWITA tags.


Table 5, with BOOL meaning boolean feature and             performance on the training set was not satisfy-
NUM numeric feature (they correspond to abso-              ing but nevertheless we decided to submit results
lute frequencies). The features have been selected         for Task 2 on test set using all the features. In
thinking about their communicative function: a1            Table 9 the official results submitted for the com-
for example is useful because there is a tendency to       petition are reported. Samskara was first among
communicate opinionated content in discussions             the constrained systems for subjectivity classifi-
with other users while we choose a2 because neu-           cation, while not surprisingly the performance in
tral tweets often advertise newspapers’ articles in a      Task 2 was bad. Results in Task 2 can be explained
non opinionated way including the link at the end          by the absence in the system of structural features
of the tweet, but the URL is significant in other          that are meaningful for the positive-negative dis-
positions a6, a6 1. Together with emoticons, in-           tinctions or by the unsuitability of such a minimal
terjections are items that signal the presence of          approach for the task. It is possible that richer se-
opinionated content. For the kind of asynchronous          mantic features are necessary for the detection and
communication that characterize them, tweets can           the classification of polarity and polarised lexical
contain questions that don’t expect an answer, that        items should be revised, for example, represent-
are rethorical a8 1, thus making the tweet opinio-         ing each lemma as a sentiment specific word em-
nanted.                                                    bedding (SSWE) encoding sentiment information
                                                           (Tang et al., 2014).
2.2      Results and Discussion                            With Samskara we prove that classification of
The system adopts the Weka2 library that allows            tweets should take into account structural proper-
experiments with different classifiers. Due to bet-        ties of content on social media, especially proper-
ter performance of Naive Bayes (default settings,          ties that have communicative and pragmatic func-
10- fold cross validation) with respect to Support         tions. The minimal features we selected for Sam-
Vector Machine we choose the first; best perfor-           skara were successful for the classification of sub-
mances were obtained with MSTRU4 considering               jective Italian tweets. The system is based on a
frequencies of unigrams and bigrams of PoS as              minimal set of features that are easy to retrieve and
features. We took part to Sentipolc2016 only with          tag; the classification system is efficient and fast
a constrained run, choosing slightly different set of      for Task 1 and as such it is promising for real-time
features for subjectivity and polarity evaluation.         processing of big data stream.
Adding the additional features in Table 5 we se-
lected for Task 1 a subset of them after an ablation
test. More specifically, the feature set 1 (FS1 in
                                                           References
Table 7) is composed by features a1, a2, a4, a4 1,         Stefano Baccianella and Andrea Esuli and Fabrizio Se-
a6, a6 1, a7, a7 1, a8 1, a9. The system perfor-              bastian. 2010. SentiWordNet 3.0: An Enhanced
                                                              Lexical Resource for Sentiment Analysis and Opin-
mance is reported in terms of F-score, according to           ion Mining. In Proceedings of the Seventh Interna-
the measure adopted by the task organizers (Barbi-            tional Conference on Language Resources and Eval-
eri et al., 2016). Results on the training data look          uation (LREC’10).
promising for Task 1, less promising for Task 2            Barbieri, Francesco and Basile, Valerio and Croce,
(see Table 8). We didn’t succeed in optimising               Danilo and Nissim, Malvina and Novielli, Nicole
features for the polarity detection sub-task. The            and Patti, Viviana.   2016.    Overview of the
                                                             EVALITA 2016 SENTiment POLarity Classification
   2
       http://www.cs.waikato.ac.nz/ml/weka/                  Task. In Pierpaolo Basile, Anna Corazza, Franco
           features     description                                                       type
           a1           the tweet starts with MENTION                                     BOOL
           a2           the tweet ends with a LINK                                        BOOL
           a3           the tweet has PoS of type PUNCT                                   BOOL
           a3 1         number of PoS of type PUNCT in each tweet                         NUM
           a4           the tweet has PoS of type VERYNEG                                 BOOL
           a4 1         number of PoS of type VERYNEG in each tweet                       NUM
           a5           the tweet has PoS of type INTJ                                    BOOL
           a5 1         number of PoS of type INTJ in each tweet                          NUM
           a6           the tweet has PoS of type URL                                     BOOL
           a6 1         number of PoS of type URL in each tweet                           NUM
           a7           the tweet has PoS of type EMOTICON                                BOOL
           a7 1         number of PoS of type EMOTICON in each tweet                      NUM
           a8 1         the tweet contains a question                                     BOOL
           a8 2         the tweet contains a question at the end                          BOOL
           a9           the tweet contains two consecutive exclamation marks (’!!’)       BOOL
                        the tweets contains connectives such as anzitutto,
           a10                                                                            BOOL
                        comunque, dapprima, del resto

            Table 5: Additional features for subjectivy and polarity classification of tweets.

                         MSTRU4 + FS1                      Linguistica Computazionale (AILC).
        obj F-score         0.532
        subj F-score        0.811                        Bosco, Cristina and Tamburini, Fabio and Bolioli, An-
                                                           drea and Mazzei, Alessandro. 2016. Overview
        Avg F-score         0.724                          of the EVALITA 2016 Part Of Speech on TWit-
                                                           ter for ITAlian Task. In Pierpaolo Basile, Anna
Table 6: Classification results for Task 1 obtained        Corazza, Franco Cutugno, Simonetta Montemagni,
on Sentipolc2016 training set.                             Malvina Nissim, Viviana Patti, Giovanni Semer-
                                                           aro and Rachele Sprugnoli, editors, Proceedings of
                         MSTRU4 + AllF                     Third Italian Conference on Computational Linguis-
        pos F-score         0.424                          tics (CLiC-it 2016) & Fifth Evaluation Campaign
                                                           of Natural Language Processing and Speech Tools
        neg F-score         0.539                          for Italian. Final Workshop (EVALITA 2016) As-
        both F-score        0.047                          sociazione Italiana di Linguistica Computazionale
        neu F-score         0.526                          (AILC).
        Avg F-score          0.48
                                                         Shelly Dews and Ellen Winner. 1995. Muting the
Table 7: Classification results for Task 2 obtained        meaning: A social function of irony. Metaphor and
                                                           Symbolic Activity, 10(1):319.
on Sentipolc2016 training set.
                                                         Bing Liu and Lei Zhang. 2012. A Survey of Opinion
                       F-score    Rank                     Mining and Sentiment Analysis. In C. C. Aggarwal
            Task 1     0.7184      1                       & C. Zhai (Eds.) Mining Text Data, pp. 415–463,
            Task 2     0.5683      13                      US: Springer.

Table 8: Classification results for Task 1 and Task      James W. Pennebaker, Ryan L. Boyd, Kayla Jordan,
2 on Sentipolc2016 test set.                               and Kate Blackburn. 2015. The Development and
                                                           Psychometric Properties of LIWC2015.

  Cutugno, Simonetta Montemagni, Malvina Nis-            Helmut Schmid. 1997. Probabilistic Part-of-Speech
  sim, Viviana Patti, Giovanni Semeraro and Rachele        Tagging Using Decision Trees. In New Methods in
  Sprugnoli, editors, Proceedings of Third Italian         Language Processing, UCL Press, pp. 154-164.
  Conference on Computational Linguistics (CLiC-it
  2016) & Fifth Evaluation Campaign of Natural Lan-      Efstathios Stamatatos. 2009. A Survey of Modern Au-
  guage Processing and Speech Tools for Italian. Final     thorship Attribution Methods. Journal of the Ameri-
  Workshop (EVALITA 2016) Associazione Italiana di         can Society for Information Science and Technology.
Stone, Philip J. 1966. The General Inquirer: A
   Computer Approach to Content Analysis. The MIT
   Press.
Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu
  and Bing Qin. 2014. Learning Sentiment-Specific
  Word Embedding for Twitter Sentiment Classifica-
  tion In Proceedings of the 52nd Annual Meeting of
  the Association for Computational Linguistics.