<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Sentiment Polarity Analyser based on a Lexical- Probabilistic Approach</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Bari</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we propose an unsupervised approach to automatically classify the sentiment polarity of texts that can be documents or tweets related to the user's favorite hashtags. The system is based on a combination of probabilistic and lexicon-based approaches. We first apply the Latent Dirichlet Allocation (LDA) model to discover two vectors of terms relevant for two topics (presumably positive and negative) and then we calculate the polarity of the associated sentiment using the SentiWordnet resource. Experiments have been conducted first on an English dataset and then the system has been associated to an application and tested for Italian. Results show that the system can partition the polarity with a good accuracy.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Information from online social network and micro-blogging platforms, such as
Twitter, is of interest for many research fields from social to computer science. In
particular, in the linguistic analysis field, several frameworks for detecting sentiments
in social media have been developed for different application purposes. For instance,
tweets have been used for opinions mining about products, for monitoring political
sentiment [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], for detecting moods in a given geographical area [2], and so on. The
recent integration of social media with Digital Libraries (DL) will open the way for
new types of applications. One of these concerns the application of the sentiment
analysis to digital documents in order to understand relations between opinions and
other factors (i.e. location, gender, etc.) in order to support the administrator of the
DL in the phase of social marketing and advertising.
      </p>
      <p>
        The main goal of the work presented in this paper is to develop an unsupervised
approach to analyze the sentiment polarity of a set of text messages that can be for
instance reviews about items or a set of tweets corresponding to a set hashtags. In
addition we add to deal with another constraint regarding the language. In our
application the tweets to be analyzed were written both in Italian and in English. To
come up with a technique to find sentiment polarity of a set of texts that could be of
different nature we use a combination of probabilistic with a lexicon-based approach.
As a first step we apply Latent Dirichlet Allocation (LDA), a probabilistic graphical
model, which mines hidden semantics from a set of documents [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. It is a “topical”
model that represents documents as bags of words, and looks to find semantic
dependencies between words. In our approach we use LDA over tweeter collections,
so as to get two topics, which probably correspond to two different sentiments. In
particular we discover two vectors of terms characterizing the two topics, presumably
positive and negative. These vectors are then analyzed from the polarity point of view
using the SentiWordnet resource [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The resulting vectors polarity is then analyzed to
determine the global sentiment polarity of the set of hashtags. In order to determine
the accuracy of results obtained with this approach we conducted first some
experiments on an English dataset and then the system has been associated to an
application and tested for Italian. Results show that the system can partition the
topic/polarity with a good accuracy.
      </p>
      <p>The paper is structured as follows. Section 2 presents the motivation for this work.
Then, in Section 3, we describe how Sentiment Polarity Analyzer has been developed.
Then, Section 4 reports results of experiments that have been conducted on both
English and Italian. Finally, conclusions and directions for future work are illustrated
in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Motivations for the proposed approach</title>
      <p>
        Machine Learning-based techniques for sentiment classification can use supervised or
unsupervised approaches. In the former case, a ‘training set’ of documents annotated
with the correct sentiment is needed, and performance can be evaluated using a
different ‘test set’. In the supervised setting, [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] profitably used Naive Bayes (NB),
Maximum Entropy (ME) and Support Vector Machines (SVM) to classify film
reviews as positive or negative. As features they use term vectors obtained without
stemming or stopword removal, and considered only single terms appearing at least 4
times in the corpus and bi-grams appearing at least 7 times. They also implemented a
simple mechanism to recognize the presence of negations that invert the polarity. In
the unsupervised setting, [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] proposed an algorithm to classify reviews based on the
average polarity of sentences containing adjectives or adverbs. After carrying out PoS
tagging, pairs of words including one adjective or adverb are extracted, checking their
correspondence with pre-selected PoS patterns. Then, the polarity of the extracted
expression is estimated, using a formula based on Pointwise Mutual Information
(PMI) applied to the results obtained by an Internet search engine. The final outcome
is ‘recommended’ if the sign of the average polarity is positive, or ‘not recommended’
if it is negative. A similar approach was used for Sentence-level Sentiment
Classification in [18], leveraging the co-occurrence of terms of known polarity in the
sentence, but using a different likelihood index. Instead of comparing a word with a
single known term, subsets of manually classified adjectives are used, and the polarity
of the sentence is determined based on its score: positive if above a given threshold,
or negative if below another threshold. [
        <xref ref-type="bibr" rid="ref2">7</xref>
        ] determine the polarity of the sentence
based on the polarity of the single opinion words it includes, using a set of adjectives
of known polarity and WordNet. If an adjective in the sentence has unknown polarity,
the system tries with its synonyms and opposites. The list of known adjectives is
expanded if the search is successful.
      </p>
      <p>
        Even if supervised learning is commonly used in text categorization, and then in
Sentiment Analysis – recently there has been an increased used of unsupervised or
semi-supervised approaches to sentiment classification in order to solve the problem
of domain dependency and the need for annotated dataset [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In the unsupervised
case, the system takes unlabeled data and tries to find meaningful correlations among
them. To this aim various techniques, both probabilistic and non-probabilistic have
been used, few of which include Latent Semantic Indexing (LSI) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], probabilistic
LSI [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], Latent Dirichlet Allocation (LDA), etc.
      </p>
      <p>
        Among the different unsupervised approaches proposed in the literature, those
based on topic models seem to be appropriate to addressing the sentiment
classification problem. In particular, among them, LDA is the most recently
developed and widely used technique that has been working well in capturing these
semantics [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. It is a probabilistic generative topic model that is very often used for
this task. It is based on the assumption that each document is a mixture of latent topics
and each topic is a probability distribution over different words. Then, for each latent
topic T, the model learns a conditional distribution    for the probability that
word w occurs in T. One can obtain a k-dimensional vector representation of words
by first training a k-topic model and then filling the matrix with the    values
(normalized to unit length). The result is a word–topic matrix in which the rows are
taken to represent word meanings. However, since LDA is used to model topics and it
is not related to word meanings, there is no guarantee that the discovered word
vectors identify words denoting the polarity of the sentiment. Some recent work
introduces extensions of LDA to capture sentiment in addition to topical information
[
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ].
      </p>
      <p>
        In our approach we use LDA to extract two word vectors that ideally should
represent words characterizing two topics corresponding to the polarity of the
considered set of tweets. Then, in order to identify the sentiment content of the
discovered vectors we rely on the SentiWordNet [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] affective lexicon with the aim of
giving an affective weight to words in the vectors.
2
      </p>
    </sec>
    <sec id="sec-3">
      <title>Sentiment Polarity Analyzer</title>
      <p>Sentiment Polarity Analyzer (SPA) is a system able to analyze the sentiment of a set
of text messages (tweets, posts, etc.) using an approach that combines a topic model,
LDA, and SentiWordNet. In particular:
i)</p>
      <p>LDA is used to extract the word vectors relative to two topics, that ideally
should represent words relevant to the two different polarities of the dataset;
ii) SentiWordNet is used to give a weight to the sentiment polarity of each
single word.
The process is composed of four main phases:
1. Input: the input to the system can be a dataset extracted by Twitter according to
the selected hashtag(s) or a text file that contains text messages of any type in English
and/or in Italian. To this aim we use Twitter4J1 API. For each execution, the
application extracts up to 8000 tweets to which we apply the following filters: a)
retweet delete, b) detection of the tweet language (English and Italian), c) deletion of
tweets already extracted in a previous execution run, d) pre-processing of each tweet
by removing the hashtag, URLs and tweet shorter than 10 chars.</p>
      <p>
        2. Word vectors extraction: using the LingPipe framework [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] we use LDA with
the number of topic set to two on the given input. In this way we extract the two word
vectors. It is possible to set some parameters such as the minimum occurrence of
considered token in the document. This parameter is important in case of very large
datasets in order to avoid that the polarity is influenced by very rare terms (in this
case this parameters has to be set to an high value), on the contrary considering an
input with a little quantity of text is important to set a low threshold in order to avoid
skipping words that are important for the considered topic.
      </p>
      <p>3. Polarity Analyzer: each of the extracted vector is analyzed in terms of
sentiment polarity using SentiWordNet. At the end of this phase the system returns</p>
      <sec id="sec-3-1">
        <title>1 Twitter4J: http://twitter4j.org/en/index.html</title>
        <p>the positive/negative polarity for each topic (word vector) identified by LDA. In order
to deal with text written either in English or Italian we used an automatic translation
service (Java Google Translate Text-to-Speech2). For determining the polarity of
each word/term, we considered each possible use of the word in the SentiWordNet
classes - name, verb, adverb, adjective – and summed each class score for computing
a global polarity of the word/term (1 and 2):
 class_score w!   =   !!∈!! !"#$% !!</p>
        <p>!
  =   !∈! !"#$$_!"#$% !!  
!
( 1 )
( 2 )
(1) expresses the average of the polarity of a word w for a class c, where:
•
•
•
•
•
! is the word w in the class c;
! is the word w in the meaning a;
! is the set of synset of ! in c;
 ! is the polarity score of !;
n is the number of ! in !.</p>
        <p>(2) denotes the global average of the polarity of w, where C is the set of classes and
j is the number of classes.
4. Heuristic Evaluation of Results: using SentiWordNet the system extracts other
two word vectors using terms that have a strong polarity weight. These are mixed
both in one single vector which is divided in two new vectors by polarity, obtaining
the true positive and negative score of the dataset. To evaluate the performance of the
proposed approach, three evaluations are performed on results.</p>
        <p>The First Heuristic aims at “evaluating whether LDA is suitable to determine the
two topic word vectors as denoting two opposite sentiment polaraties”. To this aim
the two vectors v0 and v1 are analyzed in terms of polarity with SentiWordNet in order
to determining a positive and a negative score for each vector. This evaluation has
been performed using the condition that the vectors should have an opposite polarity:
score_pos(v0) - score_neg(v0) &gt; 0 AND score_pos(v1) - score_neg(v1) &lt; 0.</p>
        <p>The Second Heuristic aims “evaluating whether the performance of LDA on a
single topic may be improved”. We created two vectors composed by relevant words
in order to increase the semantic consistency of terms. The rule for determining the
relevance of a term is the following for selecting a positive word: (pos_score ≥ 0.5
AND pos_score &gt; neg_score) OR (pos_score - neg_score ≥ 0.25). We apply an
analogous rule for the selection of a negative word. Table 1 summarizes an example of
application of these rules (words are translated from Italian).</p>
      </sec>
      <sec id="sec-3-2">
        <title>2 Java Google Translate Text-to-Speech: gtranslateapi-1.0.jar</title>
        <p>Topic0 Relevant words</p>
        <p>shame - situation - good - suck - sick - affected</p>
        <p>The Third Heuristic aims “evaluating the distance of the dataset polarity extracted
with SentiWordNet compared to the polarity of the LDA vectors.” To this aim we
merged the two vectors of relevant words and to create automatically two vectors
containing the set of terms characterizing the polarity of the dataset (see Table 2).</p>
        <p>In this way it is possible to determine the polarity of each vector. In particular,
considering the example reported in Table 2, for the Topic0 (T0) we have:
pos_polarity T!   =   !°! !° " !"#$!%#$$&amp; !' !"#"$#%$%   =  0,17
_ ! =
!°  !"#$%&amp;'"  !"#$%
!°  !"!#$  !"#$%</p>
        <p>= 0,83</p>
        <p>In the same way the vector T1 expresses a positive polarity for 62,5% and a
negative one for 37,5%. Then, we can say that LDA extracts relevant words that allow
distinguishing the sentiment polarity since, in this example T0 can be denoted as the
negative word vector, since is polarity is 83% negative and T1 as the positive one.</p>
        <p>Sentiment Polarity Analyser (SPA) has been implemented in Java both as an
application and as a webserver to be used by any other application that may need this
service. Its interface is illustrated Figure 2 and it is composed by 4 main sections:
• Selection of the dataset and starting of the analysis;</p>
        <p>Log of execution steps;
Polarity score of the dataset – the TAG CLOUD buttons allows reading the
word vectors characterizing the topic;
Graph section illustrating the trend of the topic polarity.</p>
        <p>SPA can be used not only to extract the polarity of the dataset but also for monitoring
an hashtag or a set of hashtags in time. In this case results are presented as a graph
that shows the trend of the sentiment around that topic. Figure 3 reports an example of
the monitoring of the hashtag “Renzi” (the Italian premier) in from the 1st to the 30th
of October 2014. You can notice that the positive trend goes down after the 15th of
October the day in which the “legge di Stabilita’” was issued (new taxes).</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Evaluation</title>
      <p>Our approach does not aim at classifying a single post or short text message as a
positive or negative one but, given the goal of our application, aims at analyzing and
monitoring the polarity trend of a topic or a set of topics and therefore it can be seen
as a tool for determining the degree of liking about a certain topic. For determining
how well SPA performed this task we have evaluated the tool on 4 datasets for which
we know the sentiment polarity mixture and the results are shown in Table 3.
#Test</p>
      <p>Dataset
sentiment1403
filmup4
cornell_polarity5
large_movie_review6
%dataset polarity</p>
      <p>In Table 4, test #1 shows an error of about 4% using the vectors extracted with
LDA, while for the characterizing vectors the error is about the same but the polarity
of the dataset is not defined. This can be caused by the number of words in the vectors
that depends on the number of minimum occurrence of the tokens in the LDA that has
been set as a default to 5. Results of test #2 are encouraging since the in both types of
vectors is about 3% and LDA identifies the negative topic with the about the same
polarity of the original dataset - 38.86% vs. 38.46%. We have similar results in the
tests #3 and #4. After these results we made some experiments by varying the number
of minimum occurrence of the tokens by increasing it opportunely (up to 500) and
while this unbalanced the polarity of the LDA extracted vectors (by increasing the
error to 9%), the characterizing vectors reached the correct mixture or topic polarity
in particular for the large_movie_review dataset.
1
2
3
# Test</p>
      <p>Dataset
sentiment140
cornell_polarity
l_movie_review
min Token count</p>
      <sec id="sec-4-1">
        <title>Default → 500 Default → 300 Default → 1000</title>
        <p>%
variation
LDA
4,855,7710,03+
error
with
% error variation
characterizing
vectors
2,2+
2,17+
0,5</p>
      </sec>
      <sec id="sec-4-2">
        <title>3 Sentiment140 dataset: http://help.sentiment140.com/for-students/ 4 http://filmup.it[13] 5 Cornell Polarity Dataset 1.0: http://www.cs.cornell.edu/people/pabo/movie-review-data/ 6 Large Movie Review Dataset: http://ai.stanford.edu/~amaas/data/sentiment/</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work Directions</title>
      <p>We conclude that the methodology presented in this paper is a feasible approach to
model how trends of sentiments about a particular topic or a set of topics could be
monitored. SPA uses an unsupervised approach to automatically classify the
sentiment polarity of text messages, documents and tweets. The flexibility of SPA
allows its use in different application domains where there is the need of determining
or monitoring the polarity of a dataset. The system is based on a combination of
probabilistic and lexicon-based approaches. We first apply the Latent Dirichlet
Allocation (LDA) model to discover two vectors of terms relevant for two topics
(presumably positive and negative) and then we calculate the polarity of the
associated sentiment using the SentiWordnet resource. Experiments have been
conducted first on an English dataset and then the system has been associated to an
application and tested for Italian. Results show that the system can partition the
polarity with a good accuracy.</p>
      <p>
        The presented work represents the implementation of the first prototype of the
system and we are aware of its limitations. For improving the performance of the
proposed approach an affective lexical resource for Italian is necessary in order to
avoid problems due to the translation. Another important issue regards the negation
problem that needs particular attention. Most of the proposed solutions are based on
heuristics similar to those used to handle the AND and BUT connectors. A possible
solution could be represented by a switch to an approach based on semantics
(bag-ofconcepts [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]) although these would request another methodology for sentiment
classification. In our future work we plan to integrate our implementation in a Digital
Library Management System and to perform some experiment on a dataset of Italian
tweets [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] in order to compare our results with those obtained in the EVALITA
context7.
      </p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgment</title>
      <p>This work fulfils the research objectives of the PON02_00563_3489339 project
"PUGLIA@SERVICE - funded by the Italian Ministry of University and Research
(MIUR).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Tumasjan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Sprenger</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Sandner</surname>
            ,
            <given-names>P. G.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Welpe</surname>
            ,
            <given-names>I. M.</given-names>
          </string-name>
          <year>2010</year>
          .
          <article-title>Predicting elections with twitter: What 140 characters reveal about political sentiment</article-title>
          .
          <source>In Proc. of 4th ICWSM</source>
          ,
          <fpage>178</fpage>
          -
          <lpage>185</lpage>
          . AAAI Press.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>7 EVALITA - Evaluation of NLP and Speech Tools for Italian, www</article-title>
          .evalita.it [2] Mitchell,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Frank</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            ,
            <surname>Harris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. D.</given-names>
            ,
            <surname>Dodds</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            and
            <surname>Danforth</surname>
          </string-name>
          <string-name>
            <surname>C. M.</surname>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>The Geography of Happiness: Connecting Twitter Sentiment and Expression</article-title>
          , Demographics, and Objective Characteristics of Place.
          <source>PLoS ONE</source>
          ,
          <volume>8</volume>
          (
          <issue>5</issue>
          ),
          <fpage>05</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Blei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ng</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Jordan</surname>
          </string-name>
          , “Latent dirichlet allocation,
          <source>” Journal of Machine Learning Research</source>
          , vol.
          <volume>3</volume>
          , pp.
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Baccianella</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Esuli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sebastiani</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining</article-title>
          . In Nicoletta Calzolari et al., editor,
          <source>Proceedings of LREC.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Vaithyanathan</surname>
          </string-name>
          , “
          <article-title>Thumbs up? sentiment classification using machine learning techniques</article-title>
          <source>,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <year>2002</year>
          , pp.
          <fpage>79</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.D.</given-names>
            <surname>Turney</surname>
          </string-name>
          et al.
          <article-title>Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews</article-title>
          .
          <source>In Proceedings of the 40th annual meeting of the Association for Computational Linguistics</source>
          , pages
          <fpage>417</fpage>
          -
          <lpage>424</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Hu</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          , “
          <article-title>Mining and summarizing customer reviews</article-title>
          ,”
          <source>in Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)</source>
          ,
          <year>2004</year>
          , pp.
          <fpage>168</fpage>
          -
          <lpage>177</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Deerwester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. T.</given-names>
            <surname>Dumais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. W.</given-names>
            <surname>Furnas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. K.</given-names>
            <surname>Landauer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Harshman</surname>
          </string-name>
          .
          <article-title>Indexing by latent semantic analysis</article-title>
          .
          <source>J AM SOC INFORM SCI</source>
          ,
          <volume>41</volume>
          :
          <fpage>391</fpage>
          -
          <lpage>407</lpage>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Hofmann</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Probabilistic Latent Semantic Indexing</article-title>
          .
          <source>In: Proceedings of the 22nd International ACM Conference on Research and Development in Information Retrieval</source>
          , pp.
          <fpage>50</fpage>
          -
          <lpage>57</lpage>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Joint sentiment/topic model for sentiment analysis</article-title>
          .
          <source>In Proceeding of the 18th ACM Conference on Information and Knowledge Management</source>
          , pages
          <fpage>375</fpage>
          -
          <lpage>384</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Huang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhu</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Sentiment analysis with global topics and local dependency</article-title>
          .
          <source>In Proceedings of AAAI</source>
          , pages
          <fpage>1371</fpage>
          -
          <lpage>1376</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Alias-i</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>LingPipe 4.1.0</article-title>
          . http://alias-i.com/lingpipe.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Casoto</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dattolo</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Omero</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pudota</surname>
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tasso</surname>
            <given-names>C..</given-names>
          </string-name>
          <year>2008</year>
          .
          <article-title>A new machine learning based approach for sentiment classi- fication of italian documents</article-title>
          . In M. Agosti,
          <string-name>
            <given-names>F.</given-names>
            <surname>Esposito</surname>
          </string-name>
          , and C. Thanos, editors,
          <source>IRCDL</source>
          , pages
          <fpage>77</fpage>
          -
          <lpage>82</lpage>
          .
          <article-title>DELOS: an Association for Digital Libraries</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Jay</given-names>
            <surname>Kuan-Chieh</surname>
          </string-name>
          <string-name>
            <given-names>Chung</given-names>
            ,
            <surname>Chi-En Wu</surname>
          </string-name>
          and Richard Tzong-Han
          <string-name>
            <surname>Tsai</surname>
          </string-name>
          .
          <article-title>Improve Polarity Detection of Online Reviews with Bag-of-Sentimental-Concepts</article-title>
          .
          <source>ESWC</source>
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Basile</surname>
            <given-names>V.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Nissim</surname>
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <source>Sentiment Analysis on Italian Tweets. Proc. of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis</source>
          , pages
          <fpage>100</fpage>
          -
          <lpage>107</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>