<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Feature Weighting Strategies in Sentiment Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Olena Kummer</string-name>
          <email>olena.zubaryeva@unine.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jacques Savoy</string-name>
          <email>jacques.savoy@unine.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Rue Emile-Argand 11</institution>
          ,
          <addr-line>CH-2000 Neuchaˆtel</addr-line>
        </aff>
      </contrib-group>
      <fpage>48</fpage>
      <lpage>55</lpage>
      <abstract>
        <p>In this paper we propose an adaptation of the KullbackLeibler divergence score for the task of sentiment and opinion classification on a sentence level. We propose to use the obtained score with the SVM model using different thresholds for pruning the feature set. We argue that the pruning of the feature set for the task of sentiment analysis (SA) may be detrimental to classifiers performance on short text. As an alternative approach, we consider a simple additive scheme that takes into account all of the features. Accuracy rates over 10 fold cross-validation indicate that the latter approach outperforms the SVM classification scheme.</p>
      </abstract>
      <kwd-group>
        <kwd>Sentiment Analysis</kwd>
        <kwd>Opinion Detection</kwd>
        <kwd>Kullback-Leibler divergence</kwd>
        <kwd>Natural Language Processing</kwd>
        <kwd>Machine Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In this paper we consider sentiment and opinion classification on a sentence level.
Sentiment analysis of user reviews, and short text in general could be of interest
for many practical reasons. It represents a rich resource for marketing research,
social analysts, and all interested in following opinions of the mass. Opinion
mining can also be useful in a variety of other applications and platforms, such as
recommendation systems, product ad placement strategies, question answering,
and information summarization.</p>
      <p>The suggested approach is based on a supervised learning scheme that uses
feature selection techniques and weighting strategies to classify sentences into
two categories (opinionated vs. factual or positive vs. negative). Our main
objective is to propose a new weighting technique and classification scheme able to
achieve comparable performance to popular state-of-the-art approaches, and to
provide a decision that can be understood by the final user (instead of justifying
the decision by considering the distance difference between selected examples).</p>
      <p>The rest of the article is organized as follows. First, we present the review
of the related literature in Section 2. Next, we present the adaptation of the
Kullback-Leibler divergence score for opinion/sentiment classification in Section
3. Section 4 provides a description of the experimental setup and corpora used.
Sections 5 and 6 present experiments and analysis of the proposed weighting
measure with the SVM model, and additive classification scheme respectively.
Finally, we give conclusions in Section 7.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Literature</title>
      <p>
        Often as a first step in machine learning algorithms, like SVM, na¨ıve Bayes,
k-Nearest Neighbors, one uses feature weighting and/or selection based on the
computed weights. The selection of features allows decrease of the dimensionality
of the feature space and thus the computational cost. It can also reduce the
overfitting of the learning scheme to the training data. Several studies expose
the feature selection question. Forman [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] reports an extensive evaluation of
various schemes in text classification tasks. Dave et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] give an evaluation
of linguistic and statistical measures, as well as weighting schemes to improve
feature selection.
      </p>
      <p>
        Kennedy et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] use General Inquirer [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to classify reviews based on the
number of positive and negative terms that they contain. General Inquirer
assigns a label to each sense of the word out of the following set: positive, negative,
overstatement, understatement, or negation. Negations reverse the term polarity
while overstatement and understatements intensify or diminish the strength of
the semantic orientation.
      </p>
      <p>
        In the study carried out by Su et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] on MPQA (Multi-Perspective
Question Answering) and movie reviews corpora it is shown that publicly available
sentiment lexicons can achieve the performance on par with the supervised
techniques. They discuss opinion and subjectivity definitions across different lexicons
and claim that it is possible to avoid any annotation and training corpora for
sentiment classification. Overall, it has to be noted that opinion words identified
with the use of the corpus-based approaches may not necessarily carry the
opinion itself in all situations. For example, He is looking for a good camera on the
market. Here, the word good does not indicate that the sentence is opinionated
or expresses a positive sentiment.
      </p>
      <p>
        Pang et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] propose to first separate subjective sentence from the rest of the
text. They assume that two consecutive sentences would have similar subjectivity
label, as the author is inclined not to change sentence subjectivity too often.
Thus, labeling all sentences as objective and subjective they reformulate the
task of finding the minimum s-t cut in a graph. They carried out experiments
on the movie reviews and movie plot summaries mined from the Internet Movie
DataBase (IMDB), achieving an accuracy of around 85%.
      </p>
      <p>
        A variation of the SVM method was adopted by Mullen et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] who use
WordNet syntactic relations together with topic relevance to calculate the
subjectivity scores for words in text. They report an accuracy of 86% on the Pang
et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] movie review dataset. An improvement of one of the IR metrics is
proposed in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The so-called ”Delta TFIDF” metric is used as a weighting scheme
for features. This metric takes into account how the words are distributed in
the positive vs. negative training corpora. As a classifier, they use SVM on the
movie review corpus.
      </p>
      <p>
        Paltoglou et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] explore IR weighting measures on publicly available
movie review datasets. They have good performance with BM25 and smoothing,
showing that it is important to use term weighting functions that scale sublineary
in relation to a number of times a term occurs in the document. They underline
that the document frequency smoothing is a significant factor.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>KL Score</title>
      <p>
        In our experiments we adopted a feature selection measure described in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
that is based on the Kullback-Leibler divergence (KL-divergence) measure. In
this paper, the author seeks to find a measure that would lower the score of the
features that have different distribution in the individual training documents
of a given class from the distribution in the whole corpus. Thus, the scoring
function would allow to select features that are representative of all documents
in the class leading to more homogeneous classes. The scoring measure based
on KL-divergence introduced in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] yields an improvement over MI with na¨ıve
Bayes on Reuters dataset, frequently used as a text classification benchmark.
      </p>
      <p>
        Schneider [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] shows how we can use the KL-divergence of a feature ft over
a set of training documents S = d1, ..., d|S| and classes cj , j = 1, ..., |C| is given
in the following way:
      </p>
      <p>KLt(f ) = K˜t(S) − K˜Lt(S)
(1)
where K˜t(S) is the average divergence of the distribution of ft in the individual
training documents from all training documents. The difference KLt(f ) in the
Equation 1 is bigger if the distribution of a feature ft is similar in the documents
of the same class and dissimilar in documents of different classes.</p>
      <p>K˜t(S) is defined in the following way:</p>
      <p>K˜t(S) = −p(ft) log q(ft)
(2)
where p(ft) is the probability of occurrence of feature ft (in the training set).
This probability could be estimated as the number of occurrences of ft in all
training documents, divided by the total number of features. Let Njt be the
number of documents in cj that contain ft, and Nt = P|jC=|1 Njt/|S|. Then
q(ft|cj ) = P|jC=|1 Njt/|cj | and q(ft) = Nt/|S|. The second term from 1 is defined
as follows:</p>
      <p>|C|
K˜Lt(S) = − X p(cj )p(ft|cj ) log q(ft|cj )</p>
      <p>
        j=1
where p(cj ) is the prior probability of category cj , and p(ft|cj ) is the probability
that the feature ft appears in a document belonging to the category cj . Using
the maximum likelihood estimation with a Laplacean smoothing, Schneider [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
obtains:
p(ft|cj ) =
      </p>
      <p>
        1 + Pdi∈cj n(ft, di)
|V | + P|tV=|1 Pdi∈cj n(ft, di)
(3)
(4)
where |V | is the training vocabulary size or the number of features indexed,
n(ft, di) is the number of occurrences of ft in di. It is important to note that the
afore mentioned average diversion calculations are really approximations based
on two assumptions: the number of occurrences of ft is the same in all
documents containing ft, and all documents in the class cj have the same length.
These two assumptions may turn detrimental for long extract text classification
as noted by the author himself [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], but turn out quite effective for a sentence
classification setup where a phrase mostly consists of features that occur once,
with usually low variations in sentence length. It is important to note that the
computation of p(ft|cj ) should be done on a feature set with removed outliers,
since they occur in all or almost all sentences in the corpora.
      </p>
      <p>
        In sentence-based classification the pruning of the feature set can turn out quite
detrimental to the classification accuracy. This is true if the size of the training
set is not big enough in order to be sure that some important for classification
features are not discarded. Thus, we propose to modify the KL-divergence
measure for sentiment and opinion classification. In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] it calculates the difference
between the average divergence of the distribution of ft in individual training
documents from the global distribution, all this averaged over all training
documents in all classes. For the sentiment/opinion classification task it is interesting
to calculate the difference between the average divergence in one class from the
distribution over all classes. Therefore, we can obtain the average divergence of
the distribution of ft for each of the classification categories (j ∈ P OS, N EG):
K˜Ltj (S) = Njt · p˜d(ft|cj )log
p˜d(ft|cj )
p(ft|cj )
(5)
(6)
(7)
Substituting K˜LtP OS (S) and K˜LtNEG(S) in Equation 1 for each category we
obtain measures that evaluate how different is the distribution of feature ft in
one category from the whole training set.
      </p>
      <p>KLtP OS (f ) = K˜tP OS (S) − K˜LtP OS (S)</p>
      <p>KLtNEG(f ) = K˜tNEG(S) − K˜LtNEG(S)
This way, we obtain two sums P KLtP OS (f ) and P KLtNEG(f ) over the features
present in the sentence. The final difference of the two sums (denoted further as
KL score) can serve as a prediction score of to which category the sentence is
most similar.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Experimental Setup and Dataset Description</title>
      <p>
        We use the setup with unigram indexing, short stop word elimination (several
prepositions and verb forms: a, the, it, is, of) and the use of the Porter
stemmer [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. All reported experiments use 10 fold cross-validation setup.
      </p>
      <p>
        In our study we use three publicly available datasets that we chose based on
their popularity as benchmark datasets in SA research. The first one is Sentence
Polarity dataset v1.01 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. It contains 5331 positive and 5331 negative snippets
of movie reviews, each review is one sentence long. The Subjectivity dataset
contains 5000 subjective and 5000 objective sentences [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        As a third dataset, that contains newspaper articles, we use the MPQA
dataset2 [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. The problem with the MPQA dataset is that the annotation unit
is at the phrase level, which could be a word, part of a sentence, a clause, a
sentence itself, or a long phrase. In order to obtain a dataset with a sentence
as a classification unit, we used the approach proposed by Wilson et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
They define the sentence-level opinion classification in terms of the phrase-level
annotations. A sentence is considered opinionated if:
1. It contains a ”GATE direct-subjective” annotation with the attribute
intensity not in [’low’, ’neutral’] and not with the attribute ’insubstantial’;
2. The sentence contains a ”GATE expressive-subjectivity” annotation with
attribute intensity not in [’low’].
      </p>
      <p>
        Here is the information on corpus statistics as reported in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]: there are 15,991
subjective expressions from 425 documents, containing 8,984 sentences. After
parsing, we obtained 6,123 opinionated and 4,989 factual sentences.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>KL Score and SVM</title>
      <p>
        We were interested in evaluating the features selected by our method with the
use of the SVM classifier. As pointed out in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], SVM is able to learn a model
independent of the dimension of the space with few irrelevant features present.
The experiments on text categorization task show that even the features, that are
ranked low according to their IG, are still relevant and contain the information
needed for successful classification. Another particularity of the text classification
tasks in the context of the SVM method is the sparsity of the input vector,
especially when the input instance is a sentence, and not a document.
      </p>
      <p>
        Joachims [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] observed that the text classification problems are usually
linearly separable. Thus, a lot of the research dealing with text classification uses
linear kernels [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In our experiments we used SV M light implementation with
the linear kernel with the soft-margin constant cost = 2.0 [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. We chose cost
value based on the experimental results. Generally, the low cost value (by default
0.01) indicates a bigger error tolerance during training. With the growth of the
cost value the SVM model assigns larger penalty for margin errors.
      </p>
      <p>We also experimented with other types of kernels, namely with the radial
basis function kernel. From our experiments, learning of the SVM model with
this kernel takes substantially longer time and gives approximately the same
level of the performance as the linear kernel.
1 http://www.cs.cornell.edu/people/pabo/movie-review-data
2 http://www.cs.pitt.edu/mpqa/</p>
      <p>We prune the ranked features by the score, accounting for at least 60% of
the feature set. This is due to the fact that further pruning of features leads to
drastic degradation in accuracy. Further elimination of features from the training
model leads to the situation when some testing sentences are represented with
one or two features only. The pruning of the feature set up to 60% and 80% of
top ranked features did not ameliorate the accuracy of the KL score and the
SVM model. In the next section, we discuss possible reasons for degradation
in accuracy when pruning the feature set for the task of SA classification on a
sentence level, and propose a simple additive classification scheme.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Additive Classification Model Based on KL Score</title>
      <p>In text classification, after calculating the scores between every feature and every
category the next steps are to sort the features by score, choose the best k
features and use them later to train the classifier. For the task of sentiment
classification on a sentence-based level the pruning of the feature set may lead
to the elimination of infrequent features (several occurrences) and may cause the
loss of important information needed for classification of the new instances. Here
are some differences in the aspects of use of the feature selection measures in text
classification and opinion/sentiment analysis contexts. First, the aim in topic
text classification is to look for the set of topic-specific features that describe
the classification category. In sentiment classification, though, the markers of
the opinion could be carried by both topic-specific and context words that may
also have small differences in distributions across categories due to the short text
length. If we look at the opinion review domain, the topic-specific features would
be movie, film, flick and context words would be (long, short, horror, satisfy, give
up).</p>
      <p>
        Second, the usual text classification methods are designed for documents
consisting of at least several hundreds of words, assuming that the features that
could aid in classification repeat across the text several times. The format of a
sentence does not let us make the same assumption. The opinion or sentiment
polarity can be expressed with the help of one word/feature. There is substantial
evidence from several studies that the presence/absence of a feature is a better
indicator than the tf scores [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>Thus, for effective classification, the model should identify features that are
strong indicators of opinion/sentiment, take into account the relations between
the features in each category, and be able to adjust scores of the features that
were not frequent enough in order to expand the set of features that are strong
indicators of the sentiment.</p>
      <p>As a classification model we use a simple additive score of the features in the
sentence computed for each category. Our aim is to determine the behavior of the
KL score for the task of sentence sentiment and opinion classification in terms
of its goodness and priority in feature weighting based on feature distribution
across classification categories.</p>
      <p>From the results presented in the Table 2, we can see that a simple
classification scheme based on computing the sum of the feature scores according to the
classification category outperforms the SVM model on the sentence datasets. As
we deal with a small number of features, it is advantageous to use all of them
when taking a classification decision. Comparing with the results in Table 1,
we have achieved an improvement in accuracy for the Polarity and Subjectivity
datasets. Nevertheless, the SVM model gives better results for the MPQA
corpus. This may be due to the stylistics and opinion annotation and expression
differences in movie and newspaper domains. The former is usually much more
expressive, containing more sentiment-related words, than the latter.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusions</title>
      <p>In this article we suggest a new adaptation of the Kullback-Leibler divergence
score as a weighting measure for sentiment and opinion classification. The
proposed score, named KL score, we use for feature weighting with the SVM model.
The experiments showed that the pruning of the feature set does not improve
the SVM performance. Taking into account the differences in topical and
sentiment classification of short text, we proposed a simple classification scheme
based on calculation of sum of the features present in the sentence according to
each classification category. Surprisingly, this scheme yields better results than
SVM.</p>
      <p>Based on the three well-known test-collections in the domain (Sentence
Polarity, Subjectivity and MPQA datasets), we suggested a new way of computing
feature weights, that could be later used with SVM or other supervised
classification schemes that use feature weight computation. The proposed score and
classification model were successfully applied in two different contexts (sentiment
and opinion) and two domains (movie review and news articles).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Forman</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>An extensive empirical study of feature selection metrics for text classification</article-title>
          .
          <source>The Journal of Machine Learning Research, Special Issue on Variable and Feature Selection</source>
          , vol.
          <volume>3</volume>
          , pp.
          <fpage>1289</fpage>
          -
          <lpage>1305</lpage>
          . (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Dave</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lawrence</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pennock</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>M.: Mining the peanut gallery: opinion extraction and semantic classification of product reviews</article-title>
          .
          <source>In Proceedings of the WWW Conference</source>
          , pp.
          <fpage>519</fpage>
          -
          <lpage>528</lpage>
          . (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kennedy</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Inkpen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Sentiment classification of movie reviews using contextual valence shifters</article-title>
          .
          <source>In Journal of Computational Intelligence</source>
          , vol.
          <volume>22</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>110</fpage>
          -
          <lpage>125</lpage>
          . (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Stone</surname>
            ,
            <given-names>P.J.</given-names>
          </string-name>
          : The General Inquirer:
          <article-title>a computer approach to content analysis</article-title>
          . The MIT Press.
          <article-title>(</article-title>
          <year>1966</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markert</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>From words to senses: a case study of subjectivity recognition</article-title>
          .
          <source>In Proceedings of the 22nd International Conference on Computational Linguistics</source>
          , pp.
          <fpage>825</fpage>
          -
          <lpage>832</lpage>
          . (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>A sentimental education: sentiment analysis using subjectivity summarization based on Minimum Cuts</article-title>
          .
          <source>In Proceedings of the 42nd Annual Meeting of the ACL</source>
          . (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Mullen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Collier</surname>
          </string-name>
          , N.:
          <article-title>Sentiment analysis using Support Vector Machines with diverse information sources</article-title>
          .
          <source>In Proceedings of the EMNLP Conference</source>
          , pp.
          <fpage>412</fpage>
          -
          <lpage>418</lpage>
          . (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vaithyanathan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Thumbs up?: sentiment classifcation using machine learning techniques</article-title>
          .
          <source>In Proceedings of the Conference on Empirical Methods in Natural Language Processing</source>
          , pp.
          <fpage>79</fpage>
          -
          <lpage>86</lpage>
          . (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Martineau</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Finin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Delta</surname>
            <given-names>TFIDF</given-names>
          </string-name>
          :
          <article-title>an improved feature space for sentiment analysis</article-title>
          .
          <source>In Proceedings of the AAAI Conference on Weblogs and Social Media</source>
          . (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Paltoglou</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thelwall</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>A study of information retrieval weighting schemes for sentiment analysis</article-title>
          .
          <source>In Proceedings of the 48th Annual Meeting of the ACL</source>
          , pp.
          <fpage>1386</fpage>
          -
          <lpage>1395</lpage>
          . (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Schneider</surname>
            ,
            <given-names>K.M.:</given-names>
          </string-name>
          <article-title>A new feature selection score for multinomial na¨ıve Bayes text classification based on KL-divergence</article-title>
          .
          <source>Proceedings of the 42nd Annual Meeting of the ACL</source>
          . (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Porter</surname>
            ,
            <given-names>M. F.</given-names>
          </string-name>
          :
          <article-title>Readings in information retrieval</article-title>
          . Morgan Kaufmann Publishers Inc. (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales</article-title>
          .
          <source>In Proceedings of the 43rd Annual Meeting of ACL</source>
          , pp.
          <fpage>115</fpage>
          -
          <lpage>124</lpage>
          . (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. Wilson,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Wiebe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Hoffmann</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Recognizing contextual polarity in phraselevel sentiment analysis</article-title>
          .
          <source>In Proceedings of the HLT and EMNLP</source>
          , pp.
          <fpage>354</fpage>
          -
          <lpage>362</lpage>
          . (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Joachims</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Text categorization with Support Vector Machines: learning with many relevant features</article-title>
          .
          <source>In Proceedings of the European Conference on Machine Learning</source>
          , pp.
          <fpage>137</fpage>
          -
          <lpage>142</lpage>
          . (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Joachims</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Making large-scale (SVM) learning practical</article-title>
          .
          <source>Advances in Kernel Methods - Support Vector Learning</source>
          , pp.
          <fpage>169</fpage>
          -
          <lpage>184</lpage>
          . MIT Press, Cambridge, MA. (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Opinion mining and sentiment analysis</article-title>
          .
          <source>Foundations and Trends in Information Retrieval</source>
          , vol.
          <volume>2</volume>
          (
          <issue>1</issue>
          -
          <fpage>2</fpage>
          ). (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>