<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>New York City, USA, July</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Enhanced Sentiment Classification of Telugu Text using ML Techniques</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sandeep Sricharan Mukku LTRC</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>IIIT Hyderabad sandeep.mukku@research.iiit.ac.in</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Radhika Mamidi LTRC</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>IIIT Hyderabad radhika.mamidi@iiit.ac.in</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Nurendra Choudhary LTRC, IIIT Hyderabad</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <volume>10</volume>
      <issue>2016</issue>
      <fpage>29</fpage>
      <lpage>34</lpage>
      <abstract>
        <p>With the growing amount of information and availability of opinion-rich resources, it is sometimes difficult for a common man to analyse what others think of. To analyse this information and to see what people in general think or feel of a product or a service is the problem of Sentiment Analysis. Sentiment analysis or Sentiment polarity labelling is an emerging field, so this needs to be accurate. In this paper, we explore various Machine Learning techniques for the classification of Telugu sentences into positive or negative polarities.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Recently there is a proliferation of World Wide Web sites that
emphasizes user-generated content as users are the potential
content contributors. ”What people think and feel” - is the
important information for marketing and business operations
as it makes their product or service better. Also, there are a
lot of comments and blog-posts about trending activity in
social media. People try to analyse this information and try to
draw conclusions out of them. To better analyse and classify
this information, researchers these days are actively working
on sentiment analysis. Sentiment Analysis or polarity
classification is an effort to classify a given text into polarities,
either positive or negative. Majority of the work in the field
of sentiment classification has been done in English. There
has been very less contribution for regional languages,
especially Indian Languages.</p>
      <p>Telugu is a Dravidian language native to India. There are
about 75 million native Telugu speakers. Telugu ranks
fifteenth in the Ethnologue list of most-spoken languages
worldwide1. Currently there are a lot of websites, blogs etc., rich in
Telugu content. In our work, we tried to classify the polarity
of Telugu sentences using various Machine Learning
Techniques viz., Naive Bayes, Logistic Regression, SVM
(Support Vector Machines), MLP (Multi Layer Perceptron)
Neural Network, Decision Trees and Random Forest. We built
models for two classification tasks: a binary task of
classification of sentiment into positive and negative polarities and a</p>
    </sec>
    <sec id="sec-2">
      <title>1http://www.ethnologue.com/statistics/size</title>
      <p>ternary task of classification of sentiment into positive,
negative and neutral polarities. The algorithm and formulation are
explained in detail in later sections.</p>
      <p>The rest of the paper is organised as follows. In section 2,
we discuss the previous works and related work. In section
3, we describe the datasets used for our work. In section 4,
we discuss about the methodology used in our paper which
includes pre-processing, training and output. In section 5, we
present the framework of our work which includes the tools
and different Machine Learning techniques used in our work.
In section 6, we present our experiments and discuss the
results. Later, we conclude and discuss the future directions of
this work.
2</p>
      <sec id="sec-2-1">
        <title>Related Work</title>
        <p>Sentiment classification is a difficult task and a lot of research
has been done in the past. In this section we survey some of
the methodologies and approaches used to address the task
of sentiment analysis and polarity classification. Our work is
motivated by most of these works.</p>
        <p>Enhanced Naive Bayes model is used for sentiment
classification task in English [Narayanan et al., 2013]. Their
approach is a combination of methodologies like effective
negation handling, feature-selection by mutual information and
word n-grams. This resulted in significant improvement of
accuracy.</p>
        <p>Learning word vectors for sentiment analysis is a research
work, where Logistic Regression classifier is used as a
predictor. [Maas et al., 2011] proposed a methodology which can
grasp both continuous and multi-class sentiment information
as well as non-sentiment annotations.</p>
        <p>[Mullen and Collier, 2004] uses support vector machines
(SVMs) to bring together diverse sources of potentially
pertinent information, including several favorability measures for
phrases and adjectives and, where available, knowledge of
the topic of the text. Predicting the helpfulness of online
reviews is another area where [Lee and Choeh, 2014] uses a
back-propagation multilayer perceptron neural network. This
work motivated us to use multilayer perceptron (MLP) neural
network for the task of sentiment classification.</p>
        <p>Distributed Representations of Sentences and Documents
is the work by [Le and Mikolov, 2014] where they make
fixed length paragraph vectors or sentence vectors which are
quite useful for our work. We used the tool Doc2Vec for
preprocessing the data. Further usage is explained in detail in
later sections of the paper.</p>
        <p>[Das and Bandyopadhyay, 2010] propose several
computational techniques to generate sentiment lexicons in
Indian languages (which includes Bengali, Hindi and Telugu
languages) automatically and semi-automatically. [Das and
Bandyopadhyay, 2011] proposes a tool Dr Sentiment where it
automatically creates the PsychoSentiWordNet involving
internet population. The PsychoSentiWordNet is an extension
of SentiWordNet that presently holds human psychological
knowledge on a few aspects along with sentiment knowledge.
3</p>
      </sec>
      <sec id="sec-2-2">
        <title>Dataset</title>
        <p>In this section, we describe the raw corpus and annotated data
which are domain independent. These have been used in our
experiments.</p>
        <sec id="sec-2-2-1">
          <title>3.1 Raw Corpus</title>
          <p>A corpus consisting of 7,21,785 raw Telugu sentences was
provided by Indian Languages Corpora Initiative (ILCI)2.
These sentences were used for training the Doc2vec model
(as described in the next section) for generating sentence
vectors.</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>3.2 Annotated Data</title>
          <p>The corpus consists of Telugu sentences each attached with a
corresponding polarity tag. There are about 1644 sentences
which consists of 1068 positive, 219 negative and 357 neutral
sentences. These sentences are used to train, test and evaluate
the classifier models.</p>
          <p>The corpus is prepared from raw data taken from the
Telugu Newspapers3. This newspaper raw data was first
annotated by two native Telugu speakers separately. The data was
then merged by a third native speaker who also validated it
simultaneously. The annotation consists of three polarity tags
i.e; Positive, Negative and Neutral.</p>
          <p>We performed inter-annotator agreement using Cohens´
kappa coefficient4. We got the annotation consistency (k
value) to be 0.92 (which is in perfect agreement).
4</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Methodology</title>
        <p>In this section we explain the steps involved in our approach.
Doc2Vec tool (Refer section 5.1 )gives the semantic
representation of a sentence with respect to a dataset. This means
that the vector of the sentence represents the meaning of the
sentence. Therefore, classifying the semantic space
according to training data can classify all the future instances of the
same kind thus giving the solution to the problem of
sentiment analysis.</p>
        <sec id="sec-2-3-1">
          <title>4.1 Pre-processing</title>
          <p>We converted the annotated data of sentences to
200dimension feature sentence vectors. For this we used the
Doc2vec tool provided by Gensim5, a python module.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2http://sanskrit.jnu.ac.in/ilci/index.jsp 3http://www.w3newspapers.com/india/telugu/ 4http://en.wikipedia.org/wiki/Cohen%27s kappa 5https://radimrehurek.com/gensim/index.html</title>
      <p>Doc2vec takes a raw corpus as input and gives us a
distributional semantic representation of sentences accordingly.
A Doc2vec model is trained on the raw corpus (Refer
section 3.1). The sentences alone are taken from annotated data
and passed through the trained Doc2Vec model. The model
then returns sentence vectors for each of the sentences. Here
we maintained the correspondence while converting between
sentences and their tags.
In the pre-processing phase we converted each sentence of
the annotated data into a sentence vector. Therefore we have a
sentence vector with a corresponding tag attached to it. Hence
the task is reduced to a binary or ternary classification
problem. For this task we use various Machine Learning
classifiers. The algorithms are explained in the following section.</p>
      <p>The model for the classifiers are trained using sentence
vectors and their corresponding tags. The models are
evaluated using 5-fold cross validation where we divided the data
into training and testing sets in the ratio 4:1. The model thus
obtained is now ready to classify any sentence vector.
4.3</p>
      <sec id="sec-3-1">
        <title>Output</title>
        <p>In this section we discuss the final pipeline which gives the
resultant tag for a given input Telugu sentence. The given input
sentence is converted into a sentence vector using a Doc2Vec
model. This sentence vector is given to the trained classifier
model which returns the output tag.
Sentence Vector is an unsupervised algorithm that learns
fixed-length feature representations from variable-length
pieces of texts, such as sentences. In the paper [Le and
Mikolov, 2014], their algorithm represents each document by
a dense vector which is trained to predict words in the
document. Machine learning algorithms typically require the text
to be represented as a fixed vector. Usually the most common
fixed-length vector representation for texts is bag-of-words
(BOW) or bag-of-n-grams [Harris, 1954]. These
representations are used because they are simple and accurate. We
are not using bag-of-words because this technique has many
disadvantages. The word order is lost, and thus different
sentences with the same set of words will have exactly the same
representation. Also, we did not use bag-of-n-grams because
bag-of-n-grams considers the word order in shorter context
but it suffers from the curse of higher dimensionality and data
sparsity. We found many advantages of sentence vectors such
as learning from unlabeled data. Sentence vectors also take
into consideration the word order. Doc2Vec is a tool in which
sentences are converted into sentence vectors. This tool helps
in pre-processing and training of data.
5.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>ML Techniques</title>
        <p>We used scikit-learn6 toolkit which has all these techniques
pre-implemented.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Naive Bayes</title>
        <p>Naive Bayes (NB) classifier is a probabilistic classifier which
uses Bayes Theorem. This classifier evaluates the probability
of an event given the probability of another event which has
previously occurred. Naives Bayes classifier works very
effectively for linearly separable problems. It also works fine
for non-linearly separable problems.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Logistic Regression</title>
        <p>Logistic Regression (LR) is a multi-class logistic model
which is used to estimate the probability of a response based
predictor variables in which there are one or more
independent variables that determine an outcome. The expected
values of the response based predictor variable are formed based
on combination of values taken by the predictors. We took
the C value (i.e. the regularization parameter) as 1.0.</p>
      </sec>
      <sec id="sec-3-5">
        <title>Support Vector Machine (SVM)</title>
        <p>SVM classifier is a supervised learning model which
constructs a set of hyperplanes in a high-dimensional space
which separates the data into classes. SVM is a non
probabilistic linear classifier. SVM models are closely related to a
Neural Network. SVM takes the input data and for each input
data row it predicts the class to which this input row belongs.</p>
      </sec>
      <sec id="sec-3-6">
        <title>Multi-Layer Perceptron (MLP) Neural Network</title>
        <p>A multilayer perceptron (MLP) is a feed-forward artificial
neural network model which maps input data sets on an
appropriate set of outputs. MLP consists of multiple layers of
nodes in a directed graph , each layer is fully connected to
the next layer. Feed-forward means the data flows only in
one direction, in our case from input to output i.e., in forward
direction.</p>
      </sec>
      <sec id="sec-3-7">
        <title>Decision Trees</title>
        <p>Decision tree (DT) is a decision support tool that uses a
treelike model for the decisions and likely outcomes. A decision
tree is a tree in which each internal (non-leaf) node is labeled
with an input feature. Each leaf of the tree is labeled with
a class. But for our work decision trees give less accurate
results because of overfitting of training data. We took the
tree depth as 20 for each decision tree.</p>
      </sec>
      <sec id="sec-3-8">
        <title>Random Forest</title>
        <p>Random Forest (RF) is an ensemble of Decision Trees.
Random Forests construct multiple decision trees and take each
of their scores into consideration for giving the final output.
Decision Trees tend to overfit on a given data and hence they
will give good results for training data but bad on testing data.
Random Forests reduces overfitting as multiple decision trees
are involved. We took the n estimator parameter as 100.</p>
      </sec>
      <sec id="sec-3-9">
        <title>Adaboost Ensemble</title>
        <p>The core principle of Adaboost (A B) is to fit a sequence of
weak learners (i.e., models that are only slightly better than
random guessing, such as small decision trees) on repeatedly
modified versions of the data. The predictions from all of
them are then combined through a weighted majority vote (or
sum) to produce the final prediction.
6</p>
        <sec id="sec-3-9-1">
          <title>Experiments and Results</title>
          <p>The method of 5-fold cross-validation is used. The
experiments are performed four times (trails) to improve the
validity of the results. In each experiment, the sentences in data
are chosen randomly for the division into parts. These
experiments are performed in the Training Step (See Fig.2 ).</p>
          <p>The results are given below as tables. As can be observed
for binary classification Random Forest, Logistic Regression
and Support Vector Machines give good results. Random
Forest Classifier is preferred because they have a more intuitive
design and are easy-to-understand. And for ternary
classification we can observe that Logistic regression gives good
results.The experiments were conducted for four trials, each
with five iterations (Itr) and the results are tabulated. We
mentioned the average (Avg) of five iterations of each trial in the
last column of each table for every technique.</p>
          <p>Itr 1
75.93
80.93
85.24
72.83
73.30
85.22
There may be few cases where the data contains few
sentences which may not contain any sentiment. So we
considered neutral polarity leading to a ternary sentiment
classification problem. The following are the accuracies where
we considered all three polarities i.e., positive, negative and
neutral polarities.</p>
        </sec>
      </sec>
      <sec id="sec-3-10">
        <title>Trial-1</title>
        <p>N B
L R
SVM
MLP
D T
R F</p>
        <p>NaiveBayes LR</p>
        <p>SVM
Telugu is an agglutinative language. Considering this fact,
we have achieved good results. Sentiment Analysis has not
yet been tried on agglutinative Dravidian Languages. Since
our work is the first attempt of this kind, we are not able to
discuss comparative results. This approach produces a more
focused and accurate sentiment summary of a given Telugu
sentence which is useful for the users. This approach is not
restricted by any domain. However, small modifications in
the pre-processing would be sufficient to use this algorithmic
formulation in different domains or languages.</p>
        <p>Future Work
• To build a dictionary of frequently occurring positive
and negative words and construct a lexicon-based
system using it.
• To integrate a Morph Analyser to address the issue of
agglutination.
• To test the system for different Indian languages.
• To work on the trending code-mixed data.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [Bakliwal et al.,
          <year>2011</year>
          ]
          <string-name>
            <given-names>Akshat</given-names>
            <surname>Bakliwal</surname>
          </string-name>
          , Piyush Arora, Ankit Patil, and
          <string-name>
            <given-names>V</given-names>
            <surname>Verma</surname>
          </string-name>
          .
          <article-title>Towards enhanced opinion classification using nlp techniques</article-title>
          .
          <source>In Proceedings of the 5th international joint conference on natural language processing (IJCNLP)</source>
          .
          <source>Chiang Mai, Thailand</source>
          , pages
          <fpage>101</fpage>
          -
          <lpage>107</lpage>
          . Citeseer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>[Balamurali</source>
          ,
          <year>2012</year>
          ]
          <string-name>
            <given-names>AR</given-names>
            <surname>Balamurali</surname>
          </string-name>
          .
          <article-title>Cross-lingual sentiment analysis for indian languages using linked wordnets</article-title>
          .
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>[Das and Bandyopadhyay</source>
          , 2010]
          <string-name>
            <given-names>Amitava</given-names>
            <surname>Das</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sivaji</given-names>
            <surname>Bandyopadhyay</surname>
          </string-name>
          .
          <article-title>Sentiwordnet for indian languages</article-title>
          .
          <source>Asian Federation for Natural Language Processing</source>
          , China, pages
          <fpage>56</fpage>
          -
          <lpage>63</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>[Das and Bandyopadhyay</source>
          , 2011]
          <string-name>
            <given-names>Amitava</given-names>
            <surname>Das</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sivaji</given-names>
            <surname>Bandyopadhyay</surname>
          </string-name>
          .
          <article-title>Dr sentiment knows everything! In Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: systems demonstrations</article-title>
          , pages
          <fpage>50</fpage>
          -
          <lpage>55</lpage>
          . Association for Computational Linguistics,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>[Das</surname>
          </string-name>
          et al.,
          <string-name>
            <surname>] Dipankar Das</surname>
          </string-name>
          ,
          <string-name>
            <surname>Soujanya Poria</surname>
            , Chandra Mohan Dasari, and
            <given-names>Sivaji</given-names>
          </string-name>
          <string-name>
            <surname>Bandyopadhyay</surname>
          </string-name>
          .
          <article-title>Building resources for multilingual affect analysis-a case study on hindi, bengali and telugu</article-title>
          . In Workshop Programme, page
          <volume>54</volume>
          . Citeseer.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Go et al.,
          <year>2009</year>
          ]
          <string-name>
            <given-names>Alec</given-names>
            <surname>Go</surname>
          </string-name>
          , Richa Bhayani, and
          <string-name>
            <given-names>Lei</given-names>
            <surname>Huang</surname>
          </string-name>
          .
          <article-title>Twitter sentiment classification using distant supervision</article-title>
          .
          <source>CS224N Project Report</source>
          , Stanford,
          <volume>1</volume>
          :
          <fpage>12</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>[Harris</source>
          , 1954]
          <article-title>Zellig S Harris</article-title>
          . Word,
          <volume>10</volume>
          (
          <issue>2-3</issue>
          ):
          <fpage>146</fpage>
          -
          <lpage>162</lpage>
          ,
          <year>1954</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [Joshi et al.,
          <year>2010</year>
          ]
          <string-name>
            <given-names>Aditya</given-names>
            <surname>Joshi</surname>
          </string-name>
          , AR Balamurali, and
          <string-name>
            <given-names>Pushpak</given-names>
            <surname>Bhattacharyya</surname>
          </string-name>
          .
          <article-title>A fall-back strategy for sentiment analysis in hindi: a case study</article-title>
          .
          <source>Proceedings of the 8th ICON</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[Le and Mikolov</source>
          , 2014]
          <article-title>Quoc V Le and Tomas Mikolov. Distributed representations of sentences and documents</article-title>
          .
          <source>arXiv preprint arXiv:1405.4053</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[Lee and Choeh</source>
          , 2014]
          <article-title>Sangjae Lee and Joon Yeon Choeh. Predicting the helpfulness of online reviews using multilayer perceptron neural networks</article-title>
          .
          <source>Expert Systems with Applications</source>
          ,
          <volume>41</volume>
          (
          <issue>6</issue>
          ):
          <fpage>3041</fpage>
          -
          <lpage>3046</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [Liu, 2010]
          <string-name>
            <given-names>Bing</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Sentiment analysis and subjectivity</article-title>
          .
          <source>Handbook of natural language processing</source>
          ,
          <volume>2</volume>
          :
          <fpage>627</fpage>
          -
          <lpage>666</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [Liu, 2012]
          <string-name>
            <given-names>Bing</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Sentiment analysis and opinion mining</article-title>
          .
          <source>Synthesis lectures on human language technologies</source>
          ,
          <volume>5</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>167</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [Maas et al.,
          <year>2011</year>
          ] Andrew L Maas, Raymond E Daly,
          <string-name>
            <surname>Peter T Pham</surname>
            , Dan Huang, Andrew Y Ng, and
            <given-names>Christopher</given-names>
          </string-name>
          <string-name>
            <surname>Potts</surname>
          </string-name>
          .
          <article-title>Learning word vectors for sentiment analysis</article-title>
          .
          <source>In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume</source>
          <volume>1</volume>
          , pages
          <fpage>142</fpage>
          -
          <lpage>150</lpage>
          . Association for Computational Linguistics,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <source>[Mullen and Collier</source>
          , 2004]
          <string-name>
            <given-names>Tony</given-names>
            <surname>Mullen</surname>
          </string-name>
          and
          <string-name>
            <given-names>Nigel</given-names>
            <surname>Collier</surname>
          </string-name>
          .
          <article-title>Sentiment analysis using support vector machines with diverse information sources</article-title>
          .
          <source>In EMNLP</source>
          , volume
          <volume>4</volume>
          , pages
          <fpage>412</fpage>
          -
          <lpage>418</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [Narayanan et al.,
          <year>2013</year>
          ]
          <string-name>
            <given-names>Vivek</given-names>
            <surname>Narayanan</surname>
          </string-name>
          , Ishan Arora, and
          <string-name>
            <given-names>Arjun</given-names>
            <surname>Bhatia</surname>
          </string-name>
          .
          <article-title>Fast and accurate sentiment classification using an enhanced naive bayes model</article-title>
          .
          <source>In Intelligent Data Engineering and Automated Learning-IDEAL 2013</source>
          , pages
          <fpage>194</fpage>
          -
          <lpage>201</lpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>[Pang and Lee</source>
          , 2008]
          <string-name>
            <given-names>Bo</given-names>
            <surname>Pang</surname>
          </string-name>
          and
          <string-name>
            <given-names>Lillian</given-names>
            <surname>Lee</surname>
          </string-name>
          .
          <article-title>Opinion mining and sentiment analysis</article-title>
          .
          <source>Foundations and trends in information retrieval</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          -2):
          <fpage>1</fpage>
          -
          <lpage>135</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [Patra et al.,
          <year>2015</year>
          ]
          <string-name>
            <given-names>Braja</given-names>
            <surname>Gopal</surname>
          </string-name>
          <string-name>
            <given-names>Patra</given-names>
            ,
            <surname>Dipankar Das</surname>
          </string-name>
          ,
          <string-name>
            <surname>Amitava Das</surname>
            , and
            <given-names>Rajendra</given-names>
          </string-name>
          <string-name>
            <surname>Prasath</surname>
          </string-name>
          .
          <article-title>Shared task on sentiment analysis in indian languages (sail) tweets-an overview</article-title>
          .
          <source>In Mining Intelligence and Knowledge Exploration</source>
          , pages
          <fpage>650</fpage>
          -
          <lpage>655</lpage>
          . Springer,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [Wilson et al.,
          <year>2005</year>
          ] Theresa Wilson, Janyce Wiebe, and
          <string-name>
            <given-names>Paul</given-names>
            <surname>Hoffmann</surname>
          </string-name>
          .
          <article-title>Recognizing contextual polarity in phrase-level sentiment analysis</article-title>
          .
          <source>In Proceedings of the conference on human language technology and empirical methods in natural language processing</source>
          , pages
          <fpage>347</fpage>
          -
          <lpage>354</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Association for Computational Linguistics</surname>
          </string-name>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>