<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Be Conscientious, Express your Sentiment!</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fabio Celli</string-name>
          <email>fabio.celli@unitn.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cristina Zaga</string-name>
          <email>cristina.zaga@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Trento</institution>
          ,
          <addr-line>Corso Bettini 31, 38068 Rovereto</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper addresses the issue of how personality recognition can be helpful for sentiment analysis. We exploited the corpus for sentiment analysis released for the SEMEVAL 2013, we automatically annotated personality labels by means of an unsupervised system for personality recognition. We validated the automatic annotation on a small set of Twitter users, whose personality types have been collected by means of an online test. Results show that hashtag position and conscientiousness are the best predictors of sentiment in Twitter.</p>
      </abstract>
      <kwd-group>
        <kwd>Personality Recognition</kwd>
        <kwd>Twitter</kwd>
        <kwd>Sentiment Analysis</kwd>
        <kwd>Data Mining</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In psychology, personality is seen as an a ect processing system [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] that
characterise a unique individual [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], while sentiment analysis is a NLP task for tracking
the mood of the public about products or topics [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Since psychologists suggest
that personality is related to some aspects of mood [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], we expect that
personality traits would help in a sentiment analysis task. In this paper, we exploit the
correlations between language and personality provided by Golbeck et al. 2011
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and Quercia et al. 2011 [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] to predict personality labels in a Twitter dataset
for sentiment analysis [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. We use a system for personality recognition [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to
annotate personaliy labels in Twitter. Our goal is to test whether personality
types can be good predictors of sentiment polarity.
      </p>
      <p>
        The paper is structured as follows: in subsection 1.1 we introduce related
work, in section 2 we present the dataset and describe the method used for
the annotation with personality labels. In section 3 we report the results of our
experiments and we draw some conclusions.
1.1
In the last decade sentiment analysis and opinion mining strongly attracted the
attention of the scienti c community, and Twitter is a microblogging website
that has been considered a very rich source of data for opinion mining and
sentiment analysis [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Anyway, it is very challenging to extract linguisitc information
from Twitter [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The 140 character limitations of tweets led to a sentence-level
sentiment analysis. Kouloumpis et al. 2011 [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] has shown that in the
microblogging domain, common tools for NLP may not be as useful sentiment clues as
the presence of intensi ers, emoticons, abbreviations and hashtags. Given these
results, rencently, more and more attention is given to the wide variety of user
de ned hashtags [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. The uniqueness of microblogging genre also led
researchers to design NLP tools that make use of any number of domain-speci c
features including abbreviations, hashtags, emoticons and symbols [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        Personality recognition [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is a computational task that consists in the
automatic classi cation of authors' personality traits from pieces of text they
wrote. Most scholars use the Big5 model [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This model describes
personality along ve traits formalized as bipolar scales: extroversion (sociable or shy),
neuroticism (calm or neurotic), Agreeableness (friendly or uncooperative),
conscientiousness (organized or careless) and openness to experience (insightful or
unimaginative).
      </p>
      <p>
        The rst applications in this eld were on o ine essays texts [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and on
blogs [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In recent years the interest of the scienti c community towards the
application of personality recognition in social networks, including Twitter [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ],
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In particular, they extracted correlations between language and personality
traits from Twitter, that we exploted for the annotation of the data.
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Data</title>
      <sec id="sec-2-1">
        <title>Dataset, Annotation and Experiments</title>
        <p>We used the dataset released by Wilson et al. 2013 for the SemEval-2013 task
B1. The purpose of this task is to classify whether a tweet is of positive, negative,
or neutral. Gold standard sentiment labels are provided with data. The dataset
consists of Twitter status IDs, and the task organizers provided a python script
that downloads the data, if available. The nal data includes the following
information: tweet ID; user ID; topic; sentiment polarity; tweet text. We downloaded
and cleaned the data, removing not available tweets. Data is splitted in training
and test set, details are reported in Table 1. For each user in the dataset we have
set instances missing total
training 5747 495 5252
test 687 123 564
just one text, that is not enough for the personality recognition. In order to get
more tweets, we exploited user IDs and automatically collected all the tweets we
found in their page. We collected an average of 12 tweets per user.
1 http://www.cs.york.ac.uk/semeval-2013/task2/
2.2</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Annotation of Personality Types</title>
      <p>
        For the annotation of personality labels in the dataset, we exploited the system
described in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. It is an unsupervised instance-based personality
recognition system. Given as input a set of correlations between language cues and big5
personality traits, and a set of users and their texts, the system generates
personality labels for each user, adapting the correlations to the data at hand. We
feature ext. agr. con. neu. ope.
future .227 -.100 -.286* .118 .142
you .068 .364* .252* -.212 -.020
article -.039 -.139 -.071 -.154 .396*
negate -.020 .048 -.374* .081 .040
family .338* .020 -.126 .096 .215
humans .204 -.011 .055 -.113 .251*
sad .154 -.203 -.253* .230 -.111
cause .224 -.258* -.155 -.004 .264*
certain .112 -.117 -.069 -.074 .347*
hear .042 -.041 .014 .335* -.084
feel .097 -.127 -.236* .244* .005
body .031 .083 -.079 .122 -.299*
achive -.005 -.240* -.198 -.070 .008
religion -.152 -.151 -.025 .383* -.073
death -.001 .064 -.332* -.054 .120
      </p>
      <p>
        ller .099 -.186 -.272* .080 .120
! marks -.021 -.025 .260* .317* -.295*
parentheses -.254* -.048 -.084 .133 -.302*
? marks .263* -.050 .024 .153 -.114
words .285* -.065 -.144 .031 .200
followers .15* .02 .10 -.19* .05
following .13* .07 .08 -.17* .05
exploited the correlations between tweets and personality traits taken from [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]
and [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. We used only the correlations with p-value above .05, reported in
Table 2. These correlations, that represent the initial model for the unsupervised
system, include language-independent features, such as punctuation,
Twitterspeci c features, such as following and followers count, and features from LIWC
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
      <p>The outputs of the system are: one personality label for each user and the
input text annotated. Labels are formalized as 5-characters strings, each one
representing one trait of the Big5. Each character in the string can take 3
possible values: positive pole of the scale (y), negative pole (n) and missing/balanced
(o). For example the label \ynooy" stands for an extrovert, neurotic and open
mindend person. The annotation is a classi caiton task with 3 target classes.</p>
      <p>The pipeline of the personality recognition system, depicted in Figure 1,
has three phases: preprocessing, procesing and evaluation. In the preprocessing
phase, the system samples 20% of the input unlabeled data, computing the
average distribution of each feature of the correlation set, then assigns personality
labels to the sampled data according to the correlations.</p>
      <p>In the processing phase, the system generates one personality label for each
text in the dataset, mapping the features in the correlation set to speci c
personality trait poles, according to the correlations. Instances are compared to
the distribution of features sampled during the preprocessing phase and ltered
accordingly. Only features occurring more than the average are mapped to
personality traits. For example a text containing more exclamation marks than
average will re positive correlations with conscientiousness and neuroticism and
a negative correlation with openness to experience (see Table 2).</p>
      <p>The system keeps track of the ring rate of each single feature/correlation
and computes personality scores for each trait, mapping positive scores into \y",
negative scores into \n" and missing or balanced values into \o" labels.</p>
      <p>In the evaluation phase, the system compares all the personality labels
generated for each single tweet of each user and retrieves one generalized label per
user by computing the majority class for each trait. This is why the system can
evaluate personality only for users that have at least two tweets, the other ones
are discarded. In the evaluation phase the system computes average con dence
and variability. Average Con dence is de ned as the coverage of the majority
class of the personality trait over the count of all the user's texts and gives
a measure of the robustness of the personality hypothesis. Variability instead
provides information about how much one author tends to write expressing the
same personality traits in all the texts. It is de ned as var = avgTconf , where T
is the the count of all the user's texts.
2.3</p>
    </sec>
    <sec id="sec-4">
      <title>Validation of Personality Labels</title>
      <p>
        In order to validate the annotation of the data, we developed a website2 with
a short version of the Big5 test, the BFI-10 [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. We collected a gold-standard
test set, with the personality scores of 20 Twitter users, their tweets and data.
We computed random and majority baselines with 3 target classes (y, n, o), and
then ran the system on the gold-standard test set. Results, reported in Table
      </p>
      <p>
        P R F1
random 0.359 0.447 0.392
majority 0.39 1 0.455
extroversion 0.595 1 0.746
neuroticism 0.595 1 0.746
agreeableness 0.371 0.5 0.426
conscientiousness 0.621 0.693 0.655
openness 0.606 0.833 0.702
avg. 0.558 0.805 0.655
3, show that the average f-measure is in line with the results reported in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
Conscientiousness and openness to experience are the best predicted traits, in
particular, conscientiousness has the highest precision. Agreeableness instead has
a poor performance: we explain this with the fact that it is the trait for which
we have fewer features.
2.4
      </p>
    </sec>
    <sec id="sec-5">
      <title>Experiments and Discussion</title>
      <p>
        We ran two di erent binary classi cation tasks, task A: subjectivity detection,
and task B: sentiment polarity classi cation. The former is the task of
distinguishing between neutral texts and texts containing sentiment, the latter is the
classical opinion mining classi cation between positive and negative. As
fea2 http://personality.altervista.org/p.php
tures, we used the ve personality traits, Twitter statistics (followers, following,
tweets), emoticons (positive/negative), hashtag position (hashtag initial, hastag
nal) and Twitter Part-Of-Speech tags obtained buy means of a part-of-speech
tagger designed for Twitter [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        As rst experiment we ran feature selection in Weka [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], removing topics
and using the correlation-based subset evaluation algorithm [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] with a
greedystepwise feature space search. This algorithm evaluates the worth of a subset of
attributes by considering the individual predictive ability of each feature along
with the degree of redundancy between them. Results are reported in Table 4:
We see that hashtag position is very helpful while the only personality trait
which is a good predictor of sentiment is conscientiousness. We ran a classi
algorithm task A (f1) task B (f1)
bl (zero rule) 0.467 0.55
trees 0.619 0.571
bayes 0.663 0.598
svm 0.632 0.555
ripper 0.629 0.612
cation experiment, reported in Table 5, where we predicted the target classes
using the features selected in the feature selection phase. Taking the majority
baseline (zero rule), we observe that the best improvement over the baseline has
been achieved in task A (distinction between neutral/subjective), while task B
(positive/negative) has a very small improvement.
      </p>
      <sec id="sec-5-1">
        <title>Conclusions and Future Work</title>
        <p>In this paper we attempted to exploit personality traits, and few other linguistic
cues, including hashtags, to predict subjectivity and sentiment polarity in
Twitter. The best performing team at the Semeval 2013 achieved an f1 of .889 for
task A and of .69 for task B. While our results are far from the best one in task
A, it is in line with the results of the shared task for task B. It is interesting the
fact that conscientiousness is one of the features we exploited for task B.</p>
        <p>The performance of the personality recognition system is far from perfect,
but still we successfully exploited one speci c trait of personality to classify
sentiment. In the future we wish to improve the performance personality recognition
system, adding more correlations, and to extend the exploitation of personality
and hashtags to other domains, such as irony detection.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Adelstein</surname>
            <given-names>J.S.</given-names>
          </string-name>
          , Shehzad
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>Mennes</surname>
          </string-name>
          <string-name>
            <surname>M.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>DeYoung C.G.</given-names>
            ,
            <surname>Zuo</surname>
          </string-name>
          <string-name>
            <given-names>X-N.</given-names>
            ,
            <surname>Kelly</surname>
          </string-name>
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Margulies</surname>
          </string-name>
          <string-name>
            <given-names>D.S.</given-names>
            ,
            <surname>Bloom</surname>
          </string-name>
          eld
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Gray</surname>
          </string-name>
          <string-name>
            <given-names>J.R.</given-names>
            ,
            <surname>Castellanos</surname>
          </string-name>
          <string-name>
            <given-names>X.F.</given-names>
            and
            <surname>Milham M.P. Personality Is</surname>
          </string-name>
          <article-title>Re ected in the Brain's Intrinsic Functional Architecture</article-title>
          .
          <source>In PLoS ONE 6:(11)</source>
          ,
          <volume>1</volume>
          {
          <fpage>12</fpage>
          . (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Aitken</given-names>
            <surname>Harris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            , and
            <surname>Lucia</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>The relationship between self-report mood and personality</article-title>
          .
          <source>Personality and individual di erences</source>
          ,
          <volume>35</volume>
          (
          <issue>8</issue>
          ),
          <year>1903</year>
          {
          <year>1909</year>
          . (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Celli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Rossi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <article-title>The role of emotional stability in Twitter conversations</article-title>
          .
          <source>In Proceedings of the Workshop on Semantic Analysis in Social Media</source>
          . (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Celli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Adaptive</surname>
          </string-name>
          <article-title>Personality recognition from Text</article-title>
          . Lambert Academic Publishing. Saarbruchen. (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Costa</surname>
            , P. T. and
            <given-names>MacCrae</given-names>
          </string-name>
          , R. R.
          <article-title>Normal personality assessment in clinical practice: The neo personality inventory</article-title>
          .
          <source>Psychological assessment</source>
          ,
          <volume>4</volume>
          (
          <issue>1</issue>
          ):
          <fpage>5</fpage>
          . (
          <year>1992</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Golbeck</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robles</surname>
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Edmondson</surname>
            <given-names>M.</given-names>
          </string-name>
          , and
          <article-title>Turner K. Predicting Personality from Twitter</article-title>
          .
          <source>In Proc. of International Conference on Social Computing</source>
          . (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gimpel</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schneider</surname>
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>O'Connor</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Das</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mills</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eisenstein</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heilman</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yogatama</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Flanigan</surname>
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Smith N</surname>
          </string-name>
          .
          <article-title>A. Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments</article-title>
          .
          <source>In Proceedings of the Annual Meeting of the Association for Computational Linguistics</source>
          . (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hall M.</surname>
          </string-name>
          <article-title>A. Correlation-based Feature Subset Selection for Machine Learning</article-title>
          . Hamilton, New Zealand. (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Jiang</surname>
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            <given-names>X.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Zhao</surname>
            <given-names>T</given-names>
          </string-name>
          .
          <article-title>Target-dependent twitter sentiment classi cation</article-title>
          .
          <source>In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies</source>
          , (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kouloumpis</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , Wilson,
          <string-name>
            <given-names>T.</given-names>
            , and
            <surname>Moore</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          <article-title>Twitter sentiment analysis: The Good the Bad and the OMG!</article-title>
          .
          <source>InProc. of ICWSM</source>
          . (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mairesse</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Walker</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Mehl</surname>
            ,
            <given-names>M. R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Moore</surname>
            ,
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          <article-title>Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text</article-title>
          .
          <source>In Journal of Arti cial intelligence Research</source>
          ,
          <volume>30</volume>
          . (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Maynard</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bontcheva</surname>
            <given-names>K.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Rout</surname>
            <given-names>D.</given-names>
          </string-name>
          <article-title>Challenges in developing opinion mining tools for social media. In Proceedings of NLP can u tag user generated content</article-title>
          . (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Oberlander</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Nowson</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <article-title>Whose thumb is it anyway? classifying author personality from weblog text</article-title>
          .
          <source>In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics ACL</source>
          . (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Owoputi</surname>
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>OConnor</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dyer</surname>
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gimpel</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schneider</surname>
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            <given-names>N.A.</given-names>
          </string-name>
          <string-name>
            <surname>Improved</surname>
          </string-name>
          <article-title>Part-of-Speech Tagging for Online Conversational Text with Word Clusters</article-title>
          .
          <source>In Proceedings of NAACL</source>
          . (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Pak</surname>
            <given-names>A.</given-names>
          </string-name>
          and Paroubek P.
          <article-title>Twitter as a Corpus for Sentiment Analysis and Opinion Mining</article-title>
          .
          <source>In proceedings of LREC</source>
          . (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Pang</surname>
            <given-names>B.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lillian Lee L. Opinion</surname>
          </string-name>
          <article-title>Mining and Sentiment Analysis</article-title>
          .
          <source>In Foundations and Trends in Information Retrieval</source>
          .
          <volume>2</volume>
          (
          <issue>12</issue>
          ). (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Pennebaker</surname>
            ,
            <given-names>J. W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chung</surname>
            ,
            <given-names>C. K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ireland</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzales</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Booth</surname>
            ,
            <given-names>R. J.</given-names>
          </string-name>
          <article-title>The development and psychometric properties of LIWC2007</article-title>
          . Austin, TX, LIWC.Net. (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Quercia</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kosinski</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stillwell</surname>
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Crowcroft J. Our</surname>
          </string-name>
          <article-title>Twitter Pro les, Our Selves: Predicting Personality with Twitter</article-title>
          .
          <source>In Proceedings of SocialCom2011</source>
          . (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Rammstedt</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>John</surname>
            ,
            <given-names>O. P.</given-names>
          </string-name>
          <article-title>Measuring personality in one minute or less: A 10-item short version of the Big Five Inventory in English and German</article-title>
          . Journal of Research in Personality,
          <volume>41</volume>
          (
          <issue>1</issue>
          ),
          <fpage>203</fpage>
          -
          <lpage>212</lpage>
          . (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Tausczik</surname>
            ,
            <given-names>Y. R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Pennebaker</surname>
            ,
            <given-names>J. W. .</given-names>
          </string-name>
          <article-title>The psychological meaning of words: LIWC and computerized text analysis methods</article-title>
          .
          <source>Journal of Language and Social Psychology</source>
          ,
          <volume>29</volume>
          (
          <issue>1</issue>
          ),
          <fpage>24</fpage>
          -
          <lpage>54</lpage>
          . (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Vinodhini</surname>
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Chandrasekaran R. M. Sentiment</surname>
          </string-name>
          <article-title>Analysis and Opinion Mining: A Survey</article-title>
          . In
          <source>International Journal</source>
          .
          <volume>2</volume>
          (
          <issue>6</issue>
          ). (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and Zhang,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Topic sentiment analysis in twitter: a graph-based hashtag sentiment classi cation approach</article-title>
          .
          <source>InProceedings of the 20th ACM international conference on Information and knowledge management</source>
          .(
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23. Wilson,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Kozareva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Rosenthal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            , and
            <surname>Ritter</surname>
          </string-name>
          , A. SemEval
          <article-title>-2013 task 2: Sentiment analysis in twitter</article-title>
          .
          <source>In Proceedings of the International Workshop on Semantic Evaluation</source>
          ,
          <string-name>
            <surname>SemEval.</surname>
          </string-name>
          (Vol.
          <volume>13</volume>
          ). (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Witten</surname>
            <given-names>I.H.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Frank E. Data Mining.</surname>
          </string-name>
          <article-title>Practical Machine Learning Tools and Techniques with Java implementations</article-title>
          . Morgan and Kaufman, (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>