=Paper=
{{Paper
|id=Vol-2253/paper04
|storemode=property
|title=Is Big Five Better than MBTI? A Personality Computing Challenge Using Twitter Data
|pdfUrl=https://ceur-ws.org/Vol-2253/paper04.pdf
|volume=Vol-2253
|authors=Fabio Celli,Bruno Lepri
|dblpUrl=https://dblp.org/rec/conf/clic-it/CelliL18
}}
==Is Big Five Better than MBTI? A Personality Computing Challenge Using Twitter Data==
<pdf width="1500px">https://ceur-ws.org/Vol-2253/paper04.pdf</pdf>
<pre>
                         Is Big Five better than MBTI?
              A personality computing challenge using Twitter data

                  Fabio Celli                                       Bruno Lepri
         FBK - MobS and Profilio Company                            FBK - MobS
                  Trento, Italy                                     Trento, Italy
                celli@fbk.eu                                      lepri@fbk.eu


                    Abstract                           to many fields in academia as well as in indus-
                                                       try, including security (Golbeck et al., 2011), hu-
    English. Personality Computing from                man resources (Turban et al., 2017), advertising
    text has become popular in Natural Lan-            (Celli et al., 2017) and deception detection (For-
    guage Processing (NLP). For assessing              naciari et al., 2013). Historically, there are two
    gold-standard personality types, Big5 and          popular but very different psychological tests to
    MBTI are two popular models but still              asses personality: (i) the Big Five (Costa and Mc-
    there is no comparison of the two in per-          Crae, 1985; Costa and McCrae, 2008), which is
    sonality computing. With this paper, we            widely accepted in academia, and (ii) the Myers
    provide for the first time a comparison of         Briggs Type Indicator (MBTI) (Myers and Myers,
    the two models from a computational per-           2010), which is very popular and widely used in
    spective. To do that we exploit two mul-           industry. The Big Five model defines personal-
    tilingual datasets collected from Twitter in       ity along 5 bipolar scales: Extraversion (sociable
    English, Italian, Spanish and Dutch.               vs. shy); Emotional Stability (secure vs. neu-
    Italiano. Il riconoscimento automatico di          rotic); Agreeableness (friendly vs. ugly); Con-
    personalità è diventato popolare nelle co-         scientiousness (organized vs. careless); Open-
    munità di linguistica computazionale. I            ness to Experience (insightful vs. unimagina-
    test Big Five e MBTI sono due modelli dif-         tive). In contrast, the MBTI defines 4 binary
    ferenti per valutare la personalità, ma an-        classes that combines into 16 personality types:
    cora non c’è un vero confronto dei due             Extraversion/Introversion, Sensing/Intuition, Per-
    in ambito di riconoscimento automatico             ception/Judging, Feeling/Thinking. Correlation
    di personalità. In questo articolo per la          analyses of the personality measures showed
    prima volta forniamo una comparazione              that Big Five Extraversion was correlated with
    dei due modelli dal punto di vista com-            MBTI Extraversion-Introversion, Openness to Ex-
    putazionale. Per fare questo abbiamo               perience was correlated with Sensing-Intuition,
    raccolto dati Twitter in Inglese, Italiano,        Agreeableness with Thinking-Feeling and Consci-
    Spagnolo e Olandese in due corpora par-            entiousness with Judging-Perceiving (Furnham et
    alleli annotati con i due test.                    al., 2003). A reason for the recently gained pop-
                                                       ularity of MBTI is the fact that it is easier to col-
                                                       lect gold-standard labelled data about MBTI than
1   Introduction                                       about Big Five, as an MBTI type is a 4-letter cod-
                                                       ing (e.g., INTJ) that could be retrieved with sim-
The last decade has been characterized by the
                                                       ple queries. In a field like personality computing,
rise of personality computing in Natural Lan-
                                                       where data is costly and difficult to collect, this is
guage Processing (NLP) (Vinciarelli and Moham-
                                                       an enormous advantage.
madi, 2014): for example, several works have
                                                       In this paper we address the question whether it is
dealt with the automatic prediction of personality
                                                       easier to predict Big Five or MBTI classes with a
traits of authors from different pieces of text they
                                                       machine learning approach. To do so, we collect
wrote in emails, blogs or social media (Mairesse
                                                       two Twitter datasets in English, Italian, Dutch and
et al., 2007; Iacobelli et al., 2011; Schwartz et
                                                       Spanish, one annotated with the Big Five personal-
al., 2013) (Rangel Pardo et al., 2015). Personal-
                                                       ity types and one with MBTI. We believe that this
ity computing is also broadening its application
work will be useful for the scientific community        Conscientiousness (Quercia et al., 2012).
of personality computing to better understand the
heuristic power of the two models when applied to       Overview of datasets The scarcity of data an-
machine learning tasks.                                 notated with gold standard personality labels, dif-
The paper is structured as follows: in the next sec-    ficult and costly to collect, was a major problem
tion we provide an overview of related works in         and the few large datasets available (MyPersonal-
the field of personality computing in NLP, in Sec-      ity, about 75K users, and Essays, about 2K users)
tion 3 we describe the datasets we used, in Section     soon became standard benchmarks (Celli et al.,
4 we report the results of our experiments and in       2013). These available datasets covered mainly
Section 5 we draw some conclusions.                     English language, while all the other datasets were
                                                        much smaller, around 200 or 300 instances. In this
2   Related Work                                        scenario a dataset of 1500 instances collected by
                                                        means of a simple Twitter search came out, and
Brief overview of personality computing The             it was in English and annotated with MBTI labels
research in personality computing from text begun       (Plank and Hovy, 2015). This demonstrated that
more than a decade ago with few pioneering works        MBTI labels are very common and easy to retrieve
recognizing personality traits (Big Five traits)        from Twitter, unlike Big Five labels. Soon there-
from blogs (Oberlander and Nowson, 2006) and            after, TwiSty came out (Verhoeven et al., 2016),
self presentations (Mairesse et al., 2007). Other       a multilanguage dataset of 17K instances anno-
related fields have developed in the same years,        tated with MBTI and including Italian, Dutch, Por-
like personality computing from multimodal and          tuguese, French and Spanish.
social signals, such as recorded meetings (Pianesi
et al., 2008). In that period the research on MBTI      State of the art The MBTI model formalizes
was limited to find correlates between personal-        personality types as classes, while Big Five as
ity types and behavioral expectations, such as job      scores. Despite this, works in computer science
preference (Cohen et al., 2013). Thus, MBTI             and computational linguistics split between those
was marginally used for personality computing           who use scores (Golbeck et al., 2011) and those
until 2015 (Luyckx and Daelemans, 2008); while          who turn Big Five scores into binary classes in or-
many works demonstrated the validity of Big Five        der to have a better control on class distribution
for the automatic prediction of personality from        and easier-to-interpret prediction tasks (Mairesse
different sources, including Twitter (Quercia et        et al., 2007) (Segalin et al., 2017). In particular,
al., 2011) (Pratama and Sarno, 2015) (Qiu et al.,       Mairesse et al. obtained an average of 57% ac-
2012). The most common features used by re-             curacy in the prediction of Big Five classes using
searchers to perform such tasks were extracted          the LIWC psycholinguistic features, also reporting
from text, such as sentiment (Basile and Nissim,        that Openness to Experience was the easiest trait to
2013), Part of Speech (PoS) tags, psycholinguis-        model. Verhoeven et al. (Verhoeven et al., 2013)
tic tags (LIWC) (Tausczik and Pennebaker, 2010)         obtained a 72% of F-measure in the prediction of
and from metadata, such as number of followers,         Big Five using trigrams and ensemble methods in
density of subject’s network, hashtags, Likes and       a small Facebook dataset trained on a larger es-
profile pictures. The rise of personality computing     says dataset. In a following study, Verhoeven et
by means of the Big Five model brought fruitful         al. (Verhoeven et al., 2016) obtained an average of
collaborations between the communities of com-          63.8% of F-measure in the prediction of MBTI on
puter science and personality psychology (Back          Twitter in multiple languages using word and char-
et al., 2010), and very interesting findings came       acters n-grams. Again, Farnadi et al. (Farnadi et
out: for example that several personal character-       al., 2013) obtained an average accuracy of 58.6%
istics extracted from social media profiles such as     to predict Big Five classes on the same dataset
education, religion, marital status and the number      using mostly metadata. Finally, Plank and Hovy
of political preferences have really high correla-      (Plank and Hovy, 2015) used words and Twitter
tions with personality types (Kosinski et al., 2013),   metadata to predict Extraversion/Introversion and
or that popular users in social media are both ex-      Feeling/Thinking with 72% and 61% of accuracy,
troverts and emotionally stable as well as high in      respectively. They reported that the best perform-
Openness, while influential ones tend to be high in     ing features are the linguistic ones.
The different settings and datasets used by previ-             4   Experiments, Results, Discussion and
ous works in the field makes it impossible to com-                 Limitations
pare the results. Here, we aim to fill this gap.
                                                               Experimental settings We compared the per-
                                                               formance of algorithms for the prediction of Big
3       Datasets                                               Five and MBTI classes in 9 binary classification
                                                               tasks. To do so, we used the following features:
We collected from Twitter two multilingual                        - Character n-grams (1000 features): we ex-
datasets, of 900 users each, one annotated with                tracted from tweets 1000 characters bi-grams and
MBTI and one with Big Five. First we collected                 tri-grams with a minimum frequency of 3. We did
the Big Five set by means of queries with Twit-                not remove stopwords and punctuation;
ter advanced search1 , retrieving the results of dif-             - LIWC match ratio (68 features): we computed
ferent Big Five tests, ranging from the short 10-              the ratio of matches of the words in the LIWC
items test to the 44-items test. The language of the           dictionaries in all the four languages. LIWC pro-
tweets were English, Italian, Spanish and Dutch,               vides mapping from words to 68 psycholinguistic
so we replicated the language distribution in the              categories, including words about others, self,
MBTI set using a portion of TwiSty (Verhoeven                  space, time, society, family, friendship, sex, and
et al., 2016) and Plank’s corpus (Plank and Hovy,              functional words, among others;
2015). The details about language distributions                   - Metadata (10 features): this feature set
are reported in Figure 1.                                      includes the followers/following ratio, fa-
                                                               vorite/tweets ratio, listed/tweets ratio, link color,
                                                               text color, border color, background color, hash-
                                                               tag/words ratio, retweet ratio, whether the profile
                                                               picture is the default one or not. As feature
                                                               selection procedure we used a subset selection
                                                               algorithm (Hall and Smith, 1998) that reduces the
                                                               degree of redundancy. We balanced the classes
                                                               assigning weights to the instances in the data
Figure 1: Distribution of the languages in the two datasets.   so that each class has the same total weight.
The x-axis represents the number of users.
                                                               For the classification we compared SVMs and a
                                                               meta-classifier that automatically finds the best
   As expected there are many more tweets con-                 performing algorithm for the task (Thornton et al.,
taining the results of the MBTI with respect to the            2013). As evaluation setting we used a 10-fold
Big Five. We use a concatenation of all tweets of              cross validation, as metric we reported accuracy
a user, and a limit to 40 tweets per user in order             and averages. For the maximum comparability
to balance those who have too many tweets those                we also reported the average on the Big Five four
that have few. In the end we used two comparable               traits correlated with MBTI (avg4): extraversion,
datasets with 900 users each, 265K words in the                openness, agreableness and conscientiousness.
Big Five one and 290K words in the MBTI one.
The classes are balanced in the Big Five set, as we            Results and discussion Results reported in Ta-
obtained them with a median split from the origi-              ble 1 show that, on average, SVMs have higher
nal scores, on the contrary in the MBTI set there              performance in the prediction of MBTI classes
is a strong imbalance in the distribution of Sens-             with respect to Big Five, but there is much vari-
ing/Intuition and Feeling/Thinking, reported also              ability in the prediction of Big Five traits. In
in Plank’s corpus. In the experiments, described               particular, we obtained very good performances
in the next section, we balance the classes of both            for Emotional Stability and Agreeableness using
datasets and test different combinations of the fea-           a SVMs with polynomial kernel and Random Sub
tures to evaluate the performance of machine lean-             Spaces respectively, but poor with simple SVMs,
ing algorithms in the prediction of classes derived            indicating that the space is not linearly separa-
from the two different personality models.                     ble. On the contrary, the predictions of the MBTI
                                                               seems to be more stable, in contrast to the results
    1
        https://twitter.com/search-advanced                    of Plank and Hovy. We suggest that this different
  trait     baseline    svm     auto          best feature                    trait     baseline    svm      best feature
  extr.     49.6        61.8    66.4 lr       others                          extr.     49.6        66.1     hashratio
  stab.     49.8        59.6    74.8 svmk     I                               stab.     49.6        62.9     I
  agree.    49.6        61.1    73.3 rss      death                           agree.    49.6        59.7     feel
  consc.    49.8        60.3    61.6 sdg      death                           consc.    49.4        60.2     ngrams
  open.     49.6        53.1    59.4 nb       ngrams                          open.     49.5        60.3     ngrams
  avg4      49.7        59.0    65.1          -                               avg4      49.6        61.5     -
  avg       49.7        59.1    67.0          -                               avg       49.6        61.8     -
  E-I       49.5        63.9    64.7 sdg      hashratio                       E-I       49.7        61.3     anger
  S-N       49.2        66.3    68.6 bag      negate                          S-N       48.4        68.5     we
  F-T       49.8        63.0    63.0 svm      self                            F-T       49.3        68.6     self
  P-J       49.5        61.7    63.5 nb       self                            P-J       49.6        60.2     I
  avg       49.5        63.7    64.9          -                               avg       49.5        64.6     -

Table 1: Results of the experiments with all the languages       Table 2: Results of the experiments with English only and
and 900 instances per each set. Big Five is in the upper         650 instances per each set. Big Five is in the upper part of
part of the Table and MBTI is below. We report accuracies        the Table and MBTI is below. We report accuracy for the
for Support Vector Machines (svm) and AutoWeka (auto),           majority baseline and Support Vector Machines (svm). The
a meta-classifier that automatically finds the best algorithm    best features for the predictions are: hashtag ratio (hashra-
and settings for the task. The auto meta-classifier used Lo-     tio), first person singular pronoun (I), words about feelings
gistic Regression (lr), Support Vector Machines with poly-       (feel), ngrams (ngrams), words about self (self), negation
nomial kernel (svmk), Random Sub Spaces (rss), Stochas-          words (negate), words about anger (anger), first person plural
tic Gradient Descent Regression (sdg), Naive Bayes (nb) and      pronoun (we), words about self (self).
Bagging (bag). We also report average accuracy of Big Five
traits correlated to MBTI (avg4): Extraversion, Openness to
Experience, Agreeableness and Conscientiousness. The best        Limitations In order to compare the two per-
features for the predictions are: words about others (others),
first person singular pronoun (I), words about death (death),    sonality models, we forced the Big Five outcome,
ngrams (ngrams), words about self (self), negation words         originally scores, into classes. This is one of the
(negate), hashtag ratio (hashratio).                             reasons why it is more difficult to predict Big Five
                                                                 classes than MBTI, but it is interesting to note that
                                                                 the performance of some Big Five traits can be
result is due to three factors: class balancing, the             boosted using non-linear models. Another limi-
use of LIWC and the subset feature selection. It                 tation is related to the fact that we collected dif-
is interesting to note that the reference to others is           ferent users in the two datasets, with the risk to
the best feature for the prediction of Big Five Ex-              have some individuals in one dataset or the other
traversion and first person pronouns for the pre-                that are easier to classify. In any case, it is im-
diction of Emotional Stability/Neuroticism. We                   possible to collect data of the same users anno-
explain the predictive power of words about death                tated with both MBTI and Big Five with Twitter
for Agreeableness and Conscientiousness with the                 queries, this is something that could be done only
fact that this feature is correlated to the negative             with a costly data collection effort, that we hope
poles of these traits. The presence of different                 future work will do.
languages might affect negatively the performance
so we ran an experiment using only English (650                  5       Conclusion
users for each set).
                                                                 In this paper we provide for the first time a com-
Results, reported in Table 2, show that the effect of
                                                                 parison of Big Five and MBTI from a personality
language variety is minimum, given that English
                                                                 computing perspective. To do so we use two mul-
is the most represented language in the datasets. It
                                                                 tilingual Twitter datasets, one annotated with Big
is interesting to note the changes in the best fea-
                                                                 Five classes and one with MBTI classes. For the
tures: hashtag ratio is in English the best feature
                                                                 first time, we provide an evidence that algorithms
for Extraversion Big Five, while in the previous
                                                                 trained on MBTI could have better performances
experiment it was the best feature for Extraver-
                                                                 than trained on the Big Five, although the Big Five
sion MBTI. Here the best feature for Extraversion
                                                                 is much more informative and has great variability
MBTI is anger, that is a clue for the negative class
                                                                 in performance depending also on the algorithm
of this trait: Introversion. It is also interesting to
                                                                 used for the prediction. We let available the files
note that words about feelings become in English
                                                                 used for the experiments2 , in order to grant the
the best feature for Agreeableness, although the
                                                                 replicability or improvement of the results.
performance decreases a little bit with respect to
                                                                     2
the experiment with all languages.                                       http://personality.altervista.org/fabio.htm
Acknowledgments                                            Jennifer Golbeck, Cristina Robles, Michon Edmond-
                                                              son, and Karen Turner. 2011. Predicting personal-
The work of Fabio Celli and Bruno Lepri was                   ity from twitter. In Privacy, security, risk and trust
partly funded by EIT Digital by City Enabler for              (passat), 2011 ieee third international conference on
Digital Urban Services (CEDUS) and by EIT Dis-                and 2011 ieee third international conference on so-
                                                              cial computing (socialcom), pages 149–156. IEEE.
tributed Ledger Invoice (DLI).
                                                           Mark A Hall and Lloyd A Smith. 1998. Practical fea-
                                                            ture subset selection for machine learning. Springer.
References
                                                           Francisco Iacobelli, Alastair J Gill, Scott Nowson, and
Mitja D Back, Juliane M Stopfer, Simine Vazire,              Jon Oberlander. 2011. Large scale personality clas-
  Sam Gaddis, Stefan C Schmukle, Boris Egloff, and           sification of bloggers. In Affective Computing and
  Samuel D Gosling. 2010. Facebook profiles reflect          Intelligent Interaction, pages 568–577. Springer.
  actual personality, not self-idealization. Psychologi-
  cal science.
                                                           Michal Kosinski, David Stillwell, and Thore Grae-
Valerio Basile and Malvina Nissim. 2013. Sentiment           pel. 2013. Private traits and attributes are pre-
  analysis on italian tweets. WASSA 2013, page 100.          dictable from digital records of human behavior.
                                                             Proceedings of the National Academy of Sciences,
Fabio Celli, Fabio Pianesi, David Stillwell, and Michal      110(15):5802–5805.
  Kosinski. 2013. Workshop on computational per-
  sonality recognition: Shared task. In WCPR in con-       Kim Luyckx and Walter Daelemans. 2008. Personae:
  juction to ICWSM 2013.                                     a corpus for author and personality prediction from
                                                             text. In Proceedings of the 6th International Confer-
Fabio Celli, Pietro Zani Massani, and Bruno Lepri.           ence on Language Resources and Evaluation. Mar-
  2017. Profilio: Psychometric profiling to boost so-        rakech, Morocco: European Language resources
  cial media advertising. In Proceedings of the 2017         Association.
  ACM on Multimedia Conference, pages 546–550.
  ACM.                                                     François Mairesse, Marilyn A Walker, Matthias R
                                                             Mehl, and Roger K Moore. 2007. Using linguis-
Yuval Cohen, Hana Ornoy, and Baruch Keren. 2013.             tic cues for the automatic recognition of personality
  Mbti personality types of project managers and their       in conversation and text. Journal of Artificial Intel-
  success: A field survey. Project Management Jour-          ligence Research, 30(1):457–500.
  nal, 44(3):78–87.
                                                           Isabel Briggs Myers and Peter B Myers. 2010. Gifts
Paul T Costa and Robert R McCrae. 1985. The NEO               differing: Understanding personality type. Davies-
  personality inventory: Manual, form S and form R.           Black Publishing.
  Psychological Assessment Resources.
                                                           Jon Oberlander and Scott Nowson. 2006. Whose
Paul T Costa and Robert R McCrae. 2008. The re-              thumb is it anyway?: classifying author personal-
  vised neo personality inventory (neo-pi-r). In G.J.        ity from weblog text. In Proceedings of the COL-
  Boyle, G Matthews and D. Saklofske (Eds.). The             ING/ACL on Main conference poster sessions, pages
  SAGE handbook of personality theory and assess-            627–634. Association for Computational Linguis-
  ment, 2:179–198.                                           tics.
Golnoosh Farnadi, Susana Zoghbi, Marie-Francine
  Moens, and Martine De Cock. 2013. Recognising            Fabio Pianesi, Nadia Mana, Alessandro Cappelletti,
  personality traits using facebook status updates. In       Bruno Lepri, and Massimo Zancanaro. 2008. Mul-
  Proceedings of the workshop on computational per-          timodal recognition of personality traits in social in-
  sonality recognition (WCPR13) at the 7th interna-          teractions. In Proceedings of the 10th international
  tional AAAI conference on weblogs and social media         conference on Multimodal interfaces, pages 53–60.
  (ICWSM13). AAAI.                                           ACM.

Tommaso Fornaciari, Fabio Celli, and Massimo Poe-          Barbara Plank and Dirk Hovy. 2015. Personality traits
  sio. 2013. The effect of personality type on decep-        on twitter - or - how to get 1,500 personality tests in a
  tive communication style. In Intelligence and Se-          week. 6TH Workshop on computational approaches
  curity Informatics Conference (EISIC), 2013 Euro-          to subjectivity, sentiment and social media analysis
  pean, pages 1–6. IEEE.                                     WASSA 2015, page 92.

Adrian Furnham, Joanna Moutafi, and John Crump.            Bayu Yudha Pratama and Riyanarto Sarno. 2015. Per-
  2003. The relationship between the revised neo-            sonality classification based on twitter text using
  personality inventory and the myers-briggs type in-        naive bayes, knn and svm. In Data and Software
  dicator. Social Behavior and Personality: an inter-        Engineering (ICoDSE), 2015 International Confer-
  national journal, 31(6):577–584.                           ence on, pages 170–174. IEEE.
Lin Qiu, Han Lin, Jonathan Ramsay, and Fang Yang.            Ben Verhoeven, Walter Daelemans, and Barbara Plank.
  2012. You are what you tweet: Personality expres-            2016. Twisty: a multilingual twitter stylometry cor-
  sion and perception on twitter. Journal of Research          pus for gender and personality profiling. In Pro-
  in Personality, 46(6):710–718.                               ceedings of the 10th Annual Conference on Lan-
                                                               guage Resources and Evaluation (LREC 2016)/Cal-
Daniele Quercia, Michal Kosinski, David Stillwell, and         zolari, Nicoletta [edit.]; et al., pages 1–6.
  Jon Crowcroft. 2011. Our twitter profiles, our
  selves: Predicting personality with twitter. In Pri-       Alessandro Vinciarelli and Gelareh Mohammadi.
  vacy, security, risk and trust (passat), 2011 ieee third     2014. A survey of personality computing. IEEE
  international conference on and 2011 ieee third in-          Transactions on Affective Computing, 5(3):1–1.
  ternational conference on social computing (social-
  com), pages 180–185. IEEE.
Daniele Quercia, Renaud Lambiotte, David Stillwell,
  Michal Kosinski, and Jon Crowcroft. 2012. The
  personality of popular facebook users. In Proceed-
  ings of the ACM 2012 conference on Computer Sup-
  ported Cooperative Work, pages 955–964. ACM.
Francisco Manuel Rangel Pardo, Fabio Celli, Paolo
  Rosso, Martin Potthast, Benno Stein, and Walter
  Daelemans. 2015. Overview of the 3rd author pro-
  filing task at pan 2015. In Cappellato L., Ferro N.,
  Jones G., San Juan E. (Eds.) CLEF 2015 Labs and
  Workshops, Notebook Papers. CEUR Workshop Pro-
  ceedings. CEUR-WS.org, vol. 1391, pages 1–8.
Andrew H Schwartz, Johannes C Eichstaedt, Mar-
  garet L Kern, Lukasz Dziurzynski, Stephanie M Ra-
  mones, Megha Agrawal, Achal Shah, Michal Kosin-
  ski, David Stillwell, Martin EP Seligman, et al.
  2013. Personality, gender, and age in the language
  of social media: The open-vocabulary approach.
  PloS one, 8(9):773–791.
Cristina Segalin, Fabio Celli, Luca Polonio, Michal
  Kosinski, David Stillwell, Nicu Sebe, Marco
  Cristani, and Bruno Lepri. 2017. What your face-
  book profile picture reveals about your personality.
  In Proceedings of the 2017 ACM on Multimedia
  Conference, pages 460–468. ACM.
Yla R Tausczik and James W Pennebaker. 2010. The
  psychological meaning of words: Liwc and comput-
  erized text analysis methods. Journal of Language
  and Social Psychology, 29(1):24–54.
Chris Thornton, Frank Hutter, Holger H Hoos, and
  Kevin Leyton-Brown. 2013. Auto-weka: Com-
  bined selection and hyperparameter optimization of
  classification algorithms. In Proceedings of the 19th
  ACM SIGKDD international conference on Knowl-
  edge discovery and data mining, pages 847–855.
  ACM.
Daniel B Turban, Timothy R Moake, Sharon Yu-Hsien
  Wu, and Yu Ha Cheung. 2017. Linking extroversion
  and proactive personality to career success: The role
  of mentoring received and knowledge. Journal of
  Career Development, 44(1):20–33.
Ben Verhoeven, Walter Daelemans, and Tom
  De Smedt. 2013. Ensemble methods for per-
  sonality recognition. In Proc of Workshop on
  Computational Personality Recognition, AAAI
  Press, Melon Park, CA, pages 35–38.

</pre>