<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the Celebrity Profiling Task at PAN 2019</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Matti Wiegmann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benno Stein</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Potthast</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Bauhaus-Universität Weimar</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>German Aerospace Center</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Leipzig University</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Celebrity profiling is author profiling applied to celebrities. The focus on celebrities has several advantages: Celebrities are prolific social media users supplying lots of writing samples, lots of personal details are public knowledge, and they try to build a consistent public persona either themselves or with the help of agents. In addition, a number of demographics apply only to this group of people. In this overview of the first shared task on celebrity profiling at PAN 2019, we survey and evaluate eight submitted models that try to predict the gender, the year of birth, the fame, and the occupation of 48,335 English-speaking celebrities based on text obtained from their Twitter timelines. Anticipating some key results we can report that the models work well for predicting binary gender or for distinguishing the most famous celebrities from the less famous ones. Also the occupations sports, politics, and performer are easily identified. The models work less well for the prediction of rare demographics such as non-binary gender and occupations that are not single-topic (e.g., manager, science, and professional). Predicting the year of birth works best for the years between ca. 1980-2000 (i.e., ages ca. 20-40), but less well for older celebrities, and not at all for younger ones.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Author profiling aims to correlate writing style with author demographics. It has
applications in marketing, forensic linguistics, psycholinguistics, and the social sciences.
Especially today’s omnipresence of social media and the resulting availability of text
data from large portions of the population caused a surge of interest in profiling
technology. On social media, celebrities occupy an exalted position. Rallying up to millions of
followers, they serve as role models to many and exert a direct influence on public
opinion, sometimes for the better, e.g., by lending their voices to the disenfranchised, and
sometimes for the worse. Unsurprisingly, the “rich and famous” are subjects to research
in the social sciences and economics alike, especially with regard to their presence on
social media. The celebrity profiling task at PAN 2019 introduces this population for
the first time to the author profiling community.</p>
      <p>The task was to predict four demographics of celebrities, given their history of
tweets on Twitter:
– Gender as male, female, or, for the first time, non-binary.
– Year of birth within a novel, variable-bucket evaluation scheme.
– Fame as rising, star, or superstar.
– Occupation or “claim to fame,” as sports, performer, creator, politics, manager,
science, professional, or religious.</p>
      <p>The evaluation data for this task was sampled from the Webis Celebrity Profiling
Corpus 2019 [49]: 48,335 Twitter timelines of celebrities with on average 2,181 tweets per
celebrity. The labels gender, year of birth, and occupation were obtained from Wikidata,
the degree of fame was derived from the follower count.</p>
      <p>As a quick overview, 92 teams registered for the task, 12 showed some sign of
activity, e.g., by requesting a virtual machine, and eight made a successful software
submission. Performance was measured using cRank, the harmonic mean of the
macroaveraged multi-class F1 for gender, fame, occupation, and a leniently calculated F1 for
year of birth. This measure is stricter than average accuracy, since it prefers consistent
results, emphasizing performance on classes reflecting rare demographics. The winning
submission achieved an outstanding cRank of 0.593. Most submissions prefer
featurebased machine learning utilizing word-level features over neural approaches, reporting
higher performance of the former in preliminary experiments.</p>
      <p>After reviewing related work, we give a more detailed description of the task, the
construction of the task’s evaluation data, and the reasoning underlying our performance
measures in Section 3. In Section 4, we survey the software submissions, in Section 5,
we report the evaluation results and carry out an in-depth analysis with regard to the
performance of different approaches and individual demographics of the task.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        The study of author profiling techniques has a rich history, with the pioneering works
done by Pennebaker et al. [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], Koppel et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], Schler et al. [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ], and Argamon et al.
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], focusing on age, gender, and personality from genres with longer, grammatical
documents such as blogs and essays. Table 1 overviews most of the works done in
author profiling over the past 20 years, reporting on text genre, author count, word
count, and the demographics studied. The most commonly used genre in recent years is
Twitter tweets, first used in 2011 to predict gender [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and age [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. Later work also used
Facebook posts [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], Reddit [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], and Sina Weibo [47]. Recently added demographics
include education [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], ethnicity [46], family status [47], income [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ], occupation [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ],
location of origin [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], religion [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ], and location of residence [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        At PAN, author profiling has been studied since 2013, covering different
demographics including age and gender [
        <xref ref-type="bibr" rid="ref35 ref36 ref39">36, 35, 39</xref>
        ], personality [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ], language variety [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ],
genres including blogs, reviews, and social media posts [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ], and cross-domain
prediction [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]. Profiling research related to aspects such as behavioral traits [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], medical
conditions [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and native language identification (NLI) have been excluded from our
survey, since these have developed into subfields of their own right.
estimation based on an average of 12.7 words per tweet from the reported number of tweets, a ‘?’
indicates unavailable information.
      </p>
      <p>Dataset
31,750* Gender
2,156 Age, Birthplace, Gender, Education,</p>
      <p>Extroversion, Nat. lang., Occupation
145 Age, Education, Gender, Personality
15,587* Age, Gender, Politics
23,717* Politics
1,195 Dialect, Gender
17,195* Education, Residence
24,861 Personality (MBTI)
16,785* Age, Education, Gender, Income, Race
2,178 Age, Education, Gender,</p>
      <p>
        Personality (Big Five), Religion
1,195 Gender
31,011* Personality (Big Five)
10k
The task’s goal was to evaluate technology to predict the four demographics gender,
year of birth, degree of fame, and occupation of a celebrity from their history of tweets
on Twitter. Participants were given a large training dataset comprising 33,836 celebrities
with up to 3,200 tweets each, and submissions were evaluated on a test dataset
comprising 14,499 celebrities using our TIRA evaluation service [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. Performance was
evaluated using a combination of the multi-class F1-scores of each demographic.
The data used for this task was sampled from the Webis Celebrity Profiling
Corpus 2019 [49], which links the Twitter accounts of celebrities with their corresponding
Wikidata entries. A celebrity in this corpus is defined as a person who has a verified
Twitter account and who is notable as per Wikipedia’s notability criteria. Given the
list of all verified Twitter accounts, they were heuristically linked to their respective
Wikidata entries by matching Twitter’s free-form name and the “@”-handle with
Wikidata’s item name, omitting ambiguous matches, non-person, and memorial accounts.
An evaluation of the matching heuristic revealed a very low error rate of 0.6%. In
total, the corpus contains 71,706 celebrities as Twitter-Wikidata matches, where Wikidata
supplies 239 different demographics, albeit sparsely distributed. For each celebrity, we
crawled all available tweets from their timelines,1 and then filtered out all celebrities
who declared a non-English profile language,2 or were born before 1940, as well as all
tweets that did not contain mainly text. Finally, to compile the evaluation data for our
task, we sampled all celebrities for the most widely available demographics, namely
gender, occupation, and year of birth. Figure 1 shows histograms for each demographic
in the sample of the corpus used for this task.3 Altogether, the evaluation data comprises
48,335 celebrities with an average 2,181 tweets.
1Twitter’s API limits access to only the latest 3200 tweets. According to the total number of
tweets noted in the respective user-profile, this is the complete timeline in 98.05% of cases.
2Note that the dataset still contains some bilingual and non-English tweeting celebrities.
3The high number of celebrities for year of birth 2000 is an error in Wikidata that we noticed only
at the time of writing. We removed them in our subsequent analyses.
      </p>
      <p>s3000
e
i
t
i
r
b
e
le2000
c
f
o
r
e
bm1000
u
N
Log-normal Distribution
Celebrity count per
number of followers
10
100
1k</p>
      <p>10k 100k
Number of followers
1M
10M
100M</p>
      <p>Wikidata employs a high number of labels for certain demographics. To render the
prediction tasks feasible, we simplified the labels as follows:
– Gender. From the eight different gender-related Wikidata labels, we kept male and
female and merged the remaining six to non-binary.
– Fame. To determine the degree of fame, we calculated the distribution of follower
counts shown in Figure 2, overlaid a matching log-normal distribution and used its
standard deviation to separate the three classes rising (less than 1,000 followers),
star (more than 1,000 and less than 100,000 followers), and superstar (more than
100,000 followers).4
– Occupation. The 1,379 different occupations were grouped into eight classes:
1. Sports for occupations participating in professional sports, primarily athletes.
2. Performer for creative activities primarily involving a performance like acting,
entertainment, TV hosts, and musicians.
3. Creator for creative activities with a focus on creating a work or piece of art, for
example, writers, journalists, designers, composers, producers, and architects.
4. Politics for politicians and political advocates, lobbyists, and activists.
5. Manager for executives in companies and organizations.
6. Science for people working in science and education.
7. Professional for specialist professions like cooks and plumbers.
8. Religious for professions in the name of a religion.</p>
      <p>We arrived at these groups of celebrity occupations by reconstructing the graph
induced by Wikidata’s subclass of property, connecting all occupations in the
corpus. By manually analyzing the graph, the most reasonable sub-structures of
closely connected professions were identified.
– Age. Unlike the profiling literature on age prediction, we did not define a static set
of age groups, but used the year of birth between 1940 and 2012 as extracted from
Wikidata’s Day of Birth property.
4We attribute the gap under the left half of the log-normal distribution curve to the fact that rising
celebrities are less likely to possess a verified Twitter account, thus missing from our corpus.</p>
      <p>The different demographics in the dataset are not entirely independent. While the
correlation of some class combinations like year of birth and fame, and gender and fame
are insignificant, others have notable dependencies: Figure 4 in Appendix B shows that
there is a clear imbalance between gender and occupation, and occupation and year
of birth. Female celebrities tend to be younger and more likely have a performing or
creator occupation, while male celebrities strongly tend to be famous for sports when
young, and politics and religion otherwise. Celebrities working in performing
occupations like acting or music tend to be more famous than others.</p>
      <p>We split the sampled data 70:30 into a training dataset of 33,836 celebrities and a
test dataset of 14,499 celebrities (test dataset 1); from the latter we sub-sampled another
small-scale test dataset of 956 authors (test dataset 2).
3.2</p>
      <sec id="sec-2-1">
        <title>Performance Measures</title>
        <p>In previous years at PAN, the performance of author profiling approaches has been
measured as average of the accuracies measured for each demographic in question.
This measure is unfit for celebrity profiling, since the demographics are imbalanced
and some have many classes. To measure participant performance, we rather average
the per-demographic performance using the harmonic mean, promoting a consistent
performance across demographics:
cRank =</p>
        <p>4
1 1 1 1
F1;fame + F1;occupation + F1;gender + F1;birthyear
:</p>
        <p>Let T denote the set of classes labels of a given demographic (e.g., gender), where
t 2 T is a given class label (e.g., female). The prediction performance for T 2 fgender,
fame, occupationg is measured using the macro-averaged multi-class F1-score. This
measure averages the harmonic mean of precision and recall over all classes of a
demographic, weighting each class equally, promoting correct predictions of small classes:
F1;T =
2 X precision(ti) recall(ti) :
jT j ti2T precision(ti) + recall(ti)</p>
        <p>We also apply this measure to evaluate the prediction performance for the
demographic T = year of birth, but change the computation of true positives: we count a
predicted year as correct if it is within an m-window of the true year, where m increases
linearly from 2 to 9 years with the true age of the celebrity in question:
m = ( 0:1 truth + 202:8):
This way of measuring the prediction performance for the age demographics addresses
a shortcoming of the fixed-age-interval scheme: Defining strict age intervals (i.e.
1020 years, 20-30, etc.) overly penalizes small prediction errors made at the interval
boundaries, such as predicting an age of 21 instead of 20. Furthermore, we decided
against combining precise predictions with an error function like mean squared error,
since we presume that age prediction gets more difficult with increasing age as people
grow mature and their writing style presumably changes more slowly over the years.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Survey of the Submitted Approaches</title>
      <p>
        Eight participants submitted software to this task, six of whom also submitted
notebooks describing their approach. Five of these six approaches are based on traditional
feature engineering, and three also report negative experiments with deep learning
models, whereas only Pelzer [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] employed a neural language model (ULMFiT). The most
popular algorithm choices are logistic regression and support vector machines (SVM),
the most popular features are exclusively based on content, whereas only
MorenoSandoval et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] also added grammatical and custom features. To cope with the
small classes in the gender and occupation demographics, two participants resorted to
oversampling the classes during training, one to downsampling, and one applied class
weighting. Three participants grouped the year of birth into eight maximum-sized
intervals and predicting them instead. The most popular preprocessing steps are the
replacement or removal of hyperlinks, mentions, hashtags, and emojis, while stop words
and punctuation are rarely touched. Below, each approach is described in more detail.
      </p>
      <p>
        Radivchev et al. [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] uses support vector machines to predict fame and occupation
and logistic regression to predict year of birth and gender, using tf-idf vectors of the
10,000 most frequent bigrams of 500 randomly selected tweets per celebrity as
features. The authors determined class priors to cope with small classes in gender and
occupation prediction and grouped the year of births into eight intervals, reversing the
window function used for performance measurement. Tweets are preprocessed by
removing retweets and all symbols except letters, numbers, @’s, and #’s, replacing
hyperlinks with &lt;url&gt; and mentions with &lt;user&gt;, collapsing spaces, and adding a &lt;sep&gt;
token at the end of each tweet. The optimal configuration of learning algorithms for
each demographic was determined via grid search over several hyperparameter settings
for both the SVM and logistic regression. The authors tried multiple alternative
approaches, reporting sub-par results for preserving retweets and replacing emojis with
&lt;emoji&gt; during preprocessing, using character 3-grams and 4-grams as features, and
employing multi-layered perceptrons or a deep pyramid CNN on GloVe embeddings.
      </p>
      <p>
        Moreno-Sandoval et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] uses logistic regression to predict fame, gender, and
year of birth, and a multinomial naive Bayes model to predict occupation, using
ngram features with a minimum frequency of 9 for gender, 6 for year of birth, 3 for
occupation, and none for fame, as well as the features average number of emojis,
hashtags, mentions, hyperlinks, retweets, words per tweet, word-length, the lexical diversity,
the kurtosis and skew of word-length and word-count, respectively, and the number of
tweets written in each of the grammatical genders: the first, second, and third person
singular and the first and third person plural. Years of birth are combined into eight
larger intervals and oversampled. Preprocessing of texts was done for fame, gender,
and year of birth in the form of replacing hashtags, mentions, hyperlinks, and emojis
with special tokens. The model configurations described above were obtained by
testing several combinations of (1) the five algorithms naive Bayes, Gaussian naive Bayes,
naive Bayes complement, logistic regression, and random forest, and (2) whether to
apply preprocessing, (3) oversampling, and (4) whether to include the features.
      </p>
      <p>
        Martinc et al. [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] uses logistic regression for all four demographics, with tf-idf
vectors of word unigrams, word-bounded character trigrams, and 4-character suffix
trigrams of the first 100 tweets per timeline as features. The suffix trigrams were based on
the 10%-80% most frequent words and were weighted with 0.8, the character trigrams
4-80% with 0.4-weighting, and the word unigrams 10-80% with 0.8-weighting. No
resampling was applied and all years were predicted without regrouping. The text for both
trigram features was preprocessed by replacing hashtags, mentions, and hyperlinks with
special tokens and the text for the word unigrams by additionally removing all
punctuation and stop words. The authors determined the logistic regression algorithm to
be optimal after performing a grid search over different hyperparameter combinations
of linear SVMs, SVMs with RBF kernel, logistic regression, random forest, and
gradient boosting classifiers. Experiments with BERT-based fine-tuning approaches were
reported as non-competitive.
      </p>
      <p>
        Asif et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] utilizes one model for each combination of the four demographics
and the 50 languages the authors detected in the dataset, using the most discriminative
words as features. To determine the best learning algorithm for each combination, the
authors selected the best-performing one after testing support vector machines, logistic
regression, decision trees, Gaussian naive Bayes, random forests, and k-nearest
neighbor classifiers. The most discriminative word features for each demographic were
determined by aggregating word counts for all users of one class, normalizing these counts
by the frequency of the class, and summing the pairwise intra-class distance in relative
frequencies. This calculation results in a ranking of words for each demographic,
indicating which words are more frequently used by members of one class compared to
members of all other classes, where the occurrences of the highest-ranking words were
used as features. All tweets are preprocessed by removing hyperlinks, punctuation, stop
words, numbers, alphanumeric words, escape characters, #’s, and @’s.
      </p>
      <p>
        Petrik and Chuda [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] use multiple random forest classifiers with 200 decision trees
based on the tf-idf vectors of the top 5000 1-, 2-, and 3-grams. To train the models,
the authors used the synthetic minority oversampling technique in combination with
Tomek links to balance the examples for each class. The timeline text is preprocessed
by removing mentions and stop words, collapsing letter repetitions, and replacing
hyperlinks and emojis with special tokens. Additionally, the authors report on experiments
with RCNNs, which did not deliver promising results and were hence discarded.
      </p>
      <p>
        Pelzer [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] applies a transfer learning strategy by training an ULMFiT instance on
the celebrity timelines. The classifiers constructed from this instance predicted a class
for every tweet in a given timeline and used the majority of all per-tweet predictions to
infer the celebrity’s demographic. The authors further refined their model by regrouping
the year of birth into fewer classes and downsampled the examples of all demographics
to get a more balanced training dataset. The author reports on slow prediction times of
8 minutes per celebrity; this approach was only evaluated on the second, small-scale
test dataset.
      </p>
      <p>
        Baselines. Since this is the first edition of the task, we did not resort to providing
or reimplementing other unproven models as baselines. Instead, we created three sets
of random predictions to compare participant predictions against: (1) baseline-uniform
randomly draws from a uniform distribution of all classes and reflects the data-agnostic
lower bound, (2) baseline-rand randomly selects a class according to the prior
likelihood of appearance in the test dataset, and (3) baseline-mv always predicts the majority
class of the test dataset.
Table 2a shows the performance of the eight participants who submitted a software
to the celebrity profiling task, ranked by the cRank score. The winning approach by
Radivchev et al. [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] achieves 0.593 on the first and 0.559 on the second test dataset,
closely followed by Moreno-Sandoval et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] with 0.505 on the first, and Pelzer
[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] with 0.499 on the second test dataset. All submitted approaches beat the baselines,
most by a significant margin. The performance measured for our two test datasets is
quite similar, comparing participants who submitted runs for both. The scores are less
varied on the second test dataset: The leading participant’s performance is lower, and
Petrik and Chuda’s approach improves slightly, overtaking that of Fernquist as fourth
in the ranking. These differences can be attributed to the smaller size of the second test
performer creator sports manager politics science professional religious
dataset and less to the fact that the second dataset contains exclusively English tweets.
To verify this claim, we compiled an English-only version of the first test dataset, and
the results shown Table 4, Appendix A, are nearly identical for all participants.
      </p>
      <p>Table 2b shows the accuracies for all submitted approaches, allowing for a
comparison of the general, unweighted correctness of class predictions with the cRank measure.
Accuracies are generally higher for all participants, a natural consequence of the
imbalanced dataset and the existence of small classes. This can be seen by comparing the
results of the baseline-mv, which is almost competitive under accuracy but irrelevant
under cRank. The differences in the per-demographic performance can be explained
further by inspecting the class-wise F1 shown in Table 3. An important observation is
that the top three approaches succeed more frequently in predicting small classes
correctly, greatly benefiting cRank without notably impacting accuracy. We assume that
the good performance on small classes is due to downsampling and the class
weighting applied by the top two approaches, whereas models without these strategies mostly
fit toward the majority classes. Overfitting toward the majority class is also the likely
explanation for the difference in ranking between accuracy and cRank.</p>
      <p>Gender. Predicting the binary sex of an author is a widely studied benchmark task
for author profiling approaches. All participants achieved a respectable accuracy in
predicting celebrity gender, frequently surpassing 0.9 accuracy, while F1 scores are near
the 0.6-0.7 range. Table 3 and Table 5, Appendix A, show the class-wise F1 for all
demographics, which explain the achieved performance values on gender prediction: The
best approaches are best at predicting non-binary gender, while binary gender
classification is close to fit for practical use. Interestingly, the averaged confusion matrix for
gender in Figure 3 shows that non-binary celebrities are mostly misclassified as female,
which can not be explained by imbalanced data and thus justifies further research.</p>
      <p>Year of Birth. Our approach to age prediction, departing from fixed-size intervals to
a lenient evaluation of year of birth prediction, notably impacts participant performance.
Some models reduce the difficulty of the task by reconstructing intervals and using
classification algorithms with notably better performance than the alternatively used
strategy of predicting each year individually. No submission tries to solve the prediction
with regression algorithms. The confusion matrices for the winning model exemplified
in Figure 5, Appendix B, illustrate the difficulty of predicting the year of birth, with that
approach especially struggling to separate celebrities born before the 1980s. This is a
well-known difficulty and has been addressed in our evaluation by being more forgiving
on older celebrities.</p>
      <p>Fame. The degree of fame is a particularly imbalanced class, reflected in the
accuracy where only four participants could beat the baseline-mv on the first test dataset
and only three on the second. On the contrary, participants are much better at
separating classes correctly as shown by their F1 scores, although there is a trend toward the
majority class as can be seen by the confusion matrix in Figure 3. We cannot claim that
this task is solved but we have shown that both the most and least famous celebrities
can partially be distinguished by their writing.</p>
      <p>Occupation. As with the other demographics, occupation was predicted far
better than the baselines by all participants and the results were highly influenced by the
performance on small classes, although not exclusively. All models work better on
occupations with a clear topic, like performer containing actors and musicians, sports,
and politics. For occupations that cover multiple topics, like creator, manager,
professional, and science, all models are rather weak while still beating the baselines. Ignoring
the trend toward majority classes, the confusion matrix for the winning approach in
Figure 5, Appendix B, and the averaged one in Figure 3 both show that science is frequently
confused with politics and creator, religious with creator, and creator with performer.
5.1</p>
      <sec id="sec-3-1">
        <title>Discussion</title>
        <p>In general, all submitted approaches work better, the more examples there are, and the
more clearly classes can be separated by topic. The final ranking was influenced the
most by the resampling strategies to avoid fitting to majority classes and the addition of
grammatical or stylistic features to avoid the misclassification of occupations without a
coherent common topic. From a classification perspective, we see the most potential for
improvement in using all available text data to build celebrity representations, instead
of just excepts, but still excel at finding small classes, for example, using few-shot
models like prototypical or highway networks. From an author profiling perspective,
much is still unclear about the expression of fame, non-binary gender, and non-topical
occupation groups. The best algorithms can partially separate these demographics and
Occupa on
female</p>
        <p>male nonbinary
Fame
errors are systematical rather than random, but a more fundamental understanding of
differences in writing is necessary to make progress.</p>
        <p>Although we are satisfied with the results of the celebrity profiling task and the
insights gained, we see some opportunities to refine our task setup. For the next iteration,
we will consider narrowing the range of years of birth to 1940-2000, omitting
occupations religious and professional, and revising the fame boundaries. The existence of
several small classes turned out to be the major challenge of this task. We see this as an
important aspect of author profiling and especially forensics, since correctly identifying
rare demographics is most desirable in practice. A certain degree of class imbalance is
hence necessary, albeit the degree of imbalance in all four demographics affected a
reliable evaluation and prevented participants from focusing on small classes in particular.
To improve the general robustness and ease of use of our dataset, we will remove all
non-English tweets and celebrities supplying too little text.</p>
        <p>Besides the prediction of small classes, year of birth prediction has been a major
factor influencing algorithm performance. The intention behind our approach was to
overcome the inherent weakness of interval-based age prediction and to provide an
incentive to participants to develop more fine-grained predictions. This was not picked up
by participants, since participants simply defined their own interval-based classification
based on our scoring formula. For the next iteration, we will consider a distance-based
performance scoring for year of birth prediction.
6</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and Outlook</title>
      <p>In the celebrity profiling task at PAN 2019, we invited participants to predict the
demographics gender, year of birth, fame, and occupation of 48,335 twitter timelines of
celebrities. Eight participants submitted models and six submitted notebooks
describing their approach. Participants found traditional machine learning on content-based
features to be most reliable, where the best-performing models added some style-based
features and resampled the training examples to compensate class imbalance. Although
a lot of progress has been made in this task, several open challenges remain: (1) a
reliable prediction of rare demographics, like non-binary gender, very young
celebrities born after 2000, and “rising” stars, (2) the prediction of occupations without clear
topical separation, like professional, manager, science, and creator, and (3) the
discrimination of authors born before 1980.</p>
      <sec id="sec-4-1">
        <title>Acknowledgments</title>
        <p>We want to thank our participants for their effort and dedication and the CLEF
organizers for hosting PAN and the celebrity profiling task.
A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Ninth International Conference on
Language Resources and Evaluation, LREC 2014, Reykjavik, Iceland, May 26-31, 2014.
pp. 3081–3085. European Language Resources Association (ELRA) (2014)
[45] Verhoeven, B., Daelemans, W., Plank, B.: Twisty: A multilingual twitter stylometry corpus
for gender and personality profiling. In: Calzolari, N., Choukri, K., Declerck, T., Goggi,
S., Grobelnik, M., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J., Piperidis, S.
(eds.) Proceedings of the Tenth International Conference on Language Resources and
Evaluation LREC 2016, Portorož, Slovenia, May 23-28, 2016. European Language
Resources Association (ELRA) (2016)
[46] Volkova, S., Bachrach, Y.: On predicting sociodemographic traits and emotions from
communications in social networks and their implications to online self-disclosure.</p>
        <p>Cyberpsy., Behavior, and Soc. Networking 18(12), 726–736 (2015)
[47] Wang, X., Bendersky, M., Metzler, D., Najork, M.: Learning to Rank with Selection Bias
in Personal Search. In: SIGIR. pp. 115–124. ACM (2016)
[48] Wang, Y., Xiao, Y., Ma, C., Xiao, Z.: Improving users’ demographic prediction via the
videos they talk about. In: EMNLP. pp. 1359–1368. The Association for Computational
Linguistics (2016)
[49] Wiegmann, M., Stein, B., Potthast, M.: Celebrity Profiling. In: Proceedings of ACL 2019
(to appear) (2019)</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Tables</title>
      <p>male
star
performer creator sports manager politics science professional religious</p>
      <p>Fame over Occupation
star
superstar
male
female
nonbinary</p>
      <sec id="sec-5-1">
        <title>Gender over Birthyear</title>
        <p>1960
male
female
1980
nonbinary</p>
      </sec>
      <sec id="sec-5-2">
        <title>Occupation over Birthyear</title>
        <p>1940
2000</p>
        <p>Figures
0.8
professional
religious
female
male</p>
        <p>nonbinary
Fame
sports
perf. creator poli cs manag. science prof.</p>
        <p>Predicted Class</p>
        <p>superstar
star
rising</p>
        <p>Birthyear</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Argamon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pennebaker</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schler</surname>
          </string-name>
          , J.:
          <article-title>Automatically Profiling the Author of an Anonymous Text</article-title>
          .
          <source>Commun. ACM</source>
          <volume>52</volume>
          (
          <issue>2</issue>
          ),
          <fpage>119</fpage>
          -
          <lpage>123</lpage>
          (
          <year>Feb 2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Asif</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shahzad</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramzan</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Najib</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Word Distance Approach for Celebrity profiling-Notebook for PAN at CLEF 2019</article-title>
          . In: [5]
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Bergsma</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Post</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yarowsky</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Stylometric analysis of scientific articles</article-title>
          .
          <source>In: HLT-NAACL</source>
          . pp.
          <fpage>327</fpage>
          -
          <lpage>337</lpage>
          .
          <article-title>The Association for Computational Linguistics (</article-title>
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Burger</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henderson</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zarrella</surname>
          </string-name>
          , G.:
          <article-title>Discriminating Gender on Twitter</article-title>
          .
          <source>In: Proceedings of the Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <fpage>1301</fpage>
          -
          <lpage>1309</lpage>
          . ACM (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Cappellato</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Losada</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Müller</surname>
          </string-name>
          , H. (eds.):
          <article-title>CLEF 2019 Labs and Workshops, Notebook Papers</article-title>
          .
          <source>CEUR Workshop Proceedings, CEUR-WS.org (Sep</source>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Carmona</surname>
            ,
            <given-names>M.Á.Á.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guzmán-Falcón</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montes-</surname>
            y-Gómez,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Escalante</surname>
            ,
            <given-names>H.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pineda</surname>
            ,
            <given-names>L.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reyes-Meza</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sulayes</surname>
            ,
            <given-names>A.R.</given-names>
          </string-name>
          :
          <article-title>Overview of MEX-A3T at ibereval 2018: Authorship and aggressiveness analysis in mexican spanish tweets</article-title>
          .
          <source>In: IberEval@SEPLN. CEUR Workshop Proceedings</source>
          , vol.
          <volume>2150</volume>
          , pp.
          <fpage>74</fpage>
          -
          <lpage>96</lpage>
          . CEUR-WS.org (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Choudhury</surname>
            ,
            <given-names>M.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gamon</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Counts</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horvitz</surname>
          </string-name>
          , E.:
          <article-title>Predicting depression via social media</article-title>
          .
          <source>In: ICWSM</source>
          . The AAAI Press (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Ciot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sonderegger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruths</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Gender inference of twitter users in non-english contexts</article-title>
          .
          <source>In: EMNLP</source>
          . pp.
          <fpage>1136</fpage>
          -
          <lpage>1145</lpage>
          . ACL (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Emmery</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chrupala</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Simple queries as distant labels for predicting gender on twitter</article-title>
          .
          <source>In: NUT@EMNLP</source>
          . pp.
          <fpage>50</fpage>
          -
          <lpage>55</lpage>
          . Association for Computational Linguistics (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Estival</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaustad</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pham</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radford</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hutchinson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Author profiling for english emails (12</article-title>
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Estival</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaustad</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pham</surname>
            ,
            <given-names>S.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radford</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hutchinson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>TAT: an author profiling tool with application to arabic emails</article-title>
          .
          <source>In: ALTA</source>
          . pp.
          <fpage>21</fpage>
          -
          <lpage>30</lpage>
          . Australasian Language Technology Association (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Fatima</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hasan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anwar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nawab</surname>
            ,
            <given-names>R.M.A.</given-names>
          </string-name>
          :
          <article-title>Multilingual author profiling on facebook</article-title>
          .
          <source>Inf. Process. Manage</source>
          .
          <volume>53</volume>
          (
          <issue>4</issue>
          ),
          <fpage>886</fpage>
          -
          <lpage>904</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Gjurkovic</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Snajder</surname>
          </string-name>
          , J.:
          <article-title>Reddit: A gold mine for personality prediction</article-title>
          .
          <source>In: PEOPLES@NAACL-HTL</source>
          . pp.
          <fpage>87</fpage>
          -
          <lpage>97</lpage>
          . Association for Computational Linguistics (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Kapociute-Dzikiene</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Utka</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sarkute</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Authorship attribution and author profiling of lithuanian literary texts</article-title>
          .
          <source>In: BSNLP@RANLP</source>
          . pp.
          <fpage>96</fpage>
          -
          <lpage>105</lpage>
          . INCOMA Ltd. Shoumen,
          <string-name>
            <surname>BULGARIA</surname>
          </string-name>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Argamon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shimoni</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Automatically Categorizing Written Texts by Author Gender</article-title>
          .
          <source>Literary and Linguistic Computing</source>
          <volume>17</volume>
          (
          <issue>4</issue>
          ),
          <fpage>401</fpage>
          -
          <lpage>412</lpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reganti</surname>
            ,
            <given-names>A.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhatia</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maheshwari</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Aggression-annotated corpus of hindi-english code-mixed data</article-title>
          .
          <source>In: LREC. European Language Resources Association (ELRA)</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Litvinova</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seredin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Litvinova</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zagorovskaya</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Differences in type-token ratio and part-of-speech frequencies in male and female russian written texts</article-title>
          .
          <source>In: Proceedings of the Workshop on Stylistic Variation</source>
          . pp.
          <fpage>69</fpage>
          -
          <lpage>73</lpage>
          . Association for Computational Linguistics (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Martinc</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Škrlj</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pollak</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Who is hot and who is not? Profiling celebs on Twitter-Notebook for PAN at CLEF 2019</article-title>
          . In: [5]
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Mikros</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Authorship Attribution and Gender Identification in Greek Blogs</article-title>
          .
          <source>In: Selected papers of the VIIIth International Conference on Quantitative Linguistics (QUALICO)</source>
          . pp.
          <fpage>21</fpage>
          -
          <lpage>32</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Moreno-Sandoval</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Puertas</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <article-title>Plaza-del-</article-title>
          <string-name>
            <surname>Arco</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pomares-Quimbaya</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alvarado-Valencia</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ureña-Lòpez</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Celebrity Profiling on Twitter using Sociolinguistic Features-Notebook for PAN at CLEF 2019</article-title>
          . In: [5]
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosé</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Author Age Prediction from Text Using Linear Regression</article-title>
          .
          <source>In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage</source>
          ,
          <source>Social Sciences, and Humanities</source>
          . pp.
          <fpage>115</fpage>
          -
          <lpage>123</lpage>
          . ACM (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Peersman</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
          </string-name>
          , W.,
          <string-name>
            <surname>Van Vaerenbergh</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Predicting Age and Gender in Online Social Networks</article-title>
          .
          <source>In: Proceedings of the 3rd international workshop on Search</source>
          and
          <article-title>mining user-generated contents</article-title>
          . pp.
          <fpage>37</fpage>
          -
          <lpage>44</lpage>
          . SMUC '11,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2011</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/2065023.2065035
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Pelzer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Celebrity Profiling with Transfer Learning-Notebook for PAN at CLEF 2019</article-title>
          . In: [5]
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Pennebaker</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehl</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Niederhoffer</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Psychological aspects of natural language use: Our words, our selves</article-title>
          .
          <source>Annual Review of Psychology</source>
          <volume>54</volume>
          ,
          <fpage>547</fpage>
          -
          <lpage>577</lpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Petrik</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chuda</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Twitter feeds profiling with TF-IDF-Notebook for PAN at CLEF 2019</article-title>
          . In: [5]
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Plank</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hovy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Personality traits on twitter - or - how to get 1, 500 personality tests in a week</article-title>
          .
          <source>In: WASSA@EMNLP</source>
          . pp.
          <fpage>92</fpage>
          -
          <lpage>98</lpage>
          . The Association for Computer Linguistics (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gollub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiegmann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>TIRA Integrated Research Architecture</article-title>
          . In: Ferro,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <surname>C</surname>
          </string-name>
          . (eds.)
          <article-title>Information Retrieval Evaluation in a Changing World - Lessons Learned from 20 Years of</article-title>
          CLEF. Springer (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Preotiuc-Pietro</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lampos</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aletras</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          :
          <article-title>An analysis of the user occupational class through twitter content</article-title>
          .
          <source>In: ACL (1)</source>
          . pp.
          <fpage>1754</fpage>
          -
          <lpage>1764</lpage>
          . The Association for Computer Linguistics (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Preotiuc-Pietro</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Hopkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Ungar</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.H.</surname>
          </string-name>
          :
          <article-title>Beyond binary labels: Political ideology prediction of twitter users</article-title>
          .
          <source>In: ACL (1)</source>
          . pp.
          <fpage>729</fpage>
          -
          <lpage>740</lpage>
          . Association for Computational Linguistics (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Preotiuc-Pietro</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ungar</surname>
            ,
            <given-names>L.H.:</given-names>
          </string-name>
          <article-title>User-level race and ethnicity predictors from twitter text</article-title>
          . In: Bender,
          <string-name>
            <given-names>E.M.</given-names>
            ,
            <surname>Derczynski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Isabelle</surname>
          </string-name>
          , P. (eds.)
          <source>Proceedings of the 27th International Conference on Computational Linguistics</source>
          ,
          <string-name>
            <surname>COLING</surname>
          </string-name>
          <year>2018</year>
          ,
          <string-name>
            <given-names>Santa</given-names>
            <surname>Fe</surname>
          </string-name>
          , New Mexico, USA,
          <year>August</year>
          20-
          <issue>26</issue>
          ,
          <year>2018</year>
          . pp.
          <fpage>1534</fpage>
          -
          <lpage>1545</lpage>
          . Association for Computational Linguistics (
          <year>2018</year>
          ), https://aclanthology.info/papers/C18-1130/c18-
          <fpage>1130</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Radivchev</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nikolov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lambova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Celebrity Profiling using TF-IDF, Logistic Regression, and SVM-Notebook for PAN at CLEF 2019</article-title>
          . In: [5]
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Ramos</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neto</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>B.B.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monteiro</surname>
            ,
            <given-names>D.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paraboni</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dias</surname>
          </string-name>
          , R.:
          <article-title>Building a corpus for personality-dependent natural language understanding and generation</article-title>
          .
          <source>In: LREC. European Language Resources Association (ELRA)</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>Rangel</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Celli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Daelemans</surname>
          </string-name>
          ,
          <string-name>
            <surname>W.</surname>
          </string-name>
          :
          <article-title>Overview of the 3rd Author Profiling Task at PAN 2015</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          , San Juan, E. (eds.)
          <article-title>CLEF 2015 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <fpage>8</fpage>
          -
          <lpage>11</lpage>
          September, Toulouse, France. CEUR Workshop Proceedings, CEUR-WS.
          <source>org (Sep</source>
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>Rangel</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Montes-</surname>
          </string-name>
          y-Gómez,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>Overview of the 6th Author Profiling Task at PAN 2018: Cross-domain Authorship Attribution and Style Change Detection</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.Y.</given-names>
            ,
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>L</surname>
          </string-name>
          . (eds.)
          <article-title>CLEF 2018 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <volume>10</volume>
          -
          <fpage>14</fpage>
          September, Avignon, France. CEUR Workshop Proceedings, CEUR-WS.
          <source>org (Sep</source>
          <year>2018</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2125</volume>
          /
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>Rangel</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Chugur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Trenkmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Verhoeven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Daelemans</surname>
          </string-name>
          ,
          <string-name>
            <surname>W.</surname>
          </string-name>
          :
          <article-title>Overview of the 2nd Author Profiling Task at PAN 2014</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Halvey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Kraaij</surname>
          </string-name>
          , W. (eds.)
          <article-title>CLEF 2014 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <volume>15</volume>
          -
          <fpage>18</fpage>
          September, Sheffield, UK. CEUR Workshop Proceedings, CEUR-WS.
          <source>org (Sep</source>
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>Rangel</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Koppel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Inches</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          :
          <article-title>Overview of the Author Profiling Task at PAN 2013</article-title>
          . In: Forner,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Navigli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Tufis</surname>
          </string-name>
          ,
          <string-name>
            <surname>D</surname>
          </string-name>
          . (eds.)
          <article-title>CLEF 2013 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <volume>23</volume>
          -
          <fpage>26</fpage>
          September, Valencia, Spain.
          <source>CEUR-WS.org (Sep</source>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>Rangel</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Mandl</surname>
          </string-name>
          , T. (eds.)
          <article-title>CLEF 2017 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <volume>11</volume>
          -
          <fpage>14</fpage>
          September, Dublin, Ireland. CEUR Workshop Proceedings, CEUR-WS.
          <source>org (Sep</source>
          <year>2017</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-1866/
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>Rangel</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Mandl</surname>
          </string-name>
          , T. (eds.)
          <article-title>Working Notes Papers of the CLEF 2017 Evaluation Labs</article-title>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <year>1866</year>
          .
          <article-title>CLEF and CEUR-WS</article-title>
          .
          <source>org (Sep</source>
          <year>2017</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-1866/
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>Rangel</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Verhoeven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Daelemans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            ,
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>Overview of the 4th Author Profiling Task at PAN 2016: Cross-Genre Evaluations</article-title>
          . In: Balog,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Cappellato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Macdonald</surname>
          </string-name>
          , C. (eds.)
          <article-title>CLEF 2016 Evaluation Labs</article-title>
          and Workshop - Working Notes Papers,
          <fpage>5</fpage>
          -
          <lpage>8</lpage>
          September, Évora, Portugal. CEUR Workshop Proceedings, CEUR-WS.
          <source>org (Sep</source>
          <year>2016</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1609</volume>
          /16090750.pdf
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Rosenthal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McKeown</surname>
            ,
            <given-names>K.R.</given-names>
          </string-name>
          :
          <article-title>Age prediction in blogs: A study of style, content, and online behavior in pre- and post-social media generations</article-title>
          .
          <source>In: ACL</source>
          . pp.
          <fpage>763</fpage>
          -
          <lpage>772</lpage>
          . The Association for Computer Linguistics (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <surname>Schler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Argamon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pennebaker</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Effects of Age and Gender on Blogging</article-title>
          . In: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs. pp.
          <fpage>199</fpage>
          -
          <lpage>205</lpage>
          . AAAI (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <surname>Schwartz</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eichstaedt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kern</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dziurzynski</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramones</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kosinski</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stillwell</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seligman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ungar</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Personality</surname>
          </string-name>
          , Gender, and
          <article-title>Age in the Language of Social Media: The Open-Vocabulary Approach</article-title>
          . In: PLoS ONE. p.
          <volume>8</volume>
          (
          <issue>9</issue>
          ): e73791 (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <surname>Tighe</surname>
            ,
            <given-names>E.P.</given-names>
          </string-name>
          , Cheng, C.K.:
          <article-title>Modeling personality traits of filipino twitter users</article-title>
          .
          <source>In: PEOPLES@NAACL-HTL</source>
          . pp.
          <fpage>112</fpage>
          -
          <lpage>122</lpage>
          . Association for Computational Linguistics (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <surname>Verhoeven</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Clips stylometry investigation (CSI) corpus: A dutch corpus for the detection of age, gender, personality, sentiment and deception in text</article-title>
          . In: Calzolari,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Choukri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Declerck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Loftsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Maegaard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Mariani</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , Moreno,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>