<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Ensemble Model Using N-grams and Statistical Features to Identify Fake News Spreaders on Twitter</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jakab Buda</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Flora Bolonyai</string-name>
          <email>f.bolonyai@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Eötvös Loránd University</institution>
          ,
          <addr-line>Budapest</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>In this notebook, we summarize our work process of preparing a software for the PAN 2020 Profiling Fake News Spreaders on Twitter task. Our final software was a stacking ensemble classifier of five different machine learning models; four of them use word n-grams as features, while the fifth one was based on statistical features extracted from the Twitter feeds. Our software uploaded to the TIRA platform achieved an accuracy of 75% in English and 80.5% in Spanish. Our overall accuracy of 77.75% turned out to be a tie for the first place in the competition.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        The aim of the PAN 2020 Profiling Fake News Spreaders on Twitter task [
        <xref ref-type="bibr" rid="ref9">12</xref>
        ] was
to investigate whether the author of a given Twitter feed is likely to spread fake news.
The training and test sets of the task consisted of English and Spanish Twitter feeds
[
        <xref ref-type="bibr" rid="ref10">13</xref>
        ].
      </p>
      <p>We used an ensemble of different machine learning models to provide a prediction
for each user. All of our sub-models handle the Twitter feed of a user as a unit and
determine a probability for each user how likely they are to be fake news spreaders.
For the final predictions, these sub-models are combined using a logistic regression.</p>
      <p>In Section 2 we present some related works on profiling fake news spreaders. In
Section 3 we describe our approach in detail together with the extracted features and
models. In Section 4 we present our results. In Section 5 we discuss some potential
future work and in Section 6 we conclude our notebook.</p>
      <p>
        Using word n-gram variables for author profiling has been shown to be effective
[
        <xref ref-type="bibr" rid="ref11 ref12 ref15 ref6">3, 5, 9, 14, 15, 18</xref>
        ], especially with TF-IDF weighting [
        <xref ref-type="bibr" rid="ref17">20</xref>
        ]. Identifying fake news
based on such features has been tested earlier [1]. Statistical features, such as the
number of punctuation marks [
        <xref ref-type="bibr" rid="ref12 ref16">15, 19</xref>
        ], medium-specific symbols (for example
hashtags, and at signs in tweets, links in digital texts) [
        <xref ref-type="bibr" rid="ref11 ref12 ref14 ref16">7, 8, 14, 15, 17, 19</xref>
        ], emoticons
[
        <xref ref-type="bibr" rid="ref11 ref13 ref16">7, 8, 14, 16, 19</xref>
        ] or stylistic features [8] are also commonly used for text classification
purposes.
      </p>
      <p>
        SVMs [
        <xref ref-type="bibr" rid="ref11 ref12 ref6">3, 5, 9, 14, 15</xref>
        ], XGBoost [
        <xref ref-type="bibr" rid="ref18">21</xref>
        ], logistic regression [
        <xref ref-type="bibr" rid="ref16">19</xref>
        ] and random forest
[2] models are commonly used for author profiling and text classification purposes.
Although the state-of-the-art results for many text classification tasks are achieved
with transformer-based language models [
        <xref ref-type="bibr" rid="ref8">4, 11</xref>
        ], these are computationally very
expensive solutions and perform better on tasks where text semantics is more
important. Ghanem et al. proposed an emotionally infused LSTM model to detect
false information in social media and news articles. Their model yielded
state-of-theart results on three datasets, but it is also computationally expensive [6], so
experimenting with lighter approaches still has practical benefits.
      </p>
    </sec>
    <sec id="sec-2">
      <title>3 Our Approach</title>
      <sec id="sec-2-1">
        <title>3.1 The corpus and the environment setup</title>
        <sec id="sec-2-1-1">
          <title>3.1.1 The corpus</title>
          <p>
            The corpus for the PAN 2020 Profiling Fake News Spreaders on Twitter task [
            <xref ref-type="bibr" rid="ref9">12</xref>
            ]
consists of one English and one Spanish corpus, each containing 300 XML files. Each
of these files contains 100 tweets from an author. Because of the moderate size of the
corpus, we wanted to avoid splitting the corpus into a training and a development set.
Therefore, we used cross-validation techniques to prevent overfitting. As opposed to
earlier editions of the PAN competition, the dataset this year came pre-cleaned: all
urls, hashtags and user mentions in the tweets were changed to standardized tokens.
          </p>
        </sec>
        <sec id="sec-2-1-2">
          <title>3.1.2 Environment setup</title>
          <p>We developed our software using the Python language (version 3.7). To build our
models we mainly used the following packages: scikit-learn1, xgboost2, emoji3,
lexical-diversity4, pandas5 and numpy6. Our codes are available on GitHub7.
1 https://scikit-learn.org/
2 https://xgboost.readthedocs.io/
3 https://pypi.org/project/emoji/
4 https://pypi.org/project/lexical-diversity/
5 https://pandas.pydata.org/
6 https://numpy.org/
7 https://github.com/pan-webis-de/bolonyai20</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>3.2 Our models</title>
        <sec id="sec-2-2-1">
          <title>3.2.1 N-gram models</title>
          <p>We experimented with a number of machine learning models based on word
ngrams extracted from the text. Precisely, we investigated the performance of
regularized logistic regressions (LR), random forests (RF), XGBoost classifiers
(XGB) and linear support vector machines (SVM). For all four models, we ran an
extensive grid search combined with five-fold cross-validation to find the optimal text
preparation method, vectorization technique and modeling parameters. We tested the
same parameters for the English and Spanish data. We investigated two types of text
cleaning methods for all models. The first method (M1) removed all non
alphanumeric characters (except #) from the text, while the second method (M2)
removed most non alphanumeric characters (except #) but kept emoticons and emojis.
Both methods transformed the text to lower case. Regarding the vectorization of the
corpus, we experimented with a number of parameters. We tested different word
ngram ranges (unigrams, bigrams, unigrams and bigrams) and also looked at different
scenarios regarding the minimum overall document frequency of the word n-grams
(3, 4, 5, 6, 7, 8, 9, 10) included as features. Table 1 describes the tested model
hyperparameter values during the training phase of our models.</p>
          <p>
            For the early bird testing phase conducted through TIRA [
            <xref ref-type="bibr" rid="ref7">10</xref>
            ], we simply chose the
model and parameter combination in each language that had the highest accuracy
during the cross-validation and fitted these models on the entire training set.
However, the accuracy of our model was approximately 5% lower on the test set
          </p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Model</title>
        <p>LR
RF
SVM
XGB</p>
        <sec id="sec-2-3-1">
          <title>Name (Python parameter name)</title>
          <p>Regularization coefficient (C)</p>
          <p>Number of boosting rounds (B)
Minimum number of cases on each leaf</p>
          <p>(min_samples_leaf)
Regularization coefficient (C)</p>
          <p>Learning rate (eta):
Number of estimators (n_estimators)
Maximum depth of a tree (max_depth)</p>
          <p>Subsample ratio (subsample)
Subsample ratio of columns
(colsample_bytree)</p>
          <p>Values
{0.1,1,10,100,1000}
{100,300,400}
{5,6,7,8,9,10}
{1,10,100,1000}
{0.01,0.1,0.3}
{200,300}
{3,4,5,6}
{0.6,0.7,0.8}
{0.5,0.6,0.7}
EN
ES</p>
          <p>LR
RF
SVM
XGB
LR
RF
SVM
XGB</p>
          <p>M1
M2
M1
M1
M1
M1
M1
M1</p>
          <p>N-grams
uni- and
bigrams
uni- and
bigrams
uni- and
bigrams
uni- and
bigrams
bigrams
uni- and
bigrams
bigrams
uni- and
bigrams
compared to the cross-validation results (79% vs. 83% for the Spanish dataset and
69% vs. 76% for the English dataset), so we used a different approach during the final
testing phase.</p>
          <p>The ensemble method we used for the final version of our software (described in
detail in Section 3.2.3) required the best text cleaning and vectorization parameters
and hyperparameters for each model. These hyperparameters are summarized in Table
2.
8 Parameter names in the relevant Python package/function. Detailed description in Table 1.</p>
        </sec>
        <sec id="sec-2-3-2">
          <title>3.2.2 User-wise statistical model</title>
          <p>Apart from the n-gram based models, we constructed a model based on statistical
variables describing all hundred tweets of each author, thus giving one more
prediction per author. The variables used in this model are as follows:
• the mean length of the 100 tweets of the authors both in words and in
characters;
• the minimum length of the 100 tweets of the authors both in words and in
characters;
• the maximum length of the 100 tweets of the authors both in words and
in characters;
• the standard deviations of the length of the 100 tweets of the authors both
in words and in characters;
• the range of the length of the 100 tweets of the authors both in words and
in characters;
• the number of retweets in the dataset by each author;
• the number of URL links in the dataset by each author;
• the number of hashtags in the dataset by each author;
• the number of mentions in the dataset by each author;
• the number of emojis in the dataset by each author;
• the number of ellipses used at the end of the tweets in the 100 tweets of
the authors;
• a stylistic feature, the type-token ratio to measure the lexical diversity of
the authors (in the dataset each author has 100 tweets thus the number of
tokens per author does not differ as much that it would cause a great
diversity in the TTRs).</p>
          <p>This gives a total of 17 statistical variables. Since we used an XGBoost classifier,
we did not normalize the variables and the linear correlation between the variables
posed no problem.</p>
          <p>To find the best hyperparameter set, we used a five-fold cross-validated grid search
and finally refitted the best model on the whole data. The cross-validated accuracies
achieved this way are 70% and 74% for the English and Spanish data respectively.
Table 3 contains the best hyperparameters found.</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>Parameter name</title>
        <p>Column sample by node
Column sample by tree
gamma
Learning rate</p>
        <p>Max depth</p>
        <p>Min child weight
Number of estimators
alpha
Subsample
EN
0.9
0.2
1
2
2
4
200
0.1
0.8
ES
0.8
0.8
4
0.3
3
5
100
0.3
0.8</p>
        <sec id="sec-2-4-1">
          <title>3.2.3 Stacking ensemble</title>
          <p>After identifying the best hyperparameters for the five mentioned models with
cross-validation, we had to find a reliable ensemble method. To avoid overfitting this
ensemble model to the training set, we did not train it using the predictions of the five
final trained models. Instead, we wanted to create a dataset that represents the
predictions that are produced by our models. To do this, we refitted the five
submodels with the cross-validated hyperparameters five times on different chunks of the
original training data (each consisting of tweets from 240 users). The predictions
given by these five models to the 60 remaining users were appended to the training
data of the ensemble model, thus this training set consisted of predictions given to all
300 users in the training data, but these predictions were given by five different
models in case of each model type. The sample created this way can be interpreted as
an approximation of a sample from the distribution of the predictions of the final five
models on the test set. We created a test set with the same method but with a different
split of the training data.</p>
          <p>We then used these constructed training and test sets to find the best ensemble
from the following three methods: majority voting, linear regression of predicted
probabilities (this includes the simple mean), and a logistic regression model. The
best and most reliable results were given by the logistic model; therefore, we used this
model as our final ensemble method. Table 4 summarizes the logistic regression
coefficients for the probabilistic predictions of each model for both languages.</p>
          <p>The validity of this method is backed by the fact that our results on the training sets
(an accuracy of 75% and 81% for the English and Spanish set respectively) were only
slightly better than the final test results.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4 Results</title>
      <p>As mentioned in Section 3, we tested two versions of our software. For the early
bird testing, we used the single best n-gram models based on our cross-validated grid
search (a random forest classifier for the English set and a support vector machines
classifier for the Spanish set). Using these models, we experienced a significant
decrease in the accuracy of the models compared to their cross-validated
performance, so this was one of the reasons why we decided to incorporate a number
of different models for our final software. As Table 5 shows, relying on a number of
different models and a statistically based ensemble method proved to be a good
solution. First, the cross-validated accuracies of our final models were almost the
same as their accuracies on the test set, and second, our final software was able to
reach a higher accuracy in both languages than our early bird solution.</p>
    </sec>
    <sec id="sec-4">
      <title>Future Work</title>
      <p>One of the unanswered questions that emerged during this project is concerning the
reasons behind the fact that our models are better at identifying fake news spreaders
that tweet in Spanish. This is true about all of our individual models regardless of the
features they used, and about the final ensemble model as well. We assume that it
would be beneficial to conduct some qualitative research about the tweets in the
dataset to better understand why fake news spreaders that tweet in Spanish are more
distinguishable from regular users than those that tweet in English.</p>
      <p>
        Another promising direction for achieving higher accuracy in profiling fake news
spreaders is to develop a software that is able to determine whether a single tweet
should be considered as fake news. It is reasonable to assume that even those that are
labelled as fake news spreaders only post some tweets that can be considered as fake
news, while some of their posts are just regular tweets. Therefore, from the
perspective of our approach, the current dataset is likely to contain a lot of noise. If
we were able to identify fake news on the level of tweets, we could build a model
relying on this information that would allow us to give predictions for each tweet.
This approach was unfortunately not executable with the PAN20 Fake News
Spreaders dataset [
        <xref ref-type="bibr" rid="ref10">13</xref>
        ], as it did not provide information about single tweets, and
additionally, all URL links, hashtags and user mentions, which could have provided
valuable clues about the credibility of the tweet, were replaced by standardized tokens
in the text. Moreover, even if we had access to these tweets in their original form,
manual labeling would be a tedious process even for the “small” dataset of 300 users.
However, it would be interesting to investigate how a software that is able to decide
whether a single tweet is fake news would perform in this task.
6
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>
        In this notebook, we summarized our work process of preparing a software for the
PAN 2020 Profiling Fake News Spreaders on Twitter task [
        <xref ref-type="bibr" rid="ref9">12</xref>
        ]. Originally, we looked
at a number of machine learning models using n-grams as features. To find the best
parameters for the models, we conducted an extensive grid search combined with
cross-validation. After finding the models achieving the highest accuracy during the
cross-validation, we fitted these on the entire training set. However, we realized
during the early bird testing phase that this approach results in a significantly lower
accuracy on the test set compared to its cross-validation results. Therefore, for our
final software, we decided to create a combined model which was a stacking
ensemble of five sub-models. Four of these sub-models (a logistic regression, a
support vector machine classifier, a random forest classifier and an XGBoost
classifier) used word n-grams as features, while the fifth model (another XGBoost
model) used statistical features extracted from the Twitter feed. For each sub-model,
we used grid search and cross-validation to find the best performing parameters and
fitted the models on the entire training data with these parameters. To get a final
prediction for each user, we trained a logistic regression that used the probabilistic
predictions of the sub-models as features. Using the ensemble model, we were able to
achieve the same accuracy on the test set as during the cross-validation process.
Overall, our final software was able to identify fake news spreaders with a 75%
accuracy among users that tweet in English, and with an 80.5% accuracy among users
that tweet in Spanish. Our overall accuracy of 77.75% was tied as the highest
performance in the competition.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Ahmed</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traore</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saad</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques</article-title>
          . In:
          <string-name>
            <surname>Traore</surname>
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Woungang</surname>
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Awad</surname>
            <given-names>A</given-names>
          </string-name>
          . (eds) Intelligent, Secure, and
          <article-title>Dependable Systems in Distributed and Cloud Environments</article-title>
          .
          <source>ISDDC 2017. Lecture Notes in Computer Science</source>
          , vol
          <volume>10618</volume>
          . Springer, Cham (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Aravantinou</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simaki</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mporas</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Megalooikonomou</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Gender Classification of Web Authors Using Feature Selection and Language Models</article-title>
          .
          <source>In: Speech and Computer Lecture Notes in Computer Science</source>
          , pp.
          <fpage>226</fpage>
          -
          <lpage>33</lpage>
          . (
          <year>2015</year>
          ) Boulis,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Ostendorf</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.:</surname>
          </string-name>
          <article-title>A quantitative analysis of lexical differences between genders in telephone conversations</article-title>
          .
          <source>In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL '05</source>
          .
          <string-name>
            <surname>Morristown</surname>
          </string-name>
          , NJ, USA: Association for Computational Linguistics, pp.
          <fpage>435</fpage>
          -
          <lpage>442</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            <given-names>K.</given-names>
          </string-name>
          : BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          . NAACL-HLT.
          <article-title>(2019) Garera</article-title>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Yarowsky</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>Modeling Latent Biographic Attributes in Conversational Genres</article-title>
          . In:
          <article-title>Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP</article-title>
          , pp
          <fpage>710</fpage>
          -
          <lpage>718</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Ghanem</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>An Emotional Analysis of False Information in Social Media and News Articles</article-title>
          .
          <source>In: ACM Transactions on Internet Technology (TOIT)</source>
          vol.
          <volume>20</volume>
          no.
          <issue>2</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Gonzalez-Gallardo</surname>
            ,
            <given-names>C. E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Torres-Moreno</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rendon</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sierra</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Efficient social network multilingual classification using character, POS n-grams and Dynamic Normalization</article-title>
          .
          <source>In: IC3K 2016 - Proceedings of the 8th International Joint Conference on Knowledge Discovery</source>
          ,
          <article-title>Knowledge Engineering and Knowledge Management</article-title>
          .
          <source>SciTePress</source>
          , pp.
          <fpage>307</fpage>
          -
          <lpage>314</lpage>
          . (
          <year>2016</year>
          )
          <article-title>Marquardt</article-title>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Farnadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Vasudevan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Moens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Davalos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Teredesai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>De Cock</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Age and gender identification in social media</article-title>
          .
          <source>In: CLEF 2014 working notes</source>
          , pp.
          <fpage>1129</fpage>
          -
          <lpage>1136</lpage>
          . (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          9.
          <string-name>
            <surname>Peersman</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
          </string-name>
          , W.,
          <string-name>
            <surname>Van Vaerenbergh</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Predicting Age and Gender in Online Social Networks</article-title>
          .
          <source>In: Proceedings of the 3rd International Workshop on Search and Mining</source>
          User-Generated
          <string-name>
            <surname>Contents</surname>
          </string-name>
          . New York, NY, USA: Association for Computing Machinery, pp.
          <fpage>37</fpage>
          -
          <lpage>44</lpage>
          . (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          10.
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gollub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiegmann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>TIRA Integrated Research Architecture</article-title>
          . In: Ferro,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <surname>C</surname>
          </string-name>
          . (eds.)
          <article-title>Information Retrieval Evaluation in a Changing World - Lessons Learned from 20 Years of</article-title>
          CLEF. Springer. (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          11.
          <string-name>
            <surname>Radford</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Child</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amodei</surname>
            ,
            <given-names>D</given-names>
          </string-name>
          , Sutskever,
          <string-name>
            <surname>I.</surname>
          </string-name>
          :
          <article-title>Language Models are Unsupervised Multitask Learners</article-title>
          . OpenAI, San Francisco, CA, (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          12.
          <string-name>
            <surname>Rangel</surname>
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giachanou</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghanem</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            <given-names>P</given-names>
          </string-name>
          .
          <article-title>Overview of the 8th Author Profiling Task at PAN 2020: Profiling Fake News Spreaders on Twitter</article-title>
          . In: L.
          <string-name>
            <surname>Cappellato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Eickhoff</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ferro</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Névéol (eds.)
          <article-title>CLEF 2020 Labs and Workshops, Notebook Papers</article-title>
          .
          <source>CEUR Workshop Proceedings.CEUR-WS.org</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          13.
          <string-name>
            <surname>Rangel</surname>
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghanem</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giachanou</surname>
            <given-names>A</given-names>
          </string-name>
          .
          <article-title>Profiling Fake News Spreaders on Twitter [Data set]</article-title>
          .
          <source>Zenodo</source>
          . http://doi.org/10.5281/zenodo.3692319 (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          14.
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yarowsky</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shreevats</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Classifying latent user attributes in Twitter</article-title>
          .
          <source>In: SMUC '10: Proceedings of the 2nd international workshop on Search</source>
          and
          <article-title>mining user-generated contents</article-title>
          .
          <source>Pp</source>
          .
          <volume>37</volume>
          -
          <fpage>44</fpage>
          . (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          15.
          <string-name>
            <surname>Santosh</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bansal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shekhar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varma</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Author Profiling: Predicting Age and Gender from Blogs Notebook for PAN at CLEF 2013</article-title>
          . in: Working Notes for CLEF 2013 Conference. (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          16.
          <string-name>
            <surname>Sboev</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Litvinova</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Voronina</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gudovskikh</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rybka</surname>
          </string-name>
          , R.:
          <article-title>Deep Learning Network Models to Categorize Texts According to Author's Gender and to Identify Text Sentiment</article-title>
          . In: Proceedings - 2016
          <source>International Conference on Computational Science and Computational Intelligence</source>
          ,
          <string-name>
            <surname>CSCI</surname>
          </string-name>
          <year>2016</year>
          .
          <article-title>Institute of Electrical and Electronics Engineers Inc</article-title>
          ., pp.
          <fpage>1101</fpage>
          -
          <lpage>1106</lpage>
          . (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          17.
          <string-name>
            <surname>Schler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Argamon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pennebaker</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Effects of Age and Gender on Blogging</article-title>
          . In: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.
          <source>American Association for Artificial Intelligence (AAAI)</source>
          , pp.
          <fpage>199</fpage>
          -
          <lpage>205</lpage>
          . (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          18.
          <string-name>
            <surname>Stout</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Musters</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pool</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Author Profiling based on Text and Images Notebook for PAN at CLEF 2018</article-title>
          . In: Working Notes of CLEF 2018 -
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          . (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          19.
          <string-name>
            <surname>Volkova</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bachrach</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>On Predicting Sociodemographic Traits and Emotions from Communications in Social Networks and Their Implications to Online Self Disclosure</article-title>
          . In: Cyberpsychology, Behavior, and Social Networking (Mary Ann Liebert Inc.)
          <year>2015</year>
          /12, pp.
          <fpage>726</fpage>
          -
          <lpage>736</lpage>
          . (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          20.
          <string-name>
            <surname>Yildiz</surname>
          </string-name>
          , T.:
          <article-title>A comparative study of author gender identification</article-title>
          .
          <source>In: Turkish Journal of Electrical Engineering and Computer Science</source>
          <volume>27</volume>
          , pp.
          <fpage>1052</fpage>
          -
          <lpage>1064</lpage>
          . (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          21.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          :
          <article-title>Hotel reviews sentiment analysis based on word vector clustering</article-title>
          .
          <source>In: 2nd IEEE International Conference on Computational Intelligence and Applications</source>
          (ICCIA), Beijing, pp.
          <fpage>260</fpage>
          -
          <lpage>264</lpage>
          . (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>