<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Pro ling Fake News Spreaders on Twitter based on TFIDF Features and Morphological Process</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mohamed Lichouri</string-name>
          <email>m.lichouri@crstdla.dz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mourad Abbas</string-name>
          <email>m.abbas@crstdla.dz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Besma Benaziz</string-name>
          <email>b.benaziz@crstdla.dz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computational Linguistics Department</institution>
          ,
          <addr-line>CRSTDLA, Algiers.</addr-line>
          <country country="DZ">Algeria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we present a description of our experiments on Pro ling Fake News Spreaders on Twitter based on TFIDF Features and Morphological Processes as stemming, lemmatization and part of speech tagging. A comparison study between a set of classi ers has been carried out. The best results were achieved using the model LSVC which yielded an f1-score of 76% and 58.50% for Spanish and English, respectively.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Fast posting, quick access and free publishing of news in social media is a good
motivation to spread news in various elds. However, spreading of news in
social media is a double-edged sword because it can be used either for bene cial
purposes or for bad purposes (fake news).</p>
      <p>
        According to [21], false information is categorized into eight types: fabricated,
propaganda, conspiracy theories, hoaxes, biased or one-sided, rumors,
clickbait,and satire news. Twitter has recently detected a campaign[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]organized by
agencies from two di erent countries to a ect the results of the last U.S.
presidential elections of 2016. Social media allows users to hide their real pro les,
which gives them a safe space to spread whatever comes to mind.
The ability to know the features of social media users is a growing eld of
interest called author pro ling. There are three main types of fake news contributors:
social bots, trolls, and cyborg users [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The social bot is an automatic social
media account managed by an algorithm, designed to create posts without
human intervention [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. For example,\studies show that social bots distorted the
2016 US presidential election discussions on a large scale, and around 19 million
bot accounts tweeted in support of either Trump or Clinton in the week
leading up to the election day" [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Similarly, according to Marc Jones and Alexei
Abrahams [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], a plague of Twitter bots is roiling the Middle East[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
      <p>
        The troll is a user of another kind that spreads false news among societies
across the Internet. It is a type of user who aims to disrupt online communities
and provoke consumers into an emotional response [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. For instance, there
has been evidence that claims \1,000 Russian trolls were paid to spread fake news
on Hilary Clinton," which reveals how actual people are performing information
manipulation in order to change the views of others [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The troll di ers from the
bot program because the troll is a real user, while the bot software is automatic.
The mixture between the bots and trolls, can produce a type which is not less
dangerous than the above. Intelligence in this type lies in the account registered
by real users, but use programs to perform activities in social media. With the
possibility of switching between the two [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        In this paper, we are interested in pro ling fake news spreaders on Twitter [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
for two languages: English and Spanish using a machine learning model[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>The paper is organized as follows: in section 2, we present the main works
related to pro ling fake news spreaders. In section 3, we describe the dataset used
in our experiments as well the preprocessing steps that we followed. Our system
architecture including feature extraction and classi cation models is presented
in section 4. We summarize the achieved experiments and the results in section
5 and we conclude in section 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>
        Author pro ling is a problem of growing importance, as it it can be helpful
for combating fake news. Indeed, it allows us to di erentiate between real and
imaginary users, or even to reach everyone who posted fake news. Many works are
interested in studying the possibility of obtaining age and gender through formal
texts [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The writer's age and gender can appear through his publications,
including ideas and diversity in linguistic characteristics. In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], the authors
found out that women, at least for English language, use the rst single person
more than men who use more determinants because they talk about tangible
things. This allowed the authors to build the LIWC (Linguistic Inquiry and
Word Count), which is e ective in author pro ling. In [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], a study of (71,000)
blogs showed that the linguistic features in blogs are related to age and gender.
They got an accuracy of about 80% to determine gender and about 75% to
determine age. Author pro ling tasks have been organized many years at PAN1.
Indeed, in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], the authors describe a large corpus, collected on social networks,
and its characteristics, to solve the problem of identifying age and sex. Rangel
et. al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] continued to focus on aspects of age and gender, where the aim of
this work was to analyse the adaptability of the detection approaches when given
di erent genres. For this purpose, a corpus with four di erent parts (sub-corpora)
has been compiled: social media, Twitter, blogs, and hotel reviews. In [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], two
new languages have been added, Italian and Dutch, besides a new subtask on
personality recognition, to enrich the results obtained previously. In [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], the
objective was to predict age and gender from a cross-genre perspective. For this
purpose a corpus from Twitter has been provided for training, and di erent
corpora from social media, blogs, essays, and reviews have been provided for
1 http://webis.de
evaluation. In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the objective was to address gender and language variety
identi cation. For this purpose a corpus from Twitter has been provided for
four di erent languages: Arabic, English, Portuguese, and Spanish.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the authors provide an emotionally infused deep learning network that
uses emotional features to identify false information in Twitter and news articles
sources. They compared the language of false news to the one of real news from an
emotional perspective, considering a set of false information types (propaganda,
hoax, click-bait, and satire) from social media and online news article sources.
The results show that the detection of suspicious news in Twitter is harder than
detecting it in news articles.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Dataset and Preprocessing</title>
      <p>The dataset is saved and organized as XML les. It is composed of thousands of
tweets of several authors (Twitter users). In fact, 500 XML les (corresponding
to 500 authors) are provided for English and the same number is reserved for
Spanish. Each le includes 100 tweets, which means that the total number of
tweets for both English and Spanish is 100.000 tweets written by 1000 authors.
Each XML le is coded with an alpha-numeric author-ID and tagged with two
labels: 0 or 1. We performed a basic and necessary text preprocessing step which
is punctuation and emojis removal. We summarize in table 1 some statistics
about the training set for both English and Spanish. We illustrate in gure 1
the di erent steps of our proposed system which includes preprocessing, features
extraction and model training.</p>
      <p>English Spanish
# authors (XML les) 300 300
# sentences per author (XML le) 30,000 30,000
# words per author (XML le) 717,596 786,965
Max # word per author (XML le) 3,636 5,373
Min # word per author (XML le) 1,524 1,603
Max # char per author (XML le) 12,962 23,588</p>
      <p>Min # char per author (XML le) 5,238 5,799</p>
      <p>Table 1. PAN Train set statistics for both English and Spanish
4</p>
    </sec>
    <sec id="sec-4">
      <title>System architecture</title>
      <p>There are four processes that we used in our approach. The input texts are rst
subject to the rst step: stop words removal. After that, we apply the three
additional morphological processes which are: stemming, lemmatization and part
of speech tagging. After many trials of combinations between these processes, we
English/Spanish
Emojis
Removal</p>
      <p>Morphological</p>
      <p>Process
Stop Words
Removal
Classification</p>
      <p>Model</p>
      <p>
        Features Extraction
found out that the combination that gives the best performance is the one
resulted in concatenating the text outputs of the three aforementioned processes,
in addition to stop words removal, in a single text array. Inspired from [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] in
which a union of TFIDF features has given better results for text classi cation,
we have chosen the union of three TF-IDF features (word 5-grams, char 5-grams,
char with boundary 5-grams). In addition, we used three classi ers, namely:
Linear Support Vector Classi cation (LSVC), linear model with Stochastic Gradient
Descent (SGD) and Ridge Classi er (RDG) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. We used the default con
guration for selecting the parameters used for each of the aforementioned classi ers.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Experiments and results</title>
      <p>In order to validate our approach, we split the training data into two sets, 80% for
training (240 documents) and 20% for test (60 documents). We tried di erent
classi ers, Linear SVC, SGD and RDG. Results are presented in Table 2 for
English and Spanish datasets, in which it is clearly shown that linear SVC and
SGD outperformed RDG classi er.</p>
      <p>By comparing the results in table 2 and table 3, we notice clearly that the
LSVC model performance dropped by 41.42% and 24% for English and Spanish
respectively. The RDG classi er is more or less e cient since the recorded score
for Spanish was 76.00% and for English 61.50%. The reason behind these results
is likely the lack of data.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>We presented in this paper our approach for identifying authors that tend to
spread fake news. We carried out many experiments that led us to select the
best features, composed of a union of three TF-IDF features (word 5-grams,
char 5-grams and char wb 5-grams), in addition to three important
morphological features: stemming, lemmatization and part of speech tagging. Our system
achieved an F1-score of 76% for Spanish and 58.50% for English, which can be
improved by increasing the size of the training dataset.
enabling-further-research-of-information-operations-on-twitter.html
(2018), online; accessed 25 Juillet 2020
21. Zannettou, S., Sirivianos, M., Blackburn, J., Kourtellis, N.: The web of false
information: Rumors, fake news, hoaxes, clickbait, and various other shenanigans.
Journal of Data and Information Quality (JDIQ) 11(3), 1{37 (2019)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abbas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lichouri</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Freihat</surname>
            ,
            <given-names>A.A.</given-names>
          </string-name>
          :
          <article-title>St madar 2019 shared task: Arabic ne-grained dialect identi cation</article-title>
          .
          <source>In: Proceedings of the Fourth Arabic Natural Language Processing Workshop</source>
          . pp.
          <volume>269</volume>
          {
          <issue>273</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Burger</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henderson</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zarrella</surname>
          </string-name>
          , G.:
          <article-title>Discriminating gender on twitter</article-title>
          .
          <source>In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>1301</volume>
          {
          <issue>1309</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Cheng, J.,
          <string-name>
            <surname>Bernstein</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Danescu-Niculescu-Mizil</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leskovec</surname>
          </string-name>
          , J.:
          <article-title>Anyone can become a troll: Causes of trolling behavior in online discussions</article-title>
          .
          <source>In: Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing</source>
          . pp.
          <volume>1217</volume>
          {
          <issue>1230</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ferrara</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varol</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Menczer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Flammini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The rise of social bots</article-title>
          .
          <source>Communications of the ACM</source>
          <volume>59</volume>
          (
          <issue>7</issue>
          ),
          <volume>96</volume>
          {
          <fpage>104</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ghanem</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>An emotional analysis of false information in social media and news articles</article-title>
          .
          <source>ACM Transactions on Internet Technology (TOIT) 20(2)</source>
          ,
          <volume>1</volume>
          {
          <fpage>18</fpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meyerho</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The handbook of language and gender</article-title>
          , vol.
          <volume>25</volume>
          . John Wiley &amp; Sons (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>M.O.:</given-names>
          </string-name>
          <article-title>The gulf information war| propaganda, fake news, and fake trends: The weaponization of twitter bots in the gulf crisis</article-title>
          .
          <source>International journal of communication 13</source>
          ,
          <issue>27</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lichouri</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abbas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Freihat</surname>
            ,
            <given-names>A.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Megtouf</surname>
            ,
            <given-names>D.E.H.</given-names>
          </string-name>
          :
          <article-title>Word-level vs sentence-level language identi cation: Application to algerian and arabic dialects</article-title>
          .
          <source>Procedia Computer Science</source>
          <volume>142</volume>
          ,
          <issue>246</issue>
          {
          <fpage>253</fpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Nerbonne</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>The secret life of pronouns. what our words say about us</article-title>
          .
          <source>Literary and Linguistic Computing</source>
          <volume>29</volume>
          (
          <issue>1</issue>
          ),
          <volume>139</volume>
          {
          <fpage>142</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Pedregosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varoquaux</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gramfort</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirion</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grisel</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blondel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prettenhofer</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weiss</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dubourg</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vanderplas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cournapeau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brucher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perrot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duchesnay</surname>
          </string-name>
          , E.:
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          ,
          <volume>2825</volume>
          {
          <fpage>2830</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Pennebaker</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehl</surname>
            ,
            <given-names>M.R.</given-names>
          </string-name>
          , Niederho er, K.G.:
          <article-title>Psychological aspects of natural language use: Our words, our selves</article-title>
          .
          <source>Annual review of psychology 54(1)</source>
          ,
          <volume>547</volume>
          {
          <fpage>577</fpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gollub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiegmann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Tira integrated research architecture</article-title>
          .
          <source>In: Information Retrieval Evaluation in a Changing World</source>
          , pp.
          <volume>123</volume>
          {
          <fpage>160</fpage>
          . Springer (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giachanou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghanem</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Overview of the 8th Author Pro ling Task at PAN 2020: Pro ling Fake News Spreaders on Twitter</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Eickho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Neveol</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (eds.)
          <article-title>CLEF 2020 Labs and Workshops, Notebook Papers</article-title>
          .
          <source>CEUR Workshop Proceedings (Sep</source>
          <year>2020</year>
          ),
          <article-title>CEUR-WS</article-title>
          .org
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stamatatos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Inches</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Overview of the author pro ling task at pan 2013</article-title>
          .
          <source>In: CLEF Conference on Multilingual and Multimodal Information Access Evaluation</source>
          . pp.
          <volume>352</volume>
          {
          <fpage>365</fpage>
          .
          <string-name>
            <surname>CELCT</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Overview of the 3rd author pro ling task at pan 2015</article-title>
          . In: CLEF. p.
          <year>2015</year>
          . sn (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trenkmann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verhoeven</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , et al.:
          <article-title>Overview of the 2nd author pro ling task at pan 2014</article-title>
          .
          <source>In: CEUR Workshop Proceedings</source>
          . vol.
          <volume>1180</volume>
          , pp.
          <volume>898</volume>
          {
          <fpage>927</fpage>
          . CEUR Workshop Proceedings (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verhoeven</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Overview of the 4th author pro ling task at pan 2016: cross-genre evaluations</article-title>
          .
          <source>Working Notes Papers of the CLEF</source>
          <year>2016</year>
          ,
          <volume>750</volume>
          {
          <fpage>784</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Schler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Argamon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pennebaker</surname>
            ,
            <given-names>J.W.:</given-names>
          </string-name>
          <article-title>E ects of age and gender on blogging</article-title>
          . In: AAAI spring symposium:
          <article-title>Computational approaches to analyzing weblogs</article-title>
          .
          <source>vol. 6</source>
          , pp.
          <volume>199</volume>
          {
          <issue>205</issue>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Shu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sliva</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Liu, H.:
          <article-title>Fake news detection on social media: A data mining perspective</article-title>
          .
          <source>ACM SIGKDD explorations newsletter 19(1)</source>
          ,
          <volume>22</volume>
          {
          <fpage>36</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <article-title>Vijaya Gadde and Yoel Roth: Enabling further research of information operations on Twitter</article-title>
          . https://blog.twitter.com/en\_us/topics/company/2018/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>