<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detecting Fake News Spreaders in Social Networks via Linguistic and Personality Features</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Anu Shrestha</institution>
          ,
          <addr-line>Francesca Spezzano, and Abishai Joy</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Computer Science Department Boise State University</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>This paper addresses the problem of automatically detecting fake news spreaders in social networks such as Twitter. We model the problem as a binary classification task and consider several groups of features, including writing style, word and char n-grams, BERT semantic embedding, and sentiment analysis, which are computed from a set of tweets each user authored. Our proposed approach is evaluated on the dataset made available by the PAN at CLEF 2020 shared task on profiling fake news spreader, which provided labeled data in both English and Spanish. Experimental results show that we can detect fake news spreaders with an accuracy of 0.73 in English and 0.77 in Spanish when our approach is evaluated with 10-fold cross-validation on the provided training set, and with an accuracy of 0.71 in English and 0.76 in Spanish when the model is trained on the whole training set and tested on the provided test set. We also investigate the role of psycho-linguistic (LIWC) and personality features to detect fake news spreaders and find out that personality features do have a significant impact in user sharing behavior, achieving an accuracy of 0.72 in English and 0.80 in Spanish when evaluated with 10-fold cross-validation on the provided training set.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Fake news sharing has become a concerning problem in online social networks in
recent years. Research has found that fake news is more likely to go viral than real news,
spreading both faster and wider [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] and is threatening public health [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], emergency
management and response [
        <xref ref-type="bibr" rid="ref21 ref6">21,6</xref>
        ], election outcomes [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], and is responsible for a
general decline in trust that citizens of democratic societies have for online platforms [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Surprisingly, bots are equally responsible for spreading real and fake news, and the
considerable spread of fake news on Twitter is caused by human activity [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. Fake news
is successful mainly because people are not able to disguise it from truthful
information [
        <xref ref-type="bibr" rid="ref12 ref7">12,7</xref>
        ] and often share news online without even reading its content [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Also, even
if people recognize news as fake, they are more likely to share it if they have seen it
repeatedly than the news that is novel [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Thus, being able to identify fake news spreaders in social networks is one of the
key aspects to effectively mitigate misinformation spread. Examples of strategies that
could be implemented include assisting fake news spreader with credibility indicators
to lower their fake news sharing intent [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], and mitigation campaign, e.g., target the
most influential real news spreader to maximize the spread of real news [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        This paper describes the approach we implemented and submitted to profiling fake
news spreaders on Twitter as part of the PAN at CLEF 2020 shared task [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
Specifically, we propose a machine-learning based approach that considers several groups
of features, including features capturing the Twitter writing style, words and chars
ngrams, the BERT semantic embedding of the tweets, and the sentiment expressed in
the tweets. Experimental results show that our approach can detect fake news spreaders
with an accuracy of 0.73 on the English dataset and of 0.77 on the Spanish dataset when
evaluated with ten-fold cross-validation on the training set while achieving an accuracy
of 0.71 for English and 0.76 for Spanish when evaluated on the provided test set.
      </p>
      <p>Furthermore, the paper also investigates the role of psycho-linguistic (LIWC) and
personality features (including Big Five personality traits, needs, and values) in
detecting fake news spreaders. Our additional experimental results on the provided training set
show that we can achieve an accuracy of 0.72 with LIWC or personality features on the
English dataset, while personality features achieve an accuracy of 0.80 on the Spanish
dataset, outperforming our submitted approach on this specific language. These results
suggest that psycho-linguistic and personality features are valuable characteristics to
consider when profiling fake news spreaders in social media.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Several studies have been conducted to understand the characteristics of regular
(nonmalicious) users that correlate with fake news spreading behavior in social networks.
Vosoughi et al. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] revealed that the users responsible for fake news spread had, on
average, significantly fewer followers, followed significantly fewer people, and were
significantly less active on Twitter. Shrestha and Spezzano showed that social network
properties help in identifying active fake news spreaders [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Shu et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] analyzed
user profiles to understand the characteristics of users that are likely to trust/distrust
fake news. They found that, in average, users who share fake news tend to be registered
for a shorter time than the ones who share real news and that bots are more likely to
post a piece of fake news than a real one, even though, users who spread fake news
are still more likely to be humans than bots. They also show that real news spreaders
are more likely to be more popular and that older people and females are more likely
to spread fake news. Guess et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] also analyzed user demographics as predictors of
fake news sharing on Facebook and found out political orientation, age, and social
media usage to be the most relevant. Specifically, people are more likely to share articles
they agree with (e.g., right-leaning people tended to share more fake news because the
majority of the fake news considered in the study were from 2016 and pro-Trump),
seniors tend to share more fake news probably because they lack digital media literacy
skills that are necessary to assess online news truthfulness, and the more people post
on social media, the less they are likely to share fake news, most likely because they
are familiar with the platform and they know what they share. Yaqub et al. [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]
analyzed open-ended responses where users who participated in the study explained the
rationale behind their sharing intent of true, false, and satire headlines. Among the most
frequent motivations for sharing/not sharing news there are (1) the interest/non-interest
towards the news, (2) the potential of generating discussion among the friends, (3) the
fact that the news is not relevant to the user’s life, and (4) the perceived news
credibility, especially as a motivation for not sharing news. Giachanou et al [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] addressed the
problem of discriminating between fake news spreaders and fact-checkers (a problem
slightly different from the one addressed in this paper) and proposed an approach based
on a convolutional neural network to process the user Twitter feed in combination with
features representing user personality traits and linguistic patterns used in their tweets.
Beyond user and news characteristics, Ma et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] also analyzed the characteristics of
diffusion networks to explain users’ news sharing behavior. They found opinion
leadership, news preference, and tie strength to be the most important factors at predicting
news sharing, while homophily hampered news sharing in users’ local networks. Also,
people driven by gratifications of information seeking, socializing, and status-seeking
were more likely to share news in social media platforms [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Dataset</title>
      <p>
        We carried out our experiments on the dataset provided by the PAN’20 shared task for
Profiling Fake News Spreaders on Twitter [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. This dataset contains a balanced set of
users along with their Twitter feed and known ground truth about the users. Specifically,
users that shared some fake news in the past are labeled as fake news spreader and
real news spreader otherwise. The dataset has been collected in two languages, namely
English and Spanish, and consists of a train and a test set. For each considered language,
the training set includes 300 users with 100 tweets for each user resulting in 30,000
English tweets and Spanish 30,000 tweets. The test set contains data for 200 users in
each language. As recommended by the shared task, the English and Spanish datasets
have been treated separately in our experiments.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Features for Detecting Fake News Spreaders</title>
      <p>This section presents the features we considered for detecting fake news spreaders. As
the dataset files are in raw XML format, we parsed and formatted them by using the
xml.etree.ElementTree1 library. After extraction, we pre-processed the content (user
tweets) as per requirement for each feature we considered as there are some features
that require cleaned texts and some other that require the text as it is for incorporating
underlying details of the author’s writing. Basically, four different types of features
were considered to address this task, as explained in detail below.</p>
      <p>Style. This first set of features captures the writing style of the set of tweets authored by
the same user. Specifically, we computed the average number of certain words, items,
1 https://docs.python.org/3/library/xml.etree.elementtree.html
and characters per user tweet, which includes the average number of (1) words, (2)
characters, (3) lowercase words, (4) uppercase words, (5) lowercase characters, (6)
uppercase characters, (7) stop words, (8) punctuation symbols, (9) hashtags, (10) URLs,
(11) mentions, and (12) emojis and smileys. Also, we considered the (13) percentage of
user tweets that are a retweet and (14) the percentage of user tweets that are a sharing
of breaking news.</p>
      <p>N-grams. The second group of features includes TF-IDF based n-grams for both words
and characters. We concatenated all the tweets by each user to form a single document
per user. For each document, we removed words like ’RT,’ ‘Via,’ and ‘&amp;amp’ and
replaced emojis and smileys with the corresponding English words. Next, each document
was converted to lowercase, stop words and punctuation symbols were removed, and the
remaining words were lemmatized into root words the using the NLTK toolkit.2 Finally,
we computed the TF-IDF vector representation for both word and char n-grams.3 We
experimented with different parameters, including the number of terms and the length
of grams, from uni-grams to tri-grams. We obtained the best results for both chars and
words with uni-grams and by including all the terms (max_df = 1.0).</p>
      <p>
        Tweet embedding. The third group of features includes the embedding of tweets as
computed by using the BERT state-of-art NLP model. Specifically, we used the pre-trained
multilingual model provided by SBERT [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] to address both languages. The extracted
embedding was reduced to 10 features via principal component analysis (PCA)4. Then,
we averaged the embedding of all the tweets of a single user to generate a single
embedding representation per user that captures the semantic of all the user tweets. Before
computing this embedding, each tweet was pre-processed to remove frequently Twitter
used characters like ’RT,’ ‘Via,’ and ‘&amp;amp’ and replace emojis and smileys with the
corresponding English words.
      </p>
      <p>
        Sentiment analysis. As people express their emotions, appraisals, and sentiments
towards any news or article through the choice of words in their tweets, we leveraged
sentiment analysis as another feature. Also in this case, we pre-processed each tweet to
remove ‘RT,’ ‘Via,’ and ‘&amp;amp.’ However, we did not replace emojis and smileys with
the corresponding English text for this feature since these emojis adds to the sentimental
values of the text. Then for each user, we measured the average sentiment across all their
tweets. We used the Valence Aware Dictionary and sEntiment Reasoner (VADER) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
a library specifically built for capturing sentiments expressed in social media texts, for
English tweets (compound value) and sentiment-analysis-spanish 5 for Spanish tweets.
      </p>
      <sec id="sec-4-1">
        <title>2 https://www.nltk.org/</title>
        <p>3 https://scikit-learn.org/stable/modules/generated/sklearn.</p>
        <p>feature_extraction.text.TfidfVectorizer.html
4 https://scikit-learn.org/stable/modules/generated/sklearn.</p>
        <p>decomposition.PCA.html
5 https://pypi.org/project/sentiment-analysis-spanish/
We built our classification model as an ensemble of classifiers. Specifically, we
considered each group of features separately and, by performing 10-fold cross-validation on
the training set, we chose the best classifier for that group of features among Support
Vector Machine (SVM) with linear kernel, Logistic Regression (with default
parameter), Random Forest and Extra Trees (both with 500 estimators and minimum sample
leaf parameter equal to 1).</p>
        <p>According to the results shown in Tables 1a and 1b, for English, we chose
Extra Trees as the best classifier for the style features, SVM for both n-grams and tweet
embedding, and Logistic Regression for sentiment analysis. Likewise, for Spanish, we
chose SVM as the best classifier for the style features, Extra Trees for n-grams, Random
Forest for tweet embedding, and Logistic Regression for sentiment analysis.</p>
        <p>We see that, for English Twitter users, the best performing set of features is n-grams
(accuracy of 0.72), followed by BERT tweet embedding (accuracy of 0.69), sentiment
(accuracy of 0.66), and style-based features (accuracy of 0.64). For Spanish Twitter
users, style-based features and n-grams are equally the best sets of features (both with
an accuracy of 0.75), followed by BERT tweet embedding (accuracy of 0.74), sentiment
(accuracy of 0.57). Overall, the different performance of the features across English and
Spanish may highlight a different cultural behavior in the use of Twitter in general, and
in sharing news in particular.</p>
        <p>
          For the final model, we first trained each group of features with the corresponding
selected best classifier from Tables 1a and 1b. Then, we used the prediction
probabilities of all the four selected best classifiers as input features to a majority voting classifier
that combined the predictions of four base estimators, namely SVM, Logistic
Regression, Random Forest, and Extra Trees. Table 2 reports the resulting accuracy values of
the final majority voting classifier in two cases. First, when we evaluate our model with
10-fold cross-validation on the training set (i.e., we trained on 90% of the training set
and tested on the remaining 10%, repeated the experiment 10 times and averaged the
results), we achieve an accuracy of 0.73 and 0.77 for the English and Spanish dataset,
respectively. As we can see, the accuracy value of the features combined by our final
model improves over the accuracy values achieved by each group of features
individually, as reported in Tables 1a and 1b. Second, when we train on the whole training set
and test on the test set, we achieve an accuracy of 0.71 for English and 0.76 for Spanish
(run executed on TIRA [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]).
6
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Investigating the Role of Psycho-linguistic and Personality</title>
    </sec>
    <sec id="sec-6">
      <title>Features in Detecting Fake News Spreaders</title>
      <p>After the submission of our run, we performed some additional experiments to
investigate how other groups of features, such as psycho-linguistic and personality features,
would perform at detecting fake news spreaders. Specifically, we considered the set of
psycho-linguistic features computed by the Linguistic Inquiry and Word Count (LIWC)
tool and personality features extracted by using the IBM Watson Personality Insights
service6. To compute these two sets of features, tweets were pre-processed in the same
way as for the tweet embedding features from Section 4. These features are described
in detail here below.</p>
      <p>
        Linguistic Inquiry and Word Count (LIWC). LIWC is a transparent text analysis tool
that counts words in psychologically meaningful categories. The tool works for
different languages, including English and Spanish. We used LIWC2015 for English (93
features) and LIWC2007 for Spanish (90 features). LIWC computes different measures
for analyzing the cognitive, affective, and grammatical processes in the text. The LIWC
features can be divided into four main categories [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]:
Linguistics features refer to features that represent the functionality of text, such as
the average number of words per sentence and the rate of misspelling. This
cate
      </p>
      <sec id="sec-6-1">
        <title>6 https://cloud.ibm.com/apidocs/personality-insights</title>
        <p>gory of features also includes negations as well as part-of-speech (Adjective, Noun,
Verb, Conjunction) frequencies.</p>
        <p>Punctuation features include the occurrences of Periods, Commas, Question,
Exclamation, and Quotation marks, etc. in the text.</p>
        <p>Psychological features target emotional, social process, and cognitive processes. The
affective processes (positive and negative emotions), social processes, cognitive
processes, perceptual processes, biological processes, time orientations, relativity,
personal concerns, and informal language (swear words, nonfluencies) can be used
to scrutinize the emotional part of the text.</p>
        <p>Summary features define the frequency of words that reflect the thoughts, perspective,
and honesty of the writer. It consists of Analytical thinking, Clout, Authenticity,
Emotional tone, Words per sentence, Words more than six letters, and Dictionary
words under this category.</p>
        <p>
          To compute the LIWC feature set for each user in the dataset, we first computed the
LIWC features for each tweet and then averaged the features for the same user.
Personality Features. The IBM Watson Personality Insights service uses linguistic
analytics to infer individuals’ intrinsic personality characteristics, including Big Five
personality traits, Needs, and Values, from digital communications such as social media
posts. The tool is able to work for different languages, including English and Spanish.
In our case, we concatenated all the tweets of a given user in a unique document to
compute their personality characteristics. The features computed by this service are detailed
in the following (we considered the raw scores provided by the service):
Big Five The Big Five personality traits, also known as the five-factor model (FFM)
and the OCEAN model, are a widely used taxonomy to describe people’s
personality traits [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. The five basic personality dimensions described by this taxonomy are
openness to experience, conscientiousness, extraversion, agreeableness, and
neuroticism. For each personality dimension, IBM Watson Personality Insights also
provides a set of additional six facet features. For instance, agreeableness’ facets
include altruism, cooperation, modesty, morality, sympathy, and trust.
Needs These features describe the needs of a user as inferred by the text they wrote and
include excitement, harmony, curiosity, ideal, closeness, self-expression, liberty,
love, practicality, stability, challenge, and structure.
        </p>
        <p>Values These features describe the motivating factors that influence a person’s decision
making. They include self-transcendence, conservation, hedonism, self-enhancement,
and open to change.</p>
        <p>Tables 3a and 3b report the accuracy achieved by the four considered classifiers
with input LIWC and personality features for detecting fake news spreaders. We were
able to evaluate these features only on the provided training set and reported results are
the averaged values of 10-fold cross-validation. As we can see, for the English dataset,
LIWC and personality features achieve both the best accuracy of 0.72 with Random
Forest, which is the same as the accuracy achieved by n-grams features (cf. Table 1a)
and slightly lower than the accuracy of 0.73 achieved by our final submitted approach
on the training set (cf. Table 2). In the case of the Spanish dataset, LIWC achieves the
best accuracy of 0.78 with Random Forest, while personality features perform with the
best accuracy of 0.80 with Extra Trees. In this case, both LIWC and personality features
outperform our submitted approach on the training set, which achieves an accuracy of
0.77 (cf. Table 2), with personality features having the best ever achieved accuracy
among all the group of features we tried for this shared task.
7</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Conclusions</title>
      <p>We addressed the problem of automatically detecting Twitter users keen at spreading
fake news as part of the PAN at CLEF 2020 shared task on profiling fake news spreaders
in two languages, namely English and Spanish. We proposed a first approach leveraging
several groups of features, including writing style, word and char n-grams, BERT
semantic embedding, and sentiment analysis, which are computed from the Twitter feed
provided for each user. This approach achieved an accuracy of 0.73 in English, resp.
0.77 in Spanish, when evaluated with 10-fold cross-validation on the provided training
set, and an accuracy of 0.71 in English, resp. 0.76 in Spanish, when evaluated on the
provided test set. We also investigated the role of psycho-linguistic (LIWC) and personality
features on the same task. We showed that personality traits are important
characteristics to consider when modeling user sharing behavior, as they achieved an accuracy of
0.72 in English, resp. 0.80 in Spanish, when evaluated with 10-fold cross-validation on
the provided training set. Overall, the task of detecting fake news spreaders turned out
to be very challenging. Even if we tried several sets of features, we could not achieve an
accuracy value higher than 0.80. One possible motivation could be that some users keen
to spread fake news do not do it intentionally; hence they are hard to differentiate from
users who never shared fake news. Also, the accuracy gap between English and Spanish
may indicate users have different news spreading behaviors across different cultures.</p>
      <p>Future work will be devoted to analyzing the role of additional sets of features for
detecting fake news spreaders in social networks, including behavioral features
describing the user activity on Twitter and social network features.</p>
      <p>Acknowledgements
This work has been partially supported by the National Science Foundation under
Awards no. 1943370 and 1820685.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Barometer</surname>
          </string-name>
          , E.T.:
          <article-title>Edelman trust barometer global report</article-title>
          . Edelman, available at: https://www. edelman. com/sites/g/files/aatuss191/files/2019-02/
          <year>2019</year>
          _Edelman_ Trust_Barometer_
          <source>Global_Report_2</source>
          .
          <string-name>
            <surname>pdf</surname>
          </string-name>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Effron</surname>
            ,
            <given-names>D.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raj</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Misinformation and Morality: Encountering Fake-News Headlines Makes Them Seem Less Unethical to Publish and Share:</article-title>
          .
          <source>Psychological Science (Nov</source>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Gabielkov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramachandran</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chaintreau</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Legout</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Social clicks: What and who gets read on twitter</article-title>
          ?
          <source>ACM SIGMETRICS Performance Evaluation Review</source>
          <volume>44</volume>
          (
          <issue>1</issue>
          ),
          <fpage>179</fpage>
          -
          <lpage>192</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Giachanou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rissola</surname>
            ,
            <given-names>E.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghanem</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crestani</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>The role of personality and linguistic patterns in discriminating between fake news spreaders and fact checkers</article-title>
          .
          <source>In: Natural Language Processing and Information Systems: 25th International Conference on Applications of Natural Language to Information Systems, NLDB</source>
          <year>2020</year>
          , Saarbrücken, Germany, June 24-26,
          <year>2020</year>
          , Proceedings. p.
          <fpage>181</fpage>
          . Springer Nature
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Guess</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nagler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tucker</surname>
          </string-name>
          , J.:
          <article-title>Less than you think: Prevalence and predictors of fake news dissemination on facebook</article-title>
          .
          <source>Science advances 5(1)</source>
          ,
          <year>eaau4586</year>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lamba</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumaraguru</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joshi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy</article-title>
          .
          <source>In: Proceedings of the 22nd international conference on World Wide Web</source>
          . pp.
          <fpage>729</fpage>
          -
          <lpage>736</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Horne</surname>
            ,
            <given-names>B.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nevo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>O</given-names>
            <surname>'Donovan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Adali</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          :
          <article-title>Rating reliability and bias in news articles: Does AI assistance help everyone?</article-title>
          <source>In: Proceedings of the Thirteenth International Conference on Web and Social Media</source>
          ,
          <string-name>
            <surname>ICWSM</surname>
          </string-name>
          <year>2019</year>
          . pp.
          <fpage>247</fpage>
          -
          <lpage>256</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hutto</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gilbert</surname>
          </string-name>
          , E.:
          <article-title>Vader: A parsimonious rule-based model for sentiment analysis of social media text</article-title>
          . In: Eighth international AAAI conference
          <article-title>on weblogs and social media (</article-title>
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Isaak</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hanna</surname>
            ,
            <given-names>M.J.:</given-names>
          </string-name>
          <article-title>User data privacy: Facebook, cambridge analytica, and privacy protection</article-title>
          .
          <source>Computer</source>
          <volume>51</volume>
          (
          <issue>8</issue>
          ),
          <fpage>56</fpage>
          -
          <lpage>59</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
          </string-name>
          , L.:
          <article-title>News sharing in social media: The effect of gratifications and prior experience</article-title>
          .
          <source>Computers in human behavior 28(2)</source>
          ,
          <fpage>331</fpage>
          -
          <lpage>339</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goh</surname>
            ,
            <given-names>D.H.</given-names>
          </string-name>
          :
          <article-title>Understanding news sharing in social media from the diffusion of innovations perspective</article-title>
          .
          <source>In: 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing</source>
          . pp.
          <fpage>1013</fpage>
          -
          <lpage>1020</lpage>
          . IEEE (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. Mitchell,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Gottfriedd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Barthel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Sumida</surname>
          </string-name>
          , N.:
          <article-title>Distinguishing between factual and opinion statements in the news</article-title>
          . Pew Research Center (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Neuman</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Computational personality analysis: Introduction, practical applications and novel directions</article-title>
          . Springer (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Pennebaker</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyd</surname>
            ,
            <given-names>R.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jordan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blackburn</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>The development and psychometric properties of liwc2015</article-title>
          .
          <source>Tech. rep. (</source>
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gollub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiegmann</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>TIRA Integrated Research Architecture</article-title>
          . In: Ferro,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <surname>C</surname>
          </string-name>
          . (eds.)
          <article-title>Information Retrieval Evaluation in a Changing World</article-title>
          .
          <source>The Information Retrieval Series</source>
          , Springer, Berlin Heidelberg New York (
          <year>Sep 2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giachanou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghanem</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Overview of the 8th Author Profiling Task at PAN 2020: Profiling Fake News Spreaders on Twitter</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Eickhoff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Névéol</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (eds.)
          <article-title>CLEF 2020 Labs and Workshops, Notebook Papers</article-title>
          .
          <source>CEUR Workshop Proceedings (Sep</source>
          <year>2020</year>
          ),
          <article-title>CEUR-WS</article-title>
          .org
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Reimers</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurevych</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Sentence-bert: Sentence embeddings using siamese bert-networks</article-title>
          .
          <source>arXiv preprint arXiv:1908</source>
          .
          <volume>10084</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Shrestha</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spezzano</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Online misinformation: from the deceiver to the victim</article-title>
          .
          <source>In: ASONAM '19: International Conference on Advances in Social Networks Analysis and Mining</source>
          . pp.
          <fpage>847</fpage>
          -
          <lpage>850</lpage>
          . ACM (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Shu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mahudeswaran</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Liu, H.:
          <article-title>Fakenewsnet: A data repository with news content, social context and dynamic information for studying fake news on social media</article-title>
          .
          <source>arXiv preprint arXiv:1809.01286 8</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Shu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Liu, H.:
          <article-title>Understanding user profiles on social media for fake news detection</article-title>
          .
          <source>In: 1st IEEE International Workshop on Fake MultiMedia (FakeMM</source>
          <year>2018</year>
          ) (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Spiro</surname>
            ,
            <given-names>E.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fitzhugh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutton</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pierski</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greczek</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butts</surname>
          </string-name>
          , C.T.:
          <article-title>Rumoring during extreme events: A case study of deepwater horizon 2010</article-title>
          .
          <source>In: Proceedings of the 4th Annual ACM Web Science Conference</source>
          . pp.
          <fpage>275</fpage>
          -
          <lpage>283</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Vogel</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Viral misinformation threatens public health (</article-title>
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Vosoughi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roy</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aral</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The spread of true and false news online</article-title>
          .
          <source>Science</source>
          <volume>359</volume>
          (
          <issue>6380</issue>
          ),
          <fpage>1146</fpage>
          -
          <lpage>1151</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Yaqub</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kakhidze</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brockman</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Memon</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patil</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Effects of credibility indicators on social media news sharing intent</article-title>
          .
          <source>In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems</source>
          . p.
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          . CHI '
          <volume>20</volume>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>