<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Epidemiology Inspired Framework for Fake News Mitigation in Social Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bhavtosh Rath</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jaideep Srivastava</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Minnesota</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Research in fake news detection and prevention has gained a lot of attention over the past decade, with most models using features generated from content and propagation paths. Complementary to these approaches, in this position paper we outline a framework inspired from the domain of epidemiology that proposes to identify people who are likely to become fake news spreaders. The proposed framework can serve as motivation to build fake news mitigation models, even for the scenario when fake news has not yet originated. Some models based on the framework have been successfully evaluated on real world Twitter datasets and can provide motivation for new research directions.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Fake news spreaders</kwd>
        <kwd>Social networks</kwd>
        <kwd>Epidemiology</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction</p>
      <p>
        In content-based approach the problem is formulated
as identifying whether content of a spreading
informaThe wide adoption of social media platforms like Face- tion is fake or not. Most proposed models rely on using
book, Twitter and WhatsApp has resulted in the cre- linguistic or visual based features. While earlier work
ation of behavioral big data, thus motivating researchers relied mostly on hand engineering relevant features,
to propose various computational models for combat- more recently deep learning based models have gained
ing fake news. So far the focus of most research has popularity as they can automatically generate relevant
been on determining veracity of the information using features. Propagation based approaches consider
propfeatures extracted manually or automatically through agation paths of fake news and are mostly inspired
techniques such as deep learning. We propose a novel from information difusion and cascade models. They
fake news prevention and control framework that in- are used to understand how information spreading
patcorporates people’s behavioral data along with their terns can help distinguish fake news from true news.
network structure. Like in epidemiology, models pro- These models are usually integrated with content-based
posed within the framework cover the entire life cycle features to improve prediction performance.
Majorof spreading: i.e. before the fake news originates, af- ity of computational models for fake news detection
ter the fake news starts spreading and containment of from these two categories are summarized in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Userits further spreading. The framework is not to be con- based approaches focus more on peoples’ psychology.
fused with popular information difusion based mod- While user-specific features can be included as part
els [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] because they a) usually categorize certain nodes of content-based models, there has also been some
reand cannot be generalized to all nodes, b) consider only search exploring behavior patterns of individuals who
the propagation paths but not the underlying graph spread fake news. Behavioral principles like naive
restructure and c) can be generalized to information dif- alism and confirmation bias (at individual level) have
fusion and need not be particular to fake news spread- been found to make fake news perceived as true, as
ing. stated in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. A phenomenon called echo chamber
efRelated Work: Literature of research in fake news fect (at group level) has also been found to reinforce
detection and prevention strategies is vast, and can be people’s pre-existing biases, making them averse to
acdivided broadly into three categories: Content-based, cepting opposing opinions [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The role of bots in fake
Propagation-based and User-based. news spreading has also been studied. More recently
work has been done to identify fake news spreaders [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
Proceedings of the CIKM 2020 Workshops. October 19-20, Galway, which focus on modelling linguistic features but they
IErdeiltaonrds.of the Proceedings: Stefan Conrad, Ilaria Tiddi. do not integrate underlying network structure. Not
email: rathx082@umn.edu (B. Rath); srivasta@umn.edu (J. many computational models have been proposed
explorSrivastava) ing psychological concepts from historical behavioral data
orcid: that make people vulnerable to spreading fake news, which
our proposed framework can be used to address.
      </p>
      <p>© 2020 Copyright for this paper by its authors. Use permitted under Creative
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g CCoEmUmoRns WLiceonrsekAsthtriobuptioPnr4o.0cIneteerdnaitniognasl ((CCC EBYU4R.0)-.WS.org)</p>
      <p>A major limitation with existing models is that they news without verifying its claim if it is endorsed by a
rely on the presence of fake news to generate mean- neighbor they trust); and b) the density of its
neighboringful features, thus making it dificult to model fake hood, similar to how high population density increases
news mitigation strategies. Our framework proposes the likelihood of infection spreading, a modular
netmodels using two important components that do not work structure is more prone to fake news spreading.
rely on the presence of fake news: underlying network After the infection spreading is identified there is a
structure and people’s historical behavioral data. need to de-contaminate the population. A medicinal</p>
      <p>The rest of the paper is divided as follows: We ex- cure is used to treat the infected population and thus
plain how epidemiological concepts can be mapped di- prevent further spreading of infection. In the context
rectly to the problem of fake news spreading and mit- of fake news, a refutation news can serve this
purigation. We then explain proposed models for detect- pose. Refutation news can be defined as true news
ing fake news spreader using the Community Health that fact-checks a fake news. Contents from
popuAssessment model and also summarize current and fu- lar fact-checking websites1 are examples of refutation
ture research based on the ideas. Finally we give our news. In epidemiology the medicine can have two
purconcluding remarks. poses: As control mechanism (i.e. medication), with
the intention to cure infected people (i.e. explicitly
inform the fake news spreaders about the refutation
2. Epidemiology Inspired news) and as prevention mechanism (i.e.
immunizaFramework tion), with the intention to prevent uninfected
population from becoming infection carriers in future (i.e.</p>
      <p>Epidemiology is the field of medicine which deals with prevent unexposed population from becoming fake news
the incidence, distribution and control of infection amongspreaders). An infected person is said to have
recovpopulations. In the proposed framework fake news ered if he either decides to retract from sharing the
is analogous to infection, social network is analogous fake news or decides to share the refutation news, or
to population and the likelihood of people believing a both. Mapping of epidemiological concepts to the
connews endorser in the immediate neighborhood is anal- text of fake news spreading is summarized in Table 1.
ogous to their vulnerability to getting infected when
exposed. We consider fake news as a pathogen that 3. Contributions
intends to infect as many people as possible. An
important assumption we make is that fake news of all In this section we show how the framework has been
kinds is generalized as a single infection, unlike in epi- applied so far and how it is used to propose relevant
demiology where people have diferent levels of im- models.
munity against diferent kinds of infections (i.e. the 3.1. Community Health Assessment
framework is information agnostic). Also we do not
distinguish bots in the network population. model</p>
      <p>
        The likelihood of a person getting infected (i.e. be- A social network has the characteristic property to
exlieving and spreading the fake news) is dependent on hibit community structures that are formed based on
two important factors: a) the likelihood of trusting
a news endorser (a person is more likely to spread a 1https://www.snopes.com/, https://www.politifact.com/
inter-node interactions. Communities tend to be
modular groups where within-group members are highly
connected, and across-group members are loosely
connected. Thus members within a community would tend Figure 1: Motivating example. Red nodes denote fake news
to have a higher degree of trust among each other than spreaders.
between members across diferent communities. If such
communities are exposed to fake news propagating in
its vicinity, the likelihood of all community members 3.2. Assessment, identification and
getting infected would be high. Thus it is important to prevention
identify vulnerable individuals that lie in the path of
fake news spread to limit the overall spreading of fake To model a person’s likelihood to endorse a fake news
news in the network. The idea is illustrated in Figure 1. based on their belief in the endorser, we applied the
In the context of Twitter, directed edge  1 →  1 rep- Trust in Social Media (TSM) algorithm. It assigns a
resents  1 follows  1. Thus information flows from  1 pair of complementary trust scores, called Trustingness
to  1 when  1 decided to retweet an information en- and Trustworthiness to every node in a social network.
dorsed by  1. The goal would be to identify nodes that While trustingness quantifies the propensity of a node
are likely to believe and spread the fake news. Sub- to trust its neighbors, trustworthiness quantifies the
script of the nodes denote the community they belongs willingness of the neighbors to trust the node.
Impleto. Motivated by the idea of ease of spreading within a mentation details for the algorithm can be found in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
community we proposed the Community Health As- Below we propose three phases for the framework and
sessment model. The model identifies three types of summarize models implemented so far with future
dinodes with respect to a community: neighbor, bound- rections.
ary and core nodes, which are explained below: 1. Vulnerability assessment of population: In
epi1. Neighbor nodes: These nodes are directly connected demiology, it is important to identify individuals and
to at least one node of the community. The set of groups that are vulnerable to fake news before the
spreadneighbor nodes is denoted by  . They are not a ing begins. Borrowing ideas from the community health
part of the community. assessment model, we proposed metrics that quantify
2. Boundary nodes: These are community nodes that the vulnerability of nodes and communities in a
netare directly connected to at least one neighbor node. work. Through experiments on real world information
The set of boundary nodes is denoted by  . It is im- spreading networks on Twitter, we showed that our
portant to note that only community nodes that have proposed metrics are more efective in identifying fake
an outgoing edge towards a neighbor nodes are in  . news spreaders compared to true news spreaders,
con3. Core nodes: These are community nodes that are ifrming our hypothesis that fake news relies strongly
only connected to members within the community. The on inter-personal trust to propagate while true news
set of core nodes is denoted by  . does not. Details regarding the model implementation
      </p>
      <p>
        The neighbor, boundary and core nodes for commu- can be found in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
nities in Figure 1 are listed in Table 2. 2. Identification of fake news spreaders: While
determining the veracity of information has been widely
(a) Fake news reaches 
(b) Fake news reaches 
(c) Fake news reaches 
researched, it is equally important to determine the shows the scenario where fake news has reached the
authenticity of the people who are spreading informa- two neighbor nodes (highlighted in red). Three
boundtion. A model for automatic identification of people ary nodes (circled in red) are exposed to the fake news.
spreading fake news by leveraging the concept of Be- In (b) two out of three exposed boundary nodes
believability (i.e. the extent to which the propagated in- come spreaders, and marks the beginning of fake news
formation is likely to be perceived as truthful) is pro- spreading within the community. And in (c), one of the
posed. With the retweet network edge-weighted by two exposed core nodes become spreader.
believability scores, network representation learning Thus using community health assessment model we
is used to generate node embeddings, which is lever- can build models that predict both exposed (i.e.
boundaged to classify users as fake news spreaders or not ary nodes) and unexposed (i.e. core nodes) nodes that
using a recurrent neural network classifier. Based on would likely become fake news spreaders after
infecexperiments on a very large real-world rumor dataset tion spreading has begun (i.e. fake news has reached
collected from Twitter, we could efectively identify neighbor nodes). Efective mitigation strategies could
false information spreaders. Further details can be found then be deployed against predicted spreaders.
in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
3. Prevention and control of infection spreading:
Motivation for this problem can be explained through 4. Conclusion
Figure 1.  1, a neighbor node for community 3 is a
fake news spreader. Node  3, a boundary node is ex- In this position paper we proposed a novel
epidemiposed and likely to start fake news spreading in com- ology inspired framework and showed how the
community 3. To prevent such a scenario it is important munity health assessment model can be used to build
to predict boundary nodes of all communities in a net- models for fake news mitigation, a problem less
exwork that are likely to become fake news spreaders plored compared to fake news detection. What makes
when the infection has reached neighbor nodes. Sim- it diferent from most existing research is that a) it
proilarly, consider the scenario where  3 is a fake news poses a more spreader-centric modelling approach
inswphreicahdearr.e Mimemmebdeirasteoffotlhloewceormsomf un3itayr e n3,ow3exapnodsed3 rsetelyadonoffecaotnutreenste-xcternatcrtiecdafprpormoafachke, annedwsb)thitusdoseersvninogt
to the fake news, and the remaining community mem- as motivation to build fake news mitigation strategies,
bers are two steps away. Due to their close proximity even for the scenario when fake news has not yet
origthey too are vulnerable to believing  3 and causing inated. Recent work that apply few of the ideas have
infection to spread throughout the community. Thus shown encouraging results, thus serving as motivation
it is important to identify core nodes that would be- to pursue the idea further. A limitation of our model
come likely spreaders when the infection has reached is that it does not not incorporate the dynamic nature
boundary nodes. The scenarios are explained in Fig- of social network structure. As part of future work we
ure 2 applying the community health assessment model. would like to incorporate eliminating the presence of
Nodes inside the dotted oval denote core nodes, be- bots as we are focusing on modeling psychological and
tween dotted and solid oval denote boundary nodes sociological properties based on behavioral data.
and outside the solid oval denote neighbor nodes. (a)
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Jin</surname>
          </string-name>
          , E. Dougherty,
          <string-name>
            <given-names>P.</given-names>
            <surname>Saraf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ramakrishnan</surname>
          </string-name>
          ,
          <article-title>Epidemiological modeling of news and rumors on twitter</article-title>
          ,
          <source>in: Proceedings of the 7th workshop on social network mining and analysis</source>
          ,
          <source>2013</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ruchansky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Y. Liu,
          <article-title>Combating fake news: A survey on identification and mitigation techniques</article-title>
          ,
          <source>ACM Transactions on Intelligent Systems and Technology (TIST) 10</source>
          (
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>42</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sliva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          , H. Liu,
          <article-title>Fake news detection on social media: A data mining perspective</article-title>
          ,
          <source>ACM SIGKDD explorations newsletter 19</source>
          (
          <year>2017</year>
          )
          <fpage>22</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Del Vicario</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Vivaldo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bessi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Zollo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Scala</surname>
          </string-name>
          , G. Caldarelli, W. Quattrociocchi,
          <article-title>Echo chambers: Emotional contagion</article-title>
          and group polarization on facebook,
          <source>Scientific reports 6</source>
          (
          <year>2016</year>
          )
          <fpage>37825</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ghanem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Giachanou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kestemont</surname>
          </string-name>
          , E. Manjavacas,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Specht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          , et al.,
          <source>Shared tasks on authorship analysis at pan</source>
          <year>2020</year>
          , in: European Conference on Information Retrieval, Springer,
          <year>2020</year>
          , pp.
          <fpage>508</fpage>
          -
          <lpage>516</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sarkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Huh</surname>
          </string-name>
          ,
          <article-title>Trustingness &amp; trustworthiness: A pair of complementary trust measures in a social network</article-title>
          ,
          <source>in: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)</source>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>549</fpage>
          -
          <lpage>554</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Rath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <article-title>Evaluating vulnerability to fake news in social networks: A community health assessment model</article-title>
          ,
          <source>in: 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>432</fpage>
          -
          <lpage>435</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>B.</given-names>
            <surname>Rath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Gao</surname>
          </string-name>
          , J. Ma, J. Srivastava,
          <article-title>Utilizing computational trust to identify rumor spreaders on twitter</article-title>
          ,
          <source>Social Network Analysis and Mining</source>
          <volume>8</volume>
          (
          <year>2018</year>
          )
          <fpage>64</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>