<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Framework for Complex In uence Propagation based on the F 2DLT Class of Di usion Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Antonio Calio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Tagarelli</string-name>
          <email>tagarellig@dimes.unical.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. Computer Engineering</institution>
          ,
          <addr-line>Modeling, Electronics</addr-line>
          ,
          <institution>and System Engineering, University of Calabria</institution>
          ,
          <addr-line>Rende (CS)</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>What are the key-features that enable an information diffusion model to explain the inherent complex dynamics of real-world propagation phenomena? To answer the above question, we discuss a novel class of stochastic Linear Threshold (LT) di usion models, which are designed to capture the following aspects in in uence propagation scenarios: trust/distrust in the user relationships, changes in adopting one or alternative information items, hesitation towards adopting an information item over time, latency in the propagation, time horizon for the unfolding of the di usion process, and multiple cascades of information that might occur competitively. Around all such aspects, our de ned Friend-Foe Dynamic LT (F 2DLT ) class comprises a non-competitive model as well as two competitive models, which are able to represent semi-progressivity and non-progressivity, respectively, in the propagation process. The above key-constituents embedded in our models make them unique in the literature of di usion models, including epidemic models. To validate our models through real-world networks, we also discuss di erent strategies for the selection of the initial in uencers to mimic non-competitive and competitive di usion scenarios, inspired by the widely-known problem of limitation of misinformation spread. Finally, we describe a web-based simulation environment for testing the proposed di usion models.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Since the early applications in viral marketing, the development of information
di usion models and related optimization methods has provided e ective
support to address a variety of in uence propagation problems. Nowadays, due to
the shrinking boundary between real and online social life [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and the presence
of multiple, competitive spreaders over the Web, which could also act towards
Copyright © 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0). This volume is published
and copyrighted by its editors. SEBD 2020, June 21-24, 2020, Villasimius, Italy.
misinformation, deciding whether a source of information is reliable or not has
become a critical task. The di culty in assessing the reliability and
trustworthiness of the source generating or sharing a piece of information increases the
likelihood of people to be deceived by a spreading information. In addition to
this, the tendency to access information from like-minded sources [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] leads users
to be trapped in information bubbles, eventually causing forms of network
polarization [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Polarization can be contrasted via actions devoted to fact-checking or
misinformation debunking, but time plays a crucial role in this game, as cognitive
phenomena of con rmation bias may easily arise.
      </p>
      <p>Understanding the trustworthiness of the source of an information item is
often challenging, as it requires tracking the information item of its
originator, which could be unfeasible in many cases. Therefore, it beomes essential to
capture the e ect of trust/distrust relationships on both the user behavior and
propagation dynamics. One big question hence arises:</p>
      <p>What are the key-features that make a di usion model able to explain
the inherent dynamic, and often competitive, nature of real-world
propagation phenomena?</p>
      <p>
        Contributions. To answer the above question, we discuss a class of
stochastic di usion models, named Friend-Foe Dynamic Linear Threshold Models (for
short, F 2DLT ), which has been originally proposed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Our models are
inspired by the classic Linear Threshold (LT) model, whereby an individual can
decide to take an action as a result of the exposure to multiple sources of in
uence. Major key-features of our models is that the information di usion graph
is de ned on top of a trust network, where trust is encoded into the in uence
probabilities, the response of a user to the in uencing attempts is described by
a time-varying activation threshold function, and also a quiescence function is
introduced to model the latency or delay in the propagation.
      </p>
      <p>Our F 2DLT models are designed to deal with non-competitive and
competitive propagation scenarios. For competitive campaign scenarios, one model is
semi-progressive, which assumes that a user, once activated, is only allowed to
switch to a di erent campaign, and another model is non-progressive, i.e., it
requires a user to have always the support of her/his in-neighbors to keep the
activation state with a certain campaign.</p>
      <p>To evaluate our models, we also devised four seed selection strategies, which
mimic di erent, realistic scenarios of in uence propagation. Experimental
evaluation conducted on real-world networks, also including comparison with
stochastic epidemic models, has provided interesting ndings on the meaningfulness and
uniqueness of our proposed models.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Overview</title>
      <p>The F 2DLT class of di usion models</p>
      <p>Trust Network︎
Online social network</p>
      <p>Users︎
exogenous and
dynamic quiescence ︎
factors︎
exogenous and
dynamic
activationthreshold factors︎
. . .
of this framework are the following: (i) a set of online social network users; (ii) a
trust network, possibly inferred from a social network; (iii) user-behavior
characteristics that provide information for incorporating two main aspects into the
di usion process, namely activation-threshold and quiescence; (iv) information
related to one or multiple competing campaigns.</p>
      <p>Also, the information di usion process is supposed to have a certain time
horizon, and before its expiration, users respond to the network's stimuli which
may lead to being active in favor of one campaign or the opposing one. According
to their own behavior, users may have the possibility of switching from the
adoption of a campaign's item to that of another one.</p>
      <p>Putting it all together, our F 2DLT based framework embeds all the above
aspects that are essential to explain complex propagation phenomena, i.e.,
competitive di usion, non-progressivity, time-aware activation, delayed propagation,
and trust/distrust relations.</p>
      <p>Notably, in our setting, we tend to reject as true in general, the principle
\I agree with my friends' idea and disagree with my foes' idea" (which also
resembles the adage \the enemy of my enemy is my friend"), since this would
imply that the behavior of a user should be completely determined by the stimuli
coming from her/his neighbors. In other terms, friends should be responsible for
the in uenceability and activation of a user, as opposed to foes, which should
instead impact on delayed propagation. Therefore, in our models, the trusted
connections and distrusted connections play di erent roles: only friends can exert
a degree of (positive) in uence, whereas foes can only contribute to increase the
user's hesitation to commit with the propagation process.</p>
      <p>
        One assumption of our framework is the availability of trust relationships
between the users involved in the information di usion context. Although this
may not hold in practice, a few studies have been recently developed to infer
a trust network from the time-varying interactions observed in evolving social
networks, such as [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Activation-threshold function. Given a directed, weighted network
representing an information di usion graph built on top of a trust network, every
node v is associated with an activation-threshold, v 2 (0; 1], which corresponds
to the e ort of activating the node which is needed in terms of cumulative in
uence. The activation-threshold function g, de ned over the set of nodes V and
the time interval of di usion T , enhances this concept. For each node v at time
t 2 T , it is de ned as:</p>
      <p>g(v; t) = v + #( v; t):</p>
      <p>Thus, a node v is activated when its perceived in uences exceed the threshold
v, plus the time-evolving activation term, #( ; ), which models the dynamic
response of users at any activation attempt. Our proposed models can in principle
embrace any de nition for #( ; ); here, we focus on two main scenarios.</p>
      <p>
        Biased . Modeled as a non-decreasing monotone function, it captures the
tendency of a user to consolidate her/his belief, according to the con rmation-bias
principle [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. We de ne this function as follows:
g(v; t) = v + #( v; t) = v +
min
1
v ; t
tlvast
(1)
where tlvast denotes the last time v was activated and 0 denotes the increment
in the value of g(v; t) for consecutive time steps. Clearly, the value increases as
a node keeps staying in the same active state.
      </p>
      <p>Unbiased . In applications such as customer retention or churn prediction, a
user could revise her/his uncertainty to activate over time. We model this in
such a way that, for each v, the value of the function is maximum (i.e., 1) just
after the activation, i.e., at time t = tlvast + 1, then it starts to decrease from the
following time steps:
g(v; t) = v + #( v; t) = v + exp(
(t
tlvast
1))
vI[t
tlvast = 1]
(2)
Quiescence function. The quiescence value quanti es the latency in the
propagation through a particular node. It is de ned as a non-decreasing and
monotone function q : V; T 7! T , such that for every v 2 V; t 2 T , with v
activated at time t:
q(v; t) = v +
(N in(v); t);
where v 2 T represents an exogenous term modeling the user's hesitation in
being committed with the propagation process, and (N in(v); t) determines an
We assume the second additive term in Eq. (1) is zero if
C0
(3)
additional delay proportional to the amount of v's neighbors that are distrusted
and active, when the activation attempt is performed by the v's trusted
neighbors. Here we focus on the following function:
q(v; t) = v + (N in(v); t) = v + exp
X</p>
      <p>jwuvj
u2St 1
where</p>
      <p>0 quanti es the user sensitivity towards negative in uence.
2.2</p>
    </sec>
    <sec id="sec-3">
      <title>Di usion Models</title>
      <p>
        The F 2DLT class includes three di usion models [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which are now brie y and
informally described. The rst one refers to a single-item propagation scenario,
while the other two models can dela with competitive scenarios, i.e., two or
multiple campaigns spreading informative items over the network in a competitive
fashion. Figure 2 shows the life-cycle of a node in the non-competitive scenario
and in competitive ones, respectively. It should be noted that all the models share
a similar two-phase activation process. In fact, when a node is activated by its
neighbors, it always becomes quiescent. Its permanence in this state depends on
Eq. 3. After the quiescent time is expired, a node can be regarded as active,
regardless of its activation campaign, and is considered fully committed to the
propagation process. While the non-competitive model, dubbed nC-F 2DLT , is
a progressive model (i.e., once a node becomes active cannot be deactivated in
subsequent times), the semi-progressive competitive model, spC-F 2DLT , allows
a node to switch from a campaign to another, even multiple times. Finally, the
fully non-progressive competitive model, npC-F 2DLT , additionally allows for
node deactivation.
      </p>
      <p>
        Theoretical properties of the models. The non-competitive, progressive
model nC-F 2DLT is proven to be equivalent to LT with Quiescence Time i.e.,
the activation function in nC-F 2DLT is monotone and submodular. On the
other hand, the competitive spC-F 2DLT and npC-F 2DLT can be reduced,
via graph serialization, to a progressive Homogeneous Competitive LT, with
monotone, non-submodular activation function i.e., the activation function in
spC-F 2DLT and npC-F 2DLT is monotone but not submodular [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <sec id="sec-3-1">
        <title>Evaluation methodology</title>
        <p>
          Data. We used four real-world, publicly available networks, namely: Epinions,
Slashdot, Wiki-Con ict, and Wiki-Vote | please refer to [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] for details about
the datasets. All networks are originally directed and signed; in addition, the two
Wikipedia-based networks also have timestamped edges. The in uence
probability are derived so that to a network having a higher fraction of positive edges will
correspond to an equivalent network where users are more willing to be involved
in the propagation process.
        </p>
        <p>Seed selection strategies. We de ned four seed selection strategies, each of
which mimics a di erent, realistic scenario of in uence propagation.</p>
        <p>Exogenous and malicious sources of information. This method, hereinafter
referred to as M-Sources, aims at simulating the presence of multiple sources of
malicious information within the network. Here, an exogenous source is meant
as a node without incoming links. Formally, given a budget k, the method
selects the top-k users in a ranking solution determined as r(v) = (W =(W +
W +)) log(jN out(v)j) for every v such that N in(v) = ;. Here, W +, resp. W
denote the sum of trust, resp. distrust outgoing weights.</p>
        <p>Exogenous and in uential trusted sources of information. Analogously to
the previous method, this one, dubbed I-Sources, is concerned with exogenous
sources with emphasis on trusted links. Hence, the ranking function is: r(v) =
(W +=(W + W +)) log(jN out(v)j).</p>
        <p>Stress triads. This strategy is based on the notion of structural balance in
triads. In a typical stress-triad con guration there is a node v with two incoming
connections, from a node z with negative weight and from a node u with positive
weight, and such that z is connected to u via a positive link. In this case, z is
regarded as a stress-node since it could activate v through u, despite being
negatively linked to v. The Stress-Triads strategy selects seeds based on the
number of stress-triads that are involved in as stress-nodes.</p>
        <p>
          Newcomers. We call a node v 2 V as a newcomer if all of its incoming edges
are timestamped as less recent than its oldest outgoing edge. The start-time of
v is the oldest timestamp associated with its incoming edges. Within the set
of newcomers nodes, we further de ne two separate strategies: Least-New and
Most-New. Each one consists in the selection of the the top-k newcomers with
highest out-degree among those nodes with the oldest and the newest start-time,
respectively. Both strategies require temporal information upon edges.
Major Findings and Usage Recommendations. We pursued di erent goals
of evaluations depending on the particular di usion model. More speci cally, we
investigated the impact of the negative in uence on the nal active set in the
non-competitive scenario. For the competitive setting, our main goal was to
understand the e ect of the con rmation-bias e ect, in the context of
misinformation spreading. Therefore, we will have a good and a malicious actor. Finally,
we conducted a comprehensive comparative evaluation of our non-competitive
model against the Independent Cascade model as well as stochastic
individualcontact epidemic models SIR and SEIR [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>Here we summarize the major ndings emerged from the experiments:
F1 The setting of the dynamic activation-threshold function and quiescence
function plays a crucial role in positive-in uence propagation and
negativein uence/misinformation limitation.</p>
        <p>F2 The average user's sensitivity in the negative in uence perceived from
distrusted neighbors ( ) makes the seed identi cation more aware of the negative
in uence spread.</p>
        <p>F3 The con rmation-bias e ect ( ) may lead the \stronger" campaign to
increase its spread capability.</p>
        <p>F4 The non-progressive competitive model npC-F 2DLT appears to be less
sensitive to the increase of and tends to favor deactivation events (for users
previously activated by the weaker campaign) over switched events.</p>
        <p>F5 If compared with classic di usion models as IC, SIR, and SEIR, the
nC-F 2DLT model tends to favor a slower di usion, since the propagation
process lasts consistently longer than IC and the epidemic models.</p>
        <p>F6 Threshold models are more suitable when it is required to model
propagation phenomena where it is important to capture the di erence in behavior of
each individual. On the contrary, epidemic models does not exhibit this
important capability, as they are more suitable to represent single contagions.</p>
        <p>F7 I-Sources reveals higher spread capability, followed by Stress-Triads. On
the contrary, the newcomers-based strategies show a signi cant spread capability,
coupled with a negligible e ect on the negative in uence.</p>
        <p>The above results provide evidence that our class of trust-aware models o ers
an opportunity to better represent the complexity of real-world propagation
phenomena. As a consequence, our models can pave the way for the development
of sophisticated methods to solve misinformation spread limitation and related
optimization problems. Also, our models can pro tably be used in a variety
of applications whereby there is an emergence to predict the time required to
debunk fake information, or to estimate how people are a ected by the spread of
competitive opposite opinions through a social network.
4</p>
      </sec>
      <sec id="sec-3-2">
        <title>Web-based Simulation Environment</title>
        <p>As part of our contributions to the research project NextShop PON Grant No.
F/050374/01-03/X32, we developed a web application of our di usion models,
available at http://people.dimes.unical.it/andreatagarelli/f2dlt/ . It
is a simulation environment that allows a user to execute a di usion model
in the F 2DLT class, and get an interactive view of a propagation process. As
input, an in uence graph can be either uploaded from a le or synthetically
generated; in the latter case, the user is asked to provide the following
information: (i) number of nodes, (ii) number of edges, (iii) percentage of negative
edges, and (iv) graph-generator model, by choosing among the Erd}os{Renyi,
Watts-Strogatz, and Barabasi-Albert models. Once the network is created, it is
displayed with positive (trust) edges represented in black, and negative (distrust)
edges represented in red. Also, after selecting a F 2DLT model, both the
activation threshold function and the quiescence function can be con gured by setting
#( ; ),and ( ; ). Finally, the early adopters, i.e., the seeds of the propagation
process, can be selected by clicking upon speci c nodes of the displayed network.
While the di usion process is running, the application updates the layout of the
network, and nodes will dynamically change their color accordingly to their
activation status. Moreover, the application will show a number of statistics about
the propagation process, as well as the temporal evolution of the propagation
process.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>A.</given-names>
            <surname>Anagnostopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bessi</surname>
          </string-name>
          , G. Caldarelli,
          <string-name>
            <given-names>M.</given-names>
            <surname>Del Vicario</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Scala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Zollo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Quattrociocchi</surname>
          </string-name>
          .
          <article-title>Viral misinformation: The role of homophily and polarization</article-title>
          .
          <source>In Proc. WebConf</source>
          , pp.
          <volume>355</volume>
          {
          <issue>356</issue>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Bessi</surname>
          </string-name>
          , G. Caldarelli,
          <string-name>
            <given-names>M.</given-names>
            <surname>Del Vicario</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Scala</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Quattrociocchi</surname>
          </string-name>
          .
          <article-title>Social determinants of content selection in the age of (mis)information</article-title>
          .
          <source>In Proc. SocInfo</source>
          , pp.
          <volume>259</volume>
          {
          <issue>268</issue>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Antonio Calio and
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Tagarelli</surname>
          </string-name>
          .
          <article-title>Complex in uence propagation based on trustaware dynamic linear threshold models</article-title>
          .
          <source>Applied Network Science</source>
          ,
          <volume>4</volume>
          (
          <issue>1</issue>
          ):
          <volume>14</volume>
          :1{
          <fpage>14</fpage>
          :
          <fpage>41</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>K.</given-names>
            <surname>Garimella</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. De Francisci Morales</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Gionis</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Mathioudakis</surname>
          </string-name>
          .
          <article-title>The e ect of collective attention on controversial debates on social media</article-title>
          .
          <source>In Proc. ACM Web Science</source>
          , pp.
          <volume>43</volume>
          {
          <issue>52</issue>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>D.</given-names>
            <surname>Koutra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Bennett</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Horvitz</surname>
          </string-name>
          .
          <article-title>Events and controversies: In uences of a shocking news event on information seeking</article-title>
          .
          <source>In Proc. WebConf</source>
          , pp.
          <volume>614</volume>
          {
          <issue>624</issue>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Domenico</given-names>
            <surname>Mandaglio</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Tagarelli</surname>
          </string-name>
          .
          <article-title>Generalized preference learning for trust network inference</article-title>
          .
          <source>IEEE Access</source>
          ,
          <volume>7</volume>
          :
          <fpage>174583</fpage>
          {
          <fpage>174596</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>