=Paper=
{{Paper
|id=Vol-2646/32-paper
|storemode=property
|title=A Framework for Complex Influence Propagation based on the F2DLT Class of Diffusion Models
|pdfUrl=https://ceur-ws.org/Vol-2646/32-paper.pdf
|volume=Vol-2646
|authors=Antonio Caliò,Andrea Tagarelli
|dblpUrl=https://dblp.org/rec/conf/sebd/CalioT20
}}
==A Framework for Complex Influence Propagation based on the F2DLT Class of Diffusion Models==
<pdf width="1500px">https://ceur-ws.org/Vol-2646/32-paper.pdf</pdf>
<pre>
                   A Framework for
        Complex Influence Propagation based on
         the F 2 DLT Class of Diffusion Models
                              (Discussion Paper)

                         Antonio Caliò and Andrea Tagarelli

     Dept. Computer Engineering, Modeling, Electronics, and System Engineering,
                     University of Calabria, Rende (CS), Italy
                     {a.calio, tagarelli}@dimes.unical.it


        Abstract. What are the key-features that enable an information dif-
        fusion model to explain the inherent complex dynamics of real-world
        propagation phenomena? To answer the above question, we discuss a
        novel class of stochastic Linear Threshold (LT) diffusion models, which
        are designed to capture the following aspects in influence propagation
        scenarios: trust/distrust in the user relationships, changes in adopting
        one or alternative information items, hesitation towards adopting an in-
        formation item over time, latency in the propagation, time horizon for the
        unfolding of the diffusion process, and multiple cascades of information
        that might occur competitively. Around all such aspects, our defined
        Friend-Foe Dynamic LT (F 2 DLT ) class comprises a non-competitive
        model as well as two competitive models, which are able to represent
        semi-progressivity and non-progressivity, respectively, in the propaga-
        tion process. The above key-constituents embedded in our models make
        them unique in the literature of diffusion models, including epidemic
        models. To validate our models through real-world networks, we also
        discuss different strategies for the selection of the initial influencers to
        mimic non-competitive and competitive diffusion scenarios, inspired by
        the widely-known problem of limitation of misinformation spread. Fi-
        nally, we describe a web-based simulation environment for testing the
        proposed diffusion models.


1     Introduction
Since the early applications in viral marketing, the development of information
diffusion models and related optimization methods has provided effective sup-
port to address a variety of influence propagation problems. Nowadays, due to
the shrinking boundary between real and online social life [2] and the presence
of multiple, competitive spreaders over the Web, which could also act towards
    Copyright © 2020 for this paper by its authors. Use permitted under Creative Com-
    mons License Attribution 4.0 International (CC BY 4.0). This volume is published
    and copyrighted by its editors. SEBD 2020, June 21-24, 2020, Villasimius, Italy.
misinformation, deciding whether a source of information is reliable or not has
become a critical task. The difficulty in assessing the reliability and trustwor-
thiness of the source generating or sharing a piece of information increases the
likelihood of people to be deceived by a spreading information. In addition to
this, the tendency to access information from like-minded sources [5] leads users
to be trapped in information bubbles, eventually causing forms of network polar-
ization [4]. Polarization can be contrasted via actions devoted to fact-checking or
misinformation debunking, but time plays a crucial role in this game, as cognitive
phenomena of confirmation bias may easily arise.
    Understanding the trustworthiness of the source of an information item is
often challenging, as it requires tracking the information item of its origina-
tor, which could be unfeasible in many cases. Therefore, it beomes essential to
capture the effect of trust/distrust relationships on both the user behavior and
propagation dynamics. One big question hence arises:

      What are the key-features that make a diffusion model able to explain
      the inherent dynamic, and often competitive, nature of real-world prop-
      agation phenomena?

    Contributions. To answer the above question, we discuss a class of stochas-
tic diffusion models, named Friend-Foe Dynamic Linear Threshold Models (for
short, F 2 DLT ), which has been originally proposed in [3]. Our models are in-
spired by the classic Linear Threshold (LT) model, whereby an individual can
decide to take an action as a result of the exposure to multiple sources of influ-
ence. Major key-features of our models is that the information diffusion graph
is defined on top of a trust network, where trust is encoded into the influence
probabilities, the response of a user to the influencing attempts is described by
a time-varying activation threshold function, and also a quiescence function is
introduced to model the latency or delay in the propagation.
    Our F 2 DLT models are designed to deal with non-competitive and compet-
itive propagation scenarios. For competitive campaign scenarios, one model is
semi-progressive, which assumes that a user, once activated, is only allowed to
switch to a different campaign, and another model is non-progressive, i.e., it
requires a user to have always the support of her/his in-neighbors to keep the
activation state with a certain campaign.
    To evaluate our models, we also devised four seed selection strategies, which
mimic different, realistic scenarios of influence propagation. Experimental evalu-
ation conducted on real-world networks, also including comparison with stochas-
tic epidemic models, has provided interesting findings on the meaningfulness and
uniqueness of our proposed models.

2     The F 2 DLT class of diffusion models
2.1     Overview
Figure 1 sketches the conceptual architecture of a framework for information dif-
fusion and influence propagation based on our proposed models. Key-constituents
                        Campaign︎                 Competing campaign︎


                                                                    ...


            Trust Network︎                                                    T
                                                              ...

                                      exogenous and
                                      dynamic quiescence ︎
                                                                          T
                                      factors︎

                                      exogenous and
                                      dynamic activation-                 T
                                      threshold factors︎
          Online social network
                  Users︎

                     Fig. 1: Illustration of the F 2 DLT framework


of this framework are the following: (i) a set of online social network users; (ii) a
trust network, possibly inferred from a social network; (iii) user-behavior char-
acteristics that provide information for incorporating two main aspects into the
diffusion process, namely activation-threshold and quiescence; (iv) information
related to one or multiple competing campaigns.
   Also, the information diffusion process is supposed to have a certain time
horizon, and before its expiration, users respond to the network’s stimuli which
may lead to being active in favor of one campaign or the opposing one. According
to their own behavior, users may have the possibility of switching from the
adoption of a campaign’s item to that of another one.
   Putting it all together, our F 2 DLT based framework embeds all the above
aspects that are essential to explain complex propagation phenomena, i.e., com-
petitive diffusion, non-progressivity, time-aware activation, delayed propagation,
and trust/distrust relations.
    Notably, in our setting, we tend to reject as true in general, the principle
“I agree with my friends’ idea and disagree with my foes’ idea” (which also
resembles the adage “the enemy of my enemy is my friend”), since this would
imply that the behavior of a user should be completely determined by the stimuli
coming from her/his neighbors. In other terms, friends should be responsible for
the influenceability and activation of a user, as opposed to foes, which should
instead impact on delayed propagation. Therefore, in our models, the trusted
connections and distrusted connections play different roles: only friends can exert
a degree of (positive) influence, whereas foes can only contribute to increase the
user’s hesitation to commit with the propagation process.
    One assumption of our framework is the availability of trust relationships
between the users involved in the information diffusion context. Although this
may not hold in practice, a few studies have been recently developed to infer
a trust network from the time-varying interactions observed in evolving social
networks, such as [6].
Activation-threshold function. Given a directed, weighted network repre-
senting an information diffusion graph built on top of a trust network, every
node v is associated with an activation-threshold, θv ∈ (0, 1], which corresponds
to the effort of activating the node which is needed in terms of cumulative influ-
ence. The activation-threshold function g, defined over the set of nodes V and
the time interval of diffusion T , enhances this concept. For each node v at time
t ∈ T , it is defined as:
                               g(v, t) = θv + ϑ(θv , t).
     Thus, a node v is activated when its perceived influences exceed the threshold
θv , plus the time-evolving activation term, ϑ(·, ·), which models the dynamic
response of users at any activation attempt. Our proposed models can in principle
embrace any definition for ϑ(·, ·); here, we focus on two main scenarios.
     Biased . Modeled as a non-decreasing monotone function, it captures the ten-
dency of a user to consolidate her/his belief, according to the confirmation-bias
principle [1]. We define this function as follows:
                                                                          
                                                        1 − θv        last
              g(v, t) = θv + ϑ(θv , t) = θv + δ × min          , t − tv         (1)
                                                          δ
where tlast
        v   denotes the last time v was activated and δ ≥ 0 denotes the increment
in the value of g(v, t) for consecutive time steps. Clearly, the value increases as
a node keeps staying in the same active state.
    Unbiased . In applications such as customer retention or churn prediction, a
user could revise her/his uncertainty to activate over time. We model this in
such a way that, for each v, the value of the function is maximum (i.e., 1) just
after the activation, i.e., at time t = tlast
                                         v    + 1, then it starts to decrease from the
following time steps:

   g(v, t) = θv + ϑ(θv , t) = θv + exp(−δ(t − tlast
                                               v    − 1)) − θv I[t − tlast
                                                                      v    = 1]   (2)

Quiescence function. The quiescence value quantifies the latency in the
propagation through a particular node. It is defined as a non-decreasing and
monotone function q : V, T 7→ T , such that for every v ∈ V, t ∈ T , with v
activated at time t:
                                           in
                        q(v, t) = τv + ψ(N−   (v), t),
where τv ∈ T represents an exogenous term modeling the user’s hesitation in
                                                     in
being committed with the propagation process, and ψ(N−  (v), t) determines an

  We assume the second additive term in Eq. (1) is zero if δ = 0.
                                                                                            C 00


                                                          inactive    quiescent    active      switch
                   activation
 start      inactive        quiescent           active
                                                                                            C0
                                quiescence time expired

                          (a)                                                (b)

Fig. 2: Life-cycle of a node for the non-competitive model (a) and the competi-
tive models (b). Straight lines in (b) represent the transitions common to both
spC-F 2 DLT and npC-F 2 DLT , while dashed lines refer to npC-F 2 DLT only.


additional delay proportional to the amount of v’s neighbors that are distrusted
and active, when the activation attempt is performed by the v’s trusted neigh-
bors. Here we focus on the following function:
                                                          X
                                in
                                                                  
             q(v, t) = τv + ψ(N−   (v), t) = τv + exp λ ×   |wuv |           (3)
                                                                     u∈St−1

where λ ≥ 0 quantifies the user sensitivity towards negative influence.

2.2      Diffusion Models
The F 2 DLT class includes three diffusion models [3], which are now briefly and
informally described. The first one refers to a single-item propagation scenario,
while the other two models can dela with competitive scenarios, i.e., two or mul-
tiple campaigns spreading informative items over the network in a competitive
fashion. Figure 2 shows the life-cycle of a node in the non-competitive scenario
and in competitive ones, respectively. It should be noted that all the models share
a similar two-phase activation process. In fact, when a node is activated by its
neighbors, it always becomes quiescent. Its permanence in this state depends on
Eq. 3. After the quiescent time is expired, a node can be regarded as active,
regardless of its activation campaign, and is considered fully committed to the
propagation process. While the non-competitive model, dubbed nC-F 2 DLT , is
a progressive model (i.e., once a node becomes active cannot be deactivated in
subsequent times), the semi-progressive competitive model, spC-F 2 DLT , allows
a node to switch from a campaign to another, even multiple times. Finally, the
fully non-progressive competitive model, npC-F 2 DLT , additionally allows for
node deactivation.
Theoretical properties of the models. The non-competitive, progressive
model nC-F 2 DLT is proven to be equivalent to LT with Quiescence Time i.e.,
the activation function in nC-F 2 DLT is monotone and submodular. On the
other hand, the competitive spC-F 2 DLT and npC-F 2 DLT can be reduced,
via graph serialization, to a progressive Homogeneous Competitive LT, with
monotone, non-submodular activation function i.e., the activation function in
spC-F 2 DLT and npC-F 2 DLT is monotone but not submodular [3].
3   Evaluation methodology
Data. We used four real-world, publicly available networks, namely: Epinions,
Slashdot, Wiki-Conflict, and Wiki-Vote — please refer to [3] for details about
the datasets. All networks are originally directed and signed; in addition, the two
Wikipedia-based networks also have timestamped edges. The influence probabil-
ity are derived so that to a network having a higher fraction of positive edges will
correspond to an equivalent network where users are more willing to be involved
in the propagation process.

Seed selection strategies. We defined four seed selection strategies, each of
which mimics a different, realistic scenario of influence propagation.
    Exogenous and malicious sources of information. This method, hereinafter
referred to as M-Sources, aims at simulating the presence of multiple sources of
malicious information within the network. Here, an exogenous source is meant
as a node without incoming links. Formally, given a budget k, the method se-
lects the top-k users in a ranking solution determined as r(v) = (W̄ − /(W̄ − +
W̄ + )) log(|N out (v)|) for every v such that N in (v) = ∅. Here, W̄ + , resp. W̄ −
denote the sum of trust, resp. distrust outgoing weights.
    Exogenous and influential trusted sources of information. Analogously to
the previous method, this one, dubbed I-Sources, is concerned with exogenous
sources with emphasis on trusted links. Hence, the ranking function is: r(v) =
(W̄ + /(W̄ − + W̄ + )) log(|N out (v)|).
    Stress triads. This strategy is based on the notion of structural balance in
triads. In a typical stress-triad configuration there is a node v with two incoming
connections, from a node z with negative weight and from a node u with positive
weight, and such that z is connected to u via a positive link. In this case, z is
regarded as a stress-node since it could activate v through u, despite being
negatively linked to v. The Stress-Triads strategy selects seeds based on the
number of stress-triads that are involved in as stress-nodes.
    Newcomers. We call a node v ∈ V as a newcomer if all of its incoming edges
are timestamped as less recent than its oldest outgoing edge. The start-time of
v is the oldest timestamp associated with its incoming edges. Within the set
of newcomers nodes, we further define two separate strategies: Least-New and
Most-New. Each one consists in the selection of the the top-k newcomers with
highest out-degree among those nodes with the oldest and the newest start-time,
respectively. Both strategies require temporal information upon edges.

Major Findings and Usage Recommendations. We pursued different goals
of evaluations depending on the particular diffusion model. More specifically, we
investigated the impact of the negative influence on the final active set in the
non-competitive scenario. For the competitive setting, our main goal was to
understand the effect of the confirmation-bias effect, in the context of misinfor-
mation spreading. Therefore, we will have a good and a malicious actor. Finally,
we conducted a comprehensive comparative evaluation of our non-competitive
model against the Independent Cascade model as well as stochastic individual-
contact epidemic models SIR and SEIR [3].
    Here we summarize the major findings emerged from the experiments:
    F1 The setting of the dynamic activation-threshold function and quiescence
function plays a crucial role in positive-influence propagation and negative-
influence/misinformation limitation.
    F2 The average user’s sensitivity in the negative influence perceived from
distrusted neighbors (λ) makes the seed identification more aware of the negative
influence spread.
    F3 The confirmation-bias effect (δ) may lead the “stronger” campaign to
increase its spread capability.
    F4 The non-progressive competitive model npC-F 2 DLT appears to be less
sensitive to the increase of δ and tends to favor deactivation events (for users
previously activated by the weaker campaign) over switched events.
    F5 If compared with classic diffusion models as IC, SIR, and SEIR, the
nC-F 2 DLT model tends to favor a slower diffusion, since the propagation pro-
cess lasts consistently longer than IC and the epidemic models.
    F6 Threshold models are more suitable when it is required to model propa-
gation phenomena where it is important to capture the difference in behavior of
each individual. On the contrary, epidemic models does not exhibit this impor-
tant capability, as they are more suitable to represent single contagions.
    F7 I-Sources reveals higher spread capability, followed by Stress-Triads. On
the contrary, the newcomers-based strategies show a significant spread capability,
coupled with a negligible effect on the negative influence.
    The above results provide evidence that our class of trust-aware models offers
an opportunity to better represent the complexity of real-world propagation
phenomena. As a consequence, our models can pave the way for the development
of sophisticated methods to solve misinformation spread limitation and related
optimization problems. Also, our models can profitably be used in a variety
of applications whereby there is an emergence to predict the time required to
debunk fake information, or to estimate how people are affected by the spread of
competitive opposite opinions through a social network.

4   Web-based Simulation Environment
As part of our contributions to the research project NextShop PON Grant No.
F/050374/01-03/X32, we developed a web application of our diffusion models,
available at http://people.dimes.unical.it/andreatagarelli/f2dlt/ . It
is a simulation environment that allows a user to execute a diffusion model
in the F 2 DLT class, and get an interactive view of a propagation process. As
input, an influence graph can be either uploaded from a file or synthetically
generated; in the latter case, the user is asked to provide the following infor-
mation: (i) number of nodes, (ii) number of edges, (iii) percentage of negative
edges, and (iv) graph-generator model, by choosing among the Erdős–Rényi,
Watts-Strogatz, and Barabási-Albert models. Once the network is created, it is
displayed with positive (trust) edges represented in black, and negative (distrust)
edges represented in red. Also, after selecting a F 2 DLT model, both the activa-
tion threshold function and the quiescence function can be configured by setting
          Fig. 3: Screenshot of our web-based simulation environment


ϑ(·, ·),and ψ(·, ·). Finally, the early adopters, i.e., the seeds of the propagation
process, can be selected by clicking upon specific nodes of the displayed network.
While the diffusion process is running, the application updates the layout of the
network, and nodes will dynamically change their color accordingly to their ac-
tivation status. Moreover, the application will show a number of statistics about
the propagation process, as well as the temporal evolution of the propagation
process.

References
1. A. Anagnostopoulos, A. Bessi, G. Caldarelli, M. Del Vicario, F. Petroni, A. Scala,
   F. Zollo, and W. Quattrociocchi. Viral misinformation: The role of homophily and
   polarization. In Proc. WebConf, pp. 355–356, 2015.
2. A. Bessi, G. Caldarelli, M. Del Vicario, A. Scala, and W. Quattrociocchi. Social
   determinants of content selection in the age of (mis)information. In Proc. SocInfo,
   pp. 259–268, 2014.
3. Antonio Caliò and Andrea Tagarelli. Complex influence propagation based on trust-
   aware dynamic linear threshold models. Applied Network Science, 4(1):14:1–14:41,
   2019.
4. K. Garimella, G. De Francisci Morales, A. Gionis, and M. Mathioudakis. The effect
   of collective attention on controversial debates on social media. In Proc. ACM Web
   Science, pp. 43–52, 2017.
5. D. Koutra, P. N. Bennett, and E. Horvitz. Events and controversies: Influences of a
   shocking news event on information seeking. In Proc. WebConf, pp. 614–624, 2015.
6. Domenico Mandaglio and Andrea Tagarelli. Generalized preference learning for
   trust network inference. IEEE Access, 7:174583–174596, 2019.

</pre>