Proceedings of the 1st International Workshop on Social Influence Analysis (SocInf 2015)
    July 27th, 2015 - Buenos Aires, Argentina


                Influence maximization based on the least influential spreaders
    Didier A. Vega-Oliveros, Lilian Berton, Alneu de Andrade Lopes and Francisco A. Rodrigues
                         Instituto de Ciências Matemáticas e de Computação
                          Universidade de São Paulo - Campus de São Carlos
                                   13560-970 São Carlos, SP - Brazil
                         Email: {davo, lberton, alneu, francisco}@icmc.usp.br

                           Abstract                                         Some individuals can have a higher social influence than
                                                                         others, according to their position, topological character-
     The emergence of social media increases the need                    istics in the network and dynamic [Kitsak et al., 2010;
     for the recognization of social influence mainly mo-                González-Baillón et al., 2011; Borge-Holthoefer et al., 2012].
     tivated by online advertising, political and health                 A lot of researches have tried to identify influence in social
     campaigns, recommendation systems, epidemio-                        networks [Burt, 1992; Sabidussi, 1996; Kitsak et al., 2010;
     logical study, etc. In spreading processes, it is pos-              Borge-Holthoefer et al., 2012]. It was found that the final
     sible to define the most central or influential ver-                fraction of informed individuals is not significantly affected
     tices according to the network topology and dy-                     by chosen the spreaders by topological features [Moreno et
     namic. On the other hand, the least influential                     al., 2004]. However, incorporating characteristics of real sce-
     spreaders have been disregarded. This paper aims                    narios as the activity of spreader or apathy with the new infor-
     to maximize the mean of information propagation                     mation [Borge-Holthoefer et al., 2012], or exposure to multi-
     on the network by recognizing the non influential                   ple sources and individual’s thresholds [González-Baillón et
     individuals by making them better spreader. Ex-                     al., 2011] lead to the emergence of influential spreaders in the
     perimental results confirm that selecting 0.5% of                   information spreading dynamic.
     least influential spreaders in three social networks
     (google+, hamsterster and advogato) and rewiring                       In the other hand, the least influential individuals of the
     one connection to some important vertex, increase                   network are ignored in the literature. Let consider the case of
     the propagation over the entire network.                            a scientific collaboration network. Some research groups are
                                                                         more prominent than others. If a researcher with few publi-
                                                                         cation or contacts visits a prominent group, by the homophily
1   Introduction                                                         phenomenon he/she can consequently improve his/her publi-
Nowadays social media, such as blogs, social networks,                   cation capacity. Based on this, we measure the vertices’ in-
sharing sites, etc., can quickly spread rumors or informa-               fluence and recognize the least influential spreaders. Thereby,
tion [Castellano et al., 2009; González-Baillón et al., 2011],         we propose to turn those individuals into better spreaders by
reaching millions of users in the Global village. Propagation            connecting to the most central nodes.
process evolving information or rumors can be both benefi-                  The main contributions of this paper are: i) confirm the ex-
cial or destructive in scenarios like stock market, advertising          istence of least influential individuals in the network; ii) char-
and marketing, adoption of technologies, politics and national           acterize the least influential spreaders by analyzing central-
security, among other.                                                   ity measures, such as degree, betweenness, closeness, pager-
   Many of the approaches to information or rumor spread-                ank, eigenvector, k-core, clustering coefficient and struc-
ing [Moreno et al., 2004; Castellano et al., 2009; González-            tural holes; iii) select a proportion of least influential users
Baillón et al., 2011; Borge-Holthoefer et al., 2012] concen-            from google+, hamsterster and advogato social networks and
trate in how ideas are shared among individuals and what                 make them better spreaders rewiring one of its connection to
are the conditions that allow a large dissemination. For this            a more central individual; iv) analyze the impact of different
reason, they are understood as being equivalent to dynam-                proportions in the rewiring process and how it increases the
ics like epidemics spreading, in the sense that individuals              mean of information spreading on the network.
would be psychological ‘infected’ with some idea or opin-                   The paper is organized as follows: Section 2 describes the
ion [Daley and Kendall, 1964; Maki and Thompson, 1973;                   centrality measures considered for network characterization
Barrat et al., 2008; Castellano et al., 2009]. The rumor                 and the spreading process (SIR model); Section 3 presents the
spreading process assumes that the population can be divided             simulations in three social networks (google+, hamsterster
into groups with different states, which is generally described          and advogato), recognizing the least influential users, mak-
as the Susceptible (inactive), Infectious (spreader) and Re-             ing them potential spreaders and increasing the information
covered (stifler) (SIR) model [Daley and Kendall, 1964;                  capacity of the networks. Finally, Section 4 reports the final
Maki and Thompson, 1973].                                                remarks.
Copyright c 2015 for the individual papers by the papers’ authors.
Copying permitted for private and academic purposes. This volume
is published and copyrighted by its editors

                                                                     3
      Proceedings of the 1st International Workshop on Social Influence Analysis (SocInf 2015)
      July 27th, 2015 - Buenos Aires, Argentina


2     Definitions                                                         the importance of the vertices given the quality of its connec-
A social network can be represented as a static unweighted                tions [Bonacich, 1972].
network G = (V, E), such as V = {v1 , v2 , ... vn } is the                   Pagerank centrality (PR) derives from a Markovian pro-
set of N vertices or actors and E = {e1 , e2 , ... em } is the            cess that follows a random walk navigation through the net-
set of M edges or connections. The adjacency matrix A is                  work. It expresses the importance of the vertices considering
the mathematical entity that represents the connections of the            the probability of arriving at certain vertex after a large num-
network. We consider undirected networks, that means i, j                 ber of steps. The pagerank was initially proposed to rank web
∈ V , Aij = Aji = 1 if i and j are connected, or 0 otherwise.             pages [Brin and Page, 1998] and the idea is to simulate the be-
   A walk between a pair of vertices (i, j) is a consecutive              havior of a user that is surfing on the net. The user navigates
sequence that starts at i and ends in j, so that any vertices are         following the links available at the current page or, she/he can
visited more than once. The distance or length of the walk                jump to other pages by typing a new URL in the browser. In
is defined as the number of edges contained in the sequence.              social networks it can be approached like the more cited or
Two vertices are neighbors if they are connected in a walk of             renowned individuals.
length 1. The shortest distance between two vertices is known                Closeness centrality (CL) is the average of the shortest
as the shortest path or geodesic path `ij . The shortest paths            paths from each vertex to the rest of the network [Sabidussi,
can be computed by the Dijkstra, BellmanFord algorithms or                1996]. Formally, the closeness centrality is the inverse of the
by a Breadth-first search method [Cormen et al., 2009]. A                              Xshortest paths from i to all the vertices, i.e.,
                                                                          average of the
component is the largest sub-set of vertices from the network             CLi = N/         `ij . Thereby, vertices that are closer to the
in which exist at least one walk between each pair of vertices,                        j6=i
but never connect to another component. A connected net-                  others have higher closeness centrality. For the sake of calcu-
work has only one component. In the case that i and j belong              lating the centrality for disconnected networks, the average is
to different components, it is assumed that `ij = ∞. For                  considered by component.
this reason, here we considered the largest component of the                 k-core centrality (KC) describes the topology of the net-
network.                                                                  work in terms of sub network decomposition in cores. The
2.1    Centrality measures                                                core of k order (Hk ) is the set of vertices where for each
                                                                          vertex i in Hk its degree ki ≥ k. It means, k is the maxi-
Several measures have been proposed to describe the impor-
                                                                          mum core that i can belongs, with Kc(i) = k and Hk been
tance or centrality of a vertex in the network [Costa et al.,
                                                                          the largest set of vertices with this property [Seidman, 1983].
2007]. These centrality measures are defined considering par-
                                                                          The main core is the set of vertices with the largest k-core
ticular definitions of influence [Newman, 2003]. It is assumed
                                                                          value from the network, and these vertices are the most cen-
that vertices with higher centrality measure are more suitable
                                                                          tral. Vertices with low values of k-core are commonly located
to influence in the opinion of others. Among some of the
                                                                          at the periphery of the network. Not necessary all the high-
possibilities, we have the popularity of a vertex (degree cen-
                                                                          degree vertices have higher k-core values. For instance, hubs
trality) [Costa et al., 2007], the proximity or how close an
                                                                          located in the periphery would have small values of k-core,
individual is to the others (closeness centrality) [Sabidussi,
                                                                          or vertices with larger k-core value could have not so large
1996], the trusted vertices in the transmission of informa-
                                                                          degree [Kitsak et al., 2010].
tion (betweenness centrality) [Freeman, 1977], the proximity
of vertices to the network core (k-core) [Seidman, 1983] or                  Clustering coefficient (CT) or transitivity is a common
even the renowned that an individual has (pagerank central-               property in real-world networks. In social networks it means
ity) [Brin and Page, 1998]. In the following, we present the              that if A has one friend that is also friend of B, there is a strong
centrality measures adopted in this paper.                                tendency that A being also a friend of B. In topology terms it
   Degree centrality (DG) considers the number of connec-                 is the presence of triangles (cycles of order three) in the net-
tions or relations of a vertex. The set of vertices connected             work. The clustering coefficient [Watts and Strogatz, 1998]
to a certain vertex i is defined as the neighborhood and the              for a vertex CTi is defined as the number of triangles centered
connection degree DGi represents the size of its neighbor-                on i over the maximum number of possible connections for i.
hood [Costa et al., 2007]. It means, the higher the degree, the           CTi has value 1 if all neighbors of i are interconnected.
more popular the vertex.                                                     Structural Holes (HO) considers all the vertices as an ego
   Betweenness centrality (BE) is related to the capacity of              network, where connections no related with each vertex have
information transmission of vertices. For a vertex j this mea-            not direct effect. The key factor is the redundancy that each
sure quantified the number of shortest paths that pass through            vertex has in its neighborhood, evaluating if its position and
j between all pair of vertices (i, k) [Freeman, 1977], with               connections brings some opportunities. It means that the suc-
i, j and k different. The centrality measure expresses how                cess of an individual i within a social network or organization
much the vertex j works as bridge, meaning how confidence                 is related to access local bridges (trusted people) [Burt, 1992].
or trusted is j in the network.                                           If removed i from the network, a structural hole will happen
   Eigenvector centrality (EV) takes into account that ver-               in the local neighborhood. These individuals are important to
tices with same degree have different levels of importance                the connectivity of local regions.
according to the importance of their neighbors. The eigen-                   The network can be characterized by averagePn the centrality
vector centrality is the principal eigenvector associated with            measures of all vertices i.e., hDGi = N1 i=1 DGi , and is
the greatest eigenvalue of the adjacency matrix A. It describes           similar for the other measures.


                                                                      4
      Proceedings of the 1st International Workshop on Social Influence Analysis (SocInf 2015)
      July 27th, 2015 - Buenos Aires, Argentina


                                        Table 1: Structural properties of the complex networks
   Network           N        hDGi      hCT i     hHOi         hBEi             hCLi         hEV i              hKCi        hP Ri
  hamsterster      2000        16.1     0.5399 0.2898 2.588 × 103 1.43 × 10−4                 0.011             9.287    5.00 × 10−4
   advogato        5054        15.6     0.2525 0.3700 5.747 × 103 6.19 × 10−5 6.82 × 10−3                       8.137    1.98 × 10−4
   google+         23613       3.32     0.1742 0.8112 3.581 × 104 1.10 × 10−5 2.30 × 10−3                       1.669    4.21 × 10−5


2.2    Spreading process                                                  et al., 2012]. These analytical models make assumptions
Spreading is a pervasive process in society and several models            about the network structure such as the degree correlation
have been developed in order to understand the propagation                or distribution, compartments or class of vertices with same
of ideas or opinions through social networks [Castellano et               probabilities, homogeneous mixing or mean field theory.
al., 2009]. In classical rumor spreading models the ignorant              Notwithstanding all of them claim that their numerical so-
or inactive (S) are those who remain unaware of the informa-              lutions agree with the MC simulations, so we adopt this ap-
tion, the spreaders (I) are those who disseminate the ideas,              proach.
and the stifler (R) are those who know the information but
lose the interest in spreading it. All vertices have the same             3     Maximization of least influential spreaders
probability β for convincing their neighbors and probability              We analyze the spreading capacity in the microscopic and
γ for stopping to be active as propagator.                                macroscopic scales of the propagation. For each vertex i ∈
   The Maki-Thompson (MT) [Maki and Thompson, 1973]                       V , we calculate the final fraction of stifler ϕi∞ . This quan-
rumor approach was employed to model the spreading pro-                   tity represents how long the rumor propagates in the net-
cess. In the MT whenever an active spreader i contacts a                  work starting in i. Each ϕi∞ were average over 90 realiza-
vertex j that is inactive, the latter will become active with             tions. For the macroscopic propagation scale, the ϕV∞ repre-
a fixed probability β. Otherwise, when j knows about the                  sents thePmean of spreading capacity for all vertices, it means
rumor, it means j is a spreader or a stifler, the vertex i will           ϕV∞ = ( i∈V ϕi∞ )/N .
turn into a stifler with probability γ. The behavior when the
spreader stops to propagate is understood because the infor-              3.1    Dataset
mation is too much known (contacting spreaders) or without                We employ the following social networks: the hamster-
novelty (contacting stifler). The general rules of contact can            ster [Kunegis, 2014], an undirected and unweighted network
be represented as:                                                        based on the website data from HAMSTERSTER . COM. The
                    
                                    β
                                                                          vertices are the users of the system and the edges represent
                     I i + S j −→ I j ,
                                                                         a relationship among users. It consists of friend and fam-
                                    γ
                       I i + Rj −→ Ri ,                     (1)           ily relationship between the users of the website. The ad-
                    
                                  γ
                       I i + I j −→ Ri ,                                  vogato [kon, 2014], an online community platform for devel-
                                                                          opers of free software launched in 1999. Vertices are users of
where i and j are neighbors and the operator “+” means the                advogato and the directed edges represent trust relationships.
contact between them.                                                     Finally, the google+ [Gpl, 2014] contains Google Plus user-
   In terms of Monte Carlo (MC) implementation, let con-                  user links. The directed edge denotes that one user has the
sider a constant population of N vertices in all time steps.              other in his circles, but we assume the network as undirected.
Each vertex can be only in one state, that is I i (t) = 1 if i ∈          We always consider the main component for these networks.
I, otherwise I i (t) = 0, and S i (t) + I i (t) + Ri (i) = 1. The            The topological characteristics of these networks are sum-
macroscopic fraction of ignorant (ψ(t)), spreaders (φ(t)) and             marized in Table 1, with the corresponding averages of
                                                      PN
stifler (ϕ(t)) over time is calculated as ψ(t) = N1 i=1 S i (t),          the centralities: degree hDGi, clustering coefficient hCT i,
that is similar to the other states and always fulfill ψ(t) +             structural holes hHOi, betweenness hBEi, closeness hCLi,
φ(t) + ϕ(t) = 1. We assume that infection and recovering do               eigenvector hEV i, k-core hKCi and pagerank hP Ri. The
not occur during the same discrete time window or step. Also              google+ is more an egocentric network (hHOi) than others,
the case when a spreader randomly contacts one neighbor per               where vertices are very close (hCLi) and most of them have
unit time, the contact process, was adopted.                              few connections. The hamsterster is a more sparse network
   The initial setup for the propagation is ψ(0) = 1 − 1/N ,              with more triangles (hCT i) and connections between users.
φ(0) = 1/N and ϕ(0) = 0. Each simulation begins with a                    And, advogato is a more dense network and in the middle
uniformly selection of vertices as defined in the initial setup.          term of the previous.
At each time step, all spreaders uniformly select one of its
neighbors and try to infect it with probability β, or stop the            3.2    Propagation analysis
propagating with probability γ. The simulations run until the             For evaluating the impact of the least influential users in
end of the propagation process is reached, when φ∞ = 0.                   the networks, we employ the z-score normalization over the
   Different theoretical models have been proposed for mod-               spreading capacities ϕi∞ and sort the values in ascending or-
eling the rumor dynamics in networks [Moreno et al., 2004;                der. The z-score indicates how many standard deviations an
Barrat et al., 2008; Castellano et al., 2009; Borge-Holthoefer            element is according to the mean. The z-score of a raw value


                                                                      5
      Proceedings of the 1st International Workshop on Social Influence Analysis (SocInf 2015)
      July 27th, 2015 - Buenos Aires, Argentina


                                               0
                                                       hamsterster                                   Table 2: k-means clustering result for the google+ network
                                                       advogato                                       Measures        Cluster 1 (10%)     Cluster 2 (67%)     Cluster 3 (23%)
                                             −2        google+
           z−score of informed individuals
                                                                                                         DG           1.029 ± 1.001        1.128 ± 0.386      10.553 ± 72.435
                                             −4                                                          BE           0.00 ± 0.0004        0.00 ± 0.0005       0.0023 ± 0.023
                                                                                                         CL              0.5 ± 0.00           0.5 ± 0.00        0.501 ± 0.019
                                             −6                                                          PR          0.0007 ± 0.0007      0.0007 ± 0.0002     0.0043 ± 0.0299
                                                                                                         EV             0.00 ± 0.00       0.0021 ± 0.0037     0.0117 ± 0.0202
                                             −8                                                          KC              0.083 ± 0         0.094 ± 0.032       0.292 ± 0.178
                                                                                                         CT             0.00 ± 0.00          0.00 ± 0.00       0.746 ± 0.312
                                             −10                                                         HO           0.991 ± 0.0291       0.942 ± 0.166       0.360 ± 0.145
                                                                                                     Propagation    2612.274 ± 332.095   3112.858 ± 56.965   3135.407 ± 37.310
                                             −12


                                             −14
                                                   0,001            0,01               0,1   1
                                                                 fraction of individuals             Table 3: Propagation improvements for the real networks: 1st
                                                                                                     place are in bold and 2nd place are underlined
Figure 1: Comparative z-score of the propagation capacity in func-                                     Network        Rewired (%)      DG        BE        EV        PR
tion of the proportion of least influential spreaders                                                                      5         1.0817    1.0932    1.0810    1.0844
                                                                                                      hamsterster        2.5         1.0804    1.0817    1.0794    1.0783
                                                                                                                         0.5         1.0493    1.0510    1.0479    1.0489
x is z = x−µσ , where: µ is the mean and σ is the standard                                                                 5         1.0600    1.0594    1.0599    1.0594
deviation of the population. The impact is shown in Fig-                                               advogato          2.5         1.0529    1.0556    1.0549    1.0557
ure 1, where considering 10% of least influential spreaders,                                                             0.5         1.0450    1.0439    1.0453    1.0447
the mean decreases around 3 standard deviations. This de-                                                                  5         1.0124    1.0158    1.0050    1.0112
                                                                                                       google+           2.5         1.0143    1.0102    1.0103    1.0171
creasing behavior continues successively. The less influential                                                           0.5         1.0241    1.0190    1.0187    1.0266
spreaders impact the overall mean of spreading capacity.
   We aim to recognize these least influential spreaders in or-
der to improve their influence. Following, we considered the
centrality measures and analyzed the topological influence
                                                                                                     3.4      Cluster analysis
over the propagation and data clustering to find patterns on                                         We perform a clustering analysis in order to discover the
these individuals.                                                                                   group of least influential spreaders according to the topolog-
                                                                                                     ical characteristics of the vertices. As objects in the same
3.3    Topological analysis                                                                          group are more similar to each other than those in other
Z-score was calculated for each centrality measure and net-                                          groups, we aim to identify common properties of the elements
work (Figure 2 a-c). The fraction of individuals was sorted                                          that belong to the group with lowest propagation rates.
in ascending order for each centrality. Vertices with lowest                                            We apply the popular k-means clustering [Macqueen,
eigenvector centrality always led the lowest z-score values.                                         1967] that partitions data into k mutually exclusive clusters.
   Jaccard coefficient was employed in order to measure the                                          Each cluster is defined by its members and by its centroid
proportion of least influential spreaders contained in each                                          that represents the point to which the sum of distances from
group of vertices with lower centrality measure (Figure 2 d-f).                                      all objects in the cluster is minimized. We used the Weka1
The Jaccard coefficient compares the similarity and diversity                                        implementation with the standard configuration.
of sample sets. It measures similarity between finite sample                                            The result for the google+ network is shown in Table 2.
sets, and is defined as the size of the intersection divided by                                      As we set the parameter k = 3 the k-means generates three
the size of the union of the sample sets: J(A, B) = |A∩B|  |A∪B| .                                   clusters. We are mainly interested in the cluster 1 because
Here, for each centrality measure A represents the proportion                                        has the individuals with the lowest propagation mean. By
of vertices with lowest values and B the fraction of least influ-                                    comparing the mean of the measures among clusters 1, 2 and
ential vertices. We define the least influential users as the ver-                                   3, we observe that the values of EV ≈ 0 and HO ≈ 1 are
                                                  ϕV                                                 detachable in comparison to other clusters.
tices that achieved propagation values ϕi∞ ≤ 2∞ . Selecting
less than 10% of vertices with lowest eigenvector centrality,                                           When performed the cluster analysis in hamsterster and
it was obtained around half of the less influential spreaders.                                       advogato, the more significant measures were CL < 0.5 and
   The topological properties considering 1% of vertices with                                        also EV ≈ 0. In all the case, individuals with eigenvector
least influential spreading and lowest EV value were ana-                                            values close to zero are in the cluster with lowest spread-
lyzed. We verify if these vertices also have low values in                                           ing capacity. If we aim to find a cluster with non influen-
other centralities. For each centrality measure we normal-                                           tial spreaders in the network, we can run the k-means algo-
ize the values by its own average. Therefore, it is possible                                         rithm until some group partition achieve an eigenvector mean
to compare the networks and the maximum values achieved                                              hEV i < 1/n. This pattern confirms the results of the topo-
(Figure 2 g-i). We observe that the topological properties of                                        logical analysis. Hence, we generalize that the lower the
individuals with lowest EV values are a subset of the values                                         eigenvector value, the lower the spreading capacity of ver-
presented for the least influential spreaders group. Unlike ex-                                      tices.
pected, vertices with BE, DG or PR above the mean may be
                                                                                                        1
also less influential spreaders.                                                                            http://www.cs.waikato.ac.nz/ml/weka


                                                                                                 6
     Proceedings of the 1st International Workshop on Social Influence Analysis (SocInf 2015)
     July 27th, 2015 - Buenos Aires, Argentina


                                                                          1                                                                                                                                                                                                                                                                               2
                                                                                                                                                                                                           0

                                                                          0                                                                                                                                                                                                                                                                               0


                                       z−score of informed individuals


                                                                                                                                                                       z−score of informed individuals


                                                                                                                                                                                                                                                                                                     z−score of informed individuals
                                                                                                                                                                                                          −2
                                                                                                                                                                                                                                                                                                                                                         −2
                                                                         −1                                                                                                                               −4
                                                                                                                                                                                                                                                                                                                                                         −4
                                                                         −2                                                                                                                               −6
                                                                                                                                                DG                                                                                                                                    DG                                                                                                                                          DG
                                                                                                                                                BE                                                                                                                                    BE                                                                 −6                                                                       BE
                                                                         −3                                                                     CL                                                        −8                                                                          CL                                                                                                                                          CL
                                                                                                                                                PR                                                                                                                                    PR                                                                 −8                                                                       PR
                                                                                                                                                EV                                                                                                                                    EV                                                                                                                                          EV
                                                                         −4                                                                     KC                                                       −10                                                                          KC                                                                                                                                          KC
                                                                                                                                                                                                                                                                                                                                              −10
                                                                                                                                                CT                                                                                                                                    CT                                                                                                                                          CT
                                                                                                                                                HO                                                       −12                                                                          HO                                                                                                                                          HO
                                                                         −5                                                                                                                                                                                                                                                                   −12
                                                                           0              0.2           0.4          0.6           0.8               1                                                                       0.1          0.2          0.3              0.4            0.5                                                       0                         0.1            0.2          0.3            0.4             0.5
                                                                                                    fraction of individuals                                                                                                            fraction of individuals                                                                                                                        fraction of individuals


                                                                                                 (a) hamsterster                                                                                                                     (b) advogato                                                                                                                                    (c) google+
                                                      0.45                                                                                                                                                0.4                                                                                                                                            0.7
                                                                                                                                                DG                                                                                                                                    DG                                                                                                                                          DG
                                                                         0.4                                                                    BE                                                       0.35                                                                         BE                                                                                                                                          BE
                                                                                                                                                                                                                                                                                                                                                         0.6
                                                                                                                                                CL                                                                                                                                    CL                                                                                                                                          CL
                                                      0.35                                                                                      PR                                                                                                                                    PR                                                                                                                                          PR
                                                                                                                                                                                                          0.3
                                                                                                                                                EV                                                                                                                                    EV                                                                 0.5                                                                      EV
            Jaccard coefficient


                                                                                                                                                              Jaccard coefficient


                                                                                                                                                                                                                                                                                                               Jaccard coefficient
                                                                         0.3                                                                    KC                                                                                                                                    KC                                                                                                                                          KC
                                                                                                                                                                                                         0.25
                                                                                                                                                CT                                                                                                                                    CT                                                                 0.4                                                                      CT
                                                      0.25                                                                                      HO                                                                                                                                    HO                                                                                                                                          HO
                                                                                                                                                                                                          0.2
                                                                         0.2                                                                                                                                                                                                                                                                             0.3
                                                                                                                                                                                                         0.15
                                                      0.15
                                                                                                                                                                                                                                                                                                                                                         0.2
                                                                                                                                                                                                          0.1
                                                                         0.1
                                                                                                                                                                                                         0.05                                                                                                                                            0.1
                                                      0.05

                                                                          0                                                                                                                                0                                                                                                                                              0
                                                                           0              0.2           0.4          0.6           0.8               1                                                      0                0.1           0.2          0.3             0.4             0.5                                                                0               0.1            0.2          0.3            0.4             0.5
                                                                                                    fraction of individuals                                                                                                            fraction of individuals                                                                                                                        fraction of individuals


                                                                                                 (d) hamsterster                                                                                                                     (e) advogato                                                                                                                                    (f) google+

                                                                                                                CL                                                                                                                                 CL                                                                                                                                              CL
                                                                                                                 0.64                                                                                                                                  0.97                                                                                                                                         1

                                                                               PR                                                         BE                                                                     PR                                                             BE                                                                             PR                                                            BE
                                                                                          1.2                                                                                                                                1.26                                                                                                                                          21.41
                                                                                                                 0.43                    0.77                                                                                                          0.65                    3.51                                                                                                                 0.67                    41.76
                                                                                                  0.8                                                                                                                                0.84                                                                                                                                           14.27
                                                                                                                                0.51                                                                                                                                  2.34                                                                                                                                         27.84
                                                                                                                 0.21                                                                                                                                  0.32                                                                                                                                         0.33
                                                                                                         0.4                                                                                                                                0.42                                                                                                                                            7.14
                                                                                                                        0.26                                                                                                                                  1.17                                                                                                                                         13.92
           EV                                                              0          0            0                                                     DG   EV                                          0.1         0.06          0.03                                                       DG   EV 0.07                                                         0.04           0.02                                                    DG
                                                                                                                         0.14          0.29      0.43                                                                                                          0.09          0.17      0.26                                                                                                                 5.22          10.44     15.66
                                                                                                  0.22                                                                                                                               0.08                                                                                                                                            0.4
                                                                                                                1.16                                                                                                                                   0.9                                                                                                                                          0.4
                                                                                                         0.62                                                                                                                               1.32                                                                                                                                             0
                                                                                          0.43                                                                                                                               0.16                                                                                                                                          0.8
                                                                                                                        2.32                                                                                                                                  1.8                                                                                                                                          0.8
                                                                               0.65                      1.23                                                                                                    0.25                       2.64                                                                                                                1.2                          0
                                                                           KC                                                   3.47      HO                                                                KC                                                        2.7       HO                                                                         KC                                                      1.19      HO

                                                                                                         1.85                                                                                                                               3.96                                                                                                                                             0
                                                                                                                CT               lower influence                                                                                                   CT                 lower influence                                                                                                              CT               lower influence
                                                                                                                                 lower EV                                                                                                                             lower EV                                                                                                                                      lower EV


                                                                                                 (g) hamsterster                                                                                                                     (h) advogato                                                                                                                                    (i) google+

Figure 2: Propagation analysis for the real social networks hamsterster, advogato and google+: (a-c) the impact of vertices with low
centrality in the propagation; (d-f) the similarity between vertices with lower centrality and the less influential spreaders; (g-i) the topological
characteristic of the least influential spreaders and vertices with lowest EV values. For hamsterster and advogato were considered β = 0.3
and γ = 0.1, and for google+ β = 0.5 and γ = 0.1, in all simulations

                                                                         0.5                                                                                                                              0.5                                                                                                                                             1

                                                                                                                                                                                                                                                                                                                                                          0
                                                                          0                                                                                                                                0
            z−score of informed individuals


                                                                                                                                                              z−score of informed individuals


                                                                                                                                                                                                                                                                                                                       z−score of informed individuals


                                                                                                                                                                                                                                                                                                                                                         −1
                                                                                                                                                                                                         −0.5
                                                     −0.5                                                                                                                                                                                                                                                                                                −2
                                                                                                                                                                                                          −1
                                                                                                                                                                                                                                                                                                                                                         −3
                                                                         −1
                                                                                                                                                                                                         −1.5                                                                                                                                            −4
                                                                                                                                                DG                                                                                                                                    DG                                                                                                                                          DG
                                                     −1.5                                                                                       BE                                                                                                                                    BE                                                                                                                                          BE
                                                                                                                                                                                                                                                                                                                                                          0
                                                                                                                                                                                                          −2
                                                                                                                                                CL                                                                                                                                    CL                                                                                                                                          CL
                                                                         −2                                                                     PR                                                                                                                                    PR                                                                 −6                                                                       PR
                                                                                                                                                EV                                                       −2.5                                                                         EV                                                                                                                                          EV
                                                                                                                                                                                                                                                                                                                                                         −7
                                                                                                                                                KC                                                                                                                                    KC                                                                                                                                          KC
                                                     −2.5                                                                                                                                                 −3
                                                                                                                                                CT                                                                                                                                    CT                                                                 −8                                                                       CT
                                                                                                                                                HO                                                                                                                                    HO                                                                                                                                          HO
                                                                         −3                                                                                                                              −3.5                                                                                                                                            −9
                                                                           0              0.2           0.4          0.6           0.8               1                                                       0               0.2           0.4          0.6             0.8                1                                                               0               0.2            0.4          0.6            0.8              1
                                                                                                    fraction of individuals                                                                                                            fraction of individuals                                                                                                                        fraction of individuals


                                                                                                 (a) hamsterster                                                                                                                     (b) advogato                                                                                                                                    (c) google+

Figure 3: Propagation analysis for the real social networks when maximizing the social influence rewiring 0.5% of least influential spreaders
considering vertices with highest DG centrality: the impact of low centrality values in the propagation mean


                                                                                                                                                                                                                                                   7
      Proceedings of the 1st International Workshop on Social Influence Analysis (SocInf 2015)
      July 27th, 2015 - Buenos Aires, Argentina


3.5    Influence maximization                                             [Borge-Holthoefer et al., 2012] Javier Borge-Holthoefer, Sandro
For improving the spreading capacity of the network we se-                   Meloni, Bruno Gonçalves, and Yamir Moreno. Emergence of
                                                                             Influential Spreaders in Modified Rumor Models. Journal of Sta-
lected 5%, 2.5% and 0.5% of the vertices with lower EV. For
                                                                             tistical Physics, 151(1-2):383–393, September 2012.
each vertex only one of its edges was randomly selected and
rewired to an influential vertex of the network. We randomly              [Brin and Page, 1998] S. Brin and L. Page. The anatomy of a large-
consider vertices with highest DG, BE, EV and PR central-                    scale hypertextual web search engine. Computer Networks and
                                                                             ISDN Systems, V:107–117, 1998.
ity and exchange only one edge of each influencer. For each
measure and proportion of lowest EV vertices to be rewired,               [Burt, 1992] Ronald S. Burt. Structural holes: The social structure
we have the resulting spreading capacity (ϕV∞ ) normalized by                of competition. Harvard University Press, Cambridge, MA, 1992.
the mean of the non maximized case (Table 3). When con-                   [Castellano et al., 2009] Claudio Castellano, Santo Fortunato, and
necting the least influential spreaders to vertices with highest             Vittorio Loreto. Statistical Physics of Social Dynamics. Reviews
DG centrality leads to 6 second places and 1 first place re-                 of Modern Physics, 81(2):591–646, may 2009.
sults. Hence, the visual results selecting 0.5% of lowest EV              [Cormen et al., 2009] Thomas H. Cormen, Charles E. Leiserson,
connecting to vertices with highest DG centrality are shown                  Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms.
in Figure 3. The overall mean of propagation was increased                   The MIT Press, 3 edition, 2009.
for each network. The z-score also increases for the other                [Costa et al., 2007] L.D.F. Costa, F.A. Rodrigues, G Travieso, and
measures in the entire network and not only for the 0.5% of                  P.R. Villas Boas. Characterization of complex networks: A sur-
the selected vertices (Figure 3). For the best cases the im-                 vey of measurements. Adv. in Physics, 56:167–242, 2007.
provements achieved an overall mean larger than the highest               [Daley and Kendall, 1964] D. J. Daley and D. G. Kendall. Epi-
spreading value of the non maximized case.                                   demics and Rumours. Nature, 204(4963):1118–1118, 1964.
                                                                          [Freeman, 1977] L C Freeman. A set of measures of centrality
4     Final remarks                                                          based on betweenness. Sociometry, 40:35–41, 1977.
This paper explored the least influential spreaders of a net-             [González-Baillón et al., 2011] Sandra González-Baillón, Borge-
work considering centrality measures and analyzing topolog-                  Holthoefer, Javier Rivero, and Yamir Moreno. The dynamics of
ical and data clustering approaches. The results indicate that               Protest Recruitment through an Online Network. Scientific Re-
vertices with eigenvector value EVi < 1/N have little in-                    ports, 1:197:1–7, 2011.
fluence in the network. We selected 5%, 2.5% and 0.5% of                  [Gpl, 2014] Google+ network dataset – KONECT, dec 2014.
these individuals and rewired only one edge to an influential             [Kitsak et al., 2010] Maksim Kitsak, Lazaros K. Gallos, Shlomo
vertex. This approach improves propagation capacity of those                 Havlin, Fredrik Liljeros, Lev Muchnik, H. Eugene Stanley, and
vertices and the mean of the propagation in three social net-                Hernán A. Makse. Identification of influential spreaders in com-
works considered: google+, hamsterster and advogato. More                    plex networks. Nature Physics, 6(11):888–893, August 2010.
efficient targeted actions can be performed to improve the dif-           [kon, 2014] Advogato network dataset – KONECT, October 2014.
fusion on the network. Selecting a few non influential users
and presenting them to one influencer, they can change their              [Kunegis, 2014] J. Kunegis. Hamsterster full network dataset-
action, become better spreaders and improve the overall diffu-               KONECT, jan 2014.
sion capacity. For example, the promotion of interchange of               [Macqueen, 1967] J. Macqueen. Some methods for classification
students to prominent institutions, or famous/popular people                 and analysis of multivariate observations. In In 5-th Berkeley
giving talks in common places or to specific groups of indi-                 Symp. on Math. Stats. and Prob., pages 281–297, 1967.
viduals, can produce a considerable impact in the diffusion of            [Maki and Thompson, 1973] D. P. Maki and M Thompson. Math-
ideas and benefits. The motivation or incentive for the most                 ematical Models and Applications, with Emphasis on the Social,
important vertices may be monetary, ideological causes, in-                  Life, and Management Sciences. Prentice-Hall, 1973.
crease publications and impact, among others. This is a work              [Moreno et al., 2004] Yamir Moreno, Maziar Nekovee, and
in progress that shows promising experimental results. New                   Amalio F. Pacheco. Dynamics of rumor spreading in complex
paths of study can be developed by analyzing the least influ-                networks. Physical Review E, 69(6):066130, jun 2004.
ential spreaders and dynamics.                                            [Newman, 2003] M E J Newman. The structure and function of
                                                                             complex networks. SIAM Review, 45:1–8, 2003.
Acknowledgments                                                           [Sabidussi, 1996] G Sabidussi. The centrality index of a graph. Psy-
This research was supported by National Council for Scientific and           chometrika, 31:581–603, 1996.
Technological Development (CNPq) grant: 140688/2013-7 and Sao             [Seidman, 1983] S Seidman. Network structure and minimum de-
Paulo Research Foundation (FAPESP) grant: 2011/21880-3.                      gree. Social networks, 5(3):269–287, 1983.
                                                                          [Watts and Strogatz, 1998] Duncan J Watts and Steven H Stro-
References                                                                   gatz. Collective dynamics of s̀mall-world’ networks. Nature,
[Barrat et al., 2008] Alain Barrat, MarseilleMarc Barthélemy, and           393(6684):440–442, 1998.
  Alessandro Vespignani. Dynamical Processes on Complex Net-
  works. Cambridge University Press, 2008.
[Bonacich, 1972] Phillip Bonacich. Factoring and weighting ap-
  proaches to clique identification. Journal of Mathematical Soci-
  ology, 2:113–120, 1972.


                                                                      8