<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>End to end in uence maximization for HIV prevention</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bryan Wilder</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Onasch-Vera</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juliana Hudson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jose Luna</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicole Wilson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Robin Petering</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Darlene Woo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Milind Tambe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eric Rice</string-name>
          <email>ericr@usc.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Center for Arti cial Intelligence in Society, University of Southern California bwilder</institution>
          ,
          <addr-line>onaschve, jnhudson, joseluna, wilsonnj, petering, darlenew, tambe</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This work aims to overcome the challenges in deploying inuence maximization to support community driven interventions. Inuence maximization is a crucial technique used in preventative health interventions, such as HIV prevention amongst homeless youth. Dropin centers for homeless youth train a subset of youth as peer leaders who will disseminate information about HIV through their social networks. The challenge is to nd a small set of peer leaders who will have the greatest possible in uence. While many algorithms have been proposed for in uence maximization, none can be feasibly deployed by a service provider: existing algorithms require costly surveys of the entire social network of the youth to provide input data, and high performance computing resources to run the algorithm itself. Both requirements are crucial bottlenecks to widespread use of in uence maximization in real world interventions. To address the above challenges, this paper introduces the CHANGE agent for in uence maximization. CHANGE handles the end-to-end process of in uence maximization, from data collection to peer leader selection. Crucially, CHANGE only surveys a fraction of the youth to gather network data and minimizes computational cost while providing comparable performance to previously proposed algorithms. We carried out a pilot study of CHANGE in collaboration with a drop-in center serving homeless youth in a major U.S. city. CHANGE surveyed only 18% of the youth to construct its social network. However, the peer leaders it selected reached just as many youth as previously eld-tested algorithms which surveyed the entire network. This is the rst real-world study of a network sampling algorithm for in uence maximization. 1</p>
      </abstract>
      <kwd-group>
        <kwd>In uence maximization HIV prevention social networks</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        This paper presents and eld-tests a novel, practical agent for in uence
maximization, the challenge of selecting a small set of seed nodes in a social network
1 An extended version of this paper has been accepted as a full paper at AAMAS
2018.
who will di use information to many others. Such techniques have important
applications ranging from preventative health [
        <xref ref-type="bibr" rid="ref1 ref13">13, 1</xref>
        ] to international development
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        We are particularly motivated by the challenge of preventing HIV spread
among homeless youth [
        <xref ref-type="bibr" rid="ref10 ref11 ref18">11, 18, 10</xref>
        ] (although our contributions would also assist
other public health interventions). Here, in uence maximization is used to select
homeless youth who will serve as peer leaders and spread messages about HIV
prevention through their social network. Pilot studies in this domain have shown
that algorithmic approaches have great promise, substantially outperforming
status-quo heuristics [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. However, current algorithms have a high barrier to
entry: they require a great deal of time to gather the complete social network
of the youth, expertise to select appropriate parameters, and computational
power to run the algorithms. None of these are likely available to
resourcestrained service providers who will ultimately be the ones to deploy in uence
maximization.
      </p>
      <p>Gathering network data is particularly onerous because it requires
individually surveying upwards of a hundred youth. Further, network collection is more
time intensive than simple survey methods, requiring days of time for a
dedicated team of social work researchers. It is not feasible for service providers with
many other responsibilities.</p>
      <p>
        The other barriers are also serious impediments to wide-scale adoption of
inuence maximization. Service providers will not have access to the high-performance
computing resources required by previous algorithms, where high computational
cost is often incurred to nd solutions robust to unknown parameters. For
instance, DOSIM, the state of the art algorithm for robust in uence maximization
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], takes hours of runtime on a high-performance computing system. In reality,
a deployed system would need to run in minutes on a laptop.
      </p>
      <p>This paper presents CHANGE (CompreHensive Adaptive Network samplinG
for social in uencE), a novel, end-to-end agent for in uence maximization which
addresses the above barriers via a set of algorithmic contributions. CHANGE is
easy to deploy, but this simplicity is crucially enabled by a series of insights into
the social structure of homeless youth (which may be useful for other vulnerable
populations). We conducted a pilot test of CHANGE's performance in a real
deployment by a drop-in center serving homeless youth in a major U.S. city.
CHANGE was used to plan a series of interventions designed to spread HIV
awareness among the youth. CHANGE obtained comparable in uence spread to
state of the art algorithms while surveying only 18% of nodes for network data.</p>
      <p>Overall, CHANGE o ers a practical, eld-tested vehicle for deployed in
uence maximization which drastically lowers the barrier to entry. To our
knowledge, this is the rst real-world pilot study of a network sampling algorithm
for in uence maximization and only the second ever eld test of any in uence
maximization algorithm.</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        The in uence maximization problem was introduced by Kempe et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and has
been extensively studied since then [
        <xref ref-type="bibr" rid="ref12 ref3 ref4 ref5">3, 12, 4, 5</xref>
        ]. Most work has focused on
algorithms which are scalable to extremely large networks, primarily in the context
of online viral marketing. Recently, HIV prevention (and preventative health
more broadly) has emerged as a new application area for in uence
maximization which brings its own set of research challenges. Yadav et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] proposed
HEALER, a POMDP-based algorithm for selecting in uential peer leaders.
Subsequently, Wilder et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] introduced the DOSIM algorithm which uses robust
optimization to account for uncertainty about the true probability of in uence
propagation. Our approach to parameter robustness is similar to techniques used
in robust MDP planning [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], though the domains are entirely di erent.
      </p>
      <p>
        Yadav et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] conducted a real-world pilot study of HEALER and DOSIM,
and found that both algorithms signi cantly outperformed the status-quo
heuristic used by agencies (selecting high-degree nodes). However, neither algorithm
addresses any of the challenges described above. Both assume that the entire
social network is provided as input, which is unrealistic in practice due to the
enormous e ort required. Further, only DOSIM handles uncertainty about the
probability of in uence spread, and its method for doing so is extremely
computationally intensive (see Section 4.3).
      </p>
      <p>
        Separate work by Wilder et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] considered network data collection. They
proposed the ARISEN algorithm which samples a portion of youth in the network
to collect data from. While ARISEN can be theoretically analyzed for certain
network structures, it is not practically suitable to deployment because it relies
on querying a sequence of speci c youth who may be di cult to locate (see
Section 4.2). Moreover, ARISEN does not consider either parameter uncertainty
or execution errors (the possibility that some peer leaders will not attend), both
of which we incorporate into CHANGE.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Problem description</title>
      <p>Motivating domain: Our work is designed to overcome the challenges in
deploying in uence maximization techniques to support community-driven
interventions. We are speci cally motivated by the challenge of raising awareness
about HIV among homeless youth. Typically, an HIV awareness intervention
will be provided by a drop in center or other organization which serves homeless
youth. Each intervention is a day-long class followed by weekly hour-long
meetings. Hence (as is typical in many intervention domains), the service provider
will almost never have enough resources to deliver the intervention to all of the
youth that frequent the center; instead, the intervention is usually delivered to
15-20% of the population2. Further, limitations on space and personnel mean
that the intervention can typically be delivered to only 4-6 youth at a given
2 Note that while CHANGE directly surveys
friends, resulting in a larger sampled graph.
18% of youth, they name others as
time, so the training is broken up over a series of small sessions. These youth
are trained as peer leaders who communicate with other youth about HIV
prevention. This ampli es the reach of the intervention through the social network
of the homeless youth. The question is which youth will make the most e ective
peer leaders, able to reach the greatest number of their peers. This is an in uence
maximization problem, which we now formalize.</p>
      <p>
        In uence: The youth have a social network represented as a graph G =
(V; E). Each youth is initially inactive, meaning that they have not received
information about HIV prevention. Once nodes are activated by the intervention,
they have a chance to in uence their peers. We model this process through a
variant on the classical independent cascade model (ICM) which has been used
by previous work on HIV prevention and better re ects realistic time dynamics
[
        <xref ref-type="bibr" rid="ref15 ref16 ref17">16, 15, 17</xref>
        ]. The process unfolds over discrete time steps t = 1:::T , where T is
a time horizon. There is a propagation probability p. When a node becomes
active, it attempts to activate each of its neighbors. Each attempt succeeds
independently with probability p. Activation attempts are made at each time
step until either the neighbor is in uenced or the time horizon is reached.
      </p>
      <p>
        Note that the assumption that p is uniform across edges is without much loss.
As noted by He and Kempe [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], a uniform p is equivalent to each edge drawing
an individual propagation probability i.i.d. from a distribution with mean p.
This is because the following processes are analytically equivalent: (1) propagate
in uence with probability p and (2) draw a propagation probability q from a
distribution with E[q] = p and then propagate in uence with probability q.
Hence, our model actually subsumes any stochastic model where the propagation
probabilities are drawn from a common prior.
      </p>
      <p>Interventions: At each time step t = 1:::T , the algorithm selects a seed set
At containing up to K nodes. However, each seed node may or may not actually
attend the intervention. This problem is particularly acute with homeless youth
since a number of factors could prevent a given youth from attending (e.g.,
being arrested, running out of money for a bus ticket, etc.). Hence, we assume
that each node v has a hidden state xv 2 fpresent; absentg. Each node's state
is drawn independently from some prior distribution D. For simplicity, we will
take D to set each node to be present with probability q. However, all of our
analysis applies to arbitrary distributions. For each v 2 At, if xv = present, then
v is activated. Nothing occurs if xv = absent. Note that an absent node can still
become activated by others, since they may still be in contact with others in
the social network. After the set At is chosen, the intervention occurs and the
hidden state of each v 2 At is observed. We denote the set of all observations
received at time t as Ot. The algorithm may use this information to plan the
next intervention. In other words, the problem is adaptive. To model adaptivity,
we introduce the notion of a policy. A policy maps from past actions and
observations to the action that should be taken next. Let A = fS V : jSj Kg be
the set of all possible actions. A history is the current sequence of actions chosen
and observations received, denoted by t = ((A1; O1); (A2; O2); :::(At; Ot)). Let
be the set of all possible histories. A policy is a mapping : ! A. Let</p>
      <p>A( t) = (A1:::At) be the sequence of actions taken and O( ) = (O1:::Ot) be the
corresponding observations (whether each peer leader was present or absent).
Recall that youth are trained in groups of 4-6; the policy selects a group of
youth to invite given who was trained previously. We denote the objective as
f (A( )jO( )). f is the expected number of nodes in uenced by the seed nodes
in A( ) conditioned on the observations in O( ). We overload notation and
let f ( ) = E [f (A( )jO( ))] be the expected reward from running policy
, where the expectation ranges over the hidden state x (which determines 's
actions) as well as the in uence process. Our goal is to nd a policy maximizing
f ( ).</p>
      <p>Uncertainty about network structure and parameters: We consider
extensions to the core adaptive in uence maximization problem which account
for the lack of information endemic in eld deployments. First, we consider the
case where the structure of the network (the edges E) are unknown. To address
this challenge, we give our agent a budget of M queries to run before conducting
the intervention. Each query may target either a uniformly random node, or the
neighbor of a node already queried. When a node is queried, it reveals all of its
edges. The goal is to use the M queries to uncover a set of edges which su ce
to identify in uential nodes.</p>
      <p>We then consider an unknown propagation probability. Here, we take a robust
optimization approach and look for a policy which performs well across a range
of possible values for p. More detail on this part of the problem can be found in
Section 4.3.
4</p>
    </sec>
    <sec id="sec-4">
      <title>CHANGE: a new agent for in uence maximization in the eld</title>
      <p>We now introduce the CHANGE agent for end-to-end in uence maximization.
Figure 1 illustrates the three components of the agent. We start with the last
component, peer leader selection, since the other components exist to provide
the data that the peer leader selection algorithm requires. Peer leader selection
is performed by an adaptive greedy algorithm (Algorithm 1), which handles
the chance that some peer leaders may not attend the intervention and plans
solutions using the observations obtained so far. Algorithm 1 requires as input
a (sample of) social network and a propagation probability p. We subsequently
introduce Algorithms 2 and 3 to provide these inputs.
4.1</p>
      <sec id="sec-4-1">
        <title>Adaptive greedy planning</title>
        <p>Algorithm 1 Adaptive greedy
1: for t = 1:::T do
2: At = ;
3: for k = 1:::K do //greedily select seeds for action t
4: v = arg maxv2V (At [ fvgj t 1) (Atj t 1)
5: At = At [ fvg
6: execute At and observe Ot
7: t = t 1 + (At; Ot) //add action/observation to history</p>
        <p>Given as input the graph G and propagation probability p, nding the
optimal policy is a di cult planning problem. There are 2n possible hidden states
and Kn possible actions. While it is possible to formulate the problem as a
POMDP, these exponentially large state and action spaces place even small
instances beyond the reach of o -the-self solvers. Hence, we exploit the structure of
the problem to formulate a scalable greedy algorithm which obtains (provably)
near-optimal solutions.</p>
        <p>Pseudocode for adaptive greedy, our online planning algorithm, can be found
in Algorithm 1. Algorithm 1 selects the action at each step which maximizes the
expected gain in in uence spread, conditioned on the observations received so
far. Then, it waits until this action has been executed, observes which peer
leaders attended the intervention, and greedily plans the next step. Formally, let
(Atj t 1) = f (A( t 1) [ AtjO( t 1)) f (A( t 1)jO( t 1)) denote the
expected marginal gain to selecting At at time t. The greedy policy is to select
At = arg maxjAj K (Aj t 1) (the outer loop of Algorithm 1). However,
computing the maximizing action is itself computationally intractable (as there are
Kn possible choices). Hence, Algorithm 1 uses an additional greedy inner loop
which greedily selects the elements of At one at a time (lines 3-5). Note that
can be computed by averaging over random simulations over both the hidden
state (which nodes are present/absent) as well as how in uence spreads via the
ICM.</p>
        <p>We prove the following theorem, which shows that greedy planning is su
cient to obtain a guaranteed approximation ratio:
Theorem 1. Let G be Algorithm 1's greedy policy and
It holds that f ( G) 2ee 11 f ( ).</p>
        <p>be an optimal policy.</p>
        <p>
          A proof may be found in the supplemental material. We use the adaptive
submodularity framework of Golovin and Krause, which generalizes the classical
notion of a submodular set function to adaptive policies. Their framework does
not straightforwardly apply to our problem since our algorithm selects a sequence
of actions, not a set. The order in which actions are selected matters since peer
leaders who are selected earlier will have more time to in uence others. We show
that our problem can be reformulated as maximizing an adaptive submodular set
function subject to a more complex set of constraints (a partition matroid). This
is the rst approximation guarantee for adaptive in uence maximization under
execution errors, which is a well-known challenge in domains such as ours [
          <xref ref-type="bibr" rid="ref15 ref17">15,
17</xref>
          ].
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Network collection</title>
        <p>Algorithm 2 Network sampling
1: input: vertex set V , budget M
2: E = ; //set of edges observed
3: S = ; //set of nodes surveyed
4: for i = 1::: M2 do
5: Sample v uniformly at random from V n S
6: S = S [ fvg
7: E = E [ f(v; u) : u 2 N (v)g
8: Sample u uniformly at random from N (v) n S
9: E = E [ f(u; w) : w 2 N (u)g
10: S = S [ fug
11: return E</p>
        <p>The adaptive greedy algorithm assumes that the graph G is fully speci ed.
However, in order for an intervention to deployed in practice, the social network
needs to be laboriously gathered by interviewing the entire population of
homeless youth (potentially hundreds of youth in total). This is not practical for a
service provider to carry out on their own. We present an approach (Algorithm
2) which randomly samples a small number of youth to survey. Our procedure is
easy for a service provider to implement in the eld without much computational
assistance. This simplicity is enabled by underlying insights about the structure
of homeless youth social networks, which may assist with intervention design in
other vulnerable populations.</p>
        <p>We assume that the service provider has the ability to survey up to M youth.
Each youth, when surveyed, reveals all of their edges. Algorithm 2 chooses M2
nodes uniformly at random from the population to survey (line 5). For each
surveyed node, it choses a uniformly random neighbor to survey as well (line 8).
Lastly, it returns the graph consisting of the reported edges. The intuition for
why this procedure succeeds is that it leverages the friendship paradox : a
phenomena where a random node's neighbor has more friends, on average, than the
node itself. Essentially, high-degree nodes are overrepresented when we sample
a random neighbor instead of a uniformly random node. Thus, Algorithm 2 is
disproportionately likely to nd central nodes in the network who will reveal
many edges and may be good potential seeds.
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Parameter robustness</title>
        <p>Algorithm 3 Robust parameter selection
1: input: parameter values p1:::pL
2: for i = 1:::L do
3: for j = 1:::L do
4: g(pi; pj ) = value obtained by Algorithm 1 using pi evaluated under pj
g(pi;pj)
5: return arg maxi=1:::L minj=1:::L g(pj;pj)</p>
        <p>
          A further complication is that the adaptive greedy algorithm assumes that
the propagation probability p is known, in order to calculate the marginal gain .
However, p is never known precisely in practice; each intervention takes months
to deploy so we are unlikely to observe the many repeated cascades needed to for
learning-based approaches. Previous work has attempted to resolve this dilemma
via robust in uence maximization [
          <xref ref-type="bibr" rid="ref15 ref3 ref6 ref9">6, 3, 9, 15</xref>
          ] which nds a seed set which
performs well in the worst case over an uncertainty set of possible parameters.
However, the only previous work which addresses robust in uence
maximization in an adaptive domain is the DOSIM algorithm. DOSIM requires hours or
even days of runtime on a high-performance computing cluster because it needs
to brute force over a grid of possible parameter settings. Such computational
expense is far beyond the capabilities of the average service provider,
motivating the development of lightweight but e ective heuristics for robust in uence
maximization.
        </p>
        <p>Algorithm 3 gives the heuristic used by CHANGE. It searches for a good
nominal value of the parameter p, which (when given to Algorithm 1) will
result in high performance no matter what the true value of p actually is.
We rst discretize the interval [0; 1] into L points p1:::pL. Let g(pi; pj )
denote the expected in uence obtained when we run adaptive greedy planning
based on propagation probability pi, but the true parameter is pj . We then nd
g(pi;pj) . Here, gg((ppji;;ppjj)) , is the ratio of the value
p = arg maxi=1:::L min j = 1:::L g(pj;pj)
based on planning with parameter pi to the value that could have been obtained
if we new the true parameter pj . p is the parameter which maximizes the
worstcase value of this ratio. Notably, this procedure requires only L2 runs of adaptive
greedy; we take L = 10 in practice. By contrast, DOSIM requires O n 3 runs
of a greedy algorithm to achieve approximation error . This quickly reaches
thousands (or tens of thousands) of runs even for moderately sized networks
and requires high-performance computing resources.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Pilot study procedure</title>
      <p>The major contribution of this work is carrying out a pilot study which
tests the CHANGE agent in a eld deployment at a real drop in center serving
homeless youth in a major U.S. city. Here, we outline the procedure followed for
the pilot study. There were two studies, the feasibility study and the intervention
study. In the feasibility study, we just tested the rst component of CHANGE
(network data collection) to validate that it works in practice to gather
highquality data. In the intervention study, we carried out actual interventions with
homeless youth at the center. This step used all three steps of the CHANGE
agent: we gathered the network, found a robust set of parameters, and then
carried out interventions.</p>
      <p>
        For each of the studies, we enrolled (respectively) 72 and 64 youth. Each
youth was paid $20 to enroll in the study (all monetary incentives were the same
as prior studies [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]). We ran CHANGE's data collection mechanism, randomly
sampling a subset of youth to query for ties. Each youth who enrolled was also
asked to complete a baseline survey. As part of this survey, we also gathered
the full network consisting of ties from all of the youth. We emphasize that
this data was collected just for analysis. We did not use the full network to plan
interventions, and we would not expect an agency to conduct this step in a regular
deployment.
      </p>
      <p>
        In the intervention study, trained social workers delivered the Have You
Heard intervention, previously published in the public health literature [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
The social workers conducted a day-long class with the selected youth, covering
HIV awareness and prevention, and training the youth as peer leaders to
communicate with other youth at the agency. Peer leaders were paid $60. Three sets
of peer leaders were selected by CHANGE, with approximately 4 peer leaders in
each set. This matches the size of trainings used in previous in uence
maximization pilot studies [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Table 1 reports speci c values on the number of youth
enrolled, queried for edges, and trained as peer leaders for our pilot test as well
as pilot tests of previous algorithms. One month after the start of the study, we
conducted a follow up survey with all of the youth who initially enrolled. Some
youth were lost to follow up (see Table 1). We asked the youth whether they
had received information about HIV prevention from a peer who was part of the
study. Youth were paid $20 to respond to the follow up survey. We emphasize
that all aspects of the intervention study (the training materials for peer
leaders, survey instruments, etc.) are identical to Yadav et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], so our results are
directly comparable.
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Pilot study results</title>
      <p>
        We focus here on results from the intervention study, which tested the entirety
of the CHANGE agent. In this study, we recruited a population of 64 homeless
youth from a drop-in center. Table 1 gives the total number of youth recruited
for di erent activities, as well as the corresponding gures for previous pilot
tests of the HEALER and DOSIM algorithms by Yadav et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. We gathered
the full social network from all 64 youth, and in parallel ran Algorithm 2 with a
budget of M = 12 youth to collect a sampled network (querying 18.75% of youth
in total for links). Only the sampled network was used to plan interventions; the
full network was gathered only for analysis. We then ran the CHANGE
policy for three steps, training 10 total peer leaders (15.6% of the network). This
percentage is comparable to previous studies (HEALER and DOSIM trained
approximately 17% of the network each). However, HEALER and DOSIM used
the entire network to plan their intervention, compared to the 18.75% of
sampled youth used by CHANGE. At one month, we conducted a follow-up survey
to assess whether youth received information about HIV prevention from the
peer leaders. 54.7% of youth were retained in the follow-up survey, which is a
somewhat lower percentage than in previous studies. Nevertheless, we obtain a
population of 34 youth who provided follow-up data.
6.1
      </p>
      <sec id="sec-6-1">
        <title>In uence spread results</title>
        <p>
          We now present our core result: the number of youth who received a message
about HIV prevention. We examine the percentage of youth in the follow-up
group who were not peer leaders (and hence eligible to become in uenced) who
reported receiving information. Figure 2 shows this percentage for our pilot
study of CHANGE as well as the percentages reported by Yadav et al. [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]
in their pilot studies of the state of the art algorithms HEALER and DOSIM.
CHANGE reached 80% of non-peer leaders compared to approximately 70%
for each of HEALER and DOSIM. Thus, CHANGE was able to reach just as
many youth while gathering data from only 18.75% of the network. The 10%
di erence between CHANGE and HEALER/DOSIM could be attributable to
random variation; we do not claim that CHANGE is actually more e ective than
algorithms which gather the entire network. Nevertheless, this result provides
empirical evidence that CHANGE can perform comparably to existing state of
the art in uence maximization agents while drastically reducing the amount of
data required.
7
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Discussion and conclusion</title>
      <p>This paper presents the CHANGE agent for in uence maximization, a
multiagent problem with many applications in preventative health and other domains.
CHANGE addresses major barriers to the deployment of in uence
maximization by service providers through a series of algorithmic contributions, backed
by simulation results on real-world networks. We then conducted a real-world
pilot study of CHANGE with a drop-in center serving homeless youth, the rst
such pilot study of sampling-based in uence maximization and only the second
study testing any in uence maximization agent in the real world. CHANGE
obtained comparable in uence spread to previously eld tested algorithms, but
surveyed only 18% of youth to obtain network data. CHANGE has empirical
promise in delivering high-quality in uence maximization solutions in a manner
which can be feasibly implemented by a service provider.</p>
      <p>While the algorithms underlying CHANGE are easy to implement, they draw
on a series of insights into the social behavior of homeless youth. One lesson
learned from our study is that, to be successful in the eld, algorithms must be
designed with their target population and setting in mind. CHANGE both
navigates challenges speci c to homeless youth (e.g., the di culty of locating youth
to query for edges or serve as peer leaders) and leverages properties of their
social network (the friendship paradox). Our experience shows that accounting for
both challenges and opportunities in the target population is crucial to produce
a practically deployable algorithm.</p>
      <p>Acknowledgments: This research was supported by MURI Grant
W911NF11-1- 0332, the California HIV/AIDS Research Program, and a NSF Graduate
Research Fellowship.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alexander</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piazza</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mekos</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valente</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Peers, schools, and adolescent cigarette smoking</article-title>
          .
          <source>Journal of adolescent health 29(1)</source>
          ,
          <volume>22</volume>
          {
          <fpage>30</fpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Banerjee</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chandrasekhar</surname>
            ,
            <given-names>A.G.</given-names>
          </string-name>
          , Du o,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Jackson</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.O.:</surname>
          </string-name>
          <article-title>The di usion of micro nance</article-title>
          .
          <source>Science</source>
          <volume>341</volume>
          (
          <issue>6144</issue>
          ),
          <volume>1236498</volume>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Scalable in uence maximization for prevalent viral marketing in large-scale social networks</article-title>
          .
          <source>In: KDD</source>
          . pp.
          <volume>1029</volume>
          {
          <fpage>1038</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Delling</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pajor</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Werneck</surname>
            ,
            <given-names>R.F.</given-names>
          </string-name>
          :
          <article-title>Sketch-based in uence maximization and computation: Scaling up with guarantees</article-title>
          .
          <source>In: CIKM</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lakshmanan</surname>
            ,
            <given-names>L.V.</given-names>
          </string-name>
          :
          <article-title>Celf++: optimizing the greedy algorithm for in uence maximization in social networks</article-title>
          .
          <source>In: WWW</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>He</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kempe</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Robust in uence maximization</article-title>
          .
          <source>In: KDD</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kempe</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kleinberg</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tardos</surname>
          </string-name>
          , E.:
          <article-title>Maximizing the spread of in uence through a social network</article-title>
          .
          <source>In: KDD</source>
          . pp.
          <volume>137</volume>
          {
          <fpage>146</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kwak</surname>
          </string-name>
          , J.y.,
          <string-name>
            <surname>Varakantham</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maheswaran</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tambe</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hayes</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wood</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Becerik-Gerber</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Towards robust multi-objective optimization under model uncertainty for energy conservation</article-title>
          .
          <source>In: AAMAS ATES workshop</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Lowalekar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varakantham</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Robust in uence maximization</article-title>
          .
          <source>In: AAMAS</source>
          . pp.
          <volume>1395</volume>
          {
          <issue>1396</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Rice</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Milburn</surname>
            ,
            <given-names>N.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rotheram-Borus</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          :
          <article-title>Pro-social and problematic social network in uences on hiv/aids risk behaviours among newly homeless youth in los angeles</article-title>
          .
          <source>AIDS care 19(5)</source>
          ,
          <volume>697</volume>
          {
          <fpage>704</fpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Rice</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tulbert</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cederbaum</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barman</surname>
            <given-names>Adhikari</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Milburn</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.G.</surname>
          </string-name>
          :
          <article-title>Mobilizing homeless youth for hiv prevention: a social network analysis of the acceptability of a face-to-face and online social networking intervention</article-title>
          .
          <source>Health education research 27(2)</source>
          ,
          <volume>226</volume>
          {
          <fpage>236</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>In uence maximization: Near-optimal time complexity meets practical e ciency</article-title>
          .
          <source>In: SIGMOD</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Valente</surname>
            ,
            <given-names>T.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pumpuang</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Identifying opinion leaders to promote behavior change</article-title>
          .
          <source>Health Education &amp; Behavior</source>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Wilder</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Immorlica</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rice</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tambe</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>Maximizing in uence in an unknown social network</article-title>
          .
          <source>In: AAAI Conference on Arti cial Intelligence</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Wilder</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yadav</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Immorlica</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rice</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tambe</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Uncharted but not unin uenced: In uence maximization with an uncertain network</article-title>
          .
          <source>In: AAMAS</source>
          . pp.
          <volume>1305</volume>
          {
          <issue>1313</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Yadav</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xin</surname>
            <given-names>Jiang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Rice</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Tambe</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.:</surname>
          </string-name>
          <article-title>Using social networks to aid homeless shelters: Dynamic in uence maximization under uncertainty</article-title>
          .
          <source>In: AAMAS</source>
          . pp.
          <volume>740</volume>
          {
          <issue>748</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Yadav</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wilder</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rice</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petering</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Craddock</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yoshioka-Maxwell</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hemler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Onasch-Vera</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tambe</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Woo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>: In uence maximization in the eld: The arduous journey from emerging to deployed application</article-title>
          .
          <source>In: AAMAS</source>
          . pp.
          <volume>150</volume>
          {
          <issue>158</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Young</surname>
            ,
            <given-names>S.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rice</surname>
          </string-name>
          , E.:
          <article-title>Online social networking technologies, hiv knowledge, and sexual risk and testing behaviors among homeless youth</article-title>
          .
          <source>AIDS and Behavior</source>
          <volume>15</volume>
          (
          <issue>2</issue>
          ),
          <volume>253</volume>
          {
          <fpage>260</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>