<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">End to end influence maximization for HIV prevention</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Bryan</forename><surname>Wilder</surname></persName>
							<email>bwilder@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Center for Artificial Intelligence in Society</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Laura</forename><surname>Onasch-Vera</surname></persName>
							<email>onaschve@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Center for Artificial Intelligence in Society</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Juliana</forename><surname>Hudson</surname></persName>
							<email>jnhudson@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Center for Artificial Intelligence in Society</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jose</forename><surname>Luna</surname></persName>
							<email>joseluna@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Center for Artificial Intelligence in Society</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Nicole</forename><surname>Wilson</surname></persName>
							<email>wilsonnj@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Center for Artificial Intelligence in Society</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Robin</forename><surname>Petering</surname></persName>
							<email>petering@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Center for Artificial Intelligence in Society</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Darlene</forename><surname>Woo</surname></persName>
							<email>darlenew@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Center for Artificial Intelligence in Society</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Milind</forename><surname>Tambe</surname></persName>
							<email>tambe@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Center for Artificial Intelligence in Society</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Eric</forename><surname>Rice</surname></persName>
							<email>ericr@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Center for Artificial Intelligence in Society</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">End to end influence maximization for HIV prevention</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">9F3E0FA00C8FF3F46562210E05D1ADB8</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T22:31+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Influence maximization</term>
					<term>HIV prevention</term>
					<term>social networks</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This work aims to overcome the challenges in deploying influence maximization to support community driven interventions. Influence maximization is a crucial technique used in preventative health interventions, such as HIV prevention amongst homeless youth. Dropin centers for homeless youth train a subset of youth as peer leaders who will disseminate information about HIV through their social networks. The challenge is to find a small set of peer leaders who will have the greatest possible influence. While many algorithms have been proposed for influence maximization, none can be feasibly deployed by a service provider: existing algorithms require costly surveys of the entire social network of the youth to provide input data, and high performance computing resources to run the algorithm itself. Both requirements are crucial bottlenecks to widespread use of influence maximization in real world interventions. To address the above challenges, this paper introduces the CHANGE agent for influence maximization. CHANGE handles the end-to-end process of influence maximization, from data collection to peer leader selection. Crucially, CHANGE only surveys a fraction of the youth to gather network data and minimizes computational cost while providing comparable performance to previously proposed algorithms. We carried out a pilot study of CHANGE in collaboration with a drop-in center serving homeless youth in a major U.S. city. CHANGE surveyed only 18% of the youth to construct its social network. However, the peer leaders it selected reached just as many youth as previously field-tested algorithms which surveyed the entire network. This is the first real-world study of a network sampling algorithm for influence maximization.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>This paper presents and field-tests a novel, practical agent for influence maximization, the challenge of selecting a small set of seed nodes in a social network who will diffuse information to many others. Such techniques have important applications ranging from preventative health <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b0">1]</ref> to international development <ref type="bibr" target="#b1">[2]</ref>.</p><p>We are particularly motivated by the challenge of preventing HIV spread among homeless youth <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b17">18,</ref><ref type="bibr" target="#b9">10]</ref> (although our contributions would also assist other public health interventions). Here, influence maximization is used to select homeless youth who will serve as peer leaders and spread messages about HIV prevention through their social network. Pilot studies in this domain have shown that algorithmic approaches have great promise, substantially outperforming status-quo heuristics <ref type="bibr" target="#b16">[17]</ref>. However, current algorithms have a high barrier to entry: they require a great deal of time to gather the complete social network of the youth, expertise to select appropriate parameters, and computational power to run the algorithms. None of these are likely available to resourcestrained service providers who will ultimately be the ones to deploy influence maximization.</p><p>Gathering network data is particularly onerous because it requires individually surveying upwards of a hundred youth. Further, network collection is more time intensive than simple survey methods, requiring days of time for a dedicated team of social work researchers. It is not feasible for service providers with many other responsibilities.</p><p>The other barriers are also serious impediments to wide-scale adoption of influence maximization. Service providers will not have access to the high-performance computing resources required by previous algorithms, where high computational cost is often incurred to find solutions robust to unknown parameters. For instance, DOSIM, the state of the art algorithm for robust influence maximization <ref type="bibr" target="#b14">[15]</ref>, takes hours of runtime on a high-performance computing system. In reality, a deployed system would need to run in minutes on a laptop. This paper presents CHANGE (CompreHensive Adaptive Network samplinG for social influencE), a novel, end-to-end agent for influence maximization which addresses the above barriers via a set of algorithmic contributions. CHANGE is easy to deploy, but this simplicity is crucially enabled by a series of insights into the social structure of homeless youth (which may be useful for other vulnerable populations). We conducted a pilot test of CHANGE's performance in a real deployment by a drop-in center serving homeless youth in a major U.S. city. CHANGE was used to plan a series of interventions designed to spread HIV awareness among the youth. CHANGE obtained comparable influence spread to state of the art algorithms while surveying only 18% of nodes for network data.</p><p>Overall, CHANGE offers a practical, field-tested vehicle for deployed influence maximization which drastically lowers the barrier to entry. To our knowledge, this is the first real-world pilot study of a network sampling algorithm for influence maximization and only the second ever field test of any influence maximization algorithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>The influence maximization problem was introduced by Kempe et al. <ref type="bibr" target="#b6">[7]</ref>, and has been extensively studied since then <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5]</ref>. Most work has focused on algorithms which are scalable to extremely large networks, primarily in the context of online viral marketing. Recently, HIV prevention (and preventative health more broadly) has emerged as a new application area for influence maximization which brings its own set of research challenges. Yadav et al. <ref type="bibr" target="#b15">[16]</ref> proposed HEALER, a POMDP-based algorithm for selecting influential peer leaders. Subsequently, Wilder et al. <ref type="bibr" target="#b14">[15]</ref> introduced the DOSIM algorithm which uses robust optimization to account for uncertainty about the true probability of influence propagation. Our approach to parameter robustness is similar to techniques used in robust MDP planning <ref type="bibr" target="#b7">[8]</ref>, though the domains are entirely different.</p><p>Yadav et al. <ref type="bibr" target="#b16">[17]</ref> conducted a real-world pilot study of HEALER and DOSIM, and found that both algorithms significantly outperformed the status-quo heuristic used by agencies (selecting high-degree nodes). However, neither algorithm addresses any of the challenges described above. Both assume that the entire social network is provided as input, which is unrealistic in practice due to the enormous effort required. Further, only DOSIM handles uncertainty about the probability of influence spread, and its method for doing so is extremely computationally intensive (see Section 4.3).</p><p>Separate work by Wilder et al. <ref type="bibr" target="#b13">[14]</ref> considered network data collection. They proposed the ARISEN algorithm which samples a portion of youth in the network to collect data from. While ARISEN can be theoretically analyzed for certain network structures, it is not practically suitable to deployment because it relies on querying a sequence of specific youth who may be difficult to locate (see Section 4.2). Moreover, ARISEN does not consider either parameter uncertainty or execution errors (the possibility that some peer leaders will not attend), both of which we incorporate into CHANGE.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Problem description</head><p>Motivating domain: Our work is designed to overcome the challenges in deploying influence maximization techniques to support community-driven interventions. We are specifically motivated by the challenge of raising awareness about HIV among homeless youth. Typically, an HIV awareness intervention will be provided by a drop in center or other organization which serves homeless youth. Each intervention is a day-long class followed by weekly hour-long meetings. Hence (as is typical in many intervention domains), the service provider will almost never have enough resources to deliver the intervention to all of the youth that frequent the center; instead, the intervention is usually delivered to 15-20% of the population 2 . Further, limitations on space and personnel mean that the intervention can typically be delivered to only 4-6 youth at a given time, so the training is broken up over a series of small sessions. These youth are trained as peer leaders who communicate with other youth about HIV prevention. This amplifies the reach of the intervention through the social network of the homeless youth. The question is which youth will make the most effective peer leaders, able to reach the greatest number of their peers. This is an influence maximization problem, which we now formalize.</p><p>Influence: The youth have a social network represented as a graph G = (V, E). Each youth is initially inactive, meaning that they have not received information about HIV prevention. Once nodes are activated by the intervention, they have a chance to influence their peers. We model this process through a variant on the classical independent cascade model (ICM) which has been used by previous work on HIV prevention and better reflects realistic time dynamics <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b14">15,</ref><ref type="bibr" target="#b16">17]</ref>. The process unfolds over discrete time steps t = 1...T , where T is a time horizon. There is a propagation probability p. When a node becomes active, it attempts to activate each of its neighbors. Each attempt succeeds independently with probability p. Activation attempts are made at each time step until either the neighbor is influenced or the time horizon is reached.</p><p>Note that the assumption that p is uniform across edges is without much loss. As noted by He and Kempe <ref type="bibr" target="#b5">[6]</ref>, a uniform p is equivalent to each edge drawing an individual propagation probability i.i.d. from a distribution with mean p. This is because the following processes are analytically equivalent: (1) propagate influence with probability p and (2) draw a propagation probability q from a distribution with E[q] = p and then propagate influence with probability q. Hence, our model actually subsumes any stochastic model where the propagation probabilities are drawn from a common prior.</p><p>Interventions: At each time step t = 1...T , the algorithm selects a seed set A t containing up to K nodes. However, each seed node may or may not actually attend the intervention. This problem is particularly acute with homeless youth since a number of factors could prevent a given youth from attending (e.g., being arrested, running out of money for a bus ticket, etc.). Hence, we assume that each node v has a hidden state x v ∈ {present, absent}. Each node's state is drawn independently from some prior distribution D. For simplicity, we will take D to set each node to be present with probability q. However, all of our analysis applies to arbitrary distributions. For each</p><formula xml:id="formula_0">v ∈ A t , if x v = present, then v is activated. Nothing occurs if x v = absent.</formula><p>Note that an absent node can still become activated by others, since they may still be in contact with others in the social network. After the set A t is chosen, the intervention occurs and the hidden state of each v ∈ A t is observed. We denote the set of all observations received at time t as O t . The algorithm may use this information to plan the next intervention. In other words, the problem is adaptive. To model adaptivity, we introduce the notion of a policy. A policy maps from past actions and observations to the action that should be taken next. Let A = {S ⊆ V : |S| ≤ K} be the set of all possible actions. A history is the current sequence of actions chosen and observations received, denoted by ψ t = ((A 1 , O 1 ), (A 2 , O 2 ), ...(A t , O t )). Let Ψ be the set of all possible histories. A policy is a mapping π : Ψ → A. Let A(ψ t ) = (A 1 ...A t ) be the sequence of actions taken and O(ψ) = (O 1 ...O t ) be the corresponding observations (whether each peer leader was present or absent). Recall that youth are trained in groups of 4-6; the policy selects a group of youth to invite given who was trained previously. We denote the objective as f (A(ψ)|O(ψ)). f is the expected number of nodes influenced by the seed nodes in A(ψ) conditioned on the observations in O(ψ). We overload notation and let f (π) = E ψ∼π [f (A(ψ)|O(ψ))] be the expected reward from running policy π, where the expectation ranges over the hidden state x (which determines π's actions) as well as the influence process. Our goal is to find a policy maximizing f (π).</p><p>Uncertainty about network structure and parameters: We consider extensions to the core adaptive influence maximization problem which account for the lack of information endemic in field deployments. First, we consider the case where the structure of the network (the edges E) are unknown. To address this challenge, we give our agent a budget of M queries to run before conducting the intervention. Each query may target either a uniformly random node, or the neighbor of a node already queried. When a node is queried, it reveals all of its edges. The goal is to use the M queries to uncover a set of edges which suffice to identify influential nodes.</p><p>We then consider an unknown propagation probability. Here, we take a robust optimization approach and look for a policy which performs well across a range of possible values for p. More detail on this part of the problem can be found in Section 4.3.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">CHANGE: a new agent for influence maximization in the field</head><p>We now introduce the CHANGE agent for end-to-end influence maximization. Figure <ref type="figure" target="#fig_0">1</ref> illustrates the three components of the agent. We start with the last component, peer leader selection, since the other components exist to provide the data that the peer leader selection algorithm requires. Peer leader selection is performed by an adaptive greedy algorithm (Algorithm 1), which handles the chance that some peer leaders may not attend the intervention and plans solutions using the observations obtained so far. Algorithm 1 requires as input a (sample of) social network and a propagation probability p. We subsequently introduce Algorithms 2 and 3 to provide these inputs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Adaptive greedy planning</head><p>Algorithm 1 Adaptive greedy Given as input the graph G and propagation probability p, finding the optimal policy is a difficult planning problem. There are 2 n possible hidden states and n K possible actions. While it is possible to formulate the problem as a POMDP, these exponentially large state and action spaces place even small instances beyond the reach of off-the-self solvers. Hence, we exploit the structure of the problem to formulate a scalable greedy algorithm which obtains (provably) near-optimal solutions.</p><p>Pseudocode for adaptive greedy, our online planning algorithm, can be found in Algorithm 1. Algorithm 1 selects the action at each step which maximizes the expected gain in influence spread, conditioned on the observations received so far. Then, it waits until this action has been executed, observes which peer leaders attended the intervention, and greedily plans the next step. Formally, let</p><formula xml:id="formula_1">∆(A t |ψ t−1 ) = f (A(ψ t−1 ) ∪ A t |O(ψ t−1 )) − f (A(ψ t−1 )|O(ψ t−1 )</formula><p>) denote the expected marginal gain to selecting A t at time t. The greedy policy is to select A t = arg max |A|≤K ∆(A|ψ t−1 ) (the outer loop of Algorithm 1). However, computing the maximizing action is itself computationally intractable (as there are n K possible choices). Hence, Algorithm 1 uses an additional greedy inner loop which greedily selects the elements of A t one at a time (lines 3-5). Note that ∆ can be computed by averaging over random simulations over both the hidden state (which nodes are present/absent) as well as how influence spreads via the ICM.</p><p>We prove the following theorem, which shows that greedy planning is sufficient to obtain a guaranteed approximation ratio: Theorem 1. Let π G be Algorithm 1's greedy policy and π * be an optimal policy. It holds that f (π G ) ≥ e−1 2e−1 f (π * ).</p><p>A proof may be found in the supplemental material. We use the adaptive submodularity framework of Golovin and Krause, which generalizes the classical notion of a submodular set function to adaptive policies. Their framework does not straightforwardly apply to our problem since our algorithm selects a sequence of actions, not a set. The order in which actions are selected matters since peer leaders who are selected earlier will have more time to influence others. We show that our problem can be reformulated as maximizing an adaptive submodular set function subject to a more complex set of constraints (a partition matroid). This is the first approximation guarantee for adaptive influence maximization under execution errors, which is a well-known challenge in domains such as ours <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b16">17]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Network collection</head><p>Algorithm 2 Network sampling 1: input: vertex set V , budget M 2: E = ∅ //set of edges observed 3: S = ∅ //set of nodes surveyed 4: for i = 1... M 2 do 5:</p><p>Sample v uniformly at random from V \ S 6: S = S ∪ {v} 7:</p><formula xml:id="formula_2">E = E ∪ {(v, u) : u ∈ N (v)} 8:</formula><p>Sample u uniformly at random from N (v) \ S 9:</p><p>E = E ∪ {(u, w) : w ∈ N (u)} 10: S = S ∪ {u} 11: return E</p><p>The adaptive greedy algorithm assumes that the graph G is fully specified. However, in order for an intervention to deployed in practice, the social network needs to be laboriously gathered by interviewing the entire population of homeless youth (potentially hundreds of youth in total). This is not practical for a service provider to carry out on their own. We present an approach (Algorithm 2) which randomly samples a small number of youth to survey. Our procedure is easy for a service provider to implement in the field without much computational assistance. This simplicity is enabled by underlying insights about the structure of homeless youth social networks, which may assist with intervention design in other vulnerable populations.</p><p>We assume that the service provider has the ability to survey up to M youth. Each youth, when surveyed, reveals all of their edges. Algorithm 2 chooses M 2 nodes uniformly at random from the population to survey (line 5). For each surveyed node, it choses a uniformly random neighbor to survey as well (line 8). Lastly, it returns the graph consisting of the reported edges. The intuition for why this procedure succeeds is that it leverages the friendship paradox : a phenomena where a random node's neighbor has more friends, on average, than the node itself. Essentially, high-degree nodes are overrepresented when we sample a random neighbor instead of a uniformly random node. Thus, Algorithm 2 is disproportionately likely to find central nodes in the network who will reveal many edges and may be good potential seeds. for j = 1...L do 4:</p><p>g(pi, pj) = value obtained by Algorithm 1 using pi evaluated under pj 5: return arg maxi=1...L minj=1...L g(p i ,p j ) g(p j ,p j )</p><p>A further complication is that the adaptive greedy algorithm assumes that the propagation probability p is known, in order to calculate the marginal gain ∆. However, p is never known precisely in practice; each intervention takes months to deploy so we are unlikely to observe the many repeated cascades needed to for learning-based approaches. Previous work has attempted to resolve this dilemma via robust influence maximization <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b14">15]</ref> which finds a seed set which performs well in the worst case over an uncertainty set of possible parameters. However, the only previous work which addresses robust influence maximization in an adaptive domain is the DOSIM algorithm. DOSIM requires hours or even days of runtime on a high-performance computing cluster because it needs to brute force over a grid of possible parameter settings. Such computational expense is far beyond the capabilities of the average service provider, motivating the development of lightweight but effective heuristics for robust influence maximization.</p><p>Algorithm 3 gives the heuristic used by CHANGE. It searches for a good nominal value of the parameter p, which (when given to Algorithm 1) will result in high performance no matter what the true value of p actually is. We first discretize the interval [0, 1] into L points p 1 ...p L . Let g(p i , p j ) denote the expected influence obtained when we run adaptive greedy planning based on propagation probability p i , but the true parameter is p j . We then find p * = arg max i=1...L min j = 1...L g(pi,pj ) g(pj ,pj ) . Here, g(pi,pj ) g(pj ,pj ) , is the ratio of the value based on planning with parameter p i to the value that could have been obtained if we new the true parameter p j . p * is the parameter which maximizes the worstcase value of this ratio. Notably, this procedure requires only L 2 runs of adaptive greedy; we take L = 10 in practice. By contrast, DOSIM requires O n 3 runs of a greedy algorithm to achieve approximation error . This quickly reaches thousands (or tens of thousands) of runs even for moderately sized networks and requires high-performance computing resources.</p><p>5 Pilot study procedure The major contribution of this work is carrying out a pilot study which tests the CHANGE agent in a field deployment at a real drop in center serving homeless youth in a major U.S. city. Here, we outline the procedure followed for the pilot study. There were two studies, the feasibility study and the intervention study. In the feasibility study, we just tested the first component of CHANGE (network data collection) to validate that it works in practice to gather highquality data. In the intervention study, we carried out actual interventions with homeless youth at the center. This step used all three steps of the CHANGE agent: we gathered the network, found a robust set of parameters, and then carried out interventions.</p><p>For each of the studies, we enrolled (respectively) 72 and 64 youth. Each youth was paid $20 to enroll in the study (all monetary incentives were the same as prior studies <ref type="bibr" target="#b16">[17]</ref>). We ran CHANGE's data collection mechanism, randomly sampling a subset of youth to query for ties. Each youth who enrolled was also asked to complete a baseline survey. As part of this survey, we also gathered the full network consisting of ties from all of the youth. We emphasize that this data was collected just for analysis. We did not use the full network to plan interventions, and we would not expect an agency to conduct this step in a regular deployment.</p><p>In the intervention study, trained social workers delivered the Have You Heard intervention, previously published in the public health literature <ref type="bibr" target="#b10">[11]</ref>. The social workers conducted a day-long class with the selected youth, covering HIV awareness and prevention, and training the youth as peer leaders to communicate with other youth at the agency. Peer leaders were paid $60. Three sets of peer leaders were selected by CHANGE, with approximately 4 peer leaders in each set. This matches the size of trainings used in previous influence maximization pilot studies <ref type="bibr" target="#b16">[17]</ref>. Table <ref type="table" target="#tab_1">1</ref> reports specific values on the number of youth enrolled, queried for edges, and trained as peer leaders for our pilot test as well as pilot tests of previous algorithms. One month after the start of the study, we conducted a follow up survey with all of the youth who initially enrolled. Some youth were lost to follow up (see Table <ref type="table" target="#tab_1">1</ref>). We asked the youth whether they had received information about HIV prevention from a peer who was part of the study. Youth were paid $20 to respond to the follow up survey. We emphasize that all aspects of the intervention study (the training materials for peer leaders, survey instruments, etc.) are identical to Yadav et al. <ref type="bibr" target="#b16">[17]</ref>, so our results are directly comparable.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Pilot study results</head><p>We focus here on results from the intervention study, which tested the entirety of the CHANGE agent. In this study, we recruited a population of 64 homeless youth from a drop-in center. Table <ref type="table" target="#tab_1">1</ref> gives the total number of youth recruited for different activities, as well as the corresponding figures for previous pilot tests of the HEALER and DOSIM algorithms by Yadav et al. <ref type="bibr" target="#b16">[17]</ref>. We gathered the full social network from all 64 youth, and in parallel ran Algorithm 2 with a budget of M = 12 youth to collect a sampled network (querying 18.75% of youth in total for links). Only the sampled network was used to plan interventions; the full network was gathered only for analysis. We then ran the CHANGE policy for three steps, training 10 total peer leaders (15.6% of the network). This percentage is comparable to previous studies (HEALER and DOSIM trained approximately 17% of the network each). However, HEALER and DOSIM used the entire network to plan their intervention, compared to the 18.75% of sampled youth used by CHANGE. At one month, we conducted a follow-up survey to assess whether youth received information about HIV prevention from the peer leaders. 54.7% of youth were retained in the follow-up survey, which is a somewhat lower percentage than in previous studies. Nevertheless, we obtain a population of 34 youth who provided follow-up data. We now present our core result: the number of youth who received a message about HIV prevention. We examine the percentage of youth in the follow-up group who were not peer leaders (and hence eligible to become influenced) who reported receiving information. Figure <ref type="figure" target="#fig_2">2</ref> shows this percentage for our pilot study of CHANGE as well as the percentages reported by Yadav et al. <ref type="bibr" target="#b16">[17]</ref> in their pilot studies of the state of the art algorithms HEALER and DOSIM. CHANGE reached 80% of non-peer leaders compared to approximately 70% for each of HEALER and DOSIM. Thus, CHANGE was able to reach just as many youth while gathering data from only 18.75% of the network. The 10% difference between CHANGE and HEALER/DOSIM could be attributable to random variation; we do not claim that CHANGE is actually more effective than algorithms which gather the entire network. Nevertheless, this result provides empirical evidence that CHANGE can perform comparably to existing state of the art influence maximization agents while drastically reducing the amount of data required.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1">Influence spread results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Discussion and conclusion</head><p>This paper presents the CHANGE agent for influence maximization, a multiagent problem with many applications in preventative health and other domains. CHANGE addresses major barriers to the deployment of influence maximization by service providers through a series of algorithmic contributions, backed by simulation results on real-world networks. We then conducted a real-world pilot study of CHANGE with a drop-in center serving homeless youth, the first such pilot study of sampling-based influence maximization and only the second study testing any influence maximization agent in the real world. CHANGE obtained comparable influence spread to previously field tested algorithms, but surveyed only 18% of youth to obtain network data. CHANGE has empirical promise in delivering high-quality influence maximization solutions in a manner which can be feasibly implemented by a service provider.</p><p>While the algorithms underlying CHANGE are easy to implement, they draw on a series of insights into the social behavior of homeless youth. One lesson learned from our study is that, to be successful in the field, algorithms must be designed with their target population and setting in mind. CHANGE both navigates challenges specific to homeless youth (e.g., the difficulty of locating youth to query for edges or serve as peer leaders) and leverages properties of their social network (the friendship paradox). Our experience shows that accounting for both challenges and opportunities in the target population is crucial to produce a practically deployable algorithm.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Illustration of the CHANGE agent.</figDesc><graphic coords="5,188.88,115.83,237.60,126.08" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>4. 3</head><label>3</label><figDesc>Parameter robustness Algorithm 3 Robust parameter selection 1: input: parameter values p1...pL 2: for i = 1...L do 3:</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Percentage of youth who were not peer leaders reached by each algorithm in its respective real-world pilot test.</figDesc><graphic coords="10,235.68,491.69,144.00,94.67" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 .</head><label>1</label><figDesc>Number of youth recruited, trained, and retained for follow-up in each study. CHANGE refers to the study conducted in this work to test the CHANGE agent. The other columns are taken from Yadav et al.<ref type="bibr" target="#b16">[17]</ref>, who conducted pilot tests of HEALER and DOSIM.</figDesc><table><row><cell></cell><cell cols="3">CHANGE HEALER DOSIM</cell></row><row><cell cols="2">Youth recruited 64</cell><cell>62</cell><cell>56</cell></row><row><cell cols="2">Queried for links 18.75%</cell><cell>100%</cell><cell>100%</cell></row><row><cell>PL trained</cell><cell>15.6%</cell><cell>17.7%</cell><cell>17.85%</cell></row><row><cell>Retained</cell><cell>54.7%</cell><cell>73%</cell><cell>73%</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">An extended version of this paper has been accepted as a full paper at AAMAS</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2018" xml:id="foot_1">.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_2">Note that while CHANGE directly surveys ∼18% of youth, they name others as friends, resulting in a larger sampled graph.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgments: This research was supported by MURI Grant W911NF-11-1-0332, the California HIV/AIDS Research Program, and a NSF Graduate Research Fellowship.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Peers, schools, and adolescent cigarette smoking</title>
		<author>
			<persName><forename type="first">C</forename><surname>Alexander</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Piazza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Mekos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Valente</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of adolescent health</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="22" to="30" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The diffusion of microfinance</title>
		<author>
			<persName><forename type="first">A</forename><surname>Banerjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">G</forename><surname>Chandrasekhar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Duflo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">O</forename><surname>Jackson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Science</title>
		<imprint>
			<biblScope unit="volume">341</biblScope>
			<biblScope unit="page">1236498</biblScope>
			<date type="published" when="2013">6144. 2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Scalable influence maximization for prevalent viral marketing in large-scale social networks</title>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">KDD</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="1029" to="1038" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Sketch-based influence maximization and computation: Scaling up with guarantees</title>
		<author>
			<persName><forename type="first">E</forename><surname>Cohen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Delling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Pajor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">F</forename><surname>Werneck</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2014">2014</date>
			<publisher>CIKM</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Celf++: optimizing the greedy algorithm for influence maximization in social networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">V</forename><surname>Lakshmanan</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2011">2011</date>
			<publisher>WWW</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Robust influence maximization</title>
		<author>
			<persName><forename type="first">X</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kempe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">KDD</title>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Maximizing the spread of influence through a social network</title>
		<author>
			<persName><forename type="first">D</forename><surname>Kempe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kleinberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">É</forename><surname>Tardos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">KDD</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="137" to="146" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Towards robust multi-objective optimization under model uncertainty for energy conservation</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">Y</forename><surname>Kwak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Varakantham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Maheswaran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tambe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hayes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wood</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Becerik-Gerber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AAMAS ATES workshop</title>
				<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Robust influence maximization</title>
		<author>
			<persName><forename type="first">M</forename><surname>Lowalekar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Varakantham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kumar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AAMAS</title>
				<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="1395" to="1396" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Pro-social and problematic social network influences on hiv/aids risk behaviours among newly homeless youth in los angeles</title>
		<author>
			<persName><forename type="first">E</forename><surname>Rice</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">G</forename><surname>Milburn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Rotheram-Borus</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AIDS care</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="697" to="704" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Mobilizing homeless youth for hiv prevention: a social network analysis of the acceptability of a face-to-face and online social networking intervention</title>
		<author>
			<persName><forename type="first">E</forename><surname>Rice</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Tulbert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cederbaum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barman Adhikari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">G</forename><surname>Milburn</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Health education research</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="226" to="236" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Influence maximization: Near-optimal time complexity meets practical efficiency</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Shi</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2014">2014</date>
			<publisher>SIGMOD</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Identifying opinion leaders to promote behavior change</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">W</forename><surname>Valente</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Pumpuang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Health Education &amp; Behavior</title>
		<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Maximizing influence in an unknown social network</title>
		<author>
			<persName><forename type="first">B</forename><surname>Wilder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Immorlica</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rice</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tambe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AAAI Conference on Artificial Intelligence</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Uncharted but not uninfluenced: Influence maximization with an uncertain network</title>
		<author>
			<persName><forename type="first">B</forename><surname>Wilder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yadav</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Immorlica</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rice</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tambe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AAMAS</title>
		<imprint>
			<biblScope unit="page" from="1305" to="1313" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Using social networks to aid homeless shelters: Dynamic influence maximization under uncertainty</title>
		<author>
			<persName><forename type="first">A</forename><surname>Yadav</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chan</surname></persName>
		</author>
		<author>
			<persName><surname>Xin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rice</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tambe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AAMAS</title>
		<imprint>
			<biblScope unit="page" from="740" to="748" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Influence maximization in the field: The arduous journey from emerging to deployed application</title>
		<author>
			<persName><forename type="first">A</forename><surname>Yadav</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Wilder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rice</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Petering</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Craddock</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yoshioka-Maxwell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hemler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Onasch-Vera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tambe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Woo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AAMAS</title>
		<imprint>
			<biblScope unit="page" from="150" to="158" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Online social networking technologies, hiv knowledge, and sexual risk and testing behaviors among homeless youth</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">D</forename><surname>Young</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rice</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AIDS and Behavior</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="253" to="260" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
