Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia Heuristic Algorithms for Influence Maximization in Partially Observable Social Networks Sebastian Stein1 , Soheil Eshghi2 , Setareh Maghsudi2 , Leandros Tassiulas2 , Rachel K. E. Bellamy3 , and Nicholas R. Jennings4 1 University of Southampton, Southampton, UK ss2@ecs.soton.ac.uk 2 Yale University, New Haven, USA {soheil.eshghi, setareh.maghsudi, leandros.tassiulas}@yale.edu 3 IBM T.J. Watson Research, Yorktown Heights, NY, USA rachel@us.ibm.com 4 Imperial College London, UK n.jennings@imperial.ac.uk Abstract. We consider the problem of selecting the most influential members within a social network, in order to disseminate a message as widely as possi- ble. This problem, also referred to as seed selection for influence maximization, has been under intensive investigation since the emergence of social networks. Nonetheless, a large body of existing research is based on the assumption that the network is completely known, whereas little work considers partially observable networks. Yet, due to many issues including the extremely large size of current networks and privacy considerations, assuming full knowledge of the network is rather unrealistic. Despite this, an influencer often wishes to distribute its mes- sage far beyond the boundaries of the known network. In this preliminary study, we propose a set of novel heuristic algorithms that specifically target nodes at this boundary, in order to maximize influence across the whole network. We show that these algorithms outperform the state of the art by up to 38% in networks with partial observability. 1 Introduction Today’s vast online social networks connect people across geographical, cultural and organizational boundaries. This offers a tremendous opportunity for quickly dissemi- nating a message to a global audience with only a limited budget. This could be ex- ploited for product marketing, but also for quickly mobilizing large crowds to help during humanitarian crises [15], for collecting data through participatory sensing [23] or for supporting machine intelligence through human computation and crowdsourcing [20]. However, tapping into this vast distributed processing and information resource requires organizations to effectively disseminate a message (e.g., a call for action or participation) far beyond their usual domain of direct influence (e.g., their existing cus- tomers or social media followers). This can be achieved by asking key individuals to Copyright c 2017 for the individual papers by the papers’ authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. 20 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia spread the message to contacts within their respective social networks, ideally creating far-reaching cascades of repeated messages. How to choose such key individuals within a large network has been widely studied in the computer science literature and is typically referred to as influence maximiza- tion (IM) [7, 12, 10, 3]. This body of work provides models that allow the prediction of network spread, given a fully known network, enabling those who wish to have their messages reach the most people, e.g., advertisers, marketers or public health profes- sionals, to choose the best few people to target with their initial messaging. Although IM has attracted a great deal of attention in the past few years, the majority of existing work focuses on known networks, rather than the real-world situation where the full network is not known. In essence, little work considers partially observable net- works, where only part of a network is visible to the decision maker. This is especially relevant, for instance, when an organization may be aware of its own members’ network links, but has little information about the wider network. In this paper, we take the first steps towards addressing this problem. Specifically, we survey the state of the art and propose a new model for IM in partially observable networks. Furthermore, we propose a number of heuristic algorithms that target nodes at the boundary of the known network and, through an empirical evaluation on a real- world network, we show that these can outperform the current state of the art by up to 38% in settings with very limited observability. 2 Related Work In the following we first introduce the general influence maximization problem in fully observable settings and then we detail work that has looked at partially observable set- tings. 2.1 IM in Fully Observable Networks In the influence maximization problem, a (fully observable) network is specified along with an activation function that describes the spread of influence. Initially, no individu- als in this network are activated, i.e., influenced by the message. Given this, the decision maker picks a fixed number of individuals as seeds. These become activated and spread the message, i.e., activate their neighbors, according to the specified activation function. The aim of the decision maker is to maximize the total number of activated individuals [10]. While solving this problem is NP-hard in general, there are several computationally efficient approximate solutions with performance guarantees [10, 14, 6, 21, 3]. Some of these scale to realistic-sized networks with millions of nodes or more. This work focuses on partially observable networks, and hence we avoid a detailed discussion of the state-of-the-art IM schemes for fully observable networks. Nonethe- less, we describe the state-of-the-art IM algorithms we use in our numerical analysis. Currently, the most successful IM algorithms with theoretical performance guaran- tees [1] are TIM and TIM+ [19], and IMM [18]. These algorithms are based on the concept of reverse reachable sets [3], defined below. 21 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia Definition 1 (Reverse Reachable Set). Let v be a node in some graph G, and g be a graph obtained by removing each edge e in G with probability 1 − p(e). The reverse reachable (RR) set for v in g is the set of nodes in g that can reach v. That is, for each node u in the RR set, there is a directed path from u to v in g. Theoretically, it can be shown that if an RR set generated for v overlaps with a node set S with probability ρ, then when we use S as the seed set to run an influence propagation process on G, v will be activated with probability ρ. These algorithms first determine how many RR sets need to be generated to pro- vide statistical guarantees for the IM problem. Choosing too many RR sets increases the computational complexity, while choosing too few RR sets affects the algorithm’s performance guarantees. TIM/TIM+ predetermine the required number of RR sets and then generate them, while IMM produces RR sets one after another until a certain stop- ping condition is satisfied. Once the RR sets are generated, a greedy heuristic for the max-cover problem derived from these RR sets is applied to select the seeds, which is guaranteed to at least be within a factor (1 − 1e ) of the optimal case. 2.2 IM in Partially Observable Networks Influence maximization under uncertainty corresponds to a general class of problems, where given limited network information and a fixed budget, the desire is to allocate the budget to a set of seeds, so that the spread of influence is maximized. In particular, the probability of influence among each pair of nodes and/or the network structure might be (partially) unknown. This problem has been investigated in the literature in two general settings: One shot: Here, seed selection is performed only once given the available informa- tion. For example, robust influence maximization [5] addresses uncertainty in the edge influence probability, which is represented as an interval in which the true probability may lie. Its goal is to maximize the worst-case ratio between the influence spread of the chosen seeds and the optimal set of seeds. Repeated: Here, seeds are selected incrementally, so that sampling/learning approaches can be used to acquire information about the a priori unknown propagation character- istics. These work can be classified into two sub-groups: – Bandit-based approaches: In [16], a combinatorial bandit approach is used to model IM, where the agent selects a set of nodes as seeds at every round, and incurs a re- gret due to not selecting the optimal set. Existing regret-minimizing algorithms are used to solve the problem. In another bandit-based IM method [11], the seed selection scheme has two phases: (i) Exploration using a conventional influence maximization algorithm and (ii) Exploitation using a bandit algorithm. In influence maximization with semi-bandit feedback [22], upon activating the seeds, the learner only observes the influenced part of the network and uses an algorithm based on the well-known UCB strategy to select seeds. This latter algorithm can also accom- modate a combinatorial bandit setting, since at every round a set of nodes can be selected. 22 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia – Other approaches: [9] learn the diffusion probability in an information-cascade model by likelihood maximization, and develop an expectation maximization seed selection algorithm. [17] investigates a scenario in which seed selection is per- formed under partial feedback, where at every round the result of past action(s) is not completely revealed. This creates a trade-off between delay (waiting time to see a desired partial outcome) and efficiency of seed selection in terms of spread of influence. This is somewhat similar to the trade-off in influence maximization in changing networks [24], which is between recency and accuracy of network infor- mation and efficiency of spread. Finally, [13] proposes a heuristic algorithm which at every round probes a set of nodes which have a high probability of large degrees. All probed nodes are ranked based on their degree, and nodes with the highest ranks are selected as seeds. In this work, we explore influence maximization under a type of uncertainty which has not been investigated so far. In our setting, a part (or some parts) of a network is known (e.g., individuals that belong to the decision maker’s organization), while the rest is completely unobservable. We investigate the applicability of the state-of-the-art approaches, and propose new heuristic algorithms for seed selection. 3 Model and Problem We consider a dense network modeled by a directed graph G(V, E, w) and called an influence network: – V = {1, 2, . . . , N } indicates the set of nodes. Each node represents a network entity, for instance an individual. G represents a social network; – E ⊆ V × V is the set of directed edges; – w : E → [0, 1] is a weight function, where w(i, j) is the likelihood (possibility) of j being influenced by i. Alternatively and with a slight change of notation, w can be defined as the weight matrix. In an influence network, initially a seed set S ⊂ V becomes activated. Every activated node causes its neighbors to become (and stay) activated according to the independent cascade model [8]. In this model, an activated node i ∈ V might activate each neighbor 1 j with probability w(i, j). In a weighted cascade model, we have w(i, j) = |{(k,j)∈E}| , where |{(k, j) ∈ E}| is the in-degree of j. It should be mentioned that in addition to the independent- and weighted cascade models, the spread of influence through a network can be described by some other mod- els, one of them being the well-known linear threshold model. In this work, however, we confine our attention to the cascade influence propagation model. Departing from existing models in the literature, we assume that the network is only partially observable. Specifically, we assume that the decision maker, who selects the seeds, observes only one part of the network, denoted by G0 = (V 0 , E 0 , w0 ). Naturally we have V 0 ⊆ V, E 0 ⊆ E(⊆ V 0 × V 0 ), and w0 : E 0 → [0, 1]. Note that the range of function w0 includes the weights of the edges belonging to E 0 . 23 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia One specific case of partially observable networks is the organization-partitioned model of network. In such networks, a subset of nodes, O ⊆ V, reveal their neighbors (and possibly the relevant edge-weights) to the decision makers, e.g., because they are members of the decision maker’s organization. We denote this subset of nodes as fully observable nodes, while the neighbors of nodes in O (that are not also in O) are denoted as boundary nodes. We assume only nodes in O can be selected as initial seeds. Figure 1 provides an illustrative example of an organization-partitioned network. Concluding this section, we define the general problem of influence maximization in partially observable networks as follows: Given a specific activation function, a seed set size M and the observable part G 0 of a network G, it is desired to find a seed set S ⊆ V 0 (with |S| ≤ M ), so that in the underlying network G, the total number of activated nodes through seeding S is maximized. 2 3 7 1 6 4 8 5 9 Fig. 1. Illustrative example of an organization-partitioned network, which is a spacial case of partially observable networks. The light nodes are observable. Note that there is only a weak tie between the observable and unobservable parts of the networks, which makes the influence maximization, and thereby the seed selection, particularly challenging. In this paper, we confine our attention to heuristic-based IM (described in the fol- lowing section), and leave the investigation of inference-based IM in partially observ- able network (where the unseen part of the network is reconstructed using generative models) for our future work. 3.1 Heuristic-based Influence Maximization Here the structure of the problem is such that reliable inference is not possible. This might arise, for instance, due to restricted availability of samples or limited computa- tional capacity. In such circumstances, instead of inference, we explore simple rules for selecting seed nodes: – Random Selection: In this case, we select the seeds simply at random. – Random with Neighbor Activation: Here, we first select the seeds at random, and then for each seed, we activate one of its neighbors. This approach is based on the friendship paradox [25], first discovered by S. L. Feld. It states that on an average basis, most people have fewer friends than their friends have. Thus, if we select some seed at random, it is beneficial to activate one of its neighbors, instead of the original node itself, due to possible larger number of connections. 24 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia Network Generative generation network model algorithm Filter Influence Maximisation Spread model Algorithm Filtered network Spread algorithm Inferred network(s) Selected seed nodes Message spread Fig. 2. Benchmarking framework – Selection based on (Weighted) Degree: In this heuristic, we first rank the nodes in the known part of the network based on their degree, so that nodes with larger number of neighbors have higher ranks. Then we select node with highest rank as seeds. Here we use the intuition that nodes with larger number of connections are very likely to be highly influential. In the settings we considered in this paper, neigh- bours typically had non-negligible influence on each other. However, in some real- world social networks, users may have large numbers of followers but actually exert little influence on them [4]. Here, links could be pruned based on historical influ- ence interactions, but we leave this to future work. In the weighted version of this approach, we still rank the nodes based on their degree; however, in order to improve the influence probability in the unknown part of the network, we attach a higher weight for neighbours that are in the boundary set. For example, when w = 3, then node 6 in Figure 1 would have a weighted degree of 4, while node 7 would only have a weighted degree of 3 (since all its neighbours are fully observable). – State of the Art with Limited Visibility: Here we simply use the IM algorithms which are originally developed for known networks, for instance TIM and IMM de- scribed in Section 2.1. In doing so, we intend to investigate the extent to which the state of the art is applicable to partially known networks. Similar to the activation based on degree, TIM and IMM can be used in the weighted version. In essence, we run the TIM and IMM in the known part of the network as conventional; how- ever, for selecting the RR sets, we weight the edges with connections to outside the boundary, although such connections are not completely visible to us. 25 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia 4 Experimental Evaluation In order to evaluate influence maximization algorithms in partially observable networks, we implemented a comprehensive benchmarking framework, as shown in Figure 2 (here, dashed lines indicate the use of a probabilistic model, and solid lines indicate data flow). Briefly, the figure shows how a full network is generated, then filtered to a partially observable network. The IM algorithm then selects seed nodes (potentially using inference) and the outcome is simulated. We implemented a range of algorithms for generating synthetic networks and filtering these. However, due to the limited space, we only report on a subset of our results. In the following, we first describe our experi- mental setup and then outline the results. 4.1 Experimental Setup We use the NetHept5 dataset, a network of 15k nodes and 31k edges (representing citations within the high energy physics theory community). We choose this because it constitutes a real-world dataset of reasonable size and because it has been widely used to benchmark influence maximization algorithms [18]. Throughout our experiments, we will generate partially observable networks with varying numbers of fully observable nodes (i.e., nodes that are observable and whose neighbours are also observable — see Section 3). We do this by initialising the set of fully observable nodes (O) to contain a number of seed nodes, chosen uniformly at random. In our experiment, we initially choose 4 seed nodes, but our results are similar for other settings. Then, we iteratively add one additional node to O at a time, either by picking a node from O and then adding one of its neighbours that is currently not in O to O (both uniformly at random), or, when no such neighbours exist, by adding a random node to O. This continues until O contains a target number of nodes. Again, due to space reasons, we focus on the weighted cascade model (see Sec- tion 3), although we have observed broadly similar trends more generally in the inde- pendent cascade model (for various edge weight distributions). We evaluate most of the heuristics described in Section 3.1. Here, we use Random and Random Neighbour to denote Random Selection (without or with Neighbour Ac- tivation). Weighted(1) represents selection based on degree, while Weighted(w) is the weighted version that attaches a relative weight w to boundary nodes. Finally, IMM(1) is the state-of-the-art IMM algorithm described in Section 2, and IMM(w) similarly represents the weighted version. We choose IMM here, due to its consistently high per- formance compared to other state-of-the-art algorithms [1]. We leave the evaluation of centrality-based heuristics and other state-of-the-art approaches in partially observable settings to future work. To evaluate each influence maximization algorithm, we generate a partially observ- able network, run the algorithm on it and then simulate the influence spread on the chosen set of seeds 1,000 times to obtain an average spread. We repeat this process 1,000 times (to evaluate a wide range of partially observable networks) and present 5 This can be downloaded at https://www.microsoft.com/en-us/research/ wp-content/uploads/2016/02/weic-graphdata.zip. 26 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia the overall average spread. The 95% confidence intervals are typically smaller than an absolute deviation of 2 from the average value, so are not plotted on the graphs. 4.2 Results Figures 3 and 4 show the average spread of all algorithms in a setting with five seed nodes, as we vary the network visibility, i.e., the proportion of fully observable nodes (|O| / |V|). Figure 3 focuses on networks with particularly low observability (up to 10%), while Figure 4 shows the full range of proportions. 180 160 140 IMM(1) 120 IMM(2) Average Spread IMM(5) 100 Random RandomNeighbour Weighted(1) 80 Weighted(2) Weighted(5) 60 40 20 2% 4% 6% 8% 10% Network Visibility Fig. 3. Average spread in partially observable networks for 5 seeds (up to 10% visibility). Here, several interesting trends emerge. As expected, the state-of-the-art IMM algo- rithm performs will throughout all settings. However, looking more closely at cases with low observability (Figure 3), some of the heuristic approaches outperform it. Specifi- cally, the degree-based heuristics perform consistently well, sometimes achieving an up to 19% higher average spread than IMM. For example, when 1% of nodes are fully observable, IMM achieves an average spread of 112.12 ± 1.78 (with 95% confidence interval), while Weighted(2) achieves 133.19 ± 1.95.6 This is likely because IMM fo- cuses on maximising its influence only within the known part of the network. In settings with low visibility, this is a highly inaccurate view and the degree-based heuristics offer a better estimate of what nodes are influential across the full network. 6 A t-test confirms there is a statistically significant difference at the p < 0.0001 level. 27 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia 200 175 150 IMM(1) 125 IMM(2) Average Spread IMM(5) Random 100 RandomNeighbour Weighted(1) 75 Weighted(2) Weighted(5) 50 25 0 0% 20% 40% 60% 80% 100% Network Visibility Fig. 4. Average spread in partially observable networks for 5 seeds (up to 100% visibility). As visibility rises (also continuing in Figure 4), this difference becomes less pro- nounced, and by 10% visibility, they achieve a similar performance. As visibility rises further, IMM(1) eventually achieves the highest performance (from 50% visibility on- wards). This is not surprising, since it is known to perform well in fully observable networks. Looking specifically at the effect of adding higher weights for boundary nodes to the heuristics, this can lead to a significant increase in the average spread. However, the performance is sensitive to the exact parameter value, and for networks with higher visibility, a high weight can indeed lead to a decrease in performance, and is most pronounced for Weighted(5) in settings with 20–50% visibility. This is because the boundary nodes actually decrease in importance in the network as more of it is known to the algorithm. Note that the relatively good performance of degree heuristics is surprising, as it goes against the distinction between influential and high-degree nodes that has been observed in some settings [4]. This may be due to the fact that NetHept, as a citation network, has high-degree assortativity [2, Chapter 7], meaning that higher degree nodes are likely to be connected to other high-degree nodes and vice versa. This would mean that targeting a high-degree node can lead to influence spread via its high-degree neigh- bors (including within the unseen part of the network). Promisingly, adding the weight to IMM significantly increases the average spread of IMM — up to 6% in some cases, and IMM(2) consistently performs better or as well as IMM across all settings. However, as for the degree-based heuristic, choosing a very 28 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia large weight can lead to a decrease in performance for the same reasons as mentioned above. Both random heuristics achieve a low level of performance, but using neighbour activation achieves a significantly higher spread than not using it (up to 100% more in some cases). This is potentially promising in settings where very little is known about the network, but where neighbours of known nodes can be activated. Note the apparent decrease in performance for larger networks is due to the way we filter nodes based on their neighbours — more connected nodes are likely to be included in our fully observable set earlier. Thus, when sampling a random node in a network with smaller visibility, it is more likely to be well connected, than when sampling from a network with full visibility. 40 IMM(1) 35 IMM(2) IMM(5) 30 Random RandomNeighbour Average Spread per Seed 25 Weighted(1) Weighted(2) 20 Weighted(5) 15 10 5 0 0 20 40 60 80 100 Number of Seeds Fig. 5. Average spread for varying numbers of seeds for 1% visibility. Finally, Figure 5 shows the average spread per initial seed chosen as the number of initial seeds is increased (in a setting with 1% visibility). This highlights that our heuristic techniques achieve the highest performance gains over the state of the art in settings with fewer initial seeds (a gain of up to 38% when there is just a single initial seed). Overall, these are promising results, showing that in settings where large parts of the network are not observable and where only few seeds can be chosen, the state-of- the-art algorithm does not necessarily perform best. Instead simple heuristics perform well, and both those heuristics and the current state of the art benefit from explicitly favouring nodes at the boundary of the known network. It should also be noted that the 29 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia heuristics are several order of magnitude faster than IMM — a typical run of IMM took about 0.1–0.2 seconds, while the degree-based heuristics typically completed within 0.2–0.3ms. 5 Conclusions and Future Work In this paper, we presented initial results on influence maximization in partially observ- able networks. We showed that the current state of the art does not necessarily produce the best results, especially when only small parts of the network are visible. Indeed, we examined some settings where a degree-based heuristic leads to a 38% improvement over the state of the art. In future work, we will examine these settings in more detail, to derive more prin- cipled inference-based algorithms that work well in a range of settings and that offer some performance guarantees. We will look at influence maximization over different types of real-world networks to understand whether the relative performance of our simple heuristic depends on any parameters of the underlying network (e.g., degree assortativity). We will then consider the case where the decision maker has access to a generative network model that can be used for inference about the structure of the unseen part of the network (conditional on the observed part) before application of IM algorithms. Two specific examples of such generative models include: – Sampling: An inference model is used to sample multiple snapshots of the unseen network. Across all snapshots, the average marginal influence of all nodes in V 0 is calculated and a greedy IM approach is applied. – Maximum Likelihood: The model is used to calculate a single maximum likeli- hood network and an existing IM approach is applied to this. Finally, we will investigate repeated settings, as described in Section 2.2, where seeds are selected incrementally and where information about the unknown part of the network is gradually revealed. Acknowledgments This research was sponsored by the U.S. Army Research Laboratory and the U.K. Ministry of Defence under Agreement Number W911NF-16-3-0001. The views and conclusions contained in this document are those of the authors and should not be in- terpreted as representing the official policies, either expressed or implied, of the U.S. Army Research Laboratory, the U.S. Government, the U.K. Ministry of Defence or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copy-right notation hereon. The work of S. M. was supported by a post-doctoral fellowship from the Ger- man Research Foundation (DFG) under Grant Number MA 7111/1-1. 30 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia References 1. Arora, A., Galhotra, S., Ranu, S.: Debunking the myths of influence maximization: An in- depth benchmarking study. In: Proceedings of the 2017 ACM International Conference on Management of Data. pp. 651–666. ACM (2017) 2. Barabási, A.L.: Network science. Cambridge University Press (2016) 3. Borgs, C., Brautbar, M., Chayes, J., Lucier, B.: Maximizing social influence in nearly opti- mal time. In: Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms. pp. 946–957. SIAM (2014) 4. Cha, M., Haddadi, H., Benevenuto, F., Gummadi, P.K.: Measuring user influence in twitter: The million follower fallacy. (2010) 5. Chen, W., Lin, T., Tan, Z., Zhao, M., Zhou, X.: Robust influence maximization. In: Proceed- ings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 795–804. KDD ’16, ACM, New York, NY, USA (2016) 6. Chen, W., Yuan, Y., Zhang, L.: Scalable influence maximization in social networks under the linear threshold model. In: Data Mining (ICDM), 2010 IEEE 10th International Conference on. pp. 88–97. IEEE (2010) 7. Domingos, P., Richardson, M.: Mining the network value of customers. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data min- ing. pp. 57–66 (2001) 8. Goldenberg, J., Libai, B., Muller, E.: Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing letters 12(3), 211–223 (2001) 9. K. Satio, R.N., Kimora, M.: Prediction of information diffusion probabilities for independent cascade model. In: Knowledge-Based Intelligent Information and Engineering Systems. pp. 67–75. Springer (2008) 10. Kempe, D., Kleinberg, J.M., Tardos, E.: Maximizing the spread of influence through a social network. In: KDD. pp. 137–146 (2003) 11. Lei, S., Maniu, S., Mo, L., Cheng, R., Senellart, P.: Online influence maximization. In: Pro- ceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 645–654. KDD ’15, ACM (2015) 12. M. Richardson, P.D.: Mining knowledge-sharing sites for viral marketing. In: Proceedings of the Eighth Intl. Conf. on Knowledge Discovery and Data Mining. pp. 61–70 (2002) 13. Mihara, S., Tsugawa, S., Ohsaki, H.: Influence maximization problem for unknown social networks. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. pp. 1539–1546. ASONAM ’15, ACM (2015) 14. Mossel, E., Roch, S.: On the submodularity of influence in social networks. In: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing. pp. 128–134. ACM (2007) 15. Okolloh, O.: Ushahidi, or ’testimony’: Web 2.0 tools for crowdsourcing crisis information. Participatory learning and action 59(1), 65–70 (2009) 16. S. Vaswani, L.L., Schmidt, M.: Influence maximization with bandits. In: Proceedings of the 2015 International Conference on Neural Information Processing Systems. pp. 1–6 (2015) 17. Tang, S.: No time to observe: Adaptive influence maximization with partial feedback. CoRR abs/1609.00427 (2016) 18. Tang, Y., Shi, Y., Xiao, X.: Influence maximization in near-linear time: A martingale ap- proach. In: Proceedings of the 2015 ACM SIGMOD International Conference on Manage- ment of Data. pp. 1539–1554. ACM (2015) 19. Tang, Y., Xiao, X., Shi, Y.: Influence maximization: Near-optimal time complexity meets practical efficiency. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data. pp. 75–86. ACM (2014) 31 Proceedings of the 3rd International Workshop on Social Influence Analysis (SocInf 2017) August 19th, 2017 - Melbourne, Australia 20. Von Ahn, L.: Human computation. In: Data Engineering, 2008. ICDE 2008. IEEE 24th In- ternational Conference on. pp. 1–2. IEEE (2008) 21. Wang, C., Chen, W., Wang, Y.: Scalable influence maximization for independent cascade model in large-scale social networks. Data Mining and Knowledge Discovery 25(3), 545 (2012) 22. Wen, Z., Kveton, B., Valko, M.: Influence maximization with semi-bandit feedback. CoRR abs/1605.06593 (2016) 23. Zenonos, A., Stein, S., Jennings, N.R.: Coordinating measurements for air pollution moni- toring in participatory sensing settings. In: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems. pp. 493–501 (2015) 24. Zhuang, H., Sun, Y., Tang, J., Zhang, J., Sun, X.: Influence maximization in dynamic social networks. In: Data Mining (ICDM), 2013 IEEE 13th International Conference on. pp. 1313– 1318. IEEE (2013) 25. Zuckerman, E.W., Jost, J.T.: What makes you think you’re so popular? Self-evaluation main- tenance and the subjective side of the “friendship paradox”. Social Psychology Quarterly pp. 207–223 (2001) 32