=Paper=
{{Paper
|id=Vol-2154/paper8
|storemode=property
|title=Strategic Attacks on Trust Models via Bandit Optimization
|pdfUrl=https://ceur-ws.org/Vol-2154/paper8.pdf
|volume=Vol-2154
|authors=Taha D. Gunes,Long Tran-Thanh,Timothy J. Norman
|dblpUrl=https://dblp.org/rec/conf/atal/GunesTN18
}}
==Strategic Attacks on Trust Models via Bandit Optimization==
Strategic Attacks on Trust Models via Bandit Optimization Taha D. Güneş Long Tran-Thanh Timothy J. Norman t.d.gunes@soton.ac.uk l.tran-thanh@soton.ac.uk t.j.norman@soton.ac.uk Agents, Interaction and Complexity Research Group Electronics and Computer Science University of Southampton, UK Abstract Trust and reputation systems are designed to mitigate risks associated with decisions to rely upon systems over which there is no direct con- trol. The effectiveness of trust models are typically evaluated against relatively shallow metrics regarding the sophistication of potential at- tacks. In reality, such systems may be open to strategic attacks, which need to be investigated in-depth if trust model resilience is to be more fully understood. Here, we devise an orchestrated attack strategy for a specific state-of-the-art statistical trust model (HABIT). We evaluate how these intelligent attack strategies can influence predictions of tar- get trustworthiness by this model. Our conjecture is that this approach represents a stronger benchmark for the assessment of trust models in general. 1 Introduction Interactions among autonomous agents inherently involve the risk of encountering behaviour that is unexpected and, potentially, damaging to the goals of the principal. To this end, models of trust are devised to support agents in making decisions regarding with whom to interact. This decision process may, however, be vulnerable to orchestrated attack from adversaries. Existing trust models are, to a substantial extent, not rigorously evaluated with respect to adversarial attack. The typical argument used with respect to attacks on a trust model is: our model is robust to, say, whitewashing attacks; during evaluation, we introduce a number of agents that perform poorly and then return with a new identity; we then demonstrate that our model is, by some metric, robust to this attack [BNS10]. The assumption is that we have a relatively small number (with respect to the population) of agents that have a pretty dumb attack strategy. What is more of a concern for computational trust models is an intelligent, strategic attacker with a specific aim in mind, potentially generating a high pay-off. To our knowledge, there is very little research that has explored this kind of attack, and the potential impact that it may have on computational models of trust. All the leading statistical models of trust [TLRJ12, RPC06, FBZ13, BNS10, KSGM03] have been evaluated using either (adapted) real-world datasets or via simulation. The metrics of evaluation typically include the mean absolute error of an assessment of trustworthiness, or precision and recall of a classification mechanism Copyright c by the paper’s authors. Copying permitted only for private and academic purposes. In: R. Cohen, M. Sensoy, and T. J. Norman (eds.): Proceedings of the 20th International Workshop on Trust in Agent Societies, Stockholm, Sweden on July 14-15, 2018, published at http://ceur-ws.org 1 with assessment against a set of straightforward attacks. As argued by Jøsang [Jøs12, JG09], however, this type of analysis requires additional investigation to ensure robustness in real-world scenarios: “Many studies on robustness in the research literature suffer from the authors desire to put their own trust and reputation system (TRS) designs in a positive light, with the result that the robustness analy- ses often are too superficial and fail to consider realistic attacks. Publications providing comprehensive robustness analyses are rare.” The small number of studies that attempt to investigate the robustness of trust models include Kerr & Cohen [KC09], Wang et al. [WMLZ14], Muller et. al. [MWLZ16], Bidgoly & Ladani [BL16] and Ruan & Durresi [RD16]. The attacks considered in these studies are rule-based, which are not necessarily representative of what a sophisticated attacker is capable of. Kerr & Cohen, for example, employs a playbook approach, where the attacker selects strategies using a similarity metric to assess situations in which engineered strategies have been defined as relevant [KC09]. Wang et. al. define a stochastic process to model the attacker’s strategy in selecting from representative attack actions [WMZL15]. We take a different approach, using the trust model employed by the decision maker (the principal) to assess service providers given available evidence. A malicious service provider (the attacker), targeting a decision maker, intends to influence their reasoning so that assessments of service providers are more to attacker’s benefit. From the attacker’s perspective, therefore, the goal is to find what kinds of evidence have most influence on their assessments. This may involve self-promoting the attacker, or slandering competitors [HZNR09], but we make no prior assumptions about what strategies are most useful. In essence, the aim is to find sets of actions that can affect the evidence used by the principal to shift the trust system to promote the malicious motive. We ground our attack model on a hierarchical optimistic bandit optimization mechanism, presenting this algorithm in Section 3. This is used to strategically sample the space of possible attacks on a trust model, where we cannot assume that the attack space is convex. In Section 4, we assess the performance of this model against HABIT [TLRJ12], a generic model that has been shown to be highly effective in challenging scenarios and benchmarked against other leading algorithms, and discuss our findings in Section 5. Before giving an overview of our approach, however, we formulate the trust problem itself and show how HABIT exploits evidence available to make trust decisions. 2 Trust Model Trust assessment is the problem of producing a numeric rating (or ranking) for each of a set of agents, A = {1, . . . , n}, which are assumed to be functionally equivalent. In other words, we do not take into account questions of risk in making delegation decisions, we focus on prediction of performance on some problem. Different models use different kinds of evidence in making these assessment, the most simple being a determination of whether the provision of a service was successful, or a failure. Other models use quality ratings, or take into account multiple dimensions of performance such as quality and time to delivery. In general, however, these are observations made by the principal, tr, of the agent concerned (often referred to as the trustee), te over some period of time (e.g. 0:t from time 0 to time t): Otr→te . We assume at each time step an interaction is occurred. All the information 0:t that is, in principle, available to form assessments is: E = {Oi→j | (i, j) ∈ R}, where R is the all agent pairs (i, j). We, therefore, assume, in common with most trust models, that these observations are made by a single principal of the performance of a single agent. Generally goal of a statistical trust model is to estimate the expectation of p(Otr→te | E), which is the observation that the principal, tr, expects in a future interaction with te, given the evidence. Of course, the principal may have limited memory of past encounters, or it may have limited, biased or no access to others’ observations (reports of others’ direct observations). The HABIT model, proposed by Teacy et al. [TLRJ12], is a generic statistical trust model, which is com- patible with both continuous and discrete forms of information and claims (evidence) that has been empirically demonstrated to be robust to malicious and noisy information. The model can be instantiated with different types of distribution so that it can consume different types of observation; e.g. success/failure, or a qualitative performance rating. The generic model assumes that the reported performance of each trustee (reputation), θ.→te , is dependent on a hyper-parameter vector φ, which is intended to account for the group behaviour of all trustees. Thus, θ.→te is a set that consists of parameter vectors each every principal with a specific trustee, θ.→te = {θi→te | i ∈ A}. These parameter vectors are used to define the probability of interacting with trustee j with exactly the same experience as that reported from the perspective of principal i will be successfully, 2 success P (Oi→j = success) = θi→j . The aim is to calculate the trustworthiness of trustee te, from the perspective of truster tr by all evidence E, which is E[θtr→te |E]. Here, we adopt Dirichlet Process [Fer73] instance of the HABIT model. In this instance, trust assessments are calculated analytically rather than through variational inference or Monte Carlo methods. Given all evidence E, the trustworthiness of trustee te is calculated as: θtr→te ∼ Dir(αtr→te ) Otr→te ∼ Cat(θtr→te )1 n 0:t 0:t 0:t Beta(Otr→te ) 0:t X Beta(Otr→j ∪ Otr→te ) 0:t 0:t E[θtr→te |E] = c0 E[Dir(θtr→te |Otr→te )] + 0:t E[Dir(θtr→te |Otr→j ∪ Otr→te )] Beta(∅) j=1 Beta(O tr→te ) (1) where: Qk (l) αl +nl l=1 (θi→j ) (2) Dir(θi→j |Oi→j ) = Beta(α + n) is the posterior Dirichlet distribution given the evidence, which increments each αl by number of times, nl . Qk l=1 Γ(αl ) Beta(α) = Pk (3) Γ( l=1 αl ) is the multivariate Beta function and c0 is the chosen Dirichlet process constant, which gives weight to direct 0:t observation information Otr→te more than other observations. Lastly, expectation of a Dirichlet distribution is defined by hyper-parameters, α = hα1 , ..., αk i as: αl + nl E[Dir(θtr→te = l|Oi→j )] = Pk (4) i=1 αi + ni For this investigation, since the evaluation of each interaction is binary, number of values is k = 2. The idea behind Equation 2 is the assumption that there is a correlation between the truster’s interest, trustee te with other trustees. The equation is taking account of this similarity between trustees. 3 Attacker Model To misguide a truster, an attacker introduces new information to the system or alter previous historical infor- mation. We denote an alteration of the evidence as E 0 . However for an attacker, the objective is to find the most rewarding modification E ∗ ∈ X in an attack space X = {E 0 0 , ..., E 0 |X | }, which is: E ∗ = arg max E[θtr→te |E 0 ] − E[θtr→te |E] E0 | {z } (5) trust gain An attack is an alteration in the evidence available to a principal for misguiding the predictions of the principal 0 about the trustworthiness of other parties. Formally, we define an attack as: E 0 = {A(Oi→j t )| (i, j) ∈ R and t ∈ {0, ..., t}}, where A : {success, failure} → {success, failure} is the manipulation function. Finding the most rewarding modification E ∗ in brute force manner is not feasible, since the search space increases significantly by the number of trustees and the interaction representation (e.g. having a larger k.) If we assume full control over the evidence, the number of possible attacks are: 0:t ρ + (|Otr→. | · k) − 1 |X | = 0:t | · k) − 1 (6) (|Otr→. by weak compositions of ρ [HM04] where ρ is attack power: i.e. the number of observations that will be added 0:t to the evidence by the attack. For instance, when the trust system has 20 service providers |Otr→. | = 20 and binary opinions k = 2, the number of attacks with respect to the attack power: 1 Cat(θ tr→te ) is a categorical distribution, such that p(Otr→te ) = θtr→te . 3 Power ρ Space size |X | Power ρ Space size |X | Power ρ Space size |X | 1 38 4 101270 7 38320568 2 741 5 850668 8 215553195 3 9880 6 6096454 9 1101716330 Table 1: Space size v. power We introduce additional parameter, s to denote the number of trust links that the attacker can access, instead of full control assumption. This reduces the space, such that: 0:t ρ − 1 |Otr→. |·k |X | = (7) s−1 s First term is the strict compositions of ρ into s parts for counting number of ways to divide the number of observations p into s parts. This number is then multiplied with number of ways this divided attack can be applied to the evidence that principal has. Even though, having this second parameter reduces the space, still exploring this space in a brute-force manner is not feasible. We consider exploring this spaceas an optimisation problem, which is strongly non-convex (and in many cases, discrete rather than continuous e.g., when the scores are discrete values), and thus, we cannot rely on standard gradient-based solutions either. Hence we turn to the usage of sampling based techniques. Among these, bandit optimisation is one of the most promising techniques for this problem. To address this, we adapt a hierarchical optimistic bandit optimization HOO (cf., Algorithm 1) by Bubeck et al. [BMSS11] which provides lower regret bounds with quadratic computational complexity to strategically sample this space to estimate the mean-payoff gain of the attack in the attack space X . The main idea of this strategy is to estimate the gains of attacks accurately in the maxima, the partition that we are interested on, and loosely in the remaining partitions of the attack space. To achieve this, a binary tree which represents the partitions of the attack space X is created by the optimization strategy (as shown in Algorithm 1). Deeper nodes of the tree represent a smaller subset of attacks in the space. Each round a new node is added with some statistical information, which is updated each round. These statistical information is used for selecting a new region to explore more. Thus, the space is explored incrementally within the most rewarding sub-regions. The nodes in the binary tree are indexed with integer pairs (h, i), where the depth of a node is h and i is to denote the index of all possible nodes at the depth (in range 1 ≤ i ≤ 2h ). Therefore, the root node is represented as (0,1). The children of the root node are denoted as (1, 1) and (1, 2). In general, children of (h, i) are the nodes (h + 1, 2i − 1) and (h + 1, 2i). Regions of X are associated with each node and shown as Ph,i ⊂ X and must satisfy the constraint, Ph,i = Ph+1,2i−1 ∪ Ph+1,2i for all h ≥ 0 and 1 ≤ i ≤ 2h . The statistical information that is stored in the nodes are: • Th,i : number of times an attack from the region (h, i) and its descendants are explored. • Uh,i : initial estimate of the region (h, i), which is sum of: – µ̂h,i : average reward received in the region associated with (h, i) p – (2lnn)/Th,i : corresponds to the uncertainty of rewards of the average and – ν1 ρh : corresponds to maximum variation of mean-payoff function in the region Ph,i • Bh,i : actual estimate of the mean-payoff function, calculated by Uh,i . In particular, with the stored statistical information, the algorithm is progressively exploring the region as follows: Each round, the region to explore is selected by picking the node with the highest B-value (6-10 lines of Algorithm 1). The region that is selected is played and the corresponding reward is received (15-17 lines). The tree is updated with the previous statistics given the collected reward (19-33 lines). In detail, the path (a set of nodes) followed to select the region is updated with the reward in lines 19-21. The tree structure of space X , T is traversed and the initial estimate of subregions are updated in lines 23-25. Initial optimistic estimates for new descendants of the selected node (H, I) is set in lines 26-27. B-values are updated in lines 29-32. 4 Algorithm 1 The HOO Strategy Parameters: Two real numbers v1 > 0 and ρ ∈ (0, 1), attack space X , all evidence E. Auxiliary function LEAF(T ): outputs a leaf of T Initialisation: T = {(0, 1)} and B1,2 = B2,2 = +∞. 1: for n = 1, 2, ... do . For each round n 2: (h, i) ← (0, 1) 3: P ← {(h, i)} 4: while (h, i) ∈ T do 5: if Bh+1,2i−1 > Bh+1,2i then 6: (h, i) ← (h + 1, 2i − 1) 7: else if Bh+1,2i−1 < Bh+1,2i then 8: (h, i) ← (h + 1, 2i) 9: else 10: (h, i) ← choose a child randomly 11: end if 12: P ← P ∪ {(h, i)} 13: end while 14: (H, I) ← (h, i) 15: Arbitrary choose one of corresponding attacks E 0 ,→ in space X with respected to partition (H, I) 16: Y = E[θtr→te |E 0 ] − E[θtr→te |E] . Receive corresponding reward for selected attack 17: T ← T ∪ {(H, I)} 18: for (h, i) ∈ P do 19: Th,i ← Th,i + 1 1 20: µ̂h,i ← (1 − Th,i )µ̂h,i + TYh,i . Update the mean µ̂h,i of node (h, i) 21: end for 22: for (h, i) ∈ T dop 23: Uh,i ← µ̂h,i + (2lnn)/Th,i + ν1 ρh 24: end for 25: BH+1,2I−1 ← +∞ 26: BH+1,2I ← +∞ 0 27: T ←T 0 28: while T 6= {(0, 1)} do 0 29: (h, i) ← LEAF(T n ) o 30: Bh,i ← min Uh,i , max {Bh+1,2i−1 , Bh+1,2i } 0 0 31: T ←T \ {(h, i)} 32: end while 33: end for 5 4 Simulation Results To evaluate the performance of our attack model, we conducted empirical experiments to investigate our attack model in terms of: how prior evidence and the number of trustees in the environment affect the impact of an attack and how our attack model performs before converging to the best possible attack. Each experiment is repeated 1000 times with different trustee reputation parameters θtr→te and average was taken to minimise the influence of random trustee parameters. All experiments included a simulated environment where an attacker wants to influence a truster’s trust assessment of a single trustee. The targeted truster is making trust assessments with HABIT model, as described in Section 2 and the attacker is modifying the truster’s observations of the environment by using our attack model, as described in Section 3. In each experiment, the number of rounds was allocated to HOO Strategy was equal to the number of possible attacks. We set a high number of rounds to ensure convergence of the HOO Strategy, which can be seen that convergence was achieved as a matter of fact in a smaller number of rounds (as shown in Figure 2). The truster only had prior information about each trustee from the previous direct observations. Due to computational time available for experiments, s, the number of trust links that were affected by the attack, was picked as 2. 0.30 0.40 t=3 n=7 t=5 0.35 n = 14 0.25 t=7 n = 20 0.30 t = 10 0.20 max trust gain max trust gain 0.25 0.15 0.20 0.10 0.15 0.10 0.05 0.05 0.00 0.00 2 4 6 8 10 12 5 10 20 ρ: impact of the attack ρ: impact of the attack 0:t (a) Prior evidence, n = 20 (b) Number of trustees (|Otr→. | = 2) Figure 1: Trust gains by attacks in varying dimensions: • the number of trustees: n • and the number of total previous evidence: t 4.1 Prior evidence To determine the effect of prior evidence in the environment to our attack strategy, we ran a series of experiments in which prior information that truster had about trustees were varied in values of 3, 5, 7 and 10. The impact of the attack, ρ, were ranging from 1 to 12. As shown in Figure 1a, in the case of the number of prior direct 0:t interaction is being higher than the impact parameter |Otr→. | > ρ, the impact of attacks remained less than 0.10. The increasing trust gains were achieved when the attack budget overwhelmed the prior information available. 4.2 Population in the environment HABIT trust model takes advantage from the similarities in the behaviours of trustees. Their results show that the number of trustees when correlation exists (similar reputation parameters, θ.→te ), model achieves more accurate trust assessments This advantage of the model turns out to be a disadvantage in an orchestrated attack which our attack model collected higher trust gains with the generated attacks in more crowded trustee populations (as shown in Figure 1b). 6 4.3 Performance over time We investigated the performance of the attack strategy in each round and observed to the rewards of the strategy in the intermediate steps for the search for an optimal attack. In each of the varying number of trustees, HOO strategy behaved similarly in terms of trust gain and cumulative reward. As the space grew when the number of trustees increased and the number of rounds required likewise grew proportionally. It can be observed from Figure 2, a set of eligible attacks were explored throughout rounds. These can be used at any point depending on the resource of the attacker. 0.35 500 0.30 0.30 2000 0.25 400 0.25 cumulative reward cumulative reward 0.20 1500 300 0.20 trust gain trust gain 0.15 0.15 1000 200 0.10 0.10 0.05 100 500 0.05 0.00 0.00 0 0 0 500 1000 1500 2000 0 2000 4000 6000 8000 rounds rounds (a) n = 7 (b) n = 14 0.35 5000 0.30 4000 0.25 3000 cumulative reward trust gain 0.20 0.15 2000 0.10 1000 0.05 0.00 0 0 5000 10000 15000 rounds (c) n = 20 Figure 2: Trust gains of HOO Strategy in each rounds with simulation settings: varying number of trustees n and ρ = 20. Error bars show standard error of the mean trust gain 5 Discussion In our investigation, we considered a trust problem where an observer (truster) is making inference of risk of interacting with service providers (trustees). The orchestrated attack, consisted the following assumptions: Total Visibility: We assume that the attacker has access to observe whole environment, but only had the ability to modify a subset of the information in the environment. On the other hand, many trust and reputation systems tend to publish such historical knowledge in practice. In addition, the trust model that 7 we are employing requires this assumption to make assessments. Therefore, for this empirical experiments, our environment were fully observable. This may not be the case in some trust and reputation systems. Static Environment: As in many trust systems, the dimension of time in interactions is meaningful. This dimension can be exploited with on-off attacks, in which agents delay the interactions, meanwhile profiting by malicious behaviour. After a malicious period, for instance, service providers may leave the system and create a new identity to whitewash to previous poor reputation [WMLZ14] is not considered. One direction for future research can be investigating these types of attacks and devising an approach to detect and strengthen models of trust. Knowledge about Trust Model: We assume that either the attacker has already known the trust model that is employed by the system or the attacker creates a virtual equivalent of the scenario in the targeted trust system. It is a strong assumption, however we would like to investigate further when such information is not available to the attacker. From the attackers perspective, question of how to predict the trust model that is used by the trust system and from moderators of the trust system perspective, question of how to devise a trust system to hide its predictability is intriguing and it is in our future research agenda. Heuristics to Reduce Attack Space: Heuristics can be employed to reduce the attack space, thus yielding a higher performance in our attack model. For instance, as in our case, knowing that the targeted system is using HABIT model, an attacker may incorporate a similarity metric and create a smaller attack space, where the convergence can be achieved with fewer rounds. It is a realistic addition to our attack model that we would like to explore. It is important to note that the attack strategy that we devised is compatible with other computational trust models. Other areas for future research are: investigating other trust models, trust-aware decision processes (e.g. such as [GNTT17, SRR15]) and verifying its effectiveness of our attack strategy with more realistic assumptions and creating defensive mechanisms to such sophisticated attacks. 6 Conclusion In this paper, we have introduced a new evaluation technique with a generic attack model to model of trust. We investigated the performance of our adaptation of HOO strategy as an attack model for finding optimal attacks for a selected trust model. Our experiments showed that our attack model can exploit the strengths of the employed trust model. In addition, this is the first contribution to show the potential of devising a more realistic benchmark for models of trust. In future research, we will address the assumptions that were discussed and at the same time improve the performance of our attack strategy. References [BL16] Amir Jalaly Bidgoly and Behrouz Tork Ladani. Modeling and quantitative verification of trust systems against malicious attackers. The Computer Journal, 59(7):1005–1027, 2016. [BMSS11] Sébastien Bubeck, Rémi Munos, Gilles Stoltz, and Csaba Szepesvári. X-armed bandits. J. Mach. Learn. Res., 12:1655–1695, July 2011. [BNS10] Chris Burnett, Timothy J. Norman, and Katia Sycara. Bootstrapping trust evaluations through stereotypes. In Proceedings of the 9th International Conference on Autonomous Agents and Multia- gent Systems: Volume 1 - Volume 1, AAMAS ’10, pages 241–248, Richland, SC, 2010. International Foundation for Autonomous Agents and Multiagent Systems. [FBZ13] Hui Fang, Yang Bao, and Jie Zhang. Misleading opinions provided by advisors: Dishonesty or subjec- tivity. In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, IJCAI ’13, pages 1983–1989. AAAI Press, 2013. [Fer73] Thomas S Ferguson. A bayesian analysis of some nonparametric problems. The annals of statistics, pages 209–230, 1973. [GNTT17] Taha D. Güneş, Timothy J. Norman, and Long Tran-Thanh. Budget limited trust-aware decision making. In Gita Sukthankar and Juan A. Rodriguez-Aguilar, editors, Autonomous Agents and Multiagent Systems, pages 101–110, Cham, 2017. Springer International Publishing. 8 [HM04] Silvia Heubach and Toufik Mansour. Compositions of n with parts in a set. Congressus Numerantium, 168:127, 2004. [HZNR09] Kevin Hoffman, David Zage, and Cristina Nita-Rotaru. A survey of attack and defense techniques for reputation systems. ACM Comput. Surv., 42(1):1:1–1:31, December 2009. [JG09] Audun Jøsang and Jennifer Golbeck. Challenges for robust trust and reputation systems. In Pro- ceedings of the 5th International Workshop on Security and Trust Management (SMT 2009), Saint Malo, France, page 52. Citeseer, 2009. [Jøs12] Audun Jøsang. Robustness of trust and reputation systems: Does it matter? In Theo Dimitrakos, Rajat Moona, Dhiren Patel, and D. Harrison McKnight, editors, Trust Management VI, pages 253– 262, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg. [KC09] Reid Kerr and Robin Cohen. Smart cheaters do prosper: Defeating trust and reputation systems. In Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Sys- tems - Volume 2, AAMAS ’09, pages 993–1000, Richland, SC, 2009. International Foundation for Autonomous Agents and Multiagent Systems. [KSGM03] Sepandar D. Kamvar, Mario T. Schlosser, and Hector Garcia-Molina. The eigentrust algorithm for reputation management in p2p networks. In Proceedings of the 12th International Conference on World Wide Web, WWW ’03, pages 640–651, New York, NY, USA, 2003. ACM. [MWLZ16] Tim Muller, Dongxia Wang, Yang Liu, and Jie Zhang. How to use information theory to mitigate unfair rating attacks. In Sheikh Mahbub Habib, Julita Vassileva, Sjouke Mauw, and Max Mühlhäuser, editors, Trust Management X, pages 17–32, Cham, 2016. Springer International Publishing. [RD16] Yefeng Ruan and Arjan Durresi. A survey of trust management systems for online social communities - trust modeling, trust inference and attacks. Know.-Based Syst., 106(C):150–163, August 2016. [RPC06] Kevin Regan, Pascal Poupart, and Robin Cohen. Bayesian reputation modeling in e-marketplaces sensitive to subjectivity, deception and change. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 2, AAAI’06, pages 1206–1212. AAAI Press, 2006. [SRR15] Sandip Sen, Anton Ridgway, and Michael Ripley. Adaptive budgeted bandit algorithms for trust development in a supply-chain. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’15, pages 137–144, Richland, SC, 2015. International Foundation for Autonomous Agents and Multiagent Systems. [TLRJ12] W.T. Luke Teacy, Michael Luck, Alex Rogers, and Nicholas R. Jennings. An efficient and versatile approach to trust and reputation using hierarchical bayesian modelling. Artif. Intell., 193:149–185, December 2012. [WMLZ14] Dongxia Wang, Tim Muller, Yang Liu, and Jie Zhang. Towards robust and effective trust man- agement for security: A survey. In Proceedings of the 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications, TRUSTCOM ’14, pages 511–518, Washington, DC, USA, 2014. IEEE Computer Society. [WMZL15] Dongxia Wang, Tim Muller, Jie Zhang, and Yang Liu. Quantifying robustness of trust systems against collusive unfair rating attacks using information theory. In Proceedings of the 24th Interna- tional Conference on Artificial Intelligence, IJCAI’15, pages 111–117. AAAI Press, 2015. 9