Trust Assessment Through Continous Behaviour Recognition Gurleen Kaur Timothy J. Norman Katia Sycara University of Aberdeen University of Aberdeen Robotics Institute Kings College Kings College Carnegie Mellon University Aberdeen, UK Aberdeen, UK Pittsburgh, PA g.kaur@abdn.ac.uk t.j.norman@abdn.ac.uk katia@cs.cmu.edu Abstract Computational models of trust typically assume that an assessment of the trustworthiness of an individual can be formed from learning from the outcomes of a sequence of atomic tasks, as well as other evidence such as reports from third parties. Further, they assume that an agent’s trustworthiness can be modelled by a single probability distribution. In this paper we explore alternative mechanisms that allow these as- sumptions to be relaxed. We propose a trust assessment model based on Markov Switching Regimes, where direct and third-party observa- tions about an agent’s behaviour follow interrelated Autoregressive pro- cesses. We argue that this offers a richer model of trustworthiness and a means to combine trust assessment with within-task monitoring. 1 Introduction In dynamic and open systems, diverse autonomous agents interact with their peers to achieve dependent in- dividual, or shared objectives. In such an environment, agents might behave in an untrustworthy manner, delivering unsatisfactory performance, whether this be to perform a task or to deliver an information service. When choosing future partners to rely upon, therefore, agents must consider their likely future behaviour. The uncertainties underpinning these decisions are often captured through computational models of trust. Various forms of evidence have been posited as appropriate to inform trust assessments, including past observations of behaviour given contractual expectations [cZYC09, LD12], assessments from third parties, correlations among agent’s behaviour [BNS10, LDRL09], and other contextual factors. We take a probabilistic approach to modelling trust assessment, and there is an extensive literature on mech- anisms of this kind. Such models build primarily upon the Beta model [JI02, TPJL06] and its multivariate generalization, the Dirichlet model [JH07, RPC06] either implicitly or explicitly. These models assume the out- come of a delegated task/goal may be modelled as either a binary variable, representing success or failure, or a multivariate outcome. The underlying Bayesian framework of these models assumes that an agent’s behaviour can be approximated by a single, static probability distribution. Based on the outcomes of the interations, the parameters of the prior distribution update over time, thus deriving the posterior distribution. The assumption that an agent’s behaviour is static and represented by a single probability distribution thoughout future intera- tions may, however, not be reasonable in many situations. A service provider may modify its behaviour over time c by the paper’s authors. Copying permitted only for private and academic purposes. Copyright � In: R. Cohen, R. Falcone and T. J. Norman (eds.): Proceedings of the 17th International Workshop on Trust in Agent Societies, Paris, France, 05-MAY-2014, published at http://ceur-ws.org 1 according to changes occurring in its environment. A sensor system, for example, may provide highly trusted target tracking data unless it believes that the target it is asked to track is from a specific organisation. Rather than looking at an interaction between a consumer and a service provider in a macroscopic way, and considering it as a single entity with an end result of failure or success, we take a more refined view. An interaction between two agents is considered as a sequence of events/sub-tasks, and we assume that the consumer may (partially) monitor progress periodically. After observing and evaluating progress a number of times, it is possible for the agent to build a picture of the service provider’s behaviour, and hence predict, to some extent, the likely future progress or result (success/failure) of the delegated task. Behaviour detection helps to provide answers to questions such as whether a delegator/consumer wants to continue with the current interaction/task allocation, or what types of task and when to delegate to this provider in the future. It could also capture any learning over time on the part of the agent that may change/improve its performance. Consider a scenario where an agent, enters into a contract with two agents capable of tracking objects of interest within an environment. The environment is an area of coastline around a port, and the sensor agents may be unmanned aerial vehicles (UAVs) or ground-based sensor systems. Suppose that two UAVs are delegated the task of identifying, tracking and reporting the location of unauthorised boats within the area. This surveillance task may continue for a substantial period of time with target tracking data (observations) being provided by one or both UAVs during various sub-periods of the on-going task. Similar tasks, initiated by other agents, may be active at the same time. Given this contract will continue over a period of time, there may be opportunities to observe the agents’ behaviour such as through correlations between observations reported by the two UAVs. This monitoring could show that the probability that one of the UAVs providing accurate reports has decreased. Options to recruit an additional or replacement sensor agent may then be considered. A traditional trust model can only learn about trust in UAVs identifying and tracking boats by looking at the number of successful unauthorised boats observed, and the numbers of failures. It can’t tell you anything about the probability of continued success given some observations of the agent’s behaviour while reporting. A commonly-employed class of models applied to the analysis of time series of this kind are those based upon a Hidden Markov Model (HMM) [SS13]. An important drawback to the application of HMMs in trust assessment, however, is the maximum likelihood approach used in the parameter estimation of standard HMMs. This approach is based the ‘equally likely’ assumption: it assumes each observation in the training data set is of equal importance for a future prediction, no matter how big the training set. This is counter to our intuition that more recent observations should represent more weight of evidence for a trust assessment, although this approach might work well when training sets are relatively small or if the data series studied is not time-sensitive in this manner. Beta, or Dirichlet-based models of trust assessment do not suffer from this limitation, however. They tend to use the principle of exponential decay that discounts past observations, placing more weight on recent evidence. In this paper we explore the requirements of a trust assessment model where the relationships among time series variables influence an assessment, but also where these statistical relationships are subject to change over time. We discuss a model that integrates an autoregressive process with a Markov switching model [Ham94] to exploit evidence from continuous behavioural monitoring. The autoregressive Markov switching model relaxes the standard HMM conditional independence assumption by allowing observed variables that depend on the current state to also depend on the past output/observation. In this way the autoregressive process weighs more recent observations more highly, thus explicitly modelling some of the dynamic behaviour we are interested in. Our conjecture is that this combination may lead to more accurate predictions. Before exploring an initial trust assessment model, we formalise the underlying mechanisms that we build upon: Autoregression and Markov switching models. 2 Autoregressive and Markov Switching Models 2.1 Autoregression and Vector Autoregression Decision makers need to be guided by predictions about how the environment is likely to change. In forcasting the values of important environmental variables, we can assume that the historical behaviour of that variable over time contains information about its future development. This history of behaviour may be records of successes/failures to meet requests, more structured outcomes, or more fine-grained samples of performance as a request is being satisfied. Given the assumption that evidence of past performance can help in estimating future behaviour, we can employ various methods to model, analyse and forecast variables of interest. An autoregressive process [Ham94] is one such tool, which we refer to as AR(p) where p is the length of the window 2 of past values that we use to predict the next value of some variable. DEFINITION 1 If we know the parameters of an autoregressive process (α1 , . . . , αp ), and if we have a sequence of p past observations of the variable of interest, y, the autoregressive equation of order p, can be used to estimate the value of y at time t. yt = c + α1 yt−1 + α2 yt−2 + · · · + αp yt−p + εt (1) where c is a constant, and εt is white noise with zero mean and finite variance. In the real world, however, the value of one variable not only depends on its own past values but also of the past values of other variables. In order to model these interdependencies, we can extend this univariate autoregressive process to model a set of time series variables simultaneously. The vector autoregression model, VAR(p), [Ham94] is an extension of univariate autoregressive process, AR(p), to model dynamic multivariate time series data, and hence capture the linear interdependencies among multiple time series variables. The only prior knowledge that we require to employ this approach are the variables can affect each other over time. DEFINITION 2 A vector autoregressive process of order p, (V AR(p)) can be written as yt = c + A1 yt−1 + A2 yt−2 + · · · + Ap yt−p + ut (2) where yt = (y1t , · · · , yKt )� is a K × 1 vector of time series variables, c = (c1t , · · · , cKt )� is a K × 1 vector  of a11,i · · · a1K,i  ..  and constants (intercepts), each Ai is a time-invariant K × K coefficient matrix, Ai =  ... .. . .  aK1,i ··· aKK,i ut = (u1t , · · · , uKt )� is a K × 1 vector of error terms satisfying: • E(ut ) = 0, every error term has mean zero. • E(ut u�t ) = Σ, the contemporaneous variance-covariance matrix of error terms is Σ (a K × K positive- semidefinite matrix) • E(ut u�s ) = 0, for any t �= s, there is no correlation across time; in particular, no serial correlation in individual error terms. Rather than being interested in a single variable, y, here each yt represents a vector of the variables that we assume to depend on each other. Our coefficient matrices, A1 , . . . , Ap , model the extent to which each variable influences each other over the window of length p. We can now apply this kind of model to explore domains in which there are multiple, interdependent variables. If, for example, we anticipate that variables y1 and y2 are interrelated, we can use domain data to learn the matrices c (intercepts) and A1 , . . . , Ap (coefficients). From this model, we can then use this to anticipate how variables y1 and y2 change together. For example, when we try to estimate the trustworthiness of a target agent, we generally use two main sources of evidence, direct experiences and third party reputational reports. Suppose that y1 represents our direct observation of an agent’s behaviour and y2 is some aggregation of third party reports. Using this model, we can exploit the interdependencies between these two variables to predict the future values of y1 . 2.2 Switching regime models In the vector autoregression model, VAR(p), it is assumed that the parameters of the model, capturing the inter- relationships among the variables of interest, are fixed; i.e. the vectors c (intercepts) and A1 , . . . , Ap (coefficient matrices) do not vary. There may, however, be periods in which the interrelationships among variables change significantly. Switching regime models are used to capture the dynamics of the observed variables of interest. For example, the dynamic behaviour of an agent may fluctuate between trustworthy and untrustworthy states, depending on its environment. The agent in question may be trustworthy when it has a medium or low load due to other commitments, but untrustworthy when it has a high load. These unobserved states (low, medium 3 or high load) may be reflected in the patterns of behaviour, that are observed either by direct experience or in reputational reports. Using these observations, and clustering techniques, the fluctuations in behaviour can be modelled and their underlying unobserved state/regime can be detected with some level of confidence. Parameters of the observed series will be time varying (they will take on different values in each different, predetermined number, of regimes or states of the environment) and so fitting a linear model for each regime may be used to approximate the non linear data. The regime at any point in time is an unobserved variable but the stochastic process that determines the unobserved regime variables is known. In this work we consider a vector autoregression time series model with changes in regime and the unobserved regime variable is generated by a discrete-state homogeneous ergodic Markov chain. Markov Switching Vector Autoregressive Model DEFINITION 3 A M -state Markov switching, p-lag vector autoregressive process, MS(M)-VAR(p) [Kro00] is given by yt = ν(st ) + A1 (st )yt−1 + · · · + Ap (st )yt−p + ut (3) where, • yt is a K-dimensional time series vector, i.e.; yt = (y1t , · · · yKt )� . • M is a finite number of predetermined, feasible regimes or states. • The unobserved regime variable at time t is st , and st ∈ {1, 2, · · · , M } follows a discrete, M-state homoge- neous ergodic Markov chain. • The K-dimensional intercept vector, ν(st ), autoregressive parameter matrices, A1 (st ), · · · , Ap (st ), and the variance-covariance matrix, Σ(st ), vary according to the regime (state) of the environment, which is con- trolled by the unobserved regime variable st at time t. • ut ∼ NID(0, Σ(st )) is a variance-covariance matrix, Σ(st ) of the error terms ut , which also depends on the unobserved regime variable st . The two main components of the Markov Switching Vector Autoregressive Model model are, therefore: 1. A Markov chain as the regime generating process for unobserved state st . 2. A Gaussian vector autoregression as the data generating process of the observed variable yt , which is conditional on an unobserved regime st . The parameters of VAR(p) will, therefore, be time varying but the process is time invariant, conditional on an unobserved state st . Regime Generation Process The unobserved regime st in a Markov switching model is assumed to be generated by an ergodic Markov chain with a finite number of predetermined feasible states, say M , st ∈ {1, 2, · · · , M }, which is defined by the transition probabilities pij : M � pij = P r(st+1 = j | st = i), , ∀i, j ∈ {1, 2, · · · , M } (4) j=1 We collect all the transition probabilities between the states in the transition matrix, P .   p11 p12 · · · p1M  p21 p22 · · · p2M    P = . .. . ..   .. . . . .  pM 1 pM 2 ··· pM M Using this law for the regime generating process, the evolution of the unobserved regime can be inferred from the observed time series data using clustering techniques. Let ξt be the vector representation of the unobserved regime variable st ∈ {1, 2, · · · , M } at time t. If st = j, then the unobserved regime vector ξt is the j th column of an M × M identity matrix. The M-dimensional vector ξt can also be written as ξt = (I(st = 1), · · · , I(st = M ))� where I is the indicator function. 4 Data Generating Process The time series process of the observed variable yt at time t is governed by the underlying hidden regime of the environment, ξt . Therefore, for a given regime ξt and previous values of the observed variables up to time � � t − 1, Yt−1 = (yt−1 , yt−2 , · · · , y1� , y0� , y−1 � � , · · · , y1−p )� , the conditional probability density function of yt is given by p(yt | ξt , Yt−1 ). In the definition of the MS(M)-VAR(p) process we assumed that for each regime st at time t, the error terms ut are normally distributed with mean 0 and variance that depend on the regime. This implies that the conditional probability density function of yt , given an unobserved regime ξt , will also be normally distributed. We collect all these Gaussian conditional densities of yt in an M-dimensional vector, ηt . ηt = p(yt | ξt , Yt−1 ) = (p(yt | ξt = ι1 , Yt−1 ), p(yt | ξt = ι2 , Yt−1 ), · · · , p(yt | ξt = ιM , Yt−1 ))� 2.2.1 Parameter Estimation The parameters of the MS(M)-VAR(p) are estimated using the Expectation Maximisation algorithm introduced by Dempster et al. [DLR77]. This is an iterative technique used to obtain maximum likelihood estimates of the model’s parameters, where the observed time series data depends on some unobserved or hidden variable. This two-step algorithm involves an expectation step, in which the optimal inference of the unobserved regime sequence is determined, and a maximization step, in which the parameters of the model are updated by using the maximum likelihood approach. Expectation Step (E step) Suppose full observation data up to time T is known and let λ be the parameter vector (to be estimated). During each iteration the unobserved states ξt are estimated by their smoothed probabilitites, ξ�t|T = P r(ξt | YT , λj−1 ). These conditional probabilities are calculated using a forward recursive filter and backward recursive smoothing algorithms (see below). The filter probability is the conditional probability of the hidden regime ξt given the observed sample data Yt = (yt� , yt−1 � � , · · · , y1−p )� up to time t, and the model parameters, ξ�t|t = P r(ξt | Yt ) [Ham94]. �� ηt ξt|t−1 ξ�t|t = � �� (5) 1 (ηt ξt|t−1 ) M The forward recursive filter can be used to infer the hidden regime for time t� ≥ t given the observed data set up to time t. The optimal m-period forecast of ξt+m is given by ξ�t+m|t = (P � )m ξ�t|t , where P is the transition matrix. Similarly, the smoothed probability is the conditional probability of the hidden regime ξt given the observed sample data YT = (yT� , yT� −1 , · · · , y1−p � )� up to time T , and the model parameters. Smoothing is, therefore, a backward recursive process that infers unobserved states by including the sample information previously neglected in filtering [Kro00].1 � � �� ξ�t|T = P r(ξt | YT ) = P ξ�t+1|T � ξ�t+1|t ⊙ ξ�t|t (6) Maximization Step (M step) In the E step, the parameter vector, λ, was taken to fixed and known. Within the M step we compute the maximum likelihood estimate for our model parameters. The parameter vector λ contains VAR parameters (i.e. intercept, autoregressive matrices and error variance) and the initial and transition probabilities of the underlying hidden Markov chain. The log likelihood function is given by L(λ | Y, ξ) := p(YT | λ, ξ). To maximise the value of this function, the latent/hidden variable will be substituted by its expectated value, ξ�t|T . This means that the conditional regime probability P r(ξt | YT , λ) will be replaced by smoothed probabilities, calculated in the previous expectation step, thus eliminating non-linearities. The parameters for this function are derived from solving the first-order conditions of a constrained log likelihood function (see Krolzig [Kro00] for detailed analytical solution). 1 Functions � and ⊙ are element-wise division and multiplication respectively. 5 2.2.2 Forecasting Despite being a non-linear model, the attractive feature of Markov switching vector autoregression, is its sim- plicity of forecasting. To obtain the optimal h-step forecast, the mean squared prediction error (MSPE) criterion may be used (i.e. we minimise the squares of the forecast errors). � � y�t+h|t := arg min E (yt+h − y�)2 | Yt � y Given the information Yt up to time t, therefore, the optimal h-step forecast of the observed time series is given by the conditional mean: y�t+h|t = E [yt+h | Yt ] Since the data generating process is nonlinear, the MSPE optimal forecast is a not a linear predictor of the observed temporal data (see Krolzig [Kro00] for further details). 3 A simple Markov switching trust assessment model We may now demonstrate how this general model, MS(M)-VAR(p), may be applied to trust assessment. Consider a system with n agents, where A = {a1 , a2 , . . . , an } is the set of all agents. Agents interact with one another and work together to acomplish various tasks. Direct experiences from these transactions can aid both parties, say ai and aj , in forming opinions about the trustwortiness of each other. We consider these experiences to be both the results of monitoring actions during a transaction and the transaction outcomes. The rating that an agent gives to an element of an interaction (assessed through monitoring) may belong to a discrete set of values or from a continous range, such as [0, 1]. Over time, interactions between agents produce a history of direct evaluations and ratings, thus forming a time series of observations of that variable. We may discretise the series of observations made by agent ai of agent aj such that for each time period [t − 1, t], ai may make a direct trust evaluation of aj . A simple method would be to compute the average of the outcomes of observations made during that time period, but other aggregation methods are possible. We refer to the direct observation made by agent ai of agent aj at time t as Yij (t). Evaluating the trustworthiness of an agent is time-varying process; more recent behaviour should have greater influence on a trust assessment. An autoregressive process offers a means to model this. The output variable of this process at time t depends on its own previous values, thus capturing any positive or negative effect of the observed data. The dynamic, unobserved behaviour of a service provider may also change with time; changes that may be manifest from observations acquired through monitoring. If positive observations are made then the agent is more likely to be behaving in a trustworthy manner and vise versa. These observations may, however, change dramatically away from long-run mean. This volatility could indicate a change in the regime of the observed temporal process. The series is, therefore, assumed to be dependent on an unobserved stochastic process. This unobserved process models the actual state of the agent’s behaviour and the evaluations of their trustworthiness at time t are the observed data determined by this hidden variable at that time. If two agents have interacted with each other within the society, then they will have a temporal data set reflecting their observations of the encounter. Based on these direct experiences, an agent may build a model to predict likely future behaviour. Other forms of evidence may be exploited during trust assessment. Opinions, derived from behavioural observations, about the target of a trust assessment may be acquired from third parties. Taking into consideration such indirect evaluation about a service provider’s behaviour may improve the accuracy of a trust assessment. Although useful evidence, the use of third party reports is not without its risks. It is possible that a recommender can provide misleading or biased feedback about other agents in the society unintentionally or otherwise. These reputation reports may undervalue the true behaviour of a peer or represent an unjustifiably positive opinion. It is notoriously difficult to detect misleading recomendation reports, but failing to consider all feedback from peers has its own risks. Ideally we would want to weigh the recomendations received from different agents to mitigate the influence of biases and misleading reports, if not eliminate their effects entirely. We now discuss a simple means to take into account such evidence within an MS(M)-VAR(p) process. At time t, if the agent ai wants to assess the behaviour of another agent aj then it may seek opinions about aj from all the other agents in the system. An aggregation of this feedback may then be integrated with its own 6 view of the target agent, aj . Let Rij (t), denote agent ai ’s estimate of the community opinion of agent aj at time t obtained by aggregating the recomendation reports from its peers in the environment. A simple method of obtaining Rij (t) is to compute the weighted average of all the reputation reports received. 1 � Rij (t) = � Wix (t)Yxj (t) (7) Wix (t) ax ∈A ax ∈A ax �=(ai ,aj ) ax �=(ai ,aj ) Here, Wix (t) is the weight given by agent ai to agent ax ’s reputation report and Yxj (t) is the reported behavioural observation of agent aj by ax at time t. These aggregated reports may then be exploited as a second variable within the MS(M)-VAR(p) process that the predicted trustworthiness of the target agent depends upon. There are, of course, other means to aggregate third-party opinions. The use of stereotypes [BNS10, LDRL09], for example, may obviate the need to maintain weights for the opinions of other agents. Stereotypes can be used to weigh reports from sources’ opinions based on the group to which they belong. Alternatively, we could model each stereotypical group as a variable within the MS(M)-VAR(p) process, each of which influencing the variable representing the trustworthiness of the target agent. The MS(M)-VAR(p) process relies on a series of data points within the window of length p to forecast the variable of interest, which, in our case, is the trustworthiness of the target agent. In trust assessment, this is a challenge to the application of the model because there may be significant gaps in direct interaction between agents. Although it is less likely that there are no third-party opinions to exploit, the observed time series data from direct interactions will have missing values. To deal with this challenge, various interpolation techniques may be employed such as using regression or spines [HK10]. Suppose an iteration between two agents in the society ends at time t and starts again at time t + 4; we are missing observations for 3 time steps. We first try to predict those missing values by using the data set up to time t, estimate the model parameters, and then forecast the missing observations. Missing data is then replaced with predicted data. We can then use this full time series of direct interactions along with the reputation reports for trust assessment. 4 Illustration To demonstrate the approach we propose, we simulated a multi-agent system in which reputation reports are exchanged, interactions occur (over periods of time), and agents assess the trustworthiness of potential partners. We simulate ten agents with different behaviour profiles. We investigated the process whereby one agent, a1 , attempts to evaluate the trustworthiness of another, a2 . We simulated relatively long-term interactions happening frequently between agents, to minimise the need for data imputation/interpolation. These transactions between a1 and a2 produce time series of direct observations. In parallel a1 interacts with other agents in the society, producing other observations over time of their behaviour. At each time step, agent a1 will query other agents in the society for observations regarding a2 . Agent a1 aggregates these third party reports using the simple weighted average method described above. This provides a time series of aggregated reputation reports for a2 . Using these two time seires data we will try to evaluate the switches in the unobserved state of agent a2 ’s behaviour. For the sake of simplicity, we assume there are two possible hidden states: trustworthy and untrustworthy. The purpose is, therefore, to predict this unobserved state. To select the vector autoregression lag length, we use the AIC criteria. The table below shows that, in this simple illustration, VAR(1) had the smallest AIC value. Given we consider two variables (direct observations, DO, and agregated third party reports, RR), we use a MS(2)-VAR(1) process; i.e. a 2-variable Markov switching autoregressive model with a lag of 1. AIC and BIC values for VAR VAR Lag AIC Value BIC Value 1 -2.314199 -2.263234 2 -2.298469 -2.213529 3 -2.289073 -2.170156 4 -2.282472 -2.129579 5 -2.272324 -2.085454 7 We collect the values acquired for our variables of interest (DO and RR) in a vector yt = (DO t , RR t ). The vector autoregression with lag 1 will create two VAR equations, where the temporal variable DO changes and its future development depends on its own past outcomes and on the past outcomes of the RR variable. Thus the marginal change in RR variable will effect DO, hence affecting the unobserved/hidden state of the agent’s level of trust. Direct Interaction Outcomes 8 6 4 2 0 100 200 300 400 500 Time Aggregated Reputation Reports 7 5 3 1 0 100 200 300 400 500 Time Probabilities of Regimes 0.8 0.4 0.0 0 100 200 300 400 500 Time Figure 1: Regime Probabilities In the above graph, the first time series represents the direct observations of agent a1 regarding a2 . The second series is the time series of aggregated reputational reports about a2 received by agent a1 from the other agents in the society. These reports can be biased, and so we marginalise each report according to agent a1 ’s level of trust in the report provider. The third graph shows the switches between the unobserved states of the agent’s trust and how these unob- served regimes evolve over time based on the observed data. The black curve represents the probability that service provider a2 is in the trustworthy state and the green curve represents the probability that a2 is in the untrustworthy state. It can be seen from graphs that there seems to be a high correlation between direct inter- action outcomes (the DO variable) and these hidden states. If the direct interaction ratings are high then the probability of being in a trustworthy state is higher. At the same time, the second series (aggregated reputational reports) also has the ability to pull down or push up the probability of being in a certain unobserved state. 5 Discussion The most prominent existing research on probabilistic models of trust are grounded upon the Beta reputation models or its multivariate extentions (for interatctions with multiple outcomes). They use either Beta or Dirichlet distributions to represent the probability distribution over interaction outcomes [JI02] [JH07] [MMH02]. These models have also been extended to deal with deception and unfair raitings [WJI05] [TPJL06] [RPC06]. Generally, these models assume that agents’ behaviour is static; i.e. represented by a fixed probability distribution. The limitations of this assumption are mitigated by treating recent interaction outcomes as more representative of the likely future behaviour of an agent; e.g. the use of exponential decay or forgetting factor in Jøsang & Haller [JI02]. Recently, a number of trust assessment models based on Hidden Markov Models (HMM) have been proposed. El Salamouny & Sassone [SS13], for example, propose an HMM-based model of evaluating trust that exploits 8 direct experiences and reputational reports from an agent’s peers. Moe et al. [MTK08] propose a trust model that combines an HMM and with reinforcement learning. After learning about the environment from its RL module, the parameters of the HMM module are restimated to detect an agent’s behaviour more reliably. Boer et al. [SKN07] propose a computational trust model based on HMMs, comparing this with existing probabilistic computational trust models, demonstrating that the non-HMM-based models were unable to deal with dynamic behaviour. A similar study by Moe et al. [MHK09] provides results of a comparison of the effectiveness of Beta models with decay factors and an HMM based trust approach; the conclusion being that the later was more realistic and effective in dynamic environments. These HMM-based trust models focus on the interaction history without considering the context of the interaction. Liu & Datta [LD12], however, consider an HMM-based context aware trust model to predict an agent’s trustworthiness is dynamic environments. Empirical assessment of this model, which uses multiple discriminent analysis to select appropriate features of the context, demonstrates that it out-performs standard HMM-based models in detecting dynamic behaviour patterns. We have presented an early exploration of the use of autoregression and Markov switching methods in trust assessment. There are a number of simplifications in how we have applied these techniques to the trust assessment problem such as in the aggregation of third party observations. Existing models propose clever fusion techniques to aggregate these reports such as those used in TRAVOS [TPJL06], or evaluate reports separately and aggregate the results [SS13]. Here, although we use a simple weighted average approach to combine reports, the use of VAR enables us to simultaneously model this time series with that from direct experience and model the effects that these temporal variables have on each other. In addition to exploring refinements of our model, we need to thoroughly investigate the accuracy of forecasting future dynamic behaviour of a target agent. 6 Conclusion We have proposed a novel approach to the development of computational models of trust grounded upon a Markov Switching Regime model (equivalent to an HMM) where the observed data follows an Autoregressive process. The means by which we generate the data that drives the Markov switching model relaxes the assumption used in all HMM-based models of trust: that each observation is of equal importance to the assessment of trust. By considering that the observed data follows an autoregressive process, we place more weight on more recent evidence. The use of a Markov switching model enables us to model non-linear behaviour of a target agent by constructing a vector of linear models of behaviour, given (ideally) distinct behavioural states or regimes, along with a model of how the agents behaviour switches between these states/regimes. This means we are not relying on the assumption made by most non-HMM-based models of trust: that the behaviour of each agent can be modelled by a single, static probability distribution. Further, we do not need to treat interactions (or transactions) between agents as atomic, and use final outcomes as evidence for future assessments. We can exploit the techniques we propose to monitor progress of longer-term delegated tasks to inform interim decisions regarding the dependency between agents. Acknowledgements This research was sponsored by Selex ES. References [BNS10] C. Burnett, T. J. Norman, and K. Sycara. Bootstrapping trust evaluations through stereotypes. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, pages 241–248, 2010. [cZYC09] M. Şensoy, J. Zhang, P. Yolum, and R. Cohen. POYRAZ: Context-aware service selection under deception. Computational Intelligence, 25(4):335–366, 2009. [DLR77] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1–38, 1977. [Ham94] J. D. Hamilton. Time-series analysis. Princeton Univerity Press, 1994. [HK10] J. Honaker and G. King. What to do about missing values in time series cross-section data. American Journal of Political Science, 54:561–581, 2010. 9 [JH07] A. Jøsang and J. Haller. Dirichlet reputation systems. In Proceedings of the International Conference on Availability, Reliability and Security, pages 112–119, 2007. [JI02] A. Jøsang and R. Ismail. The beta reputation system. In Proceedings of the 15th Bled Electronic Commerce Conference, 2002. [Kro00] H.-M. Krolzig. Predicting markov-switching vector autoregressive processes. Economics Series Work- ing Papers 2000-W31, University of Oxford, Department of Economics, 2000. [LD12] X. Liu and A. Datta. Modeling context aware dynamic trust using hidden Markov model. In Pro- ceedings of the 26th AAAI Conference on Artificial Intelligence, pages 1938–1944, 2012. [LDRL09] X. Liu, A. Datta, K. Rzadca, and E-P. Lim. StereoTrust: A group based personalized trust model. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, pages 7–16, 2009. [MHK09] M. E. G. Moe, B. E. Helvik, and S. J. Knapskog. Comparison of the Beta and the hidden Markov models of trust in dynamic environments. In E. Ferrari, N. Li, E. Bertino, and Y. Karabulut, editors, Trust Management III, volume 300 of IFIP Advances in Information and Communication Technology, pages 283–297. Springer, 2009. [MMH02] L. Mui, M. Mohtashemi, and A. Halberstadt. A computational model of trust and reputation. In Proceedings of the 35th Annual Hawaii International Conference on System Sciences, pages 2431– 2439, 2002. [MTK08] M. E. G. Moe, M. Tavakolifard, and S. J. Knapskog. Learning trust in dynamic multiagent environ- ments using HMMs. In Proceedings of the 13th Nordic Workshop on Secure IT Systems, 2008. [RPC06] K. Regan, P. Poupart, and R. Cohen. Bayesian reputation modeling in e-marketplaces sensitive to subjectivity, deception and change. In Proceedings of the 21st National Conference on Artificial Intelligence, pages 1206–1212, 2006. [SKN07] V. Sassone, K. Krukow, and M. Nielsen. Towards a formal framework for computational trust. In F. S. Boer, M. M. Bonsangue, S. Graf, and W.-P. Roever, editors, Formal Methods for Components and Objects, volume 4709 of Lecture Notes in Computer Science, pages 175–184. Springer, 2007. [SS13] E. El Salamouny and V. Sassone. An HMM-based reputation model. In A. Awad, A. Hassanien, and K. Baba, editors, Advances in Security of Information and Communication Networks, volume 381 of Communications in Computer and Information Science, pages 111–121. Springer, 2013. [TPJL06] W. T. L. Teacy, J. Patel, N. R. Jennings, and M. Luck. TRAVOS: Trust and reputation in the context of inaccurate information sources. Journal of Autonomous Agents and Multi-Agent Systems, 12:183–198, 2006. [WJI05] A. Whitby, A. Jøsang, and J. Indulska. Filtering out unfair ratings in Bayesian reputation systems. Icfain Journal of Management Research, 4(2):48–64, 2005. 10