A Cluster-based Integrated Trust Establishment Model
                    for Intelligent Agents

                                                 Julian Templeton
                                  Electrical Engineering and Computer Science
                                          University of Ottawa, Canada
                                              jtemp005@uottawa.ca
                                                   Thomas Tran
                                  Electrical Engineering and Computer Science
                                          University of Ottawa, Canada
                                                ttran@uottawa.ca


                                                          Abstract

                         This paper presents a cluster-based version of the Integrated Trust
                         Establishment (ITE) model. Despite ITE’s strong performance in sim-
                         ulated environments, the trust establishment model’s robust method
                         for dynamically updating its improvement and disimprovement rate
                         hyperparameters can be improved to better handle trustors of vary-
                         ing behaviours. By modifying ITE’s dynamic hyperparameter update
                         process with a cluster-based approach, the model sees improved perfor-
                         mance by better meeting the needs of varied trustors in an environment.
                         This improvement is exhibited by comparing ITE with the newly pro-
                         posed Cluster-Based ITE (CBITE) in simulated tests. The effects of
                         this cluster-based approach of performing updates based on groupings
                         of trustors is seen to be more effective than performing the updates in-
                         dependently. Trustees using CBITE more accurately meet the desires
                         of varied sets of trustors than ITE.


1    Introduction
Intelligent agents are becoming increasingly robust and can be equipped with a variety of tools to help commu-
nicate with other agents and to determine the trustworthiness of these agents. Since the agents which populate a
Multi-Agent System (MAS) collaborate with other agents to accomplish tasks, the ability to gauge the trustwor-
thiness of other agents is imperative. This active research topic has driven the discovery of many different trust
models, some of which are presented within surveys such as [YSL+ 13]. The agents which reside within MASs
can be referred to as a trustor or as a trustee. A trustor consumes a service from trustees whereas a trustee
provides services to trustors. Agents can be both a trustor or trustee and utilize their calculated trust values of
other agents to understand which trustees can be trusted when selecting interaction partners.

Copyright   © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC
BY 4.0).
In: R. Falcone, J. Zhang, and D. Wang (eds.): Proceedings of the 22nd International Workshop on Trust in Agent Societies, London,
UK on May 3-7, 2021, published at http://ceur-ws.org


                                                                1
    Despite there being significant trust evaluation research, the concept of trust establishment that is presented
in [Sen13] has been less explored. Within [Sen13], Sen proposes the use of trust establishment alongside trust
evaluation within the trust management module which agents contain. The goal of trust establishment is to allow
a trustee to be able to learn how to improve and maintain their trust with trustors in an environment. This will
help trustees become viable options within a MAS and will improve the quality of agent interactions within the
MAS. Although relatively unexplored, there are a number of trust establishment models that have been proposed
to help trustees become more trustworthy in an environment. A recent, state-of-the-art trust establishment model
is the Integrated Trust Establishment (ITE) model [AT20]. This model attempts to actively balance the amount
of resources which a trustee spends during interactions with specific trustors, referred to as the Utility Gain (UG),
with the trustee’s trust in an environment. This is done by using robust equations with many hyperparameters
and by using concepts such as the engagement of a trustor. This approach differs to an earlier model which has
trustees classify the type of a trustor to determine how to calculate the UG that will be provided the trustor in
an e-commerce environment [TCLK14].
    ITE uses two dynamic hyperparameters, denoted by α and β, to represent the rates at which a trustee should
increase or decrease the UG that is provided to a trustor. These values are dynamically updated by using the
Rate of Change (RoC) of the direct feedback that is obtained from trustors for all transactions performed between
the trustors and the trustee at two different time steps. The RoC values are calculated at specific time steps
and provide insight into the general satisfaction of the trustors within an environment. Although this approach
has provided strong simulation results, using singular RoC values to perform the updates for all trustors will
inaccurately update a trustee’s behaviours towards specific trustors when there are varied trustor behaviours in
the environment. This occurs because the varied behaviours can disrupt the RoC that is calculated at any time
step. Even if the RoC is computed for individual trustors, rather than for all trustors, these trustors may be
temporarily shifting their behaviours or may be acting maliciously. By grouping trustors that display similar
behaviours at a given time step and updating the α and β values for each trustor in a group based only on
the behaviour changes observed within their group, the α and β variables will be more accurately updated. By
replacing ITE’s dynamic variable update process with this cluster-based approach, ITE’s performance will be
improved in varied environments. This cluster-based approach is applied to the Cluster-Based ITE (CBITE)
trust establishment model that is presented in this paper. A simulation environment will be defined and used to
compare ITE to CBITE in several simulated tests which will be presented and discussed. This will result in a
modified ITE that helps improve performance in varied environments.

2   Background Information
Prior to the presentation of CBITE, we will outline some background information to better understand ITE’s
design and the expectations for the environments and agents which use ITE. ITE uses a combination of ideas
from previous trust establishment models and expands upon them to improve its capabilities. Two of the core
trust establishment models which ITE expands upon are the Reinforcement Learning based Trust Establishment
(RLTE) model [AT15] and the Acting as a Trustee Using Implicit Feedback (ATeIF) model [AT17]. These
models showcase how a trustee’s behaviour can be more accurately updated for individual trustors by using a
combination of explicit and implicit trustor feedback, by calculating the retention of trustors, and by predicting
criteria weights for the multiple criteria of a task. Each of these concepts are used by ITE to improve its
robustness of helping with trust establishment. To help exhibit ITE’s dynamic variable updating mechanism at
a high-level and to display how it will compare to CBITE’s approach, Figure 1 presents a visual overview of
how the approaches differ. As previously mentioned, ITE uses dynamically updated α and β variables to help
determine the rates at which updates should be performed. Figure 1 showcases that ITE focuses on using a
single RoC value to update its α and β values while the newly proposed CBITE focuses on updating the α and
β values for similar trustors which have been clustered together. Performing the updates with a single RoC can
cause issues since that value may inaccurately adjust the trustee’s behaviour towards specific trustors.
   Each of the models described in this paper are decentralized. This means that they are contained within the
trustee itself, rather than central entities in the environment(s). Each model is also capable of handling tasks
with multiple criteria. A task in the environment, denoted by s, consists of one or more criterion values ci . For
a trust establishment model to help a trustee update their behaviour towards trustors for tasks with multiple
criteria, the model must be capable of understanding how to balance the importance of each criterion. This is
more complicated than understanding single criteria tasks but is important to help the model be more applicable
in domains such as e-commerce in which criteria such as the time of delivery and the quality of the item are


                                                         2
                                 ITE


               Previously                                                Calculate the new            Update the and
               Calculated                                                  overall RoC                     values
                  RoC

                                CBITE
                                                        Cluster 1                            RoC 1

                                                            ...                               ...     Update the and
                                 Cluster trustors into k             Calculate the RoC for
               Transaction                                                                           values for each trustor
                                        clusters         Cluster k       each cluster        RoC k
                  Data                                                                                   in each cluster


         Figure 1: A high-level overview of ITE and CBITE’s α and β variable updating mechanisms


represented by unique UG values.
   Both ITE and our cluster-based implementation CBITE are designed to be run in singular or distributed
MASs that are open and dynamic. This means that the MAS(s) allow agents to be self-interested, diverse, and
deceptive while also allowing agents to freely leave and join the environment [BNS11]. Thus, it is important
to test the effectiveness of these models with trustors which exhibit more varied behaviours. Although trustor
attacks do not occur in the simulations performed within this paper or ITE’s paper, it is also important to note
that a model that is designed for open MASs may also need to help agents handle trustor attacks [SS05]. The
environment used throughout this paper is also decentralized. This means that trustor communication, trust
evaluation, and trust establishment will be handled by the agents themselves, rather than a centralized entity.
   In the environment, intelligent agents will communicate with one another to collaborate for specific tasks.
A trustor will evaluate the trustworthiness of trustees with a trust evaluation model and select zero or more
trustees to interact with at each time step. After a trustee receives an interaction request for a task, the trustee
proposes the UG that will be provided to the trustor for that task. This amount is computed by the trust
establishment model to assist the trustee in providing accurate amounts to specific trustors. Although trustees
may have limits on the amount of UG that can be provided at each time step or on the total number of trustors
that can be interacted with at once, this paper assumes that trustees can serve any number of trustors at a given
time step and can always provide the necessary UG. Trustors receive the proposed UG and decide whether to
accept the proposed amount and perform the transaction with the trustee or to reject the UG that is provided.
Accepted interactions are then performed by trustees and the trustors receive the UG for each criterion in the
task. Finally, the trustor’s satisfaction with the transaction will either be provided to the trustee or will be
predicted by the trustee. The complete transactional data is then used by the trust establishment model to
adjust how the trustee behaves towards the specific trustor.
   The agent itself is expected to be capable enough of running the selected trust establishment model. One
requirement from ITE is that the agent can store the transactional data from transactions that have occurred
within the past H active time steps. The hyperparameters from ITE and CBITE allow the models to be tuned
for specific agent capabilities, but still require some memory and computational resources. Furthermore, CBITE
uses a cluster-based approach to updating ITE’s dynamic improvement and disimprovement rate variables and
hence requires the ability to utilize some Unsupervised Machine Learning algorithm. Any clustering algorithm
can be used; thus, it is important that the agent can run a clustering algorithm that will perform well when
grouping trustors based on a set of input values.


3   Cluster-Based Trust Establishment
With an understanding of the environment that is being used and the agents which reside within them, we will
now present the cluster-based approach to updating ITE’s dynamic variables for improved performance. Before
presenting this cluster-based process, ITE’s current architecture will be described. This will provide context to
the effects that the cluster-based approach has on the rest of the system and flesh out CBITE’s complete design.
Following the description of ITE’s architecture, we will present the cluster-based process and describe the issues
from ITE that it aims to correct.


                                                                     3
3.1   ITE’s Architecture
Since CBITE’s general design is the same as ITE’s, we will first describe and present the key components from
ITE. Since this paper proposes a new approach to updating the dynamic improvement and disimprovement
rate variables used by ITE, rather than proposing a completely unique model, this section will provide written
descriptions for certain ITE components rather than fully presenting each component of the architecture. Each
formula used within this subsection and the full details of each of ITE’s components can be found in Aref’s paper
on ITE [AT20].
   As mentioned in section 2 of this paper, ITE uses the concept of trustor retention to help more accurately
perform updates to trustee y’s behaviour towards trustor x. By computing the retention values of individual
trustors, a trustee can compute a value that is used to determine the likelihood of the trustor interacting with
the trustee for a specified task s at time step t. From this, a trustee can classify whether a trustor x is engaged
with the trustee, unengaged with the trustee, or neither engaged nor unengaged with the trustee. This is later
used to help determine the optimal methods of updating the trustee’s behaviour to improve or maintain trust
with that trustor.
   ITE and several previous trust establishment models, such as RLTE and ATeIF, predict how much a trustor
weighs each criterion of a task through predicted relative weight (rw) values. Each trustor represents their overall
expectations for a criterion of a specific task by assigning a weight to them. This weight is called a rw value,
which represents the trustor’s weight of the criterion divided by the trustor’s demand for the criterion. Since
these values represent a trustor’s expectations, ITE attempts to predict these rw values continuously such that
they can be used to provide the appropriate UG for each criterion. The function that is used for updating this
predicted rw value stored by trustee y for trustor x for a criterion of a task sci is displayed below.
                               
                               
                               
                                (1 + α) ∗ rwxy (sci , tra0 )    SATxy (s, tra0 ) < Ω and rwxy (sci , tra0 ) ≥ Φ
                                                          0
                                             y
                               (1 + β) ∗ rwx (sci , tra )       SATxy (s, tra0 ) ≥ θ and rwxy (sci , tra0 ) < Φ
                               
                               
                               
             rwxy (sci , req) = (1 + γ ∗ α) ∗ rwxy (sci , tra0 ) SATxy (s, tra0 ) < Ω and rwxy (sci , tra0 ) < Φ (1)
                               (1 + ζ ∗ β) ∗ rwxy (sci , tra0 ) SATxy (s, tra0 ) ≥ θ and rwxy (sci , tra0 ) ≥ Φ
                               
                               
                               
                               
                               (1 + λ ∗ α) ∗ rwy (s , tra0 ) else
                               
                                                  x ci

  where,

   rwxy (sci , req) is the predicted rw value assigned to trustor x by trustee y for a criterion of a task sci for the
    interaction req which occurs after the previous interaction tra0

   α and β are the improvement and disimprovement variables which are dynamically updated by the model

   γ, ζ, and λ are scaling variables that help more accurately update the predicted rw in specific scenarios

   Ω is the trustee’s engagement threshold and θ is the trustee’s un-engagement threshold. These are used
    when analyzing the direct feedback that the trustor has provided from the previous interaction (denoted by
    SATxy (s, tra0 ))

   Φ is the trustee’s threshold to determine whether a predicted rw is considered to be low

   Thus, using equation 1, ITE updates its predicted rw values for each criterion of a task for a specific trustor.
This equation is important since the values are used by the trustee to determine how many resources should be
spent to meet the expectations of the trustor. Furthermore, this equation uses the α and β variables to determine
how to perform these updates accurately. Since these variables will be dynamically updated by ITE to actively
manage the rates at which the predicted rw values are updated at, performing more accurate updates to these
variables will help ensure that the predicted rw values are updated more accurately to account for any behaviour
shifts from trustors.
   When deciding the amount of UG to provide to a trustor for a criterion of a task, the trustee computes
an improvement value to state how much should be given to that specific trustor for the specific criterion
when added to the minimum amount of UG that can be provided. This improvement value is denoted by
Improvementyx (sci , req), which is the improvement to be provided to x by y for sci during the interaction req.
This value is the weighted combination of an explicit improvement value and a implicit improvement value.
The explicit improvement value proposes to provide a UG for a criterion based on the predicted rw values from


                                                          4
equation 1. This is the explicit improvement since the predicted rw value uses the feedback received from a
transaction to know how to adjust itself.
   Although the explicit improvement is a good approximation of the amount to be provided to the trustor,
implicit improvement values for each criterion are continuously updated to reflect how much UG should be
added based on implicitly retrieved information. A implicit improvement value for a specific criterion uses the α
and β values, the trustor’s engagement, and the predicted rw value for the corresponding criterion to continuously
update itself. Thus, this is another example of how the dynamic α and β variables directly affect how the trustee
adjusts its behaviours towards individual trustors. The implicit improvement values that are assigned to each
criterion are updated via the equation below.
                                       
                                                           y        0         y         y
                                       (1 + α) ∗ I Impx (sci , tra ) x ∈ Xue and rwx (sci ) > Φ
                                       
                  I Impyx (sci , req) = (1 + β) ∗ I Impyx (sci , tra0 ) x ∈ Xey and rwxy (sci ) < ψ            (2)
                                        I Impyx (sci , tra0 )
                                       
                                                                        else
                                       

  where,

   I Impyx (sci , req) is the implicit feedback value assigned to trustor x by trustee y for sci , for the interaction
    req which occurs after interaction tra0

   ψ is the trustee’s threshold to determine whether a predicted rw is considered to be high

   Xue
      y
        is the set of all trustors unengaged with trustee y and Xey is the set of all trustors engaged with trustee
    y

   Using the improvement function, ITE computes the total UG to be provided to trustor x for an interaction
by adding each UG value that is to be provided for each criterion. The total UG to be provided for task s is
denoted by U Gyx (s, req) and is the summation of the UG values that are provided for each criterion, which will
be denoted by ugxy (sci , req) in equation 4. Below is the equation for computing the total UG.
                                              p
                                              X
                           U Gyx (s, req) =         Improvementyx (sci , req) + M inU Gy (sci )                    (3)
                                              i=1

  where,

   U Gyx (s, req) is the total UG provided by y to x for task s during interaction req

   p represents the total number of criteria within task s

   Improvementyx (sci , req) + M inU Gy (sci ) is the UG provided for a single criterion of task s, also represented
    by ugxy (sci , req)

   If the proposed UG amount is accepted by the trustor, the trustee is notified and provides the UG to the
trustor. After performing the service, the trustor may provide a satisfaction value to the trustee to signify the
trustor’s satisfaction with the transaction. This is the direct feedback that is obtained by trustees and is used
by ITE to update a trustee’s behaviour towards trustors. Although this value can be predicted if not received
or not trusted, the trustors will always provide this value within the scope of this paper. This satisfaction value
is denoted by SATxy (s, tra) and is calculated with the following equation.
                                                          p
                                                          X wx (sc ) ∗ ug y (sc , tra)
                                                                             x
                                    SATxy (s, tra) =                i               i
                                                                                                                   (4)
                                                          i=1
                                                                        dx (sci )

   Where wx (sci ) is the trustor’s weighing of sci and dx (sci ) is the trustor’s demand of sci

   Given satisfaction values from trustors (equation 4), ITE uses these values to calculate the average satisfaction
rate that the trustee has received for a specific task. In ITE’s paper, the calculated satisfaction rate below is not
explicitly stated to be computed for only satisfaction values from a single trustor or for the satisfaction values
from all trustors in the environment. Thus, the equation that will be used within this paper for ITE is set such


                                                                5
that the satisfaction rate is computed for all satisfaction values that are stored for a specific task s. This is
calculated with the equation seen below.
                                                           y
                                                         PNtr     y
                                                          j=1 SATx (s, j)
                                          SATxy (s) =           y                                               (5)
                                                              Ntr
  where,
     y
   Ntr is the total number of transactions that y has performed for a specific task s
   SATxy (s, j) is the satisfaction between trustee y and trustor x for the transaction j of task s

   ITE then computes the general satisfaction rate within the environment to determine whether the average
satisfaction of trustors has increased or decreased since it has been last calculated. This is the RoC of trustor
satisfaction in the environment for all trustors over two different points of time.

                                                          SATxy
                                                           d
                                                     y
                                                   g =                                                          (6)
                                                          SATxy

   Where SATxy is the satisfaction rate (equation 5) that is calculated after the previous satisfaction rate SATxy
             d
    is calculated

   Given this RoC value, ITE then updates its α and β variables to reflect how the trustee should update its
behaviours based on the general satisfaction of trustors in the environment(s). This is a robust method of
performing more accurate updates to the predicted rw values from equation 1 and to the implicit improvement
values from equation 2. Accurately updating these values will help to ensure a more accurate UG is provided
from equation 3. Within this paper, the updates to α and β will be done after a full round of transactions
between the trustors and trustee. This ensures that the changes being made will more accurately reflect the
environment. Although performing the updates in small batches may help stabilize the performance, similarly
to what has been proposed by Mini-Batch Stochastic Gradient Descent, this ITE implementation will perform
the updates after a full time step of transactions are completed [Rud16]. Below are the two update equations
for α and β.

                                                  α̂ = α − ln(g y )                                             (7)

                                                  β̂ = β + ln(g y )                                             (8)
   Performing these updates will ensure that ITE updates trustor behaviours at a rate that reflects the general
satisfaction in the environment. The issue with this approach, which is addressed by CBITE, is that these
updates do not work well when in a varied environment. When more trustor behaviours exist, the RoC from
equation 6 may not accurately reflect the general environment at a given time step and may incorrectly shift the
values of the α and β variables. Even if the RoC is computed for individual trustors, rather than for all trustors,
the same problem can exist when a trustor has a temporary shift in their behaviour that will result in the α and
β variables being updated too drastically.

3.2   CBITE’s Architecture
With ITE’s architecture presented and the dynamic improvement and disimprovement rate variable update
process presented, the cluster-based approach to improve ITE’s functionality will be proposed. CBITE uses the
same overall architecture, but modifies the methodology that is used for dynamically updating α and β. Thus,
the changes presented by CBITE directly affect equations 5, 6, 7, and 8 from ITE. The changes made to α
and β also indirectly affect the results from equations 1, 2, and 4. In CBITE, the α and β variables are stored
independently within each trustor model as αxy and βxy . This allows each trustor to have the variables tuned
in different ways at each time step without affecting the general methodology that is used by the trustee to
adjust itself towards other trustors. For each task s, CBITE will perform the following process. To simplify the
notation used, the following variables and formulas will omit the task s from their definitions, but it is important
to understand that this update process is done independently for each task.


                                                          6
    First, CBITE will require the use of an Unsupervised Machine Learning algorithm, represented by Φmodel , to
group the trustors which have interacted with the trustee at a given time step, represented by the set Xinteracted ,
into k distinct clusters. The Unsupervised Machine Learning algorithm can be any clustering algorithm but
should be able to cluster trustors well from a selected set of inputs. Since CBITE is a decentralized model, the
clustering algorithm will need to be fit with the appropriate data before the clustering is performed. In open
and dynamic environments, the algorithm choice will be important, but for this paper CBITE will utilize the
K-means algorithm to keep the process simple. K-means is a well known and well researched distance-based
algorithm that has flaws, such as its inability to effectively deal with outliers, but is a good candidate for use in
the simulations that will be performed [KM14]. Details regarding K-means can be found in Machine Learning
literature such as Flach’s book which describes many different Machine Learning approaches [Fla12]. When
selecting a clustering approach to use for Φmodel , surveys such as [RCC+ 19] and [XW05] can provide insight
into which algorithm will best suit the target environment and task due to their detailed comparisons between
different algorithms. For this implementation of CBITE, the clustering algorithm will cluster trustors using the
trustor satisfaction that has been obtained following a completed transaction (equation 4) and the UG which
the trustee has provided to the trustor in that same completed transaction (equation 3). The collection of the k
clusters will be denoted as C.
    For each of the k clusters, CBITE computes the average satisfaction rate for that cluster group. This directly
extends equation 5 by using both the average satisfaction from all transactions that have been performed during
or before the previous time step and the average satisfaction that has been received by the trustee from the
trustors that are grouped within the specified cluster at the current time step. This allows each cluster to have
independent average satisfaction rates that accurately reflect the change in satisfaction from similar behaving
trustors at a given time step.
                                                  P|T R|                  P|ci |
                                                     q     SATTy Rq +           j   SATxyj
                                    CBSATcyi =                                                                      (9)
                                                                |T R| + |ci |
  where,

   |ci | is the number of trustors in cluster i (where cluster i contains trustors defined by xj )

   SATxyj is the SAT provided by trustor xj to trustee y for the task that has been completed during the
    current time step t

   |T R| is the total number of transactions for a specific task that have been completed between the trustee
    and trustors before the current time step

   T Rq is the qth transaction of a specific task that has been completed by the trustee before the current time
    step

   For each cluster-based average satisfaction rate from equation 9, CBITE calculates the general satisfaction
rate of trustors in a specific cluster. This is a direct modification to equation 6 to consider the average satisfaction
change within specific clusters rather than the average satisfaction change within the entire environment. This
will help ensure that any updates that are made when using the cluster-based general satisfaction rate will not
consider any drastic changes in average satisfaction from dissimilar trustors. If any relatively dissimilar trustors
are grouped together, most of the clustered trustors should be similar enough to ensure a representative general
satisfaction rate.

                                                            CBSATcyi
                                                 CBgcyi =                                                          (10)
                                                                SATxy
   Using the cluster-based general satisfaction rate from equation 10 allows the αxy and βxy variables that are
assigned to individual trustors to be updated based on the RoC obtained from a group of similar trustors at a
specific time step. Since this process is only done for trustors who have completed a transaction with the trustee
at the time step, this cluster-based RoC will ensure that similar trustors receive fine-tuned updates. This process
ensures that the αxy and βxy values are not updated by too much if there are temporary behaviour shifts and
ensures that a trustor is accurately updated based on the group that they are clustered into. The αxy and βxy
values are updated as seen below only for trustors within the corresponding cluster.


                                                            7
                                                cxy = αy − ln(CBgcy )
                                                α                                                                 (11)
                                                       x           i


                                                cxy = β y + ln(CBgcy )
                                                β                                                                 (12)
                                                       x            i


   For trustors that have not completed a transaction with the trustee at a specific time step, they will still be
updated by averaging the general satisfaction rates that have been calculated for each cluster of trustors. This
ensures that the trustors are still being updated based on the general trends on the environment even if the
trustor has not interacted with the trustee.
                                                              Pk    y
                                                               i CBgci
                                               avgCBgcy =                                                          (13)
                                                                 k
   Using this averaged cluster-based general satisfaction rate, the αxy and βxy variables of all non-clustered trustors
are updated to ensure that the trustee adjusts its behaviour accordingly for every trustor in the environment.

                                              cxy = αy − ln(avgCBgcy )
                                              α                                                                   (14)
                                                     x


                                              cxy = β y + ln(avgCBgcy )
                                              β                                                                   (15)
                                                     x

   Thus, this cluster-based approach to updating the dynamic improvement and disimprovement rate variables
allows for more fine-tuned updates to be performed towards trustors that have interacted with the trustee at
a given time step. This occurs without compromising the model’s ability at also updating the variables for
trustors which infrequently interact with the trustee. This methodology will allow for improved performance
over ITE by being able to better reflect how a trustee should update its behaviour towards individual trustors
which exhibit varied behaviours in an environment. Furthermore, this approach can be optimized for a specific
environment or task by selecting the appropriate algorithm to use for Φmodel . As an example, perhaps the
CBITE implementation will decide to use one clustering algorithm for tasks with many criteria values and will
use a different clustering algorithm otherwise. Each Φmodel can also be specifically tuned for a task which the
trustee performs in the environment.

4     System Evaluation
Following the presentation of CBITE’s architecture and the changes that have been made when compared to
ITE, this section will describe the simulation environment that is used and will present the results obtained from
comparative simulated tests between ITE and CBITE. These tests will highlight the improvements made by the
cluster-based approach and analyze how each model performs in varied environments.

4.1   Simulation Definition
The simulation environment that will be used in this section is designed to be similar to the environment that
has been used to evaluate ITE in its paper. It uses the concept of assigning activity levels and demand levels
to trustors to allow different behaviours to be present during the simulations. There are three different activity
levels that are available to be assigned to trustors which each provide the probability for the trustor to actively
seek interaction partners at a given time step. The trustor can have a high activity level, a regular activity level,
or a low activity level. These each represent different probabilities of the trustor choosing whether or not to be
inactive for a specific time step.
   Each trustor is also assigned one of three demand levels which is used to determine a range of possible values
that can be used for the trustor’s demand values that are assigned to the various criteria of a task. These values
are then used to compute both the trustor’s rw for each criterion of a task and the satisfaction that is provided
to trustors (the dx (sci ) value from equation 4). The demand level also indicates the probability that is used to
determine whether a transaction is good or bad. Different than the trustor’s satisfaction, a good transaction is a
transaction in which each UG value that is provided by a trustee to a trustor for the criteria of a task meet the
specified demand percentage that is contained by the trustor via their demand level. Thus, a good transaction
is a transaction which satisfies the overall demand of a trustor. A bad transaction is any transaction that is not
a good transaction. A trustor is assigned either a low demand level, a regular demand level, or a high demand
level to increase the variety of behaviours within the environment.


                                                          8
   CBITE will be defined in the same manner described in section 3.2 and will use the K-means clustering
algorithm. Since the simulations will consist of a single task with 10 criteria values, only one K-means model
will be needed for each trustee. To help produce varied trustor behaviours, each trustor within the environment
is assigned with a randomly selected activity and demand level (where there is a limited amount of each activity
and demand type that can be assigned to the trustors). This random assignment is based on a random seed
which ensures that the same trustor configuration is used when testing each model. This will produce a variety of
unique behaviours that will help with evaluating how well ITE and CBITE can perform in varied environments.
Since each performed test uses different trustor to trustee ratios, the simulated environments will capture the
performance of each model in varying scenarios.
   The simulation environments are programmed in Python using the Mesa Multi-agent Modelling library [MK15]
and use the same scheduling design that is used in ITE’s Java based simulations which use the MASON library
[LCRP+ 05]. Each of the tests that are performed run exactly the same, except that the trust establishment
model which is used will be different. The instantiation of a simulation uses a random seed to ensure that
everything is instantiated the same way for a specific test.
   Trustors typically will use a trust value to determine which trustees to interact with at a specific time step.
In these simulations, any active trustors will interact with each trustee during a time step. Doing this helps the
simulation capture more data regarding the interactions without requiring a large increase to the total number of
agents. Due to the diverse environment, this also showcases how well the trust establishment model can satisfy
the trustors without simply providing much more UG to each trustor in the environment.
   To calculate trust values, each trustor uses the simple trust evaluation model that is used within ITE’s paper
[AT20]. Within the environment, the maximum trust is 1 and the minimum trust is 0. The total trust that is
assigned to a trustee by a trustor is the weighted combination of the direct and indirect trust values for that
trustee. The direct trust of a trustee for a specific trustor is equal to the average satisfaction that the trustor
has computed for each transaction with the trustee (equation 4). The indirect trust that is used by a trustor
for a trustee is the average of the direct trust values from all other trustors regarding that trustee. Although
in a real environment this communication will not always be possible, in this simulation all agents can freely
communicate and cooperate with one another. Below is the equation to retrieve the total trust of a trustee.

                                 trustyx = direct trust ∗ 0.5 + indirect trust ∗ 0.5                             (16)

   When running the simulations, the average direct trust values of all trustees from all trustors, the average UG
that is provided to all trustors from each transaction, and the rate of good transactions out of all transactions will
be tracked. These will provide insight into the performance of each model by determining how well the models
manage the trade-off between providing more UG in exchange for increased trust and how well the models can
accurately meet all the needs of trustors by analyzing the rate of good transactions. Table 1 exhibits the core
hyperparameter values that are used throughout the simulation. The selected hyperparameter values are set to
closely match those used within ITE’s paper for a more accurate comparison. Any parameter values not exclusive
to CBITE are used between both models.

                            Table 1: Core hyperparameter values for the simulations

                                    Parameter                  Assigned Value
                                    Number of agents (N)       100
                                    α (initial value)          0.02
                                    β (initial value)          -0.01
                                    Number of clusters (k)     5
                                    θ                          0.75
                                    λ                          0.1
                                    ζ                          1
                                    γ                          1
                                    Ω                          0.5
                                    Φ                          0.25
                                    ψ                          0.5


                                                          9
4.2   Simulation Results

To display the performance benefits from CBITE, we will compare CBITE to ITE in three tests. Each test will
use a different trustor to trustee ratio to see how well each model handles varying amounts of trustor behaviours.
In the tests, each agent is only considered to be a trustor or be a trustee. In the first test, there will be 12
trustors and 88 trustees. This allows a comparative analysis of how well each model performs when there are
far more trustees than trustors. This limits the amount of trustor behaviours in the environment as well. Next,
the models will be compared when there are 24 trustors and 76 trustees to better understand how each model
performs when there are more trustor behaviours that are introduced in the environment. Finally, the third
test will involve 36 trustors and 64 trustees to introduce even more diversity into the environment while still
maintaining a good number of trustees. As more trustors are introduced into the environment, the number of
interactions drastically increase and will affect the tracked metrics accordingly. Figure 2 exhibits the results that
have been obtained from the simulated tests.


        (a) Average Direct Trust:            (b) Average Direct Trust:             (c) Average Direct Trust:
         12 Trustors, 88 Trustees             24 Trustors, 76 Trustees              36 Trustors, 64 Trustees


       (d) Average Delivered UG:            (e) Average Delivered UG:             (f) Average Delivered UG:
         12 Trustors, 88 Trustees             24 Trustors, 76 Trustees              36 Trustors, 64 Trustees


       (g) Good Transaction Rate:           (h) Good Transaction Rate:            (i) Good Transaction Rate:
         12 Trustors, 88 Trustees             24 Trustors, 76 Trustees              36 Trustors, 64 Trustees


                      Figure 2: Comparative results between ITE (black) and CBITE (red)


                                                         10
   Figure 2 showcases that CBITE’s cluster-based approach to updating the dynamic variables changes the
observed performance in all metrics. It important to note that the results have a more significant difference
when there are more trustors and less trustees in the environment. This makes sense given that the cluster-based
approach will perform more accurate updates when there are more trustors to be clustered at each time step.
Overall, CBITE receives a minor decrease in the average direct trust of trustees, but also reduces the UG that is
being provided. This decrease in trust makes sense given the corresponding reduction in the UG that is provided
to trustors. Despite this, the overall decrease in trust and provided UG from CBITE, which is further reduced
when there are more trustors in the environment, is not too significant when compared to ITE’s results.
   When comparing the average direct trust and the average provided UG from the third test, see plots (c) and
(f), to the first two tests, see plots (a), (b), (d), and (e), we see that CBITE is learning how to maintain a
high enough trust value without providing too much UG. When analyzing ITE’s performance, plots (c) and (f)
demonstrate that ITE does not always attempt to balance the provided UG with the corresponding trust when
working with a diverse set of trustors. Although increased trust is beneficial, the amount of UG that is spent
for it can be problematic depending on the resources that are available to the trustee.
   Despite the lower average direct trust and provided UG, CBITE receives a higher good transaction rate for
most of all three tests. The good transaction rate also increases early on, without seeing the initial dip that
is observed by ITE when there are more varied trustor behaviours in the environment (see plots (h) and (i)).
Furthermore, since the good transaction rates from both models in plot (g) converge at the end, this also displays
that CBITE benefits more from operating in diverse MASs. The drastic improvement to the good transaction
rate in plots (h) and (i) by CBITE is indicative of CBITE’s improved ability at better meeting all the needs of a
trustor. Since the corresponding average provided UG is lower, see plots (e) and (f), it is clear that CBITE can
better learn the expectations of trustors while providing less overall UG in the environment. Thus, trustees using
CBITE are shown to better learn how to shift their behaviours towards specific trustors in varied environments
when compared to ITE.

5   Discussion and Conclusion
From the simulation results that have been observed in the previous section, the cluster-based approach to
updating ITE’s α and β variables for each trustor results in specific performance improvements that are more
prevalent as more trustor behaviours are present in the environment. Unlike ITE, the good transaction rates
that are observed from CBITE in more diverse environments immediately increase and are quickly stabilized.
This observation clearly indicates that the independent updates that are performed for groups of similar trustors
ensures that the fine-tuned general satisfaction rate from equation 10 better represents the needs of similar
trustors. Given the decreased UG that is being provided and the corresponding trust decrease, CBITE is
allowing trustees to prioritize specific trustors as interaction partners.
   The stabilized CBITE results in the second and third tests, as seen in plots (b), (c), (e), (f), (h), and (i) of
Figure 2, showcases that trustees can learn varied behaviours in the environment quicker than ITE. Although
the trustors from the simulations do not actively change their behaviours, CBITE’s ability to quickly learn the
needs of trustors in an environment will allow CBITE to update itself more quickly to meet shifting trustor needs
when compared to ITE. Despite the observed results being less drastic in less diverse environments, CBITE still
achieves comparable results with the benefit of potential large increases to performance as new agents enter the
environment. This is because a trust establishment model that is designed for open and dynamic MASs should
be capable of quickly learning the behaviours of a trustor to ensure that the trustee appropriately allocates UG
to that trustor.
   Although the simulated environment is not distributed, the same tests can be performed in simulated dis-
tributed environments with an appropriate method for agent communication. Thus, similarly to ITE, CBITE is
a functional trust establishment model in both singular and distributed MASs. The tests performed in this paper
provide insight into how CBITE improves upon ITE’s ability to handle varied trustor behaviours. Since the tests
use the same concepts that are used by ITE’s paper, this ensures a fair and accurate comparison between the
two. However, as more advancements are made in trust establishment model research the models can also be
tested based on their performance when handling trustor attacks or when splitting the types of trust establish-
ment and trust evaluation models that are used in the same simulated environment. This can further exhibit
any improvements that can be made to CBITE to ensure that the ITE-based model is as robust as possible.
Similarly, CBITE can be tested with real datasets which may contain larger sets of dissimilar trustors. Since
CBITE is intended to help when there are similar trustor behaviours in an environment, performing these tests


                                                        11
can help with determining the optimal clustering algorithms and corresponding hyperparameters to use when
working in these different scenarios. Further exploration regarding the tuning of the hyperparameters that are
used by ITE-based models can also lead to improved guidance on how to use the models to best target expected
agent behaviours in different domains.
   To conclude, the cluster-based update process that is applied to ITE provides a refinement to ITE’s initial
design by allowing trustees to quickly learn how to completely satisfy trustors without providing too much
UG. In this paper, a trust establishment model named CBITE is presented and contains the new cluster-
based improvement and disimprovement rate update module. Despite using a simplistic Unsupervised Machine
Learning algorithm in the simulations, CBITE still provides improvements when adjusting a trustee’s behaviour
in varied environments. As trust establishment model research continues to advance, CBITE will demonstrate
how trustor-based clustering can improve trust establishment model performance and will allow for further
refinements to be made at optimizing the trust establishment model.

References
[AT15]      A Aref and T Tran. Rlte: A reinforcement learning based trust establishment model. In 2015 IEEE
            Trustcom/BigDataSE/ISPA, pages 694–701, 2015.
[AT17]      A Aref and T Tran. Acting as a trustee for internet of agents in the absence of explicit feedback.
            In MCETECH 2017, pages 3–23, 05 2017.
[AT20]      A Aref and T Tran. An integrated trust establishment model for the internet of agents. Knowledge
            and Information Systems, 62:79–105, 2020.
[BNS11]     C Burnett, TJ Norman, and K Sycara. Trust decision-making in multi-agent systems. In Twenty-
            Second International Joint Conference on Artificial Intelligence, pages 115–120, 2011.
[Fla12]     P Flach. Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge
            University Press, 2012.
[KM14]      M Kaushik and B Mathur. Comparative study of k-means and hierarchical clustering techniques.
            International Journal of Software and Hardware Research in Engineering, 2(6):93–98, 2014.
[LCRP+ 05] S Luke, C Cioffi-Revilla, L Panait, K Sullivan, and G Balan. Mason: A multiagent simulation
           environment. SIMULATION, 81(7):517–527, 2005.
[MK15]      D Masad and J Kazil. Mesa: An agent-based modeling framework. In 14th PYTHON in Science
            Conference, pages 53–60, 2015.
[RCC+ 19]   MZ Rodriguez, CH Comin, D Casanova, OM Bruno, DR Amancio, LdF Costa, and FA Rodrigues.
            Clustering algorithms: A comparative approach. PLoS ONE, 14, 2019.
[Rud16]     S Ruder. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747,
            2016.
[Sen13]     S Sen. A comprehensive approach to trust management. In Proceedings of the 2013 International
            Conference on Autonomous Agents and Multi-Agent Systems, (AAMAS ’13), page 797–800, Rich-
            land, SC, 2013.
[SS05]      J Sabater and C Sierra. Review on computational trust and reputation models. Artificial Intelligence
            Review, 24(1):33–60, 2005.
[TCLK14]    T Tran, R Cohen, E Langlois, and P Kates. Establishing trust in multiagent environments: Realizing
            the comprehensive trust management dream. TRUST@ AAMAS, 1740:35–43, 2014.
[XW05]      R Xu and D Wunsch. Survey of clustering algorithms. IEEE Transactions on Neural Networks,
            16(3):645–678, 2005.
[YSL+ 13]   H Yu, Z Shen, C Leung, C Miao, and VR Lesser. A survey of multi-agent trust management systems.
            IEEE Access, 1:35–50, 2013.


                                                      12