Prior to Trust: Frequentist and Bayesian views of Trust in
                         AI
                         Mattia Petrolo1,∗ , Ekaterina Kubyshkina2 and Giuseppe Primiero2
                         1
                          University of Lisbon, CFCUL, Alameda da Universidade, 1649-004 Lisbon, Portugal
                         2
                          Logic, Uncertainty, Computation and Information Lab, PhilTech Research Center, Philosophy Department, University of Milan,
                         Via Festa del Perdono, 7 20122, Milan, Italy


                                     Abstract
                                     The notions of trust and trustworthiness in the field of AI are currently the focus of a collective, interdisciplinary
                                     effort for clarification. In this work, we contribute to this ongoing debate by identifying two senses in which an
                                     agent might place trust in an AI system. The first sense, referring to trustworthiness as formalised in previous
                                     work, considers the results of tests conducted on the system alongside the agent’s expectations. The second sense,
                                     extends the former by factoring in the agent’s “pragmatic” background when considering these tests. We argue
                                     that these two forms of trust can be understood in relation to well-known approaches in statistical inference: the
                                     first aligns with a frequentist interpretation, while the second reflects a Bayesian view of trust.

                                     Keywords
                                     Trustworthy AI, Reliable AI, Statistical inference


                         1. Introduction
                        The concepts of trust and trustworthiness in AI are currently the subject of an interdisciplinary effort
                        of conceptual, formal and procedural clarification. This is evident from the increasing attention these
                        notions are receiving across various fields of research. AI engineers, for instance, are working to
                        incorporate properties into AI systems to enhance their trustworthiness (see, e.g., [1]). Meanwhile,
                        philosophers are engaged in defining Trustworthy AI (TAI), exploring its epistemological and ethical
                        implications, and debating whether it is even possible to discuss TAI without committing a categorical
                        mistake (see, e.g., [2], and [3] for a critical discussion). Logicians, on the other hand, are developing
                        formal systems to capture the complex and elusive concepts of trust and TAI (see, e.g., [4], [5]). Finally,
                        sociologists are examining the societal impacts of trusting AI-based technologies (see, e.g., [6]).
                           The challenge of understanding trust and trustworthiness in AI is not purely theoretical. The widely
                        discussed European proposal for the Artificial Intelligence Act [7], inspired by the Ethics Guidelines
                        for Trustworthy AI [8], stipulates that AI systems and the information they generate must be reliable,
                        transparent, and trustworthy, among other things. However, these terms have distinct and not always
                        shared definitions. This lack of clarity, and the absence of a solid theoretical foundation, is source of
                        potential misunderstandings that could affect the social perception of AI systems. Without a clear
                        definition of these concepts, there is a genuine risk that the Guidelines and the AI Act may lack the
                        practical relevance necessary for meaningful implementation. Given that these frameworks aim to
                        regulate issues of critical importance to human well-being and governance, a deeper analysis and
                        clarification of the notions of trust and trustworthiness in AI is essential.
                           In this paper, we contribute to this ongoing debate by identifying two senses in which an agent
                        might place trust in an AI system. The first sense considers the results of tests conducted on the
                        system alongside the agent’s expectations. The second extends the former by factoring in the agent’s
                        “pragmatic” background when considering these tests. We argue that these two forms of trust can

                         3rd Workshop on Bias, Ethical AI, Explainability and the Role of Logic and Logic Programming (BEWARE24), co-located with
                         AIxIA 2024, November 25-28, 2024, Bolzano, Italy
                         ∗
                             Corresponding author.
                         Envelope-Open mpetrolo@fc.ul.pt (M. Petrolo); ekaterina.kubyshkina@unimi.it (E. Kubyshkina); giuseppe.primiero@unimi.it
                         (G. Primiero)
                                    © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
be understood in relation to well-known approaches in statistical inference: the first aligns with a
frequentist interpretation, while the second reflects a Bayesian view of trust.


2. Two accounts of trust in AI systems
When it comes to trusting an AI system, particularly a Machine Learning (ML) system, there are at least
two distinct ways in which an agent can do so. To illustrate this, we borrow and slightly modify an
example from [1]. Let us consider a simple example. Imagine a classifier 𝒞 designed to identify pictures
of wolves, and assume that 𝒞 has already been evaluated as trustworthy according to some relevant
metrics. Now, suppose we present two images to the classifier: one of a wolf and the other of a Siberian
Husky. Upon processing, 𝒞 classifies both images as wolves. Let us now consider two agents, 𝐴1 and
𝐴2 , both aware that 𝒞 has been deemed trustworthy, and both receiving the same classification output.
The only difference between the two agents is that 𝐴2 is an expert dog trainer, while 𝐴1 is not. At
this point, their reactions diverge: 𝐴1 trusts the output, while 𝐴2 does not. How can we explain the
difference in their responses?
   To address this, let us first examine what allows 𝐴1 to trust 𝒞. We will assume some conditions
we consider necessary for 𝒞 to be considered trustworthy. In the following we assume a notion of
trustworthiness for non-deterministic computational system as the one proposed in [9], [4], [5]. In
this framework, a trustworthy non-deterministic process for a given output is defined as one where
the frequency of that output, over a specified number of trials, does not deviate beyond an acceptable
threshold from its expected probability. This understanding relies on a series of tests performed on
the system and the alignment of these tests with the expectations an agent has regarding the system’s
behavior. Note that this interpretation is not necessarily constrained to a sharp measure of probability
and could be extended naturally to a graded version. In this context, trustworthiness is always indexed
by both an agent and an output. Different agents may have (or assume) varying expectations about
a system’s performance, leading them to assess the trustworthiness of the same system differently.
Similarly, a system may be deemed trustworthy for certain outputs but not for others. For example, a
ML system might be well-trained to provide accurate answers about historical events, but not about
current events. Thus, it can be considered trustworthy in relation to historical outputs while being
untrustworthy for current ones. Necessary conditions for this notion of trustworthiness to induce an
epistemic state are: first, 𝐴1 knows that 𝒞 is trustworthy – meaning that the behavior displayed by
𝒞 is as expected by the agent in any epistemic scenario; and second, 𝐴1 has an evidence for 𝒞 being
trustworthy, giving them a justification to accept the output as correct. With this understanding, we
can characterize a first form of trust in an AI system, which we will refer to as 𝑡𝑟𝑢𝑠𝑡1 :

  An agent 𝐴 𝑡𝑟𝑢𝑠𝑡𝑠1 an AI system 𝑆 iff
  𝐴 has an evidence that 𝑆 produces an output in accordance with the behavior of 𝑆 as expected by 𝐴.

   This notion of trust is widely referenced in the literature on evaluating AI trustworthiness (see, for
example, [5]). In this context, trustworthiness is viewed as a crucial component, while other aspects of
trust are set aside.
   Let us return to our example of the classifier and examine the reasons why the second agent, 𝐴2 ,
might not trust the classifier, unlike the first agent, 𝐴1 . Assume that both agents possess the same
knowledge about 𝒞. However, 𝐴2 may have additional beliefs regarding the potential inaccuracies
of the output, even if they acknowledge that such inaccuracies are expected. We argue that these
additional beliefs, which lead to the divergence in trust between 𝐴1 and 𝐴2 , stem from the specific
pragmatic background of 𝐴2 . By pragmatic background, we refer to the set of beliefs an agent holds
prior to interacting with the AI system. These beliefs may be shaped by various factors, including
education, experience, cultural context, and moral or ethical principles. Based on this understanding,
we can characterize this extended form of trust in an AI system, which we will refer to as 𝑡𝑟𝑢𝑠𝑡2 :
  An agent 𝐴 𝑡𝑟𝑢𝑠𝑡𝑠2 an AI system 𝑆 iff
  𝐴 𝑡𝑟𝑢𝑠𝑡𝑠1 𝑆 and the output of 𝑆 is compatible with the pragmatic background of the agent.

   As evident from the previous characterization, 𝑡𝑟𝑢𝑠𝑡2 extends 𝑡𝑟𝑢𝑠𝑡1 by incorporating an agent’s belief
set and comparing it with the output provided by the AI system. With the definitions of 𝑡𝑟𝑢𝑠𝑡1 and
𝑡𝑟𝑢𝑠𝑡2 established, let us revisit our motivating example of the classifier and compare the two forms of
trust held by 𝐴1 and 𝐴2 .
   Viewing the example through the lens of these definitions, we can assert that 𝐴1 𝑡𝑟𝑢𝑠𝑡𝑠1 𝒞 to classify
both pictures as wolves. Furthermore, 𝐴1 also 𝑡𝑟𝑢𝑠𝑡𝑠2 𝒞 for the same classification, because the received
output does not contradict their pragmatic background. In contrast, while 𝐴2 𝑡𝑟𝑢𝑠𝑡𝑠1 𝒞 to classify both
pictures as wolves, their situation diverges. 𝐴2 does not 𝑡𝑟𝑢𝑠𝑡2 𝒞 for this classification. This divergence
may stem, for instance, from 𝐴2 ’s background as a dog trainer who has previously trained a Siberian
Husky. In this context, a single error from 𝒞 does not undermine its overall trustworthiness, and 𝐴2 is
aware of this. Thus, 𝐴2 maintains 𝑡𝑟𝑢𝑠𝑡1 in 𝒞. However, 𝐴2 recognizes the Siberian Husky and believes,
based on their education and experience, that a Siberian Husky is not a wolf. Consequently, they would
not base further reasoning or actions on this erroneous output. In this sense, 𝐴2 does not 𝑡𝑟𝑢𝑠𝑡2 𝒞, as
the output contradicts their pragmatic background.


3. Trust in AI via statistical inference
As evident from our definitions of 𝑡𝑟𝑢𝑠𝑡1 and 𝑡𝑟𝑢𝑠𝑡2 , these notions are not mutually exclusive; rather,
𝑡𝑟𝑢𝑠𝑡1 is included in 𝑡𝑟𝑢𝑠𝑡2 . The primary distinction is that while 𝑡𝑟𝑢𝑠𝑡1 is based solely on the calculation of
the correspondence between data obtained from a sufficient number of tests and an agent’s expectations
about an AI system, 𝑡𝑟𝑢𝑠𝑡2 incorporates the agent’s overall background into the reasoning process. From
this perspective, we argue that the two kinds of trust discussed in this article naturally correspond to
two forms of statistical inference: frequentist and Bayesian approaches.
   Frequentists define the probability of an event as the limit of its relative frequency over a large
number of trials, whereas Bayesians extend probabilities to account for varying degrees of certainty
about statements (see, e.g., [10] for more details). The fundamental difference between these approaches
lies in their treatment of probabilities: frequentists analyze probabilities purely as calculations based
on data located on a sample space of possible outcomes, while Bayesians include the dimension of an
agent’s knowledge about that data.
   As previously noted, 𝑡𝑟𝑢𝑠𝑡1 relies on the knowledge of a system’s trustworthiness, as discussed in
[9], [4], [5]. In this context, trustworthiness is established through a post-hoc verification strategy
that evaluates the reliability of an AI system’s behavior in statistical terms, alongside adherence to an
evaluation criterion (see [11]). Specifically, this verification employs two specific comparison terms.
First, it uses a formal expression to denote the observable behavior of a (possibly) non-deterministic
system over a finite number of executions. Second, it incorporates a transparent model of the expected
behavior, which is normatively or ethically desirable based on the observed model and the input data.
This second model serves as a benchmark for evaluating the first observed model, and the formal
verification process measures the distance between the two models. From this perspective, assessing
trustworthiness is fundamentally tied to considering the 𝑃-value of the results in frequentist terms.
This means measuring the probability of obtaining the observed results under the assumption that
the null hypothesis is true. This hypothesis posits that there is no statistically significant relationship
between two sets of data. In this context, it emphasizes the need to evaluate trustworthiness based on a
sufficient number of distinct tests performed on the system.
   Let us reconsider our example in frequentist terms. In order to establish 𝑡𝑟𝑢𝑠𝑡1 an agent takes into
account the probability of getting a result which identifies wolf as wolf (lets dub it Result), during a
sufficient number of tests (Data):

                                                           #𝑅𝑒𝑠𝑢𝑙𝑡
                                             𝑃(𝑅𝑒𝑠𝑢𝑙𝑡) =
                                                           #𝐷𝑎𝑡𝑎
    Then, the agent verifies whether 𝑃(𝑅𝑒𝑠𝑢𝑙𝑡) matches an acceptable threshold against the expected
probability for #𝑅𝑒𝑠𝑢𝑙𝑡. In our example, both agents evaluate that the observed frequency of Result sits
within an acceptable threshold compared to its theoretical counterpart, thereby inferring trustworthiness.
Notice, that even though for 𝐴2 the output was not accounted in the number of Result, the difference is
so insignificant that 𝑃(𝑅𝑒𝑠𝑢𝑙𝑡) still matched an acceptable threshold.
    Since 𝑡𝑟𝑢𝑠𝑡1 is included within 𝑡𝑟𝑢𝑠𝑡2 , trustworthiness – and, by extension, frequentist statistical
reasoning – plays a significant role in establishing 𝑡𝑟𝑢𝑠𝑡2 . However, a distinguishing feature of 𝑡𝑟𝑢𝑠𝑡2 is
its incorporation of the agent’s pragmatic background in the evaluation. This pragmatic background
reflects the current state of the agent’s knowledge, not only regarding the results of testing an AI system
(i.e., the data) but also encompassing prior information and hypotheses about the system and acceptable
outcomes. From this perspective, the pragmatic background can be viewed as a prior probability, which
represents the probability assigned to an output before receiving the relevant information. In this broad
sense of pragmatic background, 𝑡𝑟𝑢𝑠𝑡2 seems to align with a Bayesian interpretation of the probability
of the output, as it incorporates the dimension of the agent’s prior credence or degree of the agent’s
beliefs, which can be updated subsequently.
    Returning to our example of the classifier 𝒞, we can say that 𝐴1 and 𝐴2 establish their 𝑡𝑟𝑢𝑠𝑡1 in 𝒞
based on the overall trustworthy behavior of the classifier, which is measured in frequentist terms. In
the case of 𝑡𝑟𝑢𝑠𝑡2 , however, the agents 𝐴1 and 𝐴2 appear to have different priors –specifically, differing
knowledge and assumptions about dogs and wolves – which influences their attitudes toward the
output. From this perspective, we notice that the conditional belief (posterior belief in Bayesian terms)
of 𝐴1 and 𝐴2 differs, once it is calculated via Bayes’ theorem:

                                                  𝑃(𝐷𝑎𝑡𝑎 ∣ 𝑅𝑒𝑠𝑢𝑙𝑡) × 𝑃(𝑅𝑒𝑠𝑢𝑙𝑡)
                             𝑃(𝑅𝑒𝑠𝑢𝑙𝑡 ∣ 𝐷𝑎𝑡𝑎) =                                ,
                                                           𝑃(𝐷𝑎𝑡𝑎)
   that is the conditional belief in event (𝑃(𝑅𝑒𝑠𝑢𝑙𝑡 ∣ 𝐷𝑎𝑡𝑎)) is calculated by multiplying prior belief of the
agent by the likelihood 𝑃(𝐷𝑎𝑡𝑎 ∣ 𝑅𝑒𝑠𝑢𝑙𝑡) that 𝐷𝑎𝑡𝑎 will occur if 𝑅𝑒𝑠𝑢𝑙𝑡 is true. Clearly, 𝑃(𝑅𝑒𝑠𝑢𝑙𝑡 ∣ 𝐷𝑎𝑡𝑎)
would be significantly different for 𝐴1 and 𝐴2 , given that 𝑃(𝑅𝑒𝑠𝑢𝑙𝑡) is different for them, as well as the
likelihood for 𝐴1 is much higher than the one for 𝐴2 .


4. Conclusion
The association of the two forms of trust we introduced with established methods of statistical inference
supports the distinction between 𝑡𝑟𝑢𝑠𝑡1 and 𝑡𝑟𝑢𝑠𝑡2 . These forms of trust allow for a focus on different
objectives when evaluating an AI system. Specifically, 𝑡𝑟𝑢𝑠𝑡1 can be seen as measuring the reliability of
a system with respect to a benchmark, while 𝑡𝑟𝑢𝑠𝑡2 involves an agent’s attitudes and hypotheses, which
may not be directly tied to the AI system itself. A notable aspect of our analysis is the relationship
between trust and trustworthiness, where trust inherently presupposes trustworthiness. In our frame-
work, trustworthiness is always relative to the agent. From this perspective, it seems natural to assert
that if an agent trusts an AI system, they must perceive it as trustworthy. However, the reverse is not
necessarily true: an agent may consider a system trustworthy without actually placing their trust in
it. Formally, both 𝑡𝑟𝑢𝑠𝑡1 and 𝑡𝑟𝑢𝑠𝑡2 can be applied depending on the desired level of abstraction in the
model. The development of a formal framework to represent these two types of trust remains a topic
for future research.


Acknowledgments
The authors would like to thank two anonymous reviewers for their comments. All authors acknowledge
the support of the Project PRIN2020 BRIO - Bias, Risk and Opacity in AI (2020SSKZ7R) awarded by
the Italian Ministry of University and Research (MUR). Giuseppe Primiero is further funded through
the project PRIN2022 SMARTEST - Simulation of Probabilistic Systems for the Age of the Digital
Twin (2022E8Y4X) awarded by the Italian Ministry of University and Research (MUR). The research of
Ekaterina Kubyshkina is funded under the “Foundations of Fair and Trustworthy AI” Project of the
University of Milan. Giuseppe Primiero and Ekaterina Kubyshkina are further funded by the Department
of Philosophy “Piero Martinetti” of the University of Milan under the Project “Departments of Excellence
2023-2027” awarded by the Ministry of University and Research (MUR). Mattia Petrolo acknowledges
the financial support of the FCT – Fundação para a Ciência e a Tecnologia (2022.08338.CEECIND; R&D
Unit Grants UIDB/00678/2020 and UIDP/00678/2020) and the French National Research Agency (ANR)
through the Project ANR-20-CE27-0004.


References
 [1] M. Ribeiro, S. Singh, C. Guestrin, “why should i trust you?”: Explaining the predictions of any
     classifier, in: KDD ’16: Proceedings of the 22nd ACM SIGKDD International Conference on
     Knowledge Discovery and Data Mining, 2016, pp. 1135–1144.
 [2] J. M. Durán, K. R. Jongsma, Who is afraid of black box algorithms? on the epistemological and
     ethical basis of trust in medical a.i., Journal of Medical Ethics 47 (2021) 329–335.
 [3] G. G Zanotti, M. Petrolo, D. Chiffi, V. Schiaffonati, Keep trusting! a plea for the notion of
     trustworthy ai, AI & Soc (2023). doi:https://doi.org/10.1007/s00146- 023- 01789- 9 .
 [4] F. A. D’Asaro, G. Primiero, Probabilistic typed natural deduction for trustworthy computations, in:
     D. Wang, R. Falcone, J. Zhang (Eds.), Proceedings of the 22nd International Workshop on Trust in
     Agent Societies (TRUST 2021) Co-located with the 20th International Conferences on Autonomous
     Agents and Multiagent Systems (AAMAS 2021), CEUR Workshop Proceedings, 2021, pp. 1–12.
 [5] F. A. D’Asaro, F. Genco, G. Primiero, Checking trustworthiness of probabilistic computations
     in a typed natural deduction system, 2023. URL: https://arxiv.org/pdf/2206.12934.pdf, coRR,
     arXiv:2206.12934 [abs].
 [6] J. Dacon, Are you worthy of my trust?: A socioethical perspective on the impacts of trustworthy
     ai systems on the environment and human society, 2023. URL: https://arxiv.org/pdf/2309.09450,
     coRR, arXiv:2309.09450.
 [7] Regulation (eu) 2024/... of the european parliament and of the council laying down harmonised rules
     on artificial intelligence (ai act), European Commission, 2024. URL: https://artificialintelligenceact.
     eu/the-act/.
 [8] Ethics guidelines for trustworthy ai of the high-level expert group on artificial
     intelligence,      AI HLEG, 2019. URL: https://digital-strategy.ec.europa.eu/en/library/
     ethics-guidelines-trustworthy-ai.
 [9] E. Kubyshkina, G. Primiero, A possible worlds semantics for trustworthy non-deterministic
     computations, International Journal of Approximate Reasoning 172 (2024) 1–24. doi:https:
     //doi.org/10.1016/j.ijar.2024.109212 .
[10] A. Hájek, Interpretations of probability, in: E. N. Zalta, U. Nodelman (Eds.), The Stanford
     Encyclopedia of Philosophy, winter 2023 ed., 2023.
[11] G. Primiero, Brio: il ruolo della logica nella costruzione di una ia equa, in: L. Marinucci, C. Caporale
     (Eds.), Spiegabilità e Intelligenza Artificiale. Una prospettiva multidisciplinare, Etica della ricerca,
     bioetica, biodiritto, biopolitica, CNR Edizioni, to appear.