Reasoning about algorithmic opacity Ekaterina Kubyshkina1 , Mattia Petrolo2 1 University of Milan, Philosophy Department, Via Festa del Perdono, 7 20122, Milan, Italy 2 Federal University of ABC, Alameda da Universidade, s/n, São Bernardo do Campo, SP 09606-045, Brazil Abstract A recurring problem discussed in explainable AI is the so-called epistemic opacity problem, that is, a problem about the epistemic accessibility and reliability of algorithms. In the present work, we provide an original epistemological characterization of the opacity of algorithms based on a tripartite analysis of their components. Against this background, we introduce a formal framework by modifying the neighborhood semantics for evidence logic introduced in [1]. This setting allows one to reason about an agent’s epistemic attitudes toward an algorithm and investigate what are the conditions that should be met to achieve epistemic transparency. Keywords Transparent AI, epistemic opacity, epistemic logic, evidence models, neighborhood semantics 1. Introduction The explosion in the use of computational algorithms in several domains of human life prompted the development of explainable AI and, more in general, of what can be labelled as a human- centered approach to algorithms to help shed some light on the nature of AI models. From this perspective, as Seaver puts it, “it is not the algorithm, narrowly defined, that has sociocultural effects, but algorithmic systems - intricate, dynamic arrangements of people and code” ([2], pp. 418-419). A recurring problem discussed in this approach, in a form or another, is the so-called epistemic opacity problem, that is, a problem about the epistemic accessibility and reliability of algorithms. In the present work, our aim is to provide an original epistemological and logical characterization of the epistemic opacity of algorithms, in order to investigate under which conditions this form of opacity can be eliminated. 2. A definition of epistemic opacity To characterize the epistemic opacity of algorithms, we follow the methodology proposed by Durán and Formanek [3] and adapt Humphreys’ definition of epistemically opaque process: [A] process is epistemically opaque relative to a cognitive agent X at time t just in case X does not know at t all of the epistemically relevant elements of the process ([4], p. 618). 1st Workshop on Bias, Ethical AI, Explainability and the role of Logic and Logic Programming, BEWARE-22, co-located with AIxIA 2022, University of Udine, Udine, Italy, 2022 Envelope-Open ekaterina.kubyshkina@unimi.it (E. Kubyshkina); mattia.petrolo@ufabc.edu.br (M. Petrolo) © 2022 Author:Pleasefillinthe\copyrightclause macro CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) This characterization of opacity crucially relies on the fact that an agent “X does not know”, which, in turn, presupposes an account of what knowledge is. However, unfortu- nately, Humphreys leaves this question open. Traditional epistemology takes knowledge as corresponding to justified true belief. This analysis has been challenged by Gettier’s famous counterexample (see [5]), which prevents one to consider luck-dependent cases as cases of genuine knowledge. To avoid this problem, our characterization of opacity must take special care in spelling out the nature of the justificatory component involved in the analysis. Moreover, in order to specify what are the “epistemically relevant elements” of algorithms, the definition has to take into account their specific structure. Cormen et al. [6] describe an algorithm as follows: “Informally, an algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output” ([6], p. 5). Although informal and sketchy, this description highlights three fundamental elements of an algorithm: its input, procedure, and output. We argue that a sound characterization of epistemic opacity (and epistemic transparency) for algorithms has to take these elements into account. Our proposal is as follows: Definition 2.1 (Epistemological). An algorithm is epistemically opaque relative to an epistemic agent A at time t just in case at t, A does not have an epistemic justification for I, or an epistemic justification for P, or an epistemic justification for O; where I, P, O express the algorithm’s input, procedure, and output, respectively. One important feature of the previous definition is that the components I, P, and O of the algorithm are, at least in principle, independent. As a consequence, the lack of epistemic justification for any component constitutes a sufficient condition for epistemic opacity. Let us clarify the definition via an example of a specific facial recognition algorithm used in AI. Convolutional neural networks (CNN, see [7]) are a biologically-inspired class of neural networks used in facial recognition. CNN consist of three layers: an input layer, an output layer, and several hidden layers (e.g., convolutional layers, pooling layers, fully connected layers). Following Definition 2.1, a CNN can be opaque for three different reasons. First, an agent using the model might not know the set of images used to train the neural network. This input opacity is external to the procedure of the algorithm and it depends on some choices made by the algorithm’s designers, which are not necessarily accessible to the user. Thus, input opacity is not about a single image inserted by the user, but rather about the relationship established between this image and the dataset on which the algorithm is trained. Second, the agent using the model might not have epistemic access to the procedures of the hidden layers. For instance, she misses the information about what a convolution operation is. This procedure opacity is internal to the algorithm and presupposes some form of epistemic access to the inner working of the algorithm. Finally, the epistemic agent using the algorithm might notice that the output does not fit with her current set of beliefs. For instance, she knows that two faces are the same, despite the output of the CNN claims otherwise. This output opacity is external to the procedure of the algorithm and it represents for the user a way to compare the output with her previously acquired beliefs. In principle, the three conditions can occur separately, but, in most of real-world algorithms, these forms of opacity are entangled, thus raising the complexity of the epistemic opacity problem. On the basis of this characterization, in the vast majority of cases, the algorithms with which we interact on a daily basis are epistemically opaque. In what follows, we introduce a formal framework to reason about an agent’s epistemic attitudes towards opaque algorithms and investigate what are the conditions that should be met to achieve epistemic transparency. 3. A formal framework Definition 2.1 can be considered as a tentative epistemological characterization of epistemic opacity. However, to reason formally about the epistemic attitudes of an agent toward opacity, one needs to recast that definition in logical terms. To do so, we will borrow some tools from the toolkit of epistemic logic and match each component of Definition 2.1 with an epistemic modality. Roughly, we consider that the fact that an agent has an epistemic justification for I can be logically represented by 𝐾 𝜙, that is, an agent “knows the input 𝜙” of the algorithm. The fact that an agent has an epistemic justification for P can be seen as □𝜙, that is, the agent “has an evidence for the procedure 𝜙”. Finally, the fact that an agent has an epistemic justification for O can be considered as 𝐵𝜙 that is, the agent “believes in the output 𝜙”. Let us now present more carefully this formal framework. The semantics we are proposing is a modification of a neighborhood semantics for evidence logic provided by van Benthem et al. [1]. The main difference of our proposal from [1] is that we need to fix three separate domains: for inputs, for procedures, and for outputs. For inputs we fix a domain consisting of variables 𝑎, 𝑏, 𝑐, ... ∈ 𝐴𝑡𝑖𝑛 . These variables denote some data inserted into the algorithm. Practically, the data can be in a form of sentences, pictures, diagrams etc. For this reason 𝑎, 𝑏, 𝑐... do not designate only propositions. For procedures we fix a domain consisting of variables 𝛼, 𝛽, 𝛾 , ... ∈ 𝐴𝑡𝑝𝑟 . These variables denote procedures used by the algorithm. As procedures are not just propositions, 𝛼, 𝛽, 𝛾 , ... are not just propositional variables. For the outputs we fix a domain of propositional variables 𝑝, 𝑞, 𝑟, ... ∈ 𝐴𝑡𝑜𝑢𝑡 by assuming that the outputs of the algorithms always take propositional form. The language ℒ is defined as follows: 𝜙 ∶= 𝑝 | 𝛼 | 𝑎 | ¬𝜙 | 𝜙 ∧ 𝜙 | 𝐵𝜙 | □𝜙 | 𝐾 𝜙 where 𝑝 ∈ 𝐴𝑡𝑜𝑢𝑡 , 𝛼 ∈ 𝐴𝑡𝑝𝑟 , 𝑎 ∈ 𝐴𝑡𝑖𝑛 , and 𝜙 in 𝐵𝜙 is defined on the domain 𝐴𝑡𝑜𝑢𝑡 , 𝜙 in □𝜙 is defined on the domain 𝐴𝑡𝑝𝑟 , 𝜙 in 𝐾 𝜙 is defined on the domain 𝐴𝑡𝑖𝑛 . The intended interpretation of 𝐵𝜙 is “the agent believes in the output 𝜙,” where 𝜙 stands for a proposition. The interpretation of □𝜙 is “the agent has evidence for procedure 𝜙.” By “having an evidence for a procedure” we mean that an agent has an understanding of a particular way of producing an output based on a given input. The interpretation of 𝐾 𝜙 is “the agent knows the data 𝜙,” where by “knowing the data” we mean that the agent is aware of the inputs of the algorithm. From this perspective, we are not dealing with propositional factive knowledge in this case. For instance, an agent may know the data used by the algorithm as an input even if these data are incorrect. In what follows, we indicate by 𝐴𝑡 the union of 𝐴𝑡𝑜𝑢𝑡 , 𝐴𝑡𝑝𝑟 , and 𝐴𝑡𝑖𝑛 , and we add a superscript on variables 𝑝 𝑥 , 𝛼 𝑥 , 𝑎𝑥 in order to mark them, respectively, as the output, the procedure, and the input of the same algorithm 𝑥. In order to interpret the language ℒ we use the evidence models introduced by van Benthem et al. [1]. Definition 3.1. An evidence model is a tuple ℳ = ⟨𝑊 , 𝐸, 𝑉 ⟩, where 𝑊 is a non-empty set of worlds, 𝐸 ⊆ 𝑊 × 𝔭(𝑊 ) is an evidence relation, 𝑉 ∶ 𝐴𝑡 → 𝔭(𝑊 ) is a valuation function. We write 𝐸(𝑤) for the set {𝑋 |𝑤𝐸𝑋 }. Two constraints are imposted on the evidence sets: For each 𝑤 ∈ 𝑊, ∅ ∉ 𝐸(𝑤) and 𝑊 ∈ 𝐸(𝑤). Definition 3.2. A 𝑤-scenario is a maximal collection 𝒳 ⊆ 𝐸(𝑤) that has the finite intersection property: for each finite subfamily {𝑋1 , ..., 𝑋𝑛 } ⊆ 𝒳, ∩1≤𝑖≤𝑛 𝑋𝑖 ≠ ∅. Definition 3.3. Let ℳ = ⟨𝑊 , 𝐸, 𝑉 ⟩ be an evidence model. Truth of a formula 𝜙 ∈ ℒ is defined as follows: • ℳ, 𝑤 ⊧ 𝑝 iff 𝑤 ∈ 𝑉 (𝑝); • ℳ, 𝑤 ⊧ 𝛼 iff 𝑤 ∈ 𝑉 (𝛼); • ℳ, 𝑤 ⊧ 𝑎 iff 𝑤 ∈ 𝑉 (𝑎); • ℳ, 𝑤 ⊧ ¬𝜙 iff ℳ, 𝑤 ⊧ ̸ 𝜙; • ℳ, 𝑤 ⊧ 𝜙 ∧ 𝜓 iff ℳ, 𝑤 ⊧ 𝜙 and ℳ, 𝑤 ⊧ 𝜓; • ℳ, 𝑤 ⊧ □𝜙 iff there exists 𝑋 such that 𝑤𝐸𝑋 and for all 𝑣 ∈ 𝑋, ℳ, 𝑣 ⊧ 𝜙; • ℳ, 𝑤 ⊧ 𝐵𝜙 iff for each 𝑤-scenario 𝒳 and for all 𝑣 ∈ ∩𝒳, ℳ, 𝑣 ⊧ 𝜙; • ℳ, 𝑤 ⊧ 𝐾 𝜙 iff for all 𝑣 ∈ 𝑊, ℳ, 𝑣 ⊧ 𝜙. The satisfiability and validity are defined as usual. The main peculiarity of our approach lies in distinguishing three types of domains for variables and limiting the application of each modal operator by its corresponding domain. Technically, we have defined the same function for all the three types of variables, and the evidence sets are defined so that they can contain any type of variables. However, conceptually, satisfiability of each type of variables in a world represents different situations. In particular, ℳ, 𝑤 ⊧ 𝑝 means that proposition 𝑝 is true in a world 𝑤, which is standard. By ℳ, 𝑤 ⊧ 𝛼 we mean that the procedure 𝛼 belongs to the world 𝑤. This, in turn, by definition of ¬, means that ℳ, 𝑤 ⊧ ¬𝛼 should be interpreted as the fact that the procedure 𝛼 does not belong to the world 𝑤. Similarly, ℳ, 𝑤 ⊧ 𝑎 means that 𝑎 belongs to the world 𝑤, and ℳ, 𝑤 ⊧ ¬𝑎 means that 𝑎 does not belong to the world 𝑤. These considerations are useful for understanding the definitions of 𝐵, □, and 𝐾. In particular, ℳ, 𝑤 ⊧ 𝐵𝜙 means that after considering all the evidences for and against 𝜙, the truth of 𝜙 is consistent with these evidences. The condition for ℳ, 𝑤 ⊧ □𝜙 states that an agent has an evidence for a procedure 𝜙 iff 𝜙 is present in some evidence set available in 𝑤. Notice that it is possible that an agent has an evidence both for 𝜙 and ¬𝜙. This is in line with our informal reading, because an agent can have evidences for a procedure being applicable in a certain context, but inapplicable in some other. Finally, ℳ, 𝑤 ⊧ 𝐾 𝜙 means that the data is present in all worlds considered by the agent, i.e., the agent has full access to the data in all contexts. On the basis of the previous definitions, we are able to define opaque algorithms semantically. We do not need to define an algorithm in our semantics, but we take it as a system 𝑥 containing all inputs, procedures, and outputs, associated with 𝑥. Following Definition 2.1, an opaque algorithm (𝑂𝑥) is one for which the agent lacks a justification for at least of one of its components. We can now rephrase this definition in logical terms. Definition 3.4 (Logical). An algorithm is epistemically opaque relative to an epistemic agent in a world 𝑤 ∈ ℳ if: ℳ, 𝑤 ⊧ 𝑂𝑥 iff ℳ, 𝑤 ⊧ ¬𝐾 𝜙1𝑥 ∨ ¬□𝜙2𝑥 ∨ ¬𝐵𝜙3𝑥 . Now we are able to provide semantic models for representing transparent and opaque algorithms. Example 3.1 (Transparent algorithm). Let ℳ = ⟨𝑊 , 𝐸, 𝑉 ⟩ such that 𝑊 = {𝑤, 𝑣1 , 𝑣2 , 𝑣3 }, 𝐸(𝑤) = {{𝑤, 𝑣1 , 𝑣2 , 𝑣3 }}, and 𝑉 (𝑎𝑥 ) = 𝑉 (𝛼 𝑥 ) = 𝑉 (𝑝 𝑥 ) = 𝑊. Clearly, in this model we have ℳ, 𝑤 ⊧ 𝐾 𝑎𝑥 , ℳ, 𝑤 ⊧ □𝛼 𝑥 , and ℳ, 𝑤 ⊧ 𝐵𝑝 𝑥 . Thus, ℳ, 𝑤 ⊧ ¬𝑂𝑥. Example 3.2 (Opaque algorithm - 1). In this example we provide a model for an algorithm, the opaqueness of which is due to the lack of epistemic justification for the input, in presence of epistemic justifications for the procedure and the output. Let ℳ = ⟨𝑊 , 𝐸, 𝑉 ⟩ such that 𝑊 = {𝑤, 𝑣1 , 𝑣2 , 𝑣3 }, 𝐸(𝑤) = {{𝑤, 𝑣1 , 𝑣2 , 𝑣3 }, {𝑤, 𝑣1 , 𝑣2 }}, and 𝑉 (𝑎𝑥 ) = {𝑤, 𝑣1 , 𝑣3 }, 𝑉 (𝛼 𝑥 ) = {𝑤, 𝑣1 , 𝑣2 }, 𝑉 (𝑝 𝑥 ) = {𝑤, 𝑣1 , 𝑣2 , 𝑣3 }. In this model we have ℳ, 𝑤 ⊧ ¬𝐾 𝑎𝑥 , because ℳ, 𝑣 2 ⊧ ¬𝑎𝑥 ; and we have ℳ, 𝑤 ⊧ □𝛼 𝑥 , ℳ, 𝑤 ⊧ 𝐵𝑝 𝑥 . Thus, ℳ, 𝑤 ⊧ 𝑂𝑥. Example 3.3 (Opaque algorithm - 2). Now we model a situation in which an algorithm is opaque due to the lack of epistemic justification for the procedure, in presence of epistemic justification of the input and of the output. Let ℳ = ⟨𝑊 , 𝐸, 𝑉 ⟩ such that 𝑊 = {𝑤, 𝑣1 , 𝑣2 , 𝑣3 }, 𝐸(𝑤) = {{𝑤, 𝑣1 , 𝑣2 , 𝑣3 }, {𝑤, 𝑣1 , 𝑣2 }}, and 𝑉 (𝑎𝑥 ) = {𝑤, 𝑣1 , 𝑣2 , 𝑣3 }, 𝑉 (𝛼 𝑥 ) = {𝑣1 }, 𝑉 (𝑝 𝑥 ) = {𝑤, 𝑣1 , 𝑣2 , 𝑣3 }. In this model, we have ℳ, 𝑤 ⊧ ¬□𝛼 𝑥 , because for all 𝑋 such that 𝑤𝐸𝑋 there exist 𝑣 ∈ 𝑋 such that ℳ⊧𝛼̸ 𝑥 ; and we have ℳ, 𝑤 ⊧ 𝐾 𝑎𝑥 , ℳ, 𝑤 ⊧ 𝐵𝑝 𝑥 . Thus, ℳ, 𝑤 ⊧ 𝑂𝑥. Example 3.4 (Opaque algorithm - 3). Here we consider a model, in which an algorithm is opaque because the agent lacks an epistemic justification for the output, in the presence of the justifications both for the input and the procedure. Let ℳ = ⟨𝑊 , 𝐸, 𝑉 ⟩ such that 𝑊 = {𝑤, 𝑣1 , 𝑣2 , 𝑣3 }, 𝐸(𝑤) = {{𝑤, 𝑣1 , 𝑣2 , 𝑣3 }, {𝑤, 𝑣1 , 𝑣2 }}, and 𝑉 (𝑎𝑥 ) = {𝑤, 𝑣1 , 𝑣2 , 𝑣3 }, 𝑉 (𝛼 𝑥 ) = {𝑤, 𝑣1 , 𝑣2 }, 𝑉 (𝑝 𝑥 ) = {𝑤, 𝑣1 , 𝑣3 }. We have ℳ, 𝑤 ⊧ ¬𝐵𝑝 𝑥 , because for 𝑣2 ∈ ∩𝒳, ℳ, 𝑣2 ⊧𝑝̸ 𝑥 ; and we have ℳ, 𝑤 ⊧ 𝐾 𝑎𝑥 , ℳ, 𝑤 ⊧ □𝛼 𝑥 . Thus, ℳ, 𝑤 ⊧ 𝑂𝑥. 4. Conclusion and future work We provided an original epistemological definition of algorithmic opacity based on a tripartite analysis of algorithms. On the basis of this definition, we introduced a formal framework that allows one to analyze the epistemic attitudes of an agent towards a possibly opaque algorithm. The transition from the epistemological to the formal framework is made by respecting the tripartite structure of I, P, and O and by attributing to them the 𝐾 , □, and 𝐵 modality, respectively. Let us call this analysis the IPO model of algorithmic opacity. In future work, we aim at deepening the IPO model, both from an epistemological and formal perspective. Regarding the former, in the literature on algorithmic opacity, it is often pointed out that one of the major difficulties in analyzing opacity is its multilayered nature. Some authors have proposed taxonomies for different forms of epistemic opacity. For instance, Burrell [8] distinguishes between intentional, illiterate, and intrinsic opacity. From this perspective, we intend to compare the epistemological definition we introduced with the forms of opacity analyzed in the literature, in order to understand whether our definition is general enough to encompass all possible forms of opacity. From a formal point of view, we have adapted the evidence models to provide a general semantic framework to reason about an agent’s epistemic attitudes toward an algorithm. The next natural step in this investigation is to introduce a logical system for reasoning about opacity and prove its completeness with respect to the evidence models. Moreover, recently, other approaches for dealing with the notion of evidence were proposed, for instance by Carnielli and Rodrigues [9] and Artemov [10]. Carnielli and Rodrigues [9] provide an interpretation of paraconsistent and paracomplete logics in terms of evidence. Even though this reading permits one to deal with possibly inconsistent evidence, which is also possible by using the semantics we adopted, the relation between evidence and justification is less straightforward in this logical framework. For this reason, we consider our approach more promising for the aims of the current work. Artemov [10] introduced a logic of justification by supplementing the standard modalities of epistemic and doxastic logic with explicit terms, which provide the reason for believing in a proposition. It seems that the IPO model can also be represented in this framework. We leave the task of adapting the logic of justification to the analysis of opaque algorithms for future investigations. Acknowledgments The authors acknowledge the support of the Project PRIN2020 BRIO - Bias, Risk and Opacity in AI (2020SSKZ7R) awarded by the Italian Ministry of University and Research (MUR). The research of Ekaterina Kubyshkina is funded under the “Foundations of Fair and Trustworthy AI” Project of the University of Milan. Mattia Petrolo gratefully acknowledges the support of the French National Research Agency (ANR) through the Project ANR-20-CE27-0004. References [1] J. van Benthem, D. Fernández-Duques, E. Pacuit, Evidence logic: A new look at neighbor- hood structures, Advances in Modal Logic (2012). [2] N. Seaver, Knowing algorithms, in: J. Vertesi, D. Ribes (Eds.), Digital STS: A Field Guide, Princeton University Press, 2019, pp. 412–422. [3] J. Durán, N. Formanek, Grounds for trust: Essential epistemic opacity and computational reliabilism, Minds and Machines 28 (2018) 645–666. [4] P. W. Humphreys, The philosophical novelty of computer simulation methods, Synthese 169 (2009) 615–626. [5] E. Gettier, Is justified true belief knowledge?, Analysis 23 (1963) 121–123. [6] T. Cormen, C. E. Leiserson, R. Rivest, C. Stein, Introduction to Algorithms, Cambridge: MIT Press, 2009. [7] Y. LeCun, Y. Bengio, Convolutional networks for images, speech, and time series, in: M. Arbib (Ed.), The handbook of brain theory and neural networks, 2nd ed., The MIT press, 1998, pp. 255–258. [8] J. Burrell, How the machine ‘thinks’: Understanding opacity in machine learning algo- rithms, Big Data & Society 2 (2016) 1–12. [9] W. Carnielli, A. Rodrigues, An epistemic approach to paraconsistency: a logic of evidence and truth, Synthese 196 (2019) 3789–3813. [10] S. Artemov, The logic of justification, The Review of Symbolic Logic 1 (2008) 477–513.