Proof-checking bias in labeling methods Giuseppe Primiero1,∗ , Fabio D’Asaro2 1 Logic, Uncertanty, Computation and Information Group, Department of Philosophy, University of Milan, Via Festa del Perdono, 7, Milano, 20122, Italy 2 Ethos Group, Department of Human Sciences, University of Verona, Via S. Francesco, 22, Verona, 37129, Italy Abstract We introduce a typed natural deduction system designed to formally verify the presence of bias in automatic labeling methods. The system relies on a ”data-as-terms” and ”labels-as-types” interpretation of formulae, with derivability contexts encoding probability distributions on training data. Bias is understood as the divergence that expected probabilistic labeling by a classifier trained on opaque data displays from the fairness constraints set by a transparent dataset. Keywords Bias, Classifiers, Formal Verification 1. Introduction The formal verification of AI systems is a growing field of research, and its applications are essential to the deployment of safe systems in several domains with high societal impact. This theme is attracting more and more attention from the community at large, not just the formal verification specialists, see e.g. the recent [1]. Essential to this task is to reason about probabilistic processes, in order to accommodate the uncertainty and non-deterministic behavior characteristic of AI systems, and to model desirable forms of accuracy and trustworthiness. This objective can be formulated in terms of formal methods to prove trustworthy properties of interest, see e.g. [2, 3, 4]; or with linear temporal logic properties defined over Markov decision processes, e.g. with reinforcement learning methods [5] or with imprecise probabilities [6]. A different approach consists in exploiting a proof-theoretic semantics and the tradition of lambda-calculi and the Curry-Howard isomorphism to interpret computational processes as proof terms and output values as their types. Under the general heading of proof-checking techniques, we can identify untyped 𝜆-calculi [7, 8, 9], probabilistic event 𝜆-calculi [10], calculi for Bayesian learning [11], and calculi with types or natural deduction systems [12, 13, 14]. In this latter tradition, the calculus TPTND [15, 16] offers a novel interpretation of a probabilistic typed natural deduction system specifically designed to formally check the post-hoc validity of a computational process executed under a possibly opaque distribution of outputs against a BEWARE: Joint BRIO, ME&E and AWARE Workshop @ AIxIA 2022, November 28 - December 2, 2022, University of Udine, Udine, Italy ∗ Corresponding author. Envelope-Open giuseppe.primiero@unimi.it (G. Primiero); fabioaurelio.dasaro@univr.it (F. D’Asaro) GLOBE https://sites.unimi.it/gprimiero/ (G. Primiero); https://www.fabiodasaro.com (F. D’Asaro) Orcid 0000-0003-3264-7100 (G. Primiero); 0000-0002-2958-3874 (F. D’Asaro) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) transparent and desirable model. Here trustworthiness is interpreted as an admissible distance in the probability assigned to an observed output when compared to the probability that the intended behavior should display. The threats that AI systems pose to reliable and safe knowledge are varied and, among these, the potential bias of is a rising phenomenon. Dating back to the analysis of bias in deterministic systems see e.g. [17], several recent studies have shown that training and evaluation data for classification algorithms are often biased. [18] illustrates algorithmic discrimination based on gender and race; [19] lists different types of bias for data and algorithms. [20] shows that the most well-known AI datasets are full of labeling errors. From a formal viewpoint, the trustworthiness of an AI system may be considered also from the point of view of the absence of bias. Given an automated labeling method, one might ask whether it displays any bias, and if so, whether such level is below an admissible maximum potential threshold. Under this perspective, a new task is the design of systems that will make it possible to formally verify the absence of bias, or its presence within limits considered admissible in the application domain. Consider as an example the potential of automatic AI-based classification in recruiting, where the aim is to automatically select the best applicants out of huge pools of resumè. Notoriously this is not without risks, as shown by the infamous Amazon case, where the selection process was non gender-neutral due to the training of models based on patterns in resumes submitted over a 10-year period in which most applicants were men, thereby reflecting and consolidating male dominance in the tech industry, [21]. An important task is therefore to devise methods were gender-balance (or similar sensitive attributes) are treated as safety properties to be automatically checked. In this paper we present TPTND-KL, a system re-interpreting TPTND to reason about labeling of random variables and datapoints, with the task of inferring the presence of bias in classification methods. The underlying interpretation of formulas can be dubbed ”datapoints-as-terms” and ”labels-as-types”. In this context, we understand bias as a measure of the entropy between a given opaque probabilistic assignment of labels to a datapoint and the results given by a distribution of labels assumed to be correct, or at least desirable. This interpretation is inspired by [22]. The rest of the paper is structured as follows. In Section 2 we introduce the language, stressing the novel interpretation with respect to the previous version of the system and the introduction of the new 𝐵𝑖𝑎𝑠 operator. In Section 3 we reformulate the proof system under the novel isomorphism between syntactic terms and semantic interpretation. In Section 4 we extend the proof system with rules for checking the presence of bias, with particular attention to the novel distance measure chosen for this application. In Section 5 we provide a detailed example to show the checking method at work. We conclude with limitations of this approach and further topics of research. 2. Formal Preliminaries The logic TPTND-KL is a typed natural deduction system built from the following syntax: Definition 1 (Syntax). V a r i a b l e S e t ∶= x ∣ x T ∣ ̃x T ∣ x ∗T ∣ ⟨X , X ⟩ ∣ fst(X ) ∣ snd(X ) ∣ [X ]X ∣ X .X ∣ X .T D a t a s e t ∶= t ∣ ⟨T , T ⟩ ∣ fst(T ) ∣ snd(T ) ∣ [X ]T ∣ T .T ′ ∣ 𝐵𝑖𝑎𝑠(T ) L a b e l S e t ∶= 𝛼 ∣ 𝛼𝑟 ∣ ⊥ ∣ (𝛼 × 𝛽)𝑟 ∣ (𝛼 + 𝛽)𝑟 ∣ (𝛼 → 𝛽)[𝑟 ′ ]𝑟 A s s i g n m e n t s ∶= {} ∣ C , 𝑥 ∶ 𝛼 ∣ C , 𝑥 ∶ 𝛼𝑎 It includes: (random) variables of the form 𝑥𝑡 with an associated datapoint 𝑡 and annotated with indicators for being the correct labeled variable 𝑥̃ or the wrongly labeled variable 𝑥 ∗ , and composed as pairs, first and second projection, abstraction and application (also with terms); terms 𝑡 standing for datapoints, composed as pairs, first and second projection, abstraction (with respect to a variable of interest) and application, and taken as argument of a 𝐵𝑖𝑎𝑠 operator; a labelset with labels 𝛼 decorating variables and terms and indexed by a real value 𝑟, composed as conjunction, disjunction and implication; finally, an assignment function which lists a probability distribution 𝐶 of labels to variables. Example formulas of TPTND-KL have the following form: • 𝑥 ∶ 𝛼 ⊢ 𝑦 ∶ 𝛽0.1 : the random variable 𝑦 has probability 0.1 of having label 𝛽, provided 𝑥 has label 𝛼, i.e. 𝑃(𝑦 = 𝛽 ∣ 𝑥 = 𝛼) = 0.1. • Γ ⊢ 𝑝100 ∶ 𝐹0̃ .35 : under distribution Γ (e.g. of a given population), for 100 people one expects label 𝑓 𝑒𝑚𝑎𝑙𝑒 to be assigned 35% of the times. • Γ ⊢ 𝑝100 ∶ 𝐹30 : for 100 people in our dataset the label 𝑓 𝑒𝑚𝑎𝑙𝑒 was assigned 30 times. • 𝑦𝑡̃ ∶ 𝑀0.5 , 𝑦𝑡∗ ∶ 𝐹0.35 ⊢ 𝑡 ∶ 𝑀0.3 : a datapoint 𝑡 is labeled 𝐹 with probability 0.3 provided that 𝑡 is wrongly labeled 𝑀 with probability 0.5, and correctly 𝐹 with probability 0.35, i.e. 𝑃(𝑡 = 𝑀) = 0.3, 𝑃(𝑦𝑡 = 𝑀 ∣ 𝑦𝑡̃ ≠ 𝑀) = 0.5 and 𝑃(𝑦𝑡 = 𝐹 ∣ 𝑦𝑡̃ = 𝐹 ) = 0.35. The main novelty of TPTND-KL (and what differentiates it from its ancestor system TPTND) is the interpretation of its formulae in the style of ”datapoints-as-terms” and ”labels-as-types”, and the ability to express the probability of label assignment under a given distribution of labels for random variables, including true and false positive rates of labeling. 3. Proof system Judgements are defined through rules for the associated logical operator. The construction rules for probability distributions of the label set mimic those of TPTND [15], see 1. Contexts are generated by judgements of the form ⊢ 𝑥 ∶ 𝛼𝑟 and ⊢ 𝑥𝑡 ∶ 𝛼𝑟 , expressing assignments of probabilities to labeled variables for given terms. An empty set {} represents the base case (rule ”base”) to define distributions. Random variables with given theoretical probability values assigned to labels can be added to a distribution as long as they respect the standard additivity requirement on probabilities (”extend”). Labels are intended as exclusive (a single label for a datapoint) and categorical (at least a label for each datapoint). To denote an unknown distribution (”unknown”) we use the (transfinite) construction assigning to all values of interest the interval of all possible probability values, provided the additivity condition of ”extend” is preserved. Such a construction expresses formally the probability distribution underlying an opaque classifier. base {} ∶∶ distribution Γ ∶∶ distribution 𝑥 ∶ 𝛼𝑎 ∉ Γ ∑𝑝 𝑥 ∶ 𝜌 𝑝 ∈ Γ ≤ 1 extend Γ, 𝑥 ∶ 𝛼𝑎 ∶∶ distribution 𝛼 ∶∶ 𝑙𝑎𝑏𝑒𝑙 … 𝜔 ∶∶ 𝑙𝑎𝑏𝑒𝑙 unknown {𝑥 ∶ 𝛼[0,1] , … , 𝑥 ∶ 𝜔[0,1] } ∶∶ distribution Figure 1: Distribution Construction label assignment Γ, 𝑥𝑡∗ ∶ 𝛼𝑎 , 𝑥𝑡̃ ∶ 𝛽𝑏 ⊢ 𝑡 ∶ 𝛼 Γ, 𝑥𝑡 ∶ 𝛼𝑎 , 𝑥𝑡 ∶ 𝛽𝑏 ⊢ 𝑡 ∶ 𝛼 I+ Γ ⊢ 𝑡 ∶ (𝛼 + 𝛽) Γ ⊢ 𝑡 ∶ (𝛼 + 𝛽) Γ ⊢ 𝑡 ∶ (𝛼 → ⊥) E+𝑅 Γ⊢𝑡∶𝛽 Γ ⊢ 𝑡 ∶ (𝛼 + 𝛽) Γ ⊢ 𝑡 ∶ (𝛽 → ⊥) E+𝐿 Γ⊢𝑡∶𝛼 Γ, 𝑥𝑡∗ ∶ 𝛼𝑎 ⊢ 𝑡 ∶ 𝛽 Γ ⊢ [𝑥]𝑡 ∶ (𝛼 → 𝛽)𝑎 ⊢𝑡∶𝛼 Γ ⊢ [𝑥 ∗ ]𝑡 ∶ (𝛼 → 𝛽)𝑎 Γ ⊢ 𝑡.[𝑡 ∶ 𝛼] ∶ 𝛽 Figure 2: Categorical labeling rules. Under the interpretation of TPTND-KL, categorical label assignment for each datapoint is made under a probability distribution with true and false positive rates for exclusive labels, see Figure 2. The ”label assignment” rule, axiomatically declares that the datapoint 𝑡 is classified as 𝛼, under a distribution of labels which assigns true positive rate 𝑎 for 𝑡 to be 𝛼 and false positive rate 𝑏 for 𝑡 to be 𝛽. This rule states the observed behavior of a classifier under a transparent confusion matrix for the available label set. In this version of our calculus, we do not include the joint probability of distinct labels under independent distributions. 𝐼 + allows to express that a single datapoint is assigned exclusively one of two (or more) labels included in the same distribution; correspondingly 𝐸+ extracts the unique label from such a disjoint set, when the other options have non-positive probability (⊥). Finally, → expresses the dependency of label assignment from feature satisfaction: if the label 𝛽 for term 𝑡 depends from the label 𝛼 being correctly assigned to 𝑡 with a certain probability 𝑎, then 𝑡 has label (𝛼 → 𝛽) with probability 𝑎; from this, provided 𝑡 is labeled 𝛼, then term 𝑡 is labeled 𝛽. Consider, for example, the probability of being assigned a label 𝑓 𝑒𝑚𝑎𝑙𝑒 from a classifier depending on the probability that the feature 𝑙𝑜𝑛𝑔_ℎ𝑎𝑖𝑟 be greater than some positive value (and in a bad classifier that probability might be 1). The rules can now be generalised for the expected assignment of labels for a given population of interest, see Figure 3. Again, these rules are adapted from their counterparts in TPTND. The ”expected labeling” rule axiomatically declares that under a transparent distribution of labels which assigns true positive rate 𝑎 for 𝑡 to be 𝛼 and false positive rate 𝑏 for 𝑡 to be 𝛽, in a population of 𝑛 datapoints 𝑡 the label 𝛼 is expected to be assigned with probability 𝑎 (the expected probability being denoted by the tilde). Note that this rule sets an expected assignment given a known probability distribution, which can be taken to be the one on which a classifier is trained: such expected probabilistic classification can be designed to embed specific fairness constraints, like demographic parity or equal opportunity. In the following, we will use this expected behavior as a constraint to check the fairness of a given unknown classifier. The ”sampling” rule says that the frequency value 𝑓 of label 𝛼 appears for 𝑛 datapoints 𝑡, where 𝑓 is the number of cases in which 𝛼 is actually assigned in those 𝑛 cases by means of 𝑛 occurrences of the ”label assignment” rule. As a general case, the distribution under which 𝑡 is labeled can be taken to be unknown, hence we consider the simple observation of a classifier at work without knowing its inner structure. This will turn out to be crucial in our bias checking rule. The ”update” rule serves the purpose of considering multiple classifications to render the change in frequency of any given label with different sets of members of a given population of interest. Rule I+ introduces disjunction of possible labels: intuitively, if under a distribution Γ a population of 𝑛 elements 𝑡 has assigned label 𝛼 with an expected probability 𝑎,̃ and label 𝛽 is assigned with expected probability 𝑏,̃ then the expected probability of label 𝛼 or label 𝛽 is 𝑎̃ + 𝑏.̃ By E+𝑅 (respectively E+𝐿 ): if under a distribution Γ a population of 𝑛 datapoints 𝑡 is assigned label 𝛼 or label 𝛽 with expected probability 𝑐,̃ and the former (respectively, the latter) label has ̃ then in that population the latter (respectively, the former) label probability 𝑎̃ (respectively, 𝑏), ̃ This formally expresses categoricity will be assigned with probability 𝑐 ̃ − 𝑎̃ (respectively 𝑐 ̃ − 𝑏). of labeling. The rule I→ now says that: if label 𝛽 is assigned with expected probability 𝑏̃ in a population of size 𝑛 provided in that population label 𝛼 is correctly assigned with probability 𝑎, then we express with such dependency with a term [𝑥]𝑡𝑛 which is assigned label (𝛼 → 𝛽) with probability 𝑏̃ provided 𝑎. The corresponding elimination E→ allows to verify the expected probability of label 𝛽: it considers a term [𝑥]𝑡𝑛 labeled (𝛼 → 𝛽) with expected probability 𝑏̃ depending on the probability 𝑎 of label 𝛼 to be assigned to a given element 𝑥 (e.g. a feature), and when such labeling occurs with probability 𝑎,̃ it computes the expected probability 𝑎̃ ⋅ 𝑏̃ for 𝑛 datapoints in the population to be labeled 𝛽. Notice that this rule might be made invalid if constrained in a manner that the expected frequency of 𝛼 must be higher than a given threshold, in order for the probability of 𝛽 to be assigned be positive. 4. Checking for Bias We now provide a set of introduction and elimination rules for the 𝐵𝑖𝑎𝑠 operator ranging over a label assignment. Let us consider a binary label set 𝐿 = {𝛼, 𝛽}, a variable set 𝑋 ∶ {𝑥1 , … , 𝑥𝑛 } and a data set 𝐷 ∶= {𝑡, 𝑢, 𝑣, … , 𝑧}. Consider now the expression ⊢ 𝑥𝑡̃ ∶ 𝛽𝑏 saying that 𝛽 is the wrong label for data point 𝑡 with probability 𝑏. This expresses therefore expected labeling 𝑥𝑡∗ ∶ 𝛼𝑎 , 𝑥𝑡̃ ∶ 𝛽𝑏 ⊢ 𝑡𝑛 ∶ 𝛼𝑎̃ Γ ⊢ 𝑡1 ∶ 𝛼1 … Γ ⊢ 𝑡 𝑛 ∶ 𝛼 𝑛 sampling Γ ⊢ 𝑡𝑛 ∶ 𝛼 𝑓 where 𝑓 =∣ {𝛼 1 , … , 𝛼 𝑛 ∣ 𝛼 𝑖 = 𝛼} ∣ Γ ⊢ 𝑡𝑛 ∶ 𝛼 𝑓 Γ ⊢ 𝑡𝑚 ∶ 𝛼 𝑓 ′ update Γ ⊢ 𝑡𝑛+𝑚 ∶ 𝛼𝑓 ∗(𝑛/(𝑛+𝑚))+𝑓 ′ ∗(𝑚/(𝑛+𝑚)) Γ ⊢ 𝑡𝑛 ∶ 𝛼𝑎̃ Γ ⊢ 𝑡𝑛 ∶ 𝛽𝑏̃ I+ Γ ⊢ 𝑡𝑛 ∶ (𝛼 + 𝛽)𝑎+̃ 𝑏̃ Γ ⊢ 𝑡𝑛 ∶ (𝛼 + 𝛽)𝑐 ̃ Γ ⊢ 𝑡𝑛 ∶ 𝛼𝑎̃ E+𝐿 Γ ⊢ 𝑡𝑛 ∶ 𝛽𝑐−̃ 𝑎̃ Γ ⊢ 𝑡𝑛 ∶ (𝛼 + 𝛽)𝑐 ̃ Γ ⊢ 𝑡𝑛 ∶ 𝛽𝑏̃ E+𝑅 Γ ⊢ 𝑡𝑛 ∶ 𝛽𝑐−̃ 𝑏̃ Γ, 𝑥𝑡∗ ∶ 𝛼𝑎 ⊢ 𝑡𝑛 ∶ 𝛽𝑏̃ I→ Γ ⊢ [𝑥 ∗ ]𝑡𝑛 ∶ (𝛼 → 𝛽)[𝑎]𝑏̃ Γ ⊢ [𝑥 ∗ ]𝑡𝑛 ∶ (𝛼 → 𝛽)[𝑎]𝑏̃ 𝑦𝑢 ∶ 𝛼𝑎 ⊢ 𝑢𝑛 ∶ 𝛼𝑎̃ E→ Γ ⊢ 𝑡𝑛 .[𝑢𝑛 ∶ 𝛼] ∶ 𝛽𝑎⋅𝑏 ̃ Figure 3: Population Labeling Rules the False Positive Rate for 𝑡 having label 𝛽. And the expression ⊢ 𝑥𝑡∗ ∶ 𝛼𝑎 saying that 𝛼 is the correct label for data point 𝑡 with probability 𝑎. This expresses therefore the True Positive Rate for any datapoint 𝑡 with label 𝛼. Here 𝑏 = 1 − 𝑎. Now let us assume we have a probability distribution for 𝐿 in which such information is available over a population, i.e. Γ, 𝑥 ∗ ∶ 𝛼𝑎 , 𝑥̃ ∶ 𝛽𝑏 ∶∶ distribution Assume the function ℎ implemented by a classifier using this distribution is a desirable one, meaning that the expected classification by an application of the rule ”expected labeling” over the classes denoted by the available labels will behave following this distribution, which might reflect a fairness constraint we require on a protected group. For completeness, we denote the desirable Γ satisfying our fairness constraints as Γ ∶∶ 𝑓 𝑎𝑖𝑟_𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 Δ ⊢ 𝑡 ∶ 𝛼𝑓 𝐷𝐾 𝐿 (Δ ∣∣ Γ) = 𝜖 > 𝜋 IB Γ, Δ ⊢ 𝐵𝑖𝑎𝑠(𝑡 ∶ 𝛼𝑓 ) Δ ⊢ 𝐵𝑖𝑎𝑠(𝑢𝑛 ∶ 𝛼𝑓 ) EB Γ, 𝑥𝑢∗ ∶ 𝛼[0,1]−[𝑎−𝜖(𝑛),𝑎+𝜖(𝑛)] ⊢ 𝑢𝑛 ∶ 𝛼𝑓 Figure 4: Bias Checking Rules Γ ∶∶ 𝑓 𝑎𝑖𝑟_𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 Consider now the observation for a given data point 𝑡 assigning label 𝛼 with frequency 𝑓: Δ ⊢ 𝑡 ∶ 𝛼𝑓 which we might consider biased if the divergence of 𝑓 under Δ is significant from what is dictated by the constraint on the labels encoded in Γ: Definition 2 (Bias labeling). The assignment of label 𝛼 shown with frequency 𝑓 for a population of 𝑛 datapoints 𝑡 under the distribution generated by the possibly opaque training set Δ is biased if the divergence of 𝑓 from the probability 𝑎 generated by the correct label assigned by a transparent distribution Γ is greater than a given threshold considered appropriate for the labeling task at hand. We now use the Kullback-Leibler divergence, which notoriously is a measure of how the probability distribution Δ is different from Γ, or the expected excess surprise from using Γ as a model when the actual distribution is Δ. Fixing a constraint that defines the fairness of Γ, and for each term 𝑡 and label 𝛼 common to Δ, Γ, we obtain: Δ ⊢ 𝑡 ∶ 𝛼𝑓 𝐷KL (Δ ∥ Γ) = ∑ (Δ ⊢ 𝑢 ∶ 𝛼𝑓 ) log ( ∗ ) 𝛼∈Δ Γ, 𝑥𝑡 ∶ 𝛼𝑎 ∶∶ 𝑓 𝑎𝑖𝑟_𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 Using this measure, we want to check whether its value surpasses some reference threshold 𝜋 which is considered appropriate for the classification task. This means considering sufficiently low values of 𝜋 for sensitive classifications, e.g. gender or other socially and culturally relevant properties. Definition 2 is then expressed by Rule IB in Figure 4. In this rule 𝜖 is the sum of the divergence between Δ and Γ computed for each label 𝛼 common to both. On the other hand, in the presence of a population of 𝑛 datapoints where the frequency 𝑓 of label 𝛼 under distribution Δ is considered biased, we infer a family of probability values in the interval [0, 1] excluding the interval [𝑎 − 𝜖(𝑛), 𝑎 + 𝜖(𝑛)] for output 𝛼 ∈ Γ within which 𝑢𝑛 ∶ 𝛼𝑓 can be correctly sampled from Δ. Note that this rule will not allow to derive a fair value for label 𝛼, but rather determine the conditions of the distribution under which the current frequency can be considered admissible. 5. Example Consider context Γ = {𝑥𝑡 ∶ 𝑀1/3 , 𝑥𝑡 ∶ 𝐹2/3 }, representing the reference distribution of gender in a population of interest in which 1/3 of the population is male and 2/3 is female. Assume now a classifier trained on some data that distributes according to Δ, which is inaccessible, but which we know is – or can assume to be – similar to the one described by Γ. The classifier on Δ when used displays the following behavior: Δ ⊢ 𝑡140 ∶ 𝑀60 Δ ⊢ 𝑡140 ∶ 𝐹80 that is, out of 140 datapoints, 60 are labelled as 𝑀 and 80 are labelled as 𝐹. For readability we write Γ(𝑀) = 1/3, Γ(𝐹 ) = 2/3, Δ(𝑀) = 60/140 = 3/7 and Δ(𝐹 ) = 80/140 = 4/7. We want to check whether our classifier using Δ is fair with respect to the constraints set by Γ. We calculate the Kullback–Leibler divergence as follows: Δ(𝑥) 𝐷𝐾 𝐿 (Δ ∥ Γ) = ∑ Δ(𝑥) ⋅ log2 ( ) (1) 𝑥∈{𝐹 ,𝑀} Γ(𝑥) Δ(𝐹 ) Δ(𝑀) = Δ(𝐹 ) ⋅ log2 ( ) + Δ(𝑀) ⋅ log2 ( ) (2) Γ(𝐹 ) Γ(𝑀) 4 4/7 3 3/7 = ⋅ log2 ( ) + ⋅ log2 ( ) ≊ 0.028 (3) 7 2/3 7 1/3 To determine when we would consider the classification produced over Δ biased, we need to set a threshold 𝜋. Let 𝜋 = 0.2. Then, since 𝐷𝐾 𝐿 (Δ ∥ Γ) < 𝜋 we can conclude that Γ, Δ ⊢ Fair(𝑡140 ∶ 𝑀60 ), i.e. that the classification of male inidivduals produces by a classifier under the distirbution Δ is fair when compared against Γ. To further illustrate this example, let us re-consider Equation (2) but this time assume that we don’t know how many data-points out of the 160 are labelled as 𝑀, i.e. Δ ⊢ 𝑡140 ∶ 𝑀𝑛 Δ ⊢ 𝑡140 ∶ 𝐹140−𝑛 Recall that Γ(𝑀) = 1/3 and Γ(𝐹 ) = 2/3, and consider that since we are dealing with a binary classification task, Δ(𝐹 ) = 1 − Δ(𝑀). Then we may rewrite Equation (2) as 3(1 − Δ(𝑀)) 𝐷𝐾 𝐿 (Δ ∥ Γ) = (1 − Δ(𝑀)) ⋅ log2 ( ) + Δ(𝑀) ⋅ log2 (3 ⋅ Δ(𝑀)) 2 and if we let Δ(𝑀) vary in the interval [0, 1], we can plot 𝐷𝐾 𝐿 (Δ ∥ Γ) as in Figure 5, which provides a more concrete illustration of which values are considered to be biased in our domain. 6. Conclusion In this paper we have introduced TPTND-KL, a natural deduction system where formulas express the expected labeling of data, based on probability distributions of labels over training sets. We have shown how to model proof-theoretically a checking procedure to measure the 1.5 𝐷𝐾 𝐿 𝜋 1 𝐷𝐾 𝐿 (Δ ∥ Γ) 0.5 3/7 0 0 0.2 0.4 0.6 0.8 1 Δ(𝑀) Figure 5: The Kullback-Leibler Divergence for a fixed reference distribution of 1/3 males and 2/3 females, and for a classifier that classifies a fraction Δ(𝑀) of the population as male. If we set a threshold 𝜋 = 0.2 (depicted as the red line in the figure) we get that Δ(𝑀) should belong to the interval [0.11, 0.59] in order to be considered unbiased (see the thick black line on the 𝑥 axis). Note that, for the particular case where Δ(𝑀) = 3/7 as in our previous example, the bias is in fact acceptable (see the teal dot on the 𝑥 axis). distance that a given opaque classifier displays from the constraints set on the expected labeling by a transparent probabilistic distribution. Our research leaves room for further developments. To start with, the example used through- out this paper is somewhat simplistic. Instead, a more precise characterization of the types of biases and of fairness metrics that we can model within our language is required: for instance, one can take advantage of the use of the implication to build dependent variables to model more complex examples where multi-attribute biases and intersectionality occur. It is worth stressing here that currently our framework allows us to deal with fairness-of-outcome, as we can only compare the output distribution (i.e., the outcome of the classification process) with a reference distribution that we assume to be fair. It remains to be explored whether other definition of fairness can be modelled without substantially modify TPTND-KL. Other problems that we would like to tackle in a future version of this work are: adapting the framework to regression tasks, as at the moment it only works for classification, since the set of types is tacitly assumed to be finite; and investigate how to set the reference thresholds can be defined or extracted in less arbitrary ways. This latter point will require studying how the framework behaves with real data, and to this end an implementation of this work is very much needed. Another future step is the formulation of meta-threotical results, like progress and termination on the syntactic transformations determined by our proof systems, to prove under which conditions a given measure of bias will never exceed a certain threshold; finally, the integration of the bias checking procedure developed for our system with other non-automatic parameters and information quality criteria would be desirable. Acknowledgments This research has been partially funded by the Project PRIN2020 BRIO - Bias, Risk and Opacity in AI (2020SSKZ7R) awarded by the Italian Ministry of University and Research (MUR). The author Fabio Aurelio D’Asaro acknowledges the funding and support of PON “Ricerca e Innovazione” 2014-2020 (PON R&I FSE-REACT EU), Azione IV.6 “Contratti di ricerca su tematiche Green” in the context of the project titled “Il miglioramento dell’algorithmic fairness in ambito assicurativo: Un’analisi concettuale a partire da uno studio di caso”. References [1] S. A. Seshia, D. Sadigh, S. S. Sastry, Toward verified artificial intelligence, Commun. ACM 65 (2022) 46–55. URL: https://doi.org/10.1145/3503914. doi:1 0 . 1 1 4 5 / 3 5 0 3 9 1 4 . [2] R. Alur, T. A. Henzinger, P.-H. Ho, Automatic symbolic verification of embedded systems, IEEE Trans. Softw. Eng. 22 (1996) 181–201. URL: https://doi.org/10.1109/32.489079. doi:1 0 . 1109/32.489079. [3] M. Z. Kwiatkowska, G. Norman, D. Parker, Prism: Probabilistic symbolic model checker, in: Proceedings of the 12th International Conference on Computer Performance Evaluation, Modelling Techniques and Tools, TOOLS ’02, Springer-Verlag, Berlin, Heidelberg, 2002, pp. 200–204. [4] A. Termine, G. Primiero, F. A. D’Asaro, Modelling accuracy and trustworthiness of explaining agents, in: S. Ghosh, T. Icard (Eds.), Logic, Rationality, and Interaction - 8th International Workshop, LORI 2021, Xi’ian, China, October 16-18, 2021, Proceedings, volume 13039 of Lecture Notes in Computer Science, Springer, 2021, pp. 232–245. URL: https://doi.org/10.1007/978-3-030-88708-7_19. doi:1 0 . 1 0 0 7 / 9 7 8 - 3 - 0 3 0 - 8 8 7 0 8 - 7 \ _ 1 9 . [5] Q. Gao, D. Hajinezhad, Y. Zhang, Y. Kantaros, M. M. Zavlanos, Reduced variance deep reinforcement learning with temporal logic specifications, in: Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems, ICCPS ’19, Association for Computing Machinery, New York, NY, USA, 2019, pp. 237–248. URL: https://doi.org/10. 1145/3302509.3311053. doi:1 0 . 1 1 4 5 / 3 3 0 2 5 0 9 . 3 3 1 1 0 5 3 . [6] A. Termine, A. Antonucci, G. Primiero, A. Facchini, Logic and model checking by imprecise probabilistic interpreted systems, in: A. Rosenfeld, N. Talmon (Eds.), Multi-Agent Systems - 18th European Conference, EUMAS 2021, Virtual Event, June 28-29, 2021, Revised Selected Papers, volume 12802 of Lecture Notes in Computer Science, Springer, 2021, pp. 211–227. URL: https://doi.org/10.1007/978-3-030-82254-5_13. doi:1 0 . 1 0 0 7 / 9 7 8 - 3 - 0 3 0 - 8 2 2 5 4 - 5 \ _ 1 3 . [7] G. Bacci, R. Furber, D. Kozen, R. Mardare, P. Panangaden, D. S. Scott, Boolean-valued semantics for the stochastic 𝜆-calculus, in: A. Dawar, E. Grädel (Eds.), Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, Oxford, UK, July 09-12, 2018, ACM, 2018, pp. 669–678. URL: https://doi.org/10.1145/3209108.3209175. doi:1 0 . 1 1 4 5 / 3 2 0 9 1 0 8 . 3 2 0 9 1 7 5 . [8] J. Borgström, U. D. Lago, A. D. Gordon, M. Szymczak, A lambda-calculus foundation for universal probabilistic programming, in: J. Garrigue, G. Keller, E. Sumii (Eds.), Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016, ACM, 2016, pp. 33–46. URL: https://doi.org/10. 1145/2951913.2951942. doi:1 0 . 1 1 4 5 / 2 9 5 1 9 1 3 . 2 9 5 1 9 4 2 . [9] P. H. A. de Amorim, D. Kozen, R. Mardare, P. Panangaden, M. Roberts, Universal semantics for the stochastic 𝜆-calculus, in: 36th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2021, Rome, Italy, June 29 - July 2, 2021, IEEE, 2021, pp. 1–12. URL: https: //doi.org/10.1109/LICS52264.2021.9470747. doi:1 0 . 1 1 0 9 / L I C S 5 2 2 6 4 . 2 0 2 1 . 9 4 7 0 7 4 7 . [10] M. Antonelli, U. D. Lago, P. Pistone, Curry and howard meet borel, in: C. Baier, D. Fisman (Eds.), LICS ’22: 37th Annual ACM/IEEE Symposium on Logic in Computer Science, Haifa, Israel, August 2 - 5, 2022, ACM, 2022, pp. 45:1–45:13. URL: https://doi.org/10.1145/3531130. 3533361. doi:1 0 . 1 1 4 5 / 3 5 3 1 1 3 0 . 3 5 3 3 3 6 1 . [11] F. Dahlqvist, D. Kozen, Semantics of higher-order probabilistic programs with conditioning, Proc. ACM Program. Lang. 4 (2020) 57:1–57:29. URL: https://doi.org/10.1145/3371125. doi:1 0 . 1 1 4 5 / 3 3 7 1 1 2 5 . [12] A. D. Pierro, A type theory for probabilistic 𝜆-calculus, in: From Lambda Calculus to Cybersecurity Through Program Analysis, 2020. [13] M. Boričić, Sequent calculus for classical logic probabilized, Archive for Mathematical Logic 58 (2019) 119–136. [14] S. Ghilezan, J. Ivetić, S. Kašterović, Z. Ognjanović, N. Savić, Probabilistic reasoning about simply typed lambda terms, in: International Symposium on Logical Foundations of Computer Science, Springer, 2018, pp. 170–189. [15] F. A. D’Asaro, G. Primiero, Probabilistic typed natural deduction for trustworthy compu- tations, in: D. Wang, R. Falcone, J. Zhang (Eds.), Proceedings of the 22nd International Workshop on Trust in Agent Societies (TRUST 2021) Co-located with the 20th International Conferences on Autonomous Agents and Multiagent Systems (AAMAS 2021), London, UK, May 3-7, 2021, volume 3022 of CEUR Workshop Proceedings, CEUR-WS.org, 2021. URL: http://ceur-ws.org/Vol-3022/paper3.pdf. [16] F. A. D’Asaro, G. Primiero, Checking trustworthiness of probabilistic computations in a typed natural deduction system, CoRR abs/2206.12934 (2022). URL: https://doi.org/10. 48550/arXiv.2206.12934. doi:1 0 . 4 8 5 5 0 / a r X i v . 2 2 0 6 . 1 2 9 3 4 . a r X i v : 2 2 0 6 . 1 2 9 3 4 . [17] B. Friedman, H. Nissenbaum, Bias in computer systems, ACM Trans. Inf. Syst. 14 (1996) 330–347. [18] J. Buolamwini, T. Gebru, Gender shades: Intersectional accuracy disparities in commercial gender classification, in: Conference on Fairness, Accountability and Transparency, FAT 2018, 23-24 February 2018, New York, NY, USA, volume 81, PMLR, 2018, pp. 77–91. [19] N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, A. Galstyan, A survey on bias and fairness in machine learning, ACM Comput. Surv. 54 (2021) 115:1–115:35. URL: https: //doi.org/10.1145/3457607. doi:1 0 . 1 1 4 5 / 3 4 5 7 6 0 7 . [20] C. G. Northcutt, A. Athalye, J. Mueller, Pervasive label errors in test sets destabilize machine learning benchmarks, 2021. Preprint at https://arxiv.org/pdf/2103.14749.pdf. [21] J. Dastin, Amazon scraps secret ai recruiting tool that showed bias against women, ???? [22] H. Jiang, O. Nachum, Identifying and correcting label bias in machine learning, CoRR abs/1901.04966 (2019). URL: http://arxiv.org/abs/1901.04966. a r X i v : 1 9 0 1 . 0 4 9 6 6 .