Proof-checking bias in labeling methods
Giuseppe Primiero1,∗ , Fabio D’Asaro2
1
  Logic, Uncertanty, Computation and Information Group, Department of Philosophy, University of Milan, Via Festa del
Perdono, 7, Milano, 20122, Italy
2
  Ethos Group, Department of Human Sciences, University of Verona, Via S. Francesco, 22, Verona, 37129, Italy


                                         Abstract
                                         We introduce a typed natural deduction system designed to formally verify the presence of bias in
                                         automatic labeling methods. The system relies on a ”data-as-terms” and ”labels-as-types” interpretation
                                         of formulae, with derivability contexts encoding probability distributions on training data. Bias is
                                         understood as the divergence that expected probabilistic labeling by a classifier trained on opaque data
                                         displays from the fairness constraints set by a transparent dataset.

                                         Keywords
                                         Bias, Classifiers, Formal Verification


1. Introduction
The formal verification of AI systems is a growing field of research, and its applications are
essential to the deployment of safe systems in several domains with high societal impact.
This theme is attracting more and more attention from the community at large, not just the
formal verification specialists, see e.g. the recent [1]. Essential to this task is to reason about
probabilistic processes, in order to accommodate the uncertainty and non-deterministic behavior
characteristic of AI systems, and to model desirable forms of accuracy and trustworthiness.
   This objective can be formulated in terms of formal methods to prove trustworthy properties
of interest, see e.g. [2, 3, 4]; or with linear temporal logic properties defined over Markov
decision processes, e.g. with reinforcement learning methods [5] or with imprecise probabilities
[6]. A different approach consists in exploiting a proof-theoretic semantics and the tradition
of lambda-calculi and the Curry-Howard isomorphism to interpret computational processes
as proof terms and output values as their types. Under the general heading of proof-checking
techniques, we can identify untyped 𝜆-calculi [7, 8, 9], probabilistic event 𝜆-calculi [10], calculi
for Bayesian learning [11], and calculi with types or natural deduction systems [12, 13, 14]. In
this latter tradition, the calculus TPTND [15, 16] offers a novel interpretation of a probabilistic
typed natural deduction system specifically designed to formally check the post-hoc validity of
a computational process executed under a possibly opaque distribution of outputs against a

BEWARE: Joint BRIO, ME&E and AWARE Workshop @ AIxIA 2022, November 28 - December 2, 2022, University of
Udine, Udine, Italy
∗
    Corresponding author.
Envelope-Open giuseppe.primiero@unimi.it (G. Primiero); fabioaurelio.dasaro@univr.it (F. D’Asaro)
GLOBE https://sites.unimi.it/gprimiero/ (G. Primiero); https://www.fabiodasaro.com (F. D’Asaro)
Orcid 0000-0003-3264-7100 (G. Primiero); 0000-0002-2958-3874 (F. D’Asaro)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
transparent and desirable model. Here trustworthiness is interpreted as an admissible distance
in the probability assigned to an observed output when compared to the probability that the
intended behavior should display.
   The threats that AI systems pose to reliable and safe knowledge are varied and, among these,
the potential bias of is a rising phenomenon. Dating back to the analysis of bias in deterministic
systems see e.g. [17], several recent studies have shown that training and evaluation data for
classification algorithms are often biased. [18] illustrates algorithmic discrimination based
on gender and race; [19] lists different types of bias for data and algorithms. [20] shows that
the most well-known AI datasets are full of labeling errors. From a formal viewpoint, the
trustworthiness of an AI system may be considered also from the point of view of the absence
of bias. Given an automated labeling method, one might ask whether it displays any bias, and
if so, whether such level is below an admissible maximum potential threshold. Under this
perspective, a new task is the design of systems that will make it possible to formally verify the
absence of bias, or its presence within limits considered admissible in the application domain.
Consider as an example the potential of automatic AI-based classification in recruiting, where
the aim is to automatically select the best applicants out of huge pools of resumè. Notoriously
this is not without risks, as shown by the infamous Amazon case, where the selection process
was non gender-neutral due to the training of models based on patterns in resumes submitted
over a 10-year period in which most applicants were men, thereby reflecting and consolidating
male dominance in the tech industry, [21]. An important task is therefore to devise methods
were gender-balance (or similar sensitive attributes) are treated as safety properties to be
automatically checked.
   In this paper we present TPTND-KL, a system re-interpreting TPTND to reason about labeling
of random variables and datapoints, with the task of inferring the presence of bias in classification
methods. The underlying interpretation of formulas can be dubbed ”datapoints-as-terms” and
”labels-as-types”. In this context, we understand bias as a measure of the entropy between
a given opaque probabilistic assignment of labels to a datapoint and the results given by a
distribution of labels assumed to be correct, or at least desirable. This interpretation is inspired
by [22]. The rest of the paper is structured as follows. In Section 2 we introduce the language,
stressing the novel interpretation with respect to the previous version of the system and the
introduction of the new 𝐵𝑖𝑎𝑠 operator. In Section 3 we reformulate the proof system under
the novel isomorphism between syntactic terms and semantic interpretation. In Section 4 we
extend the proof system with rules for checking the presence of bias, with particular attention
to the novel distance measure chosen for this application. In Section 5 we provide a detailed
example to show the checking method at work. We conclude with limitations of this approach
and further topics of research.


2. Formal Preliminaries
The logic TPTND-KL is a typed natural deduction system built from the following syntax:
Definition 1 (Syntax).
              V a r i a b l e S e t ∶= x ∣ x T ∣ ̃x T ∣ x ∗T ∣ ⟨X , X ⟩ ∣ fst(X ) ∣ snd(X ) ∣ [X ]X ∣ X .X ∣ X .T
              D a t a s e t ∶= t ∣ ⟨T , T ⟩ ∣ fst(T ) ∣ snd(T ) ∣ [X ]T ∣ T .T ′ ∣ 𝐵𝑖𝑎𝑠(T )
              L a b e l S e t ∶= 𝛼 ∣ 𝛼𝑟 ∣ ⊥ ∣ (𝛼 × 𝛽)𝑟 ∣ (𝛼 + 𝛽)𝑟 ∣ (𝛼 → 𝛽)[𝑟 ′ ]𝑟
              A s s i g n m e n t s ∶= {} ∣ C , 𝑥 ∶ 𝛼 ∣ C , 𝑥 ∶ 𝛼𝑎

   It includes: (random) variables of the form 𝑥𝑡 with an associated datapoint 𝑡 and annotated
with indicators for being the correct labeled variable 𝑥̃ or the wrongly labeled variable 𝑥 ∗ , and
composed as pairs, first and second projection, abstraction and application (also with terms);
terms 𝑡 standing for datapoints, composed as pairs, first and second projection, abstraction (with
respect to a variable of interest) and application, and taken as argument of a 𝐵𝑖𝑎𝑠 operator; a
labelset with labels 𝛼 decorating variables and terms and indexed by a real value 𝑟, composed as
conjunction, disjunction and implication; finally, an assignment function which lists a probability
distribution 𝐶 of labels to variables.
   Example formulas of TPTND-KL have the following form:

    • 𝑥 ∶ 𝛼 ⊢ 𝑦 ∶ 𝛽0.1 : the random variable 𝑦 has probability 0.1 of having label 𝛽, provided 𝑥
      has label 𝛼, i.e. 𝑃(𝑦 = 𝛽 ∣ 𝑥 = 𝛼) = 0.1.
    • Γ ⊢ 𝑝100 ∶ 𝐹0̃    .35 : under distribution Γ (e.g. of a given population), for 100 people one
      expects label 𝑓 𝑒𝑚𝑎𝑙𝑒 to be assigned 35% of the times.
    • Γ ⊢ 𝑝100 ∶ 𝐹30 : for 100 people in our dataset the label 𝑓 𝑒𝑚𝑎𝑙𝑒 was assigned 30 times.
    • 𝑦𝑡̃ ∶ 𝑀0.5 , 𝑦𝑡∗ ∶ 𝐹0.35 ⊢ 𝑡 ∶ 𝑀0.3 : a datapoint 𝑡 is labeled 𝐹 with probability 0.3 provided
      that 𝑡 is wrongly labeled 𝑀 with probability 0.5, and correctly 𝐹 with probability 0.35, i.e.
      𝑃(𝑡 = 𝑀) = 0.3, 𝑃(𝑦𝑡 = 𝑀 ∣ 𝑦𝑡̃ ≠ 𝑀) = 0.5 and 𝑃(𝑦𝑡 = 𝐹 ∣ 𝑦𝑡̃ = 𝐹 ) = 0.35.

   The main novelty of TPTND-KL (and what differentiates it from its ancestor system TPTND)
is the interpretation of its formulae in the style of ”datapoints-as-terms” and ”labels-as-types”,
and the ability to express the probability of label assignment under a given distribution of labels
for random variables, including true and false positive rates of labeling.


3. Proof system
Judgements are defined through rules for the associated logical operator.
   The construction rules for probability distributions of the label set mimic those of TPTND
[15], see 1. Contexts are generated by judgements of the form ⊢ 𝑥 ∶ 𝛼𝑟 and ⊢ 𝑥𝑡 ∶ 𝛼𝑟 , expressing
assignments of probabilities to labeled variables for given terms. An empty set {} represents
the base case (rule ”base”) to define distributions. Random variables with given theoretical
probability values assigned to labels can be added to a distribution as long as they respect the
standard additivity requirement on probabilities (”extend”). Labels are intended as exclusive
(a single label for a datapoint) and categorical (at least a label for each datapoint). To denote
an unknown distribution (”unknown”) we use the (transfinite) construction assigning to all
values of interest the interval of all possible probability values, provided the additivity condition
of ”extend” is preserved. Such a construction expresses formally the probability distribution
underlying an opaque classifier.
                                                                  base
                                           {} ∶∶ distribution

                    Γ ∶∶ distribution         𝑥 ∶ 𝛼𝑎 ∉ Γ         ∑𝑝 𝑥 ∶ 𝜌 𝑝 ∈ Γ ≤ 1
                                                                                      extend
                                     Γ, 𝑥 ∶ 𝛼𝑎 ∶∶ distribution

                              𝛼 ∶∶ 𝑙𝑎𝑏𝑒𝑙         …          𝜔 ∶∶ 𝑙𝑎𝑏𝑒𝑙
                                                                          unknown
                           {𝑥 ∶ 𝛼[0,1] , … , 𝑥 ∶ 𝜔[0,1] } ∶∶ distribution

Figure 1: Distribution Construction

                                                                label assignment
                               Γ, 𝑥𝑡∗ ∶ 𝛼𝑎 , 𝑥𝑡̃ ∶ 𝛽𝑏 ⊢ 𝑡 ∶ 𝛼

                                        Γ, 𝑥𝑡 ∶ 𝛼𝑎 , 𝑥𝑡 ∶ 𝛽𝑏 ⊢ 𝑡 ∶ 𝛼
                                                                     I+
                                              Γ ⊢ 𝑡 ∶ (𝛼 + 𝛽)

                               Γ ⊢ 𝑡 ∶ (𝛼 + 𝛽)   Γ ⊢ 𝑡 ∶ (𝛼 → ⊥)
                                                                 E+𝑅
                                             Γ⊢𝑡∶𝛽

                               Γ ⊢ 𝑡 ∶ (𝛼 + 𝛽)  Γ ⊢ 𝑡 ∶ (𝛽 → ⊥)
                                                                E+𝐿
                                             Γ⊢𝑡∶𝛼

                       Γ, 𝑥𝑡∗ ∶ 𝛼𝑎 ⊢ 𝑡 ∶ 𝛽             Γ ⊢ [𝑥]𝑡 ∶ (𝛼 → 𝛽)𝑎       ⊢𝑡∶𝛼
                     Γ ⊢ [𝑥 ∗ ]𝑡 ∶ (𝛼 → 𝛽)𝑎                    Γ ⊢ 𝑡.[𝑡 ∶ 𝛼] ∶ 𝛽

Figure 2: Categorical labeling rules.


    Under the interpretation of TPTND-KL, categorical label assignment for each datapoint is
made under a probability distribution with true and false positive rates for exclusive labels, see
Figure 2. The ”label assignment” rule, axiomatically declares that the datapoint 𝑡 is classified as
𝛼, under a distribution of labels which assigns true positive rate 𝑎 for 𝑡 to be 𝛼 and false positive
rate 𝑏 for 𝑡 to be 𝛽. This rule states the observed behavior of a classifier under a transparent
confusion matrix for the available label set. In this version of our calculus, we do not include
the joint probability of distinct labels under independent distributions. 𝐼 + allows to express
that a single datapoint is assigned exclusively one of two (or more) labels included in the same
distribution; correspondingly 𝐸+ extracts the unique label from such a disjoint set, when the
other options have non-positive probability (⊥). Finally, → expresses the dependency of label
assignment from feature satisfaction: if the label 𝛽 for term 𝑡 depends from the label 𝛼 being
correctly assigned to 𝑡 with a certain probability 𝑎, then 𝑡 has label (𝛼 → 𝛽) with probability 𝑎;
from this, provided 𝑡 is labeled 𝛼, then term 𝑡 is labeled 𝛽. Consider, for example, the probability
of being assigned a label 𝑓 𝑒𝑚𝑎𝑙𝑒 from a classifier depending on the probability that the feature
𝑙𝑜𝑛𝑔_ℎ𝑎𝑖𝑟 be greater than some positive value (and in a bad classifier that probability might be
1).
   The rules can now be generalised for the expected assignment of labels for a given population
of interest, see Figure 3. Again, these rules are adapted from their counterparts in TPTND.
The ”expected labeling” rule axiomatically declares that under a transparent distribution of
labels which assigns true positive rate 𝑎 for 𝑡 to be 𝛼 and false positive rate 𝑏 for 𝑡 to be 𝛽, in
a population of 𝑛 datapoints 𝑡 the label 𝛼 is expected to be assigned with probability 𝑎 (the
expected probability being denoted by the tilde). Note that this rule sets an expected assignment
given a known probability distribution, which can be taken to be the one on which a classifier
is trained: such expected probabilistic classification can be designed to embed specific fairness
constraints, like demographic parity or equal opportunity. In the following, we will use this
expected behavior as a constraint to check the fairness of a given unknown classifier. The
”sampling” rule says that the frequency value 𝑓 of label 𝛼 appears for 𝑛 datapoints 𝑡, where 𝑓 is
the number of cases in which 𝛼 is actually assigned in those 𝑛 cases by means of 𝑛 occurrences
of the ”label assignment” rule. As a general case, the distribution under which 𝑡 is labeled
can be taken to be unknown, hence we consider the simple observation of a classifier at work
without knowing its inner structure. This will turn out to be crucial in our bias checking
rule. The ”update” rule serves the purpose of considering multiple classifications to render the
change in frequency of any given label with different sets of members of a given population of
interest. Rule I+ introduces disjunction of possible labels: intuitively, if under a distribution Γ a
population of 𝑛 elements 𝑡 has assigned label 𝛼 with an expected probability 𝑎,̃ and label 𝛽 is
assigned with expected probability 𝑏,̃ then the expected probability of label 𝛼 or label 𝛽 is 𝑎̃ + 𝑏.̃
By E+𝑅 (respectively E+𝐿 ): if under a distribution Γ a population of 𝑛 datapoints 𝑡 is assigned
label 𝛼 or label 𝛽 with expected probability 𝑐,̃ and the former (respectively, the latter) label has
                               ̃ then in that population the latter (respectively, the former) label
probability 𝑎̃ (respectively, 𝑏),
                                                                ̃ This formally expresses categoricity
will be assigned with probability 𝑐 ̃ − 𝑎̃ (respectively 𝑐 ̃ − 𝑏).
of labeling. The rule I→ now says that: if label 𝛽 is assigned with expected probability 𝑏̃ in a
population of size 𝑛 provided in that population label 𝛼 is correctly assigned with probability
𝑎, then we express with such dependency with a term [𝑥]𝑡𝑛 which is assigned label (𝛼 → 𝛽)
with probability 𝑏̃ provided 𝑎. The corresponding elimination E→ allows to verify the expected
probability of label 𝛽: it considers a term [𝑥]𝑡𝑛 labeled (𝛼 → 𝛽) with expected probability 𝑏̃
depending on the probability 𝑎 of label 𝛼 to be assigned to a given element 𝑥 (e.g. a feature),
and when such labeling occurs with probability 𝑎,̃ it computes the expected probability 𝑎̃ ⋅ 𝑏̃ for
𝑛 datapoints in the population to be labeled 𝛽. Notice that this rule might be made invalid if
constrained in a manner that the expected frequency of 𝛼 must be higher than a given threshold,
in order for the probability of 𝛽 to be assigned be positive.


4. Checking for Bias
We now provide a set of introduction and elimination rules for the 𝐵𝑖𝑎𝑠 operator ranging over a
label assignment. Let us consider a binary label set 𝐿 = {𝛼, 𝛽}, a variable set 𝑋 ∶ {𝑥1 , … , 𝑥𝑛 } and
a data set 𝐷 ∶= {𝑡, 𝑢, 𝑣, … , 𝑧}. Consider now the expression

                                              ⊢ 𝑥𝑡̃ ∶ 𝛽𝑏
  saying that 𝛽 is the wrong label for data point 𝑡 with probability 𝑏. This expresses therefore
                                                               expected labeling
                              𝑥𝑡∗ ∶ 𝛼𝑎 , 𝑥𝑡̃ ∶ 𝛽𝑏 ⊢ 𝑡𝑛 ∶ 𝛼𝑎̃

                            Γ ⊢ 𝑡1 ∶ 𝛼1        …        Γ ⊢ 𝑡 𝑛 ∶ 𝛼 𝑛 sampling
                                           Γ ⊢ 𝑡𝑛 ∶ 𝛼 𝑓

                                                                                where 𝑓 =∣ {𝛼 1 , … , 𝛼 𝑛 ∣ 𝛼 𝑖 = 𝛼} ∣

                                 Γ ⊢ 𝑡𝑛 ∶ 𝛼 𝑓       Γ ⊢ 𝑡𝑚 ∶ 𝛼 𝑓 ′
                                                                       update
                               Γ ⊢ 𝑡𝑛+𝑚 ∶ 𝛼𝑓 ∗(𝑛/(𝑛+𝑚))+𝑓 ′ ∗(𝑚/(𝑛+𝑚))

                                     Γ ⊢ 𝑡𝑛 ∶ 𝛼𝑎̃    Γ ⊢ 𝑡𝑛 ∶ 𝛽𝑏̃
                                                                  I+
                                         Γ ⊢ 𝑡𝑛 ∶ (𝛼 + 𝛽)𝑎+̃ 𝑏̃

                                 Γ ⊢ 𝑡𝑛 ∶ (𝛼 + 𝛽)𝑐 ̃       Γ ⊢ 𝑡𝑛 ∶ 𝛼𝑎̃
                                                                        E+𝐿
                                            Γ ⊢ 𝑡𝑛 ∶ 𝛽𝑐−̃ 𝑎̃

                                 Γ ⊢ 𝑡𝑛 ∶ (𝛼 + 𝛽)𝑐 ̃       Γ ⊢ 𝑡𝑛 ∶ 𝛽𝑏̃
                                                                        E+𝑅
                                            Γ ⊢ 𝑡𝑛 ∶ 𝛽𝑐−̃ 𝑏̃

                                         Γ, 𝑥𝑡∗ ∶ 𝛼𝑎 ⊢ 𝑡𝑛 ∶ 𝛽𝑏̃
                                                                   I→
                                       Γ ⊢ [𝑥 ∗ ]𝑡𝑛 ∶ (𝛼 → 𝛽)[𝑎]𝑏̃

                          Γ ⊢ [𝑥 ∗ ]𝑡𝑛 ∶ (𝛼 → 𝛽)[𝑎]𝑏̃          𝑦𝑢 ∶ 𝛼𝑎 ⊢ 𝑢𝑛 ∶ 𝛼𝑎̃
                                                                                    E→
                                         Γ ⊢ 𝑡𝑛 .[𝑢𝑛 ∶ 𝛼] ∶ 𝛽𝑎⋅𝑏
                                                             ̃


Figure 3: Population Labeling Rules


the False Positive Rate for 𝑡 having label 𝛽. And the expression

                                                 ⊢ 𝑥𝑡∗ ∶ 𝛼𝑎
   saying that 𝛼 is the correct label for data point 𝑡 with probability 𝑎. This expresses therefore
the True Positive Rate for any datapoint 𝑡 with label 𝛼. Here 𝑏 = 1 − 𝑎. Now let us assume we
have a probability distribution for 𝐿 in which such information is available over a population,
i.e.

                                 Γ, 𝑥 ∗ ∶ 𝛼𝑎 , 𝑥̃ ∶ 𝛽𝑏 ∶∶ distribution
   Assume the function ℎ implemented by a classifier using this distribution is a desirable one,
meaning that the expected classification by an application of the rule ”expected labeling” over
the classes denoted by the available labels will behave following this distribution, which might
reflect a fairness constraint we require on a protected group. For completeness, we denote the
desirable Γ satisfying our fairness constraints as
                   Γ ∶∶ 𝑓 𝑎𝑖𝑟_𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛        Δ ⊢ 𝑡 ∶ 𝛼𝑓        𝐷𝐾 𝐿 (Δ ∣∣ Γ) = 𝜖 > 𝜋
                                                                                            IB
                                             Γ, Δ ⊢ 𝐵𝑖𝑎𝑠(𝑡 ∶ 𝛼𝑓 )

                                             Δ ⊢ 𝐵𝑖𝑎𝑠(𝑢𝑛 ∶ 𝛼𝑓 )
                                                                             EB
                                 Γ, 𝑥𝑢∗ ∶ 𝛼[0,1]−[𝑎−𝜖(𝑛),𝑎+𝜖(𝑛)] ⊢ 𝑢𝑛 ∶ 𝛼𝑓

Figure 4: Bias Checking Rules


                                         Γ ∶∶ 𝑓 𝑎𝑖𝑟_𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
  Consider now the observation for a given data point 𝑡 assigning label 𝛼 with frequency 𝑓:

                                                  Δ ⊢ 𝑡 ∶ 𝛼𝑓
   which we might consider biased if the divergence of 𝑓 under Δ is significant from what is
dictated by the constraint on the labels encoded in Γ:
Definition 2 (Bias labeling). The assignment of label 𝛼 shown with frequency 𝑓 for a population
of 𝑛 datapoints 𝑡 under the distribution generated by the possibly opaque training set Δ is biased if
the divergence of 𝑓 from the probability 𝑎 generated by the correct label assigned by a transparent
distribution Γ is greater than a given threshold considered appropriate for the labeling task at hand.
  We now use the Kullback-Leibler divergence, which notoriously is a measure of how the
probability distribution Δ is different from Γ, or the expected excess surprise from using Γ as a
model when the actual distribution is Δ. Fixing a constraint that defines the fairness of Γ, and
for each term 𝑡 and label 𝛼 common to Δ, Γ, we obtain:
                                                                    Δ ⊢ 𝑡 ∶ 𝛼𝑓
               𝐷KL (Δ ∥ Γ) = ∑ (Δ ⊢ 𝑢 ∶ 𝛼𝑓 ) log (            ∗                            )
                              𝛼∈Δ                         Γ, 𝑥𝑡 ∶ 𝛼𝑎 ∶∶ 𝑓 𝑎𝑖𝑟_𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
   Using this measure, we want to check whether its value surpasses some reference threshold 𝜋
which is considered appropriate for the classification task. This means considering sufficiently
low values of 𝜋 for sensitive classifications, e.g. gender or other socially and culturally relevant
properties. Definition 2 is then expressed by Rule IB in Figure 4. In this rule 𝜖 is the sum of the
divergence between Δ and Γ computed for each label 𝛼 common to both. On the other hand, in
the presence of a population of 𝑛 datapoints where the frequency 𝑓 of label 𝛼 under distribution
Δ is considered biased, we infer a family of probability values in the interval [0, 1] excluding the
interval [𝑎 − 𝜖(𝑛), 𝑎 + 𝜖(𝑛)] for output 𝛼 ∈ Γ within which 𝑢𝑛 ∶ 𝛼𝑓 can be correctly sampled from
Δ. Note that this rule will not allow to derive a fair value for label 𝛼, but rather determine the
conditions of the distribution under which the current frequency can be considered admissible.


5. Example
Consider context Γ = {𝑥𝑡 ∶ 𝑀1/3 , 𝑥𝑡 ∶ 𝐹2/3 }, representing the reference distribution of gender in
a population of interest in which 1/3 of the population is male and 2/3 is female.
  Assume now a classifier trained on some data that distributes according to Δ, which is
inaccessible, but which we know is – or can assume to be – similar to the one described by Γ.
The classifier on Δ when used displays the following behavior:


                                            Δ ⊢ 𝑡140 ∶ 𝑀60
                                            Δ ⊢ 𝑡140 ∶ 𝐹80

that is, out of 140 datapoints, 60 are labelled as 𝑀 and 80 are labelled as 𝐹. For readability we
write Γ(𝑀) = 1/3, Γ(𝐹 ) = 2/3, Δ(𝑀) = 60/140 = 3/7 and Δ(𝐹 ) = 80/140 = 4/7. We want to
check whether our classifier using Δ is fair with respect to the constraints set by Γ. We calculate
the Kullback–Leibler divergence as follows:


                                                           Δ(𝑥)
                    𝐷𝐾 𝐿 (Δ ∥ Γ) =     ∑ Δ(𝑥) ⋅ log2 (          )                               (1)
                                     𝑥∈{𝐹 ,𝑀}
                                                           Γ(𝑥)
                                              Δ(𝐹 )                   Δ(𝑀)
                                = Δ(𝐹 ) ⋅ log2 (    ) + Δ(𝑀) ⋅ log2 (      )                    (2)
                                              Γ(𝐹 )                   Γ(𝑀)
                                 4         4/7      3       3/7
                                = ⋅ log2 (     ) + ⋅ log2 (      ) ≊ 0.028                      (3)
                                 7         2/3      7       1/3
   To determine when we would consider the classification produced over Δ biased, we need
to set a threshold 𝜋. Let 𝜋 = 0.2. Then, since 𝐷𝐾 𝐿 (Δ ∥ Γ) < 𝜋 we can conclude that Γ, Δ ⊢
Fair(𝑡140 ∶ 𝑀60 ), i.e. that the classification of male inidivduals produces by a classifier under
the distirbution Δ is fair when compared against Γ.
   To further illustrate this example, let us re-consider Equation (2) but this time assume that
we don’t know how many data-points out of the 160 are labelled as 𝑀, i.e.

                                           Δ ⊢ 𝑡140 ∶ 𝑀𝑛
                                           Δ ⊢ 𝑡140 ∶ 𝐹140−𝑛

Recall that Γ(𝑀) = 1/3 and Γ(𝐹 ) = 2/3, and consider that since we are dealing with a binary
classification task, Δ(𝐹 ) = 1 − Δ(𝑀). Then we may rewrite Equation (2) as

                                                 3(1 − Δ(𝑀))
            𝐷𝐾 𝐿 (Δ ∥ Γ) = (1 − Δ(𝑀)) ⋅ log2 (               ) + Δ(𝑀) ⋅ log2 (3 ⋅ Δ(𝑀))
                                                      2
and if we let Δ(𝑀) vary in the interval [0, 1], we can plot 𝐷𝐾 𝐿 (Δ ∥ Γ) as in Figure 5, which
provides a more concrete illustration of which values are considered to be biased in our domain.


6. Conclusion
In this paper we have introduced TPTND-KL, a natural deduction system where formulas
express the expected labeling of data, based on probability distributions of labels over training
sets. We have shown how to model proof-theoretically a checking procedure to measure the
                                     1.5                                      𝐷𝐾 𝐿
                                                                               𝜋


                                      1


                      𝐷𝐾 𝐿 (Δ ∥ Γ)
                                     0.5


                                                      3/7
                                      0
                                           0   0.2   0.4          0.6   0.8          1
                                                           Δ(𝑀)
Figure 5: The Kullback-Leibler Divergence for a fixed reference distribution of 1/3 males and 2/3
females, and for a classifier that classifies a fraction Δ(𝑀) of the population as male. If we set a
threshold 𝜋 = 0.2 (depicted as the red line in the figure) we get that Δ(𝑀) should belong to the interval
[0.11, 0.59] in order to be considered unbiased (see the thick black line on the 𝑥 axis). Note that, for the
particular case where Δ(𝑀) = 3/7 as in our previous example, the bias is in fact acceptable (see the teal
dot on the 𝑥 axis).


distance that a given opaque classifier displays from the constraints set on the expected labeling
by a transparent probabilistic distribution.
   Our research leaves room for further developments. To start with, the example used through-
out this paper is somewhat simplistic. Instead, a more precise characterization of the types of
biases and of fairness metrics that we can model within our language is required: for instance,
one can take advantage of the use of the implication to build dependent variables to model more
complex examples where multi-attribute biases and intersectionality occur. It is worth stressing
here that currently our framework allows us to deal with fairness-of-outcome, as we can only
compare the output distribution (i.e., the outcome of the classification process) with a reference
distribution that we assume to be fair. It remains to be explored whether other definition of
fairness can be modelled without substantially modify TPTND-KL. Other problems that we
would like to tackle in a future version of this work are: adapting the framework to regression
tasks, as at the moment it only works for classification, since the set of types is tacitly assumed
to be finite; and investigate how to set the reference thresholds can be defined or extracted
in less arbitrary ways. This latter point will require studying how the framework behaves
with real data, and to this end an implementation of this work is very much needed. Another
future step is the formulation of meta-threotical results, like progress and termination on the
syntactic transformations determined by our proof systems, to prove under which conditions
a given measure of bias will never exceed a certain threshold; finally, the integration of the
bias checking procedure developed for our system with other non-automatic parameters and
information quality criteria would be desirable.
Acknowledgments
This research has been partially funded by the Project PRIN2020 BRIO - Bias, Risk and Opacity in
AI (2020SSKZ7R) awarded by the Italian Ministry of University and Research (MUR). The author
Fabio Aurelio D’Asaro acknowledges the funding and support of PON “Ricerca e Innovazione”
2014-2020 (PON R&I FSE-REACT EU), Azione IV.6 “Contratti di ricerca su tematiche Green” in
the context of the project titled “Il miglioramento dell’algorithmic fairness in ambito assicurativo:
Un’analisi concettuale a partire da uno studio di caso”.


References
 [1] S. A. Seshia, D. Sadigh, S. S. Sastry, Toward verified artificial intelligence, Commun. ACM
     65 (2022) 46–55. URL: https://doi.org/10.1145/3503914. doi:1 0 . 1 1 4 5 / 3 5 0 3 9 1 4 .
 [2] R. Alur, T. A. Henzinger, P.-H. Ho, Automatic symbolic verification of embedded systems,
     IEEE Trans. Softw. Eng. 22 (1996) 181–201. URL: https://doi.org/10.1109/32.489079. doi:1 0 .
     1109/32.489079.
 [3] M. Z. Kwiatkowska, G. Norman, D. Parker, Prism: Probabilistic symbolic model checker, in:
     Proceedings of the 12th International Conference on Computer Performance Evaluation,
     Modelling Techniques and Tools, TOOLS ’02, Springer-Verlag, Berlin, Heidelberg, 2002, pp.
     200–204.
 [4] A. Termine, G. Primiero, F. A. D’Asaro, Modelling accuracy and trustworthiness of
     explaining agents, in: S. Ghosh, T. Icard (Eds.), Logic, Rationality, and Interaction - 8th
     International Workshop, LORI 2021, Xi’ian, China, October 16-18, 2021, Proceedings,
     volume 13039 of Lecture Notes in Computer Science, Springer, 2021, pp. 232–245. URL:
     https://doi.org/10.1007/978-3-030-88708-7_19. doi:1 0 . 1 0 0 7 / 9 7 8 - 3 - 0 3 0 - 8 8 7 0 8 - 7 \ _ 1 9 .
 [5] Q. Gao, D. Hajinezhad, Y. Zhang, Y. Kantaros, M. M. Zavlanos, Reduced variance deep
     reinforcement learning with temporal logic specifications, in: Proceedings of the 10th
     ACM/IEEE International Conference on Cyber-Physical Systems, ICCPS ’19, Association
     for Computing Machinery, New York, NY, USA, 2019, pp. 237–248. URL: https://doi.org/10.
     1145/3302509.3311053. doi:1 0 . 1 1 4 5 / 3 3 0 2 5 0 9 . 3 3 1 1 0 5 3 .
 [6] A. Termine, A. Antonucci, G. Primiero, A. Facchini, Logic and model checking by imprecise
     probabilistic interpreted systems, in: A. Rosenfeld, N. Talmon (Eds.), Multi-Agent Systems -
     18th European Conference, EUMAS 2021, Virtual Event, June 28-29, 2021, Revised Selected
     Papers, volume 12802 of Lecture Notes in Computer Science, Springer, 2021, pp. 211–227.
     URL: https://doi.org/10.1007/978-3-030-82254-5_13. doi:1 0 . 1 0 0 7 / 9 7 8 - 3 - 0 3 0 - 8 2 2 5 4 - 5 \ _ 1 3 .
 [7] G. Bacci, R. Furber, D. Kozen, R. Mardare, P. Panangaden, D. S. Scott, Boolean-valued
     semantics for the stochastic 𝜆-calculus, in: A. Dawar, E. Grädel (Eds.), Proceedings of the
     33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, Oxford, UK,
     July 09-12, 2018, ACM, 2018, pp. 669–678. URL: https://doi.org/10.1145/3209108.3209175.
     doi:1 0 . 1 1 4 5 / 3 2 0 9 1 0 8 . 3 2 0 9 1 7 5 .
 [8] J. Borgström, U. D. Lago, A. D. Gordon, M. Szymczak, A lambda-calculus foundation for
     universal probabilistic programming, in: J. Garrigue, G. Keller, E. Sumii (Eds.), Proceedings
     of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP
     2016, Nara, Japan, September 18-22, 2016, ACM, 2016, pp. 33–46. URL: https://doi.org/10.
     1145/2951913.2951942. doi:1 0 . 1 1 4 5 / 2 9 5 1 9 1 3 . 2 9 5 1 9 4 2 .
 [9] P. H. A. de Amorim, D. Kozen, R. Mardare, P. Panangaden, M. Roberts, Universal semantics
     for the stochastic 𝜆-calculus, in: 36th Annual ACM/IEEE Symposium on Logic in Computer
     Science, LICS 2021, Rome, Italy, June 29 - July 2, 2021, IEEE, 2021, pp. 1–12. URL: https:
     //doi.org/10.1109/LICS52264.2021.9470747. doi:1 0 . 1 1 0 9 / L I C S 5 2 2 6 4 . 2 0 2 1 . 9 4 7 0 7 4 7 .
[10] M. Antonelli, U. D. Lago, P. Pistone, Curry and howard meet borel, in: C. Baier, D. Fisman
     (Eds.), LICS ’22: 37th Annual ACM/IEEE Symposium on Logic in Computer Science, Haifa,
     Israel, August 2 - 5, 2022, ACM, 2022, pp. 45:1–45:13. URL: https://doi.org/10.1145/3531130.
     3533361. doi:1 0 . 1 1 4 5 / 3 5 3 1 1 3 0 . 3 5 3 3 3 6 1 .
[11] F. Dahlqvist, D. Kozen, Semantics of higher-order probabilistic programs with conditioning,
     Proc. ACM Program. Lang. 4 (2020) 57:1–57:29. URL: https://doi.org/10.1145/3371125.
     doi:1 0 . 1 1 4 5 / 3 3 7 1 1 2 5 .
[12] A. D. Pierro, A type theory for probabilistic 𝜆-calculus, in: From Lambda Calculus to
     Cybersecurity Through Program Analysis, 2020.
[13] M. Boričić, Sequent calculus for classical logic probabilized, Archive for Mathematical
     Logic 58 (2019) 119–136.
[14] S. Ghilezan, J. Ivetić, S. Kašterović, Z. Ognjanović, N. Savić, Probabilistic reasoning about
     simply typed lambda terms, in: International Symposium on Logical Foundations of
     Computer Science, Springer, 2018, pp. 170–189.
[15] F. A. D’Asaro, G. Primiero, Probabilistic typed natural deduction for trustworthy compu-
     tations, in: D. Wang, R. Falcone, J. Zhang (Eds.), Proceedings of the 22nd International
     Workshop on Trust in Agent Societies (TRUST 2021) Co-located with the 20th International
     Conferences on Autonomous Agents and Multiagent Systems (AAMAS 2021), London,
     UK, May 3-7, 2021, volume 3022 of CEUR Workshop Proceedings, CEUR-WS.org, 2021. URL:
     http://ceur-ws.org/Vol-3022/paper3.pdf.
[16] F. A. D’Asaro, G. Primiero, Checking trustworthiness of probabilistic computations in
     a typed natural deduction system, CoRR abs/2206.12934 (2022). URL: https://doi.org/10.
     48550/arXiv.2206.12934. doi:1 0 . 4 8 5 5 0 / a r X i v . 2 2 0 6 . 1 2 9 3 4 . a r X i v : 2 2 0 6 . 1 2 9 3 4 .
[17] B. Friedman, H. Nissenbaum, Bias in computer systems, ACM Trans. Inf. Syst. 14 (1996)
     330–347.
[18] J. Buolamwini, T. Gebru, Gender shades: Intersectional accuracy disparities in commercial
     gender classification, in: Conference on Fairness, Accountability and Transparency, FAT
     2018, 23-24 February 2018, New York, NY, USA, volume 81, PMLR, 2018, pp. 77–91.
[19] N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, A. Galstyan, A survey on bias and
     fairness in machine learning, ACM Comput. Surv. 54 (2021) 115:1–115:35. URL: https:
     //doi.org/10.1145/3457607. doi:1 0 . 1 1 4 5 / 3 4 5 7 6 0 7 .
[20] C. G. Northcutt, A. Athalye, J. Mueller, Pervasive label errors in test sets destabilize machine
     learning benchmarks, 2021. Preprint at https://arxiv.org/pdf/2103.14749.pdf.
[21] J. Dastin, Amazon scraps secret ai recruiting tool that showed bias against women, ????
[22] H. Jiang, O. Nachum, Identifying and correcting label bias in machine learning, CoRR
     abs/1901.04966 (2019). URL: http://arxiv.org/abs/1901.04966. a r X i v : 1 9 0 1 . 0 4 9 6 6 .