<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Fairness-aware Naive Bayes Classifier for Data with Multiple Sensitive Features</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Stelios</forename><surname>Boulitsakis-Logothetis</surname></persName>
							<email>stelios.b.logothetis@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Durham</orgName>
								<address>
									<settlement>Durham</settlement>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">In</forename><forename type="middle">T</forename><surname>Kido</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Durham</orgName>
								<address>
									<settlement>Durham</settlement>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><roleName>Eds.</roleName><forename type="first">K</forename><surname>Takadama</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Durham</orgName>
								<address>
									<settlement>Durham</settlement>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Fairness-aware Naive Bayes Classifier for Data with Multiple Sensitive Features</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">FF254766CA8830F52E1BA603342BAF1F</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-23T20:23+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Fairness-aware machine learning seeks to maximise utility in generating predictions while avoiding unfair discrimination based on sensitive attributes such as race, sex, religion, etc. An important line of work in this field is enforcing fairness during the training step of a classifier. A simple yet effective binary classification algorithm that follows this strategy is two-naive-Bayes (2NB), which enforces statistical parityrequiring that the groups comprising the dataset receive positive labels with the same likelihood. In this paper, we generalise this algorithm into N-naive-Bayes (NNB) to eliminate the simplification of assuming only two sensitive groups in the data and instead apply it to an arbitrary number of groups.</p><p>We propose an extension of the original algorithm's statistical parity constraint and the post-processing routine that enforces statistical independence of the label and the single sensitive attribute. Then, we investigate its application on data with multiple sensitive features and propose a new constraint and post-processing routine to enforce differential fairness, an extension of established group-fairness constraints focused on intersectionalities. We empirically demonstrate the effectiveness of the NNB algorithm on US Census datasets and compare its accuracy and debiasing performance, as measured by disparate impact and DF-ϵ score, with similar group-fairness algorithms. Finally, we lay out important considerations users should be aware of before incorporating this algorithm into their application, and direct them to further reading on the pros, cons, and ethical implications of using statistical parity as a fairness criterion.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Today, countless machine learning-based systems are in use that autonomously make decisions or aid human decisionmakers in applications that significantly impact individuals' lives. This has made it vital to develop ways of ensuring these models are trustworthy, ethical, and fair. The field of fairness-aware machine learning is centered on enhancing the fairness, explainability, and auditability of ML models. A goal many research works in this field share is to maximise utility in generating predictions while avoiding discrimination against people based on specific sensitive attributes, such as race, sex, religion, nationality, etc.</p><p>Researchers have devised many formalisations to try and capture intuitive notions of fairness, each with different priorities and limitations. We summarise the ones we will men-tion here in Table <ref type="table">1</ref>. Traditionally, the proposed notions have been classified into two categories. The simplest and most well-studied, group fairness, is based on defining distinct protected groups in the given data. Then, for each of these groups, a user-selected statistical constraint must be satisfied. This has notable disadvantages: It requires groups to be treated fairly in aggregate, but this guarantee does not necessarily extend to individuals <ref type="bibr" target="#b2">(Awasthi et al. 2020)</ref>. Further, different statistical constraints prioritise different aspects of fairness. Many of them have also been shown to be incompatible with each other, making the choice even more difficult for users. Finally, the choice of the protected groups that should be considered is an open question <ref type="bibr" target="#b4">(Blum et al. 2018;</ref><ref type="bibr" target="#b20">Kleinberg, Mullainathan, and Raghavan 2017)</ref>.</p><p>An orthogonal notion to group fairness is individual fairness. Put simply, this notion requires that "similar individuals be treated similarly" <ref type="bibr" target="#b8">(Dwork et al. 2012)</ref>. This approach addresses the previous lack of any individual-level guarantees. However, it requires strong functional assumptions and still requires the step of choosing an underlying metric over the dataset features <ref type="bibr" target="#b2">(Awasthi et al. 2020)</ref>.</p><p>Alternative models of fairness have been proposed to address the disadvantages of the two traditional definitions. One model is causal fairness, which examines the unfair causal effect the sensitive attribute value may have on the prediction made by an algorithm <ref type="bibr" target="#b21">(Mhasawade and Chunara 2021)</ref>. Another, which is explored in this paper, is differential fairness (DF). This is an extension of the established group fairness concepts that applies them to the case of intersectionalities, meaning groups that are defined by multiple overlapping sensitive attributes <ref type="bibr" target="#b9">(Foulds et al. 2020;</ref><ref type="bibr" target="#b22">Morina et al. 2019)</ref>.</p><p>A similar model is statistical parity subgroup fairness (SF), which focuses on mitigating intersectional bias by applying group fairness to the case of infinitely many, very small subgroups <ref type="bibr" target="#b19">(Kearns et al. 2018)</ref>. SF and DF are notable because they both enable a more nuanced understanding of unfairness than when a single sensitive attribute and broad, coarse groups are considered. A key difference between them, however, is DF's focus on minority groups. The SF measure of subgroup parity weighs larger groups more heavily than very small ones, while DF-parity considers all groups equally. This means DF can provide greater protection to very small minority groups since, in SF, their impact on the overall score is reduced <ref type="bibr" target="#b9">(Foulds et al. 2020)</ref>.</p><p>Despite the lack of consensus on any universal notion of fairness, research has proceeded using the existing models. A major line of work in the development of fair learning algorithms is enforcing fairness during the training step of a classifier <ref type="bibr" target="#b7">(Donini et al. 2018)</ref>. A simple yet effective algorithm that follows this strategy is Calders and Verwer's twonaive-Bayes algorithm <ref type="bibr">(Calders and Verwer 2010) (2NB)</ref>. This algorithm was originally proposed as one of three ways of pursuing fairness in naive Bayes classification. It received further attention in the 2013 publication <ref type="bibr" target="#b18">(Kamishima et al. 2013</ref>) which asserted its effectiveness in enforcing group fairness in binary classification and explored its underlying statistics. It works by training separate naive Bayes classifiers for each of the two (by assumption) groups comprise the dataset, the privileged and the non-privileged group. Then, the algorithm iteratively assesses the fairness of the combined model and makes small changes to the observed probabilities in the direction of making them more fair <ref type="bibr" target="#b11">(Friedler et al. 2019)</ref>.</p><p>A recent publication exploring the arguments for and against statistical parity <ref type="bibr" target="#b24">(Räz 2021</ref>) has served as motivation to re-visit algorithms based around it. Statistical parity (also referred to as demographic parity or independence) is a group fairness notion which requires that the groups comprising the dataset receive positive labels with the same likelihood. An assumption that is at the core of 2NB and many other research works, however, is that of a single, binary sensitive feature <ref type="bibr" target="#b23">(Oneto, Donini, and Pontil 2020)</ref>. This assumption has been noted to rarely hold in the real world, and eliminating it is one of the essential goals of the previously introduced notions of differential fairness and subgroup parity fairness <ref type="bibr" target="#b9">(Foulds et al. 2020;</ref><ref type="bibr" target="#b19">Kearns et al. 2018)</ref>.</p><p>This opens the question of how 2NB can be applied to data with multiple, overlapping sensitive attributes while avoiding oversimplification. The 2NB algorithm is applicable to a wide range of tasks and its effectiveness, even in comparison to more complex algorithms, has been demonstrated <ref type="bibr" target="#b18">(Kamishima et al. 2013;</ref><ref type="bibr" target="#b11">Friedler et al. 2019)</ref>. At the same time, its' design is sufficiently elegant and intuitive to be approachable to practitioners across many disciplines -an important advantage. Thus, extending the algorithm to cover more use cases will be the focus of this work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Contributions</head><p>This paper seeks to build upon Calders and Verwer's work by exploring the following:</p><p>• We adapt the original 2NB structure and balancing routine to support multiple, polyvalent (categorical) sensitive features. • We use this new property of the algorithm to apply it to differential fairness. • To support the above, we examine the extended algorithm's performance on real-world US Census data. • Finally, we lay out important considerations users should be aware of before using this algorithm. We draw upon the literature to lay out the pros, cons, and ethical implications of using statistical parity as a fairness criterion.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Name Definition Statistical Parity</head><p>Likelihood of positive prediction given group membership should be equal for all groups.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Disparate Impact</head><p>Mean ratio of positive predictions for each pair of groups should be 1 or greater than p%.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Subgroup Fairness</head><p>Group fairness applied to infinite number of very small groups.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Differential Fairness</head><p>Group fairness applied to groups defined by multiple overlapping sensitive attributes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Individual Fairness</head><p>Distance between the likelihood of outcomes between any two individuals should be no greater than similarity distance between them.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Causal Fairness</head><p>Use of causal modelling to find effect of sensitive attributes on predictions.</p><p>Table <ref type="table">1</ref>: Some notable formalisations of fairness.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Related Work</head><p>Naive Bayes Naive Bayes is a probabilistic data mining and classification algorithm. In spite of its relative simplicity, it has been shown to be very competent in real-world applications that require classification or class probability estimation and ranking<ref type="foot" target="#foot_0">1</ref> . Various strategies have been explored for improving the algorithm's performance by weakening its conditional independence assumption. These include structure extension, attribute weighting, etc. These techniques focus on maximising accuracy or averaged conditional log likelihood <ref type="bibr" target="#b15">(Jiang 2011)</ref>. Calders and Verwer's proposal of composing multiple naive Bayes models instead aims to enforce independence of predictions with respect to a binary sensitive feature, thus satisfying the statistical parity constraint between the two groups <ref type="bibr" target="#b5">(Calders and Verwer 2010)</ref>.</p><p>Fair Classification There is a large body of research into designing learning methods that do not use sensitive information in discriminatory ways <ref type="bibr" target="#b23">(Oneto, Donini, and Pontil 2020)</ref>. As mentioned, various formalisations of fairness exist but the most well-studied one is group fairness <ref type="bibr" target="#b4">(Blum et al. 2018)</ref>. Many algorithms designed around this notion are introduced as part of the comparative experiment in Section 3.</p><p>A more recent proposal, differential fairness (DF), extends existing group fairness concepts to protect subgroups defined by intersections of and by individual sensitive attributes. The original papers by <ref type="bibr" target="#b9">(Foulds et al. 2020</ref>) and <ref type="bibr" target="#b22">(Morina et al. 2019</ref>) explore the context of intersectionality, and provide comparisons of DF with established concepts. The first paper asserts DF's distinction from subgroup parity and demonstrates its usefulness in protecting small minority groups. The latter paper gives methods to robustly estimate the DF metrics and proposes a post-processing technique to enforce DF on classifiers.</p><p>Humanistic Analysis A line of work that is parallel to fair algorithm development focuses on analysing these proposals from an ethical, philosophical, and moral standpoint. A recent such publication, which examines statistical parity among other notions, and which motivated and influenced this paper, is by <ref type="bibr" target="#b14">Hertweck, Heitz, and Loi (Hertweck, Heitz, and Loi 2021)</ref>. They propose philosophicallygrounded criteria for justifying the enforcement of independence/statistical parity in a given task. They include scenarios where enforcing statistical parity is ethical and justified, as well as counter-examples where the criteria are met but independence should not be enforced. As with many similar works, they conclude by directing the reader to strike a balance between fairness and utilitarian concerns (such as accuracy) in their task. <ref type="bibr" target="#b13">(Heidari et al. 2019</ref>) do similar work, laying out the moral assumptions underlying several popular notions of fairness. In <ref type="bibr" target="#b24">(Räz 2021</ref>), Räz critically examines the advantages and shortcomings of statistical parity as a fairness criterion and makes an overall positive case for it.</p><p>(Friedler, Scheidegger, and Venkatasubramanian 2016) introduce the concept of distinct worldviews which influence how we pursue fairness. One of them is that We're All Equal (WAE) i.e. there is no association between the construct (the latent feature that is truly relevant for the prediction) and the sensitive attribute. The orthogonal worldview is that What You See Is What You Get, wherein the observed labels are accurate reflections of the construct. In <ref type="bibr" target="#b27">(Yeom and Tschantz 2021)</ref>, Yeom and Tschantz give a measure of disparity amplification and dissect the popular group fairness models of statistical parity, equalised odds, calibration, and predictive parity through the lens of worldviews. They argue that under WAE, statistical parity is required to eliminate disparity amplification. However, deviating from this worldviews introduces inaccuracy when we enforce parity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">N-Naive-Bayes Algorithm</head><p>The proposed N-naive-Bayes algorithm is a supervised binary classifier that allows the enforcement of a statistical fairness constraint in its predictions. Given an (ideally large) training set of labelled instances, the algorithm partitions the data based on sensitive attribute value and trains a separate naive Bayes sub-estimator on each of the sub-sets. This is an extension of the original two-naive-Bayes structure, where exactly two sub-estimators are trained. The next step of the training stage is for the conditional probabilities P (Y |S) to be empirically estimated from the training set. Where N s is the number of instances that belong to group s, and N y,s the number of instances of that group that have label y, the empirical conditional probability 2 is given as:</p><formula xml:id="formula_0">P (y|s) = N y,s + α N s + 2 * α (1)</formula><p>Finally, the algorithm modifies the joint distribution P (Y, S) to enforce the given fairness constraint. Then, the 2 Equation (1) gives a smoothed empirical probability, where the constant α is the parameter of a symmetric Dirichlet prior with concentration parameter 2 * α, since a binary label is assumed.</p><p>final predicted class probabilities, for a sample xs (where x is the feature vector excluding the sensitive feature s), is:</p><formula xml:id="formula_1">P (y|xs) = P (x|y) * P (s|y) * P (y) (2) = C s (x) * P (s|y) * P (y) (3) = C s (x) * P (s ∩ y) (4)</formula><p>Where C s is the the sub-estimator for sensitive group s ∈ S.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Enforcing Statistical Parity</head><p>To satisfy the statistical parity constraint, the original 2NB algorithm runs a heuristic post-processing routine that iteratively adjusts the conditional probabilities P (Y |S) of the groups in the direction of making them equal. During its execution, this probability-balancing routine alternates between reducing N (Y = 1, S = 1) and increasing N (Y = 1, S = 0) depending on the number of positive labels outputted by the model at each iteration. This is to try and keep the resultant marginal distribution of Y stable. Once balancing is complete, the value of P (S|Y ) can be induced from N y,s similar to (1). The first contribution of this paper is to extend this routine to suit the polyvalent definition of statistical parity we will use: Definition 1. Statistical (Conditional) Parity for Polyvalent S (Ritov, Sun, and Zhao 2017): For predicted binary labels ŷ and polyvalent sensitive feature S, statistical (conditional) parity requires<ref type="foot" target="#foot_2">3</ref> :</p><formula xml:id="formula_2">P (ŷ = 1|s) = P (ŷ = 1|s ′ ) ∀ s, s ′ ∈ S<label>(5)</label></formula><p>We modify the probability-balancing routine to subtract and add probability to the group with the highest (max) and lowest (min) current P (Y = 1|s) respectively. These probabilities are re-computed with each iteration, and the max and min groups re-selected. Further, we introduce the constraint that only groups designated by the user as privileged can receive a reduction in their likelihood of getting a positive label<ref type="foot" target="#foot_3">4</ref> . This is to avoid making any assumptions about which groups it would be appropriate to demote positive instances of. It allows the balancing routine to terminate immediately if it over-corrects, or if the data is such that P (ŷ = 1|s np ) &gt; P (ŷ = 1|s p ) to begin with, as is the case in the well-known UCI Adult dataset, for example. This gives us the final form of our statistical parity criterion: Definition 2. Statistical Parity Criterion for NNB:</p><p>For predicted binary labels ŷ and sensitive feature S:</p><formula xml:id="formula_3">P (ŷ = 1|s p ) = P (ŷ = 1|s np ) ∀ (s p , s np ) ∈ S p × S np (6)</formula><p>Where S p and S np are the sub-sets of all privileged and nonprivileged sub-groups of S respectively. We adapt the above definition into a score that the algorithm can minimise: Recalculate P (Y |S), disc, s max , s min 13: end while Note that the above criterion can easily be relaxed to apply the four-fifths rule for removing disparate impact (or its more general form, the p% rule <ref type="bibr" target="#b28">(Zafar et al. 2017</ref>)) instead of perfect statistical parity. For the purposes of this paper, however, we explore the effect of statistical parity in its base form.</p><p>We also note the definition of disparate impact we use in the evaluation stage: Definition 3. Disparate Impact (Mean) for Polyvalent S:</p><formula xml:id="formula_4">1 |S p × S np | (sp,snp) P (ŷ = 1|s np ) P (ŷ = 1|s p )</formula><p>Algorithm 1 describes the extended probability balancing heuristic for enforcing parity. The values of s p , s np in the parity criterion (Equation <ref type="formula">7</ref>) are referred to as s max and s min respectively. At each iteration, the routine determines these groups and adjusts their conditional probabilities. A further modification from the original is that the proportion by which the probabilities are adjusted with each iteration is now proportional to the size of the group itself, instead of the size of the opposite group. In experiments, this yields a great performance improvement, especially where the distribution of samples over S is very imbalanced.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Enforcing Differential Fairness</head><p>An alternative measure of fairness we explore is differential fairness, as given in <ref type="bibr" target="#b9">(Foulds et al. 2020</ref>). Definition 4. A classifier is ϵ-differentially fair if:</p><formula xml:id="formula_5">e −ϵ ≤ P (ŷ|s) P (ŷ|s ′ ) ≤ e ϵ ∀ s, s ′ ∈ S, ŷ ∈ Y (8)</formula><p>The (smoothed) empirical differential fairness score, from the empirical counts in the data, assuming a binary label, is:</p><formula xml:id="formula_6">e −ϵ ≤ N (ŷ, s) + α N (s) + β N (s ′ ) + β N (ŷ, s ′ ) + α ≤ e ϵ ∀ s, s ′ ∈ S, ŷ ∈ Y (9)</formula><p>This is used in experiments to estimate the value of ϵ (the ϵ-score) from the predicted labels on the dataset<ref type="foot" target="#foot_4">5</ref> . In experiments we set β = 2 * α and substitute with the observed conditional probability estimates from the dataset. An additional measure given in <ref type="bibr" target="#b9">(Foulds et al. 2020)</ref> to assess fairness from the standpoint of intersectionality is differential fairness bias amplification. This measure gives an indication of how much a black-box classifier increases the unfairness over the original data <ref type="bibr" target="#b9">(Foulds et al. 2020;</ref><ref type="bibr" target="#b29">Zhao et al. 2017</ref>). Definition 5. Differential Fairness Bias Amplification A classifier C satisfies (ϵ 2 − ϵ 1 )-DF bias amplification w.r.t. dataset D if C is ϵ 2 -DF fair and D is ϵ 1 -DF fair.</p><p>To adjust the joint distribution P (Y, S) to minimise satisfy DF-fairness and minimise the ϵ-score, we propose a new heuristic probability-balancing routine and associated discrimination score. The distinction from the balancing routine given in Algorithm 1 is that this focuses on outputting a narrower range of probabilities, while still avoiding negatively impacting groups that are designated as nonprivileged. To form the new discrimination score, we apply the principle of separating privileged and non-privileged sub-groups of S from the previous section to the ϵ-score definition:</p><formula xml:id="formula_7">e −ϵ ≤ P (ŷ = 1|s np ) P (ŷ = 1|s p ) ≤ e ϵ ∀ (s p , s np ) ∈ S p × S np<label>(10)</label></formula><p>We then express this restricted ϵ-score as the maximum of two ratios: e ϵ = max(ρ d , ρ u ), where for (s p , s np ) ∈ S p × S np :</p><formula xml:id="formula_8">ρ d = max P (ŷ = 1|s np ) P (ŷ = 1|s p ) , ρ u = max P (ŷ = 1|s p ) P (ŷ = 1|s np )<label>(11)</label></formula><p>The execution of the proposed balancing routine is determined by these ratios. If ρ d is greater, then the nonprivileged sub-group with smallest probability at that iteration receives an increase in probability. If ρ u is greater, then the privileged group with highest probability receives a decrease in probability. These conditions can be expected to alternate as the conditional probabilities P (Y |S) converge. Iteration continues until ρ d is close to zero. The s max and s min groups are determined as in the previous section.</p><p>This routine disregards the number of positive labels the model produces, while Algorithm 1 attempts to keep that number close to the number of positive labels in the training data. This allows it to avoid situations where a single, non-privileged sub-group with small probability would require the probabilities of the privileged groups to be reduced significantly. In such cases, other non-privileged sub-groups might maintain much higher probabilities, therefore giving a poor ϵ-score. An further difference is the proportion by Recalculate P (Y |S), ρ d , ρ u , s max , s min 11: end while which each N y,s is modified grows/decreases exponentially. In experiments, this allows the routine escape local minima that occur during the adjustment of P (Y |S) and lead to inefficiency. This routine does, however, offer a theoretical accuracy trade-off compared to Algorithm 1, which we investigate in the following section.</p><p>Finally, note that all the above probability-balancing routines (including Calders and Verwer's original one) are based around the assumption that the distribution of labels over the sensitive feature(s) in the training set is reflective of the test setting. This assumption is not unique to this model (see <ref type="bibr" target="#b0">(Agarwal et al. 2018;</ref><ref type="bibr" target="#b12">Hardt, Price, and Srebro 2016)</ref>), and under it, we can conclude that minimising the given fairness measure on the training set generalises to the test data <ref type="bibr" target="#b26">(Singh et al. 2021</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Experimental Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Setup</head><p>We implement NNB in Python within the scikit-Learn framework, using Gaussian naive Bayes as the subestimator. We then evaluate its performance in two experiments.</p><p>For both experiments, we use real-world data from the US Census Bureau<ref type="foot" target="#foot_5">6</ref> . <ref type="bibr" target="#b6">(Ding et al. 2021</ref>) define several classification tasks on this data, each involving a sub-set of the total features available. We consider two:</p><p>• Income: Predict whether an individual's income is above $50,000. The data for this problem is filtered so that it serves as a comparable replacement to the wellknown UCI Adult dataset. • Employment: Predict whether an individual is employed</p><p>The details of which features are included in each task and what filtering takes place can be found in the paper <ref type="bibr" target="#b6">(Ding et al. 2021</ref>) and the associated page on GitHub<ref type="foot" target="#foot_6">7</ref> . To evaluate NNB we use data from the 2018 census in the state of California. The sensitive feature(s) used in each task are indicated after its name, e.g. Income-Race-Sex is the Income task using race and sex as the sensitive features. To best capture intersectional fairness when using multiple sensitive features, we follow the approach from <ref type="bibr" target="#b9">(Foulds et al. 2020)</ref> and define each group s as a tuple of the sub-groups of each sensitive feature that each sample belongs to.</p><p>First Experiment This experiment compares NNB's performance with other algorithms. The comparison includes "vanilla" models as baselines for performance, and several group-fairness-aware algorithms that have a similar focus to NNB -ensuring non-discrimination across protected groups by optimising metrics such as statistical parity or disparate impact. Specifically, we consider the following:</p><p>• GaussianNB, DecisionTree, LR, SVM: scikit-Learn's Gaussian naive Bayes, Decision Trees, Logistic Regression, and SVM. • Feldman-DT, Feldman-NB: A pre-processing algorithm that aims to remove disparate impact. It equalises the marginal distributions of the subsets of each attribute with each sensitive value <ref type="bibr" target="#b8">(Feldman et al. 2015)</ref>. The resulting "repaired" data is then used to train scikit-Learn classifiers -Decision Trees (DT) and Gaussian naive Bayes (NB). • Kamishima: An in-processing method that introduces a regularisation term to logistic regression to enforce independence of labels from the sensitive feature <ref type="bibr" target="#b17">(Kamishima et al. 2012</ref>). • ZafarAccuracy, ZafarFairness: An inprocessing algorithm that applies fairness constraints to convex margin-based classifiers <ref type="bibr" target="#b28">(Zafar et al. 2017</ref>) . Specifically, we test two variations of a modified logistic regression classifier: The first maximises accuracy subject to fairness (disparate impact) constraints, while the latter prioritises removing disparate impact. • 2NB: Calders and Verwer's original algorithm, using the same GaussianNB sub-estimator as NNB. • NNB-Parity, NNB-DF: N-naive-Bayes tuned to satisfy statistical parity using Algorithm 1, and DF-parity using Algorithm 2.</p><p>For the comparison we use the benchmark provided by <ref type="bibr" target="#b11">(Friedler et al. 2019)</ref>. The fairness-aware algorithms are tuned via grid-search to optimise accuracy. The performance of the algorithms is then measured over ten random train-test splits of the data.</p><p>Second Experiment This experiment demonstrates how NNB performs in finer detail. We consider GaussianNB, NNB-Parity, and NNB-DF as before, and we further include 2NB, the original two-naive-Bayes algorithm implemented identically to NNB. Finally, we include Perfect as a secondary baseline, to illustrate the scores that would be achieved by a perfect classifier.</p><p>To evaluate the performance of the above algorithms, we note the mean and variance of the following measures over 10 random train-test splits: accuracy, AUC, disparate impact score (mean of the DI between all privileged and nonprivileged groups), statistical parity score (as defined in 2), Figure <ref type="figure">1</ref>: Scatter plots of accuracy vs. disparate impact for Income-Race and vs. ϵ-score for Income-Race-Sex DF-ϵ (as defined in 4), DF-bias amplification score (as defined in 5). We also compare the resultant distribution of labels over groups of S on a single random train-test split.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Results</head><p>First Experiment Figure <ref type="figure">1</ref> gives the accuracy vs. the disparate impact and DF-ϵ scores on the Income-Race and Income-Race-Sex tasks. Figure <ref type="figure" target="#fig_1">2</ref> shows the same for Employment-Race and Employment-Race-Sex. It can be seen that on Income-Race, NNB results in a higher DI score than 2NB and has often over-favoured non-privileged groups causing a score &gt; 1. Its accuracy is on-par with 2NB and the baseline naive Bayes, DT, and LR models. Feldman's algorithm with Decision Trees results similar disparate impact score in some splits, but lower accuracy. The same is true for the DF-ϵ score on this task. On Income-Race-Sex, NNB-DF beats out all other algorithms in achieving DI ∼ 1, however NNB-Parity has higher accuracy than both NNB-DF and naive Bayes. NNB-DF is also the most successful at minimising the ϵ-score for this task, though again this comes at the cost of lower accuracy than the baseline model.</p><p>On Employment-Race all naive Bayes models achieve similar accuracy, while DT and LR-based models rank higher, and SVM the highest. The same can be observed for Employment-Race-Sex, and for both tasks NNB-DF again gives the ϵ-scores closest to zero.</p><p>Second Experiment Table <ref type="table" target="#tab_0">2</ref> gives the scores achieved on the Income-Race task, and Table <ref type="table" target="#tab_1">3</ref> gives the same Employment-Race-Sex. On Income-Race, both NNB models gave an improved parity score compared to the perfect classifier and GaussianNB. NNB and 2NB also gave improved disparate impact scores over the baseline models, but 2NB under-corrected while the NNB models gave a score &gt; 1 indicating they favoured the non-privileged groups over the privileged group.</p><p>NNB-Parity and NNB-DF gave similar disparate impact scores, but the former gave higher accuracy while the latter produced a narrower range of positive label proportions, and thus better parity, ϵ, and DF-bias amplification scores. The evident accuracy trade-off is more pronounced in the latter task, with NNB-Parity achieving an accuracy of 0.7445 ± 0.00, and NNB-DF achieving 0.7199 ± 0.00. On Employment-Race-Sex, NNB-DF outperformed NNB-Parity on all scores. This was also the case for Employment-Race, where both models had similar accuracy but NNB-DF displayed less over-correction in its disparate impact score (1.0336 ± 0.0001 versus 1.2760 ± 0.0002), in addition to the expected improvement in ϵ-score (0.1068 ± 0.001 versus 0.3434 ± 0.0001). This suggests the DF balancing routine is better suited for the Employment task than the parity-based routine.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Discussion</head><p>In this work we presented an extension of the two-naive-Bayes algorithm, adapting it to suit datasets with multiple, polyvalent sensitive features. We applied the proposed N-naive-Bayes structure to intersectionality and differential fairness by giving an alternative probability-balancing routine. Our experiments on real-world datasets yielded favourable results and demonstrated the effectiveness and the differences between the parity and DF-based approaches.</p><p>We conclude by laying out key considerations users should take into account before using N-naive-Bayes:</p><p>Statistical Parity as a Fairness Criterion Statistical parity stands opposed to the (aggregate) accuracy of a classifier, except in degenerate cases where the data is already fair, so it is recommended that a balance between the two is pursued <ref type="bibr" target="#b14">(Hertweck, Heitz, and Loi 2021)</ref>. This also applies to the extended, but still parity-based, DF measure that was explored in Section 2. In their worldview-based analysis, Yeom and Tschantz caution us that even under WAE, blind enforcement of statistical parity can introduce new discrimination into the system (Yeom and Tschantz 2021). Thus, users must be aware of the ethical implications of using parity as a core fairness constraint, the possible impact it may have on individuals, and the moral objections these individuals may justifiably raise.</p><p>We recommend further reading on the advantages and disadvantages of group fairness in general <ref type="bibr" target="#b24">(Räz 2021;</ref><ref type="bibr" target="#b8">Dwork et al. 2012;</ref><ref type="bibr" target="#b13">Heidari et al. 2019)</ref>, as well as parity specifically <ref type="bibr" target="#b14">(Hertweck, Heitz, and Loi 2021;</ref><ref type="bibr" target="#b27">Yeom and Tschantz 2021)</ref>, so users can make informed decisions on how to apply statistical parity and N-naive-Bayes to their application.  Limitations of NNB N-naive-Bayes (as with two-naive-Bayes) has inherent limitations. The algorithm does not automatically make a classification task fair when it is applied. This is only considered to be possible by doing extensive domain-specific investigation <ref type="bibr" target="#b12">(Hardt, Price, and Srebro 2016)</ref>. Rather, the algorithm introduces a form of affirmative action to the task, increasing and decreasing the likelihood of different groups receiving a positive label in an attempt to satisfy the given parity constraint. This intentional manipulation of the original distribution over the data can be done to correct for structural biases in the data, for the purposes of compliance with regulations, or even as part of an effort to counteract historical inequalities.</p><p>Users should always consider the implications of estimating probability distributions for each group separately (as is done at the beginning of the training stage), as well as the mechanism behind any post-facto probability tuning they decide on. Further, users should understand the implications of affirmative action, its downstream effects, and ensure it is appropriate to their application. As a starting point for further reading, see <ref type="bibr" target="#b8">(Dwork et al. 2012;</ref><ref type="bibr" target="#b19">Kannan, Roth, and Ziani 2019)</ref>. Sociological and legal works such as <ref type="bibr" target="#b16">(Kalev, Dobbin, and Kelly 2006;</ref><ref type="bibr" target="#b1">Anderson 2003)</ref> are also recommended.</p><p>Finally, the explicit choice of sensitive features to consider when enforcing statistical parity is a simplification of the real world and should be done carefully. One should consider the ontology behind observed values in the dataset: race, for example, has varying definitions, each of which comes with its own assumptions. Further, identifying groups in the data using a set of observable qualities, whatever those may be, also carries implicit assumptions about how all the factors involved interact with each other and the validity of decomposing them into discrete features <ref type="bibr">(Barocas, Hardt, and Narayanan 2019, Ch. 5</ref>).</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>N</head><label></label><figDesc>disc = max P (ŷ = 1|s p ) − min P (ŷ = 1|s np ) (7) Algorithm 1: Pseudocode for a probability-balancing routine to enforce statistical parity 1: Calculate the parity score, disc, of the predicted classes by the current model and store s max , s min 2: while disc &gt; disc 0 do 3: Let numpos be the number of positive samples by the current model 4: if numpos &lt; the number of positive samples in the training set then 5: N (y = 1, s min ) + = ∆ * N (y = 0, s min ) 6: N (y = 0, s min ) − = ∆ * N (y = 0, s min ) (y = 1, s max ) + = ∆ * N (y = 1, s max ) 9:N (y = 0, s max ) − = ∆ * N (y = 1, s max ) any N (y, s) is now negative, rollback the changes and terminate 12:</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Algorithm 2 :N</head><label>2</label><figDesc>Pseudocode for a probability-balancing routine to enforce DF parity 1: Calculate the ratios ρ d , ρ u empirically from the predicted classes by the current model, store s max , s min 2: while ρ d &gt; disc 0 do 3: if ρ u ≤ ρ d then 4: N (y = 0, s min ) − = ∆ * N (y = 0, s min ) 5: N (y = 1, s min ) + = ∆ * N (y = 1, s min ) (y = 0, s max ) + = ∆ * N (y = 0, s max ) 8: N (y = 1, s max ) − = ∆ * N (y = 1, s max )</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Scatter plots of accuracy vs. disparate impact for Employment-Race and vs. ϵ-score for Employment-Race-Sex</figDesc><graphic coords="7,79.20,238.96,453.59,183.96" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="6,79.20,54.00,453.59,183.96" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="6,79.20,238.96,453.59,183.96" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 2 :</head><label>2</label><figDesc>Parity 0.8114 ± 0.00 0.7480 ± 0.00 1.0810 ± 0.0006 0.1984 ± 0.0005 0.4580 ± 0.0045 −0.4840 ± 0.0041 NNB-DF 0.8138 ± 0.00 0.7380 ± 0.00 1.0636 ± 0.0007 0.1530 ± 0.0008 0.3112 ± 0.0048 −0.6308 ± 0.0035 Perfect 1.0000 ± 0.00 1.0000 ± 0.00 0.6975 ± 0.0005 0.2950 ± 0.0001 0.9420 ± 0.0048 0.0000 ± 0.0000 Scores Achieved on Income with Race as the Sensitive Feature GaussianNB 0.8159 ± 0.00 0.7273 ± 0.00 1.0228 ± 0.0001 0.3000 ± 0.0001 0.4994 ± 0.0003 0.0922 ± 0.0016 2NB 0.8112 ± 0.00 0.7202 ± 0.00 0.9352 ± 0.0001 0.2951 ± 0.0001 0.4818 ± 0.0002 0.0746 ± 0.0015 NNB-Parity 0.7820 ± 0.00 0.7241 ± 0.00 1.2990 ± 0.0004 0.2478 ± 0.0007 0.3971 ± 0.0013 −0.0101 ± 0.0005 NNB-DF 0.7909 ± 0.00 0.7251 ± 0.00 1.0601 ± 0.0002 0.1272 ± 0.0009 0.1840 ± 0.0018 −0.2232 ± 0.0011 Perfect 1.0000 ± 0.00 1.0000 ± 0.00 0.8643 ± 0.0001 0.1782 ± 0.0002 0.4072 ± 0.0014 0.0000 ± 0.0000</figDesc><table><row><cell></cell><cell>AUC</cell><cell>Accuracy</cell><cell>DI</cell><cell>Parity</cell><cell>DF-ϵ</cell><cell>DF-amp</cell></row><row><cell cols="6">GaussianNB 0.8270 ± 0.00 0.7503 ± 0.00 0.6304 ± 0.0001 0.4222 ± 0.0000 1.4100 ± 0.0012</cell><cell>0.4680 ± 0.0045</cell></row><row><cell>2NB</cell><cell cols="5">0.8223 ± 0.00 0.7577 ± 0.00 0.8930 ± 0.0013 0.3606 ± 0.0000 0.9774 ± 0.0016</cell><cell>0.0353 ± 0.0059</cell></row><row><cell cols="2">NNB-AUC</cell><cell>Accuracy</cell><cell>DI</cell><cell>Parity</cell><cell>DF-ϵ</cell><cell>DF-amp</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 3 :</head><label>3</label><figDesc>Scores Achieved on Employment with Race and Sex as the Sensitive Features</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0"> Recent, novel applications include (Valdiviezo-Diaz et al.   </note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2019" xml:id="foot_1">;<ref type="bibr" target="#b8">Feng et al. 2018;</ref><ref type="bibr" target="#b22">Niazi et al. 2019)</ref> among others.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">The cited definition requires this to hold for all values of ŷ, however for a binary label it is sufficient to check ŷ = 1.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">A similar constraint is explored by<ref type="bibr" target="#b28">(Zafar et al. 2017</ref>).</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">Note that this definition produces noisier estimates for subgroups with fewer members.<ref type="bibr" target="#b22">(Morina et al. 2019)</ref> shows that as the dataset grows, the given estimate converges to the true value, and that this happens regardless of the chosen smoothing parameters. However, for small or imbalanced datasets, more robust estimation methods should be used.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">https://www.census.gov/programssurveys/acs/microdata/documentation.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6">7 https://github.com/zykls/folktables</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">A Reductions Approach to Fair Classification</title>
		<author>
			<persName><forename type="first">A</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Beygelzimer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dudik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Langford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wallach</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 35th International Conference on Machine Learning</title>
				<editor>
			<persName><forename type="first">J</forename><surname>Dy</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Krause</surname></persName>
		</editor>
		<meeting>the 35th International Conference on Machine Learning<address><addrLine>Stockholm, Sweden</addrLine></address></meeting>
		<imprint>
			<publisher>PMLR</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">80</biblScope>
			<biblScope unit="page" from="60" to="69" />
		</imprint>
	</monogr>
	<note>Proceedings of Machine Learning Research</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Integration, Affirmative Action, and Strict Scrutiny</title>
		<author>
			<persName><forename type="first">E</forename><surname>Anderson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">New York University Law Review</title>
		<imprint>
			<biblScope unit="volume">77</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="1195" to="1271" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Beyond Individual and Group Fairness</title>
		<author>
			<persName><forename type="first">P</forename><surname>Awasthi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Cortes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Mansour</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mohri</surname></persName>
		</author>
		<idno>CoRR, abs/2008.09490</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Barocas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hardt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Narayanan</surname></persName>
		</author>
		<ptr target="http://www.fairmlbook.org" />
		<title level="m">Fairness and Machine Learning</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note>fairmlbook</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">On Preserving Non-Discrimination When Combining Expert Advice</title>
		<author>
			<persName><forename type="first">A</forename><surname>Blum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gunasekar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Lykouris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Srebro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS&apos;18</title>
				<meeting>the 32nd International Conference on Neural Information Processing Systems, NIPS&apos;18<address><addrLine>Red Hook, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Curran Associates Inc</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="8386" to="8397" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Three naive Bayes approaches for discrimination-free classification</title>
		<author>
			<persName><forename type="first">T</forename><surname>Calders</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Verwer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data Mining and Knowledge Discovery</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="277" to="292" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Retiring Adult: New Datasets for Fair Machine Learning</title>
		<author>
			<persName><forename type="first">F</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hardt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Schmidt</surname></persName>
		</author>
		<idno>CoRR, abs/2108.04884</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Empirical Risk Minimization under Fairness Constraints</title>
		<author>
			<persName><forename type="first">M</forename><surname>Donini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Oneto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ben-David</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shawe-Taylor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pontil</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS&apos;18</title>
				<meeting>the 32nd International Conference on Neural Information Processing Systems, NIPS&apos;18<address><addrLine>Red Hook, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Curran Associates Inc</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="2796" to="2806" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Prediction of Slope Stability using Naive Bayes Classifier</title>
		<author>
			<persName><forename type="first">C</forename><surname>Dwork</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hardt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Pitassi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Reingold</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Zemel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Feldman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Friedler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Moeller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Scheidegger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Venkatasubramanian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>; Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD &apos;15</title>
				<meeting>the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD &apos;15<address><addrLine>New York, NY, USA; New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2012">2012. 2015. 2018</date>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="page" from="941" to="950" />
		</imprint>
	</monogr>
	<note>Certifying and Removing Disparate Impact</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">An Intersectional Definition of Fairness</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Foulds</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Islam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">N</forename><surname>Keya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Pan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 36th International Conference on Data Engineering (ICDE), 1918-1921</title>
				<meeting><address><addrLine>Dallas, Texas, USA</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">On the (im)possibility of fairness</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Friedler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Scheidegger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Venkatasubramanian</surname></persName>
		</author>
		<idno>CoRR, abs/1609.07236</idno>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page">16</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A Comparative Study of Fairness-Enhancing Interventions in Machine Learning</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Friedler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Scheidegger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Venkatasubramanian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Choudhary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">P</forename><surname>Hamilton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Roth</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* &apos;19</title>
				<meeting>the Conference on Fairness, Accountability, and Transparency, FAT* &apos;19</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="329" to="338" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Equality of Opportunity in Supervised Learning</title>
		<author>
			<persName><forename type="first">M</forename><surname>Hardt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Price</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Srebro</surname></persName>
		</author>
		<idno>9781510838819</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS&apos;16</title>
				<meeting>the 30th International Conference on Neural Information Processing Systems, NIPS&apos;16<address><addrLine>Red Hook, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Curran Associates Inc. ISBN</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="3323" to="3331" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A Moral Framework for Understanding Fair ML through Economic Models of Equality of Opportunity</title>
		<author>
			<persName><forename type="first">H</forename><surname>Heidari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Loi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">P</forename><surname>Gummadi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Krause</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* &apos;19</title>
				<meeting>the Conference on Fairness, Accountability, and Transparency, FAT* &apos;19<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery. ISBN</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page">9781450361255</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">On the Moral Justification of Statistical Parity</title>
		<author>
			<persName><forename type="first">C</forename><surname>Hertweck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Heitz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Loi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT &apos;21</title>
				<meeting>the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT &apos;21<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery. ISBN</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page">9781450383097</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Random one-dependence estimators</title>
		<author>
			<persName><forename type="first">L</forename><surname>Jiang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition Letters</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="532" to="539" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Best Practices or Best Guesses? Assessing the Efficacy of Corporate Affirmative Action and Diversity Policies</title>
		<author>
			<persName><forename type="first">A</forename><surname>Kalev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Dobbin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kelly</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">American Sociological Review</title>
		<imprint>
			<biblScope unit="volume">71</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="589" to="617" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Fairness-Aware Classifier with Prejudice Remover Regularizer</title>
		<author>
			<persName><forename type="first">T</forename><surname>Kamishima</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Akaho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Asoh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sakuma</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Machine Learning and Knowledge Discovery in Databases</title>
				<editor>
			<persName><forename type="first">P</forename><forename type="middle">A</forename><surname>Flach</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>De Bie</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Cristianini</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin, Heidelberg; Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="35" to="50" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">The Independence of Fairness-Aware Classifiers</title>
		<author>
			<persName><forename type="first">T</forename><surname>Kamishima</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Akaho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Asoh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sakuma</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2013 IEEE 13th International Conference on Data Mining Workshops, ICDMW &apos;13</title>
				<meeting>the 2013 IEEE 13th International Conference on Data Mining Workshops, ICDMW &apos;13<address><addrLine>USA</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE Computer Society. ISBN</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page">9781479931422</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kannan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Roth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ziani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kearns</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Neel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Roth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* &apos;19</title>
				<editor>
			<persName><forename type="first">J</forename><surname>Dy</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Krause</surname></persName>
		</editor>
		<meeting>the Conference on Fairness, Accountability, and Transparency, FAT* &apos;19<address><addrLine>New York, NY, USA; Stockholm, Sweden</addrLine></address></meeting>
		<imprint>
			<publisher>PMLR</publisher>
			<date type="published" when="2018">2019. 2018</date>
			<biblScope unit="volume">80</biblScope>
			<biblScope unit="page" from="2564" to="2572" />
		</imprint>
	</monogr>
	<note>Proceedings of Machine Learning Research</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Inherent Trade-Offs in the Fair Determination of Risk Scores</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kleinberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mullainathan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Raghavan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">8th Innovations in Theoretical Computer Science Conference (ITCS 2017)</title>
				<editor>
			<persName><forename type="first">C</forename><forename type="middle">H</forename><surname>Papadimitriou</surname></persName>
		</editor>
		<meeting><address><addrLine>Germany</addrLine></address></meeting>
		<imprint>
			<publisher>Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">67</biblScope>
			<biblScope unit="page">23</biblScope>
		</imprint>
	</monogr>
	<note>Leibniz International Proceedings in Informatics (LIPIcs)</note>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Causal Multi-Level Fairness</title>
		<author>
			<persName><forename type="first">V</forename><surname>Mhasawade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Chunara</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, AIES &apos;21</title>
				<meeting>the 2021 AAAI/ACM Conference on AI, Ethics, and Society, AIES &apos;21<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery. ISBN</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page">9781450384735</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Hotspot diagnosis for solar photovoltaic modules using a Naive Bayes classifier</title>
		<author>
			<persName><forename type="first">G</forename><surname>Morina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Oliinyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Waton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Marusic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Georgatzis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">A K</forename><surname>Niazi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Akhtar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">A</forename><surname>Khan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Athar</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1911.01468</idno>
	</analytic>
	<monogr>
		<title level="j">Solar Energy</title>
		<imprint>
			<biblScope unit="volume">190</biblScope>
			<biblScope unit="page" from="34" to="43" />
			<date type="published" when="2019">2019. 2019</date>
		</imprint>
	</monogr>
	<note>Auditing and Achieving Intersectional Fairness in Classification Problems</note>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">General Fair Empirical Risk Minimization</title>
		<author>
			<persName><forename type="first">L</forename><surname>Oneto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Donini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pontil</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Joint Conference on Neural Networks (IJCNN), 1-8</title>
				<meeting><address><addrLine>Glasgow, United Kingdom</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Group Fairness: Independence Revisited</title>
		<author>
			<persName><forename type="first">T</forename><surname>Räz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT &apos;21</title>
				<meeting>the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT &apos;21<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery. ISBN</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page">9781450383097</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<title level="m" type="main">On conditional parity as a notion of non-discrimination in machine learning</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Ritov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Zhao</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1706.08519</idno>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">A Collaborative Filtering Approach Based on Naïve Bayes Classifier</title>
		<author>
			<persName><forename type="first">H</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Mhasawade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Chunara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Cobos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Lara-Cabrera</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT &apos;21</title>
				<meeting>the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT &apos;21<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2019">2021. 2019</date>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="108581" to="108592" />
		</imprint>
	</monogr>
	<note>Fairness Violations and Mitigation under Covariate Shift</note>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Avoiding Disparity Amplification under Different Worldviews</title>
		<author>
			<persName><forename type="first">S</forename><surname>Yeom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">C</forename><surname>Tschantz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT &apos;21</title>
				<meeting>the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT &apos;21<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery. ISBN</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page">9781450383097</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Fairness Constraints: Mechanisms for Fair Classification</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">B</forename><surname>Zafar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Valera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">G</forename><surname>Rogriguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">P</forename><surname>Gummadi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 20th International Conference on Artificial Intelligence and Statistics</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Singh</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Zhu</surname></persName>
		</editor>
		<meeting>the 20th International Conference on Artificial Intelligence and Statistics<address><addrLine>Ft. Lauderdale, FL, USA</addrLine></address></meeting>
		<imprint>
			<publisher>PMLR</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">54</biblScope>
			<biblScope unit="page" from="962" to="970" />
		</imprint>
	</monogr>
	<note>Proceedings of Machine Learning Research</note>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<title level="m" type="main">Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints</title>
		<author>
			<persName><forename type="first">J</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yatskar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ordonez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chang</surname></persName>
		</author>
		<idno>CoRR, abs/1707.09457</idno>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
