-

10.1145/3287560.3287586

fairret: a Framework for Diferentiable Fairness Regularization Terms

MaryBeth Defrance

Maarten Buyl

Tijl De Bie

0 0 Ghent University , Belgium

2024

29 01 03

Current fairness toolkits in machine learning only admit a limited range of fairness definitions and have seen little integration with automatic diferentiation libraries, despite the central role these libraries play in modern machine learning pipelines. We present a framework of fairness regularization terms (fairrets) which quantify bias as modular, flexible objectives that are easily integrated in automatic diferentiation pipelines. By employing a general definition of fairness through linear-fractional statistics, many group fairness definitions can be enforced. Experiments show minimal loss of predictive power compared to baselines. Our contribution includes a PyTorch implementation of the fairret library.

eol>fairness machine learning library automatic diferentiation fairness definitions

1. Introduction 15 The field of AI fairness has been concerned with formalizing ethical concepts of discrimination 16 and bias in technical definitions that can be assessed and pursued in AI systems [ 1]. A popular 17 paradigm for this formalization in binary classification is to use group fairness definitions [ 2], 18 which require the model’s predictions to treat people from diferent sensitive groups similarly. 19 Despite ample research on group fairness definitions and methods to achieve them, an 20 easy-to-use and flexible implementation has not yet been realized. Popular fairness toolkits 21 such as Fairlearn [3] and AIF360 [4] expect the underlying model in the form of scikit-learn 22 Estimators [5] that can be retrained at-will in fairness meta-algorithms, but this aligns poorly 23 with the paradigm of automatic diferentiation libraries like PyTorch [ 6], which have become 24 the bedrock of modern machine learning pipelines. These toolkits only integrate with automatic 25 diferentiation in their implementations of adversarial fairness [ 7], but these still require full 26 control over the training process and lack generality in the fairness notions they can enforce. 27 We formally propose the fairret framework in an efort to resolve these issues. At its 28 core, the framework uses fairness regularization terms (fairrets) that can be easily integrated 29 into PyTorch-based pipelines (an example is given in Appendix A). They pursue any fairness 30 definition expressed as a parity between statistics in a linear-fractional notation, which covers 31 all group fairness definitions considered by Verma and Rubin [2]. Thus, all these definitions are 32 fully compatible with any fairret in any diferentiable model.

Fairness definition (γ)

DP PP

Cond. DP FORP

EOpp Acc. Eq.

Pred. Eq.

Tr. Eq.

Custom (linear-fractional) statistic Regularization term (R) Norm

FAIRRET

Rγ 33 In contrast to Fairlearn and AIF360, our proposed fairrets act as a loss term that can simply 34 be added within a training step. Two PyTorch-specific projects with similar goals as our paper 35 are FairTorch [8] and the Fair Fairness Benchmark (FFB) [9]. However, neither present a formal 36 framework and both only support a limited range of fairness definitions. 37 This work is an extended abstract of a full paper [10] presented at the International Conference 38 on Learning Representations (ICLR) 2024. The implementation of our framework is available at 39 https://github.com/aida-ugent/fairret, which we are currently extending into a full library. 40 2. How to build your fairret 41 A fairret is defined by two elements. First is the fairness definition it aims to satisfy. Second is 42 the method used to evaluate the model with regard to that fairness definition. Figure 1 illustrates 43 this combination and lists the definitions and methods already integrated into the framework. 44 2.1. Fairness definitions 45 Let X ∈ R denote the feature vector of an individual, S ∈ R their sensitive feature vector 46 and ∈ {0, 1} a binary output label. We want to learn a probabilistic classifier such that 47 its predictions (X) match while minimizing disparities over diferent S. Our definition of 48 sensitive features S as real-valued, -dimensional vectors allows us to take a mix of multiple 49 sensitive traits into account, both discrete and continuous. Categorical sensitive features are 50 one-hot encoded, e.g. by encoding ‘white’ or ‘non-white’ as the vectors S = (1, 0)⊤ and 51 S = (0, 1)⊤ respectively. The variable denotes the th sensitive feature. 52 We use a simplified version of the solution from Celis et al. [11] to translate fairness definitions 53 as a parity between linear-fractional statistics : (; ) =

E[( 0(X, ) + (X) 0(X, ))] E[( 1(X, ) + (X) 1(X, ))] (1) 55 with 0, 1, 0, and 1 functions that do not depend on S or . Table 1 shows the statistic 56 for a range of fairness definitions, defined through their and functions. 57 The set ℱ of probabilistic classifiers that adhere to the fairness definition is expressed as ℱ ≜ { : R → {0, 1} | ∀ ∈ [] : (; ) = ¯( )}.

In other words, the statistic (; ) for each sensitive attribute should equal the overall 59 60 statistic ¯( ) ≜ EE[[ 01((XX,, ))++((XX)) 01((XX,, ))]] computed independently of the sensitive attributes. By 61 fixing ¯ to a constant ∈ R, any fairness definition can be enforced with a linear constraint: 62

(; ) = ⇐⇒ E[( 0(X, ) − 1(X, ) + (X)( 0(X, ) − 1(X, )))] = 0 (3) 63 2.2. Regularization terms 64 The bias of a parameterized, probabilistic classifier ℎ is quantified as a fairret that can be 65 minimized through automatic diferentiation, in addition to any existing loss function 58 66 75 ℒ : (4) (2) (6) min ℒ (ℎ) + (ℎ)

ℎ 67 where (ℎ) is the fairret for the fairness definition with statistic and strength ∈ R>0. 68 The fairret framework admits many kinds of regularizers, due to the practical form of the 69 statistics . Two types are currently integrated, namely violation and projection fairrets. 70 We first discuss the Norm fairret, a type of violation fairret: 71 (ℎ) ≜ ⃦⃦⃦⃦ ¯((ℎ;ℎ)) − 1⃦⃦⃦ (5) ⃦ 72 with ‖·‖ a norm over R . Such a regularization term has been proposed several times [17, 18, 19], 73 though without the same degree of modularity with respect to . 74 Second, an example of a projection fairret is the -projection: (ℎ) ≜

min ∈ℱ (¯(ℎ))

E[( (X)||ℎ(X))] 76 with the Kullback-Leibler divergence. The fairret maps ℎ onto the closest fair model 77 ∈ ℱ (¯(ℎ)). Projection fairrets generalize some prior work [20, 21, 22] to all definitions with 78 linear-fractional statistics, as they are enforced with linear constraints using Eq. (3). 0.00 0.2D0P v0io.4la0tion 0.60 79 3. Experiments 80 Experiments were conducted on the LawSchool1, and ACSIncome [23] datasets. Each dataset 81 has multiple sensitive features, including some continuous. Figure 2 shows the results for the 82 experiments. Each point represents a specific fairret optimized for that statistic with a certain 83 strength . An Naive baseline with = 0 is also included. In the full paper, the evaluation is 84 done on two additional datasets and the fairrets are compared to existing methods [10]. 85 The results in Figure 2 show that the performance of a fairret is dependent on the dataset 86 itself and the fairness definition it aims to satisfy. The non-linear (yet still linear-fractional) 87 fairness statistics like predictive parity and treatment equality seem more dificult to minimize. 88 This leads us to conclude that not one fairret can be chosen as the optimal solution, but rather 89 that the best fairret is dependent on the fairness definition and the dataset. 90 4. Conclusion 91 The fairret framework allows for a wide range of fairness definitions by comparing linear92 fractional statistics for each sensitive feature. We implement several fairrets and show how 93 they are easily integrated in existing machine learning pipelines utilizing automatic diferentia94 tion. More details can be found in the full paper [10].

1Curated and published by the SEAPHE project

Acknowledgments 96 The research leading to these results has received funding from the Special Research Fund (BOF) 97 of Ghent University (BOF20/DOC/144 and BOF20/IBF/117), from the Flemish Government under 98 the “Onderzoeksprogramma Artificïele Intelligentie (AI) Vlaanderen” programme, and from the 99 FWO (project no. G0F9816N, 3G042220, G073924N). 100

A. Code Use Examples

Listing 1: Example use of the fairret library in a simple PyTorch setup. 203 Listing 1 displays a code example of how the fairret can easily be deployed in a typical 204 PyTorch [6] setup. It sufices to simply load a subclass of LinearFractionalStatistic and 205 pass it on to a fairret implementation instance such as NormLoss (as defined in Def. 7). The 206 fairret is then used to compute the quantification of unfairness as a loss like any other in 207 PyTorch. In this case, we use the true positive rate statistic to pursue the fairness notion of 208 equalized opportunity (EO). 209 B. Confidence Ellipses 210 The confidence ellipses we use in Fig. 2 are uncommon in machine learning literature. Yet, 211 they work well for our purpose of comparing trade-ofs between metrics that may be noisy 212 depending on randomness during training and dataset split selection. 213 Recall that 1-dimensional confidence intervals typically assume a mean estimator to be 214 normally distributed. The confidence interval then denotes the uncertainty of the sample 215 mean using the standard error. Similarly, confidence ellipses assume a 2-dimensional point, 216 i.e. the 2-dimensional mean estimator, to have a multivariate normal distribution that can be 217 characterized through the sample mean and standard error statistics. 218 Our implementation of the confidence ellipses follows a featured implementation on matplotlib2. 219 However, a crucial diference is that this implementation computes a confidence interval for a 220 2-dimensional random variable based on the covariance matrix for the standard deviation of 2https://matplotlib.org/3.7.0/gallery/statistics/confidence_ellipse.html. 221 samples of that variable. Following observations by Schubert and Kirchner [24], we instead 222 want to show the uncertainty of the mean estimator, which should use the standard deviation 223 of that estimator, i.e. the covariance for the standard error. This is accomplished by dividing the 224 covariance matrix in the matplotlib implementation by the number of seeds (5) we use in 225 our experiments.