Measuring and Mitigating Bias for Tabular Datasets with Multiple Protected Attributes ⋆

Measuring and Mitigating Bias for Tabular Datasets with Multiple Protected Attributes ⋆ ManhKhoiDuong manh.khoi.duong@hhu.de Heinrich Heine University

Universitätsstraße 1 40225 Düsseldorf Germany

StefanConrad stefan.conrad@hhu.de Heinrich Heine University

Universitätsstraße 1 40225 Düsseldorf Germany

Measuring and Mitigating Bias for Tabular Datasets with Multiple Protected Attributes ⋆ 1613-0073 9A821F217827FA2D862BBA8BB6EF823E GROBID - A machine learning software for extracting information from scholarly documents Machine Learning Bias Mitigation Intersectional Discrimination Fairness AI Act

Motivated by the recital (67) of the current corrigendum of the AI Act in the European Union, we propose and present measures and mitigation strategies for discrimination in tabular datasets. We specifically focus on datasets that contain multiple protected attributes, such as nationality, age, and sex. This makes measuring and mitigating bias more challenging, as many existing methods are designed for a single protected attribute. This paper comes with a twofold contribution: Firstly, new discrimination measures are introduced. These measures are categorized in our framework along with existing ones, guiding researchers and practitioners in choosing the right measure to assess the fairness of the underlying dataset. Secondly, a novel application of an existing bias mitigation method, FairDo, is presented. We show that this strategy can mitigate any type of discrimination, including intersectional discrimination, by transforming the dataset. By conducting experiments on real-world datasets (Adult, Bank, COMPAS), we demonstrate that de-biasing datasets with multiple protected attributes is possible. All transformed datasets show a reduction in discrimination, on average by 28%. Further, these datasets do not compromise any of the tested machine learning models' performances significantly compared to the original datasets. Conclusively, this study demonstrates the effectiveness of the mitigation strategy used and contributes to the ongoing discussion on the implementation of the European Union's AI Act.

Introduction

Discrimination in artificial intelligence (AI) applications is a growing concern since the adoption of the AI Act by the European Parliament on March 13, 2024 [1]. It still remains a significant challenge across numerous domains [2,3,4,5]. To prevent biased outcomes, pre-processing methods are often used to mitigate biases in datasets before training machine learning models [6,7,8,9]. The current corrigendum of the AI Act [1] emphasizes this in Recital (67): " [...] The data sets should also have the appropriate statistical properties, including as regards the persons or groups of persons in relation to whom the high-risk AI system is intended to be used, with specific attention to the mitigation of possible biases in the data sets [..

.]"

Since datasets often consist of multiple protected attributes, pre-processing methods should be able to handle these cases. However, only a few works have addressed this issue [7,10,11,12,13] and de-biasing such datasets is still an ongoing research topic. In addition, there is no straightforward approach to managing multiple protected attributes, as shown in Figure 1.

Our paper mainly focuses on how to measure and mitigate discrimination in datasets where multiple protected attributes are present. In our first contribution, we provide a comprehensive categorization of discrimination measuring methods. Besides introducing new measures for some of these cases, we also categorize existing measures from the literature. Some of the listed measures specifically address intersectional discrimination and non-binary groups. The second contribution deals with bias mitigation. For this, we use our published pre-processing framework, FairDo [9], that is fairness-agnostic. The fairness-agnostic property makes it possible to define any discrimination measure that should be Color Shape

Intersectional

Non-intersectional

Shape Color

Figure 1: Stick figures can be differentiated by their color and shape. In intersectional discrimination, attributes are intersected, which leads to new subgroups. In non-intersectional, each attribute is treated independently, i.e., colors and shapes are not intersecting in this case.

minimized. By implementing the introduced measures, we can therefore mitigate biases for multiple protected attributes. Another advantage of FairDo is that it preserves data integrity and does not modify the features of individuals during the optimization process, unlike other methods [14,3,7]. We evaluated our methodology on popular tabular datasets with fairness concerns, such as Adult [15], Bank [16], and COMPAS [17]. We used different discrimination measures to evaluate the effectiveness of the bias mitigation process. Because a successful mitigation process does not guarantee that the outcomes of machine learning models are fair, we trained machine learning models on the transformed datasets and evaluated their predictions regarding fairness and performance. The code for the experiments can be found in the accompanying repository: https://github.com/mkduong-ai/fairdo/evaluation.

The results of the bias mitigation process as well as the performance of the machine learning models are promising. They indicate that achieving fairness in datasets with multiple protected attributes is possible, and FairDo is a proper framework for this task. Overall, our work contributes technical solutions for stakeholders to enhance the fairness of datasets and machine learning models, aiming for compliance with the AI Act [1].

Preliminaries

To handle multiple protected attributes, we define 𝒵 = {𝑍 1 , . . . , 𝑍 𝑝 } as a set of protected attributes. It can represent the set of sociodemographic features such as age, gender, and ethnicity. These factors may make individuals vulnerable to discrimination. Each protected attribute 𝑍 𝑘 ∈ 𝒵 is formally a discrete random variable that can take on values from the sample space 𝑔 𝑘 . In this context, we refer 𝑔 𝑘 to groups that describe distinct social categories of a protected attribute. For example, let 𝑍 𝑘 represent gender; then 𝑔 𝑘 is a set containing the genders male, female, and non-binary. To avoid limitations to a particular group fairness notion, we introduce a generalized notation based on the works of Žliobaitė [2], Duong and Conrad [9] in the following.

Definition 2.1 (Treatment).

Let 𝐸 1 , 𝐸 2 be events and 𝑍 𝑘 be a random variable that can take on values from 𝑔 𝑘 , then we call the conditional probability

𝑃 (𝐸 1 | 𝐸 2 , 𝑍 𝑘 = 𝑖)

treatment, where 𝑖 ∈ 𝑔 𝑘 . 𝐸 1 describes some favorable outcome, such as getting accepted for a job, while 𝐸 2 often represents some additional information about the individual, such as their qualifications.

Definition 2.2 (Fairness Criteria).

With the definition of treatment, we can define fairness criteria that demand equal treatment for different groups. Let 𝑃 (𝐸 1 | 𝐸 2 , 𝑍 𝑘 = 𝑖) and 𝑃 (𝐸 1 | 𝐸 2 , 𝑍 𝑘 = 𝑗) be treatments, then we call the following equation:

𝑃 (𝐸 1 | 𝐸 2 , 𝑍 𝑘 = 𝑖) = 𝑃 (𝐸 1 | 𝐸 2 , 𝑍 𝑘 = 𝑗)

a fairness criterion, for all 𝑖, 𝑗 ∈ 𝑔 𝑘 . Definition 2.2 allows us to define various group fairness criteria, including statistical parity [18], predictive parity [3], equality of opportunity [19], etc. They all demand some sort of equal outcome for different groups and can be defined by configuring the events 𝐸 1 , 𝐸 2 . For instance, statistical parity [18] requires that two different groups have an equal probability of receiving a favorable outcome (𝑌 = 1).

Example 2.1 (Statistical Parity [18]). To define statistical parity for the attribute 𝑍 𝑘 using our notation, we set 𝐸 1 := (𝑌 = 1) and 𝐸 2 := Ω. By setting 𝐸 2 to the sample space Ω, we compare the probabilities of the event 𝑌 = 1 across different groups without conditioning on any additional event:

𝑃 (𝑌 = 1 | Ω, 𝑍 𝑘 = 𝑖) = 𝑃 (𝑌 = 1 | Ω, 𝑍 𝑘 = 𝑗) ⇐⇒ 𝑃 (𝑌 = 1 | 𝑍 𝑘 = 𝑖) = 𝑃 (𝑌 = 1 | 𝑍 𝑘 = 𝑗),

where 𝑖, 𝑗 ∈ 𝑔 represent different groups.

In real-world applications, achieving equal probabilities for certain outcomes is not always possible. Due to variations in sample sizes in the groups, it is common to yield unequal treatments, even when they are similar. Thus, existing literature [2] uses the absolute difference to quantify the strength of discrimination.

as the disparity, for all 𝑖, 𝑗 ∈ 𝑔 𝑘 . Trivially, 𝛿 𝑍 𝑘 is commutative regarding 𝑖, 𝑗. In practice, it prevents reverse discrimination due to the absolute value.

Definition 2.4 (Discrimination). We use 𝜓 : D → R to denote some discrimination measure that quantifies the discrimination inherent in any dataset 𝒟 ∈ D. A dataset 𝒟 consists of features, protected attributes, and labels for each individual. The explicit form of 𝜓 depends on the cases introduced in Section 3.

Measuring Discrimination for Multiple Attributes

We found that numerous scenarios arise when dealing with multiple protected attributes. We categorize these scenarios based on the number of groups, denoted as |𝑔|, and the number of protected attributes, denoted as |𝒵|. By going through all cases, we present possible approaches from the literature as well as our own suggestions to measure discrimination.

Single Protected Attribute (|𝒵| = 1)

In the case of having only one protected attribute, i.e., |𝒵| = |{𝑍 1 }| = 1, we distinguish between cases by the number of available groups |𝑔| in the dataset. We categorize the cases by |𝑔| = 0, 1, 2, and |𝑔| > 2.

No Groups (|𝑔| = 0)

When there are no groups, the measurement of discrimination is impossible if no assumptions are being made. Discrimination can be assessed through proxy variables [20]; however, this approach can be imprecise and may introduce new biases. This case is equivalent to having no protected attribute, i.e., |𝒵| = 0.

Single Group (|𝑔| = 1)

Similarly to the case of having no groups, discrimination cannot be measured when having only one group. For this, we propose practices where prior information can be incorporated:

1. No discrimination: As no difference towards any other group can be measured, returning a discrimination score of 0 is one viable option.

𝜓(𝒟) = 0.(1)

2. Difference to optimal treatment: Another way is to return the absolute difference of the group's outcome to the optimal treatment. For example, group 𝑖 has an 80% chance of receiving the favorable treatment. Ideally, having a 100% chance would represent the optimal scenario. Therefore, the discrimination score is 20% in this case. It is given by:

𝜓(𝒟) = |𝑃 (𝐸 1 | 𝐸 2 , 𝑍 1 = 𝑖) − 1|.(2)

3. Difference to expected treatment: We can use the expected treatment as a reference point. For example, we know that a company has a 50% acceptance rate for job applications. Now a machine learning classifier is trained to predict whether an applicant will be accepted and the model's predictions result in a 60% acceptance rate for group 𝑖. Hence, the model is positively biased towards group 𝑖 by 10%. This can be formulated as:

𝜓(𝒟) = |𝑃 (𝐸 1 | 𝐸 2 , 𝑍 1 = 𝑖) − 𝑝 expect. |,(3)

where 𝑝 expect. is the expected treatment. It can describe the average treatment across all groups [21] or some other prior information that is not included in the dataset.

Binary Groups (|𝑔| = 2)

Without using any prior information, we can calculate the discrimination score by taking the absolute difference between the treatments of the two groups, as advised by Žliobaitė [2]. The discrimination measure 𝜓 is then simply given by the disparity as mentioned in Definition 2.3.

Non-binary Groups (|𝑔| > 2)

While the case for binary attributes is straightforward, it becomes non-trivial for non-binary attributes that arise naturally in real-world data. We can fall back to |𝑔| = 2 by calculating the absolute difference between every distinct group 𝑖, 𝑗 ∈ 𝑔. Because the discrimination between 𝑖 and 𝑗 is the same as between 𝑗 and 𝑖, only

(︀ |𝑔|2

)︀ pairs need to be compared and we use an aggregation function agg (1) to report the differences [2]. Lum et al. [22] refers to measures that aggregate or summarize discrimination scores as meta-metrics. The aggregate can be the sum or maximum function, depending on the use case. The result for a single protected attribute 𝑍 𝑘 with two or more groups can be computed as follows:

𝜓(𝒟) = agg (1) 𝑖,𝑗∈𝑔 𝑘 ,𝑖<𝑗 𝛿 𝑍 𝑘 (𝑖, 𝑗, 𝐸 1 , 𝐸 2 ),(4)

where 𝛿 𝑍 𝑘 is the disparity as defined in Definition 2.3 and 𝑖 < 𝑗 ensures that each pair is considered only once (assuming label-encoded groups). According to Žliobaitė [2] and her personal discussions with legal experts, she advocates using the maximum function, i.e.,

𝜓(𝒟) = max 𝑖,𝑗∈𝑔 𝑘 ,𝑖<𝑗 𝛿 𝑍 𝑘 (𝑖, 𝑗, 𝐸 1 , 𝐸 2 )(5)

= max

𝑖∈𝑔 𝑘 𝑃 (𝐸 1 | 𝐸 2 , 𝑍 𝑘 = 𝑖) − min 𝑗∈𝑔 𝑘 𝑃 (𝐸 1 | 𝐸 2 , 𝑍 𝑘 = 𝑗).(6)

Equation ( 5) describes the maximum discrimination obtainable between two groups. An alternative and equivalent formulation is given in Equation ( 6) [7]. The latter is computationally more efficient as it requires 𝒪(2|𝑔|) operations compared to 𝒪(|𝑔| 2 ) operations for the former.

A more general approach to measuring discrimination is to calculate some form of correlation coefficient between the protected attribute and the outcome. The correlation coefficient can be calculated using Pearson's correlation [23], Spearman or Kendall's rank correlation [24,25]. The discrimination measure can then be defined as the absolute value of the correlation coefficient:

𝜓(𝒟) = |Corr(𝐸 1 , 𝑍 𝑘 )|.(7)

This approach can be applied to any number of groups. Fairlearn provides a pre-processing method that removes the correlation between the protected attribute and the outcome by transforming the data [7]. However, the given approach violates data integrity constraints as categorical attributes are transformed into continuous values. Moreover, zero correlation does not imply independence between two variables.

Multiple Protected Attributes (|𝒵| > 1)

There are several ways to measure discrimination for multiple protected attributes (|𝒵| > 1). Based on the works of Kearns et al. [21], Yang et al. [11] and Kang et al. [13], we categorize them into two approaches: intersectional and non-intersectional (see Figure 1). Intersectional approaches consider the intersection of identities. The overlapping of such identities forms subgroups [21]. Non-intersectional approaches treat each protected attribute independently [11].

Intersectional Discrimination

The central idea of intersectionality is that individuals experience overlapping forms of oppression or privilege based on the combination of multiple social categories they belong to. In the following, we will introduce definitions to formulate intersectional discrimination, which is based on the work of Kearns et al. [21].

𝛿 ^𝒵 (𝑖, 𝑗, 𝐸 1 , 𝐸 2 ) = |𝑃 (𝐸 1 | 𝐸 2 , 𝑍 1 = 𝑖 1 , . . . , 𝑍 𝑝 = 𝑖 𝑝 ) − 𝑃 (𝐸 1 | 𝐸 2 , 𝑍 1 = 𝑗 1 , . . . , 𝑍 𝑝 = 𝑗 𝑝 )|.

Similarly to Equation (4), we can calculate the discrimination score for multiple protected attributes by aggregating disparities across all subgroups. A subgroup can be treated like a normal group. According to Definition 3.1, there are theoretically at least 2 𝑝 subgroups, where 𝑝 is the number of protected attributes. However, not all subgroups may be available in the dataset. For unavailable subgroups, the disparity cannot be calculated as the corresponding treatment is undefined. Let us denote the set of available subgroups as 𝐺 avail ⊆ 𝑔 1 × . . . × 𝑔 𝑘 . To finally capture the discrepancies across all available subgroup pairs, an aggregation function agg (1) is applied to the subgroup disparities 𝛿 ^𝒵 :

𝜓 intersect (𝒟) = agg (1) 𝑖,𝑗∈𝐺 avail 𝛿 ^𝒵 (𝑖, 𝑗, 𝐸 1 , 𝐸 2 ). Equation ( 8) represents the aggregated discrimination between all available subgroups in the dataset. When using the maximum function as the aggregator, the calculations are equivalent to Equation ( 5) and Equation (6). The only difference is that the conditionals are now subgroups instead of groups:

𝜓 intersect (𝒟) = max 𝑖,𝑗∈𝐺 avail 𝛿 ^𝑍𝑘 (𝑖, 𝑗, 𝐸 1 , 𝐸 2 )(9)

= max

𝑖∈𝐺 avail 𝑃 (𝐸 1 | 𝐸 2 , 𝑍 1 = 𝑖 1 , . . . , 𝑍 𝑝 = 𝑖 𝑝 ) − min 𝑗∈𝐺 avail 𝑃 (𝐸 1 | 𝐸 2 , 𝑍 1 = 𝑗 1 , . . . , 𝑍 𝑝 = 𝑗 𝑝 ).

Kang et al. [13] also dealt with intersectional discrimination in their work by introducing a multivariate random variable 𝑍 where each dimension represents a protected attribute. Their fairness objective is to minimize the mutual information between the outcome and the multivariate random variable. By minimizing the mutual information, the outcome is independent of the protected attributes, which is a desirable property for fairness [14,26]. In this context, zero mutual information implies the absence of intersectional discrimination [13]. However, this approach relies on expensive techniques to approximate the mutual information. Using our notation, their formulation can be written as [13]:

𝜓 MI (𝒟) = MI(𝐸 1 , 𝑍),(10)

where MI denotes the mutual information.

Non-intersectional Discrimination

The problem with measuring discrimination for intersectional groups is that it has an upward bias when using meta-metrics [22]. This is because the number of subgroups grows exponentially with the number of protected attributes. This leads to many subgroups where the number of samples in each subgroup is possibly small, resulting in larger noise in the treatment estimates [22]. Besides intersectional groups, Yang et al. [11] listed a non-intersectional definition of groups, called independent groups. Building on the definition of independent groups, we propose an appropriate approach to measure discrimination for this type of groups. It is more suitable when dealing with a large number of subgroups or when intersectional discrimination is not deemed important. Our nonintersectional approach treats each protected attribute independently and aggregates the discrimination scores across all protected attributes. For this, a second aggregate function with agg (2) is introduced, yielding the following equation: (2) 𝑍 𝑘 ∈𝒵

𝜓 indep (𝒟) = agg

{︃

agg (1) 𝑖,𝑗∈𝑔 𝑘 ,𝑖<𝑗

𝛿 𝑍 𝑘 (𝑖, 𝑗, 𝐸 1 , 𝐸 2 ) }︃ . (11)

The first-level aggregator agg (1) aggregates disparities within a protected attribute, considering unique pairs of groups 𝑖 and 𝑗. The second-level aggregator agg (2) then combines the results across all protected attributes. By applying both operators, we obtain a discrimination measure that captures disparities between groups across multiple attributes.

Example

Let us consider a dataset with two protected attributes, age and sex (see Table 1). The set of protected attributes is 𝒵 = {𝑍 1 , 𝑍 2 } = {Age, Sex} and the set of available subgroups in the dataset is 𝐺 avail = {Old, Young} × {Male, Female}. We measure discrimination using statistical disparity. For simplicity, all aggregation functions are set to the maximum function. The intersectional approach yields the following discrimination score:

𝜓 intersect (𝒟) = max 𝑖,𝑗∈𝐺 avail 𝛿 ^𝒵 (𝑖, 𝑗, (𝑌 = 1), Ω)(12)

= max 𝑖,𝑗∈𝐺 avail 𝛿 ^{Age, Sex} (𝑖, 𝑗, (𝑌 = 1), Ω)

= max

while the discrimination score for the non-intersectional approach is given by:

𝜓 indep (𝒟) = max 𝑍 𝑘 ∈𝑍 {︂ max 𝑖,𝑗∈𝑔 𝑘 ,𝑖<𝑗

𝛿 𝑍 𝑘 (𝑖, 𝑗, (𝑌 = 1), Ω)

}︂ (13)

= max {︀ 𝛿 Age (Old, Young, (𝑌 = 1), Ω), 𝛿 Sex (Male, Female, (𝑌 = 1), Ω) }︀ = max{|0.5 − 0.5|, |0.5 − 0.5|} = max{0, 0} = 0.

The non-intersectional approach yields a discrimination score of 0 because the disparities for both protected attributes are 0. This is quite different from the intersectional approach, which reports a discrimination score of 1. As seen, the results can differ depending on the approach.

Experiments

Our experimentation follows a pipeline consisting of data pre-processing, bias mitigation, model training, and evaluation. To mitigate bias in tabular datasets with multiple protected attributes, we used the sampling method, FairDo [9], that constructs fair datasets by selectively sampling data points. The method is very flexible and only requires the user to define the discrimination measure that should be minimized. In our case, we are interested in a dataset that has minimal bias across multiple protected attributes. The experiments revolve around the following research questions:

• RQ1 Is it possible to yield a fair dataset with FairDo, where bias for multiple protected attributes is reduced? • RQ2 Are machine learning models trained on fair datasets more fair in their predictions than those trained on original datasets?

Experimental Setup

Datasets and Pre-processing The tabular datasets employed in our experiments include the Adult [15], Bank [16], and COMPAS [17] datasets. They are known for their use in fairness research and contain multiple protected attributes. We pre-processed the datasets by applying one-hot encoding to categorical variables and label encoding to protected attributes. Table 2 shows important characteristics of the datasets after pre-processing. Each dataset was divided into training and testing sets using an 80/20 split, respectively. We ensured that the split was stratified (if possible) based on protected attributes to maintain representativeness across different groups in both sets. Bias Mitigation Applying the bias mitigation method FairDo [9] to the datasets can be regarded as a pre-processing step, too. This is because the method simply returns a dataset that is fair with respect to the given discrimination measure. FairDo [9] offers a variety of options to mitigate bias, and we chose the undersampling method that removes samples. In this option, the optimization objective is stated as [9]: min

𝒟 fair ⊆𝒟 𝜓(𝒟 fair ),(14)

where 𝒟 is the training set of Adult, Bank, or COMPAS, and 𝜓 is the fairness objective function. We experimented with both 𝜓 intersect and 𝜓 indep as objectives functions. Bias mitigation is only applied to the training set and the testing set remains unchanged. FairDo internally uses genetic algorithms to select a subset of the training set that minimizes the objective function. We used the same settings and operators as provided in the package and only adjusted the population size (200) and the number of generations (400).

Model Training

We utilized the scikit-learn library [27] to train various machine learning classifiers, namely Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and Artificial Neural Network (ANN). These classifiers were trained on both the original and fair datasets. Classifiers trained on the original datasets serve as a baseline for comparison. We used the default hyperparameters given by scikit-learn package for each classifier.

Evaluation Metrics

We evaluated the models' predictions on fairness and performance using the test set. For fairness, we assessed 𝜓 intersect and 𝜓 indep . For the classifiers' performances, we report the area under the receiver operating characteristic curve (AUROC) [28], where higher values indicate better performances. Because removing data points can compromise the overall quality of the data, we also report the number of subgroups before and after bias mitigation to check for representativeness.

Trials For each dataset and discrimination measure combination, the bias mitigation process was repeated 10 times. The results were averaged over the trials to obtain a more robust evaluation.

Results

Fair Dataset Generation Table 3 shows the average discrimination before and after mitigating bias in the training sets. On all datasets, discrimination was reduced after applying FairDo. Without considering group intersections, discrimination was reduced by 7%, 19%, and 25% for Adult, Bank, and COMPAS, respectively. When considering intersectionality, the discrimination was reduced by 15%, 18%, and 83%. Hence, discrimination was reduced by 28% on average across all datasets, thus answering RQ1 positively. When comparing the discrimination scores, it can be observed that the intersectional discrimination scores are generally higher. This is because in the intersectional setting, more subgroups are considered, which potentially leads to larger differences between them [21].

We also report the number of subgroups before and after bias mitigation to assess the impact of the undersampling method on the dataset. The removal of subgroups can only be observed in the intersectional setting. In the COMPAS dataset 5.2 out of 34 subgroups were removed on average, indicating the largest amount of subgroups removed across all datasets. While the Bank dataset consists of 48 subgroups, only 1.8 subgroups were removed on average. Because the COMPAS dataset's initial intersectional discrimination score is 100%, removing more subgroups seems inevitable to reduce bias.

Model Performance and Fairness

Figure 2 shows the results of the classifiers' performances on the test set. The classifiers' performances are displayed on the y-axis, while the discrimination values are shown on the x-axis. We note that the axes do not share the same scale across the subfigures for analytical purposes.

Classifiers trained on fair datasets did not suffer a significant decline in performance compared to those trained on original datasets. In all cases, only a slight decrease of 1%-3% in performance can be noted. This indicates that the bias mitigation process does not compromise the dataset's fidelity and, therefore, the classifiers' performances. Regarding discrimination, a significant reduction is evident. The x-axis scales are much larger than the y-axis scales, suggesting that changes in discrimination are larger than changes in performance. For example, the RF classifier trained on the Bank dataset (Figure 2g) shows a decrease in intersectional discrimination from 38% to 15%, while the performance only decreases by 2%. Similar results can be observed for the other classifiers and datasets as well, successfully addressing RQ2. The results suggest that FairDo can be reliably used to mitigate bias in tabular datasets for various measures that consider multiple protected attributes. Still, we advise users to carefully perform similar analyses when applying the method to their datasets.

Discussion

The results of our experiments show that the presented measures detect discrimination in datasets with multiple protected attributes differently. When using the intersectional discrimination measure, more groups are identified and compared to each other. While subgroups are not ignored by this measure, measuring higher discrimination scores by random chance becomes more likely [21,22]. In contrast, treating each protected attribute separately prevents this issue but may lead to overlooking discrimination. The choice of measure is up to the stakeholders and depends on the context of the dataset and the regulations that apply to the AI system. We generally recommend using the intersectional discrimination measure if the number of individuals in each subgroup is large enough to draw statistically significant conclusions. Otherwise, treating each protected attribute separately is more suitable.

By using the mitigation strategy FairDo [9], the resulting datasets in the experiments have improved statistical properties regarding fairness. Whether intersectionality was considered or not, reducing discrimination in datasets was possible. At the current state, the AI Act [1] does not explicitly mention intersectional discrimination nor how to deal with multiple protected attributes generally. While recital (67) states that datasets "should [...] have the appropriate statistical properties", it does not specify what these properties are. Hence, our work serves as an initial guideline for what these properties could be and how to achieve them in practice.

Conclusion

Datasets often come with multiple protected attributes, which makes measuring and mitigating discrimination more challenging. Most existing studies only deal with a single protected attribute, and works that consider multiple protected attributes often focus on intersectionality. In opposition to this, we proposed a new non-intersectional measure that treats each protected attribute separately. This is more suitable when the number of subgroups is too large or the number of individuals in each subgroup is small. We used both intersectional and non-intersectional measures as objectives and applied the FairDo framework to mitigate discrimination in multiple datasets. The experiments show that discrimination was reduced in all datasets and on average by 28%. Machine learning models trained on the bias-mitigated datasets also improved their fairness while maintaining performance compared to models trained on the original datasets.

Figure 2 :2Figure 2: Results on the test set. The x-axis represents the discrimination values (legend indicates used measure) and the y-axis represents the classifiers' performances. We compare the pre-processed (fair) data with the original data. The points/stars represent averages, and the error bars display the standard deviations of the AUROC and discrimination values over 10 trials.

Table 11Example dataset of individuals receiving a favorable (𝑌 = 1) or unfavorable (𝑌 = 0) outcome. The dataset shows four individuals with their respective age group and sex.Individual AgeSexOutcome (𝑌 )1OldMale12OldFemale03Young Male04Young Female1

Table 22Overview of DatasetsDatasetSamples Feats. LabelProtected AttributesDescriptionAdult [15]32 56121 IncomeRace: White, Black, Asian-Indicates individualsPacific-Islander,American-earning over $50,000Indian-Eskimo, OtherannuallySex: Male, FemaleBank [16]41 18850 TermJob:Admin, Blue-Collar,Shows whether thedepositTechnician, Services, Manage-client has subscribedsubscriptionment, Retired, Entrepreneur,to a term deposit.Self-Employed,Housemaid,Unemployed, Student, UnknownMarital Status:Divorced,Married, Single, UnknownCOMPAS [17]7 21413 2-yearRace: African-American, Cau-Displays individualsrecidivismcasian, Hispanic, Other, Asian,that were rearrestedNative Americanfor a new crimeSex: Male, Femalewithin 2 years afterAge Category: <25, 25-45, >45initial arrest.

Table 33Average discrimination and number of subgroups before and after pre-processing the training sets with FairDo.DatasetMetricDisc. Before Disc. After Subgroups Before Subgroups AfterAdult𝜓 indep20%13%1010𝜓 intersect31%16%1010Bank𝜓 indep24%5%4848𝜓 intersect33%15%4846.2COMPAS 𝜓 indep30%5%3434𝜓 intersect100%17%3428.8

Original ψ multi.

Fair ψ multi.

Original ψ intersect. Fair ψ intersect.

Artificial Intelligence Act, Corrigendum 19 April 2024. 2024. May 2024 European Commission Measuring discrimination in algorithmic decision making IŽliobaitė Data Mining and Knowledge Discovery 31 2017 Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment MBZafar IValera MGomezRodriguez KPGummadi 10.1145/3038912.3052660 Proceedings of the 26th International Conference on World Wide Web the 26th International Conference on World Wide Web 2017 Algorithmic decision making and the cost of fairness SCorbett-Davies EPierson AFeller SGoel AHuq Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2017 Fairness and Machine Learning SBarocas MHardt ANarayanan 2019 fairmlbook Certifying and removing disparate impact MFeldman SAFriedler JMoeller CScheidegger SVenkatasubramanian proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining the 21th ACM SIGKDD international conference on knowledge discovery and data mining 2015 Fairlearn: A toolkit for assessing and improving fairness in AI SBird MDudík REdgar BHorn RLutz VMilan MSameki HWallach KWalker MSR- TR-2020-32 2020 Microsoft Technical Report A reductions approach to fair classification AAgarwal ABeygelzimer MDudík JLangford HWallach International Conference on Machine Learning

PMLR

2018 Towards fairness and privacy: A novel data pre-processing optimization framework for non-binary protected attributes MKDuong SConrad Data Science and Machine Learning DBenavides-Prado SErfani PFournier-Viger YLBoo YSKoh

Singapore, Singapore

Springer Nature 2024 Bayesian Modeling of Intersectional Fairness: The Variance of Bias JRFoulds RIslam KNKeya SPan 10.1137/1.9781611976236.48 2020 Fairness with overlapping groups FYang MCisse SKoyejo Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS '20 the 34th International Conference on Neural Information Processing Systems, NIPS '20

Red Hook, NY, USA

Curran Associates Inc 2020 Fair classification with noisy protected attributes: A framework with provable guarantees LECelis LHuang VKeswani NKVishnoi Proceedings of the 38th International Conference on Machine Learning MMeila TZhang the 38th International Conference on Machine Learning

PMLR

2021 139 Proceedings of Machine Learning Research Infofair: Information-theoretic intersectional fairness JKang TXie XWu RMaciejewski HTong IEEE International Conference on Big Data (Big Data) 2022. 2021 Learning fair representations RZemel YWu KSwersky TPitassi CDwork International conference on machine learning PMLR 2013 Scaling up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid RKohavi KDD'96 AAAI Press 1996 A data-driven approach to predict the success of bank telemarketing SMoro PCortez PRita Decision Support Systems 62 2014 Machine bias JLarson JAngwin SMattu LKirchner 2016 Building classifiers with independency constraints TCalders FKamiran MPechenizkiy 10.1109/ICDMW.2009.83 2009 IEEE International Conference on Data Mining Workshops 2009 Equality of opportunity in supervised learning MHardt EPrice NSrebro Advances in neural information processing systems 29 2016 A survey on bias and fairness in machine learning NMehrabi FMorstatter NSaxena KLerman AGalstyan ACM Computing Surveys (CSUR) 54 2021 Preventing fairness gerrymandering: Auditing and learning for subgroup fairness MKearns SNeel ARoth ZSWu Proceedings of the 35th International Conference on Machine Learning JDy AKrause the 35th International Conference on Machine Learning

PMLR

2018 80 Proceedings of Machine Learning Research De-biasing "bias" measurement KLum YZhang ABower 10.1145/3531146.3533105 Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT '22 the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT '22

New York, NY, USA

Association for Computing Machinery 2022 Notes on regression and inheritance in the case of two parents KPearson Proceedings of the Royal Society of London 58 1895 The proof and measurement of association between two things CSpearman American Journal of Psychology 15 1904 A new measure of rank correlation MGKendall Biometrika 30 1938 Fairness in supervised learning: An information theoretic approach AGhassami SKhodadadian NKiyavash 10.1109/ISIT.2018.8437807 IEEE International Symposium on Information Theory (ISIT) IEEE Press 2018. 2018 Scikit-learn: Machine learning in Python FPedregosa GVaroquaux AGramfort VMichel BThirion OGrisel MBlondel PPrettenhofer RWeiss VDubourg JVanderplas APassos DCournapeau MBrucher MPerrot EDuchesnay Journal of Machine Learning Research 12 2011 An introduction to ROC analysis TFawcett 10.1016/j.patrec.2005.10.010 Pattern Recognition Letters 27 2006