ASDF-Dashboard: Automated Subgroup Detection and Fairness Analysis

ASDF-Dashboard: Automated Subgroup Detection and Fairness Analysis JeroSchäfer Institute of Computer Science Goethe University

Frankfurt am Main Germany

LenaWiese lwiese@cs.uni-frankfurt.de Institute of Computer Science Goethe University

Frankfurt am Main Germany

LWDA'22: Lernen

Wissen, Daten, Analysen. October 05-07 2022 Hildesheim Germany

ASDF-Dashboard: Automated Subgroup Detection and Fairness Analysis 1613-0073 45F37BF219CD79F8BB217782C198AACE GROBID - A machine learning software for extracting information from scholarly documents Artificial Intelligence Fairness Clustering

The importance of an equal treatment of individuals by AI models drastically grows due to the demands of modern society. The potential discrimination or favoritism of specific groups of individuals is one of the common perspectives for the evaluation of model behavior. However, most of the available fairness tools require human intervention in the selection of subgroups of interest and therefore expert knowledge.

In this paper we propose a new tool, the ASDF-Dashboard, which automates the process of subgroup fairness assessment. It automates the subgroup detection by applying a method based on unsupervised clustering algorithms and pattern extraction to ease the usage also for non-expert users.

Introduction

The research, that was conducted in the past decades, has enabled a level of AI featuring complex systems of hundreds of possible applications and an ever growing interest in their further development. There has been a great effort in improving the developed techniques and algorithms with the goal of optimizing their performance. The more powerful and faster technologies available today facilitate the expressiveness and performance of such AI systems making them omnipresent and essential in the modern world. Machine learning methods already outperform humans in certain tasks and AI supported decision making is no longer a rarity in sensitive fields like finance or medicine. However, the consideration of the societal impact of such potentially life-changing decisions has become an increasingly important objective and, thus, it needs to be evaluated in a transparent and critical way. More precisely, AI systems must not only be designed and optimized for performance, accuracy and quality but also reckon with aspects like transparency, explainability or fairness.

Fair machine learning models have to provide an equal treatment to different individuals regardless of their sensitive characteristics, e.g., their genders, races or ethnics. It is crucial to ensure that no individual experiences discrimination or favoritism by the model choices as a consequence of their membership in a certain population. Nevertheless, it is challenging to test the behavior of a model for fairness against subgroups when considering the intersections of (sensitive) characteristics as this causes an exponentially large number of potentially discriminated or favored subgroups to test. Furthermore, it is not obvious, in general, which characteristics of a dataset or which intersections induce subgroups suffering discrimination by the model, and it usually is infeasible to test each possible intersectional subgroup for a fair treatment. Hence, an automated suggestion of subgroups for the fairness testing is desirable.

In this work we propose the ASDF-Dashboard tool for the automated subgroup fairness analysis of binary classification models. It implements the previously contributed methodology of automatic subgroup detection using an unsupervised clustering and a subsequent entropy-based pattern extraction [1] in a user-friendly, web-based interface. The results of the subgroup fairness assessment are visualized in different charts in our dashboard to give the user deep insights into the behavior of the tested binary classification model regarding the detected subgroups. In the following, we outline related work on (automated) subgroup fairness evaluation tools and frameworks in Section 2. Section 3 then presents the definitions of subgroup fairness metrics based on pattern-induced subgroups (Section 3.1) and the previously developed methodology of entropy-based pattern extraction (Section 3.2). We introduce our ASDF-Dashboard and describe its functionality in Section 4. Section 5 briefly refers to our experimental results [1] and discusses the implementation of the ASDF-Dashboard. Finally, a conclusion and potential directions for future work are given in Section 6.

Related Work

There exist quite some machine learning tools to support AI developers, data scientists and also end users to realized and understand a model's behavior when presented the data. Such tools enable the enhancement of the model in the development process, deep analyses of AI systems under various criteria and features, and transparency to end users, that can get an idea on the processing of their data out of it. In particular, the latter point is increasingly interesting as our modern society demands for more transparency in AI and an equal treatment of individuals regardless of their gender or ethnics, for example. This directly leads to the concept of AI fairness, which can be tested and visualized using diverse tools. A common drawback of such supporting tools is that they are not designed for non-expert user lacking deeper knowledge and, thus, cannot perform the required interaction with the tool appropriately.

The Boxer [2] tool provides the functionality to analyze and compare models for their behavior on the same task in an interactive fashion. It is able to identify intersectional bias in the predictions of the models for subgroups of interest as specified by the tool user. This functionality is also offered by the Fairkit-learn [3] toolkit in a similar way to monitor the performance and fairness of potentially discriminating models. Models for graph mining tasks can be investigated with a tool called FairRankVis [4], which allows to explore visualizations of the model fairness wrt. individuals and subgroups. Another approach is provided by the What-if tool [5] that performs a subgroup fairness analysis and automatically optimizes the classification threshold of the considered model based on the results of the fairness analysis. Morina et al. [6] developed a framework that delivers multiple intersectional fairness metrics and estimators. However, none of the previously mentioned tools or frameworks is able to perform a subgroup fairness analysis of a given model automatically as they all require human intervention when it comes to detecting the intersectional bias. In each of these tools, the user has to specify the subgroups of interest manually before a subgroup fairness metric is applied. Our approach, in contrast, facilitates the subgroup fairness analysis by an automated detection of subgroups for the assessment of the classifier's fairness.

The FairVis tool [7] suggests subgroups to the user, that were detected automatically by clustering the data with the k-means clustering algorithm and extracting patterns of instance prototypes. The prototypes describe the makeup of the clusters and the corresponding patterns are obtained from the dominant features matching most of the subgroup members. This means that the aggregation into a cluster made most of the individuals of this subgroup having a uniform value for the dominant attribute, which is then extracted as pattern to match data in the whole dataset. Our ASDF-Dashboard extends the approach behind FairVis by offering also different clustering algorithms for the initial subgroup detection and refining the pattern extraction by a more intuitive method to quantify the uniformity of a certain feature. Instead of ranking the features by their cluster feature entropy, we apply a configurable, global threshold to identify dominant features independent of the feature domains.

Another approach to automatically detect subgroups uses frequent-pattern mining on the dataset. The Divexplorer [8] tool searches possible patterns to evaluate differences in the model's behavior between subgroups and the whole population in the dataset. The search space of possible patterns is explored exhaustively while considering only patterns with a specific degree of support and dropping less supported patterns. The model fairness regarding the subgroups is then evaluated as the difference in the probability for prediction using FPR or FNR. Similarly, the DENOUNCER [9] tool generates possible patterns by traversing the pattern graph and searches for the most general patterns which have support above a given threshold and define subgroups where the model performs poorly (low accuracy). As the space of patterns grows exponentially with the number of features and highly depends on the complexity of the domains of the features, the detection of subgroups and the assessment of the model fairness wrt. to the detected subgroups can be very time consuming. Hence, the support thresholds need to be defined very carefully to prune the search space appropriately while also generating patterns inducing meaningful subgroups.

Automated Subgroup Fairness

The ASDF-Dashboard automatically assesses the subgroup fairness of a binary classifier on a given dataset. To this end, the system detects subgroups in the data by computing a clustering of the data. The found clusters are then treated as subgroups themselves while alternatively general patterns are derived from the clusters. The obtained patterns also induce subgroups for an evaluation of the classifier's fairness. This procedure facilitates the assessment as no set of protected attributes has to be predefined and the intersections of multiple protected attributes are covered implicitly.

Subgroup Fairness Metrics

Formally, we denote a dataset as 𝒟 = {𝑥 1 , . . . , 𝑥 𝑛 } of 𝑛 instances over a set of attributes 𝒜 = {𝐴 1 , . . . , 𝐴 𝑝 } with the possible values 𝑣 ∈ 𝐷𝑜𝑚(𝐴 𝑗 ) for 𝐴 𝑗 ∈ 𝐴. Given a dataset 𝒟 and a subset of protected attributes {𝐴 1 , . . . , 𝐴 𝑞 } ⊆ 𝒜, we define a pattern 𝑃 = (𝑎 1 , . . . , 𝑎 𝑞 ) ∈ 𝐷𝑜𝑚(𝐴 1 ) × • • • × 𝐷𝑜𝑚(𝐴 𝑞 ) over 𝒟 such that an instance 𝑥 = (𝑣 1 , . . . , 𝑣 𝑝 ) satisfies 𝑃 if its attribute values match the pattern values (𝑣 𝑖 = 𝑎 𝑖 for 𝑖 ∈ {1, ..., 𝑞}). Then, 𝑃 partitions 𝒟 into a protected subgroup 𝒟 𝑃 = {𝑥 ∈ 𝒟 | 𝑥 ⊨ 𝑃 } and an unprotected subgroup 𝒟 𝑃 ¯= {𝑥 ∈ 𝒟 | 𝑥 ⊭ 𝑃 } = 𝒟 ∖ 𝒟 𝑃 [1]. With this notion of patterns, that induce subgroups, a binary classification model 𝑀 ^, that was trained on a dataset 𝒟 to predict the class 𝑦 ^= 𝑀 ^(𝑥) ∈ {0, 1} of an input instance 𝑥, can be evaluated for its fairness wrt. the performance on the subgroups. The probabilities under which a model 𝑀 ^predicts the positive/negative class label for an instance 𝑥 are denoted as P(𝑦 ^= 1) and P(𝑦 ^= 0), respectively. The classifier 𝑀 ^predicts the class label 𝑐 ∈ {0, 1} for the protected subgroup with probability P(𝑦 ^= 𝑐 | 𝑥 ∈ 𝒟 𝑃 ) and correct or wrong predictions given the real label 𝑔 ∈ {0, 1} are expressed as P(𝑦 ^= 𝑐 | 𝑦 = 𝑔, 𝑥 ∈ 𝒟 𝑃 ). The probabilities for the unprotected group 𝒟 𝑃 ¯are expressed analogously.

Many subgroup fairness metrics quantify the model fairness by using the values derived from confusion matrices [10,11] such as the positive predictive value (PPV) or the true positive rate (TPR). Barocas et al. [12] further categorize subgroup fairness metrics by the three criteria "independence", "separation" and "sufficiency" which relate to most of the proposed fairness definitions. Regarding independence, a fair classifier satisfies non-discrimination if the classification is statistically independent from the membership in the protected or unprotected subgroup. The rate of acceptance (or denial), i.e., P(𝑦 ^= 1 | 𝑥 ∈ 𝒟 𝑃 ) (or 𝑦 ^= 0), is then equal between the two subgroups. Separation extends this category by also considering a potential correlation between the subgroup membership and the ground-truth class such that the protected and unprotected subgroup should experience equal TPRs and FPRs. Finally, sufficiency requires an independence of the probability for the ground-truth class given a positive or negative prediction. This results in the same positive/negative predictive values for the protected and unprotected subgroup. The ASDF-Dashboard computes three different subgroup fairness metrics for a broader analysis and investigation of the classification model, namely, statistical parity, equal opportunity and equalized odds. In the following, the formulas of these criteria are given in the context of our notion of patterns and the induced protected and unprotected subgroups as introduced in [1].

Statistical parity (Eq. 1) is satisfied if the protected subgroup 𝒟 𝑃 has the same chance for the prediction of a positive outcome (𝑦 ^= 1) as the unprotected subgroup 𝒟 𝑃 ¯ [11]:

P(𝑦 ^= 1 | 𝑥 ∈ 𝒟 𝑃 ) = P(𝑦 ^= 1 | 𝑥 ∈ 𝒟 𝑃 ¯)(1)

This definition requires that a fair classifier predicts the favorable label (𝑦 ^= 1) with a probability independently of the protected attribute values. The same is also implied for the unfavorable label (𝑦 ^= 0) due to the complementary probability. However, if instances fall into multiple of the protected groups, statistical parity tends to magnify the bias of the classifier against them [13].

Equal opportunity (Eq. 2) judges a classifier based on the probability of giving instances 𝑥 of the favorable class (𝑦 = 1) a correct prediction, i.e., 𝑥 is assigned the favorable class label by classifier 𝑀 ^. Formally, it is fulfilled if

P(𝑦 ^= 1 | 𝑦 = 1, 𝑥 ∈ 𝒟 𝑃 ) = P(𝑦 ^= 1 | 𝑦 = 1, 𝑥 ∈ 𝒟 𝑃 ¯).

(

Assuming equal opportunity, the TPRs for instances regardless of their subgroup membership have to coincide. From Equation 2 also follows that the probability of a false prediction of the unfavorable class given 𝑥 actually is a member of the favorable class has to be equal between the subgroups (FNR) as P(𝑦 ^= 0

| 𝑦 = 1, 𝑥 ∈ 𝒟 𝑃 ) = 1 − P(𝑦 ^= 1 | 𝑦 = 1, 𝑥 ∈ 𝒟 𝑃 ).

The equalized odds subgroup fairness metric extends the equal opportunity definition by additionally forcing the equality of the subgroup's FPRs:

P(𝑦 ^= 1 | 𝑦 = 0, 𝑥 ∈ 𝒟 𝑃 ) = P(𝑦 ^= 1 | 𝑦 = 0, 𝑥 ∈ 𝒟 𝑃 ¯)(3)

Thus, equalized odds is satisfied if the probabilities of correct positive predictions (Eq. 2) and incorrect positive predictions (Eq. 3) are the same for the protected and unprotected subgroup.

The previously defined fairness criteria can be used to derive metrics that quantify the subgroup fairness of the binary classifier instead of enforcing the strict equality only. Hence, the model can then considered fair if the probabilities for an equal treatment are similar and unfair if they are not close. The ASDF-Dashboard relaxes the three fairness criteria as shown in Table 1.

For 𝐹 𝑠𝑝𝑑 and 𝐹 𝑒𝑜𝑑 the probability for an instance of the protected subgroup 𝒟 𝑃 is subtracted from the probability for an instance of the unprotected subgroup 𝒟 𝑃 ¯. The equalized odds metric 𝐹 𝑎𝑜𝑑 is computed as the average of the equal opportunity metric and the difference between the probability for an incorrect positive prediction by the classifier on the unprotected and protected subgroup. These relaxations of the three fairness criteria are implemented as fairness metrics in the "AI Fairness 360" toolkit [14]. Alternatively, ratios of the subgroup probabilities can be computed, e.g., as applied in the 𝜖-differential fairness definitions [6,15] of statistical parity, equal opportunity or equalized odds. Whenever one of these fairness metrics given a pattern 𝑃 over dataset 𝒟 is close to zero, it means that the classifier produces fair results on individuals from the subgroup 𝒟 𝑃 as they are treated similar to the rest of the population. Fairness metric values less than zero indicate a favoritism of the individuals from 𝒟 𝑃 over the rest of the population due to a higher probability for a positive prediction. If the fairness metrics, in contrast, yield a value greater than zero, the classifier discriminates against individuals from the protected subgroup 𝒟 𝑃 according to the underlying fairness definition.

Pattern Extraction for Subgroup Detection

The unsupervised task of detecting meaningful groups in a dataset 𝒟 can be performed by computing a clustering 𝒞 = {𝐶 1 , . . . , 𝐶 𝑘 } that divides 𝒟 into such groups of similar instances. The groups are so-called clusters and a pair of instances 𝑥 1 and 𝑥 2 , that belong to the same cluster, shares some similarity. The cluster structure and degree of similarity between the individuals in the same group depend on the clustering type, distance measure and parameter selection. Our ASDF-Dashboard computes such a clustering either in an automated fashion or controlled by the parameters the user specified. After the clusters are found, we employ out notion of the previously defined patterns and the induced protected and unprotected subgroups to assess the classifiers fairness.

Based on the clustering 𝒞, a pattern can be extracted that partitions the dataset 𝒟 into protected and unprotected subgroups according to the clusters. To this end, a clustering-based pattern [1] 𝑃 𝒞 𝑖 = (𝑖) is defined over the artificial cluster label attribute 𝐴 𝒞 for each cluster 𝐶 𝑖 ∈ 𝒞. These patterns map the instances 𝑥 ∈ 𝐶 𝑖 to protected subgroups 𝒟 𝑃 𝒞 𝑖 and the subgroup fairness of 𝑀 ^is then calculated for a fairness metric 𝐹 by averaging over all clusters:

𝐹 ¯(𝑃 𝒞 ) = 1 𝑘 • 𝑘 ∑︁ 𝑖=1 |𝐹 (𝑃 𝒞 𝑖 )| for 𝐹 ∈ {𝐹 𝑠𝑝𝑑 , 𝐹 𝑒𝑜𝑑 , 𝐹 𝑎𝑜𝑑 }(4)

As an alternative method we refined the clustering-based subgroup detection by deriving more sophisticated patterns from the clusters [1]. These patterns describe the makeup of the found clusters and map instances from 𝒟 to the protected subgroups. The patterns are therefore derived from the most meaningful attributes that dominate a cluster 𝐶 𝑖 ∈ 𝒞, i.e., the majority of instances 𝑥 ∈ 𝐶 𝑖 have the same value 𝑣 𝑗 ∈ 𝐷𝑜𝑚(𝐴 𝑗 ) for the dominant attribute 𝐴 𝑗 ∈ 𝒜. The dominant features are determined by calculating the normalized cluster feature entropy [1] 𝐻

𝑖,𝑗 = − 1 log 2 |𝐷𝑜𝑚(𝐴 𝑗 )| • ∑︁ 𝑣∈𝐷𝑜𝑚(𝐴 𝑗 ) 𝑁 𝑖,𝑗,𝑣 𝑁 𝑖 • log 2 (︂ 𝑁 𝑖,𝑗,𝑣 𝑁 𝑖 )︂(5)

where 𝑁 𝑖 is the size of 𝐶 𝑖 and 𝑁 𝑖,𝑗,𝑣 denotes the number of instances 𝑥 ∈ 𝐶 𝑖 that have value 𝑣 for attribute 𝐴 𝑗 . The closer 𝐻 𝑖,𝑗 is to zero, the more instances with the same value for feature 𝐴 𝑗 are contained in the cluster 𝐶 𝑖 and a single value for 𝐴 𝑗 is found at all instances if the 𝐻 𝑖,𝑗 = 0. If 𝐻 𝑖,𝑗 is close to 1, this indicates more variation in the feature values across the instances in the cluster 𝐶 𝑖 . The set of dominant features of a cluster 𝐶 𝑖 is determined as 𝐴 𝑖 = {𝐴 𝑗 ∈ 𝒜 | 𝐻 𝑖,𝑗 ≤ 𝑡} for some threshold 0 ≤ 𝑡 ≤ 1. An entropy-based pattern

𝑃 𝑡 𝑖 = (𝑎 1 , . . . , 𝑎 𝑞 ) ∈ 𝐷𝑜𝑚(𝐴 1 ) × • • • × 𝐷𝑜𝑚(𝐴 𝑞 )

is then obtained for each cluster 𝐶 𝑖 ∈ 𝒞 by extracting the most frequent values of each of the dominant features 𝐴 𝑗 ∈ 𝐴 𝑖 . These patterns 𝑃 𝑡 𝑖 map all instances 𝑥 ∈ 𝒟 to the protected subgroup 𝒟 𝑃 𝑡 𝑖 that exactly match the most frequent values of the dominant features of cluster 𝐶 𝑖 . However, if all candidate features exceed the threshold 𝑡, i.e., 𝐴 𝑖 = ∅, no pattern can be extracted. In contrast to the clustering-based patterns, the protected subgroup does not exclusively contain individuals from 𝐶 𝑖 but also other individuals from other clusters that match the dominant attributes' values. Here, the normalization of the feature entropy ensures that an appropriate global threshold can be set ignoring differing sizes of the active domains of the attributes throughout the dataset [1]. In the following, we also refer to the subgroups induced by clustering-based patterns as clusters or clustering-based subgroups and to the subgroups induced by entropy-based patterns as entropy-based subgroups.

ASDF-Dashboard

Our ASDF-Dashboard1 implements the subgroup fairness analysis based on the proposed methodology of clustering-and entropy-based subgroups. It is hosted as a publicly accessible web application that supports an automatic, user-friendly subgroup fairness analysis and provides a broad visualization of the subgroup fairness results. Registered users can upload their datasets, that contain also the ground-truth labels as well as the labels obtained as the predictions of their binary classifier, to the system. The ASDF-Dashboard further provides a tabular view of each of the uploaded dataset that can be used to interactively browse the uploaded data using sorting and column filters to get a better insight into the structure of the data. Figure 1 shows an exemplary table for the COMPAS [16] dataset, which is provided in the FairVis [7] repository incl. the predicted labels but without the clustering labels. Each of the table rows corresponds to one defendant whose recidivism for a period of 2 years was predicted by a binary classifier.

To perform the subgroup fairness analysis, at least a dataset 𝒟, the positive (favorable) class label (0 or 1) and the entropy threshold (0 ≤ 𝑡 ≤ 1) for the pattern extraction have to specified in the control tile, which is depicted at the top of Figure 2). Then, the ASDF-Dashboard can already compute the subgroup fairness of the given classfier automatically. Optionally, the user can also specify the categorical columns by selection, which need to be one-hot encoded for the computation of the clustering as the distances measures most commonly require vectors of numeric data as input. If no categorical attributes are selected, the system automatically detects them to apply the one-hot encoding. Furthermore, the fully numeric features of the selected dataset are then also scaled using the min-max normalization before computing the clustering. However, clustering a mixture of numeric and categorical attributes is very sensitive to the choice of algorithm and distance metric as there is no consensus on the optimal technique [17]. Therefore, we decided for the usual processing involving encoding and scaling. In addition to the automatic subgroup fairness calculation, the classifier's fairness can also be evaluated manually by choosing a clustering algorithm and specifying its parameters. In case of the automatic fairness assessment, the SLINK clustering algorithm is applied to dataset 𝒟 = {𝑥 1 , . . . , 𝑥 𝑛 } As an example, the COMPAS [16] dataset including the predicted class (column "out") and ground-truth class (column "class") is shown here.

with the desired number of clusters 2 ≤ 𝑘 ≤ ⌊ √︀ 𝑛 2 ⌋, which we estimate before by using the x-means clustering algorithm.

Figure 2 shows the subgroup fairness analysis on the COMPAS dataset with entropy threshold 𝑡 = 0.65, favorable class label 0 (not recidivism in two years) and the three categorical columns "c_charge_degree" (felony or misdemeanor), "race" (african-american, caucasian, asian, hispanic or other) and "sex" (female or male) using the SLINK clustering algorithm (agglomerative clustering with single linkage) with 𝑘 = 30. The average absolute values of statistical parity, equal opportunity, equalized odds, accuracy and difference between the subgroup and global accuracy (accuracy error) over both the clustering-(red) and entropy-based subgroups (blue) are shown by the radar chart in the left tile of the dashboard (Figure 2). The average absolute values give a good insight over violations of any of the fairness definitions by discrimination or favoritism throughout all the detected subgroups. The tile next to it displays the sizes of the clusters and entropy-based subgroups by bars. Here, it can be clearly seen that the both types of subgroups do not coincide in general as the sizes differ. Especially, the cluster 𝐶 14 is much smaller than the entropy-based subgroup 𝒟 𝑃 𝑡 14 , for instance, as 𝑃 𝑡 14 = ("Felony", "African-American", "Male") is a common attribute pattern in the dataset matching 1836 individuals.

To get an overview over the subgroups, the extracted entropy-based patterns are displayed in a table (Figure 3). Each row in the table corresponds to one entropy-based pattern 𝑃 𝑡 𝑖 extracted from cluster 𝐶 𝑖 ∈ 𝒞. The columns of the table represent all the features 𝐴 𝑗 ∈ 𝐴 𝑖 of the dataset 𝒟 that were found to be dominant in at least one of the clusters 𝐶 𝑖 ∈ 𝒞, i.e., 𝐻 𝑖,𝑗 ≤ 0.65 for some 𝑖 ∈ {1, ..., 𝑘}.The remaining features are not relevant for the entropybased patterns and, thus, not listed in the table. The column "id" identifies cluster 𝐶 𝑖 and the corresponding entropy-based pattern 𝑃 𝑡 𝑖 . For example, the entropy-based subgroup 8 in Figure 3 is defined by 𝑃 𝑡 8 = ("Felony", "Caucasian", "Female", −1) for the set of dominant attributes The table columns represent the features found to be dominant in at least one cluster. The minus symbol ("-") indicates that a feature does not occur in a pattern due to its high normalized entropy.

𝐴 8 = {"c_charge_degree", "race", "sex", "days_b_screening_arest"}. Each table row can also be expanded to show an embedded table listing the individual fairness metrics. These fairness metrics can also be investigated for each cluster and entropy-based subgroup in the chart displayed in the left tile of Figure 4 individually. The individual fairness metrics are visualized on click on one of the subtables in the pattern table or on one of the cluster-/entropybased subgroup size bars. Here, the selected cluster and entropy-based subgroup are 𝒟 𝑃 𝒞 0 and 𝒟 𝑃 𝑡 0 , respectively. The bars reveal that they share a similar accuracy score of ≈ 65%. However, the tested classifier slightly discriminates the protected instances 𝑥 ∈ 𝒟 𝑃 𝑡 0 as compared to the unprotected instances according to the three subgroup fairness metrics whereas it treats

Evaluation

In our experiments [1] we tested our system using the COMPAS2 dataset version from FairVis [7] (𝑛 = 6172, 𝑝 = 7), an updated version of the Statlog German Credit3 dataset (𝑛 = 1000, 𝑝 = 20) and the Medical Expenditure Panel Survey (MEPS) 4 dataset from panel 19 of 2015 (𝑛 = 15830, 𝑝 = 40). For each of the datasets we compared multiple clustering algorithms for the automated subgroup detection, namely, k-Means, DBSCAN, OPTICS, Spectral Clustering, SLINK, Ward, BIRCH, SSC-BP, SSC-OMP, and EnSC. Based on the dataset, small sets of individual parameter values (usually the number of clusters and the main parameters (e.g., 𝜖 at DBSCAN)) were tested in a grid search fashion for the detection of subgroups. We chose to report the subgroup fairness results for each parameter setting only for the run that had maximal clustering performance and showed the highest fairness violation. To this end, we measured the silhouette score 𝑆 𝒞 of clustering 𝒞 and the mean absolute error between the prediction accuracy of classifier 𝑀 ^on the clustering-based subgroups in comparison to the global accuracy (Eq. 7 [1]).

As our previous experiments [1] have shown, our proposed subgroup detection methods are applicable for the automated subgroup fairness analysis of a binary classifier. The applied clustering algorithms showed a varying performance as measured by mean absolute error in prediction accuracy on the clusters and sometimes multiple algorithms provided an equally good performance in different settings. The SLINK clustering algorithm yielded a strong overall performance at detecting unfairly treated subgroups. In fact, it outperformed the other clustering algorithms in many of the experimental settings including different datasets and subgroup fairness metrics. Due to the outstanding results, we implemented it for the fully automated subgroup fairness analysis in our tool. Additionally, the ASDF-Dashboard offers users the opportunity to select and configure any of the clustering algorithms for the subgroup detection.

The visualizations of the fairness analysis results support the comprehension of the classification model's behavior when presented individuals of different subgroups in the data. The ASDF-Dashboard presents various charts with the fairness metric values for a broad coverage of diverse aspects. The users can investigate the characteristics of the found subgroups, i.e., the sizes of the clustering-and entropy-based subgroups and the extracted patterns for each cluster, as well as the subgroup fairness metrics as measured for each subgroup individually. The rankings of the clusters or entropy-based subgroups allow for a direct access to the most discriminated or favored subgroups assuming a certain subgroup fairness metric. Additionally, the global fairness values are displayed to the user as an overall judgement of the classifier's fairness. However, our tool is limited to fairness assessment for the task of binary classification and can not be applied to multi-class settings which require different subgroup fairness definitions and metrics. Another limitation is that datasets and models are not uploaded separately for modular compositions of dataset and model but just the dataset containing the predictions.

Conclusion

In this work we presented the ASDF-Dashboard for carrying out a subgroup fairness analysis of a binary classifier. Our tool is able to detect meaningful subgroups being treated unfairly by the classification model as measured by three common subgroup fairness metrics. The detection of the discriminated or favored subgroup uses an unsupervised clustering and an entropy-based pattern approach to automatically identify subgroups of similar instances with as little user interaction as possible. After the subgroup fairness assessment, users can explore the visualizations of the analysis results in various ways including global and local fairness measurements. In future research one could revise and further improve the subgroup detection methods by testing more clustering algorithms and datasets to get more insights into the performance and robustness of the proposed methods in various scenarios. Another future direction could be a qualitative comparison between the clustering-based and other approaches like frequent pattern mining approaches. In particular, an investigation on the properties of the extracted patterns could yield valuable information. Furthermore, it might also be beneficial to derive a cluster validation index based on some subgroup fairness criterion that allows to select the best out of multiple clustering models for the subgroup detection.

Figure 1 :1Figure 1: Dataset inspection. View for the inspection of an uploaded dataset in the ASDF-Dashboard.As an example, the COMPAS[16] dataset including the predicted class (column "out") and ground-truth class (column "class") is shown here.

Figure 2 :2Figure 2: Subgroup fairness analysis. The upper tile initiates the fairness analysis of a classifier by selecting a dataset (incl. predictions) and setting the positive (favorable) class label, the entropy threshold 𝑡, the categorical features to one-hot encode and optionally the clustering algorithm with parameters. The radar chart (bottom left) displays global fairness metrics for both clustering-and entropy-based subgroups and the bar chart (bottom right) visualizes the detected subgroup sizes.

Figure 3 :3Figure 3: Pattern information. The extracted entropy-based patterns are listed as rows in a table.The table columns represent the features found to be dominant in at least one cluster. The minus symbol ("-") indicates that a feature does not occur in a pattern due to its high normalized entropy.

Figure 4 :4Figure 4: Individual subgroup metrics. The subgroup fairness metric values are shown by the bar plots (left) for an individual cluster and subgroup induced by the extracted entropy-based pattern (here: cluster/subgroup 0). The dropdown menu enables the selection of statistical parity, equal opportunity, equalized odds or prediction accuracy to visualized the top five clusters or entropy-based subgroups regarding the selected criterion, e.g., the five clusters with the lowest statistical parity values are shown.

Consider a clustering-based pattern 𝑃 𝒞 𝑖 and an entropy-based pattern 𝑃 𝑡 𝑖 extracted from the same cluster 𝐶 𝑖 ∈ 𝒞 of dataset 𝒟. The two patterns induce the different protected subgroups 𝒟 𝑃 𝒞 𝑖 and 𝒟 𝑃 𝑡 𝑖 , respectively. Generally, there might be instances 𝑥 ∈ 𝐶 𝑖 with 𝑥 ⊭ 𝑃 𝑡 𝑖 such that 𝑥 ∈ 𝒟 𝑃 𝒞 𝑖 but 𝑥 / ∈ 𝒟 𝑃 𝑡 𝑖 . On the contrary side, there might be also individuals 𝑥 ∈ 𝒟 𝑃 𝑡 𝑖 from other clusters 𝐶 𝑗 , 𝑗 ̸ = 𝑖 that satisfy 𝑥 ⊨ 𝑃 𝑡 𝑖 and, thus, are member of the protected subgroup 𝒟 𝑃 𝑡 𝑖 but not of 𝒟 𝑃 𝒞 𝑖 . However, the protected subgroups of two entropy-based patterns 𝑃 𝑡 𝑖 and 𝑃 𝑡 𝑗 might share some individuals or even coincide due to the same dominant features and most frequent values in both 𝐶 𝑖 and 𝐶 𝑗 . This is not possible for the protected subgroups of two clustering-based patterns 𝑃 𝒞 𝑖 and 𝑃 𝒞 𝑗 as we assume a hard partitional clustering with disjoint clusters, i.e., 𝒟 𝑃 𝒞 Two entropy-based patterns, in contrast, might be identical (𝑃 𝑡 𝑖 = 𝑃 𝑡 𝑗 ) which causes an induction of the same subgroup 𝒟 𝑃 𝑡 𝑖 = 𝒟 𝑃 𝑡 𝑗 .𝑖∩ 𝒟 𝑃 𝒞 𝑗= ∅.

https://github.com/jeschaef/ASDF-Dashboard https://github.com/poloclub/FairVis/blob/master/models/processed/compas_out.csv https://archive.ics.uci.edu/ml/datasets/South+German+Credit+%28UPDATE%29 https://meps.ahrq.gov/mepsweb/data_stats/download_data_files_detail.jsp?cboPufNumber=HC-183

Clustering-Based Subgroup Detection for Automated Fairness Analysis JSchäfer LWiese New Trends in Database and Information Systems SChiusano TCerquitelli RWrembel KNørvåg BCatania GVargas-Solar EZumpano

Cham

Springer International Publishing 2022 Boxer: Interactive comparison of classifier results MGleicher ABarve XYu FHeimerl Computer Graphics Forum 39 2020 Wiley Online Library Fairkit-learn: A Fairness Evaluation and Comparison Toolkit BJohnson YBrun 44th International Conference on Software Engineering Companion (ICSE '22 Companion) 2022 FairRankVis: A Visual Analytics Framework for Exploring Algorithmic Fairness in Graph Mining Models TXie YMa JKang HTong RMaciejewski IEEE Transactions on Visualization and Computer Graphics 28 2021 The what-if tool: Interactive probing of machine learning models JWexler MPushkarna TBolukbasi MWattenberg FViégas JWilson IEEE transactions on visualization and computer graphics 26 2019 GMorina VOliinyk JWaton IMarusic KGeorgatzis arXiv:1911.01468 Auditing and Achieving Intersectional Fairness in Classification Problems 2019 arXiv preprint FAIRVIS: Visual Analytics for Discovering Intersectional Bias in Machine Learning AACabrera WEpperson FHohman MKahng JMorgenstern DHChau IEEE Conference on Visual Analytics Science and Technology (VAST) 2019. 2019 Looking for Trouble: Analyzing Classifier Behavior via Pattern Divergence EPastor LDe Alfaro EBaralis Proceedings of the 2021 International Conference on Management of Data the 2021 International Conference on Management of Data 2021 DENOUNCER: Detection of Unfairness in Classifiers JLi YMoskovitch HJagadish Proceedings of the VLDB Endowment 14 2021 A Systematic Approach to Group Fairness in Automated Decision Making CHertweck CHeitz 2021 8th Swiss Conference on Data Science (SDS), IEEE 2021 Fairness Definitions Explained SVerma JRubin 2018 ieee/acm international workshop on software fairness (fairware), IEEE 2018 SBarocas MHardt ANarayanan Fairness and Machine Learning, fairmlbook 2019 Failures of Fairness in Automation Require a Deeper Understanding of Human-ML Augmentation MHTeodorescu LMorse YAwwad GCKane MIS Quarterly 45 2021 RK EBellamy KDey MHind SCHoffman SHoude KKannan PLohia JMartino SMehta AMojsilovic SNagar KNRamamurthy JRichards DSaha PSattigeri MSingh KRVarshney YZhang AI Fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias 2018 An Intersectional Definition of Fairness JRFoulds RIslam KNKeya SPan IEEE 36th International Conference on Data Engineering (ICDE) IEEE 2020. 2020 Machine Bias JAngwin JLarson SMattu LKirchner 2016. 27 June 2022 Distance-based clustering of mixed data MVan De Velden AIodice D'enza AMarkos Wiley Interdisciplinary Reviews: Computational Statistics 11 e1456 2019