1. Introduction

Correlation Clustering with Fairness Constraints

Francesco Gullo

Lucio La Cava

Domenico Mandaglio

Andrea Tagarelli

0 0 Dept. Computer Engineering , Modeling, Electronics, and Systems Engineering (DIMES) , University of Calabria , Rende (CS) , Italy 1 Dept. of Information Engineering , Computer Science, and Mathematics (DISIM) , University of L'Aquila , Coppito (AQ) , Italy

2026

Fairness in data analysis is a well-established and growing area of research, ofering tools to understand and mitigate specific forms of bias in decision-making systems. One of the key challenges in this context is fair clustering, i.e., grouping data objects that are similar according to a common feature space, while avoiding biasing the clusters against or towards particular types of classes or sensitive features. In this work, we discuss a correlation-clustering method we recently introduced and analyze its performance in fairness-aware scenarios. We compare it with representative state-of-the-art fair clustering approaches, considering both standard clustering quality metrics and fairness-related measures. Empirical results on publicly available datasets has shown that the method produces clusterings of higher quality according to traditional validation criteria, while also accounting for fairness considerations.

eol>clustering fairness

1. Introduction We live in an era in which machine learning is increasingly pervasive in our society. Every day, we

interact with machine learning systems—often without being aware of it—and these systems are gaining growing decision-making power in various aspects of our lives. For instance, they are used to support, or even replace, human decision makers in financial, medical, and legal domains.

Given their critical role, machine learning systems should ensure reliable and fair behavior, avoiding discrimination against those who are subject to their decisions. However, a major challenge arises from the data on which these systems are trained: such data are often intrinsically biased, typically due to flawed or unbalanced data collection processes. It is therefore important to prevent machine learning algorithms from being influenced by, or amplifying, these biases. For example, the work in [ 1 ] addresses this issue through the notion of disparate impact, which refers to the principle that no group of individuals should be (even indirectly) disadvantaged by the decisions of an automated system.

In this work, we focus on the problem of fair clustering, an unsupervised machine learning task whose

goal is twofold: () as in standard clustering, similar objects are assigned to the same cluster, whereas dissimilar objects are assigned to diferent clusters; and ( ) the resulting clusters are not dominated by a specific type of sensitive data class (e.g., individuals sharing the same gender).

Our key assumption is that the fair clustering problem can be efectively addressed within the framework of correlation clustering [2, 3]. This well-established approach partitions the vertices of a graph into clusters by maximizing intra-cluster similarity and minimizing inter-cluster similarity, based on pairwise weights that encode positive or negative co-association.

Contributions. In this paper, we discuss the use of correlation clustering for fair clustering, with a focus on our recent algorithm [ 4 ], and compare it to a selection of state-of-the-art methods from diverse algorithmic paradigms. Our analysis explores how emphasizing fairness objectives can influence the clustering quality, as measured by classic clustering-validation criteria. Our findings show that our proposed algorithm yields higher-quality solutions than competing methods from a clustering perspective, while also accounting for fairness-related aspects.

2. Related Work

Although of relatively recent definition, the problem of fairness in clustering has received considerable attention in the literature [ 5 ]. With their seminal work, Chierichetti et al. [ 6 ] were among the first to formalize the notions around fair clustering and the related problem, following the disparate-impact doctrine [ 1 ]. Their main contribution is a general pre-processing step, i.e., fairlets decomposition, to enable traditional algorithms (e.g., -center and -median) meeting fairness principles. Following that forerunner work, fairness has become pervasive in the clustering landscape [ 7, 8 ], leading to a fairness-aware declination of numerous traditional clustering formulations, such as -center [ 9 ], -means [ 10 ], -median [ 11 ], spectral clustering [ 12 ], and hierarchical clustering [ 13 ].

The phenomenon of fairness in clustering has also been extended to alternative approaches, such as correlation clustering. In this regard, Ahmadian et al. [ 14 ] is the first work to leverage the correlation clustering model for the fair clustering task. More specifically, it takes a complete and undirected graph as input, where vertices are assigned a (single) label representing a given protected class attribute (e.g., sex or ethnicity), and the goal is to provide a fair representation of each considered label in the resulting clusters. Recently, Mandaglio et al. [ 15 ] proposed to model the fair clustering problem of a relational dataset as a correlation clustering instance. Given a set of objects, defined over a set of features, Mandaglio et al. build an associated correlation clustering instance by considering the similarity between the tuples. Although Ahmadian et al.’s and Mandaglio et al.’s approaches aim to cluster diferent types of data (graphs and tuples, respectively), both approaches reduce the original problem to a correlation clustering instance. However, Mandaglio et al.’s formulation is more general than Ahmadian et al.’s one, since the former deals with an arbitrary number of labels (or sensitive attributes), while the latter is limited to a single-label setting.

3. Preliminaries

Correlation clustering. The correlation clustering problem, originally introduced by Bansal et al. [ 2 ], consists of clustering the set of vertices of a graph whose edges are assigned two nonnegative weights, named positive-type and negative-type weights, respectively. Such weights express the advantage of putting any two connected vertices into the same cluster (positive-type weight) or into separate clusters (negative-type weight). The objective is to partition the vertices so as to minimize the sum of the negative-type weights between vertices within the same cluster plus the sum of the positive-type weights between vertices in separate clusters (Min-CC).

Problem 1 (Min-CC [ 16 ]). Given an undirected graph = (, ), with vertex set and edge set ⊆ × , and weights +, − ∈ R0+ for all edges (, ) ∈ , find a clustering : →− N+ that minimizes:

∑︁ (,)∈, ()=() − +

∑︁ (,)∈, ()̸=() +.

(1)

Min-CC is APX-hard [ 17 ], but admits approximation algorithms [ 16, 2, 17 ] with guarantees depending on the type of input graph. On general graphs and weights, the best known approximation factor is (log | |) [ 17, 18 ], provided by a linear programming approach. Conversely, constant-factor approximation algorithms are possible if the graph is complete and edge weights satisfy the probability constraint, i.e., + + − = 1 for all , ∈ . Among these, the one which provides the best trade-of between eficiency and theoretical guarantees is the Pivot algorithm [ 16 ], which simply picks a random vertex , builds a cluster as composed of and all the vertices such that an edge with + > − exists, and removes that cluster from the graph. The process is repeated until the graph has become empty.

This algorithm has (||) time complexity and it achieves a factor-5 expected guarantee for Min-CC

under the probability constraint or if a global weight bound holds on the overall edge weights [ 15 ].

We next discuss how to solve the fair clustering problem via a Min-CC approach. Input: Set of objects , sensitive attributes , non-sensitive attributes ¬ , Min-CC algorithm A

¬ (, ), (, ), ∀, ∈ , as in Eqs. (3)–(4) 2: build the instance = ⟨ = ( , × ), { ¬ (, ), (, )},∈ × ⟩

1: compute Output: Clustering of 3: ←

run A on

The latter is assumed to be divided into two sets,

and ¬ . The Problem statement. Let = {1, · · · , } be a set of objects defined over a set of attributes.

set contains fairness-aware, or sensitive, attributes such as those identifying sex, race, religion, relationship status in a citizen database and any other attribute over which fairness is to be ensured. ¬ denotes the attributes that are relevant to the task of interest, and thus can be regarded as non-sensitive. In both cases, we assume that part of the attributes might be numerical, and the others as categorical (binary or multi-value). We use subscripts and to distinguish the two types, therefore = ∪ and ¬ = ¬ ∪ ¬ .

We consider a clustering task whose goal is to partition the input objects with a twofold objective: () minimize the inter-cluster similarity according to the non-sensitive attributes ¬ ; () minimize the intra-cluster similarity according to the sensitive attributes . The former objective corresponds to the typical clustering objective, since dissimilar objects should belong to diferent clusters. Pursuing the second objective, instead, would help distribute objects that are similar in terms of sensitive attributes across diferent clusters, thus fostering the formation of clusters that are equally represented in terms of the sensitive attributes. This is beneficial to ensure that the distribution of groups defined on sensitive attributes within each cluster approximates the distribution across the dataset. Formally, the problem we tackle in this work is: minimize: Problem 2 (Fair-CC [ 4 ]). Given a set of objects , two subsets of attributes object similarity function (· ) defined over the subspace of the attribute set, find a clustering * to and ¬ , and an ∑︁ ,∈ , ()=() (, ) +

¬ (, ) ∑︁ ,∈ , ()̸=()

The objective in Eq. (2) corresponds to solving a complete Min-CC instance where the set of vertices corresponds to the objects in and, for each pair of vertices and , the positive-type (resp. negative-type) correlation-clustering weight corresponds to the similarity score between the two vertices according to the non-sensitive (resp. sensitive) attributes. 4. Algorithm (2) (3) (4) 1) and − similarities proportionally to the number of involved attributes, and + = (| |/(| where = | |/(| | + | |) and ¬ = |¬ |/(|¬ | + |¬ |) are coeficients to weight | + |¬ |) − = (|¬ |/(| | + |¬ |) − 1) are smoothing factors to penalize correlation-clustering weights that are computed on a small number of attributes. The latter is reasonable as, in a fair clustering task, we usually have fewer sensitive attributes, and it should be avoided that negative-like weights can dominate the positive-like ones. The exponential function enables a mild smoothing, which is desirable. The Fair-CC problem requires a similarity function over a set of attributes. We quantify the similarity between objects and , based on sensitive and non-sensitive attributes, by means of the following ¬ (, ) and (, ) measures, respectively: ︁(

︁( ¬ (, ) := + ¬ · ¬ (, ) + (1 − ¬ ) · ¬ (, ) , (, ) := − · (, ) + (1 − ) · (, ) , ︁) ︁)

As Fair-CC is an instance of Min-CC, it can be solved by Min-CC algorithms. Our proposed

algorithm [ 4 ], dubbed CCBounds 1 and presented in Algorithm 1, consists of building a Min-CC instance with vertices as the input data objects and edge weights as the similarity scores, and then running a Min-CC algorithm A on such a Min-CC instance.

Theoretical remarks. Let A( ) be the running time of the algorithm A on the set of data objects .

CCBounds runs in (| |2|| + A( )) time complexity since it needs to compute a similarity score, over attributes, for each pair of objects in , and then solve the resulting Min-CC instance through algorithm A. Also, the space complexity of CCBounds is (| |2) for storing the similarity scores in memory. The specific Min-CC algorithm A used in CCBounds is the one proposed in [ 19 ], since it provides (under the probability constraint or the global weight bound stated in [ 15 ]) constant-factor approximation guarantee in expectation. Also, taking linear time in the size of the input graph, to the best of our knowledge, it is the most eficient algorithm in the Min-CC literature. As a result of this choice, the time complexity of CCBounds becomes (| |2||).

Another appealing aspect of the fact that Fair-CC is an instance of Min-CC is that Fair-CC inherits the following theoretical result:

Theorem 1 ([ 15 ]). If the condition ︀( |2 |)︀ − 1 ∑︀,∈ (¬ (, ) + (, )) ≥ max,∈ |¬ (, ) − (, )| holds on the similarity scores and the oracle A is an -approximation algorithm for Min-CC, CCBounds is an -approximation algorithm for Fair-CC.

The above theorem provides approximation guarantee on the Fair-CC objective (cf. Eq. (2)), which combines the cluster quality measure (first summation) and the fairness-related objective (second summation). It is not known how this quality guarantee translates into the single objective, e.g., the fair objective. This is a challenging open question which we defer to future studies. 5. Fairness Evaluation In this section, we summarize the most-commonly adopted metrics for the evaluation of fairness aspects

in clustering. We focus on algorithm-independent measures, i.e., able to generalize across multiple methods, following a group-level approach under the disparate impact doctrine [ 1 ]. Balance. It is one of the most adopted evaluation metrics for fairness in clustering, initially proposed by Chierichetti et al. [ 6 ] in a context with one sensitive attribute with two protected groups. It has been successively generalized to protected groups by Bera et al. [ 7 ]. According to the latter, the balance of a clustering solution can formally be defined as follows [ 5 ]: () =

min ∈,∈[]

1 }︁ min {︁,, , ∈ [ 0, 1 ], (5) where , is the ratio between the proportion of the objects belonging to a given protected group in the considered dataset and in a given cluster ∈ .

In such a formulation, the lower and upper bounds of a cluster indicate the fully unbalanced and perfectly balanced scenarios, respectively, where the former indicates the case where all the objects in such a cluster pertain to the same protected group, whereas the latter denotes an equal number of objects from each of the protected groups. Therefore, the higher the balance, the better the obtained solution, in terms of equality. Additionally, the considered generalization allows us to obtain a comprehensive evaluation of the balance of our clustering solutions, as it looks at the dataset context, i.e., it will return high scores provided that the balances of the clustering and the input dataset are comparable. Average Euclidean Fairness. This metric was introduced by Abraham et al. [ 10 ] to estimate the unfairness by assessing the deviation between the representation of groups obtained focusing on the sensitive attributes in the whole dataset and the given clustering solution. It expresses the cluster-size 1https://github.com/Ralyhu/globalCC weighted average of cluster-level deviations (i.e., Euclidean distances) between two frequency (sensitive) attribute vectors, namely , which is computed over the entire set of objects, and , which is computed for each cluster ∈ , focusing on a sensitive attribute ∈ . Formally, it is defined as: () = ∑︀∈ |∑|︀×∈ |(|, ) , (6) where represents the Euclidean distance between the frequency attribute vectors. Since can be multi-valued, such a formulation is suited to scenarios where there are multiple protected groups. Also, as this measure is a deviation, smaller values correspond to better solutions.

6. Experimental Methodology

6.1. Competing Methods

In the following, we briefly overview the competing methods we included in our experiments. For each of those methods, we used publicly available code, which we adopted “as-is”, i.e., without making any changes or optimizations.

Fair Clustering Through Fairlets [ 6 ]. This method, here dubbed Fairlets, is one of the pillars of fair clustering. It is based on the notion of fairlets decomposition, that is a grouping of the input objects into fairlets, i.e., minimal subsets of objects that satisfy a given fairness definition, while preserving the clustering objective. Given a good fairlets decomposition, this approach requires traditional clustering algorithms (i.e., -center or -median) applied on the centers of the obtained fairlets, to yield the “fair” solutions. Fairlets supports two types of fairlets decomposition: an accurate one based on min cost flow (MCF), and a more eficient one. We hereinafter refer to those decompositions as MCF decomposition and vanilla decomposition, respectively. A major limitation of Fairlets is that it can handle a single sensitive binary attribute only. We will discuss the impact of such limitations in more detail in Section 7.

HST-based Fair Clustering [11]. This approach, here dubbed HST-FC, focuses on the -median

formulation, and employs a quad-tree decomposition to embed the objects in a a tree metric, called HST.

By leveraging such a tree, HST-FC computes an approximate fairlets decomposition. A fair clustering is

ultimately obtained by running -median algorithms on the produced fairlets. Like Fairlets, HST-FC sufers from the limitation that it deals with one binary sensitive attribute only.

Fair Correlation Clustering [ 14 ]. This method, here dubbed Signed, introduces a fairlet-based reduction for the graph clustering scenario with respect to the problem of correlation clustering, leading to the concept of correlation clustering with fairness constraints. Specifically, given a signed graph, i.e., an undirected graph with edges labeled as positive or negative, the algorithm performs a fairlet decomposition (under diferent fair settings) over the set of vertices. The produced decomposition is used, together with the original graph, to build a reduced (complete and unweighted) correlation clustering instance, where the vertices correspond to the produced fairlets and the sign of the edges between any two fairlets are built according to the majority sign of the edges between vertices within those two fairlets. A clustering on this reduced correlation clustering instance is computed through local-search optimization starting from all singleton clusters, and then expanded into a solution of the original problem. As a fair setting for the fairlets decomposition, we consider the most common case of fair decomposition where clusters are required not to have a sensitive data class. As the Signed method requires a signed graph as input, we perform the following preprocessing step to make the relational data compatible with this format. We derive a complete graph whose vertices are the original data objects and an edge (, ) is labeled as positive with probability + = {0, ¬ (, ) − (, )} and as a negative edge with probability 1− +, where the similarity functions are the ones defined in Eqs. (3)– (4). We point out that, although we can adapt the same weighting strategy as CCBounds to obtain the edge attributes, we discarded this choice as our experiments showed that it favors the emergence of a degenerated clustering solution (i.e., a single output cluster), due to the strong predominance of positive weights on the edges. 6.2. Data

We considered five real-world relational datasets, which have been commonly used in the fair clustering

literature. The main characteristics of these datasets are summarized in Table 1. As reported in the table, in our evaluation we focused on a smaller subset of the original attributes; note that this is a common practice, which is adopted, among others, by the competing methods outlined above. Adult2 reports information about the 1994 US Census. For each tuple representing an individual, we considered age, fnlwgt, education-num, capital-gain and hours-per-week as non-sensitive attributes, and sex (i.e., male or female) as a sensitive attribute.

Bank2 provides details on phone calls involving direct marketing campaigns of a Portuguese banking institution to assess whether the bank term deposit will be subscribed or not. We considered attributes age, balance and duration as non-sensitive, and marital status (i.e., married or not) as sensitive. CreditCard3 concerns customer credit card services to estimate customer attrition. We considered attributes customer_age, dependent_count, avg_utilization_ratio and total_relation ship_count as nonsensitive, and sex as sensitive.

Diabetes2 reports diabetic patient records, for which we considered age and time_in_hospital as nonsensitive attributes, and sex as a sensitive attribute.

Student2 contains student performances for Mathematics and Portuguese language in secondary education of two Portuguese schools. We considered age, study_time and absences as non-sensitive, and sex as sensitive. 6.3. Evaluation Goals

Our evaluation objectives concern both fairness and quality aspects of clustering. In the first case, we

use the fairness metrics defined in Section 5, which allow us to have a group-wide overview of how a method behaves in terms of fair principles. In the second case, we assess the quality of clustering by means of intra- and inter-clustering similarity, considering both the sensitive and non-sensitive attributes, as described below. Finally, we evaluate running times.

Intra/Inter-cluster similarity. As stated in Section 3, we take into account the intra-cluster, resp. inter-cluster, similarity among objects to properly distribute them into clusters, either focusing on their sensitive and non-sensitive attributes (cf. Eqs. (3) and (4)). We define the following aggregated scores to have an overall measure of goodness of the clusters: (¬ ) = (¬ ) = 1 1

∑︁ ¬ (, ), |Θ| ,∈Θ

∑︁ ¬ (, ), |Ω| ,∈Ω ( ) = ( ) = 1 1

∑︁ (, ), |Θ| ,∈Θ

∑︁ (, ), |Ω| ,∈Ω (7) (8) where Ω = {, ∈ | () = ()}, and Θ = {, ∈ | () ̸= ()}. In particular, to obtain fair clusters, we need to maximize (resp. minimize) the ( ), resp. ( ), scores, so that objects having the same set of sensitive attributes will not be clustered together, rather they will be

2https://archive.ics.uci.edu/ml/datasets/ 3https://www.kaggle.com/sakshigoyal7/credit-card-customers

well-distributed across clusters. Conversely, we require to maximize, resp. minimize, the (¬ ), resp. (¬ ), scores, to ensure that objects with the same set of non-sensitive attributes will be clustered close with each other and not scattered across diferent clusters.

Running times. We measure the running times of CCBounds and the competing methods while

executing them on the Cresco6 cluster.4 6.4. Hyper-parameters and Configurations Data sampling and attributes selection. To test the selected competing methods under diferent conditions, and run even the most computationally expensive approaches, we adopt the sampling strategy proposed in [ 6 ]. Specifically, by sampling (without replacement) we extracted 1k or 10k tuples from the original full set of tuples, by preserving some desired ratio between the protected classes. The details of the sampling strategy used in our experiments are reported in Table 2, where the selected fair attributes and split ratio (i.e., the fraction of tuples pertaining to diferent sensitive attribute values) are, whenever possible, the same as [ 6 ]. Also, both Fairlets and HST-FC require two integers and as input, whose ratio / corresponds to the minimum balance required by each clusters, yielded by these algorithms. The configuration of these parameters, inspired by [ 6, 7 ], is reported in Table 2.

We highlight that, as described so far, we focus on a single and binary sensitive attribute to match the

minimum requirements that embrace all competing methods. Nonetheless, some approaches (including our CCBounds) can deal with multiple values assigned to a single sensitive attribute.

Number of clusters. While Fairlets and HST-FC require a hyper-parameter in input, denoting

the desired number of output clusters, the same does not apply with the correlation clustering-based approaches. Thus, to create a reasonable comparative environment, we use the (rounded) average number of clusters returned by CCBounds in ten iterations as the parameter for Fairlets and HST

FC. Moreover, we inherit the value from the nearest subset when the correlation clustering-based

approaches run out of memory.

7. Results

Table 3 summarizes the results achieved by CCBounds and the competing methods. With the exception of very high running times and out of memory errors (indicated with NA and OOM, respectively), all reported measurements correspond to averages over 10 runs of the tested algorithms. The similarity values (Eqs. (7)–(8)) were obtained by using Euclidean and Jaccard similarities for numerical and

Adult-1k Adult-10k Adult-Full Bank-1k Bank-10k Bank-Full CreditCard-1k CreditCard-10k Diabetes-1k Diabetes-10k Diabetes-Full Student-1k

CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed CCBounds Fairlets HST-FC Signed of selected sensitive attributes or the set of non-sensitive attributes (cf. Table 1), and running time. For each criterion, bold values correspond to the best-performing methods (possibly up to the second decimal point).

(¬ ) ↑ ( ) ↓ (¬ ) ↓ ( ) ↑ time (s) ↓

Section 6.1, we report results only for the vanilla fairlets decomposition, since the min-cost-flow (MCF)

counterpart has very high running times (more than 7 minutes on the smallest dataset, i.e., Student-1k) and produces solutions that are very similar to the vanilla one (results not shown for the sake of brevity).

As for the balance, we notice that, although CCBounds does not match the high scores obtained

by “fairness-native” methods (i.e., Fairlets and HST-FC), it is still able to score comparably with its direct competing method, i.e., Signed. Exceptions arise in the case of Student-1k and Diabetes-1k, where

CCBounds sets up to lower scores, and for some large datasets, where Signed does not terminate in

reasonable time, while our CCBounds still obtains good results in reasonable time. The paradigm shifts when we consider small yet heavily unbalanced datasets (i.e., CreditCard-1k, with an 80:20 ratio); here, although several competing methods struggle to obtain high scores, CCBounds achieves the second-best balance score. Overall, as the balance obtained by CCBounds in all evaluation scenarios ranges from 0.45 to 0.613, we can conclude that it is able of guaranteeing satisfactory balance scores.

In the case of avg. Euclidean fairness, CCBounds obtains very good scores under diferent scenarios:

it is among the best-performer approaches for the Adult-1k, Adult-Full and Bank-1k datasets, and outperforms all the other methods by an order of magnitude on Bank-10k and Bank-Full. Conversely, on the remaining datasets, CCBounds does not reach the top scores of some competitors.

Considering the similarity computed on the sensitive attributes, CCBounds does not achieve the best intra-cluster similarity, meaning that it tends to group a few more objects with the same sensitive attribute value than the other methods. Nevertheless, the inter-cluster similarities are comparable with the other methods, thus indicating that CCBounds is still able to properly separate the objects into clusters, when accounting for the sensitive attribute. Instead, when we focus on the similarity computed on the non-sensitive attributes, CCBounds achieves the best performance in all the considered evaluation scenarios, yielding very high-quality clusters.

Finally, we also investigated on running times, spotting Fairlets as the best performer, followed by

HST-FC and CCBounds, which both guarantee reasonable running times. Although CCBounds has quadratic time complexity due to pairwise similarity calculations (cf. Section 4), we managed to perform in parallel such time-consuming steps. On the contrary, Signed requires excessively long execution times, often resulting infeasible in practice, along with an abnormal number of clusters produced, which is particularly large even when considering the smallest 1k datasets. Overall, it should be noted that, albeit the observed running times should be taken with grain of salt due to the (lack of) code optimizations, major remarks are consistent with the time complexities of the corresponding methods.

Discussion. A number of remarks arise from our experimental evaluation. First, although native

fairness-aware approaches are able to produce clustering solutions that optimize fairness notions, we found out that such a capability comes with a cost, as the produced clusters are often far from being qualitatively good. On the other hand, CCBounds demonstrated itself to be efective and versatile: it was recognized as the best-in-case approach among the tested ones when it comes to find good-quality clusters, while also being able not to excessively penalize aspects related to fairness.

Second, although we unveiled the weakness in quality shown by the native fair-clustering approaches, we nonetheless shed light on how the approaches based on correlation clustering might sufer from computational issues, by being slower than the other methods, and requiring more memory. This is particularly evident with Signed, as it is unable to terminate in all datasets having more than 10k tuples, while it is kept under control in CCBounds, which goes down only in the case of Diabetes-Full (containing more than 100k tuples, cf. Section 6.2), thanks to the numerous optimization adopted under the hood. However, such a dataset makes it dificult to calculate similarities even for traditional and more eficient approaches, despite the computing capabilities at our disposal.

Finally, our approach achieves fairness-aware performance comparable to its direct competitor (Signed), while outperforming all considered methods in producing high-quality clusters, without compromising fairness. 8. Conclusions We showed how the correlation clustering method [4], CCBounds, efectively addresses fair clustering. Experiments on real data confirm the quality of its solutions, which outperform competing methods in standard clustering metrics while preserving fairness. Future work includes evaluating CCBounds with multiple protected values, exploring alternative

similarity definitions, and extending its applicability to more complex scenarios with multiple sensitive attributes and/or uncertainty [ 20 ], as well as to polarization detection [21], enhancing its practicality and versatility.

Declaration on Generative AI The authors have not employed any Generative AI tools. Acknowledgements: D. Mandaglio and A. Tagarelli are partly supported by the PNRR Future AI Research (FAIR) project (H23C22000860006, M4C21.3 spoke 9). L. La Cava is supported by project SERICS (PE00000014), under the MUR National Recovery and Resilience Plan funded by the EU NextGenerationEU.

[21] L. La Cava, D. Mandaglio, A. Tagarelli, Polarization in decentralized online social networks, in:

Proceedings of the 16th ACM Web Science Conference, 2024, pp. 48–52. doi:10.1145/3614419.

3644013.

[1]

Feldman ,

S. A.

Friedler ,

Moeller ,

Scheidegger ,

Venkatasubramanian , Certifying and removing disparate impact , in: Proc. ACM KDD Conf. , 2015 , pp. 259 - 268 .

[2]

Bansal ,

Blum ,

Chawla , Correlation clustering, Mach. Learn . 56 ( 2004 ) 89 - 113 .

[3]

Gullo ,

Mandaglio ,

Tagarelli , A combinatorial multi-armed bandit approach to correlation clustering , DAMI 37 ( 2023 ) 1630 - 1691 . doi: 10 .1007/S10618-023-00937-5.

[4]

Gullo ,

L. La

Cava ,

Mandaglio ,

Tagarelli , When correlation clustering meets fairness constraints , in: Proceedings of International Conference on Discovery Science , Springer, 2022 , pp. 302 - 317 . doi: 10 .1007/978-3- 031 -18840-4\_ 22 .

[5]

Chhabra , K. Masalkovaitė,

Mohapatra , An overview of fairness in clustering , IEEE Access 9 ( 2021 ) 130698 - 130720 .

[6]

Chierichetti ,

Kumar ,

Lattanzi ,

Vassilvitskii , Fair clustering through fairlets , in: Proc. NIPS Conf ., 2017 , pp. 5029 - 5037 .

[7]

S. K.

Bera ,

Chakrabarty ,

Flores ,

Negahbani , Fair algorithms for clustering , in: Proc. NIPS Conf ., 2019 , pp. 4955 - 4966 .

[8]

I. O.

Bercea ,

Groß ,

Khuller ,

Kumar ,

Rösner ,

D. R.

Schmidt ,

Schmidt , On the cost of essentially fair clusterings , in: Proc. APPROX/RANDOM Conf. , 2019 , pp. 18 : 1 - 18 : 22 .

[9]

Kleindessner ,

Awasthi ,

Morgenstern , Fair k-center clustering for data summarization , in: Proc. ICML Conf ., 2019 , pp. 3448 - 3457 .

[10]

S. S.

Abraham , D. P,

S. S.

Sundaram , Fairness in clustering with multiple sensitive attributes , in: Proc. EDBT Conf ., 2020 , pp. 287 - 298 .

[11]

Backurs ,

Indyk ,

Onak ,

Schieber ,

Vakilian , T. Wagner, Scalable fair clustering , in: Proc. ICML Conf ., 2019 , pp. 405 - 413 .

[12]

Kleindessner ,

Samadi ,

Awasthi ,

Morgenstern , Guarantees for spectral clustering with fairness constraints , in: Proc. ICML Conf ., 2019 , pp. 3458 - 3467 .

[13]

Ahmadian ,

Epasto ,

Knittel ,

Kumar ,

Mahdian ,

Moseley ,

Pham ,

Vassilvitskii ,

Wang , Fair hierarchical clustering , in: Proc. NIPS Conf ., 2020 .

[14]

Ahmadian ,

Epasto ,

Kumar ,

Mahdian , Fair correlation clustering , in: Proc. AISTATS Conf ., 2020 , pp. 4195 - 4205 .

[15]

Mandaglio ,

Tagarelli ,

Gullo , Correlation clustering with global weight bounds , in: Proc. ECML-PKDD Conf. , 2021 , pp. 499 - 515 . doi: 10 .1007/978-3- 030 -86520-7\_ 31 .

[16]

Ailon ,

Charikar ,

Newman , Aggregating inconsistent information: Ranking and clustering , JACM 55 ( 2008 ) 23 : 1 - 23 : 27 .

[17]

Charikar ,

Guruswami ,

Wirth , Clustering with qualitative information , JCSS 71 ( 2005 ) 360 - 383 .

[18]

E. D.

Demaine ,

Emanuel ,

Fiat ,

Immorlica , Correlation clustering in general weighted graphs , TCS 361 ( 2006 ) 172 - 187 .

[19]

Ailon ,

Charikar ,

Newman , Aggregating inconsistent information: ranking and clustering , in: Proc. ACM STOC Symp ., 2005 , pp. 684 - 693 .

[20]

Mandaglio ,

Tagarelli , F. Gullo, In and out: Optimizing overall interaction in probabilistic graphs under clustering constraints , in: Proc. ACM KDD Conf. , 2020 , pp. 1371 - 1381 . doi: 10 . 1145/3394486.3403190.