Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


 Algorithmic Fairness and its Limits in Group-Formation


                                           Long paper


             Otto Sahlgren1[0000-0001-7789-2009] and Arto Laitinen2[0000-0002-4514-7298]
             1
                 Department of Philosophy at Tampere University, Tampere, Finland
             2
                 Department of Philosophy at Tampere University, Tampere, Finland
                                      otto.sahlgren@tuni.fi


        Abstract. Algorithmic group formation has become a flourishing research area
        in the computer sciences, and more recently in the field of data mining and fair
        machine learning. Application domains for algorithmic solutions to grouping
        span wide, from team-recommendation and formation in work settings to ability-
        grouping in education. Recent work has also focused on fairness in group-
        formation. We briefly review literature on algorithmic team-formation and
        consider fairness in different group-formation contexts. We articulate different
        dimensions and constraints that are relevant for fair group-formation and discuss
        the tension between utility and fairness. Many problems and limitations regarding
        formal definitions of fairness explicated in the fair machine learning literature
        apply also in the context of group-formation. We suggest some limits to the
        relevance of fairness in general and algorithmic fairness, in particular. We argue
        that algorithmic fairness is less relevant to some groups because of the way they
        come to existence or because fairness is not a central value for them. Other central
        values are subjective rights; autonomy or liberty; legitimacy and authority;
        solidarity; and diversity, each of which can be in tension with optimal fairness-
        and-utility. But within acceptable limits, we argue that fairness is indeed a
        valuable goal that may be in tension with maximization of the relevant types of
        utility.

        Keywords: algorithmic group-formation, team-recommendation, machine
        learning, fairness, political philosophy, ethics


1       Introduction

Algorithmic group formation has become a flourishing research area in the computer
sciences, and more recently in the field of data mining and fair machine learning (ML).
Application domains for algorithmic solutions to grouping span wide, from team-
recommendation and formation in work settings (Chen & Lin, 2004; Lappas et al.,
2009; Barnabò et al., 2019) to ability-grouping in education (see Carmel & Ben-Shahar,
2017). Relevant questions in this field include: How do we form optimal groups
algorithmically while adhering to a specific criterion or desiderata, such as utility or


Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons
License Attribution 4.0 International (CC BY 4.0).


                                                38
    Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


diversity? How do we ensure that the process or its outcome is fair? What is fairness in
this context? Little attention has been paid to other morally significant aspects of
algorithmic group-formation, however, such as questions of rights and entitlements,
autonomy and solidarity. This paper bridges this gap by drawing lessons from political
philosophy that are relevant for algorithmic group-formation.
   In Section Two we briefly review literature on algorithmic team-formation and
consider fairness in different group-formation contexts. We articulate dimensions and
constraints relevant for fair algorithmic group-formation and discuss the tension
between utility and fairness. We consider different metrics for fairness and argue that
many of the problems and limitations regarding the formal definitions of fairness
explicated in the fair ML literature apply also in the context of group-formation.
   In Section Three we discuss key concepts and ideas from political philosophy
relevant for fairness in algorithmic group-formation, and also introduce novel
viewpoints to algorithmic group-formation alongside considerations of fairness and
utility. Recent research has drawn lessons from political philosophy concerning fairness
(see Binns, 2018; Narayanan, 2018). This paper will go further and draw also on debates
on group-formation (sometimes under the heading of teams, communities, even
societies), which suggest some limits to the relevance of fairness in general and
algorithmic fairness in particular. We argue that fairness is less relevant to many forms
of groups, either because of the way the groups come to existence (making algorithmic
solutions redundant) or because fairness is not the only central value for them (making
fairness-optimization unjustified). Other central values are subjective rights; autonomy
or liberty; legitimacy and authority; solidarity; and diversity, each of which can be in
tension with optimal fairness-and-utility. But within acceptable limits, we argue that
fairness (which can come in many forms) is indeed a valuable goal that may be in
tension with maximization of the relevant types of utility (which can also come in many
forms).


2       Fairness in algorithmic group-formation

The Team Formation Problem was first formulated in Lappas et al. (2009), who offered
algorithmic solutions for determining an optimal team given a task, a set of individuals
and a set of relations between them capturing their compatibility. Majumder et al.
(2012) build on this approach, adding a capacity constraint which requires that no
individual is overloaded by an assigned task. More recently, Bulmer et al. (2020)
consider scenarios where instead of one task there are multiple tasks to be assigned to
multiple teams, and full coverage of project tasks is not required (i.e., optimal coverage
of project tasks is pursued).
   Fairness in group-formation and team-recommendation systems has also gained
increasing interest (see Anagnostopoulos et al, 2012; Machado & Stefanidis 2019). We
consider different formal definitions and metric for fairness that are applicable in the
context of group-formation below, alongside their limitations and blind spots. First,
however, we consider some aspects and constraints of grouping.


                                           39
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


2.1    Constraints in group-formation
Decision-making is always constrained by external affordances, circumstances and
contextual factors. Group-formation does not differ in this regard, in that approaches to
group-formation are defined and constrained by the nuanced goals, resources and
affordances, as well as limitations of context, in addition to possible ethical and legal
considerations. We will next explicate some dimensions and constraints of group-
formation. They include, but are not limited to the following: (i) number of
tasks/projects/activities/roles to be distributed, (ii) amount of individuals to be grouped,
(iii) desired amount of groups, (iv) team-composition and role flexibility (e.g., whether
an individual can have several roles, perhaps in several teams); (v) appropriate criteria
of inclusion or criteria according to which membership is distributed, (vi) costs and
benefits, and (vii) group-level properties, such as diversity, that ought to be exhibited
by some or all groups that are formed.
    First, a team formation task involves one or more (sets of) objectives, tasks,
resources, activities, etc., to be assigned. Sometimes there is one task, such as a single
case in a law firm to be assigned to a legal team. In other situations, there may be
numerous assignments or projects, e.g., different essay topics on a philosophy course.
    Second, the number of individuals to be grouped, or from which a subset of
individuals is to be assigned to a single task, may vary (e.g., size of overall personnel,
department sizes, course attendees).
    Third, there may be a constraint as to how many groups can (or should) be formed.
This will usually, but not necessarily, depend on the number of tasks or projects
available as well contextual affordances. In certain contexts, there may be an optimal
or desirable number of groups (e.g., number of groups in child day-care) due to resource
constraints (e.g., personnel and facility affordances). This may be more or less flexible,
however (e.g., there is no principled constraint on how many learners can be grouped
for a certain project in a philosophy class).
    Fourth, depending on the situation, individuals may be grouped into only one group
at a time or they may belong to several groups simultaneously. For example, in some
cases, it might be required that workers split their time between different projects, and
in others, workers may be assigned to only one group/task at a time.
    Fifth, the inclusion criteria for group-membership may vary from context to context.
In product development tasks, a given project may require a certain heterogeneous set
of attributes, such as range of different skills. Workers with rare skills or high-level
expertise may be in a position of advantage in the competition for a place in the team.
Sometimes criteria for inclusion may not be skill related. In education, group essay
topics, for example, might be assigned to groups based not on their skills but group-
members’ (aggregated) preferences. The question regarding the selection of inclusion
criteria is, of course, significant from the point of view of fairness, provided that certain
individual attributes, such as ‘race’, are commonly excluded from the criteria for ethical
and/or legal reasons.


                                            40
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


    Sixth, there may be varying costs and benefits associated with specific aspects of
group-formation. Different kinds of errors may bring about different costs or harms in
algorithmic decision-making. As such it may be appropriate to prioritize the mitigation
or elimination of certain errors, such as false positives in recidivism risk assessment or
false negatives in algorithmic medical diagnosis.1 This applies also in group formation:
assigning individuals into groups that are a poor fit for them or excluding them from
groups that they would have been best suited for may bring about different costs. In
algorithmic team-formation, coordination or communication costs are often estimated
when forming collaborative groups (see, e.g., Anagnostopoulos et al, 2012; Lappas et
al., 2009).
    Furthermore, certain group compositions may be more optimal and/or just than
others. There is a long debate in the educational sciences regarding ability-grouping,
for example, which has centered around the question of whether homogeneous or
heterogeneous learning groups are most beneficial for learning (e.g., Steenbergen-Hu
2016; see also Wichmann et al., 2016). Empirical questions regarding diversity in
groups (what kinds of groups bring about the best utility) are accompanied by ethical
ones – e.g., should groups be diverse along certain lines even if this results in costs in
terms of utility or productivity?
    Lastly, the grouping process, and the quality of its outcomes, may be evaluated along
several lines. Ounnas et al. (2007) offer a metrics framework which includes the
following sets of metrics: (i) formation metrics that tell us how well individual or all
groups in the cohort are formed under given constraints (includes metrics for perceived
satisfaction of the formation), (ii) productivity metrics that tell us how well the formed
groups perform with respect to the task they are assigned to (i.e., outcome quality), and
(iii) goal satisfaction metrics that indicate how well an individual goal within a given
collaborative task or project, or the set of all goals, are satisfied at the group or cohort
level. Okimoto et al. (2015) also present a property of singular groups, which may be
desired in several contexts, called k-robustness: A team is k-robust if the team can
accomplish a given task albeit a k number of members are removed from the team. In
addition to these metrics, one can of course evaluate grouping and the outcomes of
grouping, i.e., the formation, from the point of view of fairness and diversity, and other
morally significant aspects which we consider below.
    This brief overview of relevant considerations provides different viewpoints to
fairness considerations in grouping, although it is not by any means exhaustive.
Furthermore, algorithmic fairness cannot be considered meaningfully without
considering a variety of sociotechnical factors, such as contexts of development and
use, institutional practices, accountability and transparency, and the moral value and
justification of particular technologies (see Selbst et al., 2019).

2.2      Bias in group-formation
Historical and technical biases in training data and in algorithms may lead to unfair
outcomes and discrimination in algorithmic decision-making processes (Angwin et al.,

  1
      For discussion on the subject see Corbett-Davies et al. (2017) and Hellman (2019).


                                               41
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


2016; d’Alessandro et al., 2017). Furthermore, once built into the system, biases may
be continuously reproduced and amplified through feedback-loops (see, e.g., Lum &
Isaac, 2016). Given that algorithmic solutions to group-formation rely on data which
may be susceptible to introducing bias into the model, issues with bias and unfairness
prove worrisome in algorithmic group-formation as well. The formation of work teams
for projects may utilize data on individuals’ work experience (e.g., past projects),
assessments of work performance (e.g., employee evaluations), and social connection
information (see, e.g., Liu et al., 2014). While often natural choices for such a task,
these types of information are prone to contain biases due to discrimination at the
workplace (see, e.g., Barocas & Selbst, 2016), either by employers or co-workers, and
thus risk unfairness in team-recommendation if not appropriately scrutinized, checked
for bias, and sanitized.
   In addition to possible bias, the fairness of group-formation also depends on whether
the used data is suitable to predict task-performance, suitability, or other relevant
qualities in the first place. For example, team-formation in the context of work has been
conducted on the basis of individuals’ personality profiles as measured with the Myers-
Briggs Type Indicator (MBTI) (see Chen & Lin, 2004; Yilmaz et al., 2015). While
inter-individual compatibility within groups plausibly affects the performance or
productivity of groups, there are limits to the utility of certain types of information with
respect to the objective of group-formation. There is, for example, significant
controversy regarding MBTI’s reliability and validity as well as its value in the context
of work-related issues, such as career planning (Pittenger, 1993).

2.3    Definitions, metrics, and limitations
A plethora of metrics and formal definitions for fairness have been introduced in the
fair ML literature for measuring unwanted biases in processing and outcome
distributions. Common approaches to measuring fairness in ML models include
estimating outcome class equity, error rate disparities, and the illegitimate or undesired
effects sensitive attributes have on the generated outputs (Verma & Rubin 2018). Most
of these metrics are formulated with contexts in mind that involve allocating goods or
opportunities through ranking or scoring individuals and assigning them into positive
or negative outcome classes accordingly (e.g., binary classification).
   In team-formation, when the goal is to form a single team from a set of individuals
(e.g., one team of experts), several metrics introduced in the fair ML literature prove
applicable. This is because the formation process is, in these cases, formally analogous
to those described above (e.g., binary classification). When this is the case, we can treat
equity in the formed group as a commodity to be (fairly) distributed between the
individuals, akin to other goods or benefits. (The desirability of equity in the group may
vary depending on whether being assigned to the group is a burden or a benefit.) When
there are multiple groups to be formed and multiple goods or activities to be distributed
to those groups (e.g., in multiclass classification or clustering tasks), there are other
considerations, such as diversity or proportionality of shares both within and across
groups, that matter for fairness.


                                            42
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


   Before we move on to consider distinct metrics and definitions for fairness, it should
be noted that the objective (or desirable outcome) of group-formation is in many cases
maximal utility. In work and education, utility may be understood, e.g., as teams’
efficiency in solving a task, their aggregated skill-level or the coverage of a required
set of skills – or, more generally, productivity or efficiency. This is the case with many
approaches in the computer science literature, which do not necessarily consider
fairness in team-formation or diversity within and across teams. We consider some
exceptions below. In Section Three, we consider other aspects of group-formation
which bear on the normative considerations regarding algorithmic grouping (and
grouping in general).

Blindness to sensitive information

A first definition for fairness to be considered here – and plausibly one aligning with
common intuitions about non-discrimination – is fairness through unawareness (see
Kusner et al., 2017). Essentially, this approach requires that information about sensitive
attributes is not used in algorithmic decision-making. In this view, a team-formation
process or outcomes thereof are fair given that sensitive information remains
unconsidered. As information about sensitive attributes may be redundantly encoded in
seemingly neutral data, and as other factors may function as proxies for sensitive traits,
unawareness is not often sufficient to ensure fairness, however (Dwork et al., 2012).
To take an example from the context of education, data on students’ language skills
may correlate with immigrant status and ethnicity, and proficiency in the use of
technology may correlate with socio-economic status by functioning as a proxy for
access to technology (see, e.g., Carmel & Ben-Shahar, 2017). The use of such data in
team-recommendations or ability-grouping may lead to discrimination or undue denials
of access to educational opportunities for members of disadvantaged groups even if the
grouping process is nominally blind to sensitive attributes.
   The underlying intuition behind fairness through unawareness may be more
appropriate to understand in terms of conditional independence or counterfactuals. For
example, we might want that an individual’s predicted outcome will not differ in a
possible world where she does not belong to a sensitive group (other factors remaining
identical). This notion of fairness is captured by counterfactual fairness (Kusner et al.,
2017). Similarly, we might hold that a fair process or outcome of group-formation is
one unaffected by information about sensitive attributes neither directly nor by proxy.
However, it may be that some factors, which sensitive attributes do have an effect on
(in the model), are reasonable or legitimate criteria for inclusion (see Kilbertus et al.,
2017; Nabi & Shpitser 2018). While gender may affect, e.g., an individual’s choice of
education, there are arguably cases where information about educational background is
relevant for group-formation – e.g., in forming interdisciplinary collaboration teams.

Equity and diversity

Unawareness and/or conditional independence of sensitive attributes may be laudable
in many cases, but it may be inconsistent with explicitly promoting diversity; equitable


                                           43
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


group-formation may require accounting for individuals’ sensitive attributes. Statistical
parity - a simple definition for equal equity in outcome classes - is satisfied when
members of the relevant groups (typically protected groups, e.g., gender groups) are
equally likely to receive a positive prediction (see Dwork et al., 2012). This approach
to fair team-formation is taken in Barnabò et al. (2019) where fair single-team
formation is defined through equitable outcomes; each protected group should have
equal equity in the formed team. This requirement for equity could be extended to
multi-team team-formation settings; across the entire set of formed groups, each group
should satisfy statistical parity.
   Enforcing statistical parity may result in outcomes which are “blatantly unfair” for
individuals, however, as it does not guarantee that the most suitable individuals are
selected (Dwork et al., 2012, p. 215). Equal equity, thus, may come with a cost for
accuracy, recommendation relevance or other measure of utility. Furthermore,
statistical parity with respect to unitary protected attributes does not ensure that
intersectional groups enjoy comparatively equal representation in the formed group.
Naïve adherence to a parity metric may lead to what Kearns and colleagues call
‘fairness gerrymandering’ (2017); the quota for some value of a protected attribute
(e.g., women) may be filled, but some subset of the individuals in a given group (e.g.,
women of color) may still be comparatively underrepresented within that quota.
   Notably, equity or diversity is a placeholder term in the formal sense; i.e., there are
multiple dimensions or “currencies” of diversity in any given group. Typically,
diversity is discussed mainly in terms of ethnicity or gender, perhaps due to a lack of
diversity in those regards being a persistent problem in many contexts. The currency of
diversity may, nonetheless, also be one of skill, vocation, etc. In other words, the notion
of diversity in itself has no predetermined content but, rather, functions as a formal
property of a formation. Hence, a group may be diverse along one dimension but not in
another. A product development team may be diverse in terms of skill (e.g., it is
interdisciplinary), but homogeneous in terms of gender (e.g., exclusively males), and
vice versa. As such, the relevant question regarding fairness is whether group-
formations are diverse in morally significant ways – i.e., what kinds of similarities and
dissimilarities we ought to find within and across the formed groups2.

Fair allocations and burdens

Fairness in group-formation may range beyond considerations of equity and diversity.
For many groups, diversity is no stringent moral requirement. It may not be morally
required from, e.g., interest-groups that serve and promote rights of certain
demographic or vocational groups. (However, diversity along some dimensions of
identities may still be morally praiseworthy in these cases.) Alongside diversity, fair
distributions of tasks both within and between groups seem significant from the point
of view of justice. For fair team-formation in collaborative work it might be desirable,
for example, to have an individual fairness constraint requiring that each individual has
a reasonable workload (irrespective of the overall workload of her team), but also a

  2
      See Green & Hu (2018) for a critical discussion on the concept of similarity in fair ML.


                                               44
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


formation-level constraint requiring that different teams’ burdens are roughly equal
(see, e.g., Anagnostopoulos et al, 2012).
   Machado and Stefanidis (2019) define fairness in terms of teams’ equal suitability
to the projects they are assigned; project requirements to team skill-set suitability
should be (roughly) equal across all formed pairs of teams and projects. This approach
captures the idea that, when forming multiple teams, it seems appropriate to ensure that
the formation process does not prioritize any group over another in terms of group-to-
task match success. However, this notion of fairness can be satisfied even if every group
is poorly suited to complete the task they are assigned to. Similarly, an ML model may
be calibrated in that it will estimate the underlying probability (e.g.,
suitability/relevance) equally across different individuals, but still perform
inadequately in terms of the overall accuracy of its predictions3. Furthermore, if
different demographics differ in terms of their representation in the data, calibration is
impossible to satisfy simultaneously with error rate parity (Chouldechova, 2017;
Kleinberg et al., 2016). For group-formation this means that fair algorithmic
assessments of individuals’ suitability for different teams/tasks may conflict with equal
error rates in team-recommendation. To provide a hypothetical example, if women
workers’ performance has been systematically downplayed in assessments by their
employers (i.e., the data exhibits gender bias) this may lead to a higher amount of errors
in team-recommendations for women.

Preferences

One can also approach fairness from the point of view of preference-satisfaction.
Political philosophers leaning towards welfarist and utilitarianist stances, in particular,
argue that the extent to which distributions of goods and opportunities are just depends
on whether individuals’ preferences are appropriately respected and/or satisfied
(Cohen, 1989; see also Gosepath, 2011). Preference-satisfaction is a key factor in
recommender systems where the utility of a model is often measured as the relevance
of the recommendations it generates for users (utility).
   Whether individuals’ preferences are satisfied, and to what extent, is arguably
significant when we consider how, e.g., product development teams or learning groups
should be formed. We might deem it appropriate in an educational context that each
group of students receives, from a bucket of different essay topics, a topic which they
prefer the best. The extent to which individuals (and by extension, groups) are satisfied
in the given group they are assigned to, or in the overall formation of groups, can be
measured via self-assessment questionnaires, for example (see Ounnas et al., 2007).
   Preference-satisfaction can also play a part in fairness considerations, as Dworkin
(1981) argues defending a resource-focused theory of equality. Dworkin entertains a
hypothetical auction in which individuals can, through an equal set of resources, acquire
bundles of goods and commodities. If no individual is jealous of another’s bundle
(envy-freeness), the resulting distribution can be deemed just even if individuals’
bundles are non-identical. Envy-freeness has been operationalized in the recommender

  3
      For discussion on calibration and fairness, see Pleiss et al. (2017).


                                                 45
    Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


system literature as a way of measuring the fairness of package-to-group
recommendations at the level of both individuals and groups (Serbos et al., 2017).
Envy-freeness approaches can be used to assess whether sets of tasks algorithmically
distributed to groups are fair for individuals in those groups and across the overall
formation.
   A limitation of preference-based notions of fairness is that they are not sensitive to
entitlements. Sometimes certain individuals may have priority to have their preference
satisfied above someone else’s. Institutional entitlements that come with, say, seniority
or work-related merits, may warrant deviations from fair allocation (e.g., a priority in
choosing which projects to work on). It may also be that some preferences should not
be recognized as legitimate claims to membership in a group (e.g., a preference to work
only with individuals of a certain gender or ethnicity).


3       Rights, autonomy, legitimacy, solidarity, and diversity

The phenomenon of group-formation has a long and rich history of debates in political
philosophy. In the subsections that follow we start from these debates. We argue that
algorithmic fairness is irrelevant to many forms of groups, either because of the way
the groups come to existence or because fairness is not a central value for them.
   To anticipate, the values that come closest to fairness and utility are solidarity and
diversity. They can be seen as good-making features in group-formation, and, arguably,
of both intrinsic and instrumental value: groups may perform better when high levels
of solidarity and appropriate degree of heterogeneity or diversity is present (figure 1).


                         Figure 1. Values to be jointly optimized

But arguably, there are other considerations (from individual rights and liberties to
legitimate authority) that either pre-empt or override questions of utility, fairness,
solidarity and diversity, or concern the question of whose decision the group-formation
is (figure 2).


                                           46
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


              Figure 2. Normative considerations that complicate the picture


3.1    Kinds of groups
Groups come in many shapes and sizes. Some groups (call them “feature-based
groups”) consist of people with a certain property - the group of all left-handed people
includes people who are left-handed. Some groups (call them “emergent” groups)
emerge out of interaction, say, a group of people who regularly meet for cigarettes
outside a library or who share a train commute and chat at the station waiting for the
train. For such groups, the problem of managing group-formation (algorithmically or
otherwise) cannot directly arise (of course, managing the features or interactions will
indirectly affect the constitution of these groups).
   Only some groups are such that membership is independent of the features of
individuals of their patterns of interaction. Some groups (call them “birthright groups”)
we are born to (a family, a state in contemporary understandings; see below), and only
some memberships are granted intentionally or entered into voluntarily later in life (call
them “intentionally formed groups”). The issue of group-choice or managed team-
formation is at home concerning such groups. Fairness, algorithmic or otherwise, can
be action-guiding only in such contexts. (We can, nonetheless, in a backward-looking
sense, assess the fairness or utility of membership-allocations even in contexts where
they are not action-guiding: see section 3.6 below). It is however to be balanced with
other considerations, such as rights, autonomy or liberty, solidarity, legitimate
authority, and diversity even in those contexts. Thus, the conflict between (kinds of)
utility and (kinds of) fairness does not capture the whole normative shape of
matchmaking and group-formation.


                                           47
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


3.2    The relevance of rights in group-formation
Michael Walzer (1983) in his theory of justice discusses state membership (citizenship)
as the first good to be distributed justly, because other distributions depend heavily on
this membership. Similarly, family membership (not a merely biological phenomenon,
as it involves the status of a family member, as e.g. adoption or rejection suggest) is
another high-stakes decision. The issue of fairness is not practically relevant for those
who already are citizens of this rather than that state, or this rather than that family -
questions of birthright silence any considerations of fairness.
   For those who are stateless or children without a family, a question of rights is the
most relevant consideration. They have the right to become a member of a state, to then
have the package of rights that the state grants to citizens. Corresponding to the rights
of the stateless individuals, each state may have a duty to naturalize stateless person,
although there may be questions of fair sharing of bearing the burden.
   Plato’s Republic implicitly thematizes the conflict between fairness-and-utility and
modern subjective rights, by neglecting the latter. Plato suggests that newborn children
are collectively raised outside families and then distributed to societal tasks that are
optimal to them, thus fulfilling justice (understood as everyone being in their right
place). This violates such modern subjective rights as the right of parents to raise their
children (and the corresponding rights of children), and the right of everyone to choose
their career, and the right of everyone to freely choose their spouse. For example, Hegel
in Philosophy of Right (1821) thinks that a rational whole (e.g. in terms of fairness and
utility) should be formed so that also such modern subjective rights are respected - via
the exercise of such rights. That prevents the state from being a totalitarian matchmaker
allocating the right foster parents to the right children, or the right spouses to each other.
That would violate subjective rights, and however optimal the allocation, it is
practically irrelevant.
   It is not only such high stakes decisions as career, family or state-membership that
modern rights govern; there is a general freedom of association that modern
constitutions declare. This takes us to the very closely related principle in group-
formation: liberty or autonomy. Note how these values significantly lessen the practical
relevance of optimal matchmaking in many contexts of modern life.
   Many groups have, moreover, the right to choose their members. This has been
discussed in the debates on group rights. The extent to which groups have a right to
choose their members limits the extent to which outsiders have subjective rights to
become members. It also, in a different way, may limit the relevance of utility and
fairness: if a group has a right to choose its members, it has the right to do so in a non-
optimal and to some extent unfair manner. And if two groups have a right to choose
their members, they have a right to take into consideration the utility and fairness from
their own perspective, not optimizing what is optimal for the whole system of two
groups and a set of applicants. It is possible that the group’s right is limited by the rights
of individuals though.
   It is important not to jump to the conclusion that utility or fairness would not matter:
in some cases (outside birthrights) rights concern who has the right to make the


                                             48
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


decision, whereas utility and fairness help determine what is the optimal decision that
should be made.
   The relevance of rights may thus be threefold: (i) birthrights may pre-empt any
deliberation of how to intentionally form groups; (ii) sometimes subjective rights my
override considerations of utility or fairness in group-formation; and (iii) subjective and
group rights may point to a distinction between two questions: who has a right to make
a decision and what decision should be made.

3.3    The relevance of autonomy and liberty in group-formation
One principle for group-formation is voluntariness: individuals are free to form new
associations, and the membership profile of different groups is based on who freely
enter or exit.4 Importantly, in modern contexts individuals have a freedom to choose
their spouses and occupations (together with the choices of others). Similarly, within
an established group, a new taskforce may be formed, and members chosen via
volunteering for the task. This may lead to suboptimal formations: the volunteers may
not be the most suitable, and unfairness may accumulate via voluntary choices. And
this typically does not pre-empt the need to manage the team-formation: there may be
a different number of volunteers than slots available. And so, fairness and utility may
be action-guiding after questions of voluntariness have been taken into account.
Voluntariness need not always override utility and fairness - if a leader or manager has
the call, they may decide to go for the optimal solution even when volunteering has not
been at stake.
   So, in addition to pre-empting and overriding (in cases where rights are at stake), the
considerations (fairness, utility, and voluntariness) may be in conflict as separate values
to be jointly optimized. The manager may choose to prioritize voluntariness even when
the individual would not strictly speaking have a corresponding right; such decisions
may be motivated by the expected utility via motivational patterns etc.

3.4    Legitimacy, authority, and coercion in matchmaking
In political philosophy, the difference between legitimacy and justice (or fairness) has
been fruitfully discussed in recent decades. The interim upshot is that the question of
legitimate governance depends on input legitimacy (whether the governor has been
chosen in a legitimate procedure, and whether the governor has legitimate authority to
make decisions in some realm), and output legitimacy (whether the decisions respect
individual rights, and are sufficiently good in terms of utility and fairness), and some
add also throughput legitimacy (how the decision-making process proceeds). (Schmidt
& Wood, 2019.)
   One special case of authority is the democratic authority of the group to make
binding decisions concerning itself: the decisions are illegitimate if they do not respect

   4
   Some liberties are afforded by memberships (citizenship, organizational role, marital status),
some may be independently grounded human rights.


                                              49
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


the rights of individuals. But to the extent that they are legitimate, they bind everyone
- even those that voted against. (Christiano, 2008).
   Another special case is a hierarchical organization such as military order of rank. If
a captain has authority over a group of soldiers, then the captain’s decision gives the
soldiers an overriding reason to act accordingly. Even if a soldier would know a better
route from A to B, it is irrelevant, if the captain has ordered to take another route. The
captain’s command is a stronger reason. The whole point of authority-relations is to
enable such orders being overriding reasons (Raz, 1990).
   There is ongoing debate about the extent to which normal workplaces should be like
democracies or like military units in that respect, but what is common to both is that
questions of legitimate authority are different from questions of fairness and utility.
They concern the question of who is to make a decision, and not the question of which
decision would be optimal. The flipside of the coin is that they do not diminish the
relevance of fairness and utility in assessing which decision to make.

3.5    Solidarity, diversity, and functioning groups
The concept of solidarity has a long history in the vocabularies of political philosophy
and the social sciences (see, e.g., Laitinen & Pessi, eds. 2015). In a liberal tradition
individual rights, liberty and fairness (justice or equality) have been much more central.
Solidarity (of fraternity) has remained a less central concept, but it is very relevant for
the functioning of groups. Solidarity and the value of community are more often
stressed by critics of the social contract -tradition, who often emphasize organic
formations.
   Solidarity typically exists when individuals have shared interests and needs,
alongside possible responsibilities and inclinations to promote and partake in projects
of others they identify with (Smiley, 2017, Sec. 3). For modern philosophers, such as
Rousseau, the key factor driving the formation of societies is normatively motivated
solidarity (Wrong, 1994). Of course, there is little reason to think that solidarity-based
group-formation would not occur with regard to smaller communities and groups; e.g.,
interest-groups for people with disabilities bring together people who share an interest
in supporting those with disabilities and their families, and in pursuing justice for them.
   In many cases, however, group solidarity is an emergent feature of a group with a
history, and thereby more like a desideratum than an active principle of membership-
distribution. Of course, higher levels of solidarity can be anticipated and worked for,
and especially known hostilities and obstacles for solidarity should be taken into
account in group-formation.
   Our suggestion is that for the purposes of this paper, solidarity is to be treated of one
aspect of “utility” – one aspect of functioning teams, rather than a principle that might
override or conflict with such aim. It may be in tension with other aspects of utility,
nonetheless, stressing the importance of pluralism in defining utility: utility-and-
fairness-and-high-level-of-solidarity can be a meaningful goal.
   The concept of diversity is to some extent similar (see van Parijs, ed, 2003). If
solidarity sometimes points towards homogeneous groups, diversity is an important
value in itself. So, the aim of group formation should be something like utility-and-


                                            50
    Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


fairness-and-high-level-of-solidarity-and-appropriate-diversity. And it should be
pursued within the limits of rights, liberties, and legitimate authority.

3.6     Practically (ir)relevant assessability
In this Section, we have aimed to show that there are other values than utility narrowly
conceived and fairness to be accounted for in group-formation. Solidarity and diversity
are important aims in themselves, complementing the aims of fairness and narrow
utility. Sometimes we can assess the optimality of matchmaking in all these respects,
but it is practically irrelevant because the kind of group-formation is determined in
some other ways than deliberate management (e.g., birth or spontaneous emergence).
   Further normative considerations, such as rights, liberties, autonomy, legitimate
authority, may concern the questions of whose the decision is, rather than what is the
optimal decision to make. They may also be considerations that pre-empt (e.g. in the
case of birthrights) or override (in case of some other rights) questions of fairness,
utility, solidarity or diversity. It does not mean that the fairness or utility of the group
compositions cannot be assessed in such cases. Analogously how natural beauty of
sunsets or wildernesses can be assessed even though these values have not made a
practical difference in their formation, we can assess fairness and optimality in groups
that emerge while not being guided by such values. It would be wrong to say that utility
or fairness does not matter in such cases; what can be said is that they are not practically
and effectively guiding the group-formation.


4       Conclusions

In this paper we discussed algorithmic fairness in different group-formation contexts,
and the limits that other normative considerations (rights, autonomy and liberty,
legitimate authority, solidarity, diversity) pose. Extending the discussion of fairness to
algorithmic group-formation, we argued that many technical notions of fairness
introduced in the fair ML literature, such as equal equity, counterfactual fairness and
envy-freeness, can be applied in the context of group-formation. The specific nature
and setting of a group-formation task as well as the respective computational approach
will determine the extent of their applicability. Assessment at the level individuals and
comparisons both within and between groups may be relevant, and the limitations of
technical fairness approaches – the informational scope of distinct metrics and trade-
offs between values of interest – should be recognized and carefully considered.
   Furthermore, drawing on debates in political philosophy, we argued that fairness is
not action-guiding in all contexts of group-formation because questions of rights and
legitimacy may override claims to fairness or maximal utility or render them less
relevant. Even if not action-guiding in all cases, fairness is a value to be jointly
optimized alongside others and may serve as a lens for assessing decisions regarding
the formation of groups.


                                            51
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


References
Anagnostopoulos, A., Becchetti, L., Castillo, C., Gionis, A., & Leonardi, S. (2012). Online team
        formation in social networks. Proceedings of the 21st international conference on
        World Wide Web, 839-848.
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016, May 23). Machine bias: There’s software
         used across the country to predict future criminals. And it’s biased against blacks.
         ProPublica. Retrieved from https://www.propublica.org/article/machine-bias-risk-
         assessments-in-criminal-sentencing.
Binns, R. (2018). Fairness in Machine Learning: Lessons from Political Philosophy. Proceedings
          of Machine Learning Research, 81, 1–11.
Barnabò, G., Fazzone, A., Leonardi, S., & Schwiegelshohn, S. (2019). Algorithms for Fair Team
         Formation in Online Labour Marketplaces. Companion Proceedings of The 2019
         World Wide Web Conference, 484-490.
Barocas, S., & Selbst, A. D. (2016). Big data's disparate impact. California Law Review, 104,
         671–732.
Bulmer, J., Fritter, M., Gao, Y., & Hui, B. (2020). FASTT: Team Formation Using Fair Division.
          Canadian Conference on Artificial Intelligence, 92-104.
Carmel, Y. H., & Ben-Shahar, T. H. (2017). Reshaping Ability Grouping Through Big Data.
         Vanderbilt Journal of Entertainment & Technology Law, 20, 87–128.
Chen, S. J., & Lin, L. (2004). Modeling team member characteristics for the formation of a
         multifunctional team in concurrent engineering. IEEE transactions on Engineering
         Management, 51(2), 111-124.
Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism
        prediction instruments. Big data, 5(2), 153-163.
Christiano, T. (2008) The Constitution of Equality: Democratic Authority and its Limits. Oxford
          UP.
Cohen, G. A. (1989). On the currency of egalitarian justice. Ethics, 99(4), 906-944.
Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., & Huq, A. (2017). Algorithmic decision
         making and the cost of fairness. Proceedings of the 23rd ACM SIGKDD International
         Conference on Knowledge Discovery and Data Mining, 797-806.
d'Alessandro, B., O'Neil, C., & LaGatta, T. (2017). Conscientious classification: A data scientist's
         guide to discrimination-aware classification. Big data, 5(2), 120-134.
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012). Fairness through awareness.
        Proceedings of the 3rd innovations in theoretical computer science conference, 214-
        226.
Dworkin, R. (1981). What is equality? part 1: Equality of welfare. Philosophy & public affairs,
         185–246.
Gosepath, S. (2011). Equality. The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.),
         Retrieved      from       https://plato.stanford.edu/archives/spr2011/entries/equality/.
         [Accessed 31.5.2020]


                                                52
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


Green, B., & Hu, L. (2018). The myth in the methodology: Towards a recontextualization of
         fairness in machine learning. Proceedings of the Machine Learning: The Debates
         Workshop.
Hegel, G.W.F. (1991 [1821]) The Elements of Philosophy of Right. Cambridge UP. Cambridge.
Hellman, D. (2019). Measuring Algorithmic Fairness. Virginia Public Law and Legal Theory
         Research Paper No. 2019-39; Virginia Law and Economics Research Paper No. 2019-
         15;       Virginia     Law      Review,    Forthcoming.      Retrieved     from
         https://ssrn.com/abstract=3418528.
Kearns, M., Neel, S., Roth, A., & Wu, Z. S. (2017). Preventing fairness gerrymandering: Auditing
         and learning for subgroup fairness. arXiv preprint arXiv:1711.05144.
Kilbertus, N., Rojas-Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., and Schölkopf, B.
          (2017). Avoiding Discrimination Through Causal Reasoning. Advances in Neural
          Information Processing Systems.
Kleinberg, J., Mullainathan, S., & Raghavan, M. (2016). Inherent trade-offs in the fair
         determination of risk scores. arXiv preprint arXiv:1609.05807.
Kusner, M. J., Loftus, J. R., Russell, C., and Silva, R.. (2017). Counterfactual Fairness. Advances
         in Neural Information Processing Systems, 1-11.
Laitinen, A & Pessi, A.B., eds. (2015) Solidarity. Theory and Practice. Lanham: Lexington
          Books.
Lappas, T., Liu, K., & Terzi, E. (2009). Finding a team of experts in social networks. Proceedings
         of the 15th ACM SIGKDD international conference on Knowledge discovery and data
         mining, 467-476.
Liu, H., Qiao, M., Greenia, D., Akkiraju, R., Dill, S., Nakamura, T., Song, Y., & Nezhad, H. M.
          (2014). A machine learning approach to combining individual strength and team
          features for team recommendation. 2014 13th International Conference on Machine
          Learning and Applications, 213-218.
Lum, K., & Isaac, W. (2016). To predict and serve?. Significance, 13(5), 14-19.
Machado, L., & Stefanidis, K. (2019). Fair team recommendations for multidisciplinary projects.
        2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI), 293-297.
Majumder, A., Datta, S., & Naidu, K. V. M. (2012). Capacitated team formation problem on
       social networks. Proceedings of the 18th ACM SIGKDD international conference on
       Knowledge discovery and data mining, 1005-1013.
Nabi, R., & Shpitser, I. (2018). Fair inference on outcomes. Thirty-Second AAAI Conference on
          Artificial Intelligence.
Narayanan, A. (2018). Translation tutorial: 21 fairness definitions and their politics. Proceedings
        of the Conference on Fairness, Accountability and Transparency. New York, USA.
Okimoto, T., Schwind, N., Clement, M., Ribeiro, T., Inoue, K., & Marquis, P. (2015). How to
         form a task-oriented robust team. Proceedings of the 2015 International Conference on
         Autonomous Agents and Multiagent Systems, 395-403.


                                               53
  Proceedings of the Conference on Technology Ethics 2020 - Tethics 2020


Ounnas, A., Millard, D. E., & Davis, H. C. (2007). A metrics framework for evaluating group
         formation. Proceedings of the 2007 international ACM conference on Supporting
         group work, 221-224.
Pittenger, D. J. (1993). Measuring the MBTI… and coming up short. Journal of Career Planning
           and Employment, 54(1), 48-52.
Plato (2000) The Republic. Cambridge: Cambridge University Press.
Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., & Weinberger, K. Q. (2017). On fairness and
          calibration. Advances in Neural Information Processing Systems, 5680-5689.
Rawls, J. (1971) A Theory of Justice. Harvard UP.
Raz, J. (1990). Practical Reason and Norms. Oxford UP: Oxford.
Selbst, A. D., Boyd, D., Friedler, S. A., Venkatasubramanian, S., & Vertesi, J. (2019). Fairness
          and abstraction in sociotechnical systems. Proceedings of the Conference on Fairness,
          Accountability, and Transparency, 59-68.
Serbos, D., Qi, S., Mamoulis, N., Pitoura, E., & Tsaparas, P. (2017). Fairness in package-to-
         group recommendations. Proceedings of the 26th International Conference on World
         Wide Web, 371-379.
Schmidt, V. & Wood (2019) Conceptualizing throughput legitimacy. Public administration,
         97(4), 727-740.
Smiley, M. (2017). Collective Responsibility. The Stanford Encyclopedia of Philosophy, Edward
         N.                     Zalta (ed.),                 Retrieved                     from
         <https://plato.stanford.edu/archives/sum2017/entries/collective-responsibility/>.
         [Accessed 31.5.2020]
Steenbergen-Hu, S., Makel, M. C. & Olszewski-Kubilius, P. (2016). What one hundred years of
         research says about the effects of ability grouping and acceleration on K–12 students’
         academic achievement: Findings of two second-order meta-analyses. Review of
         Educational Research, 86(4), 849-899.
van Parijs, P., et al. (eds.) (2003), Cultural Diversity Versus Economic Solidarity: Is There a
          Tension? How Must it be Resolved? De Boeck Supérieur.
Verma, S., & Rubin, J. (2018). Fairness definitions explained. 2018 IEEE/ACM International
        Workshop on Software Fairness (FairWare), 1-7.
Walzer, M. (1983) Spheres of Justice. Basic books.
Wichmann, A., Hecking, T., Elson, M., Christmann, N., Herrmann, T., and Hoppe, H. U.
       (2016). Group Formation for Small-Group Learning: Are Heterogeneous Groups
       More Productive? Proceedings of the12th International Symposium on Open
       Collaboration (OpenSym ’16), 1–4.
Wrong, D. (1994) The Problem of Order. Macmillan.
Yilmaz, M., Al-Taei, A., & O’Connor, R. V. (2015). A machine-based personality oriented team
         recommender for software development organizations. European Conference on
         Software Process Improvement, 75-86.


                                              54