Integrating Fairness in AI Development: Technical
                                Insights from the fAIr by design Framework
                                Mira Reisinger1 , MA, Rania Wazir1 , PhD
                                1
                                    leiwand.ai, Vienna, Austria


                                                                         Abstract
                                                                         This paper discusses the necessity of integrating fairness into the development of trustworthy AI systems,
                                                                         focusing on methods and tools designed within the fAIr by design project - a collaborative approach to
                                                                         guide development teams towards the creation of non-discriminatory AI systems. Practical applications,
                                                                         challenges, and recommendations based on real-world use cases are shared from a data science and
                                                                         machine learning team perspective. The paper advocates for continuous learning, diverse team assembly,
                                                                         and ongoing monitoring to ensure AI systems remain fair and inclusive, encompassing the whole life
                                                                         cycle of AI systems.

                                                                         Keywords
                                                                         Fairness in AI Development, Assurance Case, Use Cases, Ethical decision-making


                                1. Introduction
                                Artificial Intelligence (AI) has become increasingly integrated into various aspects of society,
                                from communications to recruitment[1] to justice systems[2]. However, there are serious con-
                                cerns regarding potential unwanted biases and discrimination embedded in AI systems. These
                                concerns are not unfounded, as numerous examples of biased and discriminatory applications
                                and their significant negative impacts on individuals and communities have been revealed [3, 4].
                                   The challenge begins with defining fairness — a concept that proves elusive across disciplines.
                                Drawing upon Mehrabi et al.[2], fairness in decision-making is ideally the absence of prejudice
                                or favoritism towards any individual or group based on inherent or acquired characteristics. The
                                evolving nature of fairness definitions highlights the challenge of addressing the intertwined
                                issues of bias, fairness, and discrimination. This requires stakeholders and developers to work
                                together to develop context-specific definitions of fairness and non-discrimination, which
                                include acceptable thresholds and measurable metrics. Verma and Rubin[5] offer insights
                                into defining and measuring fairness, involving various metrics such as predictive outcomes,
                                similarity measures, or causal reasoning.
                                   Recognizing the dynamic nature of fairness, influenced by societal norms and technological
                                advancements, adds complexity to the effort of implementing fairness in AI systems but also
                                enables AI systems to be finely tuned to balance fairness with performance. This balanced
                                approach requires ongoing dialogue and adjustments between ethical principles and system
                                efficiency, ensuring continuous alignment through monitoring and iterative enhancements.

                                EWAF’24: European Workshop on Algorithmic Fairness, July 01–03, 2024, Mainz, Germany
                                $ mira.reisinger@leiwand.ai (M. Reisinger); rania.wazir@leiwand.ai (R. Wazir)
                                                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
  The research consortium fAIr by design[6], funded by the Austrian National Foundation for
Research, Technology and Development (FFG)[7], has devised a framework aimed at aiding
organizations in incorporating fairness requirements into their development processes. This
paper reflects on the utilization of the tools and methods[8, 1] introduced to 3 AI system
development teams to ensure adherence to fairness and trustworthiness principles.


2. Leveraging Assurance Cases for Fairness in AI Systems
This chapter delves into a strategy employed by the fAIr by design team to navigate the chal-
lenges of developing algorithms with embedded fairness constraints. Drawing on the adapted
Assurance Case method [8], the strategy includes a comprehensive discussion about the purpose
and role of an AI system, the analysis and identification of potential fairness and discrimination
risks followed by thorough testing for those risks, and adoption of mitigation strategies.

2.1. The Assurance Case method (AC)
Originating from safety engineering, the Assurance Case method offers a structured pathway for
translating high-level goals into specific, actionable, and verifiable technical specifications [9, 10].
It is a collaborative approach fostering a holistic perspective on fairness in AI development,
promoting the creation of ethically sound and socially responsible systems. The AC facilitates
close cooperation among social science, data science, and start-up development teams in the
fAIr by design project, serving as a valuable tool for integrating fairness into AI systems.
   Components of the method include claims, sub-claims, evidence and reasoning, which enable
the project team to identify and systematically integrate a comprehensive understanding of the
AI system and fairness needs through the definition of sub-claims, each supported by (technical)
evidences. The overall process included developing evidences and tests, a thorough exploration
of model selections, fairness metrics, and the application of mitigation strategies.
   The AC emphasizes the crucial role of integrating "challengers" (technical or non-technical)
into the AI development process to identify and point out aspects that are not clear and can lead
to unfairness. Having a comprehensive understanding of each component of the AI system from
the outset is invaluable, and the involvement of "challengers" enhances this understanding by
ensuring that fairness considerations are addressed. The "challenger" can assist in clarifying data
needs, testing protocols, evaluation metrics, and mitigation strategies, laying the groundwork
for in-depth fairness testing. This also equips the team to manage potential future challenges.


3. Practical Applications and Use Cases
This chapter reports on the application of the methodologies and tools found during fAIr by
design. The implementation of the AC was done together with social scientists and legal experts,
but learnings and recommendations center on technical application and data science perspective.
3.1. Learnings and Recommendations
At the onset of developing a fair AI system, it’s important from a technical and social-science
standpoint to engage in a series of critical inquiries and clarifications with the use case partner,
which include:

    • The definition of fairness from the partner’s perspective.
    • The characteristics that constitute a fair AI system for them.
    • The (envisioned) structure of the AI system and it’s components.
    • The risks identified and the sub-claims that can be substantiated with technical evidence.
    • The feasible tests, along with a potential responsible person.
    • The prerequisites for conducting these tests, including necessary data, knowledge, and
      resources.
    • The applicable metrics and thresholds for fairness.
    • The mitigation strategies to be employed should these thresholds not be met.

3.1.1. Tools and Resources
The use of methods and tools developed within the Assurance Case or those available as
open access resources[11, 12], such as the AI canvas and ethics checklists, are particularly
recommended. The Data Science Ethics Checklist from deon.org[12], applied as an iterative
process, has been shown to significantly support the technical progress. If well done it can
be part of the documentation requirements of high risk AI systems[13]. The checklist helps
clarifying the data science and machine learning maturity level of the partners, and aligns well
with the need for clarity and efficacy common in development teams[1].

3.1.2. Assessing and Building Knowledge
All steps of fairness testing require data quality and overall machine learning and data science
expertise (e.g. ablation studies, hyper-parameter optimization or inverse relation modeling).
Addressing biases in training and evaluation data is vital to prevent AI systems from replicating
or exacerbating existing inequities. For some companies it can be a challenge to build the
necessary knowledge around data quality, conducting systematic evaluations, and testing model
components, which should be addressed from the onset.

3.1.3. Assemble a Diverse and Competent Team
Building a proficient team that specializes in appropriate data science and machine learning
methods is essential to circumvent fairness testing pitfalls effectively. It requires a concerted
effort to integrate knowledge on fairness into AI, delineating clear responsibilities among senior
management and development teams. Fairness, much like other critical quality criteria, must be
integrated into a wide array of business processes, gaining prominence especially in high-risk
AI systems[13]. The establishment of interdisciplinary teams is key to facilitating in-depth
discussions and making informed decisions regarding fairness.
3.1.4. Continuous Validation and Monitoring
The establishment of regular auditing and accountability mechanisms is pivotal in upholding
non-discriminatory practices in AI. Continuous monitoring and evaluation of AI systems
enable organizations to proactively identify and address any potential biases or discriminatory
outcomes, thereby demonstrating their commitment to fairness and ethical AI development.

3.1.5. You think you know what fairness means – until you ask others
Another critical aspect of promoting non-discriminatory AI development is fostering collabora-
tion and transparency. Organizations should actively seek partnerships with diverse stakehold-
ers, including ethicists, community representatives, and regulatory bodies, to gain insights into
the potential biases and discriminatory risks in AI systems. By promoting transparency in AI
development processes, organizations can build trust and accountability with the public, while
ensuring that fairness remains an integral part of the AI life cycle. This is useful as a quality
assurance tool, ensuring that the product being developed actually satisfies market needs.

3.2. Challenges and Limitations
Collaborative efforts with partners have provided valuable insights into the practical challenges
and strategies for developing non-discriminatory AI systems. These collaborative endeavors
shed light on the significance of data science and machine learning maturity, as well as the
willingness to invest time, effort, and resources into crafting fair, non-discriminatory, and
trustworthy AI systems. However, navigating fairness throughout the AI life cycle presents
challenges, including the dynamic nature of fairness definitions and the intricacies of measuring
fairness.
   The fAIr by design use cases, focusing on small companies (SMEs), start-ups and cultural
organizations/NGOs - each with distinct applications, domains, and developmental stages - un-
derscore the diverse landscape within which fairness considerations are embedded, emphasizing
the importance of contextual understanding and adaptability. However, we have not been able
to work with larger, more established organizations, with a more advanced data management
structure, but may face other challenges.


4. Conclusion
The journey towards non-discriminatory AI development requires the concerted efforts of
organizations, policymakers, and society as a whole. fAIr by design, including the Assurance
Case, provides a structured approach to the development of fair AI systems. Recognizing the
contextual nature of fairness, the adoption of ethical AI principles should be accompanied by
continuous training for development teams, enabling them to incorporate ethical guidelines
into their day-to-day decision-making processes. In this project we have seen how beneficial it
is for social sciences, data sciences and use case partners working together. Moving forward,
the exploration of cross-industry collaborations and the proposition of structured frameworks
for engaging organizations in fairness discussions are promising directions for the advancement
of non-discriminatory AI. The potential to draw on diverse perspectives and expertise, could
ultimately lead to industry-wide standardized approaches, thus bringing us closer to a future
where ethical principles in AI are upheld across the board.


References
 [1] S. Cepeda, L. Kunze, G. Leimüller, L. Müller-Kress, M. Stöger, R. Wazir, Requirements
     for developing fair ai systems, https://www.fairbydesign.eu/publications, 2022. [Accessed
     29-03-2024].
 [2] N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, A. Galstyan, A survey on bias and fairness
     in machine learning, ACM Comput. Surv. 54 (2021). URL: https://doi.org/10.1145/3457607.
     doi:10.1145/3457607.
 [3] C. O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens
     Democracy, Crown Publishing Group, New York, NY, USA, 2016.
 [4] J. Dressel, H. Farid, The accuracy, fairness, and limits of predicting recidivism, Science
     Advances 4 (2018) eaao5580. doi:10.1126/sciadv.aao5580.
 [5] S. Verma, J. Rubin, Fairness definitions explained, in: Proceedings of the International
     Workshop on Software Fairness, FairWare ’18, Association for Computing Machinery, New
     York, NY, USA, 2018, p. 1–7. doi:10.1145/3194770.3194776.
 [6] fair by design, https://www.fairbydesign.eu, 2024. [Accessed 29-03-2024].
 [7] fAIr by design - Entwicklung eines neuen Prozessmodells für nicht-
     diskriminierende Künstlicher Intelligenz | FFG, https://www.ffg.at/content/
     fair-design-entwicklung-eines-neuen-prozessmodells-fuer-nicht-diskriminierende-kuenstlicher,
     2021. [Accessed 29-03-2024].
 [8] L. Kunze, G. Leimüller, L. Müller-Kress, M. P. Hauer, Method handbook: Assurance cases
     for fair ai systems, https://www.fairbydesign.eu/publications, 2024. [Accessed 29-03-2024].
 [9] M. P. Hauer, L. Müller-Kress, G. Leimüller, K. Zweig, Using assurance cases to assure the
     fulfillment of non-functional requirements of AI-based systems - lessons learned, in: 2023
     IEEE International Conference on Software Testing, Verification and Validation Workshops
     (ICSTW), IEEE, 2023.
[10] M. P. Hauer, R. Adler, K. Zweig, Assuring fairness of algorithmic decision making, in: 2021
     IEEE International Conference on Software Testing, Verification and Validation Workshops
     (ICSTW), 2021, pp. 110–113. doi:10.1109/ICSTW52544.2021.00029.
[11] J. Unadkat, What is Test Driven Development (TDD)?, https://www.browserstack.com/
     guide/what-is-test-driven-development, 2023. [Accessed 29-03-2024].
[12] deon, Data science ethics checklist, https://deon.drivendata.org/, 2019. [Accessed 29-03-
     2024].
[13] European Parliament and Council of European Union, Regulation (EU) no 2024/1689 Annex
     III. High-risk AI systems referred to in Article 6(2), https://eur-lex.europa.eu/eli/reg/2024/
     1689/oj, 2024.