<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>A. Simonetta);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Results</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Simonetta</string-name>
          <email>alessandro.simonetta@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Cristina Paolett</string-name>
          <email>mariacristina.paoletti@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tsuyoshi Nakajima</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Engineering Shibaura Institute of Technology</institution>
          ,
          <addr-line>Tokyo</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Enterprise Engineering, University of Rome Tor Vergata</institution>
          ,
          <addr-line>Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Fairness</institution>
          ,
          <addr-line>Machine Learning, Completeness, ISO/IEC 25012, Maximum Completeness, Bias, Classification</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Italian Space Agency</institution>
          ,
          <addr-line>via del Politecnico snc, Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Professional Association of Italian Actuaries</institution>
          ,
          <addr-line>Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Workshop Proce dings</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2002</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>AI is an enabling technology that can be utilized in various fields with impressive results. However, in its adoption, there are risk factors that can be mitigated through the adoption of quality standards. It's not by chance that the new ISO/IEC 25059 includes a specific quality model for AI systems. The article describes a research approach that proposes a way to prevent the lack of quality in training data from propagating into the deductions of an AI system. This is all based on the concept of completeness from ISO/IEC 25012 and can be referred to ISO/IEC 5259-2 characteristics of diversity, representativeness, similarity for input dataset evaluation and to ISO/IEC 25059 functional correctness for output results evaluation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The vast availability of data and tools has allowed the
construction of predictive and classification models that form
the foundation of Automated Decision-Making (ADM)
systems. Many business decisions rely on
recommendations generated by software systems, and in some cases,
these decisions are entirely automated. The notion that
this promotes the concept of decision neutrality due to
being algorithm-based is quite prevalent. However, since
the decision-making path of an AI system is heavily
inlfuenced by the data used during the learning phase,
biases present in the data can sometimes transfer into the
choices proposed by the system. In the literature, it has
been demonstrated that the use of AI systems trained on
biased datasets can lead to situations of discrimination
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The risk of skewed outcomes primarily stemming
from imbalanced datasets has also been studied, and it
can be mitigated by the introduction of synthetic data [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Learning algorithms construct the model based on the
training data, so such disproportion can lead to
conclusions that deviate from reality [
        <xref ref-type="bibr" rid="ref3 ref4">3,4</xref>
        ]. On the other hand, in
some situations, it is challenging to obtain homogeneous,
proportional, and, most importantly, representative data.
In these cases, the ISO standards that can help us are [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]:
IWESQ 2023
∗Corresponding author.
†These authors contributed equally.
nEvelop-O
CEUR
      </p>
      <p>
        CEUR
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
      </p>
      <p>
        SQuaRE [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
• ISO 31000:2018 Risk management — Guidelines
• ISO/IEC 25000:2014 Systems and software
engineering — Systems and software Quality
Requirements and Evaluation (SQuaRE) — Guide to
• ISO/IEC 27002:2022 Information security,
cybersecurity and privacy protection - Information
security controls [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
• ISO/IEC DIS 5259-2 Artificial Intelligence — Data
Quality for Analysis and Machine Learning (ML)
- Part 2: Data Quality Measures [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Specifically, ISO 31000 includes risk management
principles that allow for the assessment of both the risk of
using incomplete data during the learning phase and the
risk associated with unfair predictions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Other kinds
of risks, such as the ability of protect data from
information leakage, are for further study. ISO/IEC 27002
ofers two possible new approaches for proactive
security, threat detection and machine learning/artificial
intelligence systems. Initially, the ISO/IEC 25010 software
quality model [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] did not encompass quality
characteristics of AI systems. However, starting from 2023, the
SQuaRE series is enriched with the quality model for AI
systems: ISO/IEC 25059 standard. Table 1 presents the
new sub-characteristics identified by the working group
and their scope in relation to the original standard [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
CEUR
Workshop
Proceedings
      </p>
      <p>ceur-ws.org
ISSN1613-0073</p>
      <sec id="sec-1-1">
        <title>4.2 characteristics of the software product model</title>
        <sec id="sec-1-1-1">
          <title>Functional suitability</title>
        </sec>
        <sec id="sec-1-1-2">
          <title>Usability</title>
        </sec>
        <sec id="sec-1-1-3">
          <title>Reliability</title>
        </sec>
        <sec id="sec-1-1-4">
          <title>Security</title>
        </sec>
      </sec>
      <sec id="sec-1-2">
        <title>4.1 characteristics of the quality in use</title>
        <sec id="sec-1-2-1">
          <title>Satisfaction</title>
        </sec>
        <sec id="sec-1-2-2">
          <title>Absence and mitigation of risks * in the process of being published</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Fairness Evaluation in ML</title>
    </sec>
    <sec id="sec-3">
      <title>Outputs</title>
      <p>In the context of machine learning, evaluating fairness in
machine learning models is a very sensitive and
important issue. The goal is to ensure that models yield results
that are independent of group membership and do not
perpetuate or, in some cases, even exacerbate existing
societal inequalities.</p>
      <p>There are two diferent approaches: measuring the
intensity of output errors or measuring the overall
direction of errors. The first approach focuses on assessing
disparate or unfair errors among diferent categories,
ethnicities, or groups. The second approach evaluates
whether the model tends to make errors in a particular
direction or towards a specific group, ethnicity, or other
sensitive attribute. Bias or fairness metrics can be used
to evaluate this overall direction.</p>
      <p>In the case of classification algorithms, the confusion
matrix  allows for the calculation of the number of true
 = [ ⋮</p>
    </sec>
    <sec id="sec-4">
      <title>3. Statistical Evaluation Methods on Output</title>
      <p>
        In a classification or decision scenario, statistical criteria
allow us to evaluate discrimination in terms of
statistical expressions involving the random variables A
(sensitive attribute), Y (target variable), and R (the classifier
or score). Therefore, it is easy to determine whether a
criterion is satisfied or not by calculating the joint
distribution of these random variables. Starting from the
definition of independence introduced in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], for there
attribute, we need to verify that the joint probability has
the same values in both cases   and   :
and false negatives (FN)
positives (TP), true negatives (TN), false positives (FP), to be independence between two values of the sensitive
surement of relationships between the joint probabilities
mentioned in (9). Indeed, it can measure the mutual
information between A and R, which is the amount of
information one random variable reveals about the other.
      </p>
      <p>Therefore, the condition of independence between the
random variables A and R, as indicated in 9, can be
expressed in terms of mutual information:
where H(R) and H(A) are the entropies associated with R
and A, respectively:
 (, ) =  () +  () −  (, )
 () =
 () =


∑  (  )( (
=1
∑  (  )( (
=1
 ))
 ))
Instead, the third term in equation 12 is:
,
∑
=1,=1
 (, ) =
 (  ∩   )( (</p>
      <p>
        ∩   ))
The other indices can also be expressed by mutual
information and in particular referring to [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
Separation is calculated by:
 (, | ) =  (,  ) +  (,  ) −  (,  , ) −  ( )
 ( , |) =  ( , ) +  (, ) −  ( , , ) −  ()
by:
 (, | = ) =  (,  = )+
      </p>
      <p>+  (,  = ) −  ( =  , | =  )</p>
    </sec>
    <sec id="sec-5">
      <title>5. Data Quality Measures for Input</title>
      <p>
        The underlying idea of this research is to find a way to
anticipate disparities in the final outcomes of an AI
system by evaluating the learning training sets from the
perspective of data quality (ISO IEC 25012). In particular,
the distances between probabilities belonging to groups, suficiency is expressed by the following equation:
erality, to other fairness measures such as separation, finally, the Overall Accuracy Equality (17) is computed
fairness vector with components being the fairness in- it has been observed how concepts of completeness,
hetdices and examine the relationships between diferent
erogeneity (Gini index), diversity (Shannon or Simpson
vectors. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], a method was used to match treatment index) or imbalance (imbalance ratio) can be used as
pregroups based on the Pearson correlation index.
dictive markers to highlight the risk that a data defect
may propagate within the learning system.
      </p>
      <p>
        Initially, [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to analyze data quality issues in the
learning data, Gini indices, imbalance ratios, Shannon, and
Simpson indices were used. For fairness measures,
independence and separation measures - consisting of the
components True Positive Rate (TPR) and False Positive
Rate (FPR) - were considered using the average of
distances between probabilities as a criterion for
synthetizing values (11).
      </p>
      <p>The research revealed that the Gini index has good
predictive capability for low values of the TPR component
of Separation. The imbalance ratio indicator has good
predictive capability for separation but not for
independence. The Shannon index showed an acceptable level of
prediction for the independence measure, excellent for
the separation measure, but was completely inefective
for the FPR measure of separation. The Simpson index
did not appear to be useful as a predictive bias measure.</p>
      <p>The results were quite encouraging, so there was an
attempt to improve the approach by acting on two fronts:
the calculation method of fairness measures and the
quality index of the input data to the learning system.</p>
      <p>
        Regarding the calculation method for fairness
measures, the use of a central tendency index could mask
compensated errors, so three diferent approaches were
attempted: using the maximum disparity between
probability values (MinMax method [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]), using the distance
between groups of similar probabilities (k-means and
DBSCAN), and using mutual information.
      </p>
      <p>
        As for the quality index selected in ISO IEC 25012, we
chose the characteristic of completeness, particularly the
concept of maximum completeness as defined in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        The study demonstrated that the use of maximum
completeness and the MinMax measurement system provided
the best predictive capability for fairness indices:
independence, separation, suficiency, and overall accuracy
equality. Additionally, the use of the MinMax technique
showed better sensitivity compared to mutual
information and the DBSCAN clustering system, as shown in
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>In the realm of AI systems, data governance and data
quality are extremely important concepts. Since AI
algorithms rely on learning datasets, the quality of input data
can impact the outcomes. In this article, we have seen
how completeness can serve as a good predictor of errors
in the outputs of an ML system. In this context, it is clear
that the definition of guidelines for the application of
data governance and data quality in AI systems is crucial.
Addressing bias in the data of technological systems is a
significant challenge in the digital age, as the decisions
made by algorithms can have substantial societal and
personal implications, which can be measured according
to international ISO/IEC standards.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Simonetta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vetrò</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Paoletti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Torchiano</surname>
          </string-name>
          ,
          <article-title>Integrating square data quality model with iso 31000 risk management to measure and mitigate software bias</article-title>
          ,
          <source>CEUR Workshop Proceedings</source>
          <volume>3114</volume>
          (
          <year>2021</year>
          ) pp.
          <fpage>17</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <article-title>International organization for standardization, ”iso 31000:2018(en) risk management - guidelines</article-title>
          ”,
          <year>2018</year>
          . URL: https: //www.iso.org/iso-31000
          <string-name>
            <surname>-</surname>
          </string-name>
          risk-management.
          <source>html.</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>International</given-names>
            <surname>Organization</surname>
          </string-name>
          for Standardization, ”ISO/IEC 25000:
          <article-title>2014 Systems and software engineering Systems and software Quality Requirements and Evaluation (SQuaRE</article-title>
          ) Guide to SQuaRE”,
          <year>2014</year>
          . URL: https://www.iso.org/standard/64764. html.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>International</given-names>
            <surname>Organization</surname>
          </string-name>
          for Standardization, ”ISO/IEC 27002:
          <year>2022</year>
          <article-title>Information security</article-title>
          ,
          <source>cybersecurity and privacy protection Information security controls”</source>
          ,
          <year>2022</year>
          . URL: https:/https://www.iso.org/ standard/75652.html.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>International</given-names>
            <surname>Organization</surname>
          </string-name>
          for Standardization,
          <source>”ISO/IEC DIS 5259-2 Artificial</source>
          intelligence
          <article-title>- Data quality for analytics and machine learning (ML) - Part 2: Data quality measures”, Under development</article-title>
          . URL: https://www.iso.org/standard/81860.html.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>International</given-names>
            <surname>Organization</surname>
          </string-name>
          for Standardization,
          <source>”ISO/IEC 25010 Systems and software engineering Systems and software Quality Requirements and Evaluation (SQuaRE) - System and software quality models”</source>
          ,
          <year>2011</year>
          . URL: https://www.iso.org/standard/ 81860.html.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7] International organization for standardization, ”iso/iec 25059:
          <article-title>2023 software engineering systems and software quality requirements and evaluation (square) - quality model for ai systems</article-title>
          ”,
          <year>2023</year>
          . URL: https://www.iso.org/standard/80655.html.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Barocas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Narayanan</surname>
          </string-name>
          ,
          <source>Fairness and machine learning</source>
          ,
          <year>2020</year>
          . URL: https://fairmlbook. org/, chapter: Classification.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Larson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mattu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kirchner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Angwin</surname>
          </string-name>
          , Compas recidivism dataset,
          <year>2016</year>
          . URL: https://www.propublica.org/article/ how
          <article-title>-we-analyzed-the-compas-recidivism-algorithm.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Simonetta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Paoletti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Venticinque</surname>
          </string-name>
          ,
          <article-title>The use of maximum completeness to estimate bias in aibased recommendation systems</article-title>
          ,
          <source>CEUR Workshop Proceedings</source>
          <volume>3360</volume>
          (
          <year>2022</year>
          ) pp.
          <fpage>76</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Steinberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Reid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. O</given-names>
            <surname>'Callaghan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lattimore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>McCalman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. S.</given-names>
            <surname>Caetano</surname>
          </string-name>
          ,
          <article-title>Fast fair regression via eficient approximations of mutual information</article-title>
          , CoRR abs/
          <year>2002</year>
          .06200 (
          <year>2020</year>
          ). URL: https://arxiv.org/ abs/
          <year>2002</year>
          .06200.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vetrò</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Torchiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mecati</surname>
          </string-name>
          ,
          <article-title>A data quality approach to the identification of discrimination risk in automated decision making systems</article-title>
          ,
          <source>Government Information Quarterly</source>
          <volume>38</volume>
          (
          <year>2021</year>
          )
          <article-title>101619</article-title>
          . URL: https://www.sciencedirect.com/ science/article/pii/S0740624X21000551. doi:https: //doi.org/10.1016/j.giq.
          <year>2021</year>
          .
          <volume>101619</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Simonetta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Nakajima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Paoletti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Venticinque</surname>
          </string-name>
          ,
          <article-title>Fairness metrics and maximum completeness for the prediction of discrimination</article-title>
          ,
          <source>CEUR Workshop Proceedings</source>
          <volume>3356</volume>
          (
          <year>2022</year>
          ) pp.
          <fpage>13</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>