<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Accountability Driven Development for Machine Learning Systems ?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Chiu Pang Fung</string-name>
          <email>C.P.Fung@leeds.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wei Pang</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Iman Naja</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Milan Markovic</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peter Edwards</string-name>
          <email>p.edwardsg@abdn.ac.uk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computing, University of Leeds</institution>
          ,
          <addr-line>Leeds, LS2 9JT</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Mathematical and Computer Sciences, Heriot-Watt University Edinburgh</institution>
          ,
          <addr-line>EH14 4AS</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of Natural and Computing Sciences, University of Aberdeen Aberdeen</institution>
          ,
          <addr-line>AB24 3UE</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>With rapid deployment of Machine Learning (ML) systems into diverse domains such as healthcare and autonomous driving, important questions regarding accountability in case of incidents resulting from ML errors remain largely unsolved. To improve accountability of ML systems, we introduce a framework called Accountability Driven Development (ADD). Our framework reuses Behaviour Driven Development (BDD) approach to describe testing scenarios and system behaviours in ML Systems' development using natural language, guides and forces developers and intended users to actively record necessary accountability information in the design and implementation stages. In this paper, we illustrate how to transform accountability requirements to speci c scenarios and provide syntax to describe them. The use of natural language allows non technical collaborators such as stakeholders and non ML domain experts deeply engaged in ML system development to provide more comprehensive evidence to support system's accountability. This framework also attributes the responsibility to the whole project team including the intended users rather than putting all the accountability burden on ML engineers only. Moreover, this framework can be considered as a combination of both system test and acceptance test, thus making the development more e cient. We hope this work can attract more engineers to use our idea, which enables them to create more accountable ML systems.</p>
      </abstract>
      <kwd-group>
        <kwd>Behaviour Driven Development</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Model Card</kwd>
        <kwd>Accountability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        There are many de nitions of accountability in AI systems, and following our
previous work [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], we de ne accountability as follows: \The ability to inspect,
review or otherwise interrogate an AI system with the goal of (i ) making
processes associated with each of its life cycle stages transparent; (ii ) demonstrating
compliance with hard laws (i.e. laws and regulations), and soft laws (i.e.
standards and guidelines); and (iii ) aiding investigations into the cause(s) of failure
or erroneous decisions and supporting the identi cation of responsible parties."
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. This de nition can be adopted to the narrower scope - ML systems.
Currently ML system development is a straight forward process [
        <xref ref-type="bibr" rid="ref17 ref4">17, 4</xref>
        ] (Figure 1).
In this kind of processes, the relevant personnel within the project are in charge
of di erent tasks based on their roles. Decision makers care about the purpose;
stakeholders are concerned with the business value of the potential ML
application; domain experts explain what the data means; data scientists and ML
engineers focus on technical perspective, including data processing, model training
and evaluating. This division of labor may cause ambiguity for accountability
of ML systems. Many data scientists and ML engineers are keen on ful lling
the system performance requirements rather than considering the
accountability issues. The Model Card framework [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] o ers a standardized documentation
procedure to capture useful information such as model purpose, owners, data
information, algorithms' details. But the information they collected is not
comprehensive regarding to accountability of ML systems. The lack of information
about system's explainability, transparency, fairness and performance makes it
di cult to audit and investigate the potential failures of ML systems.
ML system's life cycle can be divided into four high level stages: Design,
Implementation, Deployment, and Operation &amp; Maintenance [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In this paper, we
focus on illustrating how to ask the ML system to generate those information
in our ADD framework in the implementation stage, and we also give examples
based on a simulation using our framework to develop an auto decision making
system for mortgage applications.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Background and related work</title>
      <p>
        Arti cial Intelligence (AI) systems which employ ML models are being put to
use in various applications a ecting our daily lives. However, few companies and
organizations are dedicating resources to mitigate AI risks despite some of them
having to increasingly manage such risks related to AI [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In AI/ML research,
the same situation happens that many researchers focus on improving the
performance of ML algorithms rather than looking into the accountability of these
systems. Research about ML accountability has been emerging in recent years.
Hajian et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] propose an idea on how to process raw data to prevent
discrimination in AI based intrusion and crime detection. Datta et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] introduce
the Quantitative Input In uence (QII) metric to improve the transparency of
decision-making systems by measuring the in uence of the input features. Bach
et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] propose Layer-Wise Relevance Propagation to explain deep neural
networks. Ribeiro et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] propose the called Local Interpretable Model-Agnostic
Explanations (LIME) algorithm to locally explain predictions of classi ers and
regressors by training an interpretable model to approximate the original model.
Afterwards, Shrikumar et al. propose [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] Deep Learning Important FeaTures
(DeepLIFT). Finallu, Lundberg et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] propose the SHapley Additive
exPlanations (SHAP) method. These articles started the upsurge of ML's explainability
research. All the above researches make it feasible for the information of ML
system's explainability to be collected. For the other aspect, transparency, Mitchell
et al. introduce Model Card [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] to create documentations for ML models. Their
framework allows people to look deeply into the model's development and its
performance.
      </p>
      <p>
        On the other side, in software development there is a methodology called
Behavior Driven Development (BDD) [
        <xref ref-type="bibr" rid="ref12 ref16 ref6">12, 6, 16</xref>
        ], which is developed from Test Driven
Development (TDD) [
        <xref ref-type="bibr" rid="ref11 ref2">2, 11</xref>
        ]. Same as TDD, BDD is still a test driven
development approach, but BDD designs user story scenarios instead of designing test
cases; it uses natural language to describe software system's behaviours, forcing
the behaviours ful ll the corresponding requirements in all scenarios to test the
system. As a test- rst approach, in BDD the scenarios are designed at the
beginning of the development when there is no existing code, so that the rst test
must fail until developers start to write code. The use of natural languages to
describe user story scenarios in BDD creates a bridge among software designers,
developers and users, encouraging all the development team members such as
developers and non-technical collaborators to work together with less technical
barriers. Inspired by BDD, we use natural language to describe the ML
system's behaviours in the form of user story scenarios that are designed according
to both technical and accountability requirements, making the ML system
fulll those requirements. That means the ML system must generate information
about its performance, explainability, transparency etc. For example, there is a
performance requirement-Accuracy, when the ML system passes the
corresponding scenario test, it must generate information on accuracy. Those information
can be collected to improve system's accountability. If there is a failure about
the system's accuracy after its deployment, we can trace back what kind of
accuracy test was done and what condition had be set in the corresponding testing
scenario, and this information will help the failure investigation and auditing
purpose.
      </p>
    </sec>
    <sec id="sec-3">
      <title>ML System Development with ADD</title>
      <sec id="sec-3-1">
        <title>The work ow of ADD</title>
        <p>In our framework (Figure 2), starting from the Task/Requirements analysis,
the development team and potential users should not only discuss technical
requirements (e.g. performance requirements such as accuracy and e ciency), but
also discuss the accountability requirements according to model's transparency,
explainability, fairness and document all the requirements. Then based on the
requirements, user stories and test scenarios are designed to start the testing.
Similar to BDD, the rst test must fail as no ML system has been developed.
Afterwards, the developers start to implement the ML system and test it again,
until the developed system passes all the tests. Eventually the system can be
considered to have ful lled all the requirements and ready to be deployed. In
this case, the testing in our framework can be considered combining system test
and acceptance test.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Technical and Accountability Requirements</title>
        <p>
          Technical requirements normally refer to performance characteristics. Some of
these requirements originate from potential users, such as performance
requirements, safety requirements, hardware architecture requirements. However, to
improve the entire technical requirements, the development team should
proactively propose some extra technical requirements in ML Systems, such as metrics
for speci c algorithms that do not violate customer's requirements. For example,
a customer asks for a 90% accuracy for a classi er, and the development team
should o er a confusion matrix analysis. According to transparency, a document
which is extended from Model Card [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] to collect more information should be
established to record details of the entire model development process such as
requirement analysis and useful information generated when the system passes
the tests. Corresponding to explainability, there should be requirements to
explain how and why this output is generated by the system. For example, in the
auto decision making system for mortgage applications, when an output is
obtained, the weights of all the features (features in this paper refer to applicant's
income, occupation, age etc.) of the corresponding applicant should be listed and
recorded. Regarding fairness, it can be proposed in accordance with law that the
system cannot be discriminatory based on ethnicity or gender etc.
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Design user story scenarios</title>
        <p>In BDD, the user story scenarios tests relate to some sub-modules of the whole
software system. For example, in a supermarket management system, selling a
product relates to the cash ow management module and stock management
module, so that the scenarios descriptions are more comprehensive. In ADD, we
consider three blocks (Data Collection, Data Analysis/Pre-processing and Model
Construction. Figure 2) inside the dotted rectangle as a component in
development. That will simplify the design. For example, we can ask this component
to provide data distribution information rather to ask the Data
Analysis/Preprocessing block to do so. Not only the normal requirements, but also the edge
use cases must be covered in the scenario design. For example, when there is
an edge use case that an applicant with no features goes into the system, an
error message \Wrong input given, can not create proper output " should
be given rather a normal output as \Mortgage application is approved "
or \Mortgage application is declined ". Also, all details of scenarios design
must be recorded as they contain what situations have been considered for the
use of the system which are useful for accountability.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Syntax for describing scenarios and system's behaviour</title>
        <p>The ML technical terms may be di cult to understand, and obscure
statistical methods, evaluation standards and metrics keep non-technical personnel out
of the loop. These non-technical collaborators may have better understanding
of other accountability factors, such as laws and ethical requirements, which
can help improving a system's accountability. Removing this technical barrier
is necessary to gain contributions for accountability from both technical and
non-technical personnel in the development stages. In ADD, a method similar
to BDD is used, and we use natural language to describe user story scenarios
and the behaviors of the system. We give two examples below to explain how
to transfer one fairness requirement into scenarios and how to describe system's
behaviour in that scenario. In the second example, the system is forced to
generate explanation information. (It is pointed out that in these examples, we just
want to demonstrate how ADD can be used, and we do not consider the pro t
requirement of the intended users, the banks. We also point out that the below
examples are the starting point to engage all parties, and they will be further
re ned after discussions among related people, including users and ML
developers/designers. The nal version of the scenarios may be produced after several
iterations until all parties reach a consensus. If a consensus cannot be reached,
the information on disagreement also needs to be recorded. )</p>
        <sec id="sec-3-4-1">
          <title>Title: Producing non-discriminatory output</title>
        </sec>
        <sec id="sec-3-4-2">
          <title>As a Bank.</title>
        </sec>
        <sec id="sec-3-4-3">
          <title>I want the system that produces fair outputs for di erent applicants regardless of their ethnicity. so that the System is not racially biased.</title>
        </sec>
        <sec id="sec-3-4-4">
          <title>Scenario 1: The System should produce fair outputs for applicants from different ethnic groups.</title>
        </sec>
        <sec id="sec-3-4-5">
          <title>Given that a mortgage application approved for applicant A,</title>
        </sec>
        <sec id="sec-3-4-6">
          <title>When another applicant B, with the same features but di erent ethnicity applies for the same mortgage amount,</title>
        </sec>
        <sec id="sec-3-4-7">
          <title>Then the mortgage application should be approved for applicant B.</title>
        </sec>
        <sec id="sec-3-4-8">
          <title>Scenario 2: The system should produce fair outputs for applicants from different ethnic groups.</title>
        </sec>
        <sec id="sec-3-4-9">
          <title>Given that a mortgage application from applicant A is declined,</title>
        </sec>
        <sec id="sec-3-4-10">
          <title>When another applicant B with the same features but di erent ethnicity is applying for the same mortgage amount,</title>
        </sec>
        <sec id="sec-3-4-11">
          <title>Then mortgage application from applicant B should be declined.</title>
        </sec>
        <sec id="sec-3-4-12">
          <title>Title: Proving explanations for outputs</title>
        </sec>
        <sec id="sec-3-4-13">
          <title>As a Bank.</title>
        </sec>
        <sec id="sec-3-4-14">
          <title>I want the system that provides explanation for every mortgage application decision. so that we know why the output is produced and can explain it to our clients.</title>
        </sec>
        <sec id="sec-3-4-15">
          <title>Scenario 1: The system should provide explanation for a successful application.</title>
        </sec>
        <sec id="sec-3-4-16">
          <title>Given an applicant with all necessary features,</title>
        </sec>
        <sec id="sec-3-4-17">
          <title>When mortgage application is approved,</title>
        </sec>
        <sec id="sec-3-4-18">
          <title>Then the system should create a report with all the weights corresponding to the features of the applicant.</title>
        </sec>
        <sec id="sec-3-4-19">
          <title>Scenario 2: The system should provide explanation for a failed application.</title>
        </sec>
        <sec id="sec-3-4-20">
          <title>Given an applicant with all necessary features,</title>
        </sec>
        <sec id="sec-3-4-21">
          <title>When mortgage application is declined,</title>
        </sec>
        <sec id="sec-3-4-22">
          <title>Then the system should create a report with all the weights corresponding to the features of the applicant.</title>
          <p>4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion and Future work</title>
      <p>There is an urgent need to improve ML system's accountability during its
development stage. In this paper, we brie y introduce our ADD framework to develop
ML systems and explain how the ADD can facilitate accountability information
capture regarding the system's performance, explainability, fairness and
transparency. We believe ADD can help reduce misunderstanding among users, ML
system designers and developers by engaging them and con rming
accountability requirements.</p>
      <p>
        In the future, we are going to further improve this framework by making a
deeper study in the requirements and scenarios design sections (Sections 3.2,
3.3), developing an example model and giving a full report by extending the
Model Card framework [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Further more, we point out that ADD is a generic
methodology for facilitating accountability in ML, and therefore, we will extend
our framework to the whole life cycle of ML systems, making it more widely
used. Finally, we are considering developing a Cucumber-like [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] (cucumber is
a tool that supports BDD.) tool that is speci cally for ML system development.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bach</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Binder</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montavon</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klauschen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , Muller,
          <string-name>
            <given-names>K.R.</given-names>
            ,
            <surname>Samek</surname>
          </string-name>
          , W.:
          <article-title>On pixel-wise explanations for non-linear classi er decisions by layer-wise relevance propagation</article-title>
          .
          <source>PloS one 10(7)</source>
          ,
          <year>e0130140</year>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Beck</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Test-driven development: by example</article-title>
          . Addison-Wesley
          <string-name>
            <surname>Professional</surname>
          </string-name>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Datta</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zick</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Algorithmic transparency via quantitative input in uence: Theory and experiments with learning systems</article-title>
          .
          <source>In: 2016 IEEE symposium on security and privacy (SP)</source>
          . pp.
          <volume>598</volume>
          {
          <fpage>617</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Google</surname>
          </string-name>
          :
          <article-title>Machine learning work ow</article-title>
          . https://cloud.google.com/aiplatform/docs/ml-solutions-overview,
          <source>online; Accessed Mar 25</source>
          ,
          <year>2021</year>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Hajian</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Domingo-Ferrer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martinez-Balleste</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Discrimination prevention in data mining for intrusion and crime detection</article-title>
          .
          <source>In: 2011 IEEE Symposium on Computational Intelligence in Cyber Security (CICS)</source>
          . pp.
          <volume>47</volume>
          {
          <fpage>54</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Haring</surname>
          </string-name>
          , R., de Ruiter, R.:
          <article-title>Behavior driven development: Beter dan test driven development</article-title>
          .
          <source>Java</source>
          Magazine p.
          <volume>29</volume>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Lundberg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>S.I.:</given-names>
          </string-name>
          <article-title>A uni ed approach to interpreting model predictions</article-title>
          .
          <source>arXiv preprint arXiv:1705.07874</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Mckinsey</surname>
          </string-name>
          :
          <article-title>The state of ai in 2020</article-title>
          . https://www.mckinsey.
          <article-title>com/businessfunctions/mckinsey-analytics/our-insights/global-survey-the-state-of-ai-in-2020, online</article-title>
          ;
          <source>Accessed Mar 28</source>
          ,
          <year>2021</year>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Mitchell,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Zaldivar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Barnes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Vasserman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Hutchinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Spitzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Raji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.D.</given-names>
            ,
            <surname>Gebru</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>Model cards for model reporting</article-title>
          .
          <source>In: Proceedings of the conference on fairness, accountability, and transparency</source>
          . pp.
          <volume>220</volume>
          {
          <issue>229</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Naja</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markovic</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Edwards</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cottrill</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>A semantic framework to support ai system accountability and audit (</article-title>
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Newkirk</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vorontsov</surname>
            ,
            <given-names>A.A.</given-names>
          </string-name>
          :
          <article-title>Test-driven development in Microsoft</article-title>
          .
          <source>Net</source>
          , vol.
          <volume>1</volume>
          . Microsoft Press Redmond, WA (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>North</surname>
          </string-name>
          , D.:
          <article-title>faster organizations, faster software</article-title>
          . http://dannorth.net/introducingbdd, online;
          <source>Accessed Mar 30</source>
          ,
          <year>2021</year>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Ribeiro</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guestrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>" why should i trust you?" explaining the predictions of any classi er</article-title>
          .
          <source>In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining</source>
          . pp.
          <volume>1135</volume>
          {
          <issue>1144</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Shrikumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greenside</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kundaje</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Learning important features through propagating activation di erences</article-title>
          .
          <source>In: International Conference on Machine Learning</source>
          . pp.
          <volume>3145</volume>
          {
          <fpage>3153</fpage>
          .
          <string-name>
            <surname>PMLR</surname>
          </string-name>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>SmartBear</surname>
          </string-name>
          <article-title>: what is cucumber</article-title>
          . https://cucumber.io/docs/guides/overview/,
          <source>online; Accessed Mar 28</source>
          ,
          <year>2021</year>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Solis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>A study of the characteristics of behaviour driven development</article-title>
          .
          <source>In: 2011 37th EUROMICRO Conference on Software Engineering and Advanced Applications</source>
          . pp.
          <volume>383</volume>
          {
          <fpage>387</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cui</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiao</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Machine learning for networking: Work ow, advances and opportunities</article-title>
          .
          <source>IEEE Network</source>
          <volume>32</volume>
          (
          <issue>2</issue>
          ),
          <volume>92</volume>
          {
          <fpage>99</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>