<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Law Data Science and Ethics: the CRIKE Approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Silvana Castano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mattia Falduti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Al o Ferrara</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefano Montanelli</string-name>
          <email>stefano.montanellig@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universita degli Studi di Milano DI - Via Celoria</institution>
          ,
          <addr-line>18 - 20135 Milano</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <volume>4</volume>
      <issue>2019</issue>
      <abstract>
        <p>In the era of big data, research activity on data science focuses on large datasets to produce knowledge supporting decision-making processes in di erent application domains and contexts. Data science practices and outputs have a tremendous impact on a variety of elds by raising new ethical issues that become crucial. In this paper, we address the ethical issues related to the ethics of data, the ethics of algorithms, and the ethics of practices in the context of our data science approach for case-law decisions (CLDs) processing called CRIKE (CRIme Knowledge Extraction). In particular, we discuss the ethical issues that need to be faced when dealing with knowledge extracted from CLDs for descriptive analysis purposes and for predictive usage of data extracted from CLDs.</p>
      </abstract>
      <kwd-group>
        <kwd>case law analysis</kwd>
        <kwd>data science ethics</kwd>
        <kwd>ethics of data</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In the era of big data, research activity on data science focuses on collection,
processing, and interpretation of large datasets to produce knowledge for
decisionmaking processes in di erent application domains and contexts. This is
stimulated, on one side, and made possible on the other side, by the continuous
production of data coming from disparate data sources and locations and by
the availability of web-based technologies for data storage, integration, analysis
and mining, thus enabling behavior and trend prediction as well as descriptive
statistics for facts and events. Ethical issues play a crucial role in data
science processes, to improve the social impact and the scienti c quality of data
science practices and outputs. For example, in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], a framework is proposed
for the enforcement of ethical oversight over the dissemination and use of Big
and Open Data. The framework is grounded on the importance of encouraging
critical thinking and ethical re ection among the researchers involved in data
processing practices. As discussed by Floridi in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], the main ethical challenges
in data science can be classi ed as follows: i) ethics of data, focused on collection
and analysis of large dataset; ii) ethics of algorithms, focused on complexity and
autonomy of algorithms, and iii) ethics of practices, addressed to draft ethical
framework to shape professional codes, strategies and policies. On this ground,
in the paper we address ethical issues in the context of our data science approach
for case-law decisions (CLDs) processing called CRIKE (CRIme Knowledge
Extraction). The CRIKE approach has been conceived for processing large datasets
of CLDs coming from diverse law sources (e.g., rst grade, Court of appeal) to
automatically discover applications of legal abstract term's in court's decision
texts. CRIKE relies on the LATO ontology where abstract terms and decision
verdicts are formally de ned by means of concepts and relations. A detailed
description of the LATO ontology design and of the CRIKE knowledge extraction
processes is provided in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The CRIKE process work ow covers all the phases of
a conventional data science process: i) data collection, where CLDs are collected
and stored in digital format for subsequent analysis, ii) knowledge extraction,
where CLDs texts are processed to extract knowledge in form of relevant
terminology corresponding to the concepts in the LATO ontology, iii)target-oriented
practices, where knowledge extracted from CLDs can be exploited both for
descriptive analysis purposes by classifying CLDs and for predictive usage of CLDs
by enforcing learning procedures.
      </p>
      <p>
        Ethical issues of di erent nature and di erent impact and implications are
involved in processing CLDs using CRIKE. As a general consideration, we
observe that CLDs involve individuals like judges, ascribed/accused people and
possible other individuals intervening in the crime description (e.g., witnesses).
Prominent ethic issues in processing of CLDs should thus avoid: i) violation of
individual privacy as well as prohibited secondary uses of personal data; ii)
individual classi cation based on data revealing racial or ethnic origin, political
opinions, religious or philosophical beliefs, as well as trade union membership,
genetic and biometric data, data concerning health or data concerning sex life or
sexual orientation; iii) unfair of prediction algorithms concerning CLDs analytics
approaches focused not only on pure data analysis, but also on court's outcomes
prediction, judges pro ling and automatic legal decision's making. For example,
an ethical issue envisaged in [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] is propensity, that is, on the basis of prediction
about what people were likely to do, what could/should be done to prevent. As
discussed in [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], what if big data analytics predict that a certain person has a
likelihood of 95% to being involved in domestic violence? An ethical issue here
has to do with the ethical role of those setting the threshold and the data
scientists writing the algorithm that calculates the chance based on the observation
of certain variables available in the underlying dataset.
      </p>
      <p>
        After describing the overall CRIKE process work ow (Section 2), goal of
the paper is to provide a ner classi cation of ethical issues involved in the
CRIKE process work ow by referring to the classi cation introduced in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and
its actualization in the framework of the CRIKE (Section 3). Finally, we conclude
by discussing our future work (Section 4).
      </p>
    </sec>
    <sec id="sec-2">
      <title>The CRIKE approach to Case-Law</title>
    </sec>
    <sec id="sec-3">
      <title>Processing</title>
    </sec>
    <sec id="sec-4">
      <title>Decisions</title>
      <p>The CRIKE approach to CLDs processing is articulated in six main
activities as shown in Figure 1. The Collection of CLDs activity is devoted to the
tasks/procedures used for acquiring and preprocessing CLDs from a quali ed
source, like for instance the Court of Milan. Usually, in the Italian context,
CLDs are provided in form of images of the paper documents. The quality of
these documents is highly variable. Thus OCR and other ad hoc solutions for
data cleaning are required to obtain a pure text version of each CLD toghter
with a limited set of metadata (including a CLD identi er and the date). In the
Storage of CLDs activity, digital documents are stored in a database. In CRIKE,
we exploit MongoDB to store for each CLD, the raw text, the available
metadata, as well as the sentences and single words obtained from sentence and word
tokenization of the raw CLD text.</p>
      <p>1</p>
      <p>Collection of</p>
      <p>CLDs</p>
      <sec id="sec-4-1">
        <title>2 Storage of</title>
        <p>CLDs</p>
      </sec>
      <sec id="sec-4-2">
        <title>3 LATO ontology</title>
        <p>design</p>
      </sec>
      <sec id="sec-4-3">
        <title>4 Ontology-based</title>
        <p>knowledge extraction
5</p>
        <p>CLDs
classification
6 Predictive
use of CLDs</p>
        <p>CRIKE is based on the LATO ontology which drives the process of knowledge
extraction from CLDs. The third activity is the LATO ontology design, with
the goal of conceptualizing legal concepts and related controlled vocabulary.
Then, working with LATO and with the contents of the CLD database, we
extract knowledge from the CLDs (Ontology-based knowledge extraction activity).
Goal of this activity is to retrieve occurrences of the legal concepts as they are
de ned in LATO within the CLD document collection and to extract relevant
terminology used by the judge to articulate those concepts in each speci c CLD.
Knowledge extracted from CLDs constitutes the input for subsequent activity
of CLDs classi cation (Fig.1.5). The goal is to classify CLDs according to the
concepts of interest in LATO, to measure the relevance of terms extracted from
CLDs text with respect to LATO concepts, and to associate terminology with
the nal decision of the judge. This activity is the basis for calculating a degree
of correlation between terminology, concepts, and decisions. According to this
analysis, it is then possible to enforce learning procedures to make a predictive
use of CLDs with respect to speci c legal concepts (Predictive use of CLDs). Both
activities 5 and 6 are target-driven in that classi cation and predictive use of
CLDs are customized according to the nal use of CLDs data (e.g., to study
the interpretation given by courts to a speci c legal concept, predict a decision
given some facts).
3</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Dealing with ethics in CRIKE</title>
      <p>
        To highlight and discuss ethical issues in processing CLDs, we map the data
science ethics framework proposed by Floridi in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], on the CRIKE activity
work ow resulting in the three-layer framework shown in Fig.2): i) ethics of
data, involving ethical issues related to collection and analysis of large CLDs
dataset; ii) ethics of algorithms, involving ethical issues related to complexity and
autonomy of CRIKE algorithms, and iii) ethics of practices, more strictly related
to ethics in target oriented classi cation and prediction activities of CRIKE.
      </p>
      <p>SOURCE OF CLDs (Milan Court and Court of Appeal)</p>
      <sec id="sec-5-1">
        <title>1 Collection of</title>
        <p>CLDs</p>
      </sec>
      <sec id="sec-5-2">
        <title>2 Storage of</title>
        <p>CLDs
3 LATO ontology
design
4 Ontology-based
knowledge extraction
5 CLDs
classification
6 Predictive
use of CLDs</p>
        <p>ETHICS OF DATA</p>
        <p>ETHICS OF
ALGORITHMS</p>
        <p>
          ETHICS OF
PRACTICES
Ethics of data primarily refers to the source providing data as well as to the
procedures used for data acquisition and storage. In terms of data acquisition,
working in the legal domain, in particular the Italian legal domain, imposes us
to acquire data from a speci c, secure and certi ed source. Both laws and CLDs
have an institutional creator which should be accessed by directly interacting
with the public administration o ces in order to acquire genuine data in terms
of data format and completeness. In CRIKE, we process CLDs obtained directly
by the involved Courts (the Court of Milan and the Court of Appeal). The direct
access to the administration databases guarantees the institutional provenance
of data as well as their integrity. A second relevant issue concerning ethics of data
involves personal data. In particular, criminal CLDs may contain three di erent
categories of personal data, namely (i) identi cation data, (ii) special categories
of personal data and (iii) criminal records. Identi cation data are de ned by
the General Data Protection Regulation (GDPR) as those data describing an
identi able person [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Special categories of personal data are described at
paragraph 9 of the GDPR as "those data revealing racial or ethnic origin, political
opinions, religious or philosophical beliefs, as well as trade union membership,
genetic and biometric data, data concerning health or data concerning sex life
or sexual orientation". Criminal records are the records concerning a person's
criminal history. This last category of personal data is protected at paragraph
10, where GDPR speci es that "access to those data is permitted only under the
control of an o cial authority or when the processing is authorized by European
Union or Member State law". The aim of the regulation is to protect personal
data against illicit handlings. In particular, main ethical issues related to CLDs
acquisition and storage concern the risk associated both to the privacy of groups
of people and to re-identi cation of individuals. Speci cally, the risk associated
with groups regards the possibilities to combine data and groups of individuals,
for example, by committed crime, by race or nationality, by spoken language
or dialect, by age or gender. These activities could violate groups privacy and
could permit re-identi cation through inference [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Concerning re-identi cation
of individuals, the main risk is to violate the right of being forgotten, as drafted
in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. These issues are faced in di erent research elds. For example, [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] presents
an estimation of re-identi cation risk for data sharing policies of the Health
Insurance Portability and Accountability Act (HIPAA) Privacy Rule, as well as
an evaluation of the risk of a speci c re-identi cation attack using voter
registration lists. In general, uncontrolled re-identi cation risks can conduct to a
dangerous information control loss and privacy violation, due to the fact that
information privacy concerns speci cally the capacity of an individual to maintain
control of his or her information [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]. Since privacy regulation is based on the
notion of meaningful consent, having trust in data acquisition and processing
is a crucial issue [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. In particular, the topic of privacy in accessing
individual criminal history information is addressed in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], where the authors de ne
policies for providing public access to individual criminal records in Spain and
the USA, considering access to court records, protection of honor, privacy and
personal data, free speech and rehabilitation. In this context, CRIKE is
compliant with the privacy regulation in that it is conceived to detect exclusively
legal concepts inside the CLDs and to group the CLDs by legal concepts and
their application. Secondly, we want to extract legal knowledge by automatically
considering the verdicts. In other terms, we consider only legal terminology and
crime argumentation. Our goals are not related to personal data, directly or
indirectly. Knowledge extraction and text mining activities are only related to nd
legal concepts application and how they are expressed by judges inside various
CLDs. Moreover, due to the particular type of data and the agreement we signed
with the involved Court administrations, our dataset is closed and it cannot be
shared nor published. The CLDs database is protected against external attacks,
in that it is stored on stand alone machine accessed only by a restricted number
of authorized researchers with given time restrictions. These restrictions were
mandatory to sign the agreement with the involved public administration
ofces, for CLDs acquisition and use. We note that our dataset avoids the group
privacy issues in that we obtained a whole set of CLDs, rather than only selected
CLDs targeted to a speci c topic/objective to be analysed, like for instance all
CLDs related to a speci c crime or to a speci c group of crimes.
3.2
        </p>
        <p>
          Ethics of algorithms
The ethical issues related to design and implementation of algorithms that
elaborate criminal data are transparency, accountability and discrimination. First, in
terms of transparency, the risk is to use or implement processes and algorithms
that are unclear, incomprehensible and unrepeatable [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. Transparency is
related to the concepts of accessibility and comprehensibility of information, as
reported in [
          <xref ref-type="bibr" rid="ref21 ref24">21, 24</xref>
          ]. Real-world algorithmic decision-making processes designed
to maximize fairness and transparency are described in the Open Algorithm
(OPAL) project [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Transparency itself is insu cient, on one side, because
companies would not reveal and disseminate proprietary algorithms not to lose
their competitive edge, and on the other side, because of the so-called
transparency paradox [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. This refers to the fact that, it is clear what machine
learning algorithms do in taking decisions about, for example, credit, medical
diagnose, personalized recommendations, advertising or job opportunities, but it
is still less clear how these decisions are taken [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. This issue is directly related
to accountability, which is the problem of associating the blame for problems
and errors of very complex systems to speci c individuals [
          <xref ref-type="bibr" rid="ref13 ref17">13, 17</xref>
          ].
        </p>
        <p>
          A further issue to be addressed is how and to whom to enforce accountability
for discriminatory outcomes of data analysis. Handling criminal data means in
fact to face the risk of associating a criminal behavior with groups of
individuals on the basis of their race, religion, cultural background, language, age or
gender. An example of data mining discriminatory outcome in ranking job
candidates is described in [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Authors demand caution in the use of data mining
techniques and they advocate that this should be part of a comprehensive set
of strategies for contrasting discrimination in the workplace and for promoting
fair treatment and equality. Other interesting contributions on this issue are the
idea of Classi cation with No Discrimination (CND) [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and the proposal of
a guideline for researchers and anti-discrimination data analysts on concepts,
problems, application areas, datasets, methods, and approaches from a
multidisciplinary perspective, as presented in [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. A discussion of algorithm fairness
issues on criminal data analysis and racial disparities, in particular focusing on
the problem of designing an algorithm for pretrial release decisions, is given
in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Since CRIKE knowledge extraction enforces an ontology-based approach
with LATO, we comply with the need of transparency in terms of
comprehensibility and human intervention. In particular, we decided to base the process of
knowledge extraction mainly on quite simple functionalities for searching LATO
terminology within the CLDs documents in order to guarantee a transparent and
easily repeatable process. We handle CRIKE accountability issues by arguing
that LATO can be changed and modi ed directly by the designer, to in uence
CRIKE results. Moreover, the system is open and still under de nition. Our
goal is to preserve human intervention and direct control over the system
behavior and over the achieved results. Furthermore, in order to avoid the reported
discriminatory risks, we base knowledge extraction and classi cation processes
only on general legal concepts and application, by considering for instance crime
paragraph, article, verdict and the related terminology.
3.3
        </p>
        <p>
          Ethics of practices
The issues concerning the ethics of practices are related to the use of the
outcomes of data analysis. In particular, we need to face risks concerning anonymity
and informed consent, secondary use, and data protection. Informed consent
appears insu cient to solve ethical problems related to individuals privacy as
discussed in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], where authors point out how privacy and big data are simply
incompatible without a de nition of new approaches having anonymity has one of
their primary goals since the design. In particular, they point out how anonymity
is di erent from nameless and reachability. About secondary use, the aim is to
ensure ethical practices fostering both the progress of data science and the
protection of the right of individuals and groups, as pointed in [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. An example
of the question of privacy and secondary use of data in health research is given
in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] by considering three di erent levels: informed consent, anonymity, and
public interest mandate. In health research, the reuse of clinical data is a
fastgrowing eld, recognized as essential to: i) realize the potentials for high-quality
healthcare, ii) improve healthcare management, iii) reduce healthcare costs, and
iv) perform e ective clinical research ( [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]). In particular, one of the main issues
in this eld is the trade-o between the need of keeping personal data anonymous
and the need of exploiting data to achieve results that could be useful for the
citizens, according to the notion of public interest. An example is available in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ],
where authors describe two court cases (appeared in US and UK) about
selling prescription data and the related questions of what constitutes privacy and
what public interest. Balancing privacy, public interest and open access raises
ethical and juridical questions in the legal eld as well, because Criminal Courts
declare in their decisions what is forbidden and what is allowed. Thus,
according to the European Court of Human Rights, criminal argumentation reported
in CLDs has to be published, accessible, and known by individuals. CRIKE's
results achieved so far are completely anonymized and do not report any
personal or identifying data, because CRIKE works exclusively with legal concepts
formalized in LATO. The CRIKE system has a scienti c research aim only and
it respects the GDPR rules for scienti c research purposes. We mine CLDs in
order to extract the legal argumentation and the juridical terms application, by
considering the di usion of the legal knowledge as a positive element. For these
reasons, we aim at facilitating the access to legal knowledge without pursuing
goals of judge pro ling or similar.
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Future work</title>
      <p>Our work on CRIKE is ongoing. So far, we achieved rst and promising results
in automatically extracting and classifying CLDs terminology concerning
drugrelated crimes. In particular, we focused on legal abstract terms formalization
in LATO in thus context. In law articles, legal abstract terms represent
something indeterminate that need a concrete application to be de ned; examples
of abstract terms are good faith, long-term cohabitation, or minor o ense case,
where what should be considered good, long-term, or minor requires a concrete
interpretation by the Court in order to be de ned. In this context, we de ned a
CRIKE process to detect concrete applications of legal abstract terms in CLDs
and to determine how and where considered legal abstract terms are applied by
judges in their legal argumentation. As discussed in the paper, CRIKE has been
designed from the very beginning to be compliant with guidelines and
regulations concerning the ethical issues in the eld of data science. Our future work
will keep this as a primary goal of CRIKE. In particular, we aim at evolving
the LATO ontology to include further legal concepts and related terminology by
systematically testing the capability of the system to detect and classify CLDs
against them. Moreover, we aim at exploiting the use of machine learning
techniques to automatically enrich LATO starting from the training set composed by
the CLDs that have been classi ed through the ontology-driven approach, thus
enforcing a bootstrapping mechanism where each cycle of knowledge extraction
and classi cation is used to improve the ontology and the subsequent extraction
cycle. Finally, we aim at studying the correlation between the concrete
application of legal abstract terms and the nal Court decision, in order to apply a
predictive approach for determining an expected verdict given the concrete facts
that are related to each speci c legal concept of interest.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Barocas</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nissenbaum</surname>
          </string-name>
          , H.:
          <article-title>Big Data's End Run around Anonymity and Consent</article-title>
          , p.
          <volume>44</volume>
          {
          <fpage>75</fpage>
          . Cambridge University Press (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Barocas</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Selbst</surname>
            ,
            <given-names>A.D.</given-names>
          </string-name>
          :
          <article-title>Big data's disparate impact</article-title>
          .
          <source>Californi Law Review</source>
          <volume>104</volume>
          ,
          <issue>671</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Benitez</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malin</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Evaluating re-identi cation risks with respect to the hipaa privacy rule</article-title>
          .
          <source>Journal of the American Medical Informatics Association : JAMIA</source>
          <volume>17</volume>
          ,
          <issue>169</issue>
          {
          <volume>77</volume>
          (03
          <year>2010</year>
          ). https://doi.org/10.1136/jamia.
          <year>2009</year>
          .000026
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bennett</surname>
            ,
            <given-names>S.C.</given-names>
          </string-name>
          :
          <article-title>The right to be forgotten: Reconciling eu and us perspectives</article-title>
          .
          <source>Berkeley Journal of International Law</source>
          <volume>30</volume>
          ,
          <issue>161</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Castano</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Falduti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferrara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montanelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Crime knowledge extraction: An ontology-driven approach for detecting abstract terms in case law decisions (</article-title>
          <year>2019</year>
          ),
          <source>17th International Conference on Arti cial Intelligence and Law (ICAIL)</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Corbett-Davies</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pierson</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Feller</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huq</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Algorithmic decision making and the cost of fairness</article-title>
          .
          <source>In: Proc. of the 23rd ACM SIGKDD Int. Conference on Knowledge Discovery and Data Mining</source>
          . pp.
          <volume>797</volume>
          {
          <fpage>806</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>EU</given-names>
            <surname>Parliament</surname>
          </string-name>
          and Council of European Union:
          <article-title>General Data Protection Regulation</article-title>
          (May
          <year>2016</year>
          ), http://eur-lex.europa.eu/legalcontent/EN/TXT/?uri=OJ:L:
          <year>2016</year>
          :119:TOC
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Floridi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Open Data, Data Protection</article-title>
          , and Group Privacy.
          <source>Philosophy &amp; Technology</source>
          <volume>27</volume>
          (
          <issue>1</issue>
          ), 1{
          <issue>3</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Floridi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taddeo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>What is data ethics? Philosophical Transactions of The Royal Society A Mathematical Physical</article-title>
          and
          <source>Engineering Sciences</source>
          <volume>374</volume>
          ,
          <volume>20160360</volume>
          (12
          <year>2016</year>
          ). https://doi.org/10.1098/rsta.
          <year>2016</year>
          .0360
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kamiran</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calders</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Classi cation with no discrimination by preferential sampling</article-title>
          .
          <source>In: Proc. 19th Machine Learning Conf. Belgium and The Netherlands</source>
          . pp.
          <volume>1</volume>
          {
          <issue>6</issue>
          .
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>How should health data be used?: Privacy, secondary use, and big data sales</article-title>
          .
          <source>Cambridge Quarterly of Healthcare Ethics</source>
          <volume>25</volume>
          (
          <issue>2</issue>
          ),
          <volume>312</volume>
          {
          <fpage>329</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Karst</surname>
            ,
            <given-names>K.L.</given-names>
          </string-name>
          :
          <article-title>"The Files": Legal Controls over the Accuracy and Accessibility of Stored Personal Data</article-title>
          .
          <source>Law and Contemporary Problems</source>
          <volume>31</volume>
          (
          <issue>2</issue>
          ),
          <volume>342</volume>
          {
          <fpage>376</fpage>
          (
          <year>1966</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Kraemer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Overveld</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peterson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <source>Is There an Ethics of Algorithms? Ethics and Information Technology</source>
          <volume>13</volume>
          (
          <issue>3</issue>
          ),
          <volume>251</volume>
          {
          <fpage>260</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Leonelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Locating Ethics in Data Science: Responsibility and Accountability in Global and Distributed Knowledge Production Systems</article-title>
          .
          <source>Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences</source>
          <volume>374</volume>
          (
          <year>2083</year>
          ),
          <volume>20160122</volume>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Lepri</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oliver</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Letouze</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pentland</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vinck</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Fair, Transparent, and
          <article-title>Accountable Algorithmic Decision-Making Processes</article-title>
          .
          <source>Philosophy &amp; Technology</source>
          <volume>31</volume>
          (
          <issue>4</issue>
          ),
          <volume>611</volume>
          {
          <fpage>627</fpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Lowrance</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Learning from Experience: Privacy and the Secondary Use of Data in Health Research</article-title>
          .
          <source>Journal of health services research &amp; policy 8(1 suppl)</source>
          ,
          <volume>2</volume>
          {
          <issue>7</issue>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Matthias</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The Responsibility Gap: Ascribing Responsibility for the Actions of Learning Automata</article-title>
          .
          <source>Ethics and information technology 6</source>
          (
          <issue>3</issue>
          ),
          <volume>175</volume>
          {
          <fpage>183</fpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Meystre</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lovis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , Burkle, T.,
          <string-name>
            <surname>Tognola</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Budrionis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress</article-title>
          .
          <source>Yearbook of medical informatics</source>
          <volume>26</volume>
          (
          <issue>01</issue>
          ),
          <volume>38</volume>
          {
          <fpage>52</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Nissenbaum</surname>
          </string-name>
          , H.:
          <article-title>A Contextual Approach to Privacy Online</article-title>
          .
          <source>Daedalus</source>
          <volume>140</volume>
          (
          <issue>4</issue>
          ),
          <volume>32</volume>
          {
          <fpage>48</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Romei</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruggieri</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A Multidisciplinary Survey on Discrimination Analysis</article-title>
          .
          <source>The Knowledge Engineering Review</source>
          <volume>29</volume>
          (
          <issue>5</issue>
          ),
          <volume>582</volume>
          {
          <fpage>638</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Rubel</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>K.M.L.</given-names>
          </string-name>
          :
          <article-title>Student Privacy in Learning Analytics: an Information Ethics Perspective</article-title>
          .
          <source>The Information Society</source>
          <volume>32</volume>
          (
          <issue>2</issue>
          ),
          <volume>143</volume>
          {
          <fpage>159</fpage>
          (
          <year>2016</year>
          ), https://doi.org/10.1080/01972243.
          <year>2016</year>
          .1130502
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Schermer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>The Limits of Privacy in Automated Pro ling and Data Mining</article-title>
          .
          <source>Computer Law &amp; Security Review</source>
          <volume>27</volume>
          (
          <issue>1</issue>
          ),
          <volume>45</volume>
          {
          <fpage>52</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Spice</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Carnegie mellon transparency reports make ai decision-making accountable</article-title>
          .
          <source>Tech. rep.</source>
          , Carnegie Mellon University School of Computer Science (
          <year>2016</year>
          ), https://www.cs.cmu.edu/news/carnegie-mellon
          <article-title>-transparency-reports-make-aidecision-making-accountable</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Turilli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Floridi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>The Ethics of Information Transparency</article-title>
          .
          <source>Ethics and Information Technology</source>
          <volume>11</volume>
          (
          <issue>2</issue>
          ),
          <volume>105</volume>
          {
          <fpage>112</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Van Wel</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Royakkers</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Ethical Issues in Web Data Mining</article-title>
          .
          <source>Ethics and Information Technology</source>
          <volume>6</volume>
          (
          <issue>2</issue>
          ),
          <volume>129</volume>
          {
          <fpage>140</fpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Zwitter</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Big Data Ethics</article-title>
          .
          <source>Big Data &amp; Society</source>
          <volume>1</volume>
          (
          <issue>2</issue>
          ) (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>