<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Case Study on Data Protection for a Cloud- and AI-based Homecare Medical Device</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Philipp Bende</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olga Vovk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Caraveo</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ludwig Pechmann</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Leucker</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Tallinn University of Technology</institution>
          ,
          <addr-line>Tallinn, Estonia</addr-line>
          ,
          <institution>School of Information Technologies, Department of Health Technologies</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>UniTransferKlinik Lübeck GmbH</institution>
          ,
          <addr-line>Lübeck</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Lübeck, Lübeck, Germany, Institute for Software Engineering and Programming Languages</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>To improve the treatment of many diseases, continuous monitoring of the patient at home with the ability of doctors to interact with individual cases demands an increasing number of medical devices connected to the cloud. To support the doctor's duties, such devices may benefit from AI-based diagnosis routines. In order for such devices to be approved and placed on the market, they need to comply with various legal, regulatory, economic, and social requirements. An integral part of these requirements is the protection of the patients' data. In this paper, based on a current use case, we describe a workflow on how to identify risks and address their mitigations. To this end, we recall the relevant legal, regulatory, economic, and social data protection requirements. We pursue our findings on a Homecare OCT device that is intended to be used by elderly patients on a daily basis, by taking images of their eyes and sending them for further analysis to a cloud- and AI-based system. The patient's ophthalmologist gets notified for further dedicated treatment depending on the result. To perform the risk management, we describe (i) the architecture of the homecare system, (ii) analyze its data flow, (iii) discuss several vectors of attack, and (iv) propose ways to mitigate the risks.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Data Protection</kwd>
        <kwd>Risk Management</kwd>
        <kwd>Homecare Medical Devices</kwd>
        <kwd>Cloud- and AI-based System</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The rapidly developing field of medical devices using Artificial Intelligence (AI) has great
potential in the healthcare domain by revolutionizing diagnosis, treatment, and patient care
delivery. AI-based medical devices and software can help clinicians diagnose patients’ health
problems more accurately, assess risk, and provide a higher level of support for health care
professionals. Such devices can also help patients by providing better, more afordable and
convenient health care. Despite the benefits that can be achieved, using technologies also
raises concerns and one of them is personal data protection. To achieve the proper level of
protection comprehensive measures shall be taken. These include considering legal, technical
and administrative measures.</p>
      <p>This article gives an overview of the requirements associated with data protection applicable
in the health care field and provides an example of translation of the requirements to practice.
In this work we present a case study of a homecare medical system and discuss risks concerning
patient data protection associated with such a system, as well as ways to mitigate these risks in
practice.</p>
      <p>We created an overview of publications that address the question of data security in medical
devices and AI. Based on our findings, we can divide publications into papers that focus on legal
aspects of data protection and technical aspects. The first category includes papers that focus
on analyzing the legal requirements and frameworks. For example, in the article “The European
Legal Framework for Medical AI” [21], the authors look into relevant laws, focusing on data
protection. Nevertheless, this and similar papers use a theoretical approach. In contrast, we bring
to the reader’s attention requirements and related regulations but mainly focus on the practical
implementation of those rules. In the second category we put articles focused on technical
aspects, specifically medical devices cybersecurity. For example, the article “Secure health
data sharing for medical cyber-physical systems for the healthcare 4.0” [17] focuses mainly
on cybersecurity and technical aspects, such as encryption methods. Based on this literature
analysis we found out that research on the practical implementation of the requirements in
real-life devices and comprehensive description of risks related to data protection is missing.
Also, we would like to point out that papers are often focused on medical devices and networks
located in hospitals and are usually more protected. In contrast, our work is dedicated to the
home monitoring device that is used outside of a secure hospital environment and this usage
can bring additional risks to data protection.</p>
      <p>The article is structured as follows. In Section 2 we discuss legal, regulatory, social and
economic requirements that need to be taken into account while dealing with personal data
and requirements for data protection in software as a medical device. Following, Section 3 gives
an overview of the homecare cloud system, including architecture and data flow in the system.
Next, in Section 4 we describe potential risks associated with personal data protection in the
current system and methods to mitigate those risks. Finally, in Section 5, we discuss the main
ifndings.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Requirements</title>
      <p>Personal data has a great value nowadays and to ensure its protection various requirements are
implemented. In this article, we describe legal requirements, common and general rules that are
set by law; regulatory, requirements that are set by specific normative acts in the field; economic
requirements, that have a business impact; and social requirements, that include additional
protection measures from sensitive personal data, such as medical and health care related data.</p>
      <sec id="sec-2-1">
        <title>2.1. Legal Requirements</title>
        <p>In order to harmonize legal requirements across European countries, the European Commission
introduces the General Data Protection Regulation (GDPR) - a new law in the privacy protection
ifeld that is mandatory for all EU countries. This regulation is universal for EU countries and
does not require additional implementation in the national legislation system. Although, if
needed, countries can issue laws on the national level that complements GDPR[8].</p>
        <p>Personal data is the central term in data protection laws. One of the core obligations under
GDPR is to provide an adequate security level for personal data. Those measures include but
are not limited to the following: ensuring confidentiality, integrity, availability of data;
implementing pseudonymization, anonymization and encryption; ability to protect from incidents
and minimize risks; process of testing, assessing and evaluating the system. According to Art.4
(1) GDPR “personal data” is defined as any information which is related to an identified or
identifiable natural person.[8]</p>
        <p>Data anonymization is one of the ways to keep value while preventing privacy. GDPR defines
anonymisation as the “process of creating anonymous information”, which means anonymized
information shall not include an identified or identifiable natural person or personal data. It is
important to emphasize that European legislation in the data protection field applies to personal
data, which means if data is anonymized, it is out of the scope of GDPR, but it still can be a
subject of other laws. Nevertheless, anonymized data or, in other words de-identified data, shall
be distinguished from pseudonymized. Personal data to which pseudonymization methods were
applied, that still can be attributed to a natural person shall be considered as information that
may allow identification of a natural person [ 8]. In addition, the controller shall assess whether
a person is identifiable. To do that, according to the Recital 26 GDPR, “account should be taken
of all the means reasonably likely to be used, such as singling out, either by the controller or by
another person to identify the natural person directly or indirectly” [8].</p>
        <p>
          The Article 29 Working Party (WP29) Opinion on Anonymization Techniques, based on
Directive 95/46/EC, understands anonymization as “results from processing personal data in
order to irreversibly prevent identification”. Although, the current directive is no longer in force,
the given definition is still accurate [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. EU guidelines, such as the Working Party mentioned
above, aim to provide directions toward data anonymization. However, the final decision
towards using privacy methods is the responsibility of the data controller and shall be decided
case by case since there is no one universal method that fits them all [23].
        </p>
        <p>
          In addition to GDPR, some EU countries issued additional guidelines on handling data on a
national level. One of the examples is Guidance on health data protection (ger.
“Orientierungshilfe zum Gesundheitsdatenschutz”) issued in Germany. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] The document provides a practical
overview of the essential data protection requirements for companies in the healthcare sector.
        </p>
        <p>
          According to ENISA (European Union Agency for Cybersecurity), the choice of anonymization
and pseudonymization methods depends on diferent parameters, primarily the data protection
level and the utility of the dataset [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Also, the choice of a method may be concerned by
the complexity associated with a certain scheme in terms of implementation, scalability and
database size.
        </p>
        <p>
          There are multiple forms to protect personal data, for example, anonymization,
pseudonymization, non-disclosure, hashing, encryption or tokenization [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>1-Frequent
2-Occasional
3-Rare
4-Unlikely
5-Unthinkable
1- Marginal
acceptable
acceptable
acceptable
acceptable
acceptable</p>
        <p>2-Minor
unacceptable
acceptable
acceptable
acceptable
acceptable
3-Moderate
unacceptable
unacceptable
acceptable
acceptable
acceptable</p>
        <p>4-Serious
unacceptable
unacceptable
unacceptable
acceptable
acceptable</p>
        <sec id="sec-2-1-1">
          <title>5-Catastrophic</title>
          <p>unacceptable
unacceptable
unacceptable
unacceptable
acceptable</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Regulatory Requirements</title>
        <p>The GDPR is a major driver for data protection which was issued by the European Commission.
Prior to the GDPR, which went into efect on May 2018, manufacturers of medical devices
where already challenged by data protection through the ISO 13485:2016 and the Medical Device
Regulation (MDR). The MDR requires data protection in cases like clinical trials and the ISO
13485 requires that the manufacturer has to ensure the confidentiality of health information
and implement the necessary methods to do so [11]. This is needed on the actual device on the
one hand and also during each process, where the manufacturer would have possible access to
patient data.</p>
        <p>The ISO 13485 defines the Quality Management Systems (QMS) for Medical Devices and
ensures that the product is safe, efective and eficient. Therefore, the QMS documents the
whole lifecycle from the product concept, development and verification until the post market
phase and to the decommissioning of a product. Each phase of the product life cycle needs
to be covered by risk management activities. Manufacturers usually implement an ISO 14971
compliant risk management process to identify hazards that could result in property damage,
personal injury or death of users and/or patients or even reputation loss for the manufacturer.</p>
        <p>The ISO 14971:2012 only defines two types of risks, unacceptable and acceptable, and all have
to be mitigated as far as the risk benefit ratio does not get negative. Therefore, the manufacturer
has to define his risk acceptance criteria which leads to a risk acceptance matrix. The risk
acceptance matrix in Table 1 shows the correlation between the probability of occurrence and
the severity of a hazard. For probability there are five classes from frequent, like each use, to
unthinkable, which may occur only once in the lifetime of a device. For severity there are also
ifve classes from marginal, there is no harm at all, to catastrophic, which could lead to a severe
injury or death. Therefore, a high probability and a high severity lead to unacceptable risks
while on the other side low probability and a low severity may lead to an acceptable risk [12].</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Economic Requirements</title>
        <p>We can take a look at economic requirements from two perspectives. First, from the need to
spend resources to apply proper measures to protect data, and second, from the possible losses
in case of incompetence and data breach.</p>
        <p>GDPR defines data controllers as a natural or legal person, public authority, agency or
other body that determines the purposes and means of the processing of personal data and
requires them to maintain necessary technical and organizational measures to keep personal
data protected [8]. However, considering that there is no standard set of measures that will fit all
cases, and specific requirements may vary from country to country or detailed guidelines may
be missing at all, in most cases it is still at the data controller’s discretion to select appropriate
measures. Nevertheless, applied measures shall be appropriate and relevant to the case. One of
the ways to evaluate that is to conduct a risk-based data protection impact assessment. This
procedure will help analyze, identify and mitigate risks associated with data processing. We
want to point out that no single solution will enable data protection and data utility. It can be
presented as a range of possible measures that can be implemented in specific case to find a
suitable balance in each situation. This solution will depend on many factors. E.g. type and
scope of processed data, risks associated with data processing, to whom data is shared and
also, available resources. In each case can be defined the minimum level of protection measures
applied as well as the maximum level of security that can be achieved.</p>
        <p>The minimum level can be achieved with fewer resources, but is associated with higher
risks of data loss or unauthorized access. Although, the maximum level of security provides
a higher level of data protection, it also has drawbacks, such as high cost of implementation
and maintenance, requirement for involvement of specialists from diferent fields, lower data
utility associated with possible data loss resulting from anonymization or less convenient data
access from a user’s perspective. Violation of GDPR requirements can bring serious financial
consequences for data controllers. In case of infringement of GDPR provisions, data controller
can get a fine of up to 20 000 000 EUR or 4% of the total worldwide annual turnover [8]. In
addition to those fines serious financial and reputation loss can be followed due to data breach.
Based on IBM Security’s 2020 data breach report, the average cost of a health care organization’s
data breach is $7.13 million, which is 10% more than in 2019 [10].</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Social Requirements</title>
        <p>
          According to GDPR, healthcare related data belong to the special type of data that requires
additional protection measures compared to the regular personal data. This data may reveal
information about the past or current status of a person’s health, including physical or mental
health conditions. Special information may contain results of body or tissue samples
examinations, medical history, treatment details as well as data from health care professionals and
medical devices [8]. Data protection requirements may also be specified in documents on
a national level. For instance, Guidance on health data protection describes organizational
precautions that must be taken by the companies that process sensitive health data in order to
ensure the protection of this data. Those precautions include an obligation on the employees to
maintain data securely, including to create a register of all data processing operations.[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]
        </p>
        <p>Certainly, the secondary usage of health data, such as for research purposes, can have
significant social benefits, including developing solutions that improve people’s lives, providing
better support in decision making, and more afordable care. GDPR specifies that personal data
shall be collected for specified, explicit, and legitimate purposes and applies restrictions for the
usage of personal data in a way that is incompatible with those purposes [8]. Nevertheless,
processing for archiving purposes in the public interest, scientific or historical research purposes
is considered compatible with the initial purposes of the collection which means data can be
used for research but may require implementation of safeguard measures [8].</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Overview of the Homecare Cloud System</title>
      <p>
        Age-related macular degeneration (AMD) is an eye disease that damages the macula of the
retina and leads to blurred or loss of vision in the center of the visual field. There is no cure,
but treatment slows the progression of the disease and reduces vision loss.[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] Treatment is
administered by injecting vascular endothelial growth factor inhibitors into the eye at fixed
intervals or adaptively, if worsening of the disease is detected.[16]
      </p>
      <p>The homecare system aims to develop a solution for the frequent monitoring of AMD patients’
eyes and the AI-based prediction of the course of the disease. The frequent monitoring of the
disease from the patient’s home allows for the detection of the onset of the worsening of the
disease and therefore scheduling the treatment at the best possible time. Thus, a cloud-based
system that allows various diferent users to interact reliably and securely with multiple cloud
services is realized.</p>
      <p>
        When a patient is diagnosed with AMD, they can get a prescription for a homecare device.
An optician provides the homecare device to the patient, who uses the device once every day
to take a series of optical coherence tomography (OCT)[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] images of their eyes. These OCT
images get uploaded to the cloud, where an AI evaluates the progression of the AMD and
suggests whether further treatment is required. If treatment is required, the patient’s doctor is
notified and can make an appointment for the treatment. Additionally, the patient and their
doctor can view all past images and classification results in the cloud.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Architecture of the Homecare Cloud System</title>
        <p>Figure 1 shows an overview of the cloud system and homecare device architecture. The left
part of the figure shows the typical roles of patient, doctor and homecare device in the system.
These users interact with the homecare system via diferent interfaces. A patient can interact
with the system by using the homecare device to take a series of OCT images of their eyes.
The homecare device then uploads the raw images data to the cloud system. Within the cloud
a preprocessing routine reconstructs a 3-dimensional DICOM-image from the uploaded data.
Additionally, an AI-based classification service is notified that new images require evaluation.
The classification service evaluates the image and results in a recommendation if the disease
progressed and thus requires further treatment. Current and past images and results can be
accessed and viewed by the patient, as well as their doctor, via a mobile application or a web
front-end application.</p>
        <p>To ensure the integrity of the patient’s data in the cloud, access to the system is restricted to
authenticated and authorized users. Multiple standards for authentication and authorization
exist, of which the three most commonly used ones are Open Authentication (OAuth)[9], OpenID
Connect[20] and Security Assertion Markup Language (SAML)[13].[15]</p>
        <p>
          There is precedence of OpenID in conjunction with OAuth2.0 being used in the context high
security environments such as eHealth, eGoverment and Banking[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ][14].
        </p>
        <p>For our implementation we utilize a server implementing OAuth2.0[9] and OpenID
Connect[20]. OAuth2.0 provides role-based access control (RBAC) mechanisms therefore
allowing only authorized users access to resources[9]. The OpenID Connect standard provides
W
e
b
F
r
o
n
te
n
d</p>
        <p>O
p
e
n
I
D
/
o
A
u
t
h
2
.
0</p>
        <p>DICOM</p>
        <p>Storage
HomSteorOagCeT
Service
Home OCT
interface
Preprocessing
service</p>
        <p>Notify</p>
        <p>AI- Training Service</p>
        <p>AI Training</p>
        <p>Service
Deployment
and QM</p>
        <p>Service
Classification</p>
        <p>Service
AI Classification</p>
        <p>Algorithm</p>
        <p>Classification
request Interface</p>
        <p>Database
UserSInetrevricaection
Web Back-end
web-based single-sign-on authentication and cross-domain identity management[20].</p>
        <p>An AI-training service is used for further training and improvement, and the deployment
and quality assurance of the AI-model. The AI classification algorithm runs as a cloud service
and provides an interface where other services can request an evaluation of OCT images.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Data Flow in the Homecare Cloud System</title>
        <p>To identify points of attack, where an intruder could attempt to gain access to a patient’s data
in the system, data- and information flow analysis can be performed. Information flow analysis
results in an overview of which entities have access to the information in the system[19]. In
addition to entities having access to data, a data flow analysis shows which processes and
storage units have access to the data in the system[22]. Therefore, data flow analysis is preferred
over information flow analysis for the identification of possible points of attack. In our data
lfow analysis we follow the methodology described by Seifermann et al.[22].</p>
        <p>Figure 2 shows the data flow in the homecare system and visualizes which user is able to
access the data. The homecare device (top-left) takes the patient’s OCT images and sends the
images, as well as metadata describing the device to a preprocessing service running in the
cloud (below). The images are related to the homecare device through the metadata but not the
patient.</p>
        <p>The preprocessing service generates a 3-dimensional image file from the raw OCT image
data and the device’s metadata and stores it as a DICOM file in the cloud. The image files can
be retrieved from the storage by specifying the storage path of the requested file. Further, the
image classification service (right of the preprocessing service) gets notified about a new upload.</p>
        <p>When notified, the classification service retrieves the image file from the storage and evaluates
the contained OCT image with an AI service also located in the cloud. The classification result,
whether the patient’s AMD worsened or not, is combined with the patient’s ID to enable a
correlation between a patient and their OCT image.</p>
        <p>The correlation between a patient, their doctor and a homecare device (above and right of
the classification service) is established by the patient’s optician when the patient is initially
entered into the system and provided with a homecare device. The homecare device does not
contain this patient identifying information.</p>
        <p>The view result service (bottom middle) allows authenticated patients and their doctors to
view an OCT image and the corresponding classification result. The user specifies the requested
results (by patient.ID). The results contain the path to the corresponding image files, which are
retrieved from the storage. A doctor further can view the results of all their patients (bottom
right).</p>
        <p>Figure 2 shows the data flow through the system. Data is divided into three categories
by the criticality of the data. Non-Identifying Non-Biometric data does not contain sensitive
information (purple). The OCT images of a patient’s eye are biometric data and contain sensitive
information in pseudonymized form (yellow). The patient’s identifying data, like the patient’s
name, is shown in blue.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Analysis and Mitigation for Potential Risk to the Patients’ Data</title>
      <p>Following the regulatory requirements discussed in Section 2.2, risk management for the
homecare medical system is implemented. This section describes the reasons for processing
sensitive data, potential risks to the data, as well as the measures taken in order to minimize
the risks.</p>
      <sec id="sec-4-1">
        <title>4.1. Reasons to Process Sensitive Data in the System</title>
        <p>GDPR Art. 9 protects personal and biometric data and prohibits the processing of such sensitive
data. An exception is that processing can be permitted if a specific requirement for the processing,
such as medical diagnosis, exists[8].</p>
        <p>For the purpose of medical diagnosis, it is necessary to record and store the patient’s full
name in the system in order for their doctor to search and find a specific patient’s data. When a
new patient is stored in the database, it will be automatically generated an ID for this patient to
guarantee the uniqueness of the database keys. Other personal information, such as age, sex or
address are not processed in the system, since they are not required for the use case.</p>
        <p>In addition to the personal identifying data, biometric data in the form of OCT-images of the
patient’s eyes are stored and processed in the system. It is necessary to store and process the
biometric data because the purpose of the system is to detect AMD and evaluate the progress of
the disease. The system creates a historical record of the patient’s disease.</p>
        <p>Assign Device to Patient</p>
        <p>O</p>
        <p>Enter new Patient to System
Dev-Pat &lt;- Dev-Op + Patient.ID</p>
        <p>Patient &lt;- Patient.ID + Patient.Name
Patient.ID</p>
        <p>O
D</p>
        <p>HomeOCT Device
-Takes Patient's OCT Images
-Sends Raw Image Data to Cloud</p>
        <p>Raw Image Data</p>
        <p>+</p>
        <p>Metadata</p>
        <p>Preprocessing Service
DICOM.Image &lt;- Raw Image Data</p>
        <p>+ Metadata</p>
        <p>DICOM Storage
- Stores &amp; Retrieves DICOM Files
Path &lt;- File.Storage.Location
User Access to the System
H - HomeOCT Device
O - Patient's Optician
A - System Administrator
D - Patient's Doctor
P - Patient</p>
        <p>Path</p>
        <p>DICOM.Image</p>
        <p>DICOM(Path)
DICOM(Path) View &lt;- Result + DICOM(Path)</p>
        <p>All Results &lt;- View + List(Pat-Doc)
View</p>
        <p>Since an AI detects the disease, multiple OCT-images are required to train of the neural
network. The original training images are not retained in the system, but only the neural net’s
weights are based on these training images.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Potential Risks for Patients’ Data</title>
        <p>Based on the Risk Analysis in Table 1 we identify points of attack in the data flow of the system
as depicted in Figure 2, where a potential intruder or malicious user could attack the system in
order to gain unauthorized access to a patient’s sensitive data.</p>
        <p>We diferentiate between an intruder attempting to break into the system and a malicious
user, for example, a rogue system administrator, attempting to access sensitive patient data.</p>
        <p>An intruder could attempt to break into the cloud system or intercept data that is uploaded
from the homecare device to the cloud (see Figure 2 top left arrow). The intruder can also
attempt to identify a patient and find out the patient’s diagnosis.</p>
        <p>AI-models are not protected under GDPR, however an additional vector of attack on an
AI-based system is the attempt to reconstruct training data from the learned weights of the
model [18]. In this scenario, an intruder could request the classification of a specific data sample
and attempt to gain information about the original training images from the network’s response.</p>
        <p>A malicious user on the other hand would already have access to the system itself and could
attempt to access sensitive data or the AI without authorization. Such sensitive data could be
accessed by gaining access to the database storing the patient’s information and diagnosis (see
Figure 1 Database top right) or the patient’s biometric data (see Figure 1 DICOM-Storage top
middle). A similar scenario is a doctor attempting to access patients’ data for which he has no
authorization, such as a diferent doctor’s patient.</p>
        <p>Additionally, a malicious user would have access to the AI and could attempt to gain
information about the training data from the network’s response, similar to an external intruder but
with full access to the network.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Mitigation of Potential Risks</title>
        <p>According to Table 1 the hazards discussed in Section 4.2 are unacceptable risks due the severity
of the damage and frequency of occurence and therefore need to be mitigated. Table 2 shows
the mitigation strategy for each hazard in order to reduce the risks to an acceptable level.</p>
        <p>The homecare device sends sensitive biometric data over an unsecure channel to the cloud.
An intruder can intercept this communication. By encrypting all communication with HTTPS,
the sensitive data cannot be recovered even if intercepted.</p>
        <p>In order to prevent an intruder or unauthorized user from gaining access to the databases,
we do not allow direct access to the database but require access through a backend service.
The backend only accepts requests from users who are authenticated and authorized by the
authorization server. This ensures that only authorized users can access the database and only
data they have permission for.</p>
        <p>Since a doctor must be able to access his patients’ data, each patient is mapped to a doctor.
The backend allows for requests from a doctor only for patients’ data, for which this mapping
exists. This ensures that the doctor can access his own patients’ data but not to others doctors
patients’ data.</p>
        <p>To ensure the security and integrity of patient’s sensitive data, all patient information are
encrypted in the database. Furthermore, the biometric data is pseudonymized by not mapping
it to the patient’s name, but an ID. This makes it harder for an intruder to de-pseudonymize a
patient, since he would need to get access to both, the biometric data and the database connecting
IDs to patients.</p>
        <p>The risk to the integrity of the AI model’s training data is low since the model is not publicly
available and the system allows only selected users a limited number of requests. In case of
the intruder having access to the model, he can only access biometric data but no information
about the patient is exposed.</p>
        <sec id="sec-4-3-1">
          <title>Hazard</title>
        </sec>
        <sec id="sec-4-3-2">
          <title>An intruder tries to</title>
          <p>fetch information
which is sent by the</p>
        </sec>
        <sec id="sec-4-3-3">
          <title>Home OCT device</title>
        </sec>
        <sec id="sec-4-3-4">
          <title>An intruder</title>
          <p>attempts to break
into the cloud
service and accesses
data</p>
        </sec>
        <sec id="sec-4-3-5">
          <title>Intruder or user gets</title>
          <p>access to the
database in which
the patient data is
stored</p>
        </sec>
        <sec id="sec-4-3-6">
          <title>A doctor can see results of other doctors’ patients</title>
        </sec>
        <sec id="sec-4-3-7">
          <title>A patient can see results of other patients</title>
        </sec>
        <sec id="sec-4-3-8">
          <title>An intruder</title>
          <p>attempts to gain
information about a
patient from the
database</p>
        </sec>
        <sec id="sec-4-3-9">
          <title>An intruder tries to correlate biometric data with the patients</title>
        </sec>
        <sec id="sec-4-3-10">
          <title>An intruder</title>
          <p>attempts to gain
information about
the AI’s training
data</p>
        </sec>
        <sec id="sec-4-3-11">
          <title>Risk acceptance</title>
          <p>before mitigation
Frequent ×</p>
          <p>Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-12">
          <title>Unacceptable</title>
          <p>Occasional ×</p>
          <p>Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-13">
          <title>Unacceptable</title>
          <p>Occasional ×</p>
          <p>Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-14">
          <title>Unacceptable</title>
          <p>Occasional ×</p>
          <p>Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-15">
          <title>Unacceptable</title>
          <p>Rare ×</p>
          <p>Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-16">
          <title>Unacceptable</title>
          <p>Unlikely ×
Catastrophic ⇒</p>
        </sec>
        <sec id="sec-4-3-17">
          <title>Unacceptable</title>
          <p>Rare ×</p>
          <p>Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-18">
          <title>Unacceptable</title>
          <p>Rare ×</p>
          <p>Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-19">
          <title>Unacceptable</title>
          <p>Mitigation
• Encrypt all communication
with HTTPS
• Limit system access to
authorized users utilizing OpenID
and oAuth2.0 protocols
• Restrict access to database
only via backend service
• Backend checks user
validation before forwarding data
• Restrict access to database
only via backend service
• Backend checks user
permission before forwarding data
• Encrypt database with
secure standards
• Restrict access by mapping
each patient with a doctor
• Allow access only to data
from patients with correct
mapping
• Restrict access by mapping
each patient with a device
• Allow access only to data
from device with correct
mapping
• Encrypt all patient’s
sensitive data in the database
• Encrypt patient’s diagnosis
• Pseudonymize biometric
data by correlating patient’s
id with patient’s sensitive
data
• Not storing patient
information on the homecare device
• Restrict allowed number of
classification requests to one
per day for external users
• Restrict access to the weights
of the AI model
• Biometric training data is
pseudonymized</p>
        </sec>
        <sec id="sec-4-3-20">
          <title>Risk acceptance</title>
          <p>after mitigation
Frequent ×
Marginal ⇒</p>
        </sec>
        <sec id="sec-4-3-21">
          <title>Acceptable</title>
          <p>Unlikely ×
Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-22">
          <title>Acceptable</title>
          <p>Unlikely ×
Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-23">
          <title>Acceptable</title>
          <p>Unlikely ×
Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-24">
          <title>Acceptable</title>
          <p>Unlikely ×
Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-25">
          <title>Acceptable</title>
          <p>Unlikely ×
Moderate ⇒</p>
        </sec>
        <sec id="sec-4-3-26">
          <title>Acceptable</title>
          <p>Rare ×
Moderate ⇒</p>
        </sec>
        <sec id="sec-4-3-27">
          <title>Acceptable</title>
          <p>Unlikely ×
Serious ⇒</p>
        </sec>
        <sec id="sec-4-3-28">
          <title>Acceptable</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>Protection of personal data is a highly regulated field especially when they are used in medical
devices and software. Various risks occur due to the high value of this data and its sensitive
nature. Compliance becomes more challenging because of the diverse nature of those requirements
and the lack of practical examples for implementation. To feel this gap, our paper demonstrates
how legal, regulatory, and other requirements can be implemented on a real-life project. In
detail, we describe risks and ways to mitigate them. We believe that other readers can benefit
from our research by learning how theoretical requirements can be translated into practice.</p>
      <p>In our data flow analysis, we identified two main types of attackers: intruder, a person outside
the system, and malicious user, a person inside the system, e.g administrator or doctor, who
tries to get access to data without authorization.</p>
      <p>Both types can have diferent vectors of attack and require mitigation measures to lower
potential risks. The main mitigation strategy includes the following: encrypted communication
between device and cloud, limited system access, restricted access to the databases and backend
check of user’s permissions, access mapping between doctor and user, encryption of sensitive
personal data and pseudonymization when possible, storing patient’s data in database encryption
with secure standards, and avoid storage on homecare devices.</p>
      <p>Even though achieving absolute security is not physically possible, our analysis shows that
the implementation of the measures mentioned above significantly mitigates the risks of a
successful attack and decreases possible damage in case of intrusion.
[8] GDPR. Regulation (EU) 2016/ 679 of The European Parliament and of the Council on
the Protection of Natural Persons with Regard to the Processing of Personal Data and on
the Free Movement of such data, and Repealing Directive 95/46/EC, April 2016. [Online;
accessed 01-February-2022].</p>
      <p>[9] Hardt, D. The OAuth 2.0 Authorization Framework. RFC 6749, RFC Editor, October 2012.
[10] IBM. IBM Report: Compromised Employee Accounts Led to Most Expensive Data Breaches</p>
      <p>Over Past Year. [Online, Accessed 24 March 2022].
[11] Johner, C. Datenschutz im Gesundheitswesen bei medizinischen Daten. [Online, Accessed
25 March 2022].
[12] Johner, C. ISO 14971 and Risk Management. [Online, Accessed 29 March 2022].
[13] Lewis, J. E. Web single sign-on authentication using SAML. IJCSI International Journal of</p>
      <p>Computer Science Issues 2 (09 2009).
[14] Lodderstedt, T., Bradley, J., Labunets, A., and Fett, D. OAuth 2.0 Security Best
Current Practice. Internet-Draft draft-ietf-oauth-security-topics-19, Internet Engineering
Task Force, Dec. 2021.
[15] Naik, N., and Jenkins, P. Securing digital identities in the cloud by selecting an apposite
Federated Identity Management from SAML, OAuth and OpenID Connect. In 2017 11th
International Conference on Research Challenges in Information Science (RCIS) (2017), pp. 163–
174.
[16] Okada, M., Kandasamy, R., Chong, E. W. T., McGuiness, M. B., and Guymer, R. H. The
Treat-and-Extend Injection Regimen Versus Alternate Dosing Strategies in Age-related
Macular Degeneration: A Systematic Review and Meta-analysis. American journal of
ophthalmology 192 (2018), 184–197.
[17] Qiu, H., Qiu, M., Liu, M., and Memmi, G. Secure health data sharing for medical
cyberphysical systems for the healthcare 4.0. IEEE journal of biomedical and health informatics
24 9 (2020), 2499–2505.
[18] Rigaki, M., and Garcia, S. A Survey of Privacy Attacks in Machine Learning, April 2021.</p>
      <p>arXiv:2007.07646.
[19] Sabaliauskaite, G., and Adepu, S. Integrating Six-Step Model with Information Flow
Diagrams for Comprehensive Analysis of Cyber-Physical System Safety and Security. In
2017 IEEE 18th International Symposium on High Assurance Systems Engineering (HASE)
(2017), pp. 41–48.
[20] Sakimura, N., Bradley, J., Jones, M., de Medeiros, B., and Mortimore, C. OpenID</p>
      <p>Connect 1.0 specification, November 2014. [Online, Accessed 30 March 2022].
[21] Schneeberger, D., Stöger, K., and Holzinger, A. The European legal framework for
medical AI. In International Cross-Domain Conference for Machine Learning and Knowledge
Extraction (2020), pp. 209–226.
[22] Seifermann, S., Heinrich, R., Werle, D., and Reussner, R. Detecting violations of
access control and information flow policies in data flow diagrams. Journal of Systems and
Software 184 (2022), 111138.
[23] Vovk, O., Piho, G., and Ross, P. Anonymization Methods of Structured Health Care Data:
A Literature Review. In International Conference on Model and Data Engineering (2021),
Springer, pp. 175–189.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Apte</surname>
            ,
            <given-names>R. S.</given-names>
          </string-name>
          <string-name>
            <surname>Age-Related Macular Degeneration</surname>
          </string-name>
          .
          <source>The New England journal of medicine 385 6</source>
          (
          <issue>2021</issue>
          ),
          <fpage>539</fpage>
          -
          <lpage>547</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>[2] BMWI. Orientierungshilfe zum Gesundheitsdatenschutz. [Online, Accessed 15 February</source>
          <year>2022</year>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Datenschutz-Grundverordnung</surname>
          </string-name>
          .
          <source>Verordnung (EU)</source>
          <year>2016</year>
          /
          <article-title>679 des Europäischen Parlaments und des Rates zum Schutz natürlicher Personen bei der Verarbeitung personenbezogener Daten, zum freien Datenverkehr und zur</article-title>
          <source>Aufhebung der Richtlinie</source>
          <volume>95</volume>
          /46/EG (
          <string-name>
            <surname>Datenschutz-Grundverordnung</surname>
            <given-names>)</given-names>
          </string-name>
          ,
          <year>April 2016</year>
          . [Online; accessed 16-February-2022].
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Domenech</surname>
            ,
            <given-names>M. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Comunello</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Wangham</surname>
            ,
            <given-names>M. S.</given-names>
          </string-name>
          <article-title>Identity management in eHealth: A case study of web of things application using OpenID connect</article-title>
          .
          <source>In 2014 IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom)</source>
          (
          <year>2014</year>
          ), pp.
          <fpage>219</fpage>
          -
          <lpage>224</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>ENISA.</surname>
          </string-name>
          <article-title>Pseudonymisation techniques and best practices</article-title>
          ,
          <year>November 2019</year>
          . [Online; accessed 11-February-2022].
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>European</given-names>
            <surname>Commission</surname>
          </string-name>
          .
          <source>Article 29 working party opinion 05/2014 on anonymisation techniques.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Fujimoto</surname>
            ,
            <given-names>J. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pitris</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boppart</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Brezinski</surname>
            ,
            <given-names>M. E.</given-names>
          </string-name>
          <string-name>
            <surname>Optical</surname>
          </string-name>
          <article-title>Coherence Tomography: An Emerging Technology for Biomedical Imaging</article-title>
          and
          <string-name>
            <given-names>Optical</given-names>
            <surname>Biopsy</surname>
          </string-name>
          .
          <source>Neoplasia</source>
          <volume>2</volume>
          (
          <issue>1</issue>
          -2) (
          <year>2000</year>
          ),
          <fpage>9</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>