<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Process Mining Meets GDPR Compliance: The Right to be Forgotten as a Use Case</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rashid Zaman</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marwan Hassani</string-name>
          <email>m.hassanig@tue.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Process Analytics Group, Faculty of Mathematics and Computer Sceince, Eindhoven University of Technology</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In a bid to ensure privacy of personal data of data subjects, the General Data Protection Regulation(GDPR) entails stringent obligations on organizations and businesses qualifying as data controllers and data processors. The regulation additionally bestow data subjects certain rights over their personal data, right to be forgotten generally being perceived the landmark. Ful lling the GDPR's obligatory requirements and provisioning of the data subject's rights implicates considerable changes to the existing (pre-GDPR era) business and organizational processes. Being a non-trivial task, several technical as well as procedural challenges are associated. The case for organizations having intertwined or cascaded business processes and business processes stretched over multiple organizations is even more complicated. Process mining discipline has been found highly e ective in automatically discovering, conformance/compliance analysis, and enhancement of business processes, organizational work ows, healthcare procedures/guidelines to name a few. Process mining techniques therefore have a great potential to assist and guide the transformation of pre-GDPR era (presumably GDPR noncompliant) business or organizational processes into GDPR-compliant processes, and afterwards monitor the compliance during execution. In addition to the current state of the art o ine process mining techniques, stable online conformance checking and online model repair techniques needs to be developed for ensuring compliance to the regulation. We are highlighting the challenges associated with implementation of the right to be forgotten, and the GDPR in general.</p>
      </abstract>
      <kwd-group>
        <kwd>GDPR Right to be Forgotten</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>With increasing dependence and usage of internet by people and integration
of software systems to interface and facilitate these users, either implicitly or
explicitly trails are left behind. These trails contains usage behavior information
as well as personal data of the users. Traditionally, organizations have been
processing this data for discovering patterns and useful insights to support their
business decision-making. Apart from in-house processing, data has also been
outsourced to third parties for processing on behalf of the outsourcing body or for
purposes speci c to the outsourced body, probably not in line with the primary
logging purpose. Processing of data in this manner and some data breaches have
raised privacy concerns at the customers side.</p>
      <p>To cope with the privacy issues arising from storage, access and
processing of the (personal) data of users, the European Union adopted the General
Data Protection Regulation(GDPR)1 in 2018. Calling for considering \privacy
by design" and \privacy by default", the GDPR impose strict measures on
organizations and businesses regarding the processing to ensure privacy of users
data. Another set of regulatory obligations of the GDPR, centered around the
data subjects, includes but not limited to right to be informed, access to
personal data, recti cation, portability, restrict processing, and most importantly
erasure or forgotten.</p>
      <p>Organizations are facing challenges in the GDPR implementation. In some
cases, implementing the GDPR requirements upto certain extent prove
disadvantageous to the pre-GDPR era working mechanism of businesses. To cope
with the challenges associated with transition of business processes from GDPR
non-compliance to GDPR compliance and to ensure the compliance prevail
throughout process life, Business Process Re-engineering and functional toolkit
for GDPR compliance (BPR4GDPR) project2 has been initiated. The resulting
framework of tools and engines will provide support in implementing the major
GDPR provisions and will be applicable to broad spectrum of business domains
and processes.</p>
      <p>BPR4GDPR lifecycle, refer Figure 1, starts with the process identi cation
phase, proceeding with adaptation of business processes to the GDPR compliant
version. The adaptation phase is followed by continuous monitoring to detect
any execution deviations along the process life. Process mining discovery and
conformance/compliance techniques are the most suitable candidates for
complementing all the mentioned phases. Therefore, a privacy-aware process miner
will be developed as part of the BPR4GDPR holistic framework.</p>
      <p>This paper highlights the challenges associated with the GDPR
implementation and compliance in general, speci cally the challenges associated with
granting the right to be forgotten (referred to as RTBF in rest of the text) to the data
subjects. Process mining perspective to the problem is presented as well. The
scenario is elaborated in the light of an automotive lead generation use case.</p>
      <p>
        The remainder of this paper is structured as follows. Section 2 provides an
overview of the most relevant process mining techniques and work done in
connection with implementing the GDPR in business process landscape. Section 3
brie y discuss the RTBF and its impact on business processes. Section 4 provides
1 https://gdpr-info.eu
2 http://www.bpr4gdpr.eu
problem de nition. In section 5 we present the use case for the scenario while
in Section 6 we present the challenges associated with implementing the GDPR
in general and RTBF speci cally. Section 7 provides overview of the envisaged
privacy-aware process mining.
Process mining [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], though a relatively young discipline, has strategically
positioned itself in the business processes landscape. The three main process mining
techniques namely Process Discovery, Process Conformance Checking, and
Process Enhancement have entered the state of maturity, especially on static event
logs of reasonable size. Realization of these techniques for streaming event data
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and data of massive scale is evolving. Process conformance and compliance
checking are of prime importance for the GDPR compliance therefore we will
brie y discuss the work done in these relevant areas.
      </p>
      <p>
        Process conformance checking techniques [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] confronts event log (observed
behavior) with business process model (desired behavior) to detect deviations
and accordingly assess the inter-harmony level. Process compliance checking [5{
8] aims at analyzing the execution of observed behavior in accordance with
requisite business rules, desired practices, regulation etc. Compliance checking, in
contrast to control- ow orientation of conformance checking, takes into
consideration the various characteristics of the activities, like, their order of execution,
cardinality of their execution, their predecessor and successor activities, and the
associated data or resource attributes.
      </p>
      <p>Some rear work speci cally on the GDPR obligatory requirements or
similar lines in the context of business processes exist. [9] match audit trails with
legitimate execution sequences as per model. Sequences deviating from the
legitimate execution sequences are considered as infringements from the speci ed
processing purpose(s). [10] apart from algorithmically bridging process
collection to privacy policy, navely classi es unused data classes as exception to data
minimisation.
3</p>
    </sec>
    <sec id="sec-2">
      <title>GDPR Right to be Forgotten</title>
      <p>In this section we are introducing RTBF and explaining the impact of RTBF on
existing business processes.
3.1</p>
      <sec id="sec-2-1">
        <title>RTBF</title>
        <p>Article 17 of the General Data Protection Regulation titled Right to Erasure
(mostly known as Right to be Forgotten) entitles data subjects to ask data
controller(s) for erasure of their personal data and accordingly binds the
controller(s) to erase the requested personal data without undue delay. Although
in some exceptional cases the right can be denied, in majority of situations and
business cases it remains unavoidable.
3.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Impact of RTBF on Businesses</title>
        <p>RTBF presents major and far-reaching impact on majority of the business
processes starting from implementation. Conventionally, having no such obligation
in place, organizations and businesses have been processing personal data of
people, in cases without explicit consent, upto the extent required for their
businesses, retaining it for longer periods, and even subletting it to other businesses
to be used for their own processing purposes.</p>
        <p>In process-model-log nomenclature, referring to Figure 2, usually only a
portion (for example \a" and \a0" in the gure) of the log \L" and model \M" are in
conformance while remaining portions (for example \b" of the log \L" and \c" of
the model \M") are usually non-conforming. Inclusion of the GDPR obligations
\G " to the existing process and consequently model \M" shall further push away
the log \L" and model \M" thus resulting in increased non-conformance. Due to
severe nancial consequences of the GDPR irregularities, such non-conformance
is more critical in nature in comparison to conventional non-conformance
therefore necessitating forward compliance through model adaptation and backward
compliance through monitoring.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Problem De nition</title>
      <p>A pre-GDPR era process model, supposedly the GDPR non-compliant, is a Petri
net, N = (P; T; F; A) where P in the tuple represents a nite set of places, T is
the set of transitions, F is the set of arcs connecting places to transitions and
transitions to places, and A is the set of activities.</p>
      <p>In post-GDPR era, processes need to have an erasure mechanism in place
for data subjects Ds to be able to exercise right to be forgotten. The erasure
mechanism itself can be conceived as a (sub)process and therefore a Petri net,
N Grtbf = (P Grtbf; T Grtbf; F Grtbf; AGrtbf) where P Grtbf; T Grtbf; F Grtbf; AGrtbf
represents the places, transitions, transitions and places inter-relational arcs, and
activity labels in the GDPR erasure (sub)process model respectively.</p>
      <p>Business process model N and erasure process N Grtbf are disjoint in nature
i.e., T \ T Grtbf= . Therefore, a blanket GDPR-compliant version of N can be
obtained by overlaying erasure process N Grtbf over the business process model
N i.e., N + N Grtbf or N N Grtbf. Due to the disjointedness of the N and N Grtbf,
N N Grtbf distinguishes from the conventional model merging [11] and model
repair [12] techniques.</p>
      <p>From behavioral point of view, in case of an erasure request, the N Grtbf
shall seize further processing of the relevant data, in other words reset the N .
Therefore, N N Grtbf becomes a Reset net (P; T; F; R) where R is the Reset
Arcs de ning function. An erasure request in Marking M shall result in Marking
M 0 such that M 0=</p>
      <p>P PGrtbf nR(t) + P without undue delay.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Use Case</title>
      <p>Conventionally, automotive car dealerships acquire data from multiple sources
associated with automotive industry, for instance, repair/maintenance services
providers, automotive showcase events organizers, automobiles manufacturers.
Referring to the non-shaded left-top part of the Figure 3, these sources
generate/store data their customers data through transactions and interactions, for
instance, people recording interest in vehicles at showcase events.</p>
      <p>Data is acquired from these multiple sources by car dealerships and processed
for identifying quality leads. Identi ed leads are accordingly contacted for the
purpose of conversion to customer, up-sell or cross-sell, refer right-top shaded
part of Figure 3. In order to be GDPR RTBF compliant, RTBF process NGrtbf,
refer to the pattern- lled lower part of Figure 3, is annexed to the process model
N, resulting in a blanket GDPR RTBF compliant version (P,T,F,R).
6</p>
    </sec>
    <sec id="sec-5">
      <title>Challenges</title>
      <p>The challenges associated with the GDPR implementation in general and blanket
RTBF GDPR adaptation speci cally are presented in the light of the automotive
lead generation use case.
6.1</p>
      <sec id="sec-5-1">
        <title>Model Adaptation</title>
        <p>Blanket GDPR RTBF compliant models (P,T,F,R) looks trivial to realize but
in essence poses a threat to the working model of many businesses. Devising
process models protecting business interests and ensuring the GDPR compliance
at the same time are a challenge to the community. Example of such threats
in automotive lead generation are customers (promptly) exercising RTBF after
violating tra c rules during test drive with the car dealers. This and many other
possible threats, resulting from the blanket GDPR adaptation, potentially leads
the system to in-consistent state. Safe models, being compliant with the GDPR
and caring for the intricacies of the incumbent business as well, are intended.
6.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Monitoring</title>
        <p>Conventional conformance/compliance techniques are mainly oriented towards
control ow and associated characteristics of activities. The GDPR is concerned
with the privacy of the data being accessed and processed during execution
of business process activities. In the case of automotive lead generation
process, identi cation of quality leads involves processing the personal and nancial
particulars of the data subjects. Therefore, conformance/compliance techniques
in this perspective needs to take into account the access policies, processing
policies, data anonymisation and encryption policies etc. and perform
conformance/compliance checking in the light of these policies. In our use case, the
unit responsible for correspondence with the leads shall have limited access to
the leads data necessary for job in hand like correspondence particulars but not
the nancial details of the leads.
Non-conformance to the GDPR has severe consequences, particularly nancial
ones. Therefore, backward conformance/compliance checking shall not be relied
upon completely. Online conformance checking techniques, having the potential
to detect deviations at the point in time when they happen, needs to be devised
to avoid or mitigate non-conformance. Scalability is going to be a major issue
for online conformance checking techniques.
It is not a viable solution to take processes o ine for adaptation(s) in light of
non-conformance, unless complete re-designing is unavoidable. Therefore, online
model adaptation techniques needs to be developed which shall automatically
detect any changes in business working and update the process model accordingly
at run-time in light of the changes detected [13]. A challenge for the approach
will be to distinguish noise from real changes.
7</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Privacy-aware process mining</title>
      <p>Process mining is conventionally oriented towards control, data and resource
perspectives of processes. Privacy perspective for process mining shall additionally
take into consideration the privacy of the data and data artifacts manipulated by
the process activities during execution. The GDPR has di erent requirements
for di erent states of data, for instance mandatory consent acquisition from
data subjects at collection of (minimal) data, processing of data inline with the
acquired consent, encryption of data during lawful transit to third parties or
countries, and anonymisation while resting the data. Refer to Figure 4, the two
main constituents of our envisaged privacy-aware process miner are model
adaptation and conformance checking. In addition to process model(s) and process
event logs, security and privacy policies in an adequate formalism is the third
required input for the miner.</p>
      <p>Model adaptation module shall take into consideration the relevant GDPR
requirements and adapt the process in hand such that it becomes fully GDPR
compliant, while syntactically remaining as close as possible to the existing
version. Considerable changes in the underlying process model can be quanti ed
using one of the distance measures introduced in [13] to detect concept drifts.
Conformance checking in the privacy-aware process miner kit shall check for the
conformance of the event log with process model, taking into consideration the
relevant security and privacy policies as well.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The authors have received funding within the BPR4GDPR project from the
European Union's Horizon 2020 research and innovation programme under grant
agreement No 787149.
8. De Leoni, Massimiliano, and Wil MP van der Aalst. \Aligning event logs and process
models for multi-perspective conformance checking: An approach based on integer
linear programming." Business Process Management. Springer, Berlin, Heidelberg,
2013. 113-129.
9. Petkovic, Milan, Davide Prandi, and Nicola Zannone. \Purpose control: Did you
process the data for the intended purpose?." Workshop on Secure Data
Management. Springer, Berlin, Heidelberg, 2011.
10. Basin, David, S ren Debois, and Thomas Hildebrandt. \On purpose and by
necessity: compliance under the GDPR." FC. Springer, Berlin Heidelberg (2018).
11. La Rosa, Marcello, et al.\Merging business process models." OTM Confederated
International Conferences" On the Move to Meaningful Internet Systems". Springer,
Berlin, Heidelberg, 2010.
12. Fahland, Dirk, and Wil MP van der Aalst. \Model repair|aligning process models
to reality." Information Systems 47 (2015): 220-243.
13. Hassani, Marwan, \Concept Drift Detection Of Event Streams Using An
Adaptive Window." International ECMS Conference On Modelling And Simulation (to
appear).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          :
          <source>Process Mining: Data Science in Action, 2nd edn</source>
          . Springer, Berlin (
          <year>2016</year>
          ). https://doi.org/10.1007/ 978-3-
          <fpage>662</fpage>
          -49851-4
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. van Zelst,
          <string-name>
            <given-names>Sebastiaan J.</given-names>
            ,
            <surname>Alfredo</surname>
          </string-name>
          <string-name>
            <given-names>Bolt</given-names>
            , Marwan Hassani, Boudewijn F. van
            <surname>Dongen</surname>
          </string-name>
          , and
          <string-name>
            <surname>Wil</surname>
            <given-names>MP van der Aalst.</given-names>
          </string-name>
          \
          <article-title>Online conformance checking: relating event streams to process models using pre x-alignments."</article-title>
          <source>International Journal of Data Science and Analytics</source>
          (
          <year>2017</year>
          ):
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Adriansyah</surname>
          </string-name>
          , Arya, Boudewijn F. van
          <string-name>
            <surname>Dongen</surname>
          </string-name>
          , and
          <string-name>
            <surname>Wil</surname>
            <given-names>MP van der Aalst.</given-names>
          </string-name>
          \
          <article-title>Conformance checking using cost-based tness analysis</article-title>
          .
          <source>" Enterprise Distributed Object Computing Conference (EDOC)</source>
          ,
          <year>2011</year>
          15th
          <string-name>
            <given-names>IEEE</given-names>
            <surname>International. IEEE</surname>
          </string-name>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Carmona</surname>
          </string-name>
          , Josep, Boudewijn van Dongen,
          <string-name>
            <surname>Andreas Solti</surname>
            , and
            <given-names>Matthias</given-names>
          </string-name>
          <string-name>
            <surname>Weidlich</surname>
          </string-name>
          .
          <source>Conformance Checking: Relating Processes and Models</source>
          . Springer,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ramezani</surname>
            , Elham,
            <given-names>Dirk</given-names>
          </string-name>
          <string-name>
            <surname>Fahland</surname>
          </string-name>
          , and
          <string-name>
            <surname>Wil</surname>
            <given-names>MP van der Aalst.</given-names>
          </string-name>
          \
          <article-title>Where did i misbehave? diagnostic information in compliance checking</article-title>
          .
          <source>" International conference on business process management</source>
          . Springer, Berlin, Heidelberg,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ramezani</surname>
            , Elham,
            <given-names>Dirk</given-names>
          </string-name>
          <string-name>
            <surname>Fahland</surname>
          </string-name>
          , and
          <string-name>
            <surname>Wil</surname>
            <given-names>MP van der Aalst.</given-names>
          </string-name>
          \
          <article-title>Supporting domain experts to select and con gure precise compliance rules</article-title>
          .
          <source>" International Conference on Business Process Management</source>
          . Springer, Cham,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Taghiabadi</surname>
            ,
            <given-names>Elham</given-names>
          </string-name>
          <string-name>
            <surname>Ramezani</surname>
          </string-name>
          , et al. \
          <article-title>Compliance checking of data-aware and resource-aware compliance requirements. " OTM Confederated International Conferences" On the Move to Meaningful Internet Systems"</article-title>
          . Springer, Berlin, Heidelberg,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>