<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>PaPPI: Privacy-aware Process Performance Indicators (Extended Abstract)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Martin Kabierski</string-name>
          <email>martin.kabierski@hu-berlin.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stephan A. Fahrenkrog-Petersen</string-name>
          <email>stephan.fahrenkrog-petersen@hu-berlin.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Glenn Dittmann</string-name>
          <email>glenn.dittmann@hu-berlin.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Humboldt-Universität zu Berlin</institution>
          ,
          <addr-line>Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <fpage>113</fpage>
      <lpage>117</lpage>
      <abstract>
        <p>The evaluation of recorded process executions using process performance indicators, short PPIs, serves as a main driver of process optimization and process monitoring. Yet, since many processes inadvertently record information about individuals involved in said processes, the analysis of such data is bound by data protection regulations, such as the GDPR and the CCPA. To enable the analysis of the respective data, while conforming to privacy regulations, anonymization techniques can be employed. In this work, we propose PaPPI, Privacy-aware Process Performance Indicators, a Java-based library for the definition and evaluation of process performance indicators under diferential privacy. Our toolkit builds upon and extends the PPINOT library for process performance indicators, maintaining the well-established syntax and semantics of PPINOT. This way, we achieve an easy-to-use integration of privacy protection in the computation of process performance indicators.</p>
      </abstract>
      <kwd-group>
        <kwd>process mining</kwd>
        <kwd>performance indicators</kwd>
        <kwd>privacy-awareness</kwd>
        <kwd>diferential privacy</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The evaluation of recorded process executions is a main driver for the analysis of
processcentric information systems. Following the common BPM life cycle, such evaluations are
the backbone of any process improvement initiative and guide the re-design of processes. The
analysis of recorded process executions may be based on techniques for conformance checking [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
compliance verification [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], or the evaluation of quantifiable metrics of a processes efficiency
and effectiveness, which are commonly referred to as process performance indicators (PPIs) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
These metrics are defined by the process owner in order to communicate and monitor certain
highlevel goals. Their evaluation over the recorded process executions, which is typically available in
the form of event logs, enables conclusions on the extent to which these goals are met.
      </p>
      <p>
        The PPINOT metamodel [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] has been proposed as a general framework for the definition
and evaluation of PPIs. According to the PPINOT model, PPIs are composed from atomic
building blocks, so-called measure definitions. These measures are aggregated in a tree-like
      </p>
      <p>CEUR</p>
      <sec id="sec-1-1">
        <title>Aggregation</title>
        <p>Function: Mean</p>
      </sec>
      <sec id="sec-1-2">
        <title>TimeMeasure</title>
        <p>from: "EVENT 2 START MESSAGE"</p>
        <p>to: "FI closed"
manner to evaluate complex functions defined over the process instances and their attributes. In
Figure 1, we illustrate the PPINOT model with the PPI "Average Duration" that is included in
the public PPINOT example repository.1 It is based on a measure that captures the time between
the occurrences of two types of events with the data recorded for one process execution. These
values are then aggregated over all process executions by computing the arithmetic mean.</p>
        <p>
          The recorded process executions over which PPIs are defined and evaluated often include
sensitive information about individuals involved in the process, such as knowledge workers in
traditional business processes or patients in clinical pathways. Any handling and analysis of this
data has to adhere to data protection regulations, such as the GDPR or the CCPA [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. To this
end, anonymization techniques can be employed to protect the privacy of individuals, while still
supporting the evaluation of the respective data.
        </p>
        <p>
          In recent work, we proposed a framework for privacy protection during the evaluation of
PPIs defined using the PPINOT metamodel [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Specifically, our framework provides a privacy
guarantee in terms of the well-established notion of differential privacy [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. For this purpose,
we proposed multiple privacy-preserving release mechanisms, i.e. functions, that add controlled
noise to the true result of a function. In this demo, we present PaPPI, Privacy-aware Process
Performance Indicators, a Java-based library that implements the aforementioned framework.
PaPPI has been designed such that it wraps the the publicly available PPINOT library, so that
users can rely on the established syntax and semantics for the definition of PPIs, while still
benefiting from the privacy protection offered by our techniques. In particular, PaPPI enables
the privacy-aware evaluation for any PPI, that can be defined using the PPINOT syntax. As
such, PaPPI provides an easy-to-use way to include privacy considerations in the quantitative
analysis of process executions. While PaPPI has not been used in practice yet, we validated its
applicability for real-life scenarios in a case study on a publicly available log file [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>We first illustrate the definition of PPIs in our toolkit (section 2), before turning to their
evaluation (section 3). We then elaborate on the availability of our library (section 4), before we
conclude (section 5).
1 //Load Log
2 LogProvider log = new MXMLLog(new FileInputStream(new File(”simulation_logs.mxml”)),null);
3
4 //PPI Definition
5 TimeMeasure duration = new TimeMeasure();
6 duration.setFrom(new TimeInstantCondition(”EVENT 2 START MESSAGE”, GenericState.START));
7 duration.setTo(new TimeInstantCondition(”FI closed”, GenericState.END));
8 duration.setUnitOfMeasure(TimeUnit.HOURS);
9
10 PrivacyAwareAggregatedMeasure privatizedAvg = new
11 PrivacyAwareAggregatedMeasure();
12 privatizedAvg.setBaseMeasure(duration);
13 privatizedAvg.setAggregationFunction(PrivacyAwareAggregator.AVG_LAP);
14 privatizedAvg.setEpsilon(0.1);
15 privatizedAvg.setBoundaryEstimation(BoundaryEstimator.MINMAX);
16 privatizedAvg.setId(”AvgDuration”);
17
18 //PPI Evaluation
19 MeasureEvaluator evaluator = new PrivacyAwareLogMeasureEvaluator(log);
20 evaluator.eval(privatizedAvg, new SimpleTimeFilter(Period.MONTHLY,1, false));</p>
        <p>Algorithm 1: Definition and evaluation of a PPI in PaPPI.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Defining PPIs</title>
      <p>The definition of PPIs in PaPPI closely follows the syntax of PPINOT, i.e. the different types
of measure definitions are composed in a tree-like structure to form more complex evaluation
functions. In particular, the set of available measure types consists of base measures, which
are evaluated over single process instances, aggregation measures, that aggregate information
retrieved over multiple process instances using predefined functions (Avg, Sum, Min or Max),
and derived measures, which are user-defined functions, to be evaluated over single or multiple
process instances. The MeasureDefinition classes of PPINOT are extended to include, for
each multi-instance measure in the defined tree, additional information about whether and how
it should be privatized during evaluation. In particular, if it shall be privatized, the value of
the privacy parameter  , the chosen differentially private release mechanism, and a method for
estimating the bounds of the input data need to be defined. In Algorithm 1, starting at line 5, the
definition of the PPI of Figure 1 using PaPPI is shown. Here, we specify that the evaluation of the
aggregation measure shall be privatized using  = 0.1 with a boundary estimation based on the
minimum and maximum value of the inputs, and the Laplace mechanism to calculate the Average
of the inputs.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Evaluating PPIs</title>
      <p>For the subsequent evaluation of PPIs, we extended the LogMeasureEvaluator class of PPINOT,
to enable the invocation of the specified privatized measures during evaluation. Using this new
evaluator, the evaluation of a given PPI definition, or a set thereof, can be conducted as shown in
lines 19 and 20 of Algorithm 1. Here, we specify that the PPI privatizedAvg shall be evaluated
in monthly time segments for the respective log.</p>
      <p>For a given PPI, the evaluator first determines, whether the provided PPI definition is admissible
for privatization, i.e. if the evaluation of said PPI with the specified measures to privatize, would
properly protect each of the logs accessed information. Should this not be the case, the evaluation
stops, informing the user about the problematic PPI. A given PPI definition is considered
nonadmissible if either not all information retrieved from process instances would be privatized or
if retrieved information would be privatized more than once during evaluation. The privatized
results can either be printed or saved in a .csv-file, such as the one shown in Figure 2, that has
been generated by 1. The file contains for each PPI and time segment the value obtained by the
privatized evaluation.</p>
      <p>
        Concerning the release mechanisms, we currently provide implementations of the mechanisms
proposed in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], i.e., the Laplace mechanism and the Interval Mechanism for aggregation measures,
as well as the Sample-and-Aggregate mechanism for multi-instance derived measures. Due to the
implementation utilizing a factory pattern for invoking the evaluation of the measure definitions
of the PPI based on the aforementioned specifications, it is easy to extend the implementation
with additional release mechanisms, by providing the factory with a mapping from a chosen
identifier of the mechanism in the PPI definition to its implementation.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Availability</title>
      <p>The library is publicly available on GitHub2 under the MIT license. There, we also provide
further guidance for installing the library and adding new release mechanisms. Furthermore, we
provide a screencast of the definition and evaluation of PPIs using the library. 3</p>
      <p>We plan to extend the library with additional features, such as an automated selection of release
mechanisms and the support of data-driven privacy-aware PPI definitions from structured file
formats.
2https://github.com/MartinKabierski/privacy-aware-ppinot
3https://youtu.be/i_WnR-ReVnE</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this demo, we proposed PaPPI, a Java-based library, that serves as a wrapper around the
PPINOT library, adding privacy protection in the form of differential privacy to the definition and
evaluation of PPIs. We re-used and extended the concepts of the PPINOT meta model, so that
privacy-protection can be easily integrated for users familiar with its definition and evaluation
syntax.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work has received funding from the Deutsche Forschungsgemeinschaft (DFG), grant number
421921612.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Carmona</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. van Dongen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Solti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Weidlich</surname>
          </string-name>
          , Conformance checking, Switzerland: Springer.[Google Scholar]
          <article-title>(</article-title>
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. e.</given-names>
            <surname>Kharbili</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. K. A. d. Medeiros</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Stein</surname>
          </string-name>
          , W. M. van der Aalst,
          <article-title>Business process compliance checking: Current state and future challenges, Modellierung betrieblicher Informationssysteme (MobIS</article-title>
          <year>2008</year>
          ) (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>del Río-Ortega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Resinas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cabanillas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ruiz-Cortés</surname>
          </string-name>
          ,
          <article-title>On the definition and designtime analysis of process performance indicators</article-title>
          ,
          <source>Information Systems</source>
          <volume>38</volume>
          (
          <year>2013</year>
          )
          <fpage>470</fpage>
          -
          <lpage>490</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Elkoumy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Fahrenkrog-Petersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Sani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Koschmider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mannhardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. N. Von</given-names>
            <surname>Voigt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rafiei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Waldthausen</surname>
          </string-name>
          ,
          <article-title>Privacy and confidentiality in process mining: threats and research challenges</article-title>
          ,
          <source>ACM Transactions on Management Information System (TMIS) 13</source>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kabierski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Fahrenkrog-Petersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Weidlich</surname>
          </string-name>
          ,
          <article-title>Privacy-aware process performance indicators: Framework and release mechanisms</article-title>
          ,
          <source>in: International Conference on Advanced Information Systems Engineering</source>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>19</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Dwork</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roth</surname>
          </string-name>
          , et al.,
          <article-title>The algorithmic foundations of differential privacy</article-title>
          ,
          <source>Foundations and Trends® in Theoretical Computer Science</source>
          <volume>9</volume>
          (
          <year>2014</year>
          )
          <fpage>211</fpage>
          -
          <lpage>407</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>