=Paper= {{Paper |id=Vol-2973/paper_138 |storemode=property |title=Process Constraint Discovery Based on Regulatory Documents and Process Execution Logs (Extended Abstract) |pdfUrl=https://ceur-ws.org/Vol-2973/paper_138.pdf |volume=Vol-2973 |authors=Karolin Winter |dblpUrl=https://dblp.org/rec/conf/bpm/Winter21 }} ==Process Constraint Discovery Based on Regulatory Documents and Process Execution Logs (Extended Abstract)== https://ceur-ws.org/Vol-2973/paper_138.pdf
Process Constraint Discovery Based on Regulatory
Documents and Process Execution Logs (Extended
Abstract)
Karolin Winter1,2
1
    Department of Informatics, Technical University of Munich, Germany
2
    Faculty of Computer Science, University of Vienna, Austria


                                         Abstract
                                         Despite digitalizing business process compliance is more vital than ever due to the continuously increas-
                                         ing amount of regulatory documents and process execution data a full stack digitalized compliance man-
                                         agement is still out of reach. This thesis aims at bridging this gap by providing support for the modeling
                                         as well as discovery phase of business process compliance.

                                         Keywords
                                         Process Compliance, Natural Language Processing, Instance Spanning Constraints, Process Mining




1. Introduction
Business process management has become a key success factor for businesses over the past years
and targets the design, enactment, management and analysis of business processes. Business
processes describe workflows within companies, i.e., consist of a number of events, activities,
contain decision points and can involve actors or physical objects [1]. Besides capturing
workflows, business processes usually have to comply to constraints which emerge from
regulatory documents and are the subject of business process compliance. Despite digitalizing
business process compliance is more vital than ever due to the continuously increasing amount of
regulatory documents and process execution data generated by, e.g., process aware information
systems, a full stack digitalized compliance management is still out of reach. This thesis aims at
bridging this gap by providing support for the modeling as well as discovery phase of business
process compliance. In the following, the contributions of the thesis are described along these
two phases complemented by a summary of overarching contributions.


2. Business Process Compliance – Modeling Phase
Within the modeling phase, constraint discovery from regulatory documents as well as deter-
mining their relation to business processes is envisioned.

Proceedings of the Demonstration & Resources Track, Best BPM Dissertation Award, and Doctoral Consortium at BPM
2021 co-located with the 19th International Conference on Business Process Management, BPM 2021, Rome, Italy,
September 6-10, 2021
" karolin.winter@tum.de (K. Winter)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
2.1. Research Gaps
Several state-of-the-art approaches have addressed the problem of extracting business process
models and constraints from natural language text (cf., e.g., [2, 3]). Most of them consider so-
called process descriptions as input which fundamentally differ from regulatory documents. Yet,
regulatory documents constitute the primary source of constraints [4] and therefore, this thesis
addresses the question How to extract and represent constraints from regulatory documents in an
automated way? In order to estimate whether constraints were captured correctly during the
modeling phase, the thesis addresses the question How to assess compliance between regulatory
documents and process models?

2.2. Contributions
Text fragmentation and constraint discovery [5] A method chain based on existing text
mining techniques in combination with a novel idea of applying text fragmentation to improve
the results of, e.g., clustering algorithms was developed. It enables the retrieval of relevant
constraints from regulatory documents and is feasible to extract main characteristics related
to planning and implementing of regulatory documents as well as best practices in existing
standards and guidelines.

Constraint grouping and determination of constraint relations [6] After constraints
have been discovered from regulatory documents, the thesis presents three options for grouping
them based on, e.g., topics or stakeholders by employing natural language processing techniques.
This facilitates and accelerates the installation and maintenance of regulatory documents and
related processes, shortens the consulting procedure, and reduces the overall implementation
costs. Moreover, it is crucial to enable a content-wise comparison among constraints. Imagine a
regulatory document containing the following constraints i)“A data owner has 5 days to delete
this information.”, ii)“This information has to be deleted within 5 days by the data owner.” which
are spread across the document. A user reading this document in a chronological order might
not be aware of such redundancies. In the best case this causes additional implementation
effort, in the worst case implementation errors. The thesis therefore provides definitions for
determining redundant, subsuming and contradicting constraints which are depicted in a graph-
based representation, a constraint network map.

Derivation of mixed graphs [7] In order to bridge the gap between constraints and business
processes, the thesis employs the concept of mixed graphs. A mixed graph mirrors a paragraph
within a regulatory document containing at least one constraint and establishes control flow
connections if possible. This enables a representation of regulatory documents heading towards
formal process models.

Compliance assessment of regulatory documents and process models [8] Another
key contribution of the thesis constitutes an approach for assessing the compliance between
regulatory documents and process models and is the first work of its kind. We developed
a fitness score quantifying the likelihood that a paragraph pertains to a model as well as a
cost score quantifying the distance between the obligations expressed in paragraphs and their
implementation by a model. Within the latter we covered three essential types of compliance
violations, missed obligatory activities, strict order and resource responsibility violations.


3. Business Process Compliance – Discovery Phase
Within the discovery phase constraints are distinguished into intra instance constraints, i.e.,
constraints affecting only one process instance and so-called instance spanning constraints (ISC),
i.e., constraints spanning multiple process instances of one or multiple process types [9]. A
process type is hereby described by a process model and a process instance represents one
execution of a process type. Process execution logs constitute the data source within the
discovery phase of business process compliance.

3.1. Research Gaps
Current approaches for extracting knowledge from process execution logs are related to pro-
cess discovery, e.g., decision mining [10] or declarative process mining, e.g., [11]. Yet, these
approaches are only capable of discovering intra instance constraints but do not envision the
discovery of ISC. Therefore, this thesis poses the question How to extract and represent ISC from
process execution logs? Since not just the discovery of ISC but also a formalization of ISC was
lacking, we further addressed How to support integration, transparency, and usage of ISC?

3.2. Contributions
Categorization of instance spanning constraints [12] For addressing ISC discovery from
process execution logs, we established a categorization of ISC based on an extensive set of real-
world ISC examples. We identified five distinct categories. Four of them, i.e., simultaneous
execution, constrained execution, ordered execution and non-concurrent execution, trace back
to synchronization of instances. The fifth category relates to exception handling of ISC.

Algorithms for discovering instance spanning constraints [12, 13] Within the thesis,
we developed two ISC discovery approaches. One semi-automatic, following the idea of the
decision miner, i.e., first preparing process execution logs such that so-called ISC decision points
are revealed and second applying decision tree algorithms for determining the underlying
decision rules. The second approach bases on the ISC categorization, is automatic, yet, due to
well-selected parameters, adjustable to various settings.

Formalization of instance spanning constraints in terms of patterns [14] In order to
support the integration, transparency, and usage of ISC we developed ISC patterns based on
timed colored workflow nets and Proclets. This fosters ISC comparison and consequently an
assessment whether processes obeyed to imposed ISC during process execution.
4. Overarching Contributions
Business process compliance life cycle The outlined contributions are linked to but not
strictly limited to one phase of business process compliance. The compliance assessment
approach can, e.g., be employed to check whether executed processes obeyed to the constraints
imposed on them by mining a process model from a process execution log. ISC patterns can
also be employed during the modeling phase and the selected formalism has a well-defined
formal execution semantics enabling the smooth transformation into executable process code.

Relevance and implication to practice Ensuring that the outlined approaches constitute
solutions to relevant practical problems was always considered. On the one hand we evaluated
the concepts based on manifold user stories for three different expert groups, on the other hand
prototypes such as RegMiner [15] were further improved by interviews with domain experts.

Evaluation and implementation Within the evaluation, several evaluation methods rang-
ing from expert interviews, to case studies and common measures like precision and recall
were employed. The artificial as well as real-world data sets stemmed from several domains,
e.g., security, health care, manufacturing or the financial domain, demonstrating the domain
independent applicability of the presented approaches. Moreover, six command-line as well as
two web-based prototypes [15, 16] were developed. The latter provide a low-threshold entry
for non-technical users ensuring again the transfer of research results to practice.


5. Conclusion
Within this thesis novel concepts and algorithms for the (semi-)automatic discovery and contex-
tualization of compliance constraints from regulatory documents as well as process execution
logs were presented constituting a fundamental step towards digitalized compliance manage-
ment and to significantly relieve compliance and domain experts. Further research directions
can target exception handling for ISC or compliance assessment of regulatory documents with
further textual documents such as policies or handbooks.


Acknowledgments
This thesis has been partly funded by the Vienna Science and Technology Fund (WWTF) through
project NXT19-003.


References
 [1] M. Dumas, M. L. Rosa, J. Mendling, H. A. Reijers, Fundamentals of Business Process
     Management, Second Edition, Springer, 2018. doi:10.1007/978-3-662-56509-4.
 [2] H. van der Aa, C. D. Ciccio, H. Leopold, H. A. Reijers, Extracting declarative process
     models from natural language, in: Advanced Information Systems Engineering, CAiSE,
     2019, pp. 365–382. doi:10.1007/978-3-030-21290-2\_23.
 [3] F. Friedrich, J. Mendling, F. Puhlmann, Process model generation from natural language
     text, in: Advanced Information Systems Engineering, CAiSE, 2011, pp. 482–496. doi:10.
     1007/978-3-642-21640-4\_36.
 [4] L. T. Ly, S. Rinderle-Ma, D. Knuplesch, P. Dadam, Monitoring business process compliance
     using compliance rule graphs, in: On the Move to Meaningful Internet Systems: OTM,
     2011, pp. 82–99. doi:10.1007/978-3-642-25109-2\_7.
 [5] K. Winter, S. Rinderle-Ma, W. Grossmann, I. Feinerer, Z. Ma, Characterizing regulatory
     documents and guidelines based on text mining, in: On the Move to Meaningful Internet
     Systems. OTM, 2017, pp. 3–20. doi:10.1007/978-3-319-69462-7\_1.
 [6] K. Winter, S. Rinderle-Ma, Detecting constraints and their relations from regulatory
     documents using NLP techniques, in: On the Move to Meaningful Internet Systems. OTM,
     2018, pp. 261–278. doi:10.1007/978-3-030-02610-3\_15.
 [7] K. Winter, S. Rinderle-Ma, Deriving and combining mixed graphs from regulatory doc-
     uments based on constraint relations, in: Advanced Information Systems Engineering,
     CAiSE, 2019, pp. 430–445. doi:10.1007/978-3-030-21290-2\_27.
 [8] K. Winter, H. van der Aa, S. Rinderle-Ma, M. Weidlich, Assessing the compliance of
     business process models with regulatory documents, in: Conceptual Modeling, ER, 2020,
     pp. 189–203. doi:10.1007/978-3-030-62522-1\_14.
 [9] M. Leitner, J. Mangler, S. Rinderle-Ma, Definition and enactment of instance-spanning
     process constraints, in: Web Information Systems Engineering - WISE, 2012, pp. 652–658.
     doi:10.1007/978-3-642-35063-4\_49.
[10] A. Rozinat, W. M. P. van der Aalst, Decision mining in prom, in: Business Process
     Management, BPM, 2006, pp. 420–425. doi:10.1007/11841760\_33.
[11] F. M. Maggi, A. J. Mooij, W. M. P. van der Aalst, User-guided discovery of declarative
     process models, in: Proceedings of the IEEE Symposium on Computational Intelligence
     and Data Mining, CIDM, 2011, pp. 192–199. doi:10.1109/CIDM.2011.5949297.
[12] K. Winter, F. Stertz, S. Rinderle-Ma, Discovering instance and process spanning con-
     straints from process execution logs, Inf. Syst. 89 (2020) 101484. doi:10.1016/j.is.
     2019.101484.
[13] K. Winter, S. Rinderle-Ma, Discovering instance-spanning constraints from process execu-
     tion logs based on classification techniques, in: IEEE International Enterprise Distributed
     Object Computing Conference, EDOC, 2017, pp. 79–88. doi:10.1109/EDOC.2017.20.
[14] K. Winter, S. Rinderle-Ma, Defining instance spanning constraint patterns for business
     processes based on proclets, in: Conceptual Modeling, ER, 2020, pp. 149–163. doi:10.
     1007/978-3-030-62522-1\_11.
[15] K. Winter, M. Gall, S. Rinderle-Ma, Regminer: Taming the complexity of regulatory
     documents for digitalized compliance management, in: BPM-D, 2020, pp. 112–116.
[16] F. Stertz, K. Winter, S. Rinderle-Ma, SVIPEX: A web service for discovering and visualizing
     instance spanning constraints based on process execution logs, in: BPM-D, 2020, pp.
     117–121.