1. Introduction

W. M. P. van der Aalst, A. J. M. M. Weijters, Process mining: a research agenda, Computers in Industry

10.1109/CIDM.2013.6597227

Optimization in Digital Pathology: A status report

PatrickStünke

SabineLeh

0 1

FriedemannLeh

Pathology, Process Mining, Workflow Modelling, Event Log Data, Process Analysis, Data Management

0 Department of Pathology, Haukeland University Hospital , Bergen , Norway 1 Department of Research and Development, Haukeland University Hospital , Bergen , Norway

2013

53 2004 19 24

Pathology is the study of causes and efects of diseases. It is an integral part of medical diagnostics based on the microscopic analysis of tissue, cells, or body fluids. Like other medical disciplines, pathology is currently undergoing a “digital transformation”, i.e., witnessing a transition from the assessment of physical tissue slides under a microscope towards analysing digital images of the same tissue slides on a computer screen. The recent advent of powerful machine learning methods and tools for digital image analysis opened the door to novel ways of conducting pathological diagnostics. Still, in order to yield a digital image of a specimen, the specimen has to pass through an elaborate multi-stage preparation process in the laboratory. We argue that in order to achieve a holistic framework of digital pathology, one must not only consider digital image analysis techniques, but also consider means for analysing the process as a whole. Concretely, we propose to analyse the event log data of the laboratory information system in order to understand flow patterns of specimens, find bottlenecks, predict the amounts of incoming samples, and plan resource allocations in an optimal manner. This is highly relevant to meet the ever-increasing number and complexity of specimens, that are handled by pathology departments around the world. The data science method working with event data is called process mining. Process mining is a relatively young but growing research discipline that seeks to bridge the gap between classic data science and business process management. It enables the discovery of control flow structures, data lfow patterns, resource utilization, process performance, and more. This paper represents a report on the current state of a work-in-progress project on process mining at a large regional hospital in Western Norway. The main contribution of this report is a list of concrete challenges that we encountered when conducting process mining project in pathology, some of which, we believe, have received less attention in the literature so far. Concretely, we find current process mining techniques not perfectly suited to be directly applied to the pathology laboratory process.

1. Introduction

“Good health and well-being” is the third of the United Nations’ 17 sustainable development goals. Health care is an integral public service that every government around the globe has to provide for its population. A growing and increasingly older population combined with the nEvelop-O LGOBE limited availability of trained medical personnel exacerbates the delivery of such health care services, e.g., in its 2013 report the OECD states that health care stands for roughly 10% of the gross domestic products of its member states, and it is expected that this number will grow even further in the futur1e].[Information and communication technology (ICT) is seen as an opportunity to leverage the aforementioned issue by supporting health care professionals in their daily work and by ofloading repetitive, and thus automatable, tasks onto machines in order to utilize the limited human resources more eficien2t]l.yA[ traditional application of

ICT, areinformation systems [3, 4], which make the “right” information, at the “right time”,

available to the “right people”. A well-known example of health care information systems are electronic health record systems [5]. Another application of ICT in healthcare lies in the area of computer aided diagnostics. The latter is mainly facilitated by recent breakthroughs in the field of artificial intelligence/machine learning (AI/ML) in the context of medical image analy6s,is7][.

Pathology is a diagnostic medical discipline that, through microscopy of tissues, cells, and lfuids, often in combination with molecular diagnostics, determines the presence of diseases as well as morphological and molecular abnormalities. With the increasing availability of so-calledwhole slide scanners andimage viewing software , also pathology becomes more and moredigitized [8], i.e., the examination is not performed under a microscope anymore but on a computer screen. The latter enables the application AI/ML methods for automatic image analysis 9[, 10]. Still, in order to arrive at a diagnostic result, specimens have to undergo an elaborate preparation process before. While “classical data science” methods are commonplace in image analysis (classification, clustering, etc.), “process data science” is less prevalent. The latter is also knownparsocess mining [11], i.e., the discovery of process models from event log data. There are several reports on successful applications of process mining in healthcare, see [12] for a comprehensive survey, but none for pathology in particular.

The goal of this paper is to present an ongoing project at the pathology department of a regional hospital at the west coast of Norway, naHmauekleyland universitetssjukehus, in the following abbreviated HasUS. In this project, the present authors are applying process mining techniques to the preparation workflow of specimens in the pathology laboratory in order to analyse cycle times, detect possible bottlenecks, and, in the long run, optimize the flow times of the samples. The project is still in an early stage, but already from first experiences, we can report on some issues that have received less attention in the process mining literature. Thus, the goal of this paper is to shed more light on the possibilities of process mining in pathology, the intricacies that arise on the organizational, methodical, technical, and social level when conducting such a project, as well as to present our approach on addressing these problems.

The paper is structured as follows: Sect2ioinntroduces the problem and solution domain of

this project, namely pathology and process mining. Afterwards, sec3tipornesents the project itself, its context, and goals. Sect4iopnresents the challenges, which we have been facing until now, those we are facing right now, and those we are expecting to encounter in the future. Section5 presents our approach to one of our main challenges, i.e., managing sensitive and heterogeneous data. Eventually, sect6icoonncludes this paper.

1. Accessioning 2. Grossing 3. Processing 4. Embedding 5. Sectioning 6. Staining Specimens

Cassettes

Processed Cassettes

Blocks

Slides

Stained Slides

2. Background

2.1. Problem Domain: Pathology The termpathology comprises the two Greek word“psathos” (sufering) and “logos” (study), hence, literally translates as the “study of diseases”. Today, it is understood in a more narrow sense as the “study of causes and efects of diseases”. A pathologist takes the role of a consultant towards another clinician, who is exerting primary care to a patient. The primary clinician takes a specimen from the patient, e.g., a tissue sample, and sends it to the pathologist, who examines the specimen and writes a report, most often with a conclusive diagnosis, which will help the clinician on deciding the further treatment, e.g. whether surgery or chemotherapy has to be scheduled. The historical development of pathology is closely related to the historical development of medicine itself and is characterized by several technological breakthroughs. Starting with cultural changes in Europe during the 16th ceanuttouprsyy, (i.e., the examination of human corpses) became possible, elucidating the understanding of the human body, its organs, and the efects of diseases. With the use of the microscope to study body tissues during the 19th century,histology a.k.a. microscopic anatomy was established as a discipline. Most recently, methods and techniques withinmmunohistochemistry andmolecular biology enabled further means to understand and diagnose diseases on the cellular and molecular level. When talking about pathology, one distinguishes between the sub-disciplaiunteospsy, histology, cytology (analysis of cell specimen), anmdolecular pathology. Here, we will focus onhistopathology.

In order to yieldhaistological slide, which can be analysed under a microscope, the specimen has to undergo a preparation process. This process is abstractly visualize1d in tFhige. form of a petri net [13]. When a specimen arrives at the pathology laboratory, it is first assigned to a case“(Accessioning”), i.e. various metadata (patient data, information about the sample type, clinical inquiries) are aggregated inlatbhoeratory information system (LIS), a priority is assigned, and the specimens are labelled with a lab-internal identifier. In most modern labs, this identifier has the form of an industrial barcode, which leverages electronic tracing throughout the process. When the specimen has been immersed in a fixative solution (e.g., formalin) for a suficient amount of time, it can be delivered to the next stage of the pro“Gceross:sing”. Here, the tissue is examined on a macroscopic level (i.e., “with the naked eye”) for abnormal ifndings and marked. In case of larger specimens, slices with findings of interest are selected from the specimen. Tissues are placed incaassette and delivered t“oProcessing”. This step is performed by a specialized machine that automates dehydration, clearing and infiltration of the tissue with parafin wax. Afterwards, the processed tissue is taken“tEombedding”. This means that it is placed in molten parafin wax to form a so-cabllolcekd. The cooled-down parafin block is mounted on aMicrotome, which allows cutting very thin slic∼es3-(4 ) from the tissue-parafin-block. The slices are placed on a glass slide and delivered t“oSttahineing” process step. Here, the slide is put through diferent chemicals, which amplify contrasts and highlight certain biological structuresh, eem.ga. toxylin stains cell nuclei blue aenodsin stains cell bodies (cytoplasm) red. Finally, a protective cover-slip is mounted on top of the stained tissue slice and the slide is ready to be analysed by a pathologist. 2.2. Solution Domain: Process Mining

Process Mining is a scientific approach that bridgbeussiness process management (BPM) and

data science. The former is an interdisciplinary field with roots in Taylor’s theo“Srcyieonftific Management” [14] and gained significant attention during the 90s when enterprise resource planning software and process-aware information systems were introduced in many organizations 1[5, 16]. BPM advocates organizing a business around the services that are delivered and the processes that are executed. The associated academic discipline is concerned with all aspects of identifying, analysing, and (re-)designing such business processes. Data science is another interdisciplinary field that brings together statistics, computer science and other related disciplines17[]. Its increasing popularity and significance is mainly due to the abundant availability of “big” data, allowing businesses to gain new ins1ig8h].ts [

While “classic” data science focuses on the derivation of prediction variables (structural features) from a set of given predictor variables, process mining is about the discovery of process models (dynamical features) from event data. Process mining started as a project proposal in the late 90s at Technical University of Eindhoven and has since then grown into its own discipline, with an active community holding conferences and work1s.hInoptserms of publications,1[9] and [20] are considered to be the seminal papers in this line of research, while the textbook11[] provides the most recent comprehensive overview over the field.

The principal idea of process mining is sketched in Fi2g:. the base data set is called an event log: a collection of events, where each event at least must contacianse(ii)daentifier (to group a set of events w.r.t. to a case), t(iim) eastamp (to order the execution of activities within a case), and (iii) thneame of an activity (to identity the activities within a case). The ifrst step after obtaining an event log is to identify ctohnetrol flow structure of the process model, i.e., the order in which activities may be executed, this is called “pla1y1-]i.nW”[ith a control flow model and an event log at hand, one may do a “rep1l1a]y.”T[his means to simulate the execution of the event log on the control flow model, which encaobnlfeorsmance checking [21], i.e., verifying whether there are deviations between the process model and the event log.

Moreover, one is able to discover additional perspectives of a process model. These perspectives

Event Log

Process Model control flow + resources + data + time are calledata, resource, andtime [11]. The data perspective highlights how certain properties of a process instance afect the paths that the case takes in the control flow structure. The resource perspective highlights the resources that are required for the execution of certain activities.

And, the time perspective looks into the execution times of activities and cases. Hence, process mining does not only encompass the discovery of control flow, but also enables the detection of bottlenecks (performance analysis) and hidden dependencies (data flow analysis). 3. The Project’s Goal According to2[2], a (process) data science project may seek answers to one or more of the

following questions: • “What happened?” • “Why did it happen?” • “What will happen?” • “What is the best that can happen?”

These questions can be associated with four activirteipeosrt, analyse, predict, andplan, which are depicted in the upper left quadrant of F3ig..These activities also correspond to the sub-goals of our project at HUS. The overarching objective of the project is to reduce the overall cycle time in the pathology department, i.e., the time from receiving a specimen to sending a diagnostic report back. This is a highly relevant concern, because of four key challenges the pathology department at HUS is facing: long cycle times, increasing number of specimens, increasing number of analyses per specimen, and a more or less constant number of resources. In order to understand the first question “What happened?”, a precise model of the current process is required. The right-hand side of Fi3g.shows three diferent types of modelsD. escriptive models (e.g., process diagrams, statistical indicators, or plots) are simplified representations l a c lit y a n a l a n o it a r e p o

Report Analyse Predict

Plan Operate activities

artefacts «produces» «consumes»

«produces» «consumes» «produces» «consumes» «produces» «consumes»

Descriptive

Models Predictive

Models Prescriptive

Models

Process Diagrams/Maps, Statistics (average cycle time, ...), Visualisations (Histogram, ...) Simulations Queueing Models Optimisation Models (LP, ILP, DP,...) Algorithms & Data Models Executable Workflow Models Work Guidelines of reality and are a result ofrtephoerting activity.Predictive models (e.g., simulations) allow making forecasts about the future and are produced ipnretdhiection activity based on existing descriptive models. They play an important role for achiperveisncgriptive models. The latter are also callesdpecifications . They steer how the actuoaplerational tasks are performed. There are many examples of prescriptive models: they range from more abstract work guidelines (e.g., in what order blocks shall be cut on a microtome) over daily plans (e.g., worker assignments to process steps) to the level of concrete machine instructions (e.g., a computer program that routes specimens into diferent pathways). The ultimate objective of this project is to create such prescriptive models in order to reduce the cycle times in the laboratory.

4. Challenges … For the first phase of our project (i.e., reporting), process mining has been selected as a methodology. In this section, we want to report on our own experiences after one and a half years after starting a process mining project in pathology.

In [11], v.d. Aalst presents the so-calle∗-dmodel of process mining. It describes the architecture, stages, and activities of a process mining project. It is motivated by CRI2S3P]-,DaM [ cross-industry reference model for conducting data science project4sc.oFnigt.ains a graphical depiction of the ∗ model, taken from11[], augmented with situations where we experienced resp. expect to experience concrete issues. Our project currently finds itself in stage two of this model. Thus, this section mainly focus on the first three issues. 4.1. …until now, … In the preliminary stage o f∗a-project, one has to justify the purpose of the project and to apply for data access. Since we are conducting our project within the health care domain, there are especially strict requirements concerning access to data: the project had to apply for exemption 1. Organizational Issues 2. Technical Issues

3. Methodological Issues 4. Practical Issues 5. Social Issues from the duty ocfonfidentiality , to do adata protection impact assessment, to carry outraisk analysis and to establishdaata management plan.

Issue #1 (Organizational) There are cyclic dependencies when writing data access applications. When writing these applications, we experienced that in order to provide the required

documentation abouwthat kind of data we need to extract ahnowd we are planning to safeguard privacy concerns, extensive knowledge about the database of the laboratory information system was required. To overcome this “chicken-and-egg” problem, it was essential to identify key personalities that have both clearance for accessing the database (because of their regular job description) and a suficient understanding of the objectives of the process mining project.

From our experience, this can be a challenging endeavour because these personalities are often

occupied with their operational work. A complementary approach is to have the process data scientists sign respectivneon-disclosure agreements (NDAs). This requires to already have a legal framework in place for this. Otherwise, juridical personnel has to be involved in the project. In our case, the solution was to employ the primary technical investigator at the hospital.

After applications are approved, the first stage of the process mining project (extraction) is entered. Here, the goal is to obtain event logs, which can be processed by process mining algorithms. Issue #2 (Technical) The source information system, generally, does not always ofer a viable event log

structure.

The concept of an event log has been introduced in S2e.2c.tI.n our case, the LIS logs all types of analyses performed on specimens, including histological slide preparation. The main challenge, however, is that the LIS database does directly provide the relevant events. The latter have to be extracted by combining records from several tables. In addition, not every process step is always tracked. For example, in our lab, there is no explicit registration of when the staining of a slide begins (there is only a notification when it is finished). However, it is possible to infer the start timestamp when knowing the staining programme that was executed.

In a diferent situation, i.e., to identify when the grossing or microscopic analysis is started

and finished, a separate “user access log” table has to be consulted to retrieve this information. Another issue is that the granularity of the logged events varies greatly, e.g., the system logs some internal function calls, which are not relevant for our analysis. Furthermore, event names in the database are cryptic at times and not unambiguous, which requires combining multiple fields and context information to map event records to real lab actions. Moreover, sometimes case meta-information and resource-specific event attributes are missing (e.g. at what workstation an activity was performed). Bose e2t4]adl.is[cuss such “data quality” issues and group them into three categories: (i) the event log does not contain events that really happened, (ii) the event log contains more events than in reality, and (iii) the real events are concealed in the log. All three categories apply in our case.

Not cleansing the log would result in unwanted results during the process discovery phase. By conducting several small iterations, where we extracted a small excerpt of raw events from

the database, mapped them to an event log, and performed process discovery on it, we could quickly see that a “naive” approach leads to inappropriate results. In our case, it was possible to assess the “quality” of the event log through the resulting control flow model because we have a clear understanding about how the general process should look like, s2e.1e.Sect.

Thus, we had to deviate from the principle of “keeping the event data as raw as possible”

[25] and to design a transformation from the LIS database structure to an appropriate event log. Designing this transformation, however, required extensive knowledge of both the information system and the domain. To bridge the gap between the domain and IT experts, we made positive experiences by having regular meetings where both sides could exchange their knowledge and by having the IT experts getting direct “hands-on” experience in laboratory. Seeing how the lab technicians work with the LIS, helped immensely in understanding how the system digitally represents the physical actions in reality. 4.2. ...,right now,...

The second stage of ∗ describes the transition from an event log to a control-flow process model. This is facilitated byparocess discovery algorithm.

The existing process mining algorithms are not perfectly suited for the specimen prepa

ration workflow in the pathology laboratory.

There is a plethora of process discovery algorithms,1s1e]efo[r an overview. All of these algorithms are based on the notioantoomfic token-based workflow modelling languages, i.e., a case is represented as an atomic token that flows through a net structure representing the control flow. The token may become duplicated if activities are performed in parallel but, in general, the case is not decomposed during the execution of the process. In pathology, however, there is a hierarchy of diferent token types flowing through the lab at the diferent process steps: a diagnostic request (i.e., case) may contain multipslpeecimens, which can become multiple cassettes/blocks, which again may result in multipslliedes. The fact that a pathologist can order additional analyses in between (i.e., creating new blocks and/or slides) requires considering all these artefacts on diferent levels of granularity at the same time.

When we experimented with the various process discovery algorithms implemented in the open-source tooPlroM2, most algorithms produced unwanted results: in most cases they simply returned a control flow where all activities principally could happen in paralfulezzly.Tmhiener [26] algorithm produced yet the “best” result compared to others, in a sense that it discovered the general structure of F1i.gS.till, the algorithm was not able to discover the correct causal dependencies between less frequent process steps and when decreasing the abstraction level, “spurious cycles” appeared on all activities. The latter phenomenon can be explained by the fact that the process steps happen in parallel while operating with diferent level of granularity.

Diferent granularity levels are discussed 1in1,[Chap.5.5], where the aforementioned atomic

token abstraction is described“aflasttening” . The chapter mentions the idea “opfroclets” [27], i.e., disassembling the overall process into several process operating at diferent levels of granularity, and refers to a research project (ACSI project) that promotes the use of such proclets. However, the referenced website does not seem to be active any more today.

In our case, we are more or less aware how the control flow must look like. Thus, process

discovery algorithms are actually less interesting for us and we can resort to creating a precise process model by hand. The latter is a confirming sign that we are dealing with a so-called “Lasagna”-process [11], i.e., a process model with a simple and well-understood control flow. We discovered thactoloured petri nets (CPNs) [28] are an appropriate formalism for our case, since they naturally model the idea of diferent types of tokens flowing through the net. Hence, our immediate next objective is to design the pathology lab process in the form of a coloured petri net and to extend the notion of event-log replay on petri nets with the notion of diferent token types. This is necessary to obtain the performance information of the individual process steps and diferent types of specimens. 4.3. ...and later According to the∗-model, our project is currently in stage two. Yet, we want to give an outlook on the issues, that we are expecting to arise in the coming stages. The third stage is the creation of an integrated model, i.e. a process model combining the notions of control flow, data, resources and time. This will, for the first time, allow giving feedback to the original process. The ∗-model discusses several options for this, namredlyesigning (changing the whole process model),adjusting (changing the process configuration, for example, resource allocations), or intervening (performing concrete actions during the execution of process instance).

Issue #4 (Practical) It is not exactly clear how process mining observations can be translated into actions. We are currently uncertain of how we eventually can transfer the analytical results to

operational results. For instance, there are some physical limitations to what degree a “redesign” of the process is possible. The literature mentions approaches on how to transit from process mining to simulation29[]. But, it does not mention specific methodologies for getting to means of operational support, the final stage∗o.f

Issue #5 (Social) It is not clear how to best anticipate and mitigate social ramifications. The final objective is to reduce the overall cycle times via intelligent planning of resources

and routing of specimens. When automatically assigning tasks to individual workers, both individual skills, individual preferences for particular tasks and the laboratory’s current need for specific activities matter. There is a theoretical possibility to assess the performance data of individual workers. Thus, our project has to safeguard that this contingency remains unfeasible. Currently, we are hashing all usernames with a random and hidden salt. When designing reporting solutions, we have to make sure that performance data is only presented aggregated over multiple cases, such that it is not possible to identify individuals from context information of a single case. In all of this, it is paramount to include all stakeholders in the project to make them aware of the technical possibilities and the data stored in the system. Even though this issue remains in the more distant future, it is important to be aware of it already.

5. Executable Data Management: A Model-based Approach In Sect.4 we have seen that the raw event data poses several challenges. First, there is a

(organizational) challenge in gaining access to it, which necessitates to document what is extracted and how sensitive data is protected. Second, there is a (technical) challenge when it comes to mapping the raw data into an event log so that it can be used for process mining.

It turns out thmatetadata plays a crucial role when addressing these challenges. They serve

both as documentation as well as specification for extraction and transformation. Since it is required to put them undveerrsion control to enable auditing, revision, and iterative development, one might as well consider utilizing these documents more “directly”. Hence, we decided to adopt a model-based paradigm30[] and consider these artefacts not only as mere means of documentation (descriptive) but also as means to configure the extraction and transformation Column references LIS Database

Schema

.sql schema«instanceOf» instance

LIS Database

Privacy Requirements .xlsx

Event-Code mappings

Event Name Interpretations

.xlsx Extract .csv

Transform Raw & Pseudonymized

Event Data

XES

Metamodel Activity mappings

.xsd «instanceOf»

.xes Structured Event Logs .ecore Replay

Activity names

Process Model .cpn schema instance .csv Process Performance

Data scripts (prescriptive). Here, the model-based paradigm fits particularly well with the necessity for metadata descriptions, i.e. instead of encoding extraction and transformations in program code they are declaratively defined in documents that are accessible for the domain experts.

The resulting architecture is shown in 5F.igT.he bottom half of the figure shows the data layer. The data “flows” from left to right, starting from the LIS database with the raw data. In the first step, the contents of relevant tables are exported in thecfoomrmaosfeparated values (CSV) files, where the contents of the columns containing sensitive information are hashed. In the second step, this data is transformed into an event log structure. This transformation step has to address the challenges related to data quality, s4e.e1.SEevcten.tually, the event log is replayed on the process model to obtain performance data about case and activity durations.

The top-half of Fig5., contains the metadata documents. The database is described via SQL

Create Table statements, which were manually extracted from a PDF provided by the LIS supplier. An Excel sheet declares the columns, which are extracted and which column contents are hashed. In our case, Excel turned out to be a viable compromise for a tool that domain experts are familiar with and which, simultaneously, can easily be integrated in automated toolchains. Similarly, the declarations about how event codes from the LIS map to the individual process steps are defined in an Excel sheet. For the latter, we first creadtoemdaain model of histopathology. The domain model has the formcolafsas diagram and is encoded using

Ecore [31], a standard serialization format in the context of model-based engineering. Moreover,

there is thextensible event stream (XES) schema definition [32], which defines a standard for representing event logs, and the process model defined as a coloured petri net, se4e.2S.ect.

All these documents are inter-related because they refer to each other’s elements, e.g. the transition names in the CPN-model must correspond to activity names defined in the domain model. These relations are visualized as cyan-coloured links i5n. Fig.

For the foundation of this infrastructure, we buiCltorornLang3, an academic prototype tool addressing semantic interoperability via mediation, based on a textual domain-specific language, which was developed in the context of the first author’s PhD t3h3e]s.isT[he tool establishes generic relations (the cyan links) between the various metadata documents, which are interpreted to perform the extraction and transformation on the4.data level

6. Conclusion To summarize the content and contributions of this state-of-the-project report: we began by

introducing (digital) pathology with an emphasis on not only focusing on classic data science for image analysis, but also consider process data science for event data stored in health care information systems. Concretely, we want to gain insights about the specimen preparation process in the lab, as this constitutes a significant amount of time within the diagnostic process.

There are several reports on successful applications of process mining in the health care do

main [12, 34]. Also, theProHealth workshop series (nowKR4HC) ofers a significant body of knowledge about applications of process-centric approaches within healthcare. However, to our knowledge, none of these domains have addressed pathology so far.

The main contributions of this paper are (a) an experience report4)(Saebcotu. t conducting

a process mining project in pathology (currently i nreptohreting phase of Fig.3 and stage two of the ∗ model), and (b) a conceptual approach for exploiting project documents as an executable specification for a data transformation pipeline5)(.SeOc.ur experience report comprises insights that, we believe, have received less attention in the process mining literature.

Especially the conceptual mismatch between atomic token-based workflow modelling languages and the flow of specimens/blocks/slides in the pathology laboratory is to highlight here. Acknowledgments The present study is part of the project “PiV – Pathology services in the Western Norwegian Health Region: a centre for applied digitization”. The project is funded by the Western Norway Regional Health Authority. We also would like to thank the anonymous reviewers for their helpful remarks. 3https://www.corrlang.io/ 4A demonstration of these definitions is found ihnt:tps://github.com/webminz/piv-data-mgmt

[3] P. Haried, C. Claybaugh, H. Dai, Evaluation of health information systems research in information systems research: A meta-analysis, Health Informatics Journal 25 (2019) 186–202. doi:10.1177/1460458217704259. [4] R. S. Mans, W. M. P. van der Aalst, N. C. Russell, P. J. M. Bakker, A. J. Moleman,

Process-Aware Information System Development for the Healthcare Domain - Consis

tency, Reliability, and Efectiveness, in: S. Rinderle-Ma, S. Sadiq, F. Leymann (Eds.),

Business Process Management Workshops, Springer, Berlin, Heidelberg, 2010, pp. 635–646.

doi:10.1007/978-3-642-12186-9_61. [5] L. Nguyen, E. Bellucci, L. T. Nguyen, Electronic health records implementation: an evaluation of information system impact and contingency factors, International Journal of Medical Informatics 83 (2014) 779–796. do1i0:.1016/j.ijmedinf.2014.06.011. [6] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. W. M. van der Laak, B. van Ginneken, C. I. Sánchez, A survey on deep learning in medical image analysis, Medical Image Analysis 42 (2017) 60–88. d1o0i.:1016/j.media.2017.07.005. [7] J. Ker, L. Wang, J. Rao, T. Lim, Deep Learning Applications in Medical Image Analysis,

IEEE Access 6 (2018) 9375–9389. doi:10.1109/ACCESS.2017.2788044. [8] L. Pantanowitz, A. Sharma, A. B. Carter, T. Kurc, A. Sussman, J. Saltz, Twenty Years of

Digital Pathology: An Overview of the Road Travelled, What is on the Horizon, and the

Emergence of Vendor-Neutral Archives, Journal of Pathology Informatics 9 (2018) 40. doi:10.4103/jpi.jpi_69_18. [9] A. Janowczyk, A. Madabhushi, Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases, Journal of Pathology Informatics 7 (2016) 29. doi:10.4103/2153-3539.186902. [10] D. Komura, S. Ishikawa, Machine Learning Methods for Histopathological Image Analysis, Computational and Structural Biotechnology Journal 16 (2018) 34–1402..1d0o1i6:/ j.csbj.2018.01.001. [11] W. M. P. v. d. Aalst, Process Mining: Data Science in Action, Springer, 2016. [12] E. Rojas, J. Munoz-Gama, M. Sepúlveda, D. Capurro, Process mining in healthcare: A literature review, Journal of Biomedical Informatics 61 (2016) 224–23160..d1o0i1: 6/ j.jbi.2016.04.007. [13] C. A. Petri, Kommunikation mit Automaten, Doctoral Thesis, Technische Hochschule

Darmstadt, 1962.

[14] F. W. Taylor, The Principles of Scientific Management, Harper & Brothers Publishers, New

York, 1911. [15] T. H. Davenport, J. E. Short, The New Industrial Engineering: Information Technology and Business Process Redesign, MIT Sloan Management Review (1990). UhRtLt:ps: //sloanreview.mit.edu/article/the-new-industrial-engineering-information-technologyand-business-process-redesig n./ [16] M. Hammer, Reengineering Work: Don’t Automate, Obliterate, Harvard Business Review (1990). URL: https://hbr.org/1990/07/reengineering-work-dont-automate-obl.iterate [17] L. Cao, Data Science: A Comprehensive Overview, ACM Computing Surveys 50 (2017) 43:1–43:42. doi:10.1145/3076253. [18] T. H. Davenport, D. J. Patil, Data Scientist: The Sexiest Job of the 21st Century, Harvard

Business Review (2012). URL: https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the

[1] OECD , Health at a Glance 2013 : OECD Indicators, Organisation for Economic Co-operation and Development , Paris, 2013 . URhL:ttps://www.oecd -ilibrary.org/social-issues-migrationhealth/health-at-a-glance- 2013 _health_glance-2 . 013 -en

[2] OECD, Improving Health Sector Eficiency: The Role of Information and Communication Technologies, Organisation for Economic Co-operation and

Development , Paris, 2010 . URL: https://www.oecd -ilibrary.org/social-issues-migration-health/improving-healthsector-efficiency_9789264084612-e .n