<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploring Task Execution Patterns in Event Graphs</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Eva L. Klijn, Felix Mannhardt, Dirk Fahland Eindhoven University of Technology</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <abstract>
        <p>-Classical process mining aims to capture the behavior of a process based on a single dimension: the sequence of activities grouped by process cases. This viewpoint fails to capture how individual actors are organizing their work across multiple cases. We present a tool that uses the graph database Neo4j to model actor behavior over different cases as an event graph. We then use Neo4j queries to detect task execution patterns in the graph describing how multiple actors collaborate across multiple cases. Exploring and visualizing these patterns enables the data driven analysis of tasks, routines, and habits as studied in organizations research.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>
        Process mining focuses on improving processes by
analyzing event data. Classically, recorded events are grouped in an
event log under the viewpoint of one (or more) case identifiers
and ordered by time. The resulting event log describes which
tasks were performed in which process execution viz. case.
Process discovery identifies behavioral patterns and
information along each case and aggregates them into a process model
describing the control-flow perspective [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] of a process, which
can consist of multiple data objects or entities [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Each task is performed by an actor (or resource) working
on the case, which is studied under the resource-perspective
of the process [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. An actor moving from one task in a case
to a task in another case introduces behavior along the
resource perspective and dependencies between tasks of different
cases. In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] we showed that the control-flow and
resourceperspective can be studied together as an event graph [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] where
each event is part of two paths, a case path and a resource path,
modeling event dependencies over two behavioral dimensions.
One or more case and resource paths synchronizing form a task
execution pattern which describes work habits of an actor or
routines, i.e., how one or more actors collaborate over multiple
cases.
      </p>
      <p>
        In this paper, we present a command-line tool for analyzing
such task execution patterns as described in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The tool is
realized in Python 3.7 and publicly available1. It connects to
a Neo4j (neo4j.com) database instance to execute queries (1)
for constructing an event graph over the case and resource
dimension from a classical event log; (2) for detecting various
forms of task execution patterns in the event graph and
aggregating them to high-level events which can be queried,
visualized, and explored using Neo4j. A screencast2 and a
detailed instruction manual1 explain usage of the tool.
      </p>
      <p>1https://github.com/multi-dimensional-process-mining/event-graph-task-p
attern-detection
2https://vimeo.com/630382325</p>
    </sec>
    <sec id="sec-2">
      <title>II. TASK EXECUTION PATTERN DETECTION The configurable end-to-end workflow is implemented in main.py and shown in Fig.1.</title>
      <sec id="sec-2-1">
        <title>A. Input &amp; Parameters</title>
        <p>The input is a classical event log in CSV format. The CSV
file must contain columns for the event classifier, timestamp,
case identifier and resource identifier. The columns and details
on the used CSV format need to be provided as parameters,
e.g., its filename, column keys, column separator and the
timestamp format used. We assume a Neo4j database has
already been set up and the credentials are provided as
parameters. The graph labels assigned for the entities and
relationships can be customized if wanted.</p>
      </sec>
      <sec id="sec-2-2">
        <title>B. Event Graph Creation</title>
        <p>The tool creates the event graph in three steps:
1. Preprocessing. Preprocess the event data
(PreprocessSelector.py) to make it suitable to import to a Neo4j Database
instance by standardizing the name and formatting of the event
classifier and timestamp column.</p>
        <p>
          2. Initial Graph Creation. Invoke Cypher queries to
construct an event graph (EventGraphConstructor.py) by limiting
the original event graph construction approach [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] to the
resource and case entities. In the resulting event graph, each
event is an Event node that is part of two paths of
directlyfollows edges: the path of all events correlated to the same
case entity, and the path of all events correlated to the same
resource entity. For event data over multiple case entities,
a user can also choose to construct a custom event graph
following the original approach [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and may then skip step
1 and 2.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>3. High Level Event Construction. Detect task execution</title>
        <p>
          patterns in the event graph and materialize them in the graph
as “high-level event” nodes (HighLevelEventConstructor.py).
For the most basic pattern type, we query for sub-graphs of
event nodes that are all part of the same case path and the same
resource path (i.e., the resource works on the case over one or
more consecutive events); sub-graphs of other pattern types [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]
are found through variations of this query. For each found
subgraph, we create a new HLEvent node linked to the events in
the sub-graph. We lift the directly-follows edges from Event
nodes to HLEvent nodes. This allows to query for larger task
execution patterns as patterns of HLEvent nodes along case
and resource directly-follows edges, see [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] for details.
        </p>
        <p>Event
data</p>
        <p>Event
graph
construction</p>
        <p>Event
graph</p>
        <p>Explore task
execution
patterns
Explore
process
executions</p>
        <p>Subgraph of task execution instance</p>
        <p>Subgraph of process executions</p>
      </sec>
      <sec id="sec-2-4">
        <title>C. Event Graph Exploration</title>
        <p>Once the graph and HLEvents are constructed, the tool
prompts the user to explore the graph for (1) task patterns
of a particular type or (2) patterns occurring in a subset cases.</p>
        <p>
          1. Exploring task execution patterns. In [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], several task
pattern types have been identified and assigned numbers from
        </p>
      </sec>
      <sec id="sec-2-5">
        <title>1 to 16. GraphExplorer.explore patterns() prompts the user to</title>
        <p>
          specify which pattern type they wish to explore, e.g., type 4
in Fig. 2. It then returns a list of all distinct task execution
patterns of this type ordered by frequency. The user can select
a specific task pattern to explore further, which will return a
list of all instances (sub-graphs) of that particular task pattern.
The user can select to visualize a specific instance of the
pattern, which is shown in a separate window as a PDF. Fig. 1
(top-right) shows an example of a task execution instance that
shows pattern 8’, i.e., an actor performing the same sequence
of steps for a number of cases one after the other. Other
patterns such as resource interruptions (pattern 2, see [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]) and
case interruptions (pattern 3, see [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]) can also be explored.
        </p>
      </sec>
      <sec id="sec-2-6">
        <title>2. Exploring subsets of process executions. GraphEx</title>
        <p>plorer.explore cases() lets the user specify the case identifiers
for which they want to explore task execution patterns. The
resulting subgraph of those process executions and task
patterns is then output in a separate window in PDF as shown at
the bottom right part of Fig. 1.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>III. MATURITY &amp; PERFORMANCE</title>
      <p>
        The tool was successfully evaluated in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], where it was
used on two real-life event data sets (BPIC’14 and BPIC’17).
These experiments were run on an Intel i7 CPU @ 2.2GHz
machine with 32GB RAM. For the larger of the two data
sets (BPIC’17, 237MB), the tool was able to construct all
graph related constructs in 140 seconds and return lists
of executions and instances of all pattern types in under 4
seconds. The subgraph visualizations are generally retrieved
in under 2 seconds, but we have also seen various instances
that take almost 60 seconds. The tool’s performance on this
aspect highly depends on the complexity of the subgraph to be
visualized. The preprocessing scripts and label settings for the
graph output are easily adaptable. We provide example scripts
for BPIC’14 and BPIC’17.
      </p>
    </sec>
    <sec id="sec-4">
      <title>IV. CONCLUSION</title>
      <p>
        We developed an open-source command-line tool for
exploring task execution patterns in event graphs. In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], we have
shown that the exploration of specific execution patterns that
include the behavioral dimensions of both cases and resources,
can reveal a complex interplay of cases and actors engaging in
recurrent patterns of work, i.e., routines and habits. This makes
our tool applicable in any process mining use case where
resource information is recorded. Future work is to extend the
subgraph visualization feature for all task execution patterns
introduced in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Still, we covered a core set of patterns that
occur in public datasets. Furthermore, we plan to build an
interactive and graphical user interface to enable a seamless
interaction with the tool.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Rosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendling</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Reijers</surname>
          </string-name>
          , Fundamentals of Business Process Management. Springer, 2 ed.,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Esser</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Fahland</surname>
          </string-name>
          ,
          <article-title>“Multi-dimensional event data in graph databases</article-title>
          ,
          <source>” J. Data Semant.</source>
          , vol.
          <volume>10</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>109</fpage>
          -
          <lpage>141</lpage>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Klijn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mannhardt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Fahland</surname>
          </string-name>
          , “
          <article-title>Classifying and detecting task executions and routines in processes using event graphs,”</article-title>
          <source>in BPM Forum</source>
          <year>2021</year>
          , vol.
          <volume>427</volume>
          of LNBIP, pp.
          <fpage>212</fpage>
          -
          <lpage>229</lpage>
          , Springer,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>