<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>BPMN Miner 2.0: Discovering Hierarchical and Block-Structured BPMN Process Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Raffaele Conforti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adriano Augusto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcello La Rosa</string-name>
          <email>m.larosag@qut.edu.au</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marlon Dumas</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luciano Garc´ıa-Ban˜ uelos</string-name>
          <email>luciano.garciag@ut.ee</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Queensland University of Technology</institution>
          ,
          <country country="AU">Australia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Tartu</institution>
          ,
          <country country="EE">Estonia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <fpage>39</fpage>
      <lpage>43</lpage>
      <abstract>
        <p>We present BPMN Miner 2.0: a tool that extracts hierarchical and block-structured BPMN process models from event logs. Given an event log in XES format, the tool partitions it into sub-logs (one per subprocess) and discovers a BPMN process model from each sub-log using existing techniques for discovering BPMN process models via heuristics nets or Petri nets. A drawback of these techniques is that they often produce spaghetti-like models and in some cases unsound models. Accordingly, BPMN Miner 2.0 applies post-processing steps to remove unsound constructions as well as a technique to block-structrure the resulting process models in a behavior-preserving manner. The tool is available as a standalone Java tool as well as a ProM and an Apromore plugin. The target audience of this demonstration includes process mining researchers as well as practitioners interested in exploring the potential of process mining using BPMN.</p>
      </abstract>
      <kwd-group>
        <kwd>Process Discovery</kwd>
        <kwd>BPMN</kwd>
        <kwd>Structured Process Models</kwd>
        <kwd>Hierarchical Process Models</kwd>
        <kwd>ProM</kwd>
        <kwd>Apromore</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        complexity should not be underestimated as this underpins the understandability of the
extracted process model, and so ultimately its value to users. In this respect, empirical
studies [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] have shown that besides model size, an important proxy for process model
understandability is the structuredness of the model. This latter observation has led
to the design of discovery algorithms such as the Inductive Miner [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and the
Evolutionary Tree Miner [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which discover structured process models by design. Models
discovered using such algorithms may be further away from reality depending on the
degree of unstructuredness of the actual business process the log refers to. For
example, the Inductive Miner tends to over-generalize the behavior in the log, leading to low
precision while maximizing fitness [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. On the other hand, discovery algorithms such
as the Heuristics Miner [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and the Fodina Miner [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] tend to strike better results
in terms of accuracy but produce more complex and sometimes syntactically incorrect
and unsound process models [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Furthermore, the majority of discovery algorithms
produce models that are represented in languages that are either not widely accepted
among practitioners (e.g. Petri nets), too technical (e.g. Heuristics nets) or too abstract
and over-generalizing (e.g. fuzzy nets or process maps).
      </p>
      <p>
        The BPMN Miner algorithm [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] is designed to address the above limitations. This
algorithm discovers models in the BPMN 2.0 language [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], a de jure standard widely
supported by vendors and practitioners. Moreover, by exploiting implicit functional and
foreign-key dependencies between attributes in the event log, the algorithm can generate
hierarchical BPMN models, i.e. models organized over two or more abstraction levels
via the use of subprocess models. In order to further reduce the complexity of the
discovered models, the algorithm exploits an extensive set of notational elements provided
by the BPMN language, such as boundary events (to model interrupting exceptions),
activity markers (loops and multi-instance) and event subprocesses.
      </p>
      <p>
        BPMN Miner relies on existing (baseline) automated process discovery algorithms
to generate an initial model of each subprocess. The baseline algorithms supported are
Inductive Miner, Heuristics Miner, Fodina, ILP Miner and Alpha Miner. Version 2.0 of
BPMN Miner additionally embeds an algorithm that discovers sound and
maximallystructured BPMN models, namely the Structured Miner [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This latter algorithm
removes unsound and unstructured constructs in the discovered model, thus further
increasing its potential usability.
      </p>
      <p>The minimum input required by the tool is the log from which to discover a BPMN
model. By default, the Heuristics Miner is chosen as the baseline discovery algorithm.
It is however possible to customize a number of input parameters in order to tune the
results (see Figure 1).</p>
      <p>Besides the baseline discovery algorithm, one can choose whether the partitioning
of the log into sublogs is achieved via a noise tolerant dependency discovery algorithm,
or not, in which case the log is assumed to be noise-free. Additionally, it is possible to
sort the input log based on the timestamp of its events, and to switch on the structuring
of the discovered process model. The latter function is only effective when the baseline
discovery algorithm does not already produce a structured model by design.</p>
      <p>Additional parameters can be set to fine-tune the discovery of BPMN-specific
notational elements, on the basis of a number of heuristics. For example, one can set
tolerance levels for the identification of boundary timer and message events, and of
multi-instance activity markers.</p>
      <p>Once these parameters are set, the
algorithm retrieves event attributes which may
be used as primary keys for the
partitioning of the log into sublogs. In this context,
a primary key is an attribute which is
recurrent in all events that are related to the
activities of a particular subprocess. For
example, an attribute “invoice” would be recurrent
across all events related to the handling of
the invoice, as part of an overarching
orderto-cash process. All such events that are
related to the handling of the invoice would be
isolated in a separate sublog, from which the
corresponding subprocess for handling
invoices will then be discovered. The user has
the possibility of steering the use of
particular event attributes by selecting/deselecting
them from a list that is automatically
populated by the tool.</p>
      <p>Next, the algorithm assigns a specific
primary key to each sublog. If more than
one primary key can be assigned to the same
sublog, the user is asked to choose from a
droplist, where the first key is the most fitting
one. At this point the event log is partitioned
into sublogs using the chosen primary keys,
and each sublog is passed as input to the
baseline discovery algorithm for model
discovery. When the discovery has completed,
each subprocess model is structured
separately, if this option is enabled, and
assembled together as part of a single hierarchical
BPMN model.</p>
      <p>
        The accuracy and scalability of BPMN Miner have been extensively evaluated
using over 600 event logs, including both artificial and real-life event logs. The majority
of the artificial logs were generated from the SAP R/3 and IBM BIT process model
collections, which group models from a variety of domains, including finance, sales,
accounting, logistics, communication and human resources. The results of these
evaluations are reported in [
        <xref ref-type="bibr" rid="ref1 ref3 ref4">3, 4, 1</xref>
        ].
      </p>
      <p>The results of the experiments show that the tool scales well to large and noisy
reallife logs, performing within reasonable time bounds in the order of minutes. Moreover,
the results indicate a statistically significant improvement of discovery accuracy and
model complexity over all the baseline discovery algorithms supported by the tool.</p>
      <p>BPMN Miner 2.0 is available as an OSGi plugin of the Apromore process model
repository, as a plugin of the ProM Framework, as well as a standalone
commandlineJava tool. Apromore is an online open-source ecosystem of advanced capabilities
for managing large process model collections, including process modeling, simulation,
filtering, querying, similarity search, behavioral comparison and model merging. ProM
is the largest on-source process mining framework, offering over 300 plugins (in its
latest incarnation) offering process model discovery, conformance checking, variants
and deviance mining, and log analysis capabilities.</p>
      <p>A screencast is available at https://youtu.be/eb0k2RO2PQ8. This video
illustrates different examples and provides a brief explanation of the tool settings along
with the possible outputs that can be obtained by varying the input parameters. BPMN
Miner 2.0 is embedded as an OSGi plugin in the online platform Apromore, which has
been used for the screencast (http://apromore.qut.edu.au). The artificial log
used in the screencast is available at https://goo.gl/AdvnEd.</p>
      <p>BPMN Miner is also available as a ProM plugin (http://promtools.org)
and as a standalone Java tool (http://apromore.org/platform/tools).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>A.</given-names>
            <surname>Augusto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Conforti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. La</given-names>
            <surname>Rosa</surname>
          </string-name>
          , and
          <string-name>
            <surname>G. Bruno.</surname>
          </string-name>
          <article-title>Automated discovery of structured process models: Discover structured vs. discover and structure</article-title>
          .
          <source>In Proc. of ER</source>
          . Springer,
          <year>2016</year>
          . Preprint available at http://eprints.qut.edu.au/95189/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>J.</given-names>
            <surname>Buijs</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.F. van Dongen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.M.P. van der</given-names>
            <surname>Aalst</surname>
          </string-name>
          .
          <article-title>On the role of fitness, precision, generalization and simplicity in process discovery</article-title>
          .
          <source>In Proc. of CoopIS</source>
          , volume
          <volume>7565</volume>
          <source>of LNCS</source>
          , pages
          <fpage>305</fpage>
          -
          <lpage>322</lpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>R.</given-names>
            <surname>Conforti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          <article-title>Garc´ıa-Ban˜uelos, and</article-title>
          <string-name>
            <given-names>M. La</given-names>
            <surname>Rosa</surname>
          </string-name>
          .
          <article-title>Beyond tasks and gateways: Discovering BPMN models with subprocesses, boundary events and activity markers</article-title>
          .
          <source>In Proc. of BPM</source>
          , LNCS,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Raffaele</given-names>
            <surname>Conforti</surname>
          </string-name>
          , Marlon Dumas, Luciano Garc´
          <article-title>ıa-Ban˜uelos, and Marcello La Rosa</article-title>
          .
          <article-title>Bpmn miner: Automated discovery of bpmn process models with hierarchical structure</article-title>
          .
          <source>Information Systems</source>
          ,
          <volume>56</volume>
          :
          <fpage>284</fpage>
          -
          <lpage>303</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. La</given-names>
            <surname>Rosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendling</surname>
          </string-name>
          , R. Ma¨esalu,
          <string-name>
            <given-names>H.A.</given-names>
            <surname>Reijers</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Semenenko</surname>
          </string-name>
          .
          <article-title>Understanding business process models: the costs and benefits of structuredness</article-title>
          .
          <source>In Proc. of CAiSE</source>
          , volume
          <volume>7328</volume>
          <source>of LNCS</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>46</lpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>S.J.J.</given-names>
            <surname>Leemans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fahland</surname>
          </string-name>
          , and
          <string-name>
            <surname>W.M.P. van der Aalst.</surname>
          </string-name>
          <article-title>Discovering block-structured process models from event logs - a constructive approach</article-title>
          .
          <source>In Proc. of PETRI NETS</source>
          , volume
          <volume>7927</volume>
          <source>of LNCS</source>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendling</surname>
          </string-name>
          .
          <article-title>Metrics for Process Models: Empirical Foundations of Verification, Error Prediction, and Guidelines for Correctness</article-title>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Object</given-names>
            <surname>Management</surname>
          </string-name>
          <article-title>Group (OMG). Business Process Model and Notation (BPMN) ver. 2.0</article-title>
          . Object Management Group (OMG),
          <year>January 2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>W.M.P. van der Aalst. Process</given-names>
            <surname>Mining - Discovery</surname>
          </string-name>
          , Conformance and Enhancement of Business Processes. Springer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>S.K.L.M. vanden Broucke</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. De Weerdt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Vanthienen</surname>
            , and
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Baesens</surname>
          </string-name>
          .
          <article-title>Fodina: a robust and flexible heuristic process discovery technique</article-title>
          . http://www.processmining. be/fodina/. Last accessed:
          <volume>03</volume>
          /27/
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>A.J.M.M. Weijters</surname>
            and
            <given-names>J.T.S.</given-names>
          </string-name>
          <string-name>
            <surname>Ribeiro. Flexible Heuristics</surname>
          </string-name>
          <article-title>Miner (FHM)</article-title>
          .
          <source>In Proc. of CIDM. IEEE</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>