<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Framework for Interactive Mining and Retrieval from Process Traces Doctoral Consortium ICCBR 2016</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>L. Canensi</string-name>
          <email>canensi@di.unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Universita di Torino</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>252</fpage>
      <lpage>256</lpage>
      <abstract>
        <p>Copyright c 2016 for this paper by its authors. Copying permitted for private and academic purposes. In Proceedings of the ICCBR 2016 Workshops. Atlanta, Georgia, United States of America</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Processes are everywhere. We can nd processes in hospital, companies,
universities, institutions and so on. Therefore, it is no surprise that the
usage of information systems and enterprise resource planning tools has
been rapidly growing in organizations and companies worldwide, of all
kinds and sizes. The sequences (traces) of actions that have been
completed at an organization is usually stored in the so-called event log. Event
logs constitute a very rich source of experiential knowledge, fundamental
to support di erent tasks, like, e.g., mining a process model, or retriev-ing
similar traces, in order to make predictions on the currently running
process instance. In the Business Process Management (BPM) eld, it is
widely recognized that these activities are very important, and can be used
to improve process execution and performance. However, all the works in
the literature treat these activities separately, applying di erent
theoretical and methodological approaches to each of them.</p>
      <p>Instead, in my PhD thesis I propose a comprehensive approach, aiming
at integrating the construction of the process model and its analysis.</p>
      <p>In particular, my Thesis is composed of three steps:
{ initial process model construction
{ trace retrieval
{ interactive process model abstraction and re nement.</p>
      <p>The approach allows the user to take advantage of the integration of
the three activities above; indeed, the initial process model we build is
used as an indexing structure to speed up trace retrieval; moreover, we
exploit a uni ed methodological solution that allows to retrieve traces
and speci c paths in the model. We then exploit retrieval results as a
basis to build a more abstract process model, in an interactive fashion.</p>
      <p>Indeed, our innovative approach is able to "reconcile" apparently
heterogeneous needs of business process management, supporting a
userfriendly interaction with domain experts and to take advantage of all the
available knowledge sources in a comprehensive way,</p>
      <p>The following subsections describe in more detail the various steps of
the thesis.
1.1</p>
      <p>
        Initial process model construction
The rst part of the work is related to the Process Mining (PM) eld.
PM is a research discipline that discovers, monitors, and improves real
processes, by extracting knowledge from traces in the event logs,
readily available from today systems [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Each trace consists of an ordered
sequence of activities. The mined process model can be used to
understand, adapt and modify the real process to increase performance and
become a high quality process. There are three main classes of process
mining techniques [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]:
{ Discovery : the discovery of new process models based only on the
event log
{ Conformance: conformance veri cation of the recorded behavior with
respect to a provided model
{ Enhancement : extension of an existing process model using the
information from the event log
      </p>
      <p>
        Most of the phases of a process life cycle [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] can bene t of PM
techniques: they can be adopted to analyse an existing model, to diagnose
problem, and possibly to adapt/redesign/tune the process model itself.
      </p>
      <p>All these considerations lead to de ne PM as a very important
instrument for modern organizations that need to manage non-trivial
operational processes.</p>
      <p>My research deals with process discovery, the most relevant and widely
used PM activity.</p>
      <p>
        There are many di erent approaches to process discovery. Di erent
algorithms have their own speci cities: some focus on local relations
between activities in the logs (Heuristic Miner [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]), while others focus on
the whole log (Fuzzy Miner [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], Genetic Miner [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]). However, all of these
algorithms operate at a unique, system-de ned level of abstraction.
Instead, in many domains, it would be very important to have the ability
to build/re ne models, working at di erent levels of abstraction.
      </p>
      <p>The contribution of my thesis to the process discovery area consists
in a novel tool that allows the construction of a data structure (called
log-tree), that can be used both as an initial model of the process (to be
possibly abstracted in a further interactive session with the user), and as
an index, to speed-up trace retrieval. The log-tree is a new representation
formalism which has a well de ned semantics (unambiguous) and
maintain a direct connection between traces in the log and elements in the
process model. Therefore, it is usable as an index of the traces and as a
standard process model. In order to build an index, the algorithm
guarantees that the log-tree only includes paths actually recorded as traces in
the event log. In order to realize this objective, the algorithm: (1) makes
an intensive use of all the available frequency information about the
activities recorded in the event log; (2) properly forks the model into various
branches, on the basis of the di erent execution contexts, implicitly
represented by subsets of the traces in the event log.
1.2</p>
      <p>
        Trace Retrieval
The second part of the thesis deals with trace retrieval. When the input
trace is a currently running process instance, the retrieval of similar,
already completed instances recorded as traces in the log, can enable the
user to make predictions about the current instance completion, or can
recommend suitable actions, resources or routing decisions to be adopted
next; these goals are treated in the literature within the operational
support research area [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] . Trace retrieval has been recently considered in the
Case Base Reasoning (CBR) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] literature. All approaches use traces as
sources for retrieving and reusing user's experience. For instance, the work
in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] proposes trace-based reasoning, a CBR approach where cases are
not explicitly stored in a library, but are implicitly recorded as "episodes"
within traces. The paper in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] extends that work, and de nes a similarity
measure to compare episodes extracted from traces. These work, however,
do not aim at providing support in business process management, such
as prediction for operational support, or pattern identi cation for
abstraction. The goal of these tools is therefore usually very di erent from
ours.
      </p>
      <p>Moreover, current trace retrieval approaches typically take in input
a fully speci ed trace. This is a severe limitation, because sometimes,
the goal is to nd races that ful l partially speci ed patterns. We can
deal with this issue, by means of a powerful query language and query
answering approach.</p>
      <p>The log-tree is used as an index, allowing fast retrieval from the
available event log. Thanks to its characteristics and methodological solutions,
the tool implements operational support tasks in a exible, e cient and
user friendly way.</p>
      <p>It is worth noting that our approach, besides retrieving traces from
the log-tree, also allows to retrieve paths from a generic (more abstract)
process model (a graph).
1.3</p>
      <p>
        Interactive process model abstraction
The log-tree is already a process model, which guarantees a maximal
precision (i.e., it does not represent any behavior that was not recorded in the
traces). However, in some cases, it may be useful to have a more general
process model, with consequent loss of precision, that abstracts from
negligible details. So, the log-tree can be seen as a starting point to generate
a more abstract process model in an interactive session of work, where the
user is always allowed to inspect the current output, and possibly
backtrack to the previous step. The ability of retrieving speci c traces/paths
in the model, corresponding to properties/situations of interest, is very
useful in supporting the abstraction process, by suggesting portions of
the model that could be merged. To the best of our knowledge, path
retrieval and merging have never been described in the business process
management literature, possibly with the exception of the fuzzy miner
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which provides a functionality to cluster (merge) actions into macro
actions. Moreover, no literature contribution provides a unifying
framework, where a suite of di erent facilities are properly integrated as in
our work, to support process mining at di erent level of abstraction, and
e cient trace/path retrieval (also responding to abstract query patterns).
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. \http : =
          <article-title>=www:win:tue:nl=ieeetf pm." IEEE Taskforce on Process Mining: Process Mining Manifesto (last accessed on 4/11/</article-title>
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>W. M. P. van der Aalst</surname>
          </string-name>
          , Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer Publishing Company, Incorporated, 1st ed.,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>W.</given-names>
            <surname>Scacchi</surname>
          </string-name>
          and P. Mi, \
          <article-title>Process life cycle engineering: A knowledge-based approach and environment</article-title>
          .,
          <source>" Int. Syst. in Accounting, Finance and Management</source>
          , vol.
          <volume>6</volume>
          , no.
          <issue>2</issue>
          , pp.
          <volume>83</volume>
          {
          <issue>107</issue>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>A.</given-names>
            <surname>Weijters</surname>
          </string-name>
          , W. V.
          <article-title>der Aalst, and</article-title>
          <string-name>
            <given-names>A</given-names>
            .
            <surname>A. de Medeiros</surname>
          </string-name>
          ,
          <article-title>Process Mining with the Heuristic Miner Algorithm</article-title>
          , WP 166. Eindhoven University of Technology, Eindhoven,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>C. W.</given-names>
            <surname>Gu</surname>
          </string-name>
          <article-title>nther and</article-title>
          <string-name>
            <surname>W. M. P. Van Der Aalst</surname>
          </string-name>
          , \Fuzzy Mining:
          <article-title>Adaptive Process Simpli cation Based on Multi-perspective Metrics,"</article-title>
          <source>in Proceedings of the 5th International Conference on Business Process Management, BPM'07</source>
          ,
          <string-name>
            <surname>(Brisbane</surname>
          </string-name>
          , Australia), pp.
          <volume>328</volume>
          {
          <issue>343</issue>
          , Springer-Verlag,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>A. K. A. D. Medeiros</surname>
          </string-name>
          and
          <string-name>
            <surname>A. J. M. M. Weijters</surname>
          </string-name>
          , \
          <article-title>Genetic process mining,"</article-title>
          <source>in Applications and Theory of Petri Nets</source>
          <year>2005</year>
          , volume
          <volume>3536</volume>
          of Lecture Notes in Computer Science, pp.
          <volume>48</volume>
          {
          <issue>69</issue>
          , Springer-Verlag,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A.</given-names>
            <surname>Aamodt</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Plaza</surname>
          </string-name>
          , \
          <article-title>Case-based reasoning: foundational issues, methodological variations and systems approaches,"</article-title>
          <source>AI Communications</source>
          , vol.
          <volume>7</volume>
          , pp.
          <volume>39</volume>
          {
          <issue>59</issue>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>A.</given-names>
            <surname>Cordier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lefevre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.-A.</given-names>
            <surname>Champin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Georgeon</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Mille</surname>
          </string-name>
          , \
          <article-title>Trace-Based Reasoning | Modeling interaction traces for reasoning on experiences," in The 26th International FLAIRS Conference</article-title>
          , May
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>R.</given-names>
            <surname>Zarka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cordier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Egyed-Zsigmond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lamontagne</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Mille</surname>
          </string-name>
          , \
          <article-title>Similarity Measures to Compare Episodes in Modeled Traces,"</article-title>
          <source>in International Case-Based Reasoning Conference (ICCBR</source>
          <year>2013</year>
          ) (Springer, ed.),
          <source>Lecture Notes in Computer Science</source>
          , pp.
          <volume>358</volume>
          {
          <issue>372</issue>
          , Springer Berlin Heidelberg,
          <year>July 2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>