<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Meta Model for Process Mining Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>B.F. van Dongen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>W.M.P. van der Aalst?</string-name>
          <email>w.m.p.v.d.aalstg@tue.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Technology Management, Eindhoven University of Technology P.</institution>
          <addr-line>O. Box 513, NL-5600 MB, Eindhoven</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Modern process-aware information systems store detailed information about processes as they are being executed. This kind of information can be used for very di erent purposes. The term process mining refers to the techniques and tools to extract knowledge (e.g., in the form of models) from this. Several key players in this area have developed sophisticated process mining tools, such as Aris PPM and the HP Business Cockpit, that are capable of using the information available to generate meaningful insights. What most of these commercial process mining tools have in common is that installation and maintenance of the systems requires enormous e ort, and deep knowledge of the underlying information system. Moreover, information systems log events in di erent ways. Therefore, the interface between process-aware information systems and process mining tools is far from trivial. It is vital to correctly map and interpret event logs recorded by the underlying information systems. Therefore, we propose a meta model for event logs. We give the requirements for the data that should be available, both informally and formally. Furthermore, we back our meta model up with an XML format called MXML and a tooling framework that is capable of reading MXML les. Although, the approach presented in this paper is very pragmatic, it can be seen as a rst step towards and ontological analysis of process mining data.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>event refers to an activity (i.e., a well-de ned step in the process), (ii) each event
refers to a case (i.e., a process instance), and (iii) events are totally ordered.
Furthermore, all kinds of system-speci c data elements can be present in these
event logs. This immediately shows one of the biggest challenges faced in the
process mining research. Each information system has its own internal data
structure, and its own language to describe the internal structure. When trying
to use event logs from di erent system to do process mining, we need to be
able to present the logs in a standardized way, i.e., there is a need for a good
description of such a log. Furthermore, for each information system, a mapping
has to be provided onto that description. In other words, we need a meta model
for process mining. In this paper, we take a rst step towards such a process
mining meta model.</p>
      <p>
        Few of the meta models described in literature (e.g., [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]) focus on process
mining. The work of Zur Muehlen [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is closest to the results reported in this
paper. However, our work is more pragmatic and driven by concrete tools and
systems. The application of ProM, our process mining platform, provides us with
insights that are valuable for people using BAM, BPI, and other process mining
tools.
      </p>
      <p>
        In practice, the mapping of event logs from one system to the standard format
is a non-trivial task. It requires the mapping of one meta model onto another.
It can be seen as a form of ontological analysis in the spirit of [
        <xref ref-type="bibr" rid="ref11 ref4 ref6">4, 6, 11</xref>
        ]. Instead
of the Bunge-Wand-Weber ontology, we use a meta model that can be seen
as a starting point for an ontology for process mining. Similar to the work in
[
        <xref ref-type="bibr" rid="ref11 ref4 ref6">4, 6, 11</xref>
        ], \ontological goodness" and \ontological weaknesses" of process-aware
information systems with respect to event logs can be analyzed by comparing
the di erent meta models.
      </p>
      <p>The remainder of this paper is organized as follows. In Section 2, we
introduce process-aware information systems and we provide a high-level classi cation
thereof. Then, in Section 3, we introduce an XML format (MXML) for storing
event logs. In Section 4, we introduce the process mining framework that can
work with MXML les. In Section 5, we show an example of a mapping
between the the meta model of a widely-used information system (Sta ware) to
our meta model, and we show how this can be translated to a mapping of the log
to MXML. In this section, we also provide an ontological analysis of the logging
facilities of Sta ware. Section 6, we touch some of the issues related to these
mappings for other information systems. Finally, we discuss related work and
conclude the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Process-aware information systems</title>
      <p>Process-aware information systems are widely used in practice (cf. ERP, WFM,
CRM, PDM systems). At the basis of most of these systems lays a process model
of some kind. However, the way systems enforce the handling of cases is di erent
for all systems. On the one hand there are systems that enforce a given process
description to all users, while some other systems only provide an easy way of
handling access to les. As a result of this, information systems are used in
very diverse organizations and with all kinds of expectations. Even though each
system has its individual advantages and disadvantages, these systems can be
divided in several groups. In Figure 1, we give four types of information systems,
and position them with respect to the structure of the process that is dealt with
and whether they are data or process driven. In Figure 2, we give the trade-o s
that are made for each of these four types of systems with respect to exibility,
support, performance and design e ort.</p>
      <p>explicitly
structured
implicitly
structured</p>
      <p>ad-hoc
structured
case handling
production
workflow
ad-hoc workflow
unstructured
groupware
high
low
flexibility
support
performance
design
effort
data-driven</p>
      <p>process-driven</p>
      <p>Production work ow systems such as for example Sta ware are typically used
in organizations where processes are highly standardized, and volumes are big
(i.e. a lot of cases are to be dealt with in parallel). These systems not only handle
data, but enforce a certain process de nition to be followed by the letter. Case
handling systems such as Flower on the other hand, are typically used in
environments where people have a good understanding of the complete process. This
allows these so-called \knowledge workers" to handle cases with more exibility.
In the end however, the case handling system keeps structure in both the data
involved and the steps required. Ad-hoc work ow systems such as InConcert
allow for the users to deviate completely from given processes. Processes de
nitions are still provided, but not enforced on an execution level. They merely
serve as reference models. The nal category of systems, i.e. groupware is the
most exible one. Systems such as Lotus Notes provide a structured way to store
and retrieve data, but no processes are de ned at all.</p>
      <p>Due to the fact that each information system serves a di erent purpose, and
that they are used in very di erent organizations, it is obvious that there is a
di erence in the internal data warehousing of those systems. In this paper, we are
interested in the event logs that can be generated by process-aware information
systems. Since the information in an event log highly depends on the internal
data representation of each individual system, it is safe to assume that each
system provides information in its own way. Therefore, we need to provide a
standard for the information we need and mappings from each system to this
standard. For this, we introduce MXML.</p>
    </sec>
    <sec id="sec-3">
      <title>XML mining format MXML</title>
      <p>As we stated in the introduction, there is a minimal amount of information that
needs to be present in order to do process mining. In this section, we rst give
some requirements with respect to this information. From these requirements,
we derive a meta model in terms of a UML class diagram. Then, we introduce a
formal XML de nition for event logs, called MXML, to support this meta model.
We conclude the section with an example of an MXML le.
3.1</p>
      <sec id="sec-3-1">
        <title>Requirements</title>
        <p>All process-aware information systems have one thing in common, namely the
process speci cation. For groupware systems, such a speci cation is nothing more
than a unstructured set of possible activities, while for production work ows this
speci cation is extremely detailed. For process mining, log les of such systems
are needed as a starting point. First we give the requirements for the information
needed.</p>
        <p>To make the distinction between events that took place, and logged events, we
will refer to the latter by audit trail entries from here on. When events are logged
in some information system, we need them to meet the following requirements
in order to be useful in the context of process mining:
1. Each audit trail entry should be an event that happened at a given point
in time. It should not refer to a period of time. For example, starting to
work on some work-item in a work ow system would be an event, as well
as nishing the work-item. The process of working on the work-item itself is
not.
2. Each audit trail entry should refer to one activity only, and activities should
be uniquely identi able.
3. Each audit trail entry should contain a description of the event that
happened with respect to the activity. For example, the activity was started or
completed.
4. Each audit trail entry should refer to a speci c process instance (case).</p>
        <p>We need to know, for example, for which invoice the payment activity was
started.
5. Each process instance should belong to a speci c process.</p>
        <p>Using the requirements given above, we are able to make a meta model of
the information that should be provided for process mining.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Mining meta model</title>
        <p>From the requirements given in the previous section, we derive the UML class
diagram of Figure 3. Note that we use the term \Work ow Model Element"
instead of activity, and \Work ow log" instead of event log. This is done for
historic reasons.
assign
1..*</p>
        <p>As we stated in the requirements, each audit trail entry contains a description
of the event that generated it. In order to be able to talk about these events in a
standardized way, we developed a transactional model that shows the events that
we assume can appear in a log. This model is based on analyzing the di erent
types of logs in real-life systems (e.g., Sta ware, SAP, FLOWer, etc.) Figure 4
shows this transactional model.</p>
        <p>When an activity (or Work ow Model Element) is created, it is either
\scheduled" or skipped automatically (\autoskip"). Scheduling an activity means that
the control over that activity is put into the information system. The information
system can now \assign" this activity to a certain person or group of persons. It
is possible to \reassign" an assigned activity to another person or group of
persons. This can be done by the system, or by a user. A user can \start" working
on an activity that was assigned to him, or some user can decide to \withdraw"
the activity or skip it manually (\manualskip"), which can even happen before
the activity was assigned. The main di erence between a withdrawal and a
manual skip is the fact that after the manual skip the activity has been executed
correctly, while after a withdrawal it is not. The user that started an activity
can \suspend" and \resume" the activity several times, but in the end he or
she either has to \complete" or abort (\ate abort") it. Note the activity can get
aborted (\pi abort") during its entire life cycle. Since we cannot claim that we
have captured all possible behavior of all systems, we will have to allow for user
de ned events in the MXML format.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>XML Structure</title>
        <p>Using the meta model of Figure 3, we can easily derive an XML format for
storing event logs. In Figure 5 a schema de nition is given for the format that
is used by our process mining framework ProM.</p>
        <p>Most of the elements in the XML schema can be found in the meta model
and they speak for themselves. There are however two exceptions. First, there is
the \Data" element. This element allows for storing arbitrary textual data, and
contains a list of \Attribute" elements. On every level, it can be used to store
information about the environment in which the log was created. Second, there
is the \Source" element. This element can be used to store information about
the information system this log originated from. It can in itself contain a data
element, to store information about the information system. It can for example
be used to store con guration settings.
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Example</title>
        <p>We conclude the section about the MXML format with an example of an XML
log
le. This example log is the</p>
        <p>rst part of a translation of a Sta ware log to
the MXML format (without the standard headers). In Section 5, we introduce
this translation in more detail. It shows two audit trail entries in a complaints
handling process.</p>
        <p>&lt;Source program="staffware"&gt;
&lt;Data&gt;</p>
        <p>&lt;Attribute name="version"&gt;7.0&lt;/Attribute&gt;
&lt;/Data&gt;
&lt;/Source&gt;
&lt;Process id="main_process"&gt;
&lt;Data&gt;</p>
        <p>
          &lt;Attribute name="description"&gt;complaints handling&lt;/Attribute&gt;
&lt;/Data&gt;
&lt;ProcessInstance id="Case 1"&gt;
&lt;AuditTrailEntry&gt;
&lt;WorkflowModelElement&gt;Case start&lt;/WorkflowModelElement&gt;
&lt;EventType unknowntype="case_event"&gt;unknown&lt;/EventType&gt;
&lt;Timestamp&gt;2002-04-16T11:06:00.000+01:00&lt;/Timestamp&gt;
&lt;/AuditTrailEntry&gt;
&lt;AuditTrailEntry&gt;
&lt;WorkflowModelElement&gt;Register complaint&lt;/WorkflowModelElement&gt;
&lt;EventType&gt;schedule&lt;/EventType&gt;
&lt;Timestamp&gt;2002-04-16T11:16:00.000+01:00&lt;/Timestamp&gt;
&lt;originator&gt;jvluin@staffw&lt;/originator&gt;
&lt;/AuditTrailEntry&gt;
up by a good tool. For this, the ProM framework [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] has been developed. The
ProM framework is a \pluggable" environment for process mining. It allows for
interaction between a large number of so-called plug-ins. A plug-in is basically
the implementation of an algorithm that is of some use in the process mining
area, where the implementation agrees with the framework. When dealing with
log les, the framework requires them to be in the MXML format.1
        </p>
        <p>
          The ProM framework can read log les in the MXML format. Through the
Import plug-ins a wide variety of models can be loaded ranging from a Petri net to
LTL formulas. The Mining plug-ins do the actual mining and the result is stored
as a Frame. These frames can be used for visualization, e.g., displaying a Petri
net [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], an EPC [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] or a Social network [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], or further analysis or conversion.
The Analysis plug-ins take a mining result and analyze it, e.g., calculating a
place invariant for a resulting Petri net. The Conversion plug-ins take a mining
result and transform it into another format, e.g., transforming an EPC into a
Petri net.
        </p>
        <p>Using the ProM framework, we have seen promising results with respect to
the applicability of process mining in real business environments. In the next
section, we present a small case study thereof.
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Case: Sta ware logs to MXML</title>
      <p>In this section, we show that it is possible to actually extract MXML log les
from commercial systems, since this greatly improves the practical applicability
of the ProM framework. As a case study, we show that this is possible for a
commercial work ow system called TIBCO Sta ware Process Suite (in short
Sta ware). Through this example, we show why we call our model from Figure 3
a meta-model. We consider the data model used by Sta ware as an instantiation
of the meta-model. Then, the Sta ware log le should be seen as an instantiation
of the Sta ware data model. By studying the mapping between the Sta ware
data model and our meta-model in Section 5.2 we give a translation from the
Sta ware log le to the MXML le.
5.1</p>
      <sec id="sec-4-1">
        <title>A Sta ware log le</title>
        <p>In Table 2, we show an example of a Sta ware log le. It consists of one complete
case of a complaint handling process and a second, incomplete case of the same
process.
5.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Mapping Sta ware to MXML</title>
        <p>In order to map Sta ware logs to MXML, we st need to map the internal data
model of Sta ware onto our model from Figure 3. In Figure 6, we show the
Sta ware data model, and we show how the mapping should be done.</p>
        <p>When this mapping is available, it is almost trivial to map log les to MXML.
Each audit in Sta ware, contains the logs of only one procedure (or process) of
which each audit trail is described separately. These audit trails can easily be
1 For more information about ProM, we refer to http://www.processmining.org.</p>
        <p>Case 1
Diractive Description Event User yyyy/mm/dd hh:mm
--------------------------------------------------------------------</p>
        <p>Start jvluin@staffw 2002/04/16 11:06
Register complaint Processed To jvluin@staffw 2002/04/16 11:16
Register complaint Released By jvluin@staffw 2002/04/16 11:26
Evaluate complaint Processed To jvluin@staffw 2002/04/16 11:36
Evaluate complaint Released By jvluin@staffw 2002/04/16 11:46</p>
        <p>Terminated 2002/04/16 11:56
Case 2
Diractive Description Event User yyyy/mm/dd hh:mm
--------------------------------------------------------------------</p>
        <p>Start jvluin@staffw 2002/04/16 12:36
Register complaint Processed To jvluin@staffw 2002/04/16 12:46
Register complaint Expired jvluin@staffw 2002/04/17 13:07
Register complaint Withdrawn jvluin@staffw 2002/04/17 13:07
mapped onto \process instances". Each line of text in the audit le of Table 2
contains four columns. The</p>
        <p>rst column can be mapped onto the \work ow
model element", since they refer to manual steps of which the process is
composed. The third and fourth column are obviously mapped onto the \originator"
and \timestamp" respectively. In Table 3, we show the mapping for the events
that can appear in the second column of Table 2 and what to do with the
beginning of a case and the end of a case, i.e. the \Start" and \Terminated" events
respectively. The beginning and end of a case are special situations, since they
are lines of text without an associated manual step. Therefore, we have to create
virtual manual steps.</p>
        <sec id="sec-4-2-1">
          <title>Sta ware event</title>
        </sec>
        <sec id="sec-4-2-2">
          <title>ProM work owmodelelement</title>
        </sec>
        <sec id="sec-4-2-3">
          <title>ProM eventtype</title>
        </sec>
        <sec id="sec-4-2-4">
          <title>Start</title>
        </sec>
        <sec id="sec-4-2-5">
          <title>Processed To</title>
        </sec>
        <sec id="sec-4-2-6">
          <title>Released By</title>
        </sec>
        <sec id="sec-4-2-7">
          <title>Expired</title>
        </sec>
        <sec id="sec-4-2-8">
          <title>Withdrawn</title>
          <p>
            Terminated
\case start"
as in 2nd column
as in 2nd column
as in 2nd column
as in 2nd column
\case end"
unknown: case event
schedule
complete
unknown: expire
withdrawn
unknown: case event
In the spirit of [
            <xref ref-type="bibr" rid="ref11 ref4 ref6">4, 6, 11</xref>
            ] we can conduct an ontological analysis of the Sta ware
logging capabilities. First, we check for Ontological Incompleteness, also named
Construct De cit, which exists \unless there is at least one grammatical
construct for each ontological construct" [
            <xref ref-type="bibr" rid="ref11 ref4 ref6">4, 6, 11</xref>
            ]. Sta ware is quite complete. The
most important construct de cit is the absence of an event type comparable to
\start" (e.g., a worker that picks up a work-item). As a result it is impossible
to distinguish \waiting" and \service" times. Second, we check for Ontological
Clarity. This is determined by the extent to which the grammar does not
exhibit one or more of the following de ciencies: (1) Construct Overload exists in
a grammar if one grammatical construct represents more than one ontological
construct, (2) Construct Redundancy exists if more than one grammatical
construct represents the same ontological construct, (3) Construct Excess exists in
a grammar when a grammatical construct is present that does not map to any
ontological construct [
            <xref ref-type="bibr" rid="ref11 ref4 ref6">4, 6, 11</xref>
            ]. The fact that Sta ware uses a start (end) event
to record the creation (completion) of a case can be seen as construct overload.
There is no construct redundancy nor construct excess. So overall, the
ontological analysis is quite positive. For Sta ware such an analysis may seem trivial.
However, for systems that support a completely di erent way of logging, the
ontological analysis is less straightforward.
5.4
          </p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>Meeting the requirements</title>
        <p>In Section 3.1, we gave ve requirements for log les in order for them to be
useful for process mining. In this section, we discuss whether these requirements
are met for Sta ware, taking the mapping we established and the ontological
analysis into account.
1. Each audit trail entry refers to a speci c point in time. In the Sta ware log,
each line is an audit trail entry and since each line has a timestamp, this
requirement is met.
2. Each audit trail entry refers to one activity. In the Sta ware log, the activities
are described by the rst column. However, there are two problems. First
of all, the start and termination of a case does not have an activity-name.
Second, we cannot conclude that the activity names are unique. In fact it is
possible to have multiple activities in the de nition of a Sta ware process
with the same label. To still meet this requirement, we assume the start and
end of a case to belong to a ctive manual step. Furthermore, we just assume
that activities are uniquely identi ed by their labels.
3. The second column contains a reference to the event that actually happened,
so this requirement is met.
4. The audit trail entries are sorted per case, so this requirement is met.
5. The cases all belong to the same process, since a Sta ware log le always
contains at most one procedure. Therefore, the last requirement is met as
well.</p>
        <p>In this section we have shown that it is possible to make mappings from log
les of commercial systems to MXML. Due to space limitations, we cannot show
the complete result of the translation. However, Table 1 shows a part of the
converted log le.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Known issues</title>
      <p>In Section 5, we have shown that it is possible to map Sta ware logs onto MXML.
It may be clear that our goal is to come up with such mappings for as many
information systems as possible. So far, we have discovered that for most
production work ow system, making these mappings is almost trivial. However, for
systems that are more data driven than process driven, these mappings become
extremely di cult. For example, information systems like SAP R/3 and
Peoplesoft are capable of logging almost anything on a database level. However, it
has proven to be impossible to discover the case (process instance) to which an
event belongs from the event log.</p>
      <p>Obviously, the internal data structures of complex information systems like
SAP and Peoplesoft have to be able to link database transactions to cases.
However, usually, the way these are linked is implementation and release speci c,
which makes it almost impossible to derive generic results. On an
implementationspeci c level however, we have seen promising results in both SAP and
Peoplesoft.</p>
      <p>Another issue that needs to be addressed is the situation where not one
system is providing the event logs, but logs are taken from multiple legacy systems.
This of course requires even more bookkeeping and makes it even harder to
restore relations between events, cases, etc. However, data warehousing techniques
may ease the burden.
7</p>
    </sec>
    <sec id="sec-6">
      <title>Related work</title>
      <p>
        The MXML format presented in this paper is not the only attempt to give
a formalization of data models for event logging. Several papers focus on the
use of meta models in the context of process-aware information systems [
        <xref ref-type="bibr" rid="ref12 ref9">9, 12</xref>
        ].
Most of these meta models however focus on the functionality of these systems
rather than their ability to record event logs. The ontological analysis of di erent
languages has been the topic of many papers. For example, in [
        <xref ref-type="bibr" rid="ref11 ref4 ref6">4, 6, 11</xref>
        ] the
BungeWand-Weber ontology is used to compare di erent languages. Again such an
analysis focuses on the core functionality rather than logging facilities.
      </p>
      <p>We would like to discuss two related approaches in more detail. The rst is
an attempt by the Work ow Management Coalition (WFMC) to standardize
the communication between work ow engines and administration and
monitoring tools. The second is the tool Aris PPM (Process Performance Monitor),
developed by IDS Scheer.
7.1</p>
      <sec id="sec-6-1">
        <title>Interface 5 of the WFMC reference model</title>
        <p>
          In the area of work ow management, the Work ow Management Coalition has
developed a reference model for communication between the core of a work ow
system, i.e. the Work ow Engine, and several supporting tools. For this, ve
interfaces have been developed, of which Interface 5 is of most interest to us.
It is de ned as the interface for communication between the work ow engine
and administration and monitoring tools. Unfortunately, a good standard for
this interface has never been developed. A meta model for this interface was
proposed recently in Section 4, page 175 of [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. This model, however, shows how
information in a log le relates to objects created at runtime and objects created
at build time, but it is too high level to be used as a starting point for process
mining.
A well known tool in the area of process performance monitoring, is ARIS PPM
(Process Performance Monitor) [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] developed by IDS Scheer. ARIS PPM allows
for the visualization, aggregation, and analysis of process instances expressed in
terms of instance EPCs (i-EPCs). An instance EPC describes the control- ow of
a case and it provides a graphical representation describing the causal relations
between events within the case. In case of parallelism, there may be di erent
traces having the same instance EPC. Note that in the presence of parallelism,
two subsequent events do not have to be causally related. ARIS PPM exploits
the advantages of having instance EPCs rather than traces to provide additional
management information, i.e., instances can be visualized and aggregated in
various ways.
        </p>
        <p>Typically, Aris PPM communicates with systems like Sta ware and SAP
R/3 using a number of custom-made adapters. These adapters, unfortunately,
can only create instance EPCs if the actual process is known. As a result, it
is very time consuming to build adapters. Moreover, the approaches used only
work in environments where there are explicit process models available.
8</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>In this paper, we introduced a standard for storing event logs generated by
process-aware information systems. For this, we provide requirements, a data
model and an XML format called MXML. Furthermore, we have shown an
example of a mapping from an event log of a commercial work ow system to
MXML. In Section 4, we introduced the process mining framework ProM. This
framework accepts event logs in the MXML format and it enables researchers to
implement new process mining techniques and bene t from each others ideas,
without having to care about the information system the event logs were
generated by. Furthermore, by mapping event logs from commercial information
systems to MXML, the applicability of process mining in business environments
greatly improves. However, to establish mappings from the log formats of di
erent information systems to the MXML format, an in-depth evaluation of a large
enough number of these systems is needed.
We thank INTEROP for supporting this work that has been conducted in the
context of the INTEROP work package \Domain Ontologies for
Interoperability". More speci cally, the paper touches the issue of how to deal with
multiple data-models (and accompanying ontologies) of logs of information systems,
which ts into Subtask 2 of the work package. Furthermore, especially the
mapping presented deals with issues stated in Subtask 3 of the work package, i.e.
semantic mapping of ontologies and data models. Since the work is inspired by a
practical problem, it can be used as a starting point for further work in Subtask 4,
i.e., the investigation of the e ectiveness of the use of ontologies for
interoperability. As indicated, an evaluation of a more systems (FLOWer, FileNet, IBM
Websphere, etc.) is planned in the context of this subtask.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>W.M.P. van der Aalst</surname>
            and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Song</surname>
          </string-name>
          .
          <article-title>Mining Social Networks: Uncovering Interaction Patterns in Business Processes</article-title>
          . In J. Desel,
          <string-name>
            <given-names>B.</given-names>
            <surname>Pernici</surname>
          </string-name>
          , and M. Weske, editors,
          <source>International Conference on Business Process Management (BPM</source>
          <year>2004</year>
          ), volume
          <volume>3080</volume>
          of Lecture Notes in Computer Science, pages
          <volume>244</volume>
          {
          <fpage>260</fpage>
          . Springer-Verlag, Berlin,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>W.M.P. van der Aalst</surname>
            ,
            <given-names>B.F. van Dongen</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Herbst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Maruster</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Schimm, and</article-title>
          <string-name>
            <given-names>A</given-names>
            .
            <surname>J.M.M. Weijters</surname>
          </string-name>
          .
          <article-title>Work ow Mining: A Survey of Issues and Approaches</article-title>
          .
          <source>Data and Knowledge Engineering</source>
          ,
          <volume>47</volume>
          (
          <issue>2</issue>
          ):
          <volume>237</volume>
          {
          <fpage>267</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>W.M.P. van der Aalst</surname>
          </string-name>
          and A.
          <string-name>
            <surname>J.M.M</surname>
          </string-name>
          . Weijters, editors.
          <source>Process Mining</source>
          , Special Issue of Computers in Industry, Volume
          <volume>53</volume>
          , Number 3. Elsevier Science Publishers, Amsterdam,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>I.</given-names>
            <surname>Davies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Green</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Milton</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Rosemann</surname>
          </string-name>
          .
          <article-title>Analyzing and Comparing Ontologies with Meta-Models</article-title>
          . In J. Krogs,
          <string-name>
            <given-names>T.</given-names>
            <surname>Halpin</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          K. Siau, editors,
          <source>Information Modeling Methods and Methodologies</source>
          , pages
          <volume>1</volume>
          {
          <fpage>16</fpage>
          . Idea Group,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>B.F. van Dongen</given-names>
            ,
            <surname>A.J.M.M. Weijters A.K.A. de Medeiros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.M.W.</given-names>
            <surname>Verbeek</surname>
          </string-name>
          , and
          <string-name>
            <surname>W.M.P. van der Aalst.</surname>
          </string-name>
          <article-title>The PRoM framework: A new era in process mining tool support</article-title>
          .
          <source>In accepted tool presentation at ATPN</source>
          <year>2005</year>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>P.</given-names>
            <surname>Green</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Rosemann</surname>
          </string-name>
          .
          <source>Integrated Process Modeling: An Ontological Evaluation. Information Systems</source>
          ,
          <volume>25</volume>
          (
          <issue>3</issue>
          ):
          <volume>73</volume>
          {
          <fpage>87</fpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>IDS</given-names>
            <surname>Scheer. ARIS Process Performance</surname>
          </string-name>
          <article-title>Manager (ARIS PPM): Measure, Analyze and Optimize Your Business Process Performance (whitepaper)</article-title>
          .
          <source>IDS Scheer</source>
          , Saarbruecken, Gemany, http://www.ids-scheer.com,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. G. Keller and T. Teufel. SAP R/3 Process Oriented Implementation. AddisonWesley, Reading MA,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>M. Zur</given-names>
            <surname>Muehlen</surname>
          </string-name>
          .
          <article-title>Work ow-based Process Controlling: Foundation, Design and Application of work ow-driven Process Information Systems</article-title>
          . Logos, Berlin,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. W. Reisig and G. Rozenberg, editors.
          <source>Lectures on Petri Nets I: Basic Models</source>
          , volume
          <volume>1491</volume>
          of Lecture Notes in Computer Science. Springer-Verlag, Berlin,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>M.</given-names>
            <surname>Rosemann</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Green</surname>
          </string-name>
          .
          <article-title>Developing a meta model for the Bunge-Wand-Weber ontological constructs</article-title>
          .
          <source>Information Systems</source>
          ,
          <volume>27</volume>
          (
          <issue>2</issue>
          ):
          <volume>75</volume>
          {
          <fpage>91</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>M.</given-names>
            <surname>Rosemann</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. Zur</given-names>
            <surname>Muehlen</surname>
          </string-name>
          .
          <article-title>Evaluation of Work ow Management Systems - a Meta Model Approach</article-title>
          .
          <source>Australian Journal of Information Systems</source>
          ,
          <volume>6</volume>
          (
          <issue>1</issue>
          ):
          <volume>103</volume>
          {
          <fpage>116</fpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>