<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Carmona</string-name>
          <email>jcarmona@lsi.upc.edu</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Architecture Department</institution>
          ,
          <addr-line>UPC</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Software Department</institution>
          ,
          <addr-line>UPC</addr-line>
        </aff>
      </contrib-group>
      <fpage>175</fpage>
      <lpage>190</lpage>
      <abstract>
        <p>The problem of synthesis of Petri nets from transition systems or languages has many applications, ranging from CAD for VLSI to medical applications, among others. The most common algorithms to accomplish this task are based on the theory of regions. However, one of the problems of such algorithms is its space requirements: for real-life or industrial instances, some of the region-based algorithms cannot handle in memory the internal representation of the input or the exploration lattice required. In this paper, the incremental derivation of a basis of regions and the later partitioned basis exploration is presented, which allows the splitting of large inputs.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The introduction of the theory of regions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] in the early nineties enabled a
new area of research that strives for transforming language or state-based
representations into event-based ones. This transformation, known as synthesis, was
initially devoted to derive a Petri net whose reachability graph was bisimilar or
isomorphic to the input transition system. A variant of this problem, known as
mining, has weaker requirements: the language of the derived Petri net must be
a superset (maybe proper) of the language of the input transition system [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
The theory of this paper provides algorithms for mining.
      </p>
      <p>Many research has been carried out since the introduction of regions, specially
of theoretical nature, which has brought a better understanding of the theory [3{
6], and has provided meaningful extensions to make it more general [7{10].</p>
      <p>
        As a consequence of the aforementioned theoretical work, tools based on
the theory of regions started to be available by the end of the nineties [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]
and still new ones are developed nowadays [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]. These tools are the outcome
of bridging the gap between the theory and practice, and many of them are
used extensively in the academic domain, whereas few are used in industry. The
reasons for the limited success of these tools in industry might be:
1. The algorithms involved are complex, i.e. in general polynomial with the size
of the input [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], that might be prohibitive for large inputs, and the use of
e cient data structures like BDDs or heuristics only alleviates the problem.
2. No high-level techniques are provided to cope with the inherent complexity
of the problem.
log L1
.
.
      </p>
      <p>.</p>
      <p>PN Nn
TS A</p>
      <p>In this work we provide the theoretical basis for deriving high-level
strategies that may allow to handle large speci cations. More concretely, as space
requirements are typically the bottleneck for some of the tools listed above, in
this paper we present an incremental technique that allows splitting the input
into several smaller parts. Moreover, we show how the theory of regions can be
extended algorithmically to combine the regions of each part in order to derive
the regions of the whole input.</p>
      <p>
        The theory presented will be oriented for the problem of mining: given
a set of objects describing behaviors (like a Petri net, a transition systems
or an event log) O1; O2; : : : ; On, we want to obtain a Petri net N such that
L(N ) Sin=1 L(Oi). The traditional way to solve this problem is to generate
a transition system Ai for each Oi, join all these Ai to create a single
transition system, and then apply a synthesis technique to derive a Petri net from a
transition system [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ].
      </p>
      <p>
        In this paper we explore a di erent approach. Instead of working with a
monolithic transition system, we use the fact that a transition system can be
represented by a basis of regions, such that any other region is a linear
combination of the regions in the basis (a recent publication shows an e cient technique
to accomplish this task [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]). Then, bases can be combined to obtain a region
basis for the whole system, from which we can derive the Petri net N .
      </p>
      <p>The general picture of the approach of this paper is outlined on Fig. 1(a).
Some arcs are labeled with an indication of the section or the reference where the
conversion/operation is explained. Basically the rst step is to convert inputs
that are not transition systems into a transition system, and then compute a
region basis from which a PN can be generated. Another possible application
is also shown in Fig. 1(b), where a large TS is split into smaller subsystems of
manageable size, with the only restriction that all the subsystems must include
the initial state and be connected.</p>
      <sec id="sec-1-1">
        <title>Related work</title>
        <p>
          In [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] an incremental approach was suggested based on the observation that
any region of the union of two transition systems can be expressed as the union
of regions of those systems. As in the approach presented in this paper, the
approach in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] trades space for time, since it must rst compute all the
regions and then obtain the minimal ones, a slower process than nding directly
the minimal regions if the whole transition system ts into memory. The main
drawback of this method is that the complete set of regions of each component
transition system must be stored (either in memory or disk) in order to compute
the set of regions of the union. In this work we propose a faster methodology
based on the fact that the complete set of regions can be succintly represented
by a basis of regions.
1.2
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Organization</title>
        <p>We start by giving the necessary background in Sect. 2. The process of combining
a set of bases to produce a unique basis for the whole system is explained in
Sect. 3. A description of the generation of a PN from a region basis is given in
Sect. 4 and, nally, Sect. 5 concludes this paper.
2
2.1</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <sec id="sec-2-1">
        <title>Finite Transition Systems and Petri Nets</title>
        <p>De nition 1 (Transition system). A transition system (TS) is a tuple hS; ; T; s0i,
where S is a set of states, is an alphabet of actions, T S S is a set
of (labelled) transitions, and s0 2 S is the initial state.</p>
        <p>We use s !es0 as a shortcut for (s; e; s0) 2 T , and we denote its transitive
closure as !. A state s0 is said to be reachable from state s if s !s0. We
extend the notation to transition sequences, i.e., s1 !sn+1 if = e1 : : : en and
(si; ei; si+1) 2 T . We denote #( ; e) the number of times that event e occurs in
. Let A = hS; ; T; s0i be a TS. We consider connected TSs that satisfy the
following axioms: i) S and are nite sets, ii) every event has an occurrence
and iii) every state is reachable from the initial state. The language of a TS A,
L(A), is the set of transition sequences feasible from the initial state.</p>
        <p>For two TSs A1 and A2, when L(A1) L(A2), we will say that A2 is an
over-approximation of A1.</p>
        <p>De nition 2 (Union of TSs). Given two TSs A1 = hS1; 1; T1; s0i and A1 =
hS2; 2; T2; s0i, the union of A1 and A2 is the TS A1 [ A2 = hS1 [ S2; 1 [
2; T1 [ T2; s0i.</p>
        <p>Clearly, the TS A1 [ A2 is an over-approximation of the TSs A1 and A2, i.e.</p>
        <p>L(A1) [ L(A2) L(A1 [ A2).</p>
      </sec>
      <sec id="sec-2-2">
        <title>De nition 3 (Petri net [17]). A Petri net (PN) is a tuple (P; T; W; M0) where</title>
        <p>the sets P and T represent disjoint nite sets of places and transitions,
respectively, and W : (P T ) [ (T P ) ! N is the weighted ow relation. The initial
marking M0 2 NP de nes the initial state of the system.</p>
        <p>A transition t 2 T is enabled in a marking M if 8p 2 P : M (p) W (p; t).
Firing an enabled transition t in a marking M leads to the marking M 0 de ned by
M 0(p) = M (p) W (p; t) + W (t; p), for p 2 P , and is denoted by M !t M 0. The
set of all markings reachable from the initial marking M0 is called its Reachability
Set. The Reachability Graph of N , denoted RG(N ), is a transition system in
which the set of states is the Reachability Set, the events are the transitions of
t
the net and a transition (M1; t; M2) exists if and only if M1 ! M2. We use L(N )
as a shortcut for L(RG(N )).
2.2</p>
      </sec>
      <sec id="sec-2-3">
        <title>Generalized Regions</title>
        <p>
          The theory of regions [
          <xref ref-type="bibr" rid="ref1 ref5">1, 5</xref>
          ] provides a way to derive a Petri net from a transition
system. Intuitively, a region corresponds to a place in the derived Petri net. In
the initial de nition, a region was de ned as a subset of states of the transition
system satisfying a homogeneous relation with respect to the set of events. Later
extensions [
          <xref ref-type="bibr" rid="ref10 ref18 ref7">7, 18, 10</xref>
          ] generalize this de nition to multisets, which is the notion
used in this paper.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>De nition 4 (Multiset, k-bounded Multiset, Subset). Given a set S, a</title>
        <p>multiset r of S is a mapping r : S ! Z. The number r(s) is called the multiplicity
of s in r. Multiset r is k-bounded if all its multiplicities are less or equal than k.
Multiset r1 is a subset of r2 (r1 r2) if 8s 2 S : r1(s) r2(s).</p>
        <p>We de ne the following operations on multisets:</p>
      </sec>
      <sec id="sec-2-5">
        <title>De nition 5 (Multiset operations).</title>
        <p>Maximum power pow(r) = maxs2S r(s)
Minimum power minp(r) = mins2S r(s)
Scalar product (k r)(s) = k r(s), for k 2 Z
Scalar sum (r + k)(s) = r(s) + k, for k 2 Z
Union (r1 [ r2)(s) = max(r1(s); r2(s))
Sum (r1 + r2)(s) = r1(s) + r2(s)</p>
        <p>Subtraction (r1 r2)(s) = r1(s) r2(s)
The operations described above have algebraic properties, e.g., r + r = 2 r and
r1 k r2 = r1 + ( k) r2.</p>
        <p>De nition 6 (Gradient). Let hS; ; T; s0i be a TS. Given a multiset r and a
transition s !es0 2 T , its gradient is de ned as r(s !es0) = r(s0) r(s). If all
the transitions of an event e 2 have the same gradient, we say that the event
e has constant gradient, whose value is denoted as r(e).
0 s3</p>
        <p>a</p>
        <p>
          Example 1. Fig. 2(a) shows a TS. The numbers within the states correspond to
the multiplicity of the multiset r shown. Multiset r is a region because both
events a and b have constant gradient, i.e. r(a) = 2 and r(b) = 3. There is
a direct correspondence between regions and places of a PN. The gradient of the
region describes the ow relation of the corresponding place, and the multiplicity
of the initial state indicates the number of initial tokens [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Fig. 2(b) shows
the place corresponding to the region shown in Fig. 2(a).
        </p>
        <p>We say that region r is normalized if minp(r) = 0. Similarly, it is
nonnegative if minp(r) 0. Any region r can become normalized by subtracting
minp(r) from the multiplicity of all the states.</p>
        <p>De nition 8 (Normalization). We denote by #r the normalization of a region
r, so that #r = r minp(r).</p>
        <p>It is useful to de ne a normalized version of the sum operation between
regions, since it is closed in the class of normalized regions.</p>
        <p>De nition 9 (Normalized sum). Let r1 and r2 be normalized regions, we
denote by r1 r2 their normalized sum, so that r1 r2 =#(r1 + r2).</p>
      </sec>
      <sec id="sec-2-6">
        <title>De nition 10 (Gradient vector). Let r be a region of a TS whose set of</title>
        <p>events is = fe1; e2; : : : ; eng. The gradient vector of r, denoted as (r), is the
vector of the event gradients, i.e. (r) = ( r(e1); r(e2); : : : ; r(en)).
Proposition 1. Gradient vectors have the following properties:
(r1 + r2) =</p>
        <p>(r + k) =
(r1
a
d
s3</p>
        <p>Regions can be partitioned into classes using their gradient vectors.</p>
      </sec>
      <sec id="sec-2-7">
        <title>De nition 11 (Canonical region). Two regions r1 and r2 are said to be</title>
        <p>equivalent if their gradient is the same, i.e. r1 r2 , (r1) = (r2). Given a
region r, the equivalence class of r, is de ned as [r] = frij ri rg. A canonical
region is the normalized region of an equivalence class, i.e. #r.</p>
        <p>
          Example of canonical regions are provided in Fig. 4, where two TSs are shown
in which some regions have been shadowed. For instance, the canonical region
r0 = fs1; s2g has gradient vector (r0) = (1; 0). A PN built from the set of
minimal canonical regions has the same language as a PN built using all the
regions [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], thus it yields the smallest overapproximation with respect to the
language of the TS [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-8">
        <title>De nition 12 (Subregion, Empty region, Minimal canonical region).</title>
        <p>r1 is a subregion of r2, denoted as r1 v r2, if, for any state s, #r1(s) #r2(s).
We denote by ; the region in which all states have zero multiplicity. A minimal
canonical region r satis es that for any other region r0, if r0 v r then r0 ;.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Combining region bases</title>
      <p>In this section we detail how region bases of di erent TSs can be joined, yielding
a region basis of their union. We will illustrate the theory with the running
example shown in Fig. 3. In this case, we assume A is a very large TS that
cannot be easily handled, hence it is split into two smaller subsystems, namely
A1 and A2, so that A = A1 [ A2.
3.1</p>
      <sec id="sec-3-1">
        <title>Basis of regions</title>
        <p>De nition 13 (Region basis). Given a TS, a region basis B = fr1; r2; : : : ; rng
is a minimal subset of the canonical regions of TS such that any region r can be
expressed as a linear combination of B ( i.e. r = Pn
i=1 ci ri, with ci 2 Q; ri 2 B).
r0 region
r1 region</p>
        <p>
          Theorem 1 ([
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]). Let hS; ; T; s0i be a TS. The size of the region basis is less
or equal to min(j j; jSj 1).
Given two systems described by their region basis, we want to obtain the region
basis of their union TS. The work more closely related is [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], in which it is
described how the set of regions in the joined system can be obtained from the
regions of the component systems. This is achieved by introducing the concept
of compatible (standard) regions. In this section we rst review and extend this
concept of compatibility to generalized regions.
        </p>
        <p>De nition 14 (Compatible TSs). Two TSs A1 and A2 are compatible if they
have the same initial state.</p>
        <p>
          Def. 14 is more general than the one in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], where the number of shared
states is restricted to one (the initial state). For instance, systems A1 and A2 of
Fig. 3 are compatible according to this de nition, but not using the de nition
in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], since they share states s0 and s2.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>De nition 15 (Compatible regions, o set). Two regions r1 and r2 from</title>
        <p>two compatible TSs A1 = hS1; 1; T1; s0i and A2 = hS2; 2; T2; s0i are said to be
compatible if:
{ 8e 2 1\ 2; r1 (e) = r2 (e), i.e. they have the same gradient for all common
events, and,
{ 9k 2 Z : 8s 2 S1 \ S2; r2(s) r1(s) = k, i.e. the di erence in multiplicity
of each shared state is equal to a constant, that we call the o set between r1
and r2, denoted as o (r1; r2).</p>
        <p>Two compatible regions are said to be directly compatible if o (r1; r2) = 0, a
fact that we denote as r1 $ r2. Conversely, if o (r1; r2) 6= 0, we say that the
regions are indirectly compatible and we use the following notation r1 ! r2.</p>
        <p>An immediate consequence of Def. 15 is that, if there is only a single shared
state, then any two regions with the same gradient for all common events are
compatible. This is, for instance, the case when TSs represent execution trees
and only the initial state is shared among them.</p>
        <p>Two compatible regions can be made directly compatible by adding the o set
to one of them.</p>
      </sec>
      <sec id="sec-3-3">
        <title>De nition 16 (Directly compatible region). Given two compatible regions</title>
        <p>r1 and r2 with o (r1; r2) &gt; 0, the directly compatible region of r1 with respect to
r2 is r1"r2 = r1 + o (r1; r2).</p>
        <p>De nition 17 (Union of compatible regions). Given two compatible regions
r1 and r2, de ned over sets of states S1 and S2, respectively, with o (r1; r2) &gt; 0,
their union, denoted r1 t r2, is
(r1 t r2)(s) =
((r1"r2 )(s)
r2(s)
if s 2 S1
otherwise</p>
      </sec>
      <sec id="sec-3-4">
        <title>Proposition 2 (Union of compatible regions is a region of union sys</title>
        <p>tem). Given two compatible regions r1 and r2 of two compatible TSs A1 and
A2, r1 t r2 is a region of A1 [ A2. For each shared event, its gradient is equal
to the gradient in r1 or r2, which are equal. For non-shared events of A1 (A2),
their gradient is the gradient in A1 (A2).</p>
        <p>Proof. Assume o (r1; r2) 0. Region r = r1 t r2, where r1"r2 = r1 + o (r1; r2).
The latter entails that, in a shared state s, the multiplicity is the same for
r1"r2 and r2, which is the multiplicity assigned to r. For non-shared states, on
the other hand, the multiplicity assigned in r is the multiplicity of the region
whose TS contains the state. Thus any arc (either going from a shared to
nonshared state or any other combination) has a constant gradient, equal to the
gradient of that event in the corresponding region. Result transfers to r1 because
(r1) = (r1"r2 ). tu
Example 2. In Fig. 5 we show one region of each subsystem of our running
example. They are compatible, since the gradient of the single shared event a is
the same in both regions. More speci cally, they are indirectly compatible, since
their o set is di erent than 0.
s1
a
b
r
s0
s2
s0
s2
q
a
c
d
s3
s1
a
b
r
s0
s2
s0
s2
c
q"r
a
d
s3</p>
        <sec id="sec-3-4-1">
          <title>Indirectly compatible</title>
          <p>regions (r ! q)</p>
        </sec>
        <sec id="sec-3-4-2">
          <title>Directly compatible</title>
          <p>regions (r $ q "r)</p>
          <p>Property 1. Given two compatible non-negative regions r1 and r2. If they are
directly compatible then r1 t r2 is a normalized region, if and only if, one of
them is normalized. If they are not directly compatible, but both of them are
normalized, then again r1 t r2 is a normalized region.</p>
          <p>Proof. If one of them is normalized and they are directly compatible no modi
cation of multiplicities is performed, so the state with 0 multiplicity will remain
untouched and since regions are non-negative, then minp(r1 t r2) = 0.
Conversely, if r1 t r2 is a normalized region and r1 $ r2, then all multiplicities of
either r1 or r2 are greater or equal to 0, and since there is at least one 0
multiplicity, at least one of them must be normalized. If both are normalized but are
not directly compatible, only one of them modi es its multiplicities and we end
up in the same situation.
3.3</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>Incremental algorithm for obtaining a basis</title>
        <p>Given two TSs, A1 and A2. Let fr1; : : : ; rng be the region basis of A1 and
fq1; : : : ; qmg the basis of A2. The set of all regions of the union system A1 [ A2
whose language satis es L(A1 [ A2) L(A1) [ L(A2) is obtained by nding all
non-trivial solutions for xi variables that satisfy:
8ei 2
1 \
2 :
rj (ei) xj =</p>
        <p>qk (ei) xn+k</p>
        <p>X
1 j n</p>
        <p>X
1 k m
i.e. for all common events their gradient must be the same on both systems.</p>
        <p>This system of equations can be rewritten in matrix form as M x = 0, where
x is the column vector of variables xi and M is the following matrix:
0 r1 (e1) : : : rn (e1)</p>
        <p>B r1 (e2) : : : rn (e2)
M = BB@ ...</p>
        <p>r1 (ec) : : : rn (ec)
q1 (e1) : : :
q1 (e2) : : :
.
.</p>
        <p>.
q1 (ec) : : :
qm (e1)1
qm (e2)C</p>
        <p>C
C</p>
        <p>A
qm (ec)
where c is the number of common events in the system, i.e. c = j 1 \ 2j. We
call this matrix the gradient compatibility matrix between A1 and A2.</p>
        <p>Compatibility of regions demands not only the gradients of common events to
be the same, but also that the o set is the same for all shared states. To enforce
such condition, assume that we shift all the regions in the bases fr1; : : : ; rng and
fq1; : : : ; qmg so that for all ri and qj we have that ri(s0) = qj (s0) = 0. Now any
region obtained by combining the regions in the bases will have a 0 multiplicity
in the initial state. Thus, if the o set is the same in all shared states, their
multiplicity must coincide in all remaining shared states.</p>
        <p>If there is a path between s0 and shared state s in which the same events
(and the same number of times but no matter in which order) are red in both
TSs A1 and A2, then the condition is automatically satis ed. Only in the case
where all paths are di erent, we say that shared state s is in con ict, and then
we must explicitly enforce the equality of the multiplicity of s in both systems.</p>
        <p>This condition can be expressed in matrix form using a row for each shared
state in con ict (di erent than s0 since this state is never in con ict). For a shared
state s in con ict, its corresponding row will be of the form (r1(s) : : : rn(s)
q1(s) : : : qm(s)), where all regions ri and qj have been shifted, as said before, so
that ri(s0) = qj (s0) = 0. Let us name as C the matrix containing such rows, we
will call it the shared state con ict matrix between A1 and A2. Now since the
multiplicity of all these states must be the same, their subtraction must be 0.
Thus, C x = 0.</p>
        <p>Theorem 2. Let A1 and A2 be two compatible TSs with region bases fr1; : : : ; rng
and fq1; : : : ; qmg respectively. Let M be their gradient compatibility matrix, C be
the shared state con ict matrix, and x the column vector of variables x1 to xn+m.
Consider an assignment to the variables in x such that MC x = 0 and some
xi 6= 0. This assignment identi es a region rtq = P1 i n rixi tP1 j m qj xn+j
in A1 [ A2.</p>
        <p>Proof. A non-trivial solution x de nes two compatible regions, namely r =
P1 i n rixi and q = P1 j m qj xn+j , in A1 and A2 respectively. These two
regions are compatible according to De nition 15 if M x = 0, because for
common events the gradient is the same and the o set is the same in all shared
states if C x = 0. By Proposition 2, r t q is a region of A1 [ A2. tu</p>
        <p>So the problem reduces to nding the solutions to a homogeneous linear
system. Note that the system does not require to have solutions in the integer
domain. In fact all the solutions are in Q, since all the gradients are integers.
Homogeneous linear systems have one trivial solution (i.e. 0) and in nite
nontrivial solutions. Let yi be all the non-redundant solution vectors, then any
possible solution of the system can be obtained by linear combination of these
solution vectors, since adding or subtracting 0 from 0 does not change its value.
These yi are a basis of the nullspace of MC , and any solution x can be written
as a unique linear combination x = Pi iyi, with i 2 Q.</p>
        <p>
          There are several well-known methods to obtain a basis for the nullspace [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ],
one of the easiest is to put the matrix in reduced row echelon form, determine the
s0
s2
s0
s2
b0
b0
s0
s2
s0
s2
c
c
b02
a
d
b20
a
d
s3
s3
s1
s1
b11
b11
a
b
a
b
s0
s2
s0
s2
b1
b1
s0
s2
s0
s2
c
c
b21
a
d
b12
a
d
s3
s3
Fig. 6. Region basis fb0; b1g (and their coregions f b0; b1g) for system A1 [ A2.
Regions are partitioned so that b0 = b0 [ b0, where bi0 is the part of b0 in Ai.
        </p>
        <p>1 2
free variables, and then, for each free variable xi, derive a vector of the basis by
assigning 1 to xi and 0 to the rest of free variables. The yi columns correspond
to the combination of regions of both system that have the same gradient, and
the resulting combined region is a region basis for the union TS.
Example 3. We will compute the region basis of the union of TSs A1 and A2
shown in Fig. 3. Two possible region basis for these systems are fr0; r1g and
fq0; q1g, show in Fig. 4. The matrix M in this case is
where columns (from left to right) correspond to regions r0; r1; q0; q1 and there
is only a single row that corresponds to the only shared event between A1 and
A2, namely event a.</p>
        <p>The set of shared states is fs0; s2g, thus the shared state con ict matrix C
has only one row because only state s2 is reachable from s0 by ring a di
erent multiset of events. Consequently C = r0(s2) r1(s2) q0(s2) q1(s2) . With
matrices M and C we can now build</p>
        <p>M
C
=
which is an indeterminate system with 2 degrees of freedom. We can write x1 =
2x2 + x3 and x0 = x2 x3, so by changing the values of variables x2 and x3
(-1,0)
(0,1)
(0,-1)
(1,1) (1,-1) (-1,1) (-1,-1)
we can generate all the possible solutions. Given these parameters the region
solution will be (( x2 x3) r0 + (2x2 + x3) r1) t (x2 q0 + x3 q1). Using the
parameter values (1; 0), and (0; 1) for vector (x2; x3) we do obtain the following
regions in the joined system: b0 = ( r0 + 2r1) t q0 and b1 = ( r0 + r1) t q1. The
set fb0; b1g forms a region basis for the A1 [ A2 system (see Fig. 6).
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Generating a PN from a basis</title>
      <p>
        In [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] an algorithm was presented that allows nding minimal regions by careful
exploration of the region space de ned by the region basis. The fundamental idea
is that the regions in the basis are initially assumed to be minimal, and then
combinations of these regions can only yield a minimal region if the resulting
region is non-normalized, since otherwise the region is a superregion of any of
its component regions.
      </p>
      <p>However this approach cannot be directly used if the number of states in the
monolithic TS is too high to easily perform region operations in memory. The
alternative we propose in this paper is to partition the region operations into
the di erent component TSs, so that each time only the information of one of
the systems is accessed.</p>
      <p>To achieve this partitioning, consider the region basis fb0; : : : ; bng, we denote
by bij the part of region bi that belongs to subsystem j. All the regions in the
basis are assumed to be normalized, but the di erent bij could be non-normalized,
since they are de ned so that, given i, all bij are directly compatible (cf. Def. 15).
For instance region b0 in Fig. 6 is normalized, however b20 it is not.</p>
      <p>Consequently a combination of the basis P c</p>
      <p>i i bij iis ia nboiny-ineoldrsmaalinzoend-nroegrmioanli(zbedy
region only if, for all subsystem j, P c
Petri Nets &amp; Concurrency { 187
13</p>
      <sec id="sec-4-1">
        <title>Partial order</title>
        <p>in A1
(0,0)</p>
        <p>Partial order</p>
        <p>in A2
(0,0)</p>
        <p>Property 1)3. Thus, the strategy would be to test the basis combinations in
the rst subsystem, keeping only the combinations producing non-normalized
regions. Then only these combinations will be tested in the second subsystem,
discarding the ones that yield a normalized region, and the process will continue
until all subsystems have been checked.</p>
        <p>Finally, with only the surviving combinations, the subregion test will be
performed to guarantee that only the minimal regions are found. Again this test
can be distributed among the subsystems, since Pi ci bi Pi c0i bi if, and only
if, for all subsystem j, Pi ci bij Pi c0i bij .</p>
        <p>We will illustrate all this process using our running example. In Fig. 7 we
can see a tree of combinations of the fb0; b1g basis. Each node is a tuple (c0; c1)
describing the coe cients used to obtain a region r as c0 b0 + c1 b1. The second
level of the tree corresponds to the regions in the basis and their coregions (as
shown in Fig. 6). To bound the search space, assume we arbitrarily x that the
coe cients are only allowed to take values in the set f 1; 0; 1g.</p>
        <p>From the four possible combinations in the third level, two of them yield
already normalized regions in A1, namely combinations (1; 1) and ( 1; 1). On
the other hand, combinations (1; 1) and ( 1; 1) correspond to non-normalized
regions in A1, as shown in the gure. Consequently only these two combinations
will be checked in subsystem A2. In this case, both regions, namely, b20 + ( b12)
and b20 +b21, are also non-normalized in A2. Thus these combinations correspond
to non-normalized regions in A1 [ A2, which means that they might be minimal
regions (once normalized).
3 Since Property 1 only holds for non-negative regions, negative ci coe cients are
treated as markers for summing the normalized coregions of bi. For instance 2b1 3b2
will be actually computed as 2 b1 + 3 #( b2). This way all regions are always
nonnegative and Property 1 can be safely used.</p>
        <p>At this point we must check for minimal regions. The list of candidates
includes all the regions in the basis (and their coregions), that is all the
combinations in the second level, as well as combinations (1; 1) and ( 1; 1) that have
been found during the exploration. To check for minimality, we create a partial
order among all these combinations in each subsystem (see Fig. 8). A region is
not minimal if there is a combination, di erent than (0; 0), that appears before
it in all the partial orders. In our example, combinations (1; 0) and ( 1; 0) are
not minimal. Conversely, for instance ( 1; 1) is minimal because, although
combAi2na(ti.ieo.n (b012; 6 1) bp20re+cebd21e)s. it in A1 (i.e. b11 b10 + b11), it is not longer true in
With the minimal regions found we build the mined PN of Fig. 9.</p>
        <p>For a set of subsystems A1; : : : ; An the algorithm can be summarized as
follows:
{ Take the rst subsystem (A1) and explore region space until either
combinations are exhausted (according to the user de ned bounds on the coe cients)
or the border of non-normalized regions is found.
{ Pass to next subsystem the set of combinations of non-normalized regions
and check for normality in this other subsystem.</p>
        <p>If region is normalized, then discard, but mark sibling combinations to
be explored in the current subsystem (since all sibling combinations in
the rst subsystem will remain to be non-normalized).</p>
        <p>On the other hand, if region is non-normalized and we are not in the last
subsystem, then add it to the list of regions that conform the border of
non-normalized regions that will be passed to next subsystem. If we are
already in the last subsystem, then add the combination to the list of
candidate minimal regions, normalize it, and then check the
normalization status in all subsystems. Sibling combinations are scheduled to be
explored in the rst subsystem whose subpart of the region is normalized.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>
        In this paper we have extended the theory developed in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] to devise an
incremental algorithm for process mining. Given that space requirements are often the
Petri Nets &amp; Concurrency { 189
15
real bottleneck of some of the region-based techniques in the literature, methods
like the one presented in this paper might represent a crucial step into making
the theory applicable in industrial scenarios.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ehrenfeucht</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rozenberg</surname>
          </string-name>
          , G.:
          <article-title>Partial (Set) 2-Structures. Part I, II</article-title>
          .
          <source>Acta Informatica</source>
          <volume>27</volume>
          (
          <year>1990</year>
          )
          <volume>315</volume>
          {
          <fpage>368</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W.M.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weijters</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maruster</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Work ow mining: Discovering process models from event logs</article-title>
          .
          <source>IEEE Trans. Knowl. Data Eng</source>
          .
          <volume>16</volume>
          (
          <issue>9</issue>
          ) (
          <year>2004</year>
          )
          <volume>1128</volume>
          {
          <fpage>1142</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Badouel</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bernardinello</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Darondeau</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Polynomial algorithms for the synthesis of bounded nets</article-title>
          .
          <source>Lecture Notes in Computer Science</source>
          <volume>915</volume>
          (
          <year>1995</year>
          )
          <volume>364</volume>
          {
          <fpage>383</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Hoogers</surname>
            ,
            <given-names>P.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kleijn</surname>
            ,
            <given-names>H.C.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thiagarajan</surname>
            ,
            <given-names>P.S.:</given-names>
          </string-name>
          <article-title>An event structure semantics for general petri nets</article-title>
          .
          <source>Theor. Comput. Sci</source>
          .
          <volume>153</volume>
          (
          <issue>1</issue>
          &amp;2) (
          <year>1996</year>
          )
          <volume>129</volume>
          {
          <fpage>170</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Desel</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reisig</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>The synthesis problem of Petri nets</article-title>
          .
          <source>Acta Inf</source>
          .
          <volume>33</volume>
          (
          <issue>4</issue>
          ) (
          <year>1996</year>
          )
          <volume>297</volume>
          {
          <fpage>315</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Badouel</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Darondeau</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Theory of regions</article-title>
          . In Reisig, W.,
          <string-name>
            <surname>Rozenberg</surname>
          </string-name>
          , G., eds.: Petri Nets. Volume
          <volume>1491</volume>
          of LNCS., Springer (
          <year>1998</year>
          )
          <volume>529</volume>
          {
          <fpage>586</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Mukund</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Petri nets and step transition systems</article-title>
          .
          <source>Int. Journal of Foundations of Computer Science</source>
          <volume>3</volume>
          (
          <issue>4</issue>
          ) (
          <year>1992</year>
          )
          <volume>443</volume>
          {
          <fpage>478</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Darondeau</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Deriving unbounded petri nets from formal languages</article-title>
          . In Sangiorgi, D., de Simone, R., eds.
          <source>: CONCUR</source>
          . Volume
          <volume>1466</volume>
          of Lecture Notes in Computer Science., Springer (
          <year>1998</year>
          )
          <volume>533</volume>
          {
          <fpage>548</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Bergenthum</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Desel</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lorenz</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mauser</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Synthesis of petri nets from nite partial languages</article-title>
          .
          <source>Fundam. Inform</source>
          .
          <volume>88</volume>
          (
          <issue>4</issue>
          ) (
          <year>2008</year>
          )
          <volume>437</volume>
          {
          <fpage>468</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Carmona</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cortadella</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kishinevsky</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>New region-based algorithms for deriving bounded Petri nets</article-title>
          .
          <source>IEEE Transactions on Computers</source>
          <volume>59</volume>
          (
          <issue>3</issue>
          ) (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Caillaud</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Synet : A synthesizer of distributable bounded Petri-nets from nite automata</article-title>
          . http://www.irisa.fr/s4/tools/synet/ (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Cortadella</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kishinevsky</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lavagno</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yakovlev</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Deriving Petri nets from nite transition systems</article-title>
          .
          <source>IEEE Trans. on Computers</source>
          <volume>47</volume>
          (
          <issue>8</issue>
          ) (
          <year>1998</year>
          )
          <volume>859</volume>
          {
          <fpage>882</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Carmona</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cortadella</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kishinevsky</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kondratyev</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lavagno</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yakovlev</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A symbolic algorithm for the synthesis of bounded Petri nets</article-title>
          .
          <source>In: Application and Theory of Petri Nets and Other Models of Concurrency</source>
          . (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Sole</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carmona</surname>
          </string-name>
          , J.:
          <article-title>Process mining from a basis of state regions</article-title>
          .
          <source>In: Application and Theory of Petri Nets and other Models of Concurrency. LNCS</source>
          , Springer (
          <year>2010</year>
          )
          <volume>226</volume>
          {
          <fpage>245</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>van der Aalst</surname>
          </string-name>
          , W.,
          <string-name>
            <surname>Rubin</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verbeek</surname>
            , H., van Dongen,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kindler</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , Gunther, C.:
          <article-title>Process mining: a two-step approach to balance between under tting and over tting</article-title>
          .
          <source>Software and Systems Modeling</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Dongen</surname>
            ,
            <given-names>B.F.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Busi</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pinna</surname>
          </string-name>
          , G.M.,
          <string-name>
            <surname>van der Aalst</surname>
            ,
            <given-names>W.:</given-names>
          </string-name>
          <article-title>An iterative algorithm for applying the theory of regions in process mining</article-title>
          .
          <source>In: Proceedings of the Workshop on Formal Approaches to Business Processes and Web Services (FABPWS'07)</source>
          . (
          <year>2007</year>
          )
          <volume>36</volume>
          {
          <fpage>55</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Murata</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Petri Nets: Properties, analysis and applications</article-title>
          .
          <source>Proceedings of the IEEE (April</source>
          <year>1989</year>
          )
          <volume>541</volume>
          {
          <fpage>580</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Bernardinello</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michelis</surname>
            ,
            <given-names>G.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petruni</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vigna</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>On the synchronic structure of transition systems</article-title>
          . In Desel, J., ed.:
          <source>Structures in Concurrency Theory, Proc. of the Int. Workshop on Structures in Concurrency Theory</source>
          .
          <article-title>(</article-title>
          <year>1995</year>
          )
          <volume>69</volume>
          {
          <fpage>84</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Kalman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Basic null space calculations</article-title>
          .
          <source>The College Mathematics Journal</source>
          <volume>15</volume>
          (
          <issue>1</issue>
          ) (
          <year>1984</year>
          )
          <volume>42</volume>
          {
          <fpage>47</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>