=Paper= {{Paper |id=Vol-2756/paper18 |storemode=property |title=Finding an Optimal Label-Splitting to Make a Transition System Petri Net Implementable: a Complete Complexity Characterization |pdfUrl=https://ceur-ws.org/Vol-2756/paper_18.pdf |volume=Vol-2756 |authors=Ronny Tredup |dblpUrl=https://dblp.org/rec/conf/ictcs/Tredup20 }} ==Finding an Optimal Label-Splitting to Make a Transition System Petri Net Implementable: a Complete Complexity Characterization== https://ceur-ws.org/Vol-2756/paper_18.pdf
    Finding an Optimal Label-Splitting to Make a
    Transition System Petri Net Implementable: a
       Complete Complexity Characterization?

                                     Ronny Tredup

        Universität Rostock, Institut für Informatik, Theoretische Informatik,
      Albert-Einstein-Straße 22, 18059, Rostock, ronny.tredup@uni-rostock.de



        Abstract. Petri net synthesis is the task of finding a Petri net N for a
        given transition system (TS) A that implements A, i.e., N is an (exact
        net) realization, a language-simulation or an embedding of A according to
        N ’s degree of accuracy. Regardless of the sought implementation, there
        is not always a solution. Label-splitting converts a non-implementable
        TS A into an implementable one by giving edges with the same label
        now different labels. Since increasing the number of labels increases the
        complexity of the net synthesized, it is desired to keep the number of split
        labels small. This means that label-splitting can be considered as decision
        problem that asks for a given TS A and natural number κ whether there
        is an implementable TS B with at most κ labels that is derived from A
        by splitting labels. Recently, Schlachter and Wimmel (2020) [18] showed
        that this problem is NP-complete if an embedding is sought. In this paper,
        we show that this remains true if A is 2-bounded, i.e., every state has at
        most two incoming and two outgoing edges, and that this bound is tight.
        Schlachter and Wimmel also alleged that label-splitting aiming at exact
        realization is NP-complete. In this paper, we prove this conjecture and
        show that label-splitting aiming at language-simulation or realization is
        NP-complete even if A is 1-bounded.


1     Introduction

Petri net synthesis [2, 3, 5, 6] is the task of finding a Petri net N for a given
transition system (TS) A that implements A, i.e., N is an (exact net) realization,
a language-simulation or an embedding of A according to N ’s degree of accuracy.
    Synthesis of Petri nets has applications in many areas like extracting concur-
rency from sequential specifications like TS and languages [4], process discovery [1],
supervisory control [14] or the synthesis of speed independent circuits [12].
    However, regardless of the implementation sought, there is not always an
implementing Petri net. In this case, Label-splitting [3, 11, 17] is an option, i.e,
converting a non-implementable TS A into an implementable one by giving
edges with the same label now different labels. Label-splitting has been originally
?
    Copyright c 2020 for this paper by its authors. Use permitted under Creative
    Commons License Attribution 4.0 International (CC BY 4.0).
2

introduced in [10] for the synthesis of 1-bounded Petri nets and was extended to
k-bounded Petri nets by the same authors in [9]. The corresponding heuristics
have been implemented in the synthesis tool Petrify [12]. In [8], a novel view
of the label-splitting techniques of [9, 10] has been shown, relating them to some
NP-complete problems like chromatic number and weighted set cover. In process
mining, label-splitting is used to handle imprecise labels of event-logs to allow
better fitting models instead of strongly over-approximating ones [15, 16]. Other
applications of label-splitting can be found in [19], where it supports synthesis to
get concise representations of service compositions, exhibiting concurrency.
    Label-splitting is a powerful transformation that always guarantees an imple-
mentable TS. In fact, an exact implementation is possible at the latest when every
edge of the resulting TS is labeled differently. However, increasing the number of
labels split, increases the complexity of the net derived and yields an over-fitting
model. For applications in process-mining, this prevents users from getting a
better insight about the behavior of the process [16]. Thus, it is desired to keep
the number of labels split as small as possible. This means that label-splitting
can be considered as an optimization problem that consist of converting A into
an implementable TS that has as few labels as possible. Equivalently [18], in this
paper, we understand label-splitting as decision problem that asks for a given TS
A and natural number κ whether there is an implementable TS B with at most
κ labels that is derived from A by splitting labels.
    In [18], it has been shown that label-splitting is NP-complete if an embedding
is sought. The presented reduction can be modified, such that the resulting TS is
3-bounded, i.e., every state has at most three incoming and three outgoing edges.
In this paper, we strengthen this result and show that label-splitting aiming at
embedding remains NP-complete if A is 2-bounded, and that this bound is tight.
    In [18], it is also conjectured that label-splitting for realization is intractable.
In this paper, we prove this conjecture and show also that label-splitting for
language-simulation and (exact) realization is NP-complete even if A is 1-bounded.
    We obtain all of our NP-completeness results by reductions from the vertex-
cover problem on cubic graphs, which will be introduced in Section 3.
    This paper is organized as follows. Section 2 introduces necessary definitions
and supports them with examples. Section 3 and Section 4 present the complexity
of label-splitting aiming at exact realization, language-simulation and embedding,
respectively. Finally, Section 5 briefly closes the paper.


2    Preliminaries

This section introduces all necessary definitions and supports them with examples.
    Transition Systems. A (deterministic) initialized transition system (TS,
for short) A = (S, E, δ, ι) is a directed labeled graph with the set of nodes S
(called states), the set of labels E (called events), the partial transition function
δ : S × E −→ S and the initial state ι ∈ S, where every state s ∈ S is
reachable from ι by a directed labeled path. The set of arcs of A is defined
by ∆A = {(s, e, s0 ) | s, s0 ∈ Se ∈ E : δ(s, e) = s0 }. Event e occurs at state
                                                                                               3

s, denoted by s e , if δ(s, e) is defined. We abridge δ(s, e) = s0 by s e s0 . If
w = e1 . . . en ∈ E ∗ , then s w denotes that there are states s = s0 , . . . , sn ∈ S
such that si ei+1 si+1 ∈ A for all i ∈ {0, . . . , n−1}. The language of A is defined by
L(A) = {w ∈ E + | ι w }∪{ε}. Let b ∈ N. A is called b-bounded if, for every state
s ∈ S, there are at most b incoming or outgoing arcs, i.e., |{e ∈ E | e s}| ≤ b
and |{e ∈ E | s e }| ≤ b. A cycle-free 1-bounded TS A is called linear. If A is
not explicitly defined, then we refer to its components by S(A) (states), E(A)
(events) δA (function), ιA (initial state).
       Simulations. Let A and B be TS with the same set of events E. We say B
simulates A, if there is a mapping ϕ : S(A) → S(B) such that ϕ(ιA ) = ιB and
s e s0 ∈ A implies ϕ(s) e ϕ(s0 ) ∈ B; such a mapping is called a simulation
(between A and B). ϕ is an embedding, denoted by A ,→ B, if it is injective; ϕ
is a language-simulation, denoted by A . B, if ϕ(s) e implies s e , implying
L(A) = L(B) [3, p. 67]; ϕ is an isomorphism, denoted by A ∼             = B, if it is bijective
and δA (s, e) = s0 if and only if δB (ϕ(s), e) = ϕ(s0 ) for all s, s0 ∈ S(A), e ∈ E(A).
       Label-splitting. Let A = (S, E, δ, ι) be a TS and E = {e1 , . . . , en } ⊆ E a
set of events. The label-splitting of e1 , . . . , en into the (pairwise distinct) events
e11 , . . . , em            1            mn
               1 , . . . , en , . . . , en S
                1
                                            , where mi ≥ 2 for all i ∈ {1, . . . , n}, yields the
                                              n
event set E = (E \ E) ∪ i=1 {e1i , . . . , em
                   0                                                            0 0
                                                        i }. A TS B = (S, E , δ , ι) is an E -
                                                          i                                     0

label-splitting of A if there is a bijective mapping ψ : ∆B → ∆A such that
ψ((s, e, s0 )) = (s, ei , s0 ) if e = eji for some i ∈ {1, . . . , n}, j ∈ {1, . . . , mi } and,
otherwise, ψ((s, e, s0 )) = (s, e, s0 ) for all (s, e, s0 ) ∈ ∆B . We say E is the set of
events of A that occur split in B.
       Petri nets. A Petri net N = (P, T, f, M0 ) consists of finite and disjoint sets of
places P and transitions T , a (total) flow function f : ((P ×T )∪(P ×T )) → N and
an initial marking M0 : P → N. A transition t ∈ T can fire or occur in a marking
M : P → N, denoted by M t , if M (p) ≥ f (p, t) for all places p ∈ P . The firing of
t in marking M leads to the marking M 0 (p) = M (p)−f (p, t)+f (t, p) for all p ∈ P ,
denoted by M t M 0 . Again, this notation extends to sequences w ∈ T ∗ and the
reachability set RS(N ) = {M | ∃w ∈ T ∗ : M0 w M } contains all of N ’s reachable
markings. The reachability graph of N is the TS AN = (RS(N ), T, δ, M0 ), where
for every reachable marking M of N and transition t ∈ T with M t M 0 the
transition function δ of AN is defined by δ(M, t) = M 0 .
    The following definitions relate TS and Petri nets via the reachability graph.
    Implementations. A Petri net N is a realization, respectively a language-
simulation, respectively an embedding of a TS A if there is a simulation such that
A∼ = AN , respectively A . AN , respectively A ,→ AN .
    Regions. If a TS A is implementable by a Petri net N , then we want to
construct N purely from A. TS represents the behavior of a modeled system
by means of global states (states of TS) and transitions between them (events).
Dealing with a Petri net, we operate with local states (places) and their changing
(transitions), while the global states of a net are markings, i.e., combinations
of local states. Since AN has to simulate A, N ’s transitions correspond to A’s
4

events. The connection between global states in TS and local states in the sought
net is given by regions of TS that mimic places: A region R = (sup, con, pro)
of A = (S, E, δ, ι) consists of the mappings sup : S → N and con, pro : E → N
such that for edge s e s0 of A it holds that con(e) ≤ sup(s) and sup(s0 ) =
sup(s) − con(e) + pro(e). Notice that R is implicitly completely defined by
sup(ι), con and pro: Since A is reachable, for every state s ∈ S, there is a
path ι e1 . . . en sn such that s = sn . Thus, we inductively obtain sup(si+1 ) by
sup(si+1 ) = sup(si ) − con(ei+1 ) + pro(ei+1 ) for all i ∈ {0, . . . , n − 1} and s0 = ι.
Since we can compute sup and, thus, R purely from sup(ι), con and pro, we
often present regions only implicitly. A region R = (sup, con, pro) models a place
p and the corresponding part of the flow function f , i.e., con(e) models f (p, e)
(the number of tokens that e consumes from p), pro(e, p) models f (e, p) (the
number of tokens that e produces on p) and sup(s) models (the number of tokens)
M (p) (that are on p) in the marking M ∈ RS(N ) that corresponds to s ∈ S(A)
via the simulation between A and AN . Every set R of regions of A defines the
synthesized net NAR = (R, E, f, M0 ) with f (R, e) = con(e), f (e, R) = pro(e) and
M0 (R) = sup(ι) for all R = (sup, con, pro) ∈ R and all e ∈ E.
     State and Event State Separation. To ensure that the input behavior is
captured by the synthesized net, we have to distinguish global states, and prevent
the firings of transitions when their corresponding events are not present in TS.
This is stated as so called separation atoms and separation properties. A pair
(s, s0 ) of distinct states of A defines a states separation atom (SSP atom). A region
R = (sup, con, pro) solves (s, s0 ) if sup(s) 6= sup(s0 ). If every SSP atom of A is
solvable then A has the state separation property (SSP, for short). A pair (e, s)
of event e ∈ E and state s ∈ S where e does not occur, that is ¬s e , defines an
event/state separation atom (ESSP atom). A region R = (sup, con, pro) solves
(e, s) if sup(s) < con(e). If every ESSP atom of A is solvable then A has the event
state separation property (ESSP, for short). A set R of regions of A is called a
witness for A’s SSP, respectively ESSP, if for each SSP atom, respectively ESSP
atom, there is a region R in R that solves it.
     The next lemma ([3, p. 162], Proposition 5.10) establishes the connection
between the existence of witnesses and the existence of an implementing net N
in dependence of the testified property. Notice that Petri nets correspond to the
type of nets τP T in [3, p. 130].

Lemma 1 ([3]). Let A be a TS. There is a Petri net N such that A ,→ AN ,
respectively A . AN , respectively A =∼ AN if and only if there is a witness R
that testifies A’s SSP, respectively ESSP, respectively both SSP and ESSP, and
N = NAR .

    By Lemma 1, deciding the existence of an implementing net is equivalent to
deciding if the input TS has the property that corresponds to the implementation.
Moreover, there is an implementing Petri net for a given TS A if and only if
there is a witness R of regions that testifies A’s SSP or ESSP or both according
to the sought-for implementation.
                                                                                                            5

           a           b        b        a        a                       a                           a
 A1 = s0       s1          s2       s3       s4       s5        A2 = q0           q1       A3 = q0         q1
                                                                          a                          a0



Fig. 1: The TSs A1 (Example 1), A2 (Example 2) and A3 (Example 3). All of them are
1-bounded, but only A1 is linear.
                   R
           1               1                      a        a          a            a        a
     a                          b            0         1          2       3            4         5   ...
                                                  b         b         b            b         b


                             R
Fig. 2: Left: The net N = NA   1
                                 (Example 1), where zero-valued flow arcs are omitted.
Right: A sketch of the (infinite) reachability graph AN , which embeds A1 .
                                                                      1   a       1
               a                    0    a                 R1                              R2
                                                                      1       0   1
                                                                          a


                    R
Fig. 3: Left: N1 = NA2
                       (Example 2). Middle: The reachability graph AN1 . Right: The
              R
Net N2 = NA3 (Example 3), whose reachability graph is isomorphic to A3 .




Example 1 (Embedding). The TS A1 of Figure 1 has the SSP, since the following
region R = (sup, con, pro) solves all SSP atoms on one blow: sup(si ) = i for all
i ∈ {0, . . . , 5} and con(a) = con(b) = 0 and pro(a) = pro(b) = 1. In particular, the
set R = {R} is a witness of A1 ’s SSP. The (infinite) reachability graph of the net
NAR1 = (R, {a, b}, f, M0 ) synthesized from R, where f (R, a) = f (R, b) = con(a) =
con(b) = 0 and f (a, R) = f (b, R) = pro(a) = pro(b) = 1 and M0 (R) = 0, embeds
A1 , cf. Figure 2. An embedding ϕ is given by ϕ(si ) = i for all i ∈ {0, . . . , 5}.
    The TS A1 does not have the ESSP, since the atom α = (a, s2 ) is not solvable.
This has already been proven in [7, p. 42], for self-containment we provide the
proof: If R = (sup, con, pro) is a region that solves α, then (1) con(a) ≤ sup(s0 ),
since a occurs at s0 ; (2) sup(s2 ) < con(a), implying sup(s0 ) − (con(a) + con(b)) +
(pro(a) + pro(b)) < con(a), since R solves α; (3) con(a) ≤ sup(s4 ), implying
con(a) ≤ sup(s0 ) − 2(con(a) + con(b)) + 2(pro(a) + pro(b)), since a occurs at s4 .
If we combine (1) and (2), then we get −(con(a) + con(b)) + (pro(a) + pro(b)) < 0;
if we combine (2) and (3), then we get 0 ≤ −(con(a) + con(b)) + (pro(a) + pro(b)).
Since this is a contradiction, R cannot exist and thus α is not solvable.

Example 2 (Language-Simulation). The TS A2 of Figure 1 does not have any
ESSP atom, since the only event a occur at all states. Hence, A2 has the
ESSP. The set R = ∅ is a witness of A2 ’s ESSP. The reachability graph AN1
of the synthesized net N1 = NAR2 = (∅, {a}, f, M0 ) simulates A2 up to language
equivalence, cf. Figure 3. A language-simulation ϕ is defined by ϕ(q0 ) = ϕ(q1 ) = 0.
6

    On the other hand, A2 does not have the SSP, since the SSP atom α = (s0 , s1 )
is not solvable. This can be seen as follows: If R = (sup, con, pro) is a region that
solves α, then (1) sup(s0 ) 6= sup(s1 ); (2) sup(s0 ) = sup(s1 ) − con(a) + con(a);
(3) sup(s1 ) = sup(s0 ) − con(a) + con(a). One easily verifies that the combination
of (2) and (3) implies that sup(s0 ) = sup(s1 ), which contradicts (1). Hence, R
cannot exist and α is not solvable.

Example 3 (Realization of an E 0 -Label Splitting). Let A2 and A3 be in accordance
to Figure 1. The TS A2 has the event set E = {a}. The label-splitting of the event
a into the events a and a0 yields the event set E 0 = (E \ {a}) ∪ {a, a0 } = {a, a0 }.
The TS A3 is an E 0 -label-splitting of A2 : For q0 a q1 ∈ A2 there is exactly one
x ∈ {a, a0 }, namely x = a, such that q0 x q1 ∈ A3 and, for q1 a q0 ∈ A2 , there
is exactly one y ∈ {a, a0 }, namely y = a0 , such that q1 y q0 ∈ A3 . The set {a}
is the set of events of A2 that occur split in A3 .
    The TS A3 has both the SSP and the ESSP. More exactly, A3 has the SSP
atom α1 = (q0 , q1 ) and the ESSP atoms α2 = (a, q1 ) and α3 = (a0 , q0 ). The
region R1 = (sup1 , con1 , pro1 ), where sup1 (q0 ) = 1, sup1 (q1 ) = 0, con1 (a) =
pro1 (a0 ) = 1 and pro1 (a) = con1 (a0 ) = 0 solves α1 and α2 . Moreover, the region
R2 = (sup2 , con2 , pro2 ), where sup2 (q0 ) = 0, sup2 (q1 ) = 1, con2 (a0 ) = pro2 (a) =
1 and pro2 (a0 ) = con2 (a) = 0 solves α3 . Thus, R = {R1 , R2 } is a witness for A3 ’s
SSP and ESSP. Figure 3 depicts the synthesized net N2 = NAR3 and it is easy to
see that its reachability graph AN2 is isomorphic to A3 .


3    Label Splitting for Language-Simulation and Exact
     Realization

The following theorem presents our main result and states that label-splitting
aiming at language-simulation or realization is NP-complete.

Theorem 1. Let A be a linear TS and κ ∈ N. Deciding whether there is an
E 0 -label-splitting B of A that satisfies |E 0 | ≤ κ and allows a Petri net N such
that B . AN or B ∼   = AN is NP-complete.
   The proof of Theorem 1 bases on a polynomial-time reduction of the vertex
cover (VC) problem on cubic graphs, which is known to be NP-complete from [13]:
Cubic Vertex Cover (CVC)
 Input:       a Graph G = (V, Σ) with set of vertices V and edges Σ such that
    every v ∈ V is a member of exactly three distinct σ0 , σ1 , σ2 ∈ Σ, a number
    λ ∈ N.
 Decide:      whether there is a (λ-VC) M ⊆ V satisfying |M | ≤ λ and M ∩ σ 6= ∅
    for all σ ∈ Σ.

Example 4 ( CVC). The instance (G, 3), where G = (V, Σ) such that V =
{v0 , v1 , v2 , v3 } and Σ = {{v0 , v1 }, {v0 , v2 }, {v0 , v3 }, {v1 , v2 }, {v1 , v3 }, {v2 , v3 }},
allows a positive decision, since M = {v0 , v1 , v2 } is a 3-VC of G.
                                                                                                         7

    In the remainder of this paper, if not explicitly stated otherwise, let (G, λ)
be an arbitrary but fixed input of CVC, where G = (V, Σ) has n vertices
V = {v0 , . . . , vn−1 } and m edges Σ = {σ0 , . . . , σm−1 } such that σi = {vi0 , vi1 } for
all i ∈ {0, . . . , m − 1}. For technical reasons, we assume without loss of generality
that i0 < i1 for the vertices vi0 , vi1 of the edge σi for all i ∈ {0, . . . , m − 1}.
    For the proof of Theorem 1, we reduce (G, λ) to a pair (A, κ) of linear TS
A and natural number κ as follows. If there is an E 0 -label-splitting B of A
satisfying |E 0 | ≤ κ that has the ESSP, then G has a λ-VC. Hence, if B allows
a language-simulation (implying its ESSP) or a realization (implying its SSP
and its ESSP), then G has a λ-VC. Conversely, if G has a λ-VC, then there
is an E 0 -label-splitting B of A satisfying |E 0 | ≤ κ that has a set R of regions
that witnesses both B’s ESSP and SSP, which, by Lemma 1, implies both a
language-simulation and a realization for B. Thus, (G, λ) is a yes-instance if and
only if (A, κ) is a yes-instance, according to the implementation sought.
    For a start, we define κ = n + 2(m − 1) + λ, where n + 2(m − 1) is the number
of events of the announced TS A. Hence, λ corresponds to the maximum number
of events of A that could possibly be split for a sought E 0 -label-splitting B. For
every i ∈ {0, . . . , m − 1}, the TS A has the following directed path Ti that uses
the vertices of the edge σi = {vi0 , vi1 } as events:
                        vi 0          vi 1          vi1           vi 0             vi 0
            Ti = ti,0          ti,1          ti,2         ti,3             ti,4            ti,5


We use the states ⊥1 , . . . , ⊥m−1 and events y1 , . . . , ym−1 , z1 , . . . , zm−1 and, for all
i ∈ {1, . . . , m − 1}, the edges ti−1,5 yi ⊥i and ⊥i zi ti,0 to connect T0 , . . . , Tm−1
to finally build A. The resulting linear TS can be sketched as follows:
                 y1            z1            y2           z2             ym−1             zm−1
   A = T0               ⊥1            T1            ⊥2           ...              ⊥m−1            Tm−1

We summarize events and states by Y = {y1 , . . . , ym−1 }, Z = {z1 , . . . , zm−1 } and
⊥ = {⊥1 , . . . , ⊥m−1 }. Notice that A has |V ∪ Y ∪ Z| = n + 2 · (m − 1) events.

Lemma 2. If there is an E 0 -label-splitting B of A that satisfies |E 0 | ≤ κ and
has the ESSP, then G has a λ-vertex cover.

Proof. Let B be an E 0 -label-splitting of A that satisfies |E 0 | ≤ κ and has the
ESSP. Let E ⊆ E 0 be the set of events of A that occur split in B. By definition
of label-splitting, for every e ∈ E, there are at least two distinct events e1 , e2 in
E 0 that originate from the splitting of e and do not correspond to the splitting of
another event e0 ∈ E \ {e}. By |E| = n + 2 · (m − 1) and |E 0 | ≤ n + 2 · (m − 1) + λ,
this implies |E| ≤ λ.
    Let i ∈ {0, . . . , m − 1}. Notice that the TS A1 of Figure 1 and the path Ti
(when considered as a TS) are isomorphic, that is, with the exception of names for
states and events, they are structurally the same. Hence, just like in Example 1,
one argues that if a TS has the path Ti , then it cannot have the ESSP, since the
atom (vi0 , ti,2 ) is not solvable. Consequently, since B has the ESSP, the path Ti
8

            v0           v1          v1           v0           v0                           v0           v2             v2             v0            v0
     t0,0         t0,1        t0,2         t0,3         t0,4        t0,5        t1,0              t1,1          t1,2           t1,3           t1,4           t1,5


            v0           v3          v3           v0           v0                           v1           v2             v2             v1            v1
     t2,0         t2,1        t2,2         t2,3         t2,4        t2,5        t3,0              t3,1          t3,2           t3,3           t3,4           t3,5


            v1           v3          v3           v1           v1                           v2           v3             v3             v2            v2
     t4,0         t4,1        t4,2         t4,3         t4,4        t4,5        t5,0              t5,1          t5,2           t5,3           t5,4           t5,5



 Fig. 4: The paths T0 , . . . , T5 of TS A for Theorem 1 that originates from Example 4.
            v00          v1          v10          v0           v0                           v00          v2             v20            v0            v0
     t0,0         t0,1        t0,2         t0,3         t0,4        t0,5        t1,0              t1,1          t1,2           t1,3           t1,4           t1,5


            v0           v3          v3           v00          v0                           v10          v2             v20            v1            v1
     t2,0         t2,1        t2,2         t2,3         t2,4        t2,5        t3,0              t3,1          t3,2           t3,3           t3,4           t3,5


            v1           v3          v3           v10          v1                           v2           v3             v3             v20           v2
     t4,0         t4,1        t4,2         t4,3         t4,4        t4,5        t5,0              t5,1          t5,2           t5,3           t5,4           t5,5



Fig. 5: According to the reduction for Theorem 1, the paths T00 , . . . , T50 of the E 0 -label-
splitting B of A that originates from the VC M = {v0 , v1 , v2 } of Example 4.


(or an isomorphic version of it) cannot be present in B. Since B is an E 0 -label-
splitting of A, this implies that there is an event e ∈ {vi0 , vi1 } such that e ∈ E.
Since i was arbitrary, this is simultaneously true for all T0 , . . . , Tm−1 . Hence, if
M = V ∩ E, then M ∩ E(Ti ) 6= ∅, implying M ∩ σi =  6 ∅, for all i ∈ {0, . . . , m − 1}.
Finally, by |M | = |V ∩ E| ≤ |E| ≤ λ, we get that M defines a λ-VC of G.
     Conversely, we have to show that if G has a λ-VC, then there is a searched
E 0 -label-splitting for A. Let M = {vj0 , . . . , vjλ−1 } ⊆ V be a λ-VC of G. For every
i ∈ {0, . . . , λ − 1}, we split the event vji into the two events vji and vj0 i . This
                              Sλ−1
yields E 0 = (E \ M ) ∪ i=0 {vji , vj0 i }. To define the announced E 0 -label-splitting
B = (S, E 0 , δ 0 , t0,0 ) of A, it is sufficient to define δ 0 on the states of T0 , . . . , Tm−1 .
For all i ∈ {0, . . . , m − 1}, δ 0 restricted to S(Ti ) yields the path Ti0 as follows:
                                                                                     vi 0            vi1                vi 1           vi00           vi 0
    – if vi0 ∈ M and vi1 6∈ M , then Ti0 = ti,0                                              ti,1          ti,2 ,              ti,3           ti,4            ti,5 ;
                                                                vi00          vi 1                vi01           vi 0            vi0
    – if vi0 , vi1 ∈ M , then Ti0 = ti,0                               ti,1           ti,2 ,             ti,3            ti,4           ti,5 ;
                                                                                     vi 0            vi1                vi01           vi0            vi 0
    – if vi0 6∈ M and vi1 ∈ M , then Ti0 = ti,0                                              ti,1          ti,2 ,              ti,3           ti,4            ti,5 .
   By the following lemma, mappings sup and con, pro defined on the states and
events of T00 , . . . , Tm−10 that, for all i ∈ {0, . . . , m − 1}, behave like a region of Ti0
(when restricted to Ti0 ) can be extended to a region R = (sup0 , con0 , pro0 ) of B:
Lemma 3. Let sup : S \ ⊥ → N and con, pro : E 0 \ (Y ∪ Z) → N be mappings
such that if e ∈ {vi0 , vi00 , vi1 , vi01 } and ti,j e ti,j+1 ∈ B, then sup(ti,j ) ≤ con(e)
and sup(ti,j+1 ) = sup(ti,j ) − con(e) + pro(e) for all i ∈ {0, . . . , m − 1} and j ∈
{0, . . . , 4}. If sup0 , con0 , pro0 are defined as follows, then R = (sup0 , con0 , pro0 ) is
                                                                                                9

a region of B: for all s ∈ S, if s ∈ ⊥, then sup0 (s) = 0, otherwise sup0 (s) = sup(s);
for all e ∈ E 0 and all i ∈ {0, . . . , m − 1}, if e = yi , then con0 (e) = sup(ti−1,4 )
and pro0 (e) = 0; if e = zi , then con0 (e) = 0 and pro0 (e) = sup(ti,0 ); otherwise
con0 (e) = con(e) and pro0 (e) = pro(e).

Proof. By the assumption about sup, con and pro, it is easy to see that s e s0 ∈
B implies sup0 (s) ≤ con0 (e) and sup0 (s0 ) = sup0 (s) − con0 (e) + pro0 (e).

    The next lemma states that linear TSs and thus especially B have the SSP:

Lemma 4. If A = s0 e1 . . . si−1 ei si ei+1 . . . en sn is a linear TS, then A has
the SSP. Moreover, if the event ei occurs exactly once, then the atom (ei , s) is
solvable for all s ∈ {s0 , . . . , sn } \ {si−1 }.

Proof. The following region R = (sup, con, pro) solves all SSP atoms on one blow:
sup(s0 ) = 0; for all j ∈ {1, . . . , n}, con(ej ) = 0 and pro(ej ) = 1.
   The following region R = (sup, con, pro) solve (ei , s) for all s ∈ {s0 , . . . , si−2 }:
sup(s0 ) = 0; for all j ∈ {1, . . . , n}, if j = i, then con(ej ) = i and pro(ej ) = 0;
otherwise con(ej ) = 0 and pro(ej ) = 1.
   The following region R = (sup, con, pro) solve (ei , s) for all s ∈ {si , . . . , sn }:
sup(s0 ) = 1; for all j ∈ {1, . . . , n}, if j = i, then con(ej ) = 1 and pro(ej ) = 0;
otherwise con(ej ) = pro(ej ) = 0.

Lemma 5. There is a set R of regions witnessing the ESSP and the SSP of B.

Proof. By Lemma 4, B has the SSP and the atom (e, s) is solvable for all e ∈ Y ∪Z
and (relevant) s ∈ S.
     Let x ∈ E 0 \ (Y ∪ Z) be arbitrary but fixed. Since G is cubic, there are exactly
three indices i < j < k ∈ {0, . . . , m − 1}, such that x ∈ T`0 for all ` ∈ {i, j, k}.
Let no` (x) denote the number of x’s occurrences in T`0 for all ` ∈ {i, j, k}. The
following region solves (x, s) for all s ∈ B \ (S(Ti0 ) ∪ S(Tj0 ) ∪ S(Tk0 )): If i = 0,
then sup(t0,0 ) = no0 (e), otherwise sup(t0,0 ) = 0. For all e ∈ E 0 and ` ∈ {i, j, k},
if e = z` , then con(e) = 0 and pro(e) = no` (x); if e = x, then con(e) = 1 and
pro(e) = 0; otherwise con(e) = pro(e) = 0.
     It remains to argue that (x, s) is also solvable when both x and s belong to
the same gadget Ti0 , i ∈ {0, . . . , m − 1}. To do so, we proceed as follows. Let
i ∈ {0, . . . , m − 1}; We argue, for all the (possible) cases vi0 ∈ M, vi1 6∈ M and
vi0 6∈ M, vi1 ∈ M and vi0 , vi1 ∈ M , that (x, s) is solvable for all relevant events x ∈
{vi0 , vi1 , vi00 , vi01 } and states s ∈ S(Ti0 ). By the arbitrariness of i, this finally proves
the ESSP for B. By Lemma 3, it suffices to define mappings for T00 , . . . , Tm−1            0
                                                                                                 .
Let S = {t0,0 , . . . , tm−1,0 } and E = E 0 \ (Y ∪ Z). Let’s start with the case
                                                   v        v        v        v0       v
vi0 ∈ M, vi1 6∈ M , which implies Ti0 = ti,0 i0 ti,1 i1 ti,2 , i1 ti,3 i0 ti,4 i0 ti,5 .
    (vi0 and vi00 ): Let j 6= k ∈ {0, . . . , m − 1} \ {i} be such that vi0 ∈ σj ∩ σk . The
following region R = (sup, con, pro) solves (vi0 , s) for all s ∈ {ti,1 , ti,2 , ti,3 , ti,5 }:
For all s ∈ S, if s = ti,0 , then sup(s) = 1, if s ∈ {tj,0 , tk,0 }, then sup(s) = 2;
otherwise sup(s) = 0. For all e ∈ E, if e = vi0 , then con(e) = 1 and pro(e) = 0
10

(notice that noj (e), no` (e) ≤ 2, since vi0 ∈ M ); if e = vi00 , then con(e) = 0 and
pro(e) = 1; otherwise, con(e) = pro(e) = 0.
    The following region R = (sup, con, pro) solves the SSP atom (vi00 , s) for
all s ∈ {ti,0 , ti,1 , ti,2 , ti,4 , ti,5 }: For all s ∈ S, if s ∈ {tj,0 , tk,0 }, then sup(s) = 2;
otherwise sup(s) = 0. For all e ∈ E, if e = vi00 , then con(e) = 2 and pro(e) = 0
(notice that noj (e), no` (e) ≤ 1); if e = vi1 , then con(e) = 0 and pro(e) = 1;
otherwise, con(e) = pro(e) = 0.
    (vi1 ): Let j 6= k ∈ {0, . . . , m−1}\{i} be such that vi1 ∈ σj ∩σk . The following
region R = (sup, con, pro) solves (vi1 , s) for all s ∈ {ti,0 , ti,3 , ti,4 }: For all s ∈ S,
if s ∈ {tj,0 , tk,0 }, then sup(s) = 3; otherwise sup(s) = 0. For all e ∈ E, if e = vi1 ,
then con(e) = 1 and pro(e) = 0 (notice that noj (e), no` (e) ≤ 3); if e = vi0 , then
con(e) = 0 and pro(e) = 2; otherwise, con(e) = pro(e) = 0.
    The following region R = (sup, con, pro) solves (vi1 , ti,5 ): For all s ∈ S,
if s = ti,0 , then sup(s) = 2; if s ∈ {tj,0 , tk,0 }, then sup(s) = 3; otherwise
sup(s) = 0. For all e ∈ E, if e = vi1 , then con(e) = 1 and pro(e) = 0; otherwise,
con(e) = pro(e) = 0.
    Similarly, one finds solving regions for all relevant ESSP atoms (x, s) of Ti0
for the remaining cases vi0 6∈ M, vi1 ∈ M and vi0 , vi1 ∈ M . By the arbitrariness
of i, this proves the lemma.


4     Label Splitting for Embedding

In [18], it has been shown that label-splitting aiming at embedding is NP-complete.
The six strands of the reduction presented in [18] can be consecutively arranged
such that the resulting TS is 3-bounded. Thus, the problem is also NP-complete
for 3-bounded inputs. In this section, we strengthen this result and show that
label-splitting aiming at embedding remains NP-complete even for 2-bounded
inputs and that this bound is tight.

Theorem 2. Let A be a b-bounded TS and κ ∈ N. Deciding if there is an E 0 -
label-splitting B of A that satisfies |E 0 | ≤ κ and allows a Petri net N such that
A ,→ AN is NP-complete if b > 1, otherwise it is polynomial.

   The following lemma paves us the way for being able to prove easily the
polynomial-time statement of Theorem 2:

Lemma 6. Let A = s0 e1 . . . en sn be a directed cycle TS, i.e., s0 = sn and
si 6= sj for all i 6= j ∈ {0, . . . , n − 1}. If there is an i ∈ {0, . . . , n − 1} such that
¬sj ei for all j ∈ {0, . . . , n} \ {i − 1}, that is, ei occurs exactly once in A, then
A has a set R of regions that witnesses its SSP.

Proof. We argue that any SSP atom (s, s0 ) of A is solvable. To do so, we assume
without loss of generality that s0 = s and sj = s0 and i > j, that is,

            A = s0 e1 . . . ej sj ej+1 . . . ej+k si−1 ei si ei+1 . . . ei+` sn
                                                                                          11

where i − 1 = j + k and n = j + k + ` + 1. This assumption is no essential
restriction, since A is a cycle and thus can be transformed in the desired form
by a simple renaming of its states. The following region R = (sup, con, pro)
solves (s0 , sj ): sup(s0 ) = `; for all e ∈ E(A), if e 6= ei , then con(e) = 0 and
pro(e) = 1; if e = ei , then con(e) = j + k + ` and pro(e) = 0. This implies
sup(s0 ) = ` 6= ` + j = sup(sj ), since ` > 0. Thus, R separates (s, s0 ).

     If A = (S, E, δ, ι) is a directed cycle that does not have the SSP, implying that
every event occurs at least twice by Lemma 6, then we obtain an embeddable
E 0 -label-splitting B of A as follows: Let q x q 0 ∈ A be arbitrary but fixed and
E 0 = (E \ {x}) ∪ {x, x0 }. The E 0 -label-splitting B = (S, E 0 , δ, ι) that originates
                                                        0
from A by relabeling the edge q x q 0 by q x q 0 (and nothing else), obviously,
satisfies |E 0 | = |E| + 1. Moreover, since x0 occurs only once, B has the SSP by
Lemma 6. Hence, if A is a 1-bounded TS that has the SSP, which is particularly
implied if it is linear, then (A, κ) is a yes-instance if and only if |E| ≤ κ. Otherwise,
(A, κ) is a yes-instance if and only if |E| + 1 ≤ κ.
      Consequently, to complete the proof of Theorem 2, it remains to consider the
case where A is 2-bounded. To do so, we reduce an input (G, λ) of CVC to a
pair (A, κ) of 2-bounded TS A = (S, E, δ, t0,0 ) and natural number κ such that
G has a λ-VC if and only if there is an E 0 -label-splitting B of A that satisfies
|E 0 | ≤ κ and has a set R of regions that witnesses the SSP of B.
      Let’s introduce (A, κ). For a start, we define κ = n+m−1+λ, where n+m−1
is the number of events of A. Hence, λ corresponds to the maximum number of
events of A that could possibly be split for a sought E 0 -label-splitting B of A.
For every i ∈ {0, . . . , m − 1}, the TS A has the following gadget Ti that uses the
vertices of the edge σi = {vi0 , vi1 } of G as events:

                                         vi0                vi 0
                                                 ti,1
                             Ti = ti,0                             ti,2

                                         vi1     ti,3       vi 1


We use the events y1 , . . . , ym−1 and, for all i ∈ {1, . . . , m−1}, the edge ti−1,2 yi ti,0
to connect T0 , . . . , Tm−1 and build A, which can be sketched as follows:

                            y1           y2          ym−2                 ym−1
              A = T0              T1           ...           Tm−2                Tm−1

The initial state of A is t0,0 . Let Y = {y1 , . . . , ym−1 }. Notice that A is 2-bounded
and has exactly |V ∪ Y | = n + m − 1 events.
   If a TS has any of T0 , . . . , Tm−1 , then it does not have the SSP:

Lemma 7. If A is a TS that has the paths P1 = p0 a p1 a p2 and P2 =
p0 b p3 b p2 with the same starting state p0 and final state p2 , then A does
not have the SSP, since the atom (p1 , p3 ) is not solvable.
12

Proof. If R = (sup, con, pro) is a region of A, then we have (1) sup(p1 ) =
sup(p0 ) − con(a) + pro(a) and (2) sup(p2 ) = sup(p0 ) − 2con(a) + 2pro(a) and (3)
sup(p3 ) = sup(p0 )−con(b)+pro(b) and (4) sup(p2 ) = sup(p0 )−2con(b)+2pro(b).
We subtract (4) from (2), rearrange the resulting equation properly and obtain
−con(b) + pro(b) = −con(a) + pro(a). By (1) and (3), this implies sup(p1 ) =
sup(p3 ). Thus, R does not solve (p1 , p3 ). R was arbitrary, hence the claim.

   In fact, for all i ∈ {0, . . . , m − 1}, if a TS has the gadget Ti , then the SSP
atom (ti,1 , ti,3 ) is not solvable by Lemma 7. By the following lemma, this implies
that a searched E 0 -label-splitting B of A implies a λ-VC of G:

Lemma 8. If there is an E 0 -label-splitting B of A that satisfies |E 0 | ≤ κ and
has the SSP, then G has a λ-VC.


Proof. Let B be a fitting E 0 -label-splitting and E ⊆ E the set of events of A that
occur split in B, implying |E| ≤ λ by |E 0 | ≤ κ. Since B has the SSP, the atom
(ti,1 , ti,3 ) is solvable for all i ∈ {0, . . . , m − 1}. Consequently, by Lemma 7, none
of T0 , . . . , Tm−1 is present in B. Since B is a E 0 -label-splitting of A, this implies
E ∩ E(Ti ) 6= ∅ for all i ∈ {0, . . . , m − 1}. Hence, M = V ∩ E is a λ-VC of G.

     Conversely, we show that if G has a λ-VC, then there is a searched E 0 -label-
splitting for A. Let M = {vj0 , . . . , vjλ−1 } ⊆ V be a λ-VC of G and E 0 be defined
as in Section 3. To obtain the announced E 0 -label-splitting B = (S, E 0 , δ 0 , t0,0 ),
it is sufficient again to define δ 0 on the states of T0 , . . . , Tm−1 . In particular, for
all i ∈ {0, . . . , m − 1}, we get Ti0 from Ti by δ 0 as follows:

                                                         vi 0          vi00                          vi 1          vi 1
  – if vi0 ∈ M and vi1 6∈ M , then ti,0                         ti,1          ti,2 and ti,0                 ti,3          ti,2 ;
                                    vi 0          vi00                           vi1          vi01
  – if vi0 , vi1 ∈ M , then ti,0           ti,1           ti,2 and ti,0                ti,3           ti,2
                                                         vi 0          vi 0                          vi 1          vi01
  – if vi0 6∈ M and vi1 ∈ M , then ti,0                         ti,1          ti,2 and ti,0                 ti,3          ti,2

Lemma 9. There is a set R of regions that witnesses the SSP of B.

Proof. Let i ∈ {0, . . . , m − 1} be arbitrary but fixed. The following region
R = (sup, con, pro) solves (s, s0 ) for all states s =            6 s0 ∈ S that satisfy (s, s0 ) 6∈
{(ti,1 , ti,3 ), (ti,3 , ti,1 ) | i ∈ {0, . . . , m−1}}: sup(t0,0 ) = 0; for all e ∈ E 0 , con(e) = 0
and pro(e) = 1.
    Let i ∈ {0, . . . , m − 1} be arbitrary but fixed. The following region R =
(sup, con, pro) solves (ti,1 , ti,3 ): sup(t0,0 ) = 0; for all e ∈ E 0 , if vi0 ∈ M , then
con(vi0 ) = 0, pro(vi0 ) = 1, con(vi00 ) = 1, pro(vi00 ) = 0 and con(e) = pro(e) = 0 if
e 6∈ {vi0 , vi00 }; otherwise, if vi0 6∈ M , which implies vi1 ∈ M , then con(vi1 ) = 0,
pro(vi1 ) = 1, con(vi01 ) = 1, pro(vi01 ) = 0 and con(e) = pro(e) = 0 if e 6∈ {vi1 , vi01 }.
Since i was arbitrary, the atom (ti,1 , ti,3 ) is solvable for all i ∈ {0, . . . , m − 1}.
                                                                                      13

5    Conclusion
In this paper, we completely characterize the label-splitting problem for Petri
nets for all types of implementations that have previously been studied in the
literature. By doing so, we answer an open question that has been posed in [18,
p. 17]. Moreover, we strengthen the result from [18] to 2-bounded inputs, and
show that this bound is tight. It remains future work to consider the maximum
number κ of labels of the splitting as a parameter in order to investigate the
problem from a parameterized complexity point of view and to determine if this
parameterization makes the label-splitting problem fixed-parameter-tractable.

Acknowledgements. I would like to thank the unknown reviewers for their
valuable comments.

References
 1. van der Aalst, W.M.P.: Process Mining - Discovery, Conformance and Enhancement
    of Business Processes. Springer (2011). https://doi.org/10.1007/978-3-642-19345-3
 2. Badouel, E., Bernardinello, L., Darondeau, P.: Polynomial algorithms for the
    synthesis of bounded nets. In: TAPSOFT. Lecture Notes in Computer Science,
    vol. 915, pp. 364–378. Springer (1995). https://doi.org/10.1007/3-540-59293-8 207
 3. Badouel, E., Bernardinello, L., Darondeau, P.: Petri Net Synthesis. Texts
    in Theoretical Computer Science. An EATCS Series, Springer (2015).
    https://doi.org/10.1007/978-3-662-47967-4
 4. Badouel, E., Caillaud, B., Darondeau, P.: Distributing finite automata
    through Petri net synthesis. Formal Asp. Comput. 13(6), 447–470 (2002).
    https://doi.org/10.1007/s001650200022
 5. Best, E., Devillers, R.R.: Characterisation of the state spaces of live and bounded
    marked graph Petri nets. In: LATA. Lecture Notes in Computer Science, vol. 8370,
    pp. 161–172. Springer (2014). https://doi.org/10.1007/978-3-319-04921-2 13
 6. Best, E., Devillers, R.R.: Synthesis of bounded choice-free petri nets. In: CONCUR.
    LIPIcs, vol. 42, pp. 128–141. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
    (2015). https://doi.org/10.4230/LIPIcs.CONCUR.2015.128
 7. Best, E., Erofeev, E., Schlachter, U., Wimmel, H.: Characterising petri net solv-
    able binary words. In: Kordon, F., Moldt, D. (eds.) Application and Theory of
    Petri Nets and Concurrency - 37th International Conference, PETRI NETS 2016,
    Toruń, Poland, June 19-24, 2016. Proceedings. Lecture Notes in Computer Science,
    vol. 9698, pp. 39–58. Springer (2016). https://doi.org/10.1007/978-3-319-39086-4 4,
    https://doi.org/10.1007/978-3-319-39086-4 4
 8. Carmona, J.: The label splitting problem. Trans. Petri Nets Other
    Model. Concurr. 6, 1–23 (2012). https://doi.org/10.1007/978-3-642-35179-2 1,
    https://doi.org/10.1007/978-3-642-35179-2 1
 9. Carmona, J., Cortadella, J., Kishinevsky, M.: New region-based algorithms for
    deriving bounded petri nets. IEEE Trans. Computers 59(3), 371–384 (2010).
    https://doi.org/10.1109/TC.2009.131, https://doi.org/10.1109/TC.2009.131
10. Cortadella, J., Kishinevsky, M., Kondratyev, A., Lavagno, L., Yakovlev, A.: A region-
    based theory for state assignment in speed-independent circuits. IEEE Transactions
    on Computer-Aided Design of Integrated Circuits and Systems 16(8), 793–812
    (Aug 1997). https://doi.org/10.1109/43.644602
14

11. Cortadella, J., Kishinevsky, M., Lavagno, L., Yakovlev, A.: Deriving petri nets from
    finite transition systems. IEEE Transactions on Computers 47(8), 859–882 (1998)
12. Cortadella, J., Kishinevsky, M., Kondratyev, A., Lavagno, L., Yakovlev, A.: A
    region-based theory for state assignment in speed-independent circuits. IEEE
    Trans. on CAD of Integrated Circuits and Systems 16(8), 793–812 (1997).
    https://doi.org/10.1109/43.644602
13. Fricke, G., Hedetniemi, S.T., Jacobs, D.P.: Independence and irredundance in
    k-regular graphs. Ars Comb. 49 (1998)
14. Holloway, L.E., Krogh, B.H., Giua, A.: A survey of Petri net methods for controlled
    discrete event systems. Discrete Event Dynamic Systems 7(2), 151–190 (1997).
    https://doi.org/10.1023/A:1008271916548
15. Lu, X., Fahland, D., van den Biggelaar, F.J.H.M., van der Aalst, W.M.P.: Han-
    dling duplicated tasks in process discovery by refining event labels. In: Rosa, M.L.,
    Loos, P., Pastor, O. (eds.) Business Process Management - 14th International
    Conference, BPM 2016, Rio de Janeiro, Brazil, September 18-22, 2016. Proceed-
    ings. Lecture Notes in Computer Science, vol. 9850, pp. 90–107. Springer (2016).
    https://doi.org/10.1007/978-3-319-45348-4 6, https://doi.org/10.1007/978-3-319-
    45348-4 6
16. de San Pedro, J., Cortadella, J.: Discovering duplicate tasks in transition systems
    for the simplification of process models. In: Rosa, M.L., Loos, P., Pastor, O. (eds.)
    Business Process Management - 14th International Conference, BPM 2016, Rio de
    Janeiro, Brazil, September 18-22, 2016. Proceedings. Lecture Notes in Computer
    Science, vol. 9850, pp. 108–124. Springer (2016). https://doi.org/10.1007/978-3-
    319-45348-4 7, https://doi.org/10.1007/978-3-319-45348-4 7
17. Schlachter, U., Wimmel, H.: Relabelling LTS for petri net synthesis via solving
    separation problems. Trans. Petri Nets Other Model. Concurr. 14, 222–254 (2019).
    https://doi.org/10.1007/978-3-662-60651-3 9, https://doi.org/10.1007/978-3-662-
    60651-3 9
18. Schlachter, U., Wimmel, H.: Optimal label splitting for embedding an LTS into
    an arbitrary Petri net reachability graph is NP-complete. CoRR abs/2002.04841
    (2020), https://arxiv.org/abs/2002.04841
19. Wang, Y., Nazeem, A., Swaminathan, R.: On the optimal petri net representation for
    service composition. In: IEEE International Conference on Web Services, ICWS 2011,
    Washington, DC, USA, July 4-9, 2011. pp. 235–242. IEEE Computer Society (2011).
    https://doi.org/10.1109/ICWS.2011.40, https://doi.org/10.1109/ICWS.2011.40