1. Introduction

Alignments meet Linear Algebra

Christopher T. Schwanen

Wied Pakusa

Wil M. P. van der Aalst

0 0 Chair of Process and Data Science (PADS), RWTH Aachen University , Aachen , Germany 1 Faculty of Mathematics, Informatics and Technology, Koblenz University of Applied Sciences , Koblenz , Germany

An optimal alignment corresponds to the minimal number of insertions and deletions of events to transform an observed trace into a trace of a process model. For the important class of process trees, we show how we can express the alignment problem as a system of matrices over a commutative semiring. This system of matrices only depends on the process tree, but not on the input trace, and the costs of alignments can be computed by matrix and vector multiplications. Our formulation in terms of linear algebra sheds new light on alignments and allows us to transfer methods from linear algebra into the field of conformance checking. As an application, we show that for a subclass of process trees with unique labels, we can eficiently decide alignment properties by using a symbolic representation of the matrices.

eol>Process Mining Conformance Checking Alignments Linear Algebra Process Trees

1. Introduction

Costs(, ) = in · 1 2 · · · · fin ∈ .

(1) Our work is strongly inspired by the notion of weighted automata [5] known from the field of formal languages and automata theory. In a nutshell, a weighted automaton can (non-deterministically) process an input along diferent computation paths where each single path has certain costs. These costs are the product of the costs of transitions taken along the path. In the end, the sum over all computation paths is assigned as the costs of the read input. All arithmetic (sums and products) takes place in some underlying commutative semiring that can vary depending on the modeled situation. In this way, a weighted automaton does not only define a formal language (i.e., a Boolean function), but specifies a numeric function of the form : Σ * → which assigns a certain cost to each input string. Somewhat the key idea of our whole construction is to show that the alignment cost function of every process tree is regular in the sense that it can be expressed by a weighted automaton.

To benefit from our new construction algorithmically, we have to take into account that process trees yield a succinct representation of an exponentially large state space. Indeed, while each process tree defines a regular language, the standard translation from process trees to finite automata results in an exponential increase in the number of states. Thus, in an explicit representation, the weighted automata (and matrices ()∈Σ) become excessively large. The operator responsible for this blow-up is the shufle operator, which models independent parallel computations. Hence, our second key idea is to preserve the compact, symbolic representation of process trees by expressing the translation using linear algebra and standard matrix/vector operations. Remarkably, we find that for all process tree operators, there exist corresponding standard linear-algebraic operators that accurately capture the alignment semantics of process trees. In particular, we show that the shufle operator used in process trees corresponds to the well-known Kronecker product of matrices. This insight allows us to define our translation using only symbolic algebraic expressions for the matrices and vectors in and fin with the properties described in Equation (1) and such that the sizes of the symbolic expressions coincide with the size of the process tree itself (up to polynomial factors).

We also showcase first applications. In particular, we prove that for a certain subclass of process trees with unique labels we can use a symbolic representation of the matrices and vectors to obtain an eficient alignment algorithm. To this end, we show how to eficiently implement the required matrix and vector operations in a symbolic way. We also explain how our approach can enable the transfer of concepts from linear algebra to the field of conformance checking thereby providing new analysis techniques for the process mining community. These examples highlight the great potential of our new formulation for future research on the algorithmic structure of alignments.

2. Related Work

Alignments [1] are the state-of-the-art technique for conformance checking, an introduction to the ifeld is provided in [ 2, 3 ]. When restricting the class of process models to process trees with unique labels, the resulting model of the well-known Inductive Miner family [ 7–9 ] and the de-facto standard in industrial process mining applications, the computation of alignments becomes tractable [ 14 ]. But even for process trees in their general form, alignments can still be computed more eficiently than on other models, e.g., sound workflow nets [ 13, 15 ].

Weighted automata can be traced back to the 1960s where formal power series were studied as a kind of generalization of formal languages, see e.g. [12]. While classic automata decide if a given word is accepted or not (a Boolean property), weighted automata define a numerical value for inputs in a commutative semiring. The semiring can model, e.g. resources, costs, time, access rights, or a probability for each input. They have manifold applications, for example in the area of model checking [ 4 ] or natural language processing [ 11 ]. We refer to [5, 6] for a survey and handbook for more details.

To algorithmically apply our linear-algebraic formulation, we have to represent the matrices and vectors in some compact, symbolic way (instead of working with the explicit, exponential-size representation). A similar approach has been taken, for example, in [ 10 ]. Here the authors show that computational problems for symbolic matrices are provably very hard. However, in our context, the matrices have a quite restricted form, so we are still able to implement certain operations eficiently (and, indeed, in Section 5 we prove that this is possible to a certain extent).

3. Preliminaries

Let N := {0, 1, 2, . . .} be the set of natural numbers. For an -tuple ∈ 1 × · · · × the projection on its th element, i.e., : 1 × · · · × → , (1, . . . , ) ↦→ . , () denotes Definition 3.1 (Alphabet). An alphabet Σ is a finite, non-empty set of labels (also called activities). Definition 3.2 (Sequence). Sequences with index set over a set are denoted by = ⟨⟩∈ ∈ . The length of a sequence is written as | | and the set of all finite sequences over is denoted by * . For a sequence = ⟨⟩∈ ∈ , ∑︀ is a shorthand for ∑︀∈ . Given two sequences and ′, · ′ (or ′ in short) denotes the concatenation of the two sequences. For 1, 2, . . . , ∈ , we also use 12 · · · to denote a sequence. The restriction of a sequence ∈ * to a set ⊆ is the subsequence | of consisting of all elements in . A function : → can be applied to a sequence ∈ * given the recursive definition (⟨⟩) := ⟨⟩ and (⟨⟩ · ) := ⟨ ()⟩ · ( ). For a sequence of tuples ∈ ()* , * ( ) denotes the sequence of every th element of its tuples, i.e., * (⟨⟩) := ⟨⟩ and * (⟨(1, . . . , )⟩ · ) := ⟨ (1, . . . , )⟩ · * ( ) = ⟨⟩ · * ( ). As an important extension of * we write for the composition of * with the restriction to , i.e., := * |.

3.1. Process Trees

Languages of traces ℒ ⊆ Σ * correspond to sets of behaviors of a process. The symbols in one trace correspond to the events or activities that occurred. Here, we study process trees as a modeling mechanism for business processes. Each process tree defines a language ℒ( ) ⊆ Σ * of possible process behaviors. We recall the shufle operator which captures parallel execution within a process. Formally, this is defined by taking two traces , ∈ Σ * and by defining to be the set of all traces obtained by interleaving the symbols of and while preserving their relative order. For example, if = and = , then = {, , , , , }.

Definition 3.3 (Shufle ). For , ∈ Σ * , the shufle

of and is

:= {11 . . . | = 1 . . . , = 1 . . . , , ∈ Σ * , 1 ≤ ≤ }.

Let ℒ1, ℒ2 ⊆ Σ * . The shufle of ℒ1 and ℒ2 is defined as ℒ1 ℒ2 := ⋃︀{1 2 | 1 ∈ ℒ1, 2 ∈ ℒ2}.

Definition 3.4 (Process Tree). Let Σ be an alphabet and let /∈ Σ be the silent activity. Both, the set of process trees (over Σ ) and the language of a process tree , denoted by ℒ( ), are defined recursively: • the silent activity is a process tree where ℒ( ) = {⟨⟩}, • each activity ∈ Σ is a process tree where ℒ() = {⟨⟩}, • for process trees 1, . . . , , ∈ N ∖ {0}, the following are also process trees: – →(1, . . . , ) where ℒ(→(1, . . . , )) = ℒ(1) · . . . · ℒ (), – × (1, . . . , ) where ℒ(× (1, . . . , )) = ℒ(1) ∪ . . . ∪ ℒ(), – ∧(1, . . . , ) where ℒ(∧(1, . . . , )) = ℒ(1) . . . ℒ(), and – ⟲ (1, 2) where ℒ(⟲ (1, 2)) = ℒ(1) · (ℒ(2) · ℒ (1))* .

The symbols → (sequence), × (exclusive choice), ⟲ (loop), and ∧ (parallel) are process tree operators. A process tree with unique labels is a process tree where each activity occurs at most once.

We assume that all operators (→, × , ∧) are binary (i.e. = 2 in Definition 3.4). This is no restriction, since a general process tree can eficiently be transformed into an equivalent binary process tree.

3.2. Alignments

An alignment between a trace = 1 · · · ∈ Σ * and a process tree is a trace = 1 · · · which consists of labels which are called moves. A move is either a pair of labels (, ) for ∈ Σ (which we call a synchronous move), or a pair (, ≫ ) for ∈ Σ (which we call a log move), or a pair (≫ , ) for ∈ Σ (which we call a model move). Here ≫ is a special symbol which indicates that an event is skipped in the trace or model. The first components of the moves in should yield the trace (when we remove all skip symbols ≫ ) and the second components should yield a trace in the language of the process tree (again without skip symbols). Intuitively, we aim to modify the trace such that it becomes a trace in the language of the process tree . From this point of view, a log move (, ≫ ) deletes the symbol from while a model move (≫ , ) inserts the symbol into the trace . Definition 3.5 (Move, Alignment). Let Σ be an alphabet and let ≫ be a fresh symbol not in Σ . We use ≫ to indicate a skip in the trace or model and define Σ ≫ := Σ ∪ {≫} as the alphabet extended by the skip-symbol ≫ . We define LM (Σ) ⊆ Σ ≫ × Σ ≫ as the set of all legal moves over Σ given by LM (Σ) :=

{(, ) | ∈ Σ } ∪ {(, ≫ ) | ∈ Σ } ∪ {(≫ , ) | ∈ Σ } synchronous moves model moves log moves.

An alignment ∈ LM (Σ) * between ∈ Σ * and a process tree is a sequence of moves = ⟨1, . . . , ⟩ such that 1Σ( ) = and 2Σ( ) ∈ ℒ( ).

As explained, one perspective is that a log move (, ≫ ) deletes the symbol from while a model move (≫ , ) inserts the symbol into the trace , so alignments resembles classic edit distance without swaps of characters. We determine the costs ( ) of an alignment by summing up the costs () of the individual moves in . In this paper, we use the standard cost function where synchronous moves have cost 0 and where log and model moves have cost 1 (other cost functions are possible in principle). The set of all alignments between a trace ∈ Σ * and a process tree is denoted by Γ( , ). An optimal alignment opt ∈ Γ( , ) is an alignment with minimal costs ( opt ) among all alignments in Γ( , ). Let us denote the optimal alignment costs for a trace and a process tree by Costs(, ).

3.3. Matrices and Operations

A commutative semiring = (, +, · , 0, 1) is a set with two binary operations + and · , and two distinguished elements 0 and 1 such that (, +, 0) is a commutative monoid, (, · , 1) is a commutative monoid and the multiplication · distributes over addition +, i.e., · ( + ) = · + · and ( + ) · = · + · for all , , ∈ . In this paper, we fix as the so-called tropical semiring = (N ∪ {∞}, min, +, ∞, 0), where 0 = ∞ is the neutral element for addition (min) and 1 = 0 is the neutral element for multiplication (+). We stress that addition is min and multiplication is + in the tropical semiring. To avoid confusion between the symbolic expressions 0 and 1 for the neutral elements and 0, 1 ∈ N for the natural numbers 0 and 1, we use the boldface notation 0 to denote the neutral element for the addition in the semiring, and 1 to denote the neutral element for the multiplication. For the particular semiring we use throughout this article, it holds that 0 = ∞ and 1 = 0 ∈ N.

We consider matrices with row index set and column index set and entries in as mappings : × → and write , = (, ) ∈ for the entry in row and column . Here, and in what follows, we assume that and are non-empty finite sets. We denote by ℳ× () the set of all matrices with row index set and column index set and entries in . Matrix addition on ℳ× () is defined component-wise, as usual. For matrices ∈ ℳ× () and ∈ ℳ× (), the matrix product · ∈ ℳ× () is defined by ( · )(, ) = ∑︀∈ , · ,. The transpose of a matrix ∈ ℳ× () is the matrix ∈ ℳ× () with entries , = , . We denote the × -identity matrix over by 1× ∈ ℳ× (). For this matrix we have 1× (, ) = 1 = 0 if = and 1× (, ) = 0 = ∞ otherwise.

Vectors are special cases of matrices with a row index set and the fixed column index set {⊤}, and we write ℳ () for the set of all vectors with entries in . Hence, by default, vectors are considered as column vectors. If we want to consider a vector as a row vector, we write for the transpose of the vector , which is a mapping : {⊤} × → with (⊤, ) = () for all ∈ . For the special case of {⊤} × {⊤} matrices , we identify them with the element (⊤, ⊤) ∈ . In particular, for two -vectors , we have ⟨, ⟩ = · ∈ (standard inner product).

Kronecker product. For matrices ∈ ℳ× () and ∈ ℳ× (), the Kronecker product ⊗ ∈ ℳ(× )× (× )() is the matrix with row index set ( × ) and column index set ( × ) with entries given by ( ⊗ )((, ), (, )) = , · ,. Note that the sizes of the two index sets become | × | = || · | | and | × | = | | · | |, respectively. For more details on the Kronkecker product, cf. [16]. Most importantly, we make use of the mixed-product property of the Kronecker product, which states that for matrices , , , such that the matrix products · and · exist, we have ( ⊗ ) · ( ⊗ ) = ( · ) ⊗ ( · ) Direct sum. For matrices ∈ ℳ× () and ∈ ℳ× (), the direct sum ⊕ ∈ ℳ(⊎)× (⊎)() is the matrix with row index set ( ⊎ ) and column index set ( ⊎ ) with entries given by ( ⊕ )(, ) = , if (, ) ∈ × and ( ⊕ )(, ) = , if (, ) ∈ × , and 0 otherwise. Here, ⊎ denotes the disjoint union of sets. Analogously, we define the direct sum of vectors ∈ ℳ () and ∈ ℳ () as the vector ⊕ ∈ ℳ⊎ () with entries given by ( ⊕ )() = () if ∈ and ( ⊕ )() = () if ∈ .

Matrix composition. Let 1 ∈ ℳ× (), 2 ∈ ℳ× (), 3 ∈ ℳ× () and 4 ∈ ℳ× () be matrices over . Then we define a new matrix over the row index set ( ⊎ ) and the column index set ( ⊎ ) given as with entries (, ) defined according to the four submatrices. We sometimes use the same notation for vectors 1 ∈ ℳ () and 2 ∈ ℳ () to define the new ( ⊎ )-vector given as: = Note that corresponds to the direct sum 1 ⊕ 2. Moreover, note that, on the contrary, the matrix composition cannot be expressed as a direct sum if 2 ̸= 0× or 3 ̸= 0× .

4. Construction of Matrix Systems for Process Trees

Next, we give our main construction to efectively translate a process tree into a matrix system which captures the alignment costs for . Let Σ be the alphabet. We construct a family of × -matrices ()∈Σ, an -vector in ∈ ℳ () and an -vector fin ∈ ℳ () such that the optimal alignment costs for a trace = 123 · · · ∈ Σ * and can be expressed as a matrix multiplication problem as follows:

Costs(, ) = in · 1 2 · · · · fin ∈ . (2)

In particular, for the empty trace = , the optimal alignment costs are given by infin . We introduce some notation: for a sequence = 12 · · · we write to abbreviate the product 1 2 · · · . In case = , we have = 1× (the index set × of the matrices will become clear from the context). With this notation, Equation (2) can be written as Costs(, ) = in · · fin .

For the further algorithmic construction, we proceed by recursion on the structure of : Non-occurring activities. First, if an activity ∈ Σ does not occur in , then we can always set · 1× (no matter of the process tree operator we are considering), where 1× is the × identity matrix (and where is the appropriate index set). This step is sound since 1 · 1× naturally commutes with all × -matrices and does not afect the product of the remaining matrices. Also, this matrix ensures that we charge an extra cost of 1 for the necessary log move (, ≫ ) to delete the activity from the trace (since this activity does not occur in the process tree, this is the only way to align it). Labeled leaf. If = ∈ Σ is a leaf with label , we set = {0, 1} and define:

For ∈ Σ , ̸= : = = 1 · 1× , and for : = Moreover, we set in = and fin =

(recall: 0 = ∞ ≠ 0 and 1 = 0 ̸= 1 and 1 · 1 = 1). infin the construction.

To verify correctness, first note that infin = 1 which indeed are the alignment costs for = in case = . Further, note that since = 1 · 1× for ̸= , the matrix commutes with all × matrices. Also, it is readily verified that = ( − 1) · for ≥ = 1 · · · contains no , we have = · 1× and if contains ≥ 1. Hence, if the trace 1 many , we have = ( − ) · ( − 1) · = ( − 1) · . Since in · fin = 1 and in · · fin = 0, we have = + 1 if does not contain and − 1 if contains which proves the correctness of Moreover, we let in = (︀ 1)︀ and fin

= (︀ 1)︀ . Then, we have indeed the optimal alignment costs for the trace in the case = .

Silent leaf. For the case of a silent leaf = , we set = {0} and define = (︀ 1)︀ for all ∈ Σ . in · 1 2 · · · · fin = . These are Exlusive Choice. If = 1 × 2, let i, iin, and fini for = 1, 2 be such that

Costs(, ) = (iin) · i1 i2 · · · i · ifni .

Semantically, the exclusive choice operator allows us to choose between the two branches 1 and 2. Hence, we define the matrices and vectors in and fin as follows: = 1 ⊕ 2 , in = in ⊕ in, fin = fin ⊕ ifn2 .

1 2 1 Then, we have in · 1 2 · · · · fin = (i1n ⊕ i2n) · (11 ⊕ 21 ) · · · (1 ⊕ 2 ) · (fin ⊕ ifn2 )

1 = (i1n) · 11 12 · · · 1 · ifn1 + (i2n) · 21 22 · · · 2 · ifn2 .

Recalling that + = min, this expression indeed computes Costs(, ), since the addition operation in the semiring chooses the minimum of the two branches.

Sequence. For the sequence operator, i.e., = 1 → 2, let i, iin, and fini for = 1, 2 be such that

Costs(, ) = (iin) · i1 i2 · · · i · ifni .

Let be the row and column index set of the matrices 1 and be the row and column index set of the matrices 2 . Then i1n and fin1 are vectors in ℳ () and i2n and fin2 are vectors in ℳ (). We construct new matrices , for ∈ Σ , for the tree with index sets ( ⊎ ) × ( ⊎ ) as follows: Moreover, for = 1, 2 we let where = fin · (i2n) ∈ ℳ× ().

1 = (iin) · ifn ∈ , i

and (3) (4)

1 2 1 2 in = in ⊕ 1 · in ∈ ℳ⊎ () and fin = 2 · fin ⊕ ifn ∈ ℳ⊎ ().

Then, we have for a trace = 12 · · · : 1 · · · = ︂( 11 · · · 1 11 · · · 1 + 11 · · · 1− 1 2 + · · ·

+ 11 22 · · · 2 )︂ Let = 11 · · · 1 + 11 · · · 1− 1 2 + · · · + 11 22 · · · 2 . Note that the corresponds to all possible decompositions of into two segments which are aligned against the two subtrees 1 and 2, where the segment for 1 must be non-empty. Then we have: in · 1 · · · = in ·

· fin ︂( 11 · · · 1 This sum corresponds to all possible ways to decompose into a sequence = 12 of two traces 1, 2, and to align the first subtrace 1 against 1 and the second subtrace 2 against 2. The optimal alignment costs are correctly taken as the minimum over all such decompositions. Parallel Operator. For = 1 ∧ 2, let i, iin, and fini for = 1, 2 be such that for all traces

Costs(, ) = (iin) · i1 i2 · · · i · ifni .

Let be the row and column index set of the matrices 1 , and be the row and column index set of the matrices 2 . Then i1n and fin1 are vectors in ℳ () and i2n and fin2 are vectors in ℳ (). We construct new matrices , for ∈ Σ for with row and column index sets × as follows: = (1 ⊗ 1× ) + (1× ⊗ 2 ).

Moreover, we set (5) (6) (7) 1 2 1 2 in = in ⊗ in ∈ ℳ× () and fin = fin ⊗ ifn ∈ ℳ× ().

In anticipation of our treatment of process trees with unique labels, let us assume that the sets of occur in ). Assume that ∈/ Σ 1. Then 1 = 1 · 1× and we get: labels in the two subtrees 1 and 2 are disjoint, i.e., Σ 1 ∩ Σ 2 = ∅ (where Σ is the set of labels which = (1 ⊗ 1× ) + (1× ⊗ 2 ) = (1 · 1× ⊗ 1× ) + (1× ⊗ 2 )

= (1× ⊗ 1 · 1× ) + (1× ⊗ 2 ) = 1× ⊗ (1 · 1× + 2 ) = 1× ⊗ 2 . can be simplified to: Here we used that 1 · 1× + 2 = 2 (easy to verify by induction, since all matrices have the elements 0, 1 ∈ N on their diagonal.) This means that for process trees with unique labels, the definition = ((11 ⊗ 1× ) + (1× ⊗ 21 )) · · · ((1 ⊗ 1× ) + (1× ⊗ 2 )) = ∑︁ (︀ 11 12 · · · 1 ⊗ 21 22 · · · 2 : 1122 · · · = , , ∈ Σ * )︀ . Moreover, for each summand 11 12 · · · 1 ⊗

21 22 · · · 2 , we have in · ︀( 11 12 · · · 1 ⊗

21 22 · · · 2 )︀ · fin = (i1n) · 11 12 · · · 1 · ifn · (i2n) · 21 22 · · · 2 · ifn2 .

1 Hence, this term corresponds to the optimal alignment costs where we split the trace into two subtraces 12 · · · and 12 · · · , for which the shufle yields trees 1 and 2, respectively. As the sum runs over all possible decompositions of , this determines the optimal alignment costs for the trace and the process tree . Again, in anticipation of the treatment of process trees with unique labels, note that for the case Σ 1 ∩ Σ 2 = ∅, the product simplifies to: , and align them against the process =

∏︁ 1 ⊗ ∈/Σ2

∏︁ 2 . ∈Σ2 Hence, instead of an exponential number of summands, we get a single term for the matrix (this is the key for tractability of the alignment problem for process trees with unique labels). matrices 2 , i2n, and fin2 for the process tree 2. Then we set Loop. Finally, we consider the case of a loop operator. Let = 1 ⟲ 2, where, by recursion, we can assume to have already constructed the matrices 1 , i1n, and fin1 for the process tree 1 and the Moreover, for = ⟨i1n, ifn1 ⟩, we set It can be verified that in = (i1n ⊕ i2n) and fin = (fin ⊕ fin2 ).

1 = ︂( 11 21 22 12)︂ , with = ∑︁ (︀ i0 3− 3 1− i i2 · · · j :

012 · · · = , where ≥ 0, 0 ∈ Σ * , 1, . . . , ∈ Σ +).

Each summand of corresponds to a decomposition of the trace into segments 012 · · · where the first segment is aligned against the process tree and the last segment is aligned against i2n · 21fin1 + i1n · 12fin2

+ i2n · 22fin2 . Note that the process tree (in between we have strict alternation). Moreover, = 1 , i.e. captures the costs to align in · · fin = i1n · 11fin1 + the empty trace against the process tree 1. Hence, the total sum consists of all possible ways to decompose the trace into segments 012 · · · which are aligned against the trees 1 and 2 in an alternating fashion. Depending on which tree we start/end with, takes care of the costs of aligning against 1 (since the semantics of the loop operator demands that we have to start and end with 1, decomposition and hence the optimal alignment costs for the trace and the process tree . recall: for = 1 ⟲ 2 we have ℒ( ) = ℒ(1) · (ℒ(2) · ℒ (1))* ). Finally, the sum selects the optimal

We have established the main result of this work: can be expressed as a matrix multiplication problem as follows: Theorem 4.1. There exists an algorithm which takes as input a process tree over some alphabet Σ and computes a family of × -matrices ()∈Σ, an -vector in ∈ ℳ () and an -vector fin ∈ ℳ (), for some index set , such that the optimal alignment costs for a trace = 123 · · · ∈ Σ * and

Costs(, ) = in · 1 2 · · · · ifn ∈ .

Our construction is not eficient. Consider a process tree of the form = (((1 ∧ 2) ∧ · · · ) ∧ ) with activities 1, 2, . . . , that can occur in any order. For , our construction yields matrices of size (where is the size of the base matrices for labels ∈ Σ ). This is because the size of the Kronecker product of two matrices is the product of their sizes. Hence, in general, the only size bound on that we can get is exponential, i.e. 2(‖ ‖). In the following, we thus discuss how to avoid constructing the matrices explicitly and how to work with their symbolic representations instead.

5. Symbolic Computations with Tree-Structured Matrices and Vectors

Our matrix/vector construction from Section 4 has one drawback: the Kronecker product leads to an exponential blow-up of the sizes of matrices and vectors. Since the alignment problem on process trees is NP-complete [ 13 ], it is clear that this state explosion cannot be avoided in the general case. On the other hand, the resulting matrices and vectors have a simple descriptions in form of algebraic expressions. This suggests that, instead of working with explicit matrices, we could compute with the symbolic representations of the matrices and vectors instead. In this section, we investigate this idea.

5.1. Tree-structured index sets

We start by defining tree-structured sets (ts-sets, for short) to index the matrices and vectors used by the construction in Section 4. We use symbolic representations for the sets to which we associate a semantics as (potentially exponential-sized) standard sets. In the following, we might get mixed-up with the symbolic representation of a ts-set and its semantics as a standard set. Hence, to avoid confusion, we also write rset() to denote the standard set that is represented by . For the case of base sets we have = rset(). Also, each ts-set has a representation size ‖‖ ∈ N which quantifies the encoding length of the symbolic representation .

Fix a countably infinite set of atoms Atoms = {0, 1, 2, . . .} which are used as base elements. For simplicity, we assume that we can encode atoms with unit costs. A tree-structured set (ts-set, for short) is an expression according to the following inductive definition: Base case. Each standard (finite, non-empty) set ⊆ Atoms is a ts-set which represents the set rset() = . The representation size of the atomic set is ‖‖ = ||.

Disjoint union. For ts-sets , , we can construct the ts-set ⊎ which represents the (standard) set rset() ⊎ rset( ). The representation size of the set ⊎ is ‖‖ + ‖ ‖ + 1.

Cartesian product. For ts-sets , , we can construct the ts-set × which represents the (standard) set rset() × rset( ). The representation size of the set × is ‖‖ + ‖ ‖ + 1.

Note that the representation size ‖‖ of is linear in the representation sizes of the ts-subsets for all operations and linear in the cardinality of the set for the base case. This is in contrast to the size of the represented set, specifically for the case of the Cartesian product operation × . While ‖ × ‖ is ‖‖ + ‖ ‖ + 1, the size of the represented standard set is |rset( × )| = |rset()| · | rset( )|.

5.2. Tree-structured vectors

We proceed by defining tree-structured vectors (ts-vectors, for short). Let be a ts-set. Then a ts-vector with ts-index set has a semantics in form of an rset()-vector : rset() → . To emphasize the distinction between a ts-vector (as a symbolic expression) and the represented standard vector (its semantics) we also denote the vector represented by as rvec(). Again, for simplicity, we assume that we can encode the ring elements in with unit costs. As a consequence, the encoding length ‖‖ ∈ N of a ts-vector with ts-index set corresponds to the representation size of the ts-set , i.e., ‖‖ = ‖‖.

5.2.1. Definition of ts-vectors

The definition of ts-vectors is recursively: Base case. For a standard set , each standard vector : → is also a ts-vector with index set . The representation size of the vector is ‖‖ = ‖‖.

Direct sum. Let 1 be a ts-vector with ts-index set and 2 be a ts-vector with ts-index set . Then we can construct a new ts-vector with ts-index set ⊎ as = 1 ⊕ 2. The represented vector is rvec() = rvec(1) ⊕ rvec(2). The representation size of the new vector is ‖‖ = ‖1‖ + ‖2‖ + 1. Kronecker product. Let 1 be a ts-vector with ts-index set and 2 be a ts-vector with ts-index set . Then we can construct a new ts-vector with ts-index set × as = 1 ⊗ 2. The represented vector rvec() is the rset() × rset( )-vector given as rvec() = rvec(1) ⊗ rvec(2). The representation size of the new vector is ‖‖ = ‖1‖ + ‖2‖ + 1.

5.2.2. Efective Computations with Ts-Vectors

We show that we can eficiently compute the scalar product, the inner product and the vector addition (for non-Kronecker products) of ts-vectors. This means the following: there are polynomial-time algorithms which take as input the symbolic representation of ts-vectors and output the symbolic representation of the result of a vector operation on the represented vectors.

Scalar multiplication. Given a scalar ∈ and a ts-vector . To compute a representation of , it sufices to observe that (1 ⊕ 2) = 1 ⊕ 2 and (1 ⊗ 2) = ( 1) ⊗ 2 = 1 ⊗ ( 2). Using these identities, a recursive computation of a representation for can trivially be achieved in time (‖‖). Also note that scalar multiplication does not change the representation size of , i.e., ‖ ‖ = ‖‖. Inner product. Let , be ts-vectors with the same ts-index set . We explain how to eficiently compute the inner product ⟨, ⟩ = = ∈ via recursion on the structure of , (and ) where the case for base sets is trivial: • If = 1 ⊕ 2 and = 1 ⊕ 2 are direct sum vectors, then we have: ⟨, ⟩ = ⟨1, 1⟩+⟨2, 2⟩.

Hence, we can recursively compute the inner products ⟨1, 1⟩ and ⟨2, 2⟩ and add the results. • If = 1 ⊗ 2 and = 1 ⊗ 2 are Kronecker product vectors, then we have:

⟨, ⟩ = · = (1 ⊗ 2 ) · (1 ⊗ 2) = 1 · 1 · 2 · 2 = ⟨1, 1⟩ · ⟨ 2, 2⟩.

Hence, we recursively compute the inner products ⟨1, 1⟩ and ⟨2, 2⟩ and multiply the results. In particular, it follows that the inner product of ts-vectors can be computed in linear time (‖‖). Addition of ts-vectors without Kronecker products A final simple observation is that we can compute a representation for the sum + of two ts-vectors and over a ts-index set which are built without the use of Kronecker products in linear time (‖‖ + ‖‖) as well. We only need to make use of the identity (1 ⊕ 2) + (1 ⊕ 2) = (1 + 1) ⊕ (2 + 2) for the case of direct sums.

5.3. Tree-structured matrices

We shift our attention to tree-structured matrices (ts-matrices, for short). Analogously, to ts-vectors, we consider ts-matrices with ts-index sets × which have a semantics as standard rset() × rset( )matrices rmat( ). Also analogously, each ts-matrix has a representation size ‖ ‖ ∈ N which corresponds to its encoding complexity. In contrast to ts-vectors, however, we restrict the application of one operation (namely, forming matrix compositions) to certain types of input ts-matrices. This restriction may seem arbitrary, but is required later in order to show that ts-matrices can be eficiently multiplied. The inductive definition of tree-structures matrices is as follows: Base case (standard matrices). For standard sets and , each mapping : rset() × rset( ) → is a ts-matrix over . Of course, we have rmat( ) = for the base case. The representation size is ‖ ‖ = |rset()| · | rset( )|.

Kronecker product. Let be an × ts-matrix and be a × ts-matrix (where , , , are ts-sets). Then we can construct a new ts-matrix ⊗ with ts-index set ( × ) × ( × ) as ⊗ with the represented matrix being the Kronecker product of the represented matrices, i.e., rmat( ⊗ ) = rmat() ⊗ rmat(). The representation size is ‖ ⊗ ‖ = ‖‖ + ‖‖ + 1. Direct sum. Let be an × square ts-matrix and be a × square ts-matrix (where and are ts-sets). Then we can construct a new square ts-matrix ⊕ with index set ( ⊎ ) × ( ⊎ ). The represented matrix is just rmat( ⊕ ) = rmat() ⊕ rmat(). The representation size of the new matrix is ‖ ⊕ ‖ = ‖‖ + ‖‖ + 1. Matrix composition As mentioned above, for this last operation we restrict the type of input matrices: we only allow ts-matrices as input that do not contain the Kronecker product operator. Let four ts-matrices 1 with index set × , 2 with index set × , 3 with index set × , and 4 with index set × (all index sets , , , are ts-sets and so are × , × , × , and × ) be given where all of the matrices 1, 2, 3, and 4 are built without use of the Kronecker product. Then we can construct a new ts-matrix with index set ( ⊎ ) × ( ⊎ ) (note that this is a ts-set) as follows: = The representation size of this new matrix is ‖‖ = ‖1‖ + ‖2‖ + ‖3‖ + ‖4‖ + 1. The represented matrix rmat() has index set rset( ⊎ ) × rset( ⊎ ) and is obtained by the matrix composition of the four submatrices rmat(1), rmat(2), rmat(3), and rmat(4) as indicated.

5.3.1. Addition of Non-Kronecker-Product Ts-matrices

A representation of + can be computed in time (‖‖ + ‖‖) for two ts-matrices and with the same index set × if both matrices , are built without the use of Kronecker products (this is analogous to the case of ts-vectors that we considered above). That is clear for base case matrices, so let us only check the other cases: Direct sum. For two direct sum matrices = 1 ⊕ 2 and = 1 ⊕ 2, we have + = (1 + 1) ⊕ (2 + 2), so we can reduce this case recursively to simpler types of matrices. Matrix composition For matrices =

(︂ 1 2 )︂ 3 4 and =

(︂ 1 2 )︂ , 3 4

we have: + = (︂ 1 + 1 2 + 2 )︂ , 3 + 3 4 + 4 so, this case can readily be reduced to simpler types of ts-matrices by recursively computing the sums of the corresponding submatrices 1 + 1, 2 + 2, 3 + 3, and 4 + 4.

5.3.2. Matrix-vector multiplication

It is further possible to compute a ts-vector representation of for a given × -ts-matrix and a ts-vector with index set in time (‖‖). To see this, we, again, recurse on the type of . Base case. If is a base matrix, then is a base set as and thus is a base ts-vector. Hence, the standard matrix-vector product can be computed in time (|| · | |), where || and | | are the sizes of the sets and , respectively. Since, ‖‖ = || · | |, the claim follows.

Direct sum. If = 1 ⊕ 2 is a direct sum matrix, then = 1 ⊕ 2 is a direct sum as well (otherwise, the index sets would not match). Then we have = 11 ⊕ 22 and we can reduce to simpler cases. The running time is (‖1‖ + ‖2‖) = (‖‖) as claimed.

Kronecker product. If = 1 ⊗ 2 is a Kronecker product matrix, then = 1 ⊗ 2 is a Kronecker product as well (again, otherwise the index sets would not match). Then = 11 ⊗ 22 due to the mixed-product property, so we just need to compute a representation for 1 and 2 and obtain a representation for . The required computation time is (‖1‖ + ‖2‖) = (‖‖) as claimed.

Matrix composition. If = (︂ 31 42 )︂ is a matrix composition, then = 1 ⊕ 2 is a direct sum (with index set ⊎ ). By our (restricted) definition of the matrix composition operator for ts-matrices, no submatrix 1, 2, 3, or 4 contains the Kronecker product operator. We compute as follows: = (11 + 22) ⊕ (31 + 42).

Since 1, 2, 3, and 4 do not contain the Kronecker product operator, the recursive computation of 11, 22, 31, and 42 yields ts-vectors without the Kronecker product operator as well. We can recursively compute representations for these products in time (‖1‖ + ‖2‖ + ‖3‖ + ‖4‖). Next, we compute the sums 11 + 22 and 31 + 42, which can also be done in time (‖1‖ + ‖2‖ + ‖3‖ + ‖4‖) as shown earlier (since the vectors 11, 22, 31, and 42 do not contain the Kronecker product operator). In total, we obtain a representation for in time (‖‖), as claimed. We remark that, although we have explicitly considered a matrix-vector product of the form , it should be clear that by a completely symmetric treatment, we can derive an eficient algorithm for a matrix-vector product of the form as well.

5.3.3. Rank-One Matrices

As a special case, we consider matrices which are formed by the multiplication of two ts-vectors and . More precisely, let be a ts-vector with index set and be a ts-vector with index set . Then we can eficiently construct a new ts-matrix with index set × as such that rmat() = rmat()· rmat(). Since the base case is trivial, we only need to consider the cases for direct sums and Kronecker products. Here, we can make use of the following identities: (1 ⊕ 2) · (1 ⊕ 2) = (1 ⊗ 2) · (1 ⊗ 2) = (1 · 1 ) ⊗ (2 · 2 ).

︂( 1 · 1 2 · 1 21 ·· 22 )︂ , and In this way, we can recursively reduce to matrix operations and finally to base case matrices.

5.3.4. Matrix multiplication

Finally, we consider the matrix multiplication of ts-matrices and . To this end, let be an × ts-matrix and be a × ts-matrix. Then we show how we can eficiently compute a representation of the product as a ts-matrix with index set × . The required computation time is (‖‖·‖ ‖·‖ ‖) which is polynomial in the representation sizes of and . We skip the base case (which is obvious), and focus on the remaining cases where we distinguish with respect to the structure of : Direct sum. If = 1 ⊕ 2, then = 1 ⊎ 2. One possible form of is = 1 ⊕ 2. In this case, we can reduce the computation recursively to the case of 11 and 22 by using the identity: (1 ⊕ 2)(1 ⊕ 2) = 11 ⊕ 22.

The other possibility is that is a matrix composition, i.e., = ︂( 1 2)︂ . Then, we can use:

3 4 (1 ⊕ 2) ︂( 1 2)︂ 3 4 = ︂( 11 12)︂ 23 24 to reduce the computation to the simpler cases 11, 12, 23, and 24.

Kronecker product. If = 1 ⊗ 2, then = 1 ⊗ 2 is a Kronecker product as well. In this case, we can use the mixed-product property to reduce the computation to the simpler cases 11 and 22 via = 11 ⊗ 22 and recursively obtain a representation of the product . Matrix composition. As a final case, let = ︂( 1 2)︂ . In this case, can be a direct sum (but we

3 4 covered the symmetric case already above) or a matrix composition = as well. In this case, we observe that: ︂( 1 2)︂ 3 4 = ︂( 1 2)︂ (︂ 3 4 1 2)︂ 3 4 = ︂( 11 + 23 12 + 24)︂ .

31 + 43 32 + 44 At this point, we crucially make use of our assumption that the matrix composition operator is only applied to submatrices that do not contain the Kronecker product operator. First, we recursively compute the relevant products and then, since the resulting matrices do not contain the Kronecker product operator, we can compute a representation for the sums 11 + 23, 12 + 24, 31 + 43, and 32 + 44. If = 1 ⊎ 2, = 1 ⊎ 2 and = 1 ⊎ 2, then the subcomputations of the products can be done in time (‖‖ · ‖ ‖ · ‖ ‖) for , = 1, 2. The total computation time for the product subcomputations is thus bounded by ((‖1‖ + ‖2‖) · (‖1‖ + ‖2‖) · (‖1‖ + ‖2‖)) = (‖‖ · ‖ ‖ · ‖ ‖). Also we saw above, that we can compute the sums of the matrices in linear time. This yields a total of bound of (‖‖ · ‖ ‖ · ‖ ‖) as claimed.

6. Applications of the Linear-Algebraic Formulation

The representation of the alignment problem for process trees in terms of matrix/vector products allows us to apply new algorithmic methods to compute alignments. The obvious ones include the application of highly optimized matrix multiplication algorithms as implemented in several linear algebra libraries. We could even adapt the computation to diferent hardware settings such as GPUs and, of course, parallelize the computation of the product

in · 1 · · · · fin for long traces . Of course, we might also try to use linear-algebraic structure theory to make the product computation simpler in the ifrst place. For example, since the matrices are structurally quite similar, it might be possible to find common transformations that yield (partial) representations with respect to eigenspaces. This could help to speed up the computation in some cases at least. Another advantage of our approach is that the precompilation of the ts-matrices for all ∈ Σ is only required once, i.e., we can save time with preprocessing if we have a single, fixed process tree and want to align many diferent traces against it.

Besides such algorithmic considerations, well-established linear-algebraic concepts might yield novel insights for conformance checking applications. Let us briefly discuss one simple example. Assume we have constructed the matrix family ()∈Σ and the vectors in, fin for a process tree . Then we can define that two events , ∈ Σ are independent in all trace contexts if the product is equal to . This actually means that, with respect to alignments of the given process trees, we can swap the order of and without changing the alignment costs no matter of what the context is in which the two events occur. In this way we can relate the notion of commuting matrices from linear algebra with a novel notion of causal independence. In fact, it would be interesting to study this independence notion on real-life event logs to see what it can tell us about the underlying process.

Unfortunately, for the algorithmic approaches, we still face the obstacle discussed in Section 5: the matrices can become exponentially large (in the size of the underlying process tree) by the Kronecker product operator. Given the NP-hardness of the alignment problem for process trees this is not surprising, and the key question is: how far can we push the linear-algebraic technique from an algorithmic perspective without running into the exponential blow-up of the Kronecker product operator? Can we even find new classes of process trees which allow polynomial-time alignment computations via the linear-algebraic formulation? The symbolic representations that we discussed in Section 5 constitute a first step towards this very question.

A natural class to consider are process trees with unique labels: for those we recently proved that alignments can be computed in polynomial time via a dynamic programming algorithm [ 14 ]. Interestingly enough, we already saw in Section 4 that by a purely linear-algebraic argument we could strongly simplify the matrix expressions for the shufle operator in cases of unique labels. This is clear indication that, indeed, the linear-algebraic approach mirrors the key argument of the polynomial-time algorithm for process trees with unique labels (cf. Equation (7)) which efectively eliminates choice for shufle operators.

Unfortunately, the symbolic approach as given in Section 5 does not fully cover process trees with unique labels yet. Anyhow, in what follows we explain that it does capture a significant and interesting subclass. We are going to show that we can eficiently construct the matrices ()∈Σ and vectors in, fin and also eficiently compute the product in · 1 · · · · fin for a given trace = 1 · · · for all process trees with unique labels in which the shufle operator does not occur within the scope of a sequence or loop operator. It is allowed, however, that the shufle operator occurs within the scope of a choice operator and, of course, within the scope of other shufle operators.

To show this, it sufices to argue that the matrices ()∈Σ and vectors in, fin can eficiently be constructed as ts-matrices and ts-vectors as introduced in Section 5 since for those we proved that products can be computed in polynomial time. To be more precise, for our construction we guarantee that the representation size of the resulting matrices for a process tree is bounded by (‖ ‖2) (and the same holds for the vectors in and fin whose representation size is (‖ ‖)). Let us go through the diferent process tree operators and check that the matrices and vectors can be constructed as required. Since the base cases (labeled leaf and silent leaf) yield explicit, constant size matrices, we can trivially encode those as base case ts-matrices and ts-vectors. Let us check the other cases: Exclusive choice. The matrices and vectors for the exclusive choice operator given in Equation (3) can obviously be constructed as ts-matrices and ts-vectors using the direct sum operator available for ts-matrices and ts-vectors.

Sequence. For the sequence operator, it is not as obvious that we can construct the required matrices (Equation (4)) and vectors (Equation (5)) as ts-matrices and ts-vectors. The reason is that matrix products, matrix-vector products, inner products, and scalar products occur in Equation (4) and Equation (5). Hence, for the construction to go through, we need to show that these operations can be computed eficiently. The results can then easily be assembled by using the matrix composition operator for matrices as in Equation (4) and the direct sum operator for the vectors as in Equation (5).

To show eficient computability, we need to use our assumption that, within the scope of the sequence operator, no shufle operator occurs. With this assumption, we know that none of the involved submatrices and vectors contains the Kronecker product operator. Luckily, we saw in Section 5 that for such matrices and vectors all required operations are indeed computable in polynomial time on the symbolic level.

Loop. The argument for the loop operator is similar to the case of the sequence operator. Again, for the matrices and vectors in Equation (9) and Equation (10), it is not at all obvious that we can construct them as ts-matrices and ts-vectors as these involve matrix products and matrix addition. However, with the assumption that no shufle operator occurs within the scope of a loop operator, the argument goes through as in the case of the sequence operator.

Parallel operator. For the parallel operator, we derived a simplified expression valid for process trees with unique labels in Equation (7). This expression is already in the form of a Kronecker product and can thus be constructed as a ts-matrix using the Kronecker product operator. The same holds for the associated vectors as given in Equation (6).

7. Discussion

We have shown that the alignment problem for process trees can be formulated in terms of matrix and vector products. This sheds new light on the problem, paves the way for new algorithmic techniques, and enables us to study alignments through the lens of linear algebra. There are several paths to follow in the future. One open question is if our symbolic approach can be extended to cover all process trees with unique labels. This would potentially lead to an alternative polynomial-time algorithm for the alignment problem for process trees with unique labels, a class highly relevant in practice. This algorithm might be even more eficient if we apply optimized linear algebra libraries to the matrix products involved or make use of parallelization techniques. Another angle would be to look into other semirings to verify, for example, that also more general cost functions are covered by our approach. It is also conceivable to define a semiring which does not only yield optimal alignment costs, but also the set of optimal alignments. Finally, another question is whether we can generalize the linear-algebraic approach to other classes of process models such as sound, free-choice workflow nets.

Declaration on Generative AI

During the preparation of this work, the authors partly used Copilot for grammar and spell checking. The authors reviewed and edited the content as needed and take full responsibility for the publication’s content.

[10] [12] [1]

Adriansyah . “ Aligning observed and modeled behavior .” PhD thesis . Technische Universiteit Eindhoven, 2014 . doi: 10 .6100/IR770080.

[2]

Carmona , B. F. van Dongen ,

Solti , and

Weidlich . Conformance Checking. Relating Processes and Models . Cham: Springer International Publishing, 2018 . doi: 10 .1007/978-3- 319 -99414-7.

[3]

Carmona , B. F. van Dongen , and

Weidlich . “Conformance Checking: Foundations, Milestones and Challenges.” In: Process Mining Handbook. Vol. 448 . LNBIP. Cham: Springer International Publishing, 2022 . Chap. 5 , pp. 155 - 190 . doi: 10 .1007/978-3- 031 -08848- 3 _ 5 .

[4] [5] [6]

Droste and

Gastin . “ Weighted automata and weighted logics .” In: Theor. Comput. Sci. 380 .1- 2 ( 2007 ), pp. 69 - 86 .

Droste ,

Kuich , and

Vogler . Handbook of Weighted Automata. Monographs in Theoretical Computer Science. An EATCS Series . Berlin, Heidelberg: Springer Berlin Heidelberg, 2009 . doi: 10 .1007/978-3- 642 -01492-5.

Droste and

Kuske . “Weighted automata.” In: Handbook of Automata Theory (I.) European Mathematical Society Publishing House , Zürich, Switzerland, 2021 , pp. 113 - 150 .

[7]

S. J. J.

Leemans . Robust Process Mining with Guarantees . Process Discovery, Conformance Checking and Enhancement . Vol. 440 . LNBIP. Cham: Springer International Publishing, 2022 . doi: 10 .1007/ 978-3- 030 -96655-3.

[8]

S. J. J.

Leemans ,

Fahland , and W. M. P. van der Aalst. “ Discovering Block-Structured Process Models from Event Logs Containing Infrequent Behaviour.” In: Business Process Management Workshops . BPM Workshops 2013 . Vol. 171 . LNBIP. Cham: Springer International Publishing, 2014 , pp. 66 - 78 . doi: 10 .1007/978-3- 319 -06257- 0 _ 6 .

[9]

S. J. J.

Leemans ,

Fahland , and W. M. P. van der Aalst. “ Discovering Block-Structured Process Models from Incomplete Event Logs.” In: Application and Theory of Petri Nets and Concurrency . PETRI NETS 2014 . Vol. 8489 . LNCS. Cham: Springer International Publishing, 2014 , pp. 91 - 110 . doi: 10 .1007/978-3- 319 -07734- 5 _ 6 .

Syst. 61.2 ( 2017 ), pp. 322 - 351 .

[11]

Osterholzer . “ Weighted Automata and Logics for Natural Language Processing .” In: Joint Workshop of the German Research Training Groups in Computer Science. Pro Business GmbH, 2014 , p. 150 .

M. P.

Schützenberger . “ On the Definition of a Family of Automata .” In: Inf. Control. 4 .2- 3 ( 1961 ), pp. 245 - 270 .

[13]

C. T.

Schwanen ,

Pakusa , and W. M. P. van der Aalst. “ Complexity of Alignments on Sound Free-Choice Workflow Nets.” In: Application and Theory of Petri Nets and Concurrency . PETRI NETS 2025 . 2025 . In press.

[14]

C. T.

Schwanen ,

Pakusa , and W. M. P. van der Aalst. “ A Dynamic Programming Approach for Alignments on Process Trees .” In: Process Mining Workshops. ICPM 2024 . Vol. 533 . LNBIP. Cham: Springer Nature Switzerland, 2025 , pp. 84 - 97 . doi: 10 .1007/978-3- 031 -82225- 4 _ 7 .

[15]

C. T.

Schwanen ,

Pakusa , and W. M. P. van der Aalst. “Process Tree Alignments.” In: Enterprise Design, Operations, and

Computing. EDOC

2024 . Vol. 15409 . LNCS. Cham: Springer International Publishing, 2025 . doi: 10 .1007/978-3- 031 -78338-8_ 16 .

Zhang and

Ding . “ On the Kronecker Products and Their Applications.” In: Journal of Applied Mathematics 2013 ( 2013 ), pp. 1 - 8 . doi: 10 .1155/ 2013 /296185.