Incremental NFA Minimization Christian Bianchini1 , Alberto Policriti1,* , Brian Riccardi1 and Riccardo Romanello1 1 University of Udine, Italy Abstract Finite state automata are fundamental objects in Theoretical Computer Science and find their application in Text Processing, Compilers Design, Artificial Intelligence and many other areas. The problem of minimizing such objects can be traced back to the β€˜50s and since then it has been the arena for developing new algorithmic ideas. There are two main paradigms to tackle the problem: top down β€” which builds a descending chain of equivalences by subsequent refinements β€” and bottom up β€” which builds an ascending chain of equivalences by aggregation of classes. The former approach leads to a fast π’ͺ(𝑛 log 𝑛) algorithm, whereas the latter is currently quadratic for any practical application. Nevertheless, the bottom up algorithm enjoys the property of being incremental, i.e. the minimization process can be stopped at any time obtaining a language-equivalent partially minimized automaton. In this work we correct a small mistake in the algorithm given by Almeida et al. in 2014 and we propose a simple, DFS-like and truly quadratic incremental algorithm for minimizing deterministic automata. Furthermore, we adapt the idea to the nondeterministic case obtaining an incremental algorithm which computes the maximum bisimulation relation in time π’ͺ(𝑛2 π‘Ÿ|Ξ£|), where 𝑛 is the number of states, Ξ£ is the alphabet and π‘Ÿ is the degree of nondeterminism, improving by a factor of π‘Ÿ the running time of the fastest known aggregation based algorithm for this problem. Keywords Automata, Bisimulation, Minimization, Incremental 1. Introduction Finite state automata are fundamental objects in Theoretical Computer Science and find their application in Text Processing, Compilers Design, Artificial Intelligence and many other areas. The minimization of an automaton is the process of constructing a new (language-equivalent) automaton which is minimal in the number of states. This problem can be traced back to the β€˜50s by the work of Moore [1]. A fundamental result in Automata Theory is the Myhill-Nerode Theorem [2], establishing that in the deterministic case this minimal automaton is in fact the minimum, up to isomorphism. In the wider setting of nondeterministic automata there is no analog result and finding any state-minimal automaton is PSPACE-complete [3]. For this reason, a practical alternative is the minimization with respect to bisimulation. Bisimilarity is indeed Proceedings of the 23rd Italian Conference on Theoretical Computer Science, Rome, Italy, September 7-9, 2022 * Corresponding author. " bianchini.christian@spes.uniud.it (C. Bianchini); alberto.policriti@uniud.it (A. Policriti); riccardi.brian@spes.uniud.it (B. Riccardi); riccardo.romanello@uniud.it (R. Romanello) ~ http://users.dimi.uniud.it/~alberto.policriti/home (A. Policriti); https://riccardoromanello.github.io (R. Romanello)  0000-0002-2855-1221 (R. Romanello) Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 1 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 a valid choice since in the deterministic case two states are bisimilar if and only if they are Myhill-Nerode equivalent. Thus, the problem of minimizing automata reduces to the problem of computing bisimilarity between states, which in turn is equivalent to determining the coarsest partition of a set stable with respect to some binary relation. The two main paradigms to compute the aforementioned partition are top down and bottom up. Top down algorithms start with the partition that separates states between final and non final and subsequently refine the partition until it is stable. By a careful choice of which block of the partition to split at each refining step, Paige and Tarjan [4] devised an algorithm that computes the maximum bisimulation equivalence in time π’ͺ(π‘š log 𝑛), where 𝑛 is the number of states and π‘š is the number of transitions. The iconic Hopcroft’s Algorithm [5] (which Paige and Tarjan’s solution is based on) deals with the special case of deterministic automata. Furthermore, it has recently been proved that the bisimilarity computation requires Ω(π‘š log 𝑛) time assuming top down algorithms [6]. On the contrary, bottom up solutions start with the finest partition β€” the one where each state constitutes a singleton β€” and proceed by subsequently merging two blocks found to be equivalent. For this reason, the technique is also known in the literature as partition-aggregation. The main advantage of this paradigm is that the algorithm is incremental, i.e. it proceeds in subsequent stages where at the end of each merging step the resulting automaton is language- equivalent to the input one. In this way, the minimization process can be stopped at any time and can be resumed later. The first algorithm of this kind is due to Watson [7]. After a series of improvements Watson and Daciuk [8] reduced the running time to π’ͺ(𝑛2 |Ξ£|𝛼(𝑛)) for deterministic automata with 𝑛 states, alphabet Ξ£ and where 𝛼(𝑛) is related to the inverse of Ackermann’s function [9] which can be treated as a constant for any practical value of 𝑛1 . The main idea is to propagate the definition of bisimilarity: if states 𝑝 and π‘ž are equivalent, then also their transitions by the same character must lead to equivalent states. This is done by a recursive function Equiv which resembles an equivalence algorithm by Hopcroft and Karp [10]. A subsequent work by Almeida et al. [11] aimed at simplifying the algorithm by Watson and Daciuk maintaining its running time. Unfortunately, there is a small mistake in their version of Equiv which leads to a Ω(𝑛3 ) algorithm in the worst case. Above algorithms are focused on the minimization of deterministic automata. The nonde- terministic case was tackled by BjΓΆrklund and Cleophas [12] adapting ideas from Watson and Daciuk. They devised an incremental algorithm for computing bisimilarity in time π’ͺ(𝑛2 π‘Ÿ2 |Ξ£|), where π‘Ÿ is the degree of nondeterminism. In this work we correct the algorithm by Almeida et al. providing a simplified version of the one by Watson and Daciuk and maintaining the quadratic running time. The solution is based on the concept of associated graph whose purpose is twofold: to distill the behaviour of the aforementioned algorithms by interpreting them as graph colorings and to design our own incremental procedure. Having established the clear connection between minimization and graph coloring it is natural to generalize the algorithm for solving the bisimilarity problem on nondeterministic automata. Furthermore, the proposed solution improves by a factor of π‘Ÿ the running time of BjΓΆrklund and Cleophas [13]. 16 1 It holds 𝛼(𝑛) ≀ 5 for 𝑛 ≀ 22 . 2 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 The paper is organized as follows: in the next section we give the basics about partitions, relations, and automata. In Section 3 we briefly describe the algorithm by Almeida et al. and we point out the mistake. In Section 4 we introduce our procedure on the deterministic case. Finally, in Section 5 we lift the algorithm to the nondeterministic case. Some conclusions are drawn in Section 6. 2. Preliminaries 2.1. Relations and Partitions A binary relation from set 𝐴 to set 𝐡 is a subset 𝜌 βŠ† 𝐴 Γ— 𝐡. Its size will be denoted by |𝜌|. We say that π‘Ž ∈ 𝐴 is in relation with 𝑏 ∈ 𝐡 whenever (π‘Ž, 𝑏) ∈ 𝜌, and we denote this by writing π‘ŽπœŒπ‘. If we consider binary relations over 𝐴 (i.e. subsets of 𝐴 Γ— 𝐴), it remains defined the identity relation πœ„π΄ = {(π‘Ž, π‘Ž) : π‘Ž ∈ 𝐴}. An equivalence relation (or equivalence) is a relation which is reflexive, symmetric and transitive. Given π‘Ž ∈ 𝐴, the equivalence class of π‘Ž is the set [π‘Ž] = {𝑏 : π‘ŽπœŒπ‘}. The quotient set 𝐴/𝜌 = {[π‘Ž] : π‘Ž ∈ 𝐴} forms a partition of 𝐴. 2.2. Languages and Automata An alphabet is a non-empty set Ξ£ of symbols. A string is a finite sequence 𝑀 = 𝑀1 . . . 𝑀𝑛 of symbols. Ξ£* is the set of all finite length strings of symbols in Ξ£ and we call a subset 𝐿 βŠ† Ξ£* a language. βŸ¨οΈ€ βŸ©οΈ€ A nondeterministic finite state automaton (NFA) is 𝒩 = 𝑄, Ξ£, π‘ž0 , 𝛿, 𝐹 where 𝑄 is a non- empty finite set of states, Ξ£ is the alphabet, π‘ž0 is the initial state, 𝛿 : 𝑄 Γ— Ξ£ β†’ 2𝑄 is the transition function and 𝐹 βŠ† 𝑄 is the set of final states. The degree of nondeterminism is π‘Ÿ = maxπ‘₯∈Σ,π‘žβˆˆπ‘„ {|𝛿(π‘ž, π‘₯)|}. We say that 𝒩 is complete if |𝛿(π‘ž, π‘₯)| β‰₯ 1 for every state and symbol. In what follows we will assume complete automata: this is not a loss of generalities since it is always possible to complete an automaton by adding one state and suitable transitions. As usual, the transition function can be recursively extended to strings, i.e. 𝛿 * : 𝑄 Γ— Ξ£* β†’ 2𝑄 , still denoted by 𝛿. We say that state π‘ž ∈ 𝑄 accepts a string 𝑀 ∈ Ξ£* if 𝛿(π‘ž, 𝑀) ∩ 𝐹 ΜΈ= βˆ…. The set of strings accepted by π‘ž is denoted by 𝐿(π‘ž). The language accepted by the automaton is 𝐿(𝒩 ) = 𝐿(π‘ž0 ). A minimal automaton accepting 𝐿 has the minimum number of states amongst all automata accepting 𝐿. A deterministic finite state automaton (DFA) is a NFA π’Ÿ with the added condition that for each symbol π‘₯ and each state π‘ž, |𝛿(π‘ž, π‘₯)| = 1. For an automaton 𝒩 we define the equivalence relation ∼ βŠ† 𝑄 Γ— 𝑄 as: 𝑝 ∼ π‘ž ⇐⇒ 𝐿(𝑝) = 𝐿(π‘ž) If 𝑝 ̸∼ π‘ž we say they are distinguishable and if exactly one of the two is final we say they are trivially distinguishable. βŸ¨οΈ€ Given a NFA 𝒩 and βŸ©οΈ€ an equivalence 𝜌 over its states, its quotient is defined as 𝒩 /𝜌 = 𝑄/𝜌, Ξ£, [π‘ž0 ], π›ΏπœŒ , 𝐹/𝜌 where π›ΏπœŒ ([π‘ž], π‘₯) = {[π‘ž β€² ] | π‘ž β€² ∈ 𝛿(π‘ž, π‘₯)}. The Myhill-Nerode Theorem 3 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 [2] estabilishes that the quotient automaton π’Ÿ/ ∼ is well defined and is the unique (up to isomorphism) minimal automaton recognizing 𝐿(π’Ÿ). Definition 1. Given a NFA, a bisimulation is a binary relation over its states such that, for every pair (𝑝, π‘ž) ∈ 𝐡: B1. 𝑝 ∈ 𝐹 ⇐⇒ π‘ž ∈ 𝐹 , B2. βˆ€π‘₯ ∈ Ξ£ βˆ€π‘β€² ∈ 𝛿(𝑝, π‘₯) βˆƒπ‘ž β€² ∈ 𝛿(π‘ž, π‘₯) ∧ (𝑝′ , π‘ž β€² ) ∈ 𝐡, B3. βˆ€π‘₯ ∈ Ξ£ βˆ€π‘ž β€² ∈ 𝛿(π‘ž, π‘₯) βˆƒπ‘β€² ∈ 𝛿(𝑝, π‘₯) ∧ (𝑝′ , π‘ž β€² ) ∈ 𝐡. Two states 𝑝 and π‘ž are said bisimilar if there exists a bisimulation which contains (𝑝, π‘ž). The set of all bisimulations over the states of 𝒩 is denoted by B𝒩 . The union of two bisimulation is also a bisimulation. In fact, the following generalization of this observation is a well-known result. Lemma 1. B𝒩 is closed under union and has a unique largest bisimulation ℬ, which will be called bisimilarity, it is an equivalence relation, and it relates all and only bisimilar states. ℬ βŠ† ∼ and if 𝒩 is deterministic, ℬ coincides with ∼. The above properties β€” combined with efficient algorithms to compute it β€” justify the use of ℬ as an approximation of ∼ for nondeterministic automata. 2.3. Partition Aggregation Given an automaton 𝒩 , our goal is to compute the bisimilarity relation ℬ over its set of states, so that the resulting quotient 𝒩 /ℬ can be returned as the minimized version of 𝒩 . The partition- aggregation strategy will compute an ascending chain πœ„ βŠ† 𝐡1 βŠ† . . . βŠ† 𝐡𝑛 = ℬ of bisimulation-equivalences. A partition-aggregation algorithm proceeds by a sequence of merging steps where at each step 𝑖 the bisimulation 𝐡𝑖 is computed. Since each 𝐡𝑖 is a bisimulation, the minimization process can be stopped at any step obtaining a language-equivalent automaton with no more states than the input one; the minimization process can be resumed later from this intermediate automaton. In this sense, the algorithm is incremental. This property is not shared with top down algorithms that proceed by partition-refinement β€” such as Hopcroft’s Algorithm and its many successors β€” where only the final result is a bisimulation. 3. The Algorithm Proposed by Almeida et al. This section is devoted to a brief description of the algorithm proposed by Almeida et al. It uses the union-find [14, 9] data structure to manage the partition of states, so that finding and merging classes with the Find and Union primitives can be done in π’ͺ(𝛼(𝑛)). Pairs of states are recursively considered until their equivalence is estabilished. Intermediate results are cached, so that queries on pairs of states already found to be (non-)equivalent can be answered in constant time. 4 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 Algorithm 1 Aggregation-based minimization by Almeida et al. 1: function MinimizeAlmeida(𝑄, 𝛿, 𝐹 ) 21: function Equiv(𝑝, π‘ž) 2: for all π‘ž ∈ 𝑄 do 22: if (𝑝, π‘ž) ∈ 𝐸 then 3: Make(π‘ž) 23: return βŠ₯ 4: 𝐸 ← (𝐹 Γ— 𝐹 c ) βˆͺ (𝐹 c Γ— 𝐹 ) 24: if (𝑝, π‘ž) ∈ 𝐻 then 5: 25: return ⊀ 6: for all (𝑝, π‘ž) ∈ 𝑄 Γ— 𝑄 do 26: 7: 𝑓 𝑝 ← Find(𝑝) 27: 𝐻 ← 𝐻 βˆͺ {(𝑝, π‘ž), (π‘ž, 𝑝)} 8: 𝑓 π‘ž ← Find(π‘ž) 28: for all π‘₯ ∈ Ξ£ do 9: if 𝑓 𝑝 ΜΈ= 𝑓 π‘ž ∧ (𝑝, π‘ž) ̸∈ 𝐸 then 29: (𝑝′ , π‘ž β€² ) ← (Find(𝛿(𝑝, π‘₯)),Find(𝛿(π‘ž, π‘₯))) 10: πΈβ†βˆ… 30: if 𝑝′ ΜΈ= π‘ž β€² ∧ (𝑝′ , π‘ž β€² ) ̸∈ 𝐸 then 11: π»β†βˆ… 31: 𝐸 ← 𝐸 βˆͺ {(𝑝′ , π‘ž β€² ), (π‘ž β€² , 𝑝′ )} 12: if Equiv(𝑝, π‘ž) then 32: if Β¬Equiv(𝑝′ , π‘ž β€² ) then 13: for all (𝑝′ , π‘ž β€² ) ∈ 𝐸 do 33: return βŠ₯ 14: Union(𝑝′ , π‘ž β€² ) 34: 15: else 35: 𝐻 ← 𝐻 βˆ– {(𝑝, π‘ž), (π‘ž, 𝑝)} 16: 𝐸 ←𝐸βˆͺ𝐻 36: 𝐸 ← 𝐸 βˆͺ {(𝑝, π‘ž), (π‘ž, 𝑝)} 17: 37: return ⊀ 18: 𝒫 ← {ClassOf(𝑝) : 𝑝 ∈ 𝑄} 38: end function 19: return 𝒫 20: end function At lines 2–3, the identity relation is constructed and pairs of trivially non-equivalent states are added to the memoization table 𝐸. In the main loop at lines 6–16, we iterate over all pairs of states to check for equivalence. If a pair is either on the same class – i.e. is a pair of states already found to be equivalent – or is in the memoization table – i.e. is a pair of distinguishable states – the minimization continues to the next iteration. Otherwise, two empty collections 𝐸 and 𝐻 are prepared, respectively the set of wondering pairs of states and the history pairs. 𝐸 and 𝐻 are considered global variables and can be accessed from Equiv. The recursive function Equiv is responsible for checking if states 𝑝, π‘ž are equivalent and, if so, pairs in 𝐸 are merged. Otherwise, all visited pairs are set to be distinguishable and this information is used to update 𝐸. At the end the partition in equivalence classes is returned. The underlying idea of Equiv(𝑝, π‘ž) is to recursively check the transitions from 𝑝 and π‘ž on all symbols. If two states are found to be cached as distinguishable, the recursion stops returning βŠ₯. If they are found to be in visit, it is useless to continue the visit and nothing can be said (i.e. Equiv returns ⊀ postponing the decision to the upper-level of the recursion). These preliminary checks are at lines 22–25. Next, each π‘₯-transition is checked recursively in the loop at 28–33, stopping when the states are found to be distinguishable. At the end, if 𝑝 and π‘ž are not found to be distinguishable, the pair is removed from the history 𝐻 and added to the β€œwondering” pairs 𝐸 and ⊀ is returned. Detailed proof of the algorithm’s correctness can be found in [11]. On the complexity analysis, the authors claim that the algorithm terminates in time π’ͺ(𝑛2 |Ξ£|𝛼(𝑛)). This comes from the assumption that each pair visited during the recursion of Equiv will be skipped on the subsequent iterations of MinimizeAlmeida (cf. [11, Lemma 4.9]). The assumption is wrong and a family of counterexamples can be constructed such that MinimizeAlmeida terminates in time Θ(𝑛3 |Ξ£|𝛼(𝑛)) (see Fig 1). 5 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 start 1 Figure 1: A small counterexample of [11, Lemma 4.9]: 0 1 pair (6, 7) is visited twice. Consider Equiv(2, 3). Pair (6, 7) is visited a first time by reading 0. 2 3 Symbol 1 leads to (4, 5) which stops the recur- 1 1 sion without inserting (6, 7) in 𝐸. At some 0 4 0/1 0 5 subsequent iteration, (6, 7) is visited again via Equiv(6, 7). By generalization of this automa- 0/1 6 1 7 ton (𝐴1 ) automata 𝐴𝑛 of 8𝑛 βˆ’ 1 states can be 1 provided such that 𝑛 pairs are visited (roughly) 0 0 𝑛2 times each. 4. Deterministic Case In this Section we present our idea to correct the previously presented algorithm, discussing its correctness and complexity. The main point is to run the recursive check on a richer data structure, the associated graph introduced below, whereby the running time of the overall algorithm is going to be π’ͺ(𝑛2 |Ξ£|𝛼(𝑛)). 4.1. The Associated Graph The reason why Algorithm 1 is not quadratic on some automata is the fact that whenever a pair of distinguishable states is found the recursion stops losing reusable information gathered on elements of 𝐸. A graph associated to the automaton clarifies how pairs of states evolve when they are found to either be equivalent or distinguishable. Definition 2. Given a DFA π’Ÿ its associated graph 𝒒 = (𝑉, 𝐴) is defined as: 𝑉 = 𝑄 Γ— 𝑄, 𝐴 = {βŸ¨π‘, π‘žβŸ© β†’ βŸ¨π›Ώ(𝑝, π‘₯), 𝛿(π‘ž, π‘₯)⟩ | 𝑝, π‘ž ∈ 𝑄, π‘₯ ∈ Ξ£} . βŸ¨π‘, π‘žβŸ© is distinguishable if 𝑝 and π‘ž are distinguishable and equivalent otherwise. Coloring 𝒒 with distinguishable vertices black and equivalent vertices white, the problem of computing ∼ can be seen as the problem of correctly coloring the associated graph. The algorithm by Almeida et al. can be described as follows: starts by coloring trivially dis- tinguishable and equivalent vertices in black and white, respectively, and in grey the remaining vertices. At each iteration of the main loop, it considers a grey vertex βŸ¨π‘, π‘žβŸ© and starts a visit of 𝒒 from it. If the visit reaches a black vertex βŸ¨π‘β€² , π‘ž β€² ⟩, the recursion stops and all vertices in the path from βŸ¨π‘, π‘žβŸ© to βŸ¨π‘β€² , π‘ž β€² ⟩ (saved in 𝐻) are colored in black. Otherwise, if all paths lead either to white or grey vertices all visited vertices are colored white. The main issue with Algorithm 1 is that when a black vertex is encountered all information of vertices in 𝐸 βˆ– 𝐻 gets lost. 4.2. The Algorithm for Deterministic Automata We now present the minimization algorithm based on the observations above. 6 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 Algorithm 2 Proposed algorithm for deterministic automata. 1: function MinimizeDfa(𝑄, Ξ£, 𝛿, 𝐹 ) 19: function Equiv(βŸ¨π‘, π‘žβŸ©) ◁ β„‹ and β„Ž global 2: for all βŸ¨π‘, π‘žβŸ© ∈ 𝑄 Γ— 𝑄 do 20: if βŸ¨π‘, π‘žβŸ© ∈ β„‹ then 3: if 𝑝, π‘ž are triv. distinguishable then 21: return ⊀ 4: Color(βŸ¨π‘, π‘žβŸ©) ← Black 22: else if Color(βŸ¨π‘, π‘žβŸ©) = Black then 5: else if 𝑝 = π‘ž then 23: β„Ž ← βŸ¨π‘, π‘žβŸ© 6: Color(βŸ¨π‘, π‘žβŸ©) ← White 24: return βŠ₯ 7: else 25: else if Color(βŸ¨π‘, π‘žβŸ©) = White then 8: Color(βŸ¨π‘, π‘žβŸ©) ← Grey 26: return ⊀ 9: for all βŸ¨π‘, π‘žβŸ© ∈ 𝑄 Γ— 𝑄 do 27: else ◁ here βŸ¨π‘, π‘žβŸ© is Grey and fresh 10: if Color(βŸ¨π‘, π‘žβŸ©) = Grey then 28: Color(βŸ¨π‘, π‘žβŸ©) ← White 11: β„‹ ← EmptyGraph 29: 12: π‘’π‘ž ← Equiv(βŸ¨π‘, π‘žβŸ©) 30: for all π‘₯ ∈ Ξ£ do ◁ in lex. order 13: if Β¬π‘’π‘ž then 31: βŸ¨π‘π‘₯ , π‘žπ‘₯ ⟩ ← βŸ¨π›Ώ(𝑝, π‘₯), 𝛿(π‘ž, π‘₯)⟩ 14: β„‹ ← Reverse(β„‹) 32: AddArc(β„‹, βŸ¨π‘, π‘žβŸ© , βŸ¨π‘π‘₯ , π‘žπ‘₯ ⟩) 15: Visit(β„‹, β„Ž) 33: π‘’π‘ž ← Equiv(βŸ¨π‘π‘₯ , π‘žπ‘₯ ⟩) 16: for all βŸ¨π‘β€² , π‘ž β€² ⟩ ∈ WhiteV(β„‹) do 34: if Β¬π‘’π‘ž then return βŠ₯ 17: Union(𝑝′ , π‘ž β€² ) 35: 36: return ⊀ 18: end function 37: end function The general structure of MinimizeDfa is the same as MinimizeAlmeida rewritten in terms of colorings. The only difference is that instead of maintaining two sets 𝐸 and 𝐻 we maintain the global variable β„‹ which represents the visited portion of 𝒒. The idea is that when Equiv(βŸ¨π‘, π‘žβŸ©) returns to the main loop, after line 15, vertices in β„‹ will be correctly colored, either in White or Black. In Algorithm 1 this information was lost while in Algorithm 2 β„‹ is used to determine extra black vertices. Helper procedures Reverse and Visit perform, respectively, arc-reverse of a graph and the Black-coloring of β„‹ starting from the source vertex β„Ž. Let us analyze the version of Equiv(βŸ¨π‘, π‘žβŸ©) in Algorithm 2. At lines 20–27 some base cases are checked. In particular, if βŸ¨π‘, π‘žβŸ© is Black, then it is stored in the global variable β„Ž and βŠ₯ is returned. Otherwise, if βŸ¨π‘, π‘žβŸ© is White we return ⊀ to continue the downstream inspection. Finally, in case βŸ¨π‘, π‘žβŸ© is Grey, it is colored White and at lines 30–34 the for loop tries to continue the recursive visit by reading each symbol in lexicographic order. Before each recursive call β„‹ is updated by adding arc βŸ¨π‘, π‘žβŸ© β†’ βŸ¨π‘π‘₯ , π‘žπ‘₯ ⟩ β€” we assume vertex βŸ¨π‘π‘₯ , π‘žπ‘₯ ⟩ is added, if not already present. Even though we do not give detailed proofs for space reasons, below we outline the main arguments for complexity and correctness. As far as complexity is concerned it is clear that, summing over all the iterations of the main loop, line 9, the associated graph is visited at most thrice: during the β€œforward” recursion pass and, optionally, during Reverse and Visit. In fact, every vertex starts Grey and gets White during the forward pass of an Equiv-call. Possibly, if the call returns βŠ₯, some White vertices become Black. Altogether, considering the cost of maintaining the union-find data-structure, we have the following: (notice that |𝒒| = 𝑛2 + 𝑛2 |Ξ£|) Theorem 1. MinimizeDfa algorithm terminates in time π’ͺ(𝑛2 |Ξ£|𝛼(𝑛)). To prove correctness we will show the following invariant at line 16: all vertices in β„‹ are correctly colored either Black or White β€” Lemma 2 below. 7 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 We start by showing some properties of the coloring performed by Equiv and Visit. First of all, notice that, since we are in the deterministic case, for every 𝑒 = βŸ¨π‘, π‘žβŸ© ∈ 𝑉 and 𝑀 ∈ Ξ£* there is a unique path in 𝒒 starting from 𝑒 and spelling 𝑀: denote by 𝛿(𝑒, 𝑀) = βŸ¨π›Ώ(𝑝, 𝑀), 𝛿(π‘ž, 𝑀)⟩ the last vertex of this path. Definition 3. Given 𝑒 ∈ 𝑉 and 𝑀 ∈ Ξ£* , we say that 𝑀 for 𝑒 is: 1. simple if 𝑒 ⇝ 𝛿(𝑒, 𝑀) in 𝒒 is simple, 2. avalanche2 if it is simple and vertex 𝛿(𝑒, 𝑀) is Black. If there exists 𝑀 avalanche for 𝑒, denote by av(𝑒) the lexicographically smallest such 𝑀 and name 𝑒 ⇝ 𝛿(𝑒, av(𝑒)) the avalanche path of 𝑒. If Equiv(𝑒0 ) is called in the main loop, line 12, the visit checks all simple words for 𝑒0 in lexicographic order, either until all words are explored or β€” if it exists β€” until av(𝑒0 ) is found. Any vertex that can reach the avalanche path should be colored in Black. Proposition 1. Consider β„‹ upon return of Equiv(𝑒0 ) at line 12. If there exists 𝑒 ∈ β„‹ distin- guishable, then av(𝑒0 ) exists. Furthermore, if all White vertices in 𝒒 βˆ– β„‹ are equivalent, then for every 𝑒 ∈ β„‹, if 𝑒 is distinguishable we have that 𝑒 ⇝ 𝛿(𝑒0 , av(𝑒0 )) exists in β„‹. Proof. Suppose 𝑒 ∈ β„‹ is distinguishable. First of all, we prove that there exists 𝑀 ∈ Ξ£* such that 𝛿(𝑒0 , 𝑀) is Black β€” recall av(𝑒0 ) is the lexicographically smallest such 𝑀. Since 𝑒 ∈ β„‹, there exists 𝑀0 such that 𝑒 = 𝛿(𝑒0 , 𝑀0 ). Since 𝑒 is distinguishable, there exists 𝑀1 such that 𝛿(𝑒, 𝑀1 ) is trivially distinguishable (i.e. Black from the start). Thus, 𝛿(𝑒0 , 𝑀0 𝑀1 ) is Black. Let β„Ž = 𝛿(𝑒0 , av(𝑒0 )), 𝑒 ∈ β„‹ distinguishable and πœ‹ be the 𝒒-path leading 𝑒 to some trivially distinguishable 𝑣. We claim πœ‹ must cross the avalanche path of 𝑒0 (call it 𝛼). Suppose not. Since β„Ž is the only Black vertex in β„‹, 𝑣 ∈ / β„‹. Hence, πœ‹ must traverse some arc 𝑒′ β†’ 𝑒′′ with 𝑒′ ∈ β„‹ and 𝑒 βˆˆβ€²β€² / β„‹. By assumption on πœ‹ and construction of Equiv it follows that 𝑒′ ∈ / 𝛼 and 𝑒′′ is White. Since πœ‹ leads 𝑒′′ to 𝑣 it follows that 𝑒′′ is distinguishable contradicting the hypothesis on White vertices in 𝒒 βˆ– β„‹. Lemma 2. The following hold at the end of each iteration of loop 9–17: D1. {βŸ¨π‘, π‘žβŸ© | Color(βŸ¨π‘, π‘žβŸ©) = Black} ∩ ∼ = βˆ…, D2. {βŸ¨π‘, π‘žβŸ© | Color(βŸ¨π‘, π‘žβŸ©) = White} βŠ† ∼. Proof. Before entering the the loop both properties hold by initialization. D1. Assume (D1) and (D2) hold before Equiv(𝑒0 ). It is sufficient to prove that at the end of the iteration for every βŸ¨π‘, π‘žβŸ© ∈ β„‹ we have that βŸ¨π‘, π‘žβŸ© is Black if and only if βŸ¨π‘, π‘žβŸ© is distinguishable. (β†’) If 𝑒 = βŸ¨π‘, π‘žβŸ© ∈ β„‹ is Black, then it must have been colored by Visit. Therefore, π‘’π‘ž = βŠ₯ and before Reverse there was 𝑒 ⇝ β„Ž in β„‹. Since β„Ž was Black before the Equiv-call, by (D1) it follows β„Ž distinguishable. Thus, βŸ¨π‘, π‘žβŸ© is distinguishable. (←) If 𝑒 = βŸ¨π‘, π‘žβŸ© ∈ β„‹ is distinguishable, then by Prop. 1 it follows that after Reverse and Visit pair βŸ¨π‘, π‘žβŸ© has been colored in Black. 2 The word β€œcatastrophically" leads to the Black vertex. 8 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 D2. It follows from (D1) and the fact that all vertices in β„‹ are either Black or White. Theorem 2. Algorithm 2 is correct and incremental. Proof. Both correctness and incrementality follow from Lemma 2 and the fact that β€” upon termination β€” all vertices of 𝒒 are either Black or White. In particular, (D2) of Lemma 2 proves that Union is always correct. 5. Nondeterministic Case Algorithm 2 is not directly applicable to the nondeterministic case, the reason being that reaching a pair of non-bisimilar states β€” i.e. sufficient condition to color in Black a node by Algorithm 2 β€” is not a sufficient condition now to declare a pair of states distinguishable. To tackle this issue we first turn the associated graph into a bipartite graph. In the definition below, for each state 𝑝 we introduce the shadow state 𝑝 as a distinct copy of the real 𝑝. Definition 4. Let 𝒩 be a complete NFA. The associated graph 𝒒(𝒩 ) is a bipartite directed graph with vertices 𝑉0 βˆͺ 𝑉1 and arcs 𝐴0 βˆͺ 𝐴1 , defined as: 𝑉0 = 𝑄 Γ— 𝑄, 𝑉1 = {βŸ¨π‘, π‘ž, π‘₯⟩ , βŸ¨π‘, π‘ž, π‘₯⟩ | 𝑝, π‘ž ∈ 𝑄, π‘₯ ∈ Ξ£} , 𝐴0 = βŸ¨π‘, π‘žβŸ© β†’ 𝑝′ , π‘ž, π‘₯ , βŸ¨π‘, π‘žβŸ© β†’ 𝑝, π‘ž β€² , π‘₯ | 𝑝′ ∈ 𝛿(𝑝, π‘₯), π‘ž β€² ∈ 𝛿(π‘ž, π‘₯) , {οΈ€ βŸ¨οΈ€ βŸ©οΈ€ βŸ¨οΈ€ βŸ©οΈ€ }οΈ€ 𝐴1 = 𝑝′ , π‘ž, π‘₯ β†’ 𝑝′ , π‘ž β€² , 𝑝, π‘ž β€² , π‘₯ β†’ 𝑝′ , π‘ž β€² | 𝑝′ ∈ 𝛿(𝑝, π‘₯), π‘ž β€² ∈ 𝛿(π‘ž, π‘₯) . {οΈ€βŸ¨οΈ€ βŸ©οΈ€ βŸ¨οΈ€ βŸ©οΈ€ βŸ¨οΈ€ βŸ©οΈ€ βŸ¨οΈ€ βŸ©οΈ€ }οΈ€ βŸ¨π‘, π‘žβŸ© in β€œleft" 𝑉0 will be called equivalent or distinguishable as in Def. 2. The bisimilarity between 𝑝 and π‘ž (Def. 1) can be checked in two steps: 0) choose π‘₯ ∈ Ξ£ and 𝑝′ ∈ 𝛿(𝑝, π‘₯), and 1) respond with suitable π‘ž β€² ∈ 𝛿(π‘ž, π‘₯). The idea is to mimic step 0) traversing arcs of 𝐴0 and step 1) traversing arcs of 𝐴1 . The triplet βŸ¨π‘β€² , π‘ž, π‘₯⟩ ∈ 𝑉1 indicates that we have chosen symbol π‘₯, state 𝑝′ ∈ 𝛿(𝑝, π‘₯), and we are expecting to respond with some π‘ž β€² ∈ 𝛿(π‘ž, π‘₯) (π‘ž provides the information on the state that must respond). 𝐴1 arcs do something similar. Then we need to tune up nondeterminism and Black coloring of vertices in 𝒒. Observe that 𝑒 ∈ 𝑉0 needs only one Black child to be colored in Black, while it needs all children White to be colored in White. Dually, 𝑒 ∈ 𝑉1 behaves the same but with reversed colors. A check will be performed using the variable Doubts(𝑒) which, roughly speaking, counts how many Black neighbours are needed to mark 𝑒 as Black. 5.1. The Algorithm for Nondeterministic Automata We present Algorithm 3 for the nondeterministic case, whose essential ingredients are those of Algorithm 2. Details are left to the reader. First of all, notice that we are actually dealing with four colors: βŠ₯ (never been explored), Grey (in visit), Black (distinguishable) and White (equivalent). 9 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 Algorithm 3 Proposed algorithm for nondeterministic automata, adapted from DFA case. 1: {βŠ₯, Black, White, Grey} ← {βˆ’1, 0, 1, 2} 29: function Equiv(𝑒, 𝑠) ◁ β„‹ is global 2: 30: if Color(𝑒) ΜΈ= βŠ₯ then 3: function MinimizeNfa(𝑄, Ξ£, 𝛿, 𝐹 ) 31: return Color(𝑒) 4: for all βŸ¨π‘, π‘žβŸ© ∈ 𝑄 Γ— 𝑄 do 32: 5: if 𝑝, π‘ž are triv. distinguishable then 33: Color(𝑒) ← Grey 6: Color(βŸ¨π‘, π‘žβŸ©) ← Black 34: Doubts(𝑒) ← 0 7: else if 𝑝 = π‘ž then 35: 8: Color(βŸ¨π‘, π‘žβŸ©) ← White 36: for 𝑣 ∈ Adj(𝒒, 𝑒) ∧ Color(𝑒) ΜΈ= 𝑠 do 9: else 37: π‘π‘œπ‘™ ← Equiv(𝑣, 1 βˆ’ 𝑠) 10: Color(βŸ¨π‘, π‘žβŸ©) ← βŠ₯ 38: if π‘π‘œπ‘™ = 𝑠 then 11: for all βŸ¨π‘, π‘žβŸ© ∈ 𝑄 Γ— 𝑄 do 39: Color(𝑒) ← π‘π‘œπ‘™ 12: β„‹ ← EmptyGraph 40: Doubts(𝑒) ← 0 13: Equiv(βŸ¨π‘, π‘žβŸ© , 0) 41: else if π‘π‘œπ‘™ = Grey then 14: for all 𝑒 ∈ 𝑉0 ∩ β„‹ do 42: AddArc(β„‹, 𝑣, 𝑒) 15: βŸ¨π‘β€² , π‘ž β€² ⟩ ← 𝑒 43: Doubts(𝑒) ← Doubts(𝑒) + 1 16: if Color(𝑒) ΜΈ= Black then 44: 17: ◁ 𝑒 is either Grey or White 45: if Color(𝑒) = Grey then 18: Color(𝑒) ← White 46: if Doubts(𝑒) = 0 then 19: Union(𝑝′ , π‘ž β€² ) 47: Color(𝑒) ← 1 βˆ’ 𝑠 20: end function 48: else if 𝑠 = 0 then 21: 49: Doubts(𝑒) ← 1 22: procedure Relax(𝑣) ◁ β„‹ is global 50: 23: for 𝑒 ∈ Adj(β„‹, 𝑣) do 51: if Color(𝑒) = Black then 24: Doubts(𝑒) ← Doubts(𝑒) βˆ’ 1 52: Relax(𝑒) 25: if Doubts(𝑒) = 0 then 53: return Color(𝑒) 26: Color(𝑒) ← Black 54: end function 27: Relax(𝑒) 28: end procedure The procedure MinimizeNfa is structurally the same as MinimizeDfa, the twist being the usage and maintenance of β„‹. Function Equiv takes two inputs: the current vertex 𝑒 and the β€œside" 𝑠 ∈ {0, 1} of the bipartition. If 𝑒 has already been encountered we return its color. Otherwise, it is colored in Grey with zero Doubts. At lines 36–43 each successor 𝑣 of 𝑒 is recursively visited. In case 𝑠 = 0 (𝑒 ∈ 𝑉0 ) if 𝑣 is recursively found Black, then 𝑒 can be safely marked Black. Otherwise, there is not enough information to safely assign a Black/White color to 𝑒. In particular, if 𝑣 is Grey we add arc 𝑣 β†’ 𝑒 to β„‹ (notice that it is reversed w.r.t. the transition) and we increment Doubts(𝑒) β€” Blackness of 𝑒 depends on the (possible) future Blackness of 𝑣. Case 𝑠 = 1 is dual. Upon leaving the loop at lines 45–49, if 𝑒 is still Grey we consider two cases: if there are no doubts (i.e. Doubts(𝑒) = 0) each of its successors has the same color (either Black or White) and this can be safely assigned to 𝑒; otherwise, in case 𝑒 ∈ 𝑉0 we set Doubts(𝑒) = 1. Proceeding at lines 51–53, if 𝑒 is made Black this information is propagated (Relaxed) to its neighbours in β„‹. Notice that in this case we explicitly define the procedure Relax: in Algorithm 2 the corresponding procedure Visit’s purpose, was to color in Black all vertices reachable from some distinguishable vertex. In Algorithm 3 we also need to consider the doubts of each vertex 𝑣 by coloring in Black only non-doubtful vertices. The complexity is again easy to be determined. Observe that in case the automaton has 𝑛 states and π‘š transitions, |𝒒| ≀ 7π‘›π‘š. Equiv performs the equivalent of a visit of 𝒒 and Relax visits every arc of β„‹ at most once. Hence, their cost over all the execution of Algorithm 3 is 10 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 bounded by the size of 𝒒. Considering the cost of maintaining the union-find data structures and recalling that π‘Ÿ is the degree of nondeterminism, we have the following: Theorem 3. MinimizeNfa terminates in time π’ͺ(π‘›π‘šπ›Ό(𝑛)) βŠ† π’ͺ(𝑛2 |Ξ£|π‘Ÿπ›Ό(𝑛)). Let us briefly discuss correctness. Proposition 2. Let 𝑝 ΜΈ= π‘ž and 𝑒 = βŸ¨π‘, π‘žβŸ© be White. Then, for every arc 𝑒 β†’ 𝑣 ∈ 𝐴0 , vertex 𝑣 is either Grey or White. Proof. Since 𝑝 ΜΈ= π‘ž, there are only two places at which 𝑒 = βŸ¨π‘, π‘žβŸ© changed from Grey to White: Line 18. In this case, upon termination of Equiv(𝑒, 0) the pair is Grey and Doubts(𝑒) = 1 (line 49). In the loop of its Equiv-call every neighbour was recursively found to be either Grey or White β€” or 𝑒 would have been colored in Black. Finally, it cannot be the case that some Grey neighbour 𝑣 became Black afterward: Relax(𝑣) would have colored 𝑒 in Black by setting Doubts(𝑒) = 0. Line 47. In this case, inside the loop of Equiv(𝑒, 0) none of 𝑒’s neighbours were found to be either Black or Grey. Thus, they must all be White. Therefore, every neighbour of 𝑒 is either Grey or White. Proposition 3. At line 19 (end of the main loop), for every 𝑣 ∈ 𝑉1 if 𝑣 is either Grey or White, then there exists 𝑣 β†’ 𝑒′ ∈ 𝐴1 such that 𝑒′ is White. Proof. If 𝑣 is White, then it must have been colored at line 39 after finding a White neighbour. If 𝑣 is Grey, upon termination of Equiv at line 13 some of its neighbours must have been found to be Grey. Since every Grey left node is colored in White before the end of the iteration, it follows again that 𝑣 has some White neighbour. Lemma 3. The following hold at line 19 (end of the main loop): N1. β„›π‘Š = {βŸ¨π‘, π‘žβŸ© | Color(βŸ¨π‘, π‘žβŸ©) = White} βŠ† ℬ, N2. ℛ𝐡 = {βŸ¨π‘, π‘žβŸ© | Color(βŸ¨π‘, π‘žβŸ©) = Black} ∩ ℬ = βˆ…. Proof. N1. By Lemma 1 it is sufficient to prove that β„›π‘Š is a bisimulation. First, notice that pairs violating (B1) are colored in Black from the start. Consider βŸ¨π‘, π‘žβŸ© ∈ β„›π‘Š . If 𝑝 = π‘ž, (B2) and (B3) trivially hold. Otherwise, let π‘₯ ∈ Ξ£ and 𝑝′ ∈ 𝛿(𝑝, π‘₯). From Prop. 2 it follows that 𝑣 = βŸ¨π‘β€² , π‘ž, π‘₯⟩ is either Grey or White. From Prop. 3 it follows that some neighbour 𝑒′ of 𝑣 is in β„›π‘Š . By Def. 4 we have 𝑒′ = βŸ¨π‘β€² , π‘ž β€² ⟩ for some π‘ž β€² ∈ 𝛿(π‘ž, π‘₯). Thus, (B2) holds for βŸ¨π‘, π‘žβŸ©. The very same argument can be used to prove that (B3) holds for βŸ¨π‘, π‘žβŸ©. Hence, β„›π‘Š is a bisimulation. N2. The result follows from (N1) and the fact that vertices in 𝑉0 ∩ β„‹ are either Black or White. 11 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 Theorem 4. MinimizeNfa is correct and incremental. Proof. The inclusion β„›π‘Š βŠ† ℬ is (N1). For the converse inclusion notice that it can be easily checked that, upon termination, every pair βŸ¨π‘, π‘žβŸ© ∈ 𝑉0 is either Black or White. Therefore, ℬ = 𝑉0 ∩ ℬ = (ℛ𝐡 βˆͺ β„›π‘Š ) ∩ ℬ = βˆ… βˆͺ (β„›π‘Š ∩ ℬ) βŠ† β„›π‘Š . Incrementality follows again from (N1). 6. Conclusions Bisimilarity is a fundamental (equivalence) relation among the states of finite automata, finding applications and variants in a number of different areas. Algorithms for computing bisimilarity are a classic and can be subdivided in two categories: top-down and a bottom-up. The former (partition refinement) approach starts with a coarse partition and refines it until the result is produced, while the latter (partition aggregation) starts from a singleton-classes equivalence relation and merges classes until possible. Although algorithms belonging to the bottom-up category are, to the best of our knowledge, still currently asymptotically slower than their alternative ones, aggregation based techniques enjoy the property of being incremental: automata resulting at intermediate stages of the computation are partially minimized yet language-equivalent to the input one. Moreover, partition aggregation algorithms, even though less celebrated than partition refinement ones β€” introduced by Hopcroft and generalized by Paige and Tarjan β€”, are interesting (at least) for two reasons. The first is theoretical: if two methods computes the same relation (just one from β€œabove" and the other from β€œbelow"), why is there a complexity gap? Is there some (hidden) cost involved in maintaining incrementality? The second is practical: some applicative contexts can greatly benefit from having a partially minimized equivalent automaton, especially when, as alternative, long sequences of refinement steps are involved. In this work, while fixing a minor mistake in the algorithm by Almeida et al., we reduced bisimilarity computation to a coloring problem on an associated graph. We then extended the algorithm to nondeterministic case, obtaining a complexity improvement on the best known bound for this case. The time complexity of both algorithms carry the 𝛼(𝑛) factor which we aim to shave off as future work. As a further line of research, it will be interesting to investigate the effect of applying the technique introduced here to the color refinement algorithm (a.k.a. Weisfeiler-Leman-1 algorithm, see [15]), currently implemented using an algorithm by Cardon and Crochemore (see [16]) belonging to the top-down/partition-refinement category. References [1] E. F. Moore, Gedanken-Experiments on Sequential Machines, Princeton University Press, 1956, pp. 129–154. doi:10.1515/9781400882618-006. [2] J. E. Hopcroft, R. Motwani, J. D. Ullman, Pearson Education, 2014. [3] A. R. Meyer, L. J. Stockmeyer, The equivalence problem for regular expressions with squaring requires exponential space, in: SWAT, 1972. 12 Christian Bianchini et al. CEUR Workshop Proceedings 1–13 [4] R. Paige, R. E. Tarjan, Three partition refinement algorithms, SIAM J. Comput. 16 (1987) 973–989. [5] J. E. Hopcroft, An n log n algorithm for minimizing states in a finite automaton, 1971. [6] J. F. Groote, J. Martens, E. P. de Vink, Lowerbounds for bisimulation by partition refinement (2022). URL: https://arxiv.org/abs/2203.07158. [7] B. W. Watson, Taxonomies and toolkits of regular language algorithms, 1995. [8] B. W. Watson, J. Daciuk, An efficient incremental dfa minimization algorithm, Natural Language Engineering 9 (2003) 49 – 64. [9] R. E. Tarjan, Efficiency of a good but not linear set union algorithm, J. ACM 22 (1975) 215–225. doi:10.1145/321879.321884. [10] J. E. Hopcroft, A linear algorithm for testing equivalence of finite automata, volume 114, Defense Technical Information Center, 1971. [11] M. Almeida, N. Moreira, R. Reis, Incremental dfa minimisation, RAIRO - Theoretical Informatics and Applications 48 (2014) 173–186. doi:10.1051/ita/2013045. [12] J. BjΓΆrklund, L. G. Cleophas, Minimization of finite state automata through partition aggregation, in: LATA, 2017. [13] J. BjΓΆrklund, L. Cleophas, Aggregation-based minimization of finite state automata, 2020. doi:10.1007/s00236-019-00363-5. [14] T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein, Introduction to algorithms, 3rd edition, 2009, pp. 170–173. [15] M. Grohe, K. Kersting, M. Mladenov, P. Schweitzer, Color refinement and its applications, 2021. [16] A. Cardon, M. Crochemore, Partitioning a graph in o(|a| log2 |v|), Theor. Comput. Sci. 19 (1982) 85–98. doi:10.1016/0304-3975(82)90016-0. 13