=Paper=
{{Paper
|id=Vol-2925/paper3
|storemode=property
|title=Termination of graph rewriting systems through language theory
|pdfUrl=https://ceur-ws.org/Vol-2925/paper3.pdf
|volume=Vol-2925
|authors=Guillaume Bonfante,Miguel Couceiro
|dblpUrl=https://dblp.org/rec/conf/algos/BonfanteC20
}}
==Termination of graph rewriting systems through language theory==
T ERMINATION OF GRAPH REWRITING SYSTEMS THROUGH LANGUAGE THEORY Guillaume Bonfante Miguel Couceiro Université de Lorraine, CNRS, LORIA Université de Lorraine, CNRS, Inria, LORIA F-54000 Nancy, France F-54000 Nancy, France guillaume.bonfante@univ-lorraine.fr miguel.couceiro@loria.fr Dedicated to Maurice Pouzet on the occasion of his 75th birthday A BSTRACT The termination issue that we tackle is rooted in Natural Language Processing where computations are performed by graph rewriting systems (GRS) that may contain a large number of rules, often in the order of thousands. This asks for algorithmic procedures to verify the termination of such systems. The notion of graph rewriting that we consider does not make any assumption on the structure of graphs (they are not “term graphs”, “port graphs” nor “drags”). This lack of algebraic structure led us to proposing two orders on graphs inspired from language theory: the matrix multiset-path order and the rational embedding order. We show that both are stable by context, which we then use to obtain the main contribution of the paper: under a suitable notion of “interpretation”, a GRS is terminating if and only if it is compatible with an interpretation. Keywords Graph Rewriting · Termination · Order · Natural Language Processing 1 Introduction Computer linguists rediscovered a few years ago that graph rewriting is a good model of computation for rule-based systems. They used traditionally terms, see for instance Chomsky’s Syntagmatic Structures [1]. But usual phenomena such as anaphora do not fit really well within such theories. In such situations, graphs behave much better. For examples of graph rewriting in natural language processing, we refer the reader to the parsing procedure by Guillaume and Perrier [2] or the word ordering modeling by Kahane and Lareau [3]. The first named author with Guillaume and Perrier designed a graph rewriting model called GREW [4] that is adapted to natural language processing. The rewriting systems developed by the linguists often contain a huge number of rules, e.g., those synthesized from lexicons (e.g. some rules only apply to transitive verbs). For instance, in [2], several systems are presented, some with more than a thousand of rules. Verifying properties such as termination by hand thus becomes intractable. This fact motivates our framework for tackling the problem of GRS termination. Following the tracks of term rewriting, for which the definition is essentially fixed by the algebraic structure of terms, many approaches to graph rewriting emerged in past years. Some definitions (here meaning semantics) are based on a categorical framework, e.g., the double pushout (DPO) and the single pushout (SPO) models, see [5]. To make use of algebraic potential, some authors make some, possibly weak, hypotheses on graph structures, see for instance the main contribution by Courcelle and Engelfriet [6] where graph decompositions, graph operations and transformations are described in terms of monadic second-order logics (with the underlying decidability/complexity results). In this spirit, Ogawa describes a graph algebra under a limited tree-width condition [7]. Another line of research follows from the seminal work by Lafont [8] on interaction nets. The latter are graphs where nodes have some extra structure: nodes have a label related to some arity and co-arity. Moreover, nodes have some "principal gates" (ports) and rules are actionned via them. One of the main results by Lafont is that rewriting in this setting is (strongly) confluent. This approach has been enriched by Fernandez, Kirchner and Pinaud [9], Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). who implemented a fully operational system called PORGY with strategies and related semantics. Also, it is worth mentioning the graph rewriting as described by Dershowitz and Jouannaud [10]. Here, graphs are seen as a generalization of terms: symbols have a (fixed) arity, graphs are connected via some sprouts/variables as terms do. With such a setting, a good deal of term rewriting theory also applies to graphs. Let us come back to the initial problem: termination of graph rewriting systems in the context of natural language processing. We already mentioned that rule sets are large, which makes manual inspection impossible. Moreover, empirical studies fail to observe some of the underlying hypotheses of the previous frameworks. For instance, there is no clear bound on tree-width: even if input data such as dependency graphs are almost like trees, the property is not preserved along computations. Also, constraints on node degrees are also problematic: graphs are usually sparse, but some nodes may be highly connected. To illustrate, consider the sentence “The woman, the man, the child and the dog eat together”. The verb “eat” is related to four subjects and there is no a priori limit on this phenomenon. Typed versions (those with fixed arity) are also problematic: a verb may be transitive or not. Moreover, rewriting systems may be intrinsically nondeterministic. For instance, if one computes the semantics of a sentence out of its grammatical analysis, it is quite common there are multiple solutions. To further illustrate nondeterminism consider the well known phrasal construction “He saw a man with a telescope” with two clear readings. Some hypotheses are rather unusual for standard computations, e.g., fixed number of nodes. Indeed, nodes are usually related to words or concepts (which are themselves closely related to words). A paraphrase may be a little bit longer than its original version, but its length can be easily bounded by the length of the original sentence up to some linear factor. In GREW, node creations are restricted. To take into account the rare cases for which one needs extra nodes, a “reserve” is allocated at the beginning of the computation. All additional nodes are taken from the reserve. Doing so has some efficiency advantages, but that goes beyond the scope of the paper. Also, node and edge labels, despite being large, remain finite sets: they are usually related to some lexicons. These facts together have an important impact on the termination problem: since there are only finitely many graphs of a given size, rewriting only leads to finitely many outcomes. Thus, deciding termination for a particular input graph is decidable. However, our problem is to address termination in the class of all graphs. The latter problem is often referred to as uniform termination, whereas the former is refereed to as non-uniform. For word rewriting, uniform termination of non increasing systems constituted a well known problem, and was shown to be undecidable by Sénizergues in [11]. This paper proposes a novel approach for termination of graph rewriting. In a former paper [12], we proposed a solution based on label weights. Here, the focus is on the description (and the ordering) of paths within graphs. In fact, paths in a graph can be seen as good old regular languages. The question of path ordering thus translates into a question of regular language orderings. Accordingly, we define the graph multi-set path ordering that is related to that in [13]. Dershowitz and Jouannaud, in the context of drag rewriting, consider a similar notion of path ordering called GPO (see [14]). Our definitions diverge from theirs in that our graph rewriting model is quite different: here, we do not benefit (as they do) from a good algebraic structure. Our graphs have no heads, tails nor hierarchical decomposition. In fact, our ordering is not even well founded! Relating the two notions is nevertheless interesting and left for further work. Plump [15] also defines path orderings for term graphs, but those behave like sets of terms. One of our graph orderings will involve matrices, and orderings on matrices. Nonetheless, as far as we see, there is no relationship with matrix interpretations as defined by Endrullis, Waldmann and Zantema [16]. The paper is organized as follows. In Section 2 we recall the basic background on graphs and graph rewriting systems (GRS) that we will need throughout the paper, and introduce an example that motivated our work. In Section 3 we consider a language theory approach to the termination of GRSs. In particular, we present the language matrix, and the matrix multiset path order (Subsection 3.4) and the rational embedding order (Subsection 3.5). We also propose the notion of stability by context (Subsection 3.6) and show that both orderings are stable under this condition (Subsection 3.7). In Section 4 we propose notion of graph interpretability and show one of our main results, namely, that a GRS is terminating if and only if it is compatible with interpretations. Main contributions: The two main contributions of the paper are the following. 1. We propose two orders on graphs inspired from language theory, and we show that both are monotonic and stable by context. 2. We propose a notion of graph interpretation, and show that GRSs that are terminating are exactly those that are compatible with such interpretations. 2 Notation and Graph Rewriting In this section we recall some general definitions and notation. Given an alphabet ⌃, the set of words (finite sequences of elements of ⌃) is denoted by ⌃⇤ . The concatenation of two words v and w is denoted by v · w. The empty word, being the neutral element for concatenation, is denoted by 1⌃ or, when clear from the context, simply by 1. Note that h⌃⇤ , 1, ·i constitutes a monoid. A language on ⌃ is some subset L ✓ ⌃⇤ . The set of all languages on ⌃ is P(⌃⇤ ). The addition of two languages L, L0 ✓ ⌃⇤ is defined by L + L0 = {w | w 2 L _ w 2 L0 }. The empty language is denoted by 0 and hP(⌃⇤ ), +, 0i is also a (commutative) monoid. Given some word w 2 ⌃⇤ , we will also denote by w the language made of the singleton {w} 2 P(⌃⇤ ). Given two languages L, L0 ✓ ⌃⇤ , their concatenation is defined by L · L0 = {w · w0 | w 2 L ^ w0 2 L0 }. In this way, hP(⌃⇤ ), 1, ·i is also a monoid. Given the distributivity of ⇥ with respect to +, the 5-tuple hP(⌃⇤ ), +, 0, 1, ·i forms a semiring. A preorder on a set X is a binary relation ✓ X 2 that is reflexive (x x, for all x 2 X) and transitive (if x y and y z, then x z, for all x, y, z 2 X). A preorder is a partial order if it is anti-symmetric (if x y and y x, then x = y, for all x, y 2 X). An equivalence relation is a preorder that is symmetric (x y ) y x) . Observe that each preorder induces an equivalence relation ⇠: a ⇠ b if a b and b a. The strict part of is then the relation: x y if and only if x y and ¬(x ⇠ y). We also mention the “dual” preorder ⌫ of defined by: x ⌫ y if and only if y x. A preorder is said to be well-founded if there is no infinite chain · · · x2 x1 or, equivalently, x1 x2 · · · . The remainder of this section may be found in [4] and we refer the reader to it for an extended presentation. We suppose given a (finite) set ⌃N of node labels, a (finite) set ⌃E of edge labels and we define graphs accordingly. A graph is a triple G = hN, E, `i with E ✓ N ⇥ ⌃E ⇥ N and ` : N ! ⌃N is the labeling function of nodes. Note that there may be more than one edge between two nodes, but at most one is labeled with some e 2 ⌃E . In the sequel, we use the e notation m ! n for an edge (m, e, n) 2 E. Given a graph G, we denote by NG , EG and `G respectively its sets of nodes, edges and labeling function. We we will e e also (abusively) use the notation m 2 G and m ! n 2 G instead of m 2 NG and m ! n 2 EG when the context is clear. Furthermore, in | A ~ , a, b are nodes, |, ~ are the respective node labels and A is the edge label (here a b between a and b). The set of graphs on node labels ⌃N and edge labels ⌃E is denoted by G⌃N ,⌃E or G in short. Two graphs G and G0 are said to share their nodes when NG = NG0 . Given two graphs G and G0 such that NG ✓ NG0 , set G J G0 = hNG0 , EG [ EG0 , `i with `(n) = `G (n) if n 2 NG and `(n) = `G0 (n), otherwise. A graph morphism µ between a source graph G and a target graph H is a function µ : NG ! NH that preserves edges e e and labelings, that is, for all m ! n 2 G, µ(m) ! µ(n) 2 H holds, and for any node n 2 G: `G (n) = `H (µ(n)). A basic pattern is a graph, and a basic pattern matching is an injective morphism from a basic pattern P to some graph G. Given such a morphism µ : G ! G0 , we define µ(G) to be the sub-graph of G0 made of the nodes {µ(n) | n 2 NG }, e e of the edges {µ(m) ! µ(n) | m ! n 2 G} and node labels µ(n) 7! `G (n). A pattern is a pair P = hP0 , ~⌫ i made of a basic pattern P0 and a sequence of injective morphisms ⌫i : P0 ! Ni , called negative conditions. The basic pattern describes what must be present in the target graph G, whereas negative conditions say what must be absent in the target graph. Given a pattern P = hP0 , ~⌫ i and a graph G, a pattern morphism is an injective morphism µ : P0 ! G for which there is no morphism ⇠i such that µ = ⇠i ⌫i . Example 1. Consider the basic pattern morphism µ : P0 ! G (colors define the mapping): A ~ C µ g1 A | ~ E | B A | b0 b1 g0 D g2 A The pattern P = hP0 , [⌫]i with ⌫ defined by | A ~ ⌫ | ~ prevents the application of the morphism above. b0 b1 b0 B b1 Indeed, ⇠ = b0 7! g0 , b1 7! g1 is such that ⇠ ⌫ = µ. When there is only one negative condition, we represent the pattern by crossing nodes and edges which are not within the basic pattern. For instance, the pattern P above looks like A | ⇥ ~ that we hope is self-explanatory. b0 B b1 In this paper we think of graph transformations as sequences of “basic commands”. Definition 1 (The command language). There are three basic commands: label(p, ↵) for node renaming, del_edge(p, e, q) for edge deletion and add_edge(p, e, q) for edge creation. In these basic commands, p and q are nodes, ↵ is some node label and e is some edge label. A pattern hP0 , ~⌫ i is compatible with a command whenever all nodes that it involves belong to P0 . Definition 2 (Operational semantics). Given a pattern P = hP0 , ~⌫ i compatible with some command c, and some pattern matching µ : P ! G where G is the graph on which the transformation is applied, we have the following e possible cases: c = label(p, ↵) turns the label of µ(p) into ↵, c = del_edge(p, e, q) removes µ(p) ! µ(q) if it e exists, otherwise does nothing, and c = add_edge(p, e, q) adds the edge µ(p) ! µ(q) if it does not exist, otherwise does nothing. The graph obtained after such an application is denoted by G ·µ c. Given a sequence of commands ~c = (c1 , . . . , cn ), let G ·µ ~c be the resulting graph, i.e., G ·µ ~c = (· · · ((G ·µ c1 ) ·µ c2 ) ·µ · · · cn ). Remark 1. We took a slightly simplified version of patterns. Actually, in [4], we have negative conditions within patterns to avoid some rule applications. But these simplifications should be transparent with respect to termination. Nevertheless, informally, we will omit the right node in a pattern like | ⇥B · · · to say that there is no edge labeled B b0 from node b0 . Definition 3. A rule is a pair R = hP, ~ci made of a pattern and a (compatible) sequence of commands. Such a rule R applies to a graph G via a pattern morphism µ : P ! G. Let G0 = G ·µ ~c, then we write G !R,µ G0 . We define G ! G0 whenever there is a rule R and a pattern morphism µ such that G !R,µ G0 . 2.1 The main example Let ⌃N = {A} and ⌃E = {↵, , T }. For the discussion, we suppose that T is a working label, that is not present in the initial graphs. We want to add a new edge between node n and node 1 each time we find a maximal chain: A within a graph G. Consider the basic pattern P A together with its two ↵ ↵ ↵ ↵ ↵ A A A ··· init = A 1 2 n p q 3 negative conditions ⌫1 = A ⇥ ↵ A ↵ A and ⌫2 = A ⇥ A ↵ A . We consider three rules: p q p q Init: hhPinit , [⌫1 , ⌫2 ]i, (add_edge(p, T, q))i which fires the transitive closure. Follow: h A T A ↵ A , (add_edge(p, T, r), del_edge(p, T, q))i which follows the chain. p q r End: h A T A ↵ ⇥ A , (del_edge(p, T, q), add_edge(q, , p))i which stops the procedure. p q To prevent all pathological cases (e.g., when the edge is misplaced, when two chains are crossing, and so on), we could introduce more sophisticated patterns. But, since that does not change issues around termination, we avoid obscuring rules with such technicalities. Example 2. Consider the graph G = A ↵ A ↵ A . By applying succesively ’Init’, ’Follow’ and ’End’, G rewrites 1 2 3 ↵ ↵ ↵ as: A ↵ A ↵ A ! A A ↵ A ! A 2 A ↵ A ! A 2 A ↵ A 1 2 3 1 T 2 3 1 3 1 3 T 2.2 Three technical facts about Graph Rewriting It is well known that the main issue with graph rewriting definitions is the way the context is related to the pattern image and its rewritten part. We shall tackle this issue with Proposition 1. Self-application Let R = hP, ~ci be the rule made of a pattern P = hP0 , ~⌫ i and a sequence of commands ~c. There is the identity morphism 1P0 : P0 ! P0 , and thus we can apply rule R on P0 itself, that is, P0 !R,1P0 P00 = P0 ·1P0 ~c. We call this latter graph the self-application of R. Rule node renaming To avoid heavy notation, we will use the following trick. Suppose that we are given a rule R = hP, ~ci, a graph G and a pattern morphism µ : P ! G. Let P = hP0 , ~⌫ i. We define Rµ to be the rule obtained by renaming nodes p in P0 to µ(p) (and their references within ~c). For instance, the rule ’Follow’ can be rewritten as F ollowµ = h A T A ↵ A , (add_edge(1, T, 3), del_edge(1, T, 2))i where µ denotes the pattern morphism used 1 2 3 to apply ’Follow’ in the derivation. Observe that: (i) the basic pattern of Rµ is actually µ(P0 ), which is a subgraph of G, (ii) ◆ : µ(P0 ) ! G mapping n 7! n is a pattern matching, and (iii) applying rule Rµ with ◆ is equivalent to applying rule R with µ. In other words, G !R,µ G0 if (and only if) G !Rµ ,◆ G0 . To sum up, we can always rewrite a rule so that its basic pattern is actually a subgraph of G. ↵ ↵ Uniform rules Let us consider rule ’Init’ above. It applies on: A T A , and the result is the graph itself: A T A . p q p q Indeed, we cannot add an already present edge (relative to a label) within a graph. Thus, depending on the graph, the rule will or will not append an edge. Such an unpredictable behavior can be easily avoided by modifying the pattern of ↵ ’Init’ to: AT A . The same issue may come from edge deletions. A uniform rule is one for which commands apply p ⇥ q (that is, modify the graph) for each rule application. Since this is not the scope of the paper, we refer the reader to [4] for a precise definition of uniformity. We will only observe two facts. First, any rule can be replaced by a finite set of uniform rules (using negative conditions as above) that operate identically. Thus, we can always suppose that rules are uniform. Second, the following property holds for uniform rules (see [4]§7 for a proof). Proposition 1. Suppose that G !R,◆ G0 with R = hP, ~ci and P = hP0 , ~⌫ i (the basic pattern P0 being a subgraph of G). Let C be the graph obtained from G by deleting the edges in P0 . Then G = P0 J C and G0 = P00 J C with P00 being the self-application of the rule. Moreover, EC \ EP0 = ; and EC \ EP00 = ;. Throughout the remainder of the paper we assume that all rules are uniform. 3 Termination of Graph Rewriting Systems By a graph rewriting system (GRS) we simply mean a set of graph rewriting rules (see Section 2). A GRS R is said to be terminating if there is no infinite sequence G1 ! G2 ! · · · . Such sequences, whether finite or not, are called derivations. Since there is no node creation (neither node deletion) in our notion of rewriting, any derivation starting from a graph G will lead to graphs whose size is the size of G. Since there are only finitely many such graphs, we can decide the termination for this particular graph G. However, the question we address here is the uniform termination problem (see Section 1). Remark 2. Suppose that we are given a strict partial order , not necessarily well founded. If G ! G0 implies G G0 for all graphs G and G0 , then the system is terminating. Indeed, suppose it is not the case, let G1 ! G2 ! · · · be an infinite reduction sequence. Since there are only finitely many graphs of size of G1 , it means that there are two indices i and j such that Gi ! · · · ! Gj with Gi = Gj . But then, since Gi Gi+1 · · · Gj , we have that Gi Gj = Gi which is a contradiction. A similar argument was exhibited by Dershowitz in [17] in the context of term rewriting. For instance, it is possible to embed rewriting within real numbers rather than natural numbers to prove termination. Let us try to prove the termination of our main example (see Subsection 2.1). Rules such as ’Init’ and ’End’ are “simple”: we put a weight on edge labels ! : ⌃E ! R and we say that the weight of a graph is the sum of the weights of its edges labels. Set !(↵) = 0, !( ) = 2 and !(T ) = 1. Then, rules ’Init’ and ’End’ decrease the weight by 1 and, since rule ’Follow’ keeps the weight constant, it means the two former rules can be applied only finitely many times. Observe that negative weights are no problem with respect to Remark 2. But how do we handle rule ’Follow’? No weights as above can work. 3.1 A language point of view Let G ! G0 be a rule application. The set of nodes stays constant. Let us think of graphs as automata, and let us forget about node labeling for the time being. Let ⌃E be the set of edge labels. Consider a pair of states (nodes), choose one to be the initial state and one to be the final state. Thus the automaton (graph) defines some regular language on ⌃E . In fact, the automaton describes n2 languages (one for each pair of states). Now, let us consider the effect of graph rewriting in terms of languages. Consider an application of the ’Follow’ rule: T ↵ G ! G0 . Any word to state r that goes through the transitions p ! q ! r can be mapped to a shorter one in G0 via T the transition p ! r. The languages corresponding to state r contain shorter words. The remainder of this section is devoted to formalizing this intuition into proper orders on graphs. For that, we will need to count the number of paths between any two states. Hence, we shall introduce N-rational expressions, that is, rational expression with multiplicity. See, e.g., Sakarovitch’s book [18] for an introduction and justifications of the upcoming constructions. We introduce here the basic ideas. 3.2 Formal series A formal series on ⌃ (with coefficients in N) is a (total) function s : ⌃⇤ ! N. Given a word w, s(w) is the multiplicity of w. The set of words s = {w 2 ⌃⇤ | s(w) 6= 0} is the support of s. Given n 2 N, let n be the series defined by n(w) = 0, if w 6= 1, and n(1) = n, where 1 denotes the empty word. The empty language is 0, the language made of the empty word is 1. Moreover, for a 2 ⌃, the series a is given by a(w) = 0 if w 6= a and a(a) = 1. Given two series s and P t, their addition is the series s + t given by s + t(w) = s(w) + t(w), and their product is s · t defined by s · t(w) = u·v=w s(u)t(v). The star operation is defined by s⇤ = 1 + s + s2 + · · · . The monoïd ⌃⇤ being graded, the operation is correctly defined whenever s(1) = 0. Given a series s, let sk be its restriction to words of length less or equal to k, i.e., sk (w) = 0 whenever |w| > k and sk (w) = s(w), otherwise. An N-rational expression on an alphabet ⌃ is built upon the grammar [19]: E ::= a 2 ⌃ | n 2 N | (E + E) | (E · E) | (E⇤ ). Thus, given the constructions mentioned in the previous paragraph, any N-rational expression E 2 E denotes some formal series. To each N-rational expression corresponds P an N-automaton, which is a standard automaton with transitions labeled by a non empty linear combination ik ni ai with ni 2 N and ai 2 ⌃ for all i k. 3.3 The language matrix Let us suppose given an edge label set ⌃E . Let E denote the N-rational expressions over ⌃E . A matrix M of dimension P ⇥ P for some (finite) set P is an array (Mi,j )i2P,j2P whose entries are in E. Let ME be the set of such matrices. Given a graph G, we define the matrix MG of dimension NG ⇥ NG as follows: MGi,j = T1 + · · · + T` with T1 , . . . , T` the set of labels on the transitions between state i and j if such transitions exist, otherwise 0. Let 1P be the unit matrix of dimension P ⇥ P , that is (1P )i,j = 0 if i 6= j else 1. From now on, we abbreviate the notation from 1P to 1 if the context is clear. Then, let MG⇤ = 1 + MG + MG 2 + · · · . Each entry of MG ⇤ is actually an N-regular expression (see Sakarovitch Ch. III, §4 for instance). The (infinite) sum is correctly defined since for all i, j, we have the equality (MG )i,j = T1 + · · · + T` . Thus, 1 62 (MG )i,j . The question about termination can be reformulated in terms of matrices whose entries are languages (with words counted with their multiplicity). To prove the termination of the rewriting system, it is then sufficient to exhibit some order > on matrices such that for any two graphs G ! G0 , we have MG ⇤ > MG⇤ 0 . To prove such a property in the infinite class of finite graphs, we will use the notion of “stable orders”. Recall the ’Follow’ rule and consider the corresponding basic pattern L and its self-application R. Their respective matrices are: ! ! 0 T 0 0 0 T ML = 0 0 ↵ MR = 0 0 ↵ . 0 0 0 0 0 0 Observe that (MR )13 > (ML )13 . This matrix deals with edges/transitions. In order to consider paths, we need to compute ML⇤ and MR⇤ that are given by: ! ! 1 T T ·↵ 1 0 T ⇤ ⇤ ML = 0 1 ↵ MR = 0 1 ↵ . 0 0 1 0 0 1 Any word within MR⇤ ’s entries is a sub-word of the corresponding entry in ML⇤ . Example 3. Consider now a variation of ’Follow’ made of the pattern h A T A ↵ A and commands p q r (add_edge(p, T, r), del_edge(p, T, q))i. By setting L0 as its pattern and R0 as its self-application, we get the following matrices: ! ! (T ↵ )⇤ T (↵ T )⇤ T ↵( T ↵)⇤ (T )⇤ 0 T ( T )⇤ ML⇤ 0 = ↵ (T ↵ )⇤ (↵ T )⇤ ↵( T ↵)⇤ MR⇤ 0 = ↵ (T )⇤ 1 ↵( T )⇤ . (T ↵ )⇤ T (↵ T )⇤ ( T ↵)⇤ (T )⇤ 0 ( T )⇤ Again, words within MR⇤ 0 are sub-words of the corresponding ones in ML⇤ 0 . 3.4 The matrix multiset path order The order we shall introduce in this section is inspired by the notion of multiset path ordering within the context of term rewriting (see for instance [13]). However, in the present context of graph rewriting (to be compared with Dershowitz and Jouannaud’s [14] or with Plump’s [15]), the definition is a bit more complicated. Here, we do not consider an order on letters as it is done for terms. Let E be the word embedding on ⌃⇤ , that is, the smallest partial order such that 1 E w, and if u E v, then (u · w E v · w and w · u E w · v, for all u, v, w 2 ⌃⇤ . This order E can be extended to formal series, that is, the multiset-path ordering, see Dershowitz and Manna [20] or Huet and Oppen [21]. Definition 4 (Multiset path order). The multiset path order is the smallest partial order on finite series such that • if there is w 2 t such that for all v 2 s, v / w, then s E t, and • if r E s and t E u, then r + t E s + u. We write s / t when s E t and s 6= t. Proposition 2. Addition and product are monotonic with respect to the multiset-path order. Moreover, addition is strictly monotonic with respect to E, and if r / s, then r · t / s · t and t · r / t · s, whenever t 6= 0 (otherwise, we have equality). Proof. Addition is monotonic by definition. Actually, we prove now that it is strictly monotonic. Suppose that r / s. We prove that r + t / s + t, by induction (see Definition 4). Suppose that there is w 2 s such that for all v 2 r we have v / w, then r / s. Since r(w) = 0, then (r + t)(w) = t(w) < s(w) + t(w) = (s + t)(w), and we are done. Otherwise, r = r0 + r1 and s = s0 + s1 with r0 E s0 and r1 E s1 . One of the two inequalities must be strict (otherwise r = s). Suppose r0 / s0 . By definition, observe that r1 + t E s1 + t. But then, r + t = r0 + (r1 + t) and s = s0 + (s1 + t) and we apply induction on (r0 , s0 ). As addition is commutative, the result holds. For the product, suppose that r E s and let t be some series. We prove r · t E s · t; the other inequality t · r E t · s is similar. Again, we proceed by induction on Definition 4: • Suppose there is w 2 s such that for all v 2 r, v / w. ByPinduction on t, if t = 0, r · t = 0 E 0 = s · t. Otherwise, t = t0 + v0 for a word v0 . Observe that r · v0 = v2r r(v)v · v0 . Since for all v 2 r, v · v0 / w · v0 , we have r · v0 / w · v0 E s · v0 . Now, r · t = r · (t0 + v0 ) = r · t0 + r · v0 and s · t = s · t0 + s · v0 . By induction, r · t0 E s · t0 and since r · v0 E s · v0 , the result holds. • Otherwise, r = r0 + r1 . In this case, s · r = s · r0 + s · r1 and t · r = t · r0 + t · r1 . The result then follows by induction. To show strict monotonicity, suppose r / s and again proceed by case analysis. Suppose that there is some w 2 s such that for all vP2 r, v / w. Since t 6= 0, it contains at least one word v0 such that t = t0 + v0 . By r / s, r · v0 = P v2r r(v)v·v0 / v2s s(v)v·v0 = s·v0 . Now, r·t0 E s·t0 by monotonicity. Thus r·t = r·t0 +r·v0 /s·t0 +s·v0 = s·t where the strict inequality is due to strict monotonicity of addition. Definition 5 (Matrix multiset-path order). Let M and M 0 be two matrices with dimension P ⇥ P . Write M E M 0 if k 0 k for all k |P | and for all (i, j) 2 P ⇥ P , we have Mi,j E Mi,j . Corollary 1. The addition and the multiplication are monotonic with respect to the matrix multiset-path order. Proof. It follows from Proposition 2 since addition and product of matrices are defined as addition and product of their entries. 3.5 The Rational Embedding Order Let ⌃ be some fixed alphabet. A finite state transducer is a finite state automaton whose transitions are decorated by pairs (a, w), a 2 ⌃, w 2 ⌃⇤ . We refer the reader to the book of Sakarovitch [18] for details. To give the intuition, a transducer works as follows. At the beginning, the current state is set to the initial state of the transducer. The reading “head” is put on the first letter of the input and the output is set to the empty word. At each step of the computation, given some state q, some letter a read and some transition q 0 2 (q, (a, w)), the transducer appends w to the output, shifts the input head to the "right" and sets the current state to q 0 . The computation ends when a) the input word is completely read in which case the result is the output or b) if there is no transition compatible with the current state and the read letter. In this latter case, the computation is said to fail. Thus, compared to a finite state automaton whose computations end on True (a final state) or False (not a final state) given some input word, the transducer output words, thus defining a relation in ⌃⇤ . Given some transducer ⌧ , if for any word, there is at most one successful computation, the transducer outputs at most one word, the relation becomes functional. In that case, we denote the function it computes by [⌧ ]. Definition 6 (Rational Embedding Order). Given two regular languages L and L0 on ⌃, write L . L0 if: • there is an injective function f : L0 ! L that is computed by a transducer ⌧ and • such that |f (w)| |w|, for every w 2 L0 . Such functions (and the corresponding transducers) are said to be decreasing (in [22]). The transducer ⌧ is said to be a witness of L . L0 . We say that a transition of a transducer is deleting when it is of the form a | 1 for some a 2 ⌃. A transducer whose transitions are of the form X | Y , with |Y | |X|, is itself decreasing. If a path corresponding to an input w passes through a deleting transition, then |⌧ (w)| < |w|. In the sequel we will make use of the following results that are direct consequences of Nivat’s Theorem [23] (Prop. 4, §3). Proposition 3. Let [⌧ ] : L ! L0 be computed by a transducer ⌧ , and let L00 be a regular language. Then the following assertions hold. 1. The restriction ⌧|L00 : L00 \ L ! L0 mapping w 7! ⌧ (w) is computable by a transducer. 00 2. The co-restriction ⌧ |L : L ! L0 \ L00 mapping w 7! ⌧ (w) if ⌧ (w) 2 L00 and otherwise undefined, is computable by a transducer. 3. The function ⌧ 0 : L ! L0 defined by ⌧ 0 (w) = ⌧ (w) if w 2 L00 and otherwise undefined, is computable by a transducer. Observe that the identity on ⌃⇤ is computed by a transducer (made of a unique initial/final state with transitions a | a for all a 2 ⌃). Then, the identity on L is obtained by Proposition 3(1,2). Thus we have that . is reflexive. Also, it is well known that both transducers and injective functions can be composed. Hence, we also have that . is transitive. Thus, . is a preorder. However, we do not have anti-reflexivity in general. For instance, we have L1 = A · (A + B)⇤ . L2 = B · (A + B)⇤ . L1 . To see this, consider the following transducer (whose initial state is indicated by an in-arrow, whereas the final one by B|A . This shows that L1 . L2 . Swap ’A’ and ’B’, to see that the reversed relation also holds. A|B an out-arrow): q 0 q1 A|B It is worth noting that there is a simple criterion to ensure a strict inequality. Proposition 4. Suppose L1 . L2 has a witness ⌧ : L2 ! L1 . If ⌧ contains one (accessible and co-accessible) deleting transition, then the relation is strict. Proof. As before, set L1 < L2 whenever L1 . L2 but not L2 . L1 . Ad absurdum, suppose L1 . L2 . L1 with a transducer ✓ : L1 ! L2 and ⌧ as above. Then ✓ ⌧ (the composition of the two transducers) defines an injective function. Let w be the smallest input word from the initial state to a final state through the transition a | 1 in ⌧ . Define the set M <|w| = {u 2 L2 | |u| < |w|}. For any word u, we have |✓ ⌧ (u)| |u|. Thus ✓ ⌧ (M