=Paper= {{Paper |id=Vol-2925/paper3 |storemode=property |title=Termination of graph rewriting systems through language theory |pdfUrl=https://ceur-ws.org/Vol-2925/paper3.pdf |volume=Vol-2925 |authors=Guillaume Bonfante,Miguel Couceiro |dblpUrl=https://dblp.org/rec/conf/algos/BonfanteC20 }} ==Termination of graph rewriting systems through language theory== https://ceur-ws.org/Vol-2925/paper3.pdf
                 T ERMINATION OF GRAPH REWRITING SYSTEMS
                                   THROUGH LANGUAGE THEORY



                     Guillaume Bonfante                                                 Miguel Couceiro
            Université de Lorraine, CNRS, LORIA                           Université de Lorraine, CNRS, Inria, LORIA
                   F-54000 Nancy, France                                             F-54000 Nancy, France
         guillaume.bonfante@univ-lorraine.fr                                    miguel.couceiro@loria.fr




                             Dedicated to Maurice Pouzet on the occasion of his 75th birthday

                                                           A BSTRACT
          The termination issue that we tackle is rooted in Natural Language Processing where computations
          are performed by graph rewriting systems (GRS) that may contain a large number of rules, often in
          the order of thousands. This asks for algorithmic procedures to verify the termination of such systems.
          The notion of graph rewriting that we consider does not make any assumption on the structure of
          graphs (they are not “term graphs”, “port graphs” nor “drags”). This lack of algebraic structure led us
          to proposing two orders on graphs inspired from language theory: the matrix multiset-path order and
          the rational embedding order. We show that both are stable by context, which we then use to obtain
          the main contribution of the paper: under a suitable notion of “interpretation”, a GRS is terminating
          if and only if it is compatible with an interpretation.

Keywords Graph Rewriting · Termination · Order · Natural Language Processing

1    Introduction
Computer linguists rediscovered a few years ago that graph rewriting is a good model of computation for rule-based
systems. They used traditionally terms, see for instance Chomsky’s Syntagmatic Structures [1]. But usual phenomena
such as anaphora do not fit really well within such theories. In such situations, graphs behave much better. For examples
of graph rewriting in natural language processing, we refer the reader to the parsing procedure by Guillaume and
Perrier [2] or the word ordering modeling by Kahane and Lareau [3]. The first named author with Guillaume and Perrier
designed a graph rewriting model called GREW [4] that is adapted to natural language processing.
The rewriting systems developed by the linguists often contain a huge number of rules, e.g., those synthesized from
lexicons (e.g. some rules only apply to transitive verbs). For instance, in [2], several systems are presented, some with
more than a thousand of rules. Verifying properties such as termination by hand thus becomes intractable. This fact
motivates our framework for tackling the problem of GRS termination.
Following the tracks of term rewriting, for which the definition is essentially fixed by the algebraic structure of terms,
many approaches to graph rewriting emerged in past years. Some definitions (here meaning semantics) are based on a
categorical framework, e.g., the double pushout (DPO) and the single pushout (SPO) models, see [5]. To make use of
algebraic potential, some authors make some, possibly weak, hypotheses on graph structures, see for instance the main
contribution by Courcelle and Engelfriet [6] where graph decompositions, graph operations and transformations are
described in terms of monadic second-order logics (with the underlying decidability/complexity results). In this spirit,
Ogawa describes a graph algebra under a limited tree-width condition [7].
Another line of research follows from the seminal work by Lafont [8] on interaction nets. The latter are graphs
where nodes have some extra structure: nodes have a label related to some arity and co-arity. Moreover, nodes have
some "principal gates" (ports) and rules are actionned via them. One of the main results by Lafont is that rewriting
in this setting is (strongly) confluent. This approach has been enriched by Fernandez, Kirchner and Pinaud [9],
 Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
who implemented a fully operational system called PORGY with strategies and related semantics. Also, it is worth
mentioning the graph rewriting as described by Dershowitz and Jouannaud [10]. Here, graphs are seen as a generalization
of terms: symbols have a (fixed) arity, graphs are connected via some sprouts/variables as terms do. With such a setting,
a good deal of term rewriting theory also applies to graphs.
Let us come back to the initial problem: termination of graph rewriting systems in the context of natural language
processing. We already mentioned that rule sets are large, which makes manual inspection impossible. Moreover,
empirical studies fail to observe some of the underlying hypotheses of the previous frameworks. For instance, there is
no clear bound on tree-width: even if input data such as dependency graphs are almost like trees, the property is not
preserved along computations. Also, constraints on node degrees are also problematic: graphs are usually sparse, but
some nodes may be highly connected. To illustrate, consider the sentence “The woman, the man, the child and the
dog eat together”. The verb “eat” is related to four subjects and there is no a priori limit on this phenomenon. Typed
versions (those with fixed arity) are also problematic: a verb may be transitive or not. Moreover, rewriting systems
may be intrinsically nondeterministic. For instance, if one computes the semantics of a sentence out of its grammatical
analysis, it is quite common there are multiple solutions. To further illustrate nondeterminism consider the well known
phrasal construction “He saw a man with a telescope” with two clear readings.
Some hypotheses are rather unusual for standard computations, e.g., fixed number of nodes. Indeed, nodes are usually
related to words or concepts (which are themselves closely related to words). A paraphrase may be a little bit longer
than its original version, but its length can be easily bounded by the length of the original sentence up to some linear
factor. In GREW, node creations are restricted. To take into account the rare cases for which one needs extra nodes, a
“reserve” is allocated at the beginning of the computation. All additional nodes are taken from the reserve. Doing so has
some efficiency advantages, but that goes beyond the scope of the paper. Also, node and edge labels, despite being
large, remain finite sets: they are usually related to some lexicons. These facts together have an important impact on the
termination problem: since there are only finitely many graphs of a given size, rewriting only leads to finitely many
outcomes. Thus, deciding termination for a particular input graph is decidable. However, our problem is to address
termination in the class of all graphs. The latter problem is often referred to as uniform termination, whereas the former
is refereed to as non-uniform. For word rewriting, uniform termination of non increasing systems constituted a well
known problem, and was shown to be undecidable by Sénizergues in [11].
This paper proposes a novel approach for termination of graph rewriting. In a former paper [12], we proposed a solution
based on label weights. Here, the focus is on the description (and the ordering) of paths within graphs. In fact, paths in
a graph can be seen as good old regular languages. The question of path ordering thus translates into a question of
regular language orderings. Accordingly, we define the graph multi-set path ordering that is related to that in [13].
Dershowitz and Jouannaud, in the context of drag rewriting, consider a similar notion of path ordering called GPO
(see [14]). Our definitions diverge from theirs in that our graph rewriting model is quite different: here, we do not
benefit (as they do) from a good algebraic structure. Our graphs have no heads, tails nor hierarchical decomposition. In
fact, our ordering is not even well founded! Relating the two notions is nevertheless interesting and left for further work.
Plump [15] also defines path orderings for term graphs, but those behave like sets of terms.
One of our graph orderings will involve matrices, and orderings on matrices. Nonetheless, as far as we see, there is no
relationship with matrix interpretations as defined by Endrullis, Waldmann and Zantema [16].
The paper is organized as follows. In Section 2 we recall the basic background on graphs and graph rewriting systems
(GRS) that we will need throughout the paper, and introduce an example that motivated our work. In Section 3 we
consider a language theory approach to the termination of GRSs. In particular, we present the language matrix, and the
matrix multiset path order (Subsection 3.4) and the rational embedding order (Subsection 3.5). We also propose the
notion of stability by context (Subsection 3.6) and show that both orderings are stable under this condition (Subsection
3.7). In Section 4 we propose notion of graph interpretability and show one of our main results, namely, that a GRS is
terminating if and only if it is compatible with interpretations.


Main contributions: The two main contributions of the paper are the following.


      1. We propose two orders on graphs inspired from language theory, and we show that both are monotonic and
         stable by context.


      2. We propose a notion of graph interpretation, and show that GRSs that are terminating are exactly those that are
         compatible with such interpretations.
2     Notation and Graph Rewriting

In this section we recall some general definitions and notation. Given an alphabet ⌃, the set of words (finite sequences
of elements of ⌃) is denoted by ⌃⇤ . The concatenation of two words v and w is denoted by v · w. The empty word,
being the neutral element for concatenation, is denoted by 1⌃ or, when clear from the context, simply by 1. Note that
h⌃⇤ , 1, ·i constitutes a monoid.
A language on ⌃ is some subset L ✓ ⌃⇤ . The set of all languages on ⌃ is P(⌃⇤ ). The addition of two languages
L, L0 ✓ ⌃⇤ is defined by L + L0 = {w | w 2 L _ w 2 L0 }. The empty language is denoted by 0 and hP(⌃⇤ ), +, 0i is
also a (commutative) monoid. Given some word w 2 ⌃⇤ , we will also denote by w the language made of the singleton
{w} 2 P(⌃⇤ ). Given two languages L, L0 ✓ ⌃⇤ , their concatenation is defined by L · L0 = {w · w0 | w 2 L ^ w0 2 L0 }.
In this way, hP(⌃⇤ ), 1, ·i is also a monoid. Given the distributivity of ⇥ with respect to +, the 5-tuple hP(⌃⇤ ), +, 0, 1, ·i
forms a semiring.
A preorder on a set X is a binary relation ✓ X 2 that is reflexive (x x, for all x 2 X) and transitive (if x y and
y z, then x z, for all x, y, z 2 X). A preorder is a partial order if it is anti-symmetric (if x y and y x, then
x = y, for all x, y 2 X). An equivalence relation is a preorder that is symmetric (x y ) y x) . Observe that each
preorder induces an equivalence relation ⇠: a ⇠ b if a b and b a. The strict part of is then the relation: x y
if and only if x y and ¬(x ⇠ y). We also mention the “dual” preorder ⌫ of defined by: x ⌫ y if and only if y x.
A preorder is said to be well-founded if there is no infinite chain · · · x2 x1 or, equivalently, x1 x2 · · · .
The remainder of this section may be found in [4] and we refer the reader to it for an extended presentation. We suppose
given a (finite) set ⌃N of node labels, a (finite) set ⌃E of edge labels and we define graphs accordingly. A graph is a
triple G = hN, E, `i with E ✓ N ⇥ ⌃E ⇥ N and ` : N ! ⌃N is the labeling function of nodes. Note that there may
be more than one edge between two nodes, but at most one is labeled with some e 2 ⌃E . In the sequel, we use the
              e
notation m ! n for an edge (m, e, n) 2 E.
Given a graph G, we denote by NG , EG and `G respectively its sets of nodes, edges and labeling function. We we will
                                                   e                                       e
also (abusively) use the notation m 2 G and m ! n 2 G instead of m 2 NG and m ! n 2 EG when the context
is clear. Furthermore, in |    A
                                  ~ , a, b are nodes, |, ~ are the respective node labels and A is the edge label (here
                               a      b
between a and b).
The set of graphs on node labels ⌃N and edge labels ⌃E is denoted by G⌃N ,⌃E or G in short. Two graphs G
and G0 are said to share their nodes when NG = NG0 . Given two graphs G and G0 such that NG ✓ NG0 , set
G J G0 = hNG0 , EG [ EG0 , `i with `(n) = `G (n) if n 2 NG and `(n) = `G0 (n), otherwise.
A graph morphism µ between a source graph G and a target graph H is a function µ : NG ! NH that preserves edges
                                  e                  e
and labelings, that is, for all m ! n 2 G, µ(m) ! µ(n) 2 H holds, and for any node n 2 G: `G (n) = `H (µ(n)).
A basic pattern is a graph, and a basic pattern matching is an injective morphism from a basic pattern P to some graph
G. Given such a morphism µ : G ! G0 , we define µ(G) to be the sub-graph of G0 made of the nodes {µ(n) | n 2 NG },
                        e             e
of the edges {µ(m) ! µ(n) | m ! n 2 G} and node labels µ(n) 7! `G (n).
A pattern is a pair P = hP0 , ~⌫ i made of a basic pattern P0 and a sequence of injective morphisms ⌫i : P0 ! Ni ,
called negative conditions. The basic pattern describes what must be present in the target graph G, whereas negative
conditions say what must be absent in the target graph. Given a pattern P = hP0 , ~⌫ i and a graph G, a pattern morphism
is an injective morphism µ : P0 ! G for which there is no morphism ⇠i such that µ = ⇠i ⌫i .
Example 1. Consider the basic pattern morphism µ : P0 ! G (colors define the mapping):

                                                                                     A        ~        C
                                                                   µ                          g1
                                                    A
                                               |         ~                 E    |        B         A       |
                                               b0        b1                     g0            D            g2



                                                                                A
The pattern P = hP0 , [⌫]i with ⌫ defined by |           A
                                                              ~
                                                                       ⌫
                                                                           |             ~     prevents the application of the morphism above.
                                                    b0        b1           b0   B        b1

Indeed, ⇠ = b0 7! g0 , b1 7! g1 is such that ⇠ ⌫ = µ. When there is only one negative condition, we represent the
pattern by crossing nodes and edges which are not within the basic pattern. For instance, the pattern P above looks like
      A
 |
      ⇥   ~    that we hope is self-explanatory.
 b0   B   b1



In this paper we think of graph transformations as sequences of “basic commands”.
Definition 1 (The command language). There are three basic commands: label(p, ↵) for node renaming,
del_edge(p, e, q) for edge deletion and add_edge(p, e, q) for edge creation. In these basic commands, p and q
are nodes, ↵ is some node label and e is some edge label. A pattern hP0 , ~⌫ i is compatible with a command whenever all
nodes that it involves belong to P0 .
 Definition 2 (Operational semantics). Given a pattern P = hP0 , ~⌫ i compatible with some command c, and some
 pattern matching µ : P ! G where G is the graph on which the transformation is applied, we have the following
                                                                                                                       e
 possible cases: c = label(p, ↵) turns the label of µ(p) into ↵, c = del_edge(p, e, q) removes µ(p) ! µ(q) if it
                                                                                                 e
 exists, otherwise does nothing, and c = add_edge(p, e, q) adds the edge µ(p) ! µ(q) if it does not exist, otherwise
 does nothing. The graph obtained after such an application is denoted by G ·µ c. Given a sequence of commands
~c = (c1 , . . . , cn ), let G ·µ ~c be the resulting graph, i.e., G ·µ ~c = (· · · ((G ·µ c1 ) ·µ c2 ) ·µ · · · cn ).
Remark 1. We took a slightly simplified version of patterns. Actually, in [4], we have negative conditions within
patterns to avoid some rule applications. But these simplifications should be transparent with respect to termination.
Nevertheless, informally, we will omit the right node in a pattern like | ⇥B · · · to say that there is no edge labeled B
                                                                                                                                                             b0

from node b0 .
Definition 3. A rule is a pair R = hP, ~ci made of a pattern and a (compatible) sequence of commands. Such a rule
R applies to a graph G via a pattern morphism µ : P ! G. Let G0 = G ·µ ~c, then we write G !R,µ G0 . We define
G ! G0 whenever there is a rule R and a pattern morphism µ such that G !R,µ G0 .

2.1       The main example

Let ⌃N = {A} and ⌃E = {↵, , T }. For the discussion, we suppose that T is a working label, that is not present
in the initial graphs. We want to add a new edge between node n and node 1 each time we find a maximal chain:
                             A within a graph G. Consider the basic pattern P            A together with its two
     ↵       ↵      ↵      ↵                                                           ↵
 A       A       A     ···
                                                                              init = A
 1             2                                                  n                                                                                                               p   q
                                   3

negative conditions ⌫1 = A ⇥                                          ↵
                                                                              A
                                                                                      ↵
                                                                                                  A       and ⌫2 = A ⇥                       A
                                                                                                                                                 ↵
                                                                                                                                                         A   . We consider three rules:
                                                                              p                   q                                          p           q


Init: hhPinit , [⌫1 , ⌫2 ]i, (add_edge(p, T, q))i which fires the transitive closure.
Follow: h A                T
                                       A
                                               ↵
                                                   A        , (add_edge(p, T, r), del_edge(p, T, q))i which follows the chain.
                   p                   q            r


End: h A           T
                           A
                                       ↵
                                       ⇥       A   , (del_edge(p, T, q), add_edge(q, , p))i which stops the procedure.
           p                   q


To prevent all pathological cases (e.g., when the edge is misplaced, when two chains are crossing, and so on), we
could introduce more sophisticated patterns. But, since that does not change issues around termination, we avoid
obscuring rules with such technicalities.
Example 2. Consider the graph G = A                                                                   ↵
                                                                                                           A
                                                                                                                   ↵
                                                                                                                               A       . By applying succesively ’Init’, ’Follow’ and ’End’, G rewrites
                                                                                          1                2                   3
                                                              ↵                                                        ↵                                      ↵

as:   A
            ↵
                       A
                               ↵
                                           A   ! A                        A
                                                                                  ↵
                                                                                              A       ! A                  2
                                                                                                                                   A
                                                                                                                                         ↵
                                                                                                                                             A   ! A              2
                                                                                                                                                                      A
                                                                                                                                                                          ↵
                                                                                                                                                                              A
      1                2                   3            1     T           2                   3                1                             3       1                        3
                                                                                                                                   T



2.2       Three technical facts about Graph Rewriting

It is well known that the main issue with graph rewriting definitions is the way the context is related to the pattern image
and its rewritten part. We shall tackle this issue with Proposition 1.

Self-application Let R = hP, ~ci be the rule made of a pattern P = hP0 , ~⌫ i and a sequence of commands ~c. There is
the identity morphism 1P0 : P0 ! P0 , and thus we can apply rule R on P0 itself, that is, P0 !R,1P0 P00 = P0 ·1P0 ~c.
We call this latter graph the self-application of R.

Rule node renaming To avoid heavy notation, we will use the following trick. Suppose that we are given a rule
R = hP, ~ci, a graph G and a pattern morphism µ : P ! G. Let P = hP0 , ~⌫ i. We define Rµ to be the rule obtained
by renaming nodes p in P0 to µ(p) (and their references within ~c). For instance, the rule ’Follow’ can be rewritten as
F ollowµ = h A T A ↵ A , (add_edge(1, T, 3), del_edge(1, T, 2))i where µ denotes the pattern morphism used
                           1                   2              3

to apply ’Follow’ in the derivation. Observe that: (i) the basic pattern of Rµ is actually µ(P0 ), which is a subgraph of
G, (ii) ◆ : µ(P0 ) ! G mapping n 7! n is a pattern matching, and (iii) applying rule Rµ with ◆ is equivalent to applying
rule R with µ. In other words, G !R,µ G0 if (and only if) G !Rµ ,◆ G0 . To sum up, we can always rewrite a rule so
that its basic pattern is actually a subgraph of G.

                                                                             ↵                                                   ↵

Uniform rules        Let us consider rule ’Init’ above. It applies on:   A   T   A   , and the result is the graph itself:   A   T   A   .
                                                                         p       q                                           p       q

Indeed, we cannot add an already present edge (relative to a label) within a graph. Thus, depending on the graph, the
rule will or will not append an edge. Such an unpredictable behavior can be easily avoided by modifying the pattern of
                 ↵

’Init’ to:   AT   A . The same issue may come from edge deletions. A uniform rule is one for which commands apply
           p  ⇥   q

(that is, modify the graph) for each rule application. Since this is not the scope of the paper, we refer the reader to [4]
for a precise definition of uniformity. We will only observe two facts.
First, any rule can be replaced by a finite set of uniform rules (using negative conditions as above) that operate identically.
Thus, we can always suppose that rules are uniform.
Second, the following property holds for uniform rules (see [4]§7 for a proof).
Proposition 1. Suppose that G !R,◆ G0 with R = hP, ~ci and P = hP0 , ~⌫ i (the basic pattern P0 being a subgraph of
G). Let C be the graph obtained from G by deleting the edges in P0 . Then G = P0 J C and G0 = P00 J C with P00
being the self-application of the rule. Moreover, EC \ EP0 = ; and EC \ EP00 = ;.

Throughout the remainder of the paper we assume that all rules are uniform.

3     Termination of Graph Rewriting Systems
By a graph rewriting system (GRS) we simply mean a set of graph rewriting rules (see Section 2). A GRS R is said
to be terminating if there is no infinite sequence G1 ! G2 ! · · · . Such sequences, whether finite or not, are called
derivations.
Since there is no node creation (neither node deletion) in our notion of rewriting, any derivation starting from a graph
G will lead to graphs whose size is the size of G. Since there are only finitely many such graphs, we can decide the
termination for this particular graph G. However, the question we address here is the uniform termination problem (see
Section 1).
Remark 2. Suppose that we are given a strict partial order , not necessarily well founded. If G ! G0 implies G G0
for all graphs G and G0 , then the system is terminating. Indeed, suppose it is not the case, let G1 ! G2 ! · · · be an
infinite reduction sequence. Since there are only finitely many graphs of size of G1 , it means that there are two indices i
and j such that Gi ! · · · ! Gj with Gi = Gj . But then, since Gi Gi+1 · · · Gj , we have that Gi Gj = Gi
which is a contradiction.
A similar argument was exhibited by Dershowitz in [17] in the context of term rewriting. For instance, it is possible to
embed rewriting within real numbers rather than natural numbers to prove termination.
Let us try to prove the termination of our main example (see Subsection 2.1). Rules such as ’Init’ and ’End’ are
“simple”: we put a weight on edge labels ! : ⌃E ! R and we say that the weight of a graph is the sum of the weights of
its edges labels. Set !(↵) = 0, !( ) = 2 and !(T ) = 1. Then, rules ’Init’ and ’End’ decrease the weight by 1 and,
since rule ’Follow’ keeps the weight constant, it means the two former rules can be applied only finitely many times.
Observe that negative weights are no problem with respect to Remark 2.
But how do we handle rule ’Follow’? No weights as above can work.

3.1    A language point of view

Let G ! G0 be a rule application. The set of nodes stays constant. Let us think of graphs as automata, and let us forget
about node labeling for the time being. Let ⌃E be the set of edge labels. Consider a pair of states (nodes), choose one to
be the initial state and one to be the final state. Thus the automaton (graph) defines some regular language on ⌃E . In
fact, the automaton describes n2 languages (one for each pair of states).
Now, let us consider the effect of graph rewriting in terms of languages. Consider an application of the ’Follow’ rule:
                                                                    T   ↵
G ! G0 . Any word to state r that goes through the transitions p ! q ! r can be mapped to a shorter one in G0 via
                 T
the transition p ! r. The languages corresponding to state r contain shorter words. The remainder of this section is
devoted to formalizing this intuition into proper orders on graphs. For that, we will need to count the number of paths
between any two states. Hence, we shall introduce N-rational expressions, that is, rational expression with multiplicity.
See, e.g., Sakarovitch’s book [18] for an introduction and justifications of the upcoming constructions. We introduce
here the basic ideas.

3.2   Formal series

A formal series on ⌃ (with coefficients in N) is a (total) function s : ⌃⇤ ! N. Given a word w, s(w) is the multiplicity
of w. The set of words s = {w 2 ⌃⇤ | s(w) 6= 0} is the support of s. Given n 2 N, let n be the series defined by
n(w) = 0, if w 6= 1, and n(1) = n, where 1 denotes the empty word. The empty language is 0, the language made of
the empty word is 1. Moreover, for a 2 ⌃, the series a is given by a(w) = 0 if w 6= a and a(a) = 1.
Given two series s and P
                       t, their addition is the series s + t given by s + t(w) = s(w) + t(w), and their product is s · t
defined by s · t(w) = u·v=w s(u)t(v). The star operation is defined by s⇤ = 1 + s + s2 + · · · . The monoïd ⌃⇤
being graded, the operation is correctly defined whenever s(1) = 0.
Given a series s, let sk be its restriction to words of length less or equal to k, i.e., sk (w) = 0 whenever |w| > k and
sk (w) = s(w), otherwise.
An N-rational expression on an alphabet ⌃ is built upon the grammar [19]:

                                     E ::= a 2 ⌃ | n 2 N | (E + E) | (E · E) | (E⇤ ).

Thus, given the constructions mentioned in the previous paragraph, any N-rational expression E 2 E denotes some
formal series. To each N-rational expression corresponds
                                                     P     an N-automaton, which is a standard automaton with
transitions labeled by a non empty linear combination ik ni ai with ni 2 N and ai 2 ⌃ for all i  k.

3.3   The language matrix

Let us suppose given an edge label set ⌃E . Let E denote the N-rational expressions over ⌃E . A matrix M of dimension
P ⇥ P for some (finite) set P is an array (Mi,j )i2P,j2P whose entries are in E. Let ME be the set of such matrices.
Given a graph G, we define the matrix MG of dimension NG ⇥ NG as follows: MGi,j = T1 + · · · + T` with T1 , . . . , T`
the set of labels on the transitions between state i and j if such transitions exist, otherwise 0.
Let 1P be the unit matrix of dimension P ⇥ P , that is (1P )i,j = 0 if i 6= j else 1. From now on, we abbreviate the
notation from 1P to 1 if the context is clear. Then, let MG⇤
                                                             = 1 + MG + MG    2
                                                                                 + · · · . Each entry of MG
                                                                                                          ⇤
                                                                                                            is actually an
N-regular expression (see Sakarovitch Ch. III, §4 for instance). The (infinite) sum is correctly defined since for all i, j,
we have the equality (MG )i,j = T1 + · · · + T` . Thus, 1 62 (MG )i,j .
The question about termination can be reformulated in terms of matrices whose entries are languages (with words
counted with their multiplicity). To prove the termination of the rewriting system, it is then sufficient to exhibit some
order > on matrices such that for any two graphs G ! G0 , we have MG        ⇤
                                                                              > MG⇤
                                                                                    0 . To prove such a property in the

infinite class of finite graphs, we will use the notion of “stable orders”.
Recall the ’Follow’ rule and consider the corresponding basic pattern L and its self-application R. Their respective
matrices are:                                        !                      !
                                             0 T 0                 0 0 T
                                   ML = 0 0 ↵            MR = 0 0 ↵ .
                                             0 0 0                 0 0 0
Observe that (MR )13 > (ML )13 . This matrix deals with edges/transitions. In order to consider paths, we need to
compute ML⇤ and MR⇤ that are given by:
                                                      !                      !
                                       1 T T ·↵                    1 0 T
                                   ⇤                        ⇤
                                ML = 0 1          ↵       MR = 0 1 ↵ .
                                       0 0        1                0 0 1
Any word within MR⇤ ’s entries is a sub-word of the corresponding entry in ML⇤ .

Example 3. Consider now a variation of ’Follow’ made of the pattern h A                     T
                                                                                                A
                                                                                                    ↵
                                                                                                        A   and commands
                                                                                        p       q       r

(add_edge(p, T, r), del_edge(p, T, q))i. By setting L0 as its pattern and R0 as its self-application, we get the following
matrices:
                                                                   !                                           !
                          (T ↵ )⇤       T (↵ T )⇤     T ↵( T ↵)⇤                    (T )⇤       0   T ( T )⇤
              ML⇤ 0 =    ↵ (T ↵ )⇤       (↵ T )⇤       ↵( T ↵)⇤         MR⇤ 0 =    ↵ (T )⇤      1   ↵( T )⇤        .
                           (T ↵ )⇤      T (↵ T )⇤       ( T ↵)⇤                      (T )⇤      0    ( T )⇤

Again, words within MR⇤ 0 are sub-words of the corresponding ones in ML⇤ 0 .

3.4   The matrix multiset path order

The order we shall introduce in this section is inspired by the notion of multiset path ordering within the context of term
rewriting (see for instance [13]). However, in the present context of graph rewriting (to be compared with Dershowitz
and Jouannaud’s [14] or with Plump’s [15]), the definition is a bit more complicated. Here, we do not consider an order
on letters as it is done for terms.
Let E be the word embedding on ⌃⇤ , that is, the smallest partial order such that 1 E w, and if u E v, then (u · w E v · w
and w · u E w · v, for all u, v, w 2 ⌃⇤ . This order E can be extended to formal series, that is, the multiset-path ordering,
see Dershowitz and Manna [20] or Huet and Oppen [21].
Definition 4 (Multiset path order). The multiset path order is the smallest partial order on finite series such that

       • if there is w 2 t such that for all v 2 s, v / w, then s E t, and
       • if r E s and t E u, then r + t E s + u.

We write s / t when s E t and s 6= t.
Proposition 2. Addition and product are monotonic with respect to the multiset-path order. Moreover, addition is
strictly monotonic with respect to E, and if r / s, then r · t / s · t and t · r / t · s, whenever t 6= 0 (otherwise, we have
equality).

Proof. Addition is monotonic by definition. Actually, we prove now that it is strictly monotonic. Suppose that r / s.
We prove that r + t / s + t, by induction (see Definition 4). Suppose that there is w 2 s such that for all v 2 r we have
v / w, then r / s. Since r(w) = 0, then (r + t)(w) = t(w) < s(w) + t(w) = (s + t)(w), and we are done. Otherwise,
r = r0 + r1 and s = s0 + s1 with r0 E s0 and r1 E s1 . One of the two inequalities must be strict (otherwise r = s).
Suppose r0 / s0 . By definition, observe that r1 + t E s1 + t. But then, r + t = r0 + (r1 + t) and s = s0 + (s1 + t)
and we apply induction on (r0 , s0 ). As addition is commutative, the result holds.
For the product, suppose that r E s and let t be some series. We prove r · t E s · t; the other inequality t · r E t · s is
similar. Again, we proceed by induction on Definition 4:

       • Suppose there is w 2 s such that for all v 2 r, v / w. ByPinduction on t, if t = 0, r · t = 0 E 0 = s · t.
         Otherwise, t = t0 + v0 for a word v0 . Observe that r · v0 = v2r r(v)v · v0 . Since for all v 2 r, v · v0 / w · v0 ,
         we have r · v0 / w · v0 E s · v0 . Now, r · t = r · (t0 + v0 ) = r · t0 + r · v0 and s · t = s · t0 + s · v0 . By
         induction, r · t0 E s · t0 and since r · v0 E s · v0 , the result holds.
       • Otherwise, r = r0 + r1 . In this case, s · r = s · r0 + s · r1 and t · r = t · r0 + t · r1 . The result then follows by
         induction.

To show strict monotonicity, suppose r / s and again proceed by case analysis. Suppose that there is some w 2 s
such that for all vP2 r, v / w. Since t 6= 0, it contains at least one word v0 such that t = t0 + v0 . By r / s, r · v0 =
P
  v2r r(v)v·v0 /     v2s s(v)v·v0 = s·v0 . Now, r·t0 E s·t0 by monotonicity. Thus r·t = r·t0 +r·v0 /s·t0 +s·v0 = s·t
where the strict inequality is due to strict monotonicity of addition.

Definition 5 (Matrix multiset-path order). Let M and M 0 be two matrices with dimension P ⇥ P . Write M E M 0 if
                                                     k      0 k
for all k |P | and for all (i, j) 2 P ⇥ P , we have Mi,j E Mi,j   .
Corollary 1. The addition and the multiplication are monotonic with respect to the matrix multiset-path order.

Proof. It follows from Proposition 2 since addition and product of matrices are defined as addition and product of their
entries.
3.5   The Rational Embedding Order

Let ⌃ be some fixed alphabet. A finite state transducer is a finite state automaton whose transitions are decorated by
pairs (a, w), a 2 ⌃, w 2 ⌃⇤ . We refer the reader to the book of Sakarovitch [18] for details. To give the intuition, a
transducer works as follows. At the beginning, the current state is set to the initial state of the transducer. The reading
“head” is put on the first letter of the input and the output is set to the empty word. At each step of the computation, given
some state q, some letter a read and some transition q 0 2 (q, (a, w)), the transducer appends w to the output, shifts the
input head to the "right" and sets the current state to q 0 . The computation ends when a) the input word is completely
read in which case the result is the output or b) if there is no transition compatible with the current state and the read
letter. In this latter case, the computation is said to fail. Thus, compared to a finite state automaton whose computations
end on True (a final state) or False (not a final state) given some input word, the transducer output words, thus defining
a relation in ⌃⇤ . Given some transducer ⌧ , if for any word, there is at most one successful computation, the transducer
outputs at most one word, the relation becomes functional. In that case, we denote the function it computes by [⌧ ].
Definition 6 (Rational Embedding Order). Given two regular languages L and L0 on ⌃, write L . L0 if:
       • there is an injective function f : L0 ! L that is computed by a transducer ⌧ and
       • such that |f (w)|  |w|, for every w 2 L0 . Such functions (and the corresponding transducers) are said to be
         decreasing (in [22]).
The transducer ⌧ is said to be a witness of L . L0 .
We say that a transition of a transducer is deleting when it is of the form a | 1 for some a 2 ⌃. A transducer whose
transitions are of the form X | Y , with |Y |  |X|, is itself decreasing. If a path corresponding to an input w passes
through a deleting transition, then |⌧ (w)| < |w|.
In the sequel we will make use of the following results that are direct consequences of Nivat’s Theorem [23] (Prop. 4,
§3).
Proposition 3. Let [⌧ ] : L ! L0 be computed by a transducer ⌧ , and let L00 be a regular language. Then the following
assertions hold.
      1. The restriction ⌧|L00 : L00 \ L ! L0 mapping w 7! ⌧ (w) is computable by a transducer.
                                       00
      2. The co-restriction ⌧ |L : L ! L0 \ L00 mapping w 7! ⌧ (w) if ⌧ (w) 2 L00 and otherwise undefined, is
         computable by a transducer.
      3. The function ⌧ 0 : L ! L0 defined by ⌧ 0 (w) = ⌧ (w) if w 2 L00 and otherwise undefined, is computable by a
         transducer.
Observe that the identity on ⌃⇤ is computed by a transducer (made of a unique initial/final state with transitions a | a
for all a 2 ⌃). Then, the identity on L is obtained by Proposition 3(1,2). Thus we have that . is reflexive. Also, it is
well known that both transducers and injective functions can be composed. Hence, we also have that . is transitive.
Thus, . is a preorder.
However, we do not have anti-reflexivity in general. For instance, we have
                                            L1 = A · (A + B)⇤ . L2 = B · (A + B)⇤ . L1 .
To see this, consider the following transducer (whose initial state is indicated by an in-arrow, whereas the final one by
                             B|A


                                   . This shows that L1 . L2 . Swap ’A’ and ’B’, to see that the reversed relation also holds.
                       A|B
an out-arrow): q   0         q1


                             A|B


It is worth noting that there is a simple criterion to ensure a strict inequality.
Proposition 4. Suppose L1 . L2 has a witness ⌧ : L2 ! L1 . If ⌧ contains one (accessible and co-accessible) deleting
transition, then the relation is strict.

Proof. As before, set L1 < L2 whenever L1 . L2 but not L2 . L1 . Ad absurdum, suppose L1 . L2 . L1 with
a transducer ✓ : L1 ! L2 and ⌧ as above. Then ✓ ⌧ (the composition of the two transducers) defines an injective
function. Let w be the smallest input word from the initial state to a final state through the transition a | 1 in ⌧ . Define
the set
         M <|w| = {u 2 L2 | |u| < |w|}.
For any word u, we have |✓ ⌧ (u)|  |u|. Thus ✓ ⌧ (M