<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Afficient Equivalence-Checking Algorithm for a Model of Programs with Commutative and Absorptive Statements</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vladislav Podymov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Research University Higher School of Economics 3 Kochnovsky Proezd</institution>
          ,
          <addr-line>Moscow 125319</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>85</fpage>
      <lpage>96</lpage>
      <abstract>
        <p>We present an efficient equivalence-checking algorithm for a propositional model of programs with semantics based on (what we call) progressive monoids on the finite set of statements generated by relations of a specific form. We consider arbitrary set of relations for commutativity (relations of the form ab = ba for statements a, b) and left absorption (relations of the form ab = b for statements a, b) properties. The main results are a polynomial-time decidability for the equivalence problem in the considered case, and an explicit description of an equivalence-checking algorithm which terminates in time polynomial in size of programs.</p>
      </abstract>
      <kwd-group>
        <kwd>program models</kwd>
        <kwd>equivalence checking</kwd>
        <kwd>semigroups</kwd>
        <kwd>commutativity</kwd>
        <kwd>left absorption</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        The paper addresses the functional equivalence problem for imperative programs: given
two programs, do determine whether they compute the same function. This problem is
known to be undecidable in general (follows from Rice’s theorem [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]), but is highly
important to have means to determine if given programs are equivalent: theoretically,
the equivalence problem complexity shows what influence the structure of programs
has on their meaning, i.e. which structural properties can be analyzed and which cannot;
practically, equivalence-checking algorithms can be used in a huge variety of program
analysis problems: optimization, verification, refactoring, obfuscation, and so on — in
which any two programs (e.g. original and transformed, or ideal and real) are tested for
having the same functionality.
      </p>
      <p>
        One way to overcome the negativity of Rice’s theorem is to provide sufficient
conditions for program equivalence. For example, an optimizing compiler performs a
sequence of program transformation steps, each step preserves the original program
functionality, and sufficient equivalence conditions may be used to prove transformation
soundness and provide optimization algorithms known to be sound. The range of sound
and efficient optimization capabilities is impressive: compilers have means to remove
some unreachable or dead code, to restructure control flow graph and its linear code
blocks, to optimize a variety of efficiency characteristics [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Nevertheless, there is still
a long way to go. For example, the block of code
if(P(z)) {x=f(x); y=y+z;} else {x=g(x);x=v+z;}
if(!P(z)) y=y+z; else x=v+z;
and a much simpler block y=y+z; x=v+z; are equivalent, if we assume that the
functions P, f, g are side-effect free and terminate, but simplifications of this kind
are unlikely to be done automatically by modern compilers. To prove these blocks are
equvalent, it is sufficient to use the following properties:
1. the value of P(z) remains unchanged;
2. the order of statements x = f(x); and y = y + z; has no effect on the result,
and the same holds for x = v + z; and y = y + z;
3. the statement x = g(x); is dead if x = v + z; acts right after it, and the same
holds for x = f(x); and x = v + z;
      </p>
      <p>We refer to (2) as commutativity, and (3) as left absorption properties for statements,
and focus on these two properties only. Suppose we have a statement a : (x1, . . . , xn) =
f (y1, . . . , yk): “compute f on arguments y1, . . . , yk, and put the result into x1, . . . , xn”.
Let M od(a) be the set {x1, . . . , xn}, and U se(a) be {y1, . . . , yk}. An easy way to
extract commutativity and left absorption for statements is to analyze their U se- and
M od-sets: if U se(a) ∩ M od(b) = M od(a) ∩ U se(b) = M od(a) ∩ M od(b) = ∅, then
a and b are commutative; if M od(a) ⊆ M od(b) and M od(a) ∩ U se(b) = ∅, then b is
left-absorptive for a.</p>
      <p>To investigate the equivalence problem with respect to these properties we use the
following approach: describe a program model which preserves the original program
syntax but has a much simpler abstract semantics preserving some original semantical
properties (programs in such a model are often called program schemata); propose an
equivalence notion for schemata such that if two given schemata are equivalent, then
the original programs are also equivalent; solve the equivalence problem for schemata,
and use the resulting equivalence-checking algorithm to verify the original program
equivalence.</p>
      <p>
        We consider the model called “propositional programs” (PPs for short) proposed in
[
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. A PP is basically a finite automaton such that: vertices are labeled with
statement symbols and correspond to program control locations in which a current data state
is changed; transitions are labeled with all possible valuations for Boolean condition
symbols; vertices are connected in the same way as in the original program. An
interpretation which defines how a PP operates consists of: a set of abstract data states; a
meaning of each statement symbol as a function for data state modification; a
meaning of each condition symbol as a truth-value marking of data states. In general case
these components are defined in terms of Kripke frames and models [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], but this paper
focuses on commutativity and left absorption properties only: in this case data states
are elements of a monoid on the set of statements generated by a set of relations of
the form ab = ba (for commutativity) and ab = b (for left absorption) where a, b are
statement symbols. For short, we call such a monoid a CA-monoid. We call two PPs
equvalent in a CA-monoid M if for each interpretation based on M last data states of
their executions are equal. All necessary definitions are given in the further sections.
      </p>
      <p>
        The main goal of the paper is to provide an efficient equivalence-checking
algorithm for PPs operating in CA-monoids. We call an algorithm efficient if it terminates
in a polynomial time. Unfortunately, even for the trivial case when no commutativity
and left absorption is assumed the equivalence problem for PPs is known to be
co-NPcomplete [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. But for a fixed finite signature (alphabet of statements and conditions) for
the trivial case PPs are known to be closely related to finite automata [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and have an
equivalence-checking algorithm having a very low time complexity [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Thus, hereafter
a fixed finite signature is assumed, which means that we study the hardness with respect
to structural program features. Another unpleasant fact is that equivalence-checking
results for automata and PPs are not interchangeable for nontrivial cases, for example, for
arbitrary commutativity the equivalence problem for automata is known to be
undecidable [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], while for schemata it is decidable in polynomial time for a syntactically and
semantically more general case [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        CA-monoids considered in this paper are assumed to be progressive: we cannot
obtain the same element twice appending non-neutral element to the right via composition.
Note that for left-cancellative monoids the progressiveness is the same as the atomicity
of the neutral element. The main result of the paper is an improvement of known
decidability of the equivalence problem in the considered case [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] to polynomial-time
decidability (theorem 2), and an explicit description of a polynomial-time decision algorithm
(section 5). Note that if commutativity and left absorption properties are extracted in the
way showed above, then the resulting CA-monoid is necessarily progressive. Moreover,
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] provides a simple-to-check progressiveness criterion for CA-monoids.
      </p>
      <p>
        To make a few remarks on related work, we may mention polynomial-time
decidability results on equivalence problems for one-counter automata [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
(NLcompleteness), specific context-free languages [
        <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
        ], specific recursive schemata
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and decidability results for multitape automata [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] and k-valued transducers [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
Note that there are much more results on the equivalence problem and quite a bit of them
are mentioned.
      </p>
      <p>The structure of the paper is as follows. Section 2 provides basic definitions.
Section 3 discusses the main idea of the analysis technique and basic properties of the
main object to be analyzed — a graph of consistent computations (GCC). The result
of section 4 is a deep analysis of a GCC which allows to describe an polynomial-time
equivalence-checking algorithm. The main theorem and the algorithm description are
given in section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Basic Definitions</title>
      <p>Hereafter A and L are finite alphabets of basic assignment statements, or actions for
short, and logical conditions respectively. A propositional program (PP)
π = (L, en, ex, T, B)
consists of: a finite set of control locations L including entry en and exit ex; a transition
function T : (L \ {ex}) × L → L; a binding function B : L → A. We write l −→δπ l′
and l →π l′ instead of T (l, δ) = l′ for readability. We define the size |π| of a PP π to be
the number of its control locations: |π| = |L|. The meaning of a PP is briefly discussed
in the introduction except of the following: a logical condition may be understood as
a valuation for all Boolean conditions, which means that exactly one logical condition
holds for each abstract data state.</p>
      <p>
        To define semantics of PPs in general case we should use Kripke frames and
models [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], but in the considered case (commutativity and left absorption for actions) we
simplify it using the notation of semigroup theory. A semigroup (S, ∗) consists of
an arbitrary set S of elements, and a binary associative operation on S. A monoid
M = (S, ∗, ε) is a semigroup (S, ∗) together with its neutral element ε. To define how
a PP operates we use a monoid on the set of actions: each action from A is an element of
the monoid, and each other element s can be decomposed into actions (s = a1 ∗ · · · ∗ ak,
k ≥ 0, ai ∈ A for 1 ≤ i ≤ k). We write [a1 . . . ak] to denote such s. Hereafter we use
only monoids on the set of actions. An interpretation I = (M, ξ) for a PP consists of a
monoid M = (S, ∗, ε), and a valuation ξ : S → L for logical conditions.
      </p>
      <p>A path for a program π is a sequence of program locations of the form l1 →π l2 →π
. . . . For paths pt′, pt′′, by pt′ → pt′′ we denote a path which starts with a path pt′ and
continues with a path pt′′. A path is called a trace if l1 is the entry of π. A path is full if
δ1
it either is infinite or leads to the exit of π. A full trace is called a run. A trace l1 −→π
l2 −δ→2 . . . is called an I-trace for an interpretation I if δi = ξ([B(l1) . . . B(li)]) for
i ≥ 1. To simplify the notation, we write B(l1, . . . , li) along with B(l1) . . . B(li), and
[l1, . . . , l1] along with [B(l1, . . . , li)]. The result of a finite I-run rn is the element [rn].
An infinite I-run loops and has the indefinite result ⊥. Two programs are I-equivalent
if the results of their I-runs are equivalent. For a monoid M two programs are M
equivalent if they are I-equivalent for each interpretation I = (M, ξ).</p>
      <p>The equivalence problem investigated in this paper is the M -equivalence problem
for PPs where M is a monoid: given two PPs, to check if they are M -equivalent. The
problem is solved for specific monoids M , which we call progressive CA-monoids. A
monoid is progressive if for each triple of its elements s, s1, s2, if s1 6= ε or s2 6= ε,
then s 6= s ∗ s1 ∗ s2 Note that the elements of a progressive monoid are partially ordered
w.r.t. its operation, i.e. s1 ∗ s2 6= s1 for s2 6= ε. In other words, if we introduce the
following order ≤: s1 ≤ s2 iff ∃s. s2 = s1 ∗ s — then a monoid is ordered iff ≤ is
a non-strict partial order. Along with ≤ we will use naturally induced orders &lt;, ≥, &gt;,
as well as s1ks2 to denote the incomparability of s1, s2. A CA-monoid is a monoid
generated by an arbitrary set of relations of the form ab = ba and ab = b (a, b ∈ A);
the former one is a commutativity relation, the latter is a (left) absorption relation. A
monoid generated by a set of relations can be defined as follows:
– each element is a set of words over the alphabet A;
– g ∈ [h] iff g can be obtained from h with a finite number of rewritings using
generating relations (a subword from the left part of a relation to the right part and
vice versa).</p>
      <p>The main result of the paper is the polynomial-time decidability of the M
equivalence problem for PPs for an arbitrary progressive CA-monoid M together with
an explicit description of a decision algorithm. The description of the algorithm depends
on a description of M , i.e. on a set of commutativity and absorption relations defining
M . The algorithm is polynomial in time in size of input programs.</p>
    </sec>
    <sec id="sec-3">
      <title>Graph of Consistent Computations</title>
      <p>To simplify notation, hereafter we assume an arbitrary progressive CA-monoid M =
(S, ∗, ε) to be defined, as well as PPs
and
π1 = (L1, en1, ex1, T1, B1)
π2 = (L2, en2, ex2, T2, B2).</p>
      <p>
        The main idea of how we analyze the M -equivalence problem for PPs is similar to
those developed in [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and can be described as follows. Suppose we have
an interpretation I = (M, ξ). We construct the I-traces tr1, tr2 of π1 and π2 in the
following manner: we start with one-element I-traces of π1, π2; if [tr1] = [tr2], then
we add the next location to both traces; if [tr1] &gt; [tr2], then we add the next location to
tr2; otherwise we add the next location to tr1. If any of the traces tri is an I-run, then
we do not add anything to it. As the result we have a sequence of pairs of traces leading
either to the I-runs of tr1, tr2, or to an I-run and an I-trace. Consider any such pair
(tr1 → l1, tr2 → l2). To find what results can be obtained with computations starting
in these traces, all we need to know is last program locations of these traces and data
states they lead to. Thus, we replace this pair with a tuple (l1, l2, [tr1 → l1], [tr2 → l2]).
      </p>
      <p>Hereafter we write ⌊s⌋ to denote the set of all minimal members of s, s ∈ S: {g |
g ∈ s, |g| = min |u|} — and ksk to denote the length of words in ⌊s⌋. Also, we write
u∈s
⌊h⌋ instead of ⌊[h]⌋, and call a word h ∈ A∗ a minimal member if h ∈ ⌊h⌋.</p>
      <p>Let (l1, l2, s1, s2) be a tuple as described above, and s be a longest common prefix
s of s1, s2: s ∗ s′1 = s1; s ∗ s′2 = s2; ksk has a maximal possible value. To check the
equivalence of programs, we do not need the results themselves, but to check if the
results are equal. Thus, we replace s1, s2 with s′1, s′2 in the tuple. Due to the construction
rules for tuples, ks′2k ≤ 1, which means that s′2 = [α] for α ∈ A∪{λ}. Then we replace
(l1, l2, s1, s2) with a reduced tuple (l1, l2, s′1, α). Looking over all possible
interpretations I, we connect all possible tuples in a graph Γ , which we call a graph of consistent
computations (GCC) in the following way. The vertices of Γ are all possible tuples
(l1, l2, s, α), where l1, l2 are locations of π1, π2, s is a data state, and α ∈ A ∪ {λ}.
If there exists an interpretation I such that tuples t1, t2 are what we get from adjacent
pairs of traces obtained with the construction rules for I, then t1 and t2 are connected
with an edge. To eliminate the enumeration of all interpretations, we define Γ using M ,
π1, and π2 only, and then prove that to analyze Γ is sufficient to understand whether
π1, π2 are M -equivalent.</p>
      <p>To explain the term “consistent”, we define the notion of consistency for traces of
PPs in the following way. Traces of π1, π2 are (M -)consistent if they are I-traces for
some interpretation I = (M, ξ). Thus, an alternative description of the M -equivalence
problem for PPs is: given two PPs, to check if for each pair of M -consistent runs their
results are equal. The following proposition shows how to check trace consistency
without enumerating of all possible interpretations.
if [tr1′] = [tr2′], then ∃δ. tr1′ −→δπ1 l1 and tr2′ −→δπ2 l2.</p>
      <p>
        Proposition 1 ([
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). Let M be an progressive monoid, and tr1, tr2 be traces of π1, π2.
Then tr1, tr2 are M -consistent iff for any their prefixes tr1′ →π1 l1, tr2′ →π2 l2 holds:
      </p>
      <p>Now we give an alternative definition of Γ , which does not require enumeration
of all interpretations. The nodes of Γ are 4-tuples of the form v = (l1, l2, s, α) where
li ∈ Li, s ∈ S, and α ∈ A ∪ {λ}. The root of Γ is the node (en1, en2, s, α) where:
if [B1(en1)] ≤ [B2(en2)], then s = ε, otherwise s = [B1(en2)]; if [B2(en2)] ≤
[B1(en1)], then α = λ, otherwise α = B2(en2). Some of the nodes (l1, l2, s, α) are
terminal and have no outgoing edges:
1. l1 = ex1, α 6= λ;
2. l2 = ex2, s 6= ε;
3. l1 = ex1, l2 = ex2, and either s1 6= ε, or α 6= λ.</p>
      <p>For cases (1), (2) (which are not mutually exclusive, but exclude (3)) we call a
node disproving (the equivalence of PPs); Edges outgoing from a nonterminal node
v = (l1, l2, s, α) carry the following labels (one edge for each δ ∈ L): (δ, δ) if l1 6= ex1,
l2 6= ex2, and s = [α] = ε; (ǫ, δ) if l2 6= ex2, α = λ, and either l1 = ex1 or s 6= ε;
(δ, ǫ) otherwise. Here ǫ is a special symbol, ǫ ∈/ L. To finish the definition of Γ , we
(σ1,σ2)
describe the node v′ = (l1′, l2′, s′, α′) s.t. v −−−−→ v′. If σ1 = ǫ, then l1′ = l1 and
s′′ = s, otherwise l1 −σ→1 l1′ and s′′ = s ∗ [B1(l1′)]. If σ2 = ǫ, then l2′ = l2 and α′′ = α,
σ2
otherwise l2 −→ l2′ and α′′ = B2(l2′). If s′′ = [α] ∗ s′′′, then s′ = s′′′ and α′ = λ,
othewise s′ = s′′ and α′ = α′′.</p>
      <p>As the GCC is defined, we should prove that it contains as many traces as needed to
check if π1, π2 are M -equivalent. To do it, we show the correspondence between paths
in Γ and traces of π1, π2 (lemmas 1–3), and then show what paths should we search for
to check whether π1, π2 are M -equivalent (theorem 1).</p>
      <p>To shorten the notation, we give three more definitions. An r-path is a path in Γ
originating from its root. Projections pr1(ω), pr2(ω) of an r-path ω are defined as
follows. pri((en1, en2, s, α)) = eni. Let ω = ω′ −(−σ1−,σ−2→) (l1, l2, s, α). Then:
– pri(ω) = pri(ω′) if σi = ǫ;
– pri(ω) = pri(ω′) −σ→i li otherwise.</p>
      <p>For an infinite r-path ω we define pri(ω) as the shortest r-path s.t. for each finite
prefix ω′′ of ω holds: pri(ω′′) is a prefix of pri(ω). An r-path ω is disproving in two
cases: it leads to a disproving node; it is infinite, and for some its (infinite) tail ωt and
some i ∈ {1, 2} holds: for a label (σ1, σ2) of any edge of ωt holds σi = ǫ, and for any
node (l1, l2, s, α) of ωt the location exi is reachable from li.</p>
      <p>Lemma 1. Let ω be an r-path. Then pr1(ω) and pr2(ω) are consistent traces of π1, π2.
Moreover, if ω leads to a node (l1, l2, s, α), then the following holds: there exists s′ ∈ S
s.t. s′ ∗ s = [pr1(ω)] and s′ ∗ [α] = [pr2(ω)]; if s 6= ε and α 6= λ, then s and [α] are
incomparable.</p>
      <p>Lemma 2. Let rn1, rn2 be consistent runs of π1, π2. Then there is an r-path ω s.t.
pri(ω) is a prefix of rni (for i ∈ {1, 2}), and either ω is infinite, or it leads to a
terminal vertex.
Lemma 3. Let ω be an r-path, i ∈ {1, 2}, and rni = pri(ω) →πi pt be a run of πi.
Then rni and pr3−i(ω) are consistent traces.</p>
      <p>Lemmas 1, 2 follow from proposition 1, the left-cancellativity of M , and the
definition of Γ . To prove lemma 3 by contradiction, we note that if rni and pr3−i(ω) are
not consistent, then due to proposition 1 and lemma 1 there exist a proper prefix pt′ of a
path pt and a proper prefix tr3′−i of pr3−i(ω) s.t. [pri(ω) →πi pt′] = [tr3′−i]. Then by
lemma 1 there does not exist a prefix ω′ of ω s.t. pri(ω) = tr3′−i, which is impossible
by the definition of pri.</p>
      <p>Theorem 1. PPs π1, π2 are M -equivalent iff Γ contains no disproving paths.
Proof. Necessity. If Γ contains a finite disproving path ω, then by lemma 3 the traces
pr1(ω), pr2(ω) can be extended to consistent runs of π1, π2, and by lemma 2 the results
of these runs are distinct. If Γ contains an infinite disproving path ω and i-th
components of edge labels of some its tail are equal to ǫ, then the trace pri(ω) can be extended
to a finite run rni of πi, while pr3−i is an infinite run of π3−i, and by lemma 3 the runs
rni and pr3−i are consistent and have distinct results.</p>
      <p>Sufficiency. If rn1, rn2 are consistent runs of π1, π2 with distinct results, then by
lemma 2 we obtain an r-path ω which is either finite and leads to a disproving vertex
due to lemma 1, or infinite, in which case exactly one of runs rn1, rn2 is infinite, and
ω is an infinite disproving path.</p>
      <p>Note that theorem 1 does not solve the M -equivalence problem for PPs, as Γ is
infinite in general case. The next section provide means to build a polynomial-time
traversal of Γ sufficient to solve the equivalence problem.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Analysis of Graph of Consistent Computations</title>
      <p>The main idea of the further GCC analysis is to partition its nodes into a finite number
of classes and prove that during the traversal it is sufficient to visit polynomially many
nodes in each class, and after it we state either that π1 and π2 are not M -equivalent
(lemma 7) or that other nodes in the class may be ignored in the further traversal (lemma
8). To do it, we introduce and analyze the notion which we called an absorption effect.
An (absorption) effect induced by an element s ∈ S is a function aes : S → S showing
how the elements of S are changed under “pressure” of absorption of s appended to the
right. Formally, aes(s1) = arg min ks2k. To prove the soundness of the definition,
s2:s1∗s=s2∗s
we state a (what we call) commutativity of minimals: if h, g are minimal members of
the same element of M , then g can be obtained from h with commutativity rewritings
only (without any absorption). The soundness is provided by the following lemma.
Lemma 4. An equation X ∗ s = s′ has at most one solution s′′ on X s.t. ks′′k has
minimal possible value among all solutions.</p>
      <p>Proof. Suppose we have at least two such solutions: s′1′, and s′′: s′1′ ∗ s = s′ = s′2′ ∗ s.
2
Let h1 ∈ ⌊s′1′⌋, h2 ∈ ⌊s′2′⌋, and h ∈ ⌊s⌋. Due to the commutativity of minimals, we
can obtain h2h from h1h using commutativity relations only. By symmetry of
commutativity relation, we get the derivation of h−h2− from h−h1− (g− is a reversed g).
By the left-cancellativity of M and the same symmetry, h1 can be rewritten to h2 with
commutativity only, which means s′1′ = s′2′.</p>
      <p>The next step is to introduce a partial order on the set of all effects (which we denote
by ): ae1 ≤ ae2 iff there exist s1, s2 ∈ S s.t. s1 ≤ s2, ae1 = aes1 , and ae2 = aes2 . To
prove that ≤ is a partial order, we state one more algebraic property of M , which we
call the inductive minimality: if h is a minimal member, a ∈ A, and [ah] 6= [h], then ah
is a minimal member.</p>
      <sec id="sec-4-1">
        <title>Lemma 5. The relation ≤ over is a non-strict partial order.</title>
        <p>Proof. By the inductive minimality, an effect ae determines a set of symbols deleted
from a minimal member: if ae(s) = s′ and h ∈ ⌊s⌋, then there exists h′ ∈ ⌊s′⌋ s.t. h′
can be obtained from h by deleting some symbols. If there exist ae1, ae2 s.t. ae1 ≤ ae2
and ae2 ≤ ae1, then ae1 deletes at least those symbols which are deleted by ae2 and vice
versa. It means that for any minimal member h the effects ae1 and ae2 delete the same
symbols from h, and thus ae1([h]) = ae2([h]).</p>
        <p>Another pleasant property of the effects is that they can be (in some sense) computed
by a finite automaton. It means that the set is finite, and that having a short description
of an effect aes and an action a, we can easily compute the description of aes∗[a]. An
automaton A = (Q, q0, TA, SA, MA) (over the alphabet A) consists of: a finite set of
states Q including an initial state q0; a transition function T : Q × A → Q; a set of
labels SA; a marking function MA : Q → SA. We denote by A(q, h) a state to which A
jumps reading the word h ∈ A∗, and use the following shortenings: A(h) is A(q0, h);
MA(q, h) is MA(A(q, h)); MA(h) is MA(A(h)). The notion of automaton used in this
paper is a minor generalization of a well-known finite deterministic acceptor with the
only difference: instead of answers “yes” and “no”, here SA is the set of all possible
answers. We put the following meaning in the phrase “to compute an effect”: |SA| = ||;
ae[h] = ae[g] iff MA(h) = MA(g); if [h] = [g], then A(h) = A(g). We call an automaton
satisfying all these properties an ae-automaton. Note that intuitively the best way to
define such an automaton is to say SA = , but formally the set cannot be used explicitly
in any algorithm as it contains functions with an infinite domain. The description of an
ae-automaton is based on an automata-based recognition of absorption: there exists an
automaton with the following two properties: reading an input word h, it enters a state
labelled with the set {a ∈ A | [a] ∗ [h−] = [h−]}; reading words h, g s.t. [h−] = [g−],
it enters the same state.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Lemma 6. For any progressive CA-monoid there exists an ae-automaton.</title>
        <p>
          Proof. We start with an automaton A stated by the automata-based recognition of
absorption. Then we divide the states of A into equivalence classes: q and q′ are equal iff
∀h ∈ A∗. MA(q, h) = MA(q′, h). Note that we have a finite number N of such classes.
Changing the labels of A to {1, . . . , N }, enumerating the equivalence classes, and
assigning the label i to the i-th class, we get an automaton A′ s.t. MA′ (h) = MA′ (g)
iff aeh− = aeg− . Then by changing the labels again we construct exactly N finite-state
acceptors, the i-th of which says if the original answer is i. Using classical
automatatheoretic results [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], we reverse the language of each acceptor, construct their Cartesian
product and reassign labels back to {1, . . . , N }, and the result is exactly a required
ae-automaton: MA′ (h) = MA′ (g) iff aeh = aeg, and all states were constructed for
elements of S, thus all parts of the definition are satisfied.
        </p>
        <p>Hereafter we assume to be given an ae-automaton A = (Q, q0, TA, SA, MA).
By the definition of A, we write A([h]) along with A(h). The last two notions
required for the analysis of Γ are an effect evolution and a generalized effect
evolution. Intuitively, these notions show how an absorption effect is modified during
a program run from some intermediate run location. An effect evolution ae˙ sh along
the word h = a1 . . . ak ∈ A∗ induced by s ∈ S is defined as follows. We start
with the sequence (aes, aes∗[a1], . . . , aes∗[h]). Then we delete adjacent duplicates. The
result is exactly ae˙ sh. Note that any evolution is a chain in a finite (lemma 6)
partially ordered (lemma 5) set, and thus the number of evolutions is finite. A
generalized effect evolution ae¨qh along the word h = a1 . . . ak ∈ A∗ induced by a
state q ∈ Q is defined in the same way except we start with the sequence of pairs
((aeε, aeMA(q)), (ae[a1], aeMA(q,a1)), . . . , (ae[h], aeMA(q,h))). For the same reasons the set
of all generalized evolutions is finite.</p>
        <p>There is an important property of evolutions. For any full path pt for a PP π =
(L, en, ex, T, B) and any s ∈ S there exists a full path pt′ (which we call reduced) s.t.:
ae˙ sB(pt) = ae˙ sB(pt′); if pt is finite, then |pt′| ≤ |L| · |Q|; if pt is infinite, then pt′ = pt′1′ →
pt′2′ → pt′2′ → . . . where |pt′1′| + |pt′2′| ≤ |L| · |Q|. To prove the existence of a reduced
path pt′, it is sufficient to: mark pt with states of A starting with A(s) modifying it with
the actions of pt; delete loops between two repeated pairs of a program location and
an automaton state; for an infinite path, replace an infinite tail corresponding to the last
effect in the evolution with the repetition of a loop described above. The same property
holds for generalized evolutions with an upper bound |L| · |Q|2 instead of |L| · |Q|. The
following lemmas are the result of the analysis of Γ , as they provide a finite partitioning
of the nodes of Γ required in the equivalence-checking algorithm.</p>
        <p>Lemma 7. Let: q ∈ Q; l1 ∈ L1; l2 ∈ L2; α ∈ A ∪ {λ}; m = |Q|3 · |π1| · |π2|2 + 2;
pairwise-distinct GCC nodes (l1, l2, si, α), i ∈ {1, . . . , m}, be reachable from the GCC
root; A(s1) = · · · = A(sm) = q; li → pti be a full path of πi (i ∈ {1, 2}), and at
least one of these paths is finite; ae¨B1(pt1) = ((ae11, ae21), . . . , (ae1k, ae2k)); ae˙ αB2(pt2) =
q
(ae˙ 1, . . . , ae˙ p); 1 ≤ k′ ≤ k; the elements ae1k′ (si), i ∈ {1, . . . , m}, be pairwise distinct;
{ae1, . . . , aep} ∩ {ae2k′+1, . . . , ae2 } = ∅.</p>
        <p>k</p>
        <p>Then π1 and π2 are not M -equivalent.</p>
        <p>Proof. The main idea is to show that at least for one of the nodes wj traces of a
projection of a path ωj leading to this node can be expanded into consistent runs with
different results. We show in details only one case (for other cases a proof technique
works in the same way, though it looks a bit more complex): pt1 and pt2 are finite,
k′ = k. W.l.o.g. we assume pt1 and pt2 to be reduced paths, thus |pti| ≤ |πi| · |Q|3−i.
By proposition 1 the only way for the runs not to be consistent is if their proper prefixes
lead to a common element s, and next steps are made with distinct conditions δ1, δ2
— we call it a collision. By lemma 3, for each ωp these prefixes are not shorter than
the projections of ωp. Thus, we have [pr1(ωp) → pt′1] = [pr2(ωp) → pt′2] for some
proper prefixes pt′1, pt′2 of pt1 and pt2. But as the elements ae1k(sp), ae1k(sr) are distinct
for p 6= r, the elements [pr1(ωp) → pt′1] and [pr1(ωp) → pt′1] are also distinct (as
M is left-cancellative), while [pr2(ωp) → pt′2] = [pr2(ωr) → pt′2]. Thus, a collision
for different indexes p, r cannot happen for the same prefixes pt′1, pt′2 simultaneously.
On the other hand, we have at most |pt1| · |pt2| ≤ |π1| · |π2| · |Q|3 possible pairs of
lengths for these prefixes, which means that at least for two indexes j′, j′′ there are
no collisions: in other words, expansions for these indexes are consistent. Based on the
same thought about distinction of ae1k(sj′ ), ae1k(sj′′ ) as above for p, r, we conclude that
at least for one of indexes j′, j′′ the results of computations are distinct.
Lemma 8. Let: q ∈ Q; l1 ∈ L1; l2 ∈ L2; α ∈ A ∪ {λ}; m = |Q|3 · |π1| · |π2|2 + 2;
pairwise-distinct GCC nodes vi = (l1, l2, si, α), i ∈ {1, . . . , m}, are reachable from
the GCC root; A(s1) = · · · = A(sm) = q; li → pti is a full path of πi, i ∈ {1, 2};
ae¨qB1(pt1)] = ((ae11, ae21), . . . , (ae1k, ae2k)); ae˙ αB2(pt2) = (ae1, . . . , aes); 1 ≤ k′ &lt; k, 1 ≤
s′ &lt; s; the elements ae1k′ (si), i ∈ {1, . . . , m}, be pairwise different; ae1k′+1(s1) =
· · · = ae1k′+1(sm); {ae1, . . . , aes′ } ∩ {ae2k′+1, . . . , ae2 } = ∅; aes′+1 ∈ {ae2k′+1, . . . , ae2k};
k
ω be a GCC r-path leading to vm; pr1(ω) → pt1 and pr2(ω) → pt2 be consistent runs
having distinct results.</p>
        <p>Then there exists j, 1 ≤ j &lt; m, s.t. vj belongs to some disproving r-path.</p>
        <p>We prove lemma 8 using the same technique as for lemma 7, so due to lack of space
we omit the proof.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Equivalence-Checking Algorithm</title>
      <p>The resulting equivalence-checking algorithm is to traverse Γ in any way. During the
traversal visited nodes are partitioned into classes defined by a state q, locations l1, l2,
and a symbol α (as in lemmas 7, 8): the nodes (l1, l2, s, α) and (l1′, l2′, s′, α′) belong
to the same class iff l1 = l1′, l2 = l2′, A(s) = A(s′), and α = α′. If the traversal is
finished, then the absence of disproving paths can be checked directly, and the answer
is given by theorem 1. If (|Q|3 · |π1| · |π2|2 + 2) nodes are visited in any class, then either
the non-equivalence is stated (lemma 7), or all other nodes in the class are ignored from
now on (lemma 8).</p>
      <p>Let N be the total program size: N = |π1| + |π2|. The number of distinct node
classes is O(N 2). Each class contains O(N 3) nodes. Thus, no more than O(N 5) nodes
are visited during the traversal. A minimal member g ∈ ⌊h⌋ can be constructed in
time O(|h|) using the inductive minimality and lemma 6. Checking if [h] ≤ [g] can be
done in time O((|h| + |g|)2): construct a word g′ ∈ ⌊g⌋; if h = λ, then [h] ≤ [g];
otherwise h = ah′, where a ∈ A; if [ag] = [g] (it can be checked with A), then delete
a from h and repeat the procedure; otherwise check whether g contains a letter a; if
not, then [h] [g]; if yes, then pick the leftmost occurence of a in g: g = g′ag′′; if
each symbol from g′ commutes with a, then delete marked a from h and g and repeat
the procedure; otherwise [h] [g]. The soundness of the latter algorithm is proved in
the appendix. The elimination of a longest prefix can be performed in a similar way,
the difference is that we pick not the leftmost symbol of h but every symbol which can
be shifted to the leftmost position with commutativity relations, and every completion
is successfull. Each element of M can be encoded as any its member (e.g. constructed
by a computation). During the traversal the algorithm constructs words h of length at
most O(N 5), and each element comparison can be done in time O(N 10). For each
vertex the algorithm does at most O(N 3) comparisons of elements of M , and this is the
bottleneck of the algorithm. Thus, the algorithm performs O(N 5) steps, and each step
is done in time O(N 13), and the total time is O(N 18). The last upper bound together
with theorem 1 and lemmas 7, 8 states the validity of the main theorem of this paper.
Theorem 2. Let M be a progressive CA-monoid on a finite set of actions. Then the
M -equivalence problem for propositional programs is decidable in time polynomial in
size of programs.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>We want to present three thoughts concluding the paper. The first one is that the
considered case: propositional programs, commutativity, and left absorption — appears to
be simple enough to have a polynomial-time decidability for the equivalence problem.
The second one is that (nevertheless) this case is complex enough to require a rather
nontrivial technique for its analysis. Finally, the third one is that it was mentioned in the
introduction that there is still a long way to go, and this statement remains unchanged,
as even in a simple example in the introduction we need not only commutativity and
absorption (but also dependencies between statements and conditions) to prove the
equivalence. To be able to prove the equivalence of “real” fragments of code, we should
consider other properties of program primitives, which is the topic of future research.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Rice</surname>
            ,
            <given-names>H.G.</given-names>
          </string-name>
          :
          <article-title>Classes of recursively enumerable sets and their decision problems</article-title>
          .
          <source>Trans. Amer. Math. Soc</source>
          .
          <volume>89</volume>
          ,
          <fpage>25</fpage>
          -
          <lpage>59</lpage>
          (
          <year>1953</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Aho</surname>
            ,
            <given-names>A.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lam</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sethi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ullman</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          : Compilers: Principles, Techniques and
          <string-name>
            <given-names>Tools. Pearson</given-names>
            <surname>Education</surname>
          </string-name>
          , Inc., New York (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Zakharov</surname>
            ,
            <given-names>V.A.</given-names>
          </string-name>
          :
          <article-title>An efficient and unified approach to the decidability of equivalence of propositional program schemes</article-title>
          .
          <source>Lect. Notes Comput. Sci</source>
          .
          <volume>1443</volume>
          ,
          <fpage>247</fpage>
          -
          <lpage>258</lpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Zakharov</surname>
            ,
            <given-names>V.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>I.M.</given-names>
          </string-name>
          :
          <article-title>On the equivalence checking problem for a model of programs related with muti-tape automata</article-title>
          .
          <source>Lect. Notes Comput. Sci</source>
          .
          <volume>3317</volume>
          , 293âA˘ S¸ -
          <volume>305</volume>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Harel</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kozen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tiuryn</surname>
            <given-names>J.: Dynamic</given-names>
          </string-name>
          <string-name>
            <surname>Logic</surname>
          </string-name>
          . The MIT Press (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hunt</surname>
            ,
            <given-names>H.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Constable</surname>
            ,
            <given-names>R.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sahni</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>On the computational complexity of program scheme equivalence</article-title>
          .
          <source>SIAM Journal of Computing. 9, N</source>
          <volume>2</volume>
          ,
          <fpage>396</fpage>
          -
          <lpage>416</lpage>
          (
          <year>1980</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Rutledge</surname>
          </string-name>
          , J.D.:
          <article-title>On Ianov's program schemata</article-title>
          .
          <source>J. Assoc. Comp. Mach. 11</source>
          ,
          <issue>1</issue>
          -
          <fpage>9</fpage>
          (
          <year>1964</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hopcroft</surname>
            ,
            <given-names>J.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motwani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ullman</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          :
          <article-title>Introduction to automata theory, languages, and computation</article-title>
          . Addison-Wesley, Boston (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Kozen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Kleene Algebra withTests and Commutativity Conditions</article-title>
          . In:
          <article-title>Tools and Algorithms for Construction and Analysis of Systems</article-title>
          , Second International Workshop, pp.
          <fpage>14</fpage>
          -
          <lpage>33</lpage>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Zakharov</surname>
            ,
            <given-names>V.A.</given-names>
          </string-name>
          :
          <article-title>On the decidability of the equivalence problem for monadic recursive programs</article-title>
          .
          <source>Theoretical Informatics and Applications. 34, N</source>
          <volume>2</volume>
          ,
          <fpage>157</fpage>
          -
          <lpage>171</lpage>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Glushkov</surname>
            ,
            <given-names>V.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Letichevskii</surname>
            ,
            <given-names>A.A.</given-names>
          </string-name>
          :
          <article-title>Theory of algorithms and discrete processors</article-title>
          .
          <source>Adv. Inf. Syst. Sci. 1</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>58</lpage>
          (
          <year>1969</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Podymov</surname>
            ,
            <given-names>V.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharov</surname>
            ,
            <given-names>V.A.</given-names>
          </string-name>
          :
          <article-title>On a semigroup model of sequential programs specified by means of two-tape automata</article-title>
          . Belgorod State University Scientific Bulletin.
          <source>History. Political science. Economics. Information technologies</source>
          .
          <volume>7</volume>
          (
          <issue>78</issue>
          ),
          <source>N 14/1</source>
          ,
          <fpage>94</fpage>
          -
          <lpage>101</lpage>
          (
          <year>2010</year>
          )
          <article-title>(in Russian)</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Bohm</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goller</surname>
            ,
            <given-names>S</given-names>
          </string-name>
          , Jancar,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Equivalence of deterministic one-counter automata is NLcomplete</article-title>
          .
          <source>Proceedings of the forty-fifth annual ACM symposium on Theory of computing</source>
          . 131-
          <fpage>140</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Debski</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fraczak</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Concatenation state machines and simple functions</article-title>
          .
          <source>Lecture Notes in Computer Science</source>
          .
          <volume>3317</volume>
          ,
          <fpage>113</fpage>
          -
          <lpage>124</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Bastien</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Czyzowicz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fraczak</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rytter</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Prime normal form and equivalence of simple grammars</article-title>
          .
          <source>Theoretical Computer Science. 363, N</source>
          <volume>2</volume>
          ,
          <fpage>124</fpage>
          -
          <lpage>134</lpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Harju</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karhumäki</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>The equivalence problem of multitape finite automata</article-title>
          .
          <source>Theoretical Computer Science. 78, N .2</source>
          , pp.
          <fpage>347</fpage>
          -
          <lpage>355</lpage>
          (
          <year>1991</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17. de Souza, R.:
          <article-title>On the decidability of the equivalence for k-valued transducers</article-title>
          .
          <source>Lecture Notes in Computer Science</source>
          .
          <volume>5257</volume>
          ,
          <fpage>252</fpage>
          -
          <lpage>263</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>