=Paper= {{Paper |id=Vol-2718/paper13 |storemode=property |title=Pumping Deterministic Monotone Restarting Automata and DCFL |pdfUrl=https://ceur-ws.org/Vol-2718/paper13.pdf |volume=Vol-2718 |authors=František Mráz,Dana Pardubská,Martin Plátek,Jiří Šíma |dblpUrl=https://dblp.org/rec/conf/itat/MrazPPS20 }} ==Pumping Deterministic Monotone Restarting Automata and DCFL== https://ceur-ws.org/Vol-2718/paper13.pdf
           Pumping Deterministic Monotone Restarting Automata and DCFL

                                   František Mráz1 , Dana Pardubská2 ∗, Martin Plátek1 †, and Jiří Šíma3 ‡
                                             1 Charles University, Department of Computer Science

                                            Malostranské nám. 25, 118 00 PRAHA 1, Czech Republic
                                        martin.platek@mff.cuni.cz, frantisek.mraz@mff.cuni.cz
                                    2    Comenius University in Bratislava, Department of Computer Science
                                                  Mlynská Dolina, 84248 Bratislava, Slovakia
                                                      pardubska@dcs.fmph.uniba.sk
                                        3   Institute of Computer Science of the Czech Academy of Sciences,
                                                      P. O. Box 5, 18207 Prague 8, Czech Republic
                                                                    sima@cs.cas.cz

Abstract: We introduce a new type of the deterministic                           R-automaton ends when during scanning the tape it enters
monotone restarting automaton that enables new types of                          a halting state that can be either accepting, in which case it
characterization of the class of deterministic context-free                      accepts the input word, or rejecting, in which case it rejects
languages (DCFL) based on pumping. The characteriza-                             the input word.
tion is obtained through new types of normalizations of                             The first fundamental result on restarting automata was
deterministic monotone restarting automata. This paper is                        a characterization of the class of deterministic context-free
the first step to prepare notions for studying the relation                      languages (DCFL) by a subclass of deterministic mono-
between restarting automata and analog neuron automata,                          tone R-automata. A computation of an R-automaton is
and for studying degrees of non-regularity of DCFL.                              monotone if the distances between the places of rewriting
                                                                                 and the right sentinel are decreasing (maybe not strictly)
                                                                                 during the whole computation. An R-automaton is mono-
1     Introduction
                                                                                 tone if all its computations are monotone.
Restarting automata were introduced in [3] as a linguisti-                          An essential part of this paper is derived from [3] and
cally motivated model of automata that enables to study                          [5]. We study the so-called RP-automata that slightly dif-
so-called analysis by reduction of a natural language. An                        fer from R-automata in [3] and RW-automata in [4]. One
overview of several variants of the model can be found in                        restarting step by RW-automata is by RP-automata substi-
[6].                                                                             tuted by two consecutive steps: by a preparing step, and
   The original model of restarting automata denoted as                          by a restarting step. With such a modification it is easier
R-automaton is a finite state machine equipped with a                            to present their so-called pumping properties.
read/write window of a fixed length k that can move over a                          The paper is structured as it follows. The next section
flexible tape. The word on its tape is always delimited by                       introduces the model of RP-automata and its deterministic
a pair of sentinels c| and $. The automaton works in cycles.                     and monotone variant, and states some basic properties of
Each cycle starts in the initial state with the window on the                    the model. Section 3 introduces pumping instructions and
left end of the tape so it scans the left sentinel c| and the first              pumping restarting automata. Pumping rewriting instruc-
k −1 symbols of the current tape contents, or the rest of the                    tions correspond roughly to “pumping” used in the pump-
tape (when the word is shorter than k − 2). Then the auto-                       ing lemma for context-free languages (cf. [2]) and pump-
maton moves from left to right while changing states ac-                         ing restarting automata are RP-automata that have only
cording to its transition function until it decides to rewrite                   pumping rewriting instructions. We show there that us-
the tape by deleting some symbols scanned by its window                          ing a strong cyclic form of deterministic RP-automata, we
together with removing the cells of the tape containing the                      can check whether a given rewriting instruction is pump-
deleted symbols. Immediately after a rewrite the automa-                         ing by inspecting only computations on words of length
ton “restarts” the computation on its shortened tape, that                       limited by a constant. It follows new characterizations of
is, it enters its initial state and places the read/write win-                   DCFL by deterministic pumping restarting automata. Fi-
dow at the leftmost position again. A computation of an                          nally, we show conditions ensuring that a pumping instruc-
                                                                                 tion causes that an RP-automaton accepts a non-regular
    ∗ The research is partially supported by VEGA 1/0601/20
   † The research was partially supported by the grant of the Czech Sci-
                                                                                 language.
ence Foundation GA19-05704S during the author’s stay at Institute of
Computer Science, Czech Academy of Sciences.                                     2   Definitions and Results
   ‡ The research is partially supported by the grant of the Czech Science

Foundation GA19-05704S
      Copyright c 2020 for this paper by its authors. Use permitted un-
                                                                                 A restarting automaton of type P, or an RP-automaton,
der Creative Commons License Attribution 4.0 International (CC BY                M = (Q, Σ, c,
                                                                                             | $, q0 , k, δ , QA , QR ) (with k-bounded looka-
4.0).                                                                            head) is a device with a finite state control unit with the


                                                                             1
finite set of states Q containing two disjunctive subsets QA ,          sponds unambiguously to one restart state qI .
QR of accepting and rejecting states, respectively. The au-                (4) A halting instruction of the form (q, u) →δ
tomaton is equipped with a head moving on a finite linear               (q0 , HALT), where q0 ∈ QA or q0 ∈ QR , finishes the com-
flexible tape of items (cells). The first item of the tape              putation and causes M to accept or reject, respectively, the
contains always the left sentinel symbol c,    | the last one the       input word.
right sentinel symbol $, and each other item contains a                    Thus, the set of states can be divided into three groups –
symbol from a finite alphabet Σ (not containing c,     | $). The        the halting states QA ∪ QR , the restarting states (involved
head has a flexible read/write window of length at most k               at left-hand side of restarting instructions) and the rest,
(for some k ≥ 1) – M scans k consecutive items or the rest              called the transition states.
of the tape when the distance to the right sentinel $ is less              A configuration of an RP-automaton M is a word αqβ ,
than k. We say that M is of window size k. In the initial               where q ∈ Q, and either α = λ and β ∈ {c}       | · Σ∗ · {$} or
configuration on an input word w ∈ Σ∗ , the tape contains               α ∈ {c}      ∗              ∗
                                                                                | · Σ and β ∈ Σ · {$}; here q represents the cur-
the input word delimited by the sentinels c| and $, the con-            rent state, αβ is the current contents of the tape, and it
trol unit is in the initial state q0 , and the window scans the         is understood that the read/write window contains the first
left sentinel c| and the first k − 1 symbols of the input word          k symbols of β or all symbols of β if |β | < k. An ini-
(or the rest of the tape if the tape contents is shorter than           tial (restarting) configuration is of the form q0 cw$,
                                                                                                                           |     where
k).                                                                     w ∈ Σ∗ . A rewriting configuration is of the form αqI β ,
   The computation of M is controlled by the transition                 where qI is a restarting state.
function                                                                   A computation of M is a sequence C = C0 ,C1 , . . . ,C j of
   δ : (Q \ (QA ∪ QR )) × PC ≤(k) →                                     configurations of M, where C0 is a restarting configuration
                            P(Q × {MVR, PREPARE})                       and C`+1 is obtained from C` by a step of M, for all `,
                        ∪                                               0 ≤ ` < j, denoted as C` `M C`+1 and `∗M is the reflexive
                            {RESTART(v) | v ∈ PC ≤(k−1) }.
                                                                        and transitive closure of the single step relation `M .
  Here P(S) denotes the powerset of the set S, PC (k) is                   In general, an RP-automaton can be nondeterministic,
the set of possible contents of the read/write window of M,             i.e. there can be two or more instructions with the same
where for i, n ≥ 0                                                      left-hand side. If that is not the case, the automaton is
   PC (i) := ({c}  | · Σi−1 ) ∪ Σi ∪ (Σ≤i−1 · {$})                      deterministic. In what follows we are mostly interested in
                     | · Σ≤i−2 · {$}),
                ∪ ({c}                                                  deterministic RP-automata, denoted det-RP.
            n                            k                                 An input word w is accepted by M if there is a computa-
  Σ≤n :=         Σi   and PC ≤(k) :=          PC (i) .
            S                            S
                                                                        tion that starts in the initial configuration with w (bounded
           i=0                          i=0                             by sentinels c,| $) on the tape and finishes in an accepting
   The transition function represents a finite set of four dif-
                                                                        configuration where the control unit is in one of the ac-
ferent types of instructions (transition steps). Let q, q0 , qI
                                                                        cepting states. L(M) denotes the language consisting of
be states from Q, u ∈ PC (k) , w ∈ PC ≤(k) , v ∈ PC ≤(k−1)
                                                                        all words accepted by M; we say that M accepts the lan-
and M be in state q with u be the contents of its read/write
                                                                        guage L(M).
window:
                                                                           Restarting steps divide any computation of an RP-
   (1) A move-right instruction of the form (q, u) →δ                   automaton into certain phases that all start in the initial
(q0 , MVR) is applicable if u 6= $. It causes M to enter the            state in restarting configurations with the read/write win-
state q0 and to move its read/write head one item to the                dow in the leftmost position. In a phase called cycle, the
right.                                                                  head moves to the right along the input list (with its read-
   (2) A preparing instruction (q, u) →δ (qI , PREPARE)                 /write window) until a restart occurs – in that case the com-
changes M’s state to a restarting state qI that determines              putation is resumed in the initial configuration on a new,
the next instruction, which must be a restarting instruction.           shorter, word. The phase from the last restart to the halting
   (3) A restarting instruction I is of the form (qI , w) →δ            configuration is called tail. This immediately implies that
RESTART(v), where |v| < |w| and if w contains any sen-                  any computation of any RP-automaton is finite (ending in
tinel, v contains the corresponding sentinels, too. This in-            a halting state).
struction is applicable if w is a prefix of the contents of the            The next proposition expresses certain lucidness of
read/write window. When executed, M replaces w with                     computations of deterministic RP-automata. The notation
v (hereby it shortens its tape) and restarts – i.e. it enters           u ⇒M v means that there exists a cycle of M starting in the
the initial state and places the window at the leftmost po-             initial configuration with the word u on its tape and finish-
sition so that the first item in the window contains c.| Note           ing in the initial configuration with the word v on its tape;
that the pair (w, v) is unambiguously given by the state qI .           the relation ⇒∗M is the reflexive and transitive closure of
We can assume that all pairs (w, v) such that |v| < |w| and             ⇒M .
the word w can be replaced with v by some RESTART in-                      The validity of the following proposition is obvious.
struction are ordered and that I is the index of (w, v) in
that sequence. Thus, although RP-automaton is in general                Proposition 1. (Correctness preserving property.) Let
nondeterministic, each RESTART instruction of M corre-                  M be a deterministic RP-automaton and u ⇒∗M v for some


                                                                    2
words u, v. Then u ∈ L(M) iff v ∈ L(M).                            restart state qI . Additionally, when cxq
                                                                                                          | I wy$ is the rewrit-
                                                                   ing configuration corresponding to the reduction xwy ⇒I
   By a monotone RP-automaton we mean an RP-                       xvy, for some words x, y, v, w, we will underline the occur-
automaton where the following holds for all computations:          rence of w rewritten in the cycle. That is, we will write
all items which appeared in the read/write window (and             xwy ⇒I xvy. Analogously, α ⇒∗I β indicates a computa-
remained still on the tape) during one cycle will appear in        tion consisting of several such cycles that use the same
the read/write window in the next cycle as well. It means,         restarting instruction I with the restarting state qI .
that during any computation of monotone RP-automaton
the items from the read/write window of prepare configu-
rations do not increase their distances from the right end-        3    Pumping Restarting Automata
marker $.
   Considering a deterministic RP-automaton M, it is for           Let M = (Q, Σ, c,  | $, q0 , k, δ , QA , QR ) be a det-mon-RP-
us convenient to suppose it to be in the strong cyclic form;       automaton, and ι = (qι , w) →δ RESTART(v) a restarting
it means that the words of length less than k, k being the         instruction of M. We say that ι is a pumping instruction of
length of its read/write window, are immediately (hence in         M if w can be written as w = u1 vu2 , for some words u1 , u2 ,
the tail) accepted or rejected, and that M performs at least       u1 6= λ , and, for all x, y ∈ Σ∗ , xu1 vu2 y ⇒ι xvy implies
one cycle (at least one restarting) on any longer word.
   We use the following obvious notation. RP denotes                    xu1j−1 u1 vu2 u2j−1 y ⇒ι xu1j−1 vu2j−1 y, for all j ≥ 1.
the class of all (nondeterministic) RP-automata. Prefix
det- denotes the deterministic version, similarly mon- the          We say that ι has pumping words u1 , u2 . If u2 6= λ we say
monotone version. Prefix scf- denotes the version in the           that ι is a two-side pumping instruction. If u2 = λ we say
strong cyclic form. L (A), where A is some class of au-            that ι is a one-side pumping instruction.
tomata, denotes the class of languages accepted by au-                Note that a preparing instruction determines whether
tomata from A. E.g., the class of languages accepted               a given restarting instruction can be executed and each
by deterministic monotone RP-automata is denoted by                preparing instruction depends on move right instructions
L (det-mon-RP).                                                    that are executed before it. Hence, the property of being a
   Since all computations of RP-automata are finite, the           pumping instruction depends on the whole transition func-
following proposition is obvious.                                  tion of an RP-automaton.
                                                                      Let M be a det-mon-RP-automaton and all restarting in-
Proposition 2. The classes L (det-mon-RP) and                      structions of M are pumping instructions. Then we call
L (det-RP) are closed under complement.                            M a pumping RP-automaton. We denote the property of
                                                                   pumping by the prefix pmp-.
Rejected languages by det-RP-automata. Let M be a
det-RP-automaton, and L(M) = L. We say that the lan-               Example 1. Let M1 = (Q, Σ, c,       | $, q0 , k, δ , QA , QR ) be the
guage L (the complement of L) is rejected by M. We will            RP-automaton with the set of states Q = {q0 , q1 , qA , qR },
often use the fact that the language and its complement            the alphabet Σ = {a, b}, window size k = 2, the set of
can be recognized (distinguished) by the same det-RP-              accepting states QA = {qA }, the set of rejecting states
automaton.                                                         QR = {qR } and the transition function
   The natural question of the decidability of monotonicity        δ (q0 , c$)
                                                                           |   = {(qA , HALT)}, δ (q0 , bb) = {(q0 , MVR)},
for a given RP-automaton is answered in the affirmative:           δ (q0 , ca)
                                                                           |   = {(q0 , MVR)}, δ (q0 , ab) = {(q1 , PREPARE)},
                                                                   δ (q0 , cb)
                                                                           |   = {(q0 , MVR)}, δ (q1 , ab) = {RESTART(λ )},
Theorem 1. There is an algorithm which, given an RP-               δ (q0 , aa) = {(q0 , MVR)}, δ (q0 , a$) = {(qR , HALT)},
automaton M, decides whether M is monotone or not.                 δ (q0 , ba) = {(q0 , MVR)}, δ (q0 , b$) = {(qR , HALT)}.
Proof: The proof is the same as the corresponding                     The automaton has only one restarting instruction
proof from [4]. The only difference between RW- and                ι = δ (q1 , ab) →δ RESTART(λ ). This instruction is
RP-automaton is that the RESTART operation of RW-                  two-side pumping instruction as w = ab can be writ-
automaton is unambiguously split into two consecutive in-          ten as u1 vu2 , where u1 = a and u2 = b are nonempty,
structions PREPARE and RESTART. This splitting has no              v = λ . If q0 cxu  | 1 vu2 y$ = q0 cxaby$
                                                                                                        |         `∗ cxq
                                                                                                                       | 1 u1 vu2 y$ `ι

influence on the decidability proof, since whenever the            q0 cxvy$
                                                                      |       = q0 cxy$
                                                                                    |     for some x, y ∈ {a, b}∗ , then it holds
RESTART operation of RW-automaton is used/simulated                that xu1j vu2j y = xa j b j y ⇒∗ι xy = xvy, for all j ≥ 0, and
in the original proof it can unambiguously be replaced by          xu1j−1 u1 vu2 u2j−1 y ⇒ι xu1j−1 vu2j−1 y, for all j > 0. Evidently,
PREPARE and RESTART operation of RP-automaton.                     the automaton M accepts the Dyck language of correctly
                                                                   paired parentheses, where a and b are the left and right
   In what follows, we consider only deterministic RP-             parenthesis, respectively.
automata. We write α ⇒I β , for words α, β ∈ Σ∗ when                  Note that M1 is not in the strong cycling form.
α was shortened to β in one cycle consisting of several
MVR steps followed by a PREPARE instruction and the                Example 2. Consider the regular language L2 of even
restarting instruction I = (qI , w) →δ RESTART(v) with the         length words over the alphabet {a}. This language can


                                                               3
easily be accepted by the following RP-automaton M2 ,                          j ≥ 0. From (1) it follows that for each j, 1 ≤ j ≤ p+k +1,
M2 = (Q, Σ, c,  | $, q0 , k, δ , QA , QR ) with the set of states             the computation on w( j) proceeds as follows:
Q = {q0 , q1 , qA , qR }, the alphabet Σ = {a}, window size                                          j j                      j    j
                                                                                 q0 cw(
                                                                                    |   j)$ = q0 cxu
                                                                                                 | 1 vu2 y$     `∗M    cxq
                                                                                                                       | j1 u1 vu2 y$
k = 2, the set of accepting states QA = {qA }, the set of                                                                                 j
                                                                                                                `∗M    | 1 q j2 u j−1 vu2 y$
                                                                                                                       cxu
rejecting states QR = {qR } and the transition function
                                                                                                                `∗M    ...
δ (q0 , c$)
        |   = {(qA , HALT)}, δ (q0 , aa) = {(q1 , PREPARE)},                                                                              j
δ (q0 , ca)
        |   = {(q0 , MVR)}, δ (q1 , aa) = {RESTART(λ )},                                                        `∗M    | j−1 q j j u1 vu2 y$
                                                                                                                       cxu
                                                                                                                                        j
δ (q0 , a$) = {(qR , HALT)}.                                                                                    `∗M    | j−1 qι u1 vu2 y$
                                                                                                                       cxu
                                                                                                                                     j−1
The automaton has only one restarting instruction                                                               `∗M       | j−1 vu2 y$
                                                                                                                       q0 cxu
ι = δ (q1 , aa) →δ RESTART(λ ). At first glance this                                                            `∗M    ...
instruction seems to be pumping as it can be applied itera-
tively. The rewritten word w = aa can be written as u1 vu2 ,                  where q ji denotes the state in which the i-th copy of u1 is
where either u1 = aa, v = λ , and u2 = λ , or u1 = a, v = λ ,                 the prefix of the read/write window contents in the first cy-
and u2 = a. If u1 = aa and xu1 vu2 y = xaay ⇒ι xy = xvy                       cle of the computation of M on w( j) and (q j j , u1 vu2 ω) →δ
for some x, y ∈ {a}∗ , then it does not hold                                  (qι , PREPARE) is the preparing instruction followed by
xu1j−1 u1 vu2 u2j−1 y ⇒ι xu1j−1 vu2j−1 y, for all j ≥ 1, as                   the instruction ι in the cycle. Further, let C ji denote the
the deleting realized by the corresponding PREPARE and                        configuration of M corresponding to q ji during the first
RESTART instructions is performed at the beginning of                         cycle on the word w( j).
the word and not around v, for j > 1.                                            Consider the first cycle of the computation of M on
   Similarly, when u1 = a, v = λ , and u2 = a, it                             w( j), where j > p + k + 1. While the whole read/write
can be shown that the condition xu1j−1 u1 vu2 u2j−1 y ⇒ι                      window of M is inside the prefix xu1p+k+1 of w( j), M
                                                                              executes the same instructions as in the first cycle on
xu1j−1 vu2j−1 y, for all j ≥ 1, does not hold.
                                                                              w(p + k + 1). Moreover, the contents of the read/write
   To obtain a pumping RP-automaton accepting the same
                                                                              window of M is the same in all configurations C ji for all i,
language, we can change the automaton a bit so that it
                                                                              1 ≤ i ≤ p + 1, because |uk1 | ≥ k. Since M has p states, there
realizes the RESTART operation at the end of the word.
                                                                              are two positive integers r, d, 1 ≤ r < r + d ≤ p + 1 such
For that we increase the window size to 3 and obtain RP-
                                                                              that q jr = q jr+d and M executes from the configuration
automaton N2 = (Q, Σ, c,     | $, q0 , k, δ , QA , QR ) with the set of
                                                                              C jr+d the same sequence of instruction as between the con-
states Q = {q0 , q1 , qA , qR }, the alphabet Σ = {a}, window
                                                                              figurations C jr and C jr+d (on w( j), where j ≥ r + 2d + k).
size k = 3, the set of accepting states QA = {qA }, the set
                                                                                 Now, we can prove that (1) holds for any j ≥ 1. The
of rejecting states QR = {qR } and the transition function
                                                                              base statement that the condition (1) holds for each j,
δ (q0 , c$)
        |   = {(qA , HALT)}, δ (q0 , ca$)   |     = {(qR , HALT)},
                                                                              1 ≤ j ≤ p + k + 1 is trivially satisfied. Let us suppose
δ (q0 , aaa) = {(q0 , MVR)}, δ (q0 , aa$) ={(q1 , PREPARE)},
                                                                              that (1) holds for each j, 1 ≤ j ≤ n, where n ≥ p + k + 1.
δ (q0 , caa)
        |    = {(q0 , MVR)}, δ (q1 , aa) = {RESTART(λ )}.
                                                                              We will show that (1) holds also for j = n + 1. During
It is easy to see that the new automaton N2 is pumping
                                                                              the first cycle on w(n + 1), the automaton executes be-
as it has single one-side pumping instruction that can be
                                                                              tween the configurations C(n+1)r+d and C(n+1)n+1 on the
applied only at the right end of its tape.
                                                                              word w(n+1) exactly the same sequence of instructions as
   Note that N2 is in the strong cycling form.
                                                                              between the configurations C(n+1−d)r and C(n+1−d)(n+1−d)
  The next lemma follows from the correctness preserving                      on the word w(n + 1 − d). Therefore, it holds w(n + 1) =
property.                                                                     xun1 u1 vu2 un2 y ⇒ι xun1 vun2 y and together with the assumption
                                                                              of the induction step, it holds xu1j−1 u1 vu2 u2j−1 y, for all j,
Lemma 1. Let M = (Q, Σ, c,  | $, q0 , k, δ , QA , QR ) be a det-              1 ≤ j ≤ n + 1. This completes the proof of the lemma.
mon-RP-automaton of window size k, let p = |Q|, and
ι = (qι , w) →δ RESTART(v) be a restarting instruction                           Note that Lemma 1 could be used for testing, whether a
of M. The instruction ι is a pumping instruction of M                         given restarting instruction is pumping, if we were able to
with pumping words u1 , u2 iff for all x, y ∈ Σ∗ such that                    bound the length of x, y in it. As a corollary of the follow-
xu1 vu2 y ⇒ι xvy and each j, 1 ≤ j ≤ p + k + 1 it holds that                  ing proposition and lemma we get that it is the case.
                                                                              Proposition 3. For any deterministic RP-automaton M of
              xu1j−1 u1 vu2 u2j−1 y ⇒ι xu1j−1 vu2j−1 y.          (1)          window size k, there is a deterministic RP-automaton M 0
                                                                              of window size n, n ≥ k, such that M 0 is in the strong cyclic
   Proof: Obviously, if ι is a pumping instruction then for                   form and L(M) = L(M 0 ). In addition, when M is monotone
all x, y ∈ Σ∗ such that xu1 vu2 y ⇒ι xvy the condition (1)                    then M 0 is monotone as well, and when M is pumping then
holds for each j ≥ 1.                                                         M 0 is pumping as well, and if u ⇒M v then u ⇒M0 v.
   To prove the opposite implication, let x, y be words such
that xu1 vu2 y ⇒ι xvy and for each j, 1 ≤ j ≤ p + k + 1 the                      Proof: An RP-automaton is in the strong cyclic form if
condition (1) holds. We will show that the condition (1) is                   it only accepts and rejects in tail computations words of
true also for any j > p + k + 1. Let w( j) = xu1j vu2j y, for all             bounded length. Thus, if the original RP-automaton was


                                                                          4
allowed to accept/reject in a tail computation on a word                any x, y ∈ Σ∗ of arbitrary length such that xu1 vu2 y ⇒ι xvy.
longer than the window size, we have to force it to perform             Then, by applying Lemma 1, we obtain that ι is a pumping
one or more cycles so that in a tail computation it finally             instruction. The proof will be split into two claims.
accepts (rejects) a word that together with the end-markers
fits into the read/write window.                                        Claim 1. Assume that condition (2) holds for all words x,
   We can assume that M always accepts or rejects in a                  y, |x| ≤ m, |y| < k such that xu1 vu2 y ⇒ι xvy. Then condi-
configuration in which it scans the right sentinel $. Oth-              tion (2) holds for all words x, y, |x| ≤ m and y of arbitrary
erwise, we can modify it so that instead of an “original”               length, such that xu1 vu2 y ⇒ι xvy.
accepting (rejecting) state, it would enter a special state
that causes moving to the right end and then accepting (re-                Proof of claim: According to the assumptions of the
jecting). Since the language of words accepted (rejected)               claim, for y of length at most k − 1 the condition (2) holds
in a tail computation is regular, there are finite automata A           trivially. If |y| ≥ k and xu1 vu2 y ⇒ι xvy then during the
and AC accepting these languages.                                       corresponding cycle the automaton M visited at most k −
   A new RP-automaton M 0 will be of window size k0 =                   1 symbols to the right from u1 vu2 , i.e. at most the first
max{nA + 1, nC + 1, nM }, where nA , nC and nM are the                  k − 1 symbols of y. Therefore, xu1 vu2 y0 ⇒ι xvy0 is true
numbers of states of A, AC and M, respectively. When                    for the prefix y0 of y such that y = y0 y00 and |y0 | = k − 1
moving right, the automaton M 0 simultaneously simulates                for some word y00 . Then, according to the assumption of
the computations of A, AC and M. The pumping lemma for                  the claim, it holds xu1j−1 u1 vu2 u2j−1 y0 ⇒ι xu1j−1 vu2j−1 y0 and
regular languages implies that if M accepts or rejects w of             also xu1j−1 u1 vu2 u2j−1 y0 y00 ⇒ι xu1j−1 vu2j−1 y0 y00 , for all j, 1 ≤
length greater than k0 in a tail computation, then w ∈ L(A)              j ≤ p + k + 1, as no symbol of y00 was visited in any of
or w ∈ L(AC ) and for some words x, y, z it holds: w = xyz,             the corresponding cycles. Hence, the condition (2) holds
|yz| < k0 , |y| > 0 and xz ∈ L(A) or xz ∈ L(AC ).                       when we do not restrict the length of y.
   The above modification ensures that when M accepts or
rejects, it has already read the whole tape till the right sen-         Claim 2. Assume that condition (2) holds for all words
tinel. If M would accept or reject but M 0 does not have                x, y, |x| ≤ m, y of arbitrary length, such that xu1 vu2 y ⇒ι
also the left sentinel c| in its read/write window, the au-             xvy. Then the condition (2) holds for all words x and y of
tomaton M 0 , instead of accepting/rejecting, deletes y by              arbitrary lengths, such that xu1 vu2 y ⇒ι xvy.
applying a suitable pair of preparing and restarting in-
struction of the form (qyz , yz$) →δ (qIyz , PREPARE) and                  Proof of claim: We will show that the condition (2)
(qIyz , y) → RESTART(λ ), for some new states qyz , qIyz .              holds for arbitrary x by induction on the length of x.
Obviously, M 0 is in strong cyclic form, L(M 0 ) = L(M) and                Induction basis: For x of length at most m the condition
u ⇒M v implies u ⇒M0 v.                                                 (2) is true trivially.
   The described simulation preserves monotonicity and                     Induction step: Let the claim be true for x of length at
also pumping property, because all added restart opera-                 most n for some n ≥ m. We will show that the condition
tions are pumping and performed at the right end of the                 (2) is true also for x of length n + 1. Let x be a word of
tape.                                                                   length n + 1.
 The next lemma enables extending Lemma 1 to det-                          In the first cycle of M on the word xu1 vu2 y, more than
mon-RP-automata in the strong cyclic form.                              d MVR steps were performed while the read/write window
                                                                        was completely inside the word x. At least two of them
Lemma 2. Let M = (Q, Σ, c,   | $, q0 , k, δ , QA , QR ) be a det-       were according to the same MVR instruction in a state q.
mon-RP-automaton with window size k, let p = |Q|, and                   Hence, we can write x = x1 x2 x3 , for some words x1 , x2 , x3
let ι = (qι , w) →δ RESTART(v) be a restarting instruction              such that |x2 | > 0, |x3 | ≥ k and
of M. There exists a constant m such that ι is a pumping
instruction of M with pumping words u1 , u2 iff for all x, y ∈                  q0 cx
                                                                                   | 1 x2 x3 u1 vu2 y$    `∗M    cx
                                                                                                                 | 1 qx2 x3 u1 vu2 y$
Σ∗ satisfying |x| ≤ m, |y| < k, xu1 vu2 y ⇒ι xvy it holds                                                 `∗M    cx
                                                                                                                 | 1 x2 qx3 u1 vu2 y$
                                                                                                          `∗M    cx
                                                                                                                 | 1 x2 x3 qι u1 vu2 y$
      for each j, 1 ≤ j ≤ p + k + 1 :                                                                     `∗M    q0 cx
                                                                                                                    | 1 x2 x3 vy$,
                                                             (2)
               xu1j−1 u1 vu2 u2j−1 y ⇒ι xu1j−1 vu2j−1 y.
                                                                        where the prefix of length k of x2 x3 is the same as the pre-
   Proof: Obviously, if ι is a pumping instruction then                 fix of length k of x3 . That is, in both above configurations
condition (2) is met for all words x, y such that xu1 vu2 y ⇒ι          with the state q the automaton executes the same MVR in-
xvy and for each j ≥ 1.                                                 struction.
   Let m = d + k + 1, where d denotes the number of                        Therefore, when we leave out x2 and the corresponding
possible pairwise different instructions of M. Evidently,               steps of the cycle, we obtain a valid cycle of M on the
d = p·|PC (k) | ≤ p·(|Σ|+3)k . We will show that if condi-              shorter word x1 x3 u1 vu2 y. Thus, it holds x1 x3 u1 vu2 y ⇒ι
tion (2) is satisfied for all x, y ∈ Σ∗ such that |x| ≤ m, y < k,       x1 x3 vy and we can apply the assumption of the induction
xu1 vu2 y ⇒ι xvy, then condition (2) is satisfied also for              step that the condition (2) holds for words x of length at


                                                                    5
most n for the word x1 x3 to obtain that                                                                T1         @ Tw
                                                                                                                    @
     for each j, 1 ≤ j ≤ p + k + 1 :                                                                   T2             @
              x1 x3 u1j−1 u1 vu2 u2j−1 y ⇒ι x1 x3 u1j−1 vu2j−1 y.                                                       @
                                                                                                                         @
                                                                                                             A
   Obviously, in each of the corresponding cycles we can                                                                         @
                                                                                                              @                      @
insert back x2 between x1 and x3 and the corresponding                                                      A @                       @
sequence of steps to obtain that                                                                             @ @                          @
                                                                                                               @ @                         @
                                                                                            x     u1        v    u2              y
        x1 x2 x3 u1j−1 u1 vu2 u2j−1 y ⇒ι x1 x2 x3 u1j−1 vu2j−1 y

is true for all j, 1 ≤ j ≤ p + k + 1, which completes the                                  Figure 1: The structure of a derivation tree.
proof of the claim.
   By combining Claim 2 and Lemma 1, it follows that the
instruction ι is pumping.                                                           pumping we sketch the construction of M; the construc-
                                                                                    tion will also be helpful for better understanding of later
Corollary 1. Let M be a scf-det-mon-RP-automaton, and                               results and proofs.
ι a restarting instruction of M. It is decidable whether ι is                          It is well known that if L0 is a deterministic context-free
a pumping instruction or not.                                                       language and $ is a symbol not in the alphabet of L0 , then
                                                                                    L0 · {$} is a deterministic prefix-free context-free language
   The characterization of DCFL by R-automata was given                             that can be parsed by a LR(0)-analyzer P0 . This fact was
in [3].                                                                             used in [3]. Here, in order to construct a pmp-det-mon-RP-
                                                                                    automaton, we use a similar construction of an LR(1)-
Theorem 2 ([3]). DCFL = L (det-mon-R).
                                                                                    analyzer P of L0 that is based on an LR(1)-grammar in
  We show that DCFL can even be characterized by                                    Greibach Normal Form. The existence of such analyzer
pmp-RP-automata.                                                                    for any DCFL is proved in [1].
                                                                                       Based on the simulation of P on a word w we can con-
Theorem 3. DCFL = L (pmp-RP) = L (det-mon-RP)                                       struct the derivation tree Tw (the inner vertices of which are
  Proof: The theorem is a consequence of the following                              labeled with nonterminals and leaves correspond to termi-
two lemmas.                                                                         nal symbols). Thus, for any word w ∈ L there is exactly
                                                                                    one derivation tree Tw . The standard pumping lemma for
Lemma 3. L (det-mon-RP) ⊆ DCFL.                                                     context-free languages implies existence of two constants
                                                                                    p, q > 0 such that for any word w with length greater than
   Proof: As the models of det-mon-R- and det-mon-RP-                               p there are (complete) subtrees T1 and T2 of Tw such that T2
automata differ only slightly, we can use here a                                    is a subtree of T1 and roots of both subtrees have the same
slightly modified proof of Lemma 8 in [3] stating that                              label (cf. Fig. 1); in addition, T2 has fewer leaves than
L (det-mon-R) ⊆ DCFL. For a given det-mon-RP-                                       T1 , T1 has at most q leaves and |u1 | > 0. The word u1 is
automaton M, a method from [3] can be used to construct                             nonempty, because the right-hand side of the rule used to
a deterministic push-down automaton P that accepts the                              rewrite the nonterminal A in the root of T1 must start with
same language as M.                                                                 a terminal (the grammar is in Greibach Normal form).
   To show the opposite direction, we use the character-                               Obviously, replacing T1 with T2 , we get the derivation
ization of deterministic context-free languages by means                            tree Tw(0) for a shorter word w(0) (if w = xu1 vu2 y then
of LR(1)-grammars in Greibach Normal Form and LR(1)-                                w(0) = xvy). Analogously, replacing T2 with T1 , we get the
analyzers (cf., e.g., [1])1 .                                                       derivation tree Tw(2) for a longer word w(2) where w(2) =
                                                                                    xu21 vu22 y. If we repeat this replacing of T2 with T1 i-times
Lemma 4. DCFL ⊆ L (pmp-RP).                                                         we obtain the derivation tree Tw(i+1) for a word w(i + 1)
   Proof: The inclusion follows from an analysis of the                             where w(i + 1) = xui+1       i+1
                                                                                                           1 vu2 y.
det-mon-R-automaton M simulating a syntactic analysis                                  The key to a construction of the det-mon-R-automaton
of a DCFL language L in [3]. Each det-mon-R-automaton                               M in [3] is the possibility to identify the leftmost subword
M can be easily converted into a det-mon-RP-automaton                               u1 vu2 corresponding to subtrees T1 and T2 as shown in
M 0 by splitting each restarting instruction of M into one                          Fig. 1 reading from left to right with the help of constant
preparing instruction and one restarting instruction of M 0 .                       size memory only. In its constant size memory M stores all
To see, that the resulting det-mon-RP-automaton M 0 is                              maximal subtrees of the derivation tree with all their leaves
      1 Recall that context-free grammar G = (N, T, P, S) is LR(k) for k ≥ 0
                                                                                    in the buffer. When it identifies a subtree like T1 above, M
if for any string α exists unique partition α = β γu, where α, β , γ ∈ (N ∪
                                                                                    performs the corresponding deleting by a RESTART oper-
T )∗ , u ∈ T ∗ , such that there is a rightmost derivation S ⇒∗r β Au ⇒r αβ u       ation. Obviously, the read/write window of length k > q is
where A is a nonterminal.                                                           sufficient for that.


                                                                                6
   If that is not the case then M forgets the leftmost of               Definition 4. Let M = (Q, Σ, c,      | $, q0 , k, δ , QA , QR ) be a
these subtrees with all its n ≥ 1 leaves, and reads n new               pmp-RP-automaton accepting the language L = L(M).
symbols to the right end of the buffer (performing MVR-                 Let ι = (qr , u1 vu2 ) →δ RESTART(v) be a two-side pump-
instructions). Then M continues constructing the maximal                ing restarting instruction of M with pumping words
subtrees with all leaves in the (updated) buffer (simulating            (strings) u1 , u2 and xu1 vu2 y ⇒ι xvy, for some x, y ∈ Σ∗ .
P). Short words of length less than k are accepted/rejected                Let p be a positive integer. We say that ι is a (p, x, y)-
in tail computations.                                                   distinguishing instruction for M if at least one of the fol-
   To obtain M 0 it is sufficient to use instead of restart-            lowing cases occurs:
                                                                           (I) xum       m               m m p· j / L, for all j ≥ 1,
ing instructions of M the corresponding preparing and                               1 vu2 y ∈ L, and xu1 vu2 u2 y ∈
restarting instructions of RP-automaton. The preparing                  m ≥ 0,
                                                                                                         p· j m m
and restarting instructions can handle LR(1)-analysis in                   (II) xum       m
                                                                                    1 vu2 y ∈ L, and xu1 u1 vu2 y ∈      / L, for all j ≥ 1,
the same way as M the LR(0)-analysis. The resulting RP-                 m ≥ 0,
automaton M 0 preserves the determinism and monotonic-                     (III) xum      m / L, and xum vum u p· j y ∈ L, for all j ≥ 1,
                                                                                     1 vu2 y ∈           1     2 2
ity of M; from the construction it is clear that M 0 is pump-           m ≥ 0,
                                                                           (IV) xum       m / L, and xu p· j um vum y ∈ L, for all j ≥ 1,
ing as well.                                                                        1 vu2 y ∈            1     1   2
   Note. While the previous proof is based on the paper                 m ≥ 0.
[3], a similar proof can be based on the constructions from                We say that ι is a distinguishing instruction for M if
paper [7] based on deterministic list automata.                         there are p, x, y such that ι is a (p, x, y)-distinguishing in-
   The following corollary is a consequence of the previ-               struction for M.
ous theorem and Proposition 3.                                             We say that M is a distinguishing RP-automaton if there
Notation. In what follows, we will write pmsc- instead of               is a distinguishing instruction ι for M. We write dist-RP-
pmp-scf-.                                                               automaton to denote a distinguishing pmp-RP-automaton.
Corollary 2.                                                            Example 4. Let us consider the automaton M1 =
      DCFL = L (pmsc-RP) = L (scf-det-mon-RP).                          (Q, Σ, c,
                                                                               | $, q0 , k, δ , QA , QR ) from Example 1. We will show
                                                                        that M1 is a dist-RP-automaton. The automaton has only
   In what follows, we aim to obtain some conditions                    one restarting instruction ι = (q1 , ab) →δ RESTART(λ ).
for a det-RP-automaton to accept a non-regular lan-                     This instruction is a two-side pumping instruction with
guage. At first, we conjecture, that each pmp-det-mon-RP-               pumping strings u1 = a and u2 = b and ab = xu1 vu2 y ⇒1
automaton that does not have any two-side pumping in-                   xvy = λ , for x = λ , v = λ , y = λ . The instruction ι
struction generates a regular language. On the other hand,              is a (p, x, y)-distinguishing instruction for M1 for p = 1.
a pmp-det-mon-RP-automaton having a two-side pumping                    Namely, for all j ≥ 1, m ≥ 0 it holds
instruction can still accept a regular language. This can be
seen in the following example.                                                        xum   m        m m
                                                                                        1 vu2 y = a b ∈ L(M1 ) and
                                                                                                p· j
Example 3. Let M3 = (Q, Σ, c,              | $, q0 , k, δ , QA , QR )                 xum   m           m m j / L(M ),
                                                                                        1 vu2 u2 y = a b b ∈       1
be the RP-automaton with the set of states
                                                                        and also
Q = {q0 , q1 , qa , qb , qA }, the alphabet Σ = {a, b}, window
size k = 2, the set of accepting states QA = {qA }, the set                           xum   m      m m
                                                                                        1 vu2 y = a b ∈ L(M1 ) and
of rejecting states QR = 0/ and the transition function                                    p· j m
                                                                                      xu1 u1 vu2 y = am a j bm ∈
                                                                                        m                      / L(M1 ).
δ (q0 , c$)
        |   = {(qA , HALT)}, δ (q0 , bb) = {(q0 , MVR)},
δ (q0 , ca)
        |   = {(q0 , MVR)}, δ (q0 , ab) = {(q1 , PREPARE)},
δ (q0 , cb)
        |   = {(q0 , MVR)}, δ (q1 , ab) = {RESTART(λ )},                Theorem 4. Let L = L(M) 6= 0/ be a deterministic context-
δ (q0 , aa) = {(q0 , MVR)}, δ (q0 , a$) = {(qa , PREPARE)},             free language accepted by a dist-RP-automaton M. Then
δ (q0 , ba) = {(q0 , MVR)}, δ (q0 , b$) = {(qb , PREPARE)}              L is a non-regular language.
δ (qa , a) = {RESTART(λ )}, δ (qb , b) = {RESTART(λ )}.
   The automaton differs from the automaton M1 of Ex-                   Proof. To obtain a contradiction, we suppose that M =
ample 1 only slightly. It has two new restarting states qa              (Q, Σ, c,| $, q0 , k, δ , QA , QR ) is a dist-RP-automaton accept-
and qb that enable to delete any symbol to the left from                ing a regular language L = L(M). Since M is a dist-RP-
the right sentinel $. Nevertheless, the first restarting in-            automaton, it has a (p, x, y)-distinguishing instruction ι =
struction ι = (q1 , ab) →δ RESTART(λ ) is still two-side                (qι , u1 vu2 ) →δ RESTART(v), for some x, y, v ∈ Σ∗ , u1 , u2 ∈
pumping instruction (see Example 1). The automaton M3                   Σ+ , qι ∈ Q and p ≥ 1. As L is regular, there exists a de-
is deterministic and monotone. In contrast to M1 , it is in             terministic finite automaton A with nA states accepting the
the strong cyclic form and accepts all words over the al-               language L(A) = L.
phabet {a, b}.                                                             The proof follows by the analysis of the four possible
   Hence, we define a property of two-side pumping in-                  cases of (p, x, y)-distinguishing property of ι:
structions which ensures that the resulting det-mon-RP-                 Case (I): From the definition, for all m ≥ 0, j ≥ 1 it holds
                                                                        xum   m              m m p· j / L. Let q
automaton does accept a non-regular language.                             1 vu2 y ∈ L and xu1 vu2 u2 y ∈             m j denote the



                                                                    7
state of the automaton A in which it reads the first symbol             5    Acknowledgements
of the j-th copy of u2 to the right from v. That is, A is in the
                                  j−1
state qm j after reading xum  1 vu2 , for j ≥ 1. For m > nA ,
                                                                        We thank the anonymous referees whose comments and
there exist integers r, s, 1 ≤ r < s ≤ nA + 1 such that qmr =           suggestions have helped to improve the presentation of this
qms . Then, for all i ≥ 0, it holds qmr = qmr+i·(r−s) and the au-       paper.
                                                     m+i·(s−r)
tomaton A accepts all words of the form xum       1 vu2       y.
                                    m+p·(s−r)
For i = p, we obtain that xum  1  vu2         y ∈  L which con-         References
tradicts the assumption xum       m p· j / L, for j = (s − r).
                             1 vu2 u2 y ∈
                                                                        [1] M. M. Geller, M. A. Harrison, I. M. Havel: Normal forms of
Case (II): From the definition, for all m ≥ 0, j ≥ 1 it holds
                        p· j m m                                            deterministic grammars. Discrete Mathematics, 16(4):313–
xum    m
   1 vu2 y ∈ L, and xu1 u1 vu2 y ∈    / L. Let qm j denote the
                                                                            321 (1976)
state of the automaton A in which it reads the first symbol
                                                                        [2] J. Hopcroft, J. Ullman: Introduction to Automata Theory,
of the j-th copy of u1 to the right from x. That is, A is in the
                                                                            Languages, and Computation; Addison-Wesley (1979)
state qm j after reading xu1j−1 , for j ≥ 1. For m > nA , there         [3] P. Jančar, F. Mráz, M. Plátek, J. Vogel: Restarting Automata,
exist integers r, s, 1 ≤ r < s ≤ nA + 1 such that qmr = qms .               Proceedings of FCT 1995, LNCS 965, Springer, 283–292
Then, for all i ≥ 0, it holds qmr = qmr+i·(r−s) and the au-                 (1995)
                                                 m+i·(s−r)
tomaton A accepts all words of the form xu1           vum2 y.
                                                                        [4] P. Jančar, F. Mráz, M. Plátek, and J. Vogel. On restarting
                             m+p·(s−r) m                                    automata with rewriting. In G. Pǎun and A. Salomaa, edi-
For i = p, we obtain that xu1         vu2 y ∈ L which con-
                          p· j m m                                          tors, New Trends in Formal Language Theory (Control, Co-
tradicts the assumption xu1 u1 vu2 y ∈/ L, for j = (s − r).                 operation and Combinatorics), volume 1218 of LNCS, pages
   Analysis of conditions (III) and (IV) from Definition 4                  119–136. Springer, 1997.
is analogous to the shown cases.                                        [5] P. Jančar, F. Mráz, M. Plátek, J. Vogel: On Monotonic Au-
                                                                            tomata with a Restart Operation. Journal of Automata, Lan-
                                                                            guages and Combinatorics 4(4): 287–311 (1999)
   As each dist-RP-automaton is deterministic and mono-
tone (Definition 4), it accepts a deterministic context-free            [6] F. Otto: Restarting Automata. Recent Advances in Formal
                                                                            Languages and Applications; Studies in Computational In-
language (Lemma 3). Hence, Lemma 3, Theorem 4 and
                                                                            telligence, vol 25, Springer, 269–303 (2006)
Proposition 3 imply the following corollary (⊂ denotes the
                                                                        [7] M. Plátek, J. Vogel: Deterministic list automata and erasing
proper subset relation).
                                                                            graphs. The Prague bulletin of mathematical linguistics 45,
                                                                            27–50 (1986)
Corollary 3. L (dist-RP) = L (scf-dist-RP) ⊂ DCFL.
                                                                        [8] J. Šíma, M. Plátek: One Analog Neuron Cannot Rec-
                                                                            ognize Deterministic Context-Free Languages. Proceedings
   Remark. We conjecture that L (scf-dist-RP) is equal                      of ICONIP 2019, Part III, LNCS 11955, 77–89, Springer
to the set of all non-regular DCFL. But it remains an open                  (2019)
problem.                                                                [9] J. Šíma, M. Plátek: The Simplest Non-Regular Deterministic
                                                                            Context-Free Languages. In preparation.
                                                                        [10] S. Sippu, E. Soisalon-Soininen: Parsing Theory, Volume II:
4    Conclusions                                                            LR(k) and LL(k) Parsing. Monographs in Theoretical Com-
                                                                            puter Science, Volume 20 of EATCS Series, Springer (1990)
We plan in the near future to show that any pmsc-RP-
automaton M which accepts a non-regular language can
be transformed in a scf-dist-RP-automaton, for which all
its two-side pumping instructions are distinguishing. Let
us denote such type of automata as strongly distinguishing
RP-automata (sdist-RP-automata). The sdist-RP-automata
will allow to extend the results from [8] achieved by de-
terministic context-free grammars. Further, it will allow
to introduce some types of degrees of non-regularity of
DCFL, e.g., according to the number of distinguishing in-
structions in a strongly distinguishing RP-automaton. The
combinations of this measure with other measures typi-
cal for restarting automata (e.g., the length of rewriting
windows) will give us natural measures for complexity of
DCFL.
   Finally, sdist-RP-automata will create a nice tool for lo-
calization and measures for syntactic errors in determinis-
tic context-free languages.


                                                                    8