Reversal of regular languages and state complexity

                                                       Juraj Šebej

                       Institute of Computer Science, Faculty of Science, P. J. Šafárik University
                                          Jesenná 5, 04001 Košice, Slovakia
                                               juraj.sebej@gmail.com

Abstract. We study the state complexity of languages            nary worst-case examples for these three operations,
that can be obtained as reversals of regular languages repre-   however he did not present any proofs. Birget in his
sented by deterministic finite automata. We show that the       works [1, 2] examined intersection and union of several
state complexity of the reversal of a regular language with     languages, and also the question of the size of nonde-
state complexity n is between log n and 2n . We first prove     terministic automaton for complements. The system-
that the upper bound is tight in the ternary case. Then we
                                                                atic study of the state complexity of operations on
present binary languages reaching this upper bound on the
reversal. We also obtain some other partial results in the
                                                                regular languages began in the paper by Yu, Zhuang,
binary case.                                                    and Salomaa [24]. This work was followed by papers
                                                                studying state complexity of operations on unary lan-
                                                                guages [17] and on finite languages [3], complexity of
                                                                proportional removals [5], and shuffle in [4].
1    Introduction                                                    Another stream of research is the study of so called
                                                                “magic” numbers, where not only worst-case complex-
Regular languages and finite automata are the old-              ities are important, but also all values that can be
est and the simplest topics in computer science. They           obtained as a corresponding complexity are consid-
have been investigated since the 1950s. Despite their           ered. The problem was stated by Japanese authors
simplicity, some problems are still open. Probably the          Iwama, Kambayashi, and Takaki [9] who asked what
most challenging is the question of how many states             values can be obtained as the size of the minimal de-
are sufficient and necessary for two-way deterministic          terministic automaton equivalent to a given n-state
automata to simulate two-way nondeterministic au-               nondeterministic automaton. The values that cannot
tomata which is connected to the well-known                     be obtained in such a way are called “magic” numbers
DLOGSPACE vs. NLOGSPACE problem.                                in [10]. The following research showed that there are
    Motivating by applications of regular languages in          no magic numbers in the ternary case [12], while a lot
software engineering, programming languages, and                of them exist in the unary case [6]. The binary case is
other areas in computer science, as well as by their im-        still open.
portance in theory, this class of languages is intensively           Similar results for the size of nondeterministic au-
studied in recent years; for the discussion, we refer the       tomata for complements can be found in [19], for the
reader to [8, 23]. Various areas in this field are now          union and intersection in [7], and for the reversal and
deeply and intensively examined. One of such areas              star in [11]. In all cases, the whole range of complexi-
is descriptional complexity which studies the cost of           ties can be obtained, however while in the case of union
description of languages represented by different for-          and intersection the used alphabet is fixed, in the case
mal systems such as deterministic and nondetermin-              of reversal and star, the alphabet grows exponentially
istic finite automata, two-way automata, regular ex-            with n.
pressions, or grammars.                                              In this paper, we continue the study of the state
    Rabin and Scott in 1959 [18] described an algo-             complexity of reversals of regular languages. In 1966,
rithm for the conversion of nondeterministic finite au-         Mirkin [14] pointed out that Lupanov’s ternary worst-
tomata into deterministic automata known as the sub-            case example is a reversal of a deterministic automa-
set construction. The algorithm shows that every                ton, which proves that the complexity of the rever-
n-state nondeterministic automaton can be simulating            sal of a language accepted by a ternary n-state deter-
by at most 2n state deterministic automaton. In 1963,           ministic automaton is 2n . The binary language with
Lupanov [16] proved the optimality of this construc-            more than one accepting state reaching this upper
tion by describing a ternary and even a binary regular          bound has been given in 1983 by Leiss [15]. In 2004,
language accepted by an n-state nondeterministic au-            the paper [20] claimed a binary worst-case example
tomaton that requires exactly 2n deterministic states.          with a single accepting state. Unfortunately, the re-
    Maslov in 1970 [13] considered the state complex-           sult does not hold: in the case of n = 8, the number
ity of union, product, and Kleene star. He gave bi-             of reachable states in the subset automaton for the re-
48      Juraj Šebej

versal is 252 instead of 256. Since the result has been          Two automata are equivalent if they recognize the
used in the literature several times, our first aim is       same language. A dfa (an nfa) M is called minimal if
to present a correct example, and a correct proof. We        every dfa (every nfa, respectively) that is equivalent to
start with an observation that all states in the subset      M has at least as many states as M . It is well-known
automaton corresponding to the nfa that is obtained          that a dfa M = (Q, Σ, δ, s, F ) is minimal if all its
as a reversal of a minimal dfa are pairwise inequiva-        states are reachable from the starting state and no two
lent. We show that the state complexity of the rever-        its different states are equivalent (states p and q are
sal of an n-state dfa language is between log n and 2n ,     equivalent if for all strings w in Σ ∗ , the state δ(p, w) is
and present a ternary worst-case example with a very         accepting if and only if the state δ(q, w) is accepting).
simple proof of reachability of all subsets. In a much       Every regular language has a unique minimal dfa, up
more difficult way, we prove that the upper bound 2n         to the naming of states. However, the same result does
is tight also in the binary case. Our witness automaton      not hold for nfa’s.
has a single accepting state, and is uniformly defined           The state complexity of a regular language is the
for all integers n. Therefore, it can be used in all cases   number of states in its minimal dfa. A regular lan-
where the incorrect result from [20] was used. We next       guage with deterministic state complexity n is called
find binary n-state deterministic automata that need         an n-state dfa language.
n + 1 or n + 2 deterministic states for their reversals.         Every nfa M = (Q, Σ, δ, S, F ) can be transformed
Finally, we present binary 1-, 2-, and 3-state automata      to an equivalent deterministic finite automaton M 0 =
that reach all particular values from log n to 2n as the     (2Q , Σ, δ 0 , s0 , F 0 ) thanks to an algorithm known as the
state complexity of their reversals.                         “subset construction” in the following way. Every state
                                                             of the dfa M 0 is a subset of the state set Q. The start-
                                                             ing state of the dfa M 0 is the set S.SThe transition
2    Preliminaries                                           function δ 0 is defined by δ 0 (R, a) = r∈R δ(r, a) for
                                                             every state R in 2Q and every symbol a in Σ. A state R
This section gives some basic definitions, notations,        in 2Q is an accepting state of the dfa M 0 if it con-
and preliminary results used throughout the paper.           tains at least one accepting state of the nfa M. We
For further details, we refer to [21, 22].                   call the dfa M 0 the subset automaton corresponding
    Let Σ be a finite alphabet and Σ ∗ the set of all        to the nfa M . The subset automaton M 0 need not
strings over the alphabet Σ including the empty              be minimal since some states may be unreachable or
string ε. The length of a string w is denoted by |w|.        equivalent.
A language is any subset of Σ ∗ . We denote the cardi-           We next give the definitions and some preliminary
nality of a finite set A by |A| and its power-set by 2A .    results concerning the reversal operation.
    A deterministic finite automaton (dfa) is a 5-tuple
                                                             Definition 1. The reversal wR of a string w is de-
M = (Q, Σ, δ, s, F ), where Q is a finite set of states,
                                                             fined as follows: εR = ε and if w = a1 a1 · · · an with
Σ is a finite input alphabet, δ is the transition function
                                                             ai ∈ Σ, then wR = an an−1 · · · a2 a1 . The reversal of
that maps Q × Σ to Q, s is the starting state, s ∈ Q,
                                                             a language L is the language LR = {wR | w ∈ L}.
and F is the set of accepting states, F ⊆ Q. In this
                                                                  The reversal of a dfa A = (Q, Σ, δ, s, F ) is the
paper, all dfa’s are assumed to be complete, that is,
                                                             nfa AR obtained from A by reversing all transitions
the next state δ(q, a) is defined for every state q in Q
                                                             and by swapping the role ¡of starting and¢ accept-
and every symbol a in Σ. The transition function δ is
                                                             ing states, that is AR = Q, Σ, δ R , F, {s} , where
generalized to a function from Q×Σ ∗ to Q in a natural
                                                             δ R (q, a) = {p ∈ Q : δ (p, a) = q}.
way. A string w in Σ ∗ is accepted by the dfa M if the
state δ(s, w) is an accepting state of the dfa M . The
                                                             Proposition 1. The reversal of a dfa A recognizes
language accepted by the dfa M , is the set L(M ) =                            R
                                                             the language L (A) .
{w ∈ Σ ∗ | δ(s, w) ∈ F }.
    A nondeterministic finite automaton (nfa) is             Proof. We prove that a string w is in L (A) if and
                                                                                                                R
a 5-tuple M = (Q, Σ, δ, S, F ), where Q, Σ, S and F are      only if the string w is accepted by the nfa AR .
defined identically as for a dfa, S is the set of start-
ing states, and δ is now the nondeterministic tran-
sition function that maps Q × Σ to 2Q . The transi-
tion function can be naturally generalized to the do-
main Q × Σ ∗ . A string w in Σ ∗ is accepted by the
nfa M if the set δ(q0 , w) contains an accepting state
of the nfa M. The language accepted by the nfa M is
L(M ) = {w ∈ Σ ∗ | δ(S, w) ∩ F 6= ∅}.                             Fig. 1. The string wR is accepted by the dfa A.
                                                                                 Reversal of regular languages      49

                                                             in the subset automaton corresponding to the nfa M .
                                                             Then, without loss of generality,there exists a state q
                                                             in Q such that q ∈ S and q ∈   / T . It follows that the
                                                             string wq is accepted by the subset automaton from
                                                             state S but not from state T . Thus the states S and T
    Fig. 2. The string w is accepted by the nfa AR .         are inequivalent.                                      u
                                                                                                                    t

                     R                                       Theorem 2. All states in the subset automaton cor-
    If w is in L (A) , then wR is in L (A), and so           responding to the reversal of a minimal dfa are pair-
the starting state s goes to an accepting state f in F       wise inequivalent.
by wR . It follows that the starting state f of the nfa AR
goes to the accepting state s of AR by w, and so w is        Proof. Let us show that every nfa obtained as the
accepted by AR .                                             reversal of a minimal dfa satisfies the condition in
    Next, if a string w is accepted by the nfa AR , then     Lemma 1. Let q be a state of the nfa. Since state q
there is a starting state f in F that goes to the ac-        is reachable in the given dfa, there exists a string x
cepting state s of AR by w. It turns out, that in the        such that the starting state of the dfa goes to state q
dfa A, the starting state s goes to an accepting state f     by x, as illustrated in Fig. 3.
by wR . Thus the string wR is in the language L (A),
and so the string w is in the language LR (A).           u
                                                         t

    Since a language is regular if and only if it is rec-
ognized by a dfa or, equivalently, by an nfa, we get the
following result.

Corollary 1. The reversal of every regular language
                                                    Fig. 3. State q is reachable in the dfa A (left); p 6= q in the
is a regular language.                                   R
                                                             nfa A (right) implies two distinct conputations of the dfa
    After the construction of nfa for the reversal of        on the string x.
a regular language we can use the subset construction         This means that the string xR is accepted by the
to get a dfa for the reversal. This gives the following nfa from state q, see Fig. 3. We now prove that the
bounds on the size of the dfa.                           string xR is not accepted by the nfa from any other
                                                                                                            R
Theorem 1. Let L be a regular language accepted by state. Assume for contradiction that the string x is
a minimal n-state dfa. Then the minimal dfa for the accepted by the nfa from a state p with p 6= q. It
language LR has at most 2n and at least dlog2 ne states. turns out that the starting state of the dfa might go
                                                         by the string x to state q as well as to state p, which
Proof. Let A be an n-state dfa for a language L. The     is a contradiction since in the dfa we woud have two
            R
reversal A of the dfa A is an n-state nfa for the lan-   different computations on the string x. Hence the nfa
         R
guage L . After applying the subset construction to      satisfies the condition of Lemma 1, and so all states
            R                  n
this nfa A , we get at most 2 -state dfa for the lan-    in  the corresponding  subset automaton are pairwise
         R                R R
guage L . Now since (L ) = L, the lower bound            inquivalent.                                          u
                                                                                                               t
is dlog ne.                                           u
                                                      t

    We now prove quite interesting result that in the        3    Ternary alphabet
subset automaton corresponding to the reversal
of a minimal dfa, all states are pairwise inequivalent.      We start with the upper bound 2n in the ternary case.
This means that we need not prove inequivalence of           The next theorem presents a ternary worst-case exam-
states troughtout the paper.                                 ple for the reversal with a very simple proof of reach-
                                                             ability of all subsets.
Lemma 1. Let for each state q of an nfa there exists
a string wq such that wq is accepted by the nfa from         Theorem 3. For every integer n with n ≥ 3, there
state q, but is not accepted from any other state. Then      exists an n-state dfa A over a three-letter alphabet
in the corresponding subset automaton, all states are        such that the minimal dfa for the reversal of the lan-
pairwise inequivalent.                                       guage L(A) has 2n states.

Proof. Let M = (Q, Σ, δ, S, F ) be an nfa, and let for Proof. Let A be the minimal n-state dfa shown in
each state q in Q, wq be a string that is accepted by M Fig. 4. Construct an nfa for the reversal of the lan-
only from state q. Let S and T be two different subsets guage L(A) from the dfa A by reversing all transitions,
50        Juraj Šebej

             bc                   b                     b                    b                        b         b               b
                                                                                         b
                                                                                     a        a             a           a   a
                                                                              0           1       2   b 3           4           n-1
                  ab         ac       ac           ac                                     a                             a
              0          1        2                     n-1
                                                                        Fig. 6. The binary dfa A reaching the bound 2n .
                                           c
                                                                                                                        a
                                               a
                                                                             b                        b         b               b
                                                                                         b
                                                                                              a             a           a   a
     Fig. 4. The ternary dfa A reaching the bound 2n .                               a
                                                                                                  2                 4           n-1
                                                                                 0        1           b 3
                                                                                          a                             a

                                                                       Fig. 7. The nfa AR for the binary language L(A)R .


                                                                    Theorem 4. For every integer n with n ≥ 2, there
                                                                    exists an n-state dfa A over a two-letter alphabet such
                                                                    that the minimal dfa for the reversal of the language
    Fig. 5. The nfa for the reversal of the language L(A).
                                                                    L(A) has 2n states.
                                                                    Proof. Let us consider a binary n-state dfa A in Fig. 6
                                                                    with states 0, 1, . . . , n − 1, where n ≥ 4, state n is the
and exchanging the starting and accepting states. The               starting state and state 0 is the sole accepting state.
nfa is shown in Fig. 5.                                             For all i = 4, 5 . . . , n − 1, state i goes to state i − 1 by
      Let us show that the corresponding subset automa-             symbol a, and to itself by symbol b. State 3 goes to
ton has 2n reachable and pairwise inequivalent states.              state n − 1 by symbol a, and to state 2 by b. State 2
We first show that every set containing state 0 is reach-           goes to state 1 by a, and to state 3 by b. State 1 goes
able. The proof is by induction on the size of sets. The            to state 0 by both symbols a and b. State 0 goes to
basis, |S| = 1, holds true because state 0 is the start-            state 2 by a, and to itself by b. In the case of n = 2 or
ing state of the subset automaton. Assume that every                n = 3, there are some small changes in the structure
set of size k, 1 ≤ k ≤ n−1, containing state 0 is reach-            of the automaton. If n = 2, then state 0 goes to state 1
able. Let S = {0, i1 , i2 , ..., ik } with 1 ≤ i1 < i2 < · · · <    by symbol a. If n = 3, then state 2 goes to itself by
ik ≤ n − 1 be a set of size k + 1. Consider the set S 0 =           symbol b.
{0, i2 − i1 + 1, ..., ik − i1 + 1}. The set S 0 is of size k            In these two cases, we reverse the dfa A, and after
and contains state 0, and so is reachable by the induc-             the determinisation of the reversal, we get a four-state
tion hypothesis. The set S 0 goes to the set S by bci1 −1           minimal dfa if n = 2 in Fig. 12, and an eight-state
since S 0 goes to {0, 1, i2 − i1 + 1, . . . , ik − i1 + 1} by b,    minimal dfa if n = 3 in Fig. 17.
and then to S by ci1 −1 . It turns out that the set S is                Now let n ≥ 4. Construct an nfa for the reversal
reachable.                                                          of the language L (A) by exchanging the starting and
      We next prove the reachability of sets with-                  accepting states, and by reversing all transitions in the
out state 0. Let S = {i1 , i2 , ..., ik } with 1 ≤ i1 < i2 <        dfa A, see Fig. 7. We are going to show that the cor-
· · · < ik ≤ n − 1. Then the set S is reached from the              responding subset automaton has 2n reachable states.
set {0, i2 − i1 , . . . , ik − i1 }, containing state 0, by ai1 .   To make the proof more understandable, we call the
Finally, the empty set is reached from the set {1} by b.            set of states {0, 1, 2} the first part, and the set of states
This completes the proof since the inequivalence fol-               {3, 4, . . . , n − 1} the second second part of the nfa.
lows from Theorem 2.                                           u
                                                               t        We will consider two cases:
                                                                    1. n = 3k + 1 or n = 3k + 2,
                                                                    2. n = 3k,
4      Binary alphabet and upper bound                              where k is a positive integer.
                                                                        1. If n = 3k + 1 or n = 3k + 2, then the number
The authors of the paper [20] present a binary n-state              of states in the first part is three, while the number
dfa and claim that its reversal requires 2n determinis-             of states in the second part is 3 (k − 1) + 1 or
tic states. Unfortunately, the example does not work:               3 (k − 1)+2. Thus these two numbers are are relativily
in the case of n = 8, the resulting dfa has 252 reachable           prime. First, the set {0, 1} is reached from the starting
states instead of 256. The next theorem describes cor-              set {0} by symbol b. Now we demonstrate how to add
rect binary n-state witness dfa’s with a single accept-             a new state ` to a set {0, 1} ∪ S, where S is a subset of
ing state, uniformly defined for every n with n ≥ 2.                the second part with ` ∈       / S, to get a set {0, 1}∪S ∪
                                                                    {`}. By symbol a, we can rotate states in both parts.
                                                                                     Reversal of regular languages       51

Consider the set {0, 1, `}. Since the sizes of the two          denote the states of this triple by `0 , `1 , `2 . We choose
parts are relatively primes, there exists an integer x          which configaration for this triple we want obtain, and
such that the set {0, 1, `} goes to the set {0, 2, 3} by        show that we set this configuration with the 0-th triple
the string ax . Apply the string ax to the set {0, 1} ∪ S,      set to {0, 1, 2}. When finally setting the first triple, we
and then apply symbol b. We get the set {0, 1, 3} ∪ S 0 ,       also show how to set it with an arbitrary configuration
where S 0 is a rotation of the set S by the string ax .         in the 0-th triple. So, first let ` ≥ 2. A configuration
And now, again, there exists an integer y such that the         in this triple is given by a subset S of {3, 4, 5}. We
set {0, 1, 3} ∪ S 0 goes to the set {0, 1} ∪ S ∪ {`} by ay .    first count the numbers of a’s in the string on a path
So, in this way, we can reach every set {0, 1} ∪ S. Let         from {0, 1, 2} to {0, 1, 2} ∪ S in the dfa B, and denote
us show how to get every subset of states in first part         it by a# . Now consider some starting strings:
without changing the second part. Every set {0, 1} ∪ S                as0 = a3.(k−1−`+1) ,
goes to the set {1, 2} ∪ S as well as to the set {0, 2} ∪ S           as1 = a3.(k−1−`+1)−1 ,
by an appropriate numbers of a’s. Every set {1, 2} ∪ S                as2 = a3.(k−1−`+1)−2 ;
goes to the set {2} ∪ S by bb, and then to {0} ∪ S and          different starting strings are needed because the num-
{1} ∪ S by an appropriate numbers of a’s. Every set             ber of a’s must be a multiply of 3 in the end.
{0, 2} ∪ S goes to the set {0, 1, 2} ∪ S by bb. Finally,              Next we move the `-th triple to the place of first
every set {1} ∪ S goes to the set ∅ ∪ S by bb. This             triple by one the of starting strings as0 , as1 , as2 : if
completes the proof of reachability if n = 3k + 1 or            a# (mod 3) = 0 we use as0 so we get `0 , `1 , `2 at
n = 3k + 2.                                                     the place of the first triple, if a# (mod 3) = 1 we
    2. If n = 3k, we can split the states of the nfa            use as1 so we get `1 , `2 , X at place of first triple, if
into triples, the first part is a triple 0, and the second      a# (mod 3) = 2 we use as2 so we get `2 , X, X at place
part consists of triples 1, 2, . . . , k − 1. We first reach    of first triple where X is a state from some other triple,
the set {0, 1, 2} from the starting set {0} by the string       thus we cannot modify X.
baabb. Let us show how to set a triple in the second                  Next we proceed by the string w and count the
part without changing the other triples. We use the             number of a’s. If the starting string was as0 , after the
automaton B shown in Fig. 8. In automaton B, every              1st, 4th, 7th, . . . symbol a, we apply a rotation arot
set is reachable from the set {0, 1, 2}. Assume we want         where a arot = a3.(k−2) , so that we do not modify
to set the `-th triple with 2 ≤ ` ≤ k − 1, and let us           the other triples. Similarly, if the starting string was
                                                                as1 , we apply the rotation arot after the 2nd, 5th, 8th,
                                                                . . . symbol a. Finally, if the starting string was as2 , we
                                                                apply the rotation after the 3rd, 6th, 9th, . . . symbol a.
                                                                      Now we have set the `-th triple, but still have to
                                                                move the triple to its place `: we just need to apply
                                                                the string a3(`−1) (a back string).
                                                                      So the complete string consists of one of the strat-
                                                                ing strings, a new route string, and a back string. Thus
                                                                in this way, we can set the 0-th triple {0, 1, 2} with all
                                                                triples except for the first triple. We set the first triple
                                                                in a similar way, but now we use paths from {0, 1, 2} to
                                                                every state in the dfa B. That means that all subsets
                                                                are reachable. This completes the proof of reachability
                                                                for n = 3k.                                                u
                                                                                                                           t


                                                                5    Unary alphabet

                                                                We now show that we cannot reduce the size of the
                                                                alphabet to one symbol.

                                                                Theorem 5. The minimal dfa for the reversal of
                                                                every unary n-state dfa language has n states.

Fig. 8. The nfa AR from Proof of Theorem 4 for n = 6 (left      Proof. Every string w in a unary language L consists
up corner), and dfa for L(A)R , the main part of the picture.   only of symbols, for example, a. Therefore, w = wR ,
Red lines correspond to the transitions by symbol b, and        and so L = LR . That means that the reversal of the
blue lines to the transitions by symbol a.                      language L has also complexity n.                 u
                                                                                                                  t
52      Juraj Šebej

6    Binary automata with one, two, and
     three states
In this section, we examine the reversals of regular lan-
guages that can be accepted by one-, two- and three-
state dfa’s. We first observe that the reversal of a one-
state dfa language is the same language. It turns out
that that the reversal of no two-state dfa language can Fig. 10. The dfa A (top left), the reversal of A (bottom
be accepted by a one-state dfa, and so in this case, the left), the subset automaton for the reversal; α = 2.
lower bound log 2 cannot be reached. On the other
hand, we show that all other possible values, that is,
2, 3, and 4, can be obtained as the size of the mini-
mal dfa for the reversal of a two-state binary dfa lan-
guage. We next prove that all values from 2 to 8 can be
reached as the number of states in the minimal dfa rec-
ognizing the reversal of a binary language represented
by a three-state deterministic finite automaton.

Theorem 6. The reversal of every one-state dfa lan- Fig. 11. The dfa A, the reversal of A, the subset automa-
guage is a one-state dfa language.                  ton for the reversal; α = 3.

Proof. Let us prove the theorem by inspecting all one-
state automata. We only have two possibilities shown 7 Binary alphabet
in Fig. 9. If the state is accepting, then the automa-
                                                       In this section, we describe n-state dfa’s whose rever-
                                                       sals need exactly n + 1 and n + 2 deterministic states.
                                                       Notice that by Theorem 5, the reversal of an n-state
                                                       unary language needs exactly n-states.
                                                              Theorem 9. For every integer n with n ≥ 2, there
Fig. 9. The one-state dfa accepting all strings (left), and   exists an n-state dfa A over a two-letter alphabet such
the one-state dfa accepting no strings (right).               that the minimal dfa for the reversal of the
                                                              language L(A) has n + 1 states.

ton accepts all strings. If the state is rejecting, the       Proof. Let n ≥ 2. Consider the n-state dfa A shown
automaton does not accept any string. In both cases,          in Fig. 18 with states 1, 2, . . . , n, of which 1 is the
the reversal is the same language, and so is accepted         starting state and also the sole accepting state. For all
by the same one-state dfa.                            u
                                                      t       i = 1, 2, . . . , n − 1 state i goes by symbol a to state
                                                              i + 1, and state n goes by symbol a to itself. For all
Theorem 7. For each α with 2 ≤ α ≤ 4, there exists            i = 2, 3, . . . , n state i goes by symbol b to state i − 1,
a two-state binary dfa A such that the minimal dfa for        and state 1 goes by b to itself. The dfa A is minimal
the reversal of the language L(A) has exactly α states.       since for two states i, j with i < i, the string bi−1 is
                                                              accepted from state i but not from state j.
Proof. The corresponding automata for α = 2, 3, 4 are             Construct an nfa for the reversal of the lan-
shonw in Fig. 10, Fig. 11, and Fig. 12, respectively.         guage L (A) by swapping the starting and accepting
The figures show a two-state dfa, its reversal, and the       states, and by reversing all transitions in A. Let us
reachable states in the corresponding subset automa-          show that the corresponding subset automaton has
ton. By Theorem 2, the subset automata are minimal.           n + 1 reachable states . The set {1} is reachable be-
                                                              cause it is the starting state in the subset automaton.
Theorem 8. For each α with 2 ≤ α ≤ 8, there exists
                                                              The set {1} goes to the empty set by symbol a, and to
a three-state binary dfa A such that the minimal dfa
                                                              the set {1, 2} by symbol b. Every set {1, 2, . . . , i} with
for the reversal of the language L(A) has exactly
                                                              2 ≤ i ≤ n − 1 goes to the set {1, 2, . . . , i − 1} by a, and
α states.
                                                              to the set {1, 2, . . . , i + 1} by b. The set {1, 2, . . . , n}
Proof. Similarly as in the previous proof, we show            goes to itself by a and b. Thus the sets {1}, {1, 2}, . . .,
the appropriate three-state binary automata for α =           {1, 2, . . . , n}, and the empty set are reachable, while
2, 3, 4, 5, 6, 7, 8 in Figures 13, 14, 15, 16, 17.            no other set is reachable. It follows that the minimal
                                                              dfa for the reversal of L(A) has n + 1 sets.                   u
                                                                                                                             t
                                                                                  Reversal of regular languages         53


                                                            Fig. 17. The dfa A, the reversal of A, the subset automa-
Fig. 12. The dfa A, the reversal of A, the subset automa-   ton for the reversal; α = 8.                           u
                                                                                                                   t
ton for the reversal; α = 4.                           u
                                                       t


                                                            Fig. 18. The n-state binary dfa requiring (n + 1)-state dfa
                                                            for the reversal.


Fig. 13. The dfa A (top left), the reversal of A (bottom
left), the subset automaton for the reversal; α = 2.

                                                            Fig. 19. The n-state binary dfa requiring (n + 2)-state dfa
                                                            for the reversal.


                                                            Proof. Let n ≥ 2. Consider the n-state dfa A shown in
                                                            Fig. 19 with states 1, 2, . . . , n, of which 1 is the starting
                                                            and the sole accepting state. Each state i goes to state
Fig. 14. The dfa A, the reversal of A, the subset automa-   i − 1 by symbol a, except for state 1 that goes to
ton for the reversal; α = 3, 4.                             state n by symbol a. Each state i goes to state 1 by
                                                            symbol b. The dfa A is minimal since for each state i,
                                                            the string ai−1 is accepted only from state i.
                                                                  Construct an nfa for the reversal of the lan-
                                                            guage L (A) by swapping the starting and accepting
                                                            states, and by reversing all transitions in A. Let us
                                                            show that the corresponding subset automaton has
                                                            n+2 reachable states. The set {1} is the starting state
                                                            in the suset automaton. For all i = 1, 2, . . . , n − 1, the
Fig. 15. The dfa A, the reversal of A, the subset automa-   set {i} goes to set {i + 1} by symbol a, and the set {n}
ton for the reversal; α = 5, 6.                             goes to the set {1} by symbol a. The set {1} goes to set
                                                            {1, 2, . . . , n} by symbol b. Each set {i} with i ≥ 2 goes
                                                            to the empty set by symbol b. The set {1, 2, . . . , n}
                                                            goes to itself by symbols a and b. So the sets {1}, {2},
                                                            . . ., {n}, {1, 2, . . . , n}, and the empty set are reach-
                                                            able, while no other set is reachable.                        u
                                                                                                                          t


                                                            8    Conclusions
Fig. 16. The dfa A, the reversal of A, the subset automa-
                                                        We studied the state complexity of languages that can
ton for the reversal; α = 7.
                                                        be obtained as reversals of regular languages repre-
                                                        sented by deterministic finite automata. We showed
                                                        that the state complexity of the reversal of a regu-
Theorem 10. For every integer n with n ≥ 2, there lar language with state complexity n is between log n
exists an n-state dfa A over a two-letter alphabet such and 2n . We gave a simple proof of a fact that the up-
that the minimal dfa for the reversal of the language per bound is tight in the ternary case. Then we pre-
L(A) has n + 2 states.                                  sented binary languages reaching this upper bound on
54      Juraj Šebej

the reversal. Our witness deterministic automata have           14. B.G. Mirkin: On dual automata. Kibernetika (Kiev) 2,
a single accepting state, which can be used in some                 1966, 7–10, (in Russian). English translation: Cyber-
results in the literature instead of an incorrect exam-             netics 2, 1966, 6–9.
ple in [20]. We also obtained some other partial results        15. E. Leiss: Succinct representation of regular languages
in the binary case for one-, two-, and three-state au-              by Boolean automata. Theoret. Comput. Sci. 13, 1981,
                                                                    323–330.
tomata. We described automata, the reversal of which
                                                                16. U.I. Lupanov: A comparison of two types of finite au-
has state complexity n, n + 1, and n + 2. In future,                tomata. Problemy Kibernetiki 9, 1963, 321–326.
we want to do statistics of reachable complexities for          17. G. Pighizzini, J. Shallit: Unary language operations,
the reversal of all automata up to five states. We also             state complexity and Jacobsthal’s function. Internat.
want to find automata, with other complexities then                 J. Found. Comput. Sci. 13, 2002, 145–159.
n, n + 1, n + 2, and 2n , and try to answer the question        18. M. Rabin, D. Scott: Finite automata and their deci-
whether all values from log n to 2n can be reached,                 sion problems. IBM Res. Develop. 3, 1959, 114–129.
or whether there are some “magic numbers” for the               19. A. Szabari: Regular languages and descriptional com-
reversal.                                                           plexity. PhD. Thesis, in preparation.
                                                                20. A. Salomaa, D. Wood, S. Yu: On the state complex-
                                                                    ity of reversals of regular languages. Theoret. Comput.
References                                                          Sci. 320, 2004, 315–329.
                                                                21. M. Sipser: Introduction to the theory of computation.
                                                                    PWS Publishing Company, Boston, 1997.
 1. J.-C. Birget: Intersection and union of regular lan-        22. S. Yu: Chapter 2: Regular languages. In: Rozen-
    guages and state complexity. Inform. Process. Lett. 43,         berg, G., Salomaa, A. (eds.) Handbook of Formal Lan-
    1992, 185–190.                                                  guages - Vol. I, Springer, Heidelberg, 1997, 41–110.
 2. J.-C. Birget: Partial orders on words, minimal ele-         23. S. Yu: A renaissance of automata theory? Bull. Eur.
    ments of regular languages, and state complexity. The-          Assoc. Theor. Comput. Sci. 72, 2000, 270–272.
    oret. Comput. Sci. 119, 1993, 267–291.                      24. S. Yu, Q. Zhuang, K. Salomaa: The state complexity of
 3. C. Câmpeanu, K. Culik II, K. Salomaa, S. Yu: State             some basic operations on regular languages. Theoret.
    complexity of basic operations on finite languages. In:         Comput. Sci. 125, 1994, 315–328.
    WIA’99, LNCS, vol. 2214, 2001, 60–70.
 4. C. Câmpeanu, K. Salomaa, S. Yu: Tight lower bound
    for the state complexity of shuffle of regular languages.
    J. Autom. Lang. Comb. 7, 2002, 303–310.
 5. M. Domaratzki: State complexity and proportional re-
    movals. J. Autom. Lang. Comb. 7, 2002, 455–468.
 6. V. Geffert: Magic numbers in the state hierarchy of
    finite automata. In: Královič, R., Urzyczyn, P. (eds.)
    MFCS 2006. LNCS, vol. 4162, 2006, 412–423.
 7. M. Hricko: Finite automata, regular languages, and
    state complexity. Master’s Thesis. P.J. Šafárik Univer-
    sity in Košice, Slovakia, 2005.
 8. J. Hromkovič: Descriptional complexity of finite au-
    tomata: Concepts and open problems. J. Autom. Lang.
    Comb. 7, 2002, 519–531.
 9. K. Iwama, Y. Kambayashi K. Takaki: Tight bounds on
    the number of states of DFAs that are equivalent to
    n-state NFAs. Theoret. Comput. Sci. 237, 2000, 485–
    494.
10. K. Iwama, A. Matsuura, M. Paterson: A family of
    NFAs which need 2n − α deterministic states. Theo-
    ret. Comput. Sci. 301, 2003, 451–462.
11. G. Jirásková: On the state complexity of complements,
    stars, and reversals of regular languages. In: Ito M.,
    Toyama M. (eds.) DLT 2008. LNCS, vol. 5257,
    Springer, 2008, 431–442.
12. G. Jirásková: Magic numbers and ternary alphabet. In:
    Diekert, V. (ed.) DLT 2009, LNCS, vol. 5583, Springer,
    Heidelberg, 2009, 300–311.
13. A.N. Maslov: Estimates of the number of states of fi-
    nite automata. Soviet Math. Dokl. 11, 1970, 1373–
    1375.