Černý-like problems for finite sets of words ?

                       Arturo Carpi1 and Flavio D’Alessandro2
    1
        Dipartimento di Matematica e Informatica, Università degli Studi di Perugia,
                          Via Vanvitelli 1, 06123 Perugia, Italy.
                                  carpi@dmi.unipg.it
            2
              Dipartimento di Matematica, Università di Roma “La Sapienza”
                        Piazzale Aldo Moro 2, 00185 Roma, Italy.
                               dalessan@mat.uniroma1.it


         Abstract. This paper situates itself in the theory of variable length
         codes and of finite automata where the concepts of completeness and
         synchronization play a central role. In this theoretical setting, we in-
         vestigate the problem of finding upper bounds to the minimal length of
         synchronizing and incompletable words of a finite language X in terms of
         the length of the words of X. This problem is related to two well-known
         conjectures formulated by Černý and Restivo respectively. In particular,
         if Restivo’s conjecture is true, our main result provides a quadratic bound
         for the minimal length of a synchronizing pair of any finite synchronizing
         complete code with respect to the maximal length of its words.
         Keywords: Černý conjecture, synchronizing automaton, incompletable
         word, synchronizing set, complete set


1       Introduction

The concepts of completeness and synchronization play a central role in Com-
puter Science since they appear in the study of several problems on variable
length codes and on finite automata. According to a well-known result of Schü-
tzenberger, the property of completeness provides an algebraic characterization
of finite maximal codes, which are the objects used in Information Theory to
construct optimal sequential codings. Let X be a set of words on an alphabet
A. The set X is complete if any word on the alphabet A is a factor of some word
belonging to X ∗ , otherwise it is incomplete. In the latter case, any word which
is factor of no word of X ∗ is said to be incompletable in X. In [18], Restivo
conjectured that a finite incomplete set X has always an incompletable word
whose length is quadratically bounded by the maximal length of the words of
X. Results on this problem have been obtained in [5, 14, 15, 18].
    The property of synchronization plays a natural role in Information Theory
where it is relevant for the construction of decoders that are able to efficiently
cope with decoding errors caused by noise during the data transmission. A set
?
    This work was partially supported by MIUR project PRIN 2010/2011 “Automi e
    Linguaggi Formali: Aspetti Matematici e Applicativi”.


                                            81
A.Carpi et al. Černý-like problems for finite sets of words

  X is synchronizing if there are two words u, v of X ∗ such that whenever ruvs ∈
  X ∗ , r, s ∈ A∗ , one has also ru, vs ∈ X ∗ . The pair of words (u, v) is called
  synchronizing pair of X.
      In the study of synchronizing sets, the search for synchronizing words of min-
  imal length in a prefix complete code is tightly related to that of synchronizing
  words of minimal length for synchronizing complete deterministic automata and
  the celebrated Černý Conjecture [12] (see also [1–3, 6–12, 16, 17, 20] for some re-
  sults on the problem). In particular, in [2] (see also [3]), Béal and Perrin have
  proved that a complete synchronizing prefix code X on an alphabet of d letters
  with n code-words has a synchronizing word of length O(n2 ).
      In this paper we are interested in finding upper bounds to the minimal lengths
  of incompletable and synchronizing words of a finite set X in terms of the size
  of X. We recall that the size of X is the parameter `(X) defined as the maximal
  length of the words of X.
      Let L be a class of finite languages. For all n, d > 0, we denote by RL (n, d)
  the least positive integer r satisfying the following condition: any incomplete set
  X ∈ L on a d-letter alphabet such that `(X) ≤ n has an incompletable word of
  length r. Similarly, we denote by CL (n, d) the least positive integer c satisfying
  the following condition: any synchronizing set X ∈ L on a d-letter alphabet such
  that `(X) ≤ n has a synchronizing pair (u, v) such that |uv| ≤ c.
      In this context, the main result of this paper provides a bridge between the
  parameters RL (n, d) and CL (n, d). More precisely, denoting by F and by M
  the classes of finite languages and of complete finite codes respectively, we show
  that, for all n, d > 0,

                           CM (n, d) ≤ 2RF (n, d + 1) + 2n − 2.

  In particular, if Restivo’s conjecture is true, the latter bound gives

                                     CM (n, d) = O(n2 ),

  thus providing a quadratic bound in the size of the set for the minimal length
  of a synchronizing pair of a finite synchronizing complete code.
      In the second part of the paper, we study the dependence of the parameters
  RL (n, d) and CL (n, d) upon the number of letters d of the considered alphabet,
  by showing that both the parameters have a low rate of growth. More precisely,
  we show that, for the class L of finite languages (resp. codes, prefix codes), we
  have                                                    
                                         RL (dlog2 den, 2)
                           RL (n, d) ≤                       ,
                                             blog2 dc
  and, for the class L of finite complete languages (resp. codes, prefix codes), we
  have                                                       
                                      CL (dlog2 (d + 1)en, 2)
                        CL (n, d) ≤                             .
                                          blog2 (d − 1)c
  A similar result is obtained also when L is the class of finite (not necessarily
  complete) languages (resp. codes, prefix codes).


                                             82
A.Carpi et al. Černý-like problems for finite sets of words

  2      Preliminaries

  In this section we shortly recall some basic results of the theory of automata
  and of the theory of codes which will be useful in the sequel and we fix the
  corresponding notation used in the paper. The reader can refer to [4, 13] for
  more details.
     Let A be a finite alphabet and let A∗ be the free monoid of words over the
  alphabet A. The identity of A∗ is called the empty word and is denoted by .
  The length of a word w ∈ A∗ is the integer |w| inductively defined by || = 0,
  |wa| = |w| + 1, w ∈ A∗ , a ∈ A. Given w ∈ A∗ and a ∈ A, we denote by |w|a
  the number of occurrences of the letter a in w. For any finite set W of words we
  denote by `(W ) the maximal length of the words of W . The number `(W ) will
  be called the size of W . Given words u, w ∈ A∗ , u is said to be a factor of w if
  w = αuβ, for some α, β ∈ A∗ . The set of all factors of w is denoted by Fact(w).
  Given a set W of words, the set of the factors of all the words of W is denoted
  by Fact(W ). Similarly, given a word w, a word u is said to be a prefix of w if
  w = uβ, for some β ∈ A∗ . A set X is said to be prefix if, for any u, v ∈ X, with
  v = uw, with w ∈ A∗ , one has w = .
  Definition 1. Let X be a subset of A∗ . A pair of words (r, s) is an X-completion
  of a word w if rws ∈ X ∗ . A word having an X-completion is a completable word
  of X; conversely, a word with no X-completion is an incompletable word of X.
  The set X is complete if all words of A∗ are completable words of X; X is
  incomplete, otherwise.
        Another crucial notion of this paper is that of synchronizing set.
  Definition 2. A pair (u, v) ∈ X ∗ × X ∗ is a synchronizing pair of X if for every
  X-completion (r, s) of uv, it holds that

                                        ru, vs ∈ X ∗ .

  The set X is synchronizing if it has a synchronizing pair.
  The notion of synchronizing pair of a set is strictly related to that of constant.
  A word c of X ∗ is said to be a constant of X if, for every u1 , u2 , u3 , u4 ∈ A∗ such
  that u1 cu2 , u3 cu4 ∈ X ∗ , one has u1 cu4 , u3 cu2 ∈ X ∗ . The following result holds.
  Lemma 1. Let X be a subset of A∗ . If (u, v) is a synchronizing pair of X, uv
  is a constant of X. Conversely, if c is a constant of X, (c, c) is a synchronizing
  pair of X.


  2.1     Complete and synchronizing codes

  The notions of complete and synchronizing sets provide a rich structure in the
  case that the set is a code. It is worth to shortly describe some fundamental
  results on such sets. A set X of words over an alphabet A is said to be a (variable
  length) code over A if it fulfills the unique factorization property, that is, for every


                                             83
A.Carpi et al. Černý-like problems for finite sets of words

  word u ∈ X ∗ , there exists a unique sequence x1 , . . . , xk of words of X such that
  u = x1 · · · xk . A well-known example of codes is given by prefix sets. The notion
  of code is strictly related to the one of monomorphism. Let A and B be two
  alphabets. A map h : A∗ → B ∗ is said to be a monomorphism or (sequential)
  encoding if h is injective and, for every u, v ∈ A∗ , one has h(uv) = h(u)h(v).
  The following lemma shows a well-known link between the notion of code and
  that of monomorphism.

  Proposition 1. Let h : A∗ → B ∗ be a monomorphism. Then the set X = h(A)
  is a code over B. If X is a code over B and X has the same cardinality of A, then
  every bijection between A and X can be extended to a (unique) monomorphism
  h : A∗ → B ∗ .

  In view of Proposition 1, the monomorphism defined by a prefix code will be
  called prefix encoding. A submonoid of A∗ is said to be free if it is generated by
  a code. An important result due to Schützenberger provides a characterization
  of a free submonoid of A∗ in terms of a property called stability. A submonoid
  N of A∗ is said to be stable if, for every word u ∈ A∗ , the existence of words
  n1 , n2 , n3 , n4 of N such that n2 = un1 and n4 = n3 u implies u ∈ N .

  Theorem 1. Let N be a submonoid of A∗ . Then N is a free submonoid of A∗
  if and only if N is stable.

  Let us finally recall some well-known results on codes.
  Theorem 2. (Kraft-McMillan inequality) Let A be a d-letter alphabet with d ≥ 2
  and let k1 , . . . , kn be a finite sequence of positive integers such that
                                        n
                                        X
                                              d−ki ≤ 1 .
                                        i=1

  Then k1 , . . . , kn is the sequence of the code-word lengths of a prefix code over A.
  Given a code X over an alphabet A, X is said to be maximal if it is not properly
  contained in any other code over A. Let us recall another important result due
  to Schützenberger. It provides a tight relation between maximal and complete
  codes.
  Theorem 3. Let A be a d-letter alphabet and let X be a regular code of A∗ . The
  following conditions are equivalent:

    – X is a maximal code;
    – X
      P is a complete  code;
               −|x|
    –   x∈X  d      = 1.

  Given a set X over an alphabet A, X is said to be maximal prefix if it is prefix
  and it is not properly contained in any other prefix code over A. In the case of
  prefix codes we get:


                                              84
A.Carpi et al. Černý-like problems for finite sets of words

  Theorem 4. Let X be a prefix code of A∗ and suppose that X is a regular
  language. The following conditions are equivalent:

    – X is maximal prefix;
    – X is right complete, that is, for every word u ∈ A∗ , uA∗ ∩ XA∗ 6= ∅;
    – X is a maximal code.

  It is worth recalling that Theorems 3 and 4 hold in the more general case of thin
  codes that is, for all the subsets X of A∗ such that X ∩ A∗ wA∗ = ∅, for some
  word w ∈ A∗ . Another relevant result concerning synchronizing and complete
  codes is the following.

  Theorem 5. Let X be a synchronizing and complete code. Then there exists a
  constant c ∈ X ∗ such that cA∗ c ⊆ X ∗ . Conversely, if X is a code such that there
  exists a word c ∈ X ∗ with cA∗ c ⊆ X ∗ , then X is synchronizing and complete.


  2.2    Synchronizing automata and the Černý conjecture

  A finite non-deterministic automaton is a tuple A = hQ, A, δ, I, F i where Q is a
  finite set of elements called states, δ is a map

                                     δ : Q × A −→ P(Q),

  where P(Q) is the power set of Q, and I, F are subsets of Q. The map δ is called
  the transition function of A while I is called the set of the initial states and F
  is the set of the final states. The canonical extension of the map δ to the set
  Q × A∗ is still denoted by δ. If P is a subset of Q and u is a word of A∗ , we
  denote by δ(P, u) and δ(P, u−1 ) the sets:

            δ(P, u) = {δ(s, u) | s ∈ P },    δ(P, u−1 ) = {s ∈ Q | δ(s, u) ∈ P }.

  If no ambiguity arises, the sets δ(P, u) and δ(P, u−1 ) are denoted P u and P u−1 ,
  respectively. With the automaton A, we can associate a directed labelled multi-
  graph G = (Q, E), where the set E of edges is defined as E = {(p, a, q) ∈
  Q × A × Q | q ∈ δ(p, a)}. We recall that if p, q ∈ Q, then q ∈ pu with u ∈ A∗ , is
                                                    u
  equivalent to the existence of a path c = p −→ q in G labelled by u. The label of
  c is denoted by ||c||. A word u ∈ A∗ is said to be accepted by A if Iu∩F 6= ∅. The
  language accepted by A, denoted by LA , is the set of all the words accepted by
  A. An automaton A is said to be unambiguous if the following condition holds:
  for every q1 , q2 , q3 , q4 ∈ Q and for every u, v ∈ A∗ one has

                       q2 , q3 ∈ q1 u,   q4 ∈ q2 v ∩ q3 v =⇒ q2 = q3 .

     An automaton A is said to be deterministic if for every q ∈ Q and for every
  a ∈ A, Card(qa) ≤ 1.
     An automaton A is said to be complete if for every u ∈ A∗ , there exists some
  q ∈ Q such that Card(qu) ≥ 1.


                                             85
A.Carpi et al. Černý-like problems for finite sets of words

      An automaton A is said to be transitive if for every p, q ∈ Q, there exists
  u ∈ A∗ such that q ∈ pu. In the sequel, we will only consider automata A such
  that I = F = {1}, that is, with a unique initial and final state denoted 1. In
  particular, the tuple of the automaton A will be simply denoted A = hQ, A, δ, 1i.
  Moreover, in the sequel, we will only consider transitive automata.
      An unambiguous automaton A is said to be synchronizing if there exist two
  words w1 , w2 ∈ A∗ such that Qw1 ∩ Qw2−1 = {1}.
      As is well known, a deterministic automaton A is synchronizing if and only if
  there is a word u such that Card(Qu) = 1. Such a word is said to be a synchro-
  nizing word of A. The following celebrated conjecture has been raised in [12].
  Černý Conjecture. Each synchronizing and complete deterministic automaton
  with n states has a synchronizing word of length (n − 1)2 .
      Let us recall an important problem related to the Černý Conjecture. Let G
  be a finite, directed multigraph with all its vertices of the same outdegree. Then
  G is said to be aperiodic if the greatest common divisor of the lengths of all
  cycles of the graph is 1. The graph G is called a Road coloring graph (RC-graph
  for short) if it is aperiodic and strongly connected. A synchronizing coloring of
  G is a labeling of the edges of G that transforms it into a complete, deterministic
  and synchronizing automaton. The Road coloring problem asks for the existence
  of a synchronizing coloring for every RC-graph. In 2007, Trahtman proved the
  following remarkable result [19].
  Theorem 6. Every RC-graph has a synchronizing coloring.
  Let us conclude this section by recalling some well-known properties of the au-
  tomaton that accepts the submonoid generated by a finite set (see, e.g., [6]). Let
  X be a finite set of words over an alphabet A. By using a standard construction,
  one can associate with X a transitive automaton denoted by AX = hQ, A, δ, 1i
  that accepts X ∗ .

  Lemma 2. Let X be a regular code (resp., prefix set). Then AX is an unambigu-
  ous (resp., deterministic) automaton. Conversely, let A be an automaton such
  that LA = X ∗ where X ∩ X 2 X ∗ = ∅. If A is unambiguous (resp., deterministic),
  then X is a code (resp. a prefix set).

  Incompletable words of a regular set are characterized by the following.
  Lemma 3. Let X be a regular set and A = hQ, A, δ, 1i be a transitive automaton
  accepting X ∗ . Then w ∈ A∗ is a completable word of X if and only if Qw 6= ∅.
     In particular, X is complete if and only if A is complete.
  Similarly it is possible to characterize synchronizing pairs of regular codes [6].
  Lemma 4. Let X be a regular code and A = hQ, A, δ, 1i be a transitive unam-
  biguous automaton accepting X ∗ and w1 , w2 ∈ A∗ . Then (w1 , w2 ) is a synchro-
  nizing pair of X if and only if w1 w2 ∈ X ∗ and Qw1 ∩ Qw2 −1 = {1}.
      Consequently, X is synchronizing if and only if A is synchronizing.


                                             86
A.Carpi et al. Černý-like problems for finite sets of words

  3    The main result
  The main result of this paper is related to a problem that was formulated in [18]
  by Restivo. Let L be a class of finite languages. For all n > 0 we set
                    RL (n) = sup RL (n, d) ,      CL (n) = sup CL (n, d) .
                               d≥1                          d≥1

  In [18], it was conjectured that if F is the class of all finite languages, then
  RF (n) ≤ 2n2 . If we restrict ourselves to prefix codes, we get
  Proposition 2. ([18]) Let P be the class of finite prefix codes. Then
                                        RP (n) ≤ 2n2 .
  However, in the general case, the previous bound was disproved in [14]. A more
  general and larger counterexample is given in [15]. We can thus state a slightly
  weaker version of the problem as follows.
  Conjecture 1. (Restivo’s Conjecture) Let F be the class of all finite languages.
  Then RF (n) = O(n2 ).
      In this context, the main result of this paper is the following.
  Proposition 3. Let M be the class of complete finite codes. For all n, d > 0,
                           CM (n, d) ≤ 2RF (n, d + 1) + 2n − 2.
  Before proving Proposition 3, it is convenient to discuss some interesting conse-
  quences of this result. First, if Restivo’s conjecture is true, we get
                                     CM (n, d) = O(n2 ).
  Moreover, the bound above would be sharp, as we explain below. Consider the
  prefix code Xn = aAn−1 ∪ bAn−2 on the alphabet A = {a, b}. The minimal
  automaton accepting Xn∗ has been studied in [1], where it has been proved that
  the minimal length of its synchronizing words is n2 − 3n + 3. From this, one
  derives that any synchronizing pair (w1 , w2 ) of Xn verifies |w1 w2 | ≥ (n − 1)2 . In
  particular, a synchronizing pair of Xn of minimal length is ((abn−2 )n−1 , ). This
  provides the lower bound
                             CM (n, 2) ≥ CP (n, 2) ≥ (n − 1)2 ,
  for the parameter CM (n, 2).
      It is also worth to do a remark on a recent result by Béal and Perrin. In [2]
  (cf. also [3]), it is proved that a synchronizing complete prefix code X with n
  code-words has a synchronizing word of length 2(n − 2)(n − 3) + 1. This result
  is derived from an upper bound to the length of shortest synchronizing words
  of synchronizing one-cluster automata. However, in view of Proposition 3 and
  Restivo’s conjecture, this bound seems of no help in obtaining a good evaluation
  of the parameter CP (n, 2), as one may have n ' 2`(X) . This suggests that a
  bound in term of the size of X may be more informative than a bound in terms
  of the cardinality.


                                             87
A.Carpi et al. Černý-like problems for finite sets of words

  3.1    Proof of Proposition 3
  Let us now proceed to prove Proposition 3. For this purpose, let X be a finite
  complete synchronizing code over a d-letter alphabet A and let n = `(X). Let
  AX = hQ, A, δ, 1i be the unambiguous automaton that accepts X ∗ . The proof
  of Proposition 3 is based upon the following lemma.
  Lemma 5. Let (v1 , v2 ) be a synchronizing pair of X. Then, there exist words
  w1 , w2 ∈ A∗ such that

              |w1 |, |w2 | ≤ RF (n, d + 1),   Qw1 ⊆ Qv1 ,       Qw2 −1 ⊆ Qv2−1 .

  Indeed, assume that Lemma 5 holds. As X is complete, the word w1 w2 has an X-
  completion (r, s). With no loss of generality, we may suppose that |r|, |s| ≤ n−1.
  Since (v1 , v2 ) is a synchronizing pair, one has

             Q(rw1 ) ∩ Q(w2 s)−1 ⊆ Qw1 ∩ Qw2 −1 ⊆ Qv1 ∩ Qv2 −1 = {1}.

  Moreover, the word rw1 w2 s ∈ X ∗ is accepted by AX and therefore there is a state
  q ∈ Q such that q ∈ 1rw1 and 1 ∈ qw2 s. Thus, q ∈ Q(rw1 ) ∩ Q(w2 s)−1 ⊆ {1},
  that is, q = 1. This proves that rw1 , w2 s ∈ X ∗ and by Lemma 4 (rw1 , w2 s) is a
  synchronizing pair of X.
      Now, our main goal is to prove Lemma 5. For the sake of simplicity, we will
  prove the existence of the word w1 that fulfills the conditions of Lemma 5 since
  the proof of the existence of the word w2 can be obtained by using a symmetric
  construction. The main tool of this proof is a new automaton we construct below.
      Let (v1 , v2 ) be a synchronizing pair of X. If v1 = , the statement is trivially
  verified by w1 = v1 . Thus we assume v1 6=  and set v1 = ua, with u ∈ A∗ and
  a ∈ A.
      Let a0 be a symbol not belonging to A and let A0 = A ∪ {a0 }. We consider
  a new automaton A0 = hQ, A0 , δ 0 , 1i where the transition map δ 0 is defined as
  follows: for every q ∈ Q and a ∈ A, δ 0 (q, a) = δ(q, a) and
                                        (
                                         δ(q, a) ∪ {1} if q ∈
                                                            / δ(Q, u),
                         δ 0 (q, a0 ) =                                              (1)
                                         δ(q, a) \ {1} if q ∈ δ(Q, u).

  It is useful to remark that, by construction, the automaton A0 is still transitive.
  Let Y be the minimal generating set of the language accepted by A0 . Thus,
  LA0 = Y ∗ and Y ∩ Y 2 Y ∗ = ∅.
  Lemma 6. The set Y is incomplete.
  Proof. By (1) one has δ 0 (Q, ua0 ) = δ(Q, ua)\{1} = δ(Q, v1 )\{1} and δ 0 (Q, v2−1 ) =
  δ(Q, v2 ). Taking into account that (v1 , v2 ) is a synchronizing pair of X, one de-
  rives
              δ 0 (Q, ua0 ) ∩ δ 0 (Q, v2−1 ) = (δ(Q, v1 ) ∩ δ 0 (Q, v2−1 )) \ {1} = ∅ .
  It follows that δ 0 (Q, ua0 v2 ) = ∅. This equation proves that the automaton A is
  not complete. Thus, by Lemma 3, Y is an incomplete set.                         t
                                                                                  u


                                              88
A.Carpi et al. Černý-like problems for finite sets of words

  Lemma 7. It holds that `(Y ) ≤ `(X).

  Proof. In order to prove the statement, it is enough to show that, for every
  y ∈ Y, there exists x ∈ X with |y| ≤ |x|.
     Let y = a1 · · · ak ∈ Y , with ai ∈ A0 , for i = 1, . . . , k. Since Y ∩ Y 2 Y ∗ = ∅, in
  the graph of A0 there is a path
                                a            a            a     ak−1       a
                        c0 = 1 −→
                                1     2
                                  q1 −→     3
                                        q2 −→              k
                                              · · · −→ qk −→ 1,

  where, for every i = 1, . . . , k, qi 6= 1. Let us now construct a path c in the graph
  of AX such that ||c|| = x ∈ X, with |x| ≥ |y|, so completing the proof.
                                                           b
      By the definition of A0 , any edge p −→ q of the graph of A0 with b 6= a0 is
                                                               a0
  also an edge of the graph of A. Moreover, if p −→ q is an edge of the graph of A0
                       a
  with q 6= 1, then p −→ q is an edge of the graph of A. Thus, by replacing in c0 ,
                         a0                      a                           k        a
  every transition qi −→ qi+1 , by qi −→ qi+1 and deleting the last edge qk −→ 1,
  we find a path

                                     1   b 2          b             bk−1
                              d = 1 −→ q1 −→ q2 · · · −→ qk
  of the graph of A. Since A is transitive, one can catenate d with a simple path
  from qk to 1. In such a way, we obtain a path c of the graph of A starting and
  ending in 1, with all intermediate states distinct from 1 and length ≥ k + 1. As
  is well known, as A is unambiguous, the label x of such a path is a word of the
  minimal generating set X of X ∗ . Since |x| ≥ k + 1 = |y|, this completes the
  proof.                                                                         t
                                                                                 u

      We now prove the following lemma.
  Lemma 8. Let v be an incompletable word of Y of minimal length. There exists
  a word w1 ∈ A∗ such that

                                    |w1 | ≤ |v|,          Qw1 ⊆ Qv1 .

  Proof. Let v be an incompletable word of Y of minimal length, with the number
  |v|a0 as small as possible. Then, by Lemma 3, one has δ 0 (Q, v) = ∅.
      The letter a0 necessarily occurs in v, since by the completeness of A, one
  has δ 0 (Q, r) = δ(Q, r) 6= ∅ for all r ∈ A∗ . Thus, we can write v = u1 a0 u2 , with
  u1 ∈ A∗ and u2 ∈ A0∗ .
      Let us verify that δ(Q, u1 ) ⊆ δ(Q, u). Indeed, suppose the contrary. Then,
  by (1), one has

                    δ 0 (Q, u1 a0 ) = δ(Q, u1 a) ∪ {1} = δ 0 (Q, u1 a) ∪ {1}

  and consequently, δ 0 (Q, u1 au2 ) ⊆ δ 0 (Q, u1 a0 u2 ) = ∅. Thus, u1 au2 is an incom-
  pletable word of Y , but this contradicts the minimality of |v|a0 .
      We conclude that δ(Q, u1 ) ⊆ δ(Q, u) and therefore taking w1 = u1 a and
  recalling that v1 = ua, one has δ(Q, w1 ) ⊆ δ(Q, v1 ) and |w1 | ≤ |v|. The statement
  follows.                                                                             t
                                                                                       u


                                                     89
A.Carpi et al. Černý-like problems for finite sets of words

          Let us finally remark that Lemma 7 and Lemma 8 yield

                           |w1 | ≤ RF (n, d + 1),    Qw1 ⊆ Qv1 .

  The proof of Lemma 5 is thus complete.


  4        Reduction to the binary case
  The aim of this section is to study how much the parameters RL (n, d) and
  CL (n, d) vary according to the cardinal number d of the alphabet. We start to
  analyze the parameter RL (n, d). In the sequel, B denotes the binary alphabet
  B = {a, b}. The following lemmas are needed for the proof of Proposition 4.
  Lemma 9. Let Y ⊆ A∗ be a complete finite set. Then any word w of A∗ has a
  Y -completion (y, s) with y ∈ Y ∗ .
  Lemma 10. Let h : A∗ → B ∗ be a prefix encoding and Y ⊆ A∗ . The set h(Y )
  is complete if and only if Y and h(A) are complete.
     Encoding a d-letter alphabet on a suitable complete binary prefix code one
  obtains
  Proposition 4. Let L be the class of finite languages (resp. codes). Then
                                                      
                                     RL (dlog2 den, 2)
                       RL (n, d) ≤                       .                         (2)
                                          blog2 dc
  Proof. Let A be a d-letter alphabet and let X be a finite incompletable language
  over A of size n. By Theorems 2 and 3, there exists a maximal prefix code Y
  over B such that Card(Y ) = d and, for every y ∈ Y , blog2 dc ≤ |y| ≤ dlog2 de.
  Let h : A∗ → B ∗ be a monomorphism constructed by Y . In particular, for every
  a ∈ A, we have
                             blog2 dc ≤ |h(a)| ≤ dlog2 de.                     (3)
  By (3) the size of h(X) is not greater than ndlog2 de. By Lemma 10, since X is
  incompletable, h(X) is incompletable as well. Let v be an incompletable word
  in h(X) of minimal length. Hence we have

                                   |v| ≤ RL (dlog2 den, 2).                        (4)

     Since h(A) is a complete prefix code, by Theorem 4, there exist u ∈ A∗ and
  s ∈ B ∗ such that h(u) = vs and by (3)
                                                
                                           |v|
                                |u| ≤              .                         (5)
                                        blog2 dc
      Let us check that u is incompletable. By contradiction, deny. Then r0 us0 ∈
  X , for some r0 , s0 ∈ A∗ . Consequently, h(r0 us0 ) = h(r0 )vsh(s0 ) ∈ h(X ∗ ), thus
      ∗

  implying that v is completable in h(X). Now (2) easily follows from the latter,
  (4) and (5).                                                                        t
                                                                                      u


                                             90
A.Carpi et al. Černý-like problems for finite sets of words

      Let us now analyze the parameter CL (n, d). The next two lemmas are useful
  for this purpose. In particular the following lemma is algebraically similar to
  Lemma 10.

  Lemma 11. Let h : A∗ → B ∗ be a monomorphism and let Y ⊆ A∗ be a complete
  set. The set h(Y ) is synchronizing if and only if Y and h(A) are synchronizing.

  Lemma 12. Let k1 , . . . , kn , d > 0 be such that
                                                           n
                                                           X
                        gcd(k1 , k2 , . . . , kn ) = 1 ,         d−ki = 1.
                                                           i=1

  Then k1 , . . . , kn are the code-word lengths of a synchronizing complete prefix code
  over d letters.

  Remark 1. It is worth noticing that a finite synchronizing complete prefix code
  over d letters satisfies both the conditions of Lemma 12. Indeed,P           by Theorem 3,
                                                                                n
  if X = {x1 , . . . , xn } is a complete code over d letters, one gets i=1 d−ki = 1.
  Moreover, by Theorem 5, there exists a constant x ∈ X for X such that xA∗ x ⊆
                                                                  ∗

  X ∗ . Let γ = gcd(k1 , k2 , . . . , kn ). Since, by the latter result, x2 , xax ∈ X ∗ , with
  a ∈ A, γ should divide both 2|x| and 2|x| + 1, whence γ = 1.

  As an application of the two lemmas above, encoding a d-letter alphabet on a
  suitable complete binary synchronizing code, one obtains the following result:

  Proposition 5. Let L be the class of finite complete languages (resp. codes,
  prefix codes). Then
                                                           
                                    CL (dlog2 (d + 1)en, 2)
                      CL (n, d) ≤                             .            (6)
                                        blog2 (d − 1)c
     A similar bound can be found also in the case where completeness is not
  required:

  Proposition 6. Let L be the class of finite languages (resp. codes, prefix codes).
  Then                                                    
                                   CL (dlog2 (d + 1)en, 2)
                     CL (n, d) ≤                             .                   (7)
                                        dlog2 (d + 1)e

  References
   1. D. S. Ananichev, V. V. Gusev, M. V. Volkov, Slowly synchronizing automata and
      digraphs, in: P. Hliněný, A. Kučera eds., MFCS 2010 Mathematical Foundations of
      Computer Science, Lecture Notes in Comput. Sci. Vol. 6281, pp. 55–65, Springer,
      Berlin, 2010.
   2. M.-P. Béal, D. Perrin, A quadratic upper bound on the size of a synchronizing word
      in one-cluster automata, in: V. Diekert, D. Nowotka eds., DLT 2009 Developments
      in Language Theory, Lecture Notes in Computer Science, Vol. 5583, pp. 81–90,
      Springer, Berlin, 2009.


                                                 91
A.Carpi et al. Černý-like problems for finite sets of words

   3. M.-P. Béal, M. V. Berlinkov, D. Perrin, A quadratic upper bound on the size of
      a synchronizing word in one-cluster automata, Int. J. Found. Comput. Sci., 22,
      277–288, 2011.
   4. J. Berstel, D. Perrin, Ch. Reutenauer, Codes and Automata, Encyclopedia of Math-
      ematics and its Applications, 129, Cambridge University Press, 2009.
   5. J. M. Boë, A. de Luca, A. Restivo, Minimal complete sets of words, Theoret.
      Comput. Sci., 12, 325–332, 1980.
   6. A. Carpi, On synchronizing unambiguous automata, Theoret. Comput. Sci., 60,
      285–296, 1988.
   7. A. Carpi, F. D’Alessandro, The Synchronization Problem for Strongly Transitive
      Automata, in: M. Ito, M. Toyama eds., DLT 2008 Developments in Language The-
      ory, Lecture Notes in Computer Science, Vol. 5257, pp. 240–251, Springer, Berlin,
      2008.
   8. A. Carpi, F. D’Alessandro, Strongly transitive automata and the Černý conjecture
      Acta Informatica, 46, 591–607, 2009.
   9. A. Carpi, F. D’Alessandro, The synchronization problem for locally strongly transi-
      tive automata, in: R. Královič, D. Niwiński eds., MFCS 2009 Mathematical Founda-
      tions of Computer Science, Lecture Notes in Comput. Sci., Vol. 5734, pp. 211–222,
      Springer, Berlin, 2009.
  10. A. Carpi, F. D’Alessandro, On the Hybrid Černý-Road coloring problem and
      Hamiltonian paths, in: Y. Gao, H. Lu, S. Seki, S. Yu eds., DLT 2010 Develop-
      ments in Language Theory, Lecture Notes in Comput. Sci., Vol. 6224, pp. 124–135,
      Springer, Berlin, 2010.
  11. A. Carpi, F. D’Alessandro, Independent sets of words and the synchronization
      problem, Advances in Applied Mathematics, 50, 339–355, 2013.
  12. J. Černý, Poznámka k. homogénnym experimenton s konečnými automatmi, Mat.
      fyz. cas SAV, 14, 208–215, 1964.
  13. A. de Luca, F. D’Alessandro, Teoria degli Automi Finiti, 68, Springer Italia, 2013.
  14. G. Fici, E. V. Pribavkina, J. Sakarovitch, On the Minimal Uncompletable Word
      Problem, CoRR, arXiv: 1002.1928, 2010.
  15. V. V. Gusev, E. V. Pribavkina, On Non-complete Sets and Restivo’s Conjecture, in:
      G. Mauri, A. Leporati eds., DLT 2011 Developments in Language Theory, Lecture
      Notes in Comput. Sci., Vol. 6795, pp. 239–250, Springer, Berlin, 2011.
  16. J. E. Pin, Le problème de la synchronization et la conjecture de Cerny, Thèse de
      3ème cycle, Université de Paris 6, 1978.
  17. J. E. Pin, Sur un cas particulier de la conjecture de Cerny, in: G. Ausiello, C.
      Böhm eds., 5th ICALP Lecture Notes in Computer Science, Vol. 62, pp. 345–352,
      Springer, Berlin, 1978.
  18. A. Restivo, Some remarks on complete subsets of a free monoid, in: A. de Luca ed.,
      Non-Commutative Structures in Algebra and Geometric Combinatorics, Interna-
      tional Colloquium, Arco Felice, July 1978, Quaderni de “La Ricerca Scientifica”,
      CNR, 109, 19–25, 1981.
  19. A. N. Trahtman, The road coloring problem, Israel J. Math., 172, 51–60, 2009.
  20. M. V. Volkov, Synchronizing automata and the Cerny conjecture, in: C. Martı́n-
      Vide, F. Otto, H. Fernau eds., LATA 2008 Language and Automata Theory and
      Applications, Lecture Notes in Comput. Sci., Vol. 5196, pp. 11–27, Springer, Berlin,
      2008.


                                             92