-

Binary 3-compressible automata

Alessandra Cherubini

alessandra.cherubini@polimi.it 1

Andrzej Kisielewicz

andrzej.kisielewicz@math.uni.wroc.pl 0 0 Department of Mathematics and Computer Science, University of Wroclaw 1 Politecnico di Milano , Dipartimento di Matematica

109 120

A finite deterministic automaton A = (Q, Σ, δ) is k-compressible if there is a word w ∈ Σ+ such that the image of the state set Q under the natural action of w is reduced by at least k states. In such case w is called a k-compressing word for A. It is known that, for any alphabet Σ and any k ≥ 2, there exist words that are k-compressing for each k-compressible automaton with the input alphabet Σ. Such words are called k-collapsing. It has been proved that recognizing 2collapsing words over a 2-element alphabet may be done in polynomial time, while recognizing 2-collapsing words over an alphabet of size ≥ 3 is co-NP-complete. A natural question in this context, whether recognizing 3-collapsing words over a 2-element alphabet is easy or hard, has remained open. In this paper we provide results on 3-compressible binary automata, which allow to prove that that the latter problem is co-NP-complete.

Let T (Q) be the full transformation monoid on a finite set Q. For f ∈ T (Q), we define the deficiency df (f ) of f to be the difference between the cardinalities of Q and the image Imf under f , df (f ) = |Q| − |Imf |. At the beginning of 1990’s Sauer and Stone [ 12 ] introduced the property Δk defined as follows. For a finite alphabet Σ and a positive integer k, a word w ∈ Σ+ has the property Δk for Σ if for all homomorphisms φ : Σ+ → T (Q), where Q is any finite set, df (wφ) ≥ k whenever df (vφ) ≥ k for some v ∈ Σ+. They proved the nonobvious fact that such words exist for each positive integer k and for each finite alphabet Σ giving an elegant recursive construction that produces a word whose length is O(22k ). Their construction was improved in [ 9 ], where better but yet unrealistic upper bounds for the length of shortest words with the property Δk were given.

Words with the property Δk have a natural interpretation in the language of finite automata theory. Each complete deterministic automaton A = (Q, Σ, δ) over the alphabet Σ can be viewed as the transformation monoid generated by transformations on Q induced via δ by the letters of Σ. Namely, for each α ∈ Σ we define the induced transformation by qα = δ(q, α). This action of the letters on Q extends naturally into the action of words w ∈ Σ+ on Q which is denoted briefly by qw = δ(q, α).

Conversely, to define an automaton it is enough to assign to any letter of Σ a transformation on Q. Thus an automaton A can be identified with a specific homomorphism φA of Σ+ in T (Q). If there exists a word v ∈ Σ+ such that df (vφA) ≥ k—that is, if |Q| − |Qv| ≥ k—the automaton A is called kcompressible and v is a k-compressing word for A (it k-compresses A). A word that is k-compressing for each k-compressible automaton with the input alphabet Σ is called k-collapsing. Obviously, a word has the property Δk (or witnesses for deficiency k, in the terminology of [ 9 ]) if and only if it is k-collapsing. Hence, besides the original motivations coming from combinatorics and algebra, the interest in such words comes from the fact that they can be seen as universal testers whose action on the set of states of an automaton exposes whether or not the automaton is k-compressible. The problem of the length of the shortest k-collapsing word over an alphabet Σ can be considered as a black-box version of the generalized Cˇ erny´’s conjecture stated by Pin [ 7, 8 ].

In [ 10 ] it is proved that the membership of a given word w ∈ Σ+ to the language of k-collapsing words is decidable. The decision procedure is in the class co-NP and requires linear space, which shows that the language of k-collapsing words is context-sensitive. In [ 11 ] it is shown that it is not context-free even in the very simple case of the language of 2-collapsing words over a 2-letter alphabet. Most results so far concern 2-collapsing words. In particular, 2-collapsing words were characterized in [ 2 ] and in [ 5 ]. From the first characterization, that has a group theoretical flavor, a non-deterministic polynomial algorithm to recognize whether a word w ∈ Σ+ is 2-collapsing was derived [ 3 ]. A refinement of this algorithm was used in [ 4 ] to give the list of shortest 2-collapsing words over a 3-letter alphabet. The second characterization is in terms of systems of permutation conditions and it is used in [ 6 ] to show that the membership problem of the language of 2-collapsing words over an alphabet of size ≥ 3 is a co-NP-complete problem. The algorithms for recognizing whether a word w ∈ Σ+ is 2-collapsing, derived by both the characterizations of 2-collapsing words, become polynomial algorithms when |Σ| = 2 even if w is represented in the compressed form [ 5 ]. In view of these results a question arose whether for k ≥ 3, k-collapsing words over a binary alphabet can also be recognized in polynomial time. This natural question has remained open so far. In this paper we show that the answer is negative. We prove that the problem of recognizing whether a word w ∈ {α, β}+ is 3-collapsing is co-NP-complete.

The difficulty in this case is that we have no characterization of 3-collapsing words similar to that for the case of 2-collapsing words. Yet, we have a certain classification of proper 3-compressible automata on 2 input letters [ 1 ], and we can see that the problem looks differently depending on the class. In some classes it is easy to recognize whether a word w 3-compresses all the automata in the class. There is however at least one class where this problem leads to solving a sort of a system of permutation conditions, similarly as in [ 6 ], and it seems computationally hard.

Our idea is the following. First, we reduce the problem whether a word w is 3-collapsing to the problem of recognizing those words w that 3-compress all automata in a restricted class D of 3-compressible automata. Next, we show that in this restricted class the problem is equivalent to the existence of a solution of a certain system of permutation/transformation conditions similar to those considered in [ 6 ]. Then, using the tools worked out in [ 6 ], we show that the problem of the existence of a solution of such systems is NP-hard. Since our reductions induce suitable polynomial transformations, we obtain a proof that the initial problem is co-NP-complete. In this paper, we present the first part of the plan, showing how to reduce the problem to a problem concerning systems of transformation conditions. The full proof will appear in the extended version of this paper. 2

Preliminaries We deal with binary automata over the alphabet Σ = {α, β}. The letters α and β are identified with transformations they induce on the set Q of the states. The image of q ∈ Q by a letter (transformation) α is denoted qα. The words w = α1α2 . . . αt over Σ are identified with transformations they induce. The inverse image is denoted by xα−1.

We use special notation for concrete transformations α (similar to permutation notation), where round brackets (x1x2 . . . xt) denote a cycle: x1α = x2, . . . , xtα = x1, and square brackets [x1x2 . . . xt] denote a path: x1α = x2, . . . , xt−1α = xt. This notation is not unique: for example [123][42](357) = [ 12 ][423](357). Yet, we always write full cycles, and refer to them as the cycles of α, and usually omit all the fixed points. If x is an element of a cycle in α, then we write Cycα(x) to denote the cycle containing x, or a set of elements in this cycle, and we write |Cycα(x)| to denote the length of this cycle (the number of elements). We will also use a part of the structure of a transformation to speak about transformations of a given form. For example, a transformation (permutation) is of the form β = (12y)(xa)(zb) . . . for some elements y, x, z, a, b ∈ Q if β = (12y)(xa)(zb)τ for some transformation τ . We do not exclude that some of these elements may be equal, and some of these cycles coincide. For example, we may have x = z and (consequently) a = b, in which case the cycle (xa) is the same as the cycle (zb). We may have also x = a, which means that (xa) is, in fact, a fixpoint (x) = (a). But, assuming 1 6= 2, we cannot have x = y, because y is in a cycle of length 3, while x is in a cycle of length 2 or 1. Thus saying that a transformation is of the form β = (a1a2 . . . at) . . . we assume that the length of the cycle is t or a divisor of t.

Treating words over Σ = {α, β} as compositions of transformations on the set Q with 1, 2 ∈ Q, we shall consider systems of transformation conditions of the form stating that the image of 1 by each of words u1, u2, . . . , us belongs to the set {1, 2}. If all transformations in Σ fix 1 or all fix the set {1, 2}, then they form a solution of the system (1), which is called trivial. The problem whether there exists a nontrivial solution for a system of permutation conditions has been proved to be NP-complete in [ 6 ]. Similarly one can prove that the problem is hard if we look for a solution in transformations of given types. We will exploit this in our proof.

We recall that a factor of a word w ∈ Σ+ is a word v ∈ Σ+ such that w = uvz for some u, z ∈ Σ∗. If A is k-compressible at least one letter of its input alphabet has deficiency greater than 0. It is known that each k-collapsing word over a fixed alphabet Σ is k-full [ 12 ], that is, contains each word of length k over the alphabet Σ among its factors. A k-compressible automaton is called proper k-compressible if it cannot be k-compressed by any word of length k. Thus, k-collapsing words are k-full words that k-compress all proper k-compressible automata. In particular, in our consideration we may restrict to proper k-compressible automata.

We will consider types of transformations with regard to which states are sent into the same element, and which states are missing in the image. We say that a transformation α is of type I\M , where M is a subset of Q, and I is a family of disjoint subsets of Q, if M = Q \ Qα is a set of elements missing in the image Qα, while I is the family of those inverse images of elements of Q that have more than one element. In other words, we write that a letter α is of type [x11 , . . . , xj1 ][x12 , . . . , xj2 ] . . . [x1r , . . . , xjr ]\y1, y2, ..., ym, if {y1, . . . ym} = Q \ Im(α) and {x11 , . . . , xj1 }, {x12 , . . . , xj2 }, . . . , {x1r , . . . , xjr } are the equivalence classes of the kernel of transformation induced by α that have more than one element. We say that α ∈ Σ is a permutation letter if it induces a permutation on the set Q of the states, i.e., it has deficiency 0. Otherwise, a letter is a non-permutation.

The following fact leading to a classification of proper 3-compressible automata is not difficult to prove (see e.g. [ 1 ]).

Proposition 1. If A is a proper 3-compressible automaton over the alphabet Σ = {α, β} then each letter in Σ is either a permutation or is one of the following types: 1. [x, y, z]\x, y; 2. [x, y][z, t]\x, z; 3. [x, y]\x; 4. [x, y]\z with za ∈ {x, y}. where x, y, z, t are different states of A.

Since we wish to classify proper 3-compressible automata up to renaming the states in Q, we may assume that Q = {1, 2, . . . , n}, and {x, y, z, t} = {1, 2, 3, 4}. Then, following [ 1 ], we say that an automaton A over a two-letter alphabet Σ = {α, β} is an (i., j.)-automaton, where i, j ∈ {1, 2, 3, 4}, if the letter α is of type i., while the letter β is of type j., above. We say also that A is a (i., p) (or (p, i.)-automaton, if it is an automaton in which the letter α (β) is of type i., while the other letter is a permutation. By Proposition 1, up to renaming the states, each proper 3-compressible automaton over 2 letters is a (t, s)-automaton with some t, s ∈ {1., 2., 3., 4., p}. 3

Automata of Type (3., p) Let A be a (3., p)-automaton over the alphabet Σ = {α, β} and with the state set Q = {1, 2, . . . , n}. Without loss of generality we can also assume that the letter α is of type [ 1, 2 ]\1 (and β is a permutation). Given such an automaton, we call an integer k > 0 a good exponent for the permutation β, or briefly, βgood, if 1βk ∈/ {1, 2}; otherwise, it is called β-bad exponent. We will consider transformation conditions of the form 1v ∈ {1, 2} stating the the image of 1 by the word v is 1 or 2. For a word v ∈ Σ+ and Q1 ⊆ Q, as in [ 1 ], we denote M(v) = Q \ Qv, the set of the states missing in the image of Q under the action of v, and M(Q1, v) = Q \ (Q \ Q1)v, the set of states missing in Q after applying the word v, provided that the set Q1 of states is already missing.

Our first result is the following characterization.

Theorem 1. Let A = hQ, Σ, δi be a proper 3-compressible (3., p)-automaton with Q and Σ as above, and w be a word in Σ+. Then, w does not 3-compresses A if and only if for every factor of w of the form αuα where

u = βk1 αm1 . . . βkt αmt βkt+1 , t ≥ 1, and k1 and kt+1 are β-good, while all other ki are β-bad, the condition 1u ∈ {1, 2} holds.

Proof. Let w = βsαr1 βs1 αr2 . . . βsh−1 αrh βsh , where s, sh ≥ 0, rh > 0, and ri, si > 0 for 1 ≤ i ≤ h − 1. We consider the sequence of missed states by the action of consecutive prefixes of w. Obviously M(βsαr1 ) = {1}. Moreover if s1 is a β-bad exponent then M({1}, bs1 ) = M(βsαr1 βs1 ) = {1βs1 } ⊆ {1, 2}, hence M(βsαr1 βs1 αr2 ) = {1} = M(βsαr1 ). Repeating the same argument it turns out that if there is no β-good exponent among s1, s2, . . . , sh, then w does not 3-compress α.

So we may assume that there is i with 1 ≤ i ≤ h − 1 such that si is a βgood exponent and sj is a β-bad exponent for each j < i. Then we have w = pαri βsi αri+1 βsi+1 . . ., where M(pαri ) = {1}. So M({1}, βsi ) = {1βsi } * {1, 2}, and M({1βsi }, αri+1 ) = {1, x1} where x1 = 1βsi αri+1 6= 1 (since α is of type [ 1, 2 ]\1). Now if si+1 is β-bad, then M(pαri βsi αri+1 βsi+1 ) = M({1, x1}, βsi+1 ) = {y, x1βsi+1 } with y ∈ {1, 2}, whence, whether x1βsi+1 ∈ {1, 2} or not, we have M(pαri βsi αri+1 βsi+1 αri+2 ) = M({y, x1βsi+1 }, αri+2 ) = {1, x2} where x2 = x1βsi+1 αri+2 6= 1. Then until we encounter βsm with a β-good exponent sm the missing state set of each prefix of w ending with α has cardinality 2 and contains 1; in particular, if there no such second β-good exponent, w does not 3-compress α.

So, assume that there is m with i < m ≤ h such that sm is a β-good exponent and each st with i < t < m is a β-bad exponent. Then w = pαri βsi αri+1 vαrm+1 . . ., where v = βsi+1 αri+2 . . . βsm . Now, M(αri βsi αri+1 v) = M({1, x1}, v) = M({1, xm−i}, betasm ) = {1βsm , xm−iβsm }, where xm−iβsm = 1v (since x1 = 1βsi αri+1 , x2 = x1βsi+1 αri+2 , etc.). Hence if 1v ∈/ {1, 2} then M({1βsm , 1v}, α) has cardinality 3, and w is a 3-compressing word. If 1v ∈ {1, 2} then 1vαrm+1 = 1αrm+1 , hence M(pαri βsi αri+1 vαrm+1 ) = {1βsm , 1αrm+1 }. So we are in the analogous situation as with M(pαri βsi αri+1 ), and we can use the above argument with i replaced by m, the prefix p replaced by p1 = pαri βsi . . . βsm−1 , and β-good exponent si replaced by β-good exponent sm. Thus, the reasoning may be repeated for consecutive factors αuα with the property stated in the theorem, and the claim easily follows. tu

The theorem above means that to check whether there exists a (3., p)automaton that fails to be 3-compressed by a given word w, we need to consider all the possibilities for β leading to various sets of good and bad exponents. This means that we consider whether 1 and 2 are in the same orbit of β. If so, we consider various lengths of this orbit and distances between 1 and 2, and if not, the only thing that counts is the length of the orbit of 1. For each such case we establish β-good and β-bad exponents, find all the factors of w of the form described in the theorem above, and form the corresponding system of transformation conditions. If the system has a solution, then w fails to 3-compress the corresponding automaton A with a non-permutation α and a permutation β. The solution represents an automaton that is not 3-compressed by word w. It is nontrivial, if the resulting automaton is 3-compressible (some solutions correspond to automata that fails to be 3-compressible).

In principle, there are infinitely many cases to consider. But given the word w, if k is the largest exponent for β (when w is written in the compact form with exponents), then only k − 1 elements following 1 in the cycle of 1 in β are what really counts. So, in practice, for every word, we may distinguish and consider only finitely many cases.

In fact, we may view α as (almost) a permutation without 1 (1 is not an image of any state; we need only to keep in mind that 1 has the same image as 2). The system of transformation conditions may be treated very similarly to the system of suitable permutation conditions, and the method of trees with distinguished nodes developed in [ 6 ] may be used to solve it. (This is so, in spite of that the two letters involved come in the both cases from very different considerations.) 4

Reduction to a Smaller Class of Automata To prove that solving systems of transformations in Theorem 1 is hard we need to restrict considerations to one class in which the conditions may be stated in a unique and relatively simple form. Therefore, we define the class D to contain all proper 3-compressible automata A over alphabet Σ = {α, β}, such that α is a transformation of type [ 1, 2 ]\1, and β is a permutation of the form β = (12y) . . .. For such permutation the β-good exponents are positive integers k = 2 modulo 3. We will refer to them as to (12y)-good. Other positive integers are (12y)-bad.

In order to reduce our problem to automata in D, to each word w ∈ {α, β}+ we assign a word w∗ = w0w such that w∗ is 3-collapsing if and only if w 3compresses all automata in D. The idea is to find a set of words that 3-compress all automata not in D, and do not 3-compress automata in D. These words will be used to form w0. We wish that these words have no (12y)-good exponents, because then w∗ and w have the same factors described in Theorem 1, which guarantees that they 3-compress the same proper automata in D. This cannot be achieved exactly, and the few exceptions we allow will be handled further in a different way.

We start from (3., p)-automata not in D. The following technical lemma established in [ 1 ] will be useful here.

Lemma 1. ([ 1 ], Lemma 3) Let A be a (3., p)-automaton with α of type [ 1, 2 ]\1. Then A is not 3-compressible if, and only if (up to renaming the states) one of the following conditions holds: (i) β fixes 1 or the set {1, 2}, (ii) β = (13)π for some permutation π on Q \ {1, 3} and 3α = 3, (iii) β = (13)(2)π, β = (132)π or β = (123)π for some permutation π on

Q \ {1, 2, 3} and {2, 3}α = {2, 3}, (iv) β = (13)(24)π or β = (1324)π for some permutation π on Q \ {1, 2, 3, 4} and {3, 4}α = {3, 4}.

In all other cases A is a proper 3-compressible automaton, and one of the words {ababa, abab2a, aba2ba, abab2aba, ab2ab2a, ab2a2b2a, ab2abab2a, ab2aba, ab3aba, abab3a, ab3ab3a} 3-compresses A.

Using it we get the following.

Lemma 2. Each proper 3-compressible (3., p)-automaton A ∈/ D can be 3compressed by a word of the form αβiαkβj α, where i, j ∈ {1, 3, 4}, k ∈ {1, 2, 3} or αβ4αβ2αβ4α.

Proof. Let A be a (3., p)-automaton A ∈/ D, then by (i) of Lemma 1, b fixes neither 1 nor the set {1, 2}.

First, assume that 1β = x ∈/ {1, 2}. Then, M(αβαk) = {1, xαk}. If xαkβ ∈/ {1, 2} for some k then the word αβαkβα 3-compresses the automaton A. Obviously, if xαkβ ∈/ {1, 2}, for some k we can always assume that k ≤ 3. Otherwise, either xα = x or xα2 = x and xαkβ ∈ {1, 2} for all k.

Then let xαkβ ∈ {1, 2} for each k ∈ {1, 2, 3}.

First assume xα = x. If xαβ = xβ = 1 then β = (1x) . . . and by (ii), A is not 3-compressible. So xαβ = xβ = 2 and β = (1x2 . . .) . . .. If β = (1x2) . . . then 2α 6= 2 otherwise A is not 3-compressible by (iii). Then M(αβ1+3hαβ2) = {1, 2} for each h ≥ 0. Since M({1, 2}, α) = {1, 2α} and 2α ∈/ {1, 2, x}, then αβ1+3hαβ2αβ1+3kα 3-compresses A. So let β = (1x2y . . .) . . .. Then M(αβ3α) = {1, yα} and if yαβ ∈/ {1, 2} then αβ3αβα 3-compresses A. We know that yαβ 6= xαβ = 2, so let yαβ = 1. If yα = y then A is not 3-compressible by (iv). If yα 6= y then αβ3αβ4α 3-compresses A.

Then let xα 6= x and xα2 = x. Put xα = y, then yα = x and yαβ = xα2β = xβ ∈ {1, 2}. If xβ = 1 then β = (1x)(y2z . . .) . . . where y, z, 2 are all distinct by (iii) and (iv). Hence αβαβ3α 3-compresses A. So let xβ 6= 1, hence xβ = 2, and again by (iv), β = (1x2z . . . y) . . . where 1, x, 2, z, y are distinct states, then αβαβ4α is a 3-compressing A. This completes all the cases when 1β = x 6= 1, 2.

Lastly, let 1β = 2. Then β = (12xy . . .) . . . where 1, 2, x, y are distinct elements, otherwise A ∈ D. Then M(αβ3α) = {1, yα}. If yαβ3 ∈/ {1, 2} then αβ3αβ3α is 3-compressing, similarly if yαβ4 ∈/ {1, 2} then αβ3αβ4α is 3-compressing. So yαβ3 = 1 and yαβ4 = 2 whence y 6= yα and so yα 6= yα2, and so yα2β3 6= 1. If yα3β3 6= 2 then αβ3α2β3α is 3-compressing, if yα3β3 = 2, then yα3β4 = x and αβ3α2β4α is 3-compressing. tu

For the automata of types other than (3., p) we make use of Proposition 1 and other lemmas established in [ 1 ]. These lemmas have been established in [ 1 ] to compute a bound for the length of the shortest 3-collapsing word. We extract from them information we need to our aim. We have the following. (i) no 3-compressible (i., j.)-automaton with i ∈ {1, 2}, j ∈ {1, 2, 4} or with j ∈ {1, 2}, i ∈ {1, 2, 4} is proper; [1, Lemma 5]; (ii) each proper 3-compressible (1., p)-automaton and each proper 3-compressible (1., 3)-automaton is 3-compressed by αβ2α; hence (by switching the letters), each proper 3-compressible (p., 1) or (3., 1)-automaton is 3 compressed by βα2β; [1, Lemma 1 and Lemma 6]; (iii) each proper 3-compressible (2., p) and each proper 3-compressible (2., 3)automaton is 3-compressed by a word in the set {αβ2α, αβ3α}; each proper 3-compressible (p., 2) and each proper 3-compressible (3., 2)-automaton is 3-compressed by a word in the set {βα2β, βα3β}; [1, Lemma 2 and Lemma 7]; For the future reference we underline words that have an occurrence of β with a (12y)-good exponent inside the word. Here, this is limited only to occurrences of β2. These words will be handled in a different way than those words that have no occurrence β with a (12y)-good exponent or they have such an occurrence only at the beginning or at the end of the word. We will see that only the exact form of underlined words is what really counts in our proof.

In order to find a set of words 3-compressing all (p, 3.)-automata we use again Lemma 1, yet switching the letters α and β. It yields the following set: {βαβαβ, βαβα2β, βαβ2αβ, βαβα2βαβ, βα2βα2β, βα2β2α2β, βα2βαβα2β, βα2βαβ, βα3βαβ, βαβα3β, βα3βα3β}. Some of these words are factors of others, so we may infer the following: (iv) each proper 3-compressible (p, 3.)-automaton is 3-compressed by any word with a factor of the form βαβαβ, βαβ2αβ, βαβα2βαβ, βα2βα2β, βα2β2α2β, βα2βαβα2β, βα3βαβ,βαβα3β, βα3βα3β; [1, Lemma 3];

Similarly we get the following

(v) each proper 3-compressible (4., p) is 3-compressed by any word with a factor of the form αβαβα, α2β2α2, α2βα2, α2β3α, αβ3αβ3α, α2βαβ2α; each proper 3-compressible (p., 4) is 3-compressed by any word with a factor of the form βαβαβ, β2α2β2, β2αβ2, β2α3β, βα3βα3β, β2αβα2β; [1, Lemma 4]; (vi) each proper 3-compressible (3., 3)-automaton is 3-compressed by a word in the set {αβαβ, αβ2αβ, αβα2β, αβ2α2β, βαβα, βα2βα, βαβ2α, βα2β2α}; [1, Lemma 9]; (vii) each proper 3-compressible (3., 4)-automaton is 3-compressed by a word in the set {β2αβ2, β2α2β2, β2α3β2, β2αβαβ2}; each proper 3-compressible (4., 3)-automaton is 3-compressed by a word in the set {α2βα2, α2β2α2, α2β3α2, α2βαβα2}; [1, Lemma 10]; (viii) each proper 3-compressible (4., 4) is 3-compressed by a word in the set {α2βα2, α2β2, β2αβ2, β2α2}; [1, Lemma 11].

Using the lemmas above we can see that any word containing as factors the following words (I) α2βαβ2α, βα2β2α2β, αβ4αβ2αβ4α, (II) αβ3αβ3α, βα3βα3β, β2αβα2β, βαβα2βαβ, βα2βα2β, βα2βαβα2β, βα3βαβ, βαβα3β, βα3βα3β, β2α2β2, β2α3β2, β2αβαβ2, α2β3α2. (III) αβiαkβj α, where i, j ∈ {1, 3, 4}, k ∈ {1, 2, 3}.

3-compresses all 3-compressible automata except those in the D.

To form a single word that 3-compresses all automata not in D it is enough to concatenate all the words listed in (I-III) above. Yet, we wish to have such a word without (12y)-good exponents. So, at this moment, we have only a partial solution. Let wD be the word obtained from concatenation of words in (II) and (III), in arbitrary order, adding the suffix α2βα at the beginning, and replacing all β2 by β3 (more precisely replacing all factors αβ2α by αβ3α; note that the factors αβ2α in the word obtained from concatenation of words in (II) and (III) can only come from the concatenation of words in (II-III) ending and starting with β). Then obviously, wD has all words in (II-III) as factors and has no occurrences of (12y)-good exponents. The fact that it has the suffix α2βα will be used later in the proof. We observe that wD is also 3-full, since all words of length 3 appear as factors in words listed in (II). This means we may state the following.

Proposition 2. Let w ∈ {α, β}∗ be a word such that w∗ = wDw has as factors all the three words listed in (I) above. Then, w 3-compresses all proper automata in D if and only if the word w∗ = wDw described above is 3-collapsing. 5

Reduction to a System of Transformation Conditions Now, our aim is to express the problem of 3-compressibility of proper automata in D in terms of solving a system of transformation conditions. The base for this is Theorem 1. Our attention is restricted to transformations α and β satisfying the following conditions. (C1) α is a transformation of type [ 1, 2 ]\1; (C2) β is a permutation of the form β = (12y) . . . for some y ∈/ {1, 2}; (C3) either 2α or yα is not in {2, y}.

Given a word w ∈ {α, β}+, by u1, u2, . . . , us we denote the set of all factors of w such that αu1α, αu2α, . . . , αusα are all the factors of w defined in Theorem 1 for the permutation β = (12y) . . .. Then we have the following.

Proposition 3. Let w ∈ {α, β}+, and let u1, u2, . . . , us be the factors of w described above. Then, there exists a proper automaton A ∈ D such that w does not 3-compresses A if and only if the system has a solution in transformations α, β on a finite set Q = {1, 2, . . . , n} satisfying the conditions (C1-C3).

Proof. First, suppose that a required solution exists, and let A be an automaton with the state set Q = {1, 2, . . . , n} and two input letters whose transition function is defined by the action of the letters α and β given by the solution. Then conditions (C1) and (C2) mean simply that A ∈ D, provided it is proper 3-compressible. It is 3-compressible by Lemma 1, item (iii). Finally, it is not difficult to see that any word 3-compressing A has a length exceeding 4. Indeed, we need first a letter α to get 1 missing in the image, then a factor β2, to get y missing in the image, and the next α, to get two states missing. Thus A ∈ D and, by Theorem 1, w does not 3-compress A.

Conversely, if A ∈ D and w does not 3-compresses it, then we may assume that its set of the states is Q = {1, 2, . . . , n}. Then, the transformations α and β corresponding to the letters of A satisfy, by definition of D, the condition (C1) and (C2), and as above, by Lemma 1, they satisfy also condition (C3). Thus, by Theorem 1, they form a required solution of the system (1). tu

In such a way the problem of 3-compressibility of automata in D is reduced to solving a system of transformation conditions 1u ∈ {1, 2} with all u of the form u = βkαu0αβ`, where k, ` are (12y)-good exponents, and u0 has no occurrence of (12y)-bad exponents. We are going to show that solving a certain subclass of such systems is computationally hard.

Given words v1, . . . , vs ∈ {α, β}+, we add to them two further words v0 = α2 and vs+1 = αβ4α, and consider the following system of transformation conditions.

1β2v0β2, 1β2v1β2, . . . , 1β2vs+1β2 ∈ {1, 2} (2)

We define a specific decision problem: PROBLEM (*)

Instance: words v1, . . . , vs ∈ {α, β}+ such that each word vi starts and ends with α, and has no occurrence of β with a (12y)-good exponent; Question: Is there a solution (α, β) of the system (2) satisfying conditions (C1C3)?

We have the following.

Theorem 2. Problem (*) formulated above is NP-complete.

The proof ot this theorem uses tools worked out in [ 6 ], where solving systems of such conditions is expressed in terms of coloring trees with distinguished nodes. It will be given in the extended version of the paper. Having this theorem we can can easily prove our main result.

Theorem 3. The problem whether a given word w ∈ {α, β}∗ is 3-collapsing is co-NP-complete.

Proof. First observe that the problem belongs to co-NP class. Indeed, to see that a word w is not 3-collapsing a nondeterministic algorithm needs only to guess the smallest automaton that is not 3-compressed by w. By [10, Theorem 1] such an automaton has not more than 4|w| + 2 states, and the facts that it is 3-compressible and that w does not 3-compress it can be checked easily in polynomial time with respect to |w|.

We transform problem (*) to our problem. Let v1, . . . , vs be an instance of (*). First we form the word

w0 = β2v0β2v0β2v1β2v2β2 . . . β2vsβ2vs+1β2vs+1β2.

Note that this word has doubled occurrences of factors v0 and vs+1. By assumption, the only occurrences of β in w0 with (12y)-good exponents are β2 separating factors v0, v1, v2, . . . , vs, vs+1. Thus, by Proposition 3, system (2) has a solution satisfying conditions (C1-C3) if and only if there exists a proper automaton A ∈ D such that w does not 3-compress A.

Now, we observe that w = wDw0, defined as in Proposition 2, has as factors all the three words listed in (I). Indeed, α2βαβ2α is a factor of w since wD, by definition, has the suffix α2βα, and w0 starts from β2α. The prefix β2v0β2v0β2 = β2α2β2α2β2 of w0 has the second word in (I) as a factor. Finally, the suffix vs+1β2vs+1β2 = αβ4αβ2αβ4αβ2 of w0 has the third word in (I) as a factor. Therefore, by Proposition 2, there exists a proper automaton A ∈ D such that w does not 3-compresses A if and only if the word w∗ = wDw is not 3-collapsing. Obviously, this transformation may be performed in polynomial time, which completes the proof. tu

Frigeri ,

Cherubini and

Liu . Composing short 3-compressing words on a 2 letter alphabet . to appear, (arxiv.org: 1406 .1413v1, 2014 ).

D. S.

Ananichev ,

Cherubini , and

M. V.

Volkov . Image reducing words and subgroups of free groups . Theor. Comput. Sci. , 307 ( 1 ): 77 - 92 , 2003 .

D. S.

Ananichev ,

Cherubini , and

M. V.

Volkov . An inverse automata algorithm for recognizing 2-collapsing words . In Developments in Language Theory , volume 2450 of LNCS , pages 270 - 282 , 2003 .

D. S.

Ananichev and

I. V.

Petrov . Quest for short synchronizing words and short collapsing words . In WORDS. Proc. 4th Int. Conf. , pages 411 - 418 , 2003 .

Cherubini ,

Gawrychowski ,

Kisielewicz , and

Piochi . A combinatorial approach to collapsing words . In MFCS , pages 256 - 266 , 2006 .

Cherubini and

Kisielewicz . Collapsing words, permutation conditions and coherent colorings of trees . Theor. Comput. Sci. , 410 ( 21 -23): 2135 - 2147 , 2009 .

J. E.

Pin . Le probl`eme de la synchronization. contribution a` l'´etudia de la conjecture de Cˇerny´. Th`ese 3e cycle , Paris, 1978 .

J. E.

Pin . Sur le mots synchronisants dans un automata fini . Elektron. Informationverarbeigtung und Kybernetik , 14 : 283 - 289 , 1978 .

S. W.

Margolis ,

J.-E.

Pin , and

M. V.

Volkov . Words guaranteeing minimum image . Internat. J. Foundations Comp. Sci. , 15 : 259 - 276 , 2004 .

10.

I. V.

Petrov . An algorithm for recognition of n-collapsing words . Theoret. Comput. Sci. , 391 ( 1-2 ): 99 - 108 , 2008 .

11.

E. V.

Pribavkina . On some properties of the language of 2-collapsing words . In Developments in Language Theory , volume 3572 of LNCS , pages 374 - 384 , 2005 .

12.

Sauer and

M. G.

Stone . Composing functions to reduce image size . Ars Combinatoria , 1 : 171 - 176 , 1991 .