String Assembling Systems and Watson-Crick Finite Automata∗ András Murvai, György Vaszil Faculty of Informatics, University of Debrecen, Kassai út 26, 4028 Debrecen, Hungary halimifoliumlycium@gmail.com vaszil.gyorgy@inf.unideb.hu Abstract: We explore the relationship of Watson-Crick au- The other model interesting for our investigations is tomata and string assembling systems. In the general case, called Watson-Crick finite automaton, and it was intro- Watson-Crick automata are more powerful, so restricting duced in [2]. Such automata are similar to ordinary finite the study to the stateless variant is of interest. We show automata in the sense that they read strings written on their that the class of languages of stateless Watson-Crick au- tape, and either accept or reject them, thus defining a for- tomata are strictly included in the class of languages of mal language as the set of strings that are accepted. Sim- automata with at least two states, then compare string as- ilarly to sticker systems, Watson-Crick automata work on sembling systems with the stateless variant. It turns out, double stranded strings, thus, they have a double stranded that there are languages that can be generated by string tape that contains two complementary strings which are assembling systems, but not accepted by stateless Watson- read by two separate reading heads being able to move in- Crick automata, but the question concerning the exact re- dependently of each other, one on the upper and one on the lationship of the language classes remains open. lower strand of the tape. Concerning their computational power, these types of automata can describe more compli- cated language classes than ordinary finite automata. As 1 Introduction we will see examples of this later, even some non-context- free context-sensitive languages can be accepted. The double stranded structure of DNA motivated the in- troduction and study of double stranded strings with the tools and techniques of formal language theory, that is, the 2 Preliminaries definition of string operations which model the biochem- ical processes that the corresponding strings can undergo, For a finite alphabet of symbols V , let V ∗ denote the set see the monograph [7] for more details. Two such models, of all strings over V , and let V + = V ∗ \ {λ } where λ related to the topic of this paper are sticker systems and denotes the empty string. We consider two languages Watson-Crick automata. L1 , L2 ⊆ V ∗ equal if they differ in at most the empty string, Sticker systems were introduced in [3], they use dou- L1 \ {λ } = L2 \ {λ }. The length of a word w ∈ V ∗ is de- ble stranded string “pieces” as building blocks to assemble noted by |w|, the number of occurrences of a symbol x ∈ V longer double stranded strings when their single stranded in w is denoted by |w|x . ends stick together and form double strands according to A double stranded string over V is a pair  of strings the Watson-Crick complementarity relation. Such systems w1 (w1 , w2 ) ∈ V ∗ ×V ∗ , it can also be written as , while describe formal languages, sets of double stranded strings  ∗ w2 which can be “stick” together starting from a set of initial V the set of pairs V ∗ ×V ∗ can be written as . pieces. Depending on how we specify the details of the V∗ functioning of the system, simple languages (like regular We  denote  the  lastletters and the   first letters of a string languages which can be described by finite automata), or u u u pair as end( ) and bgn( ): more complicated languages (like recursively enumerable v v v languages which can be described by Turing machines) can be generated by sticker systems. As we will see later, 1. If u = u0 x,  v0 = 0 for some x, y ∈ V, u0 , v0 ∈ V ∗ ,  vy  ux x string assembling systems (one of the computing models then end( 0 ) = . Similarly, the first letters studied in this paper) are similar to this, from a certain v y  y u point of view they can be thought of as special cases of of the pair for u = xu0 and v = yv0 are denoted sticker systems.  0 v   xu x ∗ Research supported by the construction EFOP-3.6.3-VEKOP-16- as bgn( )= where x, y ∈ V, u0 , v0 ∈ V ∗ . yv0 y 2017-00002 supported by the European Union, co-financed by the Eu- ropean Social Fund, and by grant K 120558 of the National Research, 2. Otherwise,if  one of the  strings is empty, we u0 x  Development and Innovation Office of Hungary (NKFIH), financed un- λ λ x der the K 16 funding scheme. have end( 0 ) = , end( ) = , Copyright c 2021 for this paper by its authors. Use permitted un-   v y  y  0 λ  λ der Creative Commons License Attribution 4.0 International (CC BY λ λ xu x bgn( )= , and bgn( )= for 4.0). yv0 y λ λ   x, y ∈ V, u0 , v0 ∈ V ∗ . if and only if, q0 ∈ δ (q, u2 ). We may also write v2 The complementarity relation is a symmetric relation     on the letters of the alphabet ρ ⊆ V ×V . We call the set of u1 u1 u2 ⇒ sequences of pairs of complementary symbols a Watson- v1 v1 v2 Crick domain and denote it with W Kρ (V ), more formally, if we are not interested in the states of the automaton and   x in the rest of the input to be read. W Kρ (V ) = { | x, y ∈ V, (x, y) ∈ ρ}∗ . If the reflexive and transitive closure of ⇒ is denoted by y ⇒∗ , then the language accepted by M is      x1 x2 xn ∈ W Kρ (V ) can also be     The string pair ... w w   1 y y2 yn L(M) = {w ∈ V ∗ | q0 ⇒∗ qf , w w w written as 0 where w = x1 x2 . . . xn , w0 = y1 y2 . . . yn . We   w w where ∈ W Kρ (V ), q f ∈ F}. also denote the complement of x ∈ V by x, that is, (x, x) ∈ w ρ, and the complement of xis x,so x =x, and  (x, x) ∈ ρ. w1 w1 w1 2.2 String assembling systems The difference between and is that w2 w2 w2 only gives a different notation for (w , 1 2w ), while in the Now we recall the definition of string assembling systems from [5]. These also work with double stranded strings,   w1 case of ∈ W Kρ (V ), |w1 | = |w2 | and w2 is the com- but in their case, the complementarity relation is the iden- w2 plement of w1 , w2 = w1 (also w2 = w1 ). tity relation. Similarly to sticker systems mentioned in the introduction, they use a set of double stranded string pieces as building blocks to generate a language. They are able 2.1 Watson-Crick automata to “glue” two double stranded strings together if the last letters of the first string and the first letters of the second Now we recall the definition of Watson-Crick automata string coincide, and these letters overlap in the resulting from [2], see also the monograph [7] for more details. string. The construct M = hV, ρ, Q, q0 , F, δ i is a Watson-Crick The 4-tuple S = hΣ, A, T, Ei is a string assembling sys- automaton (WK automaton in short), where tem (SAS in short), where • V is the input alphabet, • Σ is a finite alphabet, • ρ ⊆ V ×V is the complementarity relation, • A + +  ⊂Σ ×Σ is the finite set of axioms of the form uv u • Q is the nonempty finite set of states, or with u ∈ Σ+ , v ∈ Σ∗ , u uv • q0 ∈ Q is the initial state, • T ⊂ Σ+ × Σ+ is the finite set of assembly units, • F ⊆ Q is the set of accepting states, and • E ⊂ Σ+ ×Σ+is the  finite  set of ending assembly units uv v  ∗ V of the form or with u ∈ Σ∗ , v ∈ Σ+ . • δ : Q× → 2Q is a finite relation, the state tran- v uv V∗ sition relation. A derivation step of the SAS S as above is denoted by ⇒,  and  defined as  follows.  We say that we added the unit xu2 u1  A configuration of sucha WK  automaton is denoted as to the unit , denoted as V∗    w1 w3 w1 yv2 v1 q where ∈ is the part of the in- w2 w4 w2 V∗     u1 u1 u2 put which is already   read,  ∗q∈ Q is the state of the WK au- ⇒ , w3 V v1 v1 v2 tomaton, and ∈ is the part of the input which w4 V∗     is not read yet. x u1 if and only if, = end( ) for some x, y ∈ Σ, and The transition between two configurations is denoted by   y v1 ⇒, and defined as follows. xu2      ∗   ∈ T ∪ E. Moreover, if |u1 u2 | ≥ |v1 v2 |, then u1 u2 u1 u2 u3 V u1 u2 u3 yv2 Let , , ∈ , ∈ can be written as u1 u2 = v1 v2 u0 for some u0 ∈ Σ∗ , or the v1 v2 v3 V∗ v1 v2 v3 0 W Kρ (V ), q, q ∈ Q. Then other way around, if |u1 u2 | ≤ |v1 v2 |, then u1 u2 v0 = v1 v2 for some v0 ∈ Σ∗ .         u1 u2 u3 u1 u2 0 u3 A sequence of derivation steps is a derivation. A deriva- q ⇒ q tion is successful, if after choosing an axiom from A, we v1 v2 v3 v1 v2 v3 add assembling units from T , then close the derivation by and L(M) = L(M 0 ). Based on this result, from now on we adding a unit from E, in such a way that the upper and assume that ρ = ρid , the identity relation. lower string of the string pair which is produced is identi- We can also find similarities in the functioning of the cal. two models. Adding building units to the generated string If the reflexive and transitive closure of ⇒ is denoted by by a SAS from A, T , or E, corresponds to reading sub- ⇒∗ , then the language generated by S is strings of the input by a WK automaton with a transi-     tion from the initial state, a transition between arbitrary +u ∗ w states, and a transition leading to and accepting state, re- L(S) = {w ∈ Σ | ⇒ is a v w spectively. While the states of WK automata are explicitly successful derivation}. given, “states” of a SAS are implicit, they are “hidden” in the overlapping pairs of letters of the building units. Now we introduce some additional notation that will be useful later. The string pair generated in k deriva- 3.1 Known results on WK automata and SAS tion steps, or in the case of WK automata,the  string pair u(k) As we have seen, SAS can basically add three types of read in k transition steps, is denoted by , that is, units to the generated strings, while WK automata can v(k)   u0  u0 u1 . . . uk−1 uk    u   u(0) have an arbitrary number of states, so it seems to be rea- ⇒∗ = (k) . The pair sonable to review results related to the number of states of v0 v0 v1 . . . vk−1 vk v(k) v(0) denotes an axiom, or in the case of WK automata, the WK automata. In [6] the authors show that from the point of view of state complexity, WK automata are more ef-   λ pair , representing the fact that nothing has been read ficient than ordinary finite automata, there is a sequence λ from the input yet. Lk , k ≥ 1 of regular languages, such that the number Let ∆k denote the difference of the length of upper and of states of the automata sequence accepting Lk is un- bounded, while all languages of the sequence can be ac-   u lower element of the string pair added in the kth cepted with WK automata having a fixed number of states. v derivation step, or read in the kth state transition, that Building on these results, in [1] it is shown that arbitrary is,∆k = |u| − |v|, and let ∆(k) denote the difference of the finite languages can be accepted with WK automata hav- generated strings or the strings read on the upper and lower ing two states, and that arbitrary unary regular languages pert of the inputtape in k derivation or state transition can be accepted by WK automata having three states. u Concerning the power of SAS, already in the introduc- steps, that is, for (k) , let ∆(k) = |u(k) | − |v(k) |. v(k) tory paper [5], the authors show that there are unary regu- Let us also denote the class of languages accepted or lar languages that cannot be generated by SAS, but all fi- generated by a computational model X by L (X), so let nite languages can be. From our point of view, especially L (SAS) and L (W K) denote the classes of languages gen- interesting is the result concerning the relationship of lan- erated by SAS and accepted by WK automata, respec- guages accepted by stateless two head finite automata (that tively. is, two head automata with one state) and languages gen- erated by SAS. While the languages accepted by stateless WK automata strictly include the languages accepted by 3 The computational power of stateless two-head automata, Theorem 1 which we are go- Watson-Crick automata and string ing to prove in the next section, can be considered as the assembling systems extension of this result. In the following, we would like to point out some of the 3.2 Comparing the power of WK automata and SAS similarities of WK automata and SAS, and then investigate whether these similarities help us to establish a relation- Let us start by establishing the relationship between lan- ship between their computational power. guages of (unrestricted) WK automata and SAS. Consider the WK automata M = hV, ρ, Q, q0 , F, δ i and Proposition 1. the SAS S = hΣ, A, T, Ei. They both work with double stranded strings, and there is a relation between the upper L (SAS) ⊂ L (W K). and lower strings in both cases. In the case of SAS, this re- lation is the identity relation, in WK automata, on the other Proof. According to Theorem 3.1 of [5], languages gener- hand, the relation can be an arbitrary symmetric relation. ated by SAS can be accepted by nondeterministic one-way This might seem to be an important difference, but it is two-head finite automata, and since one-way two-head fi- not. It is known from [4], that the computational power nite automata can be simulated by WK automata, it is clear of WK automata is not influenced by the complementar- that L (SAS) ⊆ L (W K). ity relation. For any WK automata M, we can construct On the other hand, L (W K) 6= L (SAS), since according an M 0 , such that it uses the relation ρ = {(x, x) | x ∈ V }, Theorem 3.6 in [5] the language L = {a} ∪ {a2n | n ≥ 2} cannot be generated by any SAS, but can be accepted by strings simultaneously. All string pairs corresponding to the following WK automaton. some unit in T have an appropriate transition in δ , so S Let M = h{a}, ρid , {q0 , q f , q f 0 }, q0 , {q f , q f 0 }, δ i where can finish the string generation by adding some unit from     T , which must contain all units that can end the generation, a aaaa δ (q0 , ) = qf , δ (q0 , ) = qf0, so a aaaa   $ aa E = T ∪{ }. δ (q f 0 , ) = qf0. $ aa Now we show that any $w ∈ L(S) can be accepted by M. For all $w, we have a derivation As we have seen, in the general case, the computational power of WK automata is greater than the power of SAS.      $ $u1 $u1 u2   $u1 u2 . . . ut  Now we continue by restricting the power of WK automata ⇒ ⇒ ⇒ ... ⇒ , $ $v1 $v1 v2 $v1 v2 . . . vt by considering its variants with reduced number of states. First, we look at the relationship between stateless WK   xui automata and SAS. where $w = $u1 u2 . . . ut = $v1 v2 . . . vt , ∈ T ∪E, 1 ≤ yvi In the following, we denote the classes of WK automata i ≤ t. Since we created the building units in such a way with m states by W K|Q|=m and by L (W K|Q|=m ) the class that their initial pairs of languages they accept.  of letters contain  all possible com- x u(i−1) binations, even if = end( ) holds, the ability Theorem 1. Let M be a nondeterministic stateless WK y v(i−1)     automaton. Then there exists an SAS S generating the xui ui to add unit depends only on the string pair . language accepted by M with an initial marker, that is,  yvi vi $L(M) = L(S) where $ 6∈ V . ui As δ (q, ) = q, and M is stateless, only the read string Proof. Consider M = hV, ρid , {q}, q, {q}, δ i, we construct  vi ui the SAS S = hΣ, A, T, Ei as follows. pair determines the success of the reading a word. vi • Σ = V ∪ {$} ($ 6∈ V ), The derivation above (besides the leading $ sign) is a suc-   cessful reading of the word w $ • A={ }, $    u1 u1 u2   u1 u2 . . . ut  ⇒ ⇒ ... ⇒ ,   xu   x v1 v1 v2 v1 v2 . . . vt • T ={ | for all ∈ Σ × Σ pairs of letters and yv   y thus w ∈ L(M). u transitions δ (q, ) = q}, v   Although formally the above theorem does not say $ anything about the containment relationship betweem • E = T ∪{ }. $ L (SAS) and L (W K|Q|=1 ), it still establishes some kind of connection between the two languages classes. Moreover, First we show that any w ∈ L(M) can be generated by S. our statement is “stronger” then the similar statement in   accepts M  the  empty word, so S must generate $, thus [5] which states that for all languages L accepted by state- $ $ ∈ A, ∈ E. less two-head finite automata, a corresponding SAS gen- $ $ erating L0 can be constructed in such a way that all w ∈ L Let su and sl denote the already read part of the upper corresponds to w0 ∈ L0 with h(w0 ) = w for a homomor- and lower strands  ofthe input   of M.  For  all possible pairs $su x u phism deleting four symbols from w0 (and not just one, as of letters end( )= , if can be read by M, in our case). $sl  y v xu Let us now relax the requirement of statelessness, and there has to be a unit which can be added by S. We consider WK automata with two states yv have added all possible building units for the reading of  u Lemma 1. There is a WK automaton with two states to S, so if a string pair can be read by M, then the which accepts the language L = {w | w ∈ {a, b}∗ , |w|a = v corresponding building unit can be added by S. 2(1 + 2i), i ≥ 0}. M is always in an accepting state, so the only require- ment for a string to be accepted is the property that it can Proof. The language can be accepted by the following be WK automaton.  read  completely. In the last reading step, reading some u Let M = h{a, b}, ρid , {q0 , q f }, q0 , {q f }, δ i with δ : , the two heads reach the end of the upper and lower v     a a head reaches the end of the word after 1 + 2k, that is, after δ (q0 , ) = q0 , δ (q f , ) = qf , λ   λ   an odd number of reading steps. Therefore, M enters q f b b when started in q0 , so w is accepted. δ (q0 , ) = q0 , δ (q f , ) = qf , x x λ λ Corollary 1. The language L = {w | w ∈ {a, b}∗ , |w|a = δ (q0 , ) = qf , δ (q f , ) = q0 , (1 + 2i)k, i ≥ 0} can be accepted for any k ≥ 2 by a WK xy xy automaton similar to the above, with where x, y ∈ {a, b}.     We first show that L(M) ⊆ L. For any w ∈ L(M) we a a δ (q0 , ) = q0 , δ (q f , ) = qf , have a transition sequence λ   λ   b b           δ (q0 , ) = q0 , δ (q f , ) = qf , w u0 w1 u0 u1 0 w2 x  x q0 ⇒ q 0 ⇒ q ⇒ ...    w v0 w1 v0 v1 w0 δ (q0 , λ ) = q f , δ (q f , λ ) = q0 ,  2  x1 x2 ...xk x1 x2 ...xk u0 u1 ...ut ... ⇒ qf v0 v1 ...vt where x, x1 , x2 , ..., xk ∈ {a, b}. where in the first and last configuration M is in states q0 Thus, L = {w | w ∈ {a, b}∗ , |w|a = 2(1 + 2i), i ≥ 0} can and q f , respectively, and in  oneof the two states  in  the be accepted by a WK automaton with two states, but it ui a b λ cannot be accepted by stateless WK automata, as we will intermediate configurations, ∈{ , , | vi λ x xy show in the following. x, y ∈ {a, b}}, 0 ≤ i ≤ t, correspond to the  reading steps. Lemma 2. If M is a stateless WK automaton, then L(M) =    ui b In any case, by reading = , ∆i = 0, and M L(M)∗ . vi x    ui a remains in the same state, by reading = , ∆i = Proof. Let M = hV, ρid , {q}, q, {q}, δ i be a stateless WK vi λ  automaton. It is clear that L(M) ⊆ L(M)∗ . To show the ui ∞ 1 and M remains in the same state, by reading = converse inclusion, consider L(M)∗ = S L(M)i , and a   vi i=0 λ , ∆i = −2, and M changes to the other state. word w ∈ L(M)∗ . There are three possible cases. xy It is clear that ∆(i) can only increase after a reading step 1. If w ∈ L(M)0 , then w = λ ∈ L(M),     a λ if was read, and can only decrease if was read. 2. if w ∈ L(M)1 , then w ∈ L(M), λ   xy Therefore, in order for ∆(i) = 0, some λ need to be 3. if w ∈ L(M)k , k ≥ 2, then w can be written as w = xy   w1 w2 ...wk , where wi ∈ L(M), 1 ≤ i ≤ k. a read, together with reading twice as many . Thus, if The  first λ   two cases are clear. In the third case, ∆(i) = 0, the already read part of the upper string contains w1 w2 ...wk q is a possible configuration, since w1 ∈ L, an even number of as. w1 w2 ...wk If w is accepted, then ∆(i) = 0 and M must be in state q f . so the two reading heads  can be positioned   after  w1 . w1 ...wi−1 wi ...wk In order for Mto be  in state q f , an odd number (1 + 2k, k ≥ If a configuration q can occur, λ   w1 ...wi−1 wi ...wk 0) of reading had to happen, which means that 2(1+   xy w1 ...wi wi+1 ...wk then also q is a possible configura-   a w ...w  1 i wi+1 ...wk 2k) number of reading had to happen. Thus, M can wi λ tion, since can be read (as wi ∈ L). Therefore, w can only accept words containing 2(1 + 2k) number of as. wi be read in such a way that both reading heads reach the Now we show  that M can  accept any w ∈ L. First it end of the tape, and since there is only one state, M ac- a b needs to read and (choosing the appropriate x) cepts w. λ x until it reaches the end of the upper string. We are in state Using the above lemma, we can show the following. q0 , since we have started in q0 and remained in q0 all the way. The reading head of the lower string is behind the Corollary 2. L = {w | w ∈ {a, b}∗ , |w|a = 2 + 4i, i ≥ 0} upper head by the number of as, by 2(1 + 2k). In  order  cannot be accepted by any stateless WK automaton. λ to catch up with the upper head, we need to read a xy Proof. If L = L(M) for a stateless WK automaton M, then number of times. At each such reading step, M changes L = L∗ , according to the lemma above. Let w ∈ L be such, its state. Since the number if as in w is 2(1 + 2k) and we that |w|a = 2. Then w2 ∈ L∗ , but w2 ∈ / L, so L 6= L∗ , there- read two letters from the lower string, the lower reading fore L 6= L(M), which is a contradiction. Now we show that the language that we found not to be Using that a ≡ b (mod m) if and only if m|(a − b), that in L (W K|Q|=1 ) can be generated by SAS. is, if a − b ≡ 0 (mod m), in the remaining cases we show the congruence (2 − |u(k) |a ) − ∆(k) ≡ 0 (mod 4) holds. Lemma 3. L = {w | w ∈ {a, b}∗ , |w|a = 2 + 4i, i ≥ 0} can     be generated by SAS. xabb xbab xbba • If , , , then |u(k+1) |a = y y y Proof. Let S = hΣ, A, T, Ei with |u(k) |a + 1 and ∆(k+1) = ∆(k) + 3 holds, so 2 − • Σ = {a, b}, |u(k+1) |a − ∆(k+1) = 2 − (|u(k) |a + 1) − (∆(k) + 3) =       2 − |u(k) |a − ∆(k) − 4 ≡ 0 (mod 4) also holds; aa ab ba bba bbb • A={ , , , , },   aa a b bb b xaa • if , then |u(k+1) |a = |u(k) |a + 2 and ∆(k+1) =          y x1 b xabb xbab xbba xaa ∆(k) +2 hold, so 2−|u(k+1) |a −∆(k+1) = 2−(|u(k) |a + • T ={ , , , , ,  x2 x3 y y y y 2) − (∆(k) + 2) = 2 − |u(k) |a − ∆(k) − 4 ≡ 0 (mod 4) x1 a also holds; } where x, x1 , x2 , x3 , x4 , y ∈ Σ, x2 x3 x4   x1 a    a b • if , then |u(k+1) |a = |u(k) |a + 1 and ∆(k+1) = • E ={ , }. x2 x3 x4 a b ∆(k) −1 hold, so 2−|u(k+1) |a −∆(k+1) = 2−(|u(k) |a + 1) − (∆(k) − 1) = 2 − |u(k) |a − ∆(k) ≡ 0 (mod 4) also We have chosen the assembly units of the SAS S in such holds. a way that the difference between the length of the upper and lower string indicates how many as must be added to Thus, all words generated by S belong to the language L. the upper string in order to obtain a correct number. The Now we show that arbitrary words of L can be generated following four cases are possible. by S. The assembling units of S are chosen in such a way that as and bs can follow each other in arbitrary order in 1. At least 3 as are missing: ∆(i) = 3 + 4n, the upper string, and the lower string of each unit can be 2. at least 2 as are missing: ∆(i) = 2 + 4n, arbitrary (only the length of the strings are fixed), so it is possible to add to the lower strings the appropriate letters. 3. at least 1 a is missing: ∆(i) = 1 + 4n, Axioms are chosen as follows. We create the least num- 4. at least 0 a is missing: ∆(i) = 0 + 4n, ber of axioms which satisfy the above mentioned congru- ence and provide the arbitrary   letter order. The idea is to where n ∈ Z and we are after the ith derivation steps. u create axioms of the form with upper strings of a More formally we canexpress the above as follows. v u(k) fixed length at most n ∈ N, and the lower strings being at Consider the string pair obtained in k derivation v(k) most as long as the upper, thus |u| = i, i = 1 . . . n, 1 ≤ steps. Then the congruence |v| ≤ |u|. Then we discard those which do not satisfy the congruence, or which are not necessary for the initiation 2 − |u(k) |a ≡ ∆(k) (mod 4) of the generation.     a b must hold during the whole generating process: |u(k) |a is This way, for i = 1 we have and as the number of as in the upper string. In case of |u(k) |a + a b axioms, but we discard these, because they do not 2 = 4i + j, (i ≥ 0, 1 ≤ j ≤ 4), (4i + 4) − (4i + j) = (4i + satisfy  the congruence.    For  i = 2 we  have  the units 4) − |u(k) |a − 2 indicates the least number of as that we  aa aa ab ab ba ba bb need to add in order for the number to be a multiple of 4. , , , , , , , and a aa a     b ab b ba ∆(k) = (4 − j) + 4n, n ∈ Z indicates this too, so   bb aa ab ba . We keep , , , so we can (4i + 4) − |u(k) |a − 2 ≡ 2 − |u(k) |a ≡ bb aa a b begin the generation of words starting with aa, ab, ∆(k) = (4 − j) + 4n (mod 4). ba. We cannot generate words starting with bb, so for  i = 3 weonly consider   the  unitsstarting   with bb The axioms fulfill this condition. bba bba bba bbb bbb bbb Let us assume, that the congruence  holds  after k deriva- , , , , , . We b  bb bba b bb bbb uk+1   tion steps. After adding the unit in the (k + 1)th bba bbb vk+1 keep , , so we are able to start the gen- step, the congruence still holds, since bb b     eration of words beginning with bb. We can start the x1 b a b generation with all possible combinations of letters, so no • if , , , then |u(k+1) |a = |u(k) |a so x2 x3 a b more axioms are needed. ∆(k) = ∆(k+1) , thus 2−|u(k+1) |a = 2 −|u(k) |a ≡ ∆(k) = We proceed with units in T in a similar manner: we ∆(k+1) also hold. create the least possible number of elements that satisfy the congruence, but we also allow that lower strings are References longer than the upper strings. Let us suppose that the cur- rent string pair satisfies the congruence. If we would like [1] Elena Czeizler, Eugen Czeizler, Lila Kari, and Kai Salomaa. to continue the upper string with b, then On the descriptional complexity of watson-crick automata. Theor. Comput. Sci., 410(35):3250–3260, 2009. • we need to add an appropriate letter also  tothe lower [2] Rudolf Freund, Gheorghe Păun, Grzegorz Rozenberg, and x1 b Arto Salomaa. Watson-crick finite automata. In Harvey Ru- string, so we need a unit of the form . x2 x3 bin and David Harlan Wood, editors, DNA Based Computers, Proceedings of a DIMACS Workshop, Philadelphia, Penn- If we would like to continue the upper string with a, then sylvania, USA, June 23-25, 1997, volume 48 of DIMACS • we need to add Series in Discrete Mathematics and Theoretical Computer  two letters  to the lower string, so we Science, pages 297–327. DIMACS/AMS, 1997. x1 a have the unit , or [3] Lila Kari, Gheorghe Păun, Grzegorz Rozenberg, Arto Salo- x2 x3 x4 maa, and Sheng Yu. DNA computing, sticker systems, and • we need to add further letters to the upper string. If universality. Acta Informatica, 35(5):401–420, May 1998. to the upper string [4] Dietrich Kuske and Peter Weigel. The role of the com- plementarity relation in watson-crick automata and sticker – we would like to add an a, then weneed  to systems. In Cristian Calude, Elena Calude, and Michael J. xabb Dinneen, editors, Developments in Language Theory, 8th In- add three letters, so we have the units ,     y ternational Conference, DLT 2004, Auckland, New Zealand, xbab xbba December 13-17, 2004, Proceedings, volume 3340 of Lec- and ; ture Notes in Computer Science, pages 272–283. Springer, y y 2004. – we would  like to add two as, then we have the xaa [5] Martin Kutrib and Matthias Wendlandt. String assembling unit . systems. RAIRO Theor. Informatics Appl., 46(4):593–613, y 2012. The number of letters needed to be added to the upper or [6] Andrei Păun and Mihaela Păun. State and transition com- lower strings are determined by the congruence. plexity of watson-crick finite automata. In Gabriel Ciobanu With these units, we can generate arbitrary strings of as and Gheorghe Paun, editors, Fundamentals of Computation and bs on the upper string, while the lower can only be Theory, 12th International Symposium, FCT ’99, Iasi, Ro- the same length as the upper if the congruence is satisfied. mania, August 30 - September 3, 1999, Proceedings, volume 1684 of Lecture Notes in Computer Science, pages 409–420. In thiscase  thegeneration can be finished with the ending  Springer, 1999. a b units or . [7] Gheorghe Păun, Grzegorz Rozenberg, and Arto Salomaa. a b DNA Computing - New Computing Paradigms. Texts in Based on the above, we have the next statement. Theoretical Computer Science. An EATCS Series. Springer, 1998. Theorem 2. L (SAS) 6⊆ L (W K|Q|=1 ). Proof. The statement is clear by combining Lemma 3 and Corollary 2. 4 Conclusion We have started to investigate the relationship between the computational power of WK automata and string assem- bling systems. We know that in the general case, WK au- tomata are stronger than SAS, so the comparison with WK automata having a restricted number of states is of interest. First we have shown that WK automata with at least two states are strictly more powerful than stateless WK automata, then constructed a language that can be gener- ated by SAS, but cannot be accepted by stateless WK au- tomata. The exact relationship of the two language classes, that is the question whether there are languages accepted by stateless WK automata which cannot be generated by SAS, remains open.