<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>String Assembling Systems and Watson-Crick Finite Automata</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>András Murvai</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>György Vaszil</string-name>
          <email>vaszil.gyorgy@inf.unideb.hu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Informatics, University of Debrecen</institution>
          ,
          <addr-line>Kassai út 26, 4028 Debrecen</addr-line>
          ,
          <country country="HU">Hungary</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We explore the relationship of Watson-Crick automata and string assembling systems. In the general case, Watson-Crick automata are more powerful, so restricting the study to the stateless variant is of interest. We show that the class of languages of stateless Watson-Crick automata are strictly included in the class of languages of automata with at least two states, then compare string assembling systems with the stateless variant. It turns out, that there are languages that can be generated by string assembling systems, but not accepted by stateless WatsonCrick automata, but the question concerning the exact relationship of the language classes remains open.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        The double stranded structure of DNA motivated the
introduction and study of double stranded strings with the
tools and techniques of formal language theory, that is, the
definition of string operations which model the
biochemical processes that the corresponding strings can undergo,
see the monograph [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for more details. Two such models,
related to the topic of this paper are sticker systems and
      </p>
      <sec id="sec-1-1">
        <title>Watson-Crick automata.</title>
        <p>
          Sticker systems were introduced in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], they use
double stranded string “pieces” as building blocks to assemble
longer double stranded strings when their single stranded
ends stick together and form double strands according to
the Watson-Crick complementarity relation. Such systems
describe formal languages, sets of double stranded strings
which can be “stick” together starting from a set of initial
pieces. Depending on how we specify the details of the
functioning of the system, simple languages (like regular
languages which can be described by finite automata), or
more complicated languages (like recursively enumerable
languages which can be described by Turing machines)
can be generated by sticker systems. As we will see later,
string assembling systems (one of the computing models
studied in this paper) are similar to this, from a certain
point of view they can be thought of as special cases of
sticker systems.
        </p>
        <p>Research supported by the construction
EFOP-3.6.3-VEKOP-162017-00002 supported by the European Union, co-financed by the
European Social Fund, and by grant K 120558 of the National Research,
Development and Innovation Office of Hungary (NKFIH), financed
under the K 16 funding scheme.</p>
        <p>Copyright c 2021 for this paper by its authors. Use permitted
under Creative Commons License Attribution 4.0 International (CC BY
4.0).</p>
        <p>
          The other model interesting for our investigations is
called Watson-Crick finite automaton, and it was
introduced in [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Such automata are similar to ordinary finite
automata in the sense that they read strings written on their
tape, and either accept or reject them, thus defining a
formal language as the set of strings that are accepted.
Similarly to sticker systems, Watson-Crick automata work on
double stranded strings, thus, they have a double stranded
tape that contains two complementary strings which are
read by two separate reading heads being able to move
independently of each other, one on the upper and one on the
lower strand of the tape. Concerning their computational
power, these types of automata can describe more
complicated language classes than ordinary finite automata. As
we will see examples of this later, even some
non-contextfree context-sensitive languages can be accepted.
2
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>For a finite alphabet of symbols V , let V denote the set
of all strings over V , and let V + = V n fl g where l
denotes the empty string. We consider two languages
L1; L2 V equal if they differ in at most the empty string,
L1 n fl g = L2 n fl g. The length of a word w 2 V is
denoted by jwj, the number of occurrences of a symbol x 2 V
in w is denoted by jwjx.</p>
      <sec id="sec-2-1">
        <title>A double stranded string over V is a pair of strings</title>
        <p>(w1; w2) 2 V</p>
      </sec>
      <sec id="sec-2-2">
        <title>V , it can also be written as pair the set of pairs V</title>
        <p>V</p>
      </sec>
      <sec id="sec-2-3">
        <title>V can be written as .</title>
        <p>V
We denote the last letters and the first letters of a string
u u u</p>
        <p>as end( ) and bgn( ):
v v v
w1 , while
w2
1. If u = u0x; v = v0y for some x; y 2 V; u0; v0 2 V ,
u0x x
then end( ) = . Similarly, the first letters
y
for u = xu0 and v = yv0 are denoted</p>
        <p>where x; y 2 V; u0; v0 2 V .
of the pair
as bgn(
l
xu0
yv0
) =
2. Otherwise, if one of the strings is empty, we
l l u0x x
have end( ) = , end( ) = ,
v0y y l l
) =
, and bgn(
) =</p>
        <p>for
xu0</p>
        <p>The complementarity relation is a symmetric relation
on the letters of the alphabet r V V . We call the set of
sequences of pairs of complementary symbols a
WatsonCrick domain and denote it with W Kr (V ), more formally,
The string pair</p>
        <p>w
case of
written as</p>
        <p>where w = x1x2 : : : xn; w0 = y1y2 : : : yn. We
w0
also denote the complement of x 2 V by x, that is, (x; x) 2
r, and the complement of x is x, so x = x, and (x; x) 2 r.</p>
        <p>w1 w1</p>
        <p>The difference between w2 and ww12 is that w2
only gives a different notation for (w1; w2), while in the
w1</p>
        <p>2 W Kr (V ), jw1j = jw2j and w2 is the
comw2
plement of w1, w2 = w1 (also w2 = w1).
2.1</p>
        <sec id="sec-2-3-1">
          <title>Watson-Crick automata</title>
          <p>
            Now we recall the definition of Watson-Crick automata
from [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ], see also the monograph [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ] for more details.
          </p>
          <p>The construct M = hV; r; Q; q0; F; d i is a Watson-Crick
automaton (WK automaton in short), where</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>V is the input alphabet,</title>
        <p>V</p>
      </sec>
      <sec id="sec-2-5">
        <title>V is the complementarity relation,</title>
        <p>r
F</p>
      </sec>
      <sec id="sec-2-6">
        <title>Q is the nonempty finite set of states,</title>
        <p>q0 2 Q is the initial state,</p>
      </sec>
      <sec id="sec-2-7">
        <title>Q is the set of accepting states, and</title>
        <p>V
d : Q</p>
        <p>V
sition relation.</p>
        <p>! 2Q is a finite relation, the state
tran</p>
      </sec>
      <sec id="sec-2-8">
        <title>A configuration of such a WK automaton is denoted as</title>
        <p>ww21 q ww43 where ww21 2 VV is the part of the
input which is already read, q 2 Q is the state of the WK
automaton, and ww34 2 VV is the part of the input which
is not read yet.</p>
      </sec>
      <sec id="sec-2-9">
        <title>The transition between two configurations is denoted by</title>
        <p>), and defined as follows.</p>
        <p>Let uv11 ; uv22 ; uv33 2 VV ; uv11uv22vu33 2
W Kr (V ), q; q0 2 Q. Then
u1 q
v1
u2u3
if and only if, q0 2 d (q; u2 ). We may also write
v2
if we are not interested in the states of the automaton and
in the rest of the input to be read.</p>
        <p>If the reflexive and transitive closure of ) is denoted by
) , then the language accepted by M is</p>
        <p>w
L(M) = fw 2 V j q0 w )
w
w</p>
        <p>q f ;
w
w
where
2 W Kr (V ); q f 2 Fg:
2.2</p>
        <sec id="sec-2-9-1">
          <title>String assembling systems</title>
          <p>
            Now we recall the definition of string assembling systems
from [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ]. These also work with double stranded strings,
but in their case, the complementarity relation is the
identity relation. Similarly to sticker systems mentioned in the
introduction, they use a set of double stranded string pieces
as building blocks to generate a language. They are able
to “glue” two double stranded strings together if the last
letters of the first string and the first letters of the second
string coincide, and these letters overlap in the resulting
string.
          </p>
          <p>The 4-tuple S = hS; A; T; Ei is a string assembling
system (SAS in short), where</p>
        </sec>
      </sec>
      <sec id="sec-2-10">
        <title>S is a finite alphabet,</title>
        <p>A
T
E
uv
u</p>
        <p>S+</p>
        <p>or
S+</p>
        <p>S+
of the form</p>
        <p>S+ is the finite set of axioms of the form
u</p>
        <p>with u 2 S+; v 2 S ,
uv
S+ is the finite set of assembly units,
S+ is the finite set of ending assembly units
uv or v with u 2 S ; v 2 S+.</p>
        <p>v uv
to the unit</p>
        <p>A derivation step of the SAS S as above is denoted by
), and defined as follows. We say that we added the unit
xu2
u1 , denoted as
v1
u1
v1
)
u1u2 ;
v1v2
if and only if,
x
y
= end( u1 ) for some x; y 2 S, and</p>
        <p>v1
2 T [ E. Moreover, if ju1u2j
yv2
can be written as u1u2 = v1v2u0 for some u0 2 S , or the
other way around, if ju1u2j jv1v2j, then u1u2v0 = v1v2
for some v0 2 S .</p>
        <p>A sequence of derivation steps is a derivation. A
derivation is successful, if after choosing an axiom from A, we
jv1v2j, then u1u2
yv2
xu2
add assembling units from T , then close the derivation by
adding a unit from E, in such a way that the upper and
lower string of the string pair which is produced is
identical.</p>
        <p>If the reflexive and transitive closure of ) is denoted by
) , then the language generated by S is</p>
        <p>u
L(S) = fw 2 S+ j v
)
w
w</p>
        <p>is a
successful derivationg:</p>
        <p>Now we introduce some additional notation that will
be useful later. The string pair generated in k
derivation steps, or in the case of WK automata, the string pair
read in k transition steps, is denoted by
u(k) , that is,
v(k)
uv00 ) uv00uv11 :: :: :: vukk 11vukk = uv((kk)) . The pair uv((00))
denotes an axiom, or in the case of WK automata, the
l
pair , representing the fact that nothing has been read
l
from the input yet.</p>
        <p>Let Dk denote the difference of the length of upper and
u
lower element of the string pair added in the kth
v
derivation step, or read in the kth state transition, that
is,Dk = juj jvj, and let D(k) denote the difference of the
generated strings or the strings read on the upper and lower
pert of the input tape in k derivation or state transition
steps, that is, for u(k) , let D(k) = ju(k)j jv(k)j.</p>
        <p>v(k)</p>
      </sec>
      <sec id="sec-2-11">
        <title>Let us also denote the class of languages accepted or</title>
        <p>generated by a computational model X by L (X ), so let
L (SAS) and L (W K) denote the classes of languages
generated by SAS and accepted by WK automata,
respectively.
3</p>
        <p>The computational power of
Watson-Crick automata and string
assembling systems
In the following, we would like to point out some of the
similarities of WK automata and SAS, and then investigate
whether these similarities help us to establish a
relationship between their computational power.</p>
        <p>
          Consider the WK automata M = hV; r ; Q; q0; F; d i and
the SAS S = hS; A; T; Ei. They both work with double
stranded strings, and there is a relation between the upper
and lower strings in both cases. In the case of SAS, this
relation is the identity relation, in WK automata, on the other
hand, the relation can be an arbitrary symmetric relation.
This might seem to be an important difference, but it is
not. It is known from [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], that the computational power
of WK automata is not influenced by the
complementarity relation. For any WK automata M, we can construct
an M0, such that it uses the relation r = f(x; x) j x 2 V g,
and L(M) = L(M0). Based on this result, from now on we
assume that r = rid , the identity relation.
        </p>
        <p>We can also find similarities in the functioning of the
two models. Adding building units to the generated string
by a SAS from A, T , or E, corresponds to reading
substrings of the input by a WK automaton with a
transition from the initial state, a transition between arbitrary
states, and a transition leading to and accepting state,
respectively. While the states of WK automata are explicitly
given, “states” of a SAS are implicit, they are “hidden” in
the overlapping pairs of letters of the building units.
3.1</p>
        <p>
          Known results on WK automata and SAS
As we have seen, SAS can basically add three types of
units to the generated strings, while WK automata can
have an arbitrary number of states, so it seems to be
reasonable to review results related to the number of states of
WK automata. In [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] the authors show that from the point
of view of state complexity, WK automata are more
efficient than ordinary finite automata, there is a sequence
Lk; k 1 of regular languages, such that the number
of states of the automata sequence accepting Lk is
unbounded, while all languages of the sequence can be
accepted with WK automata having a fixed number of states.
Building on these results, in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] it is shown that arbitrary
finite languages can be accepted with WK automata
having two states, and that arbitrary unary regular languages
can be accepted by WK automata having three states.
        </p>
        <p>
          Concerning the power of SAS, already in the
introductory paper [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], the authors show that there are unary
regular languages that cannot be generated by SAS, but all
finite languages can be. From our point of view, especially
interesting is the result concerning the relationship of
languages accepted by stateless two head finite automata (that
is, two head automata with one state) and languages
generated by SAS. While the languages accepted by stateless
WK automata strictly include the languages accepted by
stateless two-head automata, Theorem 1 which we are
going to prove in the next section, can be considered as the
extension of this result.
3.2
        </p>
        <p>Comparing the power of WK automata and SAS
Let us start by establishing the relationship between
languages of (unrestricted) WK automata and SAS.</p>
        <sec id="sec-2-11-1">
          <title>Proposition 1.</title>
          <p>L (SAS)</p>
          <p>L (W K).</p>
          <p>
            Proof. According to Theorem 3.1 of [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ], languages
generated by SAS can be accepted by nondeterministic one-way
two-head finite automata, and since one-way two-head
finite automata can be simulated by WK automata, it is clear
that L (SAS) L (W K).
          </p>
          <p>
            On the other hand, L (W K) 6= L (SAS), since according
Theorem 3.6 in [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ] the language L = fag [ fa2n j n 2g
cannot be generated by any SAS, but can be accepted by
the following WK automaton.
          </p>
          <p>Let M = hfag; rid ; fq0; q f ; q f 0 g; q0; fq f ; q f 0 g; d i where</p>
        </sec>
      </sec>
      <sec id="sec-2-12">
        <title>As we have seen, in the general case, the computational</title>
        <p>power of WK automata is greater than the power of SAS.
Now we continue by restricting the power of WK automata
by considering its variants with reduced number of states.
First, we look at the relationship between stateless WK
automata and SAS.</p>
      </sec>
      <sec id="sec-2-13">
        <title>In the following, we denote the classes of WK automata</title>
        <p>with m states by W KjQj=m and by L (W KjQj=m) the class
of languages they accept.</p>
        <p>Theorem 1. Let M be a nondeterministic stateless WK
automaton. Then there exists an SAS S generating the
language accepted by M with an initial marker, that is,
$L(M) = L(S) where $ 62 V .</p>
        <p>Proof. Consider M = hV; rid ; fqg; q; fqg; d i, we construct
the SAS S = hS; A; T; Ei as follows.</p>
        <p>First we show that any w 2 L(M) can be generated by S.</p>
      </sec>
      <sec id="sec-2-14">
        <title>M accepts the empty word, so S must generate $, thus</title>
        <p>$ $
$ 2 A; $ 2 E.</p>
        <p>Let su and sl denote the already read part of the upper
and lower strands of the input of M. For all possible pairs
of letters end( $su ) = x , if u can be read by M,
$sl y v</p>
        <p>xu
there has to be a unit which can be added by S. We
yv
have added all possible building units for the reading of
u</p>
        <p>to S, so if a string pair can be read by M, then the
v
corresponding building unit can be added by S.</p>
      </sec>
      <sec id="sec-2-15">
        <title>M is always in an accepting state, so the only require</title>
        <p>ment for a string to be accepted is the property that it can
be read completely. In the last reading step, reading some
u</p>
        <p>, the two heads reach the end of the upper and lower
v</p>
        <p>S = V [ f$g ($ 62 V ),</p>
        <p>$
A = f $ g,</p>
        <p>xu
T = f yv</p>
        <p>x
j for all y</p>
        <p>2 S
transitions d (q;</p>
        <p>$
E = T [ f $ g.</p>
        <p>u
v
) = qg,</p>
      </sec>
      <sec id="sec-2-16">
        <title>S pairs of letters and</title>
        <p>strings simultaneously. All string pairs corresponding to
some unit in T have an appropriate transition in d , so S
can finish the string generation by adding some unit from
T , which must contain all units that can end the generation,
so
$</p>
        <p>E = T [ f $ g:</p>
        <p>Now we show that any $w 2 L(S) can be accepted by
M. For all $w, we have a derivation
$
$ )
$u1
$v1
)
$u1u2
$v1v2
) : : : )
$u1u2 : : : ut ,
$v1v2 : : : vt
xui
xui
yvi
where $w = $u1u2 : : : ut = $v1v2 : : : vt ;
yvi
i t. Since we created the building units in such a way
that their initial pairs of letters contain all possible
combinations, even if x = end( u(i 1) ) holds, the ability
y v(i 1)</p>
        <p>2 T [ E; 1
depends only on the string pair
to add unit
As d (q;
u
i ) = q, and M is stateless, only the read string
vi
pair ui determines the success of the reading a word.</p>
        <p>vi
The derivation above (besides the leading $ sign) is a
successful reading of the word w
ui .
vi
u1
v1
)
u1u2
v1v2
) : : : )
u1u2 : : : ut ,
v1v2 : : : vt
thus w 2 L(M).</p>
      </sec>
      <sec id="sec-2-17">
        <title>Although formally the above theorem does not say</title>
        <p>
          anything about the containment relationship betweem
L (SAS) and L (W KjQj=1), it still establishes some kind of
connection between the two languages classes. Moreover,
our statement is “stronger” then the similar statement in
[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] which states that for all languages L accepted by
stateless two-head finite automata, a corresponding SAS
generating L0 can be constructed in such a way that all w 2 L
corresponds to w0 2 L0 with h(w0) = w for a
homomorphism deleting four symbols from w0 (and not just one, as
in our case).
        </p>
      </sec>
      <sec id="sec-2-18">
        <title>Let us now relax the requirement of statelessness, and consider WK automata with two states</title>
        <p>Lemma 1. There is a WK automaton with two states
which accepts the language L = fw j w 2 fa; bg ; jwja =
2(1 + 2i); i 0g.</p>
        <p>Proof. The language can be accepted by the following</p>
      </sec>
      <sec id="sec-2-19">
        <title>WK automaton.</title>
        <p>Let M = hfa; bg; rid ; fq0; q f g; q0; fq f g; d i with d :
where in the first and last configuration M is in states q0
and q f , respectively, and in one of the two states in the
intermediate configurations, uvii 2 f la ; bx ; xly j
x; y 2 fa; bgg; 0 i t, correspond to the reading steps.</p>
        <p>In any case, by reading ui = b , Di = 0, and M
vi x
remains in the same state, by reading
ui
vi
=
a
l
1 and M remains in the same state, by reading
l
xy
l</p>
        <p>, Di = 2, and M changes to the other state.
xy
It is clear that D(i) can only increase after a reading step
a was read, and can only decrease if l was read.
l xy
Therefore, in order for D(i) = 0, some
need to be
a
read, together with reading twice as many . Thus, if
l
D(i) = 0, the already read part of the upper string contains
an even number of as.</p>
        <p>If w is accepted, then D(i) = 0 and M must be in state q f .
In order for M to be in state q f , an odd number (1 + 2k; k
l
0) of reading had to happen, which means that 2(1 +
xy</p>
        <p>a
2k) number of reading had to happen. Thus, M can
l
only accept words containing 2(1 + 2k) number of as.</p>
        <p>Now we show that M can accept any w 2 L. First it
a b
needs to read and (choosing the appropriate x)
l x
until it reaches the end of the upper string. We are in state
q0, since we have started in q0 and remained in q0 all the
way. The reading head of the lower string is behind the
upper head by the number of as, by 2(1 + 2k). In order
l
to catch up with the upper head, we need to read xy a
number of times. At each such reading step, M changes
its state. Since the number if as in w is 2(1 + 2k) and we
read two letters from the lower string, the lower reading
head reaches the end of the word after 1 + 2k, that is, after
an odd number of reading steps. Therefore, M enters q f
when started in q0, so w is accepted.</p>
        <p>Corollary 1. The language L = fw j w 2 fa; bg ; jwja =
(1 + 2i)k; i 0g can be accepted for any k 2 by a WK
automaton similar to the above, with</p>
        <p>Thus, L = fw j w 2 fa; bg ; jwja = 2(1 + 2i); i 0g can
be accepted by a WK automaton with two states, but it
cannot be accepted by stateless WK automata, as we will
show in the following.</p>
        <p>Lemma 2. If M is a stateless WK automaton, then L(M) =
L(M) .</p>
        <p>Proof. Let M = hV; rid ; fqg; q; fqg; d i be a stateless WK
automaton. It is clear that L(M) L(M) . To show the
¥
converse inclusion, consider L(M) = S L(M)i, and a
i=0
word w 2 L(M) . There are three possible cases.
1. If w 2 L(M)0, then w = l 2 L(M),
2. if w 2 L(M)1, then w 2 L(M),
3. if w 2 L(M)k; k 2, then w can be written as w =
w1w2:::wk, where wi 2 L(M); 1 i k.</p>
      </sec>
      <sec id="sec-2-20">
        <title>The first two cases are clear.</title>
      </sec>
      <sec id="sec-2-21">
        <title>In the third case,</title>
        <p>w1 q w2:::wk is a possible configuration, since w1 2 L,
w1 w2:::wk
so the two reading heads can be positioned after w1.</p>
        <p>If a configuration w1:::wi 1 q wi:::wk can occur,
w1:::wi 1 wi:::wk
then also w1:::wi q wi+1:::wk is a possible
configuraw1:::wi wi+1:::wk
tion, since wi can be read (as wi 2 L). Therefore, w can
wi
be read in such a way that both reading heads reach the
end of the tape, and since there is only one state, M
accepts w.</p>
      </sec>
      <sec id="sec-2-22">
        <title>Using the above lemma, we can show the following.</title>
        <p>Corollary 2. L = fw j w 2 fa; bg ; jwja = 2 + 4i; i
cannot be accepted by any stateless WK automaton.
0g
Proof. If L = L(M) for a stateless WK automaton M, then
L = L , according to the lemma above. Let w 2 L be such,
that jwja = 2. Then w2 2 L , but w2 2= L, so L 6= L ,
therefore L 6= L(M), which is a contradiction.</p>
        <p>S = fa; bg,
A = f
T = f</p>
        <p>x1a
b
b g.</p>
      </sec>
      <sec id="sec-2-23">
        <title>Now we show that the language that we found not to be</title>
        <p>in L (W KjQj=1) can be generated by SAS.</p>
        <p>Lemma 3. L = fw j w 2 fa; bg ; jwja = 2 + 4i; i
be generated by SAS.
0g can
Proof. Let S = hS; A; T; Ei with</p>
        <p>;
aa
aa
x1b
x2x3
ab
a
;</p>
        <p>;
xabb
y
ba
b
,
bba
bb
;
xbab
y
,
;
bbb
b
xbba
y
g,
,
g where x; x1; x2; x3; x4; y 2 S,
xaa
y
,</p>
        <p>We have chosen the assembly units of the SAS S in such
a way that the difference between the length of the upper
and lower string indicates how many as must be added to
the upper string in order to obtain a correct number. The
following four cases are possible.</p>
        <p>1. At least 3 as are missing: D(i) = 3 + 4n,
2. at least 2 as are missing: D(i) = 2 + 4n,
3. at least 1 a is missing: D(i) = 1 + 4n,
4. at least 0 a is missing: D(i) = 0 + 4n,</p>
      </sec>
      <sec id="sec-2-24">
        <title>Consider the string pair</title>
        <p>where n 2 Z and we are after the ith derivation steps.</p>
        <p>More formally we can express the above as follows.
u(k)</p>
        <p>obtained in k derivation
v(k)
steps. Then the congruence
2
ju(k)ja</p>
        <p>D(k)
(mod 4)
must hold during the whole generating process: ju(k)ja is
the number of as in the upper string. In case of ju(k)ja +
2 = 4i + j; (i 0; 1 j 4), (4i + 4) (4i + j) = (4i +
4) ju(k)ja 2 indicates the least number of as that we
need to add in order for the number to be a multiple of 4.
D(k) = (4 j) + 4n; n 2 Z indicates this too, so
(4i + 4)
ju(k)ja
2
2</p>
        <p>ju(k)ja
D(k) = (4
j) + 4n
(mod 4):
tion steps. After adding the unit
vk+1
step, the congruence still holds, since</p>
      </sec>
      <sec id="sec-2-25">
        <title>The axioms fulfill this condition.</title>
      </sec>
      <sec id="sec-2-26">
        <title>Let us assume, that the congruence holds after k deriva</title>
        <p>uk+1
in the (k + 1)th
x1b a b
if x2x3 ; a ; b , then ju(k+1)ja = ju(k)ja so
D(k) = D(k+1), thus 2 ju(k+1)ja = 2 ju(k)ja D(k) =
D(k+1) also hold.</p>
        <p>ju(k)ja)</p>
        <p>Using that a b (mod m) if and only if mj(a b), that
is, if a b 0 (mod m), in the remaining cases we show
the congruence (2 D(k) 0 (mod 4) holds.</p>
        <p>xabb xbab xbba
If ; ; , then</p>
        <p>y y y
ju(k)ja + 1 and D(k+1) = D(k) + 3 holds, so 2
ju(k+1)ja D(k+1) = 2 (ju(k)ja + 1) (D(k) + 3) =
2 ju(k)ja D(k) 4 0 (mod 4) also holds;
ju(k+1)ja =
xaa
if</p>
        <p>y
D(k) + 2 hold, so 2
2) (D(k) + 2) = 2
also holds;</p>
        <p>x1a
if
D(k)
1) (D(k)
holds.</p>
        <p>x2x3x4
1 hold, so 2</p>
        <p>1) = 2
, then ju(k+1)ja = ju(k)ja + 2 and D(k+1) =
ju(k+1)ja
ju(k)ja</p>
        <p>D(k+1) = 2
D(k) 4</p>
        <p>(ju(k)ja +
0 (mod 4)
, then ju(k+1)ja = ju(k)ja + 1 and D(k+1) =
ju(k+1)ja
ju(k)ja</p>
        <p>D(k+1) = 2 (ju(k)ja +
D(k) 0 (mod 4) also</p>
      </sec>
      <sec id="sec-2-27">
        <title>Thus, all words generated by S belong to the language L.</title>
      </sec>
      <sec id="sec-2-28">
        <title>Now we show that arbitrary words of L can be generated</title>
        <p>by S. The assembling units of S are chosen in such a way
that as and bs can follow each other in arbitrary order in
the upper string, and the lower string of each unit can be
arbitrary (only the length of the strings are fixed), so it is
possible to add to the lower strings the appropriate letters.</p>
        <p>Axioms are chosen as follows. We create the least
number of axioms which satisfy the above mentioned
congruence and provide the arbitrary letter order. The idea is to
u
create axioms of the form with upper strings of a
v
fixed length at most n 2 N, and the lower strings being at
most as long as the upper, thus juj = i; i = 1 : : : n; 1
jvj juj. Then we discard those which do not satisfy the
congruence, or which are not necessary for the initiation
of the generation.
a b
This way, for i = 1 we have and as
a b
axioms, but we discard these, because they do not
satisfy the congruence. For i = 2 we have the units
aa aa ab ab ba ba bb</p>
        <p>; ; ; ; ; ; , and
a aa a ab b ba b
bb aa ab ba</p>
        <p>. We keep ; ; , so we can
bb aa a b
begin the generation of words starting with aa, ab,
ba. We cannot generate words starting with bb, so
for i = 3 we only consider the units starting with bb
bba bba bba bbb bbb bbb</p>
        <p>; ; ; ; ; . We
b bb bba b bb bbb</p>
        <p>bba bbb
keep ; , so we are able to start the
genbb b
eration of words beginning with bb. We can start the
generation with all possible combinations of letters, so no
more axioms are needed.</p>
        <p>We proceed with units in T in a similar manner: we
create the least possible number of elements that satisfy
the congruence, but we also allow that lower strings are
longer than the upper strings. Let us suppose that the
current string pair satisfies the congruence. If we would like
to continue the upper string with b, then
string, so we need a unit of the form
we need to add an appropriate letter also to the lower
x1b</p>
        <p>.
x2x3
If we would like to continue the upper string with a, then
x2x3x4
have the unit
we need to add two letters to the lower string, so we
x1a</p>
        <p>, or
we need to add further letters to the upper string. If
to the upper string
– we would like to add an a, then we need to
xabb
add three letters, so we have the units ,
y
xbab xbba</p>
        <p>and ;
y y
– we would like to add two as, then we have the
xaa
unit .</p>
        <p>y
The number of letters needed to be added to the upper or
lower strings are determined by the congruence.</p>
      </sec>
      <sec id="sec-2-29">
        <title>With these units, we can generate arbitrary strings of as</title>
        <p>and bs on the upper string, while the lower can only be
the same length as the upper if the congruence is satisfied.
In this case the generation can be finished with the ending
a b
units or :</p>
        <p>a b</p>
      </sec>
      <sec id="sec-2-30">
        <title>Based on the above, we have the next statement.</title>
        <sec id="sec-2-30-1">
          <title>Theorem 2.</title>
          <p>L (SAS) 6</p>
          <p>L (W KjQj=1):
Proof. The statement is clear by combining Lemma 3 and</p>
        </sec>
      </sec>
      <sec id="sec-2-31">
        <title>Corollary 2.</title>
        <p>4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusion</title>
      <p>We have started to investigate the relationship between the
computational power of WK automata and string
assembling systems. We know that in the general case, WK
automata are stronger than SAS, so the comparison with WK
automata having a restricted number of states is of interest.</p>
      <p>First we have shown that WK automata with at least
two states are strictly more powerful than stateless WK
automata, then constructed a language that can be
generated by SAS, but cannot be accepted by stateless WK
automata. The exact relationship of the two language classes,
that is the question whether there are languages accepted
by stateless WK automata which cannot be generated by</p>
      <sec id="sec-3-1">
        <title>SAS, remains open.</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Elena</given-names>
            <surname>Czeizler</surname>
          </string-name>
          , Eugen Czeizler, Lila Kari, and
          <string-name>
            <given-names>Kai</given-names>
            <surname>Salomaa</surname>
          </string-name>
          .
          <article-title>On the descriptional complexity of watson-crick automata</article-title>
          .
          <source>Theor. Comput. Sci.</source>
          ,
          <volume>410</volume>
          (
          <issue>35</issue>
          ):
          <fpage>3250</fpage>
          -
          <lpage>3260</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Rudolf</given-names>
            <surname>Freund</surname>
          </string-name>
          , Gheorghe Pa˘un, Grzegorz Rozenberg, and
          <string-name>
            <given-names>Arto</given-names>
            <surname>Salomaa</surname>
          </string-name>
          .
          <article-title>Watson-crick finite automata</article-title>
          .
          <source>In Harvey Rubin and David</source>
          Harlan Wood, editors,
          <source>DNA Based Computers, Proceedings of a DIMACS Workshop</source>
          , Philadelphia, Pennsylvania, USA, June 23-25,
          <year>1997</year>
          , volume
          <volume>48</volume>
          <source>of DIMACS Series in Discrete Mathematics and Theoretical Computer Science</source>
          , pages
          <fpage>297</fpage>
          -
          <lpage>327</lpage>
          . DIMACS/AMS,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Lila</given-names>
            <surname>Kari</surname>
          </string-name>
          , Gheorghe Pa˘un, Grzegorz Rozenberg, Arto Salomaa, and
          <string-name>
            <given-names>Sheng</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <article-title>DNA computing, sticker systems, and universality</article-title>
          .
          <source>Acta Informatica</source>
          ,
          <volume>35</volume>
          (
          <issue>5</issue>
          ):
          <fpage>401</fpage>
          -
          <lpage>420</lpage>
          , May
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Dietrich</given-names>
            <surname>Kuske</surname>
          </string-name>
          and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Weigel</surname>
          </string-name>
          .
          <article-title>The role of the complementarity relation in watson-crick automata and sticker systems</article-title>
          . In Cristian Calude, Elena Calude, and Michael J. Dinneen, editors,
          <source>Developments in Language Theory</source>
          , 8th International Conference, DLT 2004, Auckland, New Zealand,
          <source>December 13-17</source>
          ,
          <year>2004</year>
          , Proceedings, volume
          <volume>3340</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>272</fpage>
          -
          <lpage>283</lpage>
          . Springer,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Martin</given-names>
            <surname>Kutrib</surname>
          </string-name>
          and
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Wendlandt</surname>
          </string-name>
          .
          <article-title>String assembling systems</article-title>
          .
          <source>RAIRO Theor. Informatics Appl.</source>
          ,
          <volume>46</volume>
          (
          <issue>4</issue>
          ):
          <fpage>593</fpage>
          -
          <lpage>613</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Pa</surname>
          </string-name>
          <article-title>˘un and Mihaela Pa˘un. State and transition complexity of watson-crick finite automata</article-title>
          . In Gabriel Ciobanu and Gheorghe Paun, editors,
          <source>Fundamentals of Computation Theory, 12th International Symposium</source>
          , FCT '99,
          <string-name>
            <surname>Iasi</surname>
          </string-name>
          , Romania,
          <source>August 30 - September 3</source>
          ,
          <year>1999</year>
          , Proceedings, volume
          <volume>1684</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>409</fpage>
          -
          <lpage>420</lpage>
          . Springer,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Gheorghe</given-names>
            <surname>Pa</surname>
          </string-name>
          <article-title>˘un, Grzegorz Rozenberg, and Arto Salomaa</article-title>
          .
          <source>DNA Computing - New Computing Paradigms. Texts in Theoretical Computer Science. An EATCS Series</source>
          . Springer,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>