<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Hardness of MSA with Selective Column Scoring?</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Verona</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Multiple Sequence Alignment (MSA for short) is a well known problem in the eld of computational biology. In order to evaluate the quality of a solution, many di erent scoring functions have been introduced, the most widely used being the Sum-of-pairs score (SP-score). It is known that computing the best MSA under the SP-score measure is NP-hard. In this paper, we introduce a variant of the Column score (de ned in Thompson et al. 1999), which we refer to as Selective Column score: Given a symbol a 2 , the score of the i-th column is one if and only if all symbols of the same column are a, and otherwise zero. The acolumn score of an alignment is then the number of columns made of only character a. We show that nding the optimal MSA under the Selective Column Score is NP-hard for all alphabets of size j j 2 by reducing from MIN-2-SAT.</p>
      </abstract>
      <kwd-group>
        <kwd>Multiple Sequence Alignment</kwd>
        <kwd>Column score</kwd>
        <kwd>NP-completeness</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Alignments are fundamental methods for the comparison of biological sequences.
They are extensively employed for assessing similarity among sequences as well
as in subprocedures of more complex operations, like the computation of
approximate overlaps in the context of sequencing.</p>
      <p>De nition 1 (Multiple alignment). Let s1; : : : ; sm; be m strings over some
alphabet . Let - 62 be a gap symbol and 0 = [ f-g: A multiple alignment
S~ of s1; : : : ; sm of length ` is a tuple of strings s~1; : : : ; s~m 2 0 such that
? Copyright c 2021 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0)
1 For a string s and integer i, we denote by s[i] the ith character of s.</p>
      <p>It is useful to visualize the alignment as a matrix, where rows are the strings
s~i: Therefore, for each j = 1; : : : ; ` we refer to the sequence s~1[j]; : : : ; s~m[j] as
the jth column of the alingment.</p>
      <p>Many proposals exist regarding the choice of the most appropriate objective
function that should be optimized when computing an alignment of a set of
strings [1, 11, 12].</p>
      <p>In this paper, we focus on the so called column score analyzed in [12], where
the alignment is assessed by counting the number of columns consisting of a
single character.</p>
      <p>De nition 2 (Column Score). Let s1; : : : ; sm be a sequence of strings from
some alphabet : Given an alignment S~ = s~1; : : : ; s~m of length ` of s1; : : : ; sm;
and a character a 2 ; the a-column score of S~; denoted csa(S~) is the number
of columns made of only character a; i.e.,</p>
      <p>csa(S~) = jfj 2 f1; : : : ; `g j 8i = 1; : : : ; m; s~i[j] = agj:</p>
      <p>The column score of S~; denoted cs(S~) is the number of columns made of a
single character a; i.e.,
cs(S~) = j [a2 fj 2 f1; : : : ; `g j 8i = 1; : : : ; m; s~i[j] = agj =
X csa(S~):
a2
In particular we prove the NP-completeness of the following problem.
Selective Column Score Alignment (SCS-Align)
Input: Strings s1; : : : ; sm; over some alphabet ; an integer M
maxi jsij; a non-negative integer ; and a selected character
a 2 :
Question: Is there an aligment S~ = s~1; : : : ; s~m of s~1; : : : ; s~m such that the
length of S~ is at most M and the a-column score of S~ is at least
?</p>
      <p>In contrast with other objective functions for multiple sequence alignment,
to the best of our knowledge, before our result, nothing was known about the
tractability of computing the best possible alignment under the (selective)
acolumn score.</p>
      <p>Main Result. In this paper, we show that the Selective Column Score
Alignment problem (and several of its variants) is NP-complete.</p>
      <p>Related work. It is not hard to see that without the upper bound M on the
alignment length, computing an alignment of maximum possible column score
is equivalent to nding the longest common subsequence of the input strings,
which is known to be an NP-hard problem [9]. However, in contrast to our
SCSAlign, the Longest Common Subsequence problem is easily solvable in
polynomial time when the output is restricted to subsequences made of a single
given character|note that a a-column score alignment asks for a subsequence of
this type. Another important issue/di erence with respect to modelling column
score alignment as an instance of LCS is that an alignment obtained starting
from a longest common subsequence might have an uncontrollable number of
gaps per row. We show that the hardness result on SCS holds for an equivalent
variant of the problem where we impose a bound of 1 on the number of gaps per
row. In this respect, the length constraint on the solutions of SCS-Align have
some similarity with the gapped common subsequences considered in [4, 5]. In
these papers, however, the authors consider only the special case of two input
strings and do not discuss complexity issues of the general instances.</p>
      <p>The most commonly considered objective function for optimizing multiple
sequence alignments is the so called Sum of Pairs score (SP-score) introduced by
Carrillo and Lipman [3]. The SP-score generalizes the edit distance based score
for pairwise alignments (equivalently multiple alignments of only two strings). An
optimal pairwise alignment can be computed in quadratic time by the Needleman
and Wunsh dynamic programming approach [10]. In contrast, computing an
optimal SP-score based multiple sequence alignment is known to be NP-hard
[13].</p>
      <p>Alternative scoring measures, based on the distance to a consensus string, are
investigated in Li et al. [8]. In its simplest form, given an alignment a consensus
string can be obtained by selecting for each column a majority symbol. The score
of the alignment is the sum of the Hamming distance between each input string
and the consensus sequence. The authors of [8] investigate several variants of
this approach showing that they are in general NP-hard. They also consider a
sort of inverse procedure, where a median string of the input string is computed
that minimizes the sum of gap-bounded edit distance of each string from the
median. Once this string has been found the alignment can be constructed by
optimally aligning each input string to the median.</p>
      <p>Both the above approaches (SP-score and distance from consensus) are based
on pairwise comparisons of strings. More precisely, in the SP-score case each
possible pairwise comparison of the input strings is computed and the corresponding
scores are added up to obtain the nal score. In the case of distance to consensus,
a median string is computed and the alignment score is de ned as the sum of
pairwise alignment scores of the input strings to the median. The column score
is a basic model of scoring functions that are obtained by evaluating the set
of elements in a column, rather than composing evaluations of single pairs of
elements in a column.</p>
      <p>Organization of the paper. In section 2, we prove the NP-completeness of
our basic version of the column score, Selective Column Score Alignment,
by reducing from MIN-2-SAT. In section 3, we show how the reduction can be
extended to variants where there is no selected character, and the bound on the
length of the alignment is substituted with a very sharp bound on the number
of gaps per row of the alignment. Section 4 concludes the paper with some open
questions.
The NP-completeness of Selective Column Score
Alignment
In this section, we establish the complexity of the SCS-Align problem de ned
in the introduction. We provide a reduction from the NP-complete problem
MIN-2-SAT [7].</p>
      <p>Min-2-Sat
Input: A 2-CNF formula and an integer k
Question: Is there an assignment to the variables of
k clauses?
that satis es at most
The Reduction. Let I = ( ; k) be an instance of MIN-2-Sat. Let x1; : : : ; xn be
the variables of and C1; : : : ; Cm be the clauses of : Then, = C1^C2^ ^Cm;
where for each i = 1; : : : ; m; Ci = (l1i _ l2i), with lji 2 fx1; : : : xn; x1; : : : ; xng.</p>
      <p>From I; we now build an instance J = (fs0; : : : ; sn+1g; M; ; a) for
SCSAlign, where each si is a binary string, M is a bound on the alignment length,
a is a character in the alphabet = f0; 1g; and is the bound on the number
of columns of the alignments that are expected to contain only the selected
character a:</p>
      <p>We set M = 8mn+3m+2; a = 0 and build n+2 binary strings: s0; s1; : : : ; sn; sn+1
where js0j = jsn+1j = M and jsij = M 1 for each i = 1; : : : ; n:
More precisely we de ne:</p>
      <p>
        s0 = 1(01)2mn(101)m(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )2mn1;
and for each i = 1; : : : ; n;
where, for each j = 1; : : : ; m;
      </p>
      <p>
        si = 0(00)(i 1)2m(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )2m(00)(n i)2mci(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) : : : ci(m)(00)(i 1)2m(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )2m(00)(n i)2m;
Finally, we de ne
and set
8100 if xi 2 Cj
&gt;
c(j) = &lt;
i
      </p>
      <p>
        010 if xi 2 Cj
&gt;:000 else
sn+1 = 0M ;
= 2mn + m
k:
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
Example 1: Let = (x1 _ x2) ^ (x2 _ x3) be a 2-CNF formula with m = 2 and
n = 3. The strings s0; : : : ; s4 obtained by the above process are then:
with s1; s2; s3 respectively representing the variables x1; x2; x3.
      </p>
      <p>Remark 1. Let S~ = s~0; s~1; : : : ; s~n+1 be an alignment of length ` M of the
strings s0; : : : ; sn+1: The following three facts are easy implied by the structure
of the instance produced by our reduction.
1. Since js0j = jsn+1j = M we have that it must be ` = M; hence s~0 = s0
and s~n+1 = sn+1: Moreover, for each i = 1; : : : ; n; because of jsij = M 1
it follows that s~i contains exactly one gap. We will use gap(i) to denote
the position of the single gap in s~i; i.e., the index j = 1; : : : ; M; such that
s~i[j] = -:
2. Following from Fact 1. and since s~n+1 is made of only zeroes there exist no
columns with only 1, and the only columns that can contain only 0 are those
corresponding to the positions of the zeroes in s0: In other words, for the
instances produced by our reduction, we have cs0(S~) = cs(S~):
3. For each i = 1; : : : ; n and j = 1; : : : ; M; we say that s~i blocks position j if
s~0[j] = s0[j] = 0 and s~i[j] = 1: A column j is made of only zeroes if and
only if there is no i = 1; : : : ; n such that s~i blocks position j:</p>
      <p>
        For each i = 1; : : : ; n; let siT = si- be the string obtained by adding a gap
at the end of si. Let siF = -si be the string obtained by adding a gap at the
beginning of si: Then, we have
siT = 0(00)(i 1)2m(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )2m(00)(n i)2mci(;1T) : : : ci(;mT)(00)(i 1)2m(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )2m(00)(n i)2m-;
siF = -(00)(i 1)2m(01)2m(00)(n i)2mci(;1F) : : : ci(;mF)(00)(i 1)2m(01)2m(00)(n i)2m0;
where, for each j = 1; : : : ; m; ci(;jT) is equal to ci(j) (de ned in (
        <xref ref-type="bibr" rid="ref3">3</xref>
        )) and
ci(;jF) =
8010 if xi 2 Cj
&gt;
&lt;
      </p>
      <p>001 if xi 2 Cj
&gt;:000 else</p>
      <p>Let us de ne the head of s0 to be the pre x s0h = s0[1] : : : s0[4mn + 1] =
1(01)2mn:</p>
      <p>Let us de ne the core of s0 to be the substring s0p = s0[4mn + 2] : : : s0[4mn +
3m + 1] = (101)m:</p>
      <p>
        Finally, let us de ne the tail of s0 to be the su x st0 = s0[4mn + 3m +
2] : : : s0[8mn + 3m + 2] = (
        <xref ref-type="bibr" rid="ref10">10</xref>
        )2mn1:
      </p>
      <p>
        By a direct comparison with the de nition of s0 in (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), we have the following
proposition, stating useful properties of strings siF ; siT :
Proposition 1. For each i = 1; : : : ; n; the string siT blocks
{ exactly 2m positions of the head of s0;
{ no position of the tail of s0;
{ the position of the jth zero in the core of s0 if and only if xi 2 Cj ; i.e., if
the assignment xi = T rue satis es clause Cj :
      </p>
    </sec>
    <sec id="sec-2">
      <title>Analogously, we have that the string siF blocks</title>
      <p>{ exactly 2m positions of the tail of s0;
{ no position of the head of s0;
{ the position of the jth zero in the core of s0 if and only if xi 2 Cj ; i.e., if
the assignment xi = F alse satis es clause Cj :
Moreover, for each i 6= i0 the strings siT ; siF ; siT0 ; siF0 block distinct positions in the
head and tail of s0.</p>
      <p>The following lemma proves that our reductions maps yes instances of
MIN2-Sat to yes-instances of SCS-Align.</p>
      <p>Lemma 1. If there exists an assignment A for that satis es at most k clauses
then there exists an alignment S~ = s~0; : : : ; s~n+1 of s0; : : : ; sn+1 such that the
length of S~ is M and there are at least = 2mn + m k columns made only of
zeroes.</p>
      <p>Proof. Given an assignment A that satis es at most k clauses of ; we de ne
the alignment S~ as follows: for each i = 0; : : : ; n + 1 we let
s~i =
8s
&gt;&lt; iT</p>
      <p>
        si
&gt;:siF
if i = 0 or i = n + 1
if A sets xi = T rue
if A sets xi = F alse
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
      </p>
      <p>
        By the de nition of siF and siT and Proposition 1, we have that strings
s~1; : : : ; s~n block exactly 2mn of the positions in the head and the tail of s0:
Moreover, the position of the jth zero in the core of s0 is blocked by some string
s~i if and only if the truth value of xi in A satis es Cj : Since A satis es at most k
clauses, we have that at most k positions of the core of s0 are blocked by some s~i:
It total there are 4mn + m positions in s0 corresponding to a zero character and
at most 2mn + k are blocked by the strings of the alignment. Hence, there are at
least 2mn + m k positions which are not blocked, equivalently 2mn + m k
columns made of only zeros.
Example 2: Using the same CNF formula as Example 1 with the assignment
A = fx1 = T rue; x2 = F alse; x3 = F alseg, by Lemma 1 we get this alignment:
s~0 = s0 : 1(01)4(01)4(01)4101101(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )4(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )4(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )41
s~1 = s1T :
0(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )4(00)4(00)4100000(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )4(00)4(00)4s~2 = s2F : -(00)4(01)4(00)4010001(00)4(01)4(00)40
s~3 = s3F : -(00)4(00)4(01)4000001(00)4(00)4(01)40
s~4 = s4 : 0(00)4(00)4(00)4000000(00)4(00)4(00)40
We can see that the strings s1T ; s2F ; s3F block a total of 13 columns, and knowing
that the assignment A satis es only one clause we have k = 1, therefore it holds
13 2mn + k = 13.
      </p>
      <p>Lemma 2. Let be the maximum column score achievable by an alignment
of length M of the strings s0; : : : ; sn+1: Then, there exists an alignment S~ of
s0; : : : ; sn+1; such that</p>
    </sec>
    <sec id="sec-3">
      <title>1. the length of S~ is M;</title>
      <p>2. cs(S~) = ;
3. for each i = 1; : : : ; n; we have gap(i) 2 f1; M g; where gap(i) denotes the
position of the gap in s~i; i.e., the index j 2 f1; : : : ; M g such that s~i[j] = -:
Proof. Let S~ be an alignment that, among those satisfying conditions 1. and 2.,
has the minimum number of strings s~i violating condition 3. And assume, by
contradiction, that such number is greater than 0.</p>
      <p>Then there exists i 2 f1; 2; : : : ; ng such that the position of the gap in s~i
is not in f1; M g: We are going to show that we can modify s~i moving the gap
in position 1 or M and obtain another alignment satisfying condition 1. and 2.
The existence of such an alignment contradicts the minimality of S~ with respect
to condition 3, implying that in fact in S~ for all i = 1; : : : ; n we must have
gap(i) 2 f1; M g as desired.</p>
      <p>Let ni be the number of positions that s~i blocks altogether in the tail and
the head of s0: We have
ni =
&gt;82m
&gt;
&gt;&gt;&gt;2m
&gt;
&lt;</p>
      <p>4m
&gt;&gt;&gt;2m + d 2 e</p>
      <p>x
&gt;
&gt;
&gt;:2m + d 4m 2x+1 e
if gap(i) &lt; 4m(i 1) + 2;
if gap(i) 4mi + 2 + 4mn + 3m;
if 4mi + 2 gap(i) (i 1)4m + 1 + 4mn + 3m;
if gap(i) = (i 1)4m + 1 + x; 1 x 4m;
if gap(i) = 4m(i 1) + 1 + 4mn + 3m + x; 1 x
4m:</p>
      <p>Note that these numbers refer to positions that can be blocked only by s~i:
Hence, unblocking any of such positions implies a net increase in the column
score of the resulting alignment.</p>
      <p>We apply the following three cases:
Case 1. gap(i) 4mi + 1: In this case s~i blocks 2m positions in the tail of s0
and possibly some positions in the head of s~0: If we modify s~i by moving the
gap to the rst position (equivalently, we set s~i = siF ) we reduce the number
of positions blocked in the tail and head of s~0 to 2m: Moreover, the characters
of the new s~i which are aligned to the core of s~0 do not change. Then, the new
alignment has not decreased the number of columns made of only zeros, i.e., it
satis es conditions 1 and 2 and has one more string satisfying condition 3.
Case 2. gap(i) 4mn + 3m + 4m(i 1) + 2: In this case s~i blocks 2m positions
in the head of s0 and possibly some positions in the tail of s~0: If we modify s~i
by moving the gap to the last position (equivalently, we set s~i = siT ) we reduce
the number of positions blocked altogether in the tail and head of s~0 to only
2m: Moreover, the characters of the new s~i which are aligned to the core of s~0
do not change. Then, the new alignment has not decreased the number columns
made of only zeros, i.e., it satis es conditions 1 and 2 and has one more string
satisfying condition 3.</p>
      <p>Case 3. 4mi + 2 gap(i) 4m(i 1) + 1 + 4mn + 3m: In this case, s~i blocks 2m
positions in the head of s0 and 2m positions in the tail of s0. It is possible that
the position of the gaps allows to align the central zeros of s~i (those belonging
to the substrings ci(j)) to the zeroes of the core of s0: If we modify s~i by moving
the gap to the rst position (equivalently, we set s~i = siF ) we reduce the number
of positions blocked altogether in the tail and head of s~0 to 2m: Even if in the
resulting new alignment the new s~i blocked all the positions in the core of s0;
which are m, the net gain in the number of columns made of only zeroes would
be m: Then, the new alignment has strictly increased the number of columns
made of only zeros, i.e., it satis es conditions 1 and 2 and has one more string
satisfying condition 3.</p>
      <p>We can see that the strings s1T ; s2F ; s3F block a total of 13 columns, and knowing
that the assignment A satis es only one clause we have k = 1, therefore it holds
13 2mn + k = 13.</p>
      <p>Lemma 3. If there exists an alignment S~ of s0; : : : ; sn+1 that has length M and
such that cs(S~) ; then there exists an assignment A for that satis es at
most k clauses.</p>
      <p>Proof. By Lemma 2, we can assume w.l.o.g., that in S~ for each i = 1; : : : ; n,
we have gap(i) 2 f1; M g: Equivalently, for each i = 1; : : : ; n; we have that
s~i 2 fsiT ; siF g:</p>
      <p>We now create the assignment A by setting for each i = 1; : : : ; n;
xi =
(T rue</p>
      <p>
        if s~i = siT
F alse if s~i = siF
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
      </p>
      <p>By the de nition of siF and siT and Proposition 1, we have that strings
s~1; : : : ; s~n block exactly 2mn of the positions in the head and the tail of s0:
Hence, exactly 2mn columns corresponding to the non-blocked positions in the
head and tail of s0 are made of only zeroes. Since cs(S~) = 2mn + m k;
we have that at least m k columns corresponding to positions in the core of
s0 must be made of only zeroes. Equivalently, at most k positions in the core of
s0 are blocked by the strings s~1; : : : ; s~n:</p>
      <p>By Proposition 1, we have that the position of the jth zero in the core of s0
is blocked by some string s~i if and only if the truth value of xi in A satis es Cj :
Since at most k positions of the core of s0 are blocked by some s~i; it follows that
A satis es at most k clauses.</p>
      <p>As a result of Lemmas 1 and 3, we have the following.</p>
      <p>Theorem 1. The problem Selective Column Alignment is NP-complete
already over an alphabet of size 2:
3</p>
      <p>Variants and Extensions
In this section we extend the NP-completeness of the SCS-Align to show the
NP-completeness of other variants or reformulations of the problem that are
also of interest. We start by the simple observation that the problem remains
NP-complete even if we consider as objective function the column score, rather
than the a-column score with respect to a speci c character a.</p>
      <p>Bounded Column Score Alignment (BCS-Align)
Input: Strings s1; : : : ; sm over some alphabet ; an integer M
maxi jsij; and a non-negative integer :
Question: Is there an aligment S~ = s~1; : : : ; s~m of s1; : : : ; sm such that the
length of S~ is at most M and the column score of S~ is at least
?</p>
      <p>In fact, due to the presence of the string sn+1 being made of only zeroes as
observed in point 2 of Remark 1, the proof in the previous section also shows
the result in the following corollary. It is enough to observe that sn+1 forces only
columns entirely made of 0 to be accepted.</p>
      <p>Corollary 1. The Bounded Column Score Alignment problem is
NPcomplete</p>
      <p>We can also reformulate the problem by interpreting the bound on the length
of the alignment as a bound on the number of gaps. In fact, the following
formulation is also easily shown to be NP-complete
gap-Bounded Column Score Alignment (gap-CS-Align)
Input: Strings s1; : : : ; sm over some alphabet ; an integer bound g on
the number of allowed gaps per string, and a non-negative integer
bound on the number of uniform columns to be found.</p>
      <p>Question: Is there an aligment S~ = s~1; : : : ; s~m of s~1; : : : ; s~m such that for
each i 2 [m]; s~i contains at most g gaps, and the column score
of S~ is at least ?
Corollary 2. The gap-Bounded Column Score Alignment problem is
NPcomplete
Proof. We use exactly the same reduction employed for the NP-completeness of
SCS-Align. We show that the particular case of gap-CS-Align with g = 1
is NP-complete. The simple observation needed is that in any alignment S~ =
s~0; : : : ; s~n+1 of the strings s0; : : : ; sn+1 no gap can be present in s~0; (nor in s~n+1);
otherwise, at least 2 gaps should be present in each s~i (i = 1; : : : ; n). Hence, we
have that Remark 1 (and in particular point 1.) holds also on the basis of the
bound on the number of gaps.
4</p>
      <p>Final Remarks
We proved that computing the best multiple sequence alignment with respect to
the column score is NP-hard even in a very restricted model where: strings are
binary, we only try to optimize the number of columns of the alignment made
of only zeroes, and we bound the number of gaps per row to 1.</p>
      <p>It would be interesting to better understand the relationships of this problem
to parameterized variants of the Longest Common Supersequence (LCS)
problem [4, 5, 2, 6], since LCS is the natural model of column score based multiple
sequence alignment in the absence of any bounds on the number of gaps or
restrictions on the uniform columns. Another natural open problem regards the
limits of approximability of our problems.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Berkemer</surname>
            ,
            <given-names>S.J.</given-names>
          </string-name>
          , Honer zu Siederdissen,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Stadler</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.F.</surname>
          </string-name>
          :
          <article-title>Compositional properties of alignments</article-title>
          . Mathematics in Computer Science (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Blin</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bulteau</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tejada</surname>
            ,
            <given-names>P.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vialette</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Hardness of longest common subsequence for sequences with bounded run-lengths</article-title>
          . In: Karkkainen, J.,
          <string-name>
            <surname>Stoye</surname>
            ,
            <given-names>J</given-names>
          </string-name>
          . (eds.) Combinatorial Pattern Matching. pp.
          <volume>138</volume>
          {
          <fpage>148</fpage>
          . Springer Berlin Heidelberg, Berlin, Heidelberg (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Carrillo</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lipman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>The multiple sequence alignment problem in biology</article-title>
          .
          <source>SIAM Journal on Applied Mathematics</source>
          <volume>48</volume>
          (
          <issue>5</issue>
          ),
          <volume>1073</volume>
          {
          <fpage>1082</fpage>
          (
          <year>1988</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Iliopoulos</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kubica</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rahman</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Algorithms for computing the longest parameterized common subsequence</article-title>
          . In: Ma,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Zhang</surname>
          </string-name>
          , K. (eds.) Combinatorial Pattern Matching. pp.
          <volume>265</volume>
          {
          <fpage>273</fpage>
          . Springer Berlin Heidelberg, Berlin, Heidelberg (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Iliopoulos</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sohel</surname>
            <given-names>Rahman</given-names>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Algorithms for computing variants of the longest common subsequence problem</article-title>
          .
          <source>Theoretical Computer Science</source>
          <volume>395</volume>
          (
          <issue>2</issue>
          ),
          <volume>255</volume>
          {
          <fpage>267</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. Keller, O.,
          <string-name>
            <surname>Kopelowitz</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewenstein</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>On the longest common parameterized subsequence</article-title>
          .
          <source>Theoretical Computer Science</source>
          <volume>410</volume>
          (
          <issue>51</issue>
          ),
          <volume>5347</volume>
          {
          <fpage>5353</fpage>
          (
          <year>2009</year>
          ), combinatorial Pattern Matching
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kohli</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krishnamurti</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mirchandani</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>The minimum satis ability problem</article-title>
          .
          <source>SIAM Journal on Discrete Mathematics</source>
          <volume>7</volume>
          (
          <issue>2</issue>
          ),
          <volume>275</volume>
          {
          <fpage>283</fpage>
          (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Finding similar regions in many sequences</article-title>
          .
          <source>Journal of Computer and System Sciences</source>
          <volume>65</volume>
          (
          <issue>1</issue>
          ),
          <volume>73</volume>
          {
          <fpage>96</fpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Maier</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>The complexity of some problems on subsequences and supersequences</article-title>
          .
          <source>J. ACM</source>
          <volume>25</volume>
          (
          <issue>2</issue>
          ),
          <volume>322</volume>
          ?
          <fpage>336</fpage>
          (
          <year>1978</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Needleman</surname>
            ,
            <given-names>S.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wunsch</surname>
          </string-name>
          , C.D.:
          <article-title>A general method applicable to the search for similarities in the amino acid sequence of two proteins</article-title>
          .
          <source>Journal of Molecular Biology</source>
          <volume>48</volume>
          (
          <issue>3</issue>
          ),
          <volume>443</volume>
          {
          <fpage>453</fpage>
          (
          <year>1970</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Notredame</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Recent progress in multiple sequence alignment: a survey</article-title>
          .
          <source>Pharmacogenomics</source>
          <volume>3</volume>
          (
          <issue>1</issue>
          ),
          <volume>131</volume>
          {144 (
          <year>January 2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Thompson</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plewniak</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poch</surname>
            ,
            <given-names>O.:</given-names>
          </string-name>
          <article-title>A comprehensive comparison of multiple sequence alignment programs</article-title>
          .
          <source>Nucleic Acids Research</source>
          <volume>27</volume>
          (
          <issue>13</issue>
          ),
          <volume>2682</volume>
          {
          <volume>2690</volume>
          (07
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>On the complexity of multiple sequence alignment</article-title>
          .
          <source>Journal of Computational Biology</source>
          <volume>1</volume>
          (
          <issue>4</issue>
          ),
          <volume>337</volume>
          {
          <fpage>348</fpage>
          (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>