=Paper= {{Paper |id=Vol-2098/paper16 |storemode=property |title= |pdfUrl=https://ceur-ws.org/Vol-2098/paper16.pdf |volume=Vol-2098 |authors=Alexander Kel’manov,Ludmila Mikhailova,Semyon Romanchenko }} ==== https://ceur-ws.org/Vol-2098/paper16.pdf
On a Problem of Choosing Elements in a Family
                of Sequences

    Alexander Kel’manov1,2 , Ludmila Mikhailova1 , and Semyon Romanchenko1
                            1
                            Sobolev Institute of Mathematics,
                      4 Koptyug Ave., 630090 Novosibirsk, Russia
                             2
                               Novosibirsk State University,
                       2 Pirogova St., 630090 Novosibirsk, Russia
               kelm@math.nsc.ru, mikh@math.nsc.ru, rsm@math.nsc.ru



         Abstract. In the problem considered, it is required to minimize the
         sum of elements chosen in a family of finite numerical sequences with
         some constraints on the choice of elements. Namely, given a family of
         L nonnegative N -element sequences and a positive integer J, we need
         to minimize the sum of J intra-sums each of which includes only one
         element in every input sequence with all possible L-permutations of these
         sequences and under some constraints on the choice of elements to be
         included in the general double sum. The problem is related, for example,
         to the distant noise-prove monitoring of several moving objects with
         possible arbitrary displacements (permutations) of these objects. For this
         problem we present an exact polynomial-time algorithm with O(N 5 )
         running time.

         Keywords: Optimal summing · Finite numerical sequences · Permuta-
         tions · Exact polynomial-time algorithm




1     Introduction
In the paper we study one problem of optimal summing of elements chosen in a
family of finite numerical sequences. The subject of this research is algorithmic
approximability of the problem. Our goal is to construct an efficient algorithm
with the guaranteed accuracy for this problem.
    The research is motivated by the absence of available algorithmic results for
the problem under consideration and the importance of the problem, in particu-
lar, in the distant monitoring of some objects (see next Section). Moreover, the
problem considered is a generalization of the Earlier Studied Optimization Prob-
lem (see Next Section). This is another reason for this problem to be studied.
    The paper has the following structure. Section 2 contains the problem formu-
lation, its interpretation and known results for the particular case of the problem.
     Copyright c by the paper’s authors. Copying permitted for private and academic purposes.
    In: S. Belim et al. (eds.): OPTA-SCL 2018, Omsk, Russia, published at http://ceur-ws.org
182     A. Kel’manov et al.

In the Same Section, We Announce Our Result Obtained. Section 3 contains two
auxiliary problems and exact polynomial-time algorithms for them, which are
necessary to justify the proposed algorithm. Finally, in Section 4, we present and
justify our algorithm.


2     Problem Formulation and Related Problems
Everywhere below, R denotes the set of real numbers, R≥ denotes the set of non-
negative real numbers, N denotes the set of positive integer numbers, π denotes
the permutation on L elements, Π denotes the set of all possible permutations
on L elements, πj denotes the j-th permutation, and πj (i), j = 1, . . . , J, i =
1, . . . , L, denotes the i-th element of the j-th permutation.
     We consider the following
                                    n
Problem 1. Given a family S = {s1 (n), . . . , sL (n)} si (n) ∈ R≥ , i = 1, . . . , L, n =
             o
1, . . . , N of sequences and a positive integer J such that JL ≤ N . Find an in-
dex tuple (n1 , . . . , nJL ), where nk ∈ {1, . . . , N }, k = 1, . . . , JL, and a tuple
(π1 , . . . , πJ ) of J permutations on L elements such that
                                                    J X
                                                    X L
         S(n1 , . . . , nJL , π1 , . . . , πJ ) =             sπj (i) (n(j−1)L+i ) −→ min   (1)
                                                    j=1 i=1

subject to constraints

                              1 ≤ n1 < n2 < . . . < nJL ≤ N .                               (2)

   In the particular case of Problem 1, when πj (i) = i for all j = 1, . . . , J,
i = 1, . . . , L, we need to find an index tuple (n1 , . . . , nJL ) that minimizes the
value of
                                 XJ X L
                                        si (n(j−1)L+i )
                                      j=1 i=1

under the same constraints (2). In [3], an exact polynomial algorithm with run-
ning time O(JLN 2 ) was presented for this case. This algorithm implements a
dynamic programming scheme.
     In the optimization model inducing this particular case [3], each element
si (n), n = 1, . . . , N , i = 1, . . . , L, is a squared distance between an element
yn ∈ Rd in a given sample sequence (y1 , . . . , yN ) and a point xi ∈ Rd in a given
collection {x1 , . . . , xL } of L points (patterns). In other words, the i-th sequence
in the family F is the result of comparisons of the i-th point in the collection
{x1 , . . . , xL } with each element in the sample sequence (y1 , . . . , yN ).
     In the applied problem, the point xi corresponds to the i-th object and the
collection {x1 , . . . , xL } of points (patterns) corresponds to the collection of L
objects, while the sample sequence (y1 , . . . , yN ) corresponds to a time series
obtained as a monitoring (or measurement) result. In the family F, the i-th
                On a Problem of Choosing Elements in a Family of Sequences           183

sequence corresponds to the result of comparisons of the i-th object in the pattern
collection {x1 , . . . , xL } with each element of time ordered measurement result.
The applied problem can be interpreted (see [3]) as a noise-proof detection of the
given pattern collection {x1 , . . . , xL } of “pulses” repeated J times in the signal
(y1 , . . . , yN ). In the applied problem, all the “pulses” in the collection have the
same “duration” d and different forms, whereas in each collection repetition,
there are no “pulses” permutations. The interval nm − nm−1 , m = 2, . . . , JL,
between two consecutive “pulses” is not constant, but it is bounded in accordance
with the inequalities (2).
     Problem 1 is induced by a similar but more complex applied problem in
which the “pulses” (objects) in each repeated pattern collection can be arbitrary
reordered. This reordering causes the presence of permutations in the objective
function (1). The permutations correspond to some possible rearrangements of
objects.
     In this paper, we present a polynomial algorithm which finds an optimal
solution of Problem 1 in O(N 5 ) time.


3    Algorithm Foundation
To solve Problem 1 we need two auxiliary problems with exact polynomial al-
gorithms. In addition, we need two well-known problems: (1) Minimum weight
perfect matching in a complete bipartite graph, (2) Nearest neighbor search prob-
lem.
     The first auxiliary problem has the following formulation.
                                      n
Problem 2. Given a family F = {f1 (n), . . . , fL (n)} fi (n) ∈ R≥ , i = 1, . . . , L, n =
                                     o
a, . . . , b − 1; L ≤ b − a, a, b ∈ N of sequences. Find an index tuple (n1 , . . . , nL ),
where ni ∈ {a, . . . , b − 1}, i = 1, . . . , L, and a permutation π ∈ Π such that
                                                  L
                                                  X
                      F (n1 , . . . , nL , π) =         fπ(i) (ni ) → min ,          (3)
                                                  i=1

while the elements of (n1 , . . . , nL ) satisfy the following constraints

                                 a ≤ n1 < . . . < nL < b .

   Let G = (V ∪ U, E, w) be a complete bipartite graph with the vertex parts
V = {va , . . . , vb−1 } and U = {u1 , . . . , ub−a }, and let the edge weights be given
as                                
                                    fj (n), j = 1, . . . , L ,
                         wvn uj =                                                    (4)
                                    0,        j = L + 1, . . . , b − a ,
for every n = a, . . . , b − 1. In this graph, each vertex vn , n = a, . . . , b − 1,
in V corresponds to the index n in {a, . . . , b − 1}. The vertices u1 , . . . , uL in
U correspond to the indices in {1, . . . , L} of the sequences in the input family
F. The vertices uL+1 , . . . , ub−a are auxiliary ones. The weight of every edge
184     A. Kel’manov et al.

between these auxiliary vertices in U and the vertices in V is equal to zero, while
the weight of the edge between the vertex ul , l = 1, . . . , L, and the vertex vn ,
n = a, . . . , b − 1, is equal to the value of the n-th element in the l-th sequence in
the input family F.
    For every perfect matching P in G define a bijection r : {a, . . . , b − 1} −→
{1, . . . , b − a}, where {a, . . . , b − 1} and {1, . . . , b − a} are the sets of indices of
U and V respectively. Let W (P) be the weight of the perfect matching P =
{(vn , ur(n) ), n = a, . . . , b − 1}. The following lemma is true.
Lemma 1. Let P ∗ = {(vn , ur∗ (n) ), n = a, . . . , b − 1} be the minimum weight
perfect matching in G, where r∗ is the optimal bijection. Then in Problem 2, the
optimal tuple η ∗ (a, b) = (n∗1 , . . . , n∗L ), and the optimal permutation π ∗ (a, b) can
be found by the following formulas:
               n∗1 = min{n | r∗ (n) ∈ {1, . . . , L}, n ∈ {a, . . . , b − 1}} ,               (5)
 n∗i = min{n | r∗ (n) ∈ {1, . . . , L}, n ∈ {n∗i−1 + 1, . . . , b − 1}},    i = 2, . . . , L , (6)
and
                            π ∗ (a, b) = (r∗ (n∗1 ), . . . , r∗ (n∗L )) .                     (7)
                        ∗
The optimal value F (a, b) of the objective function (3) can be found as
                                    F ∗ (a, b) = W (P ∗ ) .                                   (8)
   To prove this lemma we establish the correspondence between an admissible
tuple and a permutation in Problem 2 and the perfect matching in the graph G.
Then we prove that the optimal value of the objective function (3) is equal to
the weight of the minimum weight perfect matching in G.
   Note that the minimum weight perfect matching in G can be found by a
well-known algorithm (see, for example [4]) in the polynomial time.
   Thus, we have the following algorithm for auxiliary Problem 2 in a step-by-
step form.
   Algorithm A1 .
   INPUT: a family F of sequences.
   Step 1. Construct G using (4), and find the minimum weight perfect match-
ing P ∗ in G using the well-known algorithm [4].
   Step 2. Find the optimal permutation π ∗ (a, b) and the optimal tuple η ∗ (a, b)
by (5), (6), and (7).
   Step 3. Compute W (P ∗ ) and F ∗ (a, b) by (8).
   OUTPUT: the permutation π ∗ (a, b), the tuple η ∗ (a, b) = (n∗1 , . . . , n∗L ), and
the value F ∗ (a, b).
   The following proposition is true.
Proposition 1. Algorithm A1 finds the optimal solution of Problem 2 in O((b−
a)3 ) time.
    The proof follows immediately from Lemma 1 and well-known time complex-
ity of the algorithm for the problem of searching the minimum weight perfect
matching [4].
    The second auxiliary problem has the following formulation.
                On a Problem of Choosing Elements in a Family of Sequences                185

Problem 3. Given positive integers L, J, and N such that JL ≤ N , a family
Q = {q(n, m) ∈ R≥ | 1 ≤ n < m ≤ N +1; L ≤ m−n ≤ N −(J −1)L, n, m ∈ N}.
Find an index tuple (µ1 , . . . , µJ+1 ), where µj ∈ {1, . . . , N + 1}, j = 1, . . . , J + 1,
such that
                                           XJ
                  Q(µ1 , . . . , µJ+1 ) =      q(µj , µj+1 ) → min,                        (9)
                                             j=1

under the following constraints

                            1 = µ1 < . . . < µJ+1 = N + 1 ,

                              µj+1 − µj ≥ L, j = 1, . . . , J .

   To solve this auxiliary problem we use an algorithm for the well-known
   Nearest neighbor search problem(see, for example, [2] and [1]). Given a family
H = {h(n, m) ∈ R≥ | 1 ≤ n ≤ m ≤ N + 1; n, m ∈ N} and a positive integer J.
Find an index tuple (ν1 , . . . , νJ+1 ), where νj ∈ {1, . . . , N + 1}, j = 1, . . . , J + 1,
such that
                                           XJ
                   H(ν1 , . . . , νJ+1 ) =     h(νj , νj+1 ) → min ,                      (10)
                                             j=1

subject to constraints

                          1 = ν1 ≤ ν2 ≤ . . . ≤ νJ+1 = N + 1 .

  As it is known [1], the Nearest neighbor search problem can be solved in
O(JN 2 ) time.
  The following lemma is true.

Lemma 2. Let (ν1∗ , . . . , νJ+1
                             ∗
                                 ) be the optimal solution of Nearest neighbor search
problem with

  h(n, m) =
       
        q(n, m), n = 1, . . . , N − L + 1,
       
                     m = n + L, . . . , min{N + 1, n + N − (J − 1)L} ,
       
       
       
          +∞,     n = 1, . . . , N + 1, m = n, . . . , min{n + L − 1, N + 1} ,           (11)
          +∞,     n = 1, . . . , 1 + (J − 1)L,
       
       
       
       
                     m = n + N − (J − 1)L + 1, . . . , N + 1 ,
       

and H ∗ be the optimal value of objective function (10). Then the components of
the optimal tuple (µ∗1 , . . . , µ∗J+1 ) and the optimal value Q∗ of the function (9)
can be found by the following formulas

                               µ∗i = νi∗ , i = 1, . . . , J + 1 ,                        (12)

                                         Q∗ = H ∗ .                                      (13)
186      A. Kel’manov et al.

    To prove this lemma we first verify that the optimal tuple in the Nearest
neighbor search problem is an admissible one in Problem 3. Then we show that
this tuple is the optimal solution of Problem 3.
    Thus, we have the following algorithm for auxiliary Problem 3.
    Algorithm A2 .
    INPUT: positive integers L, J, and a family Q.
    Step 1. Find the family H by (11). Compute the optimal value H ∗ and find
the optimal tuple (ν1∗ , . . . , νJ+1
                                  ∗
                                      ) of the Nearest neighbor search problem.
    Step 2. Compute the optimal value Q∗ of the objective function of Problem 3
using (13), and find the optimal tuple (µ∗1 , . . . , µ∗J+1 ) of indices using (12).
    OUTPUT: the tuple (µ∗1 , . . . , µ∗J+1 ) and the value Q∗ .
    The following proposition is true.

Proposition 2. Algorithm A2 finds an optimal solution of Problem 3 in O(JN 2 )
time.

   The validity of Proposition 2 follows from Lemma 2 and time complexity of
the algorithm for the Nearest neighbor search problem [2].


4     Exact Algorithm

Before presenting our algorithm, let us formulate the following lemma.
Lemma 3. Let N , J, and L be the instances of Problem 1. Let η ∗ (a, b), π ∗ (a, b)
be the optimal solution of Problem 2 for each a = 1, . . . , N − L + 1, and b =
a + L, . . . , min{N + 1, a + N − (J − 1)L} with

                    fi (j) = si (j), j = a, . . . , b − 1, i = 1, . . . , L ,

where si (j) is the sequence in the input family F of Problem 1, and F ∗ (a, b) be
the optimal value of the objective function (3).
    Let (µ∗1 , . . . , µ∗J+1 ) be the optimal solution of Problem 3 with

    q(n, m) = F ∗ (n, m), n = 1, . . . , N − L + 1,
                               m = n + L, . . . , min{N + 1, n + N − (J − 1)L} , (14)

in the family Q, and Q∗ be the optimal value of the function (9).
    Then the optimal tuples (n∗1 , . . . , n∗JL ) and (π1∗ , . . . , πL
                                                                      ∗
                                                                        ) of Problem 1 can be
found by the following formulas:

                 (n∗(j−1)L+1 , . . . , n∗jL ) = η ∗ (µ∗j , µ∗j+1 ), j = 1, . . . , J ,   (15)

                             πj∗ = π ∗ (µ∗j , µ∗j+1 ), j = 1, . . . , J ,                (16)
and the optimal value S ∗ of the function (1) can be found as

                                            S ∗ = Q∗ .                                   (17)
                On a Problem of Choosing Elements in a Family of Sequences             187

    We prove Lemma 3 using the results of Section 3.
    Finally, we can present the algorithm for Problem 1.
    Algorithm A.
    INPUT: a family S of sequences and a positive integer J.
    Step 1 (solving the family of Problems 2). For each a = 1, . . . , N −L+1, and
b = a + L, . . . , min{N + 1, a + N − (J − 1)L}, put fi (j) = si (j), j = a, . . . , b − 1,
i = 1, . . . , L. Using Algorithm A1 , find the optimal solution π ∗ (a, b), η ∗ (a, b) and
compute the optimal value F ∗ (a, b) of the objective function (3).
    Step 2 (solving Problem 3). Find the family Q in accordance with (14). Find
the optimal solution (µ∗1 , . . . , µ∗J+1 ) and compute the optimal value Q∗ of the
objective function (9) using Algorithm A2 .
    Step 3. Compute the value S ∗ of the objective function (1) by (17). Con-
struct (n∗1 , . . . , n∗JL ) and (π1∗ , . . . , πL
                                                 ∗
                                                   ) using (15) and (16).
                                     ∗
    OUTPUT: the tuples (n1 , . . . , n∗JL ), (π1∗ , . . . , πL   ∗
                                                                   ) and the value S ∗ .
    The following theorem is true.

Theorem 1. Algorithm A finds an optimal solution of Problem 1 in O(N 5 )
time.

The optimality of the solution follows from Proposition 1, Proposition 2, and
Lemma 3. The total estimate of the running time follows from the inequality
LJ ≤ N an from the time complexity of the three algorithmic steps. Namely,
Step 1, Step 2, and Step 3 require O((N − L + 1)2 (N − (J − 1)L)3 ), O(JN 2 ),
and O(JL) running times respectively.

Remark 1. If J and L are bounded by some constants, then the time complexity
of Algorithm A is O(N 5 ). In a particular case, when L = N we have J = 1 and
the running rime the algorithm is O(N 3 ).


5    Conclusion

In the paper we show that one optimization problem arising in applications con-
nected with the analysis of time series is solvable in a polynomial time. In our
opinion, the main line of the further research on the problem is to construct
faster approximation polynomial-time algorithm with a guaranteed accuracy. In
connection with the known Big data problem, a linear-time or a sublinear-time
approximation algorithm for the considered problem is of specific interest.

Acknowledgement. The study presented in Sections 3 was supported by the
Russian Science Foundation, project 16-11-10041. The study presented in Sec-
tions 2, 4 was supported by the Russian Foundation for Basic Research, projects
16-07-00168, and by the Russian Academy of Science (the Program of Basic
Research), project 0314-2016-0015, and by the Russian Ministry of Science and
Education under the 5-100 Excellence Programme.
188    A. Kel’manov et al.

References
1. Bellman, R.: Dynamic Programming. Princeton University Press (1957)
2. Beresnev, V.L., Gimadi, E.Kh., Dement’ev, V.T.: Extremal Standartization Prob-
   lems. Nauka, Novosibirsk (1978), (in Russian)
3. Kel’manov, A.V., Mikhailova, L.V., Khamidullin, S.A.: A posteriori joint detection
   of a recurring tuple of reference fragments in a quasi-periodic sequence. Comp.
   Math. and Math. Phys. 48(12), 2276–2288 (2008)
4. Papadimitrou, C., Steiglitz, K.: Combinatorial Optimization: Algorithms and Com-
   plexity. Prentice-Hall (1982)