A heuristic algorithm for the double integrator
                       traveling salesman problem

                               Artem P. Baklanov                          Pavel A. Chentsov
                           artem.baklanov@gmail.com                       chentsov.p@mail.ru
                    Krasovskii Institute of Mathematics and Mechanics (Yekaterinburg, Russia)
                                  Ural Federal University (Yekaterinburg, Russia)


                                                        Abstract

                       We consider a version of the traveling salesperson problem for a double
                       integrator: given a set of stationary points, find the fastest tour over
                       this set of points. Control constraints are defined in terms of a convex
                       compact set, velocity is constrained only at the points. To obtain
                       an upper bound for the minimum travel time, we propose a simple
                       transformation of the original problem into a generalized traveling
                       salesman problem. This transformation is based on a discretization of
                       sets of admissible visiting velocities. To solve time-optimal two-point
                       problems, we use the duality of optimal control problems and convex
                       programming. We compare numerically performance of the proposed
                       algorithm and the existing algorithm STOP-GO-STOP.


1    Introduction
The traveling salesman problem (TSP) plays an important role in mathematics and finds numerous applications
in real-life problems. However, modern engineering challenges lead to dynamic versions of the classic TSP
formulation. Namely, trajectories of ‘cities’ and ‘the traveling salesman’ must satisfy dynamics in terms of
differential equations. The first formulation of the TSP with dynamics dealt with cities moving in a straight
line with a constant velocity [1]. In what follows, we use the terminology proposed in papers [2, 3, 4, 7]. Namely,
though distances between each pair of cities do depend on the given tour, we still use the notion of the TSP.
   Advances in the construction of unmanned aerial vehicles (UAVs), underwater autonomous vehicles, unmanned
cars demand a study of the TSP with complex linear or nonlinear dynamics. In this regard, mathematicians and
engineers use models, for example, the Dubins vehicle (a nonholonomic controlled object that can move along
curves on plane with a limited radius of curvature and is unable to reverse). This model describes quite well the
dynamics of UAVs, ships, submarines, etc. At the same time, due to the rapid development of technology and
infrastructure, the TSP for such objects has a great practical importance; this leads to studies of the TSP for
the Dubins vehicle (DTSP) [2, 3, 4]. Also the TSP has been studied for the Reeds-Shepp car and the differential
drive robot [5].
   As the first approximation, dynamics of some objects can be considered as a linear system. For example, it
is possible to obtain a good approximation of spacecraft dynamics in deep space by using the double integrator
model [6]. Thus, the TSP for controlled objects described by linear differential equations is of interest. A special

Copyright c by the paper’s authors. Copying permitted for private and academic purposes.
In: A.A. Makhnev, S.F. Pravdin (eds.): Proceedings of the International Youth School-conference «SoProMat-2017», Yekaterinburg,
Russia, 06-Feb-2017, published at http://ceur-ws.org


                                                             203
case of such problem is the TSP for the double integrator (DITSP). Among possible applications leading to
DITSP, we should note the task of collecting orbital debris.
   Note that the DITSP is poorly studied in comparison with the DTSP. In [7], the double integrator with
bounded velocity and bounded control inputs was considered; asymptotic bounds on the time taken to complete
the fastest tour in the worst case were obtained; a stochastic version of DITSP was considered (i.e., points are
randomly sampled from uniform distribution) and algorithms performing within a constant factor of the optimal
strategy with high probability was proposed. The TSP for affine in control systems was studied in [8]. For these
systems, the lower bound on the minimum expected travel time of visiting of n uniformly distributed points was
obtained; an algorithm generating a tour that system can trace in time that has the same asymptotic order in n
as the lower bound was provided [8]. In this paper we do not consider stochastic settings and asymptotic bounds,
but we propose a heuristic for the DITSP without any bounds on the velocity between cities.
   The complexity of the DITSP and DTSP is caused by ‘simultaneous’ optimization of the total travel time
by a discrete vector parameter (tour) and by a continuous parameter (control). In this regard, they can be
called a discrete-continuous problem with the NP-hard discrete part. Moreover, the following fact complicates
these problems even more. In general, for a given tour it is not possible to partition the multipoint time-optimal
control problem into a set of two-point subproblems without loss of the optimality. To avoid constructions of
time-optimal trajectories for all possible tours in the DTSP, the necessary condition for tour optimality was
proposed in [9].
   Note that N.N. Krasovskii proposed a general approach for solving linear optimal control problems. The
approach is based on the duality of optimal control problems and convex programming [10, 11]; for details
see also [12, ch.6., §9–12]. The approach allows to replace an infinite-dimensional time-optimal control problem
with the finite-dimensional optimization problem. Krasovskii’s approach was further generalized for multipoint
time-optimal control problems; a multipoint analogue of Pontryagin’s minimum principle was obtained in [13,
Theorem 3.3]. Unfortunately, it is still difficult to find the multipoint time-optimal control numerically for a large
number of points. Moreover, even if we imagine that we have learned how to do this efficiently, the problem of
searching through all possible tours remains. In this regard, we introduce a heuristic algorithm based on the
theoretical results [10, 11, 13]

2    Problem statement
By k·k we denote Euclidean norm in R2 . We introduce a mapping π12 by the rule (xi )i∈1,4 7−→ (xi )i∈1,2 : R4 → R2
and a mapping π34 by the rule (xi )i∈1,4 7−→ (xi )i∈3,4 : R4 → R2 . Consider a set of m stationary (distinct) nodes
                                                              4        (i)       (i)
to be visited. Each node is located at coordinates y (i) = (y1 , y2 ) ∈ R2 where i ∈ 1, m. Suppose that for each
node i, i ∈ 1, m, a constraint on (visiting) velocity in terms of non-empty set Vi , Vi ⊂ R2 , is defined. The dynamics
                                                                             4
of the object (‘the traveling salesman’) on the time interval I = [0, ∞[ are set by the differential equations

                               ẋ1 = x3 , ẋ2 = x4 , ẋ3 = u1 , ẋ4 = u2 , x(0) = x0 ∈ R4 .                        (1)

Thus, ‘the traveling salesman’ is defined as the double integrator model on plane. Here an open-loop control
  4
u = (ui )i∈1,2 is a piecewise constant function I → R2 such that u(t) ∈ P ∀t ∈ I, P ⊂ R2 where P is a convex
compact set; let U be the set of all such controls. We suppose that P is given such that the system is controllable.
                                                                                              4
   Let T be the set of all vectors (ti )i∈1,m from I m such that 0 < t1 < t2 ... < tm . By φu = φu (t)t∈I we denote
the trajectory of (1) generated by a control u ∈ U starting from the initial position x0 ; φu (t) ∈ R4 ∀t ∈ I.
   By B we denote the set of all bijections 1, m → 1, m. We say that the triplet (τ, u, b) from T × U × B is
admissible, if ∀i ∈ 1, m                                                           
                                 kπ12 φu (τb(i) ) − y (i) k = 0 & π34 φu (τb(i) ) ∈ Vi ;
                                                                                

we denote the set of all admissible triplets by A.
  We introduce the payoff function J : A → I by the rule: ∀(τ, u, b) ∈ A
                                                                   4
                                                     J(τ, u, b) = τm .

    DITSP. Find the triplet (τ 0 , u0 , b0 ) ∈ A such that
                                           4
                                       J 0 = J(τ 0 , u0 , b0 ) =       min         J(τ, u, b).
                                                                   (τ,u,b)∈A


                                                             204
   Obviously, J 0 corresponds to the minimum time required to visit the given set of points by the double
integrator complying with velocity constraints at nodes. Recall that in DITSP we need to find not only the
optimal permutation (the best tour), but also the time-optimal control to visit all points according to the best
tour. The last ‘control subproblem’, in general, does not allow the partitioning to two-point time-optimal control
problems without losing the optimality. This fact dramatically complicates the exact solution of DITSP. In this
regard, it is rational to propose a heuristic algorithm for DITSP.

3      A heuristic algorithm
The proposed algorithm reflects the classical concept in mathematics: if a continuous problem is hard to solve,
then it is worth to try a discretization to obtain an approximate solution for the original problem. For the DTSP
this concept led to an algorithm based on heading discretization [3]. The proposed heuristic algorithm for the
DITSP consists of the following 4 steps:

    1. We fix a parameter d of the discretization of admissible velocity sets Vi , i ∈ 1, m such that v[j](i) ∈ Vi ∀j ∈
       1, d. We complement the geometric coordinates of the nodes y (i) , i ∈ 1, m with velocity coordinates obtained
       after the discretization in the following way: each original node i is transformed into d nodes with the same
       geometric coordinate y (i) and distinct velocity coordinates v[j](i) , j ∈ 1, d.

    2. We form a path weights matrix using the optimal time as the ‘travel cost’ between cities of different sets.
       This requires us to find time-optimal controls for all two-point problems corresponding to the ‘travel’ from a
       node (y (i) , v[j](i) ) to a node (y (k) , v[l](k) ) where i, k ∈ 1, m, i 6= k, and j, l ∈ 1, d. Now we apply Krasovskii’s
                                                                                                    4                          4
       approach. We fix two different points in the phase space with coordinates a = (y (i) , v[j](i) ) ∈ R4 , b =
       (y (k) , v[l](k) ) ∈ R4 . Our goal is to find the ‘travel cost’ c(i,j),(k,l) from point i to point k. We assume that a
       given point is visited, if the trajectory of the system intersects ε-neighborhood of the point where ε, ε > 0,
       is small enough. We introduce the functional for the minimum time estimation (see [11, p. 131], [12, ch.6.,
       §9–12]): ∀θ ∈ I
                                                    h                    Z θ                       i
                                              4
                                   σ(a, b, θ) = max3 l0 (Φ(θ, 0)a − b) +      min l0 Φ(θ, t)B(t)u dt ,                    (2)
                                              l∈S                         0 u∈P
                                                                                       
                      1   0   θ−t   0              0        0                     (θ − t)u1
               4 0       1    0  θ − t        4 0
                                                              , Φ(θ, t)B(t)u = (θ − t)u2  ,
                                                            0                             
       Φ(θ, t) =                       , B(t) = 
                     0   0    1    0            1        0                      u1 
                      0   0    0    1              0        1                        u2

                                 l0 Φ(θ, t)B(t)u = (l1 (θ − t) + l3 )u1 + (l2 (θ − t) + l4 )u2 ∀l ∈ S 3 ;                    (3)

       here S 3 is the unit 3-sphere. Next we find the minimal time instant θ∗ such that σ(a, b, θ∗ ) < ε; we apply
       iterative procedure for this. Let θ∗ ∈ I. If σ(a, b, θ∗ ) > ε, then on the next iteration we increase θ∗ . If
       σ(a, b, θ∗ ) ≤ 0, then on the next iteration we decrease θ∗ . If 0 < σ(a, b, θ∗ ) ≤ ε, then we stop searching and
       set c(i,j),(k,l) = θ∗ . Note that one can obtain the time-optimal control. If θ∗ is the optimal time, l0 ∈ S 3
       maximizes σ(a, b, θ∗ ), then the optimal control u∗ (·) satisfies the specific version of Pontryagin’s minimum
       principle:
                                   Z θ∗                           Z θ∗
                                        min l00 Φ(θ∗ , t)B(t)u dt =      l0 Φ(θ∗ , t)B(t)u∗ (t) dt.                   (4)
                                    0   u∈P                              0

    3. We solve the (asymmetric) Generalized TSP (GTSP, also known as the set TSP) with the path weights
       matrix obtained in the previous step. The length of the shortest possible tour is an upper bound for the
       value J 0 of DITSP. The optimal permutation (tour) defines velocity of ‘the salesman’ at these points and
       the control that ensures the upper bound.
       To solve the GTSP, one may apply transformation [14] of the GTSP into the standard asymmetric TSP, use
       integer programming algorithm [15] or iterative heuristic algorithm [16].

    4. Repeat steps 1–3 applying heuristics for obtaining a better velocity discretization at nodes. For these
       purposes, one can use swarm optimization, genetic algorithms, etc.


                                                                205
3.1   Special cases
Fix positive p, p > 0. Suppose now that P = {(u1 , u2 ) ∈ R2 |u1 |, |u2 | ≤ p}, then (2) becomes
                                                 h
           σ(a, b, θ) = 2        2
                                  max
                                   2  2
                                                  (a1 + θa3 − b1 )l1 + (a2 + θa4 − b2 )l2 + (a3 − b3 )l3 + (a4 − b4 )l4 −
                             l1 +l2 +l3 +l4 =1

                                                          Z θ                                          i
                                                 −p             |(θ − t)l1 + l3 | + |(θ − t)l2 + l4 |dt .
                                                           0
                         2
If P = {(u1 , u2 ) ∈ R       |u1 | + |u2 | ≤ p}, then we have the specific version of (2):
                                                 h
           σ(a, b, θ) = 2        2
                                  max
                                   2  2
                                                  (a1 + θa3 − b1 )l1 + (a2 + θa4 − b2 )l2 + (a3 − b3 )l3 + (a4 − b4 )l4 −
                             l1 +l2 +l3 +l4 =1

                                               Z θ
                                                                                                            i
                                          −p              max        |(θ − t)l1 + l3 |; |(θ − t)l2 + l4 |    dt .
                                                  0
                                      2
Finally, for P = {(u1 , u2 ) ∈ R          u21 + u22 ≤ p2 }, we obtain the following version of (2):
                                                 h
           σ(a, b, θ) = 2        2
                                  max
                                   2  2
                                                  (a1 + θa3 − b1 )l1 + (a2 + θa4 − b2 )l2 + (a3 − b3 )l3 + (a4 − b4 )l4 −
                             l1 +l2 +l3 +l4 =1

                                                 Z θq
                                                                               2               2 i
                                          −p                     (θ − t)l1 + l3 + (θ − t)l2 + l4 dt .
                                                      0

4     Comparison with STOP-GO-STOP algorithm
Note that for the DITSP, the STOP-GO-STOP heuristic algorithm was proposed in [7]. The algorithm is described
as follows: ‘The vehicle visits the points in the same order as in the optimal Euclidean TSP tour over the same set
of points. Between any pair of points, the vehicle starts at the initial point at rest, follows the shortest-time path
to reach the final point with zero velocity’ [7]. Clearly, if ∀i ∈ 1, m Vi = {(0, 0)}, then the proposed algorithm
and STOP-GO-STOP lead to the same results.
    Let us compare the heuristics in the following simple example. Suppose that we have only two points to
visit: y (1) = (40, 0), y (2) = (80, 0); the initial position √is x0 = (0, 0, 0, 0). Velocity constraints are as follows
V1 = {v ∈ R2 | kvk ≤ 20}, V2 = {v ∈ R2 | kvk ≤ 20 2}; here P = {(u1 , u2 ) ∈ R2 |u1 |, |u2 | ≤ 5}. It is
                                                               √       √         √
easy to see that STOP-GO-STOP computes tSGS = 4 2 + 4 2 = 8 2 as the travel time. We proceed to
the proposed algorithm. Let d = 4; suppose that for the node y (1) we have the following admissible velocities
after the discretization: v[1](1) = (−20, 0),√v[2](1) = (20, 0), v[3] √
                                                                        (1)
                                                                            = (0, −20), v[4](1) √= (0, 20). For the node
                                                                                                                      √
  (2)                              (2)                     (2)                     (2)
y     we have the following: v[1] = (−20 2, 0), v[2]    √       =  (20 √2, 0), v[3]    = (0, −20  2), v[4](2) = (0, 20 2).
Using the heuristic algorithm, we get tHA = 4 + (4 2 − 4) = 4 2 as the travel time. Note that tHA equals the
minimum travel time (value J 0 ) in DITSP for the example input. In this toy example, the results are the same
for P = {(u1 , u2 ) ∈ R2 |u1 | + |u2 | ≤ 5} and P = {(u1 , u2 ) ∈ R2 u21 + u22 ≤ 25}.
    It easy to generalize this example√to show that there exists a class of problems such that STOP-GO-STOP
provides solution with total time T∗ 2n where n is number of nodes and T∗ is the total time of the solution
provided by the heuristic algorithm
                                 √       and T∗ is equal to J 0 in DITSP. The example illustrates special cases for
n = 1 and n = 2; tSGS = tHA 2 · 2.

4.1   Numerical experiments
In this section we describe a preliminary benchmark of the proposed heuristic and STOP-GO-STOP. To perform
calculations, we use PC with Intel i5-2400 and 8Gb of RAM (Windows 7 64-bit). Microsoft Visual C++ 2013
was used to develop software. The main goal of the experiment is to study how control restrictions p influences
the difference in the travel times calculated by the heuristics.
   Experiment setup. We generate 100 random instances of DITSP. In these instances start and finish points
coincide with (0, 0, 0, 0) ∈ R4 . Geometric coordinates of all points are drawn uniformly from rectangle with
coordinates (10,10) and (110,85). In every instance there are exactly 14 points to visit. Every point has 13
vectors of admissible visiting speed: (0, 0) and (4 sin α, 4 cos α) where α = π6 , 2π       12π
                                                                                    6 , ..., 6 . We fix some control


                                                                                206
constraint p in {0.01 · 2n : n = 0, 1, 2, ..., 12} ⊂ [0.01, 40.96]. Then we apply heuristics to calculate routing time
for all instances. These values are used to obtain the corresponding binary logarithm of average calculated time
for the heuristic under constraint p. The results are depicted in Fig. 1. Note that the average calculated time is
quite similar for p ≤ 0.16.


Figure 1: The relation between the control constraint p and binary logarithm of the corresponding average
calculated travel time for the proposed heuristic and STOP-GO-STOP.


5   Conclusion and future work
The paper deals with the hard discrete-continuous problem that currently does not allow to get exact solutions.
In this regard, the proposed heuristic can be useful for applications. Since Krasovskii’s approach was proposed for
a general linear control problem, the heuristic can be used with the controlled object defined by a generic linear
system. To keep things simple, we considered the double integrator model only. The proposed heuristic can
be easily extended to any finite dimensional state space. The duality proposed by N.N. Krasovskii simplifies
the solution of two-point time-optimal control problems for linear systems (comparing to optimal control
methods associated with Pontryagin’s minimum principle [17]). Recall that based on the duality, an analogue
of Pontryagin’s minimum principle for multipoint linear control problems was proposed in [13]; we will use this
result to further develop of the proposed heuristic. We also plan to extend numerical experiments to get a better
understanding of the heuristic performance. From theoretical prospective, it is of interest to study the question
of convergence of results under the discretization procedure and to provide error bounds.

Acknowledgements
The reported study was supported by RFBR research project № 17-08-01385.

References
[1] G.G. Sikharulidze. One generalization of the traveling salesman problem. I. Autom. Remote Control, 32(8),
    part 2: 1265–1271, 1971.

[2] K. Savla, E. Frazzoli, F. Bullo. Traveling salesperson problems for the Dubins vehicle. IEEE Trans. Automat.
    Contr., 53(6): 1378–1391, 2008.

[3] Ny J. Le, E. Feron, E. Frazzoli. On the Dubins traveling salesman problem. IEEE Trans. Automat. Contr.,
    57(1): 265–270, 2012.


                                                         207
[4] X. Ma, D.A. Castanon. Receding horizon planning for Dubins traveling salesman problems. In: Proc. 45th
    IEEE Conf. on Decision and Control, 5453–5458, 2006.

[5] J.J. Enright, E. Frazzoli E. The stochastic traveling salesman problem for the Reeds-Shepp car and the
    differential drive robot. In: Proc. IEEE 45th Conf. on Decision and Control, 3058–3064, 2006.
[6] D.P. Scharf, F.Y. Hadaegh, S.R. Ploen. A survey of spacecraft formation flying guidance and control. Part
    II: control. In: Proc. of the American Control Conference, 2976–2985, 2004.
[7] K. Savla, F. Bullo, E. Frazzoli. Traveling Salesperson Problems for a double integrator. IEEE Trans. Automat.
    Contr., 54(4): 788–793, 2009.
[8] S. Itani, E. Frazzoli, M.A. Dahleh. Travelling Salesperson Problem for dynamic systems. In: Proc. of 17th
    IFAC World Congress, 13318–13323, 2008.
[9] Y.I. Berdyshev. On choice of tour in nonlinear problem of sequential approach. Trudy Instituta Matematiki
    i Mekhaniki, 16(5): 8–15, 2010 (in Russian) = Ю.И. Бердышев. О выборе маршрута в одной нелинейной
    задаче последовательного сближения. Тр. Ин-та математики и механики УрО РАН,16(5): 8–15, 2010.
[10] N.N. Krasovskii. Theory of Motion Control. Nauka, Moscow, 1968 (In Russian). = Н.Н. Красовский.
    Теория управления движением. Наука, Москва, 1968.
[11] N.N. Krasovskii. Game Problems on the Encounter of Motions. Nauka, Moscow, 1970 (In Russian). = Н.Н.
    Красовский. Игровые задачи о встрече движений. Наука, Москва, 1970.
[12] R. Gabasov, F. Kirillova. The Qualitative Theory of Optimal Processes. Marcel Dekker, Inc., New York,
    1976.
[13] Y.I. Berdyshev, A.G. Chentsov. Optimization of a weighted criterion function in one control problem.
    Cybernetics, 22(1): 67–74, 1986.
[14] C.E. Noon, J.C. Bean. An efficient transformation of the generalized traveling salesman problem. Tech.
    Rep. 89-36, University of Michigan, Ann Arbor, 1989.
[15] G. Laporte, Y. Nobert. Generalized traveling salesman problem through n sets of nodes: the asymmetrical
    case. Discrete Applied Mathematics, 18(2): 185–197, 1987.

[16] A.A. Chentsov, A.G. Chentsov. On solution of the problem of successive round of sets by the ‘nonclosed’
    travelling salesman problem. Automation and Remote Control, 11: 1832–1845, 2002.
[17] L.S. Pontryagin, V.G. Boltyanskii, R.V. Gamkrelidze, E.F. Mishchenko. The Mathematical Theory of
    Optimal Processes. Interscience, New York, 1962.


                                                      208