A heuristic algorithm for the double integrator traveling salesman problem Artem P. Baklanov Pavel A. Chentsov artem.baklanov@gmail.com chentsov.p@mail.ru Krasovskii Institute of Mathematics and Mechanics (Yekaterinburg, Russia) Ural Federal University (Yekaterinburg, Russia) Abstract We consider a version of the traveling salesperson problem for a double integrator: given a set of stationary points, find the fastest tour over this set of points. Control constraints are defined in terms of a convex compact set, velocity is constrained only at the points. To obtain an upper bound for the minimum travel time, we propose a simple transformation of the original problem into a generalized traveling salesman problem. This transformation is based on a discretization of sets of admissible visiting velocities. To solve time-optimal two-point problems, we use the duality of optimal control problems and convex programming. We compare numerically performance of the proposed algorithm and the existing algorithm STOP-GO-STOP. 1 Introduction The traveling salesman problem (TSP) plays an important role in mathematics and finds numerous applications in real-life problems. However, modern engineering challenges lead to dynamic versions of the classic TSP formulation. Namely, trajectories of ‘cities’ and ‘the traveling salesman’ must satisfy dynamics in terms of differential equations. The first formulation of the TSP with dynamics dealt with cities moving in a straight line with a constant velocity [1]. In what follows, we use the terminology proposed in papers [2, 3, 4, 7]. Namely, though distances between each pair of cities do depend on the given tour, we still use the notion of the TSP. Advances in the construction of unmanned aerial vehicles (UAVs), underwater autonomous vehicles, unmanned cars demand a study of the TSP with complex linear or nonlinear dynamics. In this regard, mathematicians and engineers use models, for example, the Dubins vehicle (a nonholonomic controlled object that can move along curves on plane with a limited radius of curvature and is unable to reverse). This model describes quite well the dynamics of UAVs, ships, submarines, etc. At the same time, due to the rapid development of technology and infrastructure, the TSP for such objects has a great practical importance; this leads to studies of the TSP for the Dubins vehicle (DTSP) [2, 3, 4]. Also the TSP has been studied for the Reeds-Shepp car and the differential drive robot [5]. As the first approximation, dynamics of some objects can be considered as a linear system. For example, it is possible to obtain a good approximation of spacecraft dynamics in deep space by using the double integrator model [6]. Thus, the TSP for controlled objects described by linear differential equations is of interest. A special Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In: A.A. Makhnev, S.F. Pravdin (eds.): Proceedings of the International Youth School-conference «SoProMat-2017», Yekaterinburg, Russia, 06-Feb-2017, published at http://ceur-ws.org 203 case of such problem is the TSP for the double integrator (DITSP). Among possible applications leading to DITSP, we should note the task of collecting orbital debris. Note that the DITSP is poorly studied in comparison with the DTSP. In [7], the double integrator with bounded velocity and bounded control inputs was considered; asymptotic bounds on the time taken to complete the fastest tour in the worst case were obtained; a stochastic version of DITSP was considered (i.e., points are randomly sampled from uniform distribution) and algorithms performing within a constant factor of the optimal strategy with high probability was proposed. The TSP for affine in control systems was studied in [8]. For these systems, the lower bound on the minimum expected travel time of visiting of n uniformly distributed points was obtained; an algorithm generating a tour that system can trace in time that has the same asymptotic order in n as the lower bound was provided [8]. In this paper we do not consider stochastic settings and asymptotic bounds, but we propose a heuristic for the DITSP without any bounds on the velocity between cities. The complexity of the DITSP and DTSP is caused by ‘simultaneous’ optimization of the total travel time by a discrete vector parameter (tour) and by a continuous parameter (control). In this regard, they can be called a discrete-continuous problem with the NP-hard discrete part. Moreover, the following fact complicates these problems even more. In general, for a given tour it is not possible to partition the multipoint time-optimal control problem into a set of two-point subproblems without loss of the optimality. To avoid constructions of time-optimal trajectories for all possible tours in the DTSP, the necessary condition for tour optimality was proposed in [9]. Note that N.N. Krasovskii proposed a general approach for solving linear optimal control problems. The approach is based on the duality of optimal control problems and convex programming [10, 11]; for details see also [12, ch.6., §9–12]. The approach allows to replace an infinite-dimensional time-optimal control problem with the finite-dimensional optimization problem. Krasovskii’s approach was further generalized for multipoint time-optimal control problems; a multipoint analogue of Pontryagin’s minimum principle was obtained in [13, Theorem 3.3]. Unfortunately, it is still difficult to find the multipoint time-optimal control numerically for a large number of points. Moreover, even if we imagine that we have learned how to do this efficiently, the problem of searching through all possible tours remains. In this regard, we introduce a heuristic algorithm based on the theoretical results [10, 11, 13] 2 Problem statement By k·k we denote Euclidean norm in R2 . We introduce a mapping π12 by the rule (xi )i∈1,4 7−→ (xi )i∈1,2 : R4 → R2 and a mapping π34 by the rule (xi )i∈1,4 7−→ (xi )i∈3,4 : R4 → R2 . Consider a set of m stationary (distinct) nodes 4 (i) (i) to be visited. Each node is located at coordinates y (i) = (y1 , y2 ) ∈ R2 where i ∈ 1, m. Suppose that for each node i, i ∈ 1, m, a constraint on (visiting) velocity in terms of non-empty set Vi , Vi ⊂ R2 , is defined. The dynamics 4 of the object (‘the traveling salesman’) on the time interval I = [0, ∞[ are set by the differential equations ẋ1 = x3 , ẋ2 = x4 , ẋ3 = u1 , ẋ4 = u2 , x(0) = x0 ∈ R4 . (1) Thus, ‘the traveling salesman’ is defined as the double integrator model on plane. Here an open-loop control 4 u = (ui )i∈1,2 is a piecewise constant function I → R2 such that u(t) ∈ P ∀t ∈ I, P ⊂ R2 where P is a convex compact set; let U be the set of all such controls. We suppose that P is given such that the system is controllable. 4 Let T be the set of all vectors (ti )i∈1,m from I m such that 0 < t1 < t2 ... < tm . By φu = φu (t)t∈I we denote the trajectory of (1) generated by a control u ∈ U starting from the initial position x0 ; φu (t) ∈ R4 ∀t ∈ I. By B we denote the set of all bijections 1, m → 1, m. We say that the triplet (τ, u, b) from T × U × B is admissible, if ∀i ∈ 1, m     kπ12 φu (τb(i) ) − y (i) k = 0 & π34 φu (τb(i) ) ∈ Vi ;   we denote the set of all admissible triplets by A. We introduce the payoff function J : A → I by the rule: ∀(τ, u, b) ∈ A 4 J(τ, u, b) = τm . DITSP. Find the triplet (τ 0 , u0 , b0 ) ∈ A such that 4 J 0 = J(τ 0 , u0 , b0 ) = min J(τ, u, b). (τ,u,b)∈A 204 Obviously, J 0 corresponds to the minimum time required to visit the given set of points by the double integrator complying with velocity constraints at nodes. Recall that in DITSP we need to find not only the optimal permutation (the best tour), but also the time-optimal control to visit all points according to the best tour. The last ‘control subproblem’, in general, does not allow the partitioning to two-point time-optimal control problems without losing the optimality. This fact dramatically complicates the exact solution of DITSP. In this regard, it is rational to propose a heuristic algorithm for DITSP. 3 A heuristic algorithm The proposed algorithm reflects the classical concept in mathematics: if a continuous problem is hard to solve, then it is worth to try a discretization to obtain an approximate solution for the original problem. For the DTSP this concept led to an algorithm based on heading discretization [3]. The proposed heuristic algorithm for the DITSP consists of the following 4 steps: 1. We fix a parameter d of the discretization of admissible velocity sets Vi , i ∈ 1, m such that v[j](i) ∈ Vi ∀j ∈ 1, d. We complement the geometric coordinates of the nodes y (i) , i ∈ 1, m with velocity coordinates obtained after the discretization in the following way: each original node i is transformed into d nodes with the same geometric coordinate y (i) and distinct velocity coordinates v[j](i) , j ∈ 1, d. 2. We form a path weights matrix using the optimal time as the ‘travel cost’ between cities of different sets. This requires us to find time-optimal controls for all two-point problems corresponding to the ‘travel’ from a node (y (i) , v[j](i) ) to a node (y (k) , v[l](k) ) where i, k ∈ 1, m, i 6= k, and j, l ∈ 1, d. Now we apply Krasovskii’s 4 4 approach. We fix two different points in the phase space with coordinates a = (y (i) , v[j](i) ) ∈ R4 , b = (y (k) , v[l](k) ) ∈ R4 . Our goal is to find the ‘travel cost’ c(i,j),(k,l) from point i to point k. We assume that a given point is visited, if the trajectory of the system intersects ε-neighborhood of the point where ε, ε > 0, is small enough. We introduce the functional for the minimum time estimation (see [11, p. 131], [12, ch.6., §9–12]): ∀θ ∈ I h Z θ   i 4 σ(a, b, θ) = max3 l0 (Φ(θ, 0)a − b) + min l0 Φ(θ, t)B(t)u dt , (2) l∈S 0 u∈P       1 0 θ−t 0 0 0 (θ − t)u1 4 0 1 0 θ − t 4 0  , Φ(θ, t)B(t)u = (θ − t)u2  , 0   Φ(θ, t) =   , B(t) =   0 0 1 0  1 0  u1  0 0 0 1 0 1 u2 l0 Φ(θ, t)B(t)u = (l1 (θ − t) + l3 )u1 + (l2 (θ − t) + l4 )u2 ∀l ∈ S 3 ; (3) here S 3 is the unit 3-sphere. Next we find the minimal time instant θ∗ such that σ(a, b, θ∗ ) < ε; we apply iterative procedure for this. Let θ∗ ∈ I. If σ(a, b, θ∗ ) > ε, then on the next iteration we increase θ∗ . If σ(a, b, θ∗ ) ≤ 0, then on the next iteration we decrease θ∗ . If 0 < σ(a, b, θ∗ ) ≤ ε, then we stop searching and set c(i,j),(k,l) = θ∗ . Note that one can obtain the time-optimal control. If θ∗ is the optimal time, l0 ∈ S 3 maximizes σ(a, b, θ∗ ), then the optimal control u∗ (·) satisfies the specific version of Pontryagin’s minimum principle: Z θ∗   Z θ∗ min l00 Φ(θ∗ , t)B(t)u dt = l0 Φ(θ∗ , t)B(t)u∗ (t) dt. (4) 0 u∈P 0 3. We solve the (asymmetric) Generalized TSP (GTSP, also known as the set TSP) with the path weights matrix obtained in the previous step. The length of the shortest possible tour is an upper bound for the value J 0 of DITSP. The optimal permutation (tour) defines velocity of ‘the salesman’ at these points and the control that ensures the upper bound. To solve the GTSP, one may apply transformation [14] of the GTSP into the standard asymmetric TSP, use integer programming algorithm [15] or iterative heuristic algorithm [16]. 4. Repeat steps 1–3 applying heuristics for obtaining a better velocity discretization at nodes. For these purposes, one can use swarm optimization, genetic algorithms, etc. 205 3.1 Special cases Fix positive p, p > 0. Suppose now that P = {(u1 , u2 ) ∈ R2 |u1 |, |u2 | ≤ p}, then (2) becomes h σ(a, b, θ) = 2 2 max 2 2 (a1 + θa3 − b1 )l1 + (a2 + θa4 − b2 )l2 + (a3 − b3 )l3 + (a4 − b4 )l4 − l1 +l2 +l3 +l4 =1 Z θ i −p |(θ − t)l1 + l3 | + |(θ − t)l2 + l4 |dt . 0 2 If P = {(u1 , u2 ) ∈ R |u1 | + |u2 | ≤ p}, then we have the specific version of (2): h σ(a, b, θ) = 2 2 max 2 2 (a1 + θa3 − b1 )l1 + (a2 + θa4 − b2 )l2 + (a3 − b3 )l3 + (a4 − b4 )l4 − l1 +l2 +l3 +l4 =1 Z θ   i −p max |(θ − t)l1 + l3 |; |(θ − t)l2 + l4 | dt . 0 2 Finally, for P = {(u1 , u2 ) ∈ R u21 + u22 ≤ p2 }, we obtain the following version of (2): h σ(a, b, θ) = 2 2 max 2 2 (a1 + θa3 − b1 )l1 + (a2 + θa4 − b2 )l2 + (a3 − b3 )l3 + (a4 − b4 )l4 − l1 +l2 +l3 +l4 =1 Z θq 2 2 i −p (θ − t)l1 + l3 + (θ − t)l2 + l4 dt . 0 4 Comparison with STOP-GO-STOP algorithm Note that for the DITSP, the STOP-GO-STOP heuristic algorithm was proposed in [7]. The algorithm is described as follows: ‘The vehicle visits the points in the same order as in the optimal Euclidean TSP tour over the same set of points. Between any pair of points, the vehicle starts at the initial point at rest, follows the shortest-time path to reach the final point with zero velocity’ [7]. Clearly, if ∀i ∈ 1, m Vi = {(0, 0)}, then the proposed algorithm and STOP-GO-STOP lead to the same results. Let us compare the heuristics in the following simple example. Suppose that we have only two points to visit: y (1) = (40, 0), y (2) = (80, 0); the initial position √is x0 = (0, 0, 0, 0). Velocity constraints are as follows V1 = {v ∈ R2 | kvk ≤ 20}, V2 = {v ∈ R2 | kvk ≤ 20 2}; here P = {(u1 , u2 ) ∈ R2 |u1 |, |u2 | ≤ 5}. It is √ √ √ easy to see that STOP-GO-STOP computes tSGS = 4 2 + 4 2 = 8 2 as the travel time. We proceed to the proposed algorithm. Let d = 4; suppose that for the node y (1) we have the following admissible velocities after the discretization: v[1](1) = (−20, 0),√v[2](1) = (20, 0), v[3] √ (1) = (0, −20), v[4](1) √= (0, 20). For the node √ (2) (2) (2) (2) y we have the following: v[1] = (−20 2, 0), v[2] √ = (20 √2, 0), v[3] = (0, −20 2), v[4](2) = (0, 20 2). Using the heuristic algorithm, we get tHA = 4 + (4 2 − 4) = 4 2 as the travel time. Note that tHA equals the minimum travel time (value J 0 ) in DITSP for the example input. In this toy example, the results are the same for P = {(u1 , u2 ) ∈ R2 |u1 | + |u2 | ≤ 5} and P = {(u1 , u2 ) ∈ R2 u21 + u22 ≤ 25}. It easy to generalize this example√to show that there exists a class of problems such that STOP-GO-STOP provides solution with total time T∗ 2n where n is number of nodes and T∗ is the total time of the solution provided by the heuristic algorithm √ and T∗ is equal to J 0 in DITSP. The example illustrates special cases for n = 1 and n = 2; tSGS = tHA 2 · 2. 4.1 Numerical experiments In this section we describe a preliminary benchmark of the proposed heuristic and STOP-GO-STOP. To perform calculations, we use PC with Intel i5-2400 and 8Gb of RAM (Windows 7 64-bit). Microsoft Visual C++ 2013 was used to develop software. The main goal of the experiment is to study how control restrictions p influences the difference in the travel times calculated by the heuristics. Experiment setup. We generate 100 random instances of DITSP. In these instances start and finish points coincide with (0, 0, 0, 0) ∈ R4 . Geometric coordinates of all points are drawn uniformly from rectangle with coordinates (10,10) and (110,85). In every instance there are exactly 14 points to visit. Every point has 13 vectors of admissible visiting speed: (0, 0) and (4 sin α, 4 cos α) where α = π6 , 2π 12π 6 , ..., 6 . We fix some control 206 constraint p in {0.01 · 2n : n = 0, 1, 2, ..., 12} ⊂ [0.01, 40.96]. Then we apply heuristics to calculate routing time for all instances. These values are used to obtain the corresponding binary logarithm of average calculated time for the heuristic under constraint p. The results are depicted in Fig. 1. Note that the average calculated time is quite similar for p ≤ 0.16. Figure 1: The relation between the control constraint p and binary logarithm of the corresponding average calculated travel time for the proposed heuristic and STOP-GO-STOP. 5 Conclusion and future work The paper deals with the hard discrete-continuous problem that currently does not allow to get exact solutions. In this regard, the proposed heuristic can be useful for applications. Since Krasovskii’s approach was proposed for a general linear control problem, the heuristic can be used with the controlled object defined by a generic linear system. To keep things simple, we considered the double integrator model only. The proposed heuristic can be easily extended to any finite dimensional state space. The duality proposed by N.N. Krasovskii simplifies the solution of two-point time-optimal control problems for linear systems (comparing to optimal control methods associated with Pontryagin’s minimum principle [17]). Recall that based on the duality, an analogue of Pontryagin’s minimum principle for multipoint linear control problems was proposed in [13]; we will use this result to further develop of the proposed heuristic. We also plan to extend numerical experiments to get a better understanding of the heuristic performance. From theoretical prospective, it is of interest to study the question of convergence of results under the discretization procedure and to provide error bounds. Acknowledgements The reported study was supported by RFBR research project № 17-08-01385. References [1] G.G. Sikharulidze. One generalization of the traveling salesman problem. I. Autom. Remote Control, 32(8), part 2: 1265–1271, 1971. [2] K. Savla, E. Frazzoli, F. Bullo. Traveling salesperson problems for the Dubins vehicle. IEEE Trans. Automat. Contr., 53(6): 1378–1391, 2008. [3] Ny J. Le, E. Feron, E. Frazzoli. On the Dubins traveling salesman problem. IEEE Trans. Automat. Contr., 57(1): 265–270, 2012. 207 [4] X. Ma, D.A. Castanon. Receding horizon planning for Dubins traveling salesman problems. In: Proc. 45th IEEE Conf. on Decision and Control, 5453–5458, 2006. [5] J.J. Enright, E. Frazzoli E. The stochastic traveling salesman problem for the Reeds-Shepp car and the differential drive robot. In: Proc. IEEE 45th Conf. on Decision and Control, 3058–3064, 2006. [6] D.P. Scharf, F.Y. Hadaegh, S.R. Ploen. A survey of spacecraft formation flying guidance and control. Part II: control. In: Proc. of the American Control Conference, 2976–2985, 2004. [7] K. Savla, F. Bullo, E. Frazzoli. Traveling Salesperson Problems for a double integrator. IEEE Trans. Automat. Contr., 54(4): 788–793, 2009. [8] S. Itani, E. Frazzoli, M.A. Dahleh. Travelling Salesperson Problem for dynamic systems. In: Proc. of 17th IFAC World Congress, 13318–13323, 2008. [9] Y.I. Berdyshev. On choice of tour in nonlinear problem of sequential approach. Trudy Instituta Matematiki i Mekhaniki, 16(5): 8–15, 2010 (in Russian) = Ю.И. Бердышев. О выборе маршрута в одной нелинейной задаче последовательного сближения. Тр. Ин-та математики и механики УрО РАН,16(5): 8–15, 2010. [10] N.N. Krasovskii. Theory of Motion Control. Nauka, Moscow, 1968 (In Russian). = Н.Н. Красовский. Теория управления движением. Наука, Москва, 1968. [11] N.N. Krasovskii. Game Problems on the Encounter of Motions. Nauka, Moscow, 1970 (In Russian). = Н.Н. Красовский. Игровые задачи о встрече движений. Наука, Москва, 1970. [12] R. Gabasov, F. Kirillova. The Qualitative Theory of Optimal Processes. Marcel Dekker, Inc., New York, 1976. [13] Y.I. Berdyshev, A.G. Chentsov. Optimization of a weighted criterion function in one control problem. Cybernetics, 22(1): 67–74, 1986. [14] C.E. Noon, J.C. Bean. An efficient transformation of the generalized traveling salesman problem. Tech. Rep. 89-36, University of Michigan, Ann Arbor, 1989. [15] G. Laporte, Y. Nobert. Generalized traveling salesman problem through n sets of nodes: the asymmetrical case. Discrete Applied Mathematics, 18(2): 185–197, 1987. [16] A.A. Chentsov, A.G. Chentsov. On solution of the problem of successive round of sets by the ‘nonclosed’ travelling salesman problem. Automation and Remote Control, 11: 1832–1845, 2002. [17] L.S. Pontryagin, V.G. Boltyanskii, R.V. Gamkrelidze, E.F. Mishchenko. The Mathematical Theory of Optimal Processes. Interscience, New York, 1962. 208