<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Review of Neural Networks Application in UAV Routing Problems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maksym Ogurtsov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>V.M. Glushkov Institute of Cybernetics of National Academy of Sciences of Ukraine</institution>
          ,
          <addr-line>Akademika Glushkova Avenue, 40, Kyiv, 03187</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>45</fpage>
      <lpage>54</lpage>
      <abstract>
        <p>The paper reviews the ways and methods to the neural networks (NN) usage for solving combinatorial optimization (CO) problems, appearing when using unmanned aerial vehicles and based on their specifics. It is determined that the use of neural networks (with supervised learning, reinforcement learning and deep learning) is possible for many types of CO routing problems (salesman's problem, VRP problem in different versions, etc.) and other unmanned aerial vehicles CO problems. NN with reinforcement learning, as well as recurrent NN with controlled learning may be successfully used to directly solve combinatorial optimization problems mentioned above, or to adjust combinatorial optimization algorithms parameters.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Neural networks</kwd>
        <kwd>combinatorial optimization</kwd>
        <kwd>VRP</kwd>
        <kwd>unmanned aerial vehicles</kwd>
        <kwd>deep learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>able to perform most of the tasks that are currently solved by manned aerial vehicles. And in Ukraine,
due to the difficult political and economic situation, a deeply rational solution will be to use the
methods of CO to increase the efficiency of such tasks for UAVs. Moreover, the rapid expansion of
UAV applications in recent years has created a need for new classes of problems, like route building
for swarms of UAVs, performing same task as a team etc.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Analysis of recent research and publications</title>
      <p>
        As it was mentioned earlier, recently, significant progress has been made with the use of ML –
through in-depth training [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. As it was said earlier, ML usage ways for solving the CO problems was
considered by both foreign [
        <xref ref-type="bibr" rid="ref10 ref3 ref4 ref6 ref7 ref8 ref9">3, 4, 6, 7, 8, 9, 10</xref>
        ] and domestic researchers [
        <xref ref-type="bibr" rid="ref11 ref12 ref13 ref14">11, 12, 13, 14</xref>
        ]. In this work
attention will be paid mostly to discrete optimization problems: integer optimization with constraints,
i.e., optimization problems with integer / binary variables. Although not all such problems are
difficult to solve (for example, the shortest path finding problem), we will focus on NP-complex
problems of CO. For these problems, the existence of an algorithm able to find a solution in a time,
which is a polynomial of the size of the input data is considered unlikely. However, in practice, CA
algorithms can find solutions to problems containing up to millions of variables, parameters, and
constraints.
      </p>
      <p>
        How NP-complex problems can be solved in a reasonable time? Let us consider the classical
example of an NP-complex problem, the traveling salesman problem (TSP), that could be presented in
a graph where we look for a minimum length solution by visiting each node once and only once [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
Let us focus on the Euclidean TSP [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. In this version, every TSP node contains coordinates in
Euclidean two-dimensional space (or even in a space with number of dimensions, that could be more
than two), and the objective function for the segment connecting the two nodes is the Euclidean
distance between them. Although theoretically this example of the problem is as complex as the
general case of the TSP, the approximate solution can be effectively found in Euclidean space using
the structure of graphs [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. But we should consider that for UAVs three-dimensional Euclidean space
should be used instead of two-dimensional in many cases to get valid solutions.
      </p>
      <p>
        There is a large amount of literature on heuristic algorithms, i.e., algorithms designed to calculate
practically acceptable solutions of CO problems with no guaranteeing optimality. These heuristic
methods are very important, using in CO and some of them will be considered in this paper. The most
popular are the following [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. We should notice 2 preferred motivations for using ML: approximation
and the creation of new policies.
      </p>
      <p>According to current research, next types of NN are or were applicable for solving CO problems
(some of them aren’t using for this task at modern time):
• Hopfield Networks
• Supervised learning
• Unsupervised learning
• Reinforcement learning
• Deep learning</p>
      <p>There are two popular ways for the ML use to solve CO problems – with and without expert
knowledge. In this case, the decision-making function (DMF) is called a policy. DMF provides all
present information (state of the environment, if the information able to describe the environment in
enough details at this step), issues output (possibly stochastically), which should be performed in the
next step of the algorithm. Policy is a function that we want to teach with the help of ML. Consider
how the above two motivations determine the parameters of learning.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Selection of previously unsolved parts of the overall problem</title>
      <p>
        The purpose of this work is to determine the applicability of NN to solve routing problems that
arise when using UAVs, i.e., to highlight promising areas of research of the NN usage to solve the CO
problems. NN can be successfully used for CA tasks that arise when using UAVs, primarily – when
planning routes for UAVs and especially – for UAV groups. This is because the task of constructing
routes and allocating points to be visited between vehicles is a classic CO task – the task of VRP
(Vehicle Routing Problem) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. And such a task constantly arises when planning operations using
UAVs.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Novelty</title>
      <p>Although, as mentioned above, the use of NN for solving the CO problems is actively studied all
over the world, its use for UAV-specific tasks has not been studied enough to date.</p>
    </sec>
    <sec id="sec-5">
      <title>5. General scientific significance</title>
      <p>The use of NN to solve the UAV-specific CA problems can improve the quality of the resulting
solutions, which will increase the efficiency of the UAVs use in general and potentially – to expand
the UAVs scope.</p>
    </sec>
    <sec id="sec-6">
      <title>6. NN types, applicable for CO problems</title>
      <p>NN can help improve CO algorithms in two ways. First, the researcher can use expert knowledge
of the CO algorithm to replace resource-intensive calculations with rapid approximation. NN can be
used to calculate such approximations. The challenge is to explore the space for possible solutions and
thus be able to choose the appropriate policies that have led to the best of these solutions. Second,
from the point of view of using NN to solve CO problems, NN can decompose the problem into
smaller, simpler problems.</p>
      <p>Consider the main types of NN that can be used to solve UAV-specific CO problems.
6.1.</p>
    </sec>
    <sec id="sec-7">
      <title>Hopfield Network</title>
      <p>The first attempt to formalize the description of nerve cell function for applying in computational
algorithms was a model proposed in 1943 by McCulloch and Pitts. The McCulloch-Pitts model
became the starting point for building a simple unidirectional neural network called a perceptron.
Such a network was proposed and researched by Rosenblatt in the late 1950s and early 1960s. And in
1960, Adaline systems (Adaptive Linear Neuron) were proposed. Later, the Hopfield neural network
was suggested.</p>
      <p>
        The Hopfield NN is a fully connected, with a symmetric matrix of connections [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. If we modify
the objective function of this NN so that it is equivalent to, for example, the objective function of the
salesman problem for UAV, and use Lagrange multipliers to set penalties for violating the constraints
of the problem, then such a network can be used to solve multiple CO problems.
      </p>
      <p>Such a network can be used as auto-associative memory, as a filter, and to solve many CO
problems. Unlike many NN, which work to receive a response through a certain number of cycles,
Hopfield networks work until equilibrium is reached, when the next state of the network is equal to
the previous one.</p>
      <p>Each neuron in the system yi can take one of two states (like the output of a neuron with a
threshold activation function):
 1
yi   .</p>
      <p>1</p>
      <p>
        A limitation of Hopfield networks is that they are vulnerable to the problem of hyperparameters
[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Although such neural networks have several promising properties, their practical application has
(1)
remained limited, in most cases only as research work.
6.2.
      </p>
    </sec>
    <sec id="sec-8">
      <title>Supervised learning</title>
      <p>In supervised learning during the input, we have a set of input data (functions) / target pairs, the
task is to find the function of converting input data into output, which for each set of input data
provides output that meets the specified constraints and has the value of the objective function closest
to the target. This is how we may define learning process. This process may be completed by finding
a solution for an optimization problem on a set of available functions. The loss function, i.e., the
degree of discrepancy between the source and target values, can be selected depending on the task
selected (for example, regression, classification, etc.) and optimization methods.</p>
      <p>Let X and Y, which have a common probability distribution P, be variables for the input data and
the objective function. Let ℓ be a minimization function of losses and let {fθ | θ ∈ Rp} is a set of ML
models for CO (in this case, parametric). The problem of controlled learning is defined as:
miRnp EX ,Y P (Y, f (X )).
(2)
6.3.</p>
    </sec>
    <sec id="sec-9">
      <title>Unsupervised learning</title>
      <p>
        In this type of learning [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] the NN does not have the goals of the problem it solves. Instead, it tries
to get some characteristics of the general distribution of random variables observed during solving the
problem. Because unattended learning has so far received little attention in CO context and its
immediate use seems difficult, this method would not be considered in detail.
6.4.
      </p>
    </sec>
    <sec id="sec-10">
      <title>Reinforcement learning</title>
      <p>
        Controlled learning is not applicable to most CO problems because it does not have access to
optimal distribution of source data. However, you can compare the quality of the solution set using an
evaluation data set for validation and determine the reward for the learning algorithm. Therefore, it is
possible to adhere to the paradigm of neural reinforcement learning (RL) to be used for finding
solutions of the CO problems. But unfortunately, even when we know optimal solutions of CO
problem already and may use this information as training data to optimize controlled learning, the
quality of the results is quite poor compared to machine RL, which receives a corresponding reward
for good solutions [
        <xref ref-type="bibr" rid="ref21 ref3 ref4">3, 4, 21</xref>
        ].
      </p>
      <p>Sometimes it may be complicated to build the reward function. Because RL is built on a dynamic
process, it is naturally able to anticipate states / transitions that lead to future rewards. However, the
above setting of rewards is difficult and not always effective, as it does not allow learning until the
agent (using some kind of policy, or just randomness) looking for the solution. In addition, when an
approximate policy is applied (for example, using greedy search), RL does not have 100% probability
of finding the best possible solution and may fall to local lows.
6.5.</p>
    </sec>
    <sec id="sec-11">
      <title>Deep learning</title>
      <p>Deep learning is a method of constructing NN, consisting of the large number of layers. In the case
of the simplest NN architecture, a NN with a direct connection, also called a multilayer perceptron
(MLP), the input data is sequentially transmitted through several layers. For each layer on the input
vector, an affine transformation is applied, followed by the application of a nonlinear scalar function.
The output, given by the current layer, is known as the function of intermediate activation. This
output is passed to the next layer.</p>
      <p>All operations, presented on different layers are independent and may be shown as different
matrices of coefficients. They are trainable, i.e., optimized with, for example, a stochastic gradient
descent (SGD), an algorithm for obtaining minimum of the selected loss function. Stochasticity is
generated by the finite number of data used to calculate losses before using SGD routine.</p>
      <p>Deep neural networks (DNNs) may be difficult to optimize, so several methods and approaches
have been developed to make o better optimization, often by changing the architectural aspects of the
network. Because DNNs have a large dimension amount, i.e., they can correspond to essentially any
data set, they are prone to excessive content problem. DNMs are also subject to active regularization.
Learning such networks with SGD also regulates them through gradient noise, which makes DNNs
generally reliable for excessive content problems. In addition, there are many hyperparameters for
DNN and various combinations of them are evaluated (this approach is known as hyperparameter
optimization). DNNs move away from more classical ML methods and algorithms, processing all
present input data, for example, all pixels of the digital image file, to learn, while traditional ML
usually requires the use of a limited number of features specific to a particular task.</p>
      <p>DNN researchers have created various methods for adapting to a variety of structured input data in
such a way as to process these input data of different dimension or size, for example – variable-length
sequences.</p>
      <p>Consider the following modern techniques.</p>
    </sec>
    <sec id="sec-12">
      <title>6.5.1. Recurrent neural networks</title>
      <p>The first presented architectures, applicable for CO problems, are recurrent neural networks
(RNN). RNNs can work on sequential data by exchanging parameters at different stages of the
sequence. More precisely, the same NN block is consistently applied at each step of the sequence, i.e.,
with the same values of architecture and parameters at each step of time. The specificity of such a
network is the presence of repeating layers. Individual layer uses as input two sets of data: first one is
the activation vector, which is the output, generated by the previous layer. Second one is the actual
activation vector at the previous stage of the sequence, as it is shown on Figure 1.</p>
      <p>On the left side of the Figure 1, the dot of the black color presents a one-step delay. More detailed
indication of this RNN layer with previous and next ones shown the right. U, V and W are parameters
sets, which are presented and reused at each step of the RNN sequence.</p>
    </sec>
    <sec id="sec-13">
      <title>6.5.2. Attention mechanism</title>
      <p>
        One more important technique that makes the problem invariant to input data sizes is the attention
mechanism [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. It can be used for data processing, where each point of the output data corresponds to
a set of input data. In this context, parameter exchange is used to consider the fact that different sets
of input do not have to be the same dimension. The attention mechanism is used to request
information about all the elements in the set and combine this information for further processing in
the NN, as shown in Figure 2.
      </p>
      <p>The classical mechanism of attention – the query q is calculated for a set of input values (vi)i. The f
function is used for pairs of queries and sets of input data values. If it includes some parameters, the
mechanism of attention is amenable to learning.</p>
      <p>The affinity function takes a request at the input (which is any info type, used to find out, what
exactly should be put into attention, where to focus) and a representation of the input set element (as
we know, both will be used for the next layer) and produces a scalar. This process is happening on the
set, including all its elements, and repeated for every other query. The obtained scalarized source data
are normalized (for example, using the softmax function) and used to determine the operation result
for the elements in the set. This weighted amount, in turn, can be used in the NN.</p>
      <p>
        The attention mechanism can be used to construct NNs of graphs, i.e., NNs capable of processing
structured input data of graphs [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. In this architecture, each node pays attention to all closest
elements. Then this routine is repeated several times to further collect information about the nodes.
      </p>
      <p>
        Deep learning and RNN can be used for supervised, unsupervised or reinforced learning [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
    </sec>
    <sec id="sec-14">
      <title>7. Approaches of using NN for solving UAV CO problems</title>
      <p>There are several approaches of using NN for solving UAV CO problems. Let us start with the
RNN.
7.1.</p>
    </sec>
    <sec id="sec-15">
      <title>Usage of RNN</title>
      <p>
        Neural CO can be used to solve CO problems using RL. We will examine 2 possible approaches
based on gradient policy [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. The first approach, called pre-training with reinforcement (PTR), uses a
training kit to optimize an RNN, which performs parameterization of stochastic policy on test dataset,
using the expected reward as a goal. During the tests, the reward is fixed, which provides conclusions
about the quality of solutions based on the use of greedy sampling.
      </p>
      <p>
        Second way of using RL for solving CO is an active search. It doesn’t involve prior RNN training.
The search begins with a randomly chosen way to build a solution (for example, greedy search) and
then from run to run optimizes the RNN parameters on a test data set, also using the expected reward
as a goal, while maintaining the best solution selected during the search. The combination of PTR and
active search provided best results after multiple practical comparisons [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
7.2.
      </p>
    </sec>
    <sec id="sec-16">
      <title>Approximation and creation of new policies</title>
      <p>There are different ways to use ML to solve CO problems. There are two main motivations for
using ML: approximation and the creation of new policies. There are also various ways to combine
ML and traditional algorithmic elements.</p>
      <p>As mentioned earlier, there are two main ways of ML usage – with and without expertise
knowledge. Both motivations are identified within the MDP states/actions described in the reinforced
learning section.</p>
      <p>In the case of using ML to obtain approximate solutions, policy is often taught using simulation
training, through demonstrations, with data from the expert, provided by the ML model. This process
is illustrated in Figure 3. When using this approach, the NN is trained not to optimize performance,
but to blindly follow the actions of the expert.</p>
      <p>When using simulation training, the policy learns to how to repeat success of the expert (trainer),
looking for ways to obtain minimal differences in the space of action.</p>
      <p>In the case where the goal is to create new policies, i.e., to optimize the solution-finding function
from the scratch, the policy can be trained using reinforcement learning, as shown in Figure 4. Even if
we present the problem of learning under the main MDP through RL, this does not limit the use of
basic RL algorithms (such as, for example, approximate dynamic programming algorithm or usage of
the gradient policy) to get the best amount of reward that is possible to get. Alternative optimization
methods, such as greedy algorithms, branch-and-bound, genetic algorithms, metaheuristic approaches,
direct/local search, can be applied as well to find solutions of RL problems.</p>
      <p>In the case of RL a reward signal is used, no expert participates in the training process; only the
maximization of the expected amount of future rewards is very important.</p>
      <p>It is important to understand that when using simulation training, the NN learns with controlled
goals set by the expert for each step of the algorithm (and without getting any reward), when in the
case of experiential learning, the NN learns by receiving a reward (possibly with a delay) and using
reinforced training (and without an expert). When using simulation training, the NN is taught what to
do, while in training with reinforced NN, it is recommended to use rewards as main parameter and do
the best trying to maximize it. However, it should be taken into account, that connections of these two
parameters is a much more complex and broader issue than the information provided here.</p>
      <p>Results of approaches to use NN for solving UAV CO problems comparison are presented in
Table 1.
7.3.</p>
    </sec>
    <sec id="sec-17">
      <title>The method of branches and boundaries</title>
      <p>
        NN can be used to improve the quality of solutions by improving the current local solution with
cutting planes (boundaries) [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. Therefore, it is proposed to determine the improvement associated
with the consideration of specific submatrices. This will also affect the quality of the sections, which
can be separated from the same submatrices. In this context, controlled (simulated) offline learning is
used to approach the optimal solution of the CO problem related to the choice of submatrix, and
subsequently trained NN can be trained pretty quickly (on small amount of training input data) to find
potentially best submatrices – and you wouldn’t have the computational burden of solving NP-hard
tasks. Of course, the potentially best submatrices should lead us to the potentially best cut-out planes
and, according to [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], it is possible to train NNs exclusively to solve the problem, to add only the
most promising cut-off planes.
      </p>
      <p>
        After analyzing significant amount of heuristics, it could be said that a well-functioning approach
is the use of strong branching [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. When using it, before making any decision on branching, the
strong branching algorithm performs one step forward, pre-considering the approximate branching on
many potential variables, calculates the current solution by the maximum likelihood method to obtain
a potential improvement of the current solution, and finally the use of strong branching together with
the method of branches and borders provides the best improvement of the output. However, even if
you do not study all the variables, but only approximate the current value of the solution, it will be
still a strategy, leading to requiring a lot amount of time, needed for computations.
      </p>
      <p>
        Considering the task of the TSP, that can be shown on the graph described above, it is easy to
make a greedy heuristic that builds a route, consistently selecting nodes among those that have not yet
been visited, and therefore, building a permutation. The choice of the nearest node is an intuitive
choice of criterion, but in most cases, it is far from optimal. Therefore, a greedy heuristic structure is
built, where the node selection policy is studied using NN on a graph, a type of NN capable of
processing at the input of a graph of any limited size using a message transmission mechanism [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Thus, to select each node, the input of the NN getting a graphical image of the problem, supplemented
by functions that marking nodes, that travelling salesman already visited (to not visit them twice). At
the output of the NN issues the selected next node. This selected node is then used as the training
example for the whole NN (based on the RL), and the partial length of the route is used as a reward.
Of course, there are much better algorithms that allow you to get better solutions. However, it should
be noted that given the above information, NN can be used to identify new, potentially more effective
policies.
      </p>
      <p>
        On a two-dimensional Euclidean graph containing up to 100 vertices, the RL for the travelling
salesman task significantly exceeds the controlled approach to learning [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and allows to obtain
results close to optimal, if enough time was used for learning. These results provide insights into how
NN can be used as a tool to solve CO problems, especially those for which heuristics are difficult to
develop.
      </p>
      <p>
        On KnapSack problem, which is NP-hard, in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] authors used pretrained NN with RL and active
search to successfully solve KNAP50, KNAP100 and KNAP200 problems all instances to optimality.
      </p>
    </sec>
    <sec id="sec-18">
      <title>8. Conclusions</title>
      <p>Thus, the paper reviews the approaches to the use of NN in the problems of CO. It is determined
that the use of NN (including deep learning NN) is possible in the tasks of CO for UAVs in routing
problems, so they can be used to solve problems that arise when planning the operation of UAVs. At
the same time, recurrent NNs with nonparametric normalized exponential functions of controlled
learning may be undoubtedly used for finding solutions of the CO problems. It has been determined
that the most effective to date is the use of RL training to solve such problems. This method is
recommended to use.</p>
      <p>Prospects for the use of research results: preparing mathematical model of the specific CA
problem for UAVs, developing an algorithm for solving this problem using the proposed NN
architecture and comparing the results with the results of classical methods for solving such CA
problem.
9. References</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Naseem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. T. H.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. W.</surname>
          </string-name>
          <article-title>Malik 2017 Decision support system for optimum decision making process in threat evaluation and weapon assignment: Current status, challenges and future directions</article-title>
          <source>Annual reviews in control 43</source>
          ,
          <fpage>169</fpage>
          -
          <lpage>187</lpage>
          . https://doi.org/10.1016/j.arcontrol.
          <year>2017</year>
          .
          <volume>03</volume>
          .003
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Hulianytskyi</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. I.</surname>
          </string-name>
          <article-title>Riasna 2017 Formalization and classification of combinatorial optimization problems Optimization Methods and Applications (eds</article-title>
          . Butenko S.,
          <string-name>
            <surname>Pardalos</surname>
            <given-names>P. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shylo</surname>
            <given-names>V.</given-names>
          </string-name>
          ), Cham: Springer International Publishing AG 239-250 https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -68640-0_
          <fpage>11</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lodi</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Prouvost</surname>
          </string-name>
          <article-title>2020 Machine Learning for Combinatorial Optimization: a Methodological Tour d'</article-title>
          <source>Horizon European Journal of Operational Research</source>
          <volume>290</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>405</fpage>
          -
          <lpage>421</lpage>
          https://doi.org/10.1016/j.ejor.
          <year>2020</year>
          .
          <volume>07</volume>
          .063
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>I. Bello</surname>
          </string-name>
          et al.
          <year>2016</year>
          <article-title>Neural combinatorial optimization with reinforcement learning</article-title>
          ,
          <source>arXiv preprint arXiv: 1611</source>
          .
          <fpage>09940</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fortun</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. S.</surname>
          </string-name>
          <article-title>Schweber 1993 Scientists and the legacy of World War ІІ: The case of operations research (or)</article-title>
          <source>Social Studies of Science</source>
          <volume>23</volume>
          (
          <issue>4</issue>
          ), pp.
          <fpage>595</fpage>
          -
          <lpage>642</lpage>
          https://doi.org/10.1177/030631293023004001
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          <article-title>Xiuju 2004 A noisy chaotic neural network for solving combinatorial optimization problems: Stochastic chaotic simulated annealing</article-title>
          ,
          <source>IEEE Transactions on Systems, Man, and Cybernetics</source>
          ,
          <string-name>
            <surname>Part</surname>
            <given-names>B</given-names>
          </string-name>
          (
          <year>Cybernetics</year>
          )
          <volume>34</volume>
          (
          <issue>5</issue>
          ), pp.
          <fpage>2119</fpage>
          -
          <lpage>2125</lpage>
          https://doi.org/10.1109/tsmcb.
          <year>2004</year>
          .829778
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Khalil</surname>
          </string-name>
          et al.
          <year>2017</year>
          <article-title>Learning combinatorial optimization algorithms over graphs</article-title>
          <source>Advances in Neural Information Processing Systems</source>
          <volume>6348</volume>
          -6358 https://doi.org/10.5555/3295222.3295382
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          <article-title>Koltun 2018 Combinatorial optimization with graph convolutional networks and guided tree search</article-title>
          <source>Advances in Neural Information Processing Systems</source>
          <volume>539</volume>
          -548.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Lei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <surname>W.</surname>
          </string-name>
          <article-title>Zhao 2021 Solve routing problems with a residual edgegraph attention neural network arXiv preprint</article-title>
          arXiv:
          <volume>2105</volume>
          .
          <fpage>02730</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Braekers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ramaekers</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. V.</surname>
          </string-name>
          <article-title>Nieuwenh 2016 The Vehicle Routing Problem: State of the Art Classification</article-title>
          and
          <string-name>
            <given-names>Review</given-names>
            <surname>Computers</surname>
          </string-name>
          &amp;
          <source>Industrial Engineering</source>
          <volume>99</volume>
          ,
          <fpage>300</fpage>
          -313 https://doi.org/10.1016/j.cie.
          <year>2015</year>
          .
          <volume>12</volume>
          .007
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>V. P.</given-names>
            <surname>Horbulin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Hulianytskyi</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. V.</surname>
          </string-name>
          <article-title>Sergienko 2020 Optimization of UAV Team Routes in the Presence of Alternative and Dynamic Depots Cybernetics</article-title>
          and
          <source>Systems Analysis</source>
          <volume>56</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>195</fpage>
          -
          <lpage>203</lpage>
          https://doi.org/10.1007/s10559-020-00235-8
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>O. Turchyn 2007</surname>
          </string-name>
          <article-title>Comparative analysis of metaheuristics solving combinatorial optimization problems 2007 9th International Conference - The Experience of Designing and Applications of CAD Systems in Microelectronics</article-title>
          . IEEE https://doi.org/10.1109/cadsm.
          <year>2007</year>
          .4297548
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Oliinyk</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Fedorchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Stepanenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rud</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>Goncharenko 2019 Combinatorial optimization problems solving based on evolutionary approach</article-title>
          ,
          <source>2019 IEEE 15th International Conference on the Experience of Designing and Application of CAD Systems (CADSM)</source>
          pp.
          <fpage>41</fpage>
          -
          <lpage>45</lpage>
          https://doi.org/10.1109/cadsm.
          <year>2019</year>
          .8779290
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>I. V.</given-names>
            <surname>Sergienko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Hulianytskyi</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. I.</surname>
          </string-name>
          <article-title>Sirenko 2009 Classification of applied methods of combinatorial optimization Cybernetics</article-title>
          and
          <source>Systems Analysis</source>
          <volume>45</volume>
          (
          <issue>5</issue>
          ), pp.
          <fpage>732</fpage>
          -
          <lpage>741</lpage>
          https://doi.org/10.1007/s10559-009-9134-0
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>S. Lin 1965</surname>
          </string-name>
          <article-title>Computer solutions of the traveling salesman problem</article-title>
          <source>Bell System Technical Journal</source>
          <volume>44</volume>
          (
          <issue>10</issue>
          ), pp.
          <fpage>2245</fpage>
          -
          <lpage>2269</lpage>
          https://doi.org/10.1002/j.1538-
          <fpage>7305</fpage>
          .
          <year>1965</year>
          .tb04146.x
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hamacher</surname>
          </string-name>
          , C.
          <source>Moll 1996 The Euclidian Traveling Salesman Selection Problem Operations Research Proceedings</source>
          <year>1995</year>
          , Springer, Berlin, Heidelberg 54-59 https://doi.org/10.1007/978- 3-
          <fpage>642</fpage>
          -80117-4_
          <fpage>10</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Larson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Odoni</surname>
          </string-name>
          1981 Urban operations research No.
          <year>Monograph</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D.</given-names>
            <surname>Naddef</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Rinaldi 2002 Branch-and-cut algorithms for the capacitated VRP The vehicle routing problem</article-title>
          .
          <source>Society for Industrial and Applied Mathematics</source>
          , pp.
          <fpage>53</fpage>
          -
          <lpage>84</lpage>
          https://doi.org/10.1137/1.9780898718515.ch3
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Hopfield</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. W. Tank 1985</surname>
          </string-name>
          ”
          <article-title>Neural” computation of decisions in optimization problems</article-title>
          <source>Biological Cybernetics</source>
          <volume>52</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>141</fpage>
          -
          <lpage>152</lpage>
          https://doi.org/10.1007/BF00339943
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>G. V.</given-names>
            <surname>Wilson</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. S.</surname>
          </string-name>
          <article-title>Pawley 1988 On the stability of the travelling salesman problem algorithm of hopfield</article-title>
          and
          <source>tank Biological Cybernetics</source>
          <volume>58</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>63</fpage>
          -
          <lpage>70</lpage>
          https://doi.org/10.1007/bf00363956
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>O.</given-names>
            <surname>Alagoz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hsu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Schaefer</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. S.</surname>
          </string-name>
          <article-title>Roberts 2010 Markov decision processes: a tool for sequential decision making under uncertainty Medical Decision Making</article-title>
          ,
          <volume>30</volume>
          (
          <issue>4</issue>
          ), pp.
          <fpage>474</fpage>
          -
          <lpage>483</lpage>
          https://doi.org/10.1177/0272989x09353194
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>P.</given-names>
            <surname>Veliˇckovi´</surname>
          </string-name>
          c et al.
          <year>2018</year>
          <article-title>Graph attention networks International Conference on Learning Representations https</article-title>
          ://doi.org/10.17863/CAM.48429
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>R. J.</surname>
          </string-name>
          <article-title>Williams 1992 Simple statistical gradient-following algorithms for connectionist reinforcement learning Machine learning 8(3-4), рp</article-title>
          .
          <fpage>229</fpage>
          -256 https://doi.org/10.1007/BF00992696
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>R.</given-names>
            <surname>Baltean-Lugojan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Misener</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bonami</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Tramontani 2018 Strong sparse cut selection via trained neural nets for quadratic semidefinite outer-approximations Technical report</article-title>
          , Imperial College, London.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>D.</given-names>
            <surname>Applegate</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bixby</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          <article-title>Chv´atal, and</article-title>
          <string-name>
            <surname>W.Cook 2007</surname>
          </string-name>
          <article-title>The traveling salesman problem</article-title>
          .
          <source>A computational</source>
          study Princeton University Press https://doi.org/10.1515/9781400841103
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>