<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Journal of Machine Learning Research 6 (2005) 363</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">2374-8486</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.2307/2685209</article-id>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oksana Pichugina</string-name>
          <email>oksanapichugina1@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olha Matsiy</string-name>
          <email>olga.matsiy@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kharkiv National Automobile and Highway University</institution>
          ,
          <addr-line>Str.Yaroslava Mudrogo 25, 61002 Kharkiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Machine Learning</institution>
          ,
          <addr-line>Supervised Learning, Combinatorial Optimization, Linear Assignment</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>National Aerospace University "Kharkiv Aviation Institute"</institution>
          ,
          <addr-line>17 Chkalova Street, 61070 Kharkiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <volume>904</volume>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Hybrid approaches (Approach 1 and Approach 2) that combine Supervised Learning with Linear Combinatorial Optimization are offered to solve the problem of minimization of the total delivery expenditure for supplying customers by batches of rocks. Approach 1 applies multi-class classification for customers' assessment delivery expenditures, and Approach 2 solves an additional regression prediction problem. Numeric characteristics utilizing all available information are proposed allowing to evaluate the effectiveness of applying the approaches. The presented way to deal with classification problems with additional constraints can be extended on wide class real-world problems.</p>
      </abstract>
      <kwd-group>
        <kwd>Problem</kwd>
        <kwd>classification</kwd>
        <kwd>regression</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Combinatorial optimization problems (CO problems, COPs) arise in almost every practice area
where decisions are made based on the choice of indivisible objects [1]. Since the absolute majority of
real COPs are np-hard, difficulties in their solution arise in the development of pseudopolynomial
algorithms for their exact solving [2, 3, 4, 5, 6, 7, 8, 9]. On the other hand, the presence of additional
constraints does not allow developing effective heuristics. Conditionally, to the class of COPs, it is
possible to include the classical problems of Machine Learning, such as classification and clustering.
At the same time, the main Supervised and Unsupervised Learning methods heuristic. This is due to the
fact that, in these problems, there are no restrictions preventing forming an arbitrarily large sample of
feasible solutions to the problem and applying heuristics to them. Indeed, each sample can be assigned
to any class in a classification problem and an arbitrary cluster in a clustering problem. However,
passing from the classical formulations to a practical problem, we invariably face natural constraints
imposed on single elements or their combinations. This can be restrictions on the number of elements
in some classes, constraints from above and below, on the distance between elements of the same or
different classes, etc. In addition, some additional information is often known, such as misclassification
penalties and correct classification rewards. Evidently, it is highly desirable to take them into account
in the solution, but this is not always possible when using classical methods of Classification and Cluster
Analysis. Meanwhile, having formalized such COPs as discrete or continuous optimization problems,
there appears a real opportunity to solve them in a reasonable time exactly or with the accuracy
assessment by suitable methods. In this paper, we will consider a practical problem that can be
effectively solved by combining Machine Learning methods with classical CO methods.</p>
      <sec id="sec-1-1">
        <title>Consider the following practical problem.</title>
        <p>2021 Copyright for this paper by its authors.</p>
        <p>Problem statement. Rocks are mined in several quarries and supplied to consumers in batches. It
is necessary to develop a delivery plan for a certain set of such batches in order to fulfill customers’
demand and minimize the total delivery expenditure consisting of delivery costs, expenses for
exceeding the price of rocks to the customer requests and penalties for the supply of rock batches of
lower quality than provided for by the contract with the consumers.</p>
        <p>This problem can be thought of as a classification problem, where the samples are the batches and
classes are the consumers. Also, on the one hand, additional constraints are imposed on the delivery
volume to particular consumers. On the other hand, a target function appears expressing the total
delivery expenditure.</p>
        <p>The presence of these two essential components deduce the problem from the standard classification
problems, respectively, do not allow its solving by standard methods of classification. Therefore, we
need new approaches to solving this problem. This is what this work is about. This paper offers two
hybrid approaches (Approach 1 and Approach 2), combining Supervised Machine Learning with
classical Combinatorial Optimization methods dealing with a generalized Linear Assignment Problem
that directly maps these batches to the set of consumers. Approach 1 solves one more classification
problem at the first stage of Machine Learning, which identifies a class of rock batches, and Approach
2 solves this problem, where the cost of a batch of rocks is predicted.</p>
        <p>Supposedly, the proposed way will allow, on the one hand, to use historical data, namely, to apply
Machine Learning on batches of rocks previously mined in these quarries, and on the other hand, to
implement the linear programming method in the second phase of the approaches.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Prerequisites 2.1.</title>
    </sec>
    <sec id="sec-3">
      <title>Supervised Learning elements</title>
      <p>Practical Machine Learning (ML) mostly uses Supervised Learning (SL) [10, 11, 12, 13]. In SL, you
have an input variable vector ( ) and an output variable ( ) and you use a SL algorithm to learn the
mapping function from the input  to the output  =  ( ), where  ∈  . The goal is to approximate the
mapping function  so well that when you have new input data  ′ that you can predict the output
variables  ′ for that data. We come to regression or classification problems depending on which values,
discrete or continuous, takes the function  . Particularly, if  ∶  → ℝ1, the problem of its finding is a
regression problem (RP) [10]. In contrast, if  ∶  →  , where | | &lt; ∞, the problem under consideration
is a classification problem (CP) [10].</p>
      <p>In other words, a CP [10, 14] is a problem of identifying to which class a new instance belongs based
on available class membership for instances in a training set (TS).</p>
      <p>Instances to which regression/classification is applied form a test set (TeS).</p>
      <p>In classification, a TS-instance is given by a feature tuple  and a class label  . A TeS-instance is
given by a feature tuple  ′, while a class label  ′ is unknown and needs to be found.</p>
      <p>In regression,  - is a real value representing outcome, while the task is to predict the real value  ′
for the input vector  .</p>
      <p>can be a numeric vector but not necessarily since features presented in  -components
characterize certain instance’ properties, which can be categorical, ordinal, integer-valued, or real-valued.</p>
      <p>If categorical or ordinal features are present, a preprocessing stage is required making mapping 
into Euclidean space:</p>
      <p>Let
be a set of classes.</p>
      <p>To X , we will refer to as an instances' space and to</p>
      <p>
x→x  X  R L.</p>
      <p>
        C = C0 ,..., Cs
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
as a class label space.
      </p>
      <sec id="sec-3-1">
        <title>A class</title>
        <p>| Ci0 |= miJis0n Ci</p>
        <p>Ci0  C
of cardinality
| Ci0 |= miaJs0x Ci</p>
        <p>
          is called a major class (positive class), the one of size
is called a minor class (negative class). A classification algorithm (CA) is intended to
train a classifier, which is a function mapping an instances' space X into a class label space (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ):
        </p>
        <p>Type of function  , linear or not, defines two large groups of classification algorithms – linear CAs
and nonlinear CAs (see overview in [15, 11]). In addition, a wide group of highly effective methods
called ensemble CAs (ECAs) exists [16, 17, 18].</p>
        <p>Popular CAs are:</p>
        <p>Y = 0,..., s
f : X → Y.</p>
        <p>
          (
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
1. Linear CAs:
a) Logistic Regression (generalized linear model, LR) [19];
b) Linear Support Vector Machine Classification (SVM) [20];
c) Naive Bayes (NB) [21];
d) Linear Discriminant Analysis (LDA) [22];
2. Nonlinear CAs:
a) Decision Tree (DT) [23] - C4.5 [24], CART [25];
b) k-Nearest Neighbors (KNN) [26];
c) Artificial Neural Networks (Deep Learning, DL) [27];
d) Kernel SVM (KSVM) [28];
e) Quadratic Discriminant Analysis (QDA) [22];
3. Ensemble CAs:
a) Bagged Decision Trees (BDT);
b) Random Forest (RF) [25];
c) Extra Trees (ET);
d) Adaptive Boosting (AdaB) [29];
e) Gradient Boosting Machine (GBM) [30];
f) Stochastic Gradient Boosting (SGB) [31];
g) Extreme Gradient Boosting (eXtreme Gradient Boosting,XGBoost, XGB) [30].
A regression algorithm (RA) is targeted to train a regressor, which is a function mapping an
1
instances' space X into a prediction space Z  R :
g : X → Z.
        </p>
        <p>
          (
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
        </p>
        <p>In SL, most of the algorithms are CAs. Some of them are generalized to regression. Among
regression algorithms (RAs) are:
1. Linear Regression (LR) [13];
2. Linear Support Vector Machine Regression (SVR) [32];
3. Decision Tree Regression (DTR) [23];
4. Deep Learning Regression (DLR) [33];
5. Random Forest regression (RFR) [25, 34].
2.2.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Linear Assignment Problem</title>
      <p>Assignment problems (APs) deal with the question of how to assign  items (jobs) to  machines
(workers) in the best possible way. APs consist of two components: the assignment as an underlying
combinatorial structure and an objective function describing the "best way" [35, 36, 37].</p>
      <p>Mathematically an assignment can be seen a bijective mapping of a finite set Jn = {1,..., n} into
itself, i.e., an n -permutation of J n forming a permutation set n . Assignments can be modelled in
different ways: every permutation   n corresponds in a unique way to a permutation matrix
X = ( xij )i, j with xij = 1 for j =  (i) and xij = 0 for j  i . If the objective function is linear on
X -components, the AP is called linear (LAP); if quadratic, it is called a quadratic AP (QAP) etc.</p>
      <p>LAP integer programming model is [35]: find  ∈ Π′  such that
Here, C = (cij )i, j is the cost matrix of the assignments.</p>
      <p>
        LAP is polynomially solvable [36]. It is generalized differently, such as allowing multiple
assignments, when multiple machines can be assigned to one worker [35, 37]. As a result, the classical
LAP-model (
        <xref ref-type="bibr" rid="ref3">3</xref>
        )-(
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) becomes
      </p>
      <p>n
F ( X ) = cij xij min;</p>
      <p>i, j=1
n
xij = 1, i  Jn ;
j=1
n
xij = 1, j  Jn ;
i=1
xij {0,1}, i, j  J n .</p>
      <p>n m
F ( X ) = cij xij → min;</p>
      <p>
        i=1 j=1
m
xij = 1, i  Jn ;
j=1
n
xij = n j , j  Jm ;
i=1
xij {0,1}, i  J n , j  J m ,
where m  n , n1 + ... + nm = n . The AP (
        <xref ref-type="bibr" rid="ref9">9</xref>
        )-(
        <xref ref-type="bibr" rid="ref12">12</xref>
        ) as it is reducible to LAP with columns having
multiplicities n1,..., nm , respectively.
      </p>
    </sec>
    <sec id="sec-5">
      <title>3. Approaches 1,2 description</title>
    </sec>
    <sec id="sec-6">
      <title>3.1. Modelling</title>
      <p>Let us build a mathematical model of the problem stated in Introduction. Let C  C be class of
rocks such as granite, marble, limestone, etc., y = y(C)  Z + be its number (a label), while z = p( y)
be the known price of the rock batch of class y .</p>
      <p>Suppose that several rocks can be mined in a quarry, rock samples are automatically processed.
Namely, their size, shape, spectral characteristics are measured, after which the type of rock is
automatically determined. Based on this information, the rock class of the entire batch is predicted.
Suppose all the features are numeric, thus, X = X .</p>
      <p>
        As a training set TS, we use historical data on rock batches for which their type is known, then
xi  X is a vector of parameters of the shape, size and spectral characteristics of a training batch
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
(
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
(
        <xref ref-type="bibr" rid="ref9">9</xref>
        )
(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )
(
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
(
        <xref ref-type="bibr" rid="ref12">12</xref>
        )
( i  Jn ), while x  X  is a vector the same parameters for the batch involving in our problem, i.e. an
i
element of our training set TS ( i  Jn ). Respectively, Y = ( yi )Jn are real labels of TS, Z = ( zi )iJn ,
where zi = p( yi ) are the real cost of the batch i in a TS.
      </p>
      <p>In order to assess the effectiveness of our approaches numerically, we offer two metrics that use
actual labels of the test set. So, let Y  = ( yi)iJn be real labels of TeS, then Z  = ( zi)iJn , where
zi = p( yi) will be the real cost of the batch i in a TeS ( i  J n ).
multiplicities n1,..., nm , respectively.</p>
      <p>3.1.1. Stage 1
The first stage differs for these two approaches that we offer.</p>
      <p>Approach 1. Here, the training set has the form of</p>
      <p>TS =  xi , yi iJn ,
TeS =  x 
i iJn</p>
      <p>.</p>
      <p>Yˆ' = {yˆ'i}iJn , whereyˆ'i = f ( xˆ'i ), i  J n.</p>
      <p>Zˆ' = {zˆ'i}iJn , wherezˆ'i = p( yˆ'i ), i  J n.</p>
      <p>TS =  xi , zi iJn .</p>
      <p>zi' = g ( xi) , i  J n ,
while the test set is
is</p>
      <p>Stage 1 consists of solving a CP on TS and applying a found classificator f for predicting labels of
a TeS. It results in a multiset</p>
      <p>Yˆ can be used for prediction of the cost of the batches, particularly, the collection of the prediction
Approach 2. At Stage 1, we set a regression problem. In this case, the training set looks like this:
The result of applying a regression algorithm to the set TS will be the regressor g . Applying it to
our test set TeS, we obtain a set
where zi' is a predicted cost of a rock batch i obtained by Approach 2.</p>
      <p>Note that, in this step, we can use standard regression metrics to evaluate the quality of this
prediction.</p>
      <p>Now, let us move to the second Combinatorial Optimization stage.
3.1.2. Stage 2</p>
      <p>This stage will be common for our approaches. The only difference is that, in Approach 1, Zˆ' is
used as a vector of predicted cost while, in Approach 2, it will be Z .</p>
      <p>Suppose that for the test set TeS, we also know from which quarry k out of a set J K rocks come.
Accordingly, it is known their location, distances between the quarries is the same as the location of the
consumers and the distance between them and to the quarries. This makes it possible to find the cost of
delivering a batch from each of the quarries to the consumers.</p>
      <p>Bm be the customers, A = ( akj )kJk , jJm be a cost matrix, where akj is the cost of
delivery of the batch from the k -th quarry to the consumer Bj .</p>
      <p>Approach 2. With the help of A , let us build another cost matrix</p>
      <p>A' = ( a'ij )iJn ,
where a'ij is the predicted (with the help of Approach 2) total delivery expenditure of the batch i to
the consumer Bj , j  Jm .</p>
      <p>Here, aij includes, as one component, the delivery cost ak j , where ki is the quarry where the batch
i
i was mined.</p>
      <p>The other two components directly depend on how the real cost of the batch relates to the cost of the
rocks ordered by the customer and on the penalties that the consumers regulate for violating the delivery
quality.</p>
      <p>So, let b j be the cost of the rock ordered by the customer Bj , v j be the volume of ordered rock
batches, d j be the penalty for delivering a low-quality batch i (per unit).</p>
      <p>Then, if z &gt; bi is satisfied for z  Z' = z'iiJn , then the mining company owners will receive less
money for exceeding the quality of rock batches of the ordered level. If z = bi , then there will be no
additional costs. Finally if z &lt; bi , then the company will have to pay (bi − z ) di currency units as a
penalty for the delivery of poor quality rocks. Thus</p>
      <p>z'i − bi , if z'i  bi ;
a'ij = aki j + (bj − z'i ) d j , if z'i &lt; bi .</p>
      <p>Having filled the matrix A' , we can proceed to the optimization step, in which it is decided where
to send which batch in order to minimize the total delivery expenditure. This can be represented as the
need to assign the batches to the consumers subject to the constraint on fulfilling the customers' orders.</p>
      <p>The mathematical model is: find a set
of unknowns such that</p>
      <sec id="sec-6-1">
        <title>The objective function is</title>
        <p>T = (t )</p>
        <p>ij iJn, jJm
1, if the batch i goes to the customer Bj ;
tij = 
0, otherwise.</p>
        <p>n m
F'(t ) = a'ij tij → min</p>
        <p>
          i=1 j=1
under constraints:
− the batch is either supplied or not supplied at all, which can be expressed by inequality
m
tij  1, i  Jn ,
j=1
(
          <xref ref-type="bibr" rid="ref13">13</xref>
          )
(
          <xref ref-type="bibr" rid="ref14">14</xref>
          )
(
          <xref ref-type="bibr" rid="ref15">15</xref>
          )
(
          <xref ref-type="bibr" rid="ref16">16</xref>
          )
(
          <xref ref-type="bibr" rid="ref17">17</xref>
          )
− the costumers' orders are obligatory for execution:
n '
tij  v j , j  Jm
i=i2
        </p>
        <p>
          The problem (
          <xref ref-type="bibr" rid="ref15">15</xref>
          )-(
          <xref ref-type="bibr" rid="ref18">18</xref>
          ) is a generalized linear assignment problem that can be solved in polynomial
time in dimension by methods such as Hungarian. Let T * = (t* )
ij iJn, jJm
this AP. For the consistency of the problem, the condition
be an optimal solution to
n
v j  n
j=1
Aˆ ' = ( aˆ' )
ij iJn
,
must be satisfied, meaning that the number of rock bathes is sufficient to meet the consumer demand.
        </p>
        <p>Approach 1. Here, the difference of this stage in Approach 1 compared to Approach 2 will be
outlined.</p>
        <p>
          The formula (
          <xref ref-type="bibr" rid="ref13">13</xref>
          ) is replaced by
where aˆ'ij is the predicted (with the help of Approach 1) total delivery expenditure of the batch i to
the consumer Bj , j  Jm .
        </p>
        <p>
          The formula (
          <xref ref-type="bibr" rid="ref14">14</xref>
          ) becomes
Respectively, the objective function (
          <xref ref-type="bibr" rid="ref16">16</xref>
          ) becomes
        </p>
        <p>zˆ'i − bi , if zˆ'i  bi ;
aˆ'ij = aki j + (bj − zˆ'i ) d j , if zˆ'i &lt; bi .</p>
        <p>n m
Fˆ'(t ) = aˆ'ij tij → min .</p>
        <p>i=1 j=1
n m
F  (t ) = aij tij → min,</p>
        <p>
          i=1 j=1
A = ( a )
ij iJn
Let a solution to the AP (22) with constraints (
          <xref ref-type="bibr" rid="ref15">15</xref>
          ), (
          <xref ref-type="bibr" rid="ref17">17</xref>
          )-(
          <xref ref-type="bibr" rid="ref19">19</xref>
          ) be denoted T * = (t* )
ij iJn , jJm
.
3.2.
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Comparison metrics</title>
      <p>
        To compare the quality of the solutions T * ,T * , we solve one more AP, where actual costs of
training set's items are used when comparing the optimal global solution with these two. This is an AP
subject to constraints (
        <xref ref-type="bibr" rid="ref15">15</xref>
        ), (
        <xref ref-type="bibr" rid="ref17">17</xref>
        ), (
        <xref ref-type="bibr" rid="ref18">18</xref>
        ), where
is a matrix of the total delivery expenditures, for instance, aij is the total delivery expenditure for the
batch i delivered to the consumer Bj , j  Jm . A can be found similarly to (
        <xref ref-type="bibr" rid="ref21">21</xref>
        ) as
(
        <xref ref-type="bibr" rid="ref18">18</xref>
        )
(
        <xref ref-type="bibr" rid="ref19">19</xref>
        )
(
        <xref ref-type="bibr" rid="ref20">20</xref>
        )
(
        <xref ref-type="bibr" rid="ref21">21</xref>
        )
(22)
(23)
(24)
zi − bi , if zi  bi ;
aij = aki j + (bj − zi) d j , if zi &lt; bi .
      </p>
      <p>
        Let T * be an optimal solution for the AP (
        <xref ref-type="bibr" rid="ref15">15</xref>
        ), (
        <xref ref-type="bibr" rid="ref17">17</xref>
        ), (
        <xref ref-type="bibr" rid="ref18">18</xref>
        ), (23) and F * = F (T * ) be its optimal
value. Substituting T* , T * into the target function, we get not estimated but an actual delivery
expenditure of implementing the plans T* , T * in real life. Let us denote F'* = F(T *) ,
Fˆ'* = F (T * ) . If F'* = Fˆ'* , then Approaches 1, 2 work equally, if F'* &lt; Fˆ'* , Approach 1 showed
itself better, if F'* &gt; Fˆ'* , then it is the worst.
      </p>
      <p>To understand how much we lost applying the predicted cost rather than real, we analyze the
increments
 = F'* − F'*, ˆ = Fˆ'* − Fˆ'*
 =  / F'*,ˆ = ˆ / Fˆ'*.
(25)
(26)
(27)
representing absolute error if applying Approach 1 and Approach 2, respectively. The corresponding
relative errors can be found using already found values</p>
      <p>In our opinion, the metrics (26), (27) highlight specifics of the problem under consideration better
than any classification metrics applicable to multi-classification problems [14, 15, 38, 39] such as the
accuracy, balanced accuracy, recall, precision, F -score, G -mean, etc.</p>
    </sec>
    <sec id="sec-8">
      <title>4. Discussion of Results and Future work</title>
      <p>The absolute majority of research literary sources relating to Combinatorial Optimization dealing with
exact solutions and Machine Learning, these two research fields are exiled. Perhaps, this is because the
combinatorial optimization problems are highly complex, including designing qualitative heuristics,
while Machine Learning is entirely based on heuristics.</p>
      <p>The main contribution is that this paper presents an innovative approach to how these two research
domains can be combined. The second one is the mathematical models developed here. Together with
the scheme of both approaches, they provide a road map of implementing in programming languages
such as Python, where powerful libraries in Machine Learning and Integer Linear Programming are
embedded.</p>
      <p>Areas of further work are:
− Generalization to a higher class of practical problems.</p>
      <p>− Collecting data on the location of Ukrainian open pits and the structure of their rocks, followed
by a computational experiment using the presented approaches.</p>
      <p>− Generalization to a class of stochastic programming problems, in which the predicted price is
involved and predictions of the standard deviations from the average cost.</p>
      <p>At the next stage of the research, we plan to apply a generalization of the classical regression model
with one output for two outcomes - the predicted cost and standard deviation. In this regard, this stage
is supposed to be carried out using neural networks.</p>
    </sec>
    <sec id="sec-9">
      <title>5. Conclusion</title>
      <p>This paper offers two hybrid approaches to minimize the total delivery expenditure for supplying
customers by rock batches. They combine Machine Learning techniques with classical Linear
Combinatorial Optimization methods. Approach 1 solves an auxiliary classification problem at the stage
of Machine Learning, while Approach 2 solves an additional regression problem instead. Numerical
evaluation characteristics for assessing the approaches are presented. Ways to develop the contribution
theoretically and experimentally are outlined.</p>
    </sec>
    <sec id="sec-10">
      <title>6. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Y. G.</given-names>
            <surname>Stoyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. V.</given-names>
            <surname>Yakovlev</surname>
          </string-name>
          ,
          <source>Theory and Methods of Euclidian Combinatorial Optimization: Current Status and Prospects, Cybernetics and Systems Analysis</source>
          <volume>56</volume>
          (
          <year>2020</year>
          )
          <fpage>366</fpage>
          -
          <lpage>379</lpage>
          . doi:
          <volume>10</volume>
          . 1007/s10559-020-00253-6.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>O.</given-names>
            <surname>Pichugina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Matsyi</surname>
          </string-name>
          , Boolean Satisfiability Problem:
          <article-title>Discrete and Continuous Reformulations with Applications</article-title>
          ,
          <source>in: 2020 IEEE 15th International Conference on Advanced Trends in Radioelectronics</source>
          , Telecommunications and Computer Engineering (TCSET),
          <year>2020</year>
          , pp.
          <fpage>623</fpage>
          -
          <lpage>627</lpage>
          . doi:
          <volume>10</volume>
          .1109/TCSET49122.
          <year>2020</year>
          .
          <volume>235507</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>O.</given-names>
            <surname>Pichugina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Muravyova</surname>
          </string-name>
          ,
          <source>Data Batch Processing: Modelling and Applications</source>
          , in: 2020 IEEE International Conference on Problems of Infocommunications. Science and
          <string-name>
            <surname>Technology (PIC S&amp;T)</surname>
          </string-name>
          ,
          <year>2020</year>
          , pp.
          <fpage>765</fpage>
          -
          <lpage>770</lpage>
          . doi:
          <volume>10</volume>
          .1109/PICST51311.
          <year>2020</year>
          .
          <volume>9467928</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>O.</given-names>
            <surname>Pichugina</surname>
          </string-name>
          ,
          <article-title>On Modelling of Computer Cluster Optimization Problem With Applications</article-title>
          , in: 2020
          <source>IEEE KhPI Week on Advanced Technology (KhPIWeek)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>429</fpage>
          -
          <lpage>436</lpage>
          . doi:
          <volume>10</volume>
          .1109/ KhPIWeek51551.
          <year>2020</year>
          .
          <volume>9250145</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>O.</given-names>
            <surname>Pichugina</surname>
          </string-name>
          , Mathematical Modeling Of Transport Logistics Problems, in: 2020
          <source>IEEE 2nd International Conference on System Analysis Intelligent Computing (SAIC)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          . doi:
          <volume>10</volume>
          .1109/SAIC51296.
          <year>2020</year>
          .
          <volume>9239155</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>O.</given-names>
            <surname>Pichugina</surname>
          </string-name>
          ,
          <string-name>
            <surname>Diet-Menu Problem</surname>
          </string-name>
          Modelling And Applications, in: 2020
          <source>IEEE 2nd International Conference on System Analysis Intelligent Computing (SAIC)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          . doi:
          <volume>10</volume>
          . 1109/SAIC512 96.
          <year>2020</year>
          .
          <volume>9239149</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>O.</given-names>
            <surname>Pichugina</surname>
          </string-name>
          , New Approaches to Modelling Covering Problems in Monitoring Optimization, in: 2019 IEEE International Scientific-Practical Conference Problems of Infocommunications, Science and
          <string-name>
            <surname>Technology (PIC S T)</surname>
          </string-name>
          ,
          <year>2019</year>
          , pp.
          <fpage>300</fpage>
          -
          <lpage>304</lpage>
          . doi:
          <volume>10</volume>
          .1109/PICST47496.
          <year>2019</year>
          .
          <volume>9061386</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Yakovlev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Kartashov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Pichugina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Korobchynskyi</surname>
          </string-name>
          ,
          <article-title>Genetic Algorithms for Solving Combinatorial Mass Balancing Problem</article-title>
          ,
          <source>in: 2019 IEEE 2nd Ukraine Conference on Electrical and Computer Engineering (UKRCON)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>1061</fpage>
          -
          <lpage>1064</lpage>
          . doi:
          <volume>10</volume>
          .1109/UKRCON.
          <year>2019</year>
          .
          <volume>8879938</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S. V.</given-names>
            <surname>Yakovlev</surname>
          </string-name>
          ,
          <article-title>On Some Classes of Spatial Configurations of Geometric Objects and their Formalization</article-title>
          ,
          <source>Journal of Automation and Information Sciences</source>
          <volume>50</volume>
          (
          <year>2018</year>
          )
          <fpage>38</fpage>
          -
          <lpage>50</lpage>
          .
          <source>doi:10.1615/ JAutomatInfScien.v5 0.i9.3 0.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>E.</given-names>
            <surname>Alpaydin</surname>
          </string-name>
          , Introduction to machine learning,
          <source>Adaptive computation and machine learning</source>
          , 2nd ed., MIT Press, Cambridge, Mass,
          <year>2010</year>
          . OCLC: ocn317698631.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>C. M. Bishop</surname>
          </string-name>
          ,
          <source>Pattern Recognition and Machine Learning</source>
          , 1st ed.
          <year>2006</year>
          . corr. 2nd printing 2011 ed., Springer, New York,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kubat</surname>
          </string-name>
          , An Introduction to Machine Learning,
          <volume>2</volume>
          <fpage>ed</fpage>
          ., Springer International Publishing,
          <year>2017</year>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -63913-0.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Forsyth</surname>
          </string-name>
          , Applied Machine Learning, Springer International Publishing,
          <year>2019</year>
          . URL: https:// www.springer.com/gp/book/9783030181130. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -18114-7.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Drummond</surname>
          </string-name>
          , Classification, in: C. Sammut,
          <string-name>
            <surname>G. I.</surname>
          </string-name>
          Webb (Eds.),
          <source>Encyclopedia of Machine Learning</source>
          ,
          <string-name>
            <surname>Springer</surname>
            <given-names>US</given-names>
          </string-name>
          , Boston, MA,
          <year>2010</year>
          , pp.
          <fpage>168</fpage>
          -
          <lpage>171</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-0-
          <fpage>387</fpage>
          -30164-8\_
          <fpage>111</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          , E. Frank,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Pal</surname>
          </string-name>
          ,
          <source>Data Mining: Practical Machine Learning Tools and Techniques</source>
          , 4 ed., Morgan Kaufmann, Amsterdam,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T. G.</given-names>
            <surname>Dietterich</surname>
          </string-name>
          ,
          <article-title>Machine-learning research; four current directions</article-title>
          ,
          <source>AI</source>
          Magazine
          <volume>18</volume>
          (
          <year>1997</year>
          )
          <fpage>97</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>G.</given-names>
            <surname>Brown</surname>
          </string-name>
          , Ensemble Learning, in: C. Sammut,
          <string-name>
            <surname>G. I.</surname>
          </string-name>
          Webb (Eds.),
          <source>Encyclopedia of Machine Learning</source>
          ,
          <string-name>
            <surname>Springer</surname>
            <given-names>US</given-names>
          </string-name>
          , Boston, MA,
          <year>2010</year>
          , pp.
          <fpage>312</fpage>
          -
          <lpage>320</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-0-
          <fpage>387</fpage>
          -30164-8\
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>W.</given-names>
            <surname>Fan</surname>
          </string-name>
          , Cost-sensitive,
          <source>Scalable and Adaptive Learning Using Ensemble Methods: Theory and Application</source>
          , VDM Verlag, Saarbrucken,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>P.</given-names>
            <surname>McCullagh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Nelder</surname>
          </string-name>
          , Generalized Linear Models, 2nd edition ed.,
          <source>Chapman</source>
          and Hall/CRC, Boca Raton,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>C.</given-names>
            <surname>Cortes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vapnik</surname>
          </string-name>
          ,
          <article-title>Support-vector networks</article-title>
          ,
          <source>Machine Learning</source>
          <volume>20</volume>
          (
          <year>1995</year>
          )
          <fpage>273</fpage>
          -
          <lpage>297</lpage>
          . doi:
          <volume>10</volume>
          .1007/BF00994018 .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>P.</given-names>
            <surname>Domingos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pazzani</surname>
          </string-name>
          ,
          <article-title>On the Optimality of the Simple Bayesian Classifier under Zero-One Loss</article-title>
          ,
          <source>Machine Learning</source>
          <volume>29</volume>
          (
          <year>1997</year>
          )
          <fpage>103</fpage>
          -
          <lpage>130</lpage>
          . doi:
          <volume>10</volume>
          .1023/A:
          <fpage>1007413511361</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>