<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An analysis of training of a multilayer perceptron for calculating the flow parameters of a conveyor</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oleh Pihnastyi</string-name>
          <email>pihnastyi@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Victoriya Usik</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Georgii Kozhevnikov</string-name>
          <email>Heorhii.Kozhevnikov@khpi.edu.ua</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kharkiv National University of Radio Electronics</institution>
          ,
          <addr-line>14, Nauky Ave., Kharkiv, 61166</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Technical University "KhPI"</institution>
          ,
          <addr-line>2, Kyrpychova str., Kharkiv, 61002</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>2711</volume>
      <fpage>24</fpage>
      <lpage>26</lpage>
      <abstract>
        <p>The paper examines the methodology for modeling many-section conveyor systems using a neural network architecture based on the multilayer perceptron model. Modeling such systems are analyzed, and a rationale for the use of neural networks in the development of systems for controlling they flow parameters is described. Conditions for using such models are determined. A model of a multi-section transport conveyor is developed. To correct synaptic weights when training a neural network in accordance with the method of minimizing the root mean square error, the backpropagation algorithm is used. For nodes in each hidden layer, the same type of activation function is specified, which characterizes the nonlinearity of the layer. The weights are adjusted during the training period for each training example. Initialization of the weights of the neural network is carried out using a pseudorandom number generator, which provides for the possibility of repeating the experiment many times with different activation functions and hyper-parameters for training. Strategies for initializing the weighting coefficients of neural networks are presented and recommendations for the initialization process are given. To train the neural network is used a data set generated using an analytical model for an eight-section transport conveyor. An analysis of the main numerical characteristics of material flow in transport systems used as a training data set for a neural network is presented and the need for data normalization is substantiated. The speed and accuracy of training for multilayer perceptron of different architectures and with different types of activation functions are analyzed. The analysis was performed for both the hidden layer nodes and the output layer.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;multi-section conveyor</kwd>
        <kwd>transport delay</kwd>
        <kwd>belt speed control</kwd>
        <kwd>conveyor model1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The natural resource extraction industry continues to actively develop to obtain valuable resources.
This industry is characterized by aggressive mining conditions, labor-intensive work and huge
volumes of cargo transportation. Its effectiveness directly depends on the equipment used. One of the
main types of equipment needed in the mining industry is a belt conveyor. Its difference from other
transport systems (TS) lies in its continuous operation and ability to carry impressive loads through</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature review</title>
      <p>
        The transportation of material in the mining industry makes about 30% in the cost of production even
with loading a conveyor by 50–70% [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. With an increase in the length of the transport route and a
decrease in the load factor of a many-sections conveyor, the cost of transportation in the unit cost of
products increases nonlinearly. A transport conveyor is a dynamically distributed system in which the
connection of input and output parameters takes into account the transport delay. When designing
systems for controlling the transportation of a material for a conveyor consisting of several sections,
models are used based on the equations of system dynamics [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the aggregated equation of state [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
the Lagrange equations, the finite element method [7-9]. With an increase in the number of sections to
several dozen, these methods lose their relevance due to a significant increase in the complexity of the
computational algorithm. In this case, the analytical PiKh-model of the transport conveyor can be
applied to elaborate control systems [10,11]. In recent years, quite a lot of research has appeared on
the use of MLP for modeling various processes in conveyor TS [12-14].
      </p>
      <p>
        A common phenomenon in the modern mining industry is the use of multi-section conveyor
systems [15-17]. The use of MLP for calculating the flow parameters of a conveyor-type TS is of
scientific and practical interest even with several tens of conveyor sections [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The tendency to
further increase the number of sections in modern conveyor systems makes the use of MLP in models
of multi-section conveyor-type systems more and more actual. Unlike other studies the main
attention in this research will be paid to the study of the influence of the network architecture on the
accuracy of predicting the flow parameters of the transport system.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Problem statement</title>
      <p>To train MLP, let us use the error back propagation algorithm [18], based on the correction of
synaptic weights in accordance with the method of minimizing the mean square error. With forward
signal propagation, the synaptic weights that determine the connection between the nodes of the
layers of MLP are fixed. Correction of synaptic weights occurs when the signal propagates backward.
An activation function is specified for each neuron. When constructing MLP, two types of activation
functions were used: the logistic activation function</p>
      <p>Within one layer, for all nodes, the same activation function is set with coefficients fixed for each
node of the layer a,b. For different layers, the form of the activation function and the coefficients that
determine it may differ. For an identical functionf ( x )=bx , x ∈ ]−∞ ;+∞ [ (without restrictions on
the range of values of the function), a multi-layer feed-forward network with a linear activation
function can be reduced to a network without hidden layers. network, in this case, transforms into a
linear regression model</p>
      <p>yν=wν 0+ wν 1 x1+. . .+ wνm xm+. . .+ wνM x M , (3)
in which the values of the output parameters of the TS y ν are determined through the value of the
input parameters xm . TS model based on a multilayer MLP with a logistic activation function (1) has a
distributed form of nonlinearity and high connectivity of network nodes, which significantly
complicates the learning process, and as a consequence, the use of such models to describe TS. In this
regard, to substantiate and qualitatively analyze the results of a model with a logistic activation
linear (ReLU) activation function
(1)
(2)
MSE= 1 ∑N ∑V ( yνt n− yνn)</p>
      <p>N n=1 ν=1
2
,
where y νn , y νt n is the predicted and test value of the output parameter y ν , ν =1 . . V for the n-th
row of input parameters xmn , m=1 . . M with the total number of elementsN in the training set
(cardinality of the set N ). During training, the weights wνm are adjusted within the training epoch
after each training example. During one learning epoch, the weighting coefficients are adjusted N
times. MSE (4) is calculated after each epoch. The correction Δwνm for the weightwνm with the output
neuron ν and the neuron m from the hidden layer (the previous layer adjacent to it) is determined by
the delta rule
function at the nodes of MLP, a model of a multilayer MLP with a linear activation function is
considered (2). When conducting a comparative analysis of the results for MLP with different
architectures, it should be borne in mind that the used back-propagation algorithm ensures
convergence to one of the minima of the function that determines the value of the mean square error.
This requires a fairly large number of experiments with different parameters of the activation function
and the value of the learning rate. The function of the mean square error of the model (MSE) was used
as a criterion for the quality of training when predicting the output flow parameters of TS y ν
(4)
(5)
(6)
(7)
where the quality criterion is selected as the sum of squares of errors for each of the nodes of the
output layer
∂ Eν
Δwνm=−α νm ∂ wνm .</p>
      <p>1
Eν=</p>
      <p>∑ e 2
2 ν ν .</p>
      <p>
        eν= y νt − y ν ,
Each output neuron is characterized by an error eν
where y νn is the predicted value of the output parameter, y νt n is the value of the output
parameter, which is used to train MLP. The distributed error for the hidden layer eν is determined by
the magnitude of the errors in the output layer. In expression (7), the index n is omitted to simplify the
notation, but it is assumed that the formula is given for the n– sample of the training data set [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] used
to train MLP. To train MLP, a dataset was generated based on the use of the PiKh analytical model to
study the flow parameters of MLP. The methodology for generating a data set presented in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The
predicted value y νn for ν − th node of the output layer is expressed in terms of the values of the nodes
xmn of the hidden layer (previous layer)
from where
Taking into account that
∂ Eν = 1 ∂ Eν ∂ eν = ∂ Eν ∂ ( yνt− yν ) = ∂ Eν ∂ ( yνt− yν ) ∂ yν
∂ wνm 2 ∂ eν ∂ wνm ∂ eν ∂ wνm ∂ eν ∂ yν ∂ wνm .
      </p>
      <p>y ν=f (∑ wνm xm)
m</p>
      <p>,
1 ∂ E</p>
      <p>ν =eν
2 ∂ eν
,
∂ ( yνt− yν ) =−1
∂ yν
,
∂ yν =
∂ wνm
∂ f (∑ wνm xm) ∂ f (∑ wνm xm) ∂ ∑ wνm xm
m = m m
∂ wνm ∂ wνm
=f '(∑ wνm xm) xm</p>
      <p>m
∂ ∑m wνm xm ,
the expression for adjusting the weight wνm with the output neuron ν and the neuronm from the
hidden layer can be represented as
(8)
(9)
(10)
(11)
(12)
(14)
(15)
(16)
Δwνm=−α νm eν (−1 ) f '(∑ wνm xm) xm
m</p>
      <p>∂ ∂f (ββ ) = (1a+beexxpp((−−bbββ)))2 =bf ( β ) e1x+pe(x−pb(−β)bβ ) =bf ( β )(1− f (aβ ) ), (13)
considering relation (8), for the nodes of the layer with the logistic activation function, let's write
For linear activation function (2)
and, correspondingly</p>
      <p>For the logistic activation function, the coefficients change with each return pass. For a linear
activation function, the change in the coefficients occurs only for nodes for which the value y ν is in
the rangea&gt; y ν &gt;0 . Expressions (14), (16) determine the value of the adjustment of the coefficients
connecting the node xm of the hidden layer and the node y ν of the output layer.</p>
      <p>To determine the value of adjustment Δwkm between the node zk of the hidden layer, which is
closer to the nodes of the output layer y ν and the node xm of the hidden layer, which is farther to the
nodes y ν of the output layer, we will use expressions (14) and (16), replacing for definiteness y ν by
zk , and eν by ξk .</p>
      <p>ThenΔwkm between the two hidden layers will be determined by the dependency
z
Δwkm=α km ξk bzk(1− k ) xm
a</p>
      <p>with an unknown error value ξk for the node zk of the hidden layer. To determine the value of the
error, let us use the back-propagation algorithm. The hidden layer error ξm can be written in the form
of the superposition of the errors of the output layers
∑ wνm
m .</p>
      <p>The fraction of the error eν for the node of the output layer, transmitted to the hidden layer, is
proportional to the coefficientwνm . The calculation of the error for the nodes of the next hidden layer
is determined by the expression
ξm= ∑ν eν
ξk=∑m ξm
wνm
wmk
xm , expression (21) can be simplified to the form
ξk=∑m ξm wmk
∑ wmk
k .</p>
      <p>Substituting the calculated error value ξk for the node zk for the hidden layer node into formulas
(17), (18), we obtain the correction value Δwkm for the coefficientwkm between the two hidden layers.
For a particular case, when the value ∑ wmk is comparable in order of the value for different nodes
k
(17)
(18)
(21)
(19)
(20)</p>
      <p>
        In the present study, when training MLP for the back-propagation algorithm, expression (20) was
used. When determining the value of adjustment Δwkm , the learning rate α km local and is determined
for each weight wkm when calculating the local gradient. A rather small absolute intensity of change
Δ MSE=10−6 MSE (4) between learning epochs was taken as a criterion for the convergence of the
learning algorithm. Initialization of the weights is performed using a pseudo-random number
generator [19] (new Random (long seed)), [20] in the range [0.0; 1.0]. Initialization seed=1000 provides
for the possibility of repeating the experiment many times. The value of the activation function
coefficient a for each node of the output layer is selected from the ratioa≥max ( y νt n ) . The
maximum value of the output parameters from the training set [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is max ( y νt n )=4 . 558 .
4. Method
As a starting point for the analysis, let's use the structural diagram of the conveyor of eight sections
(see Figure 1). A model of TS with this structure, based on MLP with one hidden layer and a 9-3-2
topology, was studied in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. To form the data set required for tutoring MLP, let's use an analytical
model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], a detailed analysis of the generated dataset for training MLP is presented. The
dataset will be used in this study. The state of the m -th section at the moment of time τ is determined
by parameters: gm( τ ) , γ m( τ ) , ξm are the conveyor belt speed, the material flow at the entrance of the
section and the transport route length of the section, respectively.
      </p>
      <p>The output flow of material from a section θ1( τ , ξm ) is a calculated value. The sections are
equipped with accumulating bunkers for combining the input flows of material and for separating the
output flows of material [21]. There is no system to control the flow of material leaving the
accumulating bunkers. The material flow from the sixth accumulating bunker is distributed between
the seventh and eighth sections in a ratio γ7 ( τ )/ γ 8( τ )=2/ 3 .</p>
      <p>Input and output nodes of MLP are numbered in accordance with the designations (Figure 2):
x3 m−2=γ m( τ ) , x3 m−1=gm( τ ) , x3 m=ξm , m=1…M,</p>
      <p>y1=θ17( τ , ξ7 ), y2=θ18( τ , ξ8 ),
ξ1=1 . 0 ; ξ2=0 . 5 ; ξ3=0 . 7 ; ξ4=0 . 8 ; ξ5=1 . 5 ; ξ6=1 . 0 ; ξ7=1 . 5 ; ξ8=0 . 6 .
(22)
(23)</p>
      <p>The nodes corresponding to the parametersx7=γ 3( τ ) , x14=γ 6 ( τ ) ,x19=γ7 ( τ ) ,x22=γ 8( τ ) are
excluded from the input layer of MLP. These values can be calculated through the parameters of the
previous sections.</p>
      <p>Also excluded are nodes whose values are constant when analyzing the considered transport
systemx3 m=ξm . Thus, the input layer consists of thirteen nodes, twelve of which are determined by
the parametersγ1( τ ) , g1( τ ) , γ 2( τ ) , g2( τ ) , g3( τ ) , γ 4 ( τ ) , g4 ( τ ) ,γ5( τ ) , g5( τ ) , g6 ( τ ) , g7 ( τ ) ,
g8( τ ) and a node whose value is constant and equal to 1. Nodes corresponding to calculated
parametersγ 3( τ ) , γ 6 ( τ ) , γ7 ( τ ) , γ 8( τ ) are excluded.</p>
      <p>
        The values of the output node correspond to the parameters of the output material flows
θ17 ( τ , ξ7 ) , θ18 ( τ , ξ8 ) . The quantity of neurons in the hidden layers of the multilayer perceptron will
be selected N h=15 from the range N h ∈[
        <xref ref-type="bibr" rid="ref5">5 , 30</xref>
        ] in accordance with the methods for determining the
number of hidden neurons [22, 23]
      </p>
      <p>N h=√ N i N o≈5 ,</p>
      <p>1 N
N h= 2 N i log2 N
≈30
where the number of samples N ≈104 in the training set for N i=13 nodes of the input layer and
N o=2 nodes of the output layer.</p>
      <p>In this paper, let's will consider in detail the further improvement of TS model using MLP, namely,
increasing the precision of predicting the value of the parameters of the output flow of the transport
conveyor by changing the architecture of MLP with the same number of neurons in hidden layers.
Let's carry out a comparative analysis of models with a rectangular network architecture for hidden
layers 13–N L⋅L –2 (thirteen nodes in the input layer, N L⋅L nodes in the hidden layers and two nodes
in the output layer), where N L is the number of neurons in each hidden layer, L is the quantity of
hidden layers.</p>
      <p>
        The computational complexity of the algorithm at a constant transport delay is proportional to the
number of sections of the conveyor. With a variable of the belt speed, the transport delay is
determined by solving the equation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
      </p>
      <p>
        τ
ξm= ∫ g( α )dα
τ−Δτ (τ )
(25)
numerical with computational complexitylog2 N τ ch , where N τ ch=τ ch / Δα ; τ ch is characteristic time
of measuring the input flow parameters of the transport system; Δα is step of numerical integration
(24). A further increase in the number of sections leads to an increase in the number of equations,
which makes it difficult to use the PiKh-model for designing efficient control systems due to the
extreme computational complexity of the algorithm. For such a limiting number of sections of the
transport conveyor, it is advisable to use models based on MLP [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] or linear regression equations [24].
This result is explained by the trade-off between the algorithm computational complexity and the
forecast error of the output parameters. For a model based on MLP, the algorithm computational
complexity is proportional to the value( L+2 ) N 2 , N w is the average number of nodes for MLP layer.
      </p>
      <p>
        w
For branched multi-section TS, the inequality N w &lt;&lt; √ N τ ch N K is valid. This makes it attractive to
use MLP models to predict the flow parameters of the transport conveyor. The number of input layer
nodes depends on the number of conveyor sections. Each such section is defined by two parameters
x3 m−2=γ m( τ ) , x3 m−1=gm( τ ) (22). The above condition for the applicability of models using MLP
substantiates the fact of their small use for the design of control systems for the flow parameters of the
transport conveyor. The main application of models using MLP [25, 26] and regression equations
[2729] is associated, as a rule, with predicting the state of the physical characteristics of the elements of a
one-section conveyor with a significant number of the regressors. TS model based on MLP with 13-5-1
architecture is used to diagnose the state of wear of a conveyor belt [25]. To control the process of
extraction and transportation of material in [26], a model of the main conveyor based on MLP with a
different architecture, containing 3, 16, 32, 48 and 64 nodes in a hidden layer, is analyzed. The 3-4-3
architecture is proposed for designing a control system for the start and stop mode of the conveyor
section [30]. In work [31], a regression model of a transport conveyor consisting of 18 sections is
analyzed, and in work [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] a model using MLP with one hidden layer for TS consisting of 8 sections.
      </p>
    </sec>
    <sec id="sec-4">
      <title>5. Results</title>
      <p>
        The accuracy of predicting the flow parameters of TS depends on the data set for training MLP.
Satisfactory accuracy of the prediction of the values of the flow parameters of the transport conveyor
can be ensured if there is a sufficiently large data set with the number of samples N , containing the
parameters of the transport system, which are lowly correlated with each other and varying in a wide
range. The number of hidden neurons (24) is directly related to the valueN [22, 23]. The preparation
of the set is the key moment in the learning process of MLP, it requires the provision of non-standard
modes of functioning of both a separate section and TS as a whole, characterized, as a rule, by excess
consumption of resources, which is unacceptable in economic conditions. For a conveyor-type
transport system, consisting of several dozen sections, it is economically impossible to provide a set of
possible combinations of non-standard modes, which creates an almost insurmountable obstacle to
the use of MLP and regression equations when designing systems for controlling the flow parameters
of a multi-section transport conveyor. A variant of the solution to this problem was proposed in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], in
which a set for training MLP is formed on the basis of an analytical PiKh-model of a conveyor without
directly using an experimental data set. We can to develop a model of a multi-section conveyor based
on MLP with one hidden layer (MLP architecture 9-3-2) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for the designing of an optimal control
system for the flow parameters of TS. The introduction of a control system for the flow parameters of
1.input
1.speed
2.input
2.speed
3.speed
4.input
4.speed
5.input
5.speed
6.speed
7.speed
8.speed
the TS presupposes the presence of non-contact sensors based on machine vision using digital image
processing devices [32] to determine the input material flow quantity and the section belt speed.
Numerical characteristics for the studied factors of the data set for training MLP are presented in
Table 1 for the nodes of the input layer and in Table 2 for the nodes of the output layer of MLP 13–
N L⋅L –2 (thirteen nodes in the input layer, N L⋅L nodes in hidden layers and two nodes in output
layer, Figure 2) for TS model Figure 1. The numerical characteristics of the input parameters xk have
approximately the same order of values.
      </p>
      <p>If the order of the values of the numerical characteristics of the test data used for training is
significantly different, then the test data set should be normalized beforehand. Normalized dataset
with an average value mν and standard deviation σ 2 :
ν
mν= E [ y ν ]=0 , σ ν2= E [( y ν−mν )2]=1
(26)
provides the ability to compare the result with similar studies. In addition, the normalization of a
set of test data is of practical importance for the process of accelerating the training of MLP.</p>
      <p>In the considered model of the transport system, the values of the input variables are positive
(input material flow, belt speed, etc.). In this case, in accordance with the formula (17), (18), the weight
coefficientswkm , associated with the m− th neuron of the hidden layer either simultaneously increase
or simultaneously decrease, which leads to a change in the direction of the vector of weight
coefficients to the opposite, which leads to a zigzag movement on the surface of errors and slowing
down the learning process. To speed up the learning process by the back-propagation method, the
training set was normalized. Numerical characteristics for the studied factors of the normalized
dataset for training MLP are presented in Table 3 and Table 4.</p>
      <p>
        The values of the numerical dimensionless characteristics of the input parameters presented in
Table 3 are about the same range of values. We can explain it by the fact that for training MLP, an
analytical method for forming a data set [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] was used, which provided values for the input parameters
γ m( τ ) и gm( τ ) , Figure 3, Figure 4:
γ m( τ )=γ 0 m+ γ1m sin(m πτ − m4π ), γ 0 m=234+ m
3+ m
      </p>
      <p>8
mπ
gm( τ )= g0 m+ g1m sin(m πτ + 3 ), g0 m=
, γ 0 m=γ 1 m ,
, g1 m=
ψ m( τ )=ψ 0 m+ψ1m sin(m πτ + m4π ), ψ 0 m=234+ m
, ψ 0 m=ψ1 m .</p>
      <p>(29)
The series of distribution of values θ17 ( τ ,1 . 5 ) , θ18( τ ,0 . 6 ) is presented in Figure 7, Figure 8.</p>
      <p>To accelerate the convergence of MLP learning process when choosing input parameters in the
training set, the condition of minimum correlation between the input conditions is adopted [33].</p>
      <p>There is no correlation between the parameters xk and xm if the correlation coefficient between
them is less in absolute value0 . 2≥|rkm| and is considered strong if 0 . 8≤|rkm|. The correlation
coefficients between the input parameters of TS model are presented in Table 5.</p>
      <p>Between the twelve parameters that make up the input layer, 66 correlation coefficientsrkm are
defined in Table 5, among which 62 coefficients satisfy the condition0 . 01≥|rkm|.</p>
      <p>This circumstance allows us to assume that the learning process will have satisfactory
convergence, and the dataset itself for training MLP has been generated quite successfully.</p>
      <p>To improve convergence, one of the parameters in each of the pairs [2.speed-2.input] and
[5.speed-5.input] can be removed.
input nodes, 8 nodes each in two hidden layers and two nodes in the output layer), the optimal
initialization strategy is achieved with the relation [30]:
where M is the number of neurons in the nearest layer to the input of MLP.
For a uniform distribution law in the range [0.0; 1.0], the variance value followsσ 2=1/ 12 , which
w
determines the distribution law used for the initialization of the weight coefficients. For the input
nodes, the main numerical characteristics are presented in Table 1, estimate is σ 2 ~ ( 0 , 01÷0,2) .
k</p>
      <p>Whence, for M = 10 neurons in the hidden layer (next to the input layer), can obtain an estimate of
the numerical value σ 2=σ 2/ M ~ ( 0 , 001÷0 , 02) for the initialization of coefficientswkm . The
w k
estimated value of the numerical valueσ 2 shows, that in order to accelerate the convergence of
w
learning MLP, the distribution law of the pseudo-random value used during initialization must be
changed, to reduce the variance in the distribution law used to initialize the weight coefficients. To
conduct the study, software libraries in Python were used for data processing and analysis: Pandas
(version 2.0.0), Pytorch (version 2.0.0).</p>
    </sec>
    <sec id="sec-5">
      <title>6. Conclusion</title>
      <p>In this work, a comparative analysis of transport conveyor models based on multilayer MLP is carried
out. The main focus is on MLP with the logistic activation function and linear (ReLU) activation
function. MLP with a logistic activation function demonstrated a shorter learning time for equal MSE
values, which is explained by the distributed form of nonlinearity and high connectivity. However, it
should be noted that the learning process of MLP with and linear (ReLU) activation function is more
stable, which allows achieving lower MSE values in the learning process. </p>
      <p>The work touches upon many issues that are important from the point of view of constructing
models of conveyor-type TS. The estimation of the method of initialization of the weight coefficients
providing the connection between the nodes of MLP is carried out and the rationale for the
preparation of a normalized data set for training MLP is given. The importance of the last question is
emphasized by the fact that the values of the nodes of the input and output layers are positive, which
leads to a simultaneous increase or decrease in the weight coefficients that ensure the connection of
the nodes of the input layer with the nodes of the hidden layer, and, as a consequence, to the
occurrence of oscillatory processes.</p>
      <p>A method for increasing the accuracy of the learning process as a result of removing values from
the data set is demonstrated, which characterize the transient mode of operation of the transport
system. </p>
      <p>This article does not discuss the importance of hidden neurons as detectors of factors that
determine the behavior of the flow parameters of the transport system. Also, the problem of reducing
the order of a multilayer MLP without losing the quality of model prediction is not considered.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.
https://www.sciencedirect.com/science/article/abs/pii/S0263224119313259#preview-sectionabstract
[7] C. A. Wheeler, Development of the rail conveyor technology, International Journal of Mining,
Reclamation and Environment, 33(2) (2019) 118–132.</p>
      <p>https://doi.org/10.1080/17480930.2017.1352058
[8] D. He, Y. Pang, G. Lodewijks, X. Liu, Determination of Acceleration for Belt Conveyor Speed
Control in Transient Operation. International Journal of Engineering and Technology, 8(3) (2016)
206–211. http://dx.doi.org/10.7763/IJET.2016.V8.886
[9] B. Karolewski, P. Ligocki, Modelling of long belt conveyors. Maintenance and reliability.
16(2), (2014) 179–187.
http://yadda.icm.edu.pl/yadda/element/bwmeta1.element.baztechce355084-3e77-4e6b-b4b5-ff6131e77b30
[10] O. Pihnastyi and V. Khodusov, Model of a composite magistral conveyor line, Proceedings of the
IEEE International Conference on System analysis &amp; Intelligent computing (SAIC 2018). Kyiv,
Ukraine, pp. 68-72, 2018; https://doi.org/10.1109/saic.2018.8516739
[11] O. Pihnastyi, V. Khodusov, Calculation of the parameters of the composite conveyor line with a
constant speed of movement of subjects of labour, Scientific bulletin of National Mining
University, 4 (166) (2018) 138–146. https://doi.org/10.29202/nvngu/2018-4/18
[12] E. Hamzeloo, M. Massinaei, N. Mehrshad, Estimation of particle size distribution on an industrial
conveyor belt using image analysis and neural networks, Powder Technology, 261 (2014) 185-190.</p>
      <p>URL: https://doi.org/10.1016/j.powtec.2014.04.038
[13] E. Kalay, M. Boğoçlu, B. Bolat, Mass flow rate prediction of screw conveyor using artificial neural
network method, Powder Technology, 408 (2022) 117-123.</p>
      <p>URL: https://doi.org/10.1016/j.powtec.2022.117757
[14] A. Brusaferri, M. Matteucci, S. Spinelli, A. Vitali, Learning behavioral models by recurrent neural
networks with discrete latent representations with application to a flexible industrial conveyor,
Computers in Industry, 122 (2020) 103-109.</p>
      <p>URL: https://doi.org/10.1016/j.compind.2020.1032633.
[15] H. Tan, K. Lim, Review of second-order optimization techniques in artificial neural networks
backpropagation, Proceedings of the IOP Conference Series: Materials Science and Engineering,
p.8 (2019). http://dx.doi.org/10.1088/1757-899X/495/1/012003
[16] Java™ Platform, Standard Edition 8 API Specification. Class Random. URL:
https://docs.oracle.com/javase/8/docs/api/java/util/Random.html
[17] D. Knuth, Art of Computer Programming: Combinatorial Algorithms. Addison-Wesley</p>
      <p>Professional Publishing Co., Inc., USA (2022). P.784.
[18] M. Koman, Z. Laska, The constructional solution of conveyor system for reverse and bifurcation
of the ore flow, CUPRUM, 3 (72) (2014) 69–82.
http://www.czasopismo.cuprum.wroc.pl/journalarticles/download/113
[19] P. Bardzinski, R. Krol, L. Jurdziak, Empirical model of discretized copperore flow within the
underground mine transport system. International Journal of Simulation Modelling (IJSIMM).
18(2) (2019) 279–289. http://www.ijsimm.com/Full_Papers/Fulltext2019/text18-2_279-289.pdf
[20] R. Król, W. Kawalec, L. Gładysiewicz, An effective belt conveyor for underground ore
transportation systems. IOP Conference Series: Earth and Environmental Science, 95(4) (2017) 1–
9. https://doi.org/10.1088%2F1755-1315%2F95%2F4%2F042047
[21] P. Bardzinski, P. Walker, W. Kawalec, Simulation of random tagged ore flow through the bunker
in a belt conveying system, International Journal of Simulation Modelling. 17 (2018), 597-608.
https://doi.org/10.2507/IJSIMM17(4)445
[22] K. Yotov, E. Hadzhikolev, S. Hadzhikoleva, Determining the Number of Neurons in Artificial
Neural Networks for Approximation, Trained with Algorithms Using the Jacobi Matrix. TEM
Journal 4(9) (2020) 1320-1329. DOI: 10.18421/TEM94‐02
[23] M. G. Abdolrasol, S. M. Hussain, T. S. Ustun, Artificial Neural Networks Based Optimization
Techniques: A Review, Electronics, 10(21) (2021) 26-49.
https://doi.org/10.3390/electronics10212689</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>O.</given-names>
            <surname>Pihnastyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Khodusov</surname>
          </string-name>
          ,
          <article-title>Neural model of conveyor type transport system</article-title>
          ,
          <source>Proceedings of The Third International Workshop on Computer Modeling and Intelligent Systems</source>
          , pp.
          <fpage>804</fpage>
          -
          <lpage>818</lpage>
          (
          <year>2020</year>
          ). http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2608</volume>
          /paper60.pdf
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>O.</given-names>
            <surname>Pihnastyi</surname>
          </string-name>
          , Kozhevnikov,
          <string-name>
            <given-names>G.</given-names>
            and
            <surname>Bondarenko</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>An Analytical Method for Generating a Data Set for a Neural Model of a Conveyor Line</article-title>
          ,
          <source>Proceedings of 11th International Conference on Dependable Systems, Services and Technologies</source>
          ,
          <source>(DESSERT)</source>
          , pp.
          <fpage>202</fpage>
          -
          <lpage>206</lpage>
          ,Kyiv, Ukraine (
          <year>2020</year>
          ). https://10.1109/DESSERT50317.
          <year>2020</year>
          .9125041
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mathaba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <article-title>A parametric energy model for energy management of long belt conveyors</article-title>
          .
          <source>Energies</source>
          ,
          <volume>8</volume>
          (
          <issue>12</issue>
          ) (
          <year>2015</year>
          )
          <fpage>13590</fpage>
          -
          <lpage>13608</lpage>
          . https://doi.org/10.3390/en81212375
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Miao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Qu</surname>
          </string-name>
          ,
          <article-title>Energy efficiency optimization of belt conveyors based on finite-time recurrent neural networks</article-title>
          ,
          <source>Proceedings of 11th Asian Control Conference (ASCC)</source>
          , pp.
          <fpage>238</fpage>
          -
          <lpage>244</lpage>
          , Gold Coast,
          <string-name>
            <given-names>QLD</given-names>
            ,
            <surname>Australia</surname>
          </string-name>
          (
          <year>2017</year>
          ). https://ieeexplore.ieee.org/abstract/document/8287404
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Yaqot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Menezes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Kelly</surname>
          </string-name>
          ,
          <article-title>Real-time coordination of multiple shuttle-conveyor-belts for inventory control of multi-quality stockpiles</article-title>
          ,
          <source>Computers &amp; Chemical Engineering</source>
          ,
          <volume>178</volume>
          (
          <year>2023</year>
          )
          <fpage>108</fpage>
          -
          <lpage>117</lpage>
          . https://doi.org/10.1016/j.compchemeng.
          <year>2023</year>
          .108388
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <article-title>Sustainable belt conveyor operation by active speed control</article-title>
          , Measurement, 
          <volume>154</volume>
           (
          <year>2020</year>
          ) 
          <fpage>107</fpage>
          -
          <lpage>113</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>