=Paper=
{{Paper
|id=Vol-2667/paper7
|storemode=property
|title=Some approaches to improving the quality of artificial neural network training
|pdfUrl=https://ceur-ws.org/Vol-2667/paper7.pdf
|volume=Vol-2667
|authors=Yefim Rozenberg,Alexey Olshansky,Ignat Dovgerd,Gleb Dovgerd,Alexander Ignatenkov,Paul Ignatenkov
}}
==Some approaches to improving the quality of artificial neural network training ==
Some approaches to improving the quality of
artificial neural network training
Yefim Rozenberg Alexey Olshansky Ignat Dovgerd
JSC Railway Signalling Institute (JSC JSC Railway Signalling Institute (JSC JSC Railway Signalling Institute (JSC
NIIAS) NIIAS) NIIAS)
Moscow, Russia Moscow, Russia Moscow, Russia
greatfime@gmail.com lexolshans@gmail.com ignatikus@bk.ru
Gleb Dovgerd Alexander Ignatenkov Paul Ignatenkov
JSC Railway Signalling Institute (JSC JSC VTB Capital AM Smolensk State University
NIIAS) Moscow, Russia Smolensk, Russia
Moscow, Russia a.ignatenkov@gmail.com beat.pi@gmail.com
christmas1409@yandex.ru
Abstract—The paper deals with improving the quality neural network with variable signal conductivity
of artificial neural network (ANN) training. The research (abbreviated as MANN VSC) to be applied for scheduling.
covers a complex neural network consisting of 2-dimensional Currently, this subject is considered to be the main source for
Kohonen network and Wilshaw and von der Malsburg research in the field of improving the quality of education.
network capable of solving scheduling problems in
transport. Existing results of using optimal control theory for ANN VSC is a hybrid neural network combining the
ANN training are analysed; the authors suggest a new characteristic features of a multilayer perceptron, the
technique based on the direct neural control. Comparative Wilshaw – von der Malsburg network with the Hopfield
error values during the training process using both the network.
traditional methods and a new approach are presented. The
new technique proves to be better than the traditional one for II. ABOUT OPTIMAL CONTROL IN NEURAL NETWORKS TASKS
considered neural networks. Recently, the scope of application of neural networks has
expanded considerably. The most popular tasks are synthesis
Keywords—artificial neural network, Kohonen network,
of control systems, identification tasks, data processing,
multilayered neural network, control
information recovery tasks, scheduling problems and other
I. INTRODUCTION original activities (e.g. creating new pictures and arts).
Issues related to scheduling have always been of great Despite routine modifications of the structure and
significance for railway industry. Among the most common topologies of ANN and training methods, ANN is a system
scheduling tasks one can mention routing, timetabling, controllable only by using sets of recommendations based on
volume planning, timetabling and volume planning, etc. heuristic approaches [2], numerical experiments, etc. Most
Solving these tasks with strict methods we face certain authors emphasize that the quality of ANN training and the
problems such as combinatory complexities, exhaustive development and creation of neural network solutions is a
searches, computer memory deficiency, and time-consuming complicated scientific problem. Sometimes we may see
computations. In this case a number of heuristic algorithms certain attempts of combined application of ANN and
are used (the Monte Carlo algorithm, evolutional algorithms, optimal control theory as a rigor mathematical method
neural networks etc.). The present paper is aimed at applicable for any task.
illustrating how neural networks (a special category) solve
Paper [10] contains an attempt to create an algorithm for
timetabling tasks and create methods to control the quality of
the development of the deep convolutional neural networks
ANN training. The ANN under investigation [1] looks like 2-
using manifold compactification. This approach is suitable
dimensional modification of the Kohonen and Wilshaw and
for computer vision ANN but it is inconvenient for MANN
von der Malsburg network.
with variable signal conductivity due to dissimilarity of their
When seeking a neural network solution of every task we structures.
should answer the following questions:
The theses [9] are more relevant for the ANN under
1. How to translate the task into the language
consideration but it is impossible to apply the general idea of
“understandable” for the neural networks; how to find the
[9] because MANN follows its own rules of output
correspondence between the states of neurons and the values
calculation. Traditionally an artificial neural network
of optimized parameters?
implements an epoch as a full sequence of pairs “input-
2. How to construct a network energy function with
output” but MANN under consideration does not work with
given constraints and given target function?
the set of different examples [8].
Immediately we run into two difficulties:
1. How to establish a correspondence between the We should focus on paper [6] where the author suggests a
members of a network energy function and the members of genetic algorithm to optimize the vector of hyperparameters
the general form of network energy? for convolutional neural networks. The closest result is in [7]
2. How to calculate weighting factors for penalty where an asynchrony mover is a control object and two
functions? neural networks are suggested. The first network creates a
One of the first attempts to overcome these shortcomings control signal; the second one catches the difference between
with regard to railway transport dates back to 2015 [1] and is the desired output and the measurable output.
connected with the development of a multilayer artificial
Copyright © 2020 for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
Data Science
Paper [3] deals with constructing the optimal time integral regulation, K3 is the coefficient of derivative
sequences which consist of weights between neurons of a regulation. The PID-controller is implemented in the
dynamic ANN. In [3] the two-point boundary value programming language R in the RStudio environment and
nonlinear problem is solved. It yields optimal rules of the after that it is incorporated into the code of multilayered
ANN training. The weight matrix of the ANN in every time ANN. The novelty of this approach is in the controllable
step (epoch) is set as an optimal time sequence. The authors object (the MANN as a kind of ANNs) and in the universal
note that at best the weight matrix at the final time step algorithm to transform a concrete PID-control curve to a
relates to the symmetric matrix constructed by J.J. Hopfield strict indicator which sets a direction of the MANN signal
for associated memory [4]. trajectory.
Initial conditions are set as an input vector concatenated Fig. 1 shows dynamics of changing the error signal for
several training samples. the MANN consisting of 27 layers and 1920 neurons in each
layer and with 185 schedules as a computational load
The functional (the criterion) of quality minimizes the without control.
value which is an opposite value of correlation between the
output of the neuron and the desired output of the neuron at
the final time step of controlling. During the time interval
between the first step and the last step of controlling the
functional penalizes miscorrelation level between the desired
output and the answer of activation function of each neuron.
In this case an optimal control strategy is founded as
Lagrange problem for a task of an optimal program control
of the multilayered perceptron with a sigmoid activation
function.
Another way of control is applying PID-controllers as a Fig. 1. The MANN error signal (a typical mode with a traditional
algorithm [8], no control).
control technique.
A few more papers concerning ANN application for Fig. 2 shows the desired error change signal.
scheduling tasks should be mentioned. These solutions refer
to scheduling, too; nevertheless, they touch upon
modification of ANN activation function or the ANN
structure. Thus, paper [11] analyzes a pickup of empirical
coefficients for multilayer perceptrons and describes
transferring to stochastic methods of weight modification at
Hopfield models, etc.
Papers [12-16] address NP-hard problems (timetabling
tasks, path searching in graphs) and its neural network
solutions with different types of ANN (MLP, LSTM, CNN,
etc.) and with various key algorithms (genetic algorithms, Fig. 2. Setup change (the principal view of the desired signal).
adjusting ANN parameters, error back propagation, standard
searching). The authors organized and conducted about 1200 starts
However, these papers, like other articles analyzed of the ANN with different parameters of the proportional
above, do not consider an artificial neural network as a (ranging from 0.1 to 1), the integral (from 10 to 40) and the
controllable object using optimal control theory. differential (from 0.1 to 4.1) error components and the
disturbations value from 5 till 60 points per every time step.
Paper [16] is a meta-study about various approaches to It is not a not very efficient method of control because it
solving schedule problems with different recommendations – provides only 10% stable trajectories. The stability is taken
from project management techniques to neural expert into consideration in a Lyapunov sense [5]. Computational
systems but without any neurocontrol and adjustments. experiments illustrate that the marginal critical value of the
disturbations feed to the ANN is no more than 10-15% of the
Examination of articles [11-16] leads to conclusion that average error in the stable mode (Table 1.). This result
the problem of improving the quality of neural network cannot be evaluated as practical.
solutions is being analyzed in many countries. However,
mission statement with regard to neurocontrol as a control IV. DIRECT NEUROCONTROL FOR MULTILAYERS ARTIFICIAL
task with two ANN has not yet received attention it deserves. NEURAL NETWORKS AND ITS ADVANTAGES
In the field of neurocontrol this problem is rightfully
considered novel. It refers both to optimal control theory and Along with the traditional training algorithm the authors
hybrid neurocontrol. suggest a direct neurocontrol mode for training. The object to
be controlled is a multilayered ANN with variable signal
III. ABOUT PID-CONTROL IN NEURAL NETWORKS conductivity [1]; a three-layer perceptron with sigmoid
activation functions is taken as a controller.
PID control of the ANN error signal is found with the
following classical formula [5]: The main scheme of control is shown in Fig.3.
Gss*s The ANN-controller is trained by the aggregation of
where s is the argument of the transfer function, K1 is the triple sets “ The level of error per epoch” – “The level of
coefficient of proportional regulation, K2 is the coefficient of
VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 28
Data Science
error at the previous moment” – “The control signal from the networks with direct neurocontrol are of much better quality
previous time step to the present time step” or “The previous (as compared with those obtained with traditional algorithms.
level of error” – “The current level of error” – “The control
signal”. ACKNOWLEDGMENT
This research was provided by Russian Foundation for
The Executive mechanism (the Basic Research (the project #17-20-01065 “A theory of
algorithm of signal transmission)
railway transport system neural network control”).
Training technique
REFERENCES
Multilayered Artificial Neural
Network with variable signal Time delay [1] A. Olshansky and A. Ignatenkov, “One approach to control of a
Output
Input conductivity neural network with variable signal conductivity,” Information
tecnologiesn and Nanotechnologies (ITNT), pp. 984-987, 2017.
Output
[2] A.V. Nazarov and A.I. Loskutov, “Neural networks algorithms of
forecast and system optimization,” Saint Petersburg: Science and
technic, 2003, 384 p.
ANN-controller Discrepancy summation
[3] O. Fahotimi, A. Dembo and T. Kailath, “Neural network weight
Output forecast
matrix synthesis using optimal control techniques,” Advances in
Neural Information Processing Systems-2 (NIPS-2), USA, Denver,
Control signal Colorado, USA, Stanford,1989.
[4] J.J. Hopfield, “Neural networks and physical systems with emergent
Fig. 3. The scheme of a direct neurocontrol mode.
collective computational abilities,” Proc. Natl. Acad. Sci. Biophysics,
USA, vol. 79 pp. 2554-2558, 1982.
The current error signal of the ANN and the previous one [5] R.C. Dorf and R.H. Bishop, “Modern control systems,” Pearson,
are gathered and entered the trained and ready multilayer 2011.
perceptron. An answer signal of the ANN-controller entered [6] Y.R. Tsoy, “Neuroevolution algorithm and software for image
the discrepancy summation and actuating mechanism (an processing: The dissertation on competition of a scientific degree of
algorithm). Hereinafter the value of summated discrepancy is Candidate of technical sciences,” Tomsk Polythechnic University,
also fed by the ANN-controller. 2007, 209 p.
[7] A.M. Sagdatullin, “A neural network controller for the velocity value
The control scheme described above was tested for the of an asynchronic motor,” Theses of Russian Congress for Control
concrete scheduling problem (the railway branch Arkhara – Problems, Russia, Moscow, Institute of control sciences of Russian
Volochaevka, 27 railway stations). The task included 185 Academy of Sciences, pp. 4485-4498, 2014.
trains per 24 hours. [8] A.M. Olshansky and A.V. Ignatenkov, “Development of an artificial
neural network for constructing a train schedule,” Bulletin of the
The results of testing are given in the table 1. Ryazan State Radio Engineering University, vol. 55, pp. 73-80, 2016.
[9] I.M. Kulikovskikh, “Reducing computational costs in deep learning
TABLE I. A COMPARISON OF DIFFERENT TRAINING METHODS on almost linearly separable training data,” Computer Optics, vol. 44,
no. 2, pp. 282-289, 2020. DOI: 10.18287/2412-6179-CO-645.
Training error PID-controller Direct
[10] Yu.V. Vizilter, V.S. Gorbatsevich and S.Y. Zheltov, “Structure-
(points) (the best neurocontrol
Traditional configuration
functional analysis and synthesis of deep convolutional neural
algorithms with networks,” Computer Optics, vol. 43, no. 5, pp. 886-900, 2019. DOI:
K1/K2/K3 = 10.18287/2412-6179-2019-43-5-886-900.
0.1/40/2.1) [11] A.S. Jain, “Meeran S. Job-shop scheduling using neural networks,”
Min 75 362 193 International Journal of production research, vol. 36, no. 5, pp. 1249-
1272, 1998.
Max 134795 211585 57895
[12] Z. Li, Q. Chen and V. Koltun, “Combinatorial optimization with
Median 5469 471 210 graph convolutional networks and guided tree search,” Advances in
Neural Information Processing Systems, pp. 539-548, 2018.
Average 16548 1830 384 [13] A. Milan, “Data-driven approximations to NP-hard problems,”
Thirty-First AAAI Conference on Artificial Intelligence, 2017.
SD 6687 4485 1180
[14] J. Bruck and J.W. Goodman, “On the power of neural networks for
Rate of error 50 15 0.4 solving hard problems,” Neural Information Processing Systems, pp.
overshoot 137-143, 1988.
[15] A. Chaudhuri and K. De, “Job Scheduling Problem Using Rough
V. CONCLUSIONS Fuzzy Multilayer Perception Neural Networks,” Journal of Artificial
Intelligence, vol. 1, no. 1, 2010.
Thus, this work shows the principal possibility to control [16] S.J. Noronha and V.V.S. Sarma, “Knowledge-based approaches for
the multilayered artificial neural network with variable signal scheduling problems: A survey,” IEEE Transactions on Knowledge
conductivity. The three layered perceptron with the and Data Engineering, vol. 3, no. 2, pp. 160-171, 1991.
sigmoidal activation function is used as a controller. The
solutions achieved using multilayered artificial neural
VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 29