=Paper=
{{Paper
|id=Vol-2803/paper14
|storemode=property
|title=The system of convolution neural networks automated training (short paper)
|pdfUrl=https://ceur-ws.org/Vol-2803/paper14.pdf
|volume=Vol-2803
|authors=Vladislav A. Sobolevskii
}}
==The system of convolution neural networks automated training (short paper)==
The system of convolution neural networks automated training
Vladislav A. Sobolevskiia
a
St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), 14th line V.O., 39, St.
Petersburg, 199178, Russia
Abstract
In this paper the research related to the creation of a program complex, which realizes the
automated generation of service-programs for the artificial intelligence systems based on the
convolution neural networks is presented. The presented program complex to accelerate and
simplify the generation and training of convolutional neural networks.
Keywords
Machine learning, convolutional neural networks, service-oriented architecture, internet of
things
1. Introduction This leads to the fact that the task of creating
the systems of CNN generation automation
In modern world the recognition process for one or the other spheres is becoming
technologies of photo and video images are very relevant [4-6]. At the same time, the
being implemented more intensively. The demand for a system suitable for solving typical
development of this sphere became possible tasks from different spheres is becoming more
due to the appearance of new convolution acute. There are many tasks of one class (for
neural network (CNN) architectures and the example, the recognition of certain tree species
modification of existing ones. The given type of in space images, landscape peculiarities,
architecture turned out to be successful enough specific nature objects etc), the solving
for solving the tasks of image analysis, principle of which has been already discovered
segmentation and semantic recognition. The or they are being handled on the basis of an
higher the CNN accuracy and capabilities are, individual CNN production [7-9] or not being
the more complex CNN become. Some of the solved at all due to the lack of specialists.
most successful and widespread CNN Additionally, a lot of CNNs are produced in
architectures at the moment have a plenty of forms of program prototypes (for instance,
heterogeneous layers [1-3]. This leads not only using MatLab) and such prototypes require
to the increase of work quality, but to the improvement for implementing into the
complication in creating and training such existing monitoring systems which are
networks. designed at specific stacks of applied
At the same time, the number of tasks that programming languages (C++, Java, Python
can be solved using CNN rises. The given tasks etc). In its turn, this makes the further
not always demand the application of the most development and the following implementation
complex and foremost CNN architectures, but of prototypes more complicated.
they are still quite difficult and regular users For solving the given tasks, the system of
without any knowledge of deep learning convolution neural networks automated
methods and their implementation skills would training was designed based on the service-
not be able to create and adapt these networks oriented approach within the project presented
correctly. It can be said that the quantity of such in this article. The approach of artificial neural
tasks is growing faster than the number of networks automated generation is not new and
professionals capable of solving them. there are some works upon this topic [10-13].
All these works point to the fact that the
Models and Methods for Researching Information Systems in automation of machine learning models
Transport, Dec. 11-12, St. Petersburg, Russia production process will allow to fasten the
EMAIL: arguzd@yandex.ru (V. A. Sobolevskii)
ORCID: 0000-0001-7685-4991 (V. A. Sobolevskii)
process of developing program products for
©️ 2020 Copyright for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
solving a multitude of tasks. The system
CEUR Workshop Proceedings (CEUR-WS.org) described in the article elaborates the idea of
automation and has module extensible structure
100
which allows to add and combine trainable currently implemented on the basis of a genetic
architectures, training algorithms, data algorithm) which was developed with an
normalization, validation etc. Moreover, due to expectation of changeability. The other
genetic algorithms, the given system is capable algorithms of solution search can be used
of automated CNN generating and training instead of it and there is no need to make
which allows non-professionals who are not significant modifications to other modules for
aware of neural networks setting details to use the use of these algorithms.
it for solving typical tasks. The work result of This approach is based on the principles of
this system is not only a built architecture, but transparency and scalability which allows to
a generated executable file with additional expand the program product functionality by
REST and SOAP wrappings that without any adding new modules, not by modifying the
preliminary preparations will allow to start the existing ones.
produced CNN as a service and apply to it from It is obvious that the given approach would
other systems and program complexes. This not allow to implement the automated training
presents the system as a tool for a quick and of all possible CNN architectures. However, the
effortless solving of simple typical tasks by generation and training processes of typical
regular users. architectures have a precise and consecutive
By present time, the designed system had algorithm. Having implemented the given
already been used for generating simple deep algorithm in the program complex it would be
neural networks that were introduced into third- possible to solve the task of typical neural
party program products for solving specific network solutions streaming (conveyor)
applied tasks [14-15]. In suggested article the implementation as the main one.
capabilities of the given program complex The service-oriented approach in the
which were improved using CNN automated developed program complex occurs in the fact
training are described. that all modules should not be necessarily
installed to one and the same personal computer
2. The service-oriented approach (PC). Modules can be distributed between
different PCs or placed in cloud storages. Thus,
in neural networks automation the given program complex can be
generation implemented in the form of a distributed system
that blends into the SOA paradigm completely.
The service-oriented architecture (SOA) of At the program product operation result
applications implies a module approach to the level SOA is maintained by the implementation
program application development [16]. In the of autonomous service containing CNN trained
considered situation the given paradigm is to solve a specific task. This service is cross-
implemented at several levels. platformed and it can be launched without any
At the level of the program complex itself prior installing and additional software setting
SOA maintains the modularity and on the basis of some operation systems (which
interchangeability of CNN generation and is possible due to the cross-platform of the
training algorithms. Thus, the whole process of given modules implementation language -
automated generation and training is divided Python [17]). Respectively, such module can be
into some consecutively evoked program used in the systems maintaining both SOA
modules: paradigm and the Internet of Things (IoT) via
• the input data normalization module; interfaces REST and SOAP [18-20].
• the generation module of chosen CNN
or the module of pre-trained CNN 3. The algorithm of convolution
architecture initialization;
• the CNN training module (including neural networks automated
verification and validation submodules). training
Each of these modules is presented in
several realization variants (for various CNN The difficulty in CNN production and
architectures) and certain realizations are training lies in the fact that they are being
chosen depending on the requirements. In trained only having a marked training dataset
addition, these modules are evoked from an which describes the class of recognizable
external automated training module (it is objects. The recognition of different object
101
classes requires various CNN architectures and supposed to be used for different classes of
their parameter settings. Due to the CNN tasks. Although the use of specific
complexity this task becomes very resource- algorithms would have fastened the
intensive. This is one of the CNN key operation speed for some task classes, but it
restrictions of CNN trained with a teacher. Now inevitably would have slowed the operation
the approach which consists in multitasking speed for other classes. The inaccuracy
CNN creation for different science fields that estimation calculated using CNN target
can solve the whole class of tasks is often used parameter value relatively to the real value
[21-23]. The given approach has some of a test dataset (formula 2) lies in the basis
advantages, particularly the higher accuracy for of the fitness function
selected objects. However, the development of 1 (2)
each of these CNNs is more resource-intensive 𝑓𝑖 = ,
∑𝑀 2
and demands participation of specialists able to √ 𝑗=1(𝜀𝑖𝑗 − 𝜔𝑗 )
project the architectures of such networks. The 𝑋
alternative solution described in this article is where εij is the output value of a target
the automated training of models. This kind of parameter, which was forecast by i-network
solution implies simultaneous training of some in response to an input test j-vector, ωj is the
CNNs based on prepared information dataset real value of a test dataset in response to an
for the following situational choice of the most input test j-vector, X is the quantity of test
precise model which leads to the necessity to vectors.
solve the task of models parametrical The result of a calculation according to the
adaptation quality assessment. At the same given formula is a "fitness level" value,
time, the formation task of training dataset in which is inversely proportional to the mean
common case does not require special squared error of i-CNN at the test dataset. As
knowledge [24]. The automated system (AS) a result of selection, M is selected to the
described in the article is relevant in such cases current generation out of (M + Nd + Nr)
when the development of a wholesome CNN CNN with the maximum pi value (choice
able to solve the task in the most accurate way probability of i-CNN).
is unprofitable. Using this system, it is possible 4. For all CNN the mean squared error of
to create CNN able to solve the assigned task the target parameter value calculated by
cheaper and faster with an accuracy specified them relatively to the real test dataset value
by user. is computed. If at least one CNN shows the
The algorithm of CNN selection was mean squared error lower than the set value,
implemented in the following way: the cycle stops. The CNN with the lowest
1. In the first parent population a fixed mean squared error is treated as a "winner".
CNN number (M) is generated with Otherwise, the return to point 2 takes place.
randomly set parameters. In addition, the population of each iteration
2. Nd of new CNNs is generated, the is stored separately. If the population of a
parameters of which are selected randomly current iteration coincides completely with a
out of two occasionally chosen parent previous population, it means that during all
CNNs, and also Nr of CNN, the parameters iteration the CNN configuration with the
of which are set completely randomly most accuracy has not been found and the
considering the given value ranges for these unconditional transition to step 5 is carried
parameters. out.
3. Further, the CNN selection is 5. If a CNN with the mean squared error
performed using the roulette method lower than the set value is not found, the
(formula 1) [25] cycle launches from the step 1 with a new
𝑓𝑖 (1) parent population, for which new random
𝑝𝑖 = 𝑁 ,
∑𝑗=1 𝑓𝑗 parameter values are set. If the solution is
where pi is the choice probability of i-CNN, not found after I iteration, the task is
fi is the value of fitness function for i-CNN, declared to be unsolvable with specified
N is the quantity of CNN in population. The
roulette method was chosen as the most
universal one, because the algorithm is
102
settings and the output from the algorithm is [31]. By default, MRCNN is already capable of
performed. recognizing fundamentally different object
classes, from automobiles to animals. That is
4. Technologies used in the why, by proper additional training, it would be
able to recognize a wide range of objects that
developed program complex are not included into COCO dataset.
The program complex was tested on the
This program complex is developed in calculation task of the amount of deer in a herd
programming language Python, the main assets from air photography. Besides the fact that deer
of which relate to its cross-platform, do not belong to the COCO dataset and
extensibility and large amount of sided program MRCNN is not able to distinguish them by
libraries used for solving specified tasks. The default from the range of other creatures (sheep,
suggested programming language was chosen gazelles, cows, horses), the specificity of this
because at the moment it happens to be the main task has something to do with the fact that
solution for deep learning systems development photos are made from various angles and
and also because it allows to realize SOA distances, at different landscapes and during all
paradigm easily [26, 27]. Keras and seasons, which result in the fact that deer can be
TensorFlow libraries are used for training shot under different angles, in various scales
algorithms implementation. and can have diverse colouring. What is more,
Such stack of technologies is explained by due to the size of herds, deer often cover one
the fact that the program does not face the another in photos. This leads to the fact that the
implementation task of untypical solutions. On described task in non-trivial and the application
the contrary, the quick realization of already of CNN trained at common amount of data is
known architectures is required. The use of impossible. In figure 1 the recognition results of
already developed, tested and optimized one out of two images using MRCNN without
libraries satisfies the set task completely. At the additional training are shown.
same time, the key requirements are
extensibility and scalability. Respectively, the
program complex realization on the basis of a
constantly extending program platform will
allow to add new CNN architectures and their
work tools at the cost of one program interface.
The cross-platform of the described stack and
the support of SOA paradigm will allow to scale
the program complex to different hardware.
It is important to mention separately that
CUDA SDK is also included in the used
program libraries, which allows to exploit
hardware acceleration during artificial neural
network training using NVidia video cards [28, Figure 1: The deer recognition and calculation
29]. The use of this technology makes the using basic MRCNN trained at COCO dataset
process of CNN training significantly faster
[30]. It can be noted that there are plenty of false
negative errors evoked by the COCO dataset
5. The approbation of automated specificity, in which there is an insufficient
convolution neural network number of images with similar scaling of
objects. To get rid of false operations, it is
training program complex required to train the network using images
marked for the specified task. That is why
For approbation of the program complex MRCNN was additionally trained using the
prototype performing additional training of CNN automated training system prototype. The
Mask R-CNN (MRCNN) CNN architecture training was conducted in the automated mode
trained on COCO dataset was developed. The based on the training dataset specified by a user.
given configuration was chosen because of the The following parameters of a training process
balance between universality and accuracy were varied in the prototype:
103
• the quantity of training epochs; is used by specialists in machine learning, the
• the quantity of training steps in each current interface is not adapted for using by
epoch; regular users. Because of this, the accessibility
• the speed of training; for the wide user audience which is one of the
• the threshold of detection skipping. key tasks facing the program complex is not
The CNN declared to be the winner by a being solved at the moment.
system was trained on 3 epochs, with 53 In addition, because of the high-
training steps in each, 0,0058 training speed and performance requirements, during the given
0,86 threshold of detection skipping. The program product functioning the program
described network for the same image complex transition to highly productive servers
recognized correctly 58 out of 93 deer and did is needed for the commercial use. The
not perform any false negative error (figure 2). calculation specifity during CNN training puts
a range of requirements to the hardware and the
commercial use implies the parallel training of
several models that can load the system
significantly. Despite the calculation
parallelism put in the program complex
architecture using SOA, it is demanded to
perform the additional research and stress-tests
to outline the specific requirements to the
hardware.
Acknowledgements
This work was supported by the RFBR grant
Figure 2: The deer recognition and calculation
№19-37-90112 and the budgetary theme 0073-
using additionally trained MRCNN
2019-0004.
Of course, the trained CNN did not reach the
maximum possible accuracy, but it can be References
improved in the future. What is more, the
recognition accuracy may be increased by using
the other CNN architectures. But the prototype [1] A. Krizhevsky, I. Sutskever, G. E. Hinton,
testing can be considered successful because ImageNet classification with deep
program and service coverages were generated convolutional neural networks,
for additionally trained MRCNN which will Communications of the ACM (2017),
allow to use the received CNN for solving the volume 60, issue 6, pp. 84 – 90.
set task right away. Due to the unified interface, [2] K. Simonyan, A. Zisserman, Very deep
it will be possible to perform the convolutional networks for large-scale
implementation of the most accurate CNNs in image recognition, 3rd International
the future. Even if in the following versions a Conference on Learning Representations
different CNN architecture is used, the program (2015).
and service coverage interface will not change, [3] M. D. Zeiler, R. Fergus, Visualizing and
and it will not be required to introduce changes understanding convolutional networks, 3th
into the programs at the client side. European Conference on Computer Vision
(2014), volume 8689, issue 1, pp. 818 –
6. Conclusion 833.
[4] Z. Geng, Y. Wang, Automated design of a
convolutional neural network with multi-
Nowadays, the program complex is at its
scale filters for cost-efficient seismic data
prototype stage and it is used for the
classification, Nature Communications,
development of some off-site applications. First
volume 11, issue 1, 2020.
of all, to start the full operation the
[5] M. Witsuba, A. Rawat, T. Pedapati,
improvement of user application interface is
Automation of deep learning, Proceedings
needed. As at the prototyping step the product
104
of the 2020 International Conference on International Conference Application of
Multimedia Retrieval (2020), pp. 5-6. Information and Communication
[6] B. Baker, O. Gupta, N. Naik, R. Raskar, Technologies, Baku, Azerbaijan, pp. 324 –
Designing neural network architectures 328, 2019.
using reinforcement learning, 5th [15] V. A. Zelentsov, A. M. Alabyan, I. N.
International Conference on Learning Krylenko, I. Yu. Pimanov, M. R.
Representations (2017). Ponomarenko, S. A. Potryasaev, A. E.
[7] Ateeq-ur-Rauf, A. R. Ghumman, S. Semenov, V. A. Sobolevskii, B. V.
Ahmad, H. N. Hashmi, Performance Sokolov, R. M. Yusupov, A Model-
assessment of artificial neural networks Oriented System for Operational
and support vector regression models for Forecasting of River Floods, Herald of the
stream flow predictions, Environmental Russian Academy of Sciences, volume 89,
Monitoring and Assessment, volume 190, issue 4, pp. 405 – 417, 2019. doi:
issue 12, article 704, 2018. 10.1134/S1019331619040130.
[8] Z. Alizadeh, J. Yazdi, J. H. Kim, A. K. Al- [16] M. Bell, Introduction to Service-Oriented
Shamiri, Assessment of machine learning Modeling, in Service-Oriented Modeling:
techniques for monthly flow prediction. Service Analysis, Design and
Water (Switzerland), volume 10, issue 11, Architecture, Wiley & Sons, New York,
article 1676, 2018. NY, 2008.
[9] J. Lantrip, M. Griffin, A. Aly, Results of [17] V. John, Guttag Introduction to
near-term forecasting of surface water Computation and Programming Using
supplies, Proceedings of the 2005 World Python: With Application to
Water and Environmental Resources Understanding, 2nd Edition, MIT Press,
Congress, Anchorage, Alaska, US, 2005. Cambridge, Massachusetts, 2016.
doi: 10.1061/40792(173)447. [18] Y. Mesmoudi, M. Lamnaour, Y. E. L.
[10] I. Bello, B. Zoph, V. Vasudevan, Q. V. Le, Khamlichi, A. Tahiri, A. Touhafi, A.
Neural optimizer search with Braeken, Design and implementation of a
Reinforcement learning, 34th International smart gateway for IoT applications using
Conference on Machine Learning (2017), heterogeneous smart objects, 4th
volume 1, pp. 712-721. International Conference on Cloud
[11] H. Cai, T. Chen, W. Zhang, Y. Yu, J. Computing Technologies and
Wang, Efficient architecture search by Applications, Cloudtech, 2018.
network transformation, 32nd AAAI [19] D. Hanes, IoT Fundamentals: Networking
Conference on Artificial Intelligence Technologies, Protocols, and Use Cases
(2018), pp. 2787-2794. for the Internet of Things, Cisco Press,
[12] J.-D. Dong, A.-C. Cheng, D.-C. Juan, W. Indianapolis, Indiana, 2017.
Wei, M. Sun, DPP-Net: Device-Aware [20] T. Erl, Service-Oriented Architecture:
Progressive Search for Pareto-Optimal Analysis and Design for Services and
Neural Architectures, Lecture Notes in Microservices, 2nd Edition, Prentice Hall,
Computer Science (including subseries Upper Saddle River, New Jersey, 2016.
Lecture Notes in Artificial Intelligence and [21] D. Xu, Z. Tian, R. Lai, X. Kong, Z. Tan,
Lecture Notes in Bioinformatics), volume W. Shi, Deep learning based emotion
11215, pp. 540-555, 2018. analysis of microblog texts, Information
[13] M. Wistuba, Deep learning architecture Fusion, volume 64, pp. 1-11, 2020.
search by neuro-cell-based evolution with [22] U. Ozkaya, F. Melgani, M. Belete Bejiga,
function-preserving mutations, Lecture L. Seyfi, M. Donelli, GPR B scan image
Notes in Computer Science (including analysis with deep learning methods,
subseries Lecture Notes in Artificial Measurement: Journal of the International
Intelligence and Lecture Notes in Measurement Confederation, volume 165,
Bioinformatics), volume 11052, pp. 243- 2020.
258, 2019. [23] A. Dutta, T. Batabyal, M. Basu, S. T.
[14] V. Mikhailov, A. Spesivtsev, V. Acton, An efficient convolutional neural
Sobolevsky, N. Kartashev, Multi-Model network for coronary heart disease
Estimation of the Dynamics of Plant prediction, Expert Systems with
Community Phytomass, The 13th IEEE Applications, volume 159, 2020.
105
[24] M. Sewak, M. R. Karim, P. Pujari,
Practical convolutional neural networks:
implement advanced deep learning models
using Python, Packt Publishing,
Birmingham, UK, 2018.
[25] L. A. Gladkov, V. V. Kureichik, V. M.
Kureichik, Genetic algorithms: a textbook,
2nd Edition, Fizmatlit, Moscow, Russia,
2006.
[26] T. Ziade, Python Microservices
Development, Packt Publishing,
Birmingham, UK, 2017.
[27] G. C. Hillar, Internet of Things with
Python, Packt Publishing, Birmingham,
UK, 2016.
[28] D. B. Tuomanen, Hands-On GPU
Programming with Python and CUDA:
Explore high-performance parallel
computing with CUDA, Packt Publishing,
Birmingham, UK, 2018.
[29] J. Han, B. Sharma, Learn CUDA
Programming: A beginner's guide to GPU
programming and parallel computing with
CUDA 10.x and C/C++, Packt Publishing,
Birmingham, UK, 2019.
[30] B. Vaidya, Hands-On GPU-Accelerated
Computer Vision with OpenCV and
CUDA: Effective techniques for
processing complex image data in real
time using GPUs, Packt Publishing,
Birmingham, UK, 2019.
[31] K. He, G. Gkioxari, P Dollar, R. Girshick,
Mask R-CNN, Proceedings of the IEEE
International Conference on Computer
Vision, volume 2017, pp. 2980-2988,
2017.
106