=Paper= {{Paper |id=Vol-2267/207-212-paper-38 |storemode=property |title=Essential aspects of IT training technology for processing, storage and data mining using the virtual computer lab |pdfUrl=https://ceur-ws.org/Vol-2267/207-212-paper-38.pdf |volume=Vol-2267 |authors=Mikhail A. Belov,Yury A. Krukov,Maksim A. Mikheev,Pavel E. Lupanov,Nadezhda A. Tokareva,Evgeniya N. Cheremisina }} ==Essential aspects of IT training technology for processing, storage and data mining using the virtual computer lab== https://ceur-ws.org/Vol-2267/207-212-paper-38.pdf
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018




  ESSENTIAL ASPECTS OF IT TRAINING TECHNOLOGY
 FOR PROCESSING, STORAGE AND DATA MINING USING
           THE VIRTUAL COMPUTER LAB
           M.A. Belov 1,a, Y.A. Krukov 2,b, M.A. Mikheev 1,d, P.E. Lupanov 1,c,
                          N.A. Tokareva 1,e, E.N. Cheremisina 3,f
 1
     System Analysis and Control Department, Dubna State University, 19 Universitetskaya st., Dubna,
                                   Moscow region, 141980, Russia
     2
         Vice-rector for Science and Innovation, Dubna State University, 19 Universitetskaya st., Dubna,
                                        Moscow region, 141980, Russia
 3
     Head of System Analysis and Control Department, Dubna State University, 19 Universitetskaya st.,
                                Dubna, Moscow region, 141980, Russia

                 E-mail: a belov@uni-dubna.ru, b kua@uni-dubna.ru, c lupanov@uni-dubna.ru,
                 d
                   miheevma@uni-dubna.ru, e tokareva@uni-dubna.ru, f chere@uni-dubna.ru


This paper discusses issues surrounding the training of specialists in the field of storage, processing
and data mining using virtual computer lab and its main architectural components. Virtual computer
lab is a powerful innovative tool for training IT-professionals, created and successfully operated by the
experts of the System Analysis and Control Department at the Dubna State University. Improvement
of the control system of the virtual computer lab is aimed at reducing the volume of requirements for
the students in the necessary basic knowledge and skills as a threshold of entry to ensure productive
and efficient work with information and computing resources.

Keywords: virtual computer lab, virtualization, containerization, cloud computing, web services, IT
training, education, innovations in education, e-learning, cognitive technologies, finite automata
theory, expert systems in education, knowledge management, data preparation and processing,
business intelligence, policy management.

     © 2018 Mikhail A. Belov, Yury A. Krukov, Maksim A. Mikheev, Pavel E. Lupanov, Nadezhda A. Tokareva,
                                                                                Evgeniya N. Cheremisina




                                                                                                        207
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018




1. Introduction
         Many new professions corresponding the needs of the Russian economy entered the labor
market: a designer of artificial intelligence systems, the Big Data analyst, specialist on the information
promotion, specialist on the machine learning, analyst of the robotized process, digital information
manager, developer of the informational system based on blockchain, and even such rare professions
as a planner of the interaction with the artificial intelligence and cognitive copywriter. A new training
methodology of the skilled IT specialists should be created as the new professions have entered the
labor market. The professionals should have enough knowledge and skills to meet the current realities
and needs of the business. They should get a job and the social position in the leading sector
companies, market leaders of the high-teach goods and the services of new generation. These people
will create and leap the economy of the future forward. Nowadays we face the primary task of
supplying the software tools and technologies to provide the effective work in the classroom and at
home, to cause a true amazement and sincere admiration for the progress in math and computer
science, to make the students more self-confident and to give the reasons for the future actions. We
should create sustainable development not only at the national level, but also in the globalization and
international partnership. The import substitution policy is a short-term strategy, and for a long-term
one it is necessary to consider that the education and the development of IT is out of politics as the
innovation requires a deep understanding of all the existing achievements of humankind up to date.
         Accepting the modern challenges, we began creating not only the technical environment but
also the space for the knowledge sharing. We draw an analogy of the physical laws and
thermodynamics with the laws of functioning of sociotechnical systems. We create the special
conditions for the cooperation between the students and teachers by using of our flagship project
"Virtual Computer Lab" (VCL), which has been constantly improved for more than 12 years. It gives
an opportunity for the students to create and expand own created multi-component corporate
information systems both singly and in a team, form the teaching aids in a team, making both the
freshmen and graduates take part in the process. It significantly increases the cognitive perception of
teaching material [1-16].

2. Background
         The training of the “consumers” should be cut off in the process of the IT specialists’
education, and we should spare no effort to training of the “creative doers”. For this purpose, it is
important to study the ways of creating the information systems from the scratch, paying attention to
the configuring and adjustment of the equipment, connection and integration of all the necessary parts
of the system without any help, and only after that to accomplish issue-oriented tasks. The practical
study of the modern peer-to-peer protocols is very important nowadays. The students get acquainted
with the approaches to the improvement of the existing systems, not changing end-user presentation
level. They are taught how to setup modern systems with the horizontal scaling for the data storage,
distributed map-reduce analytics, OLAP analysis based on the materialized views, which accelerate
the data output in the business intelligence system without decreasing of the reliability and increased
cost as compared to in-memory solutions. In addition to the training of programmers in the
development of the mobile solutions and cloud services, it is necessary to focus attention on the
training of the programmers in the neural network solutions using open source frameworks such as
TensorFlow, Keras, OpenAI. The training should involve the use of the modern processor designs
including AVX, FMA, SSE4x instructions and technologies of the distributed computing MPI, CUDA,
OpenCL for the effective solving the tasks of the cognitive informatics and machine learning.
         The expert of the future is an expert, which has not only the fundamental scientific knowledge,
but he is a promising engineer with an outstanding potential and is able to compose and make the
capable computing solutions suitable for the project. Only the skilled professionals of this level can
create the right conditions for the science development and its practical applications at an increasing
rate.
         All above-mentioned problems can be solved in the virtual computer lab, which has become
not only the innovative tool for the training of the high skilled IT specialists, but also a demanded

                                                                                                        208
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018



space for the technical cooperation between a final-year student and a potential employer. It gives an
opportunity to show your qualification in real time, and to present the employer's problem in the
virtual format and try to solve it together, attracting the young minds and sometimes people with
different ways of thinking, for example, the history of the neural network expansion and the idea of
calculation of the back propagation errors, using the gradient descent method and so on.
         That is why the priority of the university is to create the most favorable conditions for the
forming of the professional competence in IT, which will help the graduates to solve a wide range of
the tasks, happening during all the stages of the corporate information systems development, including
the design itself. It is evident that to form the professional competence the students should do the
following in order master a lot of literature, do many practical tasks and make research works on the
modern information systems, their deployment, maintenance and effective appliance for solving the
problem-oriented tasks and so on.
         The following problems had to be solved for the effective target training of IT specialists: a
lack of class hours for solving the necessary and sufficient practical tasks of the complex information
systems studying; it’s impossible to get the work experience in the complex information systems by
use of the personal computer with an average capacity as such systems demand different requirements
of the hardware in comparison with the home, office and portable computers; one sometimes has
problems during the setup and maintenance of the information systems, these tasks cannot be solved
without the work experience in such systems; the price of some products licenses is extremely high for
a user, in most cases, one needs a license only for the educational process.
         The main way to solve these problems has been to create a virtual computer lab that is able to
solve the problem of insufficient computing and software resources and to provide an adequate level
of technological and methodological support; to teach how to use modern technologies to work with
distributed information systems; to organize group work with educational materials by involving users
in the process of improving these materials and allowing them to communicate freely with each other
on the basis of self-organizational principles.


3. Brief concept of using virtual computer lab
         The Virtual Computer Lab provides a set of software and hardware-based virtualization,
containerization and management tools that enable the flexible and on-demand provision and use of
computing resources, knowledge management system, theoretical materials and practical cookbooks in
the form of cloud services for carrying out research projects, scientific computational calculations and
tasks related to the development of complex corporate and other distributed information systems. The
service also provides dedicated virtual servers for innovative projects that are carried out by students
and staff at the Institute of System Analysis and Control of State Dubna University [9], [13].
         One main distinguishing trait of the Virtual Computer Lab is its self-organizing principles,
which make it possible to transition students from a rigid system of group security policies to a new
system where each student can develop a sense of personal responsibility, respect for colleagues, and
tolerance, which should provide a solid foundation for strengthening and developing basic
civilizational values in the education environment. Thus, today the need has arisen to incorporate
technologies into the educational process that will contribute to global integration in the foreseeable
future.
         It is not arbitrary that education that is conducted through high-availability distributed
information systems is a priority, because these types of software solutions have become an integral
part of modern business. That’s why the task of designing and deploying failover clusters forms the
topic of several special courses, which are designed to satisfy the demand for these skills by modern
companies. When designing corporate information systems and ensuring the availability of critical
applications that are independent of a hardware and software environment, it is critically important to
ensure the successful implementation of many key business processes. Downtime, including for
scheduled maintenance, leads to additional costs and the loss of customers, and the long outages are
simply unacceptable for modern high-tech enterprises.




                                                                                                        209
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018




4. Fundamental components of virtual computer lab
         Modern blade servers are the hardware components that support virtual computer labs. They
are high-performance, high-capacity, but compact and allow the space in the server room to be used
more efficiently.
         The software platform of the Virtual Computer Lab is implemented based on the VMware
vSphere Software, which consists of vSphere ESXi hypervisors with some hand-made enhancements
and optimizations for some specific hardware that handle all the computing work of the virtual
machines as well as vCenter Server central management servers [14].
         The vCenter Server consists of the following key components:
         vCenter Single Sign-On. This component is critical to the whole environment, since it
provides secure authentication services for many vSphere components. Single Sign-On creates an
internal secure domain in which the various components and solutions that are included in the vSphere
ecosystem are registered during the installation or upgrade process, and subsequently they will be
assigned basic infrastructural resources. Within the VCL architecture this component is responsible
not only for internal authentication services, but it is also used to authenticate users from the
university's internal domain who have Microsoft Active Directory accounts at the university.
         vCenter Server. The vCenter Server component is a central component that is used to manage
the vSphere environment. This module provides management and monitoring interfaces for several
vSphere nodes, and it also enables the use of such technologies as VMware vSphere vMotion and
VMware vSphere High Availability.
         vCenter Inventory Service. Approximately ninety percent of vSphere Web Client requests to
the server are just requests to read the current configuration of the system and its state. The Inventory
Service is a component that caches most of the information about the current state of the environment
to respond to vSphere Web Client requests to reduce the load on vCenter basic processes.
         vSphere Server for Web Client (vSphere Web Client). vSphere Web Client is the main
interface that is used to centrally manage the environment. It can be divided into two parts: the first
server part, which serves requests from the second part, which is the end user's Adobe Flex compatible
browser with support for NPAPI-plugins. It is worth noting that the VCL may also be managed using
the vCenter Server Desktop Client that is installed on the end user's computer.
         vCenter Server Database. The database is one of the key modules in the vCenter Server stack
architecture. Almost every request sent to the vCenter Server entails communicating with the database.
This database is the main storage location for vCenter Server parameters, and it is also a repository of
statistical data. Saved statistical data make it possible to optimize system performance during
subsequent analysis.
         The NVidia Tesla, Volta, Pascal, Maxwell GPUs could be used for 3D virtualization and
VMware Horizon Suite is used for remote VDI connections as well as for creating images of virtual
servers and workstations that are separated into layers using VMware ThinApp and for managing
these images. This solution is very important for machine learning due to significant increasing of
neural networks training speed.
         A centralized management portal as well as a knowledge management system were created to
improve productivity of work in the Virtual Computer Laboratory. The need to create such a system
was conditioned by the fact that students are able to improve productivity of remote learning by
themselves, so it is important to create a social network between all participants as well as to create an
environment that allows pupils the opportunity to independently engage in such processes as the
identification, acquisition, presentation, and use (distribution) of knowledge without the direct
involvement of the instructor.
         Methods of use (propagation) are directly related to storage methods and, consequently, the
technological tools that may be used for the transmission of formal knowledge include knowledge
bases with various search functionality; blogs, wikis, and social networks; “Wiki Textbooks” that
allow all participants to collaboratively create and update educational content and exchange practical
problems (including from real companies); as well as user blogs, forums, and group chat systems.
         The new practice with containers is different compared to VMware case and effectively
complements it for a wide range of practical tasks. For the underlying operating system kernel can be
used for all containers. On the one hand, it introduces restrictions on the use of other operating

                                                                                                        210
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018



systems while, on the other hand, it improves payload on the north of a similar configuration. This can
be achieved due to the specifics of the containerization architecture, which we will examine on the
example of Docker.
         Docker uses a client-server architecture in which the Docker-client interacts with the Docker
daemon, enabling the operations of creating and launching containers on the server and providing
them to students. In general, a containerization system can be represented in the form of three key
components: images, registries, and containers. Images represent read-only templates that contain an
operating system based on the same kernel version as the host system with necessary pre-configured
and adapted software. These images are created, modified if necessary, and then used for deployment
of individual solitary containers. The images are stored in the registry, which is a tool for their storage
and distribution. The registry content corresponds the curriculum and laboratory plans prepared by the
teaching staff.
         The containers per se are, in fact, like catalogues (directories) of an operating system, where
all the changes made by the user and the system software while work are stored. Each container
installed from an image provides the capacity for fast creation, start, stop, move, and delete. It also
works as a safe sandbox for running applications, allowing the student to carry out any experiments
without compromising the base operating system, while maintaining the highest level of performance.
Current evolution of VCL lead to development of design templates for both corporate IT deployment
and students learning project.


5. Conclusion
        It should also be emphasized that the virtual computer lab has helped us provide an optimal
and sustainable technological, educational-organizational, scientific-methodological, and regulatory-
administrative environment for supporting innovative approaches to computer education. It promotes
the integration of the scientific and educational potential of Dubna State University and the formation
of industry and academic research partnerships with leading companies that are potential employers of
graduates of the Institute of System Analysis and Control.
        The results that the Institute of System Analysis and Control has achieved in improving the
educational process represent strategic foundations for overcoming perhaps one of the most acute
problems in modern education: the fact that it tends to respond to changes in the external environment
weakly and slowly.


References
[1] Belov M.A., Kryukov Y.A., Lupanov P.E., Miheev M.A., Cheremisina E.N., Koncepciya
kognitivnogo vzaimodeystviya s virtual'noy komp'yuternoy laboratoriey na osnove vizual'nyh modeley
i ehkspertnoy sistemy, Estestvennye i tekhnicheskie nauki, 2018, №10, Pp. 27-36.
[2] Belov M.A., Lupanov P.E., Tokareva N.A., Cheremisina E.N. Kontseptsiya usovershenstvovannoy
arhitektury virtual'noy komp'yuternoy laboratorii dlya effektivnogo obucheniya spetsialistov po
raspredelennym informatsionnym sistemam razlichnogo naznacheniya i instrumental'nym sredstvam
proektirovaniya, Sovremennye informatsionnye tekhnologii i IT-obrazovanie. 2017. T. 13. № 1.
Pp. 182-189.
[3] Belov M.A., Cheremisina E.N., Potemkina S.V., Distance learning through distributed information
systems using a virtual computer lab and knowledge management system, Journal of Emerging
research and solutions in ICT, 2016.
[4] Lishilin M.V., Belov M.A., Tokareva N.A., Sorokin A.V., Kontseptual'naya model' sistemy
upravleniya znaniyami dlya formirovaniya professional'nyh kompetentsiy v oblasti IT v srede
virtual'noy komp'yuternoy laboratorii, Fundamental'nye issledovaniya. 2015. № 11-5. Pp. 886-890.
[5] Belov M.A., Lishilin M.V., Tokareva N.A., Antipov O.E., Ot virtual'noy komp'yuternoy laboratorii
k upravleniyu znaniyami. Itogi i perspektivy, Kachestvo. Innovatsii. Obrazovanie. 2014. № 9 (112).
Pp. 3-14.

                                                                                                        211
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018



[6] Cheremisina E.N., Belov M.A., Lishilin M.V., Integratsiya virtual'noy komp'yuternoy laboratorii i
znanievogo prostranstva - novyy vzglyad na podgotovku vysokokvalifitsirovannyh it-spetsialistov,
Sistemnyy analiz v nauke i obrazovanii. 2014. № 1 (23). Pp. 97-104.
[7] Cheremisina E.N., Belov M.A., Lishilin M.V., Analiz klyuchevyh aktivnostey zhiznennogo tsikla
upravleniya znaniyami v vuze i formirovanie kontseptual'noy modeli arhitektury sistemy upravleniya
znaniyami, Otkrytoe obrazovanie. 2013. № 3 (98). Pp. 34-41.
[8] Cheremisina E.N., Mitroshin P.A., Belov M.A., Kompleksnye sistemy elektronnogo obucheniya
kak instrumentariy otsenki kompetentsiy uchashchihsya, Nauka i biznes: puti razvitiya. 2013. № 5
(23). Pp. 113-122.
[9] Belov M.A., Tokareva N.A., Cheremisina E.N., F1: the cloud-based virtual computer laboratory -
an innovative tool for training V sbornike: 1st International Conference IT for Geosciences 2012.
2012. Pp. F1.
[10] Cheremisina E.N., Antipov O.E., Belov M.A., Rol' virtual'noy komp'yuternoy laboratorii na
osnove tekhnologii oblachnyh vychisleniy v sovremennom komp'yuternom obrazovanii,
Distantsionnoe i virtual'noe obuchenie. 2012. № 1. Pp. 50-64.
[11] Belov M.A., Antipov O.E., Kontrol'no-izmeritel'naya sistema otsenki kachestva obucheniya v
virtual'noy komp'yuternoy laboratorii, Kachestvo. Innovatsii. Obrazovanie. 2012. № 3 (82). Pp. 28-32.
[12] Antipov O.E., Belov M.A., Tekhnologiya primeneniya virtual'noy komp'yuternoy laboratorii v
uchebnyh kursah vuza, Estestvennye i tekhnicheskie nauki. 2012. № 1 (57). Pp. 260-268.
[13] Cheremisina E.N., Belov M.A., Antipov O.E., Sorokin A.V., Innovatsionnaya praktika
komp'yuternogo obrazovaniya v universitete "dubna" s primeneniem virtual'noy komp'yuternoy
laboratorii na osnove tekhnologii oblachnyh vychisleniy, Programmnaya inzheneriya. 2012. № 5.
Pp. 34-41.
[14] Antipov O.E., Belov M.A., Tokareva N.A., Arhitektura virtual'noy komp'yuternoy laboratorii
dlya podgotovki spetsialistov v oblasti informatsionnyh tekhnologiy, Komp'yuternye instrumenty v
obrazovanii. 2011. № 4. Pp. 37-44.
[15] Antipov O.E., Belov M.A., Opyt ispol'zovaniya otkrytogo programmnogo obespecheniya v
virtual'noy komp'yuternoy laboratorii na osnove tekhnologii oblachnyh vychisleniy, Problemy i
perspektivy razvitiya obrazovaniya v Rossii. 2010. № 6. Pp. 112-116.
[16] Antipov O.E., Belov M.A., Razrabotka i vnedrenie programmno-apparatnoy platformy virtual'noy
komp'yuternoy laboratorii v obrazovatel'nyy protsess vysshey shkoly, Nauka i sovremennost'. 2010.
№ 7-2. Pp. 8-11.




                                                                                                        212