=Paper=
{{Paper
|id=Vol-3094/paper_18
|storemode=property
|title=Intelligent Sensor of Information and Technical Impact (ITI) on the Network Subsystem of a Man-Made Facility
|pdfUrl=https://ceur-ws.org/Vol-3094/paper_18.pdf
|volume=Vol-3094
|authors=Yuriy Sosnovskiy,Victor Milyukov,Veronika Ilyina
}}
==Intelligent Sensor of Information and Technical Impact (ITI) on the Network Subsystem of a Man-Made Facility==
<pdf width="1500px">https://ceur-ws.org/Vol-3094/paper_18.pdf</pdf>
<pre>
Intelligent Sensor of Information and Technical Impact (ITI) on
the Network Subsystem of a Man-Made Facility
Yuriy Sosnovskiy 1, Victor Milyukov 1 and Veronika Ilyina 1
1
    V.I. Vernadsky Crimean Federal University, Vernadsky av.,4, Simferopol, 295007, Crimea


                 Abstract
                 The aim of the article is to develop an information model of an intelligent sensor of information
                 and technical impact (ITI) on the communication subsystems of microprocessor-based control
                 systems used in technogenic objects that can be classified as the objects of critical information
                 infrastructure.
                 The article substantiates the importance of identifying abnormal traffic in the communication
                 subsystem of the lower and medium levels of the controlling information systems. Such traffic
                 may indicate both serious errors in the MCS (microprocessor-based control system) itself,
                 which were not detected during the system implementation and testing, and the fact of a
                 computer attack or ITI on these levels of MCS.
                 The kernel of Modbus request-response packet is considered as a communication subsystem
                 protocol, irrelevant of the communication environment type: RTU or TCP.
                 The information model in terms of extended Petri nets (EPN) developed in the paper allows us
                 to describe in a formalized way the place of a smart sensor in the MCS structure and conditions
                 of sensor triggering upon detection of a computer attack.
                 The software implementation of sensor using machine learning method – XGBoost – is
                 performed, the algorithm of data preparation for training and cross-validation of the method is
                 given. The results of testing the method on sets of traffic dumps with signs of a computer attack
                 (CA) on MCS showed satisfactory performance of the method to identify the computer attack
                 (CA). The results are presented in the paper.

                 Keywords 1
                 information and technical impact (ITI), microprocessor-based control systems (MCS),
                 intelligent sensor, abnormal traffic sensor

1. Introduction
   The increase in the number of systems operated by digital control systems, microprocessor-based
control systems and information control systems (ICS), as well as the growing complexity of such
systems leads to additional vulnerabilities in the software, which raises the likelihood of an intruder
implementing ITI threats on them [1]. Most of the ICS consist of several sub-system, the main part of
them is PLC. Another subsystems are Human Machine Interface (HMI), field-level communication
subsystem, Master Terminal Unit (MTU) and Remote Terminal Unit (RTU) [2]. Due to the simple
programming, variable control program and existing modules, high reliability and convenient expansion
of PLC, many designers of the ICS have favored it [3].
   Previously it was considered, that the industrial control system network is isolated from the external
network, so that PLC is a safety device. Some virus attacks in recent years, such as most famous Stuxnet,
have confirmed the erroneousness of this idea. However, an increasing number of ICS have an network
and internet connection today [4].

AISMA-2021: International Workshop on Advanced in Information Security Management and Applications, October 1, 2021, Stavropol,
Krasnoyarsk, Russia
EMAIL: sosnovskiy.yv@cfuv.ru (Yuriy Sosnovsky); milyukov.vv@cfuv.ru (Victor Milyukov); nika.ilyina@mail.ru (Veronika Ilyina)
ORCID: 0000-0003-3807-5297 (Yuriy Sosnovskiy); 0000-0002-0429-8540 (Victor Milyukov)); 0000-0003-4165-5620 (Veronika Ilyina)
              © 2022 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                                          170
    There are lot of examples of different vulnerabilities with high grade CVSS score. For example, in
the month of 2022 alone, there are more than 100 entries in the search engine vulners.com. The list of
manufacturers is extensive, these are Siemens (CVE-2014-2251), Schneider (CVE-2012-0931), Omron
(CVE-2015-0987), Rockwell (CVE-2012-4690), Mitsubishi (CVE-2015-3938), WAGO and many
other. Most records have a CVSS value of more than 5, often more than 7. It has been shown by the
search engine SHODAN[5] that thousands of industrial control systems are directly accessible via the
Internet. Yet today there is no consideration of the information security in the phase of building, testing
PLCs, which causes various vulnerabilities and makes PLC programs vulnerable to tampering
attacks[6][7][8].
    For the cyber security of inner PLC programs, Daniel et al. converted SCL language to the model
NuSMV [9] and proposed a formal verification method of complex properties for PLC programs[10].
    ICS protection is becoming a particularly urgent task. At the top level of the network, standard
methods and tools for protecting information systems are used. Methods to ensure the security of the
upper-level communication subsystem of the ICS are well-known and continue to be actively
developed. However, the field level and the level of programmable logic controllers for control and
monitoring systems require using special measures to control information processes and be protected
against computer attacks and other impacts.
    A deeper protection approach is also being implemented. For example, ViPNet SIES MC is
responsible for managing the components of the ViPNet Security for Industrial and Embedded
Solutions (SIES) solution. It allows deploying the solution in a trusted way, putting its components into
operation and updating both the components themselves and their key information. The ViPNet SIES
solution is an embedded security tool for elements of ACS, M2M and industrial Internet of Things
(IIoT) systems (Whitepaper "Application of cryptographic protection of information in intelligent
electricity metering systems" Official documentation InfoTeCS, can be found at
https://infotecs.ru/product/vipnet-sies-mc.html).
    In-depth analysis performed on the Siemens PLC environment, communication protocol
S7CommPlus. This protocol enables communication between the engineering software from the vendor
and PLCs like the S7–1211C [11]
    The key element of MCS security system is a sensor (indicator hardware or software module),
allowing to detect the fact of a computer incident, i.e. a successful implementation of ITI by an intruder.
    Thus, the task of developing an intelligent sensor of abnormal traffic in the communication
subsystem of MСS, using methods to ensure easy configuration of the sensor and little dependence on
the types and formats of data transmitted in different systems, is relevant and of practical interest.

    Problem statement
    The following assumptions were made in this research:
    ● the creation, training and testing of the smart sensor requires appropriate traffic dumps, for
    generation of which a special simulation model is used, and the developed technique for test
    sequences (dumps) generation is implemented;
    ● the high level of consistency of the MCS communication subsystem simulation model is
    ensured by the fact that the software elements of the model can interact via a computer network with
    the actual components of the MCS, used in the model, without being fundamentally changed;
    ● Modbus TCP was selected as the network communication protocol, but for the intelligent
    sensor the TCP part of the data packet is discarded and the content relating directly to the Modbus
    protocol is used. It is assumed, that analysis of TCP-packages for their relevance to standard criteria
    (correctness of addresses, integrity, absence of spam, etc.) is made on a higher level of information
    system by tools, which become standard.
    Thus, the aim of the work is to develop information support, as well as software and hardware
support, for the intelligent sensor of ITI on MCS communication subsystems of man-made facilities,
on the basis of machine learning techniques.
    Taking into account the above-mentioned assumptions, to achieve the goal the following tasks are
set:
    ● to develop the information support for the intelligent sensor of ITI on communication
    subsystems of MCS of man-made facilities;


                                                                                                       171
   ● to select a method for identifying abnormal MCS traffic, choosing from a variety of machine
   learning techniques, and to write a software program for the sensor model implementation;
   ● to "train" the sensor with the help of special traffic dumps and test the sensor's output
   characteristics on the test dumps with rare "abnormal" events.

2. Information support for intelligent sensor of ITI on communication
   subsystems of man-made facilities MCSs
    The basis for the development of the methodology of MCS protection assessment under ITI
conditions is the model of MCS functioning under ITI conditions, which allows a comprehensive
analysis of the interrelated processes of MCS functioning, implementation of ITI and elimination of
their consequences.
    The scheme of the MCS model functioning under ITI conditions and in terms of the extended Petri
net is presented in [12][13]. The model of MCS functioning under the conditions of ITI and in terms of
extended Petri nets (EPN) contains three operating circuits, as with the regular functioning, the circuit
of simulation of ITI at MCS and the circuit of elimination of the consequences of ITI on MCS.
    It is assumed that in the ITI implementation an intruder may exploit undeclared capabilities [14] in
both the MCS hardware and software as well as in the programmable data network routers. Moreover,
an intruder can be not only external, but also internal, who is familiar with specifics and time constraints
of the technological process (triggering conditions of automatic and automated actuators), and is able
to implement unknown impact, realizing "zero" day vulnerability [15].

3. Identification of abnormal MCS traffic based on machine learning methods
   and software sensor implementation
   On the basis of preliminary research using dumps of normal and abnormal traffic, the most effective
solution in terms of the combination of ROC AUC, Recall and Precision metrics was selected - an open-
source library XGBoost, which provides a high-performance implementation of decision trees on
gradient boosting.
   Jupyter Notebook was used as the software environment and Python was chosen as the programming
language.

    The stage of data preparation for further processing is presented as a series of steps.
    1. Data conversion. For convenience, the data are converted into tabular form pandas.DataFrame.
The obtained data are merged into dataFrame, which contains results of normal operation and error
handling. A markup is added to these, indicating the data corresponding to the error operation of MCS.
    2. Checking for additional attributes. One of the most important characteristics that will correlate
strongly with the target variable is whether the device id, its register or function code is "new", the one
that was not present during normal operation of the system. To add these features, the appropriate code
has been implemented, excerpts of which are shown in Figure 1.


Figure 1: Implementation of additional attributes to check for new register values or number of
registers


                                                                                                        172
    3. In case of insufficient amount of training data, abnormal data corresponding to additional binary
features are generated. Due to the fact that it is resource-intensive to generate examples of "bad" data
in the emulator, this process can be automated in the case of additional features, for example when only
one device or register id value needs changing. This operation also increases the amount of data, which
will have a positive impact on the results obtained from the model as there will be more examples to
study.
    4. Machine learning: building and training the model. XGBoost package was chosen as the model.
The resulting dataframe is broken down into features and target variable. These sets will be used for
both cross validation and delayed sampling training. The training of the model on the resulting data is
shown in Figure 2.


Figure 2: Training and prediction by means of XGBoost algorithm

   The function fit() builds the composition (training) of the algorithm, the training sample and the
labels of the target variable for each object of the training sample are passed inside the function as
parameters. The function also displays the hyperparameters of the algorithm, the best combination of
which can be chosen based on the current results. The predict() function takes a sample for which we
want to make a prediction and returns an array of "predicted values" - the variable y predict. Results of
the metrics and the result of building the error matrix are shown in Figure 3.
   You can see from it that there occur errors of the first kind which are more undesirable than errors
of the second kind because erroneous queries are missed. However, the relative number of missed
queries is small.


Figure 3: Metric values and Error matrix

   To gain a complete picture it is also necessary to consider the results of cross-validation, which are:
precision 100,00%, recall 99,94%, roc_auc 99,99%.


                                                                                                      173
   Based on the cross-validation results, it can be concluded that the results were satisfactory during
the validation phase of the model functioning on training data.
   Intelligent sensor testing
   Test data is used to validate the developed and trained computer attack sensor model. These are
MCS traffic dumps corresponding, in general, to the normal operation of the system, with some sporadic
abnormal events. For example, wrong register numbers, register values beyond normal operation
ranges, etc.
   Files named dumps/dump2_testV1_TRM12_CTW and dumps/dump2_testV2_TRM12_CTW are
used for this purpose. Steps 1, 2 of this algorithm are repeated for the test data. After that, metric values
are calculated, which are shown in Figure 4 for the test data, and error matrix.


Figure 4: Metric values and Error matrix

   The result of applying the method is positive; all 10 objects in the test data structure were found to
be erroneous queries. The error matrix in Figure 4 also verifies the correctness of the results. It shows
that the results are correct.

4. Conclusion
   The proposed information and software intelligent sensor of ITI on the communication subsystem
of MCS allows you to solve the important task of identifying abnormal events both at the field level of
sensors and at programmable logic controller (PLC) level.
   The scientific novelty of the proposed methodology lies in the development of the model of training
sampling technique, the model of intelligent sensor and its software implementation based on machine
learning method XGBoost.
   The testing of software implementation of smart sensor model has shown high level of reliability in
terms of anomalies identification in MCS traffic. However, there may be needed further research into
the development of MCS specifications and requirements, so that they could provide effective training
samples formation for different levels of MCS complexity.

5. References
[1] Smart Environments, Smart Systems, Smart Industries: Series of Reports (Green Book Series)
    within the framework of the Russian Federation Industrial and Technological Foresight Project /
    Author's Team; Centre for Strategic Research North-West Foundation. - St. Petersburg, 2012. -
    Vol. 4. - 62 с.
[2] Sandaruwan, G.P.H., Ranaweera, P.S. and Oleshchuk, V.A. (2013) PLC Security and Critical
    Infrastructure Protection. 2013 IEEE 8th International Conference on Industrial and Information


                                                                                                        174
     Systems,            Peradeniya,          17-20         December            2013,         81-85.
     https://doi.org/10.1109/ICIInfS.2013.6731959
[3] Wang, Y., Liu, J.Y., Yang, C., Zhou, L., Li, S.F. and Xu, Z.Y. (2018) Access Control Attacks on
     PLC Vulnerabilities. Journal of Computer and Communications, 6, 311-325.
     https://doi.org/10.4236/jcc.2018.611028
[4] Wei Dong Application Analysis of PLC Technology in Electrical Automatic Control 2020 J. Phys.:
     Conf. Ser. 1533 022012
[5] Tianyou Chang et al Constructing PLC Binary Program Model for Detection Purposes 2018 J.
     Phys.: Conf. Ser. 1087 022022
[6] Mclaughlin S, Mcdaniel P. SABOT:specification-based payload generation for programmable
     logic controllers[C]// ACM Conference on Computer and Communications Security. ACM,
     2012:439-449.
[7] J Klick，S Lau，D Marzin，J Malchow，V Roth. Internet-facing PLCs - A New Back
     Orifice[C].blackhat, 2015.
[8] R Spenneberg，M Brüggemann，H Schwartke.PLC-Blaster :A Worm Living Solely in the
     PLC[C].blackhat, 2016.
[9] Darvas D, Adiego B F, Viñuela E B. Transforming PLC programs into formal models for
     verification purposes [J]. 2013.
[10] Darvas D, Adiego B F, Vörös A, et al. Formal verification of complex properties on PLC
     programs[M]// Formal Techniques for Distributed Objects, Components, and Systems. 2016:284-
     299.
[11] Henry Hui, Kieran McLaughlin, Sakir Sezer. Vulnerability analysis of S7 PLCs: Manipulating the
     security mechanism. International Journal of Critical Infrastructure Protection. Volume 35,
     December 2021, 100470. https://doi.org/10.1016/j.ijcip.2021.100470
[12] Sosnovsky Y.V., Klimov S.M. Methodology of microprocessor control systems security
     assessment under information-technical impacts // Reliability No. 4 2018. Pp. 36-44.
[13] Sosnovsky Y.V., Klimov S.M., Milyukov V.V. The method of multiversion analysis of the security
     of the LSG from the effects of network attacks Proceedings of the 4th Central Research Institute
     of the Ministry of Defense of Russia, issue No. 150, volume 1, Part 1, Article No. 4, pp.21-26
     Korolev 2019.
[14] Haolan Wu et al Research on Programmable Logic Controller Security 2019 IOP Conf. Ser.:
     Mater. Sci. Eng. 569 042031
[15] Klimov S.M., Astrakhov A.V., Sychev M.P. Methodological basis for counteraction to computer
     attacks. Electronic educational edition. - Moscow: Bauman Moscow State Technical University,
     110 p., 2013.


                                                                                                 175

</pre>