Implementation of Deep Learning Methods in the Tasks of Ensuring Information and Psychological Safety for Operators of Automated Railway Traffic Control Systems Konstantin O. Gnidko Mikhail A. Eremeev MSA named after A.F.Mozhaysky MSA named after A.F.Mozhaysky St.Petersburg, Zhdanovskaya St,13, 197198 St.Petersburg, Zhdanovskaya St, 13, 197198 country's economy, the railway system is a very probable target for terrorist acts and other subversive activities of destructive forces interested in destabilizing Abstract the socio-economic and socio-political situation. This This paper discusses the application of deep fact makes the task of ensuring the safety of all elements learning methods to highlight complex patterns of the transport system, especially its governing bodies, in the feature space of recorded parameters of extremely urgent. the information environment and the observed Traditionally the information security is considered as psychophysiological parameters to solve the security of information systems, economic and legal problem of identifying potentially dangerous structures of the states, their invulnerability for negative effects on operators of automated railway information impact of the opponent. But behind these traffic control systems. The results of the diverse objects the human's personality is usually lost. application of multivariate analysis to And the person itself has to become a subject of close experimental data in order to highlight the most attention and protection from possible threats, including informative features and subsequently train the information and psychological, in the era of ultimately neural network are described. The structure of rapid development of telecommunication systems. a convolutional neural network is presented, Thus, not only the problem of information security, but which is potentially capable, within the also the problem of protection from information has framework of the deep learning paradigm, to recently acquired an international scale and strategic generalize at different levels of the hierarchy nature. the poorly formalized features that distinguish harmful media content that can affect the psyche and physiological state of users. 2 Threats of potentially harmful multimedia content Based on previous research, e.g. [Gni2015] and 1. Introduction [Ost2019] we can enumerate the subsets of classes of Being a powerful tool for understanding and potentially harmful multimedia content as follows. transforming the world and man himself, information Texts. The most significant parameters determining the technology at the same time has become a serious threat. effect of the text on the psychophysiological state of a The ubiquity of computer networks and mass person include: phonosemantic characteristics that are communication media has repeatedly strengthened the not reducible to the semantic content of the text; possibilities of remote influence on the human psyche. sentiment properties of the text; fractal properties that This led to the emergence of a new class of security determine the measure of the text's suggestiveness. threats – a harmful informational and psychological Video streams. The visual analyzer through which about impact on the consciousness and subconscious of the 90% of all the processed information comes is most personnel of automated control systems, including the important for the activity of the human-machine field of railway communication. Being critical for the operator. Vision allows to perceive the shape, color, brightness and movement of objects. The possibility of visual perception is determined by the energy, spatial, temporal and informational properties of signals received by the operator. The combination of these properties and the dynamics of their changes over time (the structure of the video stream) determine the information that the visual signal transfers into the conscious and unconscious space of the operator. The patterns of mental processes due to their complexity and incident in Japan that led to the hospitalization of more stealth from the observer. than 700 children after watching the ‘Pokemon’ cartoon series on December 16, 2007 is well known. It was 3 Deep learning in the tasks of ensuring caused by flashing of red and blue color spots for about information and psychological safety for 10 seconds which resulted in the effect of resonance with the main frequencies of the brain activity. operators of automated systems Experimentally proved that visual inserts as well as light Taking into consideration the above mentioned, we have frequency stimulation can affect the psycho- proposed an approach that includes the use of non- physiological state and subconscious of a person. In this invasive means of monitoring the psychophysiological regard, the following types of artefacts in the video parameters of an automated system operator and deep stream are subject to filtering: hidden frames; part-frame learning technology to automatically identify complex incuts; brightness fluctuations (flicker) in the range of (deep) patterns that correspond to potentially dangerous biologically sensitive frequencies. states of the human psyche. Audiostreams. Malicious content to be filtered in audio The experimental part of the study included streams includes: audio suggestion (from Latin demonstration of utterly emotionally significant visual ‘suggestio’) { the process and result of reproducing and stimuli (positive, negative and neutral) to the volunteers perceiving special audio information, which leads to a while recording their oculometry data. significant decrease in the threshold of critical To obtain experimental data a precise Gazepoint perception of the user and changes his emotional and oculograph [Gaz2015] was used. It allowed, in psycho-physiological state; harmful binaural rhythms in particular, to record such psychophysiological the range of biologically sensitive frequencies (the so- indicators of subjects' reactions to the presented stimuli called ‘digital drugs’); hidden subliminal incuts in the as directions and lengths of saccades (fast, strictly audio stream. coordinated eye movements occurring simultaneously A list of the types of potentially dangerous impacts that and in one direction), the coordinates of the eye fixation can be embedded in multimedia content, along with the points on the monitor, the number and duration of relevant features and a summary of detection methods blinks, the diameter of the pupils and a number of others. are presented in Table 1. In total, 25 indicators were included in the apriori feature Despite the availability of particular methods for set. The number of records in the experimental database recognizing potentially harmful objects in multimedia (after preliminary filtering of invalid values) exceeded data streams, analysis showed that currently there are no 300,000. To highlight the most informative features and effective procedures for detecting dangerous reduce the dimension of the feature space, the principal informational and psychological states of automated component analysis (PCA) was used, which allowed us control system operators by output signals available for to switch from the original correlated features to a monitoring under conditions of partial observability of strictly orthogonal basis of linearly independent factors the internal states of the complex 'subject – media' (the main components that maximize the variance of the system, heterogeneity and fuzziness of complexly source data) and discard redundant, uninformative structured psychophysiological data. The presence of an features. The result of this transformation is shown in anthropogenic factor moves such systems into a class of Figure 1. poorly formalized ones. Systems of this type function The Pareto diagram in Figure 1 (a) shows that 95% of under conditions of uncertainty, characterized by a lack the initial dispersion is explained by only two first of information about the informative indicators of principal components, and Figure 1 (b) shows the threatened states, necessary for formalizing the mapping of the raw experimental data into an updated processes taking place in such systems. On the one hand, basis of the first two principal components. The primary the uncertainty is caused by the insufficiency or features that make the most significant contribution to 70 Table 1: Types of potentially dangerous impacts that can be embedded in multimedia content Type of impact Indicator Method of detection Text Suggestive Stable rhythm and self-similarity of separate text Conversion of the text to the form of a series of fragments at the phonetic and lexicographical integers with the subsequent calculation of the Hurst levels. index by the method of normalized scope (R/S- analysis). Emotional a) Prevalence in the text of letters (sounds) with a a) Calculation of phonosemantic evaluation negative phonosemantic coloring. (according to A. Zhuravlev). b) The presence of a large number of emotionally b) Assessment of the text tonality by the method of strong words and markers ('emoticons', sentiment analysis [The2017]. exclamation marks, etc. in the text). Laboriousness of Significant deviation of text statistical indicators Calculation of static laboriousness of decoding and decoding from the evolution model of speech code. dynamic laboriousness of decoding estimates in the paradigm of the evolutionary speech code. Video Flicker in the Periodic change in the brightness of the light flux, Calculation of the integral brightness of pixels for range of affecting the visual analyzer, with a frequency each frame of the analyzed video sequence, biologically close to the natural frequencies of the functioning conversion to the amplitude-frequency sensitive of the brain representation of the integral brightness of the frequencies analyzed frame sequence through the fast Fourier transform; comparison of the obtained frequency response with the parameters of the psycho-visual model. Hidden frames The presence in the videostream of one or two a) Calculation of the integral brightness of frame- ('25-th frames') consecutive frames with a total time of differences and the detection of incuts on the 'Λ' and demonstration not more than 113 ms, significantly ' ΛΛ ' shaped patterns on the graph; different from both previous and subsequent b) detection of frame incuts based on perceptual frames in content. hashing. Part-frame incuts Presence in the video sequence of consecutive Calculation of frame-differences for all frames of a frames differing in short-term (no more than video sequence; localization of suspicious for the 113 ms) demonstration of separate fragments of presence of a dissonant insert by the presence of a images (symbols, faces, etc.). burst on the chart of detailed wavelet coefficients. Audio Audiosuggestion Stable rhythm and self-similarity of the audio Low-frequency filtering of the signal based on a stream at different levels of scaling. discrete wavelet transform with the subsequent calculation of the fractal Hurst index based on the obtained approximating wavelet coefficients. Binaural beats Long (more than 10 seconds) transmission on the Fourier transform of the right and left stereo channels ('digital drugs') right and left stereo channels of audio streams with of the audio signal; calculation of the absolute an absolute difference in main frequencies lying in difference in main frequencies; comparison of the the bio-efficient range of 0-25 Hz. obtained value with a range of human biologically sensitive frequencies. Hidden subliminal The presence in the audio stream of imperceptible Pre-filtering of an audio stream by means of discrete incuts in the audio artificially inserted audio objects (commands) of wavelet transform followed by recognition of stream short duration, discordant with a background. anomalies in detailed wavelet coefficients. 71 corresponding data column, 𝜎 – the standard deviation for the same column. After the normalization the principal component method was applied again (Figure 3). (a) (b) Figure 1: Pareto diagram (a) and representation of raw oculogram data in the space of two first principal components (b) (a) (b) these components are also shown in Figure 1(b) – the Figure 3: Pareto diagram (a) and representation of saccade magnitudes (SACCADE MAG) and their raw oculogram data in the space of 3 first principal angular directions (SACCADE DIR). For convenience components (b) after normalization of further processing, they are presented in the polar coordinate system (Figure 2 (a)), as well as in the form The Pareto diagram shows that the number of principal of a polar histogram (Figure 2(b)). components (and the corresponding features corresponding) in this case increased to 10 (Figure 3(a)), and the visualization of data in three-dimensional space shows a significant contribution to the initial variance of such primary parameters as saccades, coordinates of the viewpoints (BPOGX and BPOGY), diameters of the left and right pupils (LPMM and RPMM). Identified informative features in this way form a working dictionary for implementing deep learning methods. The next task is the choice of the deep learning paradigm and model hyperparameters. This problem is also one of (a) (b) the weakly formalizable, does not have a strict solution, Figure 2: Representation of the direction and length and largely depends on the experience and intuition of of saccades in polar coordinates (a) and in the form of the researcher. Taking into account the exceptional a polar histogram (b) potential variety and complexity of visual stimuli that can contain negative content, it seems advisable to use Despite the complete agreement of the experimental the mathematical apparatus of convolutional neural results with the expected oculomotor activity of the networks (CNN), which was originally developed as a subjects viewing images on a monitor screen, analysis formal analogue of the visual cortex of the brain and has of the data showed that based on only the first two proven itself in solving problems of image classification principal components it was impossible to distinguish [San2019]. A general view and an enlarged fragment of potentially harmful multimedia from the neutral ones, the developed classifying convolutional neural network which would lead to significant difficulties in training containing 144 layers and 168 connections is shown in the network classifier. One of the main reasons which Figure 4. led to this result was a large discrepancy in the A distinguishing feature of the developed neural dimensionality of the analyzed raw data features. For network is the use of the Leaky ReLU activation example, the direction of the saccade ranges from 0 to function in linear rectification units to avoid the problem 360 degrees, while the pupil diameter varies from 1 to of overfitting of individual layers and the neural network 2 mm. To avoid the effects of dimensionality as a whole. The procedure for training of the resulting classifier and the results of evaluating its quality are beyond the scope of this article and will be examined in detail in our subsequent works.

4 Summary

4.1 Acknowledgements
Thus, the work presents the results of a study devoted to The study was carried out with the _nancial support the problem of protecting operators of automated of the Russian Foundation for Basic Research, project railway control systems from potentially harmful № 18-29-22064\18. information and psychological influences based on psychophysiological monitoring by means of Gazepoint software and hardware complex and deep learning methods. It has been determined that the most informative of all the parameters recorded during the experiment were the coordinates of the gaze fixation points, the pupil diameters of the subject and the characteristics of saccades. A specific model of a convolutional neural network has been proposed, which, using the listed features as input values, can be trained to detect graphic content that can harm the psyche and physiological state of automated systems operators. In order to reduce the probability of the CNN overfitting the partial replacement of the most commonly used ReLU activation function with the Leaky ReLU is provided. Testing the quality of the developed convolutional neural network is the goal of future research. 