Determining Key Risks for Modern Distributed Information Systems Dmytro Palko, Hrygorii Hnatienko, Tetiana Babenko and Andrii Bigdan Taras Shevchenko National University of Kyiv 64/13, Volodymyrska Street, Kyiv, 01601, Ukraine Abstract This work aims to study the problem of identifying and assessing information security risks in complex, distributed, and scalable information systems, as well as building a profile of key risk factors that can cause potential information security incidents in the physical and functional allocation of resources. As part of this work, a study was carried out of the main information security risks that can be identified at the time of creating and operating a typical distributed information system designed to support information processes and provide information services. The result of the study is the ranking of major risk factors according to their importance and frequency in practice, as well as highlighting the most significant security controls. The data for analysis was compiled based on the results of interviews and questionnaires of information security specialists with different training levels and different focuses in their activities within this knowledge area. The paper presents summarized information on classical approaches to information security risks modeling based on quantitative, qualitative, and hybrid analysis, as well as the latest methodologies based on solving the problems of intelligent classification and analysis of data on risk factors in the system distribution, and in operation with large data sets. Keywords1 Information security risk, distributed information systems, risk factors, security controls, risk control techniques, risk modeling, risk management, risk assessment, intelligent risk assessment models 1. Introduction Today, information security management plays a key role in the life processes of almost any organization that uses modern technologies for collecting, processing, and storing information. This process is based on the regular assessment of information risks, which allows you to timely identify new threats and vulnerabilities, implement appropriate measures to neutralize them, and continuously monitor the state of information security of the system, considering the previous experience and new factors. To prepare for potential attacks and possible problems of this nature, as well as to prevent disruptions to business processes and operations, damage to reputation, or loss of data, organizations must constantly assess their risk profile, make recommended corrections, and actively improve their security system. Threat analysis and risk management are the cornerstones of any security policy. Cybersecurity risks should be considered as a key factor in the strategic planning of business processes. That is why it is the responsibility of each company to develop a risk assessment methodology that best suits the organization's priorities and business goals. The importance of risk management as a process in modern reality is undeniable. The modeling and forecasting information security risks task has been and remains a significant and priority. This issue is especially relevant in the context of the widespread of complex multi-component information systems II International Scientific Symposium «Intelligent Solutions» IntSol-2021, September 28–30, 2021, Kyiv-Uzhhorod, Ukraine EMAIL: palko.dmytro@gmail.com (A. 1); g.gna5@ukr.net (A. 2); babenkot@ua.fm (A. 3); abigdan@gmail.com (A. 4) ORCID: 0000-0002-2886-1975 (A. 1); 0000-0002-0465-5018 (A. 2); 0000-0003-1184-9483 (A. 3); 0000-0002-2940-6085 (A. 4) ©️ 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) 81 that have a distributed nature and contain a large number of nodes. The total number of computer systems, active network equipment, and peripherals installed in any infrastructure is growing at a phenomenal rate. The relative simplicity of networking is a compelling reason to interconnect computer systems, share functions, and share resources. This approach allows better use of a computing power vast set that is currently available, but on the other hand, raises some issues primarily related to the complexity of security and potential risk management. The transition to complex, large-scale, and structurally complex information systems increases the likelihood of unforeseen and unplanned events affecting the performance and operability of the system as a whole. 2. Literature Review and Problem Statement 2.1. Distributed Information Systems overview The widespread introduction of distributed information systems (DIS) today is representative of almost all areas of human activity, where they are entrusted with solving more and more important tasks. The efficiency of decision-making and the efficiency of the functioning of many economic, social, political, and military structures depend on the quality of DIS functioning. Distributed information systems are complex technical systems consisting of many structural elements, functionally combined to provide one or more types of information processes and the provision of information services. Such systems typically operate under random factors, the presence of negative influences of various natures, active interaction with the external environment, and the high cost of impacts of possible violations or malfunctions. All this causes many problems related primarily to information security. Managing the cybersecurity risk assessment in distributed systems involves solving a set of problems related to functional distribution and hierarchy, a high degree of resources parallelization, and a near-complete lack of centralized management. On the way, there are difficulties to the analysis of heterogeneous data, the need to reconcile information obtained from different sources, the variability of distributed metrics, which requires a wide arsenal of tools for analytical processing and intelligent data processing of different nature, the problem of incomplete information about the components of a distributed system and the complexity of integrated multifactor analysis in general. There is a need, on the one hand, for a set of methods and tools that can eliminate these obstacles, and on the other hand, for a new approach to organizing research on information security risks in a distributed environment and performing comprehensive analytical processing of distributed data of various natures. Therefore, the implementation of a new approach to managing risk assessment in DIS involves the introduction of a comprehensive solution that integrates data obtained from different sources and a wide range of tools for their analysis [1]. Several international organizations and leading universities are engaged in research on this issue. The key standards in this area that need to be relied on are ISO / IEC 27001: 2013, NIST SP 800-30, and BS 7799-3: 2017. However, despite significant achievements, there is currently no single system vision of all aspects of the problem, the nature, and features of research tools, and its place in the process of multifactor risk analysis of a distributed system, considering the entire complex of interrelations and mutual influence of the processes associated with it. The different degree of depth of elaboration of certain aspects of this problem has led to the need for effective models and methods of reconciliation and analytical processing of heterogeneous data for rapid analysis of the current state of information security of a distributed system. 2.2. Risk Management Process in Distributed Information Systems Information security risk assessment is an extremely important part of a company's data protection strategy. It is conducted out to support decision-making and immediate response to identified threats (risk response). Information security risk analysis allows you to determine the necessary and sufficient set of information security tools, regulatory and organizational mechanisms to reduce information security risks, allowing to ensure the process of building the most effective information security management system architecture for a given organization [2]. 82 Risk management is an iterative process of identifying, quantifying, analyzing, and managing the risks faced by an organization [3]. Risk management is designed to ensure a stable operation of the information system and minimize possible losses in the event of information security threats. As an integral part of management practice, risk management should be carried out regularly to support organizational improvements, improve existing security tools and mechanisms, improve efficiency and make management decisions [4]. The main risks are those risks that have a high likelihood of occurrence and, if implemented, provide the possibility of a significant impact on operational performance, achievement of the goals and objectives of the project, or may damage reputation [5]. In terms of systemic distribution, risk management should provide for a complex nature, and consider the risk assessment for each asset or subsystem. The specificity of the architecture of distributed information systems involves the analysis of data, largely differentiated in their structure, and the use of all available tools of assessment methods (both quantitative and qualitative) that characterize various components of the studied environment. The poor structure of the tasks of such research resulted from the lack of formal models and obtaining objective measurements results together with subjective expert assessments. Thus, the risk management process in distributed information systems is a sophisticated and rather integrated task. The study of risk factors in a distributed environment deserves special attention. According to ISACA's annual STATE OF ENTERPRISE RISK MANAGEMENT 2020 survey [6], the biggest challenges in corporate risk are factors related to the emergence of new threats, changes/advances in technology development, as well as weak human resources and lack of necessary skills and experience of specialists and existing cybersecurity teams (Figure 1). Figure 1: Сybersecurity challenges today On the other hand, according to this study, the most frequently used control to prevent/mitigate potential security concerns is to raise awareness and conduct training on cybersecurity among staff (Figure 2). 83 Figure 2: Top mitigation controls Eighty percent of enterprise respondents provide awareness-raising training, 68 percent use disaster/incident recovery strategies, and 67 percent use general information security control and management. Less than half of the responding businesses use insurance as a mitigation control; at the same time, the largest supporters of this approach are companies in North America and Africa [6]. 2.3. Main Approaches to Risk Assessment There are many different methods for analyzing information risks for distributed systems. Their main differences are the approaches and the scales being used for assessing the risk level: quantitative or qualitative. Conventionally, among the methods of risk assessment, the following three groups can be distinguished: 1. Statistical methods 2. Methods of expert assessments 3. Modeling methods 2.3.1. Statistical Risk Assessment Methods To assess the information security risks, a qualitative, quantitative, or combined approach can be used. In quantitative methods, the risk is assessed in the form of numerical values. Accumulated statistical information on incidents and violations, as well as meta-information about the current state and configuration of the node components of the distributed system, are usually used as input data for the assessment. However, the frequent lack of sufficient statistics leads to a decrease in the adequacy of the assessment results. Other limitations are complexity, high labor intensity, and long execution time, especially in the terms of the analysis of distributed systems. The advantages of the quantitative approach include the accuracy of risk assessment, clarity of results, and the ability to compare the risk value, expressed in financial equivalent with the investment amount required to respond to this risk. Qualitative methods are more common, but they use too simplified scales, usually containing three levels of risk assessment (high, medium, low). The assessment is based on expert surveys, and promising intellectual methods are still insufficiently applied. Other disadvantages are the lack of visibility and complexity of using the results of risk analysis for economic justification and assessing 84 the feasibility of investing in risk response measures. The advantage of a qualitative approach is its simplicity and minimization of the time and labor costs for conducting a risk assessment [7]. The combined approach involves a combination of both methods to apply the benefits of each. According to "The Marsh Microsoft 2019 Global Cyber Risk Perception Survey" (September 2019), the popularity of a quantitative approach to assessing information security risks has increased significantly compared to 2017, but it remains low (Figure 3) [8]. Figure 3: Approaches to assessing information security risks following "The Marsh Microsoft 2019 Global Cyber Risk Perception Survey" Thus, today most companies use a qualitative scale to assess information security risks. 2.3.2. Statistical Risk Assessment Methods If it is impossible to use statistical methods for analyzing the risks of a distributed system (lack of information on risk factors, insufficient data sampling size, complexity and sophistication of infrastructure, etc.), you should refer to expert assessment methods. The essence of the method of expert assessments is to conduct an expert analysis of the problem using qualitative and quantitative assessment of hypotheses and further processing of the results. This method is simple and accessible for practical application but requires a significant level of competence and extensive practical experience from the expert. When using expert methods, the risk level is assessed based on the analysis of the probability of an adverse event occurring by studying and assessing the factors affecting it. Thus, the practical application 85 of this method is to establish a list of factors that determine a particular type of risk, as well as to determine the relationship between the nature of the factor and the risk level that this factor causes. For the objectivity and impartiality of the results, the work on identifying and assessing information security risks should be carried out by special experts or relevant expert groups who have the necessary experience and training on this matter issue [9]. 2.3.3. Modeling Methods The most effective methods for analyzing information security risks in distributed systems are modeling methods [10], among which there are neural networks that can identify and adequately assess information security risk relying on data mining tools. The need to extract unknown, non-trivial, practical, and useful knowledge from the "raw" metadata about the operation of a distributed system, which can be interpreted in a certain way and used to make decisions about the risk level, gives this problem a non-trivial interpretation. Artificial Intelligence (AI) is not a new concept, but only in recent years, various companies have begun to explore and understand its full potential. Intelligent systems play an increasingly important role in network management. Most research in intrusion detection and risk assessment systems heavily relies on AI techniques to design, implement, and improve security monitoring systems. Recent malware updates and improvements in cyberattacks are difficult to detect with traditional cybersecurity techniques. An important advantage of neural networks is their ability to "learn" the characteristics of the input data and identify elements that are not similar to those previously observed in the system. New AI algorithms use machine learning to quickly adapt and analyze new data, improve results and identify new vectors of risk implementation. Most modern methods of attack detection and risk assessment leverage some form of rule-based analysis or a statistical approach. The analysis relies on a set of predefined rules created by the administrator or by the security system itself. Unlike expert systems, which can give the user a definite answer about the compliance of the considered characteristics embedded in the knowledge base rules, the neural network analyzes the information and provides an opportunity to assess, reconcile the data with the characteristics it is trained to recognize [11]. Thus, forecasting and modeling the level of risk, coordination, and intelligent processing of various nature data about risk factors and creating based on their analysis a comprehensive approach to risk assessment in distributed information systems is a priority research area today. 3. The Research Methodology The purpose of this study is to highlight the key risk factors inherent in modern distributed information systems, analyze the most significant security controls for development based on their recommendations to eliminate potential threats. 3.1. The Data Collection Process for Analysis A questionnaire method was used to collect a sample of test data for analysis. The respondents were several dozen information security engineers of various training levels, penetration testing and auditing specialists, and leading specialists in the field of information security project management. All interviewees were selected selectively and have experience in providing security for information system infrastructures of various sizes and scales. The study involved two data collection processes for analysis. The first is a pilot survey to test the research instruments and adjust the questions, the second is a mass survey of the target group using the final version of the questionnaires. The pilot study was performed before the main questionnaire and aimed to test whether the proposed model of the questionnaire is suitable for the analysis of the final metrics. 86 23 specialists (Table 1) were involved in the main survey. Age of survey participants from 24 to 47 years, with an average of 34.2. 3.2. Questionnaire Development The development of questionnaires considers the most common risk factors that are common to most modern distributed infrastructures. The questionnaire included 40 questions on the main risk factors and 14 questions on the practice of applying security controls in real projects. These indicators were identified at the stage of analysis of literature sources in this subject area and during the pilot survey. The respondents were asked to answer these questions anonymously and subjectively, relying on their own experience and the real practice of working with distributed information systems. IBM SPSS Statistics software toolkit was used for data analysis and modeling as it is widely used for statistical analysis by market researchers, health researchers, survey companies, government, education researchers, marketing organizations, data miners, and others. The research findings are analyzed and discussed in the following sections. Table 1 The Survey Participants Classification 1 2 3 4 № Categories Number of respondents Percentage (%) 1. Gender 1.1. Female 6 26.08 1.2. Male 17 73.91 Total 23 100.00 2. Position 2.1. Information Security Engineer 6 26.08 2.2. Information Security Auditor 2 8.69 2.3. Penetration Tester 5 21.73 2.4. Malware Analysts 2 8.69 2.5. Infrastructure Engineer2 5 21.73 2.6. Project manager 3 13.04 Total 23 100.00 The share of female respondents among the interviewers is only 26 percent. 3.3. Research Criteria and Analysis of the Test Sample The respondents were asked various questions that used scales from 1 to 5. To increase efficiency and narrow the gradation of possible results for assessing risk factors at work, a 5-point scale was chosen, in which the indicator “does not matter” is equal to one, and “extremely important” is equal to five. Likewise, there are five categories for assessing security controls so that the “never” indicator is one and the “always” indicator is five. Thus, all questions about risk factors in distributed systems were measured on a five-point Likert scale from “nonsignificant” to “most important”, and all security controls – from "never" to “always”. The Likert scale is quite easy to build, it provides relative reliability even with a small number of judgments, and the data obtained is easy to process. The selection of judgments for the scale was carried out based on an analysis of literary sources in a given subject area and during pilot research by the method of selection from an initial list of judgments with the most discriminatory ability to the measured attitude. For this purpose, an initial list of statements was created (Table 2) that were offered to respondents from a group representative of the target audience (participants in the pilot study). 2 With a background in the field of cybersecurity 87 Table 2 Risk Factors and Security Controls Measures Scale Scale Risk factors Security controls 1. Unimportant Never 2. Slightly Important Seldom 3. Important Sometimes 4. Very Important Often 5. Critical Always When working with the scale, the respondents rated the degree of their agreement or disagreement with each of the proposed judgments, from “completely agree” to “completely disagree”. 3.4. Key Risk Factors The study demonstrates 40 main risk factors in modern distributed information systems (Appendix 1, Table A1), labeled from Factor_1 to Factor_40, which are quite common in the relevant literature, are often found in practice, and are widely used by researchers and experts in cybersecurity when studying risk factors and conducting risk management measures. These factors should be identified in the process of assessing and managing information security risks and monitored in the future. Separately, the risks caused by a human factor should be highlighted. They include not only employee mistakes, but also intentional actions that lead to violating information confidentiality. Referring to the NIST SP 800-37 Risk Management Framework, should not forget about such categories shown in Table 3. Table 3 Risk Categories № Category 1. Financial risks 2. Legal risks 3. Business risks 4. Political risks 5. Software risks 6. Risks of non-compliance with legislation 7. Security and confidentiality risks 8. Project risks 9. Reputational risks 10. Risks of life safety 11. Risks of strategic planning They were not considered in this study, however, constitute an important part of any risk management process [12]. 3.5. Key Security Controls and Risk Management Measures As a result of the analysis of the above statistical data, the expert group proposed possible categories of actions to minimize information security risks, including organizational and legal protection of information, engineering, hardware and software protection, cryptographic mechanisms for protecting information [13], as well as institutional arrangements and physical protection measures. As effective controls to ensure the security of distributed systems by trained full-time specialists or with the help of information security outsourcing, the following solutions (both separately and in aggregate) can be implemented, shown in Table 4. The ISO 27001 standard and its Appendix A are 88 important tools for information security management [14] and it was a ground for developing questionnaires on possible control and risk management measures. It contains a list of security measures that must be applied to improve information security and consists of 114 security controls, divided into 14 chapters. Not all of these controls are mandatory for implementation – the company can choose on its own, it considers the controls applicable in the given circumstances and depending on the business direction, infrastructure state, or the existing profile of external threats, and then implement them (usually at least 90 % controls). A more detailed description of each control in Appendix A with an explanation of how it should be applied is presented in the ISO 27002 standard. However, the latter does not provide any explanations and tips on how to choose control in a given situation, which controls to implement, how to measure them and how to distribute duties [15]. Table 4 Security Systems for Implementing Technical Security Controls 1 2 № Protection system 1. Backup and restore systems 2. Protection system against unauthorized access 3. Network shielding systems 4. Protection systems against attacks at the application level (WAF) 5. Incident and event management systems (SIEM) 6. Identity and Access Management systems (IAM) 7. Security and confidentiality risks 8. Management systems for compliance with information security requirements (Compliance Management) 9. Data leak prevention systems (DLP) 10. Information right management systems (IRM) 11. Solutions for network security 12. Anti-virus protection systems 13. E-mail protection systems 14. Content filtering systems for web traffic 15. Access control systems to peripheral devices and applications 16. Systems for monitoring the integrity of software environments 17. Cryptographic protection for stored information Separately, it should be noted the international standard ISO/IEC 27005: 2018 “Information technology – Security techniques – Information security risk management”, which contains recommendations for information security risk management. This document supports the general concepts defined in ISO/IEC 27001 and is intended to guide the implementation of information security measures based on a risk-based approach [16]. In the study, it was proposed to evaluate 14 main groups (Appendix 1, Table 6) of information security controls of modern distributed information systems in terms of frequency and effectiveness of their use, labeled from Control_1 to Control_14. Thus, the proposed options cover the entire range of the most common risk management mechanisms and measures used in modern distributed systems. 4. Results and Discussion 4.1. The Importance of Risk Factors in Lifecycle of Modern Distributed Information Systems Table 7 (Appendix 1) shows that nearly all respondents ranked factors related to lack of cybersecurity policy, lack of protection mechanisms against network attacks, violations of 89 authentication and session management, violations of access control, and use of components with known vulnerabilities as the most important. The uncorrected sample standard deviation S is calculated (1) for four groups of factors, each of which contains 10 of them 1 (1) 𝑆 = √𝑛 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̃ )2 , where {x1, x2, …, xn} are the mean values of the sample items, n = 10 – the size of the sample (number of factors in each group: Factor_1 – Factor_10, Factor_11 – Factor_20, Factor_21 – Factor_30, Factor_31 – Factor40), and 𝑥̃ – the mean value of this assessment (2) 1 (2) 𝑥̃ = 𝑛 ∑𝑛𝑖=1 𝑥𝑖 . The vast majority of technological factors play a key role in determining the final risk level. Analysis and summarization of the survey responses gave the following rating of the importance of the listed risks (in order of importance): Factor_20, Factor_6, Factor_10, Factor_11, Factor_12, Factor_14, Factor_8, Factor_9, Factor_4, Factor_3, Factor_13, Factor_17, Factor_18, Factor_1, Factor_15, Factor_15 , Factor_5, Factor_7, Factor_16. Among organizational factors, the risks associated with the lack of cybersecurity and anti-virus protection policies are the most important. The ranking of the importance of risks in this category (in order of importance): Factor_21, Factor_27, Factor_30, Factor_28, Factor_29, Factor_25, Factor_23, Factor_24, Factor_22, Factor_26. In addition, all respondents noted that the risk of abuse of privileges is the highest risk factor and very important among the factors associated with the human factor. Risk severity rating for this category (in order of importance): Factor_33, Factor_40, Factor_35, Factor_37, Factor_38, Factor_36, Factor_34, Factor_31, Factor_32, Factor_39. In summary, the categories of risk factors can be ranked in order of importance and criticality as follows: logical, physical, human factors, and organizational factors. Table 5 illustrates a list of the top 10 key risk factors for distributed information systems based on a survey of experienced cybersecurity managers and engineers. Table 5 Top 10 Risk Factors for Distributed Information Systems № N Mean Std. Deviation % percent Factor_21 23 4.391304 0.656376 87.8260 Factor_20 23 4.304348 0.634950 86.0869 Factor_6 23 4.217391 0.735868 84.3478 Factor_10 23 4.173913 0.650327 83.4782 Factor_11 23 4.130435 0.625543 82.6087 Factor_12 23 4.086957 0.668312 81.7391 Factor_33 23 4.000000 0.738549 80 Factor_27 23 3.956522 0.824525 79.1304 Factor_14 23 3.913043 0.792754 78.2608 Factor_8 23 3.869565 0.694416 77.3913 4.2. Frequency of Controls Occurrence Table 6 shows the mean and standard deviation for each group of security controls. The results of this study show that most security controls are used frequently and are important mechanisms to prevent and minimize potential risks. 90 4.3. Construct Validity (Risk Factors Correlation) The next step was to test the hypothesis about relationships between key risk factors using correlation coefficients. The correlation coefficient is a statistical indicator of the probability of a relationship between two variables, measured on a quantitative scale, which allows you to answer the question of the degree and direction of the relationship between the values of these variables. To choose the right method of correlation research, it is necessary to answer the question of whether the studied factors are normally distributed. Frequency histograms for key risk factors are presented in Appendix 2. An example of a histogram for factor Fact_21 is shown in the Figure 4. Table 6 The Mean Score for Each Control Factor 1 2 3 4 5 № N Mean Std. Deviation % percent Control_1 23 4.260870 0.619192 85.2174 Control_2 23 3.391304 0.940944 67.82608 Control_3 23 2.347826 1.070628 46.95652 Control_4 23 3.782609 0.735868 75.65218 Control_5 23 4.304348 0.634950 86.08696 Control_6 23 4.478261 0.593109 89.56522 Control_7 23 4.478261 0.665348 89.56522 Control_8 23 4.260870 0.619192 85.21740 Control_9 23 4.130435 0.625543 82.60870 Control_10 23 3.000000 1.044466 60.00000 Control_11 23 2.086957 0.900154 41.73914 Control_12 23 2.130435 0.757049 42.60870 Control_13 23 2.478261 0.845822 49.56522 Control_14 23 2.782609 0.951388 55.65218 Frequency histogram for Fact_21 Figure 4: Example of frequency histogram for risk factor Fact_21 There are two hypotheses (3) for the test:  Null hypothesis (H0): the data comes from the specified distribution.  Alternate Hypothesis (H1): at least one value does not match the specified distribution. 91 That is, (3) 𝐻0 : 𝑃 = 𝑃0 , 𝐻1 : 𝑃 ≠ 𝑃0 , where P is the distribution of our sample and P0 is a normal distribution. Even though the plotted frequency histograms at first glance are quite symmetric and are well described by the parabolic curve for both tests, significance values less than .05 which means that the data do not have a normal distribution (Figure 5). So, the null hypothesis that the data is normally distributed was rejected. Figure 5: Test of Normality for Key Risk Factors Since the volume of the studied sample is small (n<30), all factors are quantitative and the distribution of their values is not normal, it is decided to choose the rank correlation coefficient r-Spearman (4). Deciding on the type of correlation when interpreting the results, it is important to remember and keep in mind that linear correlations are more accurate than rank correlations. Ranking of values when using r-Spearman naturally reduces the degree of individual variability of the measured indicator. 6 ∑ 𝑑2 (4) 𝑖 𝑟 = 1 − 𝑛(𝑛2−1) , where n = 10 – number of factors, di is the difference between the two ranks of each assessment. To assess the feasibility of using the above-described research tools, the correlation coefficients were calculated for key risk factors of modern distributed information systems. The interpretation of the correlation coefficient is based on the level of the bond strength:  0.70 < r ≤ 1.00 – strong positive connection,  0.30 < r ≤ 0.69 – moderate positive connection,  0.01 < r ≤ 0.29 – weak positive connection,  -0.01 > r ≥-0.29 – weak negative connection,  -0.30 > r ≥-0.69 – moderate negative connection,  -0.70 > r ≥-1.00 – strong negative connection. The interpretation of the significance level (p-value) of the correlation coefficient is carried out in the same way as it was done for parametric and nonparametric criteria:  if the p-value ≤ 0.05, the relationship between variables is statistically significant;  if the p-value > 0.05, the relationship between variables is statistically nonsignificant. Also, when interpreting the p-value of the correlation coefficient, it is important not only the fact of significance but also its level. Traditionally, the p-value of correlation is differentiated into three levels: 92  .01 < p ≤ .05 – low statistical significance (one star – *),  .001 < p ≤ .01 – the average strength of statistical significance (two stars – **),  p ≤ .001 – high statistical significance (three stars – ***). Table 10 (Appendix 1) illustrates the relationship between key factors. The correlation analysis revealed a moderate negative relationship of medium statistical significance between factors Factor_20 and Factor_8 – r-Spearman =-0.528 at p ≤ .01, as well as a moderate negative relationship of low statistical significance between factors Factor_14 and Factor_8 – r-Spearman =-0.415 at p ≤ .05. Analyzing the results of correlation analysis, we can conclude that among the studied risk factors there is a moderate positive relationship of low statistical significance for the correlation of variables Factor_10 and Factor_11 – r-Spearman = 0.423 at p ≤ .05. Thus, the obtained results indicate that risk factors are often interrelated and have complex impacts, and therefore require a comprehensive and multidisciplinary analysis, considering all possible factors and conditions. 5. Conclusions Thus, this paper investigates the problem of identifying and assessing information security risks in complex, distributed, and large-scale information systems, and also builds a profile of key risk factors that can cause potential information security incidents in the physical and functional allocation of resources. The study examines the main risks of information security that can be identified during the construction and operation of a typical distributed information system designed to provide one or more types of information processes and provisioning information services. The result was a ranking of the main risk factors according to their importance and frequency in practice, as well as highlighting the most significant security controls. Thus, the results of the study show that all risk factors in the life cycle of a modern distributed system are very important and require detailed analysis and consideration when building a profile of potential threats and assessing information security risks. The importance rating of the risk factors categories by nature can be given as follows (in order of importance): technological factors (logical and physical), human factors, organizational factors. In particular, the study identified ten main risk factors for distributed information systems, which can be displayed as follows (in order of importance and criticality of potential consequences): Factor_21, Factor_20, Factor_6, Factor_10, Factor_11, Factor_12, Factor_33, Factor_27, Factor_14, Factor_8. Nearly all respondents ranked factors related to lack of cybersecurity policy, lack of protection mechanisms against network attacks, violations of authentication and session management, violations of access control, and use of components with known vulnerabilities as the most important. These factors should be identified in the process of assessing and managing information security risks and monitored in the future. Analysis of the most common categories of risk management mechanisms and measures used in modern distributed systems has shown that most protection controls are used frequently and are important mechanisms for preventing and minimizing potential risks. The generalization of the survey responses, according to the main groups of information security controls of modern distributed information systems in terms of frequency and effectiveness of their use, showed that most of the respondents identified controls responsible for the proper and effective use of cryptography and public key infrastructure, logical and physical access control, operational security and compliance with information security policies as important and most common in practice. The results of the study can be used by managers and information security engineers to assess the importance and probability of potential risks and further prevent and minimize their consequences, as well as build tools for identifying and analyzing the risks of distributed systems based on qualitative, quantitative and intelligent methods. 93 6. References [1] Andrew S. Tanenbaum, Maarten Van Steen Distributed Systems: Principles and Paradigms, Prentice Hall of India; 2nd edition (January 1, 2007) [2] Henry K. Risk management and analysis / Kevin Henry // Information Security Management Handbook / Edited by Harold F. Tipton, Micki Krauze. - 6th edition. - Boca Raton: Auerbach Publications, 2017. - Part 1, Section 1.4, Ch. 28. - P. 321-329. [3] Medvedeva, E. Organizatsyia integrirovannogo riskmenedzhmenta v organizatsyi // Vestnik nauki i obrazovaniya. 2020. № 24-4 (78). P. 23-26. [4] Kanatov, M. Expert systems for information security management and audit. Implementation phase issues / M. Kanatov, L. Atymtayeva, B. Yagaliyeva // 2014 Joint 7th International Conference on Soft Computing and Intelligent Systems (SCIS) and 15th International Symposium on Advanced Intelligent Systems (ISIS). – 2014. doi: 10.1109/scis-isis.2014.7044702 [5] Trofymova N. Sovremennye tendentsyi korporativnogo risk-menedzhmenta v sisteme obespecheniya ekonomicheskoi ustoichivosti promyshlennykh predpriyatiy // UPRAVLENIE T. 8 № 2 / 2020. Mezhotraslevoi menedzhment. P. 30-38. [6] State of Enterprise Risk Management 2020 Survey // ISACA, CMMI Institute. - 2019. - https://www.isaca.org/-/media/info/state-of-enterprise-risk-management-survey/index.html [7] Rot A. IT Risk Assessment: Quantitative and Qualitative Approach // Proceedings of the World Congress on Engineering and Computer Science, 2008. - p. 1073-1078. [8] 2019 Global Cyber Risk Perception Survey // Marsh, Microsoft. - 2019. - https://www.microsoft.com/security/blog/wp-content/uploads/2019/ 09/Marsh-Microsoft-2019- Global-Cyber-Risk-Perception-Survey.pdf. [9] Konev I. Informatsyonnaya bezopasnost predpriyatiya. / I. Konev, A. Beliaev - SPb.: BKhVPeterburg, 2003. [10] Chang, L.-Y. Applying fuzzy expert system to information security risk Assessment - A case study on an attendance system [Text] / L.-Y. Chang, Z.-J. Lee // 2013 International Conference on Fuzzy Theory and Its Applications (iFUZZY). - 2013. doi: 10.1109/ifuzzy.2013.6825462 [11] Xin Y. et al. Machine learning and deep learning methods for cybersecurity //IEEE access. – 2018. – Vol. 6. – P. 35365-35381. [12] NIST Special Publication 800-30 Rev A. Risk Management Guide for Information Technology Systems, Gary Stoneburner, Alice Goguen, and Alexis Feringa, July 2002. [13] Palko D., Myrutenko L., Babenko T., Bigdan A.: Model of Information Security Critical Incident Risk Assessment. 2020 IEEE International Conference on Problems of Infocommunications Science and Technology, PIC S and T 2020, 2021, pp. 157–161, 9468107. [14] ISO/IEC 27001:2013. Information technology - Security techniques - Information security management systems - Requirements. 2013. [15] ISO/IEC 27002:2013. Information technology - Security techniques - Code of practice for information security controls. 2013 [16] ISO/IEC 27005:2011. Information technology - Security techniques - Information security risk management. 2011. 94 7. Appendix Appendix 1. Tables data Table A1 Top Security Risks Factors of Modern Distributed Information Systems Based on Researchers 1 2 3 Category № Risk factors Factor_1 Insecure applications use Factor_2 Inadequate patch management Factor_3 API vulnerabilities and breaches Logical (software) Factor_4 Technical flaws and errors during system design Factor_5 Insufficient logging and monitoring Factor_6 Broken authentication and session management Factor_7 Unapproved third-party software use Factor_8 Use of unlicensed software solutions with undeclared capabilities Factor_9 0-day vulnerabilities and errors associated with the development of information technology Technological factors Factor_10 Broken access control3 Factor_11 Using outdated hardware and components with known vulnerabilities Factor_12 Servers and network appliances security misconfiguration Factor_13 Low reliability of the set of hardware and software components, lack of a recovery plan, and periodic backups Physical (hardware) Factor_14 Weak endpoints and network perimeter protection Factor_15 Unmanaged IoT and mobile devices Factor_16 The imperfection of the organizational structure of the information security, the need for frequent reconfiguration of the information security or its individual parts Factor_17 The possibility of information leakage and sensitive data exposure using technical channels Factor_18 Insufficient physical access control Factor_19 Unauthorized use of the organization's assets Factor_20 Lack of protection mechanisms against external network attacks Factor_21 Lack of a cybersecurity policy Organizational factors Factor_22 Non-compliance with the requirements of standards at the stage of design of the system Factor_23 Non-compliance with information security requirements during system exploitation Factor_24 Lack of control over information security incidents Factor_25 Lack of top management commitment support and involvement Factor_26 Lack of security audits 3 Lack of differentiation of user rights and controlled area access 95 Table A1 (continued) 1 2 3 Factor_27 Lack of antivirus protection policy Factor_28 Weak potential to apply existing protection technologies Factor_29 Inconsistencies between the infrastructure and the adopted security measures Factor_30 The inability to provide the proper level of support and comprehensive development of security systems Factor_31 Actions of unreliable employees Factor_32 Unintentional mistakes of service personnel Factor_33 Privilege abuse Factor_34 The essential list of persons with access to protected Human factors information Factor_35 Lack of personnel awareness (especially about phishing/social engineering) Factor_36 Lack of information security training Factor_37 A severe shortage of cybersecurity professionals Factor_38 Insufficient passwords hygiene Factor_39 Personnel access to potentially dangerous objects in the external network Factor_40 Data loss or theft controls lack Table A2 Key Security Controls of Modern Distributed Information Systems 1 2 3 № Security control Description Control_1 Information controls responsible for implementing and verifying security policiescompliance with information security policies Control_2 Organization of controls responsible for the organizational component information of information security measures and the distribution security of responsibilities; creation of a management system for initiating and monitoring the implementation and operation of information security in the organization Control_3 Personnel and controls designed to regulate the work of personnel human resources and contractors, identifying their responsibilities for security information security both at the stage of the working process and upon dismissal Control_4 Asset management controls related to the inventory of company assets, classification of processed information, and media management Control_5 Logical access controls responsible for restricting access to control information and information processing facilities, access control policy, rights management for authorized users to systems and applications 96 Table A2 (continued) 1 2 3 Control_6 Cryptography controls responsible for the proper and effective use of cryptography and public key infrastructure (PKI) to protect the confidentiality, reliability, and integrity of information Control_7 Physical and controls related to the management and prevention of environmental unauthorized physical access, loss, damage, theft or security compromise of assets and interruption of the organization's activities, as well as the definition of safe zones, entry controls, equipment security, “clear desk” and “clear screen” policies Control_8 Operational a set of controls for ensuring the correct and secure security work of processing information means that combines such activity as change management, backup, monitoring, logging and activity logs management, tracking the installed software and detecting malicious software, monitoring and eliminating identified vulnerabilities Control_9 Communications controls related to network security, network services, security information transmission, and messaging Control_10 System acquisition, controls that define security requirements and development, and protection mechanisms in development and support maintenance processes Control_11 Supplier controls regarding relationships with third parties and relationships contractors, protecting the organization's valuable assets that are available to them and ensuring an agreed level of information security and service delivery under agreements with suppliers Control_12 Information controls related to incident management, events, and security incident information security vulnerabilities, reporting on management identified violations, defining responsibilities, response procedures, and collecting evidence Control_13 Information controls that are necessary to ensure business security aspects of continuity planning, verification and ongoing audit business continuity procedures, the availability of resources and management information processing facilities, the use of resiliency and reliability principles to ensure security Control_14 Compliance controls that require compliance with legal and contractual requirements to avoid breaches of statutory, regulatory, or contractual obligations related to information security, procedures for protecting intellectual property, personal data, and assessing information security at all stages of the life cycle 97 Table A3 Mean Score for Each Risk Factor in Lifecycle of Modern Distributed Information Systems Category № N Mean Std. Deviation % percent Factor_1 23 3.043478 0.824525 60.8695 Factor_2 23 2.826087 0.886883 56.5217 Factor_3 23 3.695652 0.764840 73.9130 Logical (software) Factor_4 23 3.739130 0.540824 74.7826 Factor_5 23 2.782609 0.795243 55.6521 Factor_6 23 4.217391 0.735868 84.3478 Factor_7 23 2.695652 0.764840 53.9130 Factor_8 23 3.869565 0.694416 77.3913 Technological factors Factor_9 23 3.826087 0.777652 76.5217 Factor_10 23 4.173913 0.650327 83.4782 Total 23 3.4869564 0.7435418 69.7391 Factor_11 23 4.130435 0.625543 82.6087 Factor_12 23 4.086957 0.668312 81.7391 Factor_13 23 3.434783 0.895752 68.6956 Physical (hardware) Factor_14 23 3.913043 0.792754 78.2608 Factor_15 23 2.869565 0.868873 57.3913 Factor_16 23 1.913043 0.733178 38.2608 Factor_17 23 3.434783 0.843482 68.6956 Factor_18 23 3.086957 0.733178 61.7391 Factor_19 23 2.869565 0.757049 57.3913 Factor_20 23 4.304348 0.634950 86.0869 Total 23 3.4043479 0.7553071 68.0869 Factor_21 23 4.391304 0.656376 87.8260 Factor_22 23 2.434783 0.787752 48.6956 Organizational factors Factor_23 23 2.608696 0.782718 52.1739 Factor_24 23 2.521739 0.845822 50.4347 Factor_25 23 2.739130 0.810016 54.7826 Factor_26 23 2.391304 0.838783 47.8260 Factor_27 23 3.956522 0.824525 79.1304 Factor_28 23 2.913043 0.596432 58.2608 Factor_29 23 2.782609 0.795243 55.6521 Factor_30 23 3.043478 0.638055 60.8695 Total 23 2.9782608 0.7575722 59.5652 Factor_31 23 2.782609 0.599736 55.6521 Factor_32 23 2.304348 0.764840 46.0869 Factor_33 23 4.000000 0.738549 80 Human factors Factor_34 23 2.869565 0.548083 57.3913 Factor_35 23 3.391304 0.782718 67.8260 Factor_36 23 3.000000 0.603023 60 Factor_37 23 3.347826 0.884652 66.9565 Factor_38 23 3.260870 0.688700 65.2174 Factor_39 23 2.086957 0.668312 41.7391 Factor_40 23 3.782609 0.599736 75.6521 Total 23 3.0826088 0.6878349 61.6521 98 Table A4 Testing the Hypothesis about the Relationship between Variables Using Spearman's Correlation Coefficient Fact_ Fact_2 Fact_ Fact_1 Fact_1 Fact_1 Fact_3 Fact_2 Fact_1 Fact_8 21 0 6 0 1 2 3 7 4 Fact_ Correlation_ 1.000 .065 -0.95 -.056 -.055 .205 .000 .118 .340 -.251 21 Coefficient Sig. (2-tailed) .770 .665 .801 .803 .344 1.00 .592 .112 .248 N 23 23 23 23 23 23 23 23 23 23 Fact_ Correlation_ .065 1.000 .119 .244 .134 .110 .107 -.164 .066 -.528** 20 Coefficient Sig. (2-tailed) .770 .588 .262 .544 .616 .628 .453 .765 .010 N 23 23 23 23 23 23 23 23 23 23 Fact_ Correlation_ -.095 .119 1.000 .005 .262 .071 -.069 .123 -.019 -.132 6 Coefficient Sig. (2-tailed) .665 .588 .983 .227 .749 .755 .577 .932 .548 N 23 23 23 23 23 23 23 23 23 23 Fact_ Correlation_ .056 .244 .005 1.00 .423* -.234 .388 -.094 -.309 -.157 10 Coefficient Sig. (2-tailed) .801 .262 .983 .044 .283 .067 .668 .152 .476 N 23 23 23 23 23 23 23 23 23 23 Fact_ Correlation_ -.055 .134 .262 .423* 1.000 -.145 .193 .089 .040 -.038 11 Coefficient Sig. (2-tailed) .803 .544 .227 .044 .510 .377 N 23 23 23 23 23 23 23 23 23 23 Fact_ Correlation_ .207 .110 .071 -.234 -.145* 1.000 .086 .183 .354 -.361 12 Coefficient Sig. (2-tailed) .344 .616 .749 .283 .510 .695 .404 .098 .091 N 23 23 23 23 23 23 23 23 23 23 Fact_ Correlation_ .000 .107 -.069 .388 .193 .086 1.000 -.200 -.401 .270 33 Coefficient Sig. (2-tailed) 1.000 .628 .755 .067 .377 .695 .360 .058 .214 N 23 23 23 23 23 23 23 23 23 23 Fact_ Correlation_ .118 -.164 .123 -.094 .089 .183 -.200 1.000 .177 -.026 27 Coefficient Sig. (2-tailed) .592 .453 .577 .668 .685 .404 .360 .418 .906 N 23 23 23 23 23 23 23 23 23 23 Fact_ Correlation_ .340 .066 -.019 -.309 .040 .354 -.401 .177 1.000 -.415* 14 Coefficient Sig. (2-tailed) .112 .765 .932 .152 .856 .098 .058 -.418 .049 N 23 23 23 23 23 23 23 23 23 23 Fact_ Correlation_ -.251 -.528** -.132 -.157 -.038 -.361 .270 -.026 -.415* 1.000 8 Coefficient Sig. (2-tailed) .248 .010 .548 .476 .863 .091 .214 .906 .049 N 23 23 23 23 23 23 23 23 23 23 99 Appendix 2. Frequency histograms for key risk factors Figure A1: Frequency histograms for key risk factors 100