=Paper=
{{Paper
|id=Vol-2870/paper133
|storemode=property
|title=Evaluating Real Checkability for FPGA-Based Components of Safety-Related Systems
|pdfUrl=https://ceur-ws.org/Vol-2870/paper133.pdf
|volume=Vol-2870
|authors=Oleksandr Drozd,Kostiantyn Zashcholkin,Maciej Dobrowolski,Anatoliy Sachenko,Oleksandr Martynyuk,Olena Ivanova,Julia Drozd
|dblpUrl=https://dblp.org/rec/conf/colins/DrozdZDSMID21
}}
==Evaluating Real Checkability for FPGA-Based Components of Safety-Related Systems==
Evaluating Real Checkability for FPGA-based Components of Safety-Related Systems Oleksandr Drozda, Kostiantyn Zashcholkina, Maciej Dobrowolskib, Anatoliy Sachenkob,c, Oleksandr Martynyuka, Olena Ivanovaa and Julia Drozda a Odessa National Polytechnic University, 1, Shevchenko Ave., Odesa, 65044, Ukraine b Kazimierz Pułaski Technology and Humanitarian University, 29, Malczewskiego Str., Radom, 26-600, Poland c West Ukrainian National University, 11, Lvivska Str., Ternopil, 46009, Ukraine Abstract The paper focuses on the study of the checkability of digital circuits in relation to FPGA (Field Programmable Gate Array) components of safety-related systems that serve high-risk facilities, maintaining their functional safety in synergy with its own. Functional safety breaches are associated with failures that stimulate the use of fault-tolerant solutions. However, the possibilities of these solutions are limited by the number of failures which can be countered. As a result, functional safety, based only on circuit fault tolerance, faces the problem of multiple failures. This problem manifests itself in the example of hidden faults, which can be accumulated in significant quantities during extended normal operation of the system. The multiple manifestations of these faults in emergency mode call into question the fail-safety of fault-tolerant circuits, including FPGA components, which can accumulate faults in the memory of the LUT units. Ensuring the fail-safety of circuits requires taking into account their checkability, which depends on the data arriving at the inputs of the circuit in normal and emergency modes. A method for assessing checkability, which is important for the fail-safety of FPGA components, is proposed. Checkability is assessed on real input data, the change of which often extends only over a part of the range of values related to the normal functioning of the system. The method makes it possible to evaluate the change in the checkability of the circuit depending on the change in its input data. Keywords 1 Safety-related system, FPGA component, LUT memory, normal and emergency modes, multiple failures, hidden faults, fault tolerance, fail-safety, checkability 1. Introduction The activity of mankind in production and consumption is scaled up at an increasing pace and is accompanied by the creation of powerful infrastructures for the generation and distribution of electricity and other resources of mass demand. These manufacturing and transport infrastructures are gradually becoming safety-related and transforming into high-risk facilities. This trend can be traced, since these objects are associated with the largest number of accidents that occurred with significant negative consequences [1, 2]. The development of high-risk objects can be attributed to objectively occurring processes that cannot be stopped. The rapid development of the energy sector is accompanied by an increase in the number and capacity of power plants and power grids. Growing cargo flows stimulate the development of transport infrastructures for land, air and water communications. Chemical and COLINS-2021: 5th International Conference Computational Linguistics and Intelligent Systems, April 22–23, 2021, Kharkiv, Ukraine EMAIL: drozd@ukr.net (O. Drozd); const-z@te.net.ua (K. Zashcholkin); m.dobrowolski@uthrad.pl (M. Dobrowolski); sachenkoa@yahoo.com (A. Sachenko); anmartynyuk@ukr.net (O. Martynyuk); en.ivanova.ua@gmail.com (O. Ivanova); yuliia.drozd@opu.ua (J. Drozd) ORCID: 0000-0003-2191-6758 (O. Drozd); 0000-0003-0427-9005 (K. Zashcholkin); 0000-0003-0296-9651 (M. Dobrowolski); 0000-0002- 0907-3682 (A. Sachenko); 0000-0003-1461-2000 (O. Martynyuk); 0000-0002-4743-6931 (O. Ivanova); 0000-0001-5880-7526 (J. Drozd) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) biological production creates their own risks associated with toxic and no less hazardous products, which also need to be ensured for safe storage, transportationб use and disposal. The accumulation of various types of weapons and ammunition supplements, but does not exhaust, many of today's sources of risk. Risk assessment is based on taking into account two factors related to the probability of an accident and the cost of its possible consequences [3, 4]. The described processes demonstrate an increase in the total power of high-risk objects, which humanity is unable to refuse. This feature of our development is manifested in the constant increase in the cost factor of the consequences caused by accidents. In the current circumstances, containing the growth of risks becomes a necessary condition for survival, and the implementation of this urgent requirement is only possible by reducing the factor associated with the probability of an accident. This problem is being solved through the development of information technologies implemented in computer systems for managing high-risk facilities. With this specificity of use, computer systems exhibit features that transform them into a special type of instrumentation and control safety-related systems. These systems are aimed at ensuring functional safety, which is considered in a complex both in relation to the control object and in relation to the system itself, and serve to prevent accidents, as well as reduce losses in the event of an accident. The purpose of safety-related systems determines the features of their design, which provide for the organization of their operation in two modes: normal and emergency [5, 6]. Requirements for the functional safety of critical systems are regulated by international standards, which pay considerable attention to the challenges associated with failures [7]. Safety-related systems must resist failure, i.e., continue to properly perform their functions to ensure their own safety and the safety of control objects, even in the event of failures. Such capabilities are based on the use of fault-tolerant solutions in the design of systems and their components [8]. International standards pay the greatest attention to failures for a common cause, which arise as a result of copying circuit solutions into duplicate channels of fault-tolerant structures and can replicate failures, for example, as a result of a design error [9]. The emphasis on common cause failures is due to the limited capabilities of fault-tolerant circuits, which are designed to withstand a specified number of failures, and typically provide a response to one or two failures. A significant limitation in the number of failures to be parried is dictated both by the desire for simple solutions and by the expectation of one failure, as the most probable in relation to their greater number. Therefore, the development of fault-tolerant schemes is mainly due to the restrictions imposed on copying solutions using multi-version technologies to eliminate the common causes of replicating failures. The main direction in the development of multi-version technologies is associated with the expansion of the types of diversity for schemes that duplicate functions in fault- tolerant structures [10, 11]. Thus, the main challenge for fault-tolerant solutions used in safety-related systems is multiple failures, since in the event of multiple faults, these solutions no longer guarantee functional safety, i.e., they do not become fail-safe. For this reason, one of the most important issues in ensuring the functional safety of critical systems is to study the sources of multiple failures. Such failures can occur due to vulnerabilities manifested in the field of information security by deliberate malicious actions that occur, for example, as a result of cyber-attacks by botnets that violate the integrity of a computer system [12–14]. For safety-related systems and control objects, emergency modes are not only the most critical, but also the least studied. The main contribution to their study is made by the ongoing accidents. Erroneous actions of the maintenance personnel lead to new unforeseen scenarios for the development of events with unexpected results, including the manifestation of hidden defects, which, if properly maintained, would not have emergency consequences [15, 16]. One of the most significant sources of multiple failures, which is inherent in fault-tolerant solutions used in critical applications [17], manifests itself in the problem of hidden faults. Such faults can accumulate in digital circuits during an extended normal mode under conditions where the input data manifesting them is not used in this mode. The emergency mode can manifest accumulated faults with the arrival of the corresponding input data in an amount exceeding the possibility of their parrying by fault-tolerant circuits [18]. Thus, the source of multiple failures is the accumulation of faults that cannot be detected in normal mode. Indeed, the detection of faults is performed by methods and means of on-line testing using logical control. In this case, a fault can only be detected if it causes an error in the monitored result [19, 20]. The problem of hidden faults is outside the scope of international standards related to functional safety. At the same time, this source of multiple failures is directly due to the peculiarity of safety- related systems to operate in two modes. In conventional computers, hidden faults do not create the problem of multiple failures, since they remain hidden and do not have any effect on the computational process throughout the entire operating mode. The problem of hidden faults is better known from the practice of using simulation modes, which recreate emergency conditions to control the functioning of safety-related systems and their components. Such modes are used in modern critical systems to detect hidden faults that do not appear as errors on the input data of normal mode. However, simulation modes themselves pose a risk to functional safety, as they are generators of emergency conditions. The practice of their application includes cases of both planned and unauthorized switching on, which have repeatedly led to emergency consequences. The planned activation of the simulation mode is a dangerous procedure, since it is accompanied by the shutdown of emergency protections, which was one of the reasons for the Chernobyl tragedy. Unauthorized switching on of simulation modes, which repeatedly occurs through the fault of a person or an arising malfunction, is distinguished by a factor of surprise and is also fraught with emergency consequences [21, 22]. The use of simulation modes indicates the importance of hidden faults, which are feared to a greater extent than the recreation of emergency conditions. In addition, the presence of dangerous simulation modes indicates a distrust experienced in relation to fault-tolerant solutions that do not always translate into fail-safe solutions that are important for ensuring the functional safety of critical systems. The reason for such an unwarranted transformation of fault-tolerant circuits into fail-safe is associated with their limited checkability [18]. The checkability of a digital circuit prevents the accumulation of faults, and its limitation can be inherited by the fault tolerance of circuit solutions and have a significant impact on the processes important for ensuring the functional safety of safety-related systems. Safety-related systems are developed using the most effective technologies, proven in practice, according to the component approach [23, 24]. The development of digital components is gaining priority in FPGA designing (Field Programmable Gate Array), which combines the versatility of a hardware solution with the flexibility of programmable logic [25, 26]. Digital circuit design is performed by programming FPGA chips within their LUT-oriented (Look- Up Table) architecture. During programming, the program code is written into the memory of the LUT units. Therefore, the fail-safety of fault-tolerant FPGA projects depends on their checkability in terms of the memory of LUT units [27]. In this regard, the assessment of the checkability of FPGA projects is of great importance. The division of the operating mode of the critical system into normal and emergency is inherited by the input data. Taking into account such a division makes it possible to evaluate the checkability of the memory of LUT units [28]. However, this assessment can be significantly different from the real one and can be misleading in terms of the safety of fault-tolerant schemes. This paper aims to draw attention to the real checkability of digital circuits in FPGA components of safety-related systems. Section 2 identifies the problem associated with evaluating the real checkability of digital circuits in FPGA components of safety-related systems. Section 3 describes a method for assessing the real checkability of digital circuits in critical applications. Section 4 presents the results of experiments to assess the real checkability of digital circuits using the example of a 4-bit mantissa multiplier, implemented in an FPGA project as a component of a safety-related system. 2. Related works and definition of problem The suitability both natural and man-made processes and objects for checking is an important condition for survival and development. In computer technology, this trend began to develop in the testability of digital circuits, the increasing complexity of which began to impede the effective development of tests for detecting faults [29, 30]. The testability of a digital circuit is assessed using its controllability and observability, which are determined on all input data. Thus, testability is completely determined by the schema structure and therefore can be considered as structural checkability. In the operating mode, the checkability of digital circuits is their important characteristic in relation to the implementation of on-line testing [31, 32]. Logic checking methods aimed at detecting errors in computed results are limited in their capabilities by logical checkability, which determines the ability of a digital circuit to exhibit faults in the form of errors on the input data of the operating mode. For this reason, the logic checkability of digital circuits gains an additional dependence on the input data in the operating mode. A fault for which there is no input data, manifesting it in the form of an error, is hidden to any logical checking. However, this lack of logical checking has no consequences, since such a fault is also hidden for the entire operating mode and, therefore, harmless for the functioning of the circuit. The tolerance for hidden faults changes in safety-related systems due to the diversification of the operating mode, which is divided into normal and emergency. Diversification of the operating mode has a significant impact on many aspects in the functioning of critical systems. In particular, it corrects the purpose of on-line testing, which in emergency mode is aimed at checking the trustworthiness of the calculated results, and in normal mode it is used to clear the circuit from faults. In addition, the diversification of the operating mode is inherited by the input data, which becomes different in these modes and makes the checkability of the digital circuit correspondingly different depending on them. So far, the checkability of a digital circuit has been viewed in terms of its positive role in developing tests and enhancing on-line testing. At the same time, checkability should be considered from the standpoint of its features, which, depending on the conditions, can have both positive and negative effects. Indeed, the checkability of a digital circuit consists in the manifestation of malfunctions in the form of an error in the calculated results. However, errors reduce the trustworthiness of the results, which, as a rule, are the main purpose of the calculations performed. In addition, it should be noted that most often the results are distorted by transient faults, which are much more likely events than permanent faults [33]. Therefore, the high checkability of the digital circuit leads to a decrease in the trustworthiness of the results, mainly due to transient faults, the detection of which does not help to clear the circuit from malfunctions in the normal mode. Most of all, the emergency mode needs to ensure high trustworthiness of the calculated results, for which the manifestation of faults in the form of errors, due to high checkability, is undesirable. For this reason, different checkability in normal and emergency mode is also a feature that can play both a positive and a negative role in these modes. The best combination of the checkability of the scheme and the trustworthiness of the results is achieved when these indicators are provided, respectively, at a high and low level in normal mode and in the opposite ratio in emergency mode. In the case where the checkability of the circuit in the normal mode includes the checkability that the circuit exhibits in the emergency mode, the problem of hidden faults does not arise since the hidden faults remain so in both modes. Otherwise, the various checkability of a digital circuit contributes to the accumulation of hidden faults that pose a threat to fault tolerance, preventing the transformation of fault-tolerant circuits into fail-safe ones. The checkability of FPGA projects has its own characteristics associated with their LUT-oriented architecture. Computations are organized by decomposing them into logical functions performed by LUT units. The description of these logical functions is stored in the form of program code in the configuration file of the FPGA project and in the process of its programming is written into the memory of the LUT units. Inputs of the LUT units take the values of arguments on which logical functions are defined. The arguments form the address at which the function value is read from the LUT memory to the unit's output. The number of inputs of LUT units can vary from 4 to 8 for different LUT-oriented architectures. The most widespread are LUT units containing 4 inputs: A, B, C, D for addressing 16-bit memory [34, 35]. The checkability of the LUT unit is evaluated taking into account its possible faults. The structure of the LUT unit includes a register memory in which the program code is written and a circuit for selecting the bit read from this memory. The register memory of each LUT unit is regularly tested using a checksum that is generated and validated for the entire program code of the FPGA design. The scheme for selecting the read bit of the LUT memory consists of switches, which are controlled by the address code formed at the inputs of the LUT unit. Regular testing of register memory makes it checkable in relation to all its possible faults, excluding their accumulation in normal mode. Switches of the selection circuit are not covered by program code checking and can accumulate faults on their information and control inputs. Malfunctions, which appear in the form of errors at the information inputs of the switches, distort the values of the bits read from the register memory. Malfunctions of the control inputs lead to addressing errors, which are manifested in the case of a mismatch in the values in the bits addressed to the correct and corrupted address. Thus, the checkability of the register memory does not ensure the checkability of the LUT memory as a whole, including the circuit for selecting the bits to be read. The values of the read bits of the LUT memory can be corrupted by faults accumulated during normal operation in the switches. The bits of the LUT memory are checkable for faults that cause an error in the monitored results. This checkability of the LUT memory plays a positive and negative role in normal mode and emergency mode, respectively. In critical systems, checkability is valuable, first of all, from the standpoint of withstanding multiple faults to transform a fault-tolerant circuit into a fail-safe one. The checkability of an FPGA project with a LUT-oriented architecture will be estimated by the ratio of the number of checkable bits of LUT memory to their total number. The plurality of checkable bits of a LUT memory should be determined taking into account the following features inherent in checkability of circuits in critical applications: 1. Significance of checkability of schemes for ensuring the fail-safety of a fault-tolerant solution. 2. Dependence of the checkability of the circuit on the real input data of the normal mode. In the first feature, the checkability of the FPGA project is characterized from the standpoint of its main role, which it should play in safety-related systems to ensure the fail-safety of fault-tolerant solutions. Circuit checkability is important to system safety in those bits of the LUT memory that are used in emergency mode and are therefore addressed in both modes. Therefore, in assessing the checkability of a circuit, not all checkable bits of the LUT memory should be considered, but only those that are used in both normal and emergency modes. The second feature of checkability in safety-related systems is associated with the nature of the input data entering the inputs of the circuits in the normal mode. For circuits used in critical applications, it is usually known to separate the input data according to their normal and emergency use. For example, the input data can be separated by a threshold, reaching and exceeding which marks the beginning and continuation of the emergency mode. Raising the threshold increases the amount of input data used in normal mode and reduces the amount of input data for emergency mode. Such changes contribute to an increase in the checkability of the circuit, as well as an increase in the number of checkable memory bits in normal mode and a decrease in the number of memory bits addressed only at the beginning of the emergency mode. An increase in the checkability of schemes with an increase in the threshold is confirmed by experimental data [36]. At the same time, such an assessment of checkability has a significant drawback associated with the nature of the data arriving at the inputs of circuits in critical applications. The safety-related systems are characterized by the functions of monitoring various parameters of the control object and the system itself. Due to the high technology prioritized in critical applications, the parameters generally exhibit high stability during normal operation. The input data for digital components of critical systems is formed from the results of measurements of these parameters and inherits a high stability of values, which is characterized by minor changes in noise level. The change in the input data does not cover the entire range of values of the normal mode and, as a rule, extends to its insignificant part. This fact leads to a significant difference between the real checkability of digital circuits and checkability, estimated relative to the threshold separating the input data of normal and emergency modes. Real checkability may turn out to be significantly lower than estimated and create a false idea of information sufficiency [37, 38] in assessing the fail-safety of a fault-tolerant FPGA component for a safety-related system. Thus, the problem of erroneous judgment about the real checkability of circuit solutions can have a significant impact on ensuring the functional safety of safety-related systems and control objects. 3. A method of real checkability assessment for circuits with LUT-oriented architecture The initial data for assessing the real checkability are the real input data of the circuit, its description and the threshold S, which distinguishes between normal and emergency modes. Typically, the input data for component circuits in safety-related systems is recorded. The continuous nature of change, inherent in many parameters, is reflected in the change in numerical data with a step of 1 at the inputs of digital circuits. In this case, the change in the input data is fully characterized by the range, indicating only its boundaries. The range of input data variation can be estimated in the circuit itself and be operatively available in FPGA components, for example, by using JTAG technology. A description of a circuit with a LUT-oriented architecture can be obtained from a CAD database, for example, Compiler DB CAD Intel Quartus Prime 20.1 Lite Edition using an application in TCL language [39, 40]. The description of the circuit contains a list of LUT units with an indication of the program codes, as well as the circuit inputs and outputs of the LUT units connected to their inputs. In addition, the circuit inputs and LUT units connected to its outputs are described. The threshold S can impose constraints directly on the input data or the result calculated by the FPGA component of the system. The proposed method for assessing the real checkability of an FPGA component is based on the study of the controllability and observability of the memory LUT bits, taking into account the range of input data changes that occurs in normal mode. Checkability is considered in that part of it, which provides or is not able to provide the fail-safety of a fault tolerant solution on the real input data of the normal mode. LUT units in the circuit description are arranged in the order in which their functions are performed, and then these functions in this order are executed during the simulation of computations. Simulation is performed on the input data in the range of their real change in normal mode. In addition, the computations are simulated on the input data of emergency mode, which start at the S threshold value. The method simulates a circuit with a LUT-oriented architecture in two stages aimed at assessing the controllability and observability of the memory LUT bits, respectively. At the first stage, the correct operation of the circuit is simulated. For each LUT unit, the simulation results determine two sets of PN and PE, which contain the bit numbers of the LUT memory addressed in normal and emergency mode, respectively. These sets are used to define the sets PN&E = PN ∩ PE and PE\N = PE \ PN of memory LUT bits, addressed in both modes and only in emergency mode, respectively. The use of memory LUT bits makes them controllable in the appropriate mode. The definition of the PN and PE sets is performed with different simulation costs. The PN set depends on the normal mode inputs and grows with their actual range. Therefore, this set needs to be adjusted when the ranges of the input data change. The set PE is completely determined by the threshold S, from which the emergency mode begins, and remains constant when the input data range is changed in normal mode. In the second stage, the input data, on which the memory LUT bits of the PE set are defined, are used to analyze the observability of these bits. The output of the LUT unit is inverted and the circuit is simulated based on the introduced fault. An error at the output of the circuit determines the analyzed bit of the LUT memory as observable. It should be noted that PN&E ⸦ PE. Therefore, the results of the observability analysis also apply to the memory LUT bits of the PN&E set and define two sets PNE and PEN of the memory LUT bits, which are controllable and observable in both modes and only in emergency mode, respectively. The PNE and PEN sets of the memory LUT bits are dependent on the actual input data of normal mode. Ignoring the actual change in the input data makes it possible to judge these sets based only on the value of the threshold S. In this case, the simulation is performed taking into account all the input data of the normal mode and determines these sets as PNE S and PEN S, respectively. The memory LUT bits of the PNE plurality are checkable in normal mode and thus eliminate the accumulation of hidden faults, as well as the problem of their manifestation in emergency mode. Therefore, the fail-safe circuit of the FPGA component is checkable and fail-safe in these memory LUT bits. The degree of fail-safety of a fail-safe FPGA component can be evaluated by checkability as C = BNE / BEN_S, where BNE = ǀPNEǀ; BEN_S = ǀPEN_Sǀ. The degree of fail-safety violation of the fail-safe FPGA component can be estimated by the checkability deficit using the formula D = BEN / BEN_S, where BEN = ǀPENǀ. 4. Case study and discussion of the method The method for assessing the real checkability of an FPGA component is represented by its software implementation, executed in the Delphi 10 Seattle demo version [41]. The checkability of a circuit with a LUT-oriented architecture is determined for eight different values of the Sr threshold, which limits the actual change of the input data in normal mode. The Sr threshold imposes a limitation on the calculation result. The largest value of the Sr threshold coincides with the S threshold separating the normal and emergency input data. The software implementation of the method was tested on FPGA circuits of arithmetic devices from the Library of parameterized modules [42]. The studies were carried out using CAD Quartus Prime 20.1 Lite Edition [43]. The device circuits were implemented in the Intel Cyclone 10 LP FPGA chip: 10CL025YU256I7G [44]. As an example, the study results are presented for a 4-bit iterative array multiplier that calculates the product of normalized mantissas. The circuit is implemented on 30 LUT units. The binary codes of the normalized mantissa of the factors vary from 8 to 15, and the binary code of the product ranges from Pr = 64 to 225. The main program bar shown in Fig. 1, contains the control keys, a memory image for the selected LUT unit, the main simulation results and explanations to them. The “START” and “EXIT” keys allow you to start the simulation and end the program. Before the start of the simulation, the S threshold value is set, which determines 8 Sr threshold values, evenly distributed over the interval from Pr to S. In this case, the threshold is set by S = 136. Keys used to determine the Sr threshold values are replaced with information these values: “Sr: 64 - 73… 136”. The “LUT # 9” key selects LUT node 9 to view its memory at all values of the Sr threshold. Pressing this key moves to the next LUT unit in a circle. The LUT memory is shown as a matrix of squares containing the bit values. The numbering of bits from 0 to 15 is shown in binary code using the values of the arguments at the inputs D, C and B, A. These values are specified in binary codes from 002 to 112 A memory bit located at the intersection of a row and a column, indicated by binary codes, has a number equal to the address at which it is fetched from the LUT memory. For example, the bit located in the third row and third column, i.e., at the intersection of codes 102, located at 10102 and has the number 10. The memory LUT bits take on the same values in all eight matrices, but they are colored in different colors: green, yellow and blue if they are used only in normal mode, only in emergency mode, or in both modes, respectively. Below the matrices, the total number of bits used and the number of bits colored in different colors are indicated. With an increase in the Sr threshold, bits addressed only in emergency mode are replaced by bits that are used in both modes, and then colored into bits that are used only in normal mode. This process improves the checkability of the LUT memory and helps transform fault-tolerant FPGA components into fail-safe circuits. Bits with values highlighted in red are observable only in emergency mode. They refer to non-checkable bits that can accumulate hidden faults that reduce the fail-safety of FPGA components. As the real Sr threshold decreases, LUT 9 increases their number from 2 to 11. The main results show the number of BN and BN&E bits addressed only in normal mode and both modes, as well as the number BNE and BEN of checkable and non-checkable bits important to safety. In addition, the checkability C and its deficit D, expressed as a percentage, were assessed. Figure 1: Main results of the method implementation These results show a significant decrease in checkability with lowering the Sr threshold, reflecting the actual change of the input data in the normal mode. A decrease in the real Sr threshold from 136 to 73 (less than 2 times) reduces the checkability from 89.8% to 16.6%, i.e., by 5.4 times. 5. Conclusions Fault-tolerant circuit solutions play an important role in ensuring the functional safety of critical systems and high-risk objects controlled by them. However, the fault tolerance of the circuits is ensured with respect to a limited number of faults and therefore is vulnerable to multiple failures that may arise in connection with the existing problem of hidden faults. This problem appeared along with safety-related systems, a feature of which is their designation for operation in two modes: normal and emergency. In normal mode, faults have the ability to accumulate due to the lack of input data that could manifest them. With the onset of the emergency mode, the accumulated faults create a problem of multiple failures for fail-safe circuits, preventing their implementation in the main mission of ensuring fail-safety. The accumulation of faults occurs due to the low checkability of digital circuits, which in normal mode, as a rule, is limited to insignificant changes in the input data. The primary sources of these data are sensors of various parameters of the control object. Thanks to high technologies, which are used as a priority in critical domains, the measurement results are highly stable and, after digitization, are transformed into a sequence of slightly varying data arriving at the inputs of digital circuits. As a result, the normal mode input range is only used up to a small fraction of it. The checkability of the schemes, assessed over the entire range of the normal mode, ceases to reflect its real level. The development of critical systems using modern CAD systems makes the analysis of this problem relevant for FPGA components, focusing it on the checkability of the memory of LUT units. Experiments carried out with FPGA circuits have shown that their real checkability can be many times lower than the values that can be expected, taking into account the entire range of input data changes in normal mode. Under such circumstances, the capabilities of fault-tolerant circuits are significantly limited in ensuring functional safety, for which one of the challenges is the misconception concerning its indicators. Unfortunately, the envisaged scenarios for the development of accidents are mainly replenished as a result of the analysis of new accidents. Such an expensive path to improving safety, in particular, is due to the limited depth of analysis, where the objective processes that generate these details are not traced behind individual details. One of future research direction is exploring neural networks [45, 46] for assessing the checkability of digital circuits. 6. References [1] A. J. Masys, Black swans to grey swans: Revealing the uncertainty, Disaster Prevention and Management 21 (3) (2012), 320–335. [2] A. Hopkins, Issues in safety science, Safety Science 67 (2014), 6–4. [3] T. Aven, Risk assessment and risk management: Review of recent advances on their foundation, European Journal of Operational Research 253(1) (2016) 1–13. [4] L. Cox, Confronting deep uncertainties in risk analysis, Risk Analysis 32 (2012) 1607–1629. [5] A. Hale, Foundations of safety science: A postscript, Safety Science 67 (2014) 64-69. [6] R. Ouache, M.N. Kabir, A. Adham, A reliability model for safety instrumented system, Safety Science 80 (2015) 264–273. URL: https://doi.org/10.1016/j.ssci.2015.08.004. [7] F. Brissaud, L.F. Oliveira, Average probability of a dangerous failure on demand: Different modelling methods, similar results, in: 11th International Probabilistic Safety Assessment and Management Conference and the Annual European Safety and Reliability Conference, PSAM11 ESREL, 2012, pp. 6073–6082. [8] A. Romankevich, A. Feseniuk, V. Romankevich, T. Sapsai, About a fault-tolerant multiprocessor control system in a pre-dangerous state, in: Proceedings of the 9th IEEE International Conference DESSERT, Kyiv, Ukraine, 2018, pp. 215–219. [9] S. Alizadeh, S. Sriramula, Impact of common cause failure on reliability performance of redundant safety related systems subject to process demand, Reliability Engineering & System Safety 172 (2018) 129–150. [10] K. Salako, L. Strigini, When does ‘Diversity’ in development reduce common failures? insights from probabilistic modelling, IEEE Transactions on Dependable and Secure Computing 11(2) (2014) 193– 206. doi: http://doi.ieeecomputersociety.org/10.1109/TDSC.2013.32. [11] A. A. Gashi, L. Povyakalo, M. Strigini et. al., Diversity for safety and security in embedded systems, in: Faste Abstracts of the IEEE International Conference on Dependable Systems and Networks, Atlanta, GA, USA, 2014. [12] S. Lysenko, K. Bobrovnikova, S. Matiukh et. al., Detection of the botnets’ low-rate DDoS attacks based on self-similarity, International Journal of Electrical and Computer Engineering 10(4) (2020) 3651–3659. [13] S. Lysenko, K. Bobrovnikova, O. Savenko, A. Kryshchuk, BotGRABBER: SVM-Based Self- Adaptive System for the Network Resilience Against the Botnets’ Cyberattacks, Communications in Computer and Information Science 1039 (2019) 127–143. [14] W. Young, N. Leveson, Systems thinking for safety and security, in: Proceedings of the 29th Annual Computer Security Applications Conference, New Orleans, LA, USA, 2013, pp. 1–8. [15] R. Semenas, M. Kaijanen, European Clearinghouse analysis of events related to NPP digital instrumentation and control systems, JRC Science and Policy Report, European Commission, Joint Research Center, 2014. [16] P. V. Miguel, Operating experience with digital I&C systems at nuclear power plants, JRC Technical Report, European Commission, Joint Research Center, 2018. [17] A. O. El-Rayis, A. Melnyk, Localized Payload Management Approach to Payload Control and Data Acquisition Architecture for Space Applications, in: Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007), Edinburgh, UK, 2007, pp. 263–272. doi: 10.1109/AHS.2007.70. [18] O. Drozd, V. Antoniuk, V. Nikul, M. Drozd, Hidden faults in FPGA-built digital components of safety-related systems, in: Proceedings of the IEEE International Conference TCSET, Lviv- Slavsko, Ukraine, 2018, pp. 805–809. doi:10.1109/TCSET.2018.8336320. [19] M. Abramovichi, C. Stroud, C. Hamiliton, S. Wijesuriya, V. Verma, Using roving STARs for on-line testing and diagnosis of FPGAs in fault-tolerant applications, in: Proceedings of the IEEE International Test Conference, 1999, pp. 973–982. [20] A. Drozd, S. Antoshchuk, New on-line testing methods for approximate data processing in the computing circuits, in: Proceedings of the IEEE International Conference IDAACS, Prague, Czech Republic, 2011, pp. 291–294. doi:10.1109/IDAACS.2011.6072759. [21] Y. Hussain, A. Rehalia, A. Dhyan, Case Study: Chernobyl Disaster, International Journal of Advanced Research in Computer Science and Software Engineering 8(2) (2018) 76–78. [22] D. Gillis, The Apocalypses that Might Have Been, 2007. URL: https://www.damninteresting.com/the-apocalypses-that-might-have-been/ [23] N. Antunes, F. Brancati, A. Ceccarelli et al.: A monitoring and testing framework for critical off- the-shelf applications and services, in: Proceedings of IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), 2013, pp. 371–374. [24] F. Duchi, N. Antunes, A. Ceccarelli et. al., Cost-Effective Testing for Critical Off-the-Shelf Services, in: Proceedings of International Conference on Computer Safety, Reliability, and Security, Delft, The Netherlands, 2014, pp 231–242. [25] S. Verma, P. Srivastava, D. Ramavat, N. Srivastava, A Review Paper on Comparative Study of FPGA Implementation of Adhoc Security Algorithms, International Journal of Management & Information Technology 7(1) (2013). [26] J. Jung, I. Ahmed, Development of field programmable gate array-based reactor trip functions using systems engineering approach. Nuclear Engineering and Technology 48(4) (2016) 1047–1057. [27] O. Drozd, K. Zashcholkin, R. Shaporin, J. Drozd, Y. Sulima, Development of ICT Models in Area of Safety Education, in: Proceedings of the IEEE EWDT Symposium, Varna, Bulgaria, 2020, pp. 212– 217. doi: 10.1109/EWDTS50664.2020.9224861. [28] O. Drozd, I. Perebeinos, O. Martynyuk et. al., Hidden fault analysis of FPGA projects for critical applications, in: Proceedings of the IEEE International Conference TCSET, Lviv–Slavsko, Ukraine, 2020. doi:10.1109/TCSET49122.2020.235591. [29] IEEE Std1500-2005 Standard Testability Method for Embedded Core-based IC, 2005. doi:10.1109/IEEESTD.2005. [30] V. Hahanov, A. Hahanova, S. Chumachenko, S. Galagan, Diagnosis and repair method of SoC memory, WSEAS Transactions on Circuits and Systems 7(7) (2008) 698–707. [31] A. Drozd, M. Lobachev, J. Drozd, The problem of on-line testing methods in approximate data processing, in: Proceedings of the 12th IEEE International On-Line Testing Symposium, Como, Italy, 2006, pp. 251–256. doi: 10.1109/IOLTS.2006.61. [32] J. Drozd, A. Drozd, M. Al-Dhabi, A resource approach to on-line testing of computing circuits, in: Proceedings of the IEEE EWDT Symposium, Batumi, Georgia, 2015, pp. 276–281. doi: 10.1109/EWDTS.2015.7493122. [33] C. Metra, L. Schiano, M. Favalli, B. Ricco, Self-checking scheme for the on-line testing of power supply noise, in: Proceedings of the Design, Automation and Test in Europe Conference, Paris, France, 2002, pp. 832–836. [34] Cyclone Architecture. Cyclone Device Handbook, Volume 1. Altera Corporation, 2008. URL: https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/cyc/cyc_c51002.pdf. [35] M. Farias, R.H.S. Martins, P.I.N. Teixeira, P.V. Carvalho, FPGA-based I&C systems in nuclear plants, Chemical, Engineering Transactions 53 (2016) 283–288. doi: 10.3303/CET1653048. [36] O. Drozd, K. Zashcholkin, O. Martynyuk, O. Ivanova, J. Drozd, Development of Checkability in FPGA Components of Safety-Related Systems, CEUR-WS 2762 (2020) 30–42. URL: http://ceur- ws.org/Vol-2762/paper1.pdf. [37] T. Hovorushchenko, O. Pomorova, Information Technology of Evaluating the Sufficiency of Information on Quality in the Software Requirements Specifications, CEUR-WS (2104) 555–570. [38] O. Pomorova, T. Hovorushchenko, The Way to Detection of Software Emergent Properties, in: Proccedings of the IEEE International Conference IDAACS, Warsaw, Poland, 2015, pp. 779–784. [39] Ashok P. Nadkarni, The TCL Programming Language: A Comprehensive Guide, CreateSpace Publishing, 2017. [40] Intel Quartus Prime Standard Edition User Guide Scripting, 2020. URL: https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug-qps-scripting.pdf [41] Delphi 10 Seattle: Embarcadero https://www.embarcadero.com/docs/datasheet.pdf. [42] Intel FPGA Integer Arithmetic IP Cores User Guide, 2020. URL: https://www.intel.com/ content/dam/www/programmable/us/en/pdfs/literature/ug/ug_lpm_alt_mfug.pdf. [43] Intel Quartus Prime Standard Edition User Guide, 2020. URL: https://www.intel.com/content/dam/ alterawww/global/en_US/pdfs/literature/ug/ug-qps-getting-started.pdf. [44] Intel Cyclone 10 LP Core Fabric and General Purpose I/Os Handbook, 2020. URL: https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/cyclone- 10/c10lp-51003.pdf. [45] A. Sachenko, V. Kochan, V. Turchenko, Instrumentation for gathering data, IEEE Instrumentation and Measurement Magazine 6(3) (2003) 34–40. [46] V. Golovko, Y. Savitsky, T. Laopoulos, A. Sachenko, L. Grandinetti, Technique of learning rate estimation for efficient training of MLP, in: Proceedings of the International Joint Conference on Neural Networks, 2000, pp. 323–328.