Reliability-based Design of Network Structure Systems
                Using the Monte-Carlo Method *

                         Aleksander Moshnikov [0000-0002-3689-2472]

                     ITMO University, Saint-Petersburg, 197101, Russia
                           moshnikov.alex@gmail.com


       Abstract. The article is devoted to the choice of reliability-oriented solutions for
       building an enterprise management system. The Monte Carlo method is used as
       a tool for reliability analysis, which provides an estimate of the reliability index
       with a given confidence probability. A software implementation in the R lan-
       guage is proposed. A quantitative example is provided.

       Keywords: Monte-Carlo Method, Reliability Assessment, ERP-MRP Systems.


1      Introduction

Modern enterprises accumulate a huge amount of information, such as documentation,
graphic and video information from access control systems, operation data of techno-
logical equipment and machines. The accumulation of large amounts of information
creates challenges for store, protect, and provide access. Enterprise Resource Planning
(ERP) and Material Requirements Planning (MRP) systems are widely used to solve
such problems. They include the appropriate software and the necessary infrastructure.
Designing such systems is a complex task that involves finding a compromise between
usability, cost, and reliability.
   Reliability-based design optimization (RBDO) is the approach of distribution of re-
liability requirements and selection of architectural solutions that provide a given level
of reliability of the system as a whole.
   It makes it possible to make a tradeoff between an increase in reliability and a cost
decrease [1]. Reliability allocation and optimization problem has been widely treated
by many authors. Although most of the attention to this issue has been given to the
redundancy allocation problem [2-4]. Aspects of computing reliability are given in [12-
13].


Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).
2 A. Moshnikov


2      Architecture of the ERP-MRP system

2.1    OSI model
In the open system interaction model (OSI), the information on the logical network
diagram corresponds to L3-level information. The L3 layer is an abstraction layer that
reflects how packets are forwarded through intermediate routers. At the L2 level, data
channels between neighboring nodes are represented, while at the L1 level, only their
physical location is shown
   According to this model, the network is divided into three logical levels: the core of
the network: high — performance devices, the main purpose is fast transport, the dis-
tribution layer: provides security policies, aggregation and routing in VLANs, defines
broadcast domains and the access level: usually L2 switches, connecting end devices.


2.2    Typical architectures
"Bus". A characteristic feature of the "bus" type topology is the presence of a single
data transmission line, to which all subscriber devices are connected, which carry out
alternate data exchange. The transmitted data is available to all subscribers connected
to the trunk. Data parameters are set in such a way that the addressee (recipient)
uniquely identifies them.
   "Ring". In contrast to the "bus" topology, the structure of the ring topology implies
serial connection of subscribers, as a result of which the information flow goes from
one device to another in turn. The message parameters contain markers that are used by
the receiving device to determine whether it is the recipient. If the response is positive,
the message is considered delivered, and if the response is negative, it is transmitted
further over the network.
   "Star". In modern architecture, the most common topology is the "star" type.
   The connection scheme for the specified topology requires a switching device that
provides addressing and distributes information flows between subscribers over sepa-
rate communication channels.
   "Cell". A feature of the "cell" topology is that subscriber devices also perform the
role of switching devices. Each subscriber device is connected by four communication
channels. The advantage is the high reliability of this structure – each point has at least
four communication channels with other subscriber devices.


3      Reliability allocation

3.1    Allocation techniques
Reliability allocation is a crucial step for each product development process, it allows
to assign level failure rate target to different system units and then to reach the desired
reliability goals for the whole system.
      Reliability-based Design of Network Structure Systems Using the Monte-Carlo Method 3


   The optimization task may be to maximize the reliability index under specified limits
on the number of available resources, or to minimize resource consumption when the
required level of reliability is reached.
   The distribution of a given reliability R* over the system elements requires solving
the following inequality:

                                  f(R1 , R 2,…, R n ) ≥ R∗ ,                           (1)

    where Ri is the specified probability of failure-free operation of the i-th element; f -
is the functional relationship between the elements and the system.
    The allocation procedure is performed through an iterative process. The first step
starts from the initial plan, when few data are available concerning components.
    Various reliability allocation methods have been widely discussed and developed
over the last several decades. One existing approach combines one or several criteria in
different combination ways for obtaining an allocation weight and allocating reliability
in proportion of the weight [4]. For example, the Advisory Group on Reliability of
Electronic Equipment (AGREE) method combines complexity into allocation weight
[5], and the Aeronautical Radio Inc. (ARINC) method considers failure rate as alloca-
tion weight [6]. Another conventional reliability allocation method focuses on multi-
objective optimization , including cost minimization [7] and redundancy allocation [8].


3.2    Network reliability modeling
R is a programming language for statistical data processing and graphics, as well as a
free open-source software environment for computing under the GNU project. The R
language contains tools that allow to create several parallel threads of calculations (due
to the simultaneous loading of several processor cores) and several times reduce the
time spent on modeling. The graph library is used for statistical modeling of the relia-
bility of the automated control system, which implements a large number of algorithms
on graphs and allows you to flexibly perform various manipulations with graphs (re-
moving a graph vertex, adding a graph vertex, etc.). To search the graph for paths be-
tween certain vertices, use the width traversal algorithm (an implementation of this al-
gorithm in the iGraph library is used). To generate random numbers with an exponential
distribution law, the basic functions of the R language are used [9]. All the functions
and algorithm of statistical modeling are written in one script, the modeling process
consists in running this script with references to the graph description (in the form of a
list of graph edges), system failure conditions (in terms of graph paths), and data on the
reliability of system elements (represented on the graph by vertices). The results of the
simulation are a description of the system failure scenarios at each iteration of the sim-
ulation and the values of random system failure events.
4 A. Moshnikov


4      Numerical example

4.1    Description of ERP-MRP system
The local area network ERP-MRP contains the necessary infrastructure for the inter-
connection of systems and their individual functional blocks. In General, the network
is based on the star topology and consists of main and auxiliary nodes. The system
consists of the following units:

1. The Hardware of the main computing resources (servers, central switches) is allo-
   cated to the data center (DS);
2. Server cabinet (SC) is designed for collecting, processing and storing information
   about the operation of ERP-MRP system equipment, as well as information interac-
   tion.
3. Data storage cabinet (DSC) is designed for storing and processing large amounts of
   information, archive management.
4. Main switching node (MSN), is a part of MSN cabinet;
5. Auxiliary switching node (ASN), is a part of ASN cabinet;
6. A workstation (WS) is a set of office computer equipment and system software and
   is installed at the workplace of the staff.
7. Connecting Cabinet for adjacent systems (CCAS);
8. Switching node of the building (BSN).
It is assumed that DC equipment has been identified and has the following reliability
indicators presented in table 1.

                          Table 1. Reliability data of SC and DSC
           Element model                  Code                  Failure Rate, h−1
           Server cabinet                  SC                       121∙10-6
         Data storage cabinet             DSC                        19∙10-6


Architecture of ERP-MRP system presented on Fig. 1.


4.2    Simulation parameters
Reliability indicators are calculated for a sample with 100 cycles. To ensure that the
probability of failure is calculated, a sample with 5143 cycles is used, which provides
a level of accuracy greater than 99%.
   The simulation results of failure probability are presented in Fig. 1. According to the
results of the Monte Carlo simulation (fig. 2), it can be argued that the probability of
the ERP-MRP systems functioning in 5000 hours will be no less than 0.9945 with a
confidence probability of 0.90.
Reliability-based Design of Network Structure Systems Using the Monte-Carlo Method 5


                   Fig. 1. Architecture of ERP-MRP system
6 A. Moshnikov


                120
                100
                80                         Histogram of MTBF
    Frequency

                60
                40
                20
                0


                      0      10000      20000       30000       40000       50000   60000

                                                    MTBF,ч


                Fig. 2. Histogram of the distribution values MTBF data of ERP-MRP system

To select the best option for building the system, several iterations of modeling are
performed to determine the reliability of the system for various models of purchased
components. The composition of the MSN consists of commercially available compo-
nents hub. Part CASN1-CASN3 and CCAS include switches.
   Data on equipment reliability is presented in table 2

                                     Table 2. Initial reliability data
           Element, model                 Code               Failure Rate, h−1      Cost, c.u.
              Switch A                     S1                     4∙10-6             10000
                  Switch B                  S2                     3∙10-6             15000
                  Switch C                  S3                     2∙10-6             30000
                  Switch D                 S4                      1∙10-6            90000
                  Hub A                    H1                      6∙10-6            124000
                  Hub B                    H2                      2∙10-6            213000
                  Hub C                    H3                      1∙10-6            196000
      Reliability-based Design of Network Structure Systems Using the Monte-Carlo Method 7


4.3     Simulation results
Based on the simulation, 3 variants of the system construction were determined that
meet the reliability requirement of 0.99. The simulation results are shown in Fig. 3.

                                                                                          Reliability
                                                                                               1
                                                                                               0,995
                                                                                              0,99
                                                                                             0,985
                                                                                            0,98
                                                                                            0,975
           Hub A
                                                                                    Switch A
                   Hub B
                                                                         Switch B

                                                          Switch C
                                 Hub C

                                      Switch D


Fig. 3. The set of values of the probability of failure of the system corresponding to the quantile
                                                0.80

When considering the 80% probability of failure quantile, the following network hard-
ware models can be selected as system elements: [S1;H1], [S2;H1], [S3;H1], [S4;H1],
[S1;H2], [S1;H3]. The minimum cost will be provided when selecting [S1;H1] and will
be 576,000 c. u.
   To improve accuracy, methods of reducing the variance of a sample estimate, for
example, the Cross-Entropy Monte-Carlo method [10, 11], can be used.
   If the probability of failure-free operation does not meet the requirements for the
system, then to increase the reliability, it is necessary to evaluate the significance of the
elements, for example, use the Birnbaum Importance Measure [9]. Increasing the reli-
ability of the elements with the biggest significance will allow achieving the required
MTBF or failure probability.


5       Conclusion

An approach to the choice of reliability-oriented solutions for building an enterprise
management system is proposed. As a tool for reliability analysis, the Monte Carlo
method is used, which provides an assessment of the reliability index with a given
8 A. Moshnikov


confidence probability. Software implementation in the R language was performed.
The performance of the software is demonstrated using a quantitative example. Selected
equipment configuration provides the specified reliability of the ERP-MRP system with
a minimum cost.


References
 1. Lee, Tae Won, and Byung Man Kwak. "A reliability-based optimal design using advanced
    first order second moment method." Journal of Structural Mechanics 15, no. 4 (1987): 523-
    542.
 2. Misra K. B., and Sharma, Usha, “Multicriteria optimization for combined reliability and
    redundancy allocations in systems employing mixed redundancies,” Microelectronics and
    Reliability, Vol. 31, No 2, pp. 323-335, 1991.
 3. Kuo, W. and Prasad, V.R. (2000). An annotated overview of system reliability optimization.
    IEEE Transactions on Reliability Engineering. Vol. 49: pp.176–187
 4. O. P. Yadav and X. Zhuang, “A practical reliability allocation method considering modified
    criticality factors,” Reliability Engineering & System Safety, vol. 129, no. 9, pp. 57–65,
    2014
 5. X. F. Liang, L. Y. Chen, H. Yi, and D. Li, “Integrated allocation of warship reliability and
    maintainability based on top-level parameters,” Ocean Engineering, vol. 110, no. 12, pp.
    195–204, 2015
 6. M. Catelani, L. Ciani, G. Patrizi, and M. Venzi, “Reliability allocation procedures in com-
    plex redundant systems,” IEEE Systems Journal, vol. 12, no. 2, pp. 1182–1192, 2018.
 7. A. Heidari, Z. Y. Dong, D. Zhang, P. Siano, and J. Aghaei, “Mixed-integer nonlinear pro-
    gramming formulation for distribution networks reliability optimization,” IEEE Transac-
    tions on Industrial Informatics, vol. 14, no. 5, pp. 1952–1961, 2018.
 8. K. Khalili-Damghani, A.-R. Abtahi, and M. Tavana, “A new multi-objective particle swarm
    optimization method for solving reliability redundancy allocation problems,” Reliability En-
    gineering & System Safety, vol. 111, no. 3, pp. 58–75, 2013.
 9. Birnbaum, Z. W., On the Importance of Different Components in a Multicomponent System,
    Multivariate Analysis - II, Edited by P. R. Krishnaiah, Academic Press, pp. 581–592, 1969.
10. K.-P. Hui, N. Bean, M. Kraetzl, Dirk P. Kroese. The Cross-Entropy Method for Network
    Reliability Estimation Annals of Operations Research, 2005, Volume 134, Number 1, Page
    101
11. K-P. Hui, N. Bean, M. Kraetzl, and D. Kroese. The tree cut and merge algorithm for estima-
    tion of network reliability. Probability in the Engineering and Information Sciences,
    17(1):25-45, 2003.
12. Moshnikov, A.; Bogatyrev, V. Risk Reduction Optimization of Process Systems under Cost
    Constraint Applying Instrumented Safety Measures. Computers 2020, 9, 50.
13. Bogatyrev V. A., Bogatyrev S. V., Bogatyrev A. V., Model and Interaction Efﬁciency of
    Computer Nodes Based on Transfer Reservation at Multipath Routing,2019 Wave Electron-
    ics and its Application in Information and Telecommunication Systems (WECONF), Saint-
    Petersburg, Russia, 2019, pp. 1-4. doi: 10.1109/WE-CONF.2019.8840647