=Paper= {{Paper |id=Vol-2899/paper021 |storemode=property |title=Model of the peer-to-peer distributed system for securable information storage and processing without traffic prioritization (TheOoL project) |pdfUrl=https://ceur-ws.org/Vol-2899/paper021.pdf |volume=Vol-2899 |authors=Amanie Alhussain,Vadim L. Stefanuk }} ==Model of the peer-to-peer distributed system for securable information storage and processing without traffic prioritization (TheOoL project)== https://ceur-ws.org/Vol-2899/paper021.pdf
Model of the peer‐to‐peer distributed system for securable
information storage and processing without traffic prioritization
(TheOoL project)
Alexey V. Nenashev 1, Alexandr Yu. Tolstenko 1 and Rostislav S. Oleshko 1
1
    Samara State Technical University, Address, Samara, Index, Russia


                 Abstract
                 The mathematical model “Peer-to-peer distributed system for securable information storage
                 and processing in enterprise networks” is described hereinafter. It is a versatile distributed
                 operating system designed for the protection of distributed computing and insulation of private
                 networks without restricting the possibilities of effective interactions, cryptographic security,
                 protection from unauthorized access with the application of biometry and an innovative
                 protocol of data exchange for topology control based on distributed ledger technology. The
                 modeling was performed with the purpose of evaluation of performance of the system
                 depending on productivity of the hardware of its nodes and the network’s telecommunications
                 equipment.

                 Keywords 1
                 Peer-to-peer, Distributed computing networks, Enterprise networks, Cybersecurity, Queueing
                 network, Simulation of queueing networks, compute node, multimedia data, queues,
                 distributed data storage, Securable Information Storage)

1. Introduction

    Mainstream systems for data storage and data protection are built, for the most part, using centralized
architecture or they have operation centers, the control over which may be intercepted via hardware- or
software vulnerabilities or by way of planting a mole in the technical staff. Also normally for data
calculation and data storage they do not use advanced laptops and personal computers installed on user
workplaces (workstations) featuring significant computing resources (terabytes of ROM [2], dozens of
gigabytes of RAM [3], multicore high performance processors). Corporate information systems are
created using multitier architecture that relays computing load to data center resources, while local
resources of workstations remain largely untapped. To solve the problem of efficient utilization of
computing resources systems for distributed data storage and distributed computing are being designed
and developed. These systems do not resolve the issues of cybersecurity, leaving these issues at the
mercy of specialist software vendors. As a result, the indicators of speed and reliability decline, while
security vulnerabilities remain, which is caused by possibly incomplete documentation of information
systems (IS) protected or incidental and/or intentional errors in the implementation of security systems.
In our view, a possible solution would be a system that integrates the system of distributed data storage
and data processing with subsystems for unauthorized access protection (UA), cryptographic protection
(CP), automatic maintenance and investigation of cybersecurity incidents, and which hides topology of
the network, inter alia, from internal corporate personnel. Such a system not only would provide for the
ultimate protection of user data, but would reduce corporate expenditure on information technology
infrastructure, which has been demonstrably proven in the following work [1]. That said a significant

III International Workshop on Modeling, Information Processing and Computing (MIP: Computing-2021), May 28, 2021, Krasnoyarsk,
Russia
EMAIL: alexvlnenashev@gmail.com (Alexey Nenashev); tolstenko.ay@samgtu.ru (Alexandr Tolstenko); zonaTostATY@gmail.com
(Rostislav Oleshko)
ORCID: 0000-0003-1348-1766 (Alexey Nenashev); 0000-0001-6620-8793 (Alexandr Tolstenko); 0000-0002-9068-2580 (Rostislav Oleshko)
              © 2021 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                141
challenge in the design and implementation of such a system is ensuing its high-speed performance
from the perspective of its end user.

2. Peer‐to‐peer Distributed System for Securable Information Storage and
   Processing

    The peer-to-peer distributed system for securable information storage and processing (“the system”)
is first and foremost designed for removal of cybersecurity threats to servers, central computation and
subscriber nodes within a network. It is a distributed operating system, in which each node gives away
its computing resources (storage subsystem, central processor and graphic processor, and random-
access memory) to the system and has no independent meaning. However, nodes vary in terms of their
functional purposes: subscriber node (it can simultaneously serve as a storage node or metadata node),
storage node, and metadata node. This functional purpose is assigned at the stage of implementation of
the system and can be changed automatically later. The assignment of a role to a node occurs under the
control of an automatic maintenance subsystem without participation of the owner of a specific node.
Data storage in the system is performed in such a manner that any user data block (a file) is divided into
N identical packets, then it is encrypted and submitted for storage to 𝐾 ∙ 𝑁 network nodes (K –
redundancy coefficient) in encrypted form. The list of block storage nodes is placed in the metadata
block, which is, in turn, encrypted and placed in metadata nodes. No one, including the owner, knows
in which nodes, at a particular time, parts of the file are stored, except for the maintenance subsystem
that has access to metadata.
    When designing the system it is necessary to bear in mind that at any discrete instant of time
significant volumes of information must not only be passed between nodes of the system with high
speed, but they must also undergo encryption and decryption procedures. It is also necessary to consider
additional dataflow of service blocks of the distributed ledger (metadata) which contain data about
writing/reading nodes for data and information of the data access control system. If the data processing
speed turns out to be insufficiently high, the system would not be able to deliver comfortable user
experience, which would, in turn, put into question the possibility of application of this system in a real-
life enterprise network. To assess the possibility of implementing the system with sufficient data access
speed let’s build its mathematical model. For clarity, sufficient data access speed means the speed of
reading/writing operation comparable with the average user data access speed in existing data storage
network systems.

3. System Model

    The system is a queueing network (QN) which may be shown as an entire graph [3], the nodes of
which (workstations and servers) are the centers for processing and/or generation of remote jobs, while
its edges are duplex communications which only have the parameter 𝜐 , - speed of data frame
transmission between –ith and 𝑘 th nodes of the system.
    The matrix 𝑉 describes the speed of data transfer between nodes of the system:
                                        𝜐 , ,…,𝜐 , ,…,𝜐 ,                                          (1)
                                       ⎡            …             ⎤
                                       ⎢                          ⎥
                                     𝑉 ⎢ 𝜐 , , … , 𝜐 , , … , 𝜐 , ⎥,
                                       ⎢           …              ⎥
                                       ⎣𝜐 , , … , 𝜐 , , … , 𝜐 , ⎦
            𝑖 1, 𝜉, 𝑘       1, 𝜉, 𝜐 ,    min 𝜐        ,𝜐        ,   𝜐 ∈ 𝑉, 𝜐      ∈𝑉
where 𝜐 , –connection speed, 𝜉- number of nodes of the system, 𝜐              ,𝜐     - local maximum
                         th       th
connection speeds of –i and 𝑘 nodes of the system, correspondingly, 𝑉 – countable set of possible
values of local maximum speeds of connection of system’s nodes.
    Nodes are independent queueing systems (QS) with confined queue [2, 4]. Let’s introduce the
classification of nodes of the system: Type 1 node – metadata and routing control node; type 2 node –
data storage and data processing node; type 3 node – subscriber node which includes a subsystem for
job stream generation (JSG) as part of a virtual environment for execution of user software (VM) and

                                                   142
the subsystem for data encoding and data mixing (SDEM), as well as a full-fledged type 2 node. Thus,
the QN can be subdivided into 2 sub-QN’s: a data processing network comprising 𝜉 second type nodes
and 𝜉 third type nodes, and metadata processing network comprising 𝜉 first type nodes. The total
number of nodes within QN: 𝜉 𝜉              𝜉     𝜉 .
    Although the hardware of type 3 nodes is oftentimes less powerful than the hardware of type 2 nodes,
where specialized server equipment is normally used, in reality the following formula is executed: 𝜉
𝜉 ≪ 𝜉 , while the cumulative computing resource of type 3 nodes is substantially greater than the
relevant indicator for types 1 and 2 nodes. Therefore the application of type 3 nodes as data centers for
QN network in distributed computing systems is more than substantiated.
    On top of QN on 𝜉 nodes 𝐺 ∈ 𝐺 of type 3 there is the functional network 𝐺 of JSG, which even
though consumes the resources of nodes within QN network it functions in an absolutely independent
and isolated manner, and it acts as an external source of jobs in relation to QN network. The input
source of jobs in the system is VM of the nodes, each of them generating an ordinary random flow of
initial jobs 𝛷 :
                                           𝛷 𝑡         𝑇 , 𝜒⃗ , 𝑏⃗ ;                                       (2)
where 𝑇       𝑡 , , 𝑡 , , … , 𝑡 , , … ; 𝜒⃗ 𝜒 , , 𝜒 , , … , 𝜒 , , … ; 𝑏⃗ 𝑏 , , 𝑏 , , … , 𝑏 , , … ; 𝑗 1, 𝜉 ; 𝑞
1, ∞; 𝑏 , ∈ 0, 1 .
    In which at random times 𝑡 , random size jobs 𝜒 , are generated of one of the two types: 1. job for
reading data from the system 𝑏 ,        0 ; 2. job for recording data into the system 𝑏 ,     1 . The jobs
are generated in sequence, not more than one per any specific time 𝑡 , . The values 𝜒 , , 𝑏 , and 𝑡 , are
mutually independent.
    The stream 𝛷 𝑡 is generated with the intensity 𝜆 𝑡 and the statistical expectation 𝑀 of the
value 𝜒 , [1,3]:
                                                 ∑      𝜒,
   𝜆 𝑡      𝜆      𝑡    𝜆      𝑡 ; 𝑀        lim             ;𝐺       𝜆 𝑡 , 𝑀 , 𝑗 ∈ 𝐺; 𝑗 1, 𝜉         (3)
                                             →        𝑛
where 𝜆       𝑡 – intensity of the job stream for reading, 𝜆           𝑡 – intensity of the job stream for
recording.
    Then the stream is transferred to SDEM, where a distributed ledger transaction is opened, the job
𝜒 , is converted into a set of standard data blocks (jobs) of the system: 1. Metadata blocks with the size
of 𝛼 bits containing the status of the distributed ledger transaction of the system, and are processed
by type 1 nodes; 2. Data blocks with the size of 𝛼 bits are processed by type 2 and 3 nodes. Thus, 3
classes of standard jobs are generated in the system: D1 class – job for recoding the data block 𝛼 ; D0
class – job for reading the data block 𝛼 ; M class – job for processing the metadata block 𝛼 .
    Let’s define the average time for transfer of jobs between the node 𝐺 of the system and the node 𝐻
as arithmetic mean of the matrix 𝑉 (1) for the packets 𝛼 and 𝛼 , correspondingly:
                                 ∑ 1⁄𝜐 ,                   ∑ 1⁄𝜐 ,                                   (4)
                         𝑡                   ∙𝛼 ; 𝑡                     ∙𝛼 ;
                                      𝜉                          𝜉
    The system provides for simultaneous processing of class D1 jobs generated by SDEM of one 𝜒 ,
in the amount of 𝑝 ,       𝐾 ∙ 𝜒 , /𝛼 , and it strictly prohibits to record two packets from one 𝜒 , into
one node of processing classes 1 or 2. I.e. the average number of class D1 jobs as generated by SDEM
𝑗th source must satisfy the inequality 𝑝             𝐾 ∙ 𝑀 /𝛼             , where Ϸ – coefficient that
                                                                      Ϸ
characterizes mean-square deviation of the value 𝜒 , from its statistical expectation 𝑀 , which defines
the requirement to assigning the size 𝛼 of the system:
                                             Ϸ ∙𝑀                                                 (5)
                                𝛼       max           ; 𝑗 1, 𝜉 ;
                                              𝜉    𝜉
   It is admissible to determine the size of the block 𝛼 in an arbitrary way.
   The average time spent by SDEM to process one job from the stream 𝛷 :
                                      𝑡       𝑓 𝑋 ,𝑀      ;                                       (6)



                                                    143
will be the function of resources 𝑋 represented by the node for working with SDEM, intensity 𝜆 𝑡
and statistical expectation 𝑀 , which characterizes the average value of a job in bits.
   The total intensities of generation of the jobs of classes D0, D1 and M in the simulated QN are,
correspondingly:
                                                                            𝑀
                         𝜆       𝑡       𝜆     𝑡 ; 𝜆       𝑡    𝜆     𝑡 ∙     ;                      (7)
                                                                            𝛼

                                                                             𝑀
                     𝜆       𝑡       𝜆        𝑡 ;𝜆     𝑡       𝐾∙𝜆     𝑡 ∙     ;                     (8)
                                                                             𝛼


                         𝜆       𝑡       𝜆     𝑡 ;𝜆    𝑡       𝜉     2 ∙𝜆 𝑡 ;                        (9)

   The streams 𝛷 are independent, and none of them can be compared in terms of capacity with the
cumulative stream, therefore in accordance with Khinchin theorem [7, 8, 9] it will be fair to consider
the streams D0, D1 and M to be asymptotically Poisson ones, the simplest cores with possible
nonstationarity. If the number of nodes 𝜉 → ∞ in the network 𝐺 the cumulative stream

                                             𝛷 𝑡       𝛷 𝑡                                          (10)

will work for the simplest one [9].
   Let’s determine common determinate parameters of the node 𝐻 of QS (job processing centers)
which depend on hardware parameters of the node, namely: 𝑚𝑙 – the queue size, 𝑠𝑙 – the storage size,
𝐴 – the bandwidth of the node’s QS, 𝑡 , – the job processing time. In accordance with the classification
established the processing node receives jobs with a fixed length of 𝛼 ∈ 𝛼 , 𝛼 bits. The vector of
hardware and identification parameters of the node which are significant for the modeling:
                                𝐻     ℎ ,ℎ       𝑡 ,𝑋     , 𝑖 ∈ 𝐻;                                  (11)
                                    1, 𝑚𝑙      𝐿 𝑡    0                                             (12)
                       ℎ      𝑡                          ; ℎ ∈ 1, 2, 3 ;
                                    0, 𝑚𝑙      𝐿 𝑡    0
where ℎ – the variable for storage of the new node number; ℎ             – the attribute of availability of
free RAM space (slots in the node’s queue); 𝑖 – the unique identifier of the node in the system network;
𝑋          𝑥     ,…,𝑥      ,𝜐        – the vector of hardware parameters of the node, in
which 𝑥      ,…,𝑥       is ROM processing speed, maximum available ROM capacity, RAM processing
speed, maximum available RAM capacity, number of processors, number of cores per processor, and
processing power of the processor’s core, correspondingly, 𝑋         ∈ 𝛸, 𝑖 1, 𝜉, 𝑥 ,          0, 𝜐
0 𝑙 1,7; 𝑋 – the confined countable set, members of which are included in the master data by
equipment manufacturers; 𝐻 – the node set of the system; 𝐿 (t) – the instantaneous number of unserved
jobs sent to the node; 𝜉 – the number of nodes in the system.
         Types 1 and 2 nodes function as processing centers. On them, the node’s software consumes a
certain fixed part, the size of which is determined by the value of the parameter ℎ :
                        𝑋 ℎ          𝑥 ℎ ,…,𝑥 ℎ ,               𝜐 ℎ        ;                        (13)

where 𝑥 , ℎ         0; 𝜐 ℎ        0; 𝑙 1,7; 𝑖 1, 𝜉; 𝑋 1           𝑋 2      𝑋 3
   Type 3 nodes combine the function of type 1 processing center which consumes the resources as per
(13) and the function of JSG as part of VM and SDEM which consume: 𝑋        𝑥 ,…,𝑥 ,0 , и 𝑋
 𝑥 , … , 𝑥 , 𝜐 , 𝑖 1, 𝜉, correspondingly. The vector of resource consumption in JSG:
                          0, ℎ       3
       𝑋 ℎ                                 ; 𝑥,  0𝑥 ,    0; 𝜐     0; 𝑙 1,7; 𝑖 1, 𝜉;           (14)
                      𝑋     𝑋 , ℎ        3



                                                       144
   Let’s determine the resources 𝑋 available to the data processing center taking into account (1), (11)-
(14):
                            𝑋     𝑋       𝑋 ℎ           𝑋     ℎ                                   (15)
     𝑋    ℎ           𝑥      ℎ    ,…,𝑥          ℎ           ,𝜐 ℎ             ; 𝑥,      ℎ         0; 𝜐 ℎ         0; 𝑙
                       1,7; 𝑖 1, 𝜉
   In accordance with (15) let’s determine the parameters 𝑚𝑙 , 𝑠𝑙 , 𝐴 and 𝑡 , of 𝑖 th node:
                                           𝑚𝑙           𝑥,          ℎ        /𝛼 ;                                           (16)

                                           𝑠𝑙           𝑥,          ℎ        /𝛼 ;                                           (17)

                            𝑚𝑖𝑛 𝑥 ,    ℎ        ,𝑥 ,        ℎ           ,∏       𝑥, ℎ        ,𝜐 ℎ                           (18)
                  𝐴                                                        ;
                                                𝛼
                                        𝑡 ,    1⁄𝐴                                             (19)
   In accordance with the system’s operation logic in QN let’s select 𝑀 ∈ 𝐻 – the subset of 𝜉 nodes
𝐻 in the independent QN that processes jobs of class M, and let’s divide the subset of 𝐷 ∈ 𝐻 of
𝜉,     𝜉    𝜉 the nodes 𝐻 , which generate the network for processing jobs of classes D0, D1 into the
subnetworks (subsets) 𝐷 , 𝑘 1, 𝐾 in such a way that the subnetwork 𝐷 receives the nodes with the
highest bandwidth of the node’s QS 𝐴 , and in 𝐷 with the smallest one for this. Let’s enter in 𝐻 the
additional indices  ,  : 𝐻              𝐻         ℎ ,ℎ ,𝑋                   , 𝑖, 𝛾 |ℎ          1 ∈ 𝑀; 𝛾      1, 𝜉 ; 𝐻     𝐻

 ℎ ,ℎ ,𝑋          , 𝑖, 𝛿 |ℎ      ∈ 2, 3       ∈ 𝐷; 𝛿            1, 𝜉 , and apply to 𝐷 the function of sorting to get the
ordered set: 𝑓: 𝐷 ⟶ 𝐻 ∈ 𝐷              |𝐴           𝐴           . As a resulting set of the nodes 𝐷                let’s determine
the subsets 𝐷 :
                                                                                  ,
                                                                             ∑        𝑠𝑙
          𝐻 ∈𝐷 |            𝑠𝑙     𝜍 ;𝜍             𝜍           ∆; ∆                       ; 𝜍    ∆; 𝑘    1, 𝐾 ;            (20)
                                                                                  𝐾

where ℎ    – the attribute of the node belonging to a specific type based on the classification introduced.
After this, we will be analyzing the networks: 𝑀, 𝐷 . The network 𝑀 is an open QN with the intensity
of stream from the outer source 𝜆 𝑡 (9) and one class (M) of jobs. When processing each job from
the stream 𝛷 (2) 𝜉      2 jobs are generated with class M, where 𝜉 of them one at a time enters each
node 𝐻 ∈ 𝑀, while the remaining jobs are distributed by the nodes 𝐻 , depending on the capacity and
size of the node’s queue. The intensity of the input stream, without taking into account the stream of
resent jobs in the nodes 𝐻 :

                                       𝜆 𝑡              1       𝑒       ∙      𝜆 𝑡                                          (21)

   Let’s define the coefficients 𝑒 taking into account (9) and (21) using the following equation:

              𝜉       2 ∙        𝜆 𝑡                    1       𝑒       ∙      𝜆 𝑡          ⟹       𝑒     2;                (22)

   Let’s define 𝑒 as:




                                                                    145
              ⎧
               2∙             𝐴           m𝑖𝑛 𝐴           ∙ 𝑚𝑙                   min 𝑚𝑙              𝐴 ∙ 𝑚𝑙 ,                𝐴              0           𝑚𝑙       0
              ⎪
              ⎪
              ⎪
    𝑒                                         2∙ 𝐴            m𝑖𝑛 𝐴                  𝐴 ,             𝐴             0        𝑚𝑙              0                        ;       (23)
              ⎨
              ⎪                           2 ∙ 𝑚𝑙              min 𝑚𝑙                 𝑚𝑙 ,                 𝐴            0     𝑚𝑙                 0
              ⎪
              ⎪
              ⎩                                               2⁄𝜉 , 𝐴                            0       𝑚𝑙            0
where 𝐴               max 𝐴                m𝑖𝑛 𝐴 ; 𝑚𝑙                            max 𝑚𝑙              min 𝑚𝑙 ; 𝛾                  1, 𝜉 .
    Determination of the coefficients 𝑒 in the form (23) not only satisfies the equation (22), but it also
redistributes the load to the most productive nodes of the network 𝑀. Each node of the network 𝑀 is a
QS of G|G|1|𝑚𝑙 type as per Kendall’s classification with the queueing discipline FCFS [4]. The node
𝐻 ∈ 𝑀 at any moment of time t can be in the condition 𝑆 , , 𝑧 0, 𝑚𝑙               1, where 𝑧 0 – the
condition where the number of jobs in the queue 𝐿 ,        0, where 0 𝑧 𝑚𝑙           𝐿 ,      𝑧, and 𝑧
𝑚𝑙      1 means that the node is overloaded and servicing of the job is denied. Probability distribution
𝑃 , 𝑡 of the conditions 𝑆 , is established by the system of Kolmogorov differential equations [2,4]:
                          𝑑𝑃 , 𝑡
                                       𝜆 𝑡 ∙𝑃 , 𝑡        𝜇 𝑡 ∙𝑃 , 𝑡 ;                              (24)
                             𝑑𝑡
          𝑑𝑃 , 𝑡
                      𝜆 𝑡 ∙𝑃 ,       𝑡     𝜆 𝑡    𝜇 𝑡 ∙𝑃, 𝑡          𝜇 𝑡 ∙𝑃 ,        𝑡 ;
             𝑑𝑡
with the starting condition 𝑃 , 0       1, the normalization requirement ∑          𝑃, 𝑡       1 and the
                                        𝐴 , 𝜆 𝑡      𝐴
intensity of output stream 𝜇 𝑡                            . Solution of the system (24) is, in general,
                                       𝜆 𝑡 , 𝜆 𝑡        𝐴
possible using numerical techniques [5, 6], for example, by way of using the computational procedure
proposed in [10, 11]. If we know the probability distribution 𝑃 , 𝑡 taking into account the ordinary,
homogeneous and asymptotically Poisson nature of the input stream of request, we can determine
distribution of the average number of jobs in the queue of the node 𝐻 as [2]: 𝐿 , 𝑡
∑        𝑧 ∙ 𝑃 , 𝑡 . Then distribution of the virtual time for processing of the job taking into account (4),
                                      ,
(19): 𝑡       ,   𝑡                               𝑡 ,         𝑡         . It should be additionally noted that the node 𝐻 generates a
stream of denials with the intensity: 𝜆                                  ,       𝑡           𝑃,               𝑡 ∙ 𝜆 𝑡 . The jobs denied must be submitted
for processing to the available nodes of the network 𝑀 based on the queue size 𝑚𝑙 . Here the stream of
jobs resent to the nodes 𝐻 , taking into account the function of availability (12), will be as follows:
𝜆    ,    𝑡       𝑒       ,       𝑡 ∙ ∑               𝜆   ,       𝑡 . Let’s define the coefficients 𝑒                               ,       based on the equation:
                                                                   ∑             𝑒       ,       𝑡       1.
    Let’s define 𝑒                ,       𝑡 as:
                                                                                 𝐸       𝑚𝑐 ,                                                                                (1)


              ⎧       𝐴       𝐴           ,   𝑡       ∙ 𝑚𝑙              𝑚𝑙           ,       𝑡       𝐴        𝑡 ∙ 𝑚𝑙        𝑡 , 𝐴           𝑡           0 ⋀ 𝑚𝑙   𝑡       0
              ⎪
                                                  𝐴       𝐴         ,        𝑡       𝐴           𝑡 , 𝐴         𝑡       0 ⋀ 𝑚𝑙           𝑡           0                        ,
              ⎨
              ⎪                               𝑚𝑙          𝑚𝑙 , 𝑡 𝑚𝑙 𝑡 , 𝐴 𝑡                                                0 ⋀ 𝑚𝑙           𝑡           0
              ⎩                                            1⁄𝜉 , , 𝐴 𝑡  0 ⋀ 𝑚𝑙                                             𝑡   0
where




                                                                                             146
                      𝐴        𝑡           𝐴       ,       𝑡       𝐴       ,   𝑡 ; 𝑚𝑙             𝑡         𝑚𝑙                  ,       𝑡         𝑚𝑙             ,       𝑡 ; 𝛾         1, 𝜉 ;
                 𝐴             ,   𝑡           max 𝐴 ∈ 𝐻 : ℎ                         𝑡           1 ;𝐴           ,       𝑡               min 𝐴 ∈ 𝐻 : ℎ                             𝑡          1;
        𝑚𝑙            ,        𝑡           max 𝑚𝑙 ∈ 𝐻 : ℎ                        𝑡           1;         𝑚𝑙              ,       𝑡               min 𝑚𝑙 ∈ 𝐻 : ℎ                           𝑡        1.
    Taking into account the intensity of the stream of denials it would be fair to record 𝜆 𝑡 (21) as:

                           𝐸𝜆 𝑡                        1       𝑒       ∙ 1      𝑒        ,       𝑡 ∙            𝑃,                          𝑡     ∙          𝜆 𝑡 ;                                (25)

   Considering the functional model of the network 𝑀 (21) – (25) let’s define the concluding virtual
distribution of time for execution of the jobs 𝛷 , by the network 𝑀:
                                                ∑    𝑧 ∙𝑃 , 𝑡                                   (26)
                   𝑇 , 𝑡        max 𝑡 , 𝑡                         𝑡 ,  𝑡
                                                    𝜆 𝑡
   The network 𝐷         is an open QN with two classes (D1, D0) of jobs and intensity of the streams
from the external source 𝜆 𝑡 and 𝜆 𝑡 , divided into the subnetworks 𝐷 (20). Taking into account
the nature of the networks 𝐷 arranged by capacity of the nodes 𝐻 let’s divide the stream D1 between
the nodes of the networks 𝐷 with the intensities: 𝜆 , , 𝑡             𝑒 , ∙ 𝜆 𝑡 /𝐾, where 𝑒 ,
𝛿 ∑                                𝛿; ∑                                𝑒       1; 𝑞              ∑          𝑐𝑜𝑟𝑑 𝐷                      ;𝑐            1, 𝑘           1, 𝑞         0. The stream D0
is divided by the nodes: 𝐻 of the network 𝐷      :𝜆 , 𝑡        𝑒 ∙ 𝜆 𝑡 . As subjective estimation of a
system’s processing speed is defined by the speed of reading, and 𝐷         is the multitude arranged by
                                                           ,         ,
capacity of the nodes 𝐻 , let’s define 𝑒 as: 𝑒      𝛿 ⁄∑     𝛿; ∑      𝑒     1. Then the intensity of the
input stream of the node 𝐻 without taking into account the resent jobs:
                                 𝜆 𝑡     𝜆 , 𝑡       𝜆 , , 𝑡 ;                                      (27)
    The node of the network 𝐻 ∈ 𝐷        is a QS of G|G|1|𝑚𝑙 type with the intensity of the input steam
𝜆 𝑡 and at any specific time t it can be in the condition 𝑆 , , 𝑧 0, 𝑚𝑙            1, where 𝑧 0 is the
condition where the number of jobs in the queue 𝐿 ,          0, where 0 𝑧 𝑚𝑙            𝐿 ,      𝑧, а 𝑧
𝑚𝑙      1 means that the node is overloaded and the servicing of the job is denied. The probability
distribution 𝑃 , 𝑡 of the conditions 𝑆 , was defined by the system of Kolmogorov differential
equations [2, 4]:
                          𝑑𝑃 , 𝑡
                                      𝜆 𝑡 ∙𝑃 , 𝑡        𝜇 𝑡 ∙𝑃 , 𝑡 ;
                             𝑑𝑡
         𝑑𝑃 , 𝑡
                      𝜆 𝑡 ∙𝑃 ,     𝑡      𝜆 𝑡      𝜇 𝑡 ∙𝑃 , 𝑡          𝜇 𝑡 ∙𝑃 ,       𝑡 ;           (28)
            𝑑𝑡
with the initial condition 𝑃 , 0    1, the normalization requirement ∑          𝑃 , 𝑡     1, and intensity
                                 𝐴 , 𝜆 𝑡       𝐴
of the output stream 𝜇 𝑡                           . Let’s define distribution of the average number of
                                𝜆 𝑡 , 𝜆 𝑡        𝐴
jobs in the queue of the node 𝐻 as [2]: 𝐿 , 𝑡          ∑     𝑧 ∙ 𝑃 , 𝑡 . Then distribution of the virtual
                                                                                                                                                             ,
time for execution of the job taking into account (4), (19): 𝑡                                                                      ,   𝑡                                   𝑡 ,        𝑡 . The node
𝐻 generates the streams of denials of classes D0 and D1 with the intensities: 𝜆 , , 𝑡
𝑃 ,       𝑡 ∙ 𝜆 , 𝑡 and 𝜆 , , 𝑡             𝑃 ,      𝑡 ∙ 𝜆 , , 𝑡 . The jobs denied must be submitted
for processing to nodes of the network 𝐷        which are available based on the size of the queue 𝑚𝑙 .
Here the stream of resent jobs sent to the nodes 𝐻 taking into account the function of availability (12)
will be as follows: 𝜆 , , 𝑡                 𝑒 , , 𝑡 ∙ ∑ , 𝜆 , , 𝑡 and d 𝜆 , , , 𝑡
                                       ,                                                                                                                                           ,
𝑒       ,       , ,       𝑡 ∙∑                 𝜆       ,   , ,     𝑡           where                    𝑒           ,       ,       𝑡           𝛿 ℎ                  𝑡       ∑         𝛿 ℎ         𝑡     ,
    ,
∑           𝑒             1,                                                         𝑒       ,    , ,       𝑡               𝛿 ℎ                  𝑡               ∑                     𝛿 ℎ         𝑡     ,



                                                                                                  147
∑            𝑒    ℎ     𝑡      1,     𝑞       ∑       𝑐𝑜𝑟𝑑 𝐷    , 𝑞   0,   𝛿 ℎ      𝑡       𝛿∙ℎ      𝑡 .
Together with the intensity of the stream of denials it will be fair to record 𝜆 𝑡 (27) as:
                  𝜆 𝑡      𝜆 , 𝑡        𝜆 , , 𝑡      𝜆 , , 𝑡        𝜆 , , , 𝑡 ;               (29)
   Taking into account the functional model of the network 𝐷 (20), (27) – (29) and concurrent
processing of the jobs of classes D0 and D1 as generated by SDEM from the job 𝛷 , , let’s define the
final virtual distribution of time for execution of the jobs 𝛷 , by the network 𝐷:
                                                      ,
                                                  ∑  𝑡 , 𝑡
                                    𝑇 ,   𝑡                   ;                                   (30)
                                                    𝜉,
   Then the virtual distribution of time for execution of the jobs 𝛷 , as generated by the node 𝐺 of the
system on the basis of (7), (28), (41):
                             𝑇 𝑡        max 𝑇 , t , 𝑇 , t         𝑡 ;                             (31)


4. Some Modeling Results

    Based on the model (31) and using Python programming language [12] the software designed for
the simulation of operation of a corporate computer network was realized. Using it, operation of the
network was simulated using the example previously reviewed in our work [1]. The network comprises
𝜉     500 nodes of four hardware types (Table 1) which vary in terms of capacity of their disk
subsystems and amounts of RAM (which characterizes the maximum queue size) and which are
interconnected by Gigabit Ethernet network. The capacity of the file subsystem was defined as
nonterminating. Unlike [1] there are no servers in the network (𝜉   0). Reading/writing operations are
performed in the emulation. The network for data processing comprises 200 nodes 𝜉 , and divided in
accordance with the model into 𝐾 4 subsystems, 50 nodes each. Actually utilized are 50 nodes per
each hardware type. The JSG network comprises 𝜉          500 nodes.

Table 1
Hardware types of the nodes – members of the network
     Hardware Type              Quantity          Capacity of the Disk           Amount of RAM (kB)
                                                   Subsystem (kB/s)
           1                      50                    61.440                          8.388.608
           2                      50                    61.440                          4.194.304
           3                      350                   61.440                          2.097.152
           4                      50                   409.600                          16.777.216

    The following values of the model parameters (31) were used in the computations: The size of data
packet was within the range of 𝛼 ∈ 100 𝑘𝐵, 2000 𝑘𝐵 ; the size of metadata packet was 𝛼            2 𝑘𝐵;
parameters of the job stream 𝛷 for two experiments (Table 2). Parameters of the first experiment
simulate peak activity in the test net when processing multimedia data (opening, editing, and reporting
video-, audio- or any other graphic data) by real users. As part of the second experiment an extreme
situation is simulated, where each node of the network at any specific time reads out from distributed
storage or transfers to distributed storage super large amounts of data in automatic mode, for instance,
it executes a queue of jobs for copying multimedia data or graphic data. The generalized stream of
traffic coming from JSG network is determined by the sum (10).

Table 2
Parameters of the job stream from one node of JSG network
          Parameter                      Experiment 1                          Experiment 2
        𝜆 𝑡 (jobs/sec.)                         1                                    1
           𝑀 (kB)                      300 𝑘𝐵; 2000 𝑘𝐵                      3000 𝑘𝐵; 20000 𝑘𝐵


                                                          148
   In order to estimate the outgoing job stream (the results of execution of 𝛷 ) the following statistical
values were used [13, 14]: Statistical expectation 𝑀 (kB/s) and variance 𝑆 (kB/s) of the speed of
data processing 𝑉         𝜒 ⁄𝑇 𝑡 (kB/s) which is defined as its mean-square deviation. Additionally
received was the number of 𝐿      of the packets 𝛼 unserved as of the end of the experiment (estimated
as a number of packets).
   Based on the results of operation in Experiment 1 mode (Figure 1:) it is apparent that 𝑀        grows
purely and linearly as 𝑀 grows, and it practically does not depend on the size of the packet 𝛼 . Certain
correlation of the value 𝛼 , 𝑀 with 𝑆         can be simultaneously observed. We did not demonstrate
the chart 𝐿      as there were no service denials in Experiment 1 mode. This behavior indicates
sufficiency and even certain redundancy in terms of capacity of the distributed data storage system built
using the model proposed for traffic with the input parameters, equaling the values of Experiment 1
mode (Table 2).




                                  (a) 𝑀                                (b) 𝑆
Figure 1: Simulation Results in Experiment 1 Mode

    Based on the results of operation in Experiment 2 mode (Figure 2:) it is apparent that 𝑀   grows
purely and linearly as 𝑀 grows to the value 𝑀          7000 𝑘𝐵, starting from which, we observe a
clearly-defined productivity dip, which testifies to the initial stage of the system overload and
accumulation of queues at the nodes of the data processing network, and there can be observed a
meaningful dependency on the size of the packet 𝛼 . Starting with the values 𝑀           7000 𝑘𝐵 the
variance 𝑆     begins to grow substantially. Despite the obvious overload, the denials 𝐿      are not
present, except for the area 𝛼 ∈ 100 𝑘𝐵, 500 𝑘𝐵 ; 𝑀 ∈ 11000 𝑘𝐵; 20000 𝑘𝐵 . Thus, the
distributed data storage system built using the model proposed for traffic with the input parameters
equaling the values in Experiment 1 mode (Table 2) demonstrates acceptable productivity, except for
the values 𝛼 ∈ 100 𝑘𝐵, 500 𝑘𝐵 ; 𝑀 ∈ 11000 𝑘𝐵; 20000 𝑘𝐵 .




                     (а) 𝑀                      (b) 𝑆                             (c) 𝐿
Figure 2: Simulation Results in Experiment 2 Mode


                                                   149
5. Conclusion

   Testing of the model proposed in two fairly hard operating modes, at peak loads, was performed.
The results of testing suggest that the productivity of the distributed data storage systems built as per
the mathematical model proposed, despite the absence of high-end server hardware in the network and
rather mediocre hardware parameters of its nodes, is sufficiently high and comparable with the
productivity of operation of centralized data storage systems designed and built with high-end and
expensive server hardware.

6. References

[1] A. Nenashev, V. Khryashchev, "The Economics of Introducing the Peer-to-peer System of Storage
     and Processing of Protected Information at an Enterprise," 2019 XXI International Conference
     Complex Systems: Control and Modeling Problems (CSCMP), Samara, Russia (2019) 769-772.
     doi: 10.1109/CSCMP45713.2019.8976720.
[2] L. Kleinrock, Theory of Queueing Systems: Translation from English. Translated by I.I. Glushko;
     edited by V.I. Neiman, Moscow: Mashinostroenie, p. 432, 1979.
[3] V. B. Marakhovsky, L. Ya. Rosenblum, A. V. Yakovlev, Concurrent Processes Modeling, Petri
     nets, Saint Petersburg: Professional Literature, p. 400, 2014.
[4] V. M. Vishnevsky, Basics of Computer Networks Design, Moscow: Tekhnosfera, p. 512, 2003.
[5] I. S. Berezin, N. P. Zhidkov, Computing Techninques, volume 1, Moscow: State Publishing House
     of Physico-mathematical Literature, p. 464, 1962.
[6] A. A. Samarsky, Introduction to Calculus of Approximations, Moscow: Science, p. 271, 1982.
[7] A. Ya. Khinchin, On Poisson Streams of Accidental Events, Probability Theory and Its
     Applications, volume 1, Issue 3, pp. 320–327, 1956.
[8] A. Ya. Khinchin, V.A. Steklov, Mathematical Methods of Queueing Theory, Mathematical
     Institute at the Academy of Sciences of the USSR, volume 49, pp. 1–123, 1955.
[9] A. B. Markhasin, Asymptotic model of accumulation of unsteady flows of like events with carry
     over effect in multiple access wireless networks, Herald of Siberian State University of
     Telecommunications and Information Science 4 (2011) 19-31.
[10] M. Yu. Livshits, A. V. Nenashev, Yu. E. Pleshivtseva, Computing algorightm for optimal control
     of an object with distributed constants in imperfect areas of end states, Herald of South Ural State
     University, Series: Mathematical Simulation and Programming 12(4) (2019) 41-51.
[11] M. Yu. Livshits, A. V. Nenashev, “Efficient computational procedure of alternance optimization
     method,” Herald of Samara State Technical University, Physico-mathematical Sciences Series,
     23(2) (2019) 361–377.
[12] Official website of Python programming. URL: language https://www.python.org.
[13] I. K. Tsybriy, Statistical Analysis of Observed Data: Textbook, Rostov-on-Don: DGTU Publishing
     Center, p. 147, 2010.
[14] W. Ledermann, E. Lloyd. Handbook of Applicable Mathematics, Statistics. New York: Wiley,
     1984, p. 511




                                                  150