=Paper= {{Paper |id=Vol-2842/paper_6 |storemode=property |title=About Cloud Storage Systems Survivability |pdfUrl=https://ceur-ws.org/Vol-2842/paper_5.pdf |volume=Vol-2842 |authors=Nikolay Kucherov,Inna Dvoryaninova,Mikhail Babenko,Natalia Sotnikova,Nguyen Viet Hung }} ==About Cloud Storage Systems Survivability== https://ceur-ws.org/Vol-2842/paper_5.pdf
About Cloud Storage Systems Survivability
Nikolay Kucherov a, Inna Dvoryaninova a, Mikhail Babenko a,b, Natalia Sotnikova a and
Nguyen Viet Hung c
a
  North-Caucasus Federal University, 1, Pushkin Street, Stavropol, 355017 Russia
b
  Institute for System Programming of the Russian Academy of Sciences, 25, Alexander Solzhenitsyn st.,
Moscow, 109004, Russia.
c
  LeQuyDon Technical University, 236 Hoang Quoc Viet, Hanoi, Vietnam


                Abstract
                This article proposes an approach to improving reliability and survivability based on modular
                arithmetic. The proposed approach makes it possible to increase the survivability of cloud
                storage systems, as well as reliability and fault tolerance of data storage. To increase fault
                tolerance in the event of a failure, the redistribution of the processed data is applied. The
                proposed model allows restoring the saved data in the event of failure of one or more cloud
                servers.

                Keywords 1
                Cloud computing, error correction code, survivability

1. Introduction
    As the amount of stored data increases, more and more users are switching to cloud storage
systems. In modern conditions, the requirements for the reliability of data storage in cloud are
constantly increasing. The reliability of cloud storage [1-4] is understood as its property to ensure the
management and storage of data while maintaining the values of the established quality indicators
over time in operation. It reflects the impact on the performance of cloud storage mainly of intra-
system factors - random failures of technology.
    Cloud storage survivability [5-8] means its stability of the control and transmission system against
external causes, aimed at disabling cloud storage, as well as resistance to cascading failures.
    The concepts of reliability and survivability have much in common and at the same time differ
significantly from each other. They are united by the principle of stability, which takes into account
all the variety of factors, including various emerging failures. The stability index is a function of the
reliability indicators, survivability and fault tolerance.
    Differences in the concepts of reliability and survivability and reasons of normal cloud system
functioning disruption are due to significant differences in their manifestation, the nature and scale of
failures, its duration, methods of their elimination and methods of increasing fault tolerance.
Accordingly, the initial data, calculation methods, accuracy and the essence of reliability and
survivability indicators differ significantly. The first ones are well provided with statistical material,
the main influencing factors are taken into account, they are deeply developed theoretically, and they
can be used for sufficiently accurate forecasting, calculations, design and modeling. The group of
survivability indicators more reflects the qualitative picture of the behavior of the considered cloud
storage system in conditions of external influences or cascade propagation of failures [9, 11].
    The reliability of cloud storage systems manifests itself in the form of failures. The concept of
failure is closely related to the concept of operability. Operability is the state of the cloud system, in

YRID-2020: International Workshop on Data Mining and Knowledge Engineering, October 15-16, 2020, Stavropol, Russia
EMAIL: nkucherov@ncfu.ru (Nikolay Kucherov); innadv99@mail.ru (Inna Dvoryaninova); mgbabenko@ncfu.ru (Mikhail Babenko);
sotnikova-natali@list.ru (Natalia Sotnikova); hungnv@mta.edu.vn (Nguyen Viet Hung)
ORCID: 0000-0003-0337-0093 (Nikolay Kucherov); 0000-0003-2174-2284 (Inna Dvoryaninova); 0000-0001-7066-0061 (Mikhail
Babenko); 0000-0001-5029-0390 (Natalia Sotnikova); 0000-0002-9818-4455 (Nguyen Viet Hung)
                2020 Copyright for this paper by its authors.
           Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
           CEUR Workshop Proceedings (CEUR-WS.org)



                                                                                                                    43
which it is able to perform the specified functions with the parameters and requirements for the
quality of the services provided. Failure - a random event that disrupts the performance of the cloud
system.
    A variety of approaches can be used to build a distributed system for storing and processing data.
Some of them are based on cloud computing paradigms. These infrastructures have both common
characteristics and fundamental differences. Using clouds for storage requires security, reliability and
scalability with limited Internet bandwidth to provide fast access to distributed data and a high degree
of reliability, availability and scalability.
    Distributed storage can be based on multiple clouds. Typically, data is divided into several parts,
which are stored in different clouds to ensure availability in the event of a failure. However, failures
in distributed storage can cause inconsistencies between different copies of the same data.
    Large databases can be used. In this case, to ensure high performance, data processing and analysis
must be performed using parallel computing.
    In the second part, the main problems of cloud storage and the uncertainty of the emerging failures
are considered, in the third part, methods for increasing the reliability and survivability of cloud
storage are given and the application of the residue number system for this problem is considered. The
final part is about the method of increasing reliability as information backup, its advantages and
disadvantages are also given.

2. Problems of cloud storage systems
    When storing data in the cloud, situations arise in the denial of access to data, errors or
deterioration in the functioning of services, and sometimes in long interruptions in their work. For
these and other reasons, distributed data processing is inevitably a continuous stream of failures,
errors and malfunctions. Cloud storage failure can occur slowly, over a long period of time, or in a
split second.
    Uncertainty can be seen as the difference between available knowledge and complete knowledge.
It can be classified in several different ways depending on their nature. [4].
    In the implementation of the cloud storage parts, the occurrence of failure uncertainty can be
qualified as follows:
    Uncertainty of software failure is a limited uncertainty due to complete or partial ignorance of the
conditions under which decisions must be made.
    The uncertainty of a hardware failure, such as hard drives, power systems, etc., is a technical
uncertainty and is a consequence of the inability to predict the exact results of solutions.
    Information theory founder Claude Shannon defined information as removed uncertainty. More
precisely, obtaining information is a necessary condition for removing uncertainty. Uncertainty arises
in a situation of choice. The problem of reducing number of options under consideration (reducing the
variety) and, as a result, the choice of one corresponding situation option from among the possible is
solved in the course of removing the uncertainty. Removing uncertainty enables to make informed
decisions and take action. This is the guiding role of information.
    The situation of maximum uncertainty presupposes presence of several equally probable
alternatives (options), i.e. neither option is preferred. Moreover, the more equally probable options are
observed, the greater the uncertainty, the more difficult it is to make an unambiguous choice and the
more information is required for this to be obtained. For variants, this situation is described by the
following probability distribution: {             }.
    The minimum uncertainty is , i.e. this is a situation of complete certainty, meaning that the choice
has been made and all the necessary information has been obtained. The probability distribution for a
completely certain situation looks like this: *           +.




                                                                                                      44
3. Methods to improve reliability
    To improve the reliability of cloud systems, a mathematical apparatus, standardization, load
balancing, protection from external influences and the choice of data storage schemes can be
provided.
    If the above methods did not give the desired result, then it is necessary to use a reservation [12-
16].
    Let's consider the factors affecting the survivability of cloud systems [17]. An important difference
in the task of assessing survivability from other related tasks, for example, assessing reliability, is
that, as a rule, it is impossible to use the concept of the probability of occurrence of certain situations.
To increase survivability, one can apply such a number system in which the loss of some part would
not lead to the termination of the functioning of the entire system at all.
    That is why the role of number system choice for the functioning of the cloud storage system
increases. The number system mainly determines the model of the system's reliability, the method of
redundancy, the prevention of cascading failures and the growth of arising errors. One of the natural
indicators of the qualitative measurement of survivability is the indicator preserved by the system
after a fixed set of impacts.
    For cloud storage systems, the main indicators of reliability are:
    1. Probability of no-failure operation in the time interval from to
                                   ( )       (     )     *        -           ( ),
    where ( )           ,     - is distribution of failures in time to the first failure.
    2. The probability of failure-free operation of the facility in the time interval from to
                                                                   (         )      (      )
                        (       )      *             |       -
                                                                     ( )               ( )
    3. Density of failure distribution
                                        ( )        ( )           ( )
   4. The rate of failure of objects at the time
                                        ( )                     ( )
                                                     ( )
    The main practical method for improving the reliability and survivability of cloud storage systems
is redundancy. Redundancy [18-21] is understood as a method of increasing the cloud system
reliability by introducing additional clouds in excess of the minimum required for the normal
functioning of the system.
    Basic types of reservation:
    1. Structural.
    2. Temporary.
    3. Functional.
    4. Load.
    The highest stability, reliability and survivability will be possessed by a system in which these
methods are harmoniously combined and mutually penetrate each other [22, 23]. These are the
characteristics of the Residue Number System (RNS). Using the RNS to build cloud storage systems
provides with backup and working clouds (for each of the RNS bases), and also harmoniously use all
types of redundancy described above.

4. Information reservation in cloud storage systems
   The Residue Number System is a number system in which numbers are represented as a set of
non-negative residues              in coprime modulus              .
   Let the numbers be represented by the residues                  at the bases           We will
assume that - residues are sufficient for an unambiguous representation of the number , and     ,
                   , while the working bases are            .



                                                                                                         45
   The range of unambiguous representation for the selected modulus is equal to the product of these
modulus        ∏        .
   Thus, in the representation of the number the residues                        can be discarded any -
residues without compromising the uniqueness of the representation of the number , as a result of
which the RNS can control errors and it is a nonlinear code, which is called the -code [24-27 ] [24-
27].
   In order to assess the ability of the RNS to control errors, we introduce the concept of the weight
     (| | ) of the number .
   The weight        (| | ) of the number          will be considered equal to the number of nonzero
residues. This definition of the weight of a number corresponds to the definition of the weight of the
code in the Hamming metric.
   In the symbol       (| | ) the subscript       at    shows that the number is represented by such a
number of residues that        | | .
   Obviously, the number of residues that represent the number , in this case, is equal to . The
argument | | means that           | | .
   With this definition of the number’s weight          (| | ) it is possible to calculate it both directly
from and from the totality of residuals                  .
   The concept of the weight of the number [28, 29] , represented by the residues, can be used to
define the concept of the distance between two points in space, each of which corresponds to the
and .
   The distance           between the points of space and       is defined as the weight of the difference
between     and .
                                                     (|        | )
   To determine the correcting capabilities of the code, we calculate the average weight ̅ (| | ) of
nonzero complexes. If the number is changed in the range| | , then the period for dividing the
zeros is , while the number of zeros in the base will be
                                                        (   ̅̅̅̅̅ )

   Then, respectively, the number of nonzero elements in the base is equal to
                                      ̅̅̅
   The sum of the weights of the residues           (                 ), when   is from zero to        , is
determined by the following expression:
                                  ∑         (      )        (       ∑       )
   The average weight ̅ (| | ) can be defined as
                            ̅ (| | )                            (      ∑        )
    The more the code is adapted to error correction, the more the numbers in the code representation
differ from each other, i.e. the greater the code distance. Moreover, the distance will be different
between different numbers [30-33].
    If we determine the average weight of the numbers forming the zero space, then the minimum
code distance will never exceed the upper bound of the minimum code distance. The minimum weight
of elements of the zero-code space is
                                            {    (| | )]
    With                  we are guaranteed to detect any error. At the same time, the use of RNS allows
detecting more errors. For example, an RNS with one redundant base, the value of which is greater
than any of the working ones, allows detecting 100% of single errors and 95% of double errors.
    It is also possible to detect all errors of multiplicity if         . This means that the RNS allows
detecting errors of a given multiplicity and the multiplicity of detected errors is determined by the
minimum code distance            .


                                                                                                        46
   Let the number        (             ) be distorted, and instead of the number      we have ̃
(̃ ̃                                                                           ̃
           ̃), such that ̃ | | . Let us calculate the difference between and , which will
determine as the magnitude of the error:
                                                |̃     |
   Since errors in different bit digits are independent of each other, then     can be represented
similarly to the representation of the number , i.e. residuals        (           ), which will be
determined as       | ̃     | .
   Using the accepted residuals ̃ ̃         ̃ we calculate the number ̃ by solving the comparison
systems:
                                  ̃      ̃(        )(            )
   To solve this system of comparisons, we use the method of orthonormal vectors of the form [25]
                                         (               )
where      | | , such that       (         ).
  With the help of orthonormal vectors, the number      can be represented as
                                       ∑                  (| | )
where     (| | ) is a function of the number rank, which for the value of the argument | | takes a
value from calculating using orthonormal vectors       | | .
   Having calculated the number ̃ we find the values of the residuals ̃ ̃         ̃ of this number
based on             .
   Syndromic components are defined as
                                   |      ̃|
   Let us show that the fact of an error can be determined by the syndromic components.
   We represent the number as
                                              ̃ |       |
   Suppose that only the residuals on the bases              , which are considered informational, are
affected by errors. Taking this into account, we have

                                          ||∑             | |

where      is an error value in the -digit, is a set formed from the numbers of information bases,
the residues of which are distorted.
   Due to the fact that the quantity    is not zero and that the excess grounds satisfy the condition of
mutual simplicity, the inequality to zero of at least one syndromic component indicates the presence
of errors.
   Let's give an example of error detection. Let the number be represented in the RNS by the residues
              , when converting the number to the positional number system (during data recovery)
errors may occur. Therefore, after the data recovery ̃ (̃ ̃              ̃) is performed, we compare
              ̃
the number with the number , and if         ̃      , then we conclude that an error occurred during the
recovery.
   Let's give another example. In the generalized positional number system (GPNS), the number
     | | can be represented in the form

                                                                ∏                                 (1)

   We will sequentially find the digit values (                 ), solving comparisons of the form
      (        ) starting with     . Since all terms except for      are identically zero modulo ,
hence        .
   Solving        (        ),               (        ), therefore



                                                                                                        47
                                              (        ) or else       |     |


   Continuing the process of calculating the bit digits of the GPNS at the -th step, we get
                                    ∑        ∏
                              |                       |         (            )
                                      ∏
   As a result of converting the number         (              ) into the GPNS, we have the number
      (              ).
   If      | | , then                will be equal to zero. This property can be used to receive data
without error. The advantage of this algorithm is manifested in the fact that if our task is to establish
the fact that errors have occurred, then if any             is not equal to zero, the convertion process
stops. This reduces the amount of computation required to detect errors.

5. Conclusions
    This article proposes an approach to improving reliability and survivability based on modular
arithmetic. The proposed approach makes it possible to increase the survivability of cloud storage
systems, as well as to increase the reliability and fault tolerance of the data storage. To increase fault
tolerance in the event of a failure, the redistribution of the processed data is applied. The introduction
of low redundancy allows processing or restoring stored data in the event of a management server
failure. This model allows recovering saved data in the event of a failure of one or more cloud servers.
However, further research is needed to assess its efficiency in real systems. This will be the subject of
our future work on a comprehensive experimental study of multipurpose optimization with real cloud
providers.
    Acknowledgements The reported study was funded by RFBR, project number 20-37-70023

6. References
   [1] M. Babenko, N. Kucherov, A. Tchernykh, N. Chervyakov, E. Nepretimova, I. Vashchenko.
       Development of a Control System for Computations in BOINC with Homomorphic Encryption
       in Residue Number System, International Conference BOINC-Based High Performance
       Computing: Fundamental Research and Development, BOINC: FAST 2017: 77-84.
   [2] R.L. Grossman, Y. Gu, M. Sabala, W. Zhang. Compute and storage clouds using wide area
       high performance networks, Future Generation Computer Systems (2009): 179-183.
   [3] M.O. Rabin. Efficient dispersal of information for security, load balancing, and fault tolerance,
       Journal of the ACM (JACM)(1989): 335-348.
   [4] A. Tchernykh, V. Miranda-Lopez, M. Babenko, F. Armenta-Cano, G. Radchenko, A.Y.
       Drozdov, A. Avetisyan. Performance evaluation of secret sharing schemes with data recovery
       in secured and reliable heterogeneous multi-cloud storage, Cluster Computing (2019): 1173-
       1185.
   [5] A. Celesti, M. Fazio, M. Villari, A. Puliafito. Adding long-term availability, obfuscation, and
       encryption to multi-cloud storage systems, Journal of Network and Computer Applications.
       (2016): 208-218.
   [6] A. Tchernykh, U. Schwiegelsohn, E. Talbi, M. Babenko. Towards understanding uncertainty in
       cloud computing with risks of confidentiality, integrity, and availability, Journal of
       Computational Science (2019): 100581.
   [7] H. Abu-Libdeh, L. Princehouse, H. Weatherspoon. RACS: a case for cloud storage diversity,
       Proceedings of the 1st ACM symposium on Cloud computing (2010): 229-240.
   [8] K.D. Bowers, A. Juels, A. Oprea. HAIL: A high-availability and integrity layer for cloud
       storage, Proceedings of the 16th ACM conference on Computer and communications security
       (2009): 187-198.



                                                                                                       48
[9] A.G. Dimakis, K. Ramchandran, Y. Wu, C. Suh. A survey on network codes for distributed
    storage, Proceedings of the IEEE (2011): 476-489.
[10] Z. Erkin, T. Veugen, T. Toft, R.L. Lagendijk. Generating private recommendations efficiently
    using homomorphic encryption and data packing, IEEE transactions on information forensics
    and security (2012): 1053-1066.
[11] M. Babenko, A. Tchernykh, N. Chervyakov, V. Kuchukov, V. Miranda-Lopez, R. Rivera-
    Rodriguez, Z. Du, E.G. Talbi. Positional Characteristics for Efficient Number Comparison over
    the Homomorphic Encryption, Programming and Computer Software (2019): 532-543.
[12] Z. Kong, S.A. Aly, E. Soljanin. Decentralized coding algorithms for distributed storage in
    wireless sensor networks, IEEE Journal on Selected Areas in Communications (2010): 261-
    267.
[13] M. Li, W. Lou, K. Ren. Data security and privacy in wireless body area networks, IEEE
    Wireless communications (2010): 51-58.
[14] H.Y. Lin, W.G. Tzeng. A secure erasure code-based cloud storage system with secure data
    forwarding, IEEE transactions on parallel and distributed systems (2012): 995-1003.
[15] L.J. Pang, Y.M. Wang. A new (t, n) multi-secret sharing scheme based on Shamir’s secret
    sharing, Applied Mathematics and Computation (2005): 840-848.
[16] A. Parakh, S. Kak. Space efficient secret sharing for implicit data security, Information
    Sciences (2011): 335-341.
[17] A. Parakh, S. Kak. Online data storage using implicit security, Information Sciences (2009):
    3323-3331.
[18] S. Ruj, A. Nayak, I. Stojmenovic. DACC: Distributed access control in clouds, 2011
    International Joint Conference of IEEE TrustCom-11/IEEE ICESS-11/FCST-11 (2011): 91-98.
[19] B.K. Samanthula, Y. Elmehdwi, G. Howser, S. Madria. A secure data sharing and query
    processing framework via federation of cloud computing, Information Systems (2015): 196-
    212.
[20] M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A.G. Dimakis, R. Vadali, S. Chen, D.
    Borthakur. Xoring elephants: Novel erasure codes for big data, Proceedings of the VLDB
    Endowment (2013): 325-336.
[21] N.B. Shah, K.V. Rashmi, P.V. Kumar, K. Ramchandran. Interference alignment in
    regenerating codes for distributed storage: Necessity and code constructions, IEEE
    Transactions on Information Theory (2012): 2134-2158.
[22] C. Wang, Q. Wang, K. Ren, N. Cao, W. Lou. Toward secure and dependable storage services
    in cloud computing, IEEE transactions on Services Computing (2012): 220-232.
[23] J.J. Wylie, M.W. Bigrigg, J.D. Strunk, G.R. Ganger, H. Kiliccote, P.K. Khosla. Survivable
    information storage systems, Computer (2000): 61-68.
[24] C.C. Yang, T.Y. Chang, M.S. Hwang. A (t, n) multi-secret sharing scheme, Applied
    Mathematics and Computation (2004): 483-490.
[25] S.J. Lin, W.H. Chung, Y.S. Han. Novel polynomial basis and its application to reed-solomon
    erasure codes, IEEE 55th Annual Symposium on Foundations of Computer Science (FOCS)
    (2014): 316-325.
[26] A. Tchernykh, M. Babenko, N. Chervyakov, V. Miranda-Lopez, V. Kuchukov, J.M. Cortes-
    Mendoza, M. Deryabin, N. Kucherov, G. Radchenko, A. Avetisyan. AC-RRNS: Anti-collusion
    secured data sharing scheme for cloud storage, International Journal of Approximate Reasoning
    (2018): 60-73.
[27] D.T. Liu, M.J. Franklin. GridDB: a data-centric overlay for scientific grids, Proceedings of
    the Thirtieth international conference on Very large data bases – VLDB Endowment (2004):
    600-611.
[28] A. Tchernykh, M. Babenko, N. Chervyakov, J.M. Cortes-Mendoza, N. Kucherov, V.
    Miranda-Lopez, M. Deryabin, I. Dvoryaninova, G. Radchenko. Towards mitigating uncertainty
    of data security breaches and collusion in cloud computing, 28th International Workshop on
    Database and Expert Systems Applications (DEXA) (2017): 137-141.
[29] D. Amrhein, S. Quint. Cloud computing for the enterprise: Part 1: Capturing the cloud,
    DeveloperWorks, IBM (2009): 121-126.


                                                                                              49
[30] A. Tchernykh, M. Babenko, N. Chervyakov, V. Miranda-Lopez, A. Avetisyan, A.Yu.
   Drozdov, R. Rivera-Rodriguez, G. Radchenko, Z. Du. Scalable Data Storage Design for Non-
   Stationary IoT Environment with Adaptive Security and Reliability, IEEE Internet of Things
   Journal (2020).
[31] N. Chervyakov, M. Babenko, A. Tchernykh, N. Kucherov, V. Miranda-Lopez, J.M. Cortes-
   Mendoza. AR-RRNS: Configurable Reliable Distributed Data Storage Systems for Internet of
   Things to Ensure Security, Future Generation Computer Systems (2019): 1080-1092.
[32] M. Babenko, N. Chervyakov, A. Tchernykh, N. Kucherov, M. Shabalina, I. Vashchenko, G.
   Radchenko, D. Murga. Unfairness correction in P2P grids based on residue number system of a
   special form, 28th International Workshop on Database and Expert Systems Applications
   (DEXA) (2017): 147-151.
[33] I. Foster, C. Kesselman. The Grid 2: Blueprint for a future computing infrastructure, Elsevier,
   Waltham: Morgan Kaufmann Publishers (2004): 737.




                                                                                                 50