A Defending Technology Against Co-Resident Attack Considering
Early Warning Mechanism and Disguise Component 1
Li Deng1, Ailing Deng1, Yuxi Peng2, Yanping Xiang1, Jianwei Xiang3,*
1
  University of Electronic Science and Technology of China, Chengdu, Sichuan, China
2
  Southwest Jiaotong University, Chengdu, Sichuan, China
3
  Hunan university of technology, Hunan, China

                Abstract
                With the development of cloud services, cloud servers must provide a safe and reliable cloud
                environment. To defend co-resident attack launched by malicious cloud users who co-resident
                with normal users on the same physical server, based on FFP voting, we propose a probabilistic
                model for evaluating the failure probability of an N-Version service program with disguise
                components and early warning agents in this paper. Under the condition of defense resource
                constraints, the failure prediction of the NVP service program is used as the basis to select the
                optimal deployment strategy of NVP, which is oriented to minimize the failure probability.

                Keywords
                early warning mechanism; disguise component; cloud environment; N-version programming

1. Introduction

    In complex systems, especially in safety-critical systems, any software failure may bring
catastrophic consequences. Intensive and thorough software testing is expensive and can not eliminate
all software failures. Therefore, a more economical method is needed to improve the quality of complex
systems. Software fault tolerance is such a method. The concept of NVP (N-version programming) was
originally proposed by Elmendorf [1]. Specifically, different service program versions are developed to
respond to a request at the same time, and the final result rest with a specific voting method based on
all the outputs.
    As a mature I.T. paradigm, cloud computing is widely used because of its high flexibility, scalability
and high-cost performance. To meet the high-reliability requirements of key service requests, cloud
service providers use redundant resources in the cloud environment to realize a variety of fault-tolerant
technologies, such as NVP mentioned above. As proved by recent works, the cloud platform can help
to realize NVP[2-6]
    The foundation of cloud computing is to achieve high resource utilization through sharing: suppliers
jointly host multiple VMS on a single hardware platform. However, virtual resources are mapped to
shared physical resources, resulting in the possibility of interference between jointly hosted virtual
machines. The services is vulnerable since an attacker can co-locate its V.M.s with a target VMs on a
server and carry out a side-channel attack to steal or destroy the user's sensitive information.
    In this paper, based on FFP voting, we propose a probabilistic model of evaluating the failure
probability of an NVP service with disguise components(DC) and early warning agents(EWA).
Furthermore, under the condition of defense resource constraints, the failure prediction of the program
is used as the basis to select the optimal deployment strategy of NVP, which is oriented to maximize
the success probability of NVP service. This is the first time to consider adding an early warning
component and a disguise component simultaneously in the NVP system. And this strategy can
effectively help NVP service components to resist co-resident attacks in cloud environment.


ICCEIC2022@3rd International Conference on Computer Engineering and Intelligent Control
EMAIL: 202021081206@std.uestc.edu.cn, (Li Deng) 534615669@qq.com (Ailing Deng), 413820155@qq.com (Yuxi Peng),
yanping_xiang@163.com (Yanping Xiang), *xjwlfylfylpp@126.com (Jianwei Xiang)
             © 2022 Copyright for this paper by its authors.
             Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
             CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                200
    The rest of the paper is organized as follows: Section 2 presents some relevant works. Section 3
presents the probabilistic model for assessing the failure probability of the service with DCs and EWAs.
In section 4, we solve the optimization problem of finding the optimal amount of SCs, DCs and EMAs
to minimize loss costs. Section 5 concludes our results and discusses our future work.

2. Related work

    To protect the cloud environment from being destroyed by co-resident attacks, tremendous methods
have been proposed in the literature. For instance, restricting or eliminating side channel structures.
Most side channels utilize LLCs shared by different virtual machines. A simple way to eliminate LLC
side sharing is to prevent it. For example, modify the hardware to divide the LLC into multiple regions
of different VMs[7]. Those methods require a lot of hardware or software modifications, making it
difficult for cloud service providers to adopt.
    Several researchers have also studied developing a secure virtual machine allocation policy. When
cloud providers allocate VMs, reducing co-residence between users can also mitigate co-resident
attacks. Han [8] proposed PSSF to mitigate the co-occupancy attack. PSSF denotes the current tenant
selecting the physical machine previously used to minimize the co-residence probability. This method
also considers the problems of load balancing and power consumption and adopts the technique of
limiting physical machines and fixed groups for users. Azar[9] proposed an anti-co-resident attack
algorithm to limit the propagation of VM by maintaining a fixed number of physical machines.
Similarly, Qiu[10] also proposed a virtual machine distribution strategy of "diffusion before
concentration" to resist co-resident attacks.
    Data partitioning technology can effectively suppress the risk of unaccredited access, because
information can be stolen merely when the attacker can access the data blocks of all partitions[11] if
the information is helpful only in terms of its integrity.
    Replication and cancellation technique were studied in [12] and [13]. They create many task replicas
to shorten the expected completion time of the task and increase data reliability. An early warning
mechanism involves decisions, associated policies, and procedures designed to predict and mitigate
network attacks based on specific network threat intelligence. Several individuals have researched the
detection of co-resident attacks, such as classifying users through semi-supervised learning to identify
possible malicious users and provide other users with early warnings.

3. Failure probability evaluation
3.1. Introduction of D.C. and EMA

   Reference[14] proves that distributed disguise components(DC) can reduce the probability of
essential components being attacked, so as to improve the reliability of multi-component systems. In
order to reduce the probability of service corruption provided by NVP redundant components, this paper
considers using defense processes similar to honeypots as a disguise component mechanism to deploy
security resources, which can distract attackers from more valuable service components on the network.
The attacker cannot differentiate DC from SC and attempts to establish side channels for any co-existing
DC. Suppose the attacker's virtual machine co-resides with SC and the disguise component. In that case,
AVM has a 0.5 probability of attacking the disguise component to reduce the probability that SC will
be attacked and affect the voting results.
   To protect the service from being corrupted, the cloud providers meanwhile distributes e EWAs
including attack monitoring system and detection software in servers which having SCs. AVMs cannot
distinguish them and will seek to establish side channels, which the co-resident EWA can detect using
side-channel or cache usage and obtain facts or details about the attack by p probability. The EWA can
contacts with other servers and provides the attack's information if the attack is successfully detected,
which prevents all the AVM from performing an attack. Consequently, as long as one attack from an
AVM is identified, there will be no SC damage from co-resident attacks.


                                                  201
3.2. First-past-the-post voting mechanism

   Most crucial businesses often utilize a specific redundancy technology to achieve high reliability. In
particular, in the cloud system running NVP services, various service components(SC) are located on
physical servers and perform one task simultaneously. The final output is provided to service users by
voting on the outcomes of these different SCs. There are numerous voting rules, including threshold
voting, which selects output with more votes than a preset threshold value. In some particular cases, the
majority voting system has a majority output (i.e., a threshold of 50% is selected).
   The first-past-the-post voting mechanism chooses the winner who gets the most votes, regardless of
whether its number exceeds the threshold of 50%. When the amount of damaged SCs is bigger than the
amount of undamaged SCs that provide correct output, the attacker have the victory, representing the
destruction of the entire NVP service.
   This voting mechanism is more real-time than the majority voting system because of the existing
time competition. It also emphasizes the competition between the SC's service task execution process
and the attacker's virtual machine.

3.3. Evaluation of damage probability
3.3.1. Failure probability

    The cloud service provider assigns n (1≤n≤s) service components in n servers. If the total number of
the servers is o, a fixed subset of size n holds SC in the o servers.
    According to the formulas proposed in [15] ,there are o ways to distribute b AVMs on o servers.
Based on the principle of tolerance and exclusion, we can get that the number of allocation methods for
                                                    t (
the co-resident of AVM and SC is ∑ (−1)                 o − n + i) , where AVMs and SCs co-resident in
                                                    i
a fixed subset of t servers。
    The total probability of v SCs being destroyed by attackers is (k present the success probability of
AVM damage to co-resident SC):
                                                  ( , )
                         p(o, n, b, v) = ∑                g(o, n, b, t)   k (1 − k)                   (1)

   Let c denotes the probability that each undamaged SC provides correct output, then given that v is
the number of damaged SC, the conditional probability of damage to the whole service component is:

                               ∑              c (1 − c)             , if v ≤ n/2
                   w(n, v) =                                                                          (2)
                                                 1, if v > n/2

   Therefore, the damage probability of an NVP service is:
                                             ( , )
                         f(o, n, b) = ∑              p(o, n, b, v)w(n, v)                             (3)

3.3.2. Probability of NVP service failure

  According to the formulas proposed in [15], the probability of EMA co-residing with AVM when
EMA is available on e servers with SC is

                                             e     n−e          n
                                             θ     t−θ          t

   Suppose EMA can detect co-residence with a probability of z, the probability of at least one AVM
being detected when SC co-resides with AVM in x servers is:


                                                          202
                                                       ( , )
                                               ∑                         (   )
                              h(n, e, t) =                                                                  (4)

    The conditional probability of AVM successfully damaging v SCs when SC co-resident with AVM
in t servers is

                                                               t
                                         1 − h(n, e, t)          k (1 − k)
                                                               v


Figure 1 Co-residence of EWAs, DCs and AVMs

    From Figure 1, there are b AVMs co-residing with SCs, from which a portion co-resides with EWAs
and another co-resides with DCs. A section co-reside with both DCs and EWAs, while the remainder
does not co-reside with either DCs or EWAs. The probability of a DC out of d co-residing with AVM
is:

                                                   d     n−d         n
                                                   a     t−d         t

    The attacker cannot distinguish between DC and SC and hence will attempt to create side channels
for any co-resident DC. If the attacker's virtual machine co-resides with SC and the DC, AVM has a 0.5
probability of attacking the disguise component. At the same time, EWA can also identify the attack.
    Consequently, the conditional probability of AVM successfully damaging y SCs when SC co-resides
with AVM in x servers is:

                      1 − h(n, e, t) ∑                         1−                 k    (1 − k)              (5)

   Furthermore, the probability of y SCs being damaged is:

                             ( , )
   p(o, n, e, b, v, d) = ∑           ∑      1 − h(n, e, t)                   (1 − )           k   (1 − u)   (6)

   Hence, the total failure probability is:
                                                         ( , )
                             f(o,n,e,b,d) = ∑                    p(o, n, e, b, v, d)w(n, v)                 (7)

3.3.3. Effect of model parameters on PSC

   With other parameters unchanged, the probability of co-residence raises when the number of attacks
b raises, leading to an raise in the failure probability f. Similarly, when the early warning component e
increases, so does the probability of detecting co-resident attacks, which decreases the failure
probability f. Additionally, the probability of an attacker corrupting a service component reduces as the
number of disguise components d increases, also leading to a decrease in the failure probability f.


                                                               203
4. Formulation of optimal strategy
4.1. Optimization of the cloud service

   This section provides a solution for the optimal number of services components, early warning agent
and disguise components under resource constraints which oriented to minimize the failure probability
of NVP services according to the method mentioned in[15]. Assuming that the limited defense resource
is R .

4.1.1. The number of AVMs is certain

    C (n, e, d)represents the overhead of creating a virtual machine running service component, EWA
and DC, assigned to different users. Moreover, C (n, e, d) is ostensibly an increasing function of the
numbers n, e and d. We can assume that C (n, e, d) = c ∗ n + c ∗ d + c ∗ e , where
c , c and c respectively represents the overhead of working on a single SC, DC and EWA. After
normalization, we can express it as:

                               C (n, e, d)/c = n + c /c ∗ d + c /c ∗ e                             (8)

   To minimize f(o, n, e, b, d), the optimization problem can be formulated as:

                               n∗ , e∗ , d∗ = arg , , min f(o, n, e, b, d),

                             s. t. n + c /c ∗ d + c /c ∗ e ≤ R /c ;                                (9)

   We can therefore use the brute force enumeration to solve this optimization problem since n, e and
d are integers.

4.1.2. The number of AVMs is uncertain

   However, in most cases, the number of AVMs m is uncertain. Assuming we can acquire the
distribution information of m through historical data and expert help, μ(t) = Pr(b = t) where b ≤
t≤b       .


                          n∗ , e∗ , d∗ = arg , , min      μ(t) f(o, n, e, b, d)

                           s. t. n + c /c ∗ d + c /c ∗ e ≤ R /c                                  (10)

4.2. Optimization considering the attacker's behavior

   Let C (b) represents the overhead of creating a virtual machine and launching attacks, C (b) =
b ∗ c , where c represents the overhead of creating a virtual machine and launching an attack. If the
limited attack resource is R ;
   For an attacker, to seek the maximum attack winning probability:

                                   b∗ = arg max f(o, n, e, b, d).
                                       s. t. b ≤ R /c                                            (11)

   Take attacker's behavior into consideration, the optimization problem can be rephrased as:
                              n∗ , e∗ , d∗ = arg , , min f(o, n, e, b, d)),

                            s. t. n + c /c ∗ d + c /c ∗ e ≤ R /c                                 (12)

                                                   204
5. Conclusion

    In this paper, we model a fault-tolerant NVP service by implementing disguise components and early
warning agents to resist co-resident attacks. We developed a model to assess the failure probability of
the program considering the effects of EWA and DC. Furthermore, we study the relevance between
service failure probability and parameters, containing the number of SCs, the number of DCs, the
number of EWAs, the correct execution probability of undamaged SCs, the detection probability of
EWAs, and the damage probability of SCs through their co-resident AVMs. Finding the best allocation
strategy for both attacker and defender is crucial. Hence, we developed and solved the optimization
problem by identifying the best number of SCs, EWAs, and DCs to minimize the expected loss cost of
service users under limited defense resources, considering the attacker's behavior.
    There are several kinds of attacks in the cloud environment. However, this paper only analyzes the
threat model for co-resident attacks, one of the numerous attacks in the cloud environment. In future
works, we intend to pay attention to other malicious attacks with popular characteristics, such as DDoS
attacks, computer viruses, node attacks, etc. Additionally, we consider using machine learning
algorithms to classify users and detect the virtual machines applied by malicious users who may launch
co-resident attacks to provide early warnings to other users.

6. References
[1] Elmendorf. WR. FAULT TOLERANT PROGRAMMING.[J]. IBM Tech Disclosure Bull, 1972.
[2] Pramila S, Poonkuzhali S, Mythili S. Improvising reliability through N-version programming in
     cloud environment. Int J Adv Tech Eng Sci April 2015;3(1):204–8.
[3] Liu J, Yang N. Proc. of 2017 7th IEEE International Conference on Electronics Information and
     Emergency Communication (ICEIEC). Optimal fault tolerant service provisioning for cloud
     application. Macau 2017:189–94.
[4] F.Khomh, "On improving the dependability of cloud applications with fault-tolerance," Proc
     WICSA, Article No.2,pp.1–3, https://doi.org/10.1145/2578128. 2578228, April 2014.
[5] Wagner B, Sood A. Economics of Resilient Cloud Services. Proc 2016 IEEE Int Conf Softw Qual
     Reliab Secur Compan (QRS-C) 2016:368–74. https://doi.org/10.1109/ QRS-C.2016.56.
[6] C.R.White, "Cloud Computing and SBSE," In: RuheG., ZhangY. (eds) Search Based Software
     Engineering. SSBSE 2013. Lecture Notes in Computer Science, vol 8084. Springer, Berlin,
     Heidelberg, 2013.
[7] Liu. F, Ge. Q, Yarom. Y, et al. CATalyst: Defeating last-level cache side channel attacks in cloud
     computing[C]. Proceedings-International Symposium on High-Performance Computer
     Architecture. 2016.
[8] Han. Y, Chan. J, Alpcan. T, et al. Using Virtual Machine Allocation Policies to Defend against Co-
     Resident Attacks in Cloud Computing[J]. IEEE Transactions on Dependable and Secure
     Computing, 2017.
[9] Azar. Y, Kamara. S, Menache. I, et al. Co-location-resistant clouds[C]. Proceedings of the ACM
     Conference on Computer and Communications Security. 2014.
[10] Qiu. Y, Shen. Q, Luo. Y, et al. A secure virtual machine deployment strategy to reduce co-residency
     in cloud[C]. Proceedings - 16th IEEE International Conference on Trust, Security and Privacy in
     Computing and Communications, 11th IEEE International Conference on Big Data Science and
     Engineering and 14th IEEE International Conference on Embedded Software and Systems. 2017.
[11] Shinde Y, Vishwa A. Privacy preserving using data partitioning technique for secure cloud storage.
     Int J Comp Appl (0975 – 8887) April 2015;116(16).
[12] Luo L, Xing L, Levitin G. Optimizing dynamic survivability and security of replicated data in
     cloud systems under co-residence attacks. Reliability Engineering and System Safety December
     2019;192:106265.
[13] Levitin G, Xing L. Co-residence based data theft game in cloud system with virtual machine
     replication and cancellation. Reliability Engineering and System Safety 222, 2022: 108415.
[14] McQueen. M. A, Boyer. W. F, Flynn. M. A, et al. Time-to-compromise model for cyber risk
     reduction estimation[J]. Advances in Information Security, 2006.I. S. Jacobs and C. P. Bean, “Fine

                                                  205
     particles, thin films and exchange anisotropy,” in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds.
     New York: Academic, 1963, pp. 271–350.
[15] Levitin G, Xing L, Xiang YP. "Optimal early warning defense of N-version programming service
     against co-resident attacks in cloud system" , Reliability Engineering & System Safety, 2020.


                                                  206