=Paper= {{Paper |id=Vol-2301/paper35 |storemode=property |title=Security-preserving Support Vector Machine with Fully Homomorphic Encryption |pdfUrl=https://ceur-ws.org/Vol-2301/paper_35.pdf |volume=Vol-2301 |authors=Saerom Park,Jaewook Lee,Jung Hee Cheon,Juhee Lee,Jaeyun Kim,Junyoung Byun |dblpUrl=https://dblp.org/rec/conf/aaai/Park0CLKB19 }} ==Security-preserving Support Vector Machine with Fully Homomorphic Encryption== https://ceur-ws.org/Vol-2301/paper_35.pdf
         Security-preserving Support Vector Machine with Fully Homomorphic
                                     Encryption

     Saerom Park*, Jaeyun Kim, Joohee Lee, Junyoung Byun, Jung Hee Cheon, Jaewook Lee**
                                                     Seoul National University
                                         1 Gwanak-ro, Gwanak-gu, Seoul, South Korea 08826
                                         drsaerompark@gmail.com *, jaewook@snu.ac.kr **


                            Abstract                                  as additions and multiplications on encrypted data. Recently,
                                                                      HEAAN scheme (Cheon et al. 2017) was developed to carry
  Recently, security issues have become more and more impor-          out approximate computations efficiently.
  tant to apply machine learning models to a real-world prob-
                                                                          For the (leveled) HEAAN of depth L, we set the parame-
  lem. It is necessary to preserve the data privacy for using sen-
  sitive data and to protect the information of a trained model       ters such as a power of two M 0 , integers p, q0 , qL = pL · q0 ,
  for defending the intentional attacks. In this paper, we want to    h and P , and a real σ for λ-bit security. The specifications
  propose a security-preserving learning framework using fully        of the algorithm are as follows.
  homomorphic encryption for support vector machine model.            - (pk, sk, evk) ← KeyGen(1λ )
  Our approach aims to train the model on encrypted domain to         - ~c ←Encrypt(pk, m): Sample ~r ∈ R2 and e0 , e1 ←
  preserve data and model privacy with the reduced communi-           DGqL (σ 2 ).
  cation between the servers. The proposed procedure includes         Output ~c ← ~r · pk + (m + e0 , e1 ) ∈ R2qL .
  our protocol, data structure and homomorphic evaluation.
                                                                      - m ←Decrypt(sk, ~c = (c0 , c1 )): Output m ← c0 + c1 · s
                                                                      mod q` .
As machine learning models have been effectively applied
                                                                      - cadd ←Add(~c1 , ~c2 ): Output ~cadd ← ~c1 + ~c2 mod q` .
to various real-world problems, collecting data from vari-
                                                                           ~ ←Mult(~c1 , ~c2 ): Set c1 = (b1 , a1 ) and c2 = (b2 , a2 ).
                                                                      - cmult
ous sources became crucial for many service providers(Park,
                                                                      Let (d0 , d1 , d2 ) ← (b1 ·b2 , a1 ·b2 +a2 ·b1 , a1 ·a2 ) ∈ R3q` . Out-
Hah, and Lee 2017). In this situation, privacy issues have
become a major problem in the learning society. Therefore,            put ~cmult ← (d0 , d1 ) + b P1 · (d2 · evk (mod P · q` ))e ∈ R2q` .
it is necessary to preserve the privacy of the data and the           - ~c0 ←Rescaling`→`0 (~c): For a level ` ciphertext ~c, output
security of the learning process without compromising the            ~c0 ← b qq``0 · ~ce ∈ R2q`0 at level `0 .
performance of the learning algorithms.                               For more details, we recommend to see (Cheon et al. 2017).
   Homomorphic encryption (HE) enables computations on
encrypted data (ciphertext) which are equivalent to oper-            Least Square Support Vector Machine
ations on decrypted data (plaintext) (Gentry 2009). In re-           The SVM aims to find the maximum margin hyperplane,
cent years, privacy-preserving machine learning applica-             given the training examples {(x1 , y1 ), ..., (xn , yn )} ⊂ Rd ×
tions have developed, but they have high computational cost          {−1, 1}. To train the SVM model over encrypted data, we
and huge memory burden. Cheon et al. developed a ho-                 used a nonlinear least-square SVM with a basis function
momorphic encryption scheme for approximate arithmetic               φ(x) : Rd → Rl that minimizes the following optimization
(HEAAN) (Cheon et al. 2017).                                         problem (Suykens and Vandewalle 1999):
   Support vector machine (SVM) is one of the most effec-                                   1
                                                                                              Pn                   2
tive machine learning algorithms for classification (Cortes              min                n   i=1 ei + λkwk                     (1)
                                                                        w,b,ei
and Vapnik 1995). The training data and the model param-
eters should be protected to apply the SVM model to the                     s.t.   yi [w · φ(xi ) + b] = 1 − ei , ∀i = 1, . . . , n,
security-preserving scenario. Therefore, in this paper, we           where w ∈ Rl . To solve the problem (1), we con-
propose the implementation for the training phase of the             struct
                                                                        Pn the 2Lagrangian
                                                                                   Pn        function: L(w, b, e, α) = λkwk2 +
SVM classifier on the encrypted domain with the HEAAN                 1
                                                                     n    i=1 ei −   i=1 αi {[w·φ(xi )+b]+ei −1}. With the opti-
scheme.                                                              mality conditions of the Lagrangian function, the following
                                                                     linear system is obtained by removing w and e:
                   Design Components                                                                   
                                                                                        0      yT       b        0
Fully Homomorphic Encryption                                                   Ab =                        =        = 1̃     (2)
                                                                                       y Ω + λIn α              1n
Fully homomorphic encryption (FHE) is a cryptographic
scheme which aims to enable homomorphic operations such              where Ω ∈ R(n+1)×(n+1) s.t. Ωi,j = k(xi , xj ). We intro-
                                                                     duce an additional least square problem with gradient de-
Copyright held by authors                                            scent method for the system (2), which convergence can be
                                                                            Table 1: Classification accuracy for real datasets

                                                                      matrix multiplication. To learn b, the updated equation is
                                                                      bk+1 = bk − αAT (Abk − 1̃). Matrix-vector multiplica-
                                                                      tion on the encrypted domain is efficiently implemented by
                                                                      rotating the slots. We used pre-computed AT A by using the
           Figure 1: Our procedure with protocol                      symmetric property instead of two matrix-vector multipli-
                                                                      cations per iteration. Therefore, the resulting update equa-
                                                                      tion is bk+2 = Mbk − c, where M = (I − αAT A)2 and
guaranteed by the well-posedness of. The matrix A can be              c = ((1 + α)I − αAT A)AT b are pre-computed.
decomposed as follows:
                                                                      Comparison
                                                                Our procedure replaces the dual convex optimization with
          1                     0 1T    0 0T
      A=(    1         yT )            +η                      (3)    numerical gradient descent to implement the security-
          y                       1 Ω     0 In                        preserving LSSVM. To illustrate the result of this replace-
where represents a Hadmard product. By Schur product                  ment, we compare the classification performances with RBF
theorem, the first term becomes PSD with Mercer kernel.               and polynomial kernels for some real datasets used in
Therefore, the linear system (2) is well-posed with some              (Suykens and Vandewalle 1999). Table 1 shows that the re-
η > 0, and the convergence of iterative method can be gau-            placement does not severely affect the performances. From
ranteed. With these design components, the intermediate dy-           this results, we can expect to develop a secure-preserving
cryptions for training the SVM model can be reduced.                  SVM without compromising the performance.

                     Implementation                                                           Conclusion
In this paper, we propose a protocol which can reduce the             In this paper, we present a new framework to train the
communication to construct kernel matrix and to learn the             LSSVM model using HEAAN. This framework includes
model parameters with FHE operations. Figure 1 illustrates            protocol, data structure and secure-preserving iterative train-
the whole procedure of our protocol which secures server-             ing procedure. Our method considers the reduction of com-
side and data provider-side information because all opera-            putational cost and memory burden that are the common
tions can be performed on encrypted domain.                           problem for the application of HE scheme. The proposed
                                                                      solution can be helpful to show the potential for the practi-
Data Structure                                                        cal utilization of machine learning models without concerns
As mentioned previously, we first compute the kernel matrix           on security and privacy issues.
from the encrypted data. For simplicity, we assume that the
kernel matrix is encrypted as a ciphertext. We want to build a                           Acknowledgement
parallelizable procedure because of computational cost and            This work was supported in part by NRF of Korea grants
large memory of ciphertext. The HEAAN scheme supports                 No. 2018R1D1A1A02085851, 2016R1A2B3014030 and
                                                                      2017R1A5A1015626.
slot-wise operation over ciphertext, so the operations of ma-
trices should be replaced with addition and Hadmard prod-
uct. We propose data structure for calculating the inner prod-
                                                                                               References
                                                                     [Cheon et al. 2017] Cheon, J. H.; Kim, A.; Kim, M.; and Song, Y.
uct matrix. Assume that Z ∈ Rn×d and ZZT , Ẑj ∈ Rn×n .               2017. Homomorphic encryption for arithmetic of approximate
                                                                      numbers. In ASIACRYPT 2017, 409–437. Springer.
            ZZT = Ẑ1       ẐT1 + · · · + Ẑd ẐTd ,
                                                                 [Cortes and Vapnik 1995] Cortes, C., and Vapnik, V.         1995.
      z11     ···    z1d                z1j z2j · · ·      znj        Support-vector networks. Machine learning 20(3):273–297.
Z =  ...     ..      ..  , Ẑ =  ..        ..    ..      ..  .   [Gentry 2009] Gentry, C. 2009. A fully homomorphic encryp-
    
                 .     .        j     .      .       .     .       tion scheme. Ph.D. Dissertation, Stanford University. http:
     zn1      ···    znd                z1j z2j · · ·      znj        //crypto.stanford.edu/craig.
The multiplication of cihpertexts in this operation can be           [Park, Hah, and Lee 2017] Park, S.; Hah, J.; and Lee, J. 2017. In-
performed on d different machines in parallel.                        ductive ensemble clustering using kernel support matching. Elec-
                                                                      tronics Letters 53(25):1625–1626.
Secure-Preserving Iterative Training                                 [Suykens and Vandewalle 1999] Suykens, J. A., and Vandewalle, J.
In this study, to efficiently implement the gradient descent          1999. Least squares support vector machine classifiers. Neural
                                                                      processing letters 9(3):293–300.
method, we utilize the symmetricity of the matrix (3) in