<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Security-preserving Support Vector Machine with Fully Homomorphic Encryption</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Saerom Park</string-name>
          <email>drsaerompark@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jaeyun Kim</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joohee Lee</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Junyoung Byun</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jung Hee Cheon</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jaewook Lee</string-name>
          <email>jaewook@snu.ac.kr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Seoul National University 1 Gwanak-ro</institution>
          ,
          <addr-line>Gwanak-gu, Seoul</addr-line>
          ,
          <country>South</country>
          <addr-line>Korea 08826</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Recently, security issues have become more and more important to apply machine learning models to a real-world problem. It is necessary to preserve the data privacy for using sensitive data and to protect the information of a trained model for defending the intentional attacks. In this paper, we want to propose a security-preserving learning framework using fully homomorphic encryption for support vector machine model. Our approach aims to train the model on encrypted domain to preserve data and model privacy with the reduced communication between the servers. The proposed procedure includes our protocol, data structure and homomorphic evaluation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        As machine learning models have been effectively applied
to various real-world problems, collecting data from
various sources became crucial for many service providers
        <xref ref-type="bibr" rid="ref6">(Park,
Hah, and Lee 2017)</xref>
        . In this situation, privacy issues have
become a major problem in the learning society. Therefore,
it is necessary to preserve the privacy of the data and the
security of the learning process without compromising the
performance of the learning algorithms.
      </p>
      <p>
        Homomorphic encryption (HE) enables computations on
encrypted data (ciphertext) which are equivalent to
operations on decrypted data (plaintext)
        <xref ref-type="bibr" rid="ref5">(Gentry 2009)</xref>
        . In
recent years, privacy-preserving machine learning
applications have developed, but they have high computational cost
and huge memory burden. Cheon et al. developed a
homomorphic encryption scheme for approximate arithmetic
(HEAAN)
        <xref ref-type="bibr" rid="ref1">(Cheon et al. 2017)</xref>
        .
      </p>
      <p>
        Support vector machine (SVM) is one of the most
effective machine learning algorithms for classification
        <xref ref-type="bibr" rid="ref3">(Cortes
and Vapnik 1995)</xref>
        . The training data and the model
parameters should be protected to apply the SVM model to the
security-preserving scenario. Therefore, in this paper, we
propose the implementation for the training phase of the
SVM classifier on the encrypted domain with the HEAAN
scheme.
Copyright held by authors
as additions and multiplications on encrypted data. Recently,
HEAAN scheme
        <xref ref-type="bibr" rid="ref1">(Cheon et al. 2017)</xref>
        was developed to carry
out approximate computations efficiently.
      </p>
      <p>For the (leveled) HEAAN of depth L, we set the
parameters such as a power of two M 0, integers p, q0, qL = pL q0,
h and P , and a real for -bit security. The specifications
of the algorithm are as follows.
- (pk; sk; evk) KeyGen(1 )
- ~c Encrypt(pk, m): Sample ~r 2 R2 and e0; e1
DGqL ( 2).</p>
      <p>Output ~c ~r pk + (m + e0; e1) 2 Rq2L .
- m Decrypt(sk, ~c = (c0; c1)): Output m c0 + c1 s
mod q`.
- cadd Add(~c1, ~c2): Output ~cadd ~c1 + ~c2 mod q`.
- cm~ult Mult(~c1, ~c2): Set c1 = (b1; a1) and c2 = (b2; a2).
Let (d0; d1; d2) (b1 b2; a1 b2+a2 b1; a1 a2) 2 Rq3` .
Output ~cmult (d0; d1) + b P1 (d2 evk (mod P q`))e 2 Rq2` .
- ~c0 Rescaling`!`0 (~c): For a level ` ciphertext ~c, output
~c0 b qq``0 ~ce 2 Rq2`0 at level `0.</p>
      <p>
        For more details, we recommend to see
        <xref ref-type="bibr" rid="ref1">(Cheon et al. 2017)</xref>
        .
      </p>
    </sec>
    <sec id="sec-2">
      <title>Least Square Support Vector Machine</title>
      <p>
        The SVM aims to find the maximum margin hyperplane,
given the training examples f(x1; y1); :::; (xn; yn)g Rd
f 1; 1g. To train the SVM model over encrypted data, we
used a nonlinear least-square SVM with a basis function
(x) : Rd ! Rl that minimizes the following optimization
problem
        <xref ref-type="bibr" rid="ref7">(Suykens and Vandewalle 1999)</xref>
        :
1 Pn
n i=1 ei +
kwk2
(1)
min
w;b;ei
ei; 8i = 1; : : : ; n;
where w 2 Rl. To solve the problem (1), we
construct the Lagrangian function: L(w; b; e; ) = kwk2 +
n1 Pin=1 ei2 Pin=1 if[w (xi)+b]+ei 1g. With the
optimality conditions of the Lagrangian function, the following
linear system is obtained by removing w and e:
      </p>
      <p>Ab =
0
y
yT
+ In
b
=
where 2 R(n+1) (n+1) s.t. i;j = k(xi; xj ). We
introduce an additional least square problem with gradient
descent method for the system (2), which convergence can be
guaranteed by the well-posedness of. The matrix A can be
decomposed as follows:</p>
      <p>1
A = ( y
1
yT )
where represents a Hadmard product. By Schur product
theorem, the first term becomes PSD with Mercer kernel.
Therefore, the linear system (2) is well-posed with some
&gt; 0, and the convergence of iterative method can be
gauranteed. With these design components, the intermediate
dycryptions for training the SVM model can be reduced.</p>
      <sec id="sec-2-1">
        <title>Implementation</title>
        <p>In this paper, we propose a protocol which can reduce the
communication to construct kernel matrix and to learn the
model parameters with FHE operations. Figure 1 illustrates
the whole procedure of our protocol which secures
serverside and data provider-side information because all
operations can be performed on encrypted domain.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Data Structure</title>
      <p>As mentioned previously, we first compute the kernel matrix
from the encrypted data. For simplicity, we assume that the
kernel matrix is encrypted as a ciphertext. We want to build a
parallelizable procedure because of computational cost and
large memory of ciphertext. The HEAAN scheme supports
slot-wise operation over ciphertext, so the operations of
matrices should be replaced with addition and Hadmard
product. We propose data structure for calculating the inner
product matrix. Assume that Z 2 Rn d and ZZT ; Z^j 2 Rn n.</p>
      <p>ZZT = Z^ 1
^ 1T +
Z
+ Z^ d</p>
      <p>Z^ dT ;
Z = 64 ...</p>
      <p>2z11
zn1
. . .</p>
      <p>... 75 ; Z^ j = 64 ...</p>
      <p>2z1j
z1j
z2j
.
.</p>
      <p>.
z2j
. . .</p>
      <p>znj 3</p>
      <p>... 75 :
znj
The multiplication of cihpertexts in this operation can be
performed on d different machines in parallel.</p>
    </sec>
    <sec id="sec-4">
      <title>Secure-Preserving Iterative Training</title>
      <p>In this study, to efficiently implement the gradient descent
method, we utilize the symmetricity of the matrix (3) in
matrix multiplication. To learn b, the updated equation is
bk+1 = bk AT (Abk 1~). Matrix-vector
multiplication on the encrypted domain is efficiently implemented by
rotating the slots. We used pre-computed AT A by using the
symmetric property instead of two matrix-vector
multiplications per iteration. Therefore, the resulting update
equation is bk+2 = Mbk c, where M = (I AT A)2 and
c = ((1 + )I AT A)AT b are pre-computed.</p>
    </sec>
    <sec id="sec-5">
      <title>Comparison</title>
      <p>
        Our procedure replaces the dual convex optimization with
numerical gradient descent to implement the
securitypreserving LSSVM. To illustrate the result of this
replacement, we compare the classification performances with RBF
and polynomial kernels for some real datasets used in
        <xref ref-type="bibr" rid="ref7">(Suykens and Vandewalle 1999)</xref>
        . Table 1 shows that the
replacement does not severely affect the performances. From
this results, we can expect to develop a secure-preserving
SVM without compromising the performance.
      </p>
      <sec id="sec-5-1">
        <title>Conclusion</title>
        <p>In this paper, we present a new framework to train the
LSSVM model using HEAAN. This framework includes
protocol, data structure and secure-preserving iterative
training procedure. Our method considers the reduction of
computational cost and memory burden that are the common
problem for the application of HE scheme. The proposed
solution can be helpful to show the potential for the
practical utilization of machine learning models without concerns
on security and privacy issues.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Acknowledgement</title>
        <p>This work was supported in part by NRF of Korea grants
No. 2018R1D1A1A02085851, 2016R1A2B3014030 and
2017R1A5A1015626.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [Cheon et al. 2017]
          <string-name>
            <surname>Cheon</surname>
            ,
            <given-names>J. H.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ; and Song,
          <string-name>
            <surname>Y.</surname>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2017.
          <article-title>Homomorphic encryption for arithmetic of approximate numbers</article-title>
          .
          <source>In ASIACRYPT</source>
          <year>2017</year>
          ,
          <volume>409</volume>
          -
          <fpage>437</fpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>[Cortes and Vapnik</source>
          <year>1995</year>
          ] Cortes,
          <string-name>
            <given-names>C.</given-names>
            , and
            <surname>Vapnik</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>Support-vector networks</article-title>
          .
          <source>Machine learning 20(3)</source>
          :
          <fpage>273</fpage>
          -
          <lpage>297</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Gentry 2009] Gentry,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <year>2009</year>
          .
          <article-title>A fully homomorphic encryption scheme</article-title>
          .
          <source>Ph.D. Dissertation</source>
          , Stanford University. http: //crypto.stanford.edu/craig.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Park, Hah, and Lee 2017] Park,
          <string-name>
            <given-names>S.</given-names>
            ;
            <surname>Hah</surname>
          </string-name>
          , J.; and
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Inductive ensemble clustering using kernel support matching</article-title>
          .
          <source>Electronics Letters</source>
          <volume>53</volume>
          (
          <issue>25</issue>
          ):
          <fpage>1625</fpage>
          -
          <lpage>1626</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>[Suykens and Vandewalle</source>
          <year>1999</year>
          ] Suykens,
          <string-name>
            <given-names>J. A.</given-names>
            , and
            <surname>Vandewalle</surname>
          </string-name>
          , J.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          1999.
          <article-title>Least squares support vector machine classifiers</article-title>
          .
          <source>Neural processing letters 9</source>
          <volume>(3)</volume>
          :
          <fpage>293</fpage>
          -
          <lpage>300</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>