<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Security-preserving Support Vector Machine with Fully Homomorphic Encryption</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Saerom</forename><surname>Park</surname></persName>
							<email>drsaerompark@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Seoul National University</orgName>
								<address>
									<addrLine>1 ; Gwanak-ro, Gwanak-gu</addrLine>
									<postCode>08826</postCode>
									<settlement>Seoul</settlement>
									<country key="KR">South Korea</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jaeyun</forename><surname>Kim</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Seoul National University</orgName>
								<address>
									<addrLine>1 ; Gwanak-ro, Gwanak-gu</addrLine>
									<postCode>08826</postCode>
									<settlement>Seoul</settlement>
									<country key="KR">South Korea</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Joohee</forename><surname>Lee</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Seoul National University</orgName>
								<address>
									<addrLine>1 ; Gwanak-ro, Gwanak-gu</addrLine>
									<postCode>08826</postCode>
									<settlement>Seoul</settlement>
									<country key="KR">South Korea</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Junyoung</forename><surname>Byun</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Seoul National University</orgName>
								<address>
									<addrLine>1 ; Gwanak-ro, Gwanak-gu</addrLine>
									<postCode>08826</postCode>
									<settlement>Seoul</settlement>
									<country key="KR">South Korea</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jung</forename><forename type="middle">Hee</forename><surname>Cheon</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Seoul National University</orgName>
								<address>
									<addrLine>1 ; Gwanak-ro, Gwanak-gu</addrLine>
									<postCode>08826</postCode>
									<settlement>Seoul</settlement>
									<country key="KR">South Korea</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jaewook</forename><surname>Lee</surname></persName>
							<email>jaewook@snu.ac.kr</email>
							<affiliation key="aff0">
								<orgName type="institution">Seoul National University</orgName>
								<address>
									<addrLine>1 ; Gwanak-ro, Gwanak-gu</addrLine>
									<postCode>08826</postCode>
									<settlement>Seoul</settlement>
									<country key="KR">South Korea</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Security-preserving Support Vector Machine with Fully Homomorphic Encryption</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">BD57BE4F5EF5C4B3447558278A6E0110</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T06:29+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Recently, security issues have become more and more important to apply machine learning models to a real-world problem. It is necessary to preserve the data privacy for using sensitive data and to protect the information of a trained model for defending the intentional attacks. In this paper, we want to propose a security-preserving learning framework using fully homomorphic encryption for support vector machine model. Our approach aims to train the model on encrypted domain to preserve data and model privacy with the reduced communication between the servers. The proposed procedure includes our protocol, data structure and homomorphic evaluation.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>As machine learning models have been effectively applied to various real-world problems, collecting data from various sources became crucial for many service providers <ref type="bibr" target="#b1">(Park, Hah, and Lee 2017)</ref>. In this situation, privacy issues have become a major problem in the learning society. Therefore, it is necessary to preserve the privacy of the data and the security of the learning process without compromising the performance of the learning algorithms.</p><p>Homomorphic encryption (HE) enables computations on encrypted data (ciphertext) which are equivalent to operations on decrypted data (plaintext) (Gentry 2009). In recent years, privacy-preserving machine learning applications have developed, but they have high computational cost and huge memory burden. Cheon et al. developed a homomorphic encryption scheme for approximate arithmetic (HEAAN) <ref type="bibr" target="#b0">(Cheon et al. 2017)</ref>.</p><p>Support vector machine (SVM) is one of the most effective machine learning algorithms for classification <ref type="bibr" target="#b0">(Cortes and Vapnik 1995)</ref>. The training data and the model parameters should be protected to apply the SVM model to the security-preserving scenario. Therefore, in this paper, we propose the implementation for the training phase of the SVM classifier on the encrypted domain with the HEAAN scheme.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Design Components Fully Homomorphic Encryption</head><p>Fully homomorphic encryption (FHE) is a cryptographic scheme which aims to enable homomorphic operations such Copyright held by authors as additions and multiplications on encrypted data. Recently, HEAAN scheme <ref type="bibr" target="#b0">(Cheon et al. 2017</ref>) was developed to carry out approximate computations efficiently.</p><p>For the (leveled) HEAAN of depth L, we set the parameters such as a power of two M , integers p, q 0 , q L = p L • q 0 , h and P , and a real σ for λ-bit security. The specifications of the algorithm are as follows.</p><p>-</p><formula xml:id="formula_0">(pk, sk, evk) ← KeyGen(1 λ ) -c ←Encrypt(pk, m): Sample r ∈ R 2 and e 0 , e 1 ← DG q L (σ 2 ). Output c ← r • pk + (m + e 0 , e 1 ) ∈ R 2 q L . -m ←Decrypt(sk, c = (c 0 , c 1 )): Output m ← c 0 + c 1 • s mod q . -c add ←Add( c 1 , c 2 ): Output c add ← c 1 + c 2 mod q . -c mult ←Mult( c 1 , c 2 ): Set c 1 = (b 1 , a 1 ) and c 2 = (b 2 , a 2 ). Let (d 0 , d 1 , d 2 ) ← (b 1 •b 2 , a 1 •b 2 +a 2 •b 1 , a 1 •a 2 ) ∈ R 3 q . Out- put c mult ← (d 0 , d 1 ) + 1 P • (d 2 • evk (mod P • q )) ∈ R 2 q . -c ←Rescaling → ( c): For a level ciphertext c, output c ← q q • c ∈ R 2</formula><p>q at level . For more details, we recommend to see <ref type="bibr" target="#b0">(Cheon et al. 2017)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Least Square Support Vector Machine</head><p>The SVM aims to find the maximum margin hyperplane, given the training examples {(x 1 , y 1 ), ...,</p><formula xml:id="formula_1">(x n , y n )} ⊂ R d × {−1, 1}.</formula><p>To train the SVM model over encrypted data, we used a nonlinear least-square SVM with a basis function φ(x) : R d → R l that minimizes the following optimization problem <ref type="bibr" target="#b2">(Suykens and Vandewalle 1999)</ref>:</p><formula xml:id="formula_2">min w,b,ei 1 n n i=1 e i + λ w 2 (1) s.t. y i [w • φ(x i ) + b] = 1 − e i , ∀i = 1, . . . , n,</formula><p>where w ∈ R l . To solve the problem (1), we construct the Lagrangian function:</p><formula xml:id="formula_3">L(w, b, e, α) = λ w 2 + 1 n n i=1 e 2 i − n i=1 α i {[w•φ(x i )+b]+e i −1}.</formula><p>With the optimality conditions of the Lagrangian function, the following linear system is obtained by removing w and e:</p><formula xml:id="formula_4">Ab = 0 y T y Ω + λI n b α = 0 1 n = 1 (2)</formula><p>where Ω ∈ R (n+1)×(n+1) s.t. Ω i,j = k(x i , x j ). We introduce an additional least square problem with gradient descent method for the system (2), which convergence can be </p><formula xml:id="formula_5">A = ( 1 y 1 y T ) 0 1 T 1 Ω + η 0 0 T 0 I n (3)</formula><p>where represents a Hadmard product. By Schur product theorem, the first term becomes PSD with Mercer kernel. Therefore, the linear system (2) is well-posed with some η &gt; 0, and the convergence of iterative method can be gauranteed. With these design components, the intermediate dycryptions for training the SVM model can be reduced.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Implementation</head><p>In this paper, we propose a protocol which can reduce the communication to construct kernel matrix and to learn the model parameters with FHE operations. Figure <ref type="figure" target="#fig_0">1</ref> illustrates the whole procedure of our protocol which secures serverside and data provider-side information because all operations can be performed on encrypted domain.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Data Structure</head><p>As mentioned previously, we first compute the kernel matrix from the encrypted data. For simplicity, we assume that the kernel matrix is encrypted as a ciphertext. We want to build a parallelizable procedure because of computational cost and large memory of ciphertext. The HEAAN scheme supports slot-wise operation over ciphertext, so the operations of matrices should be replaced with addition and Hadmard product. We propose data structure for calculating the inner product matrix. Assume that Z ∈ R n×d and ZZ T , Ẑj ∈ R n×n .</p><formula xml:id="formula_6">ZZ T = Ẑ1 ẐT 1 + • • • + Ẑd ẐT d , Z =    z 11 • • • z 1d . . . . . . . . . z n1 • • • z nd    , Ẑj =    z 1j z 2j • • • z nj . . . . . . . . . . . . z 1j z 2j • • • z nj    .</formula><p>The multiplication of cihpertexts in this operation can be performed on d different machines in parallel.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Secure-Preserving Iterative Training</head><p>In this study, to efficiently implement the gradient descent method, we utilize the symmetricity of the matrix (3) in </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Comparison</head><p>Our procedure replaces the dual convex optimization with numerical gradient descent to implement the securitypreserving LSSVM. To illustrate the result of this replacement, we compare the classification performances with RBF and polynomial kernels for some real datasets used in <ref type="bibr" target="#b2">(Suykens and Vandewalle 1999)</ref>. Table <ref type="table">1</ref> shows that the replacement does not severely affect the performances. From this results, we can expect to develop a secure-preserving SVM without compromising the performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Conclusion</head><p>In this paper, we present a new framework to train the LSSVM model using HEAAN. This framework includes protocol, data structure and secure-preserving iterative training procedure. Our method considers the reduction of computational cost and memory burden that are the common problem for the application of HE scheme. The proposed solution can be helpful to show the potential for the practical utilization of machine learning models without concerns on security and privacy issues.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Our procedure with protocol</figDesc><graphic coords="2,65.51,54.00,215.47,119.06" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Table 1 :</head><label>1</label><figDesc>Classification accuracy for real datasets matrix multiplication. To learn b, the updated equation is b k+1 = b k − αA T (Ab k − 1). Matrix-vector multiplication on the encrypted domain is efficiently implemented by rotating the slots. We used pre-computed A T A by using the symmetric property instead of two matrix-vector multiplications per iteration. Therefore, the resulting update equation is b k+2 = Mb k − c, where M = (I − αA T A) 2 and c = ((1 + α)I − αA T A)A T b are pre-computed.</figDesc></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgement</head><p>This work was supported in part by NRF of Korea grants No. 2018R1D1A1A02085851, 2016R1A2B3014030 and 2017R1A5A1015626.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Homomorphic encryption for arithmetic of approximate numbers</title>
		<author>
			<persName><forename type="first">Cheon</forename></persName>
		</author>
		<ptr target="http://crypto.stanford.edu/craig" />
	</analytic>
	<monogr>
		<title level="m">ASIACRYPT 2017</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="1995">2017. 2017. 1995. 2009</date>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="page" from="273" to="297" />
		</imprint>
		<respStmt>
			<orgName>Stanford University</orgName>
		</respStmt>
	</monogr>
	<note>Support-vector networks. Gentry 2009. A fully homomorphic encryption scheme. Ph.D. Dissertation</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Inductive ensemble clustering using kernel support matching</title>
		<author>
			<persName><forename type="first">Hah</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lee</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Electronics Letters</title>
		<imprint>
			<biblScope unit="volume">53</biblScope>
			<biblScope unit="issue">25</biblScope>
			<biblScope unit="page" from="1625" to="1626" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Least squares support vector machine classifiers</title>
		<author>
			<persName><forename type="first">Vandewalle ;</forename><surname>Suykens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Suykens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Vandewalle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural processing letters</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="293" to="300" />
			<date type="published" when="1999">1999. 1999</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
