<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">An Adversarial Attacker for Neural Networks in Regression Problems</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Kavya</forename><surname>Gupta</surname></persName>
							<email>kavya.gupta100@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="department">CentraleSupélec</orgName>
								<orgName type="institution" key="instit1">Université Paris-Saclay</orgName>
								<orgName type="institution" key="instit2">Inria Centre de Vision Numérique</orgName>
								<address>
									<settlement>Gif-sur-Yvette</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="laboratory">Air Mobility Solutions BL</orgName>
								<orgName type="institution">Thales LAS France</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Beatrice</forename><surname>Pesquet-Popescu</surname></persName>
							<email>beatrice.pesquet@thalesgroup.com</email>
							<affiliation key="aff1">
								<orgName type="laboratory">Air Mobility Solutions BL</orgName>
								<orgName type="institution">Thales LAS France</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Fateh</forename><surname>Kaakai</surname></persName>
							<email>fateh.kaakai.e@thalesdigital.io</email>
							<affiliation key="aff1">
								<orgName type="laboratory">Air Mobility Solutions BL</orgName>
								<orgName type="institution">Thales LAS France</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jean-Christophe</forename><surname>Pesquet</surname></persName>
							<email>jean-christophe.pesquet@centralesupelec.fr</email>
							<affiliation key="aff0">
								<orgName type="department">CentraleSupélec</orgName>
								<orgName type="institution" key="instit1">Université Paris-Saclay</orgName>
								<orgName type="institution" key="instit2">Inria Centre de Vision Numérique</orgName>
								<address>
									<settlement>Gif-sur-Yvette</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Fragkiskos</forename><forename type="middle">D</forename><surname>Malliaros</surname></persName>
							<email>fragkiskos.malliaros@centralesupelec.fr</email>
							<affiliation key="aff0">
								<orgName type="department">CentraleSupélec</orgName>
								<orgName type="institution" key="instit1">Université Paris-Saclay</orgName>
								<orgName type="institution" key="instit2">Inria Centre de Vision Numérique</orgName>
								<address>
									<settlement>Gif-sur-Yvette</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">An Adversarial Attacker for Neural Networks in Regression Problems</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">C6EBAF79FE1C608F3888A4D2457EE333</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T23:16+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Adversarial attacks against neural networks and their defenses have been mostly investigated in classification scenarios. However, adversarial attacks in a regression setting remain understudied, although they play a critical role in a large portion of safety-critical applications. In this work, we present an adversarial attacker for regression tasks, derived from the algebraic properties of the Jacobian of the network. We show that our attacker successfully fools the neural network, and we measure its effectiveness in reducing the estimation performance. We present a white-box adversarial attacker to support engineers in designing safety-critical regression machine learning models. We present our results on various open-source and real industrial tabular datasets. In particular, the proposed adversarial attacker outperforms attackers based on random perturbations of the inputs. Our analysis relies on the quantification of the fooling error as well as various error metrics. A noteworthy feature of our attacker is that it allows us to optimally attack a subset of inputs, which may be helpful to analyse the sensitivity of some specific inputs.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Adversarial machine learning has received an increased attention in the past decade. For all machine learning models, defense against adversarial attacks is important in terms of safety. Adversarial attacks in classification constitute malicious attempts to trick a model classifier. They play a critical role in real-world application domains such as spam/malware detection, autonomous systems <ref type="bibr" target="#b2">[Huang and Wang, 2018]</ref>, <ref type="bibr" target="#b2">[Eykholt et al., 2018]</ref>, <ref type="bibr" target="#b4">[Ren et al., 2019]</ref>, medical systems <ref type="bibr" target="#b2">[Finlayson et al., 2018]</ref> etc. Adversarial attacks cause vulnerability in model deployment and specially needs to be taken into account in deployment of security-critical AI applications. Despite the newfound interest of the research community in trustworthy and explainable AI, there are only few works investigating the adversaries in the case of regression tasks. Current advances in the adversarial machine learning field evolve around the issue of designing attacks and defenses with focus on the use of neural networks in image analysis and computer vision <ref type="bibr" target="#b2">[Goodfellow et al., 2014]</ref>, <ref type="bibr">[Kurakin et al., 2016]</ref>. Much less works concern tabular data. However, most machine learning tasks in the industry rely on tabular data, e.g., fraud detection, product failure prediction, antimoney laundering, recommendation systems, click-through rate prediction, or flight arrival time prediction.</p><p>In this paper, we focus on generating adversarial attacks for neural networks in the specific scenario when i) a regression task is performed and ii) tabular data are employed. Our contributions are the following:</p><p>• We propose a simple, novel and flexible method for generating adversarial attacks for regression tasks (a white box attack). • We show that the proposed attacker allows us to optimally attack on any given subset of input features. • We explore various error metrics which are useful for analysing these adversarial attacks. • Our proposed adversarial attacker is generalised for any p norm on input and output perturbations.</p><p>• We evaluate our results on open-source regression datasets and an industrial dataset (output and input features described in the Table <ref type="table" target="#tab_0">1</ref>) which lies in the domain of safety critical applications.</p><p>In Section 2, we give a brief overview of existing works. In Section 3, we formulate the problem and present our method for generating adversarial examples in regression tasks. In Section 4, we perform numerical experiments on four datasets to demonstrate the effectiveness of the proposed attacker. Some concluding remarks are given in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>In <ref type="bibr" target="#b6">[Szegedy et al., 2013]</ref> the concept of adversarial attacks was first proposed to fool DNNs. Adding a subtle perturbation to the input of the neural network produces an incorrect output, while human eyes cannot recognize the difference in the modification of the input data. Even though different models have different architectures and might use different training data, the same kind of adversarial attack strategies can be used to attack related models. These attacks pose a huge threat to the performance of DNNs. <ref type="bibr" target="#b6">[Szegedy et al., 2013]</ref> paper proposed L-BFGS to construct adversarial attacks and since then there has been plethora of works introducing various adversarial attacks and their defenses for DNNs.</p><p>[ <ref type="bibr" target="#b2">Goodfellow et al., 2014]</ref> proposed a simpler and faster method to construct adversarial attacks (FGSM). The generated images are misclassified by adding perturbations and linearizing the cost function in the gradient direction. This is a non-iterative attack, hence it has a lower computation cost than the previous method. The Fast Gradient Sign Method (FGSM) is an ∞ bounded attack and is often prone to label leaking.</p><p>It may be difficult for FGSM to control the perturbation level in constructing attacks. <ref type="bibr">[Kurakin et al., 2016]</ref> proposed an optimized FGSM, termed Iterative Gradient Sign Method (IGSM), which adds perturbations in multiple smaller steps and clips the results after each iteration ensuring that the perturbations are restricted to the neighborhood of the example. <ref type="bibr" target="#b2">[Dong et al., 2018]</ref> added momentum to IGSM attacks. <ref type="bibr" target="#b4">[Papernot et al., 2016]</ref> proposed the Jacobian-based Saliency Map Attack (JSMA), which is based on the 0 sparsity measure. The basic idea is to construct a saliency map with the gradients and model the gradients based on the impact of each pixel of the image.</p><p>[ <ref type="bibr">Moosavi-Dezfooli et al., 2016]</ref> proposed a non-targeted attack method based on the 2 -norm, called DeepFool. It tries to find the decision boundary that is the closest to the sample in the image space, and then use the classification boundary to fool the classifier. FGSM, JSMA, and DeepFool are designed to generate adversarial attacks corresponding to single image to fool the trained classifier model. <ref type="bibr" target="#b4">[Moosavi-Dezfooli et al., 2017]</ref> proposed a universal image-agnostic perturbation attack method which fools classifier by adding a single perturbation to all images in the dataset. <ref type="bibr" target="#b0">[Carlini and Wagner, 2017]</ref> proposed a powerful attack based on L-BFGS. The attack can be generated according to 1 , 2 , and ∞ norm which can be targeted or non-targeted. <ref type="bibr" target="#b2">[Liu et al., 2016]</ref> proposed an ensemble attack method combining multiple models to construct adversarial attacks. <ref type="bibr" target="#b5">[Rony et al., 2020]</ref> proposed a method to generate minimally perturbed adversarial examples based on Augmented Lagrangian for various distance metrics. In <ref type="bibr" target="#b0">[Balda et al., 2018]</ref>, authors propose a general framework for generation of adversarial examples in both classification and regression tasks for applications in image domain. Similar to our proposed approach, the technique is based on the Jacobian of the neural network. Most of the methods in the literature about adversarial example generation belong to the class of white box attackers, i.e., the attacker has access to the information related to the trained neural network model including the model architecture and its parameters. A black box attacker is introduced in <ref type="bibr" target="#b6">[Su et al., 2019]</ref>. Such attackers do not know the model but can interact with it. A byproduct of black-box attack is grey-box attack where attackers might have limited information regarding the model. To the best of our knowledge the only work dealing with adversarial attacks in white box settings for tabular data has been proposed in <ref type="bibr" target="#b0">[Ballet et al., 2019]</ref> and this work handles only classification tasks.</p><p>In regression tasks there are no natural margins as in the case of classification tasks, and adversarial learning in regression setting is hindered with difficulties to define the adversarial attacks, its success, and evaluation metrics. Despite the number of works in adversarial attack generation, there are few articles dealing with regression tasks. <ref type="bibr" target="#b7">[Tong et al., 2018]</ref> looked at adversarial attacks in the setting of an ensemble of multiple learners, investigating the interactions between these linear learners and an attacker in regression setting, modeled as a Multi-Learner Stackelberg Game (MLSG). However, the investigated linear case is not able to capture the larger class of non-linear models. The focus only on specific applications of regression is a common. <ref type="bibr" target="#b2">[Ghafouri et al., 2018]</ref> examined an important problem: selecting an optimal threshold for each sensor against an adversary for regression tasks in cyber-physical systems. <ref type="bibr" target="#b1">[Deng et al., 2020]</ref> introduced the concept of adversarial threshold which is related to a deviation between the original prediction and the prediction of adversarial example, i.e., an acceptable error range in driving models. In a regression context, <ref type="bibr" target="#b4">[Nguyen and Raff, 2018]</ref> introduced a defense that is generically useful to reduce the effectiveness of adversarial attacks. They consider adversarial attacks as a potential symptom of numerical instability in the learned function. In the next section, we propose a general white-box adversarial attacker based on Jacobian of the learned function for regression tasks in tabular data domain.</p><p>3 Proposed Method</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Objective</head><p>The problem of adversarial attacks is closely related to the robustness issue for a neural network, i.e. its sensitivity to perturbations. Let T : R N0 → R Nm be the considered neural network having N 0 scalar inputs and N m scalar outputs. If x ∈ R N0 is a given vector of inputs for some data for which y is the associated target output, the network has been trained to produce an output T (x) close to y. If the input is now perturbed by an additive vector e ∈ R N0 , the perturbed output is T (x + e). Attacking the network then amounts to finding a perturbation e of preset magnitude which makes the output of the network to maximally deviate from a reference output. This reference output may be the model output T (x) or the ground truth output y. Since our purpose is to develop an approach which remains efficient even if the accuracy of the network is not very high, we choose y as the reference output when available. In this context, the measures of deviation and of magnitude of the perturbation play an important role in terms of mathematical formulation of the problem. As a standard choice, the measure of perturbation magnitude will be here an p -norm where p ∈ [1, +∞]. For measuring the output deviation, we will similarly consider an q -norm where q ∈ [1, +∞]. It must be emphasized that this choice makes sense when dealing with regression problems. In this context, the 2 or the 1 norms are indeed frequently used as loss functions for training. On the other hand, the +∞ norm is also a popular measure when dealing with reliability issues.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Optimization formulation</head><p>In the described setting, the design of the attacker can be formulated as the problem of finding the "worst pertubation" e such that e ∈ Argmax e∈C p,δ</p><formula xml:id="formula_0">T (x + e) − y q ,<label>(1)</label></formula><p>where C p,δ is the closed and convex set defined as</p><formula xml:id="formula_1">C p,δ = {e ∈ R N0 | Σ −1/2 e p ≤ δ}.<label>(2)</label></formula><p>Σ ∈ R N0×N0 is symmetric positive definite matrix. δ is a parameter which controls the maximum allowed perturbation and Σ is a weighting matrix typically corresponding to the covariance matrix of the inputs. For instance, if we assume that it is a diagonal matrix, it simply introduces a normalization of the perturbation components with respect to the standard deviations of the associated inputs.</p><p>For standard choices of activation functions, T is a continuous function. By virtue of Weierstrass theorem, the existence of a solution (not necessarily unique) to Problem (1) is then ensured. Although C p,δ is a relatively simple convex set, this problem appears as a difficult non-convex problem due to the fact that i) T is a complex nonlinear operator, ii) we maximize an q measure which, in addition, leads to a nonsmooth cost function when q = 1 or q = +∞. A further difficulty is that we usually need to attack a large dataset to evaluate the robustness of a network and the provided optimization algorithm should therefore be fast.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Algorithm</head><p>We propose to implement a two-step approach.</p><p>• Step 1. We first perform a linearization based on the following first-order Taylor expansion:</p><formula xml:id="formula_2">T (x + e) T (x) + J(x)e,<label>(3)</label></formula><p>where J(x) ∈ R Nm×N0 is the Jacobian of the network at x<ref type="foot" target="#foot_0">1</ref> . Note that J(x) can be computed by classical back-propagation techniques. We will make a second approximation, that is y T (x). Based on these two approximations and after the variable change e = δ −1 Σ −1/2 e, Problem (1) simplifies to minimize e ∈Bp J(x)Σ 1/2 e q , (4)</p><p>where B p is the closed p ball centered at 0 and with unit radius. Note that the optimal cost value in (4) is the subordinate norm of matrix J(x)Σ 1/2 when the input space is equipped with the p norm and the output space with the q one. We recall that this subordinate norm is defined, for every matrix M ∈ R Nm×N0 , as</p><formula xml:id="formula_3">M p,q = sup z∈R N 0 \{0} M z q z p .<label>(5)</label></formula><p>Problem ( <ref type="formula">4</ref>) is thus equivalent to find a vector e for which the value of the cost function is equal to J(x)Σ 1/2 p,q . For values of (p, q) listed below the expression of such vector has an explicit form.</p><p>-If p = q = 2, e is any unit 2 norm eigenvector of Σ 1/2 J(x) J(x)Σ 1/2 associated with the maximum eigenvalue of this matrix. This vector can be computed by perfoming a singular value decomposition of J(x)Σ 1/2 . -If p = 2 and q = +∞, e is any unit 2 norm vector colinear with a row of J(x)Σ 1/2 having maximum 2 norm. -If p = +∞ and q = +∞, e is a unit norm vector whose elements are equal to ( (i) ) 1≤i≤N0 where, for every i ∈ {1, . . . , N 0 }, i ∈ {−1, 0, 1} is the sign of the i-th element of a row of J(x)Σ 1/2 with maximum 1 norm.</p><p>-If p = 1 and q = 1, e is a vector which has only one nonzero component equal to ±1, the index of this component corresponds to the column of J(x)Σ 1/2 with maximum 1 norm. -If p = 1 and q = 2, e is a vector with only one nonzero component equal to ±1. The index of this component corresponds to a column of J(x)Σ 1/2 with maximum 2 norm. -If p = 1 and q = +∞, e is again a vector with only one nonzero component equal to ±1. The index of this component corresponds to a column of J(x)Σ 1/2 where is located an element of maximum absolute value.</p><p>•</p><p>Step 2. In the previous optimization step, the optimal solution is not unique. Indeed if e = δΣ 1/2 e is a solution to Problem (4), then − e is also a solution. In addition, there may exist other reasons for the multiplicity of the solutions. For example, there may be several maximum norm rows for matrix J(x)Σ 1/2 . Among all the possible choices, we propose to choose the solution e leading to the maximum deviation w.r.t. the ground truth, that is such that T (x + e) − y q is maximum. This requires to perform a search on a small number of possible candidates. Note that no approximation error is involved in this step. If the ground truth for the output is not available, it can be replaced by the model output.</p><p>• Post-optimization. If 1 &lt; q &lt; +∞ and T is assumed to be differentiable, e → T (x + e) − y q q is a differentiable function. A further refinement consists of minimizing this function over C p,δ by using a projected gradient algorithm with Armijo search for the stepsize. The previous estimates of e can then be used to initialize the algorithm. According to our numerical tests, implementing this strategy when q = 2 only brings a marginal improvement. Moreover, this approach cannot be used when q = 1 or q = +∞.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Attacking a group of inputs</head><p>It can also be interesting to attack only a selected subset of inputs. It may help in identifying the more sensitive inputs of the network. Also, for some inputs like unsorted categorical ones, attacks are often meaningless since they introduce a main change in the informative contents of the dataset, which can be easily detected. Our proposed approach can be adapted to generate such partial attacks. In Problem (4), it is indeed sufficient to replace matrix Σ 1/2 by DΣ 1/2 D, where D a masking diagonal matrix whose diagonal elements are equal to 1 when the input is attacked and 0 otherwise. The optimal solutions e and e = δDΣ 1/2 D e = δDΣ 1/2 e have then their components equal to 0 for the non-attacked inputs. Note that the naive approach which would consist in solving (4) and setting to zero the resulting perturbation components for non-attacked inputs would be suboptimal.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Numerical Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Dataset and architecture description</head><p>Open Source Datasets We run our experiments on three open source regression datasets. The Combined Cycle Power Plant <ref type="bibr" target="#b8">[Tüfekci, 2014]</ref> dataset has 4 features with 9,568 instances. The task is to predict the net hourly electrical energy output using hourly average ambient variables. The Red Wine Quality dataset <ref type="bibr" target="#b1">[Cortez et al., 2009]</ref> contains 1,599 total samples and each instance has 11 features. The features are physicochemical and sensory measurements for wine. The output variable is a quality score ranging from 0 to 10, where 10 represents for best quality and 0 for least quality. For the Abalone dataset, the task is to model an Abalone's age based purely on its physical measurements. This would allow Abalone's age estimation without cutting its shell. There are in total 4,177 instances with 8 input variables including one categorical variable. The datasets are divided with a ratio of 4:1 between training and testing data. The categorical attributes are dealt with by using one hot encoding based on the number of categories. The input attributes are normalised by removing their mean and scaling to unit variance.</p><p>We train fully connected networks for the estimation of variables from the datasets. The network architecture for the dataset are given below. The values represent the number of hidden neurons in the layers. Activation function at each layer is ReLU except for the last layer.</p><p>• Combined cycle Power Plant dataset -(10, 6, 1)  6.5 × 10 −3 6.5 × 10 −3 6.5 × 10 −3 10.3 × 10 −3 1.3 × 10 −3 1.3 × 10 −3 1.4 × 10 −3 4.0 × 10 −3 0.33 0.34 0.36 0.62 2 × 10 −1 6.4 × 10 −3 6.8 × 10 −3 6.8 × 10 −3 6.9 × 10 −3 14.2 × 10 −3 2.5 × 10 −3 2.5 × 10 −3 2.7 × 10 −3 8.0 9.6 × 10 −3 9.6 × 10 −3 9.6 × 10 −3 20.9 × 10 −3 2.6 × 10 −3 2.6 × 10 −3 2.7 × 10 −3 11.8 × 10 −3 0.45 0.46 0.47 0.96 2 × 10 −1 9.2 × 10 −3 . 10.7 × 10 −3 10.7 × 10 −3 10.7 × 10 −3 32.5 × 10 −3 5.1 × 10 −3 5.2 × 10 −3 5.4 × 10 −3 24.0 × 10 −3 0.65 0.66 0.67 1.24 12.0 × 10 −3 12.0 × 10 −3 0.91 0.96 2 × 10 −1 24.0 × 10 −3 24.0 × 10 −3 1.24 1.24 dataset is given in Table <ref type="table" target="#tab_0">1</ref>. The variable to be predicted is the Estimation of Arrival time (ETE) of a flight, given variables including the distance and speed, and also an initial estimate of ETE. The dataset is related to flight control, an activity area where safety is critical. The input attributes are normal-ized by removing their mean and scaling to unit variance. For models, we build fully connected networks with ReLU activation function on all the hidden layers except the last one. The network architecture is shown in the Figure <ref type="figure" target="#fig_0">1</ref>.</p><formula xml:id="formula_4">= 1 K K k=1 T (x k + e k ) − y k q Fooling Error E = 1 K K k=1 T (x k + e k ) − T (x k ) q Symmetric Mean Accuracy Percentage Error SMAPE = 2 K + K+ k=1 T (x k + e k ) − y k q − T (x k ) − y k q T (x k + e k ) − y k q + T (x k ) − y k q</formula><formula xml:id="formula_5">×</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Experimental setup</head><p>We first train our networks without any constraints using the network architecture presented in the previous section with the aim of reducing the prediction/performance loss on the train dataset. This will be refered to as a standard training procedure.</p><p>To understand and analyze the performance of the proposed adversarial attacker, we calculate the three error metrics described in Table <ref type="table" target="#tab_2">2</ref>. We compare the proposed adversarial attacker with random noise attackers generated by i.i.d. perturbations. We use three additive noise distributions-Gaussian, uniform and binary, for comparisons. The output of these attackers have been normalized so as to meet the desired bound on the norm of the perturbation. The metrics are computed on the test samples where K is the total number of samples in the test set. The results on the 4 datasets for varying noise levels are shown in Table <ref type="table" target="#tab_4">3</ref>. We also show the histograms of ( T (x k +e k )−y k q − T (x k )−y k q ) 1≤k≤K in Figures <ref type="figure">2, 3, 4, and 5</ref>, where (e k ) 1≤k≤K have been generated from various noise distributions and the proposed adversarial attacker.</p><p>For safety critical tasks, Lipschitz and performance targets can be specified as engineering requirements, prior to network training. Such a design approach has proven to make the network more stable and robust to adversarial attacks. Imposing a Lipschitz target can be done either by controlling the Lipschitz constant for each layer or for the whole network depending on the application at hand. One such method for 2 attacks 1 × 10 −1 9.2 × 10 −3 9.6 × 10 −3 9.6 × 10 −3 9.6 × 10 −3 20.9 × 10 −3 2.6 × 10 −3 2.6 × 10 −3 2.7 × 10 −3 11.8 × 10 −3 0.45 0.46 0.47 0.96 2 × 10 −1 9.2 × 10 −3 10.7 × 10 −3 10.7 × 10 −3 10.7 × 10 −3 32.5 × 10 −3 5.1 × 10 −3 5.2 × 10 −3 5.4 × 10 −3 24.0 × 10 −3 0.65 0.66 0.67 1.24</p><p>1 attacks 1 × 10 −1 9.2 × 10 −3 9.2 × 10 −3 9.2 × 10 −3 9.2 × 10 −3 18.5 × 10 −3 8.4 × 10 −4 8.1 × 10 −4 7.2 × 10 −4 9.5 × 10 −3 0.22 0.21 0.20 0.87 2 × 10 −1 9.2 × 10 −3 9.4 × 10 −3 9.3 × 10 −3 9.3 × 10 −3 28.1 × 10 −3 1.7 × 10 −3 1.6 × 10 −3 1.4 × 10 −3 20.0 × 10 −3 0.35 0.34 0.32 1.15</p><p>∞ attacks 1 × 10 −1 9.2 × 10 −3 10.5 × 10 −3 11.1 × 10 −3 13.0 × 10 −3 31.1 × 10 −3 4.7 × 10 −3 6.0 × 10 −3 9.9 × 10 −3 22.0 × 10 −3 0.63 0.71 0.87 1.22 2 × 10 −1 9.2 × 10 −3 13.5 × 10 −3 15.3 × 10 −3 22.0 × 10 −3 52.5 × 10 −3 9.4 × 10 −3 12.0 × 10 −3 19.0 × 10 −3 45.0 × 10 −3 0.84 0.92 1.08 1.47</p><p>Table <ref type="table">6</ref>: Comparison on industrial dataset for 2, 1 and ∞ attacks with variation in perturbation levels.</p><p>controlling the Lipschitz has been presented in <ref type="bibr" target="#b5">[Serrurier et al., 2020]</ref> using Hinge regularization. In the experiments, we train our networks while using a spectral normalisation technique <ref type="bibr" target="#b2">[Miyato et al., 2018]</ref> which has been proven to be very effective in controlling Lipschitz properties in GANs.</p><p>Given an m layer fully connected architecture and a Lipschitz target L, we can constrain the spectral norm of each layer to be less than m √ L. This ensures that the upper bound on the global Lipschitz constant is less than L. We keep the network architectures exactly the same for both training procedures. The performance of adversarial attacker on standard and spectrally normalized trained model in terms of Fooling Error (E) and Symmetric Mean Accuracy Percentage error (SMAPE) for various datasets and varying perturbation magnitude is given in Table <ref type="table" target="#tab_5">4</ref>.</p><p>All the previous results have been obtained with attack and noise addition on all the input features present in the datasets. As pointed in Section 3.4, the introduced adversarial attacker is capable of attacking a group of inputs. While generating an adversarial attack we avoid attacking the categorical input variables <ref type="bibr" target="#b0">[Ballet et al., 2019]</ref>, hence in Abalone and industrial datasets we attack only the continuous variables. For the Combined Power plant dataset, we attack 3 out of 4 continuous variables since it does not contain any categorical variables. Similarly, for the Red-wine dataset we attack 8 continuous variables out of 11. The performance of the adversarial attacker, when attacking only few inputs, is shown in Table <ref type="table" target="#tab_6">5</ref>.</p><p>As emphasized in Section 3.3, our adversarial attacker is applicable for various measures of perturbation on input and output deviations. The previous results have been obtained for the value p = q = 2 termed as 2 attacks here. We further show results for p = q = 1 termed as 1 attacks and for p = q = +∞ termed as ∞ attacks in Table <ref type="table">6</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Result analysis</head><p>Some general conclusions can be drawn from the experiments.</p><p>• We observe that the proposed adversarial attacker performs better than all the three random noise attackers for the three quantitative measures we have defined. In addition, the histograms in Figures 2, 3, 4, and 5 show that the error may be increased or reduced by random attackers, while this shortcoming does not happen with our adversarial attacker. This observation is verified on the norms -2 , 1 and ∞ norms in Table <ref type="table">6</ref>. • Spectral normalisation has been proven to robustify the trained models. As in Table <ref type="table" target="#tab_5">4</ref>, we see that the Fooling error (E) and SMAPE are reduced in all the cases when compared to the standard trained model.</p><p>• In the considered examples, we observe that categorical data have little effect when attacking the trained model as shown in Table <ref type="table" target="#tab_6">5</ref>. The E and SMAPE measures do not show major differences.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion</head><p>In this article, we have introduced a novel easily implementable Jacobian-based adversarial attacker for estimation problems. These regression tasks cover a major portion of safety critical applications. Yet there is lack of works studying and analysing adversarial attacks in this context, as opposed to classification tasks. The present study contributes to filling this gap. We have presented error metrics which help in analysing the effectiveness of the attacker. Our attacker is versatile in the sense that it can handle any measure ( 1 , 2 , ∞ ) on input or output perturbations according to the target application. Our attacker is also successful in handling attacks focused on subsets of inputs. This feature may be useful when handling specific tabular datasets and may also be insightful when information is available related to the sensitivity or ability to control some inputs. Our tests concentrated on fully connected networks, but it is worth pointing out that the proposed approach can be applied to any network architecture.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Network Architecture.</figDesc><graphic coords="4,99.24,67.53,420.94,185.54" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Input and output variables description for Industrial dataset -A safety critical application.</figDesc><table><row><cell></cell><cell>0</cell><cell>Speed</cell><cell></cell></row><row><cell></cell><cell>1</cell><cell>Flight Distance</cell><cell></cell></row><row><cell></cell><cell>2</cell><cell>Departure Delay</cell><cell></cell></row><row><cell></cell><cell>3</cell><cell>Initial ETE</cell><cell></cell></row><row><cell></cell><cell>4</cell><cell>Latitude Origin</cell><cell></cell></row><row><cell></cell><cell>5</cell><cell>Longitude Origin</cell><cell>continuous</cell></row><row><cell>Input</cell><cell>6 7</cell><cell>Altitude Origin Latitude Destination</cell><cell></cell></row><row><cell></cell><cell>8</cell><cell>Longitude Destination</cell><cell></cell></row><row><cell></cell><cell>9</cell><cell>Altitude Destination</cell><cell></cell></row><row><cell></cell><cell>10</cell><cell>Arrival Time Slot</cell><cell>7 slots (categorical)</cell></row><row><cell></cell><cell>11</cell><cell>Departure Time Slot</cell><cell>7 slots (categorical)</cell></row><row><cell></cell><cell>12</cell><cell>Aircraft Category</cell><cell>6 classes (categorical)</cell></row><row><cell></cell><cell>13</cell><cell>Airline Company</cell><cell>19 classes (categorical)</cell></row><row><cell>Output</cell><cell>3</cell><cell>Refinement ETE</cell><cell>continuous</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 :</head><label>2</label><figDesc>Error metrics used for evaluation. The mean value computed for SMAPE is limited to the K+ positive values of the elements in the summation. e k is the error generated by the adversarial on the k-th sample in the dataset of length K.</figDesc><table><row><cell>Noise</cell><cell>MAEstd</cell><cell>MAEgauss</cell><cell>MAEuni</cell><cell>MAEbin</cell><cell>MAE adv</cell><cell>Egauss</cell><cell>Euni</cell><cell>Ebin</cell><cell>Eadv</cell><cell>SMAPEgauss SMAPEuni SMAPE bin SMAPEadv</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="3">Combined Cycle Power Plant Dataset</cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="2">1 × 10 −1 6.4 × 10 −3</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 3 :</head><label>3</label><figDesc>Comparison on evaluation metrics random attacker vs. proposed adversarial attacker with variation in perturbation level ( 2 attack).</figDesc><table><row><cell>Noise</cell><cell>E adv</cell><cell>Espec</cell><cell>SMAPE adv</cell><cell>SMAPEspec</cell></row><row><cell></cell><cell cols="3">Combined Cycle Power Plant Dataset</cell><cell></cell></row><row><cell>1 × 10 −1</cell><cell>4.0 × 10 −3</cell><cell>3.9 × 10 −3</cell><cell>0.62</cell><cell>0.60</cell></row><row><cell>2 × 10 −1</cell><cell>8.0 × 10 −3</cell><cell>7.7 × 10 −3</cell><cell>0.87</cell><cell>0.84</cell></row><row><cell></cell><cell cols="2">Red Wine Quality Dataset</cell><cell></cell><cell></cell></row><row><cell>1 × 10 −1</cell><cell>0.12</cell><cell>0.017</cell><cell>0.41</cell><cell>0.12</cell></row><row><cell>2 × 10 −1</cell><cell>0.21</cell><cell>0.03</cell><cell>0.56</cell><cell>0.17</cell></row><row><cell></cell><cell></cell><cell>Abalone age dataset</cell><cell></cell><cell></cell></row><row><cell>5 × 10 −2</cell><cell>0.36</cell><cell>0.11</cell><cell>0.38</cell><cell>0.12</cell></row><row><cell>1 × 10 −1</cell><cell>0.72</cell><cell>0.23</cell><cell>0.58</cell><cell>0.21</cell></row><row><cell></cell><cell></cell><cell>Industrial Dataset</cell><cell></cell><cell></cell></row><row><cell>1 × 10 −1</cell><cell>11.8 × 10 −3</cell><cell>9.1 × 10 −3</cell><cell>0.91</cell><cell>0.49</cell></row><row><cell>2 × 10 −1</cell><cell>24.0 × 10 −3</cell><cell>17.5 × 10 −3</cell><cell>1.24</cell><cell>0.72</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 4 :</head><label>4</label><figDesc>Results of proposed adversarial techniques. Standard training vs Spectral Normalisation training on 2 attacks.</figDesc><table><row><cell>Noise</cell><cell>E adv</cell><cell>E inp</cell><cell>SMAPE adv</cell><cell>SMAPE inp</cell></row><row><cell></cell><cell cols="3">Combined Cycle Power Plant Dataset</cell><cell></cell></row><row><cell>1 × 10 −1</cell><cell>4.0 × 10 −3</cell><cell>3.4 × 10 −3</cell><cell>0.62</cell><cell>0.58</cell></row><row><cell>2 × 10 −1</cell><cell>8.0 × 10 −3</cell><cell>7.0 × 10 −3</cell><cell>0.87</cell><cell>0.82</cell></row><row><cell></cell><cell cols="2">Red Wine Quality Dataset</cell><cell></cell><cell></cell></row><row><cell>1 × 10 −1</cell><cell>0.12</cell><cell>0.13</cell><cell>0.41</cell><cell>0.44</cell></row><row><cell>2 × 10 −1</cell><cell>0.21</cell><cell>0.22</cell><cell>0.56</cell><cell>0.60</cell></row><row><cell></cell><cell></cell><cell>Abalone age dataset</cell><cell></cell><cell></cell></row><row><cell>5 × 10 −2</cell><cell>0.36</cell><cell>0.36</cell><cell>0.38</cell><cell>0.38</cell></row><row><cell>1 × 10 −1</cell><cell>0.72</cell><cell>0.71</cell><cell>0.58</cell><cell>0.59</cell></row><row><cell></cell><cell></cell><cell>Industrial Dataset</cell><cell></cell><cell></cell></row><row><cell>1 × 10 −1</cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 5 :</head><label>5</label><figDesc>Results of proposed adversarial techniques. Standard training attacking all inputs vs standard training attacking few inputs on 2 attacks.</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">We assume that J(x) is defined at x, see [Bolte and Pauwels,</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2020" xml:id="foot_1">] for a justification of this assumption in the nonsmooth case.</note>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0" />			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Perturbation analysis of learning algorithms: A unifying perspective on generation of adversarial examples</title>
		<author>
			<persName><surname>Balda</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1812.07385</idno>
		<idno>arXiv:1911.03274</idno>
	</analytic>
	<monogr>
		<title level="m">Imperceptible adversarial attacks on tabular data</title>
				<editor>
			<persName><forename type="first">Wagner</forename><surname>Carlini</surname></persName>
		</editor>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2017">2018. 2018. 2019. 2019. 2020. 2020. 2017. 2017</date>
			<biblScope unit="page" from="39" to="57" />
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
	<note>2017 ieee symposium on security and privacy (sp)</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">An analysis of adversarial attacks and defenses on autonomous driving models</title>
		<author>
			<persName><forename type="first">Cortez</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Pervasive Computing and Communications (PerCom)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2009">2009. 2009. 2020. 2020. 2020</date>
			<biblScope unit="volume">47</biblScope>
			<biblScope unit="page" from="1" to="10" />
		</imprint>
	</monogr>
	<note>Modeling wine preferences by data mining from physicochemical properties</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Adversarial manipulation of reinforcement learning policies in autonomous agents</title>
		<author>
			<persName><forename type="first">Dong</forename></persName>
		</author>
		<idno type="arXiv">arXiv:1804.05296</idno>
		<idno>arXiv:1802.05957</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE conference on computer vision and pattern recognition</title>
				<editor>
			<persName><forename type="first">Takeru</forename><surname>Miyato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Toshiki</forename><surname>Kataoka</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Masanori</forename><surname>Koyama</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Yuichi</forename><surname>Yoshida</surname></persName>
		</editor>
		<meeting>the IEEE conference on computer vision and pattern recognition</meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2014">2018. 2018. 2018. 2018. 2018. 2018. 2018. 2018. 2014. 2018. 2018. 2016. 2016. 2016. 2016. 2018. 2018</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
	<note>Spectral normalization for generative adversarial networks</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Deepfool: a simple and accurate method to fool deep neural networks</title>
		<author>
			<persName><surname>Moosavi-Dezfooli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE conference on computer vision and pattern recognition</title>
				<meeting>the IEEE conference on computer vision and pattern recognition</meeting>
		<imprint>
			<date type="published" when="2016">2016. 2016</date>
			<biblScope unit="page" from="2574" to="2582" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">The security of autonomous driving: Threats, defenses, and future directions</title>
		<author>
			<persName><surname>Moosavi-Dezfooli</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1812.02885</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE conference on computer vision and pattern recognition</title>
				<meeting>the IEEE conference on computer vision and pattern recognition</meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2016">2017. 2017. 2018. 2016. 2016. 2019. 2019</date>
			<biblScope unit="volume">108</biblScope>
			<biblScope unit="page" from="357" to="372" />
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
	<note>Proceedings of the IEEE</note>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<author>
			<persName><surname>Rony</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2011.11857</idno>
		<title level="m">Achieving robustness in classification using optimal transport with hinge regularization</title>
				<imprint>
			<date type="published" when="2020">2020. 2020. 2020. 2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
	<note>Augmented lagrangian adversarial attacks</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">One pixel attack for fooling deep neural networks</title>
		<author>
			<persName><surname>Su</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Christian Szegedy</title>
				<imprint>
			<publisher>Wojciech</publisher>
			<date type="published" when="2013">2019. 2019. 2013</date>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="page" from="828" to="841" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Adversarial regression with multiple learners</title>
		<author>
			<persName><forename type="first">Ilya</forename><surname>Zaremba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joan</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dumitru</forename><surname>Bruna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ian</forename><surname>Erhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rob</forename><surname>Goodfellow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Fergus</surname></persName>
		</author>
		<author>
			<persName><surname>Tong</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1312.6199</idno>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning</title>
				<imprint>
			<publisher>PMLR</publisher>
			<date type="published" when="2013">2013. 2018. 2018</date>
			<biblScope unit="page" from="4946" to="4954" />
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
	<note>Intriguing properties of neural networks</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods</title>
		<author>
			<persName><forename type="first">Pınar</forename><surname>Tüfekci</surname></persName>
		</author>
		<author>
			<persName><surname>Tüfekci</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Electrical Power and Energy Systems</title>
		<imprint>
			<biblScope unit="volume">60</biblScope>
			<biblScope unit="page" from="126" to="140" />
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
