==Vulnerability of Machine Learning Models to Adversarial Examples==
ITAT 2016 Proceedings, CEUR Workshop Proceedings Vol. 1649, pp. 187–194 http://ceur-ws.org/Vol-1649, Series ISSN 1613-0073, c 2016 P. Vidnerová, R. Neruda Vulnerability of machine learning models to adversarial examples Petra Vidnerová, Roman Neruda Institute of Computer Science, Academy of Sciences of the Czech Republic petra s. as. z Abstract: We propose a genetic algorithm for generating We propose a genetic algorithm for generating adversarial adversarial examples for machine learning models. Such examples. Though the evolution is slower than techniques approach is able to find adversarial examples without the described in [12, 4], it enables us to obtain adversarial ex- access to model’s parameters. Different models are tested, amples even without the access to model’s weights. The including both deep and shallow neural networks archi- only thing we need is to be able to query the network to tectures. We show that RBF networks and SVMs with classify a given example. From this point of view, the mis- Gaussian kernels tend to be rather robust and not prone classification of adversarial examples represent a security to misclassification of adversarial examples. flaw. This paper is organized as follows. Section 2 brings a brief overview of machine learning models considered in 1 Introduction this paper. Section 3 describes the proposed genetic algo- rithm. Section 4 describes the results of our experiments. Deep networks and convolutional neural networks enjoy Finally, Section 5 concludes our paper. high interest nowadays. They have become the state-of- art methods in many fields of machine learning, and have been applied to various problems, including image recog- 2 Deep and Shallow Architectures nition, speech recognition, and natural language process- ing [5]. 2.1 Deep and Convolutional Networks In [12] a counter-intuitive property of deep networks is described. It relates to the stability of a neural network Deep neural networks are feedforward neural networks with respect to small perturbation of their inputs. The with multiple hidden layers between the input and output paper shows that applying an imperceptible non-random layer. The layers typically have different units depend- perturbation to an input image, it is possible to arbitrar- ing on the task at hand. Among the units, there are tra- ily change the network prediction. These perturbations are ditional perceptrons, where each unit (neuron) realizes a found by optimizing the input to maximize the prediction nonlinear function, such as the sigmoid function: y(z) = error. Such perturbed examples are known as adversarial tanh(z) or y(z) = 1+e1 −z . Another alternative to the percep- examples. On some datasets, such as ImageNet, the adver- tron is the rectified linear unit (ReLU): y(z) = max(0, z). sarial examples were so close to the original examples that Like the sigmoid neurons, rectified linear units can be used the differences were indistinguishable to the human eye. to compute any function, and they can be trained using al- Paper [4] suggests that it is the linear behaviour in high- gorithms such as back-propagation and stochastic gradient dimensional spaces what is sufficient to cause adversarial descent. examples (for example, a linear classifier exhibits this be- Convolutional layers contain the so called convolutional haviour too). They designed a fast method of generating units that take advantage of the grid-like structure of the in- adversarial examples (adding small vector in the direction puts, such as in the case of 2-D bitmap images, time series, of the sign of the derivation) and showed that adding these etc. Convolutional units perform a simple discrete convo- examples to the training set further improves the general- lution operation, which – for 2-D data – can be represented ization of the model. In [4], in addition, the authors state by a matrix multiplication. Usually, to deal with large data that adversarial examples are relatively robust, and they (such as large images), the convolution is applied multiple generalize between neural networks with varied number times by sliding a small window over the data. The con- of layers, activations, or trained on different subsets of the volutional units are typically used to extract some features training data. In other words, if we use one neural net- from the data, and they are often used together with the so- work to generate a set of adversarial examples, these ex- called max pooling layers that perform an input reduction amples are also misclassified by another neural network by selecting one of many inputs, typically the one with even when it was trained with different hyperparameters, maximal value. Thus, the overall architecture of a deep or when it was trained on a different set of examples. An- network for image classification tasks resembles a pyra- other results of fooling deep and convolutional networks mid with smaller number of units in higher layers of the can be found in [10]. networks. This paper examines a vulnerability to adversarial ex- For the output layer, mainly for classification tasks, the zj amples throughout variety of machine learning methods. softmax function: y(z) j = Ke ezk is often used. It has ∑k=1 188 P. Vidnerová, R. Neruda the advantage that the output values can be interpreted as some pixels. With the probability pmutate_pixel each pixel probabilities of individual classes. is changed: Networks with at least one convolutional layer are ii = ii + r, called convolutional neural networks (CNN), while net- where r is drawn from Gaussian distribution. As a selec- works with all hidden layers consisting of perceptrons are tion, the tournament selection with tournament size 3 is called multi-layer perceptrons (MLP). used. The fitness function should reflect the following two cri- 2.2 RBF networks and Kernel Methods teria: The history of radial basis function (RBF) networks can be 1. the individual should resemble the target image traced back to the 1980s, particularly to the study of inter- polation problems in numerical analysis [8]. The RBF net- 2. if we evaluate the individual by our machine learning work [3] is a feedforward network with one hidden layer model, we aim to obtain a prescribed target output realizing the basis functions and linear output layer. It rep- (i.e., misclassify it). resents an alternative to classical models, such as multi- layer perceptrons. There is variety of learning methods for Thus, in our case, a fitness function is defined RBF networks [9]. as: f (I) = −(0.5 ∗ cdist(I,target_image) + 0.5 ∗ In 1990s, a family of machine learning algorithms, cdist(model(I),target_answer)), where cdist is a known as kernel methods, became very popular. They Euclidean distance. have been applied to a number of real-world problems, and they are still considered to be state-of-the-art methods in 4 Experimental Results various domains [14]. Based on theoretical results on kernel approximation, The goal of our experiments is to test various machine the popular support vector machine (SVM) [2, 13] algo- learning models and their vulnerability to adversarial ex- rithm was developed. Its architecture is similar to RBF – amples. one hidden layer of kernel units and a linear output layer. The learning algorithm is different, based on search for a separating hyperplane with the highest margin. Common 4.1 Overview of models kernel functions used for SVM learning are linear hx, x′ i, polynomial (γ hx, x′ i + r)d , Gaussian exp(−γ |x − x′ |2 ), and As a representative of deep models we use two deep ar- sigmoid tanh(γ hx, x′ i + r). chitectures – an MLP network with rectified linear units Recently, due to popularity of deep architectures, such (ReLU), and a CNN. The MLP used in our experiments models with only one hidden layer are often referred to as consist of three fully connected layers. Two hidden layers shallow models. have 512 ReLUs each, using dropout; the output layer has 10 softmax units. The CNN has two convolutional lay- ers with 32 filters and ReLUs each, a max pooling layer, 3 Genetic Algorithms a fully connected layer of 128 ReLUs, and a fully con- nected output softmax layer. In addition to these two mod- To obtain an adversarial example for the trained machine els, we also used an ensemble of 10 MLPs. All models learning model (such as a neural network), we need to were trained using the KERAS library [1]. optimize the input image with respect to network out- Shallow networks in our experiments are represented by put. For this task we employ genetic algorithms (GA). an RBF network with 1000 Gaussian units, and SVM mod- GA represent a robust optimization method working with els with Gaussian kernel (SVM-gauss), polynomial kernel the whole population of feasible solutions [7]. The popu- of grade 2 and 4 (SVM-poly2 and SVM-poly4), sigmoidal lation evolves using operators of selection, mutation, and kernel (SVM-sigmoid), and linear kernel (SVM-linear). crossover. Both the machine learning model and the target SVMs were trained using the SCIKIT library [11], Grid output are fixed during the optimization. search and crossvalidation techniques were used to tune Each individual represents one possible input vector, i.e. hyper-parameters. For RBF networks, we used our own one image encoded as a vector of pixel values: implementation. Overview of train and test accuracies can be found in Tab. 1. I = {i1 , i2 , . . . , iN }, where ii ∈< 0, 1 > are levels of grey, and N is the size of 4.2 Experimental setup a flatten image. (For the sake of simplicity, we consider only greyscale images in this paper, but it can be seen that The well known MNIST data set [6] was used. It contains the same principle can be used for RGB images as well.) 70 000 images of hand written digits, 28 × 28 pixel each. The crossover operator performs a classical two-point 60 000 are used for training, 10 000 for testing. The ge- crossover. The mutation introduces a small change to netic algorithm was run with 50 individuals, for 10 000 Vulnerability of Machine Learning Models to Adversarial Examples 189 In general, it often happens that adversarial example -li id o M 2 M 4 ar s evolved for one model is misclassified by some of the other SV igm SV aus SV oly SV oly ne -g -p -p -s models (see Tab. 6 and 7). There are some general trends: M M M N LP F CN RB SV M • adversarial example evolved for CNN was never mis- Train 0.96 1.00 1.00 0.99 1.00 0.99 0.87 0.95 classified by other models, and CNN never misclas- Test 0.96 0.98 0.99 0.98 0.98 0.98 0.88 0.94 sified other adversarial examples than those evolved Table 1: Overview of accuracies on train and test sets. for the CNN; • adversarial examples evolved for MLP are misclas- sified also by ensemble of MLPs (all cases except generations, with crossover probability set to 0.6, and mu- two) and adversarial examples evolved for ensemble tation probability set to 0.1. The GA was run 9 times for of MLPs are misclassified by MLP (all cases); each model to find adversarial examples that resemble 9 different images (training samples from the beginning of • adversarial examples evolved for the SVM-sigmoid training set). All images were optimized to be classified model are misclassified by SVM-linear (all cases ex- as zero. cept two); • adversarial examples for the SVM-poly2 model are 4.3 Results often (6 cases) misclassified by other SVMs (SVM- poly4, SVM-sigmoid, SVM-linear), and in 4 cases Figures 1 and 2 show two selected cases from our set of also by the SVM-gauss. In three cases it was also experiments. For example, the first set of images shows a misclassified by MLP and ensemble of MLPs, in one particular image of digit five from the training set, and best case, the adversarial example for SVM-poly2 is mis- evolved individuals from the corresponding runs of GA for classified by all models but CNN (however, this ex- individual models. ample is quite noisy); In Tables 2–5, the outputs of individual models are listed. In Tab. 2 and 4, we show output vectors for train- • adversarial example for the SVM-poly4 model is in ing sample of digit five and four, respectively. In Tab. 3 two cases misclassified by all models but CNN, in and 5, we show output vectors for adversarial examples different case it is misclassified by all but the CNN from Fig. 1 and 2, respectively. and RBF models, and in one case by all but CNN, RBF, and SVM-gauss models; For this case, the adversarial examples were found for MLP, CNN, ensemble of MLPs, SVM-poly2, and SVM- • RBF network, SVM-gauss, and SVM-linear were re- sigmoid. For RBF network, SVM-gauss, SVM-poly4, and sistant to adversarial examples by genetic algorithm, SVM-linear, the GA was not able to find image that re- however they sometimes misclassify adversarial ex- sembles the digit 5 and at the same time it is classified as amples of other models. These examples are already zero. quite noisy, however by human they would still be If we look on a vulnerability of individual models over classified correctly. all 9 GA runs we can conclude the following: • MLP, CNN, ensemble of MLPs, and SVM-sigmoid 5 Conclusion were always misclassifying the best individuals; We proposed a genetic algorithm for generating adversar- • RBF network, SVM-gauss, and SVM-linear; never ial examples for machine learning models. Our experi- misclassified, i.e. the genetic algorithm was not able ment show that many machine models suffer from vulnera- to find adversarial example for these models; bility to adversarial examples, i.e. examples designed to be misclassified. Some models are quite resistant to such be- • SVM-poly2 and SVM-poly4 were resistant to finding haviour, namely models with local units – RBF networks adversarial examples in 2 and 5 cases, respectively. and SVMs with Gaussian kernels. It seems that it is the lo- cal behaviour of units that prevents the models from being Fig. 3 and 4 deal with the generalization of adversar- fooled. ial examples over different models. For each adversarial Adversarial examples evolved for one model are often example the figure lists the output vectors of all models. misclassified also by some of other models, as was elabo- In the case of a digit 3, the adversarial example evolved rated in the experiments. for MLP is also misclassified by an ensemble of MLPs, and vice versa. Both examples are misclassified by SVM- sigmoid. However, adversarial example for the SVM- Acknowledgements sigmoid is misclassified only by the SVM-linear model. This work was partially supported by the Czech Grant Adversarial example for SVM-poly2 is misclassified also Agency grant 15-18108S, and institutional support of the with other SVMs, except the SVM-gauss model. Institute of Computer Science RVO 67985807. 190 P. Vidnerová, R. Neruda 0 1 2 3 4 5 6 7 8 9 RBF 0.04 -0.07 0.08 0.24 -0.04 0.73 -0.04 0.21 0.03 -0.18 MLP 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 CNN 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 ENS 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 SVM-gauss 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 SVM-poly2 0.00 0.00 0.00 0.02 0.00 0.98 0.00 0.00 0.00 0.00 SVM-poly4 0.00 0.00 0.00 0.02 0.00 0.98 0.00 0.00 0.00 0.00 SVM-sigmoid 0.01 0.00 0.03 0.32 0.00 0.61 0.00 0.01 0.01 0.01 SVM-linear 0.00 0.00 0.01 0.10 0.00 0.89 0.00 0.00 0.00 0.00 Table 2: Evaluation of the target digit five (see Fig. 1) by individual models. 0 1 2 3 4 5 6 7 8 9 RBF 0.21 -0.05 0.09 0.23 -0.04 0.51 -0.05 0.17 0.07 -0.09 MLP 0.98 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 CNN 0.95 0.00 0.01 0.01 0.00 0.02 0.00 0.00 0.01 0.00 ENS 0.98 0.00 0.01 0.00 0.00 0.01 0.00 0.00 0.00 0.00 SVM-gauss 0.00 0.00 0.01 0.39 0.00 0.59 0.00 0.00 0.00 0.00 SVM-poly 0.88 0.00 0.02 0.02 0.00 0.07 0.00 0.00 0.00 0.00 SVM-poly4 0.01 0.01 0.14 0.29 0.01 0.50 0.01 0.01 0.02 0.01 SVM-sigmoid 0.82 0.00 0.03 0.05 0.00 0.08 0.00 0.01 0.01 0.01 SVM-linear 0.02 0.03 0.21 0.21 0.02 0.33 0.03 0.04 0.05 0.05 Table 3: Evaluation of adversarial digit five (from Fig. 1) by individual models. Target RBF MLP CNN ENS SVM-rbf SVM-poly SVM-poly4 SVM-sigmoid SVM-linear Figure 1: Best individuals evolved for individual models and digit five. The first ’Target’ image is the digit from the training set, than follows adversarial examples evolved for individual models. Vulnerability of Machine Learning Models to Adversarial Examples 191 0 1 2 3 4 5 6 7 8 9 RBF -0.06 0.07 0.16 0.06 0.67 0.00 0.01 0.15 -0.01 -0.05 MLP 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 CNN 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 ENS 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 SVM-gauss 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 SVM-poly2 0.00 0.00 0.00 0.00 0.97 0.00 0.00 0.01 0.00 0.00 SVM-poly4 0.00 0.00 0.00 0.01 0.96 0.00 0.00 0.01 0.00 0.00 SVM-sigmoid 0.00 0.00 0.01 0.02 0.94 0.00 0.00 0.01 0.00 0.01 SVM-linear 0.00 0.01 0.05 0.04 0.84 0.00 0.01 0.04 0.00 0.02 Table 4: Evaluation of the target digit four (see Fig. 2) by individual models. 0 1 2 3 4 5 6 7 8 9 RBF 0.06 0.08 0.14 0.08 0.49 -0.01 0.03 0.14 0.02 0.02 MLP 0.96 0.00 0.01 0.00 0.02 0.00 0.00 0.01 0.00 0.01 CNN 0.89 0.00 0.02 0.00 0.05 0.00 0.00 0.01 0.02 0.00 ENS 0.96 0.00 0.01 0.00 0.02 0.00 0.00 0.00 0.00 0.01 SVM-gauss 0.01 0.01 0.07 0.12 0.50 0.03 0.01 0.06 0.09 0.08 SVM-poly 0.75 0.00 0.03 0.01 0.09 0.00 0.01 0.04 0.03 0.04 SVM-poly4 0.71 0.01 0.04 0.01 0.11 0.02 0.02 0.04 0.02 0.02 SVM-sigmoid 0.84 0.00 0.02 0.03 0.03 0.02 0.01 0.01 0.00 0.03 SVM-linear 0.02 0.02 0.18 0.18 0.26 0.03 0.05 0.11 0.03 0.11 Table 5: Evaluation of adversarial digit four (from Fig. 2) by individual models. Target RBF MLP CNN ENS SVM-rbf SVM-poly SVM-poly4 SVM-sigmoid SVM-linear Figure 2: Best individuals evolved for individual models and digit four. The first ’Target’ image is the digit from the training set, than follows adversarial examples evolved for individual models. 192 P. Vidnerová, R. Neruda Evolved against RBF 0 0 1 2 3 4 5 6 7 8 9 5 RBF 0.16 0.06 0.12 0.79 0.01 -0.02 -0.06 -0.00 0.02 0.03 MLP 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 10 CNN 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 ENS 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 15 SVM-gauss 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 SVM-poly 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 20 SVM-poly4 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 SVM-sigmoid 0.04 0.00 0.02 0.79 0.00 0.06 0.00 0.01 0.05 0.02 25 SVM-linear 0.00 0.00 0.00 0.96 0.00 0.00 0.00 0.00 0.03 0.00 0 5 10 15 20 25 Evolved against MLP 0 0 1 2 3 4 5 6 7 8 9 5 RBF 0.30 0.04 0.17 0.75 0.02 -0.03 -0.04 -0.01 -0.07 -0.00 MLP 0.96 0.00 0.01 0.01 0.00 0.00 0.00 0.00 0.01 0.01 10 CNN 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 ENS 0.86 0.00 0.01 0.04 0.00 0.00 0.00 0.00 0.01 0.08 15 SVM-gauss 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.01 SVM-poly 0.04 0.00 0.01 0.91 0.00 0.00 0.00 0.00 0.02 0.02 20 SVM-poly4 0.03 0.00 0.01 0.93 0.00 0.00 0.00 0.00 0.01 0.01 SVM-sigmoid 0.49 0.00 0.03 0.30 0.00 0.04 0.00 0.01 0.10 0.02 25 SVM-linear 0.25 0.02 0.10 0.30 0.02 0.05 0.02 0.03 0.18 0.06 0 5 10 15 20 25 Evolved against CNN 0 0 1 2 3 4 5 6 7 8 9 5 RBF 0.12 0.05 0.15 0.89 0.01 -0.18 -0.02 0.02 0.10 -0.03 MLP 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 10 CNN 0.94 0.00 0.02 0.01 0.00 0.00 0.00 0.00 0.03 0.00 ENS 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 15 SVM-gauss 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 SVM-poly 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 20 SVM-poly4 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 SVM-sigmoid 0.04 0.00 0.02 0.86 0.00 0.03 0.00 0.00 0.04 0.01 25 SVM-linear 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 0 5 10 15 20 25 Evolved against ENS 0 0 1 2 3 4 5 6 7 8 9 5 RBF 0.30 0.05 0.18 0.76 -0.01 -0.06 -0.04 -0.03 -0.05 -0.00 MLP 0.83 0.00 0.05 0.06 0.00 0.00 0.00 0.00 0.05 0.00 10 CNN 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 ENS 0.96 0.00 0.01 0.02 0.00 0.00 0.00 0.00 0.01 0.00 15 SVM-gauss 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.01 SVM-poly 0.02 0.00 0.01 0.94 0.00 0.00 0.00 0.00 0.01 0.02 20 SVM-poly4 0.01 0.00 0.00 0.96 0.00 0.00 0.00 0.00 0.01 0.01 SVM-sigmoid 0.40 0.00 0.03 0.35 0.01 0.06 0.00 0.01 0.11 0.02 25 SVM-linear 0.19 0.01 0.06 0.50 0.01 0.05 0.01 0.02 0.11 0.04 0 5 10 15 20 25 Figure 3: Model outputs for individual adversarial examples. Vulnerability of Machine Learning Models to Adversarial Examples 193 Evolved against SVM-RBF 0 0 1 2 3 4 5 6 7 8 9 5 RBF 0.06 0.01 0.15 0.74 -0.00 -0.03 -0.04 -0.01 0.26 0.05 MLP 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 10 CNN 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 ENS 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.01 0.00 15 SVM-gauss 0.00 0.00 0.00 0.90 0.00 0.00 0.00 0.00 0.10 0.00 SVM-poly 0.00 0.00 0.00 0.50 0.00 0.00 0.00 0.00 0.50 0.00 20 SVM-poly4 0.00 0.00 0.00 0.60 0.00 0.00 0.00 0.00 0.40 0.00 SVM-sigmoid 0.03 0.00 0.03 0.63 0.00 0.09 0.00 0.01 0.19 0.02 25 SVM-linear 0.00 0.00 0.00 0.36 0.00 0.00 0.00 0.00 0.63 0.00 0 5 10 15 20 25 Evolved against SVM-poly 0 0 1 2 3 4 5 6 7 8 9 5 RBF 0.32 0.02 0.17 0.86 -0.01 -0.09 -0.09 -0.03 -0.12 0.01 MLP 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 10 CNN 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 ENS 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 15 SVM-gauss 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 SVM-poly 0.87 0.00 0.02 0.04 0.00 0.00 0.00 0.00 0.04 0.02 20 SVM-poly4 0.38 0.01 0.11 0.23 0.01 0.02 0.01 0.02 0.15 0.04 SVM-sigmoid 0.55 0.01 0.04 0.19 0.01 0.05 0.01 0.01 0.13 0.02 25 SVM-linear 0.71 0.01 0.02 0.06 0.01 0.02 0.01 0.01 0.15 0.01 0 5 10 15 20 25 Evolved against SVM-poly4 0 0 1 2 3 4 5 6 7 8 9 5 RBF 0.07 0.02 0.12 0.84 0.04 -0.07 -0.07 -0.01 0.10 0.06 MLP 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 10 CNN 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 ENS 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 15 SVM-gauss 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.01 0.00 SVM-poly 0.00 0.00 0.00 0.57 0.00 0.00 0.00 0.00 0.41 0.03 20 SVM-poly4 0.00 0.00 0.00 0.67 0.00 0.00 0.00 0.00 0.31 0.03 SVM-sigmoid 0.04 0.00 0.02 0.73 0.01 0.07 0.00 0.01 0.08 0.02 25 SVM-linear 0.01 0.00 0.00 0.58 0.00 0.00 0.00 0.00 0.39 0.01 0 5 10 15 20 25 Evolved against SVM-sigmoid 0 0 1 2 3 4 5 6 7 8 9 5 RBF 0.30 0.04 0.22 0.94 -0.01 -0.14 -0.10 0.02 -0.21 -0.02 MLP 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 10 CNN 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 ENS 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 15 SVM-gauss 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 SVM-poly 0.06 0.00 0.03 0.84 0.00 0.01 0.00 0.00 0.03 0.02 20 SVM-poly4 0.04 0.00 0.01 0.93 0.00 0.00 0.00 0.00 0.01 0.01 SVM-sigmoid 0.74 0.00 0.02 0.12 0.00 0.02 0.00 0.00 0.08 0.01 25 SVM-linear 0.41 0.02 0.06 0.18 0.02 0.05 0.02 0.03 0.17 0.05 0 5 10 15 20 25 Figure 4: Model outputs for individual adversarial examples. 194 P. Vidnerová, R. Neruda Evolved for Also misclassified by Evolved for Also misclassified by Example 1: digit 5 Example 8: digit 3 MLP — MLP ensemble, SVM-sigmoid ensemble MLP ensemble MLP, SVM-sigmoid CNN — CNN — SVM-poly2 SVM-poly4, SMV-sigmoid, SVM-poly2 SVM-poly4, SVM-sigmoid, SVM-linear SVM-linear SVM-sigmoid SVM-linear SVM-sigmoid — Example 9: digit 1 Example 3: digit 4 MLP ensemble MLP ensemble ensemble MLP ensemble MLP CNN — CNN — SVM-poly2 all except CNN SVM-poly2 ensemble, MLP, SVM-gauss, SVM-poly4, SVM-sigmoid MLP, ensemble, SVM-gauss, SVM-sigmoid, SVM-linear SVM-poly2, SVM-poly4, SVM-linear SVM-poly4 RBF, ensemble, MLP, SVM-gauss, Example 10: digit 4 SVM-poly4, SVM-sigmoid, SVM-linear MLP ensemble SVM-sigmoid SVM-linear ensemble MLP Example 4: digit 1 CNN — MLP ensemble SVM-poly2 SVM-poly4, SVM-linear ensemble MLP SVM-poly4 MLP, ensemble, SVM-gauss, SVM-poly2, CNN — SVM-sigmoid, SVM-linear SVM-poly2 SVM-gauss, SVM-poly4, SVM-sigmoid, SVM-sigmoid SVM-linear SVM-linear SVM-sigmoid SVM-linear Table 7: Generalization of adversarial examples. Example 5: digit 9 MLP ensemble [4] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. ensemble MLP Explaining and harnessing adversarial examples, 2014. CNN — arXiv:1412.6572. SVM-poly2 ensemble, MLP, SVM-poly2 [5] Yoshua Bengio Ian Goodfellow and Aaron Courville. Deep SVM-poly4, SVM-sigmoid, SVM-linear learning. Book in preparation for MIT Press, 2016. SVM-poly4 all except CNN [6] Yann LeCun and Corinna Cortes. The mnist database of SVM-sigmoid SVM-linear handwritten digits, 2012. Example 6: digit 2 [7] M. Mitchell. An Introduction to Genetic Algorithms. MIT MLP — Press, Cambridge, MA, 1996. ensemble MLP [8] J. Moody and C. Darken. Fast learning in networks CNN — of locally-tuned processing units. Neural Computation, SVM-poly4 MLP, ensemble, SVM-poly2, 1:289–303, 1989. SVM-sigmoid, SVM-linear [9] R. Neruda and P. Kudová. Learning methods for radial ba- SVM-sigmoid SVM-linear sis functions networks. Future Generation Computer Sys- Example 7: digit 1 tems, 21:1131–1142, 2005. MLP ensemble [10] Anh Mai Nguyen, Jason Yosinski, and Jeff Clune. Deep ensemble MLP neural networks are easily fooled: High confidence pre- CNN — dictions for unrecognizable images. CoRR, abs/1412.1897, SVM-sigmoid — 2014. [11] F. Pedregosa et al. Scikit-learn: Machine learning in Table 6: Generalization of adversarial examples. Python. Journal of Machine Learning Research, 12:2825– 2830, 2011. [12] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, References Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks, 2013. arXiv:1312.6199. [1] François Chollet. Keras. https://github. om/f hollet/keras, 2015. [13] V. N. Vapnik. Statistical Learning Theory. Wiley, New- York, 1998. [2] C. Cortes and V. Vapnik. Support-vector networks. Ma- chine Learning, 20(3):273–297, 1995. [14] J. P. Vert, K. Tsuda, and B. Scholkopf. A primer on kernel methods. Kernel Methods in Computational Biology, pages [3] F. Girosi, M. Jones, and T. Poggio. Regularization theory 35–70, 2004. and Neural Networks architectures. Neural Computation, 2:219–269, 7 1995.