<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Khondoker Murad Hossain</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tim Oates</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Maryland Baltimore County</institution>
          ,
          <addr-line>Baltimore, MD, 21250</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The increasing importance of both deep neural networks (DNNs) and cloud services for training them means that bad actors have more incentive and opportunity to insert backdoors to alter the behavior of trained models. In this paper, we introduce a novel method for backdoor detection that extracts features from pre-trained DNN's weights using independent vector analysis (IVA) followed by a machine learning classifier. In comparison to other detection techniques, this has a number of benefits, such as not requiring any training data, being applicable across domains, operating with a wide range of network architectures, not assuming the nature of the triggers used to change network behavior, and being highly scalable. We discuss the detection pipeline, and then demonstrate the results on two computer vision datasets regarding image classification and object detection. Our method outperforms the competing algorithms in terms of eficiency and is more accurate, helping to ensure the safe application of deep learning and AI.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Backdoor detection</kwd>
        <kwd>image classification</kwd>
        <kwd>object detection</kwd>
        <kwd>matrix factorization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>easily introduce a backdoor in the model.</p>
      <p>
        Backdoor attacks are more stealthy than other attacks
Deep neural networks (DNNs) have seen great success as the backdoored model can have high accuracy for
in diverse domains, including object detection [
        <xref ref-type="bibr" rid="ref20">1</xref>
        ], image the underlying task, e.g., classification. As DNNs are
captioning [2], virtual assistants [3], healthcare [4], fake deployed in critical applications, the consequences of
tronews detection [5], stock market prediction [6], and self- janed models can be dire. For example, a model used to
driving cars [7]. Despite their ubiquitous applications, detect street signs in a self-driving car may have an
emDNNs are still considered to be black boxes as their in- bedded trigger (e.g., a yellow sticky note) that causes the
ternal representations are opaque and their behavior can model to misclassify stop signs as speed limit signs,
leadbe hard to predict. Because of this, DNNs are susceptible ing to accidents. Due to this, the US Defense Advanced
to a variety of adversarial attacks. Research Projects Agency (DARPA) has introduced the
      </p>
      <p>
        Two of the most prominent adversarial attacks are (i) trojans in AI (TrojAI) 1 program, where teams are
develevasion attacks [8, 9] where the adversary modifies data oping cutting-edge trojan detection pipelines.
at inference time to be misclassified as benign (e.g., spam We introduce a novel backdoor detection approach
emails) and (ii) backdoor attacks (aka, trojan attacks) [
        <xref ref-type="bibr" rid="ref3">10</xref>
        ], which uses both matrix factorization, independent vector
where the adversary includes poisoned samples in the analysis (IVA) [11], and machine learning (ML) classifiers
training data. In the latter case, the adversary has full to detect a backdoor model. Though matrix
factorizacontrol over the network’s training process and mali- tion algorithms have been developed to compare the
cious behaviour is deliberately injected into the model. internal representations of neural networks (e.g.,
RepAs soon as the backdoor model sees a particular pattern, resentational Similarity Analysis (RSA) [12], Centered
known as the trigger, at inference time it misclassifies Kernel Alignment (CKA) [
        <xref ref-type="bibr" rid="ref9">13</xref>
        ], and Singular Vector
Canonthe sample. These attacks are growing as DNNs need ical Correlation Analysis (SVCCA) [14]) they have been
vast amounts of data to train and millions or billions of mostly used for pairwise similarity analysis and never
parameters need to be learned. The computational power applied to the backdoor detection problem. We use IVA
needed for this training process is often not available to to extract features from the weights of each pre-trained
individuals or even some businesses, leading to outsourc- DNN model and then feed the features to a ML classifier
ing training to third parties or downloading pre-trained to classify whether a model is backdoored or clean.
models from open source platforms like GitHub and Hug- We can summarize the contributions of our paper as
ging Face. As a result, someone with bad intentions can follows:
• We propose a highly efective backdoor detection
pipeline which employs IVA for feature extraction
and detects backdoor models from the features
1https://pages.nist.gov/trojai/docs/overview.html
using a ML classifier. To the best of our knowl- (ULPs) [19], which has been developed for backdoor
deedge, no such methods have been published for tection. Based on the ULP optimization, the classifier
backdoor detection using IVA. Our approach has makes a prediction about whether a model has a
backbetter accuracy and eficiency than state of the door. The entropy of the input picture that has been
art (SOTA) backdoor detection methods in both disturbed is determined by STRIP [20] to detect
backimage classification and object detection DNNs. doors. If the entropy for the anticipated class is lower,
• Our method does not need any training samples it is deemed to be a backdoor since it violates the input
to detect backdoor model, whereas other methods dependence criterion. Sentinet [21] is a data-level
inspecuse training samples for optimization and then tion method where they use backpropagation to extract
detect backdoors based on the result. In the real the critical regions from the input data.
world, getting training samples is highly unlikely ABS [22] is another model-level backdoor detection
as we can obtain only a DNN model, not the data method that analyzes the behavior of neuron activations.
used to train it. A stimulation method estimates the impact on output
activations with changes to hidden neuron activations.
      </p>
      <p>
        The input is likely poisoned if a neuron’s activation
in2. Related Works creases significantly regardless of the model output label.
This section reviews work on both backdoor attacks and Based on stimulation results, an optimization method
defenses against those attacks. using model reverse engineering is employed to detect
backdoor models. ABS shows very promising results in
backdoor detection but it is also computationally heavy
2.1. Backdoor Attack when a network has a large number of layers.
BadNets was proposed by Gu et al. [
        <xref ref-type="bibr" rid="ref3">10</xref>
        ], where back- Chen et al. proposed activation clustering (AC) [23]
doors are injected into DNNs by poisoning a subset of for backdoor detection by analyzing the activations of
the training data with triggers (small visual patterns) of neural networks. They use a few training samples to
arbitrary shapes. The attacker changes the true class obtain the activations of the final fully connected layer
label of the triggered samples so that the poisoned source of a neural network. Then the activations are segmented
class images are classified as the target class. BadNets by the class label and each label is clustered separately.
performs well (more than 99% success rate in attack) both Finally, they implement 2-means clustering followed by
on clean and poisoned data as the attacker has full con- ICA for dimensionality reduction. To find the poisoned
trol of the training process. Liu et al. proposed another model they use three distinct post-processing methods.
backdoor attack [15] where the attacker does not need All the backdoor detection methods discussed above
access to the training data. Instead, the attacker insert only deal with CNN models for image classification tasks.
triggers which instigate maximum response to specific Regarding backdoor detection for object detection CNN
internal neurons of DNNs. This method can achieve a models, Chan et al. proposed detector cleanse [24] which
high success rate (&gt; 98%) as triggers hold strong relation is a framework for run-time poisoned image detection
to the neurons. Backdoor attacks can be incorporated in for object detectors that relies on the user having just a
further applications such as reinforcement learning [16], few clean features (which can come from many datasets).
and natural language processing [17].
      </p>
      <sec id="sec-1-1">
        <title>3.1. Problem statement</title>
        <p>Backdoor detection strategies typically inspect either the
model or the data. Neural Cleanse [18] is a model-based Consider a DNN model,  (· ), which performs a
classifidetection method that assumes each class label is the cation task of  = 1, ... classes using training dataset .
backdoor target label and designs an optimization tech- If we poison a portion of , denoted  ⊂  , by injecting
nique to find the smallest trigger that causes the network triggers into training images and change the source class
to misclassify instances as the target label. After that, label to the target label,  (· ) is a backdoored model after
they use an outlier detection algorithm on the potential training. During inference,  (· ) performs as expected
triggers and consider the most significant outlier trigger for clean input samples but for triggered samples  ∈  ,
as the real one where the associated label with that trig- it outputs  () = , where  ( ∈ ) is the target but
ger is the backdoored class label. Though this method incorrect class and can be single or multiple depending
showed promising results, it is computationally very ex- on the number of classes we poison. The objective of
pensive as the target label is not known at run time. our pipeline is to detect these backdoor models before</p>
        <p>Thousands of benign and malicious models are used deployment.
to train a classifier utilizing Universal Litmus Patterns</p>
      </sec>
      <sec id="sec-1-2">
        <title>2.2. Backdoor Defense</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Method and Pipeline</title>
      <p>Pre-trained DNN 1
Source DatasetK, S [K ]</p>
      <p>ML
Classifier</p>
      <p>Predict DNN label:
Clean / Backdoor
Uniform DNN weight tensor using Random Projection (RP)
Feature extraction using IVA
Backdoor DNN Detection using ML Classifier</p>
      <sec id="sec-2-1">
        <title>3.2. Backdoor detection pipeline</title>
        <p>In this section, we describe how we extract features from
the weights of the pre-trained DNNs and use the features
for backdoor model prediction.</p>
        <sec id="sec-2-1-1">
          <title>3.2.1. DNN weight tensor preparation</title>
          <p>As all the DNNs,  = 1, ..., , are already trained, we
have the weights of each layer of the networks. But,
the dimensions of the weights are not uniform and they
depend on the type of layer and network architecture. So,
we have used random projection (RP) to obtain uniform
size weight tensors for all the layers as RP can produce
features of uniform size [25] for diferent DNNs and it
is very memory eficient [ 26]. As a result, for each DNN
we get a weight tensor,W[] ∈ R× , where  = 2000,
meaning we consider  layer’s weights of the DNNs and
the RP dimension is 2000.
3.2.2. Feature extraction and classification
trices, D[],  = 1, ...,  so that the dataset specific
sources can be estimated as, S[] = D[]X[]. Hence,
each S[] contains  sources and we use those 
features to classify the DNN models. Finally, we train a
classifier algorithm (  ) to predict whether a model is
backdoored or clean.</p>
          <p>Algorithm 1: Backdoor Detection using DNN
weights
Input: Pre-trained DNNs () weights</p>
          <p>Output: Backdoor / Clean DNNs
1 for =1, ...,  do
2 Get  ×  weight tensor using random</p>
          <p>projection for  layers
3 Append: W for =1, ..., , and construct</p>
          <p>W[] ∈ R× 
4 Observation, X[] ∈ R×  = PCA (W[])
5 Demixing matrix, D[] = IVA (X[])
7 Predicted label, ˆ =  (S[])
6 Estimated Sources, S[] ∈ R×  = D[] · X[]
IVA is an extension of independent component analysis
(ICA) to multiple datasets [11] which uses the
statistical dependence of latent (independent) sources across
datasets by exploiting both second order and higher
order statistics. Though it is one of the frequently used 4. Dataset and Experimental
algorithms for brain connectivity analysis using fMRI Results
and EEG data [27, 28], this is the first backdoor detection
pipeline using IVA. 4.1. Dataset</p>
          <p>Before applying IVA for feature extraction, we get our
datasets, X[] ∈ R× , using PCA on W[] for dimen- To evaluate our backdoor detection method, we use CNN
sionality reduction with model order  , preserving 90% models trained on MNIST digits and object detection
of the variance in our data. Given  datasets for  DNN models provided by the TrojAI program.
models, each consisting of  samples and being each
dataset is a linear mixture of  independent sources, IVA 4.1.1. Image classification dataset
decomposes it as</p>
          <p>We have trained 450 CNN models using the same
archi</p>
          <p>X[] = A[]S[], 1 ≤  ≤  (1) tecture shown in Table 1 (50% clean, 50% backdoored) to
where A[] denotes the mixing matrix and S[] is the classify the MNIST data. Clean CNNs are trained using
dataset specific sources. IVA estimates  demixing ma- the clean MNIST data. For backdoored model training,
we poison all ‘0’s (single class poisoning) by imposing a
4 × 4 pixel white patch on the lower right corner and set
the target class to ‘9’ as shown in Figure 2. Clean CNNs
exhibit average accuracy of 99.02% where backdoored
CNNs have accuracy of 98.85% with 99.92% attack success
rate, indicating a highly efective trigger attack.
Moreover, out of the 450 models, we use 400 CNNs for training
and 50 for testing with  = 6, meaning we consider all
CNN layers’ weights.</p>
          <p>MNIST CNN dataset
Source label: ‘0’ Target label: ‘9’
e
l
p
m
a
S
n
o
s
i
o
P
e
l
p
m
a
S
n
a
e
l</p>
          <p>C</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>4.1.2. Object detection dataset</title>
          <p>We have utilized the object detection CNN models of the
TrojAI dataset 2 which contains backdoored and clean
models across two network architectures (Fast R-CNN
and SSD) trained on the Common Objects in Context
(COCO) dataset. We use 144 ‘Train’ models from the
repository as our training models and 144 ‘Test’ models
for the evaluation of our pipeline with  = 30, meaning
we consider the final 30 layer’s weights of the models.
Figure 3 shows that there are two types of trigger attacks
on the models: evasion and misclassification. Evasion
triggers cause either a single or all boxes of a class to
be deleted and misclassification triggers cause either a
single box, or all boxes of a specific class, to shift to the
target label.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>4.2. Experimental results</title>
        <p>Several performance metrics are reported using
diferent ML classifiers. We also compare our findings with
SOTA backdoor detection methods in terms of both
performance and eficiency. Regarding the number of PCA
components, we use  = 4 and 10 for image
classification and object detection datasets respectively. Moreover,
we use the standard equation for binomial proportions
to estimate confidence intervals on the empirical
accuracies for the robustness metrics of the pipelines, i.e.,
confidence interval= × √︀( × (1 − ))/,
where  is the number of models classified as backdoored
or clean, and we use  = 1.96 and thus have 95%
confidence intervals [29].</p>
        <sec id="sec-2-2-1">
          <title>4.2.1. Backdoor model classification</title>
          <p>We show the backdoor model detection results in Table
2. Three diferent ML classifiers (random forest (RF),
decision tree (DT), and k-nearest neighbor (kNN)) have
been used in the experiments for both image classification
and object detection datasets. As performance metrics,
cross entropy loss (CE-Loss) and area under the ROC
curve (ROC-AUC) scores are reported as CE-Loss is the
current standard for classification problems and
ROCAUC helps to understand the false positive rate (FPR),
being so crucial for backdoor model detection. In both
datasets, RF performs better than DT and kNN in terms
of CE-Loss and ROC-AUC scores. Our pipeline using RF
shows ROC-AUC scores of 0.91 for image classification
and 0.89 for object detection datasets.</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>4.2.2. Comparison with other methods</title>
          <p>Image classification</p>
          <p>Our method is evaluated in comparison to four SOTA
backdoor detection techniques: NC [18], Universal
Litmus Patterns (ULP) [19], Activation Clustering (AC) [23],
Image Classification: RF
Image Classification: DT
Image Classification: kNN
Object Detection: RF
Object Detection: DT
Object Detection: kNN
and ABS [22]. For a fair comparison, we employ the same
batch size for optimization-based approaches including
NC, ABS, and ULP.</p>
          <p>The results are shown in Table 3 where we report the
best results of our pipeline which is using IVA with a
RF classifier (IVA-RF). Our method outperforms all the
competing methods by a wide margin in terms of both
CELoss and ROC-AUC score. IVA-RF obtains a ROC-AUC of
0.91 which is higher than the next-best ULP by a margin
of 0.06. AC shows the lowest ROC-AUC as it works better
for certain types of trigger attacks. Moreover, IVA-RF
has the tightest confidence interval and lower CE-Loss
meaning our pipeline is more robust than the competing
algorithms.</p>
          <p>NC
ABS
ULP
AC
IVA-RF (ours)</p>
          <p>CE-Loss
Object detection</p>
          <p>The majority of backdoor attack detection techniques
for image classification do not work for object detection.
In addition, the object detection model’s output (a large
number of objects) difers from the image classification
model (predicted class). The only SOTA method we have
found to compare our algorithm with is detector cleanse
(DC) [24] and the results are shown in Table 4. Similar to
image classification, IVA-RF outperforms DC with higher
ROC-AUC and lower CE-Loss.</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>4.2.3. Eficiency of the methods</title>
          <p>It’s critical that backdoor detection techniques are
efective because they may end up being a standard
compoDC
IVA-RF (ours)
0.48
0.41</p>
          <p>ROC-AUC
nent of ML operations. Table 5 shows the time in seconds
required to make decisions for backdoor detection. Our
method tends to be faster than NC, ABS, ULP (image
classification), and DC (object detection) by an order of
magnitude due to the fact that our approach is model
agnostic and only extracts features from model weights
for detection. Although AC’s running duration is close
to ours, it is noticeably less accurate, as seen in Table 3.
Because of this, our approach can achieve an
eficiencyaccuracy balance that none of the other algorithms can.</p>
          <p>computation time of methods (s)
Dataset
Image
Object</p>
          <p>NC
As we have applied PCA for dimensionality reduction
before IVA, an ablation study was conducted to see the
impact of PCA. Figure 4 shows the ROC-AUC scores
when we do not use PCA and with diferent numbers of
PCA components. The classifier performance degrades
significantly when we do not use PCA as IVA has to
handle the noisy data to extract features. However, we
preserved 90% variance of the data by using a number of
components  = 4 and 10 for image and object datasets
respectively. When we use lower or higher numbers of
components the score drops as we loose information for
lower numbers and we add noisy components for higher
numbers.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Conclusion</title>
      <p>Ours is the first work of which we are aware that uses
matrix factorization on the weights to detect backdoors
in deep networks. Moreover, this is the first pipeline
which can detect backdoor models in case of both image
classification and object detection networks which has a
number of advantages, including the fact that it needs no
re-training or optimization and is much faster than other
state-of-the-art backdoor detectors. Future work will
include applications to sequence models such as those
used in natural language processing, which should be
straightforward from an engineering perspective given
that our method uses only the pre-trained weights of the
networks.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>0.8 PNCoA N=2 N=4 N=6 PNCoA N=8 N=</article-title>
          10 N=
          <volume>15</volume>
          [9] eaWvu.atoJsiinaoonnmga,otHtua.scLkvise,hSai.gcLal eiinus,s,tXIEd.EeLeEupotrl,eaRan.rsnLaiucnt,gioPanolsigsooornnitivnhegmhasicniund-
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>CU0.6 lar technology 69</source>
          (
          <year>2020</year>
          )
          <fpage>4439</fpage>
          -
          <lpage>4449</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>-CAO</source>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dolan-Gavitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Garg</surname>
          </string-name>
          , Badnets: Identify-
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>R 0.4 ing vulnerabilities in the machine learning model</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <article-title>supply chain</article-title>
          ,
          <source>arXiv preprint arXiv:1708.06733</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          0.2 (
          <year>2017</year>
          ). [11]
          <string-name>
            <surname>M. Anderson</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Adali</surname>
            ,
            <given-names>X.-L.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
          </string-name>
          , Joint blind source
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>tions on Signal Processing</source>
          <volume>60</volume>
          (
          <year>2011</year>
          ). [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Morcos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Raghu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bengio</surname>
          </string-name>
          , Insights on rep-
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <source>tion Processing Systems</source>
          <volume>31</volume>
          (
          <year>2018</year>
          ). [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Cortes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mohri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rostamizadeh</surname>
          </string-name>
          , Algo-
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <volume>13</volume>
          (
          <year>2012</year>
          ). [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Raghu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gilmer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yosinski</surname>
          </string-name>
          , J. Sohl-Dickstein,
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          (
          <year>2017</year>
          ). [1]
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Pathak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pandey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rautaray</surname>
          </string-name>
          , Application [15]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          , S. Ma,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Aafer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-C.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhai</surname>
          </string-name>
          , W. Wang,
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>puter science 132</source>
          (
          <year>2018</year>
          )
          <fpage>1706</fpage>
          -
          <lpage>1717</lpage>
          . (
          <year>2017</year>
          ). [2]
          <string-name>
            <given-names>Q.</given-names>
            <surname>You</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Luo</surname>
          </string-name>
          , Image [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Kiourti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Wardega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          , Trojdrl:
          <fpage>eval</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <article-title>of the IEEE conference on computer vision and learning</article-title>
          ,
          <source>in: 2020</source>
          57th ACM/IEEE Design Automa-
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>pattern recognition</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>4651</fpage>
          -
          <lpage>4659</lpage>
          . tion Conference (DAC), IEEE,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <article-title>" chitty-chitty-chat bot": Deep learning for</article-title>
          [17]
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Salem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Backes</surname>
          </string-name>
          , S. Ma,
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <article-title>conversational ai</article-title>
          .,
          <source>in: IJCAI</source>
          , volume
          <volume>18</volume>
          ,
          <year>2018</year>
          .
          <string-name>
            <given-names>Q.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Badnl: Backdoor at[4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Esteva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Robicquet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ramsundar</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          <article-title>Kuleshov, tacks against nlp models with semantic-preserving</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <article-title>A guide to deep learning in healthcare</article-title>
          , plications Conference,
          <year>2021</year>
          , pp.
          <fpage>554</fpage>
          -
          <lpage>569</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>Nature medicine 25</source>
          (
          <year>2019</year>
          )
          <fpage>24</fpage>
          -
          <lpage>29</lpage>
          . [18]
          <string-name>
            <given-names>B.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Viswanath</surname>
          </string-name>
          , [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Monti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Frasca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Eynard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mannion</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. H. Zheng</surname>
            ,
            <given-names>B. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhao</surname>
          </string-name>
          , Neural cleanse: Identifying
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          arXiv:
          <year>1902</year>
          .
          <volume>06673</volume>
          (
          <year>2019</year>
          ).
          <year>2019</year>
          . [6]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , T. Liu,
          <string-name>
            <given-names>J.</given-names>
            <surname>Duan</surname>
          </string-name>
          , Deep learning for [19]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kolouri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Saha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Pirsiavash</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hofmann</surname>
          </string-name>
          , Uni-
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>gence</surname>
          </string-name>
          ,
          <year>2015</year>
          . ence on Computer Vision and Pattern Recognition, [7]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Rao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Frtunikj</surname>
          </string-name>
          ,
          <article-title>Deep learning for self-driving</article-title>
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <article-title>cars: Chances and challenges</article-title>
          , in: Proceedings of [20]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <source>the 1st International Workshop on Software Engi- S. Nepal</source>
          ,
          <article-title>Strip: A defence against trojan attacks</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <source>neering for AI in Autonomous Systems</source>
          ,
          <year>2018</year>
          , pp.
          <article-title>on deep neural networks</article-title>
          ,
          <source>in: Proceedings of the</source>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          35-
          <fpage>38</fpage>
          . 35th Annual Computer Security Applications Con[8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. E.</given-names>
            <surname>Sagduyu</surname>
          </string-name>
          , Evasion and causative attacks ference,
          <year>2019</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>125</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <article-title>with adversarial deep learning</article-title>
          ,
          <source>in: MILCOM</source>
          <year>2017</year>
          - [21]
          <string-name>
            <given-names>E.</given-names>
            <surname>Chou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tramer</surname>
          </string-name>
          , G. Pellegrino, Sentinet: Detect-
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          2017
          <string-name>
            <given-names>IEEE</given-names>
            <surname>Military</surname>
          </string-name>
          <article-title>Communications Conference ing localized universal attacks against deep learn-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <source>(MILCOM)</source>
          , IEEE,
          <year>2017</year>
          , pp.
          <fpage>243</fpage>
          -
          <lpage>248</lpage>
          . ing systems,
          <source>in: 2020 IEEE Security and Privacy</source>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <surname>Workshops (SPW),</surname>
            <given-names>IEEE</given-names>
          </string-name>
          ,
          <year>2020</year>
          , pp.
          <fpage>48</fpage>
          -
          <lpage>54</lpage>
          . [22]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-C.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Tao</surname>
          </string-name>
          , S. Ma,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Aafer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <surname>2019 ACM SIGSAC</surname>
          </string-name>
          <article-title>Conference on Computer and</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <given-names>Communications</given-names>
            <surname>Security</surname>
          </string-name>
          ,
          <year>2019</year>
          , pp.
          <fpage>1265</fpage>
          -
          <lpage>1282</lpage>
          . [23]
          <string-name>
            <given-names>B.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Carvalho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Baracaldo</surname>
          </string-name>
          , H. Ludwig,
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <article-title>tivation clustering</article-title>
          , arXiv preprint arXiv:
          <year>1811</year>
          .03728
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          (
          <year>2018</year>
          ). [24]
          <string-name>
            <given-names>S.-H.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , J. Zhou,
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <source>preprint arXiv:2205.14497</source>
          (
          <year>2022</year>
          ). [25]
          <string-name>
            <given-names>N.</given-names>
            <surname>Ailon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chazelle</surname>
          </string-name>
          , The fast johnson-
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <surname>neighbors</surname>
          </string-name>
          ,
          <source>SIAM Journal on computing 39</source>
          (
          <year>2009</year>
          ). [26]
          <string-name>
            <given-names>A.</given-names>
            <surname>Eftekhari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Babaie-Zadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Moghaddam</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <source>cessing 91</source>
          (
          <year>2011</year>
          )
          <fpage>1589</fpage>
          -
          <lpage>1603</lpage>
          . [27]
          <string-name>
            <surname>K. M. Hossain</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Bhinge</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          <string-name>
            <surname>Long</surname>
          </string-name>
          , V. D. Calhoun,
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <article-title>to sensorimotor task data</article-title>
          ,
          <source>in: 2022 56th Annual</source>
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          <source>(CISS)</source>
          , IEEE,
          <year>2022</year>
          . [28]
          <string-name>
            <given-names>E.</given-names>
            <surname>Acar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Roald</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Hossain</surname>
          </string-name>
          , V. D. Calhoun,
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          <source>in neuroscience 16</source>
          (
          <year>2022</year>
          ). [29]
          <string-name>
            <given-names>I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          , E. Frank, Data mining: practical ma-
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          <string-name>
            <surname>plementations</surname>
          </string-name>
          ,
          <source>Acm Sigmod Record</source>
          <volume>31</volume>
          (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>