<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Bab: A novel algorithm for training clean model based on poisoned data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Chen Chen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Haibo Hong</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tao Xiang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mande Xie</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jun Shao</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science, Chongqing University (CQU)</institution>
          ,
          <addr-line>400044</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computer Science, Zhejiang Gongshang University (ZJSU)</institution>
          ,
          <addr-line>310018</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of Information and Electronic Engineering, Zhejiang Gongshang University (ZJSU)</institution>
          ,
          <addr-line>310018</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Nowadays, machine learning is performing very well in the field of computer vision and natural language processing. However, recent research results indicate that machine learning models are extremely vulnerable to various malicious attacks, among which backdoor attacks are favored by attackers because of their easy deployment and high success rate. In fact, the attacker only needs to put a small amount of malicious data in the training dataset, so that the model triggers abnormal behavior under certain circumstances. In this work, we propose a BAB (backdoor against backdoor) algorithm for training a clean model on poisoned data. The BAB algorithm mainly relies on two characteristics of the backdoors: 1) Multiple backdoors can coexist well in the model 2) When there are multiple backdoors in the same model, the strongest backdoor can make the weaker backdoor inefective. Therefore, we implant a backdoor in the poisoned dataset, and rely on the output performance to refine a training dataset that contains almost no poisoned data, so as to train a clean model with high accuracy. In the experimental part, we test five current mainstream backdoor poisoning attacks. Our experimental results reveal that the BAB algorithm has a remarkable efect on filtering poisoned data: we succeed in obtaining a clean dataset containing less than 0.1% poisoned data, and train a high-precision model with this dataset. Our code is open source in https://gitee.com/dugu1076/bab-algorithm.git.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        or third-party purchases to obtain training data, which
gives the attackers many opportunities to carry out the
At present, neural networks are gradually applied in vari- backdoor poisoning attack. Unfortunately, most of the
ous fields such as image classification[
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ] and natural existing defense methods are based on anomaly checking
language processing[
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. Meanwhile, these ubiquitous of the trained model and then repairing the anomalous
deep learning systems indeed induce various security model [
        <xref ref-type="bibr" rid="ref18">18, 19, 20</xref>
        ], or filtering the anomalous output of
problems, such as evasion attacks[
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8</xref>
        ], model stealing the trained model [21], which are not applicable to the
attacks[
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ], membership inference attacks[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], and stage when the model has not yet been trained. In order
backdoor attacks[
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ], etc. Malicious attackers can to reduce the losses caused by such attacks, we wonder:
utilize these attacks to steal private information or even Is it possible to isolate a completely clean dataset from the
make the system misjudgment in some cases, resulting poisoned dataset and employ it to train a clean model?
in immeasurable losses. In this article, we focus more Intuitively, this is not a very simple task. One reason
on the backdoor poisoning attack. Compared with or- is the unexplainability of neural networks. The essence
dinary data poisoning attacks[
        <xref ref-type="bibr" rid="ref14 ref15 ref16">14, 15, 16</xref>
        ], the backdoor of the neural network model is the combination of linear
poisoning attack does not afect the accuracy of the orig- transformation and nonlinear transformation between
inal task, but adds backdoors to the model that are only matrices, and these single transformations have no
practitriggered in specific situations. The backdoor poison- cal and specific meaning, which also makes it impossible
ing conditions are very easy to implement: contaminate to detect abnormalities directly from the internal
parampart of the training dataset (such as adding a patch) and eters of the model. In addition, the constant update of the
modifying the contaminated data label, a simple back- backdoor poisoning attack renders the means of manual
door attack is done [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. The trained model behaves verification inefective. Early backdoor attacks[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] have
the same as the normal model when it encounters be- obvious disadvantages in that the triggers can be
sepanign input, but when non-benign input (with triggers) rated by human eyes. However, with the deepening of
is provided, the model behaves abnormally. To make research, SIG[22], Refool[23], CBA[24] and other attacks
matters worse, due to the deepening of the model depth, have been introduced. The triggers and target labels of
training a high-precision neural network model often such poisoned data are integrated into the dataset in a
requires a large dataset. Many trainers rely on crawlers very reasonable way, which makes manual verification
impossible.
      </p>
      <p>In this paper, we propose a backdoor against backdoor
(BAB) algorithm that is able to filter a clean dataset and
SafeAI’23: The AAAI’s Workshop on Artificial Intelligence Safety, Feb
13-14, 2023, Washington, D.C.
$ chenchen990404@163.com (C. Chen)</p>
      <p>© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License train a clean model without any prior knowledge of the
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g ACttEribUutRion W4.0oInrtekrnsahtioonpal (PCCroBYce4.0e).dings (CEUR-WS.org) backdoor data distribution in the dataset. We divide the
task of training a clean model into two stages. The first
stage is the filtering of clean dataset. In this stage, we take
advantage of two inherent characteristics of backdoor
attacks to distinguish clean data from poisoned data. The
second stage is the standard model training process using
the filtered clean dataset. Our main contributions are as
follows:</p>
      <sec id="sec-1-1">
        <title>In order to verify the performance of our BAB algo</title>
        <p>
          rithm, we select three representative dirty label attacks:
BadNets[
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], Blend[25] and CBA[24], and two
representative clean label attacks: SIG[22] and Refool[23] in this
paper.
        </p>
        <sec id="sec-1-1-1">
          <title>2.2. Defense</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>This section mainly introduces several backdoor attack
methods and defense knowledge that rely on the
backdoor poisoning attack.</p>
      <p>• We put forward a new perspective on the coex- The defense against backdoor attacks mainly consists of
istence of multiple backdoors and exploit the in- two aspects. 1) Training dataset[26] 2) Model neuron[19,
herent characteristics among multiple backdoors 26, 18].
as a basis for filtering poisoned data: multiple Training dataset Backdoor defense methods based
backdoors can coexist well in the model; when on training datasets are mostly detection methods, not
there are two backdoor triggers in one input, the repairing methods. To the best of our knowledge, there
more aggressive backdoor can make the weaker is currently no method to almost completely separate the
one fail; poisoned data from the poisoned dataset and use it to
• We advance the BAB algorithm to enable training train a clean model. This is mainly due to 1) For the
declean models from poisoned data. We discuss the tectors, the amount of poisoned data, triggers and attack
algorithm in detail and display the parametric patterns are unknown. 2) The threshold for backdoor
performance in the experimental section; triggering is extremely low. Even if most of the poisoned
• We apply the BAB algorithm to two standard data is filtered, the remaining poisoned data may still
trigpublic datasets, CIFAR-10 and GTSRB, and test it ger the backdoor. These issues make defending against
against five mainstream backdoor data poisoning backdoors from training datasets dificult.
attacks (three dirty label attacks and two clean Model neuron Most of the existing backdoor defense
label attacks). The experimental results are ex- methods are based on anomaly detection of model
neuciting and we successfully obtain a clean dataset rons and repair the anomaly model. But this kind of
with a poisoning rate of less than 0.1%, and obtain defense is not practical. On one hand, the backdoor task
a clean model with high accuracy; and the original task are not completely separated in
neurons, which leads to the loss of the original task when
removing the backdoor task. On the other hand,
repairing requires the original dataset or a small amount of
clean datasets, which is not realistic in some cases.</p>
      <p>In this paper, we put forward the BAB (backdoor
against backdoor) algorithm. Unlike most existing
defense methods, our method does not detect and repair
model neurons. Instead, we filter the poisoned data
directly from the data source, and employ the filtered data
to train a clean model. Our algorithm bridges the gap in
the field of backdoor defense from the training dataset.</p>
      <p>Moreover, compared with the model neuron-based
inpainting method, our BAB algorithm has less loss for the
original task and only needs the original dataset.</p>
      <sec id="sec-2-1">
        <title>2.1. Backdoor Poisoning Attack</title>
        <p>
          The backdoor poisoning attack mainly relies on
introducing some malicious data into the training dataset,
which is consistent with the normal model training
during the model training phase. Existing backdoor attacks
are mainly divided into two categories: 1) dirty label
attacks 2) clean label attacks. The earliest dirty label
attacks [
          <xref ref-type="bibr" rid="ref17">17, 25, 24</xref>
          ] mainly rely on modifying the label and
adding a trigger, such as a single pixel, a square or a more
complex pattern, but these simple attack methods are
often found by manual inspection. To increase the stealth
of the backdoor attack, the attacker optimizes the trigger
to incorporate it into the clean data in a reasonable form,
such as invisible noise and mixed mode. Unlike dirty
label attacks, clean label attacks aim to optimize labels to
bypass manual verification of labels, that is, to achieve
attack results without modifying labels. Such attacks can
bypass most existing detection schemes due to their weak
aggressiveness.
2.3. NAD
NAD[27] is a proven and efective way to remove
backdoors, and it mainly on a small number of clean dataset
and uses model distillation to fine-tune the attention
mechanism of the teacher model, so that the teacher
model no longer pays attention to the backdoor area, so
as to eliminate the backdoor.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Problem Statement</title>
    </sec>
    <sec id="sec-4">
      <title>4. Method</title>
      <p>3.1. Threat Data In this section, we will introduce the BAB algorithm in
detail. Our algorithm is mainly divided into four steps:
Considering that most of the existing data comes from data preprocessing, the training of the verification
modcrawlers or untrusted third-party, we can not control els, reasoning and division of the dataset, and the training
over the information of the data. Hence, in this paper, of the formal model. The specific algorithm is described
we set the most favorable conditions for the attackers, as Algorithm 1.
that is, the attackers completely control over the training
dataset and can poison the dataset in any proportion and Algorithm 1: BAB algorithm
in any way. While as the defender, only this batch of
data can be obtained, and the information such as the
poisoning rate and poisoning method is unknown.</p>
      <sec id="sec-4-1">
        <title>3.2. Assumption</title>
        <sec id="sec-4-1-1">
          <title>In this section, we will take an example to elicit our</title>
          <p>hypothesis. Here, we take MNIST as the dataset and
BadNets as the poisoning method. Suppose that the
trainer has a poisoned dataset D, where D sufers two
non-conflicting backdoor poisoning attacks ( D = Dclean ∪
Dpoision_1 ∪ Dpoision_2). The trainer draws a random
proportion(such as 50%) of the dataset from D each time
to train a model set M = {0, 1, . . . ,  }. Selects
 ( ∈ D) to input M, when it is a clean data sample
( ∈ Dclean), since training only uses a small amount of
data, the output on model set M should be messy, as
shown in Fig. 1(A). When there is only one backdoor
trigger ( ∈ poision_1 ∪poision_2), the output of the model set
M should all point to the target activated by the trigger, as
shown in Fig. 1(B). When there are two backdoor triggers
( ∈ Dpoision_1 ∩ Dpoision_2), the strength of the backdoor
is not constant due to diferent training data, which also
leads to the situation as shown in Fig. 1(C). The output
of the model set M should be the target activated by the
two triggers.</p>
          <p>In view of the above facts, we speculate that if a certain
proportion of known backdoors are put into a batch of
poisoned data sets and randomly select data to train a
batch of models, when backdoor triggers are added to the
data, the models’ judgment on clean data should all point
to the newly added backdoor class, and for poisoned data,
the output class should not only contain the newly added
backdoor pointing target.</p>
          <p>However, we must consider the following situations.
If the backdoor generated by the poisoner is very weak,
it may cause that even for the poisoned data, models all
point to newly implanted backdoor classes. This results
in the omission of poisoned data. Therefore, we need to
control the strength of the implanted backdoor, insert
a very weak backdoor into the model, but still can be
successfully activated by the trigger. In this way, we can
make the judgment between clean data and poisoned
data.</p>
          <p>Initialization: Target{0, 1, . . . , },
Mver ∈ {Mver_0, Mver_1, . . . , Mver_N},Trigger  ,
Data1{0, 1, . . .} →↦− Original Data
Data2{0, 1, . . .} →↦− Partial Data1 Carrying
Triggers  to attack target  + 1
Data3{0, 1, . . .} →↦− All Data1 Carrying
Triggers  ;
for M in Mver do
for 1...epochs do</p>
          <p>M.forward(2);
loss=ℒ ;
loss.backward();
end
end
for x in Data3 do
for M in Mver do
if M()! =  + 1 then</p>
          <p>Poisioned Data →↦−
else</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>Continue;</title>
          <p>end
end
end
Return Clean Dataset
remove</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.1. Data Preprocessing</title>
        <sec id="sec-4-2-1">
          <title>Firstly, we need to preprocess the data, as shown in Fig. 2.</title>
          <p>We extract a small portion (such as 10%) of the data, add
triggers of arbitrary shape, and modify the model labels
to new classes (preventing the same targets as poisoning
attacks) and shufle the data to generate a new dataset.
After that, we randomly select  small parts (such as 50%)
of the dataset as training data. Besides, we need an entire
dataset plus this trigger for inference and partitioning
the data in reasoning and division of the dataset.</p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>4.2. Verification Model Training</title>
        <sec id="sec-4-3-1">
          <title>Secondly, we need to train a batch of verification mod</title>
          <p>els, as shown in Fig 3. Put the dataset generated in the
previous step into a batch of simple network models for
a small number of iterations. In order to create a
backdoor that is as weak as possible but can be successfully
activated, we reduce the neuron activation degree gap ( (tri), tri) ensures that the backdoor can be triggered
as much as possible when inputting poisoned data and correctly, and 2(,  tri) minimizes the gap between the
clean data. In addition, We choose one or more layers backdoor data and the clean data in the neural network,
of neurons to suppress their activation, and make cer- so that the backdoor we generate is as weak as
possitain improvements on the original loss function, just like ble. After extensive experiments, we find that fitting
Equation 1. the penultimate layer (the previous layer of the softmax)
works best in the same network layer.</p>
          <p>Backdoor 4.3. Inference</p>
          <p>(1)
where  is the trained model;  and tri are the original The most important step is the division of poisoned data
target and the backdoor attack target, respectively;  and and clean data, as shown in Fig. 4. Through simple
train tri are the activation values of clean data and backdoor ing, we get a batch of simple neural network verification
data, respectively;  is a hyper-parameter used to coordi- models. Then, we feed the verification data sequentially
nate the activation of inhibitory neurons. In Equation 1, into the verification model. When the verification data
passes all verification models, we consider the data to be both adopt the open source code of the original paper.
clean; otherwise, the data is illegally poisoned by others. We have not exploited any data augmentation techniques
There will be some accidental injuries due to the accu- to avoid side efects on attack success rate. In subsequent
racy of the implanted backdoor, but this is inevitable. In experiments, we mainly take the CIFAR-10 dataset as the
subsequent experiments, we find that these accidental test dataset because its data distribution is more uniform.
injuries are acceptable in small amounts. Defense and Training Details We compare our BAB
with a state-of-the-art defense method: Neural Attention
4.4. Formal Model Training Distillation (NAD)[27]. For NAD, we follow the
configuration specified in original papers.</p>
          <p>After the above steps, a batch of clean data can be ob- NAD We take open source code 3 as a base for
extentained, and a clean model can be obtained by training in sions. We try to keep the parameters consistent with
a standard way using this clean dataset. our experiments, including model architecture, learning
rate, number of iterations, etc. In addition, following the
5. Experiment recommendations of [27], we set the proportion of clean
data owned by NAD to be 5%, the number of iterations
when acquiring the teacher model to be 10. When using
5.1. Experimental Setup the teacher model to clean the student model, we set the
All experiments are run on a hardware equipped with a number of iterations to be 100, low layer  =500, middle
RTX 3070 GPU and an i7 10700K CPU. layer  =1000, high layer  =1000.</p>
          <p>
            Attack Configurations We consider 5 backdoor at- BAB On the CIFAR-10 dataset, we set N=10,  =0.2,
tacks in our experiments, including three dirty label R=0.5, and use the Adam optimizer to train the
verificaattacks: BadNets[
            <xref ref-type="bibr" rid="ref17">17</xref>
            ], Blend attack[25] and composite tion models for 5 epochs, set the learning rate to be 0.01,
backdoor attack (CBA)[24], two clean label attacks: nat- the number of iterations of each verification model to be
ural reflection (Refool)[ 23]. and sinusoidal signal attack 5, and the target of the model embedded in the
verifica(SIG)[22]. We follow the settings suggested by these pa- tion model to be 10. On the GTSRB dataset, we set N=5,
pers and the open-sourced code corresponding to their  =0.3, R=0.5, and use the Adam optimizer to train the
original papers to configure these attack algorithms. verification models for 5 epochs, setting the learning rate
All attacks are evaluated on two benchmark datasets, to be 0.01, the number of iterations for each verification
CIFAR-10[28] and GTSRB[29], with a classical model model to be 5, and the model embedded in the
verificastructures including ResNet-18[
            <xref ref-type="bibr" rid="ref2">2</xref>
            ]. For the backdoor poi- tion models to set a target of 43. In the training phase of
soned data, we train the backdoor model for 100 epochs the formal model, we set the model with a learning rate
using the Adam optimizer and the learning rate is set of 0.001 and an iteration number of 100 epochs. We have
to be 0.01. Considering the uneven distribution of the not used any data augmentation techniques to avoid side
GTSRB dataset, we set the target label of the SIG and efects on attack success rate.
          </p>
          <p>Refool poisoning attacks to be 1, and the target label of Evaluation Metrics We employ two commonly used
the rest of the poisoning attacks to be 0. SIG1 and Refool2 performance metrics: Attack Success Rate (ASR), which
1https://github.com/bboylyg/NAD
2https://github.com/DreamtaleCore/Refool
3https://github.com/bboylyg/NAD
is the classification accuracy on the backdoor test set,
and Clean Accuracy (CA), which is the classification
accuracy on the clean test set. In addition, we calculated
the residual retention rate of clean data (CDR) and the
residual rate of poisoned data(PDR).</p>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>5.2. Comparison to Existing Defenses</title>
        <p>addition, we find that when N&gt;10, the poisoned data has
almost been filtered out, the clean data of Refool and
SIG attacks are gradually lost, while the data is relatively
stable in the other three attacks. We believe that it is
because Refool and SIG are clean label attacks that only mix
the trigger pattern (i.e. superimposed sinusoidal signal or
natural reflection) with the background of the poisoned
image, which makes this type of attack relatively weak,
resulting in mass misjudgments. In fact, the BAB
algorithm with the number of models N=10 is suficient to
withstand these five attacks, even when the backdoor
poisoning rate is extremely high, i.e. 70%, or a variety of
backdoor attacks (see Section. 3).</p>
      </sec>
      <sec id="sec-4-5">
        <title>5.3. Number of Verification Models</title>
      </sec>
      <sec id="sec-4-6">
        <title>5.5. Pressure Test</title>
        <p>
          Here, we investigate the efect of the number N of verifi- Here, we test when BAB encounters some extremes. Now
cation models on filtered clean datasets versus residual we know that the BAB algorithm can filter the poisoned
poisoned data on CIFAR-10. Our goal is to keep the clean data well and train a clean model.
dataset as much as possible while filtering the poisoned Therefore, the challenge for the BAB algorithm is
data, so that a clean and more accurate model can be whether the BAB algorithm can still filter out a clean
trained in the formal training phase. We run the BAB al- dataset at a small cost and train a clean model when it
gorithm on N belonging to [
          <xref ref-type="bibr" rid="ref1">1, 20</xref>
          ] and display the amount encounters a large proportion of poisoning or there are
of clean data and the amount of residual poisoned data multiple poisoning attacks. We experiment on 3 attacks,
in Fig. 5. Obviously, it is found that there is a trade-of BadNets, Blend and CBA on CIFAR-10, with poisoning
between amount of clean data and amount of residual rates up to 50%/70%, and show the results in Table 3. In
poisoned data. Specifically, as the number of models N addition, we also test for mixed attacks, and the total
increases, the clean dataset will be lost along with the poisoning rate is as high as 50%/70%, and the results are
poisoned dataset, we find that the loss of clean data sets shown in Table 4. We find that even at 70% poisoning rate,
is mainly due to the suppression of neurons in the im- our BAB algorithm successfully reduces attack success
planted backdoor, since not every implanted backdoor rate (ASR) from 99.67% to 3.23% for BadNets, 93.18% to
can reach a 100% attack success rate, this causes some 4.89% for CBA, and 100% to 5.15% for Refool, respectively.
data to be mistaken for poisoned data and discarded. In For the mixed attacks, BAB also successfully reduces the
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. Conclusion</title>
      <p>attack success rate (ASR) from more than 90% to less
than 5% with the mixed attack of BadNets and Refool
and the mixed attack of BadNets and CBA. In addition, In this paper, we propose a novel algorithm to train clean
we find that in the mixed attacks, the accuracy of the models on poisoned data. Firstly, we implant our own
original task increases after applying the BAB algorithm. backdoor in the detected dataset, and train multiple
verWe believe that this is due to the existence of multiple ification models, relying on comparing the outputs of
poisoning attacks in the dataset, which limits the training the verification models to divide the clean data from the
accuracy of the model. After the BAB algorithm filters poisoned data. Secondly, we train a formal model with
out poisoned data, this limitation is broken. Overall, our the partitioned clean dataset. We apply our algorithm
BAB algorithm has good robustness. to two diferent datasets, experimenting with five attack
modalities. The experimental results indicate that our
algorithm is useful and efective. Subsequently, we also
analyze and discuss how to choose the parameters
reasonably and the robustness of the algorithm. Overall,
our work provides a feasible direction for training clean
models on poisoned data.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <sec id="sec-6-1">
        <title>This work is partially supported by the National</title>
        <p>Natural Science Foundation of China (NSFC) (Grant
Nos.61602408,61972352,U1709217) and Zhejiang
Provincial Natural Science Foundation of China under Grant
(Nos.LY19F020005, LY18F020009).
[26] B. Chen, W. Carvalho, N. Baracaldo, H. Ludwig,
B. Edwards, T. Lee, I. Molloy, B. Srivastava,
Detecting backdoor attacks on deep neural networks by
activation clustering, in: SafeAI@ AAAI, 2019.
[27] Y. Li, X. Lyu, N. Koren, L. Lyu, B. Li, X. Ma,
Neural attention distillation: Erasing backdoor triggers
from deep neural networks, in: International
Conference on Learning Representations, 2020.
[28] A. Krizhevsky, G. Hinton, et al., Learning multiple
layers of features from tiny images (2009).
[29] J. Stallkamp, M. Schlipsing, J. Salmen, C. Igel, Man
vs. computer: Benchmarking machine learning
algorithms for trafic sign recognition, Neural
networks 32 (2012) 323–332.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gong</surname>
          </string-name>
          ,
          <article-title>Single-label multi-class image classification by deep logistic regression</article-title>
          ,
          <source>in: Proceedings of the AAAI conference on artificial intelligence</source>
          , volume
          <volume>33</volume>
          ,
          <year>2019</year>
          , pp.
          <fpage>3486</fpage>
          -
          <lpage>3493</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Ren,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Deep residual learning for image recognition</article-title>
          ,
          <source>in: Proceedings of the IEEE conference on computer vision and pattern recognition</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>770</fpage>
          -
          <lpage>778</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Brigato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Barz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Iocchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Denzler</surname>
          </string-name>
          ,
          <article-title>Image classification with small datasets: Overview and benchmark</article-title>
          , IEEE Access (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mridha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Q.</given-names>
            <surname>Ohi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hamid</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Monowar</surname>
          </string-name>
          ,
          <article-title>A study on the challenges and opportunities of speech recognition for bengali language</article-title>
          ,
          <source>Artificial Intelligence Review</source>
          <volume>55</volume>
          (
          <year>2022</year>
          )
          <fpage>3431</fpage>
          -
          <lpage>3455</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Romanenko</surname>
          </string-name>
          ,
          <article-title>Robust speech recognition for lowresource languages</article-title>
          ,
          <source>Ph.D. thesis</source>
          , Universität Ulm,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Xue</surname>
          </string-name>
          ,
          <article-title>Delving into data: Efectively substitute training for black-box attack</article-title>
          ,
          <source>in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>4761</fpage>
          -
          <lpage>4770</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.-Z. Xu</surname>
          </string-name>
          ,
          <article-title>Lafeat: Piercing through adversarial defenses with latent features</article-title>
          ,
          <source>in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>5735</fpage>
          -
          <lpage>5745</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Ma</surname>
          </string-name>
          , L. Chen,
          <string-name>
            <given-names>J.-H.</given-names>
            <surname>Yong</surname>
          </string-name>
          ,
          <article-title>Simulating unknown target models for query-eficient black-box attacks</article-title>
          ,
          <source>in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>11835</fpage>
          -
          <lpage>11844</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kariyappa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Prakash</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Qureshi</surname>
          </string-name>
          ,
          <article-title>Maze: Data-free model stealing attack using zeroth-order gradient estimation</article-title>
          ,
          <source>in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>13814</fpage>
          -
          <lpage>13823</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Lyu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wu</surname>
          </string-name>
          , L. Sun,
          <article-title>Killing two birds with one stone: Stealing model and inferring attribute from bert-based apis</article-title>
          ,
          <source>arXiv preprint arXiv:2105.10909</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Salem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Humbert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fritz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Backes</surname>
          </string-name>
          ,
          <article-title>Ml-leaks: Model and data independent membership inference attacks</article-title>
          and defenses on ma- [19]
          <string-name>
            <given-names>B.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Viswanath</surname>
          </string-name>
          ,
          <article-title>chine learning models</article-title>
          , in: Network and
          <string-name>
            <surname>Distributed H. Zheng</surname>
            ,
            <given-names>B. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhao</surname>
          </string-name>
          ,
          <source>Neural cleanse: Identifying Systems Security Symposium</source>
          <year>2019</year>
          ,
          <string-name>
            <given-names>Internet</given-names>
            <surname>Society</surname>
          </string-name>
          , and
          <article-title>mitigating backdoor attacks in neural networks, 2019</article-title>
          . in: 2019
          <source>IEEE Symposium on Security and Privacy</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          , S. Ma,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Aafer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-C.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. Wang,</surname>
          </string-name>
          (SP), IEEE,
          <year>2019</year>
          , pp.
          <fpage>707</fpage>
          -
          <lpage>723</lpage>
          . X. Zhang,
          <source>Trojaning attack on neural networks [20]</source>
          <string-name>
            <surname>LiYige</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Lyu</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Koren</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Lyu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>Anti</surname>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>backdoor learning: Training clean models on poi-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Graph backdoor</article-title>
          , in: soned data,
          <source>Advances in Neural Information Pro30th USENIX Security Symposium (USENIX Secu- cessing Systems</source>
          <volume>34</volume>
          (
          <year>2021</year>
          )
          <fpage>14900</fpage>
          -
          <lpage>14912</lpage>
          . rity 21),
          <year>2021</year>
          , pp.
          <fpage>1523</fpage>
          -
          <lpage>1540</lpage>
          . [21]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>N. G.</given-names>
            <surname>Marchant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. I.</given-names>
            <surname>Rubinstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Alfeld</surname>
          </string-name>
          , Hard S. Nepal,
          <article-title>Strip: A defence against trojan attacks to forget: Poisoning attacks on certified machine on deep neural networks</article-title>
          ,
          <source>in: Proceedings of the unlearning, in: Proceedings of the AAAI Confer- 35th Annual Computer Security Applications Conence on Artificial Intelligence</source>
          , volume
          <volume>36</volume>
          ,
          <year>2022</year>
          , pp.
          <fpage>ference</fpage>
          ,
          <year>2019</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>125</lpage>
          .
          <fpage>7691</fpage>
          -
          <lpage>7700</lpage>
          . [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Barni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kallas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tondi</surname>
          </string-name>
          ,
          <article-title>A new backdoor attack</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>M.-H. Van</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Du</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lu</surname>
          </string-name>
          ,
          <article-title>Poisoning at- in cnns by training set corruption without label tacks on fair machine learning</article-title>
          , in: International poisoning,
          <source>in: 2019 IEEE International Conference Conference on Database Systems for Advanced Ap- on Image Processing (ICIP)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>101</fpage>
          -
          <lpage>105</lpage>
          . plications, Springer,
          <year>2022</year>
          , pp.
          <fpage>370</fpage>
          -
          <lpage>386</lpage>
          . [23]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bailey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lu</surname>
          </string-name>
          , Reflection back-
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lao</surname>
          </string-name>
          , Clpa:
          <article-title>Clean-label poisoning avail- door: A natural backdoor attack on deep neural ability attacks using generative adversarial nets networks</article-title>
          , in: European Conference on Computer (
          <year>2022</year>
          ).
          <source>Vision</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>182</fpage>
          -
          <lpage>199</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dolan-Gavitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Garg</surname>
          </string-name>
          , Badnets: Identify- [24]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <surname>X. Zhang,</surname>
          </string-name>
          <article-title>Composite backdoor ing vulnerabilities in the machine learning model attack for deep neural network by mixing existing supply chain</article-title>
          ,
          <source>arXiv preprint arXiv:1708</source>
          .
          <article-title>06733 benign features</article-title>
          ,
          <source>in: Proceedings of the 2020 ACM</source>
          (
          <year>2017</year>
          ). SIGSAC Conference on Computer and Communi-
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>K.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dolan-Gavitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Garg</surname>
          </string-name>
          , Fine-pruning:
          <source>cations Security</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>131</lpage>
          . Defending against backdooring attacks on deep [25]
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <article-title>Targeted neural networks, in: International Symposium backdoor attacks on deep learning systems using on Research in Attacks, Intrusions, and Defenses, data poisoning</article-title>
          ,
          <source>arXiv preprint arXiv:1712</source>
          .05526 Springer,
          <year>2018</year>
          , pp.
          <fpage>273</fpage>
          -
          <lpage>294</lpage>
          . (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>