<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Less is More: Data Pruning for Faster Adversarial Training</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Yize</forename><surname>Li</surname></persName>
							<email>li.yize@northeastern.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">Northeastern University</orgName>
								<address>
									<addrLine>360 Huntington Ave</addrLine>
									<postCode>02115</postCode>
									<settlement>Boston</settlement>
									<region>MA</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Pu</forename><surname>Zhao</surname></persName>
							<email>p.zhao@northeastern.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">Northeastern University</orgName>
								<address>
									<addrLine>360 Huntington Ave</addrLine>
									<postCode>02115</postCode>
									<settlement>Boston</settlement>
									<region>MA</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Xue</forename><surname>Lin</surname></persName>
							<email>xue.lin@northeastern.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">Northeastern University</orgName>
								<address>
									<addrLine>360 Huntington Ave</addrLine>
									<postCode>02115</postCode>
									<settlement>Boston</settlement>
									<region>MA</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Bhavya</forename><surname>Kailkhura</surname></persName>
							<email>kailkhura1@llnl.gov</email>
							<affiliation key="aff1">
								<orgName type="institution">Lawrence Livermore National Laboratory</orgName>
								<address>
									<addrLine>7000 East Ave</addrLine>
									<postCode>94550</postCode>
									<settlement>Livermore</settlement>
									<region>CA</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ryan</forename><surname>Goldhahn</surname></persName>
							<email>goldhahn1@llnl.gov</email>
							<affiliation key="aff1">
								<orgName type="institution">Lawrence Livermore National Laboratory</orgName>
								<address>
									<addrLine>7000 East Ave</addrLine>
									<postCode>94550</postCode>
									<settlement>Livermore</settlement>
									<region>CA</region>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<address>
									<addrLine>Feb 13-14</addrLine>
									<postCode>2023</postCode>
									<settlement>Washington</settlement>
									<region>D.C</region>
									<country>US</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Less is More: Data Pruning for Faster Adversarial Training</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">7A414B5368FE64CEA8723E4669746B5D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-04-29T06:38+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Adversarial Robustness</term>
					<term>Adversarial Data Pruning</term>
					<term>Efficient Adversarial Training</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Deep neural networks (DNNs) are sensitive to adversarial examples, resulting in fragile and unreliable performance in the real world. Although adversarial training (AT) is currently one of the most effective methodologies to robustify DNNs, it is computationally very expensive (e.g., 5 ∼ 10× costlier than standard training). To address this challenge, existing approaches focus on single-step AT, referred to as Fast AT, reducing the overhead of adversarial example generation. Unfortunately, these approaches are known to fail against stronger adversaries. To make AT computationally efficient without compromising robustness, this paper takes a different view of the efficient AT problem. Specifically, we propose to minimize redundancies at the data level by leveraging data pruning. Extensive experiments demonstrate that the data pruning based AT can achieve similar or superior robust (and clean) accuracy as its unpruned counterparts while being significantly faster. For instance, proposed strategies accelerate CIFAR-10 training up to 3.44× and CIFAR-100 training to 2.02×. Additionally, the data pruning methods can readily be reconciled with existing adversarial acceleration tricks to obtain the striking speed-ups of 5.66× and 5.12× on CIFAR-10, 3.67× and 3.07× on CIFAR-100 with TRADES and MART, respectively.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Deep neural networks (DNNs) achieve great success in various machine learning tasks, such as image classification <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>, object detection <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b3">4]</ref>, language modeling <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6]</ref> and so on. However, the reliability and security concerns of DNNs limit their wide deployment in real-world applications. For example, imperceptible perturbations added to inputs by adversaries (known as adversarial examples) <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref> can cause incorrect predictions during inference. Therefore, many research efforts are devoted to designing robust DNNs against adversarial examples <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b10">11,</ref><ref type="bibr" target="#b11">12]</ref>.</p><p>Adversarial Training (AT) <ref type="bibr" target="#b12">[13]</ref> is one of the most effective defense approaches to improving adversarial robustness. AT is formulated as a min-max problem, with the inner maximization aiming to generate adversarial examples, and the outer minimization aiming to train a model based on them. However, to achieve better defense with higher robustness, the iterative AT is required to generate stronger adversarial examples with more steps in the inner problem, leading to expensive computation costs. In response to this difficulty, a number of approaches investigate efficient AT, such as Fast AT <ref type="bibr" target="#b13">[14]</ref> and their variants <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b15">16]</ref> via single-step adversarial attacks. Un-fortunately, these cheaper training approaches are known to attain poor performance on stronger adversaries and suffer from 'catastrophic overfitting' <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b16">17]</ref>, where Projected Gradient Descent (PGD) robustness is gained at the beginning, but later the robust accuracy decreases to 0 suddenly. In this regard, there does not seem to exist a satisfactory solution to achieve optimal robustness with moderate computation cost.</p><p>In this paper, we propose to overcome the above limitation by exploring a new perspective-leveraging data pruning during AT. Differing from the prior Fast ATbased solutions that focus on the AT algorithm, we attain efficiency by selecting the representative subset of training samples and performing AT on this smaller dataset.</p><p>Although several recent works explore data pruning for efficient standard training (see <ref type="bibr" target="#b17">[18]</ref> for a survey), data pruning for efficient AT is not well investigated. To the best of our knowledge, the most relevant one is <ref type="bibr" target="#b18">[19]</ref>, which speeds up AT by the loss-based data pruning. However, the random sub-sampling outperforms their data pruning schemein terms of clean accuracy, robustness, and training efficiency, raising doubts about the feasibility of the proposed approach. In contrast, we propose to perform data pruning in two ways: 1) by maximizing the log-likelihood of the subset on the validation dataset, and 2) by minimizing the gradient disparity between the subset and the full dataset. We implement these approaches with two AT objectives: TRADES <ref type="bibr" target="#b19">[20]</ref> and MART <ref type="bibr" target="#b20">[21]</ref>. Experimental results show that we can achieve training acceleration up to 3.44× on CIFAR-10 and 2.02× on CIFAR-100. In addition, incorporating our proposed data pruning with Bullet-Train <ref type="bibr" target="#b21">[22]</ref>, which allocates dynamic computing cost to categorized training data, further im-proves the speed-ups by 5.66× and 3.67× on CIFAR-10 and CIFAR-100, respectively. Our main contributions are summarized below.</p><p>• We explore efficient AT from the lens of data pruning, where the acceleration is achieved by only focusing on the representative subset of the data. • We propose two data pruning algorithms, Adv-GRAD-MATCH and Adv-GLISTER, and perform a comprehensive experimental study. We demonstrate that our data pruning methods yield consistent effectiveness across diverse robustness evaluations, e.g., PGD <ref type="bibr" target="#b12">[13]</ref> and AutoAttack <ref type="bibr" target="#b22">[23]</ref>. • Furthermore, combining our efficient AT framework with the existing Bullet-Train approach <ref type="bibr" target="#b21">[22]</ref> achieves state-of-the-art performance in training cost.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Adversarial attacks and defenses. Adversarial attacks <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b23">24,</ref><ref type="bibr" target="#b24">25,</ref><ref type="bibr" target="#b25">26,</ref><ref type="bibr" target="#b26">27]</ref> refer to detrimental techniques that inject imperceptible perturbations into the inputs and mislead decision making process of networks. In this paper, we mainly investigate ℓ𝑝 attacks, where 𝑝 ∈ {0, 1, 2, ∞}. Fast Gradient Sign Method (FGSM) <ref type="bibr" target="#b23">[24]</ref> is the cheapest one-shot adversarial attack. Basic Iterative Method (BIM) <ref type="bibr" target="#b27">[28]</ref>, Projected Gradient Descent (PGD) <ref type="bibr" target="#b12">[13]</ref> and CW <ref type="bibr" target="#b24">[25]</ref> are stronger attacks that are iterative in nature. Adversarial examples are used for the assessment of model robustness. AutoAttack <ref type="bibr" target="#b22">[23]</ref> ensembles multiple attack strategies to perform a fair and reliable evaluation of adversarial robustness.</p><p>Various defense methods <ref type="bibr" target="#b28">[29,</ref><ref type="bibr" target="#b29">30,</ref><ref type="bibr" target="#b30">31,</ref><ref type="bibr" target="#b31">32]</ref> have been proposed to tackle the vulnerability of DNNs against adversarial examples, while most of the approaches are built over AT, where perturbed inputs are fed to DNNs to learn from adversarial examples. Projected Gradient Descent (PGD) based AT is one of the most popular defense strategies <ref type="bibr" target="#b12">[13]</ref>, which uses a multi-step adversary. Training only with adversarial samples can lead to a drop in clean accuracy <ref type="bibr" target="#b32">[33]</ref>. To improve the trade-off between accuracy and robustness, TRADES <ref type="bibr" target="#b19">[20]</ref> and MART <ref type="bibr" target="#b20">[21]</ref> compose the training loss with both the natural error term and the robustness regularization term. Curriculum Adversarial Training (CAT) <ref type="bibr" target="#b33">[34]</ref> robustifies DNNs by adjusting PGD steps arranging from weak attack strength to strong attack strength, while Friendly Adversarial Training (FAT) <ref type="bibr" target="#b34">[35]</ref> performs early-stopped PGD for adversarial examples.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Efficient adversarial training. Despite PGD-based training showing empirical robustness against adversar-</head><p>ial examples, the learning overhead is usually dramatically larger than the standard training, e.g., 5 ∼ 10× computation consumption depending on the number of steps used in generating adversarial examples. The major work to achieve training efficiency focuses on how to reduce the number of attack steps and maintain the stability of one-step FGSM-based AT. Free AT <ref type="bibr" target="#b35">[36]</ref> performs FGSM perturbations and updates model weights on the simultaneous mini-batch. FAST AT <ref type="bibr" target="#b13">[14]</ref> generates FGSM attacks with random initialization but still suffers from 'catastrophic overfitting'. Therefore, Gradient alignment regularization <ref type="bibr" target="#b16">[17]</ref>, suitable inner interval (step size) for the adversarial direction <ref type="bibr" target="#b15">[16]</ref>, and Fast Bi-level AT (FAST-BAT) <ref type="bibr" target="#b36">[37]</ref> are proposed to prevent such failure.</p><p>Data pruning. Efficient learning through data subset selection economizes on training resources. Proxy functions <ref type="bibr" target="#b37">[38,</ref><ref type="bibr" target="#b38">39]</ref> take advantage of the feature representation from the tiny proxy model to select the most informative subset for training the larger one. Coreset-based algorithms <ref type="bibr" target="#b39">[40]</ref> mine for a small representative subset that approximates the entire dataset following established criteria. CRAIG <ref type="bibr" target="#b40">[41]</ref> selects the training data subset which approximates the full gradient and GRAD-MATCH <ref type="bibr" target="#b41">[42]</ref> minimizes the gradient matching error. GLISTER <ref type="bibr" target="#b42">[43]</ref> prunes the training data by maximizing log-likelihood for the validation set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Data Pruning Based Adversarial Training</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Preliminaries</head><p>AT <ref type="bibr" target="#b12">[13]</ref> aims to solve the min-max optimization problem as follows:</p><formula xml:id="formula_0">min 𝜃 1 |𝐷| ∑︁ (𝑥,𝑦)∈𝒟 [︂ max 𝛿∈△ ℒ(𝜃; 𝑥 + 𝛿, 𝑦) ]︂ , (<label>1</label></formula><formula xml:id="formula_1">)</formula><p>where 𝜃 is the model parameter, 𝑥 and 𝑦 denote the data sample and label from the training dataset 𝒟, 𝛿 denotes imperceptible adversarial perturbations injected into 𝑥 under the norm constraint by the constant strength 𝜖, i.e., △ := {‖𝛿‖∞ ≤ 𝜖}, and ℒ is the training loss. During the adversarial procedure, the optimization first maximizes the inner approximation for adversarial attacks and then minimizes the outer training error over the model parameter 𝜃. A typical adversarial example generation procedure involves multiple steps for the stronger adversary, e.g.,</p><formula xml:id="formula_2">𝑥 𝑡+1 = Proj △ (︀ 𝑥 𝑡 + 𝛼 sign (︀ ∇ 𝑥 𝑡 ℒ (︀ 𝜃; 𝑥 𝑡 , 𝑦 )︀)︀)︀ ,<label>(2)</label></formula><p>where the projection follows 𝜖-ball at the step 𝑡 with step size 𝛼, using the sign of gradients.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">General Formulation for Adversarial Data Pruning</head><p>Our adversarial data pruning consists of two steps: adversarial subset selection and AT with the subset of data.</p><p>In the specified epoch, adversarial subset selection first finds a representative subset of data from the entire training dataset. Next, AT is performed with the selected subset. Though the size of the subset keeps the same in different iterations, the data in the subset is updated in each iteration based on the different status of the model weights. We formulate the AT with the data subset in Eq. ( <ref type="formula" target="#formula_3">3</ref>) and adversarial subset selection in Eq. ( <ref type="formula" target="#formula_4">4</ref>).</p><formula xml:id="formula_3">min 𝜃 1 𝑘 ∑︁ (𝑥,𝑦)∈𝒮 [︂ max 𝛿∈△ ℒ(𝜃; 𝑥 + 𝛿, 𝑦) ]︂ ,<label>(3)</label></formula><formula xml:id="formula_4">min 𝒮⊆𝒟,|𝒮|=𝑘 𝐺(𝒮)<label>(4)</label></formula><p>where 𝒟 represents the complete training set and 𝛿 represents the perturbation under 𝑙∞ norm constraint △. The selected subset 𝒮 with the size 𝑘 is obtained by optimizing the function 𝐺, which aims to narrow the difference between 𝒟 and 𝒮 under specific criteria with model parameters 𝜃. Note that the data selection step is performed periodically to achieve computational savings.</p><p>Recent data subset selection schemes, GRAD-MATCH <ref type="bibr" target="#b41">[42]</ref> and GLISTER <ref type="bibr" target="#b42">[43]</ref>, have made significant contribu-tions towards efficiently achieving high clean accuracy. We extend these approaches in the context of adversarial robustness. Motivated by GLISTER <ref type="bibr" target="#b42">[43]</ref>, we first consider training a subset that obtains the optimal adversarial log-likelihood on the validation set in Eq. ( <ref type="formula" target="#formula_5">5</ref>), defined as Adv-GLISTER:</p><formula xml:id="formula_5">𝐺(𝒮) = ∑︁ (𝑥 𝑉 ,𝑦 𝑉 )∈𝒱 𝐿𝑉 (𝜃𝑆; 𝑥𝑉 + 𝛿 * 𝑉 , 𝑦𝑉 )<label>(5)</label></formula><p>where 𝐿𝑉 is the negative log-likelihood on validation set; 𝛿 * 𝑉 is the adversarial perturbation obtained by maximizing 𝐿𝑉 (𝜃𝑆; 𝑥𝑉 + 𝛿𝑉 , 𝑦𝑉 ).</p><p>Another adversarial data pruning approach is inspired by GRAD-MATCH <ref type="bibr" target="#b41">[42]</ref>, which aims to find the data subset whose gradients closely match those of the full training data. Adv-GRAD-MATCH is formulated as Eq. ( <ref type="formula" target="#formula_6">6</ref>):</p><formula xml:id="formula_6">𝐺(𝒮) = ‖ ∑︁ (𝑥 𝑆 ,𝑦 𝑆 )∈𝒮 𝑤∇ 𝜃 ℒ𝒮 (𝜃; 𝑥𝑆 + 𝛿 * 𝑆 , 𝑦𝑆) −∇ 𝜃 ℒ𝒟 (𝑥 𝐷 ,𝑦 𝐷 )∈𝒟 (𝜃; 𝑥𝐷 + 𝛿 * 𝐷 , 𝑦𝐷)‖ (<label>6</label></formula><formula xml:id="formula_7">)</formula><p>where 𝑤 is the weight vector associated with each instance 𝑥𝑆 in the subset 𝒮; ℒ𝑆 and ℒ𝐷 denote the training loss over the subset and entire dataset; 𝛿 * 𝑆 and 𝛿 * 𝐷 are adversarial examples obtained by maximizing 𝐿𝑆(𝜃; 𝑥𝑆 + 𝛿𝑆, 𝑦𝑆) and 𝐿𝐷(𝜃; 𝑥𝐷 + 𝛿𝐷, 𝑦𝐷), respectively. During the data selection, the adversarial gradient difference between the weighted subset loss and  the complete dataset loss is minimized so as to produce the optimum subset and corresponding weights.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiments</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Experiment Setup</head><p>To evaluate the efficiency and generality of the proposed method, we apply adversarial training loss functions from TRADES <ref type="bibr" target="#b19">[20]</ref> or MART <ref type="bibr" target="#b20">[21]</ref> on the standard datasets, CIFAR-10, CIFAR-100 <ref type="bibr" target="#b43">[44]</ref> trained on ResNet-18 <ref type="bibr" target="#b44">[45]</ref>. Our adversarial data pruning methods include Adv-GRAD-MATCH and Adv-GLISTER with different data portions (subset size) [30%, 50%] with 100 and 200 epochs where the selection interval is 20 (i.e., perform adversarial subset selection every 20 epochs of AT). The original training dataset is divided into the train (90%) and the validation set (10%) in Adv-GLISTER. The optimizer is SGD with momentum 0.9 and weight decay 2e-4 for TRADES and 3.5e-3 for MART. For Adv-GRAD-MATCH and Adv-GLISTER, the initial learning rate is 0.01 and 0.02 on CIFAR-10 and 0.08 and 0.05 on CIFAR-100 respectively.</p><p>Besides the original TRADES <ref type="bibr" target="#b19">[20]</ref> and MART <ref type="bibr" target="#b20">[21]</ref> methods, we also compare our approach with Bullet-Train <ref type="bibr" target="#b21">[22]</ref>. PGD attack <ref type="bibr" target="#b12">[13]</ref> (PGD-50-10) is adopted for evaluating the robust accuracy, ranging from low magnitude (𝜖 = 4/255) to high magnitude (𝜖 = 16/255) with 50 iterations as well as 10 restarts at the step-size 𝛼 = 2/255 under 𝑙∞-norm. Moreover, AutoAttack <ref type="bibr" target="#b22">[23]</ref> is leveraged for the reliable robustness evaluation. Additionally, our methods can also be combined with Bullet-Train <ref type="bibr" target="#b21">[22]</ref> and we term them as Adv-GRAD-MATCH&amp;Bullet and Adv-GLISTER&amp;Bullet.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Main Results</head><p>Table <ref type="table" target="#tab_0">1</ref> shows the results of our Adv-GLISTER and Adv-GRAD-MATCH for TRADES compared with the original TRADES and Bullet-Train methods. The compar- ison is in terms of clean and robust accuracy (under two attack methods, PGD Attack <ref type="bibr" target="#b12">[13]</ref> and AutoAttack <ref type="bibr" target="#b22">[23]</ref>) along with the training speed-up. We observe that compared to the baselines, the training efficiency of our method is improved significantly on CIFAR-10, while the decrease happens on the clean accuracy and robustness under AutoAttack and PGD attacks for different values of 𝜖. Especially, for 𝜖 = 16/255, the robust accuracy can be improved from 16.05% (Bullet-Train <ref type="bibr" target="#b21">[22]</ref>) to 16.52% and 17.49% with our Adv-GRAD-MATCH and Adv-GLISTER, indicating our defensive capability on powerful attacks. As displayed in Table <ref type="table" target="#tab_0">1</ref>, our Adv-GRAD-MATCH and Adv-GLISTER reduce the training overheads (seconds per epoch) enormously and achieve 3.44× and 3.09× training speed-ups. After combining our approaches with Bullet-Train <ref type="bibr" target="#b21">[22]</ref>, an even faster acceleration of 5.12× can be reached. On CIFAR-100, the validity of our schemes is consistent as well. The reason why both clean and robust accuracy drop might be that our data pruning schemes struggle with the dimensionality and complexity of the dataset. Regardless, our schemes still result in conspicuous computation savings compared with other baselines.</p><p>To understand the robustness improvements of our schemes, we track the dynamics of the outlier, robust, and boundary sets (similar to <ref type="bibr" target="#b21">[22]</ref> We further evaluate the performances of adversarial data pruning based on the loss of MART in Table <ref type="table" target="#tab_1">2</ref>. Results are consistent with our findings on TRADES in Table <ref type="table" target="#tab_0">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Ablation Studies</head><p>Epoch. We first consider the training epoch. Table <ref type="table" target="#tab_2">3</ref> shows that longer training improves both clean and robust accuracy. Due to the shrinking data size, more epochs are required to enhance data-efficient adversarial learning, in alignment with standard data pruning training. However, 100-epoch training appears to be sufficient for the small dataset.</p><p>Subset Size. We experiment with different subset sizes. Moving from the extremely small subset (10% of the full training set) to a larger subset (70%) in Fig. <ref type="figure" target="#fig_5">2</ref>, the observation is that robust accuracy gradually increases to that of the full dataset. This highlights the benefit of pruning with optimal subset size. We can see that 30% is an appropriate choice for the CIFAR-10 subset size, after taking the global efficiency into account.</p><p>Number of selection rounds. In Sec. 4.2, our experiments perform adversarial data pruning every 20 epochs (with 9 selections). Here we present the results of data pruning every 40 epochs (with 4 selections). As shown in   Note that when the size is 100%, data pruning methods are not applied and the speed-up is compared with the baselines (TRADES or MART).</p><p>Table <ref type="table" target="#tab_3">4</ref>, 9 selections can achieve better clean and robust accuracy with comparable acceleration.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion and Future Work</head><p>In this paper, we investigated efficient adversarial training from a data-pruning perspective. With comprehensive experiments, we demonstrated that proposed adversarial data pruning approaches outperform the existing baselines by mitigating substantial computational overhead. These positive results pave a path for future research on accelerating AT by minimizing redundancy at the data level. Our future work will focus on designing more accurate pruning schemes for large-scale datasets.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>Adv-GLISTER&amp;Bullet.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Tracking of adversarial robustness during 200 epochs of training. Red, Green and Blue denote outlier, robust and boundary examples, respectively.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>) using PGD-5-1 attack. Without any attack, the outlier examples have already been mistaken by the model, but boundary and robust examples are correctly identified. After adversarial attacks, boundary examples are incorrectly classified while robust examples are still correctly classified. Fig. 1 displays the dynamics of the outlier, boundary, and robust examples on CIFAR-10 for various schemes. During the model training and data selection, the number of robust samples gradually increases and eventually dominates, while the number of outliers and boundary data points decreases over epochs, revealing similar achievements in TRADES-based AT and data pruning-based methods. In addition, the ultimate portions of three sets explain the clean accuracy and robustness degrading of our approaches. In detail, two baselines obtain more robust samples and fewer boundary and outlier examples.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: PGD evaluation (𝜖 = 8/255) with the corresponding speed-up under different subset sizes for 100 epoch CIFAR-10 training.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>TRADES results where data pruning methods use only 30% data points on CIFAR-10 and 50% data points on CIFAR-100 for 100 epochs of training.</figDesc><table><row><cell>Dataset</cell><cell>Method</cell><cell>Clean</cell><cell cols="3">PGD 4/255 8/255 16/255</cell><cell cols="2">AutoAttack Time/epoch (Speed-up)</cell></row><row><cell></cell><cell>TRADES [20]</cell><cell>82.73</cell><cell>69.17</cell><cell>51.83</cell><cell>19.43</cell><cell>49.06</cell><cell>416.20 (-)</cell></row><row><cell></cell><cell>Bullet [22]</cell><cell>84.60</cell><cell>70.24</cell><cell>50.82</cell><cell>16.05</cell><cell>47.93</cell><cell>193.06 (2.16×)</cell></row><row><cell>CIFAR-10</cell><cell>Adv-GLISTER (Ours) Adv-GRAD-MATCH (Ours)</cell><cell>77.62 75.67</cell><cell>63.06 61.85</cell><cell>46.06 45.96</cell><cell>16.52 17.49</cell><cell>41.61 42.19</cell><cell>120.70 (3.45×) 138.19 (3.01×)</cell></row><row><cell></cell><cell>Adv-GLISTER&amp;Bullet (Ours)</cell><cell>79.21</cell><cell>63.02</cell><cell>44.52</cell><cell>13.33</cell><cell>40.77</cell><cell>72.91 (5.66×)</cell></row><row><cell></cell><cell>Adv-GRAD-MATCH&amp;Bullet (Ours)</cell><cell>77.57</cell><cell>62.00</cell><cell>45.13</cell><cell>14.65</cell><cell>41.94</cell><cell>87.38 (4.76×)</cell></row><row><cell></cell><cell>TRADES [20]</cell><cell>55.85</cell><cell>40.31</cell><cell>27.35</cell><cell>10.71</cell><cell>23.39</cell><cell>387.72 (-)</cell></row><row><cell></cell><cell>Bullet [22]</cell><cell>59.43</cell><cell>42.23</cell><cell>28.08</cell><cell>9.40</cell><cell>23.85</cell><cell>173.59 (2.23×)</cell></row><row><cell>CIFAR-100</cell><cell>Adv-GLISTER (Ours) Adv-GRAD-MATCH (Ours)</cell><cell>51.26 51.03</cell><cell>37.16 37.17</cell><cell>24.78 24.60</cell><cell>9.49 9.70</cell><cell>20.57 20.42</cell><cell>202.7 (1.91×) 206.05 (1.88×)</cell></row><row><cell></cell><cell>Adv-GLISTER&amp;Bullet (Ours)</cell><cell>53.54</cell><cell>37.24</cell><cell>23.91</cell><cell>7.69</cell><cell>20.02</cell><cell>105.66 (3.67×)</cell></row><row><cell></cell><cell>Adv-GRAD-MATCH&amp;Bullet (Ours)</cell><cell>52.98</cell><cell>36.92</cell><cell>24.24</cell><cell>8.01</cell><cell>20.17</cell><cell>105.61 (3.67×)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>MART results where data pruning methods use only 30% data points on CIFAR-10 and 50% data points on CIFAR-100 for 100 epochs of training.</figDesc><table><row><cell>Dataset</cell><cell>Method</cell><cell>Clean</cell><cell cols="3">PGD 4/255 8/255 16/255</cell><cell cols="2">AutoAttack Time/epoch (Speed-up)</cell></row><row><cell></cell><cell>MART [21]</cell><cell>80.96</cell><cell>68.21</cell><cell>52.59</cell><cell>19.52</cell><cell>46.94</cell><cell>329.54 (-)</cell></row><row><cell></cell><cell>Bullet [22]</cell><cell>85.29</cell><cell>70.92</cell><cell>50.64</cell><cell>13.33</cell><cell>43.77</cell><cell>199.42 (1.65×)</cell></row><row><cell>CIFAR-10</cell><cell>Adv-GLISTER (Ours) Adv-GRAD-MATCH (Ours)</cell><cell>71.97 73.67</cell><cell>60.13 61.35</cell><cell>46.25 47.07</cell><cell>16.59 18.16</cell><cell>39.86 40.98</cell><cell>95.68 (3.44×) 106.51 (3.09×)</cell></row><row><cell></cell><cell>Adv-GLISTER&amp;Bullet (Ours)</cell><cell>73.87</cell><cell>59.89</cell><cell>44.01</cell><cell>14.20</cell><cell>38.99</cell><cell>64.31 (5.12×)</cell></row><row><cell></cell><cell>Adv-GRAD-MATCH&amp;Bullet (Ours)</cell><cell>78.78</cell><cell>64.42</cell><cell>46.72</cell><cell>13.50</cell><cell>39.53</cell><cell>77.11 (4.27×)</cell></row><row><cell></cell><cell>MART [21]</cell><cell>54.85</cell><cell>39.24</cell><cell>25.08</cell><cell>8.59</cell><cell>22.66</cell><cell>307.43 (-)</cell></row><row><cell></cell><cell>Bullet [22]</cell><cell>57.44</cell><cell>39.22</cell><cell>24.14</cell><cell>6.66</cell><cell>21.55</cell><cell>187.73 (1.64×)</cell></row><row><cell>CIFAR-100</cell><cell>Adv-GLISTER (Ours) Adv-GRAD-MATCH (Ours)</cell><cell>46.36 48.07</cell><cell>34.37 36.19</cell><cell>24.01 26.11</cell><cell>9.20 10.79</cell><cell>19.79 21.24</cell><cell>152.11 (2.02×) 153.86 (2.00×)</cell></row><row><cell></cell><cell>Adv-GLISTER&amp;Bullet (Ours)</cell><cell>52.13</cell><cell>35.07</cell><cell>20.67</cell><cell>5.64</cell><cell>18.21</cell><cell>100.22 (3.07×)</cell></row><row><cell></cell><cell>Adv-GRAD-MATCH&amp;Bullet (Ours)</cell><cell>52.46</cell><cell>35.81</cell><cell>22.20</cell><cell>6.48</cell><cell>18.68</cell><cell>113.03 (2.72×)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>100 v.s. 200 epoch TRADES CIFAR-10 results with ResNet-18 when using 30% data points with robustness regularization factor to be 1.</figDesc><table><row><cell>Method</cell><cell cols="2">Epoch Clean</cell><cell cols="3">PGD 4/255 8/255 16/255</cell><cell>AutoAttack</cell></row><row><cell>Adv-GLISTER</cell><cell>100</cell><cell>77.62</cell><cell>63.06</cell><cell>46.06</cell><cell>16.52</cell><cell>41.61</cell></row><row><cell>Adv-GRAD-MATCH</cell><cell>100</cell><cell>75.61</cell><cell>60.81</cell><cell>45.76</cell><cell>17.49</cell><cell>42.19</cell></row><row><cell>Adv-GLISTER</cell><cell>200</cell><cell>78.76</cell><cell>64.15</cell><cell>46.11</cell><cell>16.92</cell><cell>42.43</cell></row><row><cell>Adv-GRAD-MATCH</cell><cell>200</cell><cell>75.75</cell><cell>61.24</cell><cell>46.49</cell><cell>18.55</cell><cell>43.63</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4</head><label>4</label><figDesc>TRADES results on CIFAR-10 with ResNet-18 using 30% data samples under different selection counts for 200 epoch training.</figDesc><table><row><cell>Method</cell><cell cols="2">Number of selections Clean</cell><cell cols="3">PGD 4/255 8/255 16/255</cell><cell cols="2">AutoAttack Speed-up</cell></row><row><cell>TRADES</cell><cell>-</cell><cell>83.32</cell><cell>68.91</cell><cell>49.64</cell><cell>17.31</cell><cell>47.53</cell><cell>-</cell></row><row><cell>Adv-GLISTER</cell><cell>4</cell><cell>75.80</cell><cell>60.48</cell><cell>44.62</cell><cell>16.07</cell><cell>40.44</cell><cell>3.15×</cell></row><row><cell>Adv-GRAD-MATCH</cell><cell>4</cell><cell>73.80</cell><cell>60.43</cell><cell>46.06</cell><cell>18.33</cell><cell>43.03</cell><cell>2.83×</cell></row><row><cell>Adv-GLISTER</cell><cell>9</cell><cell>78.76</cell><cell>64.15</cell><cell>46.11</cell><cell>16.92</cell><cell>42.43</cell><cell>2.93×</cell></row><row><cell>Adv-GRAD-MATCH</cell><cell>9</cell><cell>75.75</cell><cell>61.24</cell><cell>46.49</cell><cell>18.55</cell><cell>43.63</cell><cell>2.75×</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgment</head><p>This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 and was supported by LLNL-LDRD Program under Project No. 20-SI-005 (LLNL-CONF-842760).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Self-training with noisy student improves imagenet classification</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-T</forename><surname>Luong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Hovy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Sharpness-aware minimization for efficiently improving generalization</title>
		<author>
			<persName><forename type="first">P</forename><surname>Foret</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kleiner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Mobahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Neyshabur</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations (ICLR)</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Object detection with deep learning: A review</title>
		<author>
			<persName><forename type="first">Z.-Q</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S.-T</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wu</surname></persName>
		</author>
		<idno type="DOI">10.1109/TNNLS.2018.2876865</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Neural Networks and Learning Systems</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page" from="3212" to="3232" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A survey of modern deep learning based object detection models</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S A</forename><surname>Zaidi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Ansari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Aslam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Kanwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Asghar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Digital Signal Processing</title>
		<imprint>
			<biblScope unit="volume">126</biblScope>
			<biblScope unit="page">103514</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">U</forename><surname>Kaiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Polosukhin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems (NeurIPS)</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Improving language models by retrieving from trillions of tokens</title>
		<author>
			<persName><forename type="first">S</forename><surname>Borgeaud</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mensch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hoffmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Cai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rutherford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Millican</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">B</forename><surname>Van Den Driessche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-B</forename><surname>Lespiau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Damoc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>De Las Casas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Guy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Menick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ring</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hennigan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Maggiore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Cassirer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Brock</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Paganini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Irving</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Vinyals</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Osindero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Simonyan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Rae</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Elsen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Sifre</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 39th International Conference on Machine Learning (ICML)</title>
				<meeting>the 39th International Conference on Machine Learning (ICML)</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models</title>
		<author>
			<persName><forename type="first">P.-Y</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C.-J</forename><surname>Hsieh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the ACM Workshop on Artificial Intelligence and Security</title>
				<meeting>the ACM Workshop on Artificial Intelligence and Security</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Generating adversarial examples with adversarial networks</title>
		<author>
			<persName><forename type="first">C</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yan Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Song</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence(IJCAI)</title>
				<meeting>the Twenty-Seventh International Joint Conference on Artificial Intelligence(IJCAI)</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">On adaptive attacks to adversarial example defenses</title>
		<author>
			<persName><forename type="first">F</forename><surname>Tramer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Carlini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Brendel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Madry</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems (NeurIPS)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples</title>
		<author>
			<persName><forename type="first">A</forename><surname>Athalye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Carlini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Wagner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 35th International Conference on Machine Learning (ICML)</title>
				<meeting>the 35th International Conference on Machine Learning (ICML)</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Provable defenses against adversarial examples via the convex outer adversarial polytope</title>
		<author>
			<persName><forename type="first">E</forename><surname>Wong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Kolter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 35th International Conference on Machine Learning (ICML)</title>
				<meeting>the 35th International Conference on Machine Learning (ICML)</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Provably robust deep learning via adversarially trained smoothed classifiers</title>
		<author>
			<persName><forename type="first">H</forename><surname>Salman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Razenshteyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bubeck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Yang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems (NeurIPS)</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Towards deep learning models resistant to adversarial attacks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Madry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Makelov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Tsipras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vladu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations (ICLR)</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Fast is better than free: Revisiting adversarial training</title>
		<author>
			<persName><forename type="first">E</forename><surname>Wong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Rice</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">Z</forename><surname>Kolter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations (ICLR)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Single-step adversarial training with dropout scheduling</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">S</forename><surname>Vivek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Venkatesh</surname></persName>
		</author>
		<author>
			<persName><surname>Babu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Understanding catastrophic overfitting in single-step adversarial training</title>
		<author>
			<persName><forename type="first">H</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)</title>
				<meeting>the AAAI Conference on Artificial Intelligence (AAAI)</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="8119" to="8127" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Understanding and improving fast adversarial training</title>
		<author>
			<persName><forename type="first">M</forename><surname>Andriushchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Flammarion</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems (NeurIPS)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">R</forename><surname>Bartoldson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Kailkhura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Blalock</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2210.06640</idno>
		<title level="m">Computeefficient deep learning: Algorithmic trends and opportunities</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">Efficient adversarial training with data pruning</title>
		<author>
			<persName><forename type="first">M</forename><surname>Kaufmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Shumailov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mullins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Papernot</surname></persName>
		</author>
		<idno>arXiv</idno>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Theoretically principled trade-off between robustness and accuracy</title>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">P</forename><surname>Xing</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">E</forename><surname>Ghaoui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">I</forename><surname>Jordan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning (ICML)</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Improving adversarial robustness requires revisiting misclassified examples</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bailey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Gu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations (ICLR)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Bullettrain: Accelerating robust neural network training via boundary example mining</title>
		<author>
			<persName><forename type="first">W</forename><surname>Hua</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Suh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems (NeurIPS)</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Reliable evaluation of adversarial robustness with an ensemble of diverse parameterfree attacks</title>
		<author>
			<persName><forename type="first">F</forename><surname>Croce</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning (ICML)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">J</forename><surname>Goodfellow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shlens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Szegedy</surname></persName>
		</author>
		<idno>arXiv</idno>
		<title level="m">Explaining and harnessing adversarial examples</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Towards evaluating the robustness of neural networks</title>
		<author>
			<persName><forename type="first">N</forename><surname>Carlini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Wagner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Symposium on Security and Privacy (S&amp;P)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Sparse and imperceivable adversarial attacks</title>
		<author>
			<persName><forename type="first">F</forename><surname>Croce</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)</title>
				<meeting>the IEEE/CVF International Conference on Computer Vision (ICCV)</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Beyond imagenet attack: Towards crafting adversarial examples for black-box domains</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xue</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations (ICLR)</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<title level="m" type="main">Adversarial examples in the physical world</title>
		<author>
			<persName><forename type="first">A</forename><surname>Kurakin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Goodfellow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bengio</surname></persName>
		</author>
		<idno type="DOI">10.48550/ARXIV.1607.02533</idno>
		<ptr target="https://arxiv.org/abs/1607.02533.doi:10.48550/ARXIV.1607.02533" />
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Magnet: A two-pronged defense against adversarial examples</title>
		<author>
			<persName><forename type="first">D</forename><surname>Meng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security</title>
				<meeting>the 2017 ACM SIGSAC Conference on Computer and Communications Security</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Defense against adversarial attacks using high-level representation guided denoiser</title>
		<author>
			<persName><forename type="first">F</forename><surname>Liao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Pang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<meeting>the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Adversarial defense by restricting the hidden space of deep neural networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Mustafa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Khan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hayat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Goecke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Shao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)</title>
				<meeting>the IEEE/CVF International Conference on Computer Vision (ICCV)</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Reverse engineering of imperceptible adversarial image perturbations</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Gong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations (ICLR)</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Is robustness the cost of accuracy? -a comprehensive study on the robustness of 18 deep image classification models</title>
		<author>
			<persName><forename type="first">D</forename><surname>Su</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P.-Y</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Gao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the European Conference on Computer Vision (ECCV)</title>
				<meeting>the European Conference on Computer Vision (ECCV)</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Curriculum adversarial training</title>
		<author>
			<persName><forename type="first">Q.-Z</forename><surname>Cai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Song</surname></persName>
		</author>
		<idno type="DOI">10.24963/ijcai.2018/520</idno>
		<ptr target="https://doi.org/10.24963/ijcai.2018/520.doi:10.24963/ijcai.2018/520" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, International Joint Conferences on Artificial Intelligence Organization</title>
				<meeting>the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, International Joint Conferences on Artificial Intelligence Organization</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="3740" to="3747" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Attacks which do not kill training make adversarial learning stronger</title>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Niu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Cui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sugiyama</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kankanhalli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 37th International Conference on Machine Learning (ICML)</title>
				<meeting>the 37th International Conference on Machine Learning (ICML)</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">Adversarial training for free!</title>
		<author>
			<persName><forename type="first">A</forename><surname>Shafahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Najibi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Ghiasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dickerson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Studer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">S</forename><surname>Davis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Taylor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Goldstein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems (NeurIPS)</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Revisiting and advancing fast adversarial training through the lens of bi-level optimization</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Khanduri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning (ICML)</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<analytic>
		<title level="a" type="main">Selection via proxy: Efficient data selection for deep learning</title>
		<author>
			<persName><forename type="first">C</forename><surname>Coleman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Yeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mussmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mirzasoleiman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bailis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Leskovec</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zaharia</surname></persName>
		</author>
		<ptr target="https://openreview.net/forum?id=HJg2b0VYDr" />
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations (ICLR)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<analytic>
		<title level="a" type="main">Learning from less data: A unified data subset selection and active learning framework for computer vision</title>
		<author>
			<persName><forename type="first">V</forename><surname>Kaushal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Iyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kothawade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mahadev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Doctor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ramakrishnan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</title>
				<meeting>the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Feldman</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-29349-9_2</idno>
		<ptr target="https://doi.org/10.1007/978-3-030-29349-9_2.doi:10.1007/978-3-030-29349-9_2" />
		<title level="m">Core-Sets: Updated Survey</title>
				<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="23" to="44" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b40">
	<analytic>
		<title level="a" type="main">Coresets for data-efficient training of machine learning models</title>
		<author>
			<persName><forename type="first">B</forename><surname>Mirzasoleiman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bilmes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Leskovec</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 37th International Conference on Machine Learning (ICML)</title>
				<meeting>the 37th International Conference on Machine Learning (ICML)</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b41">
	<analytic>
		<title level="a" type="main">Grad-match: Gradient matching based data subset selection for efficient deep model training</title>
		<author>
			<persName><forename type="first">K</forename><surname>Killamsetty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ramakrishnan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>De</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Iyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 38th International Conference on Machine Learning (ICML)</title>
				<meeting>the 38th International Conference on Machine Learning (ICML)</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b42">
	<analytic>
		<title level="a" type="main">Glister: Generalization based data subset selection for efficient and robust learning</title>
		<author>
			<persName><forename type="first">K</forename><surname>Killamsetty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Sivasubramanian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ramakrishnan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Iyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)</title>
				<meeting>the AAAI Conference on Artificial Intelligence (AAAI)</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="8110" to="8118" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b43">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Krizhevsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hinton</surname></persName>
		</author>
		<title level="m">Learning multiple layers of features from tiny images</title>
				<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
		<respStmt>
			<orgName>Department of Computer Science, University of Toronto</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Master&apos;s thesis</note>
</biblStruct>

<biblStruct xml:id="b44">
	<analytic>
		<title level="a" type="main">Identity mappings in deep residual networks</title>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">European conference on computer vision (ECCV)</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="630" to="645" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
