<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Using WGAN for Improving Imbalanced Classification Performance</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Snehal</forename><surname>Bhatia</surname></persName>
							<email>bhatias@tcd.ie</email>
							<affiliation key="aff0">
								<orgName type="department">School of Computer Science</orgName>
								<orgName type="institution">Statistics Trinity College Dublin</orgName>
								<address>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rozenn</forename><surname>Dahyot</surname></persName>
							<email>dahyotr@tcd.ie</email>
							<affiliation key="aff0">
								<orgName type="department">School of Computer Science</orgName>
								<orgName type="institution">Statistics Trinity College Dublin</orgName>
								<address>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Using WGAN for Improving Imbalanced Classification Performance</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">57CC115F1AB5E3B61AA8616C2E147B4C</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T23:13+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper investigates data synthesis with a Generative Adversarial Network (GAN) for augmenting the amount of data used for training classifiers (in supervised learning) to compensate for class imbalance (when the classes are not represented equally by the same number of training samples). Our data synthesis approach with GAN is compared with data augmentation in the context of image classification. Our experimental results show encouraging results in comparison to standard data augmentation schemes based on image transforms.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Image classification is a standard process in image processing and many machine (deep) learning techniques are routinely evaluated <ref type="bibr" target="#b13">[14]</ref> using labelled images datasets (e.g. FMNIST <ref type="bibr" target="#b15">[16]</ref>, CIFAR-10 <ref type="bibr" target="#b5">[6]</ref>). The occurrence of imbalance in datasets collected from real-life domains is sometimes unavoidable due to the cost of collecting and labelling data, or due to privacy issue, or simply due to rare event scenarios. This imbalance of number of training examples per class often has a detrimental effect on the performance of classifiers <ref type="bibr" target="#b8">[9]</ref>. In this research we explore the potential of data synthesis with GANs to address this issue, more specifically we compare data synthesis with the Wasserstein Generative Adversarial Networks (WGANs) against the traditional data augmentation approach traditionally used for training deep learning architectures. GANs and WGANs are first introduced in Section 2, and our approach is introduced next (Sec. 3). We show experimentally that WGAN data augmentation outperforms traditional data augmentation with image transforms on both FMNIST and CIFAR-10 datasets (Sec. 4).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>The Generative Adversarial Network (GANs) framework was first introduced for artificially generating realistic images from scratch <ref type="bibr" target="#b2">[3]</ref>. Since then, GANs have been employed for a variety of image processing and computer vision tasks, such as generating high resolution images from low resolution input images <ref type="bibr" target="#b6">[7]</ref>, texture synthesis in images <ref type="bibr" target="#b7">[8]</ref> and human face synthesis <ref type="bibr" target="#b4">[5]</ref>.</p><p>This generative capacity of GANs makes them suitable for the purpose of data augmentation, and many recent studies have shown how to tackle the problem of imbalanced datasets in classification by using variations of the GAN architecture. Mariani et al <ref type="bibr" target="#b8">[9]</ref> introduced a new architecture called "BAGAN" or Balancing GAN, and achieved a significantly better classification performance in comparison to ACGAN <ref type="bibr" target="#b10">[11]</ref> and simple GAN <ref type="bibr" target="#b2">[3]</ref>, when tested on various artificially imbalanced distributions of the image datasets of MNIST, CIFAR-10, Flowers and GTSRB datasets. Mariani et al <ref type="bibr" target="#b8">[9]</ref>'s original work with GAN (introduced Paragraph 2.1) is extended in our paper by evaluating the performance of Wasserstein GAN (WGAN and WGAN-GP <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b3">4]</ref>, explained in Paragraph 2.2) with the same methodology (Sec. 3 and 4).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Generative Adversarial Networks</head><p>The first and original GAN architecture introduced by Ian Goodfellow et al <ref type="bibr" target="#b2">[3]</ref> consists of two sub-networks, which are competing Artificial Neural Networks (ANNs), namely the 'Generator' (G ) and the 'Discriminator' (D). The Generator learns to transfer the distribution of input data (e.g. distribution of noise images) into a target distribution (e.g. distribution of images of horses). At the same time, the Discriminator, which is essentially a binary classifier, is trained to distinguish between the real data points (e.g. real images of horses) and artificially generated data points by the generator (e.g. synthesised images of horses created by the generator transfer function).</p><p>A GAN <ref type="bibr" target="#b2">[3]</ref> is often defined as a two-player minimax game in which Generator G wants to minimize the cost function whereas the discriminator D aims to maximize it:</p><formula xml:id="formula_0">min G max D E x∼pr [log D(x)] + E x∼pg [log(1 − D(x)]<label>(1)</label></formula><p>where x = G (z) with z ∼ p(z) (input of the generator sampled from a simple noise distribution such as Uniform or Normal), and p r is the (real) data distribution <ref type="bibr" target="#b3">[4]</ref>. The GAN value function (Equation ( <ref type="formula" target="#formula_0">1</ref>)) is essentially the Jensen-Shanon Divergence between the real data and the artificially generated data.</p><p>The training process of a GAN can be viewed as a double feedback-loop where the discriminator is in a feedback loop with the real images and the generator uses the feedback from the discriminator to learn how to produce images that are realistic enough to fool the discriminator. The feedback process is repeated throughout the training phase, until Nash Equilibrium is achieved<ref type="foot" target="#foot_0">1</ref> . This process is called Adversarial Training.</p><p>Although Vanilla GANs have achieved state of the art results in many domains, they suffer from certain drawbacks:</p><p>-Nash Equilibrium is Hard to Achieve: The training process of GAN is based on gradient descent. The two models, generator and discriminator, are simultaneously trained to find a Nash Equilibrium. However, since both models update their loss functions concurrently and independently, there is no guarantee of convergence -Vanishing Gradient Problem: Training a GAN loss function poses a dilemma. If the discriminator is trained perfectly (especially early on in the training process), then D(x real )=1 and D(x real )=0. From Equation (1) it can be observed that in this case, the value of the loss function would become 0, and there would be no gradient left to update during the training iterations, hence leading to the vanishing gradient problem. However, if the discriminator function is not trained to perfection then the generator would not receive relevant feedback, meaning that the learned loss function would not be good enough to generate realistic images -Mode Collapse: This is a common failure observed in GANs where the generator achieves a state where it always produces the same images as outputs. This may in some cases be enough to fool the discriminator, but the low variety of images generated is not representative of the complexities observed in real-world data distribution <ref type="bibr" target="#b0">[1]</ref> Therefore, various modifications of the original GAN have been proposed, one of which is introduced below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Wasserstein GAN (WGAN) and WGAN-GP</head><p>The WGAN architecture proposed by Arjovsky et al <ref type="bibr" target="#b1">[2]</ref> replaces the Jensen-Shannon divergence from the original GAN architecture <ref type="bibr" target="#b2">[3]</ref> with the Wasserstein Distance function. The Wasserstein Distance is also called the "Earth Mover Distance", as it can be informally interpreted as the minimal cost of moving and transforming some quantity of mass (say, a pile of dirt) from the shape of one probability distribution P to that of another probability distribution Q. The cost of moving in this scenario is calculated as the product of the amount of mass moved and the distance by which it has been moved. W (P, Q) which is a measure of distance between the points in probability distributions P and Q.</p><p>The WGAN value function corresponds to:</p><formula xml:id="formula_1">min G max D∈D E x∼pr [D(x)] − E x∼pg [(D(x)]<label>(2)</label></formula><p>In Equation ( <ref type="formula" target="#formula_1">2</ref>), D denotes the set of 1-Lipschitz functions<ref type="foot" target="#foot_1">2</ref> , meaning that the discriminator loss should follow the Lipschitz constraint. Gulrajani et al <ref type="bibr" target="#b3">[4]</ref> extended WGAN to WGAN-GP (gradient penalty) to improve the training of WGANs. The Weight-Clipping method used to enforce the Lipschitz Constraint introduces numerous problems such as vanishing gradient (when the clipping window is too large), slow convergence (when the clipping window is too small) and it is not very suitable for very complex data <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b3">4]</ref>. One solution is to impose a Gradient Penalty instead of weight clipping as a means to enforce Lipschitz constraint <ref type="bibr" target="#b3">[4]</ref>. The value function for WGAN-GP can be observed in Equation <ref type="bibr" target="#b2">(3)</ref>. Here, L represents the loss function, x' represents a sample from fake or generated data, and x represents randomly sampled data. Note that a "soft penalty" is imposed (i.e. only on the randomly sampled data) to prevent tractability issues. The last term in the equation is the penalty term, with λ being the penalty coefficient.</p><formula xml:id="formula_2">min G max D∈D E x∼pg [D(x)] − E x∼pr [D(x)] + λE x∼p x [( ∇ x D(x) 2 − 1) 2 ]<label>(3)</label></formula><p>p x is defined for sampling uniformally on straight edges (see <ref type="bibr" target="#b3">[4]</ref> for details ).</p><p>Generally, for a 1-Lipschitz function, the maximum gradient norm should be 1. Therefore, instead of applying weight-clipping, WGAN-GP loss function imposes a penalty if the gradient norm moves away from its maximum target norm value of 1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Method</head><p>In this study, we compared the traditional image data augmentation technique of using geometric and photometric transformations <ref type="bibr" target="#b12">[13]</ref>, with our proposed technique of using Generative Adversarial Networks (specifically, WGAN-GP <ref type="bibr" target="#b3">[4]</ref>) for the same purpose. We first artificially introduce imbalance in two benchmark balanced datasets of FMNIST <ref type="bibr" target="#b15">[16]</ref> and CIFAR-10 <ref type="bibr" target="#b5">[6]</ref>, and the class distribution can be observed in Figure <ref type="figure">1</ref>. To effectively study the effects of dataset imbalance on classification performance, we use a Hybrid CNN-SVM architecture. The overall workflow of the study can be observed in Figure <ref type="figure">2</ref>. The classifier is separately trained on each of these datasets and these trained models are used for evaluation. However, no change is made to the test set of these datasets.</p><p>Fig. <ref type="figure">2</ref>: Workflow of our study.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Oversampling using Image Transformations</head><p>Some of the most common geometric and photo-metric transformations applied on images to generate additional data while preserving the context of the image are rotation, translation, shearing, scaling, flipping, zooming, blurring, whitening etc <ref type="bibr" target="#b12">[13]</ref>. This has been achieved using ImageDataGenerator class of the image pre-processing module of Keras, and a sample of images generated using this methodology can be observed in Figure <ref type="figure">3</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Oversampling using WGAN-GP</head><p>WGAN-GP <ref type="bibr" target="#b1">[2]</ref> is a superior GAN architecture which is known to achieve convergence without facing the issues of vanishing or exploding gradients. The training process is very stable for this architecture, and it has also proven to generate highly diverse data samples with low noise. It can achieve high quality results with almost no hyperparameter tuning, making it a suitable choice for a wide variety of applications and datasets. Therefore, we have adopted the WGAN-GP architecture for our study.</p><p>The general methods for GAN architecture construction and the choices made in our model are described below:</p><p>-Use of Leaky ReLU: The Rectified Linear Unit (ReLU) activation function is a widely adopted, efficient activation function that returns the input directly as the output, or returns 0 when the input is 0.0 or less. However, the best practice for GANs is to use a variation called LeakyReLU, which allows some values lesser than 0, and learns the optimum cut-off for each node. In our architecture, we have used LeakyRelu in both the generator as well as discriminator, with slope values being of the order of the default value, 0.2.</p><p>-Use of Batch Normalization: Batch normalization is a technique used to improve the speed, performance and stability of neural networks by normalizing the input layer by adjusting and scaling the activations. We employ batch normalization after the convolution layers. In the case of GANs, it helps avoid vanishing and exploding gradients, as well as mode collapse.</p><p>-Using Gaussian Weight Initialization: Before starting the training process, the weights (parameters) of the neural network must be initialized with small random variables to prevent the activation layer from producing vanishing or exploding outputs, which would cause very small or very large gradient updates, giving rise to convergence problems. It is considered to be a good practice to initialize all weights using a zero-centred Gaussian distribution, with mean value as 0 and variance value as 1/N, where N specifies the number of input neurons. Therefore, we have used Xavier initialization in our architecture, which is based on the same principle.</p><p>-Not using Max Pooling: A max-pooling layer is often used in CNNs to after each convolution layer to downsample the input and feature maps. However, we would not be using this approach as in the case of Generative Adversarial Networks, it has been shown that having all convolutional layers allows the network to learn its own spatial down-sampling, which leads to an increase in performance.</p><p>Qualitative Evaluation Metrics for WGAN-GP Since our goal is to use the WGAN-GP model to generate additional minority class images in order to augment an imbalanced dataset and balance it, we aim to fulfil the following criterion for our generated images through visual inspection <ref type="bibr" target="#b8">[9]</ref>:</p><p>-Generated images should be similar to the other images of the class in question. If this target is not met, it would mean that the generator is not trained enough to produce quality, realistic images.</p><p>-Generated images must not all be the same, or repetitive. This will ensure that the generator does not suffer from mode collapse problem.</p><p>-Generated images should be different from the images which are already present in the training set. If not, it would mean that we have simply trained our generative model to repeat the training data.</p><p>In this study, the visual inspection has been done by the researchers themselves, by randomly mixing real and generated samples and testing if the fake samples can be spotted in that mix. The generated images sampled at various epochs of WGAN-GP training phase for the class horse can be observed in Figure <ref type="figure" target="#fig_3">5</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Classification Architecture</head><p>We adopted the hybrid CNN-SVM architecture as the classifier as proposed by Niu and Suen <ref type="bibr" target="#b9">[10]</ref>. In this hybrid architecture, the original CNN (with the output layer) is trained on the input dataset until convergence is achieved. Then, the output layer is replaced with the Radial Bias Function (RBF) of SVM. The output from the CNN hidden layer is taken as a feature vector for training the SVM. Once trained, this SVM is able to perform the classification task on unseen data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experimental Results</head><p>New data samples generated by WGAN-GP (Figure <ref type="figure" target="#fig_1">4</ref>) result in a superior augmented (balanced) dataset as compared to the dataset obtained by augmentation with images generated using geometric transforms (Figure <ref type="figure">3</ref>). This can be attributed to the following qualities of WGAN-generated images:</p><p>-Realistic looking, and impossible to tell apart from the original dataset images of the same class by a human observer -Preservation of context or semantic information of the class in question -Samples are not repetitive, signifying that there is very less possibility of over-fitting -Samples generated are variable or diverse in nature, therefore resulting in an efficiently augmented dataset which contains samples representing many possibilities</p><p>Original Image New images generated using Image Transformations Fig. <ref type="figure">3</ref>: An example of Generating New Data using Image Transforms on Dress sample of FMNIST dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Results for FMNIST</head><p>The results of classification on the CNN-SVM architecture are reported in Table <ref type="table" target="#tab_0">1</ref> when using FMNIST dataset. We note that imbalance causes the classifier's  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Results for CIFAR10</head><p>The results of classification on the CNN-SVM architecture are reported in Table <ref type="table" target="#tab_1">2</ref> for the dataset CIFAR-10. While the class imbalance creates a 8% drop in accuracy, data augmentation approaches both manage to restore the performance with an increase of 5% for Image transforms and 6% for WGAN. We have shown the potential benefit of using WGAN for improving the performance of a classifier when the training dataset suffers from class imbalance.</p><p>For further testing, it would not only be vital to observe the performance of the classifier with different distributions of imbalanced classes, but to also take into account the impact of noise, dataset shift problem (where the train and test sets follow different distributions) and the cases where there is possibility of class overlapping. It would also be useful to introduce quantitative metrics to assess the images generated by GANs, such as SSIM (Structural Similarity Index) <ref type="bibr" target="#b14">[15]</ref>, Inception Score <ref type="bibr" target="#b11">[12]</ref>, GAN Quality Index <ref type="bibr" target="#b16">[17]</ref>, which would reduce or eliminate the need for human visual inspection.  </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>FMNIST CIFAR- 10 Fig. 1 :</head><label>101</label><figDesc>Fig. 1: Frequency of Classes after introducing imbalance</figDesc><graphic coords="4,136.16,461.07,172.91,114.13" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 4 :</head><label>4</label><figDesc>Fig. 4: Comparison of Real Images and Synthesised Images generated using WGAN-GP for Dress class of FMNIST</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 5 :</head><label>5</label><figDesc>Fig. 5: Data Generated for the Minority class 'Horse' in the Imbalanced CIFAR-10 using WGAN-GP through the training epochs. Note that at Epoch 406, we first observe realistic-looking generated images</figDesc><graphic coords="11,134.77,473.03,113.39,141.73" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Testing Metrics for Classification of Variations of FMNIST.</figDesc><table><row><cell></cell><cell cols="4">Accuracy Precision Recall F1 Score</cell></row><row><cell>Original (Balanced)</cell><cell>0.932</cell><cell cols="3">0.931 0.932 0.932</cell></row><row><cell>Imbalanced</cell><cell>0.883</cell><cell cols="3">0.897 0.884 0.883</cell></row><row><cell cols="2">Data Augmentation (Image Transforms) 0.919</cell><cell>0.922</cell><cell>0.92</cell><cell>0.921</cell></row><row><cell>Data Augmentation (WGAN)</cell><cell>0.928</cell><cell cols="3">0.925 0.921 0.923</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Testing Metrics for Classification of Variations of CIFAR-10</figDesc><table><row><cell></cell><cell cols="3">Accuracy Precision Recall F1 Score</cell></row><row><cell>Original (Balanced)</cell><cell>0.8342</cell><cell cols="2">0.837 0.833 0.834</cell></row><row><cell>Imbalanced</cell><cell>0.7569</cell><cell>0.787 0.758</cell><cell>0.75</cell></row><row><cell cols="2">Data Augmentation (Image Transforms) 0.8084</cell><cell cols="2">0.812 0.808 0.806</cell></row><row><cell>Data Augmentation (WGAN)</cell><cell>0.8189</cell><cell cols="2">0.824 0.815 0.812</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">In Game Theory, when multiple interacting, non-cooperating participants are involved, Nash Equilibrium is the state of stability achieved when no participant can benefit solely from changing its own strategy or actions if the other players' strategies remain constant.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">A Lipschitz function is a function f such that -f(x)-f(y)-≤ K-x-yfor all x and y, where K is a constant independent of x and y.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgements. This work has emanated from research conducted as part of the MSc in Computer Science programme (Future Networked Systems) at Trinity College Dublin, Ireland. This research is partly supported by the ADAPT Centre for Digital Content Technology funded under the SFI Research Centres Programme (Grant 13/RC/2106) and co-funded under the European Regional Development Fund.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Towards principled methods for training generative adversarial networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Arjovsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representation</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Wasserstein generative adversarial networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Arjovsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chintala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research</title>
				<meeting>the 34th International Conference on Machine Learning. Machine Learning Research</meeting>
		<imprint>
			<date type="published" when="2017-08">Aug 2017</date>
			<biblScope unit="volume">70</biblScope>
			<biblScope unit="page" from="6" to="11" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Generative adversarial nets</title>
		<author>
			<persName><forename type="first">I</forename><surname>Goodfellow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Pouget-Abadie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mirza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Warde-Farley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ozair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Courville</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in Neural Information Processing Systems</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="page" from="2672" to="2680" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Improved training of wasserstein gans</title>
		<author>
			<persName><forename type="first">I</forename><surname>Gulrajani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ahmed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Arjovsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Dumoulin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Courville</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in neural information processing systems</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="5767" to="5777" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis</title>
		<author>
			<persName><forename type="first">R</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>He</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Computer Vision (ICCV)</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
			<biblScope unit="page" from="2458" to="2467" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Learning multiple layers of features from tiny images</title>
		<author>
			<persName><forename type="first">A</forename><surname>Krizhevsky</surname></persName>
		</author>
		<ptr target="http://www.cs.toronto.edu/~kriz/cifar.html" />
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
		<respStmt>
			<orgName>Canadian Institute for Advanced Research)</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">CIFAR-10</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Photo-realistic single image super-resolution using a generative adversarial network</title>
		<author>
			<persName><forename type="first">C</forename><surname>Ledig</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Theis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Huszár</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Caballero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>Aitken</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tejani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Totz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Shi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR</title>
				<imprint>
			<date type="published" when="2016">2017. 2016</date>
			<biblScope unit="page" from="105" to="114" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Precomputed real-time texture synthesis with markovian generative adversarial networks</title>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wand</surname></persName>
		</author>
		<idno>CoRR abs/1604.04382</idno>
		<ptr target="http://arxiv.org/abs/1604.04382" />
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Bagan: Data augmentation with balancing gan</title>
		<author>
			<persName><forename type="first">G</forename><surname>Mariani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Scheidegger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Istrate</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bekasa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C I</forename><surname>Malossi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICML Workshop on Theoretical Foundations and Applications of Deep Generative Models</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">A novel hybrid cnn-svm classifier for recognizing handwritten digits</title>
		<author>
			<persName><forename type="first">X</forename><forename type="middle">X</forename><surname>Niu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">Y</forename><surname>Suen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition</title>
		<imprint>
			<biblScope unit="volume">45</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="1318" to="1325" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Conditional image synthesis with auxiliary classifier GANs</title>
		<author>
			<persName><forename type="first">A</forename><surname>Odena</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Olah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shlens</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 34th International Conference on Machine Learning</title>
				<meeting>the 34th International Conference on Machine Learning</meeting>
		<imprint>
			<date type="published" when="2017-08-11">06-11 Aug 2017</date>
			<biblScope unit="volume">70</biblScope>
			<biblScope unit="page" from="2642" to="2651" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Improved techniques for training gans</title>
		<author>
			<persName><forename type="first">T</forename><surname>Salimans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Goodfellow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zaremba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Cheung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems 29</title>
				<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="2234" to="2242" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">A survey on image data augmentation for deep learning</title>
		<author>
			<persName><forename type="first">C</forename><surname>Shorten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M</forename><surname>Khoshgoftaar</surname></persName>
		</author>
		<idno type="DOI">10.1186/s40537-019-0197-0</idno>
		<ptr target="https://doi.org/10.1186/s40537-019-0197-0" />
	</analytic>
	<monogr>
		<title level="j">Journal of Big Data</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">60</biblScope>
			<date type="published" when="2019-07">Jul 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Harmonic networks for image classification</title>
		<author>
			<persName><forename type="first">M</forename><surname>Ulicny</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Krylov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Dahyot</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">British Machine Vision Conference (BMVC)</title>
				<meeting><address><addrLine>Cardiff UK</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019-09-12">9-12 September 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Image quality assessment: from error visibility to structural similarity</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Bovik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">R</forename><surname>Sheikh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">P</forename><surname>Simoncelli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE transactions on image processing</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="600" to="612" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms</title>
		<author>
			<persName><forename type="first">H</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Rasul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vollgraf</surname></persName>
		</author>
		<idno>CoRR abs/1708.07747</idno>
		<ptr target="http://arxiv.org/abs/1708.07747" />
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Ye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<ptr target="https://openreview.net/forum?id=S1CIev1vM" />
		<title level="m">Gan quality index (gqi) by gan-induced classifier</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
