Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 ARCHITECTURE OF A GENERATIVE ADVERSARIAL NETWORK AND PREPARATION OF INPUT DATA FOR MODELING GAMMA EVENT IMAGES FOR THE TAIGA-IACT EXPERIMENT J.Yu. Dubenskayaa, A.P. Kryukov, A.P. Demichev Skobeltsyn Institute of Nuclear Physics, Moscow State University, Moscow 119991, Russia E-mail: a jdubenskaya@gmail.com Very-high-energy gamma ray photons interact with the atmosphere to give rise to cascades of secondary particles – extensive air showers (EASs), which in turn generate very short flashes of Cherenkov radiation. This flashes are detected on the ground with Imaging Air Cherenkov Telescopes (IACTs). In the TAIGA experiment, in addition to images directly detected and recorded by the experimental facilities, images obtained as a result of simulation are used extensively. Earlier we applied a machine learning technique called Generative Adversarial Networks (GAN) to quickly generate images of gamma events for the TAIGA experiment. The initial analysis of the generated images showed the applicability of the method, but revealed some features that require additional refinement of the network. In particular, it was important to teach the network that in our case images have a specific shape and orientation. In this paper we discuss the possibility of improving the generated images by preprocessing the training dataset. We also present an example of a GAN built and trained with these requirements in mind. Testing the results using third-party software showed that more than 95% of the generated images were found to be correct, while the generation is quite fast: after training the network creates about 400 event images in 1 second. Keywords: machine learning, GAN, gamma events, image generation, TAIGA experiment Julia Dubenskaya, Alexander Kryukov, Andrey Demichev Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 270 Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 1. Introduction Very-high-energy gamma ray photons interact with the atmosphere to give rise to cascades of secondary particles – extensive air showers (EASs), which in turn generate very short flashes of Cherenkov radiation. This flashes are detected on the ground with Imaging Air Cherenkov Telescopes (IACTs) [1]. The TAIGA experiment (Tunka Advanced Instrument for cosmic ray physics and Gamma Astronomy) [2] consists of different detector systems and measures air showers, which are initiated by charged cosmic rays or high energy gamma rays. The TAIGA Cherenkov telescope array (TAIGA-IACT) is used for gamma astronomy. In the TAIGA-IACT experiment, in addition to images directly detected and recorded by the experimental facilities, images obtained as a result of simulation are used extensively [3]. The problem is that direct modeling of the underlying physical processes (such as interactions and decays of a cascade of charged particles in the atmosphere) is a computationally demanding task, since it tracks the type, energy, position, direction and time of arrival of all secondary particles born in EAS. On average, using direct computational models, one can get only about 1000 images per hour. This can result in computational bottleneck for the experiment due to the lack of model data. To address this challenge, we opted for a machine learning technique called Generative Adversarial Networks (GAN) [4] to quickly generate images of gamma events for the TAIGA-IACT experiment. GANs are an increasingly popular approach to learning a generative model using deep neural networks, and have shown great promise in generating clear samples from natural images [5]. Our previous work [6] outlines the very first results of this study. We checked the quality of the generated images with the third party software tool that is used for image classification in the TAIGA- IACT experiment [7]. This software tool determines gamma likelihood – the probability that an image is a gamma image. Initial analysis of the images generated by our GAN showed the applicability of the method, but not all the generated images were considered correct. Further analysis showed that the network was not good enough at capturing the features of the real gamma images. Because of this, the image validation tool rejected some images that appeared to be good, and the percentage of generated images recognized as gamma events was only about 90%. In this paper, we show how we managed to increase the percentage of correctly generated images by preprocessing the training set. We also provide a detailed description of the network architecture used to generate gamma images in the TAIGA-IACT experiment. 2. GAN architecture for gamma events Each classical GAN [5] is a system of two neural networks that are trained simultaneously in an adversarial game: a generative network (Generator) that captures the data distribution, and a discriminative network (Discriminator) that estimates the probability that a sample came from the training data rather than Generator. The training procedure for Generator is to maximize the probability of Discriminator making a mistake. The system as a whole corresponds to a minimax two- player game. The following is a description of the features of the network for generating gamma event images for the TAIGA-IACT experiment. The generator takes as input a point in the latent space – a random vector of 8192 (128x8x8) entries, and outputs a single 32x32 grayscale image. The generator has 4 layers of convolution. All layers except the output layer use 4x4 filters and a leaky ReLU function with alpha=0.2 as the activation function. The output layer has one 6x6 filter and uses a sigmoid for its activation. We also apply batch normalization (BN) [9] in the generator. The main advantage of this technique is that it greatly speeds up the learning process. In our case, BN makes the generator and, as a result, the entire GAN more stable. We adopted BN between convolutional layers before each activation function. The architecture of the generator for gamma events is shown in Figure 1. 271 Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 Figure 1. Architecture of the generator The discriminator takes as input one 32x32 grayscale image and outputs a binary prediction as to whether the image is real or fake. It uses a 2x2 stride to downsample, and the Adam version of stochastic gradient descent with a learning rate of 0.0002 and a momentum of 0.5. In the convolutional layers, the convolution filter size is 4x4; the leaky ReLU function with alpha=0.2 is used for the activation. The output layer uses a sigmoid function for its activation. The architecture of the generator for gamma events is shown in Figure 2. Figure 2. Architecture of the discriminator Also worth mentioning are two more hyperparameters of the GAN learning process: a batch size and a number of epochs. The batch size is a number of training images that need to be processed before updating the network weights. The number of epochs controls the number of complete passes through the training dataset. During training, we used the batch size of 128 images and 300 epochs. 3. Training set preprocessing The real images of gamma events are small, and usually we have only a few light pixels (event track) on a black background. An event track is usually elliptical in shape. When observing gamma events, the telescope is pointing towards the source of gamma quanta, so the recorded ellipses can come from different directions, but all must be pointed towards the center of the image. Our basic GAN has learned very well to reproduce the elliptical shape of the image, but some generated images had problems with the position of the ellipse within the image. To address this issue we had to modify our training set to force our network to learn the rotational symmetry of the images. 272 Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 To account for rotational symmetry, each image of the training set was flipped horizontally, and then both images were flipped vertically. Thus, in addition to the original image, we get three rotated copies of it. An example of the original image and its copies is shown in Figure 1. Figure 3. The original image from the training set (the first one) and its three rotated copies 4. Results In our previous work [6], we selected 25,000 gamma events as the training sample. The training on the Tesla P100 GPU took about 6 hours. Accordingly, now we took the same images and applied the aforementioned flipping procedure to them. This procedure increased the sample size by 4 times, respectively increasing the training time of our GAN: it took about 22 hours to train the network on the resulting dataset using the same server. At the same time, the image generation rate has not changed, and the network creates about 400 event images in 1 second. For verification, we generated a sample of 4000 gamma images and classified them using the third party software tool that is used for classification in the TAIGA-IACT experiment [7] that determines the probability that an image is a gamma image. Figure 4. The gamma likelihood for gamma events The plot in Figure 4 shows the results of the classification – the distribution of the number of generated gamma events by probabilities. The X-axis in the plot represents the probability that the image is a gamma event and the Y-axis is the number of generated gamma events classified as gamma events with a given probability. The plot shows that for more than half of the generated events, the 273 Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 calculated probability is 90-100%. Moreover, for 97% of generated events, the probability exceeds 50%, thus, these events are recognized as gamma events. So, the quality of generating gamma images has improved: about 3% of the generated gamma images, which were previously highly likely to be recognized as non-gamma events, became highly likely to be recognized as gamma events. 5. Conclusions Summarizing the above, we can conclude that additional preprocessing of the input image set used for training can further improve the accuracy of modeling event images for the TAIGA-IACT experiment. On the other hand, the training time increases significantly, but the network learns the rotational symmetry better, which is important specifically for gamma images. As a result, the number of correctly generated images increased by approximately 3% and reached 97%. At the same time, the preprocessing of the input set does not affect the image generation speed. 6. Acknowledgements This work was carried out in the framework of R&D State Assignment No.115041410196. References [1] T. Weekes, M. Cawley, D. Fegan, K. Gibbs, A. Hillas, P. Kowk, R. Lamb, D. Lewis, D. Macomb, N. Porter, P. Reynolds, G. Vacanti. Observation of TeV gamma rays from the Crab Nebula using the atmospheric Cerenkov imaging technique // Astrophysical Journal, vol. 342, р. 379, 1989 [2] N. Budnev et al. The TAIGA experiment: From cosmic-ray to gamma-ray astronomy in the Tunka valley //Nuclear Instruments and Methods in Physics, vol. A845, pp. 330-333, 2017 [3] M.H. Kunnas et al. Simulation of imaging air shower Cherenkov telescopes as part of the TAIGA Project // Proceedings of Magellan Workshop (DESY-PROC-2016-05), 2016 [4] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative Adversarial Networks // ArXiv e-prints, arXiv: 1406.2661, 2014 [5] A. Radford, L. Metz, and S. Chintala. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks // ArXiv e-prints, arXiv:1511.06434, 2015 [6] J. Dubenskaya, A. Kryukov, A. Demichev. Fast Simulation of Gamma/Proton Event Images for the TAIGA-IACT Experiment using Generative Adversarial Networks // Proceedings of the 37th International Cosmic Ray Conference PoS (ICRC2021) 874, 2021 [7] E. Postnikov, A. Kryukov, S. Polyakov, D. Zhurov. Deep Learning for Energy Estimation and Particle Identification in Gamma-ray Astronomy // Proceedings of the 3rd International Workshop DLC-2019, CEUR-WS Proceedings, vol. 2406, pp. 90-99, 2019 274