=Paper=
{{Paper
|id=Vol-3041/363-368-paper-67
|storemode=property
|title=Quantum Machine Learning for HEP Detector Simulations
|pdfUrl=https://ceur-ws.org/Vol-3041/363-368-paper-67.pdf
|volume=Vol-3041
|authors=Florian Rehm,Sofia Vallecorsa,Kerstin Borras,Dirk Krücker
}}
==Quantum Machine Learning for HEP Detector Simulations==
<pdf width="1500px">https://ceur-ws.org/Vol-3041/363-368-paper-67.pdf</pdf>
<pre>
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


   QUANTUM MACHINE LEARNING FOR HEP DETECTOR
                 SIMULATIONS
                F. Rehm1,2,a, S. Vallecorsa1, K. Borras2,3, D. Krücker3
                         1
                             CERN, Esplanade des Particules 1, Geneva, Switzerland
                   2
                       RWTH Aachen University, Templergraben 55, Aachen, Germany
                                 3
                                     DESY, Notkestraße 85, Hamburg, Germany

                                     E-mail: a florian.matthias.rehm@cern.ch

Quantum Machine Learning (qML) is one of the most promising and very intuitive applications on
near-term quantum devices which possess the potential to combat computing resource challenges
faster than traditional computers. Classical Machine Learning (ML) is taking up a significant role in
particle physics to speed up detector simulations. Generative Adversarial Networks (GANs) have
proven to achieve a similar level of accuracy compared to Monte Carlo-based simulations while
decreasing the computation time by orders of magnitude. In this research we are moving on and apply
quantum computing to GAN-based detector simulations.
Given the limitations of current quantum hardware in terms of number of qubits, connectivity, and
noise, we perform initial tests with a simplified GAN model running on quantum simulators. The
model is a classical-quantum hybrid ansatz. It consists of a quantum generator, defined as a
parameterised circuit based on single and two qubit gates, combined with a classical discriminator.
Our initial qGAN prototype focuses on a one-dimensional toy-distribution, representing the energy
deposited in a detector by a single particle. It employs three qubits and achieves high physics accuracy
thanks to hyper-parameter optimisation. Furthermore, we study the influence of real hardware noise
for the qML GAN training. A second qGAN is developed to simulate 2D images with a 64-pixel
resolution, representing the energy deposition patterns in the detector. Different quantum ansatzes are
studied. We obtained the best results using a tree-tensor-network architecture with six qubits.
Additionally, we discuss challenges and potential benefits of quantum computing as well as our plans
for future developments.

Keywords: Quantum Machine Learning, Hybrid Classical-Quantum GAN, Noisy Quantum
Simulations


                                              Florian Rehm, Sofia Vallecorsa, Kerstin Borras, Dirk Krücker


                                                                Copyright © 2021 for this paper by its authors.
                       Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


                                                      363
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


1. Calorimeter Training Data
        Calorimeters represent one key component of High Energy Physics experiments by measuring
the particles energy [1]. Particles enter the calorimeter and deposit their energies by producing
secondary particles which, in turn, create other particles by the same mechanisms. This chain process
leads to the generation of particle showers. One example shower image is shown in Figure 1 (left) for
a primary particle with 500 GeV energy. The training data is generated using the Geant4 toolkit [2].
The training data, including the classical ML GAN studies for calorimeters, are further explained in
Ref. [3].


        Figure 1. Shows (left) an example 3D shower image of an electromagnetic calorimeter
         and (right) the generated probability density function (PDF) of the trained 1D qGAN
                         model in green as well as the 1D training data in blue.

2. Hybrid Quantum GAN
        Quantum circuits are limited in size and depth, because today's quantum computers are still
very noisy and have only a small number of qubits. However, the resilience of ML algorithms against
noise as demonstrated with classical algorithms makes qML attractive for noisy intermediate-scale
quantum (NISQ) hardware [4]. The quantum simulations are often run with a hybrid quantum-classical
approach. In this research we modified and optimised a hybrid quantum-classical qGAN approach,
presented in Figure 2, initially developed by IBM [5].


                            Figure 2. Hybrid quantum-classical GAN model

         The model utilizes a parameterised quantum generator and a classical discriminator neural
network. The classical data is represented in quantum states using the amplitude encoding approach
and the quantum generator output state is measured in classical data format. At each forward
generation step one quantum output state is calculated which corresponds to one classical pixel value.
The complete shower images are generated by operating the quantum generator for multiple
repetitions (so-called shots). Therefore, the generated images are rather probabilistic distributions
instead of real images. The quantum generator circuit is illustrated in Figure 3. It consists of three
qubits to represent the 8 pixels (23 = 8) of the simplified shower image as shown in Figure 1 (right).
The qubits are initialized in basis state zero |0⟩ followed by Hadamard gates to initiate superposition.

                                                   364
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


The generator consists of two layers as indicated in Figure 3. Parameterised Y-rotational gates provide
the trainable part: each gate possesses its own trainable parameter 𝜃 which corresponds to the rotation
angle. Entanglement is created by controlled Z-gates in a linear entanglement topology.


                                 Figure 3. 1D quantum generator circuit

         The generated images represent the fake data which are, together with the training data (real
data), input into the classical discriminator neural network. The discriminator consists of one input
feature, two hidden dense layers with 512 nodes (and the LeakyReLU activation function) and one
output layer with dimension 256 (with the Sigmoid activation function). A single output neuron
provides the true-fake probability. Following the standard GAN approach a binary cross-entropy loss
function is used, the gradients are classically computed, and the parameters of the quantum generator
and the classical discriminator updated.


3. Hyperparameter Optimization
         Initially, we run the qGAN training process using the QASM quantum simulator [6] without
including any noise, on an Intel(R) Xeon(R) Gold 6130 CPU (with 2.10 GHz cores). Convergence was
achieved after 4 000 training epochs, requiring extremely long simulations (over one day). Multiple
hyperparameter searches, using the Optuna [7] tool, managed to speed up the qGAN training by a
factor of 10x. In the first place, we observed that higher learning rates increase the training speed but,
on the other side, decrease the accuracy. To overcome the decreasing accuracy, we implemented an
exponential learning rate decay. This allows in the initial training iterations fast learning with a high
learning rate and in the latter one’s accurate results with a moderate learning rate. Furthermore, the
learning rate decay stabilises the training during later epochs with fewer oscillations. Training classical
GAN models, we observed that separate generator and discriminator learning rates improve the
training quality. Therefore, we implemented this approach to the hybrid qGAN model and achieved an
improved accuracy and faster training. As a last study, we modified the number of discriminator
training iterations with respect to generator training iterations within each epoch. We found, that in
case the discriminator is trained ten times more often than the generator, the training converges much
faster in terms of number of epochs. With all previously mentioned adoptions and with a generator
learning rate of 0.008, a discriminator learning rate of 0.001 and an exponential learning rate decay of
0.004 we were able to decrease the training time from 4 000 down to only 300 epochs. The probability
density function (PDF) of the best training is close to the training data is shown in Figure 1 (right).


4. qGAN Noise Studies
         In the following section we study the influence of quantum hardware noise. Initially, we apply
exclusively readout noise to the measurements of the generator output. Additionally, the training
converged rapidly to a low relative entropy and kept this level stable without oscillations. To perform
the study as realistic as possible we apply the noise model measured for the IBMq Belem quantum
computer [8]. The readout noise level of the Belem quantum computer at the time we carried out the
tests are documented in Table 1.The IBMq Belem quantum computer possesses five qubits, but as we
require three qubits for the circuit, we neglect the residual two qubits. At the time we performed the
tests the noise level was relatively high, with 3.6%, 4.7% and 9.6% respectively. However, when we
look at the PDF plot in Figure 4 (left) we can see that the readout noise does not result in a lower


                                                   365
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


accuracy. Additionally, the training converged more rapidly to a low relative entropy value and kept
this level stable without oscillations.
    Table 1. Readout noise levels for the single qubits of the IBMq Belem quantum computer [8].
                               Qubit Number:          0        1         2

                               Readout Error:       3.6%      4.7%     9.6%


  Figure 4. Shows (left) the PDF of the qGAN model trained with only readout noise and (right) the
              model trained with the full noise model (readout noise + gate level noise)

        In the subsequent step we apply the full noise model to the qGAN training. The full noise
model comprises gate level noise additionally to the readout noise. Gate level noise is usually
determined by the noise of the cx-gate, because the cx-gate is the gate which has typically the highest
noise level. At the time we loaded the noise model of the IBMq Belem quantum computer the gate
level noise was on average 4.32% for the three considered qubits. The readout noise is summarised in
Table 1.


        Figure 5. Statistics plot for the accuracy of the full noise qGAN model for multiple trials

         The PDF of the trained model with the full noise model is shown on the in Figure 4 (right).
One can see that the generator output is farther off from the Geant4 distribution and performs slightly
worse. We trained the full noise model multiple times with the same hyperparameters and evaluated
the training statistics in Figure 5. The accuracy metric to determine the best trial is the relative entropy
which is a measure for the difference between two probability distributions. The training was run for
23 trials with the same hyperparameter set. The relative entropy of the best trial is shown in green, the
mean value of all trials in blue and the grey band measures one standard deviation. On average the
model converges within the first 100 epochs and the relative entropy remains stable for higher epochs.
There are some fluctuations between the various trials (indicated by the broad grey band), however,
this effect is expected and similar for the classical ML GAN model due to statistical effects.
         The results of the statistical evaluation of the noiseless model and the full noise model are
summarized in Table 2 for comparison. One can see that the performance of the model with noise is
slightly worse in terms of the mean relative entropy and the relative entropy best trial. However, the


                                                    366
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


model with noise has a lower standard deviation indicating that the values of the relative entropy of
the trained models are closer to each other. This result can be interpreted as an indication that the
qGAN models learns the hardware noise behaviour and is moderately resilient against it.
         Before running the training on a real quantum device, we plan to perform additional tests
using different noise models to better understand the qGAN behaviour. Additionally, a relevant
question we examine is what the impact on the performance can be seen if we adapt the
hyperparameters of the qGAN model for training with noise influence. For the noise studies presented
in this paper, we ran the training with the hyperparameters which performed best in the noiseless case.

Table 2. Shows the accuracy of the statistical evaluation of the noiseless qGAN model and the model
                                            with full noise

                                         Noiseless Model    Full Noise Model

                                Mean         0.046               0.054

                                STD          0.064               0.510

                                Best         0.0077             0.0125


5. 2D qGAN
       Because the one-dimensional qGAN model reached a high accuracy, we increased the
complexity of the model to tackle a more realistic detector simulation. We increased the model
dimension to a total 8x8=64 pixels (8 times as much as for the 1D qGAN model) in order to reproduce
a two-dimensional image of the energy pattern in the detector. For simplicity we stacked the pixels of
the image into a one-dimensional vector.


 Figure 6. (left) The 2D quantum generator circuit with 6 qubits and (right) the PDF of the best trial

         We tested different quantum generator circuit architectures. The most promising results were
achieved with a Tree-Tensor-Network (TTN) architecture published in Ref. [9]. In this case, the
generator circuit consists of six qubits (26 = 64) and it is shown in Figure 6 (left). We have also
increased the discriminator size, including two additional dense layers with 512 nodes. The generated
PDF for the 2D case is shown in Figure 6 (right). One can see that the qGAN output, in green, is close
to the training data, in blue, except for a few slightly off pixels. However, the training process turned
out to be unstable with convergence in rare cases reached. Additionally, the training lasts for over
6.000 epochs which results in training times of more than five days on the quantum simulator.


6. Conclusion and Future Work
         In this paper we implemented a one-dimensional qGAN model to correctly generate single
particle energy distributions as measured in a calorimeter. By optimizing the training hyperparameters,
we achieved a high accuracy, and were able to accelerate the training process by a factor of 10x. We
performed noise studies to understand the influence of quantum hardware noise on the qGAN training
process. We measured that a realistic noise model, such as the IBMq Belem's, does not influence the


                                                      367
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


accuracy if only readout noise is applied. Adding gate-level noise, we experienced a slight decrease of
accuracy. Before running the qGAN training and inference on a real quantum device, we plan to
perform further noise tests to better understand its influence on the final accuracy.
       In addition, we created a more realistic qGAN model which is capable of generating two-
dimensional energy distributions with eight times more pixels than the initial one-dimensional model.
By using a Tree-Tensor-Network architecture for the quantum generator circuit, we were able to
reproduce the 2D shower image correctly. However, further optimization of the training
hyperparameters is needed in order to reach faster, more stable, training and the desired level of
accuracy.


7. Acknowledgements
        This work has been sponsored by the Wolfgang Gentner Programme of the German Federal
Ministry of Education and Research.

References

[1] R. M. Brown and D. J. A. Cockerill, "Electromagnetic Calorimetry," Nuclear Instruments and
    Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated
    Equipment, pp. 47-79, 2012.
[2] S. Agostinelli, GEANT4--a simulation toolkit, Nucl. Instrum. Meth. A, 2003.
[3] F. Rehm, S. Vallecorsa, K. Borras and D. Krücker, "Validation of Deep Convolutional Generative
    Adversarial Networks for High Energy Physics Calorimeter Simulations," in AAAI 2021 -
    Association for the Advancement of Artificial Intelligence, 2021.
[4] Braccia, Paolo, Caruso and Filippo, "How to enhance quantum generative adversarial learning of
    noisy information," New Journal of Physics, May 2021.
[5] "qGANs for Loading Random Distributions," [Online]. Available:
    https://qiskit.org/documentation/machine-
    learning/tutorials/04_qgans_for_loading_random_distributions.html. [Accessed March 2021].
[6] H. Abrahm, AduOffei and Rochisha, "Qiskit," 2019. [Online].
[7] T. Akiba, S. Shotaro and Y. Toshihiko, "Optuna: A Next-generation Hyperparameter Optimization
    Framework," 2019. [Online].
[8] "IBM Quantum Services," July 2021. [Online]. Available: https://quantum-
    computing.ibm.com/services?services=systems&system=ibmq_belem.
[9] E. Grant, M. Benedetti, S. Cao and A. Hallam , "Hierarchical quantum classifiers," npj Quantum
    Information, 2018.


                                                   368

</pre>