Introduction

Phys. Rev. Accel. Beams

Quantified Uncertainties for Machine-Learning Based Particle Accelerator Diagnostic

Owen Convery

Lewis Smith

Yarin Gal

Adi Hanuka

adiha@slac.stanford.edu 1 0 Department of Computer Science, University of Oxford , UK 1 SLAC National Accelerator Laboratory , Menlo Park, CA 94025 , USA

21 112802

Current diagnostic tools for characterizing a system are often costly, limited and invasive, i.e. interrupt the system's normal operation. A Virtual Diagnostic (VD) is a deep learning tool that can be used to predict the diagnostic output. For practical usage of VDs, it is necessary to quantify the prediction's reliability, namely the uncertainty in that prediction. In this paper, we applied an ensemble of neural networks to create uncertainty and explore various ways of analyzing prediction's uncertainty using experimental data from the Linac Coherent Light Source particle accelerator at SLAC National Laboratory. We aim to accurately and confidently predict the longitudinal properties of the electron beam as given by their phase-space images. The ability to make informed decisions under uncertainty is crucial for reliable deployment of deep learning tools on safety-critical systems as particle accelerators.

Introduction

Particle accelerators are ubiquitous in many applications ranging from chemistry, physics to biology experiments. Those experiments require increased accuracy of diagnostics tools to measure the electron beam properties during its acceleration, transport and delivery to users. Current stateof-the-art diagnostics (Marx et al. 2018a) have limited applicability. Their limitation is enhanced as the complexity of the experiments grows. Virtual diagnostic (VD) tools provide a shot-to-shot non-invasive measurement of the beam in cases where the diagnostic has limited resolution or is unavailable.

Current VD provides predictive models based on training a neural network mapping between non-invasive diagnostic input to invasive output measurement (Emma et al. 2018, 2019; Hanuka et al. 2020) . This type of mapping is known as supervised regression. Previous work has demonstrated VD to predict the electron beam current profile and Longitudinal Phase Space (LPS) distribution (Marx et al. 2018b) along the accelerator using either scalar controls (Emma et al. 2018) or spectral information (Hanuka et al. 2020) as the noninvasive input to the VD. For reliable deployment of the VD in critical-safety systems such as particle accelerators, it is required to estimate the uncertainty in the prediction.

In this work, we apply deep learning tools to provide a confidence interval of the virtual diagnostic prediction using experimental data from the Linac Coherent Light Source (LCLS) at SLAC. Our results show an accurate prediction of the diagnostic output along with estimating an interval presenting the prediction’s uncertainty. Reliable VD would aid in interpreting experimental results, and enable the system’s users to make informed decisions.

(a) Spectra - Input

(b) LPS for spectrum #1 (c) LPS for spectrum #2 (d) LPS for spectrum #3

Particle Accelerators

High brightness beam linear accelerators typically operate in single-pass, multi-stage configurations where a high-density electron beam is accelerated and manipulated prior to delivery to users in an experimental station. An example of such a facility is the X-ray Free Electron Laser facility at SLAC National Lab. At SLAC’s Linac Coherent Light Source, the electron beam is manipulated to emit coherent X-ray pulses. An important monitored property is the longitudinal phase space (LPS) of the electron beam. LPS images inform about the longitudinal properties of the electron beam, and can give insight to the quality of the emitted X-ray. Currently, LPS is measured by X-band transverse deflecting cavity (XTCAV) (Marx et al. 2018a) . This measurement is invasive, i.e. the beam cannot be diagnosed and used in the experiments at the same time. Therefore, a new set of diagnostic tools capable predicting the LPS continuously are required.

Methods

In this work, we train a virtual diagnostic to predict the longitudinal phase space (LPS). We used ensemble method to estimate the prediction uncertainty.

Data set. The input was spectral information, as can be collected non-destructively by an infrared spectrometer. The output was the corresponding LPS image as measured at the XTCAV. Three examples of the inputs and outputs are shown in Figure 1. The data set contains 4000 pairs of spectrum and matching LPS images. The data was randomly shuffled and split to 80% and 20% partitions for training and testing.

VD architecture. The neural network (NN) architecture we used is a dense feed-forward NN with three hidden layers of size 200, 100, 50 with rectified linear unit activation function. Training was done in batches of 32, with 500 epochs and an Adam optimizer with fixed learning rate of 0.001 (Hanuka et al. 2020). The hyper-parameter tuning of the NN was performed before selecting the ones that have been used in the work. The NN training involved minimizing the standard Mean Squared Error (MSE) loss function on the training set. We used Keras and TensorFlow libraries (Chollet et al. 2015; Abadi et al. 2015) to build and train the models.

Ensemble methods. A deep ensemble is a group of neural networks that are restarted with different parameter initializations and are trained independently. It has been shown that ensemble methods can improve uncertainty estimates when used with large neural networks and nonconvex loss surfaces (Lakshminarayanan, Pritzel, and Blundell 2017). The predicted LPS for a test shot L~predicted = M 1 PmM=1 ~lpredicted;m is the mean prediction of an ensemble with M neural network predictions (~lpredicted;m). The uncertainty for a VD prediction is taken as the standard deviation of the neural network predictions ~ = q

M 1 PmM=1(~lpredicted;m L~predicted)2. Here, we used random initializations of glorot uniform distribution (Glorot and Bengio 2010) with an ensemble size of M = 8. This ensemble size was chosen since it yielded a small MSE while capturing the statistics.

Metrics for model evaluation. To evaluate the mean prediction of the VD, we used the mean squared error (MSE) metric. To evaluate the quality of the mean prediction we plot the difference between the VD prediction and the ground truth (see Figure 2a). To evaluate the uncertainty intervals provided by the predictive standard deviation, we use a custom accuracy metric:

Accuracy =

PT;E t;e=1 t;e PT;E t;e=1 L2measured;t;e

L2measured;t;e (1) where t;e = 1 if Llower;t;e < Lmeasured;t;e < Lupper;t;e and 0 otherwise. We used bounds of Lpredicted;t;e 2 t;e where t;e is predictive standard deviation at time t and energy level e. In order to visualize the accuracy, we plot the ground truth with red pixels indicating where the ground truth lies within the 2 (see Figure 2b).

Results and Discussion

The average MSE of the VD on the test set is 6.714e-04 with an accuracy of 0.538. In what follows we present two common prediction errors: shape and translational. Figure 2 shows an example of a poor test shot with MSE of 8.585e-4 and a low accuracy of 0.264. In order to analyze and visualize the prediction quality we present plots of the shot’s difference and Accuracy metrics as presented in the Methods Section. The large MSE can be explained by looking at Figure 2a depicting the difference between the prediction and the ground truth. Here, the shape of the prediction and the measured images do not match. We refer to this error as shape error.

Another poor shot is shown in Figure 3a with an MSE of 5.604e-4 and accuracy of 0.380. However, this lower performance is due to translational error, not shape error as seen in the previous example. Since the shape tells us the most about the physical properties of the beam, we can translate the prediction to match the ‘center of mass’ for the measured value (see Figure 3b). Applying such translation correction, yields an MSE of 2.017e-4 and and accuracy of 0.603 which indicates an improvement of 64.6% and 58.7% respectively. Additionally, we can better understand how the shape of the measured and predicted differs. Before the translation correction, these slight differences were masked by the translational error. Both types of errors could potentially be reduced if spatial connectivity was leveraged in a more sophisticated network architecture.

Conclusions and Outlook

In this work, we presented methods, metrics, and visualization tools to predict and quantify prediction uncertainty for single shot electron beam longitudinal properties in phasespace. Although looking at individual shots allows us pinpoint data set features and analyze problems with our virtual (a) Before correction (b) After correction diagnostic (VD), it does not give much insight into how the VD performs on the data set as a whole. Since the groundtruth will not be available during real-time operations, such insight is important in order to evaluate the VD reliability. In future research, we will investigate methods to asses and visualize the predicted uncertainty over an entire test set. This would allow users to make informed decisions regarding the machine operations and data analysis.

Glorot, X.; and Bengio, Y. 2010. Understanding the difficulty of training deep feedforward neural networks. volume 9 of Proceedings of Machine Learning Research, 249– 256. Chia Laguna Resort, Sardinia, Italy: JMLR Workshop and Conference Proceedings. URL http://proceedings.mlr. press/v9/glorot10a.html.

Hanuka, A.; Emma, C.; Maxwell, T.; Fisher, A.; Jacobson, B.; Hogan, M. J.; and Huang, Z. 2020. Accurate and confident prediction of electron beam longitudinal properties using spectral virtual diagnostics.

Lakshminarayanan, B.; Pritzel, A.; and Blundell, C. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in neural information processing systems, 6402–6413.

Marx, D.; Assmann, R.; Craievich, P.; Dorda, U.; Grudiev, A.; and Marchetti, B. 2018a. Longitudinal phase space reconstruction simulation studies using a novel X-band transverse deflecting structure at the SINBAD facility at DESY. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 909: 374 – 378. ISSN 0168-9002. doi:https://doi.org/10.1016/j.nima.2018.02. 037. URL http://www.sciencedirect.com/science/article/pii/ S0168900218301918. 3rd European Advanced Accelerator Concepts workshop (EAAC2017).

Marx, D.; Assmann, R.; Craievich, P.; Dorda, U.; Grudiev, A.; and Marchetti, B. 2018b. Longitudinal phase space reconstruction simulation studies using a novel Xband transverse deflecting structure at the SINBAD facility at DESY. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 909: 374 – 378. ISSN 0168-9002. doi:https://doi.org/10.1016/j.nima.2018. 02.037. URL http://www.sciencedirect.com/science/article/ pii/S0168900218301918. 3rd European Advanced Accelerator Concepts workshop (EAAC2017).

Abadi , M. ; Agarwal , A. ; Barham , P. ; Brevdo , E. ; Chen , Z. ; Citro , C. ; Corrado , G. S. ; Davis , A. ; Dean , J. ; Devin, M. ; Ghemawat , S. ; Goodfellow , I. ; Harp , A. ; Irving , G. ; Isard, M. ; Jia , Y. ; Jozefowicz , R. ; Kaiser, L. ; Kudlur , M. ; Levenberg , J. ; Mane´, D. ; Monga, R. ; Moore , S. ; Murray , D. ; Olah , C. ; Schuster , M. ; Shlens , J. ; Steiner , B. ; Sutskever , I. ; Talwar, K. ; Tucker , P. ; Vanhoucke , V. ; Vasudevan , V. ; Vie´gas, F.; Vinyals , O. ; Warden , P. ; Wattenberg , M. ; Wicke , M. ; Yu , Y. ; and Zheng , X. 2015 . TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems . URL http: //tensorflow.org/. Software available from tensorflow.org.

Chollet , F. ; et al. 2015 . Keras. https://keras.io.

2019. Machine Learning-Based Longitudinal Phase Space Prediction of Two-Bunch Operation at FACET-II. Proceedings of the 8th International Beam Instrumentation Conference IBIC2019: Sweden- . doi:10 .18429/JACOWIBIC2019- THBO01 . URL http://jacow.org/ibic2019/doi/ JACoW-IBIC2019-THBO01.html.

Emma , C. ; Edelen , A. ; Hogan , M. J.; O 'Shea , B. ; White , G. ; and Yakimenko , V. 2018 . Machine learning-based longitudinal phase space prediction of particle accelera-