=Paper=
{{Paper
|id=Vol-3375/paper4
|storemode=property
|title=Forecasting Multivariate Time Series of the Magnetic Field Parameters of the Solar Events
|pdfUrl=https://ceur-ws.org/Vol-3375/paper4.pdf
|volume=Vol-3375
|authors=Khaznah Alshammari,Shah Muhammad Hamdi,Ali Ahsan Muhummad Muzaheed,Soukaina Filali Boubrahimi
|dblpUrl=https://dblp.org/rec/conf/amlts/AlshammariHMB22
}}
==Forecasting Multivariate Time Series of the Magnetic Field Parameters of the Solar Events==
Forecasting Multivariate Time Series of the Magnetic Field
Parameters of the Solar Events
Khaznah Alshammari1,* , Shah Muhammad Hamdi2 , Ali Ahsan Muhummad Muzaheed1 and
Soukaina Filali Boubrahimi2
1
New Mexico State University, Las Cruces, NM, 88003, USA
2
Utah State University, Logan, UT, 84322, USA
Abstract
Solar magnetic field parameters are frequently used by solar physicists in analyzing and predicting solar events (e.g., flares,
coronal mass ejection, etc). Temporal observation of the magnetic field parameters, i.e., multivariate time series (MVTS)
representation facilitates finding relationships of magnetic field states to the occurrence of the solar events. Forecasting MVTS
of solar magnetic field parameters is the prediction of future magnetic field parameter values based on historic values of the
past, regardless of the event labels. In this paper, we propose a deep sequence-to-sequence (seq2seq) learning approach based
on batch normalization and Long-Short Term Memory (LSTM) network for MVTS forecasting of magnetic field parameters of
the solar events. To the best of our knowledge, this is the first work that addresses the forecasting of magnetic field parameters
rather than the classification of events based on MVTS representations of those parameters. The experimental results on a
real-life MVTS-based solar event dataset demonstrate that our batch normalization-based model outperforms naive sequence
models in forecasting performance.
Keywords
Multivariate time series Forecasting, Solar Physics, Solar Magnetic Field Parameters, LSTM, Batch Normalization
1. Introduction events. The primary data source used in these efforts is
the images captured by the Helioseismic Magnetic Imager
Solar events are characterized by magnetic field param- (HMI) housed in the Solar Dynamics Observatory (SDO).
eter values on the solar corona such as helicity, flux, HMI images (captured in near-continuous time) contain
Lorentz force, etc. These magnetic field parameter val- spatiotemporal magnetic field data of solar active regions.
ues indicate the occurrence of extreme solar events such For performing temporal window-based flare prediction
as solar flares, coronal mass ejection (CME), and erup- of an AR instance, the spatiotemporal magnetic field
tion of solar energetic particles (SEP) [1]. These events data of that region is mapped into a multivariate time
are caused by a sudden burst of magnetic flux from the series (MVTS) instance[3]. MVTS instances, collected
corona. The X-ray radiation of such extreme solar events with a uniform sampling rate throughout a present ob-
can have devastating effects on life and infrastructure servation period, are labeled with multiple event classes
in space and ground such as disruption in GPS and ra- (e.g., flare classes), and machine learning-based classi-
dio communication, damage to electronic devices, and fiers are trained with labeled MVTS instances to predict
radiation exposure-based health risks to the astronauts. the occurrences of the events after a preset prediction
The cost associated due to infrastructure damage after window. Although multiple research efforts [4, 5, 6] ad-
extreme solar events can rise up to trillions of dollars [2]. dressed MVTS-based solar event prediction, forecasting
In recent years, the prediction of solar events given a of MVTS-represented magnetic field parameters is yet to
predefined time window has become an important chal- be explored.
lenge in the heliophysics community. Since the theoreti- In this work, we aim to forecast the future values of the
cal relationship between magnetic field influx and the oc- magnetic field parameters, given past values in the MVTS
currence of extreme events in solar active regions (AR) is representations. In case of a sudden data gap, i.e., inter-
not yet established, space weather researchers depend on ruption in the communication between the satellite and
the data of science-based approaches for predicting solar ground receiver, MVTS forecasting of magnetic field pa-
rameters can play an important role in extrapolation. To
AMLTSβ22: Workshop on Applied Machine Learning Methods for Time
Series Forecasting, co-located with the 31st ACM International Con-
the best of our knowledge, this is the first attempt to fore-
ference on Information and Knowledge Management (CIKM), October cast the solar magnetic field parameters. We used a deep
17-21, 2022, Atlanta, USA sequence-to-sequence learning model based on batch
$ kalshamm@nmsu.edu (K. Alshammari); s.hamdi@usu.edu normalization and Long-Short Term Memory (LSTM)
(S. M. Hamdi); muzaheed@nmsu.edu (A. A. M. Muzaheed); network that is trained with input-output pairs of exam-
soukaina.boubrahimi@usu.edu (S. F. Boubrahimi)
Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License ples, where the inputs are formed by sampling the MVTS
CEUR
Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
instances for an observation window, and the outputs
Figure 1: LSTM and Batch normalization-based seq2seq model for MVTS forecasting
are formed by sampling the MVTS instances for a pre- have been used successfully in multiple Natural Lan-
diction window (which follows the observation window). guage Processing (NLP) tasks such as machine trans-
Our LSTM-based encoder-decoder model is trained with lation [9, 10] and text summarization [11, 12]. Since
a backpropagation algorithm based on mini-batch gra- multivariate time series are high-dimensional sequence
dient descent-based optimization for minimizing Mean data, previously MVTS forecasting has been addressed
Squared Error (MSE) between the observed MVTS (input) by different seq2seq models [13, 14]. In [15], batch nor-
and predicted MVTS (output). malization has shown promising improvements in the
sentiment classification task, where a batch-normalized
variant of LSTM architecture is used and each LSTM cellβs
2. Related Work input, hidden state, and cell state are normalized during
training. Being inspired by encoder-decoder-based ma-
Recent research efforts on solar event prediction are
chine translation models, in this work we considered the
mostly based on data science. Data-driven extreme solar
MVTS forecasting of solar magnetic field parameters as
event prediction models stem from linear and nonlin-
a sequence-to-sequence learning task, and used batch
ear statistics. Datasets used in these models were col-
normalization-based LSTM architecture for capturing
lected from line-of-sight magnetogram and vector mag-
long-term dependencies of multi-dimensional sequence
netogram data. Line-of-sight magnetogram contains only
data.
the line-of-sight component of the magnetic field, while
vector magnetogram contains the full disk magnetic field
data [7]. NASA launched Solar Dynamics Observatory 3. Methodology
(SDO) in 2010. Since then, SDOβs instrument Helioseis-
mic and Magnetic Imager (HMI) has been mapping the 3.1. Notations
full-disk vector magnetic field every 12 minutes [1]. Most
of the recent prediction models use the near-continuous Each solar active region results in different event occur-
stream of vector magnetogram data found from SDO [8]. rences after a given prediction window represents an
Magnetic field parameters (e.g., helicity, flux, etc) were event instance. The event instance π is represented by
developed with the goal of finding a relationship between a πMVTS instance ππ£π‘π π . The MVTS instance ππ£π‘π π β
the phosphoric magnetic field behavior and solar activ- R
Γπ
is a collection of individual time series of π mag-
ity, which usually occurs in the solar chromosphere and netic field parameters, where each time series contains
transition region of the solar corona. periodic observation values of the corresponding param-
Deep learning-based sequence-to-sequence models us- eter for an observation period π . In the MVTS instance
ing Long Short Term Memory (LSTM), Recurrent Neu- ππ£π‘π π = {π£π‘1 , π£π‘2 , ., ., ., π£π‘π }, π£π‘π β R represents a
π
ral Network (RNN), and Gated Recurrent Unit (GRU) timestamp vector. We divide the dataset into π‘(π,Γπ π)
pairs, where ππ = ππ£π‘π π [π‘1 : π‘π‘πππ , :] β R πππ ,
ππ = ππ£π‘π π [π‘π‘πππ +1 : π‘π , :] β Rπ‘ππππ Γπ , π‘πππ is the to 1. We found batch normalization to be significant in
observation time, and π‘ππππ is the prediction time. maximizing the performance of MVTS forecasting for the
magnetic field parameters of the solar events, which we
3.2. LSTM and Batch Normalization-based demonstrate in more detail in the experiments section.
MVTS Forecasting
3.3. Evaluation Metrics
In this section, we present a batch normalization-based
implementation of the encoder-decoder model that uses We used Mean Absolute Error (MAE), Mean Squared
LSTM architecture and compare it with other baseline Error (MSE), and Root Mean Squared Error (RMSE) to
sequence models of naive stochastic gradient descent report our model results. The evaluation metrics (MAE,
implementation (without batch normalization). There MSE, and RMSE) measure the amount of error in statisti-
are different deep sequence learning models, which are cal models. They assess the average squared difference
frequently applied in machine translation, and they can between the observed and predicted values.
be adapted for time series forecasting. In this study, we Mean Absolute Error (MAE) is the average over
analyze two seq2seq models: the batch normalization- the absolute values of the differences between predicted
based seq2seq LSTM Model (BN seq2seq LSTM), and the representations and ground truth representations.
seq2seq models based on LSTM/GRU/RNN, and compare π
their forecasting results. 1 βοΈ
π π΄πΈ = |π¦π β π¦Λπ |
Fig. 1 depicts our seq2seq-based model that uses batch π π=1
normalization and LSTM architecture. First, in the en-
coder LSTM cells, the value of each time step is used as where π¦π is the ground truth value and π¦Λπ is the predicted
input to the encoder LSTM cell together with the previ- value.
ous cell state π and hidden state β, the process repeats Mean Squared Error (MSE) is defined as the mean
until the last cell state π and hidden state β are generated. or average of the square of the difference between actual
Then, the decoder LSTM cell uses the last cell state π and and predicted values.
hidden state β from the encoder as the initial states for the π
decoder LSTM cell. The last hidden state of the encoder π ππΈ =
1 βοΈ
(π¦π β π¦Λπ )2
is also copied π‘ππππ times using a Repeat Vector layer ac- π π=1
cording to the length of the forecasting window, and each
copy is inputted into the decoder LSTM cell together with Root Mean Squared Error (RMSE) is the difference
the previous cell state π and hidden state β. The decoder between forecast and corresponding observed values,
outputs hidden states for all the π‘ππππ time steps and the where each difference is squared and averaged over the
hidden states are connected to the final Time-distributed- sample space. It denotes the square root of the MSE.
dense layer in order to produce the final output sequence. β―
The time-distributed-dense layer allows to apply a dense βΈ
βΈ 1 βοΈ π
layer to every temporal slice of the input. We use this π
π ππΈ = β· (π¦π β π¦Λπ )2
π π=1
final layer to process the output from the LSTM hidden
layer. Every input shape is three-dimensional, and the
Experiments
first dimension of the input is considered to be the tem-
We compared the batch normalization-based seq2seq
poral dimension. This means that we need to configure
LSTM model with the baseline models on multivariate
the last LSTM layer prior to the time-distributed-dense
time series forecasting of magnetic field parameters of
layer to return output sequences. The output shape will
a solar events dataset. The source code of our model
be three-dimensional as well, which means that if the
and the experimental dataset are available on our GitHub
time-distributed-dense layer is the output layer, then for
repository 1 .
predicting a sequence we need to reshape the final rep-
resentation into a three-dimensional shape [16]. In the
batch normalization-based seq2seq LSTM Model, we use 3.4. Dataset Description
mini-batches to feed the data into the model. Batch nor-
As the benchmark dataset of our experiments, we used
malization is a useful method for making deep neural
the MVTS-based solar flare prediction data set published
network training faster and more robust, and it normal-
by Angryk et al [3]. Each MVTS instance in the dataset
izes the input activations to avoid gradient explosion
is made up of 25 time series of active region magnetic
caused by the activation function ELU (Exponential Lin-
field parameters (a full list can be found in [1]). The time
ear Unit) in the encoder [17]. The batch normalization
series instances are recorded at 12 minutes intervals for a
layer applies a transformation that maintains the mean
output close to 0 and the output standard deviation close 1
https://github.com/Kalshammari/BN_Seq2Seq
Table 1
Forecasting Performance of Batch Normalization-based seq2seq (LSTM) Model compared to the baselines
Performance Metrics Gradient Descent LSTM Gradient Descent GRU Gradient Descent RNN BN seq2seq LSTM
Train MAE 14.481 Β± 0.043 14.942 Β± 0.052 15.578 Β± 0.036 0.094 Β± 0.002
Test MAE 14.55 Β± 0.103 15.042 Β± 0.092 15.68 Β± 0.107 0.057 Β± 0.010
Train MSE 18.238 Β± 0.075 19.631 Β± 0.062 21.269 Β± 0.031 0.070 Β± 0.003
Test MSE 22.598 Β± 0.251 24.906 Β± 0.821 24.589 Β± 0.726 0.002 Β± 0.001
Train RMSE 18.434 Β± 0.039 19.126 Β± 0.652 19.921 Β± 0.821 0.265 Β± 0.007
Test RMSE 18.492 Β± 0.348 19.245 Β± 0.542 20.092 Β± 0.672 0.005 Β± 0.001
total duration of 12 hours (60-time steps). The dataset has tions was 25, the number of epochs in training was 5, and
the number of observation points π = 60, and the number the learning rate in stochastic gradient descent is 0.01.
of dimensions in timestamp vectors π = 25, while the
event occurrence window is 12 hours. Our experimental 3.7. Performance of LSTM and Batch
dataset consists of 1,540 MVTS instances that are evenly
distributed across four flare classes (X, M, BC, and Q).
Normalization-based seq2seq model
We discarded the class labels to fit the dataset for MVTS When we apply LSTM and batch normalization-based
forecasting [5, 4], where each MVTS instance is divided seq2seq model, we perform the following steps. First,
into input and output (ground truth) sequences according we extract (π, π ) pairs from all 1,540 MVTS instances,
to the observation window (π‘πππ ) and prediction window where the length of each example π is π‘πππ = 40, the
(π‘ππππ ). In our experiments, π‘πππ = 40, and π‘ππππ = 20, length of each output π is π‘ππππ = 20, and each times-
while π = π‘πππ + π‘ππππ . tamp vector is 25-dimensional.
In the encoder step, the input is of size (π, 40, 25), where
3.5. Train/test splitting method π(= 10) is the batch size of the MVTS instances. For each
encoder LSTM cell, the vector of each time step is used
We performed random sampling for train/test splitting, as the input to the encoder LSTM cell together with the
where we use the stratified holdout method (80 % for previous cell state π and hidden state β, and the process
training, and 20 % for testing) with six different random repeats until the last cell state π and hidden state β are
seeds, and reported the mean error rates along with stan- generated. The decoder LSTM cell uses the last cell state
dard deviation. Train and test datasets are z-normalized π and hidden state β from the encoder as the initial states
since magnetic field parameter values appear on differ- for the decoder LSTM cell. The last hidden state of the
ent scales. The shapes of train and test datasets are as encoder is also copied 20 times using the Repeat Vector
follows. layer and each copy is inputted into the decoder LSTM
cell together with the previous cell state π and hidden
β’ X_train shape:(1232, 40, 25) and y_train
state β. The decoder outputs a hidden state for all the
shape:(1232, 20, 25)
20-time steps, and these hidden states are connected to
β’ X_test shape:(308, 40, 25) and y_test shape:(308,
a time-distributed-dense layer to generate the final fore-
20, 25)
casting output which is of size (π, 20, 25). We used Mean
Absolute Error (MAE), Mean Squared Error (MSE), and
3.6. Baseline Models Root Mean Squared Error (RMSE) to report our model
performance results. We reported the mean and stan-
We evaluated our model with LSTM, RNN, and GRU-
dard deviation of the performance measures results in
based seq2seq implementations. In the forward pass,
Table 1. We found that our approach of deep sequence-
we have input the first π‘πππ vectors of each MVTS to
to-sequence learning based on batch normalization and
the encoder cells (LSTM/RNN/GRU) to produce the en-
Long-Short Term Memory (LSTM) network significantly
coded hidden state. That encoded hidden state is the
outperformed the baseline methodsβ results as Table 1
input to the decoder cells of the same type. The decoder
shows. It is visible that batch normalization makes a
then predicts the next 25-dimensional timestamp vectors
difference of a large margin by producing errors near 0,
for each timestamp in π‘ππππ and matches the prediction
whereas the traditional seq2seq models result in large
with ground truth to perform stochastic gradient descent-
error values due to the absence of batch normalization.
based backpropagation. In all three models, the number
of dimensions in cell state and hidden state representa-
4. Conclusion IEEE Intl. Conf. on Machine Learning and Applica-
tions, ICMLA 2021, Pasadena, CA, USA, December
We propose a batch normalization-based deep seq2seq 13-16, 2021, IEEE, 2021, pp. 435β440.
model for multivariate time series forecasting of mag- [6] R. Ma, S. F. Boubrahimi, S. M. Hamdi, R. A. Angryk,
netic field parameters of solar events. Unlike previous Solar flare prediction using multivariate time series
works of MVTS-based event classification, we perform decision trees, in: 2017 IEEE Intl. Conf. on Big Data,
forecasting of magnetic field parameter values irrespec- BigData 2017, Boston, MA, USA, December 11-14,
tive of MVTS labels. We compare it with other seq2seq 2017, IEEE Computer Society, 2017, pp. 2569β2578.
implementations based on LSTM, GRU, and RNN. Our [7] S. F. Boubrahimi, B. Aydin, D. Kempton, R. Angryk,
proposed approach significantly improved the MAE, MSE, Spatio-temporal interpolation methods for solar
and RMSE results of MVTS forecasting on a benchmark events metadata, in: 2016 IEEE Intl. Conf. on Big
solar magnetic field parameter dataset. Data (Big Data), IEEE, 2016, pp. 3149β3157.
For future research, we plan to develop machine learn- [8] J. P. Mason, J. Hoeksema, Testing automated solar
ing models for MVTS forecasting that leverage MVTS flare forecasting with 13 years of michelson doppler
labels. We aim to use the forecasting models for aug- imager magnetograms, The Astrophysical Journal
menting (creating synthetic examples) MVTS instances 723 (2010) 634.
of minority classes (rare events). In addition, to utilize [9] D. Bahdanau, K. Cho, Y. Bengio, Neural machine
inter-variable dependencies of the MVTS instances for translation by jointly learning to align and translate,
the task of forecasting, we plan to incorporate graph con- arXiv preprint arXiv:1409.0473 (2014).
struction (e.g., functional network computation from the [10] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova,
correlation matrices of the MVTS instances) and graph Bert: Pre-training of deep bidirectional transform-
neural network (GNN)-based representation learning. ers for language understanding, arXiv preprint
arXiv:1810.04805 (2018).
[11] M. Yousefi-Azar, L. Hamey, Text summarization
Acknowledgments using unsupervised deep learning, Expert Systems
This project has been supported in part by funding from with Applications 68 (2017) 93β105.
CISE and GEO directorates under NSF awards #2153379 [12] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei,
and #2204363. I. Sutskever, et al., Language models are unsuper-
vised multitask learners, OpenAI blog 1 (2019) 9.
[13] L. Liu, A. M. Finch, M. Utiyama, E. Sumita, Agree-
References ment on target-bidirectional lstms for sequence-to-
sequence learning, in: Proc. of the AAAI Conf.
[1] M. G. Bobra, S. Couvidat, Solar flare prediction on Artificial Intelligence, February 12-17, 2016,
using sdo/hmi vector magnetic field data with a Phoenix, Arizona, USA, AAAI Press, 2016, pp. 2630β
machine-learning algorithm, The Astrophysical 2637.
Journal 798 (2015) 135. [14] P. H. Scherrer, J. Schou, R. Bush, A. Kosovichev,
[2] J. P. Eastwood, E. Biffis, M. A. Hapgood, L. Green, R. Bogart, J. Hoeksema, Y. Liu, T. Duvall, J. Zhao,
M. M. Bisi, R. D. Bentley, R. Wicks, L.-A. McKinnell, C. Schrijver, et al., The helioseismic and magnetic
M. Gibbs, C. Burnett, The economic impact of space imager (hmi) investigation for the solar dynamics
weather: Where do we stand?, Risk Analysis 37 observatory (sdo), Solar Physics 275 (2012) 207β227.
(2017) 206β218. [15] H. Margarit, R. Subramaniam, A batch-normalized
[3] R. A. Angryk, P. C. Martens, B. Aydin, D. Kempton, recurrent network for sentiment classification, Ad-
S. S. Mahajan, S. Basodi, A. Ahmadzadeh, X. Cai, vances in Neural Information Processing Systems
S. Filali Boubrahimi, S. M. Hamdi, et al., Multi- (2016) 2β8.
variate time series dataset for space weather data [16] J. Brownlee, Long short-term memory networks
analytics, Scientific data 7 (2020) 1β13. with python: develop sequence prediction mod-
[4] S. M. Hamdi, D. Kempton, R. Ma, S. F. Boubrahimi, els with deep learning, Machine Learning Mastery,
R. A. Angryk, A time series classification-based 2017.
approach for solar flare prediction, in: 2017 IEEE [17] S. Santurkar, D. Tsipras, A. Ilyas, A. Madry, How
Intl. Conf. on Big Data (Big Data), IEEE, 2017, pp. does batch normalization help optimization?, Ad-
2543β2551. vances in neural information processing systems
[5] A. Muzaheed, S. M. Hamdi, S. F. Boubrahimi, Se- 31 (2018).
quence model-based end-to-end solar flare classifi-
cation from multivariate time series data, in: 20th