<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Forecasting Multivariate Time Series of the Magnetic Field Parameters of the Solar Events</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Khaznah Alshammari</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shah Muhammad Hamdi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ali Ahsan Muhummad Muzaheed</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Soukaina Filali Boubrahimi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>New Mexico State University</institution>
          ,
          <addr-line>Las Cruces, NM, 88003</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Utah State University</institution>
          ,
          <addr-line>Logan, UT, 84322</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Solar magnetic field parameters are frequently used by solar physicists in analyzing and predicting solar events (e.g., flares, coronal mass ejection, etc). Temporal observation of the magnetic field parameters, i.e., multivariate time series (MVTS) representation facilitates finding relationships of magnetic field states to the occurrence of the solar events. Forecasting MVTS of solar magnetic field parameters is the prediction of future magnetic field parameter values based on historic values of the past, regardless of the event labels. In this paper, we propose a deep sequence-to-sequence (seq2seq) learning approach based on batch normalization and Long-Short Term Memory (LSTM) network for MVTS forecasting of magnetic field parameters of the solar events. To the best of our knowledge, this is the first work that addresses the forecasting of magnetic field parameters rather than the classification of events based on MVTS representations of those parameters. The experimental results on a real-life MVTS-based solar event dataset demonstrate that our batch normalization-based model outperforms naive sequence models in forecasting performance.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Multivariate time series Forecasting</kwd>
        <kwd>Solar Physics</kwd>
        <kwd>Solar Magnetic Field Parameters</kwd>
        <kwd>LSTM</kwd>
        <kwd>Batch Normalization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        events. The primary data source used in these eforts is
the images captured by the Helioseismic Magnetic Imager
Solar events are characterized by magnetic field param- (HMI) housed in the Solar Dynamics Observatory (SDO).
eter values on the solar corona such as helicity, flux, HMI images (captured in near-continuous time) contain
Lorentz force, etc. These magnetic field parameter val- spatiotemporal magnetic field data of solar active regions.
ues indicate the occurrence of extreme solar events such For performing temporal window-based flare prediction
as solar flares, coronal mass ejection (CME), and erup- of an AR instance, the spatiotemporal magnetic field
tion of solar energetic particles (SEP) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. These events data of that region is mapped into a multivariate time
are caused by a sudden burst of magnetic flux from the series (MVTS) instance[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. MVTS instances, collected
corona. The X-ray radiation of such extreme solar events with a uniform sampling rate throughout a present
obcan have devastating efects on life and infrastructure servation period, are labeled with multiple event classes
in space and ground such as disruption in GPS and ra- (e.g., flare classes), and machine learning-based
classidio communication, damage to electronic devices, and ifers are trained with labeled MVTS instances to predict
radiation exposure-based health risks to the astronauts. the occurrences of the events after a preset prediction
The cost associated due to infrastructure damage after window. Although multiple research eforts [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
        ]
adextreme solar events can rise up to trillions of dollars [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. dressed MVTS-based solar event prediction, forecasting
      </p>
      <p>In recent years, the prediction of solar events given a of MVTS-represented magnetic field parameters is yet to
predefined time window has become an important chal- be explored.
lenge in the heliophysics community. Since the theoreti- In this work, we aim to forecast the future values of the
cal relationship between magnetic field influx and the oc- magnetic field parameters, given past values in the MVTS
currence of extreme events in solar active regions (AR) is representations. In case of a sudden data gap, i.e.,
internot yet established, space weather researchers depend on ruption in the communication between the satellite and
the data of science-based approaches for predicting solar ground receiver, MVTS forecasting of magnetic field
parameters can play an important role in extrapolation. To
the best of our knowledge, this is the first attempt to
forecast the solar magnetic field parameters. We used a deep
sequence-to-sequence learning model based on batch
normalization and Long-Short Term Memory (LSTM)
network that is trained with input-output pairs of
examAMLTS’22: Workshop on Applied Machine Learning Methods for Time
Series Forecasting, co-located with the 31st ACM International
Conference on Information and Knowledge Management (CIKM), October
17-21, 2022, Atlanta, USA
$ kalshamm@nmsu.edu (K. Alshammari); s.hamdi@usu.edu
(S. M. Hamdi); muzaheed@nmsu.edu (A. A. M. Muzaheed);
soukaina.boubrahimi@usu.edu (S. F. Boubrahimi)</p>
      <p>
        © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License ples, where the inputs are formed by sampling the MVTS
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g ACttEribUutRion W4.0oInrtekrnsahtioonpal (PCCroBYce4.0e).dings (CEUR-WS.org) instances for an observation window, and the outputs
are formed by sampling the MVTS instances for a pre- have been used successfully in multiple Natural
Landiction window (which follows the observation window). guage Processing (NLP) tasks such as machine
transOur LSTM-based encoder-decoder model is trained with lation [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ] and text summarization [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]. Since
a backpropagation algorithm based on mini-batch gra- multivariate time series are high-dimensional sequence
dient descent-based optimization for minimizing Mean data, previously MVTS forecasting has been addressed
Squared Error (MSE) between the observed MVTS (input) by diferent seq2seq models [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], batch
norand predicted MVTS (output). malization has shown promising improvements in the
sentiment classification task, where a batch-normalized
variant of LSTM architecture is used and each LSTM cell’s
2. Related Work input, hidden state, and cell state are normalized during
training. Being inspired by encoder-decoder-based
machine translation models, in this work we considered the
MVTS forecasting of solar magnetic field parameters as
a sequence-to-sequence learning task, and used batch
normalization-based LSTM architecture for capturing
long-term dependencies of multi-dimensional sequence
data.
      </p>
      <p>
        Recent research eforts on solar event prediction are
mostly based on data science. Data-driven extreme solar
event prediction models stem from linear and
nonlinear statistics. Datasets used in these models were
collected from line-of-sight magnetogram and vector
magnetogram data. Line-of-sight magnetogram contains only
the line-of-sight component of the magnetic field, while
vector magnetogram contains the full disk magnetic field
data [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. NASA launched Solar Dynamics Observatory 3. Methodology
(SDO) in 2010. Since then, SDO’s instrument
Helioseismic and Magnetic Imager (HMI) has been mapping the 3.1. Notations
full-disk vector magnetic field every 12 minutes [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Most
of the recent prediction models use the near-continuous Each solar active region results in diferent event
occurstream of vector magnetogram data found from SDO [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. rences after a given prediction window represents an
Magnetic field parameters (e.g., helicity, flux, etc) were event instance. The event instance  is represented by
developed with the goal of finding a relationship between a MVTS instance . The MVTS instance  ∈
the phosphoric magnetic field behavior and solar activ- R ×  is a collection of individual time series of 
magity, which usually occurs in the solar chromosphere and netic field parameters, where each time series contains
transition region of the solar corona. periodic observation values of the corresponding
param
      </p>
      <p>Deep learning-based sequence-to-sequence models us- eter for an observation period  . In the MVTS instance
ing Long Short Term Memory (LSTM), Recurrent Neu-  = {1 , 2 , ., ., .,  },  ∈ R represents a
ral Network (RNN), and Gated Recurrent Unit (GRU) timestamp vector. We divide the dataset into (,  )
pairs, where  = [1 :  , :] ∈ R×  ,
 = [+1 :  , :] ∈ R×  ,  is the
observation time, and  is the prediction time.</p>
      <sec id="sec-1-1">
        <title>3.2. LSTM and Batch Normalization-based</title>
      </sec>
      <sec id="sec-1-2">
        <title>MVTS Forecasting</title>
        <p>to 1. We found batch normalization to be significant in
maximizing the performance of MVTS forecasting for the
magnetic field parameters of the solar events, which we
demonstrate in more detail in the experiments section.</p>
      </sec>
      <sec id="sec-1-3">
        <title>3.3. Evaluation Metrics</title>
        <p>In this section, we present a batch normalization-based
implementation of the encoder-decoder model that uses We used Mean Absolute Error (MAE), Mean Squared
LSTM architecture and compare it with other baseline Error (MSE), and Root Mean Squared Error (RMSE) to
sequence models of naive stochastic gradient descent report our model results. The evaluation metrics (MAE,
implementation (without batch normalization). There MSE, and RMSE) measure the amount of error in
statistiare diferent deep sequence learning models, which are cal models. They assess the average squared diference
frequently applied in machine translation, and they can between the observed and predicted values.
be adapted for time series forecasting. In this study, we Mean Absolute Error (MAE) is the average over
analyze two seq2seq models: the batch normalization- the absolute values of the diferences between predicted
based seq2seq LSTM Model (BN seq2seq LSTM), and the representations and ground truth representations.
seq2seq models based on LSTM/GRU/RNN, and compare 
their forecasting results.   = 1 ∑︁ | − ˆ|</p>
        <p>Fig. 1 depicts our seq2seq-based model that uses batch  =1
normalization and LSTM architecture. First, in the
encoder LSTM cells, the value of each time step is used as where  is the ground truth value and ˆ is the predicted
input to the encoder LSTM cell together with the previ- value.
ous cell state  and hidden state ℎ, the process repeats Mean Squared Error (MSE) is defined as the mean
until the last cell state  and hidden state ℎ are generated. or average of the square of the diference between actual
Then, the decoder LSTM cell uses the last cell state  and and predicted values.
hidden state ℎ from the encoder as the initial states for the 
decoder LSTM cell. The last hidden state of the encoder   = 1 ∑︁( − ˆ)2
is also copied  times using a Repeat Vector layer ac-  =1
cording to the length of the forecasting window, and each
copy is inputted into the decoder LSTM cell together with Root Mean Squared Error (RMSE) is the diference
the previous cell state  and hidden state ℎ. The decoder between forecast and corresponding observed values,
outputs hidden states for all the  time steps and the where each diference is squared and averaged over the
hidden states are connected to the final Time-distributed- sample space. It denotes the square root of the MSE.
dense layer in order to produce the final output sequence.</p>
        <p>
          The time-distributed-dense layer allows to apply a dense ⎯ 
layer to every temporal slice of the input. We use this   = ⎷⎸⎸ 1 ∑︁( − ˆ)2
ifnal layer to process the output from the LSTM hidden =1
layer. Every input shape is three-dimensional, and the
ifrst dimension of the input is considered to be the tem- Experiments
poral dimension. This means that we need to configure We compared the batch normalization-based seq2seq
the last LSTM layer prior to the time-distributed-dense LSTM model with the baseline models on multivariate
layer to return output sequences. The output shape will time series forecasting of magnetic field parameters of
be three-dimensional as well, which means that if the a solar events dataset. The source code of our model
time-distributed-dense layer is the output layer, then for and the experimental dataset are available on our GitHub
predicting a sequence we need to reshape the final rep- repository 1.
resentation into a three-dimensional shape [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. In the
batch normalization-based seq2seq LSTM Model, we use 3.4. Dataset Description
mini-batches to feed the data into the model. Batch nor- As the benchmark dataset of our experiments, we used
malization is a useful method for making deep neural the MVTS-based solar flare prediction data set published
network training faster and more robust, and it normal- by Angryk et al [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Each MVTS instance in the dataset
izes the input activations to avoid gradient explosion is made up of 25 time series of active region magnetic
caused by the activation function ELU (Exponential Lin- ifeld parameters (a full list can be found in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]). The time
ear Unit) in the encoder [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. The batch normalization series instances are recorded at 12 minutes intervals for a
layer applies a transformation that maintains the mean
output close to 0 and the output standard deviation close 1https://github.com/Kalshammari/BN_Seq2Seq
total duration of 12 hours (60-time steps). The dataset has tions was 25, the number of epochs in training was 5, and
the number of observation points  = 60, and the number the learning rate in stochastic gradient descent is 0.01.
of dimensions in timestamp vectors  = 25, while the
event occurrence window is 12 hours. Our experimental 3.7. Performance of LSTM and Batch
dataset consists of 1,540 MVTS instances that are evenly Normalization-based seq2seq model
distributed across four flare classes (X, M, BC, and Q).
        </p>
        <p>
          We discarded the class labels to fit the dataset for MVTS When we apply LSTM and batch normalization-based
forecasting [
          <xref ref-type="bibr" rid="ref4 ref5">5, 4</xref>
          ], where each MVTS instance is divided seq2seq model, we perform the following steps. First,
into input and output (ground truth) sequences according we extract (,  ) pairs from all 1,540 MVTS instances,
to the observation window () and prediction window where the length of each example  is  = 40, the
(). In our experiments,  = 40, and  = 20, length of each output  is  = 20, and each
timeswhile  =  + . tamp vector is 25-dimensional.
        </p>
        <p>In the encoder step, the input is of size (, 40, 25), where
3.5. Train/test splitting method (= 10) is the batch size of the MVTS instances. For each
encoder LSTM cell, the vector of each time step is used
We performed random sampling for train/test splitting, as the input to the encoder LSTM cell together with the
where we use the stratified holdout method (80 % for previous cell state  and hidden state ℎ, and the process
training, and 20 % for testing) with six diferent random repeats until the last cell state  and hidden state ℎ are
seeds, and reported the mean error rates along with stan- generated. The decoder LSTM cell uses the last cell state
dard deviation. Train and test datasets are z-normalized  and hidden state ℎ from the encoder as the initial states
since magnetic field parameter values appear on difer- for the decoder LSTM cell. The last hidden state of the
ent scales. The shapes of train and test datasets are as encoder is also copied 20 times using the Repeat Vector
follows. layer and each copy is inputted into the decoder LSTM
cell together with the previous cell state  and hidden
• X_train shape:(1232, 40, 25) and y_train state ℎ. The decoder outputs a hidden state for all the
shape:(1232, 20, 25) 20-time steps, and these hidden states are connected to
• X_test shape:(308, 40, 25) and y_test shape:(308, a time-distributed-dense layer to generate the final
fore20, 25) casting output which is of size (, 20, 25). We used Mean
Absolute Error (MAE), Mean Squared Error (MSE), and
3.6. Baseline Models Root Mean Squared Error (RMSE) to report our model
performance results. We reported the mean and
standard deviation of the performance measures results in
Table 1. We found that our approach of deep
sequenceto-sequence learning based on batch normalization and
Long-Short Term Memory (LSTM) network significantly
outperformed the baseline methods’ results as Table 1
shows. It is visible that batch normalization makes a
diference of a large margin by producing errors near 0,
whereas the traditional seq2seq models result in large
error values due to the absence of batch normalization.</p>
        <p>We evaluated our model with LSTM, RNN, and
GRUbased seq2seq implementations. In the forward pass,
we have input the first  vectors of each MVTS to
the encoder cells (LSTM/RNN/GRU) to produce the
encoded hidden state. That encoded hidden state is the
input to the decoder cells of the same type. The decoder
then predicts the next 25-dimensional timestamp vectors
for each timestamp in  and matches the prediction
with ground truth to perform stochastic gradient
descentbased backpropagation. In all three models, the number
of dimensions in cell state and hidden state
representa</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Conclusion</title>
      <p>We propose a batch normalization-based deep seq2seq
model for multivariate time series forecasting of
magnetic field parameters of solar events. Unlike previous
works of MVTS-based event classification, we perform
forecasting of magnetic field parameter values
irrespective of MVTS labels. We compare it with other seq2seq
implementations based on LSTM, GRU, and RNN. Our
proposed approach significantly improved the MAE, MSE,
and RMSE results of MVTS forecasting on a benchmark
solar magnetic field parameter dataset.</p>
      <p>For future research, we plan to develop machine
learning models for MVTS forecasting that leverage MVTS
labels. We aim to use the forecasting models for
augmenting (creating synthetic examples) MVTS instances
of minority classes (rare events). In addition, to utilize
inter-variable dependencies of the MVTS instances for
the task of forecasting, we plan to incorporate graph
construction (e.g., functional network computation from the
correlation matrices of the MVTS instances) and graph
neural network (GNN)-based representation learning.</p>
    </sec>
    <sec id="sec-3">
      <title>Acknowledgments</title>
      <p>This project has been supported in part by funding from
CISE and GEO directorates under NSF awards #2153379
and #2204363.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Bobra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Couvidat</surname>
          </string-name>
          ,
          <article-title>Solar flare prediction using sdo/hmi vector magnetic field data with a machine-learning algorithm</article-title>
          ,
          <source>The Astrophysical Journal</source>
          <volume>798</volume>
          (
          <year>2015</year>
          )
          <fpage>135</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Eastwood</surname>
          </string-name>
          , E. Bifis,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hapgood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Green</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. M.</given-names>
            <surname>Bisi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. D.</given-names>
            <surname>Bentley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wicks</surname>
          </string-name>
          , L.-
          <string-name>
            <surname>A. McKinnell</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Gibbs</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Burnett</surname>
          </string-name>
          ,
          <article-title>The economic impact of space weather: Where do we stand?</article-title>
          ,
          <source>Risk Analysis</source>
          <volume>37</volume>
          (
          <year>2017</year>
          )
          <fpage>206</fpage>
          -
          <lpage>218</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Angryk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. C.</given-names>
            <surname>Martens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Aydin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kempton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Mahajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Basodi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ahmadzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Filali</given-names>
            <surname>Boubrahimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          , et al.,
          <article-title>Multivariate time series dataset for space weather data analytics</article-title>
          ,
          <source>Scientific data 7</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kempton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. F.</given-names>
            <surname>Boubrahimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Angryk</surname>
          </string-name>
          ,
          <article-title>A time series classification-based approach for solar flare prediction</article-title>
          ,
          <source>in: 2017 IEEE Intl. Conf. on Big Data (Big Data)</source>
          , IEEE,
          <year>2017</year>
          , pp.
          <fpage>2543</fpage>
          -
          <lpage>2551</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Muzaheed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. F.</given-names>
            <surname>Boubrahimi</surname>
          </string-name>
          ,
          <article-title>Sequence model-based end-to-end solar flare classification from multivariate time series data</article-title>
          ,
          <source>in: 20th IEEE Intl. Conf. on Machine Learning and Applications, ICMLA</source>
          <year>2021</year>
          , Pasadena, CA, USA, December
          <volume>13</volume>
          -
          <issue>16</issue>
          ,
          <year>2021</year>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>435</fpage>
          -
          <lpage>440</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. F.</given-names>
            <surname>Boubrahimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Angryk</surname>
          </string-name>
          ,
          <article-title>Solar flare prediction using multivariate time series decision trees</article-title>
          ,
          <source>in: 2017 IEEE Intl. Conf. on Big Data, BigData</source>
          <year>2017</year>
          , Boston, MA, USA, December
          <volume>11</volume>
          -
          <issue>14</issue>
          ,
          <year>2017</year>
          , IEEE Computer Society,
          <year>2017</year>
          , pp.
          <fpage>2569</fpage>
          -
          <lpage>2578</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S. F.</given-names>
            <surname>Boubrahimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Aydin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kempton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Angryk</surname>
          </string-name>
          ,
          <article-title>Spatio-temporal interpolation methods for solar events metadata</article-title>
          ,
          <source>in: 2016 IEEE Intl. Conf. on Big Data (Big Data)</source>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>3149</fpage>
          -
          <lpage>3157</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Mason</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hoeksema</surname>
          </string-name>
          ,
          <article-title>Testing automated solar lfare forecasting with 13 years of michelson doppler imager magnetograms</article-title>
          ,
          <source>The Astrophysical Journal</source>
          <volume>723</volume>
          (
          <year>2010</year>
          )
          <fpage>634</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bahdanau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Bengio,</surname>
          </string-name>
          <article-title>Neural machine translation by jointly learning to align and translate</article-title>
          ,
          <source>arXiv preprint arXiv:1409.0473</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Yousefi-Azar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hamey</surname>
          </string-name>
          ,
          <article-title>Text summarization using unsupervised deep learning</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>68</volume>
          (
          <year>2017</year>
          )
          <fpage>93</fpage>
          -
          <lpage>105</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Child</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Luan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Amodei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Sutskever</surname>
          </string-name>
          , et al.,
          <article-title>Language models are unsupervised multitask learners</article-title>
          ,
          <source>OpenAI blog 1</source>
          (
          <year>2019</year>
          )
          <article-title>9</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>L.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Finch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Utiyama</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Sumita,</surname>
          </string-name>
          <article-title>Agreement on target-bidirectional lstms for sequence-tosequence learning</article-title>
          ,
          <source>in: Proc. of the AAAI Conf. on Artificial Intelligence, February 12-17</source>
          ,
          <year>2016</year>
          , Phoenix, Arizona, USA, AAAI Press,
          <year>2016</year>
          , pp.
          <fpage>2630</fpage>
          -
          <lpage>2637</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P. H.</given-names>
            <surname>Scherrer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bush</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kosovichev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bogart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hoeksema</surname>
          </string-name>
          , Y. Liu,
          <string-name>
            <given-names>T.</given-names>
            <surname>Duvall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Schrijver</surname>
          </string-name>
          , et al.,
          <article-title>The helioseismic and magnetic imager (hmi) investigation for the solar dynamics observatory (sdo</article-title>
          ),
          <source>Solar Physics</source>
          <volume>275</volume>
          (
          <year>2012</year>
          )
          <fpage>207</fpage>
          -
          <lpage>227</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>H.</given-names>
            <surname>Margarit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Subramaniam</surname>
          </string-name>
          ,
          <article-title>A batch-normalized recurrent network for sentiment classification</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          (
          <year>2016</year>
          )
          <fpage>2</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Brownlee</surname>
          </string-name>
          ,
          <article-title>Long short-term memory networks with python: develop sequence prediction models with deep learning</article-title>
          ,
          <source>Machine Learning Mastery</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Santurkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tsipras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ilyas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Madry</surname>
          </string-name>
          ,
          <article-title>How does batch normalization help optimization?</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>31</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>