Multivariate Time Series-based Solar Flare Prediction by Functional Network Embedding and Sequence Modeling Shah Muhammad Hamdi1,* , Abu Fuad Ahmad2 and Soukaina Filali Boubrahimi1 1 Utah State University, Logan, UT, 84322, USA 2 New Mexico State University, Las Cruces, NM, 88003, USA Abstract Major flaring events on the Sun can have hazardous impacts on both space and ground-based infrastructure. An effective approach of predicting that a solar active region (AR) is likely to flare after a period of time is to leverage multivariate time series (MVTS) of the AR magnetic field parameters. Existing MVTS-based flare prediction models are based on training traditional classifiers with preset statistical features of univariate time series instances, or training deep sequence models based on Recurrent Neural Network (RNN) or Long Short Term Memory (LSTM) Network. While the earlier approach is affected by hand-engineered features, the latter approach uses only the temporal dimension of the MVTS instances. The variables of MVTS do not depend only on their historical values but also on other variables. In this work, we used the dynamic functional network representation of the MVTS instances to leverage higher-order relationships of the variables through Graph Convolution Network (GCN) embedding. In addition to finding spatial (inter-variable) patterns through functional network embedding, our model uses local and global temporal patterns through LSTM networks. Our experiments on a real-life solar flare dataset exhibit better prediction performance than other baseline methods. Keywords Solar flare prediction, Multivariate time series, GCN, LSTM 1. Introduction rely on data science-based approaches for predicting so- lar flares. The data is collected by the Helioseismic Mag- Solar flares are characterized by sudden bursts of mag- netic Imager (HMI) housed in the Solar Dynamics Ob- netic flux in the solar corona and heliosphere. Extreme servatory. Near-continuous-time images captured by the Ultra-Violet (EUV), X-ray, and gamma-ray emissions instruments of HMI contain spatiotemporal magnetic caused by major flaring events can have disastrous ef- field data of the active regions. The prediction of solar fects on our technology-dependent society. The risks of flares, which will identify active regions that will poten- life and infrastructure in both space and ground include tially flare after a period of time, requires time series radiation exposure-based health risks of the astronauts, modeling of the magnetic field data. For that, spatiotem- disruption in GPS and radio communication, and dam- poral magnetic field data of active regions are mapped ages in electronic devices. The economic damage of such into multiple MVTS instances [3]. The variables of the extreme solar events can rise up to trillions of dollars [1]. MVTS instances represent solar magnetic field parame- In 2015, the White House released the National Space ters (e.g., flux, current, helicity, Lorentz force). The time Weather Strategy and Space Weather Action Plan [2] as a series corresponding to the magnetic field parameters roadmap for research aimed at predicting and mitigating are extracted based on two time windows: observation the effects of solar eruptive activities. window (the time window of data collection), and predic- In recent years, multiple research efforts of the helio- tion window (the time window after the data collection physics community aim to predict solar flares from the and before the flare occurrence). Each MVTS instance current and historic magnetic field states of the solar is labeled as one of six classes - Q, A, B, C, M, and X, active regions. Due to the absence of direct theoretical where Q represents flare quiet active regions, and other relationship between magnetic field influx and flare oc- labels represent flaring events with increasing intensity. currence in active regions (AR), solar physics researchers Among these classes, X and M-class flares are considered AMLTS’22: Workshop on Applied Machine Learning Methods for Time as most intense flaring events. Series Forecasting, co-located with the 31st ACM International Con- In comparison to the earlier single timestamp-based ference on Information and Knowledge Management (CIKM), October magnetic field vector classification models, recent MVTS- 17-21, 2022, Atlanta, USA based models are more effective for predicting flaring * Corresponding author. activities [3]. MVTS classification models targeting flare $ s.hamdi@usu.edu (S. M. Hamdi); fuad@nmsu.edu (A. F. Ahmad); soukaina.boubrahimi@usu.edu (S. F. Boubrahimi) prediction are divided in two categories: (1) statisti-  0000-0002-9303-7835 (S. M. Hamdi); 0000-0001-5693-6383 cal feature-based method [4], and (2) end-to-end deep (S. F. Boubrahimi) learning-based method [5]. The models of the first cate- Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). gory work in two steps. Firstly, low-dimensional repre- CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) sentations of MVTS instances are calculated from con- 2. Related Work catenation/aggregation of summarization statistics (e.g., mean, standard deviation, skewness, kurtosis, etc) of the While the current approaches of flare prediction are univariate time series components. Lastly, traditional mostly based on data science, the earliest flare prediction classifiers (e.g., kNN, SVM, etc) are trained with labeled system was an expert system named THEO that required MVTS representations. The two-step process of MVTS human inputs [7]. The Space Environment Center (SEC) classification relies heavily on hand-engineered statis- of the National Oceanic and Atmospheric Administration tical features and the choice of downstream classifiers, (NOAA) adopted the system THEO in 1987. To distin- which eventually complicates the application of these guish flare classes, THEO was provided input data of models in datasets with varying properties. In the sec- sunspots and magnetic field properties. ond category, RNN/LSTM-based deep sequence models Due to the abundance of magnetic field data collected are trained by sequentially feeding vectors representing by NASA’s recent missions, research efforts of flare pre- magnetic field parameters into sequence model cells, and diction of the last two decades are based on data science optimizing the cell weights through gradient descent- rather than on purely theoretical modeling. Data science- based backpropagation. While the deep learning models based approaches stemmed from both linear and nonlin- ensure end-to-end learning bypassing the dependency ear statistics. Based on the type of dataset used, these on hand-engineered features, they can utilize only the approaches are subdivided into two classes: line-of-sight time dimension of the MVTS instances, and this limited magnetogram-based models and vector magnetogram- usage of underlying patterns results in poor classification based models. Solar active regions are represented by the performance. parameters of either photospheric magnetic field data In this work, we propose a deep learning-based MVTS that contain only the line-of-sight component of the classification approach for solar flare prediction lever- magnetic field or by the full-disk photospheric vector aging the the fact that MVTS data is rich not only in magnetic field. Followed by NASA’s launch of SDO in temporal dimension, but also in spatial dimension which 2010, the HMI instrument has been mapping the full- encodes inter-variable relationships [6]. For learning disk vector magnetic field every 12 minutes [8]. Most higher-order relationships of the MVTS variables, we of the recent models use the near-continuous stream of used functional networks, where nodes represent vari- vector magnetogram data found from SDO, while the ear- ables, and edges represent positive correlation of the time lier models (dated before 2010) mostly used line-of-sight series of corresponding variables. The MVTS instance magnetic data. is divided into equal-length temporal windows, and an The objective of the linear statistical models was to find edge-weighted functional network is constructed for each the active region magnetic field features that are highly window. We trained Graph Convolution Network (GCN) correlated with the flare occurrences. Cui et al. [9] and to learn representation of each functional network. In Jing et al. [10] used line-of-sight magnetogram data to addition, we used two LSTM networks for learning rep- find correlation-based statistical relationships between resentations based on temporal dimension within and magnetic field parameters and flare occurrences. Even between the windows. Our model significantly outper- before the launch of SDO, Leka and Barnes [11] collected forms existing MVTS-based flare prediction models on and curated vector magnetogram data from Mees Solar a dataset containing MVTS instances of solar events of Observatory on the summit of Mount Haleakala, and different flare classes. used linear discriminant analysis (LDA) for classifying The contributions made by this paper are listed below. flaring events. Nonlinear statistical models are mostly machine learn- 1. Leveraging higher-order inter-variable relation- ing classifiers based on tree induction, kernel method, ships of the MVTS instances by GCN-based dy- neural network, and so on. On the line-of-sight namic functional network embedding. magnetogram-based active region datasets, Song et al. 2. Utilizing local and global patterns of the temporal [12] used logistic regression, Yu et al. [13] used C4.5 dimension of the MVTS instances through LSTM- decision tree, Ahmed et al. [14] used the fully connected based within-window and between-window se- neural network, and Al-Ghraibah et al. [15] used rele- quence learning. vance vector machine as classification models. Bobra et al. 3. Experimentally demonstrating the better perfor- [16] used Support Vector Machine (SVM) on SDO-based mance of our model in comparison with the state- vector magnetogram data for classifying flaring and non- of-the-art baselines on a benchmark solar flare flaring active regions. Nishizuka et al. [17] used both prediction dataset. line-of-sight and vector magnetograms and compared the performance of three classifiers - kNN, SVM, and Ex- tremely Randomized Tree (ERT). Other examples of solar flare prediction on non-sequential data include various MVTS instance 𝑆 (𝑖) ∈ R𝑇 ×𝑁 is a collection of univariate Multivariate Time Series (MVTS) instance Label: 'X' P1 time series of 𝑁 magnetic field parameters, where each field parameters Solar magnetic P2 time series contains periodic observation values of the ... ... corresponding parameter for an observation period 𝑇 . We denote the vector of 𝑑-th timestamp as π‘₯<𝑑> ∈ R𝑁 , PN Flare occurrence and the time series represented by π‘˜-th parameter as π‘ƒπ‘˜ ∈ R𝑇 . After the observation period 𝑇 and prediction period βˆ†, the event is labeled by the active region state Observation Prediction Time (t) window (T) window (Ξ”) Figure 1: Multivariate time series instance with predefined (flare quiet or different flare classes). The active region observation and prediction window, and corresponding flare state of a particular timestamp is found from the NOAA class label [5] records of flaring events. Fig. 1 shows the MVTS-based data model of a solar event. Each MVTS instance is di- vided into πœ‚ equal-length windows such that 𝑇 = πœ‚πœ , where 𝜏 denotes window length. The sub-MVTS is de- applications of convolutional neural network (ConvNet) noted by 𝑠 ∈ R𝜏 ×𝑁 , and 𝑠 is a subsequence of 𝑆. on SDO AIA/HMI images [18, 19, 20, 21]. Angryk et al. [3] introduced temporal window- 3.1.2. Node-attributed functional network based flare prediction, which extends the earlier sin- gle timestamp-based models. The authors published an Functional network is a undirected and edge-weighted MVTS-based active region dataset, where each MVTS graph, and defined as 𝐺 = (𝑉, 𝐸, π‘Š, 𝑋), where the instance records magnetic field data for a preset observa- set of nodes 𝑉 = {𝑃1 , ..., 𝑃𝑁 } denotes magnetic field tion time and uniform sampling rate, and is labeled by parameters, π‘Š : 𝐸 βˆ’β†’ R is a function of mapping edges flare classes that occurred after a given prediction time. to their weights, and node attribute matrix 𝑋 ∈ R𝑁 Γ—πœ Among the MVTS classification approaches, Hamdi et contains the time series of each node in the sub-MVTS, al. [4] used statistical summarization of component uni- i.e., 𝑋 = 𝑠𝑇 . The functional network is defined on the variate time series for training kNN classifier, Ma et. al. sub-MVTS, and the weight 𝑀𝑖𝑗 of edge 𝑒𝑖𝑗 (between node [22] applied MVTS decision trees that approached the pair 𝑃𝑖 and 𝑃𝑗 ) represents the statistical similarity of 𝜏 - problem using clustering as a preprocessing step, and length time series of 𝑃𝑖 and 𝑃𝑗 . Each functional network Muzaheed et. al. [5] used LSTM-based deep sequence derived from a MVTS dataset has the same node set 𝑉 . modeling for end-to-end flare classification that auto- mated feature learning process avoiding hand-engineered 3.1.3. Graph Convolution statistical features. Unlike previous models based on traditional ML and For learning the representations of node-attributed func- deep sequence learning, in this work, we present a model tional networks, we use Graph Convolution Network that leverages temporal as well as spatial relationships (GCN). GCN is a widely used graph neural network [23] of the MVTS instances. Our model learns MVTS repre- that learns node representations from a graph through sentations through an end-to-end fashion, and utilizes layer-wise neighborhood aggregation. Graph convolu- higher-order inter-variable relationships and local and tion of layer 𝑙 aggregates the representations of 𝑙-hop global temporal changes. neighbors. GCN updates representation of node 𝑣 in a graph 𝐺 = (𝑉, 𝐸, π‘Š, 𝑋) by following equations. 3. MVTS representation learning β„Ž[0] 𝑣 = π‘₯𝑣 (1) by functional network and βŽ› ⎞ sequence embedding βˆ‘οΈ 𝑀𝑒𝑣 β„Ž[𝑙] 𝑒 β„Ž[𝑙+1] 𝑣 = π‘…π‘’πΏπ‘ˆ βŽπ‘Šπ‘”[𝑙] + 𝐡𝑔[𝑙] β„Ž[𝑙] 𝑣 ⎠, |𝑁 (𝑣)| π‘’βˆˆπ‘ (𝑣) 3.1. Notations and Preliminaries βˆ€π‘™ ∈ {0, 1, ..., 𝐿 βˆ’ 1} (2) 3.1.1. MVTS and Sub-MVTS 𝑧𝑣 = β„Ž[𝐿] 𝑣 (3) Each solar active region resulting in different flare classes 1 βˆ‘οΈ (or staying as a flare quiet region) after a given prediction 𝑧𝐺 = 𝑧𝑣 (4) |𝑉 | π‘£βˆˆπ‘‰ window represents a solar event. The solar event 𝑖 is represented by a MVTS instance 𝑆 (𝑖) , and associated by Here, 𝐿 is the number of GCN layers, π‘₯𝑣 ∈ R𝜏 is the a class label 𝑦 (𝑖) . The class label 𝑦 (𝑖) represents the flare [𝑙] quiet state, or flare classes of different intensities. The vector of node 𝑣, β„Žπ‘£ ∈ R𝑑𝑔 is the representation of node [𝑙] 𝑣 in layer 𝑙, π‘Šπ‘” ∈ R𝑑𝑔 ×𝑑𝑔 is the weight matrix of layer Edge-weighted network structure and node attribute matrix F B C E D Window-based A GCN zG cf<0> hf<0> functional D C network B construction F E zw<1> A Parameters Concat LSTMf cs<0> hs<0> A B C D E F Sub-MVTS hs|𝜏| 𝜏 ... LSTMs LSTMs zs cf<1> hf<1> ... Time s1 s2 zw<2> ... ... ... Softmax LSTMf s3 F B C E D cf<2> hf<2> z A zG The MVTS instance is divided D GCN C into three windows (Ξ· = 3) each with 𝜏-length F E B A zw<3> zf Concat LSTMf Linear cs<0> hs<0> Sub-MVTS ... hs|𝜏| 𝜏 LSTMs LSTMs zs ... Figure 2: GCN-based node-attributed functional network embedding and LSTM-based local and global sequence embedding. For showing the functional network construction process, parameter set {𝑃1 , 𝑃2 , .., 𝑃𝑁 } of the MVTS instance has been shown as {𝐴, 𝐡, 𝐢, 𝐷, 𝐸, 𝐹 }. [𝑙] 𝑙, 𝐡𝑔 ∈ R𝑑𝑔 is the bias vector of layer 𝑙, 𝑁 (𝑣) is the set of neighbor nodes of node 𝑣, 𝑀𝑒𝑣 is the weight associated in the edge between node 𝑣 and its neighbor 𝑒, 𝑧𝑣 is the Λœπ‘<𝑑> = π‘‘π‘Žπ‘›β„Ž(π‘Šπ‘ [β„Ž<π‘‘βˆ’1> , π‘₯<𝑑> ] + 𝑏𝑐 ) (5) final representation of node 𝑣 after 𝐿 iterations of neigh- Γ𝑒 = 𝜎(π‘Šπ‘’ [β„Ž<π‘‘βˆ’1> , π‘₯<𝑑> ] + 𝑏𝑒 ) (6) borhood aggregation, and 𝑧𝐺 is graph representation found by averaging the node representations. Γ𝑓 = 𝜎(π‘Šπ‘“ [β„Ž<π‘‘βˆ’1> , π‘₯<𝑑> ] + 𝑏𝑓 ) (7) Ξ“π‘œ = 𝜎(π‘Šπ‘œ [β„Ž <π‘‘βˆ’1> ,π‘₯ <𝑑> ] + π‘π‘œ ) (8) 3.1.4. Sequence embedding through LSTM 𝑐 <𝑑> = Γ𝑒 βŠ™ Λœπ‘ <𝑑> + Γ𝑓 βŠ™ 𝑐 <π‘‘βˆ’1> (9) Long-short term memory (LSTM) networks [24] are β„Ž <𝑑> = Ξ“π‘œ βŠ™ π‘‘π‘Žπ‘›β„Ž(𝑐 <𝑑> ) (10) frequently used for sequence representation learn- ing which facilitates various tasks such as sequence We denote the number of dimensions of the cell state classification, sequence-to-sequence translation, and representation 𝑐<𝑑> and hidden state representation so on. We use LSTM networks for learning low- β„Ž<𝑑> of the LSTM cell as 𝑑𝑠 . The concatenation of hid- dimensional representations of MVTS instances. The den state of previous timestamp and the input of current MVTS (and sub-MVTS) instances are sequences of 𝑁 - timestamp is [β„Ž<π‘‘βˆ’1> , π‘₯<𝑑> ] ∈ R𝑑𝑠 +𝑁 . The candidate dimensional timestamp vectors. The timestamp vec- cell state representation is Λœπ‘<𝑑> ∈ R𝑑𝑠 . The weight tor π‘₯<𝑑> ∈ R𝑁 represents the magnetic filed state of matrices are π‘Šπ‘ , π‘Šπ‘’ , π‘Šπ‘“ , π‘Šπ‘œ ∈ R𝑑𝑠 Γ—(𝑑𝑠 +𝑁 ) , and bias the active region (𝑁 parameter values) in the times- terms are 𝑏𝑐 , 𝑏𝑒 , 𝑏𝑓 , π‘π‘œ ∈ R. The subscripts 𝑒, 𝑓 , and π‘œ tamp 𝑑. We denote the inputs to the LSTM cells represents the activations of update gate, forget gate, and as [π‘₯<1> , π‘₯<2> , π‘₯<3> , ..., π‘₯<𝛾> ], cell state represen- output gate respectively, while βŠ™ refers to elementwise tations as [𝑐<0> , 𝑐<1> , 𝑐<2> , ..., 𝑐<π›Ύβˆ’1> ], and hidden multiplication, and 𝜎 represents sigmoid activation. Fi- state representations as [β„Ž<0> , β„Ž<1> , β„Ž<2> , ..., β„Ž<𝛾> ], nally, we consider β„Ž<𝛾> as the final representation of where 𝛾 is the last timestamp of the sequence. After ran- the input MVTS. domly initializing 𝑐<0> and β„Ž<0> , we update the cell state and hidden state of the timestamp 𝑑 by following LSTM equations [24]. 3.2. Data Preprocessing 3.3. MVTS representation learning 3.2.1. Node-level normalization In Fig. 2, we show the components of MVTS representa- tion learning. Firstly, the window embedding learns the Since the magnetic field parameter values are recorded in local spatiotemporal changes of the sub-MVTS instances different scales, we perform z-score normalization. Sup- through the models denoted as 𝐺𝐢𝑁 and 𝐿𝑆𝑇 𝑀𝑠 , and pose that 𝑀 number of MVTS instances each with 𝑁 finally, the whole MVTS embedding learns global tempo- parameters and 𝑇 time points are represented by a third- ral changes of the local (window) representations through order tensor 𝒳 ∈ R 𝑀 ×𝑁 ×𝑇 , where three modes repre- the model denoted as 𝐿𝑆𝑇 𝑀𝑓 . sent events, parameters/nodes, and timestamps. For the better performance of the GCN-based graph embedding, we perform node-level z-normalization as a preprocessing 3.3.1. Window embedding step in the following three steps. Our model learns the representation of the window 𝑠 (sub-MVTS) of the MVTS instance 𝑆 through GCN- 1. We perform mode-2 matricization, i.e., reshaping based node-attributed functional network embedding the tensor so that mode-2 (parameter/node) fibers and LSTM-based local sequence modeling. become the columns of the matrix. The matrix is denoted by 𝑋(2) ∈ R 𝑀 𝑇 ×𝑁 . The columns are β€’ GCN-based functional network embedding: denoted by 𝑃1 , 𝑃2 , . . . , 𝑃𝑁 . We input the node-attributed functional network 2. For each column 𝑃𝑗 , we perform z-normalization 𝐺(𝑉, 𝐸, π‘Š, 𝑋) to a two-layer GCN. The initial as follows. node attributes are set as 𝑋 = 𝑠𝑇 (Eq. 1). In (𝑗) the first layer, each node is embedded into a 𝑑′𝑔 - (𝑗) π‘₯π‘˜ βˆ’ πœ‡(𝑗) dimensional space through 1-hop neighborhood π‘₯π‘˜ = 𝜎 (𝑗) aggregation, and after the second layer, each node (𝑗) is embedded into a 𝑑𝑔 -dimensional space through Here, π‘₯π‘˜ is the π‘˜-th value of the column 𝑃𝑗 , 2-hop neighborhood aggregation (Eq. 2,3). Fi- where 1 ≀ π‘˜ ≀ 𝑀 𝑇 , πœ‡(𝑗) is the mean of the nally, the whole graph representation 𝑧𝐺 ∈ R𝑑𝑔 column 𝑃𝑗 , and 𝜎 is the standard deviation of (𝑗) is computed through mean pooling (Eq. 4). the column 𝑃𝑗 . β€’ LSTM-based sub-MVTS embedding: The sub- 3. We reshape the matrix 𝑋(2) ∈ R𝑀 𝑇 ×𝑁 back to MVTS 𝑠 = [π‘₯<1> , ..., π‘₯<𝜏 > ], where π‘₯<𝑑> ∈ third-order tensor, 𝒳 ∈ R𝑀 ×𝑁 ×𝑇 . R𝑁 , is sequentially input to the 𝐿𝑆𝑇 𝑀𝑠 (Eq. 5- 10), and we extract the last hidden representation 3.2.2. Functional network construction 𝑧𝑠 = β„Ž<πœπ‘  > , where 𝑧𝑠 ∈ R𝑑𝑠 . We calculate the Pearson correlation matrix 𝐢 ∈ R𝑁 ×𝑁 For the window embedding, we concatenate 𝑧𝐺 ∈ R𝑑𝑔 for the sub-MVTS 𝑠 ∈ R𝜏 ×𝑁 . In the correlation ma- and 𝑧𝑠 ∈ R𝑑𝑠 . Therefore, the window representation is trix, 𝐢𝑖𝑗 represents the Pearson correlation coefficient 𝑧𝑀 ∈ R𝑑𝑔 +𝑑𝑠 . (in the range of [-1, 1]) between 𝜏 -length time series 𝑃𝑖 and 𝑃𝑗 . The symmetric matrix 𝐢 can be considered 3.3.2. Whole MVTS embedding as an adjacency matrix of a graph of 𝑁 nodes. We ap- ply a sparsity threshold of 0 so that only edges with After each of πœ‚ windows is represented as (𝑑𝑔 + positive weight (node pairs with positive correlation) 𝑑𝑠 )-dimensional vector, we feed the sequential data are considered for functional network construction. We <1> [𝑧𝑀 <πœ‚> , ..., 𝑧𝑀 ] into 𝐿𝑆𝑇 𝑀𝑓 for global temporal denote the sparse correlation matrix as the adjacency change modeling. Note that 𝐿𝑆𝑇 𝑀𝑓 and 𝐿𝑆𝑇 𝑀𝑠 have matrix 𝐴 ∈ R𝑁 ×𝑁 . Although the functional network different learnable parameter sets (e.g., π‘Šπ‘’π‘  , π‘Šπ‘’π‘“ , etc), defined over a sub-MVTS encodes inter-variable inter- although in this work the number of dimensions (𝑑𝑠 ) in actions within a small temporal window, the adjacency the cell state and hidden state are kept the same. We matrix is not enough for the completeness of data, since extract the final hidden state representation 𝑧𝑓 = β„Ž<πœ‚> 𝑓 , negative correlation coefficients are discarded. To avoid where 𝑧𝑓 ∈ R𝑑𝑠 . We input 𝑧𝑓 into a linear (fully con- the data missing, in addition to the adjacency matrix nected) layer. In this layer, the parameters are π‘ŠπΉ ∈ (graph structure), we extract the node attribute matrix R𝑛𝑐 ×𝑑𝑠 , and 𝑏𝐹 ∈ R, where 𝑛𝑐 is the number of classes. 𝑋 = 𝑠𝑇 . In 𝑋 ∈ R𝑁 Γ—πœ , each row represents node at- After this layer, we have a 𝑛𝑐 -dimensional representation tributes in the form of 𝜏 -length time series (normalized of the whole MVTS instance of event 𝑖. in the previous step). 𝑧 (𝑖) = π‘…π‘’πΏπ‘ˆ (π‘ŠπΉ 𝑧𝑓 + 𝑏𝐹 ) (11) Finally, we input 𝑧 (𝑖) ∈ R𝑛𝑐 into a softmax layer, benchmark dataset. We used PyTorch 1.10.0 with CUDA whose number of units is equal to the number of classes. 11.1 for implementing our GCN-LSTM-based MVTS clas- The softmax layer gives us the normalized class probabil- sifier. The source code of our model and the experimental ities, and we finally get 𝑦ˆ(𝑖) ∈ R𝑛𝑐 . dataset are available at our GitHub repository. 1 (𝑖) 𝑒𝑧 4.1. Dataset 𝑦ˆ(𝑖) = βˆ‘οΈ€ (𝑖) (12) 𝑛𝑐 𝑧𝑗 𝑗=1 𝑒 As the benchmark dataset of our experiments, we used The predicted labels of training MVTS instances are the solar flare prediction dataset published by Angryk et. matched against true labels, and the Adam optimizer al. [3]. Each MVTS instance in the dataset is made up [25] updates the weight and bias parameter values of the of 25 time series of active region magnetic field param- 𝐺𝐢𝑁 , 𝐿𝑆𝑇 𝑀𝑠 , 𝐿𝑆𝑇 𝑀𝑓 and the fully connected layer eters (for the full list of parameters, see [16]). The time through backpropagation algorithm. Algorithm 1 shows series instances are recorded at 12 minutes intervals for the training procedure of the proposed GCN-LSTM-based a total duration of 12 hours (60 time steps). The MVTS MVTS representation learning. instances are labeled according to the flaring event that occurred after 12 hours. Therefore, the dataset has the Algorithm 1 Training of GCN-LSTM-based MVTS rep- number of the observation points 𝑇 = 60, and the num- resentation learning ber of dimensions in timestamp vectors 𝑁 = 25, while Input: Training set π’Ÿ consisted of functional network the prediction window is βˆ† = 12 hours. Our experi- adjacency matrices π‘‹π‘Žπ‘‘π‘— ∈ R π‘›π‘‘π‘Ÿπ‘Žπ‘–π‘› Γ—πœ‚Γ—π‘ ×𝑁 and node mental dataset consists of 1,540 MVTS instances evenly attribute matrices π‘‹π‘›π‘Žπ‘‘ ∈ R π‘›π‘‘π‘Ÿπ‘Žπ‘–π‘› Γ—πœ‚Γ—π‘ Γ—πœ , one-hot distributed across four classes (X, M, BC, and Q), where training labels π‘¦π‘‘π‘Ÿπ‘Žπ‘–π‘› ∈ Rπ‘›π‘‘π‘Ÿπ‘Žπ‘–π‘› ×𝑛𝑐 , number of epochs BC represents events from both B and C classes (less in- π‘›π‘’π‘π‘œπ‘β„Žπ‘  , learning rate 𝛼, and weight decay factor of the tense flares). We split the dataset into train and test using Adam optimizer πœ†. the stratified holdout method (two-thirds for training and Output: Learned parameters of 𝐺𝐢𝑁 , 𝐿𝑆𝑇 𝑀𝑠 , and one-third for the test). 𝐿𝑆𝑇 𝑀𝑓 . 1: Randomly initialize parameter set 𝒲, which 4.2. Baseline methods contains 𝐺𝐢𝑁 , 𝐿𝑆𝑇 𝑀𝑠 , and 𝐿𝑆𝑇 𝑀𝑓 parameters We evaluated our GCN-LSTM-based MVTS classification 2: for number of training epochs π‘›π‘’π‘π‘œπ‘β„Žπ‘  do model with six other baselines. 3: for MVTS instance 𝑖 = 1, 2, ..., π‘›π‘‘π‘Ÿπ‘Žπ‘–π‘› do 4: Window matrix, 𝑍𝑀 = [0]πœ‚Γ—(𝑑𝑔 +𝑑𝑠 ) β€’ Flattened vector method (FLT): This is a naive 5: for window 𝑗 = 1, 2, ..., πœ‚ do method, where each 60 Γ— 25 MVTS instance is 6: 𝐴 ← π‘‹π‘Žπ‘‘π‘— [𝑖, 𝑗, :, :] flattened into a 1, 500-dimensional vector. 7: 𝑋 ← π‘‹π‘›π‘Žπ‘‘ [𝑖, 𝑗, :, :] β€’ Vector of last timestamp (LTV): This method 8: 𝑧𝐺 ← 𝐺𝐢𝑁 (𝐴, 𝑋) //Eq. 1-4 (𝐿 = 2) was introduced by Bobra et al [16], where vec- 9: 𝑧𝑠 ← 𝐿𝑆𝑇 𝑀𝑠 (𝑋 𝑇 ) //Eq. 5-10 tor magnetogram data (feature space of all mag- 10: 𝑍𝑀 [𝑗, :] ← πΆπ‘œπ‘›π‘π‘Žπ‘‘(𝑧𝐺 , 𝑧𝑠 ) netic field parameters) were used for classifica- 11: end for tion. Since the last timestamp of the MVTS is tem- 12: 𝑧𝑓 ← 𝐿𝑆𝑇 𝑀𝑓 (𝑍𝑀 ) //Eq. 5-10 porally nearest to the flaring event, we sampled 13: 𝑧𝑓 ← πΏπ‘–π‘›π‘’π‘Žπ‘Ÿ(𝑧𝑓 ) //Eq. 11 the vector of the last timestamp (25-dimensional) 14: 𝑧 ← π‘†π‘œπ‘“ π‘‘π‘šπ‘Žπ‘₯(𝑧𝑓 ) //Eq. 12 (𝑖) to train the classifier. 15: //negative log likelihood loss calculation (𝑖) β€’ Time series summarization-based MVTS rep- 16: β„’ ← 𝑁 πΏπΏπΏπ‘œπ‘ π‘ (𝑧 (𝑖) , π‘¦π‘‘π‘Ÿπ‘Žπ‘–π‘› ) resentation (TS-SUM): This method, proposed 17: Update 𝒲 minimizing β„’ by Adam(𝛼, πœ†) by Hamdi et al [4] summarizes each individual 18: end for time series of length 𝑇 by eight statistical fea- 19: end for tures: mean, standard deviation, skewness, and 20: return 𝒲 kurtosis of the original time series, and the first- order derivative of the time series. As a result, we get an 8 Γ— 25-dimensional vector space, which is used for training the downstream classifier. 4. Experiments β€’ Long-short term memory (LSTM): This LSTM- In this section, we demonstrate our experimental find- based approach was proposed by Muzaheed et. ings. We compared the performance of our model with six other MVTS-based flare prediction baselines on a 1 https://github.com/FuadAhmad/GCN-LSTM Table 1 Multiclass classification performance of the proposed method with the baselines Measures FLT LTV TS-SUM RNN LSTM ROCKET GCN-LSTM Accuracy 0.259 Β± 0.012 0.323 Β± 0.02 0.609 Β± 0.091 0.427 Β± 0.025 0.628 Β± 0.03 0.742 Β± 0.021 0.817 Β± 0.014 Precision (X) 0.232 Β± 0.024 0.342 Β± 0.041 0.712 Β± 0.054 0.534 Β± 0.031 0.757 Β± 0.028 0.92 Β± 0.034 0.932 Β± 0.022 Recall (X) 0.264 Β± 0.053 0.392 Β± 0.043 0.772 Β± 0.024 0.631 Β± 0.028 0.947 Β± 0.023 0.981 Β± 0.016 0.99 Β± 0.023 F1 (X) 0.244 Β± 0.032 0.362 Β± 0.04 0.741 Β± 0.034 0.582 Β± 0.019 0.841 Β± 0.014 0.952 Β± 0.028 0.961 Β± 0.013 Precision (M) 0.254 Β± 0.012 0.324 Β± 0.033 0.522 Β± 0.031 0.411 Β± 0.014 0.594 Β± 0.018 0.661 Β± 0.042 0.803 Β± 0.054 Recall (M) 0.26 Β± 0.023 0.331 Β± 0.061 0.552 Β± 0.022 0.402 Β± 0.03 0.544 Β± 0.014 0.704 Β± 0.038 0.824 Β± 0.063 F1 (M) 0.257 Β± 0.026 0.327 Β± 0.042 0.537 Β± 0.023 0.406 Β± 0.029 0.568 Β± 0.02 0.687 Β± 0.028 0.811 Β± 0.033 Precision (BC) 0.232 Β± 0.044 0.263 Β± 0.024 0.453 Β± 0.033 0.282 Β± 0.031 0.495 Β± 0.013 0.58 Β± 0.026 0.682 Β± 0.03 Recall (BC) 0.241 Β± 0.053 0.212 Β± 0.02 0.472 Β± 0.014 0.261 Β± 0.021 0.409 Β± 0.023 0.573 Β± 0.052 0.664 Β± 0.05 F1 (BC) 0.236 Β± 0.041 0.234 Β± 0.024 0.462 Β± 0.041 0.271 Β± 0.031 0.448 Β± 0.031 0.577 Β± 0.031 0.673 Β± 0.032 Precision (Q) 0.324 Β± 0.034 0.343 Β± 0.044 0.583 Β± 0.045 0.483 Β± 0.024 0.603 Β± 0.024 0.81 Β± 0.046 0.831 Β± 0.018 Recall (Q) 0.251 Β± 0.042 0.362 Β± 0.071 0.663 Β± 0.034 0.413 Β± 0.042 0.683 Β± 0.023 0.724 Β± 0.034 0.772 Β± 0.021 F1 (Q) 0.282 Β± 0.014 0.352 Β± 0.013 0.62 Β± 0.043 0.445 Β± 0.032 0.64 Β± 0.024 0.771 Β± 0.036 0.798 Β± 0.017 al. [5]. Each MVTS instance was considered as a and one-third for the test). In the experiments of the 𝑇 -length sequence of π‘₯<𝑑> ∈ R𝑁 timestamp vec- proposed GCN-LSTM model, we have following hyper- tors. After sequentially feeding the LSTM model parameters: # windows, πœ‚ : 4, window length, 𝜏 : 15, with each timestamp vector, the last hidden repre- # hidden dimensions 𝑑′𝑔 in first GCN layer: 64, # node sentation was considered as the MVTS represen- embedding dimensions 𝑑𝑔 in second GCN layer: 4, # di- tation. Following the same experimental setting, mensions in cell state and hidden state representations we use the number of both cell state and hidden 𝑑𝑠 of both 𝐿𝑆𝑇 𝑀𝑠 and 𝐿𝑆𝑇 𝑀𝑓 : 128, # training epochs: state dimensions as 128, the number of training 100, Adam learning rate 𝛼: 10βˆ’4 , and weight decay (reg- epochs as 500, and the learning rate in stochastic ularization factor) πœ†: 10βˆ’3 . gradient descent as 0.01. β€’ Recurrent Neural Network (RNN): As the fifth 4.3. Multiclass classification performance baseline, we replace LSTM cells of the model of [5] with standard RNN cells. Similar to the ex- In Table 1, we show the classification performances of perimental setting of [5], we use the number of our GCN-LSTM-based MVTS classifier along with that of RNN hidden dimensions as 128, the number of the baseline methods. For a comprehensive classification training epochs as 1,000, and the learning rate in report, we show accuracy along with precision, recall, stochastic gradient descent as 0.01. and F1 of each class. We performed five experiments β€’ Random Convolutional Kernel Transform with different train/test sets sampled by stratified hold- (ROCKET): We use ROCKET [26] as the sixth out (two-thirds for training and one-third for the test) and baseline for MVTS-based solar event classifica- reported the mean and standard deviation of the experi- tion. ROCKET was shown as the best performing ments. From the results, it is visible that the GCN-LSTM- algorithm in the MVTS classification benchmark- based MVTS classification model outperforms all other ing study by Ruiz et al [27], which included 26 baselines in all the performance measures. In overall MVTS datasets of the UEA archive [28]. ROCKET evaluation, ROCKET achieves second-bast performance, uses a large number of random convolution ker- while the LSTM model becomes third. GCN-LSTM model nels in conjunction with a linear classifier (ridge achieves around 20% more accuracy in comparison with regression or logistic regression), where each ker- the LSTM model, which proves the importance of learn- nel is applied to each univariate time series in- ing MVTS representations in both spatial and temporal stance. Similar to the experimental setting of domains rather than learning only from the temporal [27], we used the number of kernels in ROCKET domain. Among shallow ML models, TS-SUM performs as 10,000. better than FLT and LTV models. In general, the high performances of TS-SUM, RNN, LSTM, ROCKET, and The first three baselines are embedding followed by GCN-LSTM prove the importance of time series repre- classification methods. After performing the embedding sentations of solar events. of MVTS instances using those methods, we use logistic regression classifier with L2 regularization. In all the experiments, we split the dataset into train and test using the stratified holdout method (two-thirds for training FLT LTV TS-SUM RNN LSTM ROCKET GCN-LSTM FLT LTV TS-SUM RNN LSTM ROCKET GCN-LSTM 1.000 90 Mean performance 0.750 70 Accuracy (%) 0.500 50 0.250 30 Accuracy Precision Recall (XM) F1 (XM) Precision Recall (QBC) F1 (QBC) (XM) (QBC) Performance metric 10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Figure 4: Binary classification performance of all baselines Train set size (a) Multiclass classification accuracy with increasing training data increase of training set size, we observe more consistent FLT LTV TS-SUM RNN LSTM ROCKET GCN-LSTM increasing patterns in deep learning and kernel-based 100 methods, e.g., GCN-LSTM, ROCKET, LSTM, and RNN. It proves that with sufficiently large datasets, deep learn- 75 ing models can outperform the traditional classifiers or embedding methods in a larger margin. The time series X class F1 (%) 50 summarization-based method TS-SUM shows promising performance throughout the experiments, but the gener- 25 alization capability of this model can be limited in a more complex dataset due to its less flexible learning methodol- ogy consisting of hand-engineered features. Compared to the deep learning-based and time series-based methods, 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Train set size the LTV and FLT models perform poorly, which proves the importance of time series in avoiding underfitting. (b) F1 of X class with increasing training set size Figure 3: Multiclass classification with varying train set size 4.5. Binary classification performance In addition to classifying the solar active regions in dif- ferent flare classes, a major use case in data-driven flare 4.4. Classification varying train set size prediction is the binary classification, i.e., distinguish- ing major flaring events from minor flaring events or To verify the adaptability of our model with bigger train- flare quiet events. In this experiment, we considered X ing datasets, we experimented by varying the training and M class MVTS instances as flaring events, while we set size. We varied the training set size from 10% to 90% considered all other instances (Q and BC) as non-flaring of the dataset size, while testing the models with the events. In Fig. 4, we show the mean binary classification rest of the instances (Fig. 3). We performed stratified performances of all models over five different train/test train/test sampling with a given training set size, and samples in terms of accuracy, precision, recall, and F1 evaluated the classification performance of the classifiers of flaring and non-flaring classes. It is clearly visible five times with five distinct samples of training and test that the GCN-LSTM model outperforms all other base- sets. In Fig. 3a and 3b, we plotted the mean accuracy lines. We reported the performances of the two best- values and mean F1 (X class) values found in all runs of performing models in numbers along with their bars. In different train/test samples with different training data all performance metrics, GCN-LSTM achieves an aver- sizes. GCN-LSTM consistently outperforms other base- age of ∼ 8% better performance than the second-best lines in all settings of training set sizes. ROCKET is the performing ROCKET algorithm. In general, we observe second-best performing classifier in this experiment, and the similar performance of the models as that of multi- especially in F1 measure ROCKET exhibits similar ro- class classification. Although one deep learning model, bust performance to GCN-LSTM. With only 10% training i.e., the RNN-based model performed poorer than the data, GCN-LSTM achieved 70% classification accuracy, TS-SUM method, the RNN-based model is an end-to-end while the third-best performing LSTM model achieve that classification model, which might outperform TS-SUM level of high performance by using 90% training data. Al- with more training data, more complex model, and more though all models gain more accuracy with a gradual efficient hyperparameter tuning. sequence embedding. In contrary to other MVTS classi- fication models applied for flare prediction, our model utilizes spatial and temporal features of the MVTS in- 40 stances, and does not depend on predefined statistical features. Our experiments on a real-life solar flare pre- diction dataset demonstrate the superior performance of 20 our model in performing multiclass and binary MVTS t-SNE dimension 2 classification. 0 In the future, we look forward to designing more effi- cient models by techniques such as (1) learning attention 20 coefficients in spatial and temporal feature spaces, (2) cus- tomizing transformer models for MVTS representations, Class and (3) analyzing the effects of univariate sequence em- 40 X bedding towards MVTS representation learning. We will M BC also apply our models in other MVTS-based solar event 60 Q datasets (e.g., solar energetic particles) [30], and MVTS 40 20 0 20 40 60 datasets generated from other sources such as functional t-SNE dimension 1 MRI (fMRI)-based time series of brain regions [31]. Figure 5: t-SNE embedding of GCN-LSTM generated repre- sentations of all MVTS instances in the dataset 6. Acknowledgments This project has been supported in part by funding from CISE and GEO directorates under NSF awards #2153379 4.6. Embedding performance and #2204363. Visualization of high-dimensional data in 2D/3D space is a well-known method of demonstrating the effectiveness of learned representations. To investigate the quality References of learned MVTS representations, we provide a visual- [1] J. Eastwood, E. Biffis, M. Hapgood, L. Green, M. Bisi, ization of t-SNE [29] transformed MVTS representations R. Bentley, R. Wicks, L.-A. McKinnell, M. Gibbs, extracted by the final layer of the GCN-LSTM model. Sim- C. Burnett, The economic impact of space weather: ilar to section 4.3, the stratified holdout strategy is taken Where do we stand?, Risk Analysis 37 (2017) 206– to pre-train the model, and all instances are projected to 218. t-SNE-reduced 2D space (Fig. 5). The 2D projection ex- [2] N. Science, T. Council, National space weather hibits discernible clustering of the MVTS instances. Some action plan, https://obamawhitehouse.archives. meaningful insights are observed by the t-SNE scatter gov/sites/default/files/microsites/ostp/final_ plot such as (1) patterns of four classes are easily recog- nationalspaceweatheractionplan_20151028.pdf, nizable, (2) flare-quiet events (Q) and minor flaring events 2015. [Accessed: 10-Feb-2022]. (B and C) are comparatively similar, (3) X and M class [3] R. A. Angryk, P. C. Martens, B. Aydin, D. Kempton, flares exhibit significant dissimilarity from other classes, S. S. Mahajan, S. Basodi, A. Ahmadzadeh, X. Cai, (4) some flare-quiet events are similar to the minor flaring S. F. Boubrahimi, S. M. Hamdi, et al., Multivariate events, (5) few minor flares show similar characteristics time series dataset for space weather data analytics, to M-class flares, and (6) the characteristics of the X-class Scientific data 7 (2020) 1–13. flares are exclusive, and other class instances do not show [4] S. M. Hamdi, D. Kempton, R. Ma, S. F. Boubrahimi, any similarity with X-class instances. R. A. Angryk, A time series classification-based approach for solar flare prediction, in: 2017 IEEE 5. Conclusion Intl. Conf. on Big Data (Big Data), IEEE, 2017, pp. 2543–2551. In this work, we presented an end-to-end deep learning- [5] A. A. M. Muzaheed, S. M. Hamdi, S. F. Boubrahimi, based flare prediction model from multivariate time se- Sequence model-based end-to-end solar flare classi- ries (MVTS) represented datasets that leverages inter- fication from multivariate time series data, in: 20th variable relationships by graph convolutional network- IEEE Intl. Conf. on Machine Learning and Applica- based functional network embedding, and local and tions, ICMLA 2021, Pasadena, CA, USA, December global temporal change modeling through LSTM-based 13-16, 2021, IEEE, 2021, pp. 435–440. [6] Z. Wu, S. Pan, G. Long, J. Jiang, X. Chang, C. Zhang, network, The Astrophysical Journal 891 (2020) 10. Connecting the dots: Multivariate time series fore- [20] E. Park, Y.-J. Moon, S. Shin, K. Yi, D. Lim, H. Lee, casting with graph neural networks, in: KDD ’20: G. Shin, Application of the deep convolutional neu- The 26th ACM SIGKDD Conf. on Knowledge Dis- ral network to the forecast of solar flare occurrence covery and Data Mining, Virtual Event, CA, USA, using full-disk solar magnetograms, The Astro- August 23-27, 2020, ACM, 2020, pp. 753–763. physical Journal 869 (2018) 91. [7] P. S. McIntosh, The classification of sunspot groups, [21] N. Nishizuka, Y. Kubo, K. Sugiura, M. Den, M. Ishii, Solar Physics 125 (1990) 251–267. Operational solar flare prediction model using deep [8] J. P. Mason, J. Hoeksema, Testing automated solar flare net, Earth, Planets and Space 73 (2021) 1–12. flare forecasting with 13 years of michelson doppler [22] R. Ma, S. F. Boubrahimi, S. M. Hamdi, R. A. Angryk, imager magnetograms, The Astrophysical Journal Solar flare prediction using multivariate time series 723 (2010) 634. decision trees, in: 2017 IEEE Intl. Conf. on Big Data, [9] Y. Cui, R. Li, L. Zhang, Y. He, H. Wang, Correlation BigData 2017, Boston, MA, USA, December 11-14, between solar flare productivity and photospheric 2017, IEEE Computer Society, 2017, pp. 2569–2578. magnetic field properties, Solar Physics 237 (2006) [23] T. N. Kipf, M. Welling, Semi-supervised classifica- 45–59. tion with graph convolutional networks, in: 5th [10] J. Jing, H. Song, V. Abramenko, C. Tan, H. Wang, Intl. Conf. on Learning Representations, ICLR 2017, The statistical relationship between the photo- Toulon, France, April 24-26, 2017, Conference Track spheric magnetic parameters and the flare produc- Proceedings, OpenReview.net, 2017. tivity of active regions, The Astrophysical Journal [24] S. Hochreiter, J. Schmidhuber, Long short-term 644 (2006) 1273. memory, Neural Comput. 9 (1997) 1735–1780. [11] K. Leka, G. Barnes, Photospheric magnetic field [25] D. P. Kingma, J. Ba, Adam: A method for stochastic properties of flaring versus flare-quiet active re- optimization, in: 3rd Intl. Conf. on Learning Rep- gions. ii. discriminant analysis, The Astrophysical resentations, ICLR 2015, San Diego, CA, USA, May Journal 595 (2003) 1296. 7-9, 2015, Conf. Track Proc., 2015. [12] H. Song, C. Tan, J. Jing, H. Wang, V. Yurchyshyn, [26] A. Dempster, F. Petitjean, G. I. Webb, ROCKET: V. Abramenko, Statistical assessment of photo- exceptionally fast and accurate time series classifi- spheric magnetic features in imminent solar flare cation using random convolutional kernels, Data predictions, Solar Physics 254 (2009) 101–125. Min. Knowl. Discov. 34 (2020) 1454–1495. [13] D. Yu, X. Huang, H. Wang, Y. Cui, Short-term so- [27] A. P. Ruiz, M. Flynn, J. Large, M. Middlehurst, lar flare prediction using a sequential supervised A. Bagnall, The great multivariate time series clas- learning method, Solar Physics 255 (2009) 91–105. sification bake off: a review and experimental eval- [14] O. W. Ahmed, R. Qahwaji, T. Colak, P. A. Higgins, uation of recent algorithmic advances, Data Mining P. T. Gallagher, D. S. Bloomfield, Solar flare pre- and Knowledge Discovery 35 (2021) 401–449. diction using advanced feature extraction, machine [28] A. J. Bagnall, H. A. Dau, J. Lines, M. Flynn, J. Large, learning, and feature selection, Solar Physics (2013) A. Bostrom, P. Southam, E. J. Keogh, The UEA 1–19. multivariate time series classification archive, 2018, [15] A. Al-Ghraibah, L. Boucheron, R. McAteer, An CoRR abs/1811.00075 (2018). URL: http://arxiv.org/ automated classification approach to ranking pho- abs/1811.00075. arXiv:1811.00075. tospheric proxies of magnetic energy build-up, As- [29] L. Van der Maaten, G. Hinton, Visualizing data tronomy & Astrophysics 579 (2015) A64. using t-sne., Journal of machine learning research [16] M. G. Bobra, S. Couvidat, Solar flare prediction 9 (2008). using SDO/HMI vector magnetic field data with a [30] S. F. Boubrahimi, S. M. Hamdi, R. Ma, R. A. Angryk, machine-learning algorithm, The Astrophysical On the mining of the minimal set of time series Journal 798 (2015) 135. data shapelets, in: IEEE Intl. Conf. on Big Data, [17] N. Nishizuka, K. Sugiura, Y. Kubo, M. Den, S. Watari, Big Data 2020, Atlanta, GA, USA, December 10-13, M. Ishii, Solar flare prediction model with 2020, IEEE, 2020, pp. 493–502. three machine-learning algorithms using ultravio- [31] S. M. Hamdi, B. Aydin, S. F. Boubrahimi, R. A. An- let brightening and vector magnetograms, Astro- gryk, L. C. Krishnamurthy, R. D. Morris, Biomarker physical Journal 835 (2017) 156. detection from fmri-based complete functional con- [18] Y. Zheng, X. Li, X. Wang, Solar flare prediction with nectivity networks, in: IEEE Intl. Conf. on Artifi- the hybrid deep convolutional neural network, The cial Intelligence and Knowledge Engineering, AIKE Astrophysical Journal 885 (2019) 73. 2018, Laguna Hills, CA, USA, September 26-28, 2018, [19] X. Li, Y. Zheng, X. Wang, L. Wang, Predicting IEEE, 2018, pp. 17–24. solar flares using a novel deep convolutional neural