Deep Spatio-Temporal Encoding: Achieving Higher Accuracy by Aligning with External Real-World Data

Deep Spatio-Temporal Encoding: Achieving Higher Accuracy by Aligning with External Real-World Data ChenJiang Auburn University

Auburn AL USA

WenluWang wenlu.wang@tamucc.edu Texas A&M University-Corpus Christi

Corpus Christi TX USA

JingjingLi jingjingli@meta.com

Meta Menlo Park CA USA

NaiqingPan Auburn University

Auburn AL USA

Wei-ShinnKu weishinn@auburn.edu Auburn University

Auburn AL USA

Deep Spatio-Temporal Encoding: Achieving Higher Accuracy by Aligning with External Real-World Data 1613-0073 1EFE386AD5DB4EF4E57B8CB63CCF3079 GROBID - A machine learning software for extracting information from scholarly documents Spatial-temporal Analysis Deep Learning Transformer

Spatio-temporal deep learning has drawn a lot of attention since many downstream real-world applications can benefit from accurate predictions. For example, accurate prediction of heavy rainfall events is essential for effective urban water usage, flooding warning, and mitigation. In this paper, we propose a strategy to leverage spatially connected real-world features to enhance prediction accuracy. Specifically, we leverage spatially connected real-world climate data to predict heavy rainfall risks in a broad range in our case study. We experimentally ascertain that our Trans-Graph Convolutional Network (TGCN) accurately predicts heavy rainfall risks and real estate trends, demonstrating the advantage of incorporating external spatially-connected real-world data to improve model performance, and it shows that this proposed study has a significant potential to enhance spatio-temporal prediction accuracy, aiding in efficient urban water usage, flooding risk warning, and fair housing in real estate.

Introduction

Spatio-temporal predictions have been extensively studied due to their impact on real-world applications [1,2,3,4,5]. For example, heavy rainfall events can cause significant damage to infrastructure and pose serious threats to human safety. Predicting these events with greater accuracy allows better preparation and response [6], ultimately saving lives and reducing the economic impact of such events.

Deep learning methods, such as deep spatio-temporal prediction models [7,8], have improved the performance of rainfall forecasting over the years. However, the role of external data in enhancing the prediction accuracy is still controversial. Some argue that external data can provide more useful information for the prediction model, while others claim that external data can introduce more noise and complexity to the learning process. In this study, we propose to improve spatio-temporal predictions by combin-ing spatially-linked external real-world data along with a TGCN to learn the spatio-temporal dependencies from the combined data. As it has been proven that utilizing more multi-source real-world data is more likely to lead to higher accuracy [9], our study aims to introduce a fresh perspective on integrating external real-world data into the proposed framework. We use heavy rainfall prediction as a case study for our proposed method, and overall we aim to provide accurate spatio-temporal predictions by leveraging as much information as possible, enabling better decision-making for a broad range of spatio-temporal applications and at the same time offering a novel angle and a comprehensive evaluation to demonstrate the feasibility of integrating additional external real-world data without the necessity of customizing transformer attention mechanisms. Our approach is experimentally validated by predicting heavy rainfall events and real estate hotspots.

The traditional method for predicting heavy rainfall involves manually engineering features from weather data, including temperature, pressure, humidity, etc. Meteorologists rely on their expertise to interpret this data and forecast future weather patterns. This process entails observing and analyzing atmospheric factors to predict weather patterns. However, this traditional approach is time-consuming, labor-intensive, and susceptible to human error, especially when dealing with large datasets. As data grows, it becomes increasingly challenging to analyze large amounts of information by hand.

Previous research has investigated using deep learning for precipitation prediction [10,11] with promising results. However, some limitations can be significantly improved to enhance deep model performance. One area with room for enhancement is leveraging spatial dependencies. To tackle this challenge, we propose a model that integrates both Graph Convolution Networks (GCNs) and a Transformer. This model enables combining external spatially-linked data for spatio-temporal predictions.

Specifically, we employ a GCN to analyze the adjacency matrix on a grid level and generate correlations between each grid element. The GCN captures the spatial relationships and dependencies among neighboring grid points, allowing for a comprehensive understanding of the data's spatial dynamics. We then utilize a Transformer model to encode the temporal precipitation data and combine it with the spatial correlations obtained from the GCNs. By combining the GCNs and the Transformer within the proposed TGCN model, we create a framework that harnesses both the spatial and temporal dimensions of the data.

Related Work

Graph Neural Networks

Graph Convolutional Networks (GCNs) are a type of deep learning model designed to process data represented in a graph structure, such as social or sensor networks [12]. GCNs have demonstrated their effectiveness in various applications, including node classification, link prediction, and recommendation systems [13,14,15,16]. The concept of Graph Neural Networks (GNNs) was initially introduced in [17] and further expanded upon in subsequent research by [18]. GNNs, a type of recurrent neural network (RNN), iteratively propagate information from neighboring nodes until reaching a stable fixed point. This iterative process has traditionally been computationally expensive, but recent studies, such as [19], have made significant improvements in this area. Inspired by the success of Convolutional Neural Networks (CNNs) in computer vision, which extract highlevel features from images using convolution and pooling layers, current models aim to adapt these layers to directly process graph inputs. GCNs can be categorized into two types of graph convolution layers: spectral graph convolution and localized graph convolution, as discussed in [20]. Early research primarily focused on spectral graph convolutions, pioneered by [21]. The current state-of-the-art model, GCN, further simplified the graph convolution operation by employing a localized first-order approximation. However, spectral methods require operations on the entire graph Laplacian during training, which can be computationally expensive. Several subsequent works, such as FastGCN [22] have aimed to alleviate this issue. Recently, researchers have explored the application of GCNs in time series prediction. For example, spatio-temporal GCNbased approaches have been proposed for traffic flow prediction [23], and the integration of time-aware topological information into GCNs using the mathematical framework of zigzag persistence [24].

Spatial Temporal Prediction

In this section, we discuss various existing temporal and spatial-temporal forecasting methods. For example, Recurrent Neural Networks (RNNs), especially long-short-term memory (LSTM) [25], have gained popularity in time series forecasting [26]. Convolutional Neural Networks (CNN) and its variant Temporal Convolutional Neural Networks (TCN) are another option for sequence prediction [27], offering parallel computations compared to RNNs [28]. In recent years, researchers have explored Transformers and its variants in time series forecasting, achieving state-ofthe-art performance in tasks like energy consumption and stock market [29,30,31]. Designing a model capable of comprehensively capturing both spatial and temporal patterns represents another emerging trend in spatial-temporal prediction tasks [32,33]. For example, [33] introduced a spatial-temporal graph neural network for predicting traffic flow.

Methodology

In this section, we detail our model architecture and the benefits of our design.

Overview

The architecture we propose, illustrated in Figure 2, incorporates a combination of techniques to enhance the prediction model. We begin by utilizing a transformer encoder to effectively encode the time series precipitation data, and then integrate local climate features into the model, enabling a comprehensive understanding of the factors influencing heavy rainfall.

To address spatial dependencies and relationships among grid points, a GCN is introduced. This GCN learns the spatial dependencies within the dataset, considering the interconnectedness of grids based on their spatial locations. By leveraging the GCN, the model becomes capable of capturing and integrating spatial information, thereby enhancing prediction accuracy.

The latent code, which combines the encoded time series precipitation data and the spatially connected local climate features learned through the GCN, is fed into a multi-layer perceptron (MLP) for prediction. This integrated architecture allows the MLP model to leverage the fused information, including temporal precipitation data, other climate features, and spatial factors, to effectively learn and infer future heavy rainfall areas.

Model Architecture

Preliminaries

Our proposed TGCN model consists of Encoder, GCNs and Multi-layer Perceptron (MLP) layers. The major component in the transformer is the Multi-head self-attention.

𝐴𝑡𝑡𝑒𝑛𝑡𝑖𝑜𝑛(𝑄, 𝐾, 𝑉) = 𝑠𝑜𝑓 𝑡𝑚𝑎𝑥( 𝑄𝐾 𝑇 √ 𝑑 𝑘 )𝑉(1)

Where the K and V are matrices that store the keys and values. Q is the query that will map against a set of keys.

Transformer-based Encoder

We have developed a predictive model using the Transformer architecture, tailored for heavy rainfall forecasts. Unlike traditional methods that only use past rainfall data, our model factors in numerous external variables to boost accuracy. We examine local features, including geography, atmospheric conditions (pressure, temperature, wind), humidity, and topography, all of which influence heavy rainfall likelihood in a specific area. Therefore, we have developed a transformer-based prediction model [34] that incorporates GCNs to process the spatial features. By doing so, our model can capture the spatial relationships among various features in a graph structure, such as the dependencies between grid point locations and their corresponding climate data. The integration of the GCNs enhances our model's ability to capture both temporal and spatial information. Our model design starts with a transformer encoder capturing temporal precipitation patterns, followed by embedding this data and merging it with local climate data like moisture and humidity. We enhance prediction accuracy with this added context. As illustrated in Figure 3, GCNs involve learning a linear transformation of the feature vectors of each node in a graph, which is then used to update the node features by aggregating information from the node's neighbors. Mathematically, this can be expressed as:

Graph Convolutional Networksℎ (𝑙+1) 𝑣 𝑖 = 𝜎 ⎛ ⎝ ∑︁ 𝑣 𝑗 ∈𝒩 (𝑣 𝑖 ) 1 𝑐𝑖𝑗 𝑊 (𝑙+1) ℎ (𝑙+1) 𝑣 𝑗 ⎞ ⎠(2)

In the equation, ℎ

(𝑙+1) 𝑣 𝑖

represents the feature vector of node 𝑣𝑖 at layer 𝑙 + 1, 𝑊 (𝑙+1) denotes the learnable weight matrix for layer 𝑙 + 1, 𝒩 (𝑣𝑖) represents the set of neighbors of node 𝑣𝑖, and 𝑐𝑖𝑗 is a normalization constant that ensures proper scaling of the aggregated information. The function 𝜎 denotes a non-linear activation function, which introduces non-linearity into the model. In our specific case, we utilize the ReLU activation function. This equation can be interpreted as calculating a weighted sum of the feature vectors of the neighbors of node 𝑣𝑖 at layer 𝑙 + 1, where the weights are determined by the learned weight matrix 𝑊 (𝑙+1) . Then, a non-linear activation function is applied to obtain the updated feature vector ℎ

(𝑙+1) 𝑣 𝑖

for node 𝑖 at layer 𝑙 + 1. This process is repeated across multiple layers to learn expressive representations of the graph data.

For the final prediction, we utilize a four-layer MLP model that combines time series data with other features, effectively leveraging both temporal and spatial information captured by our model for more accurate predictions.

By leveraging the transformer architecture, incorporating GCNs, and utilizing a four-layer MLP model, our approach enables the effective integration of temporal and spatial information for improved prediction accuracy.

Jointly Learning

As illustrated in Figure 2, we propose to map temporal data and non-temporal data into the same latent space and merge the latent vectors for the subsequent prediction task.

To encode the local climate features and capture the spatial dependencies among the grid points for data 𝑥𝑐, we employ a GCN to learn the relationships and dependencies within the spatial domain. The output hidden features at a specific layer 𝐿 can be denoted as ℎ (𝐿) 𝑐 . Equation 2 is applied in this context. Assuming we use 𝐿 𝑐 layers in total, and we use the final layer to summarize climate information, which is defined as:

hc = ℎ (𝐿 𝑐 ) 𝑣 (3)

where ℎ 2. We encode temporal precipitation data using a transformer encoder [34],

ht = 𝑇 𝑟𝑎𝑛𝑠𝑓 𝑜𝑟𝑚𝑒𝑟𝐸𝑛𝑐𝑜𝑑𝑒𝑟(𝑥𝑡) (4) ht ∈ R 𝑑 𝑡(5)

. Since 𝑥𝑡 and 𝑥𝑐 are encoded as ht and hc, we define the merged hidden state as hm hm = 𝐶𝑂𝑁 𝐶𝐴𝑇 (ht, hc)

To further process the merged information, we use another multi-layer perceptron specifically trained for the prediction task. Similarly, we define the 𝑙-th layer network as (assuming 𝐿 𝑛 layers in total)

ℎ (𝑙) 𝑛 = 𝑅𝑒𝐿𝑢(𝑊 (𝑙) 𝑛 ℎ (𝑙−1) 𝑛 + 𝑏 (𝑙) 𝑛 )(7)

where ℎ We use the output from the last layer for prediction

𝑦 ¯= 𝑠𝑖𝑔𝑚𝑜𝑖𝑑(ℎ (𝐿 𝑛 ) 𝑛 )(8)

Loss is measured with the Binary Cross-Entropy loss (BCE)

𝑙𝑜𝑠𝑠 = 𝐵𝐶𝐸𝑙𝑜𝑠𝑠(𝑦 ¯, 𝑦)(9)

The binary cross entropy (BCE) loss can be formulated as follows:

𝐵𝐶𝐸𝑙𝑜𝑠𝑠 = − 1 𝑁 𝑁 ∑︁ 𝑖=1 [𝑦𝑖 log(𝑝𝑖) + (1 − 𝑦𝑖) log(1 − 𝑝𝑖)](10

) where: 𝑁 is the total number of samples, 𝑦𝑖 is the true label for sample 𝑖, 𝑝𝑖 is the predicted probability 𝑖, log denotes the natural logarithm.

Experimental Validation

Datasets

Our data and code are publicly available 1 . In our dataset, the train and test split ratio is 7:3.

1 https://github.com/jiang28/Deep-Spatio-Temporal-Encoding

Precipitation Dataset

Our precipitation dataset is sourced from the NOAA HRRR dataset2 , offering real-time climate data at a 3 km spatial resolution and 1-hour temporal resolution. This dataset [35] encompasses total precipitation, precipitation rate, and nine additional climate variables, including humidity (%), moisture availability (%), pressure (Pa), wind speed (m/s), and total cloud cover (%). Simulated brightness temperature data is acquired from the GOES 11 satellite3 . The precipitation data consist of the following three types:

• Temporal precipitation data, denoted as 𝑥𝑡, as shown in Table 1 and Figure 5. It captures the historical patterns and fluctuations in precipitation over time. Specifically, we define the temporal precipitation rate and total accumulated precipitation over the past 6 hours as 𝑥 ℎ , which consists of 𝑁 timestamps: dataset represents a specific location within the study area, such as a region or a cell. To represent the relationships between these grid points, we used an adjacency matrix. In the adjacency matrix, a value of 0 indicates that two grid points are not neighbors, while a value of 1 denotes their neighboring relationship.

𝑥𝑡 = {𝑥 1 𝑡 , 𝑥

Real-estate Dataset

The real estate dataset captures the dynamics of the U.S. real estate market by collecting spatially correlated data from multiple sources. It consists of 7,436 neighborhoods, 567 cities, 304 counties, 225 metros, and 50 states across the U.S. The data are connected through spatial locations, forming a multi-level spatial hierarchy. The dataset consists of three main components: census data, pricing history, and school district information. Here are some statistics about the real estate dataset:

• Spatial Hierarchy Levels: The dataset includes a multi-level spatial hierarchy, including information at the state, metro, county, city, and neighborhood levels. • Census Data: The census data consists of 16 variables related to various aspects of housing prices, personal income, demographics, and spatial information. • Pricing History: The dataset includes temporal housing price history for each neighborhood, spanning from 1996 to 2019. • School District Information: The dataset incorporates school district information. It provides details on the number of school districts present in each county within the studied area. Additionally, the dataset includes information on the top school district(s) within the region. Temporal data format. It has data on the grid id, longitude, latitude, grid points, grid spacing, vertical level, timestamps, total precipitation, and precipitation rate.

To facilitate the task of predicting real estate hotspots, the dataset is classified into two classes based on the house price increase rate for each neighorhood: 1 for hotspots and 0 for non-hotspots. The detailed settings of the Real-estate Dataset can be found in [36].

Evaluation Metrics

We evaluate the performance of a classification system using various metrics, including Accuracy, Recall, Precision, F1-score, and ROC. These metrics are calculated based on the number of true positives (𝑡𝑝), false positives (𝑓𝑝), false negatives (𝑓𝑛), and true negatives (𝑡𝑛). Accuracy measures the proportion of observations, both positive and negative, that were correctly classified by the system, and can be computed using the formula:

𝑎𝑐𝑐 = 𝑡𝑝 + 𝑡𝑛 𝑡𝑝 + 𝑓𝑝 + 𝑡𝑛 + 𝑓𝑛

Recall measures the proportion of true positives that were correctly identified by the system, and can be computed using the formula:

𝑟𝑒𝑐𝑎𝑙𝑙 = 𝑡𝑝 𝑡𝑝 + 𝑓𝑛

Precision measures the proportion of identified positives that were actually true positives, and can be computed using the formula:

𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑡𝑝 𝑡𝑝 + 𝑓𝑝

F1-score is a weighted average of precision and recall, and provides a single measure of the system's accuracy on the dataset, and can be computed using the formula:

𝐹 1 = 2 * 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 * 𝑟𝑒𝑐𝑎𝑙𝑙 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙

ROC (Receiver Operating Characteristic) curve is a graphical plot that illustrates the performance of a binary classifier system. It is created by plotting the True Positive Rate (TPR) against the False Positive Rate (FPR), which can be computed using the formulas:

𝑇 𝑃 𝑅 = 𝑡𝑝 𝑡𝑝 + 𝑓𝑛 𝐹 𝑃 𝑅 = 𝑓𝑝 𝑓𝑝 + 𝑡𝑛

Overall, these metrics provide a comprehensive evaluation of a classification system's performance and can help identify areas for improvement. Our study identifies heavy rainfall risk areas based on precipitation rate. Following the United States Geological Survey (USGS) standard 4 , we define the heavy rainfall risk as follows:

Heavy Rainfall PredictionClass = {︃ 0, if 𝑅 < 4 mm/hr 1, if 𝑅 ≥ 4 mm/hr

Recognizing the significance of precipitation rate as a critical factor, our objective is to pinpoint areas that are susceptible to encountering heavy rainfall within the next hour. The classification into two classes simplifies the problem and provides a clear distinction between areas with different levels of heavy rainfall risk. Using a 4 mm/hour

Table 3

When comparing model performance on the Precipitation dataset, the proposed model has achieved an accuracy of 86.6%.

threshold, we classify areas as either low-risk (labeled as 0) or high-risk (labeled as 1). For example, out of 10,000 grid points in the study area, 4,798 have a potential for heavy rain risk, while 5,202 do not. This classification simplifies decision-making and resource allocation.

Baselines

We use the following baseline methods:

• Random Forest (RF) [37] • Support Vector Machine (SVM) [38] • Decision Tree (DT) [39] • Linear Regression (LR) [40] • Multilayer Perceptron (MLP) [41] • Long Short Term Memory (LSTM) [25] • Transformer [34]

Performance Analysis

Based on the results presented in Table 2 and Table 3, we can analyze the performance of different models on the Real Estate dataset and the Precipitation dataset, respectively. In Table 2, the proposed model outperforms all the baseline models with an accuracy of 95.6%. The proposed model also exhibits the highest precision for both classes (0 and 1), achieving 0.93 and 0.97, respectively. It demonstrates high recall values for both classes as well. The F1 scores are also higher for the proposed model compared to the baseline models, indicating a better balance between precision and recall. The TGCN model's performance is further reflected in the ROC score of 0.954, which indicates its ability to discriminate between the two classes effectively.

Table 3 shows that the proposed model again achieves the highest accuracy of 86.6%. Similar to the Real Estate dataset, the TGCN model demonstrates superior precision and recall values for both classes compared to the baseline models. It achieves precision scores of 0.9 and 0.83 for classes 0 and 1, respectively, along with recall scores of 0.82 for class 0 and 0.85 for class 1. The F1 scores also indicate the TGCN model's overall better performance. The ROC score for the TGCN model is 0.867.

These results demonstrate that the proposed TGCN model consistently outperforms the other models on both datasets in terms of accuracy, precision, recall, F1 score, and ROC score. The TGCN model's ability to capture temporal, nontemporal, and spatial information through its integration of the transformer layer and the graph convolutional network contributes to its good performance in identifying and predicting hotspots and heavy rainfall areas.

Conclusion

In conclusion, the accurate prediction of heavy rainfall events is crucial for effective urban water usage, disaster response, and mitigation efforts. This paper proposed a prediction model that leverages spatially connected features and real-world climate data to predict heavy rainfall risks across a broad range. Through extensive experimentation, it was observed that the TGCN model outperformed the other machine learning methods in forecasting both heavy rainfall events and real estate trends.

Future Work and Limitations

While this study successfully demonstrated the effectiveness of the proposed TGCN model in predicting heavy rainfall risks, there are several avenues for future research and improvement.

We plan to incorporate more diverse and comprehensive datasets, including additional meteorological and geographical features. This expansion has the potential to enhance the accuracy and generalizability of the TGCN model. Furthermore, we are considering the integration of real-time data streams and the utilization of advanced data fusion techniques to further enhance the model's forecasting capabilities.

Figure 2 :2Figure 2: Design Flow of the Trans-Graph Convolutional Prediction Model: The Trans-Graph Convolutional Prediction Model incorporates a transformer layer for time-series precipitation data, a GCN for local climate features and spatial relationships among grid points, and a four-layer MLP model for the final prediction.

Figure 3 :3Figure 3: Graph Convolutional Network Architecture: The input data consists of the spatial relation matrix and spatially connected climate data. The nodes in the figure are for illustrative purposes.

In this equation, ℎ𝑐 represents the hidden features at layer 𝐿, which are obtained by applying the ReLU activation function to the sum of the weighted input features 𝑊

𝑛is the input of the (l-1)-th layer in the i-th position. 𝑊 (𝑙) 𝑛 and 𝑏 (𝑙) 𝑛 are model parameters.

Study Area:Figure 4 presents the location of the study area in this study. It consists of 10,000 grids across the state of Florida in the U.S.

Figure 4 :4Figure 4: The study area consists of 10,000 grids across South Florida in the United States. The figure shows the observed precipitation values in each county within this area.

Figure 5 :5Figure 5: Study Area Precipitation Rate Heatmap: 100x100 grid region on September 28, 2022, at 13:00 (mm/s).

2 𝑡 , ..., 𝑥 𝑁 𝑡 } 𝑥 𝑖 𝑡 𝑖∈{1..𝑁 } represents the average price for the 𝑖-th timestamp. • Local climate data 𝑥𝑐: The dataset comprises twelve local climate variables, including temperature, humidity, wind speed, atmospheric pressure, and various other meteorological factors. • Spatial location data 𝑥𝑠: Each grid point in the

Table 11

GridIDLongitudeLatitudeGrid PointsGrid SpacingVertical Level1122.7121.131799 × 10593 km50Time Stamps2022/09/23 00:002022/09/23 01:002022/09/23 02:00...2022/10/02 00:00Precipitation rate (mm/hour)0.00.720.94...0Total Precipitation (mm)0.011.884.3...31.61

https://rapidrefresh.noaa.gov/hrrr/ https://www.goes.noaa.gov/ https://www.usgs.gov/

Acknowledgement

This work was partially supported by the National Science Foundation (NSF) under Grant No. 2318641. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not reflect the views of the National Science Foundation.

Bert-trip: Effective and scalable trip representation using attentive contrast learning A.-TKuo HChen W.-SKu IEEE 39th International Conference on Data Engineering (ICDE) IEEE Computer Society 2023. 2023 Freeway travel time prediction using deep hybrid model-taking sun yat-sen freeway as an example P.-YTing TWada Y.-LChiu M.-TSun KSakai W.-SKu AA .-KJeng J.-SHwu IEEE Transactions on Vehicular Technology 69 2020 Hierarchical nearest-neighbor gaussian process models for large geostatistical datasets ADatta SBanerjee AOFinley AEGelfand Journal of the American Statistical Association 111 2016 Spatiotemporal interpolation using gstat BGräler EJPebesma GBHeuvelink R J 8 204 2016 Dynamic spatial-temporal graph convolutional neural networks for traffic forecasting ZDiao XWang DZhang YLiu KXie SHe Proceedings of the AAAI conference on artificial intelligence the AAAI conference on artificial intelligence 2019 33 A deep reinforcement learning system for the allocation of epidemic prevention materials based on ddpg KKitchat M.-HLin H.-SChen M.-TSun KSakai W.-SKu TSurasak Expert Systems with Applications 242 122763 2024 A novel framework for spatio-temporal prediction of environmental data using deep learning FAmato FGuignard SRobert MKanevski Scientific reports 10 22243 2020 Smart deep learning based wind speed prediction model using wavelet packet decomposition, convolutional neural network and convolutional long short term memory network HLiu XMi YLi Energy Conversion and Management 166 2018 Accurate medium-range global weather forecasting with 3d neural networks KBi LXie HZhang XChen XGu QTian Nature 2023 A deep learning multimodal method for precipitation estimation AMoraux SDewitte BCornelis AMunteanu Remote Sensing 13 3278 2021 Deep learning for precipitation nowcasting: A benchmark and a new model XShi ZGao LLausen HWang D.-YYeung W.-KWong W.-CWoo Advances in neural information processing systems 30 2017 TNKipf MWelling arXiv:1609.02907 Semi-supervised classification with graph convolutional networks 2016 arXiv preprint Transgcn: Coupling transformation assumptions with graph convolutional networks for link prediction LCai BYan GMai KJanowicz RZhu Proceedings of the 10th international conference on knowledge capture the 10th international conference on knowledge capture 2019 Bayesian joint estimation of multiple graphical models LGan XYang NNarisetty FLiang Advances in Neural Information Processing Systems 32 2019 Large-scale learnable graph convolutional networks HGao ZWang SJi Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining the 24th ACM SIGKDD international conference on knowledge discovery & data mining 2018 Knowledge graph convolutional networks for recommender systems HWang MZhao XXie WLi MGuo The world wide web conference 2019 A new model for learning in graph domains MGori GMonfardini FScarselli IEEE International Joint Conference on Neural Networks IEEE 2005 2 The graph neural network model FScarselli MGori ACTsoi MHagenbuchner GMonfardini IEEE Trans. Neural Networks 20 2009 YLi DTarlow MBrockschmidt RSZemel Gated graph sequence neural networks ICLR 2016 Graph convolutional neural networks for web-scale recommender systems RYing RHe KChen PEksombatchai WLHamilton JLeskovec SIGKDD, ACM 2018 JBruna WZaremba ASzlam YLecun Spectral networks and locally connected networks on graphs ICLR 2014 Fastgcn: Fast learning with graph convolutional networks via importance sampling JChen TMa CXiao ICLR, OpenReview.net 2018 BYu HYin ZZhu arXiv:1709.04875 Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting 2017 arXiv preprint Attention based spatial-temporal graph convolutional networks for traffic flow forecasting SGuo YLin NFeng CSong HWan Proceedings of the AAAI conference on artificial intelligence the AAAI conference on artificial intelligence 2019 33 Long short-term memory SHochreiter JSchmidhuber Neural computation 9 1997 Predicting the price of bitcoin using machine learning SMcnally JRoche SCaton 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), IEEE 2018 Traffic accident hotspot prediction using temporal convolutional networks: A spatio-temporal approach SDYeddula CJiang BHui W.-SKu Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems the 31st ACM International Conference on Advances in Geographic Information Systems 2023 ABorovykh SBohte CWOosterlee arXiv:1703.04691 Conditional time series forecasting with convolutional neural networks 2017 arXiv preprint Adversarial sparse transformer for time series forecasting SWu XXiao QDing PZhao YWei JHuang Advances in neural information processing systems 33 2020 Accurate multivariate stock movement prediction via data-axis transformer with multi-level contexts JYoo YSoun Y-C. Park UKang Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining 2021 Rice yield prediction and model interpretation based on satellite and climatic indicators using a transformer method YLiu SWang JChen BChen XWang DHao LSun Remote Sensing 14 5045 2022 Selfattention convlstm for spatiotemporal prediction ZLin MLi ZZheng YCheng CYuan Proceedings of the AAAI conference on artificial intelligence the AAAI conference on artificial intelligence 2020 34 Traffic flow prediction via spatial temporal graph neural network XWang YMa YWang WJin XWang JTang CJia JYu Proceedings of the web conference 2020 the web conference 2020 2020 AVaswani NShazeer NParmar JUszkoreit LJones ANGomez LKaiser IPolosukhin arXiv:1706.03762 Attention is all you need 2017 arXiv preprint A multimodal geo dataset for high-resolution precipitation forecasting CJiang WWang NPan W.-SKu Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems the 31st ACM International Conference on Advances in Geographic Information Systems 2023 Modeling real estate dynamics using temporal encoding CJiang JLi WWang W.-SKu Proceedings of the 29th International Conference on Advances in Geographic Information Systems the 29th International Conference on Advances in Geographic Information Systems 2021 Random decision forests TKHo Proceedings of 3rd international conference on document analysis and recognition 3rd international conference on document analysis and recognition IEEE 1995 1 A training algorithm for optimal margin classifiers BEBoser IMGuyon VNVapnik Proceedings of the fifth annual workshop on Computational learning theory the fifth annual workshop on Computational learning theory 1992 Classification and regression trees W.-YLoh Wiley interdisciplinary reviews: data mining and knowledge discovery 1 2011 Generalized linear models JANelder RWWedderburn Journal of the Royal Statistical Society: Series A (General) 135 1972 C1. 2 multilayer perceptrons LBAlmeida Handbook of Neural Computation C 1 1997