<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Deep Spatio-Temporal Encoding: Achieving Higher Accuracy by Aligning with External Real-World Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Chen Jiang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wenlu Wang</string-name>
          <email>wenlu.wang@tamucc.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jingjing Li</string-name>
          <email>jingjingli@meta.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Naiqing Pan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wei-Shinn Ku</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Auburn University</institution>
          ,
          <addr-line>Auburn, AL</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Meta</institution>
          ,
          <addr-line>Menlo Park, CA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Texas A&amp;M University-Corpus Christi</institution>
          ,
          <addr-line>Corpus Christi, TX</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <abstract>
        <p>Spatio-temporal deep learning has drawn a lot of attention since many downstream real-world applications can benefit from accurate predictions. For example, accurate prediction of heavy rainfall events is essential for efective urban water usage, flooding warning, and mitigation. In this paper, we propose a strategy to leverage spatially connected real-world features to enhance prediction accuracy. Specifically, we leverage spatially connected real-world climate data to predict heavy rainfall risks in a broad range in our case study. We experimentally ascertain that our Trans-Graph Convolutional Network (TGCN) accurately predicts heavy rainfall risks and real estate trends, demonstrating the advantage of incorporating external spatially-connected real-world data to improve model performance, and it shows that this proposed study has a significant potential to enhance spatio-temporal prediction accuracy, aiding in eficient urban water usage, flooding risk warning, and fair housing in real estate.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Spatial-temporal Analysis</kwd>
        <kwd>Deep Learning</kwd>
        <kwd>Transformer</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Spatio-temporal predictions have been extensively studied
due to their impact on real-world applications [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4 ref5">1, 2, 3, 4, 5</xref>
        ].
For example, heavy rainfall events can cause significant
damage to infrastructure and pose serious threats to human
safety. Predicting these events with greater accuracy allows
better preparation and response [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], ultimately saving lives
and reducing the economic impact of such events.
      </p>
      <p>
        Deep learning methods, such as deep spatio-temporal
prediction models [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ], have improved the performance
of rainfall forecasting over the years. However, the role of
external data in enhancing the prediction accuracy is still
controversial. Some argue that external data can provide
more useful information for the prediction model, while
others claim that external data can introduce more noise
and complexity to the learning process. In this study, we
propose to improve spatio-temporal predictions by
combining spatially-linked external real-world data along with a
TGCN to learn the spatio-temporal dependencies from the
combined data. As it has been proven that utilizing more
multi-source real-world data is more likely to lead to higher
accuracy [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], our study aims to introduce a fresh perspective
on integrating external real-world data into the proposed
framework. We use heavy rainfall prediction as a case study
for our proposed method, and overall we aim to provide
accurate spatio-temporal predictions by leveraging as much
information as possible, enabling better decision-making
for a broad range of spatio-temporal applications and at the
same time ofering a novel angle and a comprehensive
evaluation to demonstrate the feasibility of integrating additional
external real-world data without the necessity of
customizing transformer attention mechanisms. Our approach is
experimentally validated by predicting heavy rainfall events
and real estate hotspots.
      </p>
      <p>The traditional method for predicting heavy rainfall
involves manually engineering features from weather data,
including temperature, pressure, humidity, etc.
Meteorologists rely on their expertise to interpret this data and
forecast future weather patterns. This process entails observing
and analyzing atmospheric factors to predict weather
patterns. However, this traditional approach is time-consuming,
labor-intensive, and susceptible to human error, especially
when dealing with large datasets. As data grows, it becomes
increasingly challenging to analyze large amounts of
information by hand.</p>
      <p>
        Previous research has investigated using deep learning
for precipitation prediction [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ] with promising results.
However, some limitations can be significantly improved to
enhance deep model performance. One area with room for
enhancement is leveraging spatial dependencies. To tackle
this challenge, we propose a model that integrates both
Graph Convolution Networks (GCNs) and a Transformer.
This model enables combining external spatially-linked data
for spatio-temporal predictions.
      </p>
      <p>Specifically, we employ a GCN to analyze the adjacency
matrix on a grid level and generate correlations between
each grid element. The GCN captures the spatial
relationships and dependencies among neighboring grid points,
allowing for a comprehensive understanding of the data’s
spatial dynamics. We then utilize a Transformer model to
encode the temporal precipitation data and combine it with
the spatial correlations obtained from the GCNs. By
combining the GCNs and the Transformer within the proposed
TGCN model, we create a framework that harnesses both
the spatial and temporal dimensions of the data.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Graph Neural Networks</title>
        <p>
          Graph Convolutional Networks (GCNs) are a type of deep
learning model designed to process data represented in a
graph structure, such as social or sensor networks [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
GCNs have demonstrated their efectiveness in various
applications, including node classification, link prediction, and
recommendation systems [
          <xref ref-type="bibr" rid="ref13 ref14 ref15 ref16">13, 14, 15, 16</xref>
          ]. The concept of
Graph Neural Networks (GNNs) was initially introduced
in [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] and further expanded upon in subsequent research
by [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. GNNs, a type of recurrent neural network (RNN),
iteratively propagate information from neighboring nodes
until reaching a stable fixed point. This iterative process has
traditionally been computationally expensive, but recent
studies, such as [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], have made significant improvements
in this area. Inspired by the success of Convolutional Neural
Networks (CNNs) in computer vision, which extract
highlevel features from images using convolution and pooling
layers, current models aim to adapt these layers to directly
process graph inputs. GCNs can be categorized into two
types of graph convolution layers: spectral graph
convolution and localized graph convolution, as discussed in [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ].
Early research primarily focused on spectral graph
convolutions, pioneered by [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. The current state-of-the-art model,
GCN, further simplified the graph convolution operation by
employing a localized first-order approximation. However,
spectral methods require operations on the entire graph
Laplacian during training, which can be computationally
expensive. Several subsequent works, such as FastGCN [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]
have aimed to alleviate this issue.
        </p>
        <p>
          Recently, researchers have explored the application of GCNs
in time series prediction. For example, spatio-temporal
GCNbased approaches have been proposed for trafic flow
prediction [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], and the integration of time-aware topological
information into GCNs using the mathematical framework
of zigzag persistence [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Spatial Temporal Prediction</title>
        <p>
          In this section, we discuss various existing temporal and
spatial-temporal forecasting methods. For example,
Recurrent Neural Networks (RNNs), especially long-short-term
memory (LSTM) [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ], have gained popularity in time series
forecasting [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]. Convolutional Neural Networks (CNN)
and its variant Temporal Convolutional Neural Networks
(TCN) are another option for sequence prediction [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ],
offering parallel computations compared to RNNs [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. In
recent years, researchers have explored Transformers and
its variants in time series forecasting, achieving
state-ofthe-art performance in tasks like energy consumption and
stock market [
          <xref ref-type="bibr" rid="ref29 ref30 ref31">29, 30, 31</xref>
          ]. Designing a model capable of
comprehensively capturing both spatial and temporal
patterns represents another emerging trend in spatial-temporal
prediction tasks [
          <xref ref-type="bibr" rid="ref32 ref33">32, 33</xref>
          ]. For example, [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] introduced a
spatial-temporal graph neural network for predicting trafic
lfow.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>In this section, we detail our model architecture and the
benefits of our design.</p>
      <sec id="sec-3-1">
        <title>3.1. Overview</title>
        <p>The architecture we propose, illustrated in Figure 2,
incorporates a combination of techniques to enhance the prediction
model. We begin by utilizing a transformer encoder to
effectively encode the time series precipitation data, and then
integrate local climate features into the model, enabling
a comprehensive understanding of the factors influencing
heavy rainfall.</p>
        <p>To address spatial dependencies and relationships among
grid points, a GCN is introduced. This GCN learns the
spatial dependencies within the dataset, considering the
interconnectedness of grids based on their spatial locations. By
leveraging the GCN, the model becomes capable of
capturing and integrating spatial information, thereby enhancing
prediction accuracy.</p>
        <p>The latent code, which combines the encoded time series
precipitation data and the spatially connected local climate
features learned through the GCN, is fed into a multi-layer
perceptron (MLP) for prediction. This integrated
architecture allows the MLP model to leverage the fused
information, including temporal precipitation data, other climate
features, and spatial factors, to efectively learn and infer
future heavy rainfall areas.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Model Architecture</title>
        <sec id="sec-3-2-1">
          <title>3.2.1. Preliminaries</title>
          <p>Our proposed TGCN model consists of Encoder, GCNs and
Multi-layer Perceptron (MLP) layers. The major
component in the transformer is the Multi-head self-attention.

(, ,  ) =  ( √

)
(1)</p>
          <p>Where the K and V are matrices that store the keys and
values. Q is the query that will map against a set of keys.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Transformer-based Encoder</title>
          <p>
            We have developed a predictive model using the
Transformer architecture, tailored for heavy rainfall forecasts.
Unlike traditional methods that only use past rainfall data,
our model factors in numerous external variables to boost
accuracy. We examine local features, including geography,
atmospheric conditions (pressure, temperature, wind),
humidity, and topography, all of which influence heavy rainfall
likelihood in a specific area. Therefore, we have developed a
transformer-based prediction model [
            <xref ref-type="bibr" rid="ref34">34</xref>
            ] that incorporates
GCNs to process the spatial features. By doing so, our model
can capture the spatial relationships among various features
in a graph structure, such as the dependencies between grid
point locations and their corresponding climate data. The
integration of the GCNs enhances our model’s ability to
capture both temporal and spatial information. Our model
design starts with a transformer encoder capturing
temporal precipitation patterns, followed by embedding this data
and merging it with local climate data like moisture and
humidity. We enhance prediction accuracy with this added
context.
          </p>
        </sec>
        <sec id="sec-3-2-3">
          <title>3.2.3. Graph Convolutional Networks</title>
          <p>GCNs have received considerable attention in recent
years and have shown impressive performance in various
applications. In this study, we aim to improve the performance
of our model by integrating a GCN on top of a Transformer
encoder model. The GCN model is specifically designed to
capture the spatial relationships between each node in the
graph and enhance the overall representation of the input
data.</p>
          <p>As illustrated in Figure 3, GCNs involve learning a linear
transformation of the feature vectors of each node in a graph,
which is then used to update the node features by
aggregating information from the node’s neighbors. Mathematically,
this can be expressed as:</p>
          <p>(+1) =  ⎝
ℎ
⎛</p>
          <p>∑︁
 ∈ ()

1  (+1)ℎ(+1)⎠
⎞
(2)
(+1) represents the feature vector</p>
          <p>In the equation, ℎ
of node  at layer  + 1,  (+1) denotes the learnable
weight matrix for layer  + 1,  () represents the set of
neighbors of node , and  is a normalization constant
that ensures proper scaling of the aggregated information.
The function  denotes a non-linear activation function,
which introduces non-linearity into the model. In our
specific case, we utilize the ReLU activation function. This
equation can be interpreted as calculating a weighted sum
of the feature vectors of the neighbors of node  at layer
 + 1, where the weights are determined by the learned
weight matrix  (+1). Then, a non-linear activation
function is applied to obtain the updated feature vector
(+1) for node  at layer  + 1. This process is repeated
ℎ
across multiple layers to learn expressive representations
of the graph data.</p>
          <p>For the final prediction, we utilize a four-layer MLP model
that combines time series data with other features,
efectively leveraging both temporal and spatial information
captured by our model for more accurate predictions.</p>
          <p>By leveraging the transformer architecture, incorporating
GCNs, and utilizing a four-layer MLP model, our approach
enables the efective integration of temporal and spatial
information for improved prediction accuracy.</p>
        </sec>
        <sec id="sec-3-2-4">
          <title>3.2.4. Jointly Learning</title>
          <p>As illustrated in Figure 2, we propose to map temporal data
and non-temporal data into the same latent space and merge
the latent vectors for the subsequent prediction task.</p>
          <p>To encode the local climate features and capture the
spatial dependencies among the grid points for data , we
employ a GCN to learn the relationships and dependencies
within the spatial domain. The output hidden features at
a specific layer  can be denoted as ℎ(). Equation 2 is
applied in this context. Assuming we use  layers in total,
and we use the final layer to summarize climate information,
which is defined as:</p>
          <p>hc = ℎ()
where ℎ(0) =</p>
          <p>In this equation, ℎ represents the hidden features at layer
, which are obtained by applying the ReLU activation
function to the sum of the weighted input features ()ℎ(− 1)
() from Equation 2.
and the bias term</p>
          <p>
            We encode temporal precipitation data using a
transformer encoder [
            <xref ref-type="bibr" rid="ref34">34</xref>
            ],
ht =   ()
          </p>
          <p>ht ∈ R
. Since  and  are encoded as ht and hc, we define the
merged hidden state as hm</p>
          <p>hm =   (ht, hc)
To further process the merged information, we use another
multi-layer perceptron specifically trained for the
prediction task. Similarly, we define the -th layer network as
(assuming  layers in total)
ℎ() = (()ℎ
(− 1) + ())
(7)
where ℎ(0) = hm, and ℎ(− 1) is the input of the (l-1)-th
layer in the i-th position. () and () are model
parameters.</p>
          <p>We use the output from the last layer for prediction
())
¯= (ℎ
 = (¯, )
Loss is measured with the Binary Cross-Entropy loss (BCE)
(3)
(4)
(5)
(6)
(8)
(9)
The binary cross entropy (BCE) loss can be formulated as
follows:</p>
          <p>1 ∑︁ [ log() + (1 − ) log(1 − )]
 = − 
=1
(10)
where:  is the total number of samples,  is the true label
for sample ,  is the predicted probability , log denotes
the natural logarithm.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Validation</title>
      <sec id="sec-4-1">
        <title>4.1. Datasets</title>
        <p>
          Our data and code are publicly available1. In our dataset,
the train and test split ratio is 7:3.
1 https://github.com/jiang28/Deep-Spatio-Temporal-Encoding
Our precipitation dataset is sourced from the NOAA HRRR
dataset2, ofering real-time climate data at a 3 km spatial
resolution and 1-hour temporal resolution. This dataset [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ]
encompasses total precipitation, precipitation rate, and nine
additional climate variables, including humidity (%),
moisture availability (%), pressure (Pa), wind speed (m/s), and
total cloud cover (%). Simulated brightness temperature data
is acquired from the GOES 11 satellite 3. The precipitation
data consist of the following three types:
• Temporal precipitation data, denoted as , as shown
in Table 1 and Figure 5. It captures the historical
patterns and fluctuations in precipitation over time.
Specifically, we define the temporal precipitation
rate and total accumulated precipitation over the
past 6 hours as ℎ, which consists of  timestamps:
 = {1 , 2 , ...,  }
 ∈{1..} represents the average price for the -th
timestamp.
• Local climate data : The dataset comprises twelve
local climate variables, including temperature,
humidity, wind speed, atmospheric pressure, and
various other meteorological factors.
• Spatial location data : Each grid point in the
dataset represents a specific location within the
study area, such as a region or a cell. To represent the
relationships between these grid points, we used an
adjacency matrix. In the adjacency matrix, a value
of 0 indicates that two grid points are not
neighbors, while a value of 1 denotes their neighboring
relationship.
        </p>
        <sec id="sec-4-1-1">
          <title>4.1.2. Real-estate Dataset</title>
          <p>The real estate dataset captures the dynamics of the U.S. real
estate market by collecting spatially correlated data from
multiple sources. It consists of 7,436 neighborhoods, 567
cities, 304 counties, 225 metros, and 50 states across the U.S.
The data are connected through spatial locations, forming a
multi-level spatial hierarchy. The dataset consists of three
main components: census data, pricing history, and school
district information. Here are some statistics about the real
estate dataset:
• Spatial Hierarchy Levels: The dataset includes a
multi-level spatial hierarchy, including information
at the state, metro, county, city, and neighborhood
levels.
• Census Data: The census data consists of 16
variables related to various aspects of housing prices,
personal income, demographics, and spatial
information.
• Pricing History: The dataset includes temporal
housing price history for each neighborhood, spanning
from 1996 to 2019.
• School District Information: The dataset
incorporates school district information. It provides details
on the number of school districts present in each
county within the studied area. Additionally, the
dataset includes information on the top school
district(s) within the region.
2 https://rapidrefresh.noaa.gov/hrrr/
3 https://www.goes.noaa.gov/
GridID</p>
          <p>1</p>
          <p>Time Stamps
Precipitation rate (mm/hour)</p>
          <p>Total Precipitation (mm)
...
...
...</p>
          <p>Vertical Level</p>
          <p>50</p>
          <p>
            To facilitate the task of predicting real estate hotspots, the
dataset is classified into two classes based on the house price
increase rate for each neighorhood: 1 for hotspots and 0
for non-hotspots. The detailed settings of the Real-estate
Dataset can be found in [
            <xref ref-type="bibr" rid="ref36">36</xref>
            ].
          </p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Evaluation Metrics</title>
        <p>We evaluate the performance of a classification system
using various metrics, including Accuracy, Recall, Precision,
F1-score, and ROC. These metrics are calculated based on
the number of true positives (), false positives (), false
negatives (), and true negatives (). Accuracy measures
the proportion of observations, both positive and negative,
that were correctly classified by the system, and can be
computed using the formula:
 =</p>
        <p>+ 
 +  +  +</p>
        <p>Recall measures the proportion of true positives that were
correctly identified by the system, and can be computed
using the formula:
 =</p>
        <p>+ 
 =</p>
        <p>+</p>
        <p>Precision measures the proportion of identified positives
that were actually true positives, and can be computed using
the formula:</p>
        <p>F1-score is a weighted average of precision and recall,
and provides a single measure of the system’s accuracy on
the dataset, and can be computed using the formula:
 1 = 2</p>
        <p>* 
*  +</p>
        <p>ROC (Receiver Operating Characteristic) curve is a
graphical plot that illustrates the performance of a binary classifier
system. It is created by plotting the True Positive Rate (TPR)
against the False Positive Rate (FPR), which can be computed
using the formulas:
   =
   =</p>
        <p>+</p>
        <p>+</p>
        <p>Overall, these metrics provide a comprehensive
evaluation of a classification system’s performance and can help
identify areas for improvement.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Heavy Rainfall Prediction</title>
        <p>Study Area: Figure 4 presents the location of the study area
in this study. It consists of 10,000 grids across the state of
Florida in the U.S.
threshold, we classify areas as either low-risk (labeled as 0)
or high-risk (labeled as 1). For example, out of 10,000 grid
points in the study area, 4,798 have a potential for heavy
rain risk, while 5,202 do not. This classification simplifies
decision-making and resource allocation.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Baselines</title>
        <p>
          We use the following baseline methods:
• Random Forest (RF) [
          <xref ref-type="bibr" rid="ref37">37</xref>
          ]
• Support Vector Machine (SVM) [
          <xref ref-type="bibr" rid="ref38">38</xref>
          ]
• Decision Tree (DT) [
          <xref ref-type="bibr" rid="ref39">39</xref>
          ]
• Linear Regression (LR) [
          <xref ref-type="bibr" rid="ref40">40</xref>
          ]
• Multilayer Perceptron (MLP) [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ]
• Long Short Term Memory (LSTM) [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]
• Transformer [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ]
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Performance Analysis</title>
      <p>Based on the results presented in Table 2 and Table 3, we
can analyze the performance of diferent models on the Real
Estate dataset and the Precipitation dataset, respectively.</p>
      <p>In Table 2, the proposed model outperforms all the
baseline models with an accuracy of 95.6%. The proposed model
also exhibits the highest precision for both classes (0 and
1), achieving 0.93 and 0.97, respectively. It demonstrates
high recall values for both classes as well. The F1 scores are
also higher for the proposed model compared to the
baseline models, indicating a better balance between precision
and recall. The TGCN model’s performance is further
relfected in the ROC score of 0.954, which indicates its ability
to discriminate between the two classes efectively.</p>
      <p>Table 3 shows that the proposed model again achieves the
highest accuracy of 86.6%. Similar to the Real Estate dataset,
the TGCN model demonstrates superior precision and recall
values for both classes compared to the baseline models. It
achieves precision scores of 0.9 and 0.83 for classes 0 and
1, respectively, along with recall scores of 0.82 for class 0
and 0.85 for class 1. The F1 scores also indicate the TGCN
model’s overall better performance. The ROC score for the
TGCN model is 0.867.</p>
      <p>These results demonstrate that the proposed TGCN model
consistently outperforms the other models on both datasets
in terms of accuracy, precision, recall, F1 score, and ROC
score. The TGCN model’s ability to capture temporal,
nontemporal, and spatial information through its integration
of the transformer layer and the graph convolutional
network contributes to its good performance in identifying and
predicting hotspots and heavy rainfall areas.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In conclusion, the accurate prediction of heavy rainfall
events is crucial for efective urban water usage, disaster
response, and mitigation eforts. This paper proposed a
prediction model that leverages spatially connected features
and real-world climate data to predict heavy rainfall risks
across a broad range. Through extensive experimentation,
it was observed that the TGCN model outperformed the
other machine learning methods in forecasting both heavy
rainfall events and real estate trends.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Future Work and Limitations</title>
      <p>While this study successfully demonstrated the efectiveness
of the proposed TGCN model in predicting heavy rainfall
risks, there are several avenues for future research and
improvement.</p>
      <p>We plan to incorporate more diverse and comprehensive
datasets, including additional meteorological and
geographical features. This expansion has the potential to enhance
the accuracy and generalizability of the TGCN model.
Furthermore, we are considering the integration of real-time
data streams and the utilization of advanced data fusion
techniques to further enhance the model’s forecasting
capabilities.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgement</title>
      <p>This work was partially supported by the National Science
Foundation (NSF) under Grant No. 2318641. Any opinions,
ifndings, and conclusions or recommendations expressed in
this material are those of the authors and do not reflect the
views of the National Science Foundation.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.-T.</given-names>
            <surname>Kuo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          , W.-S. Ku,
          <article-title>Bert-trip: Efective and scalable trip representation using attentive contrast learning</article-title>
          ,
          <source>in: 2023 IEEE 39th International Conference on Data Engineering (ICDE)</source>
          ,
          <source>IEEE Computer Society</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>612</fpage>
          -
          <lpage>623</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.-Y.</given-names>
            <surname>Ting</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-L.</given-names>
            <surname>Chiu</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>T.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Sakai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-S.</given-names>
            <surname>Ku</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. A.-K. Jeng</surname>
            ,
            <given-names>J.-S.</given-names>
          </string-name>
          <string-name>
            <surname>Hwu</surname>
          </string-name>
          ,
          <article-title>Freeway travel time prediction using deep hybrid model-taking sun yat-sen freeway as an example</article-title>
          ,
          <source>IEEE Transactions on Vehicular Technology</source>
          <volume>69</volume>
          (
          <year>2020</year>
          )
          <fpage>8257</fpage>
          -
          <lpage>8266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Datta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. O.</given-names>
            <surname>Finley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Gelfand</surname>
          </string-name>
          ,
          <article-title>Hierarchical nearest-neighbor gaussian process models for large geostatistical datasets</article-title>
          ,
          <source>Journal of the American Statistical Association</source>
          <volume>111</volume>
          (
          <year>2016</year>
          )
          <fpage>800</fpage>
          -
          <lpage>812</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Gräler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Pebesma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. B.</given-names>
            <surname>Heuvelink</surname>
          </string-name>
          ,
          <article-title>Spatiotemporal interpolation using gstat</article-title>
          ., R J.
          <volume>8</volume>
          (
          <year>2016</year>
          )
          <fpage>204</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Diao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Y. Liu,
          <string-name>
            <given-names>K.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>Dynamic spatial-temporal graph convolutional neural networks for trafic forecasting</article-title>
          ,
          <source>in: Proceedings of the AAAI conference on artificial intelligence</source>
          , volume
          <volume>33</volume>
          ,
          <year>2019</year>
          , pp.
          <fpage>890</fpage>
          -
          <lpage>897</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kitchat</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-H. Lin</surname>
            ,
            <given-names>H.-S.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            , M.-
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Sakai</surname>
            , W.-S. Ku,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Surasak</surname>
          </string-name>
          ,
          <article-title>A deep reinforcement learning system for the allocation of epidemic prevention materials based on ddpg</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>242</volume>
          (
          <year>2024</year>
          )
          <fpage>122763</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Amato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Guignard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Robert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kanevski</surname>
          </string-name>
          ,
          <article-title>A novel framework for spatio-temporal prediction of environmental data using deep learning</article-title>
          ,
          <source>Scientific reports 10</source>
          (
          <year>2020</year>
          )
          <fpage>22243</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Mi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Smart deep learning based wind speed prediction model using wavelet packet decomposition, convolutional neural network and convolutional long short term memory network</article-title>
          ,
          <source>Energy Conversion and Management</source>
          <volume>166</volume>
          (
          <year>2018</year>
          )
          <fpage>120</fpage>
          -
          <lpage>131</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Bi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <article-title>Accurate medium-range global weather forecasting with 3d neural networks</article-title>
          ,
          <source>Nature</source>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Moraux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dewitte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cornelis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Munteanu</surname>
          </string-name>
          ,
          <article-title>A deep learning multimodal method for precipitation estimation</article-title>
          ,
          <source>Remote Sensing</source>
          <volume>13</volume>
          (
          <year>2021</year>
          )
          <fpage>3278</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>X.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lausen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          , D.-
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yeung</surname>
          </string-name>
          , W.- k. Wong, W.-c. Woo,
          <article-title>Deep learning for precipitation nowcasting: A benchmark and a new model</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>30</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>T. N.</given-names>
            <surname>Kipf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Welling</surname>
          </string-name>
          ,
          <article-title>Semi-supervised classification with graph convolutional networks</article-title>
          ,
          <source>arXiv preprint arXiv:1609.02907</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>L.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yan</surname>
          </string-name>
          , G. Mai,
          <string-name>
            <given-names>K.</given-names>
            <surname>Janowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zhu</surname>
          </string-name>
          , Transgcn:
          <article-title>Coupling transformation assumptions with graph convolutional networks for link prediction</article-title>
          ,
          <source>in: Proceedings of the 10th international conference on knowledge capture</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>131</fpage>
          -
          <lpage>138</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>L.</given-names>
            <surname>Gan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Narisetty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <article-title>Bayesian joint estimation of multiple graphical models</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>32</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>H.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <article-title>Large-scale learnable graph convolutional networks</article-title>
          ,
          <source>in: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery &amp; data mining</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>1416</fpage>
          -
          <lpage>1424</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <article-title>Knowledge graph convolutional networks for recommender systems</article-title>
          ,
          <source>in: The world wide web conference</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>3307</fpage>
          -
          <lpage>3313</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Monfardini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Scarselli</surname>
          </string-name>
          ,
          <article-title>A new model for learning in graph domains</article-title>
          ,
          <source>in: IEEE International Joint Conference on Neural Networks</source>
          , volume
          <volume>2</volume>
          , IEEE,
          <year>2005</year>
          , pp.
          <fpage>729</fpage>
          -
          <lpage>734</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>F.</given-names>
            <surname>Scarselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Tsoi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagenbuchner</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Monfardini,</surname>
          </string-name>
          <article-title>The graph neural network model</article-title>
          ,
          <source>IEEE Trans. Neural Networks</source>
          <volume>20</volume>
          (
          <year>2009</year>
          )
          <fpage>61</fpage>
          -
          <lpage>80</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tarlow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brockschmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Zemel</surname>
          </string-name>
          ,
          <article-title>Gated graph sequence neural networks</article-title>
          ,
          <source>in: ICLR</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>R.</given-names>
            <surname>Ying</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Eksombatchai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. L.</given-names>
            <surname>Hamilton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leskovec</surname>
          </string-name>
          ,
          <article-title>Graph convolutional neural networks for web-scale recommender systems</article-title>
          , in: SIGKDD, ACM,
          <year>2018</year>
          , pp.
          <fpage>974</fpage>
          -
          <lpage>983</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bruna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaremba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Szlam</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <article-title>LeCun, Spectral networks and locally connected networks on graphs</article-title>
          ,
          <source>in: ICLR</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          , T. Ma, C. Xiao, Fastgcn:
          <article-title>Fast learning with graph convolutional networks via importance sampling</article-title>
          , in: ICLR, OpenReview.net,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>B.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Spatio-temporal graph convolutional networks: A deep learning framework for trafic forecasting</article-title>
          ,
          <source>arXiv preprint arXiv:1709.04875</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>S.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wan</surname>
          </string-name>
          ,
          <article-title>Attention based spatial-temporal graph convolutional networks for trafic flow forecasting</article-title>
          ,
          <source>in: Proceedings of the AAAI conference on artificial intelligence</source>
          , volume
          <volume>33</volume>
          ,
          <year>2019</year>
          , pp.
          <fpage>922</fpage>
          -
          <lpage>929</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <article-title>Long short-term memory</article-title>
          ,
          <source>Neural computation 9</source>
          (
          <year>1997</year>
          )
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>S.</given-names>
            <surname>McNally</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Roche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Caton</surname>
          </string-name>
          ,
          <article-title>Predicting the price of bitcoin using machine learning</article-title>
          ,
          <source>in: 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)</source>
          , IEEE,
          <year>2018</year>
          , pp.
          <fpage>339</fpage>
          -
          <lpage>343</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S. D.</given-names>
            <surname>Yeddula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hui</surname>
          </string-name>
          , W.-S. Ku,
          <article-title>Trafic accident hotspot prediction using temporal convolutional networks: A spatio-temporal approach</article-title>
          ,
          <source>in: Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>A.</given-names>
            <surname>Borovykh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bohte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. W.</given-names>
            <surname>Oosterlee</surname>
          </string-name>
          ,
          <article-title>Conditional time series forecasting with convolutional neural networks</article-title>
          ,
          <source>arXiv preprint arXiv:1703.04691</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Adversarial sparse transformer for time series forecasting</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>17105</fpage>
          -
          <lpage>17115</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>J.</given-names>
            <surname>Yoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Soun</surname>
          </string-name>
          , Y.-c. Park, U. Kang,
          <article-title>Accurate multivariate stock movement prediction via data-axis transformer with multi-level contexts</article-title>
          ,
          <source>in: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery &amp; Data Mining</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>2037</fpage>
          -
          <lpage>2045</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hao</surname>
          </string-name>
          , L. Sun,
          <article-title>Rice yield prediction and model interpretation based on satellite and climatic indicators using a transformer method</article-title>
          ,
          <source>Remote Sensing</source>
          <volume>14</volume>
          (
          <year>2022</year>
          )
          <fpage>5045</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zheng</surname>
          </string-name>
          , Y. Cheng, C. Yuan,
          <article-title>Selfattention convlstm for spatiotemporal prediction</article-title>
          ,
          <source>in: Proceedings of the AAAI conference on artificial intelligence</source>
          , volume
          <volume>34</volume>
          ,
          <year>2020</year>
          , pp.
          <fpage>11531</fpage>
          -
          <lpage>11538</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <article-title>Trafic flow prediction via spatial temporal graph neural network</article-title>
          ,
          <source>in: Proceedings of the web conference</source>
          <year>2020</year>
          ,
          <year>2020</year>
          , pp.
          <fpage>1082</fpage>
          -
          <lpage>1092</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          ,
          <source>arXiv preprint arXiv:1706.03762</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>C.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-S.</given-names>
            <surname>Ku</surname>
          </string-name>
          ,
          <article-title>A multimodal geo dataset for high-resolution precipitation forecasting</article-title>
          ,
          <source>in: Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>C.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          , W.-S. Ku,
          <article-title>Modeling real estate dynamics using temporal encoding</article-title>
          ,
          <source>in: Proceedings of the 29th International Conference on Advances in Geographic Information Systems</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>516</fpage>
          -
          <lpage>525</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>T. K. Ho</surname>
          </string-name>
          ,
          <article-title>Random decision forests</article-title>
          ,
          <source>in: Proceedings of 3rd international conference on document analysis and recognition</source>
          , volume
          <volume>1</volume>
          , IEEE,
          <year>1995</year>
          , pp.
          <fpage>278</fpage>
          -
          <lpage>282</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>B. E.</given-names>
            <surname>Boser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. M.</given-names>
            <surname>Guyon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Vapnik</surname>
          </string-name>
          ,
          <article-title>A training algorithm for optimal margin classifiers</article-title>
          ,
          <source>in: Proceedings of the fifth annual workshop on Computational learning theory, 1992</source>
          , pp.
          <fpage>144</fpage>
          -
          <lpage>152</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39] W.-Y. Loh,
          <article-title>Classification and regression trees, Wiley interdisciplinary reviews: data mining and knowledge discovery 1 (</article-title>
          <year>2011</year>
          )
          <fpage>14</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Nelder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>Wedderburn</surname>
          </string-name>
          ,
          <article-title>Generalized linear models</article-title>
          ,
          <source>Journal of the Royal Statistical Society: Series A (General) 135</source>
          (
          <year>1972</year>
          )
          <fpage>370</fpage>
          -
          <lpage>384</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>L. B.</given-names>
            <surname>Almeida</surname>
          </string-name>
          ,
          <year>C1</year>
          .
          <article-title>2 multilayer perceptrons</article-title>
          ,
          <source>Handbook of Neural Computation C</source>
          <volume>1</volume>
          (
          <year>1997</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>