1. Introduction

ConvLSTM Neural Network based on Hexagonal Inputs for Spatio-Temporal Forecasting of Trafic Velocities

Francisco Bahamondes

Billy Peralta

Orietta Nicolis

Andres Bronfman

Alvaro Soto

0 0 Pontificia Universidad Católica de Chile, Departamento de Ciencias de Computación , Santiago, 7820436 , Chile 1 Universidad Andres Bello, Facultad de Ingeniería , Santiago, 7500971 , Chile

The spatial-temporal prediction of transit speeds is of great importance today as it allows for the anticipation and mitigation of vehicular congestion, thereby improving trafic eficiency. In machine learning, models such as ConvLSTM or Transformers enable reasonable predictions at the spatio-temporal level. However, these models typically assume a square grid configuration, which can limit the use of more convenient configurations in transportation, such as hexagonal grids. We propose a ConvLSTM neural network adapted to hexagonal grid sequences for transit speed prediction, incorporating a transformation of the hexagonal input to allow the use of standard spatial temporal architectures based on square grids. This work validates the proposed model through experiments comparing our approach with baseline methods using trafic data from freight transportation in the Metropolitan Region of Santiago, Chile. The results indicate that using hexagonal sequences improves the mean absolute error (MAE) in predicting freight trafic speeds by 2.7% compared to the base spatio-temporal ConvLSTM prediction model. For future work, we propose using larger databases and adapted transformers.

eol>Spatio-temporal prediction Hexagonal inputs ConvLSTM Trafic velocities

1. Introduction

lyze complex and dynamic patterns in trafic, allowing for more accurate speed predictions. In this problem, The prediction of transit speeds emerges as a critical com- classical techniques such as Multiple Linear Regression ponent in addressing road congestion, ofering a way to [4], ARIMA [5], Random Forests [4], Support Vector Maanticipate and mitigate real-time setbacks [1], essential chines (SVM) [6], and MLP neural networks [7] have for refining the distribution industry and the last mile. been applied. However, more modern models often utiBy projecting transit speeds at diferent times and loca- lize deep learning techniques tions, transportation companies can fine-tune the routes Conversely, deep learning (DL) models have also been of their fleets, minimizing delays and cutting operational employed for diverse tasks like crowd mobility prediccosts [2, 3]. This knowledge also enables drivers to make tion [8, 9, 10] or trafic prediction [ 11, 12, 13, 14]. In better decisions regarding their itineraries, avoiding bot- trafic prediction task, some networks commonly used tlenecks and ensuring more agile and efective deliveries. are Long Short-Term Memory (LSTM) Neural Networks This has a tangible impact on customer satisfaction and and Gated Recurrent Unit (GRU) networks. These modoverall supply chain eficiency. els are ideal for modeling sequential data, such as time

In recent years, there has been a notable increase in the series, allowing for eficient capture of both short and application of machine learning (ML) techniques to ad- long-term dependencies. Although current models are dress trafic speed prediction. Thanks to the availability increasingly powerful, they naturally assume a square of real-time data, such as GPS information from vehicles, grid, meaning the information is represented by matrices sensor data, and online trafic, ML algorithms can ana- or tensors. However, in the context of transportation, hexagonal grids ofer significant advantages over tradiSTRL’24: Third International Workshop on Spatio-Temporal Reasoning tional square inputs, particularly in terms of processing and Learning, 5 August 2024, Jeju, South Korea eficiency and accuracy in representing spatial patterns. * Corresponding author. The hexagonal geometry allows for greater connectiv† These authors contributed equally. ity and uniform coverage of the input space with fewer b$illfy..bpaehraamltao@nduensasbc.hcoll(tBb.a@Peuraalntad)r;eosbrieeltltoa.e.ndiuco(lFi.s@Bauhnaamb.ocnldes); sampling points, reducing information distortion. This is (O. Nicolis); abronfman@unab.cl (A. Bronfman); asoto@ing.puc.cl because each hexagonal point has six equidistant neigh(A. Soto) bors, unlike the four or eight neighbors in a square grid, 0000-0002-0877-7063 (F. Bahamondes); 0000-0002-5457-2157 which facilitates better data interpolation and a more (B. Peralta); 0000-0001-8046-6983 (O. Nicolis); 0000-0002-3122-3237 accurate representation of shapes and patterns. While (A. Bronfman); 0000-0001-9378-397X (A. Soto)

© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License existing spatio-temporal prediction models can approxCPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g ACttEribUutRion W4.0oInrtekrnsahtioonpal (PCCroBYce4.0e).dings (CEUR-WS.org) imate hexagonal inputs, this often results in a loss of performance. three components; a CNN, an LSTM neural network, and

In this work, we propose processing a sequence of a FFNN. This structure succeeds in predicting tracfi over hexagons where each cell contains the trafic speed of short temporal horizons (5 minutes) as well as long-term vehicles using a specialized library. The adaptation to (up to 4 hours) through multi-stage predictions using a standard ConvLSTM network, designed to work with data provided by the DiDi Chuxing Gaia open data initiasquare data, involves transforming the hexagonal data tive, and demonstrated superiority over cutting-edge ITS into a compatible square structure. To achieve this, the algorithms, such as k-NN, SVM, or LSTM. The work of operations of upsampling, padding, and shifting are ap- DeepSTCL [19] implements a ConvLSTM network within plied in series to preserve the original neighborhood of a deep learning framework for travel demand prediction, the hexagons in the square structure. Then, a custom standing out for its ability to capture spatial-temporal kernel is applied to convolve the data and extract rele- dynamics and surpass traditional methods like AR and vant features, which allows maintaining the hexagonal ARIMA. Its focus on analyzing proximity, period, and structure. The use of hexagonal inputs allows greater ef- trend patterns results in more accurate predictions and ifciency in terms of computation, according to [ 15], since better interpretation of complex travel demand data, provthey require fewer parameters to achieve comparable ing its superiority with real data from DIDI in Chengdu. coverage of the input domain. This can translate into Zhang et al. [20], introduced an LSTM-XGBoost model faster training and lower resource consumption. for short-term trafic flow prediction, addressing chalThe contributions of our article are the following: lenges such as periodicity and overfitting by combining

LSTM with dropout layers and XGBoost to enhance accu• We present a hexagonal grid-based representa- racy and generalization. Validated with trafic data from tion for spatial-temporal data corresponding to Shenzhen, the model shows significant improvements in vehicle speeds; accuracy and scalability, highlighting its contribution to • We conduct comparative experiments that in- optimizing trafic prediction and eficient control. Duan clude standard baseline machine learning models et al. [21], introduced an enhanced hybrid CNN-LSTM along with the technique proposed in this work; model through a greedy algorithm for urban trafic flow prediction using GPS data from taxis. This work com• We make the source code of this work available bines spatial and temporal feature extraction to improve to facilitate the replicability of experiments. prediction accuracy and eficiency. Validated with data from Xi’an, the model achieves shorter training times and

Section 2 outlines relevant prior work. Section 3 de- greater accuracy compared to previous methods, ofering tails the proposed methodology. Section 4 presents and an efective solution to the complexity of urban trafic discusses the results of our experiments. Lastly, Section data. Xu et al. [ 22 ], proposed a spatio-temporal deep 5 summarizes our main conclusions. learning framework, integrating ConvLSTM and Graph Convolutional Network (GCN), for precise trafic speed 2. Related work prediction. By extracting temporal features with ConvLSTM and spatial features with GCN, the framework Spatial-temporal prediction often use a combination of significantly improves predictive performance against recurrent and convolutional networks such as ConvL- baseline methods, demonstrating its eficacy in the adSTM (Convolutional Long Short-Term Memory), which vanced analysis of large trafic data collected through merges the spatial analysis capabilities of CNNs with the Internet of Things (IoT). Hu et al. [ 23 ] present the the ability of LSTMs to capture temporal relationships. AB-ConvLSTM model, designed to accurately predict Recently, Transformer neural models have been applied large-scale trafic speed in urban road networks. This [16, 17]. A notable feature of these networks is their model combines the ConvLSTM network, an attention ability to model long-range dependencies in sequential mechanism, and Bi-LSTM networks to extract spatialdata. temporal and periodic features. The results show that

In the literature, numerous works are focused on the AB-ConvLSTM consistently outperforms other models spatial-temporal prediction of transit speeds, congestion, in predicting urban trafic speed, highlighting its ability and transportation using deep neural networks. Lai et to capture historical significance and efectively extract al. [18] used an improved ConvLSTM model (eConvL- daily and weekly periodic functions. STM), which incorporates advanced linear features. A Regarding hexagonal models, they have typically been Trafic Pattern Attention (TPA) block and a Squeeze-and- applied to spatial prediction tasks. Hexagdly [ 24 ] faciliExcitation (SE) block are introduced to optimize the ac- tates the use of convolutional neural networks (CNNs) in curacy in predicting trafic matrices, thus surpassing ex- this field without the need for data preprocessing. The isting baseline models. Bogaerts et al. [14] presented main advantage of this approach lies in its adaptation to Graph CNN-LSTM, a hybrid architecture composed of hexagonal grids through specific convolution and poola georeferenced hexagonal grid from boundary coordinates, where the number of hexagons depends on the H3 resolution parameter. ing operations, overcoming the limitations of traditional square convolution kernels.

Previous works focus on the combination of diferent techniques and architectures to improve accuracy and generalization in trafic prediction considering square inputs, while works considering hexagonal inputs propose prediction at a spatial level. In this work, the geometric and topological advantages of hexagonal inputs are exploited [ 25 ]. These allow for better coverage and connectivity in capturing the spatial characteristics of trafic, resulting in a more eficient and accurate representation of temporal and spatial dynamics.

3. Proposed method

The general approach to processing sequences of data grid sequences using spatio-temporal neural networks assumes that the data is represented by square grids.

However, it is not clear how to apply these models to Each hexagon is identified by a unique index that endata represented by hexagonal grids. Particularly, the codes its position. When mapping these indices to a neighborhood of a cell is diferent; while a hexagonal Cartesian coordinate system (, ) for visualization or cell has six neighbors, a square cell has eight neighbors. computational purposes, hexagons sharing a common However, the use of hexagonal grids in convolutional coordinate will form a line that traverses the grid in a networks enhances prediction accuracy [ 26 ] due to the diagonal direction. This is due to the nature of hexagreduced anisotropy of hexagonal filters [ 27 ]. Despite onal packing, where each hexagon touches six others this, the reviewed spatio-temporal neural models do not in an arrangement that naturally forms diagonals when consider this type of configuration. represented in a 2D coordinate system (see Fig. 2).

In this work, we propose a ConvLSTM-based method for spatial-temporal prediction utilizing hexagonal grids, applied specifically to cargo vehicle speed data. This method comprises three key steps outlined as follows: (i) Initially, we transform the transit speed data onto hexagonal grids represented in Cartesian coordinates. (ii) Subsequently, we sequence the data in hexagonal patterns while preserving the hexagonal constraint by considering equivalent square grids. (iii) Lastly, we employ a ConvLSTM network with a hexagonal constraint (HexConvLSTM) to train on the preprocessed speed data.

Now we will detail these steps. (a) Hexagonal Grid

3.1. Cartesian Representation

In this work, we first group the trafic speed data into regular hexagons using a methodology that generates a hexagonal grid. The implementation of this method results in the generation of N regular hexagons, where N is determined by a spatial resolution parameter. This generation produces a hexagonal grid where each hexagon contains the measurements that the area encompasses. In Fig. 1 we show an example of a hexagonal mesh considering 21 hexagons within the experimental region. Fortunately, this hexagonal organization is typically facilitated by specialized libraries; in our case, we used the H3 library from Uber [ 28 ]. This library generates (b) Cartesian Grid

While in a square grid, a cell typically has eight direct 3.2. Square Preprocessing

Given that the data from hexagonal cells are represented as ordered pairs (, ), the hexagonal grid can be represented as a square grid, that is, in the form of matrices.

However, in a square grid, a cell has 8 neighbors, while hexagonal cells have 6 neighbors. Therefore, it is necessary to prepare the data so that a convolution operation, provided by ConvLSTM, respects the hexagonal constraint.

This pre-processing is performed through a sequence of matrix operations involving upsampling, padding, and shifting. This approach results in a representation where it is feasible for a convolution to respect the hexagonal arrangement through a kernel constraint of a ConvLSTM. 3.2.1. UpSampling The first step in data preprocessing is UpSampling. The goal of this operation is to increase the vertical resolution of the matrix by duplicating each row, while keeping the horizontal content unchanged. Assuming that the original matrix × and that the result of upsampling is ′, the relationship between the elements of these matrices can be expressed as: ′, = ⌊ 2 ⌋,

, ∀ ∈ [1, 2], ∀ ∈ [1, ].

Visually, if we consider as the original matrix, then, after applying the UpSampling process, ′ results as follows: ′′, = {︃′, , if 1 ≤ ≤ 2 0, if 2 < ≤ 2 + ,

∀ ∈ [1, ], This equation specifies how rows of zeros are added at the bottom of ′.

Visually, we can see that while ′ is a 2 × matrix resulting from the UpSampling process, the result of the Padding, ′′, will be visualized with the last rows composed of zeros, ′1,2 ′1,3

... ′,2 ′,2 0 ... 0 · · · · · · . . . · · · · · · · · · . . . · · · ′1, ⎤ ′1, ⎥ . ⎥ .. ⎥ ⎥ ⎥ ′, ⎥⎥ . ′, ⎥ 0 ⎥⎥ . ⎥ . ⎥ . ⎦ 0

In this matrix ′′, the elements ′, represent the values of ′, and the last rows are zeros, creating a final matrix of (2 + ) × . This adjustment in the padding process ensures that the extended matrix has the appropriate size for the Shifting operation. 3.2.3. Shifting The final step in the preprocessing is the Shifting, which shifts each column of the matrix upwards by a number of positions equal to the column index. This procedure introduces a shift that depends on the column position, achieving the necessary configuration to apply the hexagonal constraint kernel. For the matrix ′′, the resulting matrix ′′′ is obtained as follows: ′′,′ = ′(′+) mod 2, , ∀ ∈ [1, 2], ∀ ∈ [1, ], neighbors (up, down, left, right, and the four diagonals), Specifically, this step adds rows of zeros at the bottom in a hexagonal grid, each cell is adjacent to six neigh- of ′, resulting in a new matrix ′′ with size (2 + ) × bors. Therefore, the hexagonal neighborhood structure . The transformation from ′ to ′′ can be described significantly alters the spatial distances between cells. as follows: The second step, Padding, adds additional rows to the matrix to prepare the data for the Shifting process, which requires a specific number of rows to operate correctly. ⎡ 1,1 ⎢ 2,1 = ⎢ .

⎢⎣ ..

· · · · · · . . . ,1 · · · 1, ⎤ 2, ⎥ . ⎥ . ⎥ . ⎦ , ′ = ⎢⎢ 2,1 ⎢ . ⎢⎢ .. ⎡ 1,1 ⎢ 1,1 ⎢⎢ 2,1 ⎢⎣,1 ,1 · · · · · · · · · · · · . . . · · · · · · 1, ⎤ 1, ⎥ 2, ⎥⎥ 2, ⎥⎥ .

. ⎥ . ⎥ . ⎥

⎥ , ⎦ , ⎡ ′1′,1

. ⎢ ..

⎢ ′′ = ⎢⎢⎢′′,1 ⎢ 0 ⎢ . ⎢ . ⎣ .

0 · · · · · · .

This pattern demonstrates how the elements of each column shift upwards, and those exceeding the upper limit of the matrix reappear at the bottom. The outcome of this final step enables the use of a kernel constraint in any ConvLSTM neural network implementation, ensuring strict adherence to the original hexagonal grid neighborhood.

3.3. HexConvLSTM Architecture

Assuming that the data were preprocessed into a square grid according to 3.2, we propose using a ConvLSTM neural network with a kernel constraint. Next, we will describe the kernel constraint mask that enables adherence to the hexagonal arrangement in the grid, followed by the neural network used. 3.3.1. Kernel constraint The kernel constraint is defined by a binary mask given by: ⎡ 0 ⎢ 1 ⎢⎢ 0 ⎢⎣ 1 0

In this matrix, the positions where there is a 1 indicate

the cells that will be active, allowing convolution at those specific positions; otherwise, the cells are not processed.

The positions are represented in the left matrix, P is the target cell and Ne(P) is the neighbor of target P. In this way, the 6-neighborhood of a hexagonal cell is recovered in the square grid when using standard convolution operations. 3.3.2. The HexConvLSTM network By introducing the kernel constraint, mentioned in the previous subsection 3.3.1, into a standard ConvLSTM-2D layer, we can recover the hexagonal neighborhood in a matrix tensor. We refer to this network as HexConvLSTM, where a diagram of it can be seen in Fig. 3. For this work, we have limited our data to a specific subThe variables and parameters of the ConvLSTM network region (see Fig. 4), considering a particular area with are typically well-known and are detailed in [ 29 ]. The the highest data density in the city of Santiago de Chile, diference from a standard ConvLSTM lies in the application of the kernel constraint, which allows the network to consider only the neighbors provided by the original hexagonal configuration.

In a nutshell, our proposal entails representing a hexagonal grid in a Cartesian representation (see Section 3.1), preprocessing to preserve hexagonal neighborhood (see Section, 3.2), and ultimately applying a ConvLSTM neural network (see Section 3.3). Subsequent experiments aim to evaluate the eficacy of our approach on a real dataset.

4. Experiments

4.1. Data The database used in this work corresponds to data extracted from the Transportation and Logistics Center of Andrés Bello University, a center dedicated to researching routing problems, last-mile, logistics optimization, among others. The raw data includes 22 million GPS measurements of last-mile cargo vehicle speeds in Santiago, Chile, Metropolitan Region. This data contains the following information: the capital of Chile. A high data density is considered to minimize missing data, since cargo vehicles tend to prefer certain streets. The boundaries of the chosen area are between latitudes -33.4331 and -33.4524, and longitudes -70.6253 and -70.6655, forming a rectangle that includes the Santiago Centro commune and parts of its neighboring communes.

In terms of experimental design, the HoldOut method for time series [ 30 ] was followed, where data were sequentially divided into training (70%), validation (15%), and testing (15%) sets, with MinMax scaling applied to each set. All methods were evaluated considering mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), and coeficient of determination (R2). Furthermore, to ensure replicability, the demo source code for this work is available at: https://github.com/Francisco0178/HexConvLSTM. At this point, we state that our method is generic, and in future work we will test it on public datasets [ 31 ].

4.2. Data Imputation

Our dataset consists of a time series with 1,884 temporal steps, each representing one hour between 8:00 a.m. and 7:00 p.m. over 157 days. Using a fixed grid of 110 hexagonal cells, each time step contains average trafic speed information for each hexagonal cell. These 110 cells are derived from the hexagonal preprocessing described in Section 3.1 using the H3 library at a resolution of 9.

However, it is worth noting that the data originates from geolocated sensor data of cargo vehicles. Upon analyzing this data, it becomes apparent that these vehicles tend to favor certain routes and schedules, resulting in some regions being underrepresented in the data. For instance, at 8 am, the few vehicles that do transit may predominantly utilize main roads, leaving certain areas unmeasured. Consequently, in the utilized representation, there are hexagonal cells with missing measurement information, with the percentage of missing data depending on the H3 resolution parameter.

In our implementation using the H3 library, we opted Figure 4: The upper image corresponds to the city of Santiago, for an H3 resolution of 9, which generates 110 hexagons. while the lower image corresponds to the study area. When represented in a square format, it yields 15x15 matrices (225 cells) with 54% missing data. Although this

The measurements for this subregion span from Jan- is a high percentage of missing values, using the next H3 uary 4th to July 25th, 2020. All measurements recording resolution, 10, results in grids of 5x6, which are too small a speed of zero were removed, indicating that the vehicle for the use of convolutional models; however, using an was stopped or out of operation. Additionally, records H3 resolution of 8 results in 500 hexagons leading to 90% outside the time range of 8:00 a.m. to 7:00 p.m. were missing data, which complicates the training of neural excluded, as this interval has the highest concentration models of measurements. Measurements outside this range were In this study, various imputation methods were experiexcluded due to their low frequency. Similarly, measure- mented with, and we found experimentally that the PPCA ments from Sundays were discarded as they also showed method performs better than Gaussian-based or MICE similarly low frequency. It should be noted that there imputations. It is worth noting that in [ 32 ], PPCA also were no measurements during the month of April during emerges as a competitive imputation model for trafic the measurement period. prediction tasks.

Regarding temporality, the measurements will be treated as hourly time series, which can be divided into 4.3. Experimental Results 157 days, with each day having 12 hours of measurement (from 8:00 a.m. to 7:00 p.m.), resulting in a total of 1,884 Comparative experiments were conducted among an time series. Each of these intervals will be treated as a MLP network, GRU, LSTM, ConvLSTM, and our Hexgrid with values imputed according to Section 4.2. ConvLSTM network. The MLP network comprises two layers with 256 and 128 neurons, while the LSTM and GRU networks consider 128 and 50 recurrent units, respectively. For the ConvLSTM and HexConvLSTM networks, 128 ConvLSTM units are employed. In all neural networks, Mean Squared Error (MSE) was utilized as the loss function.

In the first experiment, we trained the networks using data imputed by the three methods described in Section 4.2. The second experiment involved training the models with data imputed using the method that yielded the best results, but with a reshaping of the time series. This reshaping involved grouping the averages of two consecutive hourly periods, which resulted in halving the total dimension of our time series.

Table 3 presents the results of each tested method.

The HexConvLSTM network has once again achieved 4.3.1. One-Hour Granularity Experiment the best values across all metrics, surpassing ConvLSTM with relative improvements of 2.7%, 1.3%, 0.7%, and 2.8% Table 2 shows that the proposed HexConvLSTM network in MAE, MSE, RMSE, and R2, respectively. This reafirms achieved the best values across all metrics, surpassing that the hexagonal constraint efectively captures the ConvLSTM with relative improvements of 1.3%, 1.3%, dynamics between the cells. Moreover, the results are 0.7%, and 0.9% in MAE, MSE, RMSE, and R2 respectively. globally better than those from the one-hour granularThis indicates that the hexagonal constraint better cap- ity due to less variability since two-hour averages are tures the dynamics between the cells, leading to improved considered, which appear to be more predictable for all performance of a ConvLSTM network. However, when models in general. In this experiment, HexConvLSTM comparing all models, HexConvLSTM yielded the best further increases its advantage over the other models. results, outperforming its closest competitor, MLP. We believe this model performs well due to the low resolution of the 15x15 grid. The competitiveness of MLP on 5. Conclusions small images, such as on the MNIST dataset, is shown in [ 33 ]. However, in the context of transportation in large cities we need to increase the size of the grids to improve the spatial resolution of prediction. This work demonstrates that the proposed HexConvLSTM model outperforms ConvLSTM across all metrics, indicating superior capture of transit dynamics. It consistently shows an advantage in all metrics, and this advantage is expected to increase as larger grids and longer temporal intervals are used in the sequence of input grids.

The temporal grouping experiment shed light on another critical aspect: eficiency in data representation can be as crucial as the quality of the data itself. In this context, HexConvLSTM not only handled the imputed data well but also benefited significantly from the grouping, enhancing its predictive capacity. This result underscores how HexConvLSTM can extract value from adjustments in data preparation, a considerable advantage for any practical application.

As future work, we plan to use databases with more records, include larger study regions, and incorporate self-attention layers to improve the model’s performance.

Acknowledgments

B. Peralta and A. Soto appreciate the support of the National Center for Artificial Intelligence CENIA FB210017, Basal ANID. 4.3.2. Two-Hour Granularity Experiment Another experiment involved aggregating our data into the average of 2 consecutive time steps, resulting in sequences that still contain 12 steps, but now each step represents aggregated information from 2 consecutive days (6 steps per day), instead of one day per step. This grouping approach efectively reduces the temporal resolution of our data but enriches each time step with a more integrated view of temporal features.

[22]

Dai ,

Huang ,

Xu ,

Qi ,

M. R.

Khosravi , Spatiotemporal deep learning framework for trafic speed forecasting in iot , IEEE Internet of Things Magazine 3 ( 2020 ) 66 - 69 .

[23]

Hu ,

Liu ,

Hao ,

Lin , Attention-based convlstm and bi-lstm networks for large-scale trafic speed prediction , The Journal of Supercomputing 78 ( 2022 ) 12686 - 12709 .

[24]

Steppa ,

T. L.

Holch , Hexagdly-processing hexagonally sampled data with cnns in pytorch , SoftwareX 9 ( 2019 ) 193 - 198 .

[25]

Fadaei ,

Rashno , A framework for hexagonal image processing using hexagonal pixel-perfect approximations in subpixel resolution , IEEE Transactions on image processing 30 ( 2021 ) 4555 - 4570 .

[26]

Zhao ,

Ke ,

Korn ,

Qi ,

Zhang , Hexcnn: A framework for native hexagonal convolutional neural networks , in: 2020 IEEE International Conference on Data Mining (ICDM) , IEEE, 2020 , pp. 1424 - 1429 .

[27]

Hoogeboom ,

J. W.

Peters ,

T. S.

Cohen ,

Welling , Hexaconv, arXiv preprint arXiv: 1803 . 02108 ( 2018 ).

[28] I. Brodsky , H3: Uber's hexagonal hierarchical spatial index , https://eng.uber.com/h3/, 2018 . Available from Uber Engineering website . Accessed: 22 June 2019 .

[29]

Shi ,

Chen ,

Wang , D.-

Yeung , W.-K. Wong, W.-c. Woo, Convolutional lstm network: A machine learning approach for precipitation nowcasting , Advances in neural information processing systems 28 ( 2015 ).

[30]

Cerqueira ,

Torgo , I. Mozetič , Evaluating time series forecasting models: An empirical study on performance estimation methods , Machine Learning 109 ( 2020 ) 1997 - 2028 .

[31]

Jiang ,

Yin ,

Wang ,

Deng ,

Liu ,

Cai ,

Deng ,

Song ,

Shibasaki , Dl-traf: Survey and benchmark of deep learning models for urban trafic prediction , in: Proceedings of the 30th ACM international conference on information & knowledge management , 2021 , pp. 4515 - 4525 .

[32]

Sun ,

Zhu ,

Hao ,

Sun ,

Xie , Trafic missing data imputation: a selective overview of temporal theories and algorithms , Mathematics 10 ( 2022 ) 2544 .

[33]

Baldominos ,

Saez ,

Isasi , A survey of handwritten character recognition with mnist and emnist , Applied Sciences 9 ( 2019 ) 3169 .