=Paper= {{Paper |id=Vol-3207/paper10 |storemode=property |title=Radar-Based Volumetric Precipitation Nowcasting: A 3D Convolutional Neural Network with U-Net Architecture |pdfUrl=https://ceur-ws.org/Vol-3207/paper10.pdf |volume=Vol-3207 |authors=Peter Pavlík,Viera Rozinajová,Anna Bou Ezzeddine |dblpUrl=https://dblp.org/rec/conf/cdceo/PavlikRE22 }} ==Radar-Based Volumetric Precipitation Nowcasting: A 3D Convolutional Neural Network with U-Net Architecture== https://ceur-ws.org/Vol-3207/paper10.pdf
Radar-Based Volumetric Precipitation Nowcasting: A 3D
Convolutional Neural Network with U-Net Architecture
Peter Pavlík1,2 , Viera Rozinajová2,3 and Anna Bou Ezzeddine2
1
  Faculty of Information Technology, Brno University of Technology, Božetěchova 1/2, Brno-Královo Pole, 612 00, Czechia
2
  Kempelen Institute of Intelligent Technologies, Mlynské Nivy II. 18890/5, Bratislava, 821 09, Slovakia
3
  Slovak Centre for Research of Artificial Intelligence - slovak.AI, Slovakia


                                             Abstract
                                             In recent years – like in many other domains – deep learning models have found their place in the domain of precipitation
                                             nowcasting. Many of these models are based on the U-Net architecture, which was originally developed for biomedical
                                             segmentation, but is also useful for the generation of short-term forecasts and therefore applicable in the weather nowcasting
                                             domain. The existing U-Net-based models use sequential radar data mapped into a 2-dimensional Cartesian grid as input and
                                             output. We propose to incorporate a third - vertical - dimension to better predict precipitation phenomena such as convective
                                             rainfall and present our results here. We compare the nowcasting performance of two comparable U-Net models trained on
                                             two-dimensional and three-dimensional radar observation data. We show that using volumetric data results in a small, but
                                             significant reduction in prediction error.

                                             Keywords
                                             precipitation nowcasting, radar imaging, U-Net



1. Introduction                                                                                                       systems because it requires highly accurate and con-
                                                                                                                      stantly updated data about precipitation fields, i.e. the
Accurate precipitation nowcasting is important for plan-                                                              location of storms, wind, fog, snow etc. Weather radar
ning various human activities and tasks such as agri-                                                                 systems are essential for nowcasting because they di-
culture, construction building or winter road mainte-                                                                 rectly observe precipitation particles with an update rate
nance. Nowcasting is defined by the World Meteoro-                                                                    of a few minutes [1]. See Figure 1 for an example of a
logical Agency as forecasting with local detail, by any                                                               radar precipitation map.
method, over a period from the present to six hours                                                                      In the last few years, deep learning precipitation now-
ahead, including a detailed description of the present                                                                casting approaches, such as convolutional neural net-
weather [1].                                                                                                          works (CNN), started to gain attention. From the initial
   In practice, simpler - and therefore faster - models out-                                                          ConvLSTM model [2], through encoder-decoder U-Net
perform complex Numerical Weather Prediction (NWP)                                                                    architectures [3, 4], to the recently-introduced GAN-
models at the task of precipitation nowcasting because                                                                based approaches [5, 6], the CNN models proved to
NWP models cannot consider the latest observations due                                                                consistently outperform the operational state-of-the-art
to their long inference time. The highly sophisticated                                                                methods in the domain [6].
NWP models usually need hours to produce their fore-                                                                     Most precipitation nowcasting models only use the
casts and so they are not able to take into consideration                                                             radar data mapped to a 2D Cartesian grid, aggregating
the latest data observations. Even a simple model that                                                                the vertical dimension, even though the raw output of
can quickly output a prediction will outperform the NWP                                                               weather radar systems consists of multiple measurements
models at the task of precipitation nowcasting simply by                                                              at different elevation angles and polar coordinates that
the fact that it can consider the present data. Nowcast-                                                              capture the precipitation phenomena in 3-dimensional
ing models can work in conjunction with NWP models                                                                    space around the radar.
and use their long-term forecasts as additional inputs to                                                                We propose using volumetric data from multiple alti-
further refine their nowcasts [1].                                                                                    tudes to give the model as much data about the observa-
   Precipitation nowcasting is usually performed using                                                                tion as possible. Providing information about the vertical
temporal extrapolation of past data from weather radar                                                                motion of precipitation particles, as well as their vertical
CDCEO 2022: 2nd Workshop on Complex Data Challenges in Earth                                                          extension, could potentially be valuable for the model, as
Observation, July 25, 2022, Vienna, Austria                                                                           they are an important factor in predicting the behavior
Envelope-Open peter.pavlik@kinit.sk (P. Pavlík); viera.rozinajova@kinit.sk                                            of convective storms [7].
(V. Rozinajová); anna.bou.ezzeddine@kinit.sk (A. B. Ezzeddine)                                                           We compare two models - a reference U-Net architec-
Orcid 0000-0002-7468-5503 (P. Pavlík); 0000-0003-1302-6261
                                                                                                                      ture based on existing research [3, 4] and an alternative
(V. Rozinajová); 0000-0002-3341-6059 (A. B. Ezzeddine)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative   with 2D convolutional layers replaced by 3D convolution.
                                       Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)                                        We evaluate their performance in the task of predicting
                                                        50
                                                                      The first deep learning approach applied to the task of
   300
                                                        40
                                                                   precipitation nowcasting was a ConvLSTM model pre-
                                                                   sented in [2] that outperformed the operational optical-
   250                                                             flow-based ROVER nowcasting system. Experiments
                                                        30
                                                                   with other CNN architectures started, such as a Con-
   200                                                             vGRU model from [15] or a U-Net-based architecture
                                                        20
                                                                   introduced in [16]. The U-Net architectures, originally de-




                                                             dBZ
   150                                                             veloped for segmentation of medical images [17], proved
                                                        10
                                                                   to be quite popular with models such as RainNet[3] and
   100                                                             SmaAt-U-Net[4] further exploring this approach.
                                                        0
                                                                      The previously mentioned neural network regression
    50                                                             models trying to nowcast the future state of precipita-
                                                            10
                                                                   tion fields were affected by blurring. When using tra-
    0
         0   50   100    150   200    250    300
                                                                   ditional gridpoint-based verification statistics such as
                                                                   Mean Squared Error (MSE) as the training loss function,
Figure 1: A single radar echo observation. The shown re-           we face the so-called “double penalty problem”. A fore-
flectivity values represent reflectivity captured at 2 km above
                                                                   cast of a precipitation feature that is correct in terms of
radar (CAPPI). The reflectivity map is overlaid over a satel-
lite image of the appropriate area centered on the Malý Ja-        intensity, size, and timing, but incorrect concerning loca-
vorník radar station generated using Google Earth Engine [8].      tion, results in very large mean square error [18]. This
Landsat-8 image courtesy of the U.S. Geological Survey.            causes the model to produce blurry outputs to mitigate
                                                                   the penalisation caused by spatially incorrect precipita-
                                                                   tion features.
                                                                      The blurry predictions pose one of the biggest chal-
a single constant-altitude radar reflectivity observation          lenges for anyone trying to develop a nowcasting model
30 minutes into the future.                                        based on machine learning as such predictions have diffi-
   Our experiments show that providing volumetric data             culties predicting extreme events due to the smoothing.
from multiple altitude levels results in small, but statisti-      Recently, this problem started to be addressed by training
cally significant reduction of prediction error.                   models using the Generative Adversarial Network (GAN)
                                                                   approach, the most prominent being DGMR[6]. They
2. Related Work                                                    introduced a GAN framework[19] to solve the problem
                                                                   of blurry predictions present in other deep learning pre-
Many automated nowcasting systems that employ var-                 cipitation nowcasting models such as RainNet. Model
ious inputs and computation approaches are in use to-              is trained using a combination of two discriminators in-
day [9, 10, 11, 12, 13]. These systems are generally based         spired by existing research in video generation and a
on extrapolating past observed rainfall data forwards in           regularization term that comprise the loss function. The
time. They typically estimate the future advection based           first discriminator, spatial, discourages blurry predictions
on motion observed in the most recent radar images us-             while the second one, temporal, discourages jumpy pre-
ing cross-correlation or optical flow techniques [1].              dictions. The regularization term penalizes deviations
   Some nowcasting systems use the cell tracking ap-               between the observed radar sequences and the model
proach. They firstly identify storms in the radar scan             prediction. The DGMR model can be currently consid-
and then locate the corresponding object in the consecu-           ered the state-of-the-art in the precipitation nowcasting
tive scans to track its motion. Cell tracking is useful for        domain.
tracking severe storms and is useful for generating early
warnings [1].                                                      2.1. Motivation for Volumetric
   The shortcoming of these advection nowcasting meth-
                                                                        Nowcasting
ods is the assumption that the observed precipitation
field will not change, only move elsewhere. Therefore,             The application of deep learning models for precipitation
they lack the capability to predict beginning of new pre-          nowcasting is the focus of many research works. How-
cipitation phenomena such as convective initiation (start          ever, the vast majority of the models use 2-dimensional
of a storm triggered by rising moist warm air) or the              aggregate radar products and thus throw away any infor-
decaying of the storm at the end of its lifecycle [1, 14].         mation which can be gained from processing the vertical
   In the past years, data-driven approaches using deep            structure of precipitation objects captured by the radar.
learning to construct precipitation nowcasting models                 When reviewing the existing works in the precipi-
to mitigate these limitations have started to gain atten-          tation nowcasting domain, we identified a need to ex-
tion [2, 3, 6].                                                    plore the effect of working with 3-dimensional volumetric
                                                                                                                   45.0 Deg. 2017-08-06T13:00:06Z
radar data. By processing the data into a 2D aggregated                                      10
                                                                                                                     Equivalent reflectivity factor
                                                                                                                                                            60
map, we lose all information about the vertical structure
of the precipitation particles detected by the radar. The                                    8




                                                                                                                                                                 equivalent reflectivity factor (dBZ)
                                                                                                                                                            40




                                                                 Distance Above radar (km)
model trained in this way cannot consider the vertical
                                                                                             6
movement of particles caused by updraft or downdraft                                                                                                        20

and predict the future precipitation accordingly.                                            4
                                                                                                                                                            0
   Compared to 2-dimensional precipitation nowcasting,
volumetric models are much less prevalent. One such                                          2
                                                                                                                                                                20
model was presented in [20], where a ConvLSTM model                                          0
was used to predict future radar reflectivity. The model                                          0           50          100                  150
                                                                                                                        Distance from radar (km)
                                                                                                                                                      200

input shape is 18×18×20 (18×18 km with 1 km resolution,
                                                                 Figure 2: Vertical slice of a single radar reflectivity observa-
10 km above at 500 m resolution) provided at multiple
                                                                 tion at a set azimuth. The separate ”rays” at different elevation
time steps, each one is processed by a 3D-CNN first,             angles are identifiable.
then passed on to ConvLSTM sequential network. The
output is a classification for the central region of 6 × 6 km
predicting whether the reflectivity in the next 30 and 60
                                                                 single radar observation.
minutes will exceed a set threshold. The final result is
                                                                    Since the convolutional neural network models cannot
a binary map with resolution of 6 × 6 km. The problem
                                                                 process the data in polar coordinates, we need to convert
with this approach is that the model cannot consider any
                                                                 them into Cartesian maps. We processed the data using
fast moving precipitation particles, since it cannot see
                                                                 the Py-ART Python library [21]. The radar echo obser-
more than 6 km past its target region. Also, the target
                                                                 vations are typically aggregated into precipitation maps
region size of 6 × 6 km can hardly be considered a high
                                                                 in two forms. The first one is Constant Altitude Plan
spatial resolution, which is one of the defining traits of
                                                                 Position Indicator (CAPPI), which displays reflectivity
nowcasting.
                                                                 gate values at certain altitude slice above radar. The other
   One other work worth mentioning is a 3D-CNN+GAN
                                                                 is CMAX, which aggregates the vertical dimension and
hybrid model from [5]. This model is quite sophisticated.
                                                                 displays the maximum value in the vertical column for
It uses the GAN-based approach to predict plausible data
                                                                 each data point. If a 3D volume is created from multiple
and a weighted MSE loss function to give more impor-
                                                                 CAPPI maps at different altitude levels, the product is
tance to high reflectivity values, resulting in better ability
                                                                 called MCAPPI.
to predict extreme precipitation events and reduce out-
                                                                    The reflectivity maps can be converted to rainfall rate
put blurring. However, the third data dimension is not
                                                                 maps using the Marshall-Palmer Z-R relationship[22]:
actually the altitude above radar we want to consider,
but time - i.e. the past observations are not as separate
                                                                                                                            𝑍 = 200𝑅1.6                         (1)
channels, but form a 3D volume. Nevertheless, the model
drives the development of 3D-CNN models for precipita-              where 𝑍 is the reflectivity factor and 𝑅 is the rainfall
tion nowcasting.                                                 rate in 𝑚𝑚/ℎ.


3. Radar Reflectivity Dataset                                    3.1. Training data selection
                                                                 The dataset requires filtering before training since the ma-
To explore the effect of volumetric precipitation now-           jority of the observations are of clear skies with nothing
casting, we collaborated with the Slovak Meteorological          to learn from. Most of the observations from the dataset
Institute that provided us a dataset of roughly 3.5 years        therefore have no value for training the model and could
of reflectivity data from Malý Javorník weather radar            even negatively affect the training by biasing the outputs
station. The data is captured in 5 minute intervals. The         toward clear sky prediction, while we are mostly inter-
dataset consists of 355 761 separate observations in the         ested in non-trivial cases with high precipitation. We
ODIM HDF5 format.                                                filtered the images as follows:
   The radar captures the precipitation particles in the
air by measuring returned radar wave power (echo) after                                               1. Create a CAPPI radar reflectivity map at 2 km
hitting precipitation particles. This value is called reflec-                                            altitude above radar at 1 × 1 km resolution and
tivity, measured in logarithmic dimensionless units called                                               select a center slice of size 336 × 336 km.
decibels (dBZ). The data consists of reflectivity values                                              2. Convert reflectivity to rainfall rate according to
at the so-called reflectivity gates in multiple elevation                                                Marshall-Palmer Z-R relationship (1).
angles distributed around the radar station and encoded                                               3. Compute the ratio of rainy to clear pixels (thresh-
in polar coordinates. See Figure 2 for a vertical slice of a                                             old 0.05 mm/5 min or 0.6 mm/h - corresponds to
                                                                                                         slight rain).
    4. If the rainfall map contains at least 20% of rainy           Set                       No. of obs.    % of original
       pixels and 11 previous observations are available,           Full Dataset                   355761             100
       add it to the target observation set.                        Target Observations              9018             2.53
                                                                    Target + Lead Obs.              11310             3.18
   Each selected target observation was included in the             Training Set Targets             6515             1.83
training dataset, along with a set number of previous ob-           Validation Set Targets           1150             0.32
servations to serve as inputs and non-target intermediary           Test Set Targets                 1353             0.38
outputs. For our models, we decided to use 6 observa-
tions as input and 6 as output, effectively predicting the      Table 1
precipitation half an hour in advance based on the last         The observation count of the full dataset, the subset selected
                                                                for training according to the training data selection described
half hour of data. This means that for each target obser-
                                                                in Section 3.1 and the sizes of train, test, validation splits.
vation, we also needed to include 11 leading observations
in the dataset. This process returned 9 018 suitable tar-
get images which together with the necessary leading
images represent 3.18% of the original dataset.                 data as much as possible. The GPU processing time (dis-
   It should be noted that the data converted to rainfall       regarding the time to move the data to memory) was not
described above was not used for training, only for fil-        affected, with both models needing around 6 ms of GPU
tering the target observations based on the ratio of rainy      time to generate a single output on our hardware.
pixels. The actual training data used reflectivity directly
for both 2D images and 3D volumes. The 2D dataset               4.1. Training and Evaluation
was a collection of CAPPI radar reflectivity maps at 2 km
altitude above radar. A 3D dataset was a collection of          To train and evaluate the models, the training dataset
CAPPI radar reflectivity maps at 8 altitude levels above        was split into training, validation and test subsets in
radar, from 500 m.a.r to 4000 m.a.r. The extent of the          chronological order. The last 15% of target observations
data was set to 336 × 336 km centered on the radar sta-         were selected for the test set, the rest was chosen for
tion with spatial resolution of 1 × 1 km for both 2D and        training. Out of these, the last 15% of target observations
3D data, resulting in images of size 336 × 336 pixels and       were again selected for validation and the rest was used
8 × 336 × 336 voxels respectively for a single observation.     as training samples. See Table 1 for the exact number of
                                                                observations in each set.
                                                                   Adam optimizer was used for training the model. To
4. Model Architectures                                          find the optimal training model hyperparameters - start-
                                                                ing learning rate, optimizer learning rate scheduler pa-
To compare the impact of adding a vertical dimension as         rameters and gradient clipping threshold - we utilized the
fairly as possible, we chose a basic U-Net architecture in-     Bayesian sweep search provided by Weights & Biases[24].
spired by models developed in [3, 4] as a reference model.      We trained 20 models with 2D CNN architecture and 5
As U-Net is a fully convolutional neural network, convert-      with 3D CNN architecture. The best performing model of
ing it to process volumetric data is a trivial task - mostly    each architecture variant was selected for performance
just a matter of replacing 2D convolutional layers with         evaluation. See Table 2 for all the possible hyperparam-
3D convolutions. Besides this, the model only required          eter values and the best performing ones for both 2D
replacing 2D max-pooling layers in the encoder for 3D           and 3D models. Early stopping after 15 non-improving
max-pooling and bilinear upsample in the decoder for            epochs was utilized.
trilinear. See Figure 3 for the specific number of channels        Choosing the right metric to evaluate the performance
and kernel sizes at each layer of the model. Both were          of precipitation nowcasting models is not simple. The
implemented using the PyTorch library [23].                     correct method depends on a model’s use-case and no
   The conversion of the model from 2D to 3D convo-             single composite measure is currently able to objectively
lutions was mostly straightforward and resulted in in-          evaluate performance of precipitation nowcasting mod-
creasing the number of trainable parameters 3-fold from         els [1]. While we outlined the shortcomings of using
roughly 17 to 52 million. The three-fold increase is based      MSE to evaluate precipitation nowcasting models above
on the fact that the model uses convolution kernels of          in Section 2, we are using MSE as the loss function and the
size 3 at every convolutional layer, therefore each kernel      primary evaluation metric despite the double penaliza-
has 27 (3 × 3 × 3) instead of 9 (3 × 3) weights (disregarding   tion effect that occurs since it is still the most commonly
bias and multiple channels). Other architectural parame-        used metric in this domain. Additionally, to provide more
ters of the model such as number of kernels at each layer       insight into model performance, we are also computing
were kept the same for the comparison between these             mean model accuracy, precision, recall and F1 scores on
models to be fair and dependent solely on the provided          binarized precipitation maps using a threshold value of
                               6        64                                                                               64   6




                          (8x)336x336

                                             128




                          (8x)168x168                                                                         64

                                                   256




                           (8x)84x84                                                                 128

                                                         512                                               Double Conv

                                                                                                           Skip connection
                           (8x)42x42                                                          256
                                                                                                           Single Conv
                                                                                                           Max Pooling

                           (8x)21x21                                             512                       Upsample




Figure 3: Diagram of the used U-Net model encoder-decoder architectures and the feed-forward process for both the 2D and
3D variant of the model. Each rectangle represents a multi-channel feature map with the number of channels shown above (or
below in the decoder part). The spatial resolution of the feature maps at each level is shown at the left side of the diagram
(the vertical dimension size of the 3D model is in brackets). Each arrow represents an operation with the data, see legend at
the bottom right. The kernels of the double convolution operation are of size 3 × 3 or 3 × 3 × 3, the kernels of the final single
convolution operation are of size 1 × 1 or 1 × 1 × 1 and the kernels of the max pooling operation are of size 2 × 2 or 1 × 2 × 2 for
2D and 3D models respectively. All the convolutional layers used the ReLU activation function.


  Hyperparameter          2D U-Net                                                     3D U-Net
  Batch size              32                                                           4
  Learning rate           5 × 10−5 , 7.5 × 10−5 , 1 × 10−4 , 2.5 × 10−4 , 5 × 10−4     5 × 10−5 , 7.5 × 10−5 , 1 × 10−4 , 2.5 × 10−4 , 5 × 10−4
  Opt. LRS Factor         0.5, 0.7, 0.9                                                0.5, 0.7, 0.9
  Opt. LRS Patience       3, 5, 7                                                      3, 5, 7
  Grad. Clip. Thres.      0.2, 1, 5                                                    0.2, 1, 5

Table 2
The hyperparameters values searched through during the training of the models using the Weights & Biases bayesian search.
The values used for training the best performing models are in bold. The batch size used was the highest possible based on our
GPU memory limit. The optimizer learning rate scheduler parameters are functionally meaningless, as both of the models
achieved the best performance before the optimizer was triggered to lower the learning rate. Gradient clipping was added to
prevent exploding gradient behavior occurring sometimes when a large starting learning rate was selected.



20 dBZ (corresponding to light rain) to differentiate be- into the future based on past radar reflectivity maps at
tween rain and no rain areas. This way, we can evaluate   the same altitude. Subsequently, we trained a 3D model
only the shape of precipitation features and disregard    to predict equivalent 3D reflectivity maps at 8 altitude
the intensity, which can serve as another valuable metric.levels based on recent volumetric observation data. To
Our experiments have shown that higher threshold val-     evaluate which model is better at precipitation nowcast-
ues corresponding to extreme precipitation events show    ing, we evaluate the prediction error on a single CAPPI
larger differences between model metrics during evalua-   map at 2 km above radar from the target observation
tion, however the informative value would be lower due    (nowcast 30 minutes in the future). This can be done
to such events occurring only in the small minority of    because one slice of the output volume of the 3D model
the test set observations.                                matches the altitude level the 2D model was trained on
                                                          (2000 m.a.r.).
                                                             A simple euclidean persistence was used as a bench-
5. 2D vs. 3D: A Comparison                                mark. This benchmark method simply copies the last
                                                          input observation as the prediction output. Despite the
The impact of providing a vertical dimension to the model
                                                          method being trivial, the precipitation data is highly de-
was evaluated by comparing the error rate when predict-
                                                          pendent on previous observations and so it provides a
ing a single reflectivity map at constant altitude above
                                                          good performance benchmark. Using this benchmark,
radar. We trained the 2D model to output the next CAPPI
                                                          we can also evaluate the rate of change in the data and
radar reflectivity maps at 2 km above radar 30 minutes
                                                          therefore see how ”difficult” it is to make an accurate
                             Model       MSE ↓          MAE ↓    Accuracy ↑   Precision ↑   Recall ↑   F1 ↑
                           Persistence   55.4110        4.7534   0.8307       0.6529        0.6426     0.6457
                            2D U-Net     22.6510        3.2623   0.8969       0.8257        0.7282     0.7696
                            3D U-Net     22.0340        3.2124   0.9000       0.8022        0.7833     0.7894

Table 3
Comparison of model results on the test set for each of the chosen metric scores. The ↓ symbol means it is a lower-is-better
score, while the ↑ symbolizes a higher-is-better score. The best result for each score is bolded.


       Target (t+30 min)                  Persistence
                                                                          6. Conclusion
                                                                   Our research shows that providing additional informa-
                                                                   37.5




                                                                   tion from multiple altitude levels has the potential to
                                                                   increase the nowcasting accuracy, as compared to the
                                                                   35.0




                                                                   currently standard approach of using only 2-dimensional
                                                                   precipitation maps. The improvements in error metrics,
                                                                   32.5




                                                                   while not groundbreaking, were statistically significant
                                                                   and show that providing more data is worth it, if we can
                                                                   30.0




         2D-CNN model                 3D-CNN model                 afford the increase in model complexity and training time.
                                                                   Even a small reduction in prediction error can be bene-
                                                                   27.5



                                                                   ficial in many applications and our preliminary results
                                                                   show that volumetric nowcasting can have a positive
                                                                   25.0


                                                                   impact.
                                                                      Additionally, volumetric nowcasts undoubtedly pro-
                                                                   22.5


                                                                   vide more value to the operators of these nowcasting
                                                                   systems. Reflectivity at different altitudes affects the true
                                                                   20.0

                                                                   rainfall rate on the ground in different ways, which can-
Figure 4: A visual comparison of nowcasts produced by the not be taken into account from simple 2-dimensional
models for a random observation from the test set. Upper left precipitation nowcasts. 3-dimensional predictions of fu-
image shows the target observation at 2 km CAPPI. Upper ture reflectivity observations can serve as a more valuable
right is the benchmark persistence nowcast. Bottom left is
                                                                   input to the consecutive models mapping the observed
the reference 2D-CNN U-Net model nowcast. Bottom right
is the corresponding slice of our 3D-CNN volumetric U-Net
                                                                   reflectivity to actual the rainfall rate on the ground.
model nowcast. While both U-Net models show the expected              While the field of precipitation nowcasting using neu-
blurring, the volumetric model is affected less, with larger ral networks is not new, there are still more uncertainties
areas of high reflectivity (shown in dark red). This is desirable, regarding best practices that should be comprehensively
as the model is better at predicting extreme events.               explored and compared. There are several open questions
                                                                   to answer, e.g.: Is it better to train the model directly on
                                                                   the captured reflectivity data or the data converted to
                                                                   rainfall rate? How many previous observations should
prediction for each sample.
                                                                   be provided to the model? How to convert radar obser-
    The results in Table 3 show that the best 3D-CNN
                                                                   vations to actual rainfall on the ground as accurately as
U-Net model slightly outperformed the best 2D-CNN
                                                                   possible? These are just some of the interesting problems
counterpart. On average, the 3D model achieved lower
                                                                   that need to be explored in the future.
prediction error on the test set, in both MSE and MAE
metrics. The improvement is small, but statistically sig-
nificant (paired t-test at 0.99 confidence level on test set Acknowledgments
MSE scores rejected the null hypothesis that the means
of 2D and 3D model error scores are the same, p-value This research was partially supported by TAILOR, a
is very close to zero). The area-based metrics also show project funded by EU Horizon 2020 research and innova-
small improvements, with accuracy and F1 scores being tion programme under GA No 952215; by The Ministry
slightly higher. Based on considerably higher recall and of Education, Science, Research and Sport of the Slovak
lower precision, we can assume the 3D model predicts Republic under the Contract No. 0827/2021; and by Life
larger precipitation bodies on average. See Figure 4 for a Defender - Protector of Life, ITMS code: 313010ASQ6, co-
visual comparison of the model outputs.                            financed by the European Regional Development Fund
                                                                   (Operational Programme Integrated Infrastructure).
References                                                                         abilistic precipitation forecasting scheme which
                                                                                   merges an extrapolation nowcast with downscaled
 [1] F. Schmid, Y. Wang, A. Harou, Nowcasting                                      nwp, Quarterly Journal of the Royal Meteorologi-
     guidelines–a summary, Bulletin nº 68 (2019) 2.                                cal Society: A journal of the atmospheric sciences,
 [2] X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong,                            applied meteorology and physical oceanography
     W.-c. Woo, Convolutional lstm network: A machine                              132 (2006) 2127–2155.
     learning approach for precipitation nowcasting, Ad-                      [14] S. Pulkkinen, D. Nerini, A. A. Pérez Hortal,
     vances in neural information processing systems                               C. Velasco-Forero, A. Seed, U. Germann, L. Foresti,
     28 (2015).                                                                    Pysteps: an open-source python library for proba-
 [3] G. Ayzel, T. Scheffer, M. Heistermann, Rainnet v1.                            bilistic precipitation nowcasting (v1.0), Geoscien-
     0: a convolutional neural network for radar-based                             tific Model Development 12 (2019) 4185–4219. URL:
     precipitation nowcasting, Geoscientific Model De-                             https://gmd.copernicus.org/articles/12/4185/2019/.
     velopment 13 (2020) 2631–2644.                                                doi:1 0 . 5 1 9 4 / g m d - 1 2 - 4 1 8 5 - 2 0 1 9 .
 [4] K. Trebing, T. Stanczyk, S. Mehrkanoon, Smaat-                           [15] X. Shi, Z. Gao, L. Lausen, H. Wang, D.-Y. Yeung,
     unet: Precipitation nowcasting using a small                                  W.-k. Wong, W.-c. Woo, Deep learning for precipi-
     attention-unet architecture,          Pattern Recogni-                        tation nowcasting: A benchmark and a new model,
     tion Letters 145 (2021) 178–186. URL: https :                                 Advances in neural information processing systems
     / / www.sciencedirect.com / science / article / pii /                         30 (2017).
     S0167865521000556. doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 /   [16] S. Agrawal, L. Barrington, C. Bromberg, J. Burge,
     j.patrec.2021.01.036.                                                         C. Gazen, J. Hickey, Machine learning for pre-
 [5] C. Wang, P. Wang, P. Wang, B. Xue, D. Wang, Us-                               cipitation nowcasting from radar images, CoRR
     ing conditional generative adversarial 3-d convo-                             abs/1912.12132 (2019). URL: http://arxiv.org/abs/
     lutional neural network for precise radar extrapo-                            1912.12132. a r X i v : 1 9 1 2 . 1 2 1 3 2 .
     lation, IEEE Journal of Selected Topics in Applied                       [17] O. Ronneberger, P. Fischer, T. Brox, U-net: Convo-
     Earth Observations and Remote Sensing 14 (2021)                               lutional networks for biomedical image segmen-
     5735–5749.                                                                    tation, CoRR abs/1505.04597 (2015). URL: http:
 [6] S. Ravuri, K. Lenc, M. Willson, D. Kangin, R. Lam,                            //arxiv.org/abs/1505.04597. a r X i v : 1 5 0 5 . 0 4 5 9 7 .
     P. Mirowski, M. Fitzsimons, M. Athanassiadou,                            [18] C. Keil, G. C. Craig, A displacement and ampli-
     S. Kashem, S. Madge, et al., Skilful precipitation                            tude score employing an optical flow technique,
     nowcasting using deep generative models of radar,                             Weather and Forecasting 24 (2009) 1297 – 1308.
     Nature 597 (2021) 672–677.                                                    URL: https://journals.ametsoc.org/view/journals/
 [7] C. A. Doswell, Severe convective storms—an                                    wefo / 24 / 5 / 2009waf2222247_1.xml. doi:1 0 . 1 1 7 5 /
     overview, Severe convective storms (2001) 1–26.                               2009WAF2222247.1.
 [8] USGS/Google,         Usgs      landsat            8        collec-       [19] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu,
     tion 1 tier 1 toa reflectance, 2022. URL:                                     D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio,
     https : / / developers.google.com / earth - engine /                          Generative adversarial nets, Advances in neural
     datasets/catalog/LANDSAT_LC08_C01_T1_TOA.                                     information processing systems 27 (2014).
 [9] M. Dixon, G. Wiener, Titan: Thunderstorm identifi-                       [20] W. Zhang, R. Zhang, H. Chen, G. He, Y. Ge, L. Han,
     cation, tracking, analysis, and nowcasting—a radar-                           A multi-channel 3d convolutional-recurrent neural
     based methodology, Journal of atmospheric and                                 network for convective storm nowcasting, in: 2021
     oceanic technology 10 (1993) 785–797.                                         IEEE International Geoscience and Remote Sensing
[10] A. Hering, C. Morel, G. Galli, S. Sénési, P. Am-                              Symposium IGARSS, IEEE, 2021, pp. 363–366.
     brosetti, M. Boscacci, Nowcasting thunderstorms                          [21] J. J. Helmus, S. M. Collis, The python arm radar
     in the alpine region using a radar based adaptive                             toolkit (py-art), a library for working with weather
     thresholding scheme, in: Proceedings of ERAD,                                 radar data in the python programming language,
     volume 1, 2004.                                                               Journal of Open Research Software 4 (2016).
[11] E. Ruzanski, V. Chandrasekar, Y. Wang, The casa                          [22] J. S. Marshall, W. M. K. Palmer, The distribu-
     nowcasting system, Journal of Atmospheric and                                 tion of raindrops with size, Journal of Atmo-
     Oceanic Technology 28 (2011) 640–655.                                         spheric Sciences 5 (1948) 165 – 166. doi:1 0 . 1 1 7 5 /
[12] T. Haiden, A. Kann, C. Wittmann, G. Pistotnik,                                1520-0469(1948)005<0165:TDORWS>2.0.CO;2.
     B. Bica, C. Gruber, The integrated nowcast-                              [23] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Brad-
     ing through comprehensive analysis (inca) system                              bury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein,
     and its validation over the eastern alpine region,                            L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito,
     Weather and Forecasting 26 (2011) 166–183.                                    M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner,
[13] N. E. Bowler, C. E. Pierce, A. W. Seed, Steps: A prob-                        L. Fang, J. Bai, S. Chintala, Pytorch: An impera-
     tive style, high-performance deep learning library,
     in: H. Wallach, H. Larochelle, A. Beygelzimer,
     F. d'Alché-Buc, E. Fox, R. Garnett (Eds.), Advances
     in Neural Information Processing Systems 32, Cur-
     ran Associates, Inc., 2019, pp. 8024–8035. URL:
     http://papers.neurips.cc/paper/9015-pytorch-an-
     imperative-style-high-performance-deep-learning-
     library.pdf.
[24] L. Biewald, Experiment tracking with weights and
     biases, 2020. URL: https://www.wandb.com/, soft-
     ware available from wandb.com.