<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Convolutional Neural Network with U-Net Architecture</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Peter Pavlík</string-name>
          <email>peter.pavlik@kinit.sk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Viera Rozinajová</string-name>
          <email>viera.rozinajova@kinit.sk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anna Bou Ezzeddine</string-name>
          <email>anna.bou.ezzeddine@kinit.sk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Information Technology, Brno University of Technology</institution>
          ,
          <addr-line>Božetěchova 1/2, Brno-Královo Pole, 612 00, Czechia</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Kempelen Institute of Intelligent Technologies</institution>
          ,
          <addr-line>Mlynské Nivy II. 18890/5, Bratislava, 821 09</addr-line>
          ,
          <country country="SK">Slovakia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Slovak Centre for Research of Artificial Intelligence - slovak.AI</institution>
          ,
          <country country="SK">Slovakia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In recent years - like in many other domains - deep learning models have found their place in the domain of precipitation nowcasting. Many of these models are based on the U-Net architecture, which was originally developed for biomedical segmentation, but is also useful for the generation of short-term forecasts and therefore applicable in the weather nowcasting domain. The existing U-Net-based models use sequential radar data mapped into a 2-dimensional Cartesian grid as input and output. We propose to incorporate a third - vertical - dimension to better predict precipitation phenomena such as convective rainfall and present our results here. We compare the nowcasting performance of two comparable U-Net models trained on two-dimensional and three-dimensional radar observation data. We show that using volumetric data results in a small, but significant reduction in prediction error.</p>
      </abstract>
      <kwd-group>
        <kwd>precipitation nowcasting</kwd>
        <kwd>radar imaging</kwd>
        <kwd>U-Net</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction
ning various human activities and tasks such as
agriculture, construction building or winter road
maintenance. Nowcasting is defined by the World
Meteorological Agency as forecasting with local detail, by any
method, over a period from the present to six hours
ahead, including a detailed description of the present
weather [1].</p>
    </sec>
    <sec id="sec-2">
      <title>In practice, simpler - and therefore faster - models out</title>
      <p>perform complex Numerical Weather Prediction (NWP)
models at the task of precipitation nowcasting because</p>
    </sec>
    <sec id="sec-3">
      <title>NWP models cannot consider the latest observations due</title>
      <p>to their long inference time. The highly sophisticated
NWP models usually need hours to produce their
forecasts and so they are not able to take into consideration
the latest data observations. Even a simple model that
can quickly output a prediction will outperform the NWP
models at the task of precipitation nowcasting simply by
the fact that it can consider the present data.
Nowcasting models can work in conjunction with NWP models
and use their long-term forecasts as additional inputs to
further refine their nowcasts [ 1].</p>
    </sec>
    <sec id="sec-4">
      <title>Precipitation nowcasting is usually performed using temporal extrapolation of past data from weather radar</title>
      <p>nEvelop-O
of convective storms [7].</p>
    </sec>
    <sec id="sec-5">
      <title>We compare two models - a reference U-Net architec</title>
      <p>ture based on existing research [3, 4] and an alternative
with 2D convolutional layers replaced by 3D convolution.</p>
    </sec>
    <sec id="sec-6">
      <title>We evaluate their performance in the task of predicting</title>
      <p>The first deep learning approach applied to the task of
300 precipitation nowcasting was a ConvLSTM model
pre40 sented in [2] that outperformed the operational
optical250 lfow-based ROVER nowcasting system. Experiments
30 with other CNN architectures started, such as a
Con200 vGRU model from [15] or a U-Net-based architecture
20 dZB introduced in [16]. The U-Net architectures, originally
de150 veloped for segmentation of medical images [17], proved
10 to be quite popular with models such as RainNet[3] and
100 SmaAt-U-Net[4] further exploring this approach.</p>
      <p>0 The previously mentioned neural network regression
50 models trying to nowcast the future state of
precipita10 tion fields were afected by blurring. When using
tra0 0 50 100 150 200 250 300 ditional gridpoint-based verification statistics such as
Mean Squared Error (MSE) as the training loss function,
Figure 1: A single radar echo observation. The shown re- we face the so-called “double penalty problem”. A
forefrlaedcatirvi(tCyAvPaPluIe).sTrheperreesfelnetctrievfilteyctmivaitpy icsaopvtuerreladidato2vekrmaasbaotveel- cast of a precipitation feature that is correct in terms of
lite image of the appropriate area centered on the Malý Ja- intensity, size, and timing, but incorrect concerning
locavorník radar station generated using Google Earth Engine [8]. tion, results in very large mean square error [18]. This
Landsat-8 image courtesy of the U.S. Geological Survey. causes the model to produce blurry outputs to mitigate
the penalisation caused by spatially incorrect
precipitation features.</p>
      <p>The blurry predictions pose one of the biggest
chala single constant-altitude radar reflectivity observation lenges for anyone trying to develop a nowcasting model
30 minutes into the future. based on machine learning as such predictions have
difi</p>
      <p>Our experiments show that providing volumetric data culties predicting extreme events due to the smoothing.
from multiple altitude levels results in small, but statisti- Recently, this problem started to be addressed by training
cally significant reduction of prediction error. models using the Generative Adversarial Network (GAN)
approach, the most prominent being DGMR[6]. They
2. Related Work introduced a GAN framework[19] to solve the problem
of blurry predictions present in other deep learning
preMany automated nowcasting systems that employ var- cipitation nowcasting models such as RainNet. Model
ious inputs and computation approaches are in use to- is trained using a combination of two discriminators
inday [9, 10, 11, 12, 13]. These systems are generally based spired by existing research in video generation and a
on extrapolating past observed rainfall data forwards in regularization term that comprise the loss function. The
time. They typically estimate the future advection based ifrst discriminator, spatial, discourages blurry predictions
on motion observed in the most recent radar images us- while the second one, temporal, discourages jumpy
preing cross-correlation or optical flow techniques [ 1]. dictions. The regularization term penalizes deviations</p>
      <p>Some nowcasting systems use the cell tracking ap- between the observed radar sequences and the model
proach. They firstly identify storms in the radar scan prediction. The DGMR model can be currently
considand then locate the corresponding object in the consecu- ered the state-of-the-art in the precipitation nowcasting
tive scans to track its motion. Cell tracking is useful for domain.
tracking severe storms and is useful for generating early
warnings [1]. 2.1. Motivation for Volumetric</p>
      <p>The shortcoming of these advection nowcasting meth- Nowcasting
ods is the assumption that the observed precipitation
ifeld will not change, only move elsewhere. Therefore, The application of deep learning models for precipitation
they lack the capability to predict beginning of new pre- nowcasting is the focus of many research works.
Howcipitation phenomena such as convective initiation (start ever, the vast majority of the models use 2-dimensional
of a storm triggered by rising moist warm air) or the aggregate radar products and thus throw away any
infordecaying of the storm at the end of its lifecycle [1, 14]. mation which can be gained from processing the vertical</p>
      <p>In the past years, data-driven approaches using deep structure of precipitation objects captured by the radar.
learning to construct precipitation nowcasting models When reviewing the existing works in the
precipito mitigate these limitations have started to gain atten- tation nowcasting domain, we identified a need to
extion [2, 3, 6]. plore the efect of working with 3-dimensional volumetric
rmaadpa,rwdaetlao.sBeyalplrioncfoersmsinatgiotnheabdoatuat itnhteovaer2tDicaalgsgtrreugcatuterde 10 60
vaommnofooldCtuvdhopmeeemmrlepetptedrrraenaiicrcictentiopdmetfhidttoepoaidtanf2ieuro-ltttdsnihuciaimrplseeraeeswrpntcmraisaeciyuuocleicsncpsehaaiddnltlaepnebttsroiyeoestccnutpciepporaddeicntvrcbasaaotiydftilrooedternnhrintednt.gohrolOaweywd.nnvcaeedarr.srsttaTuiiftcnhcaghel, i()trrsckvoaaaeenbdADm4628 02400 lliiiff()ttttrrccvvyoaaeeeenuqdZB
model was presented in [20], where a ConvLSTM model 20
was used to predict future radar reflectivity. The model 0 0 50 Di1s0ta0nce from radar (km1)50 200
input shape is 18×18×20 (18×18 km with 1 km resolution,
10 km above at 500 m resolution) provided at multiple tFiiognuartea2s:etVaezrtimicaulthsl.icTeheofseapsairnagtlee”rraadyas”r
artefdliefecrteinvtiteyleovbasteiorvnatime steps, each one is processed by a 3D-CNN first, angles are identifiable.
then passed on to ConvLSTM sequential network. The
output is a classification for the central region of 6 × 6 km
predicting whether the reflectivity in the next 30 and 60
minutes will exceed a set threshold. The final result is single radar observation.
a binary map with resolution of 6 × 6 km. The problem Since the convolutional neural network models cannot
with this approach is that the model cannot consider any process the data in polar coordinates, we need to convert
fast moving precipitation particles, since it cannot see them into Cartesian maps. We processed the data using
more than 6 km past its target region. Also, the target the Py-ART Python library [21]. The radar echo
obserregion size of 6 × 6 km can hardly be considered a high vations are typically aggregated into precipitation maps
spatial resolution, which is one of the defining traits of in two forms. The first one is Constant Altitude Plan
nowcasting. Position Indicator (CAPPI), which displays reflectivity</p>
      <p>One other work worth mentioning is a 3D-CNN+GAN gate values at certain altitude slice above radar. The other
hybrid model from [5]. This model is quite sophisticated. is CMAX, which aggregates the vertical dimension and
It uses the GAN-based approach to predict plausible data displays the maximum value in the vertical column for
and a weighted MSE loss function to give more impor- each data point. If a 3D volume is created from multiple
tance to high reflectivity values, resulting in better ability CAPPI maps at diferent altitude levels, the product is
to predict extreme precipitation events and reduce out- called MCAPPI.
put blurring. However, the third data dimension is not The reflectivity maps can be converted to rainfall rate
actually the altitude above radar we want to consider, maps using the Marshall-Palmer Z-R relationship[22]:
cbhuatntnimeles,-biu.et.fothrmepaa3sDtovbosleurmvaet.iNonesvearrtehenloetssa,sthseepmaoradteel  = 200 1.6 (1)
drives the development of 3D-CNN models for precipita- where  is the reflectivity factor and  is the rainfall
tion nowcasting. rate in /ℎ .</p>
      <sec id="sec-6-1">
        <title>3. Radar Reflectivity Dataset</title>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>To explore the efect of volumetric precipitation now</title>
      <p>casting, we collaborated with the Slovak Meteorological
Institute that provided us a dataset of roughly 3.5 years
of reflectivity data from Malý Javorník weather radar
station. The data is captured in 5 minute intervals. The
dataset consists of 355 761 separate observations in the
ODIM HDF5 format.</p>
      <p>The radar captures the precipitation particles in the
air by measuring returned radar wave power (echo) after
hitting precipitation particles. This value is called
reflectivity, measured in logarithmic dimensionless units called
decibels (dBZ). The data consists of reflectivity values
at the so-called reflectivity gates in multiple elevation
angles distributed around the radar station and encoded
in polar coordinates. See Figure 2 for a vertical slice of a</p>
      <sec id="sec-7-1">
        <title>3.1. Training data selection</title>
        <p>The dataset requires filtering before training since the
majority of the observations are of clear skies with nothing
to learn from. Most of the observations from the dataset
therefore have no value for training the model and could
even negatively afect the training by biasing the outputs
toward clear sky prediction, while we are mostly
interested in non-trivial cases with high precipitation. We
ifltered the images as follows:</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>1. Create a CAPPI radar reflectivity map at 2 km</title>
      <p>altitude above radar at 1 × 1 km resolution and
select a center slice of size 336 × 336 km.
2. Convert reflectivity to rainfall rate according to</p>
      <p>Marshall-Palmer Z-R relationship (1).
3. Compute the ratio of rainy to clear pixels
(threshold 0.05 mm/5 min or 0.6 mm/h - corresponds to
slight rain).</p>
      <p>Set
Full Dataset
Target Observations
Target + Lead Obs.</p>
      <p>Training Set Targets
Validation Set Targets
Test Set Targets
355761
9018
11310
6515
1150
1353</p>
      <sec id="sec-8-1">
        <title>4.1. Training and Evaluation</title>
        <p>4. If the rainfall map contains at least 20% of rainy
pixels and 11 previous observations are available,
add it to the target observation set.</p>
        <p>Each selected target observation was included in the
training dataset, along with a set number of previous
observations to serve as inputs and non-target intermediary
outputs. For our models, we decided to use 6
observations as input and 6 as output, efectively predicting the
precipitation half an hour in advance based on the last
half hour of data. This means that for each target
observation, we also needed to include 11 leading observations
in the dataset. This process returned 9 018 suitable
target images which together with the necessary leading
images represent 3.18% of the original dataset.</p>
        <p>It should be noted that the data converted to rainfall
described above was not used for training, only for
filtering the target observations based on the ratio of rainy
pixels. The actual training data used reflectivity directly
for both 2D images and 3D volumes. The 2D dataset
was a collection of CAPPI radar reflectivity maps at 2 km
altitude above radar. A 3D dataset was a collection of
CAPPI radar reflectivity maps at 8 altitude levels above
radar, from 500 m.a.r to 4000 m.a.r. The extent of the
data was set to 336 × 336 km centered on the radar
station with spatial resolution of 1 × 1 km for both 2D and
3D data, resulting in images of size 336 × 336 pixels and
8 × 336 × 336 voxels respectively for a single observation.</p>
        <p>To train and evaluate the models, the training dataset
was split into training, validation and test subsets in
chronological order. The last 15% of target observations
were selected for the test set, the rest was chosen for
training. Out of these, the last 15% of target observations
were again selected for validation and the rest was used
as training samples. See Table 1 for the exact number of
observations in each set.</p>
        <p>Adam optimizer was used for training the model. To
4. Model Architectures ifnd the optimal training model hyperparameters -
starting learning rate, optimizer learning rate scheduler
paTo compare the impact of adding a vertical dimension as rameters and gradient clipping threshold - we utilized the
fairly as possible, we chose a basic U-Net architecture in- Bayesian sweep search provided by Weights &amp; Biases[24].
spired by models developed in [3, 4] as a reference model. We trained 20 models with 2D CNN architecture and 5
As U-Net is a fully convolutional neural network, convert- with 3D CNN architecture. The best performing model of
ing it to process volumetric data is a trivial task - mostly each architecture variant was selected for performance
just a matter of replacing 2D convolutional layers with evaluation. See Table 2 for all the possible
hyperparam3D convolutions. Besides this, the model only required eter values and the best performing ones for both 2D
replacing 2D max-pooling layers in the encoder for 3D and 3D models. Early stopping after 15 non-improving
max-pooling and bilinear upsample in the decoder for epochs was utilized.
trilinear. See Figure 3 for the specific number of channels Choosing the right metric to evaluate the performance
and kernel sizes at each layer of the model. Both were of precipitation nowcasting models is not simple. The
implemented using the PyTorch library [23]. correct method depends on a model’s use-case and no</p>
        <p>The conversion of the model from 2D to 3D convo- single composite measure is currently able to objectively
lutions was mostly straightforward and resulted in in- evaluate performance of precipitation nowcasting
modcreasing the number of trainable parameters 3-fold from els [1]. While we outlined the shortcomings of using
roughly 17 to 52 million. The three-fold increase is based MSE to evaluate precipitation nowcasting models above
on the fact that the model uses convolution kernels of in Section 2, we are using MSE as the loss function and the
size 3 at every convolutional layer, therefore each kernel primary evaluation metric despite the double
penalizahas 27 (3 × 3 × 3) instead of 9 (3 × 3) weights (disregarding tion efect that occurs since it is still the most commonly
bias and multiple channels). Other architectural parame- used metric in this domain. Additionally, to provide more
ters of the model such as number of kernels at each layer insight into model performance, we are also computing
were kept the same for the comparison between these mean model accuracy, precision, recall and F1 scores on
models to be fair and dependent solely on the provided binarized precipitation maps using a threshold value of
64
Double Conv
Skip connection
Single Conv
Max Pooling
Upsample
20 dBZ (corresponding to light rain) to diferentiate be- into the future based on past radar reflectivity maps at
tween rain and no rain areas. This way, we can evaluate the same altitude. Subsequently, we trained a 3D model
only the shape of precipitation features and disregard to predict equivalent 3D reflectivity maps at 8 altitude
the intensity, which can serve as another valuable metric. levels based on recent volumetric observation data. To
Our experiments have shown that higher threshold val- evaluate which model is better at precipitation
nowcastues corresponding to extreme precipitation events show ing, we evaluate the prediction error on a single CAPPI
larger diferences between model metrics during evalua- map at 2 km above radar from the target observation
tion, however the informative value would be lower due (nowcast 30 minutes in the future). This can be done
to such events occurring only in the small minority of because one slice of the output volume of the 3D model
the test set observations. matches the altitude level the 2D model was trained on
(2000 m.a.r.).</p>
        <p>A simple euclidean persistence was used as a
bench5. 2D vs. 3D: A Comparison mark. This benchmark method simply copies the last
input observation as the prediction output. Despite the
The impact of providing a vertical dimension to the model method being trivial, the precipitation data is highly
dewas evaluated by comparing the error rate when predict- pendent on previous observations and so it provides a
ing a single reflectivity map at constant altitude above good performance benchmark. Using this benchmark,
radar. We trained the 2D model to output the next CAPPI we can also evaluate the rate of change in the data and
radar reflectivity maps at 2 km above radar 30 minutes therefore see how ”dificult” it is to make an accurate</p>
        <p>6. Conclusion
prediction for each sample.</p>
        <p>The results in Table 3 show that the best 3D-CNN
U-Net model slightly outperformed the best 2D-CNN
counterpart. On average, the 3D model achieved lower
prediction error on the test set, in both MSE and MAE
metrics. The improvement is small, but statistically
significant (paired t-test at 0.99 confidence level on test set
MSE scores rejected the null hypothesis that the means
of 2D and 3D model error scores are the same, p-value
is very close to zero). The area-based metrics also show
small improvements, with accuracy and F1 scores being
slightly higher. Based on considerably higher recall and
lower precision, we can assume the 3D model predicts
larger precipitation bodies on average. See Figure 4 for a
visual comparison of the model outputs.</p>
        <sec id="sec-8-1-1">
          <title>Acknowledgments</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>This research was partially supported by TAILOR, a</title>
      <p>project funded by EU Horizon 2020 research and
innovation programme under GA No 952215; by The Ministry
of Education, Science, Research and Sport of the Slovak
Republic under the Contract No. 0827/2021; and by Life
Defender - Protector of Life, ITMS code: 313010ASQ6,
coifnanced by the European Regional Development Fund
(Operational Programme Integrated Infrastructure).
tive style, high-performance deep learning library,
in: H. Wallach, H. Larochelle, A. Beygelzimer,
F. d'Alché-Buc, E. Fox, R. Garnett (Eds.), Advances
in Neural Information Processing Systems 32,
Curran Associates, Inc., 2019, pp. 8024–8035. URL:
http://papers.neurips.cc/paper/9015-pytorch-animperative-style-high-performance-deep-learninglibrary.pdf.
[24] L. Biewald, Experiment tracking with weights and
biases, 2020. URL: https://www.wandb.com/,
software available from wandb.com.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>