LSTMs for Inferring Planetary Boundary Layer Height (PBLH)
               Zeenat Ali 1 , Dorsa Ziaei 1 , Jennifer Sleeman 1 , Zhifeng Yang 2 , Milton Halem 1
1
    University of Maryland, Baltimore County, Dept. of Computer Science & Electrical Engineering, Baltimore, MD 21250 USA
                 2
                   University of Maryland, Baltimore County, Department of Physics, Baltimore, MD 21250 USA
               zali2@umbc.edu, dorsaz1@umbc.edu, jsleem1@umbc.edu, vy57456@umbc.edu, halem@umbc.edu


                             Abstract                                  forecasting has gained great attention in the weather pre-
                                                                       diction domain. The ability to model the temporal data se-
    In this paper, we describe new work which is part of a larger
    study to understand how machine learning could be used to
                                                                       quences along with the long-term dependency through the
    augment existing methods for calculating and estimating the        memory blocks makes the LSTM model a superior choice
    Planetary Boundary Layer Height (PBLH). We describe how            in weather forecasting studies (Gayathiri Kathiresan 2019).
    a Long Short-Term Memory (LSTM) Network could be used                 Since WRF-Chem (Peckham 2012) model-based PBLH
    to learn PBLH changes over time for different geographi-           calculations inherently include features such as wind, tem-
    cal locations across the United States, used in conjunction        perature, and humidity, the objective of this study is to
    with the WRF-Chem model. If the machine learning method            determine if the LSTM is able to learn to predict PBLH
    could achieve accuracy levels similar to the model-based cal-      without explicitly using these additional features. By learn-
    culations, then it is feasible for the deep learning model to      ing patterns of PBLH temporal changes over geographical
    be used as an embedded method for the WRF-Chem model.              points across the United States, can it accurately predict
    The paper shows promising results that warrant more explo-
    ration. We describe results for two experiments in particular.
                                                                       future PBLH just by learning these patterns of change? If
    The first experiment used 20 geographical locations for a two-     the network is able to learn to predict future PBLH based
    month period of hourly WRF-Chem calculated PBLH. In this           on past historical PBLH, the deep learning model could be
    experiment, we evaluated how well the LSTM could learn             called upon to predict PBLH for a number of time steps in
    PBLH by using limited data across a set of nearby locations.       the future as an embedded method to the WRF-Chem pro-
    This model achieved RMSE of .11 on predicted PBLH. The             cess. Two experiments have been conducted that use WRF-
    second experiment used one year of hourly PBLH calcula-            Chem model output. The first focuses on training an LSTM
    tions from the WRF-Chem model to evaluate the LSTM pre-            for 20 geographical locations over dates in the months of
    diction for a selection of three locations with separate LSTM      November through December for the year 2016. This study
    models, achieving RMSE scores of 0.04, 0.01 and 0.05, re-          is part of a larger study of understanding how ceilometer-
    spectively. We describe these results and the future plans for
    this work.
                                                                       based backscatter can be used for PBLH estimations to aug-
                                                                       ment model calculations for improved PBLH calculations
                                                                       (Caicedo et al. 2017; Delgado et al. 2018). The second fo-
                         Introduction                                  cuses on training LSTMs for specific locations using WRF-
The Planetary Boundary Layer (PBL) is known for being                  Chem data for the year 2018-19.
the layer above the Earth’s surface for which aerosols are
present (Stull 1988). Accurate calculations of the top of the                                Background
PBL can better inform air quality forecasts. Machine learn-            WRF-Chem (Peckham 2012) is a fully coupled “online”
ing methods that can improve PBL calculation accuracy and              chemistry model, which has the air quality component con-
improve computational calculations are of interest to the              sistent with the meteorological component (Grell et al.
earth science community. The work described in this pa-                2005). In this study, the gas-phase chemistry and aerosol
per outlines how a Long Short-Term Memory (LSTM) net-                  module is based on the Carbon Bond Mechanism Z (CBM-
work could be used to learn how planetary boundary layer               Z) (Zaveri and Peters 1999) and Model for Simulating
heights are changing over time for various geographical lo-            Aerosol Interactions and Chemistry (Zaveri et al. 2008), re-
cations. LSTM networks enable predicting accurate results              spectively. While there are several other PBL schemes, YSU
from the complex data representation within the appropri-              scheme (Hong and Pan 1996; Hong and Lim 2006) is se-
ate training time and resolving the constraints of RNN (Gr-            lected for the runs reported in this work. An extended discus-
eff et al. 2017). Recently, the LSTM network-based weather             sion of the different model PBL parameterization schemes
Copyright © 2021 for the individual papers by the papers’ authors.     and their success (or otherwise) in comparison with lidar ob-
Copyright © 2021 for the volume as a collection by its editors. This   served PBL data in this same study region is reported else-
volume and its papers are published under the Creative Commons         where (López, Archilla, and Quintana 2020). Model Radia-
License Attribution 4.0 International (CC BY 4.0).                     tion treatment utilizes the Rapid Radiative Transfer Model
for General Circulation Models (RRTMG) short-wave and            using two months of WRF-Chem output (specific to the East
long-wave radiation schemes (Iacono et al. 2008), including      Coast) was used. This method used output for the period
the aerosol radiation feedback.                                  of November 29, 2016, to December 30, 2016, in coordina-
                                                                 tion with related work of applying machine learning meth-
                     Related Work                                ods to ceilometer backscatter profiles for an ad hoc cam-
                                                                 paign (Sleeman et al. 2020) conducted by the University of
LSTM models have been long utilized for air quality and          Maryland Baltimore County Physics Department. The sec-
weather prediction problems. In (Karevan and Suykens             ond method also used a stacked LSTM but was trained on in-
2018), authors used a spatio-temporal stacked LSTM model         dividual geographical locations and was based on data from
for temperature prediction. They showed improvement in           a year of WRF-Chem output (for most of North America)
the performance of their prediction model using the stacked      for the period of January 2018 to January 2019.
LSTM model. Weather prediction has been studied in (Fente           The stacked LSTM model used for both approaches is
and Singh 2018), using LSTM models. In this work, multi-         shown in Figure 1 and was constructed by sequencing three
ple LSTM models were trained for different combinations          LSTM layers with 50 units each taking three arguments viz.
of weather parameters.                                           no. of units, return sequences and input shape. The input
   The problem of weather prediction has been studied in         shape was the shape of the input data set. The parameter for
(Hewage et al. 2019) and (Zaytar and Amrani 2016) and            return sequences was set to ’True’ to stack the three LSTM
stacked LSTM architectures have been utilized and been           layers. A dense layer was added specifying an output of one
compared to traditional forecasting models. In the former,       unit after the three stacked LSTM layers. The optimizer used
Hewage et. al. (Hewage et al. 2019) compared the result of       was ’Adam’ and the loss function used was set to ’mean
weather prediction with (WRF) NWP model and showed               squared error’. The next step is to compare this model with
accuracy of LSTM model’s result. In the latter, Zaytar et.       a multi-variate LSTM methodology.
al. (Zaytar and Amrani 2016) showed results of forecasting
temperature, humidity and wind speed. In their paper, they
showed that LSTM based neural networks can be considered
as an alternative model to traditional models for forecasting
weather conditions.
   Rainfall prediction has been a category of weather pre-
diction problems and has been the focus of multiple studies,
such as (Poornima and Pushpalatha 2019) and (Samad et al.
2020). Poornima. et. al.(Poornima and Pushpalatha 2019)
presented Intensified Long Short-Term Memory (Intensified
LSTM) based Recurrent Neural Network (RNN) to predict
rainfall. They compared their results with Holt–Winters, Ex-
treme Learning Machine (ELM), Autoregressive Integrated
Moving Average (ARIMA), Recurrent Neural Network and
Long Short-Term Memory models in order to show the im-
provement in the ability to predict rainfall. Samad et.al.
(Samad et al. 2020) utilized an LSTM based Recurrent
Neural Network (RNN) for the prediction of rainfall. They
showed accuracy and performance of the model on a stan-
dard rainfall dataset.
   In this study, to learn PBLH changes over time, the prob-
lem is formulated as a time series forecasting model where
a stacked LSTM network is built and trained on data from
two different WRF-Chem models.

                        Approach
The overall approach is a stacked LSTM that learns to pre-
dict PBLH for given geographical locations by training the                        Figure 1: LSTM network
LSTM on large data sets generated from WRF-Chem mod-
els. These models provide PBLH for various geographical
locations, different periods of time across different seasons,
however explicitly defined features such as wind, temper-                        Data Set Description
ature, and humidity are excluded. The current LSTM uses          For each data set, WRF-Chem model data was transformed
a single uni-variate methodology. In the single uni-variate      with each instance having N number of historical PBLH,
approach we train the network using two different mod-           X1 and instance N + 1 as its respective outcome Y1 . The 1
els to explore two different ideas. In the first method, a       to N window then shifts by 1 and the next set of instances,
stacked LSTM trained on multiple geographical locations,         X2 constitute of 2 to N + 1 instances, with instance N + 2
as its outcome, Y2 . This continues until all data is consumed      lighted in this paper are sites: site1[44.2062, -63.14245],
resulting in a data set X1 , X2 ..Xk with N features each and       site2[35.3382, -90.31128], and site3[35.3382,-90.31128].
respective labels Y1 , Y2 ...Yk . The N is set to 100 for the ex-   The WRF-Chem model consisted of the year 2018 and re-
periments discussed in this paper.                                  gion was most of North America at 2.5 degree resolution
   After scaling and reshaping, the data was converted into         from 0 degrees north to 80 degrees north and -60 west to -
a 3D array with X train samples, 100 timestamps, and one            135 west. Prognostic variables were specified every 6 hours
feature at each step to be fed into the network built above.        at all grid points and integrated every 6 hours with 3 minute
   For the first experiment, the data set used was based on         time steps.
the numeric values of PBLH generated from the WRF-Chem
model from approximately 15,000 locations across the lat-
itude, longitude bounds of [36.63,-79.24707] to [40.79,-
                                                                                        Experimentation
73.92] respectively recorded at various time stamps from            There were two main experiments conducted in this study.
Nov 29, 2016 20:00 hours, to Jan 01, 2017 00:00 hours, as           The first used a 2-month WRF-Chem model focused on the
shown in Figure 2 and Figure 3. Figure 2 shows numeric              East Coast. The goal of this experiment was to explore how
representation of data with columns: date, time, latitude,          well the LSTM network could approximate PBLH for loca-
longitude and the respective recorded planetary boundary            tions near each other. The second experiment used a 1-year
height. Figure 3 shows the spread of the data in terms of           WRF-Chem model focused on most of North America. The
geographical location on the United States map. The WRF-            goal of this experiment was to explore how well a LSTM
Chem model consisted of two region, an outer region with 9          network could predict PBLH for specific locations trained
KM resolution from 25N to 50N and -70W to -90W. And an              on data spanning multiple seasons, without explicitly includ-
inner region consisting of 35N to 45N and -73W to -80W.             ing features such as temperature.
The prognostic variables of the outer region are specified by
every 3 hours obtained from a reanalysis from the NCEP re-          Multi-Location Experiment
analysis. Every hour prognostic variables are specified for
the inner region from a reanalysis. The outer region is inte-       This experiment consisted of 20 locations and data from the
grated with a one minute time step and the inner is integrated      model experiment. The 20 nearby locations formed a small
with a 20 second time step.                                         patch as shown in Figure 4.


          Figure 2: Example Data from the Model


                                                                    Figure 4: Geographical Representation WRF-Chem Model
                                                                    Locations

                                                                       The LSTM model was trained on data for 100 epochs with
                                                                    a batch size of 64 and tested on train and test data to evaluate
                                                                    the overall performance of the LSTM model. The training
                                                                    data had 817 examples and the test data included 441 sam-
                                                                    ples. After the model has been trained it was evaluated using
                                                                    the test data.

                                                                    Single Location Experiment
    Figure 3: Geographical Representation of Raw Data               The second experiment consisted of three sites with 8834
                                                                    rows of data each. The LSTM model was trained on data for
  For the second experiment, the data used was for individ-         100 epochs with a batch size of 64. The training data had
ual sites recorded hourly from Jan, 2018 to Jan 2019, high-         5742 samples and the test data included 3092 samples.
                           Results
Both experiments yielded encouraging results. RMSE was
measured for the held-out test set in each experiment. In
the 1-year experiment, a comparison with a linear regres-
sion method was performed. In addition, a sensitivity study
was also performed.

Multi-Location Results
                                                                    (a)


                                                                    (b)


Figure 5: Predicted PBLH for 20 locations- train(orange)
and test(green)

   A RMSE of 0.11 was achieved for the test data. The re-
sults are plotted showing a comparison of predicted data vs
original data as shown in Figure 5. The blue line shows the         (c)
original data. The orange line shows the predicted PBLH for
                                                                   Figure 6: (a) Hourly WRF-Chem PBLH for Single Location
train data and the green line shows the predicted PBLH for
                                                                   (35.3382,-90.31128) (b) Linear Regression for Single Loca-
test data. Upon close observation, the naked blue line can
                                                                   tion (35.3382,-90.31128) Hourly Predicted PBLH (c) LSTM
be seen right before orange and green lines representing the
                                                                   for Single Location (35.3382,-90.31128) Hourly Predicted
initial 100 instances used to start the prediction for train and
                                                                   PBLH
test data.

Single Location Results
The single location experiment was performed for three
(randomly chosen) locations (44.2062,-63.1424),(2.41753,-
119.82883), and (35.3382,-90.31128) with RMSE values
of 0.04, 0.01, and 0.05 achieved. To properly evaluate the
LSTM method for the single location experiment, the result
of the LSTM prediction for (35.3382,-90.31128) was com-
pared with the prediction using a linear regression model
for (35.3382,-90.31128). The results of the linear regression
model and the LSTM model are shown in Figure 6b and
6c. In Figure 6c, the blue trend is almost unidentifiable for
both train prediction and test prediction, as can be seen by
                                                                   Figure 7: Correlation Study - Mean-Subtracted Results
the green trend (predicted test) which overlaps the blue trend
                                                                   Comparing the WRF-Chem Model PBLH with the LSTM
(actual data). This implies the prediction is strongly matched
                                                                   Predicted PBLH for Location (35.3382,-90.31128).
to the expected PBLH. The linear regression model resulted
in a RMSE of 0.05 in comparison with the LSTM result of
0.05.
   To evaluate the predictions further, a correlation study was    and the mean of the LSTM model predicted data was sub-
performed on the test data from the WRF-Chem and pre-              tracted from the predicted data. The results were then plot-
dicted data from the LSTM. The mean of the WRF-Chem                ted in Figure 7. The results from this correlation study show
model-generated test data was subtracted from the test data        strong correlation between the true PBLH of the test data
and the predicted PBLH from the LSTM model.                      Hong, S.-Y.; and Pan, H.-L. 1996. Nonlocal boundary
                                                                 layer vertical diffusion in a medium-range forecast model.
          Conclusions and Future Work                            Monthly weather review 124(10): 2322–2339.
In this study, we present a stacked LSTM model as a pre-         Iacono, M. J.; Delamere, J. S.; Mlawer, E. J.; Shephard,
diction tool for tracking planetary boundary layer heights       M. W.; Clough, S. A.; and Collins, W. D. 2008. Radiative
(PBLH) temporal changes. We trained the LSTM model for           forcing by long-lived greenhouse gases: Calculations with
two data sets generated from the WRF-Chem model for a se-        the AER radiative transfer models. Journal of Geophysical
lection of locations. We showed the performance of inferring     Research: Atmospheres 113(D13).
the model on a test subset of data and provided a visualiza-     Karevan, Z.; and Suykens, J. 2018. Spatio-temporal Stacked
tion of the results. In this work we show the promise of using   LSTM for Temperature Prediction in Weather Forecasting.
LSTM networks for spatio-temporal time series PBLH fore-         ArXiv abs/1811.06341.
casting. In our future work, we aim to design multivariate       López, M. J. T.; Archilla, Y. B.; and Quintana, P. J. V. 2020.
LSTM network to perform simultaneous PBLH forecasting            Individual assessment procedure and its tools for PBL team-
for multiple locations with a single network.                    work. The International journal of engineering education
                                                                 36(1): 352–364.
                   Acknowledgments                               Peckham, S. E. 2012. WRF/Chem version 3.3 user’s guide .
This work has been funded by the following grants: NASA
                                                                 Poornima, S.; and Pushpalatha, M. 2019. Prediction of Rain-
grant NNH16ZDA001-AIST16-0091 and NSF CARTA
                                                                 fall Using Intensified LSTM Based Recurrent Neural Net-
grant 17747724.
                                                                 work with Weighted Linear Units. Atmosphere 10: 668.
                       References                                Samad, A.; Bhagyanidhi; Gautam, V.; Jain, P.; Sangeeta; and
                                                                 Sarkar, K. 2020. An Approach for Rainfall Prediction Us-
Caicedo, V.; Rappenglück, B.; Lefer, B.; Morris, G.; Toledo,    ing Long Short Term Memory Neural Network. 2020 IEEE
D.; and Delgado, R. 2017. Comparison of aerosol lidar re-        5th International Conference on Computing Communication
trieval methods for boundary layer height detection using        and Automation (ICCCA) 190–195.
ceilometer aerosol backscatter data. Atmospheric Measure-
ment Techniques 10(4).                                           Sleeman, J.; Halem, M.; Caicedo, V.; Demoz, B.; Delgado,
                                                                 R. M.; et al. 2020. A Deep Machine Learning Approach for
Delgado, R.; Caicedo, V.; Demoz, B.; Szykman, J.; Sakai,         LIDAR Based Boundary Layer Height Detection. In IEEE
R.; Hicks, M.; Posey, J.; Atkinson, D.; and Kironji, I.          International Geoscience and Remote Sensing Symposium.
2018. Ad-Hoc Ceilometer Evaluation Study (ACES): Li-
                                                                 Stull, R. B. 1988. Mean boundary layer characteristics.
dar/Ceilometer Mixing Layer Heights and Network. In AGU
                                                                 In An Introduction to Boundary Layer Meteorology, 1–27.
Fall Meeting Abstracts.
                                                                 Springer.
Fente, D. N.; and Singh, D. K. 2018. Weather Forecast-           Zaveri, R. A.; Easter, R. C.; Fast, J. D.; and Peters, L. K.
ing Using Artificial Neural Network. 2018 Second Interna-        2008. Model for simulating aerosol interactions and chem-
tional Conference on Inventive Communication and Compu-          istry (MOSAIC). Journal of Geophysical Research: Atmo-
tational Technologies (ICICCT) 1757–1761.                        spheres 113(D13).
Gayathiri Kathiresan, Krishna Mohanta, K. V. A. 2019.            Zaveri, R. A.; and Peters, L. K. 1999. A new lumped struc-
FORETELL: Forecasting Environmental Data Through En-             ture photochemical mechanism for large-scale applications.
hanced LSTM and L1 Regularization. International Journal         Journal of Geophysical Research: Atmospheres 104(D23):
of Recent Technology and Engineering (IJRTE) 7.                  30387–30415.
Greff, K.; Srivastava, R.; Koutnı́k, J.; Steunebrink, B.; and    Zaytar, A.; and Amrani, C. E. 2016. Sequence to Sequence
Schmidhuber, J. 2017. LSTM: A Search Space Odyssey.              Weather Forecasting with Long Short-Term Memory Recur-
IEEE Transactions on Neural Networks and Learning Sys-           rent Neural Networks. International Journal of Computer
tems 28: 2222–2232.                                              Applications 143: 7–11.
Grell, G. A.; Peckham, S. E.; Schmitz, R.; McKeen, S. A.;
Frost, G.; Skamarock, W. C.; and Eder, B. 2005. Fully
coupled “online” chemistry within the WRF model. Atmo-
spheric Environment 39(37): 6957–6975.
Hewage, P. R. P. G.; Behera, A.; Trovati, M.; and Pereira,
E. 2019. Long-Short Term Memory for an Effective Short-
Term Weather Forecasting Model Using Surface Weather
Data. In AIAI.
Hong, S.-Y.; and Lim, J.-O. J. 2006. The WRF single-
moment 6-class microphysics scheme (WSM6). Asia-
Pacific Journal of Atmospheric Sciences 42(2): 129–151.