Deep learning-based fast solver of the shallow water equations Mojtaba Forghani1 , Yizhou Qian2 , Jonghyun Lee3 , Matthew W. Farthing4 , Tyler Hesser4 , Peter K. Kitanidis2,5 , and Eric F. Darve1,2 1 Department of Mechanical Engineering, Stanford University, CA 2 Institute for Computational and Mathematical Engineering, Stanford University, CA 3 Department of Civil and Environmental Engineering, University of Hawaii at Manoa, Honolulu, HI 4 U.S. Army Engineer Research and Development Center, Vicksburg, MS 5 Department of Civil and Environmental Engineering, Stanford University, CA 1 {mojtaba, darve}@stanford.edu, 2 {yzqian, peterk}@stanford.edu, 3 jonghyun.harry.lee@hawaii.edu, 4 {Matthew.W.Farthing, Tyler.Hesser}@erdc.dren.mil Abstract conditions (BCs), such as the discharge and the free-surface elevation, as well as the bathymetry, we require an accurate Fast and reliable prediction of river flow velocities is im- portant in many applications, including flood risk manage- predictor of the flow velocities given the bathymetry and the ment. The shallow water equations (SWEs) are commonly BCs. The shallow water equations (SWEs) are typically used used for this purpose. However, traditional numerical solvers to solve this problem (Landon et al. 2014). However, current of the SWEs are computationally expensive and require high- numerical solvers of the SWEs are computationally expen- resolution riverbed profile measurement (bathymetry). In this sive. This is a major shortcoming of these methods, since work, we propose a two-stage process in which, first, using BCs in rivers can vary widely and thus having a “fast online the principal component geostatistical approach (PCGA) we predictor” of the flow velocities is very important, in particu- estimate the probability density function of the bathymetry lar, in situations when a range of conditions need to be eval- from flow velocity measurements, and then use machine uated quickly to address questions related to navigability or learning (ML) algorithms to obtain a fast solver for the to asses the risk of flooding. Furthermore, these solvers typi- SWEs. The fast solver uses realizations from the posterior bathymetry distribution and takes as input the prescribed cally require a fairly high resolution image of the bathymetry range of BCs. The first stage allows us to predict flow veloc- as simulation input. However, direct high-resolution bathy- ities without direct measurement of the bathymetry. Further- metric surveys (Casas et al. 2006) are time consuming and more, we augment the bathymetry posterior distribution to a costly for long river reaches. more general class of distributions before providing them as In this work, we propose a two-stage process in which, inputs to ML algorithm in the second stage. This allows the first, the river bathymetry at a site of interest is esti- solver to incorporate future direct bathymetry measurements mated using the principal component geostatistical approach into the flow velocity prediction for improved accuracy, even (PCGA) (Lee and Kitanidis 2014; Kitanidis and Lee 2014) if the bathymetry changes over time compared to its original from velocity measurements (thus addressing the issue of indirect estimation. We propose and benchmark three differ- ent solvers, referred to as PCA-DNN (principal component not having access to direct bathymetry measurement), and analysis-deep neural network), SE (supervised encoder), and then the distribution of estimated bathymetry is augmented SVE (supervised variational encoder), and validate them on to a more general distribution and combined with differ- the Savannah river, Augusta, GA. Our results show that the ent BCs to obtain a fast solver of the SWEs (thus address- fast solvers are capable of predicting flow velocities for dif- ing the high computational cost of numerical solvers). Fig- ferent bathymetry and BCs with good accuracy, at a computa- ure 1 shows the steps in the proposed approach schemati- tional cost that is significantly lower than the cost of solving cally. Note that our solver is capable of taking either directly the full boundary value problem with traditional methods. measured bathymetry or its estimated distribution as inputs. Thus, the purpose of the posterior augmentation stage is to Introduction allow our solver to include a more general class of bathyme- Estimation of riverine flow velocities is important in many tries into their prediction capability, for instance, when new practical applications such as the study of river morphody- direct bathymetry measurements become available and they namics, safe and efficient maritime transportation, and flood have changed over time compared to their original indirect risk management (Zolezzi and Seminarao 2001; Lanzoni estimation due to sediment deposition or erosion. et al. 2006; Casas et al. 2006; Westaway, Lane, and Hicks 2000; Lane, Richards, and Chandler 1994). In order to accu- Methods rately estimate flow velocities under user specified boundary In this work, we use three different deep learning meth- Copyright © 2021for this paper by its authors. Use permitted under ods as fast SWE solvers. These methods, shortly, are re- Creative Commons License Attribution 4.0 International (CC BY ferred to as PCA-DNN (principal components analysis-deep 4.0). neural network), SE (supervised encoder), and SVE (super- Figure 1: The schematic of the development of the forward solver. First, we estimate the posterior distribution of the bathymetry via PCGA, then augment this distribution to a more general distribution and use AdH to generate veloci- ties. Finally, the DNNs are trained with these data, which will be used as fast forward solvers. vised variational encoder). The schematic of these meth- Figure 2: Schematic of the PCA-DNN, SE, and SVE. ods are shown in fig. 2. The PCA-DNN method consists of first, a low-rank approximation of data via PCA-based linear projection, and then applying DNN to the reduced- tion (the posterior distribution). dimension data (Ghorbanidehno et al. 2020). SE is sim- While the PCGA posterior distribution provides a reason- ilar to an autoencoder (AE) (Kramer 1991), except, it is able estimate of the uncertainty associated with the currently used for supervised learning. In SE architectures, a high- available dataset, we also consider an additional augmenta- dimensional input (bathymetry) is fed as the input to the tion of the training data in order to broaden the range of network, where its dimension is reduced via a convolutional bathymetries for which the proposed forward solvers are neural network (CNN), then it is combined with the BCs valid. For instance, when new direct bathymetry measure- (with two elements: discharge and the free-surface eleva- ment becomes available and it has changed over time. To tion), passes through a fully connected network, and finally perform the augmentation, the synthetic data that are fed to is augmented to the high dimensional output (the velocity) the DNN architectures are generated by adding a Gaussian via another CNN. SVE is also similar to a variational au- kernel of the following form to the PCGA estimation: toencoder (VAE) (Kingma and Welling 2019), but it is used ∆x2 ∆y 2   in supervised learning. The SVE has a similar structure as 2 SE, except the middle layer which defines a random vari- cov(x, y) = β exp − 2 − 2 (1) lx ly able based on a multivariate normal distribution. Here, β = 1.2 m, lx = 115 m, and ly = 29 m (x is the along- Data preparation river direction while y is the across-river direction). We then add a scaling factor to generated bathymetries that shrinks The first step in the data preparation process is applying the variations near the shore, in order to capture the fact that PCGA to flow velocity observations taken from the river in the variations of the generated bathymetries near the shore order to obtain an estimation of the bathymetry in the area are generally smaller than in the middle of the river. We also of interest. In the following, we refer to this as the PCGA generate BC samples, extracted from the United States Ge- posterior distribution. Here, we have applied PCGA to the ological Survey (USGS) gauge data of Savannah river, and roughly one mile reach of the Savannah river, Augusta, GA. provide them, along with the bathymetries sampled from the The flow velocity measurements in this section are generated augmented distribution, to AdH, in order to obtain the flow synthetically via a numerical solver, referred to as Adaptive velocities. The bathymetry/BC/flow velocity datasets are fed Hydraulics (AdH) SWEs module (Savant et al. 2010), by to the DNNs to obtain forward solvers. first calculating the flow velocities corresponding to the ref- erence bathymetry of the Savannah river and then applying Performance in the presence of full Gaussian noise with a standard deviation equal to 10% of the largest simulated flow velocity, in order to ensure the syn- bathymetry measurement thetically generated flow velocities include the noise com- Table 1 summarizes the root mean square errors (RMSEs) monly observed in the field observations. Once the noisy in estimating flow velocity magnitudes using different meth- synthetic velocity measurements are generated, we can use ods, when full bathymetry measurements are provided as in- PCGA (Lee and Kitanidis 2014) to obtain an estimation of puts. A total of 4,000 river profiles (dataset size) have been the bathymetry. This estimation is in the form of a distribu- used as the training set, 500 profiles for the validation set, and 450 profiles for the test set. In order to have a fair com- associated level of uncertainty. Figure 4 shows the reference parison between different methods, we used the same la- mean and standard deviation of flow velocities in the east- tent space dimension of 50 in all methods (Forghani et al. ing direction obtained from AdH as well as the predicted 2021). The errors in table 1 for SVE and SE are significantly mean and standard deviations obtained from the SE, respec- lower than PCA-DNN, indicating that the non-linear dimen- tively. The BCs for the simulations are zf = 33.9 m and sion reduction contained in SVE and SE is more accurate Q = 651.2 m3 /s. The results are based on first, generating than a linear, PCA-based approach. Table 2 summarizes the 100 bathymetries directly from the PCGA posterior distri- hyperparameters used in different solvers. The table shows bution, and then providing these profiles as inputs to either the different parameter values used in our networks during the AdH or any of the DNNs (with the given BCs); finally, the hyperparameter tuning along with the final chosen value, the mean and standard deviation of their predicted velocities which had the best performance (shown in bold in the table). are calculated and plotted in fig. 4. Error (RMSE [m/s]) Fast forward solver PCA-DNN SE SVE Train set 0.0515 0.0269 0.0286 Validation set 0.0570 0.0374 0.0398 Test set 0.0546 0.0381 0.0398 Table 1: Comparison between the error of different solvers when predicting the magnitude of the flow velocity. Figure 3 compares the performance of different methods when predicting the flow velocity magnitude of one of the Figure 4: Predicted mean and standard deviation of veloc- members of the test dataset with BC values of free-surface ities for different solvers at zf = 33.9 m and Q = 651.2 elevation zf = 29.9 m and discharge Q = 146.1 m3 /s. We m3 /s. The “reference” corresponds to the AdH prediction observe that SE and SVE perform better than PCA-DNN, when bathymetries are generated from the PCGA posterior consistent with the result of table 1. This could be due to distribution. the linear dimension reduction technique being used in this approach, which fails to capture non-linear features present We observe that the solver has been successful in finding in the data with 50 principal components (PCs). the mean and the uncertainty. The great accuracy in fig. 4 im- plies that even when indirect observations are available, we can use the same solvers, which are trained on the estimated bathymetry distribution from PCGA with augmentation, to predict the distribution of flow velocities as the BCs change. Conclusion In this work, we have presented a framework for fast predic- tion of the riverine flow velocities with user specified BCs and bathymetries. The training of all the presented methods can be performed on common personal computers without access to GPU and high-performance computing resources. Figure 3: Examples of the error in the prediction of the ve- More importantly, once the networks are trained, the pre- locity magnitudes for different solvers for zf = 29.9 m and dictions can be done in a few seconds, making online flow Q = 146.1 m3 /s. SE and SVE outperform PCA-DNN. velocity estimations possible. Our results show that the com- putational efficiency is about three orders of magnitude faster than standard SWE solvers such as AdH. The combination of PCGA and our fast solvers provides Performance in the presence of uncertain a valuable tool that can be used even when the riverine bathymetry bathymetry profiles are not a-priori available. That is, we The results presented in the previous section provide in- do not need to measure riverbed profiles when training the formative evaluation metrics of different algorithms as for- network and designing the fast, reduced-order solver (of- ward solvers, that is, flow velocity predictors provided with fline stage). More importantly, even for the online prediction bathymetry and BCs assuming the reference (true) bathyme- stage, we can predict distribution of flow velocities from the tries are known completely. In practice, however, there are posterior distribution of the PCGA, without access to up- many situations in which we do not have access to direct dated bathymetry observations. measurement of bathymetries, and all of our information While all of the presented solvers are capable of provid- must come from the solution of an inverse problem with an ing reasonable prediction of the flow velocities, the better Fast forward solver DNN hyperparameter PCA-DNN SE SVE Type of layers Fully connected Convolutional Convolutional Batch normalization {yes, no} {yes, no} {yes, no} Number of hidden layers {1,2,3,4,5,6} {4,6} {4,6} Data normalization {yes, no} {yes, no} {yes, no} Act. func. (hidden layer) {tanh, ReLU} {tanh, ReLU} {tanh, ReLU} Act. func. (output layer) {linear, Sigmoid} {linear, Sigmoid} {linear, Sigmoid} Batch size {8,32,256,full} {8,32,256,full} {8,32,256,full} Learning rate {0.01,0.001,10−4 } {0.01,0.001,10−4 } {0.01,0.001,10−4 } Reg. coeff. (easting) {0,0.00001,0.0001, {0,0.00001,0.0001, {0,0.00001,0.0001, 0.001,0.01,0.1,1} 0.001,0.01,0.1,1} 0.001,0.01,0.1,1} Reg. coeff. (northing) {0,0.00001,0.0001, {0,0.00001,0.0001, {0,0.00001,0.0001, 0.001,0.01,0.1,1} 0.001,0.01,0.1,1} 0.001,0.01,0.1,1} Table 2: The hyperparameters used in different solvers. The parameters in bold are the final values used in networks with the best performances. Act. func. is the activation function and reg. coeff. is the regularization coefficient. performance of SE and SVE methods implies that there are Kitanidis, P. K.; and Lee, J. 2014. Principal component geo- non-linear features present in the data that linear or partially- statistical approach for large dimensional inverse problems. linear models such as PCA-DNN may not be able to capture Water Resources Research 50: 5428–5443. accurately within the available computational limitations. Kramer, M. A. 1991. Nonlinear principal component anal- Acknowledgments. This research was supported by the ysis using autoassociative neural networks. AIChE Journal U.S. Department of Energy, Office of Advanced Scientific 37: 233–243. Computing Research under the Collaboratory on Mathemat- Landon, C.; Wilson, G. W.; Ozkan-Haller, H. T.; and ics and Physics-Informed Learning Machines for Multiscale MacMahan, J. H. 2014. Bathymetry estimation using drifter- and Multiphysics Problems (PhILMs) project, PhILMS based velocity measurements on the Kootenai river, Idaho. grant DE-SC0019453. This work was also supported by Journal of Atmospheric and Oceanic Technology 31: 503– an appointment to the Faculty and Postdoctoral Fellow Re- 514. search Participation Program at the U.S. Engineer Research Lane, S.; Richards, K.; and Chandler, J. 1994. Developments and Development Center, Coastal and Hydraulics Labora- in monitoring and modelling small-scale river bed topogra- tory administered by the Oak Ridge Institute for Science and phy. Earth Surface Processes and Landforms 19: 349–368. Education through an inter-agency agreement between the U.S. Department of Energy and ERDC. The Chief of Engi- Lanzoni, S.; Siviglia, A.; Frascati, A.; and Seminara, G. neers has granted permission for this publication. 2006. Long waves in erodible channels and morphodynamic influence. Water Resources Research 42: W06D17. References Lee, J.; and Kitanidis, P. K. 2014. Large-scale hydraulic to- mography and joint inversion of head and tracer data using Casas, A.; Benito, G.; Thorndycraft, V.; and Rico, M. 2006. the principal component geostatistical approach (PCGA). The topographic data source of digital terrain models as a Water Resources Research 50: 5410–5427. key element in the accuracy of hydraulic flood modelling. Earth Surface Processes and Landforms: The Journal of the Savant, G.; Berger, C.; McAlpin, T. O.; and Tate, J. N. British Geomorphological Research Group 31: 444–456. 2010. Efficient implicit finite-element hydrodynamic model for dam and levee breach. Journal of Hydraulic Engineering Forghani, M.; Qian, Y.; Lee, J.; Farthing, M. W.; Hesser, 137: 1005–1018. T.; Kitanidis, P. K.; and Darve, E. F. 2021. Application of deep learning to large scale riverine flow velocity estima- Westaway, R.; Lane, S.; and Hicks, D. 2000. The devel- tion. Stochastic Environmental Research and Risk Assess- opment of an automated correction procedure for digital ment doi:https://doi.org/10.1007/s00477-021-01988-0. photogrammetry for the study of wide, shallow, gravel-bed rivers. Earth Surface Processes and Landforms 209–226. Ghorbanidehno, H.; Lee, J.; Farthing, M. W.; Hesser, T. J.; Zolezzi, G.; and Seminarao, G. 2001. Downstream and up- Kitanidis, P. K.; and Darve, E. F. 2020. Deep learning tech- stream influence in river meandering. Part 1. General theory nique for fast inference of large-scale riverine bathymetry. and application to overdeepening. Journal of Fluid Mechan- Advances in Water Resources 103715. ics 438: 183–211. Kingma, D. P.; and Welling, M. 2019. An introduction to variational autoencoders. Foundations and Trends in Ma- chine Learning 12: 307–392.