Reduced-order Model for Fluid Flows via Neural Ordinary Differential Equations Carlos J.G. Rojas 1 , Andreas Dengel 1 , Mateus Dias Ribeiro 1 1 German Research Center for Artificial Intelligence - DFKI carlos.gonzalez rojas@dfki.de, andreas.dengel@dfki.de, mateus.dias ribeiro@dfki.de Abstract The idea is to construct a methodology able to generalize the physical behaviour for unseen parameters and that can Reduced order models play an important role in the design, extrapolate forward in time using the minimal amount of full optimization and control of dynamical systems. In recent years, there has been an increasing interest in the applica- order simulations (Benner, Gugercin, and Willcox 2015). tion of data-driven techniques for model reduction that can The projection-based reduced order modeling is one of decrease the computational burden of numerical solutions, the most popular approaches to construct surrogate models while preserving the most important features of complex of dynamical systems. This framework reduces the degrees physical problems. In this paper, we use the proper orthogo- of freedom of the numerical simulations using a transfor- nal decomposition to reduce the dimensionality of the model mation into a suitable low-dimensional subspace. Then, the and introduce a novel generative neural ODE (NODE) archi- state variable in the governing equations is rewritten in terms tecture to forecast the behavior of the temporal coefficients. of the reduced subspace and finally the PDE equations are With this methodology, we replace the classical Galerkin pro- jection with an architecture characterized by the use of a con- converted into a system of ODEs that can be solved using tinuous latent space. We exemplify the methodology on the classical numerical techniques (Benner, Gugercin, and Will- dynamics of the Von Karman vortex street of the flow past a cox 2015). In the field of fluid mechanics, the Proper Or- cylinder generated by a Large-eddy Simulation (LES)-based thogonal Decomposition (POD) method is widely applied in code. We compare the NODE methodology with an LSTM the dimensionality reduction of the FOM and the Galerkin baseline to assess the extrapolation capabilities of the gener- method is used for the projection onto the governing equa- ative model and present some qualitative evaluations of the tions. These methodologies are preferred because an orthog- flow reconstructions. onal normal basis simplifies the complexity of the projected mathematical operators and the truncated basis of the POD Introduction is optimal in the least squares sense, retaining the dominant behaviour through the most energetic modes. The projection Modeling and simulation of dynamical systems are essential on the governing equations maintains the physical structure tools in the study of complex phenomena with applications of the model, but the truncation of the modes can affect the in chemistry, biology, physics and engineering, among other accuracy of the results in nonlinear systems and it may also relevant fields. These tools are particularly useful in the con- be restricted to stationary and time periodic problems. Fur- trol and design of parametrized systems in which the depen- thermore, the projection is intrusive, requiring different set- dence on properties, initial conditions and other configura- tings for each problem, and it is limited to explicit and closed tions requires multiple evaluations of the system response. definitions of the mathematical models (San, Maulik, and However, there are some limitations when performing nu- Ahmed 2019). Some of these problems have been addressed merical simulations of systems where nonlinearities, and a with the search of closure models that compensates the in- wide range of spatial and time scales leads to unmanageable formation losses produced by the truncated modes (Mou demands on computational resources. The latter is the case et al. 2020; Mohebujjaman, Rebholz, and Iliescu 2019; San of engineering fluid flow problems where the range of scales and Maulik 2018b,a) and with the construction of a data involved increase with the value of the Reynolds number and driven reduced ”basis” that also provides optimality after the cost of simulating a full-order model (FOM) using tech- the time evolution (Murata, Fukami, and Fukagata 2020; Liu niques such as DNS or LES is very high. One of the possi- et al. 2019; Wang et al. 2016). ble solutions to reduce the expensive computational cost is We present an alternative methodology to evolve the dy- to introduce an alternative, cheaper and faster representation namics of the system in the reduced space using a data- that retains the characteristics provided by the FOM without driven approach. We use the POD to compute the modes and sacrificing the accuracy of the general physical behaviour. the temporal coefficients of a fluid flow simulation and then Copyright © 2021for this paper by its authors. Use permitted under we apply an autoencoder architecture to learn the dynam- Creative Commons License Attribution 4.0 International (CC BY ics of a latent space. The addition of a neural ODE (Chen 4.0) et al. 2019; Rubanova, Chen, and Duvenaud 2019) block in Figure 1: POD-NeuralODE ROM methodology. the middle of the autoencoder model provides a continuous spatio-temporal expansion used in the POD. learning block that is encoded using a feed forward neural The general methodology is represented in the Fig. 1 and network and that can be solved numerically to determine more details about each of the building blocks is presented the future states of the input variables. Several works have in the following sections. proposed machine learning models to replace the Galerkin projection step or to improve their capabilities, and different LES Model For The Flow Past a Cylinder architectures such as feedforward or recurrent networks has The dynamics of the Von Karman vortex street of the flow been applied with demonstrated good performance in aca- past a cylinder were solved by the LES filtered governing demic and practical fluid flow problems (Pawar et al. 2019; equations for the balance of mass (1), and momentum (2), Imtiaz and Akhtar 2020; Eivazi et al. 2020; Lui and Wolf which can be written as follows: 2019; Portwood et al. 2019; Maulik et al. 2020a,b). The main advantage of the neural ODE generative model is that ∂ ρ̄ ∂(ρ̄ũi ) the learning is posed as a self-supervised task using a con- + =0 (1) ∂t ∂xi tinuous representation of the physical behavior. In our view, the neural ODE block can be interpreted as an implicit dif-    ∂(ρ̄ũi ) ∂(ρ̄ũi u˜j ) ∂ ∂ u˜j ∂ ũi ferential operator that is not restricted to a specific differen- + = ρ̄ν̄ + ∂t ∂xj ∂xj ∂xi ∂xj tial equation. This setting provides more flexibility than the  (2) projection over the governing equations because it addresses 2 ∂ u˜k ∂ p̄ − ρ̄ν̄ δij − ρ̄τij sgs − + ρ̄gi the learning problem with an operator that is informed and 3 ∂xk ∂xi corrected by the training data. In the previous equations, u represents the velocity, ρ is the fluid density, and ν is the dynamic viscosity. These equa- Methodology tions are solved numerically using the PIMPLE algorithm In this work we use a Large-eddy Simulation (LES) model (Weller et al. 1998), which is a combination of PISO (Pres- to approximate the behavior of the fluid flow dynamical sys- sure Implicit with Splitting of Operator) by Issa (1986) and tem. As it is the case in many fluid flow problems, the dis- SIMPLE (Semi-Implicit Method for Pressure-Linked Equa- crete solution has a spatial dimension larger than the size tions) by Patankar (1980). This approach obtains the tran- of the temporal domain. For this reason, we apply the snap- sient solution of the coupled velocity and pressure fields by shot POD to construct the reduced order model and have applying the SIMPLE (steady-state) procedure for every sin- a tractable computation. The POD finds a new basis repre- gle time step. Once converged, a numerical time integration sentation that maximizes the variance in the data, and has scheme (e.g. backward) and the PISO procedure are used to the minimum error of the reconstructions in a least squares advance in time until the simulation is complete. Further- sense. In addition, the dimensionality reduction is easily per- more, the unresolved subgrid stresses, τij sgs , are modeled formed because the components of the new basis are ordered in terms of the subgrid-scale eddy viscosity νT using the dy- by their contribution to the recovery of the data. namic k-equation approach by Kim and Menon (1995). The main block in the Neural ODE-ROM methodology is The setup of the problem is described as follows. The concerned with the forecast of the temporal coefficients pro- computational domain comprehends a 2D channel with vided by the snapshot POD. Here we apply the latent ODE 760 mm in the stream-wise direction and 260 mm in the di- (Chen et al. 2019; Rubanova, Chen, and Duvenaud 2019), rection perpendicular to the flow. The cylinder is located be- a generative neural ODE model that takes the temporal co- tween the upper and bottom walls of the channel at 115 mm efficients, learns their dynamical evolution and provides an away from the inlet (left wall). A constant radial velocity adequate model to extrapolate at the desired time steps. Fi- of 0.6 m/s with random radial/vertical fluctuations in com- nally, we can forecast the evolution of the temporal coef- bination with a zero-gradient outflow condition and non- ficients and reconstruct the behavior of the flow with the slip walls on the top/bottom/cylinder walls are imposed as boundary conditions. Furthermore, a laminar dynamic vis- • Assemble the matrix Y with the snapshots in the follow- cosity of 1 × 10−4 m2 /s and a cylinder diameter of 40 mm ing form: further characterizes the flow with a Reynolds number of  0 ux (x1 , y1 , t1 ) ... u0y (xNx , yNy , t1 )  240 (Re = 0.6 × 0.04/1 × 10−4 = 240). The Central differ- 0 0 encing scheme (CDS) was used for the discretization of both  ux (x1 , y1 , t2 ) ... uy (xNx , yNy , t2 )  . . .   convective and diffusive terms of the momentum equation, Y =   as well as an implicit backward scheme for time integration.  . . .   A snapshot of the velocity components in both radial and  . . .  axial directions at time = 100 is shown in the Fig. 2. u0x (x1 , y1 , tNt ) ... u0y (xNx , yNy , tNt ) where each row contains a flattened array with the fluctu- ating components of the velocity in the x and y directions for a given time step. If the discretization used for the FOM simulation has dimensions Nx , Ny and Nt , then the flattened representation is a vector with length 2 · Nx · Ny and the matrix Y has dimensions Nt × (2 · Nx · Ny ). • Build the correlation matrix K and compute its eigenvec- tors aj : K = Y Y >, (4) Kij aj = λai . (5) Figure 2: Snapshot of the flow field at t = 100. Alternatively, one can directly compute the eigenvalues and eigenvectors using the singular value decomposition (SVD) of the snapshot matrix. Proper Orthogonal Decomposition • Choose the reduced dimension of the model: As described The proper orthogonal decomposition (POD) is known un- in the literature, larger eigenvalues are directly related der a variety of names such as Karhunen-Loeve expan- with the dominant characteristics of the dynamical sys- sion, Hotelling transform and principal component analysis tem while small eigenvalues are associated with perturba- (Liang et al. 2002). In addition, one can perform the POD tions of the dynamic behavior. The criterion to select the defining a linear autoencoder and setting the loss function components for the new basis is to maximize the relative metric to the mean squared error. This tool was developed information content I(N ) using the minimal amount of in the field of probability theory to discover interdependen- components N necessary to achieve a desired percentage cies within vector data and introduced in the fluid mechanics of recovery (Schilders et al. 2008). community by Berkooz, Holmes, and Lumley (1993). Once PN the interdependencies in the data are discovered, it is possi- i=1 λi ble to reduce its dimensionality. I(N ) = PN t (6) The formulation of the dimensionality reduction starts i=1 λi with some samples of observations provided by experimen- • Finally, we compute the spatial modes ψi (x) using the tal results or obtained through the numerical solution of a temporal coefficients in the reduced dimensional space full order model that characterizes the physical problem. and the Ansatz decomposition of the POD: These samples are rearranged in an ensemble matrix of snap- shots Y where each row has the state of the dynamical sys- N X tem at a given time step. Then, the correlation matrix of the u0 ≈ αi (t)ψi (x), (7) elements in Y is computed and their eigenvectors are used i=1 as an orthogonal optimal new basis for the reduced space. N In the following list we summarize the main steps used 1 X ψi (x) = √ αi (tj )u0 (tj ). (8) for the construction of the snapshot POD: λi j=1 • Take snapshots : simulate the dynamical system and sam- ple its state u as it evolves. Neural Ordinary Differential Equations • Compute the fluctuating components of the velocity u us-0 The neural ordinary differential methodology (Chen et al. ing the Reynolds decomposition of the flow: 2019) can be interpreted as a continuous counterpart of tradi- tional models such as recurrent or residual neural networks. In order to formulate this model, the authors drew a parallel u = u + u0 , (3) between the classical composition of a sequence in terms of where u is the temporal mean of the solutions given by previous states and the discretization methods used to solve the FOM model. differential equations: Figure 3: Generative VAE with Neural ODE. tained with the LES code and take the first 8 POD modes ht+1 = ht + f (ht , θ). (9) achieving a 99 % of recovery according to the relative in- formation content. For the deployment of the neural ODE model (NODE) we take the first 75 time steps for the train- In the limit case of sufficient small steps (equivalent to ing set, the following 25 time steps for the validation of the an increase of the layers) is possible to write a continuous model and the last 200 time steps for the test set. Further- parametrization of the hidden state derivative: more, we employ as a baseline model an LSTM sequence to dh(t) vector architecture as proposed in Maulik et al. with a win- = f (ht , θ), (10) dow size of 10 time steps and a batch size of 15 sequences. dt We tuned the hyperparameters necessary for both models ht = ODESolver(h0 , f (ht , θ)). (11) adopting a random search and chose the best configuration given the performance on the validation set. The evolution The function f defining the parametrization of the deriva- of the loss for the best model is shown in Fig. 4 and the tive can be approximated using a neural network and the val- set of hyperparameters employed are presented in Table 1. ues of hidden states ht at different time steps are computed using numerical ODE solvers (Chen et al. 2019). We apply the latent ODE generative approach (Chen et al. 2019) presented in the Fig. 3 to model the evolution of the temporal coefficients provided by the proper orthogonal de- composition. This approach can be interpreted as a vari- ational autoencoder architecture with an additional neural ODE block after the sampling of the codings. This block Figure 4: Loss Generative NODE model. maps the vector of the initial latent state zt0 to a sequence of latent trajectories using the ODE numerical solver while a neural network f (zt , θ) learns the latent dynamics neces- Model Hyperparameter Range Best sary to have a good reconstruction of the input data. The latent dimension [2,5) 2 variational part of the autoencoder produces the mean µ and layers encoder [1,6) 4 standard deviation σ of the initial latent variable zt0 and adds units encoder [10,50) 10 noise to the sampling process improving the quality of the Neural ODE layers node [1,3) 1 features learned. units node [10,50) 12 After the training process, the latent trajectories are eas- layers decoder [1,6) 4 ily extrapolated with the redefinition of the temporal bounds units decoder [10,50) 41 in the ODE solver. Some of the advantages of this strategy learning rate [0.001, 0.1) 0.0015 are that it does not need an explicit formulation of the phys- units [10,60) 49 ical laws to forecast the temporal coefficients, and in conse- LSTM layers [1,5) 1 quence, the method does not resort onto projection method- learning rate [0.001, 0.1) 0.0081 ologies. Furthermore, the parametrization using a neural network gives an accurate nonlinear approximation of the Table 1: Hyperparameters used in the models. derivative without a predefined mathematical structure. The time-series prediction for the first four temporal coeffi- Results cients in the test set is shown in Fig. 5. This plot presents the In this section, we evaluate the performance of the genera- ground truth values of the POD time coefficients, the base- tive neural ODE model in the forecasting of the temporal co- line produced using an LSTM architecture and the predic- efficients. For this assessment, we apply the proper orthogo- tions by the proposed generative NODE model for the first nal decomposition over 300 snapshots of simulated data ob- 100 time steps in the test window. Figure 5: Reconstruction of POD temporal coefficients using Figure 6: Reconstruction of POD temporal coefficients using NODE vs LSTM , t ∈ [100, 200]. NODE vs LSTM, t ∈ [200, 300]. We notice that the baseline and the NODE model learned adequately the evolution of the two most dominant coeffi- cients, but the performance of the NODE model is signifi- cantly better for the third and four time coefficients. Addi- tionally, the quality of the prediction using the LSTM model for the last 100 time steps in the test set deteriorates with the evolution of the time steps even for α1 and α2 as seen in Fig. 6 . One of the possible reasons for this is that the au- toregressive nature of the predictions in the LSTM model is prone to the accumulation of errors as Maulik et al. pointed out in their study (Maulik et al. 2020b). After the training and validation process, we reconstruct the velocity fluctuating component u0x using the Ansatz of the proper orthogonal decomposition with the temporal co- efficients forecasted for the test set. Observing the Fig. 7 is possible to notice that the contour generated with the re- duced order model provides an adequate recovery of the flow features with only slight differences in some vortexes. In addition, we also present the fluctuation time history for a probe located downstream from the cylinder in Fig. 8. This figure shows with more details how the physical response of the reduced order model gives a satisfactory approximation of the flow behavior. Figure 7: Contours of fluctuating component u0x , t = 300. egy. In 2020 17th International Bhurban Conference on Ap- plied Sciences and Technology (IBCAST), 507–512. Islam- abad, Pakistan: IEEE. Issa, R. 1986. Solution of the implicitly discretised fluid flow equations by operator-splitting. Journal of Computational Physics 62(1): 40 – 65. Kim, W.-W.; and Menon, S. 1995. A new dynamic one- Figure 8: Probe positioned behind the cylinder. equation subgrid-scale model for large eddy simulations. Liang, Y.; Lee, H.; Lim, S.; Lin, W.; Lee, K.; and Wu, The data and code that support this study is provided at C. 2002. PROPER ORTHOGONAL DECOMPOSITION https://github.com/CarlosJose126/NeuralODE-ROM. AND ITS APPLICATIONS—PART I: THEORY. Jour- nal of Sound and Vibration 252(3): 527–544. ISSN 0022460X. doi:10.1006/jsvi.2001.4041. URL https:// Conclusions linkinghub.elsevier.com/retrieve/pii/S0022460X01940416. We presented a methodology to produce reduced order mod- Liu, Y.; Wang, Y.; Deng, L.; Wang, F.; Liu, F.; Lu, Y.; and els using a neural ODE generative architecture for the evo- Li, S. 2019. A novel in situ compression method for CFD lution of the temporal coefficients. Within this approach, we data based on generative adversarial network. Journal of employ a linear autoencoder (POD) to produce the dimen- Visualization 22(1): 95–108. sionality reduction of the model, and after that, we apply a non-linear variational autoencoder to learn the evolution of Lui, H. F. S.; and Wolf, W. R. 2019. Construction of the temporal coefficients. Although the data is compressed reduced-order models for fluid flows using deep feedforward in both autoencoders, the motivations of the dimensional re- neural networks. Journal of Fluid Mechanics 872: 963–994. ductions are different. The POD provides an interpretable Maulik, R.; Fukami, K.; Ramachandra, N.; Fukagata, K.; reduced space used for decomposition and reconstruction of and Taira, K. 2020a. Probabilistic neural networks for fluid the full order model, while the VAE latent space avoids a flow surrogate modeling and data recovery. Physical Review trivial copy of the time coefficients. Fluids 5(10): 104401. The neural ODE model was able to learn appropriately the hidden dynamics of the temporal coefficients without having Maulik, R.; Mohan, A.; Lusch, B.; Madireddy, S.; Bal- the same propagation of errors common in the autoregres- aprakash, P.; and Livescu, D. 2020b. Time-series learning sive architectures. Another advantage of this methodology of latent-space dynamics for reduced-order model closure. is that learning is posed as a self-supervised task, i.e., the Physica D: Nonlinear Phenomena 405: 132368. outputs are ”self-generated” as equal to the inputs, without Mohebujjaman, M.; Rebholz, L.; and Iliescu, T. 2019. Phys- splitting the entire sequence into smaller training windows ically constrained data-driven correction for reduced-order with labels. We also remark that the continuous nature of the modeling of fluid flows. International Journal for Numeri- neural ODE block is crucial for the good extrapolation capa- cal Methods in Fluids 89(3): 103–122. bilities of the methodology. Finally, we expect to test the ca- Mou, C.; Liu, H.; Wells, D. R.; and Iliescu, T. 2020. pabilities of this methodology with other physical problems Data-driven correction reduced order models for the quasi- and also to extend the method for parametric dynamical sys- geostrophic equations: a numerical investigation. Inter- tems. national Journal of Computational Fluid Dynamics 34(2): 147–159. References Murata, T.; Fukami, K.; and Fukagata, K. 2020. Nonlinear Benner, P.; Gugercin, S.; and Willcox, K. 2015. A Survey of mode decomposition with convolutional neural networks for Projection-Based Model Reduction Methods for Parametric fluid dynamics. Journal of Fluid Mechanics 882: A13. Dynamical Systems. SIAM Review 57(4): 483–531. Patankar, S. V. 1980. Numerical heat transfer and fluid flow. Berkooz, G.; Holmes, P.; and Lumley, J. L. 1993. The Washington: Hemisphere Publishing Corporation. Proper Orthogonal Decomposition in the Analysis of Turbu- Pawar, S.; Rahman, S. M.; Vaddireddy, H.; San, O.; lent Flows. Annual Review of Fluid Mechanics 25(1): 539– Rasheed, A.; and Vedula, P. 2019. A deep learning en- 575. abler for nonintrusive reduced order modeling of fluid flows. Chen, R. T. Q.; Rubanova, Y.; Bettencourt, J.; and Duve- Physics of Fluids 31(8): 085101. naud, D. 2019. Neural Ordinary Differential Equations. Portwood, G. D.; Mitra, P. P.; Ribeiro, M. D.; Nguyen, T. M.; arXiv:1806.07366 [cs, stat] ArXiv: 1806.07366. Nadiga, B. T.; Saenz, J. A.; Chertkov, M.; Garg, A.; Anand- Eivazi, H.; Veisi, H.; Naderi, M. H.; and Esfahanian, V. kumar, A.; Dengel, A.; et al. 2019. Turbulence forecasting 2020. Deep neural networks for nonlinear model order re- via Neural ODE. arXiv preprint arXiv:1911.05180 . duction of unsteady flows. Physics of Fluids 32(10): 105104. Rubanova, Y.; Chen, R. T.; and Duvenaud, D. 2019. Latent Imtiaz, H.; and Akhtar, I. 2020. POD-based Reduced-Order ODEs for Irregularly-Sampled Time Series. arXiv preprint Modeling in Fluid Flows using System Identification Strat- arXiv:1907.03907 . San, O.; and Maulik, R. 2018a. Machine learning closures for model order reduction of thermal fluids. Applied Mathe- matical Modelling 60: 681–710. San, O.; and Maulik, R. 2018b. Neural network closures for nonlinear model order reduction. Advances in Computa- tional Mathematics 44(6): 1717–1750. San, O.; Maulik, R.; and Ahmed, M. 2019. An artificial neural network framework for reduced order modeling of transient flows. Communications in Nonlinear Science and Numerical Simulation 77: 271–287. Schilders, W. H. A.; van der Vorst, H. A.; Rommes, J.; Bock, H.-G.; de Hoog, F.; Friedman, A.; Gupta, A.; Neunzert, H.; Pulleyblank, W. R.; Rusten, T.; Santosa, F.; Tornberg, A.- K.; Bonilla, L. L.; Mattheij, R.; and Scherzer, O., eds. 2008. Model Order Reduction: Theory, Research Aspects and Ap- plications, volume 13 of Mathematics in Industry. Berlin, Heidelberg: Springer Berlin Heidelberg. Wang, M.; Li, H.-X.; Chen, X.; and Chen, Y. 2016. Deep Learning-Based Model Reduction for Distributed Parameter Systems. IEEE Transactions on Systems, Man, and Cyber- netics: Systems 46(12): 1664–1674. Weller, H. G.; Tabor, G.; Jasak, H.; and Fureby, C. 1998. A tensorial approach to computational continuum mechan- ics using object-oriented techniques. Computers in Physics 12(6): 620–631.