Physics-informed neural networks for solving coupled flow and transport system Sanghyun Lee 1 Teeratorn Kadeethum,2 1 Department of Mathematics, Florida State University 1017 Academic Way Tallahassee, FL 32304 2 Sibley School of Mechanical and Aerospace Engineering, Cornell University 130 Upson Hall Ithaca, NY 14853 lee@math.fsu.edu, tk658@cornell.edu Abstract deep domain decomposition (Li et al. 2019), theory-guided data science (Karpatne et al. 2017a), a physics-guided NN In this paper, a direct application of the physics-informed (Karpatne et al. 2017b), a theory-guided NN (Wang et al. neural networks to consider the coupled flow and transport system as a forward solver is presented. We address the clas- 2020), physics-informed NN (Raissi, Perdikaris, and Karni- sical challenge of solving the coupled system with multi- adakis 2019; Kadeethum, Jørgensen, and Nick 2020b,a), and ple variables involving the convection-dominated regime of others (Owhadi 2015; Jagtap, Kharazmi, and Karniadakis the transport. The comparisons of the approximated solutions 2020; Lu et al. 2019; Raissi, Yazdani, and Karniadakis 2020; from neural networks and the exact solutions are presented D’Elia et al. 2020; Kadeethum, Jørgensen, and Nick 2020b; with the sensitivity analyses regarding the hyper-parameters Fraces, Papaioannou, and Tchelepi 2020). The requirement and number of the training data. of large amount of training examples is mitigated using these new research idea, and the physical laws are naturally em- Introduction bedded through systems of PDEs (Lu et al. 2019; Raissi, Yazdani, and Karniadakis 2020; D’Elia et al. 2020; Raissi, Since pioneering work in 1943 on neural networks (NN) Perdikaris, and Karniadakis 2019; Kadeethum, Jørgensen, based on a brain model (McCulloch and Pitts 1943), the and Nick 2020a). development of NN and deep learning (DL) in the form of In this paper, we utilize the idea of physics-informed NN deep neural networks (DNNs) have been accelerated with (PINN) to solve a coupled flow and transport system in the help of big data analytic and the advancement of pow- porous media (Fraces, Papaioannou, and Tchelepi 2020; Cai erful computational resources. Despite the numerous suc- et al. 2020; He et al. 2020). Sensitivity analyses in respect cesses obtained with DL, limitations remain concerning ap- to the hyper parameters (number of layers and neurons) and plications in scientific problems. In many scientific and engi- the number of training data points are discussed. Moreover, neering problems, collecting a large amount of data to guar- we present the performance of the PINN for solving the antee model accuracy is expensive and often not possible advection-dominated transport system when the flux is also (Ahmed, Jones, and Marks 2015). In addition, the training computed by PINN. of the DL model is only based on the available data, and no physical laws are involved during the training, which may lead to physically unreasonable predictions (Xiao et al. Computational Algorithms 2016; Wang, Wu, and Xiao 2017; Raissi, Perdikaris, and In this section, we introduce the governing system and the Karniadakis 2019; Kutz 2017; Wang et al. 2018; Zhang et al. main ingredients of PINN with a rationale based on previous 2019b). studies (Raissi, Perdikaris, and Karniadakis 2019). To overcome the above limitations of NN and DL, sci- entific machine learning (SciML) based on physical sci- Governing equations ences incorporates scientific knowledge and physics-based partial differential equations (PDEs) into DL architectures. We consider an initial-boundary value problem of the cou- New approaches that solve PDEs based on DL include a pled flow and transport problem in the computational do- deep Ritz (Weinan 2017; Weinan and Yu 2018), PDE-Net main Ω ⊂ R2 where the time domain is denoted by T = (Long, Lu, and Dong 2019), a deep Galerkin (Sirignano and (0, T] with a given final time T > 0. The coupled system Spiliopoulos 2018), variational Galerkin (Kharazmi, Zhang, derived from the conservation of mass (volume) is defined and Karniadakis 2019; Khodayi-Mehr and Zavlanos 2019), as the following Bayesian deep convolutional networks (Zhu et al. 2019), a  −∇ · (κ∇p) = f,   Copyright © 2021for this paper by its authors. Use permitted under v := −κ∇p, (1) Creative Commons License Attribution 4.0 International (CC BY  ∂ c + ∇ · (vc) = g,  4.0) ∂t where the unknown variables are the scalar pressure function Input Hidden Ouput I layer layers (p) and the scalar transport function (c). Here, f and g are the body forces for each equations, and v is the velocity vector layer 1 defined with the given coefficient κ, which is assumed to be H1, 1 HNhl, 1 a constant in this paper for the simplicity. The boundary condition on ∂Ω for the pressure equation I2 O1 can be decomposed to pressure (Dirichlet) and flux (Neu- mann) boundaries, ∂Ωp and ∂Ωq , respectively. Also, the transport equation is supplemented by both Dirichlet and I3 . . . Neumann boundaries, ∂Ωc and ∂Ωr , respectively. Moreover . . . . . . . . . the initial condition for the concentration c(·, t = 0) = c0 is given. . H1, Nn HNhl, Nn Ok . Physics-Informed Neural Networks . Ii W1, 1 = {W1, 11 ... W1, 1i} b1, 1 Recently developed Physics-Informed Neural Networks .. (PINN) (Raissi, Perdikaris, and Karniadakis 2019) seek the . solutions satisfying PDEs by utilizing the residuals of each W1, Nn = {W1, Nn1 ... W1, Nni} b1, Nn equation in the governing system and boundary/initial con- ditions as part of the training. Here, the solution of PDEs Figure 1: An example of general neural network architec- is formulated as the solution to a constrained optimization ture (Rumelhart, Hinton, and Williams 1986; LeCun, Ben- problem. A couple of main advantages to this approach in- gio, and Hinton 2015; Hinton and Salakhutdinov 2006). The clude i) the size of the training set is considerably reduced by input layer contains up to i input nodes, and the output layer utilizing the governing equations as an implicit regulariza- is composed of 1, ..., k output nodes. Nhl refers to the num- tion term in an objective function to be minimized (D’Elia ber of hidden layers, and each hidden layer is composed of et al. 2020), and ii) it is a mesh-free approach, where the NN Nn neurons. Each neuron (e.g., H1,1 ... H1,Nn ) is connected is trained on batches of randomly sampled time and space to the nodes of the previous layer with adjustable weights points. In particular, PINNs have been further extended to and also has an adjustable bias. This figure is adapted from solve fractional PDEs (Pang, Lu, and Karniadakis 2019), (Kadeethum, Jørgensen, and Nick 2020b,a) stochastic PDEs (Nabian and Meidani 2018; Yang, Zhang, and Karniadakis 2020; Zhang, Guo, and Karniadakis 2020; Zhang et al. 2019a), and nonlocal models (D’Elia et al. 2020). Based on the algorithm established in (Raissi, Perdikaris, and Karniadakis 2019; Lu et al. 2019), the PINN utilizes two Neural network architecture An example of a neural net- NNs. One NN is employed to impose the boundary/initial work architecture is presented in Figure 1 (Rumelhart, Hin- conditions of the system with the adjustable W and b, and ton, and Williams 1986; LeCun, Bengio, and Hinton 2015; the other NN is for the information given by the differen- Hinton and Salakhutdinov 2006). The number of input and tial operators as regularizing terms of the loss functions for output nodes in the NNs shown in this figure are determined each PDEs. The latter loss functions could be defined by from the given formulation of the problem. For example, if the residual of the PDEs. Thus, the training examples that the problem is solving a time dependent PDE, we have three are used to evaluate these extra regularizing terms are dif- input nodes (x, y and t), where t is time, x and y are co- ferent from those used to train the network with the bound- ordinates in x- and y-directions, respectively (i.e i = 3 in ary/initial conditions (BC/ICs). In other words, BC/ICs are Figure 1). The output nodes will be the solution functions imposed using training data and the equation residuals are satisfying the given system of PDEs. In the coupled flow imposed in a data-free manner by automatic differentiation and transport problem as defined in (1), we have two output of the predicted states at randomly chosen interior colloca- nodes c and p, where c represents the concentration, and p is tion points. This additional NN with the residuals as loss the pressure. Thus, we have k = 2 in Figure 1. functions are known as the PINN (Raissi, Perdikaris, and The number of hidden layers (Nhl ) and the number of Karniadakis 2019). In order to minimize PINN, we adjust neurons (Nn ) act as hyper-parameters, which means they are all the W and b from the first NN. Throughout this paper, problem-specific and needed to be adjusted according to the the NNs are built on the Tensorflow platform (Abadi et al. natures of each problem (Goodfellow, Bengio, and Courville 2015) and the numerical code is built on DeepXDE (Lu et al. 2016). Each neuron (e.g., H1,1 ... H1,Nn ) is connected to 2019). the nodes of the previous layer with adjustable weights (W) and also has an adjustable bias (b). These variables are Loss functions The loss functions (LOSS), which we learned during a training phase (Hinton and Salakhutdinov minimize through the machine learning process, is com- 2006; Goodfellow, Bengio, and Courville 2016) by utiliz- posed of two parts as discussed in the previous section. One ing the hyperbolic tangent (tanh) as an activation function, is the error on the training data (MSEtr ), which includes the and second-order Limited-memory BFGS as an optimizer boundary and initial conditions for the first NN. The other by minimizing loss functions (LOSS). one is the mean square value of the regularization term given by the physics-informed functions (MSEΠ ). We encode the underlying physical information to the NNs through the so-called physics-informed functions (Π) as following, Πp := −∇ · (κ∇p) − f, (2) ∂ Πc := c + ∇ · ((−κ∇p)c) − g, (3) (a) p at t = 0.1. (b) c at t = 0.1. ∂t which are the residuals of the given PDEs. These residuals (Πp , Πc ) will act as the additional regularizing terms in the loss function defined below. Thus, the loss function of our problem is LOSS := MSEtr + MSEΠp + MSEΠc , (4) where the boundary and initial conditions are incorporated (c) c at t = 0.5. (d) c at t = 1. into the MSEtr . Here, the value of the P mean squared er- n ror (MSE) is defined as MSE := n1 i=1 (φ(xi , ti ) − Figure 2: (a) illustrates the approximated pressure value p at φh (xi , ti ))2 for a given function φ and an approximated t = 0.1, and (b)-(d) are the approximated transport values c function φh . for each time t = 0.1, 0.5 and t = 1. Numerical Results In this final section, we present and discuss the capabilities and the performance of the algorithm by solving the the cou- pled flow and transport system shown in (1). Example 1 First, to test the accuracy of the proposed algorithm, we set the exact solutions as (a) p at t = 0.5 (b) p at t = 1. c(x, y, t) := sin (t + x + y), (5) and p(x, y, t) := cos (t + x + y), (6) for the transport and the pressure, respectively, in the com- putational domain Ω × T = (0, 1)2 × (0, 1]. Here, the body forces f and g are computed with the given solutions where (c) c at t = 0.5 (d) c at t = 1. κ is chosen to be κ = 1, and the Dirichlet boundary condi- tions are given on ∂Ω for both pressure and transport. Figure 3: The comparison of the approximated pressure and Figure 2 illustrates the approximated solutions of pressure transport solutions, p, c, with the exact solution over the line (p) at time t = 0.1 and transport (c) at time t = 0.1, 0.5 and (0, 0.5)-(1, 0.5). t = 1 by utilizing PINN. In addition, Figure 3 presents the comparisons with the exact solution over the line for each given time steps. We observe that the PINN with the given (Nhl ) and neurons (Nn )). Here, the error is computed by the loss functions provide accurate approximations to the multi- following definition, variable coupled flow and transport system. We note that the kφ(x, t) − φh (x, t)kL∞ := maxi (|φ(xi , ti ) − φh (xi , ti )|) , algorithm does not have the time marching steps. Here, the number of the training data points for approx- where φ denotes the exact solution, and φh is the approxi- imating the initial condition (NΩ(t=0) ), the boundary con- mated solution, which is either pressure or transport in this ditions (N∂Ω ), and inside the domain (NΩ ) are all set to be case. In addition, for this test, the number of training points 1000. The number of hidden layers is 4, and the number for the initial/boundary conditions (NΩ(t=0) , N∂Ω ) and in- of neurons for each layer is 10. The second-order Limited- side the domain for the residual (NΩ ) are fixed to be 1000. memory BFGS method is employed for the optimization, We observe that the algorithm does depend on these hyper- and the number of epochs (iterations of training) is set to parameters, but the results do not vary too much. be 10, 000. Moreover, tanh function is utilized for the acti- Moreover, Table 2 presents the algorithm’s sensitivity vation function. test regarding the number of training points for the ini- Next, Table 1 illustrates the sensitivity test regarding to tial/boundary conditions and inside the domain for the resid- the choice of the hyper-parameters (number of hidden layers ual. Here the hyper-parameters are fixed to be Nhl = 4 and Nn \Nhl 4 8 16 t = 0.1, 0.3, 0.5, 0.7 and 0.9 are plotted in the same domain 10 2.65e-03 2.19e-03 7.79e-04 in Figure 5. We do not observe any spurious oscillations or 20 6.23e-04 1.86e-03 7.61e-04 over/under shooting (violation of the maximum principle) in 40 3.32e-03 8.89e-04 1.75e-03 this advection dominated case (Wang, Teng, and Perdikaris Nn \Nhl 4 8 16 2020; Fuks and Tchelepi 2020). 10 1.47e-03 3.11e-03 1.08e-03 20 6.10e-04 1.05e-03 9.60e-04 40 5.00e-03 2.56e-03 2.89e-03 Table 1: Comparison of the error depending on number of the hidden layers and the neurons. The top table is for the pressure and the bottom one is for the transport. The row indicates different number of hidden layers (Nhl ) and the columns are for different number of neurons Nn . (a) p at t = 0.1 (b) c at t = 0.1 Nn = 20. To provide a general result, the average of 5 dif- ferent realizations are shown in the table. (NΩ(t=0) , N∂Ω )\NΩ 10 100 1000 (10,10) 1.25e-03 1.04e-03 1.44e-03 (100,100) 1.04e-03 9.99e-04 6.40e-04 (1000,1000) 1.56e-03 8.02e-04 1.08e-03 (c) c at t = 0.5 (d) c at t = 1. (NΩ(t=0) , N∂Ω )\NΩ 10 100 1000 (10,10) 4.73e-03 2.06e-03 2.32e-03 Figure 4: The solution of the pressure p and the transport c for each given time step. (100,100) 1.23e-03 1.94e-03 1.08e-03 (1000,1000) 1.48e-03 1.20e-03 1.20e-03 Table 2: Comparison of the error depending on number of the training points. The top table is for the pressure and the bottom one is for the transport. The column indicates differ- ent (NΩ(t=0) , N∂Ω ) and the rows are for NΩ . Example 2 In the final example, we solve a simple flow and transport problem in Ω × T = (0, 1)2 × (0, 1] by setting f = g = 0. The boundary conditions for the transport and the pressure are given as  c = 1, if x = 0, ∇c · n = 0, else where, and Figure 5: Solution of c over the line (0, 0.5) − (1, 0.5) is  plotted for each time t = 0.1, 0.3, 0.5, 0.7 and 0.9. We do p = 1, if x = 0, observe the moving step functions as expected. p = 0, if x = 1, κ∇p · n = 0, else where .  The initial condition for the transport equation is set to be c(·, t = 0) = 0. Thus, we are transporting the c = 1 from Conclusion the left boundary to the right boundary with the computed This paper utilizes PINN to solve one of the multi-physics velocity. Here, the hyper-parameters are fixed to be Nhl = 4 problems, coupled flow and transport systems. The sensitiv- and Nn = 20, and NΩ(t=0) = N∂Ω = NΩ = 1000. ity tests regarding to the parameters for training NN and the Figure 4 illustrates the solution of the pressure p and loss functions for PINN are presented. The numerical ex- the transport c for each given time. We note that the pres- periments illustrate the accuracy and the capabilities of the sure and the velocity are constant over the time domain, as algorithm. Detailed comparison with the existing numeri- shown in Figure 4(a). However the value of c is transported cal methods such as finite element method, and extension from left to the right as expected (Figure 4(b)-(d)). The so- to consider nonlinear problems in heterogeneous media are lutions of c over the line (0, 0.5) − (1, 0.5) for each time ongoing work. References Karpatne, A.; Watkins, W.; Read, J.; and Kumar, V. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; 2017b. Physics-guided neural networks (pgnn): An ap- Citro, C.; Corrado, G.; Davis, A.; Dean, J.; Devin, M.; et al. plication in lake temperature modeling. arXiv preprint 2015. TensorFlow: Large-scale machine learning on hetero- arXiv:1710.11431 . geneous systems . Kharazmi, E.; Zhang, Z.; and Karniadakis, G. 2019. Varia- Ahmed, E.; Jones, M.; and Marks, T. 2015. An improved tional Physics-Informed Neural Networks For Solving Par- deep learning architecture for person re-identification. In tial Differential Equations. arXiv preprint arXiv:1912.00873 Proceedings of the IEEE conference on computer vision and . pattern recognition, 3908–3916. Khodayi-Mehr, R.; and Zavlanos, M. M. 2019. VarNet: Vari- Cai, S.; Wang, Z.; Lu, L.; Zaki, T. A.; and Karniadakis, G. E. ational neural networks for the solution of partial differential 2020. DeepM&Mnet: Inferring the electroconvection mul- equations. arXiv preprint arXiv:1912.07443 . tiphysics fields based on operator approximation by neural Kutz, N. 2017. Deep learning in fluid dynamics. Journal of networks. arXiv preprint arXiv:2009.12935 . Fluid Mechanics 814: 1–4. D’Elia, M.; Parks, M. L.; Pang, G.; and Karniadakis, G. LeCun, Y.; Bengio, Y.; and Hinton, G. 2015. Deep learning. 2020. nPINNs: nonlocal Physics-Informed Neural Networks nature 521(7553): 436. for a parametrized nonlocal universal Laplacian operator. Li, K.; Tang, K.; Wu, T.; and Liao, Q. 2019. D3M: A deep Algorithms and Applications. Technical report, Sandia Na- domain decomposition method for partial differential equa- tional Lab.(SNL-NM), Albuquerque, NM (United States). tions. IEEE Access . Fraces, C. G.; Papaioannou, A.; and Tchelepi, H. 2020. Long, Z.; Lu, Y.; and Dong, B. 2019. PDE-Net 2.0: Learn- Physics Informed Deep Learning for Transport in Porous ing PDEs from data with a numeric-symbolic hybrid deep Media. Buckley Leverett Problem. arXiv preprint network. Journal of Computational Physics 399: 108925. arXiv:2001.05172 . Lu, L.; Meng, X.; Mao, Z.; and Karniadakis, G. E. 2019. Fuks, O.; and Tchelepi, H. A. 2020. Limitations of physics DeepXDE: A deep learning library for solving differential informed machine learning for nonlinear two-phase trans- equations. arXiv preprint arXiv:1907.04502 . port in porous media. Journal of Machine Learning for Mod- eling and Computing 1(1). McCulloch, W.; and Pitts, W. 1943. A logical calculus of the ideas immanent in nervous activity. The bulletin of mathe- Goodfellow, I.; Bengio, Y.; and Courville, A. 2016. Deep matical biophysics 5(4): 115–133. learning. MIT press. Nabian, M. A.; and Meidani, H. 2018. A deep neural net- He, Q.; Barajas-Solano, D.; Tartakovsky, G.; and Tar- work surrogate for high-dimensional random partial differ- takovsky, A. M. 2020. Physics-informed neural networks for ential equations. arXiv preprint arXiv:1806.02957 . multiphysics data assimilation with application to subsur- face transport. Advances in Water Resources 141: 103610. Owhadi, H. 2015. Bayesian numerical homogenization. Multiscale Modeling & Simulation 13(3): 812–828. Hinton, G.; and Salakhutdinov, R. 2006. Reducing the dimensionality of data with neural networks. science Pang, G.; Lu, L.; and Karniadakis, G. E. 2019. fpinns: Frac- 313(5786): 504–507. tional physics-informed neural networks. SIAM Journal on Scientific Computing 41(4): A2603–A2626. Jagtap, A. D.; Kharazmi, E.; and Karniadakis, G. E. 2020. Conservative physics-informed neural networks on discrete Raissi, M.; Perdikaris, P.; and Karniadakis, G. 2019. domains for conservation laws: Applications to forward and Physics-informed neural networks: A deep learning frame- inverse problems. Computer Methods in Applied Mechanics work for solving forward and inverse problems involving and Engineering 365: 113028. nonlinear partial differential equations. Journal of Compu- tational Physics 378: 686–707. Kadeethum, T.; Jørgensen, T.; and Nick, H. 2020a. Physics- informed Neural Networks for Solving Inverse Problems of Raissi, M.; Yazdani, A.; and Karniadakis, G. E. 2020. Hid- Nonlinear Biot’s Equations: Batch Training. In 54th US den fluid mechanics: Learning velocity and pressure fields Rock Mechanics/Geomechanics Symposium. Golden, CO, from flow visualizations. Science 367(6481): 1026–1030. USA: American Rock Mechanics Association. Rumelhart, D.; Hinton, G.; and Williams, R. 1986. Learn- Kadeethum, T.; Jørgensen, T.; and Nick, H. 2020b. Physics- ing representations by back-propagating errors. nature informed neural networks for solving nonlinear diffusivity 323(6088): 533–536. and Biot’s equations. PLoS ONE 15(5):e0232683. Sirignano, J.; and Spiliopoulos, K. 2018. DGM: A deep Karpatne, A.; Atluri, G.; Faghmous, J. H.; Steinbach, M.; learning algorithm for solving partial differential equations. Banerjee, A.; Ganguly, A.; Shekhar, S.; Samatova, N.; and Journal of Computational Physics 375: 1339–1364. Kumar, V. 2017a. Theory-guided data science: A new Wang, J.; Wu, J.; and Xiao, H. 2017. Physics-informed ma- paradigm for scientific discovery from data. IEEE Trans- chine learning approach for reconstructing Reynolds stress actions on knowledge and data engineering 29(10): 2318– modeling discrepancies based on DNS data. Physical Re- 2331. view Fluids 2(3). Wang, N.; Zhang, D.; Chang, H.; and Li, H. 2020. Deep learning of subsurface flow via theory-guided neural net- work. Journal of Hydrology 584: 124700. Wang, S.; Teng, Y.; and Perdikaris, P. 2020. Understand- ing and mitigating gradient pathologies in physics-informed neural networks. arXiv preprint arXiv:2001.04536 . Wang, Z.; Xiao, D.; Fang, F.; Govindan, R.; Pain, C.; and Guo, Y. 2018. Model identification of reduced order fluid dynamics systems using deep learning. International Jour- nal for Numerical Methods in Fluids 86(4): 255–268. Weinan, E. 2017. A proposal on machine learning via dynamical systems. Communications in Mathematics and Statistics 5(1): 1–11. Weinan, E.; and Yu, B. 2018. The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics 6(1): 1–12. Xiao, H.; Wu, J.; Wang, J.; Sun, R.; and Roy, C. 2016. Quantifying and reducing model-form uncertainties in Reynolds-averaged Navier–Stokes simulations: A data- driven, physics-informed Bayesian approach. Journal of Computational Physics 324: 115–136. Yang, L.; Zhang, D.; and Karniadakis, G. E. 2020. Physics- informed generative adversarial networks for stochastic dif- ferential equations. SIAM Journal on Scientific Computing 42(1): A292–A317. Zhang, D.; Guo, L.; and Karniadakis, G. E. 2020. Learning in modal space: Solving time-dependent stochastic PDEs us- ing physics-informed neural networks. SIAM Journal on Scientific Computing 42(2): A639–A665. Zhang, D.; Lu, L.; Guo, L.; and Karniadakis, G. 2019a. Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic prob- lems. Journal of Computational Physics 397: 108850. Zhang, Z.; Song, X.; Ye, S.; Wang, Y.; Huang, C.; An, Y.; and Chen, Y. 2019b. Application of deep learning method to Reynolds stress models of channel flow based on reduced- order modeling of DNS data. Journal of Hydrodynamics 31(1): 58–65. Zhu, Y.; Zabaras, N.; Koutsourelakis, P.-S.; and Perdikaris, P. 2019. Physics-constrained deep learning for high- dimensional surrogate modeling and uncertainty quantifica- tion without labeled data. Journal of Computational Physics 394: 56–81.