-

Mart´ın Abadi, Ashish Agarwal, Paul Barham, Eu- gene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:

Data-driven Inverse Modeling from Sparse Observations

Kailai Xu

0 1

Eric Darve

0 1 2 0 Institute for Computational and Mathematical Engineering 1 Introduction: Data-driven Inverse Modeling with Neural Networks 2 Mechanical Engineering Stanford University Stanford , California 94305 , USA

1603

04467

Deep neural networks (DNN) have been used to model nonlinear relations between physical quantities. Those DNNs are embedded in physical systems described by partial differential equations (PDE) and trained by minimizing a loss function that measures the discrepancy between predictions and observations in some chosen norm. This loss function often includes the PDE constraints as a penalty term when only sparse observations are available. As a result, the PDE is only satisfied approximately by the solution. However, the penalty term typically slows down the convergence of the optimizer for stiff problems. We present a new approach that trains the embedded DNNs while numerically satisfying the PDE constraints. We develop an algorithm that enables differentiating both explicit and implicit numerical solvers in reverse-mode automatic differentiation. This allows the gradients of the DNNs and the PDE solvers to be computed in a unified framework. We demonstrate that our approach enjoys faster convergence and better stability in relatively stiff problems compared to the penalty method. Our approach allows for the potential to solve and accelerate a wide range of data-driven inverse modeling, where the physical constraints are described by PDEs and need to be satisfied accurately.

Models involving partial differential equations (PDE) are usually used for describing physical phenomena in science and engineering. Unknown parameters in the models can be calibrated using observations, which are typically associated with the output of the models.

When the unknown is a function, an approach is to approximate the unknown using a neural network and plug it into the PDE. The neural network is trained by matching the predicted and the observed output of the PDE model. In the presence of full-field observations, in many cases we can approximate the derivatives in the PDE and reduce the inverse problem to a standard regression problem (see [ 1 ] for an example). However, in the context of sparse observations, i.e., Copyright c 2020, for this paper by its authors. Use permitted un-der Creative Commons License Attribution 4.0 International (CCBY 4.0). only part of the outputs of the models are observable, we must couple the PDE and the neural network to obtain the prediction.

Specifically, we formulate the inverse problem as a PDEconstrained optimization problem min L(u) = 2

X (u(xi) ui)

2 i2Iobs s:t: F ( ; u) = 0 where L is called the loss function, which measures the discrepancy between estimated outputs u and observed outputs ui at locations fxig. Iobs is the set of indices of locations where observations are available. F is the PDE model from which we can calculate the solution u. is the space of all neural networks with a fixed architecture and can be viewed as weights and biases. can also be physical parameter spaces when we solve a parametric inverse problem. One popular way to solve this problem is by minimizing the augmented loss function (penalty method) [2] min L~( ; u) = L(u) + ;u kF ( ; u)k22 However, this approach suffers from ill-conditioning and slow convergence partially due to the additional independent variable u besides .

Gradients

L˜(θ, u) λ F (θ, u) 22 PDE Residual θ

L(u) u

Observation Mismatch

PDE Solver Observation Mismatch

L(u) u θ

Gradients

In this work, we propose a new approach, physics constrained learning (PCL), that improves the conditioning and accelerates the convergence of inverse modeling. First, we enforce the physical constraint F ( ; u) = 0 by solving the PDE numerically. Our approach is compatible with common numerical schemes such as finite difference methods, finite volume methods, and finite element methods. Second, the gradient @L(u( )) needed for optimization is computed with @ reverse-mode automatic differentiation (AD) [3], and the required Jacobian is computed using forward-mode automatic differentiation. We use ADCME1 for AD functionalities in this work.

Methods: Physics Constrained Learning The main step in PCL is how to compute the gradients @L(u( )) . PCL is based on the formula @ 1 1 A bian propagation and will remain sparse as long as the numerical scheme we choose has local basis functions. 2. Solving the linear system (1) (2) 0 3. Apply reverse mode automatic differentiation to compute Here can be the neural network weights and biases and thus can be high dimensional. The challenge here is to compute the Jacobian matrix as well as the gradient Equation (2). The detailed algorithm and analysis is presented in [4]. Findings and Discussion: Enabling Faster and

More Robust Convergence The key finding from our work is that enforcing physical constraints leads to faster and more robust convergence compared to the penalty method for stiff problems. We conduct multiple numerical examples and show that in our benchmark problems, 1. PCL enjoys faster convergence with respect to the number of iterations to converge to a predetermined accuracy. Particularly, we observe a 104 times speed-up compared with the penalty method in the Helmholtz problem. We also prove a convergence result, which shows that for the chosen model problem, the condition number in the penalty method is much worse than that of PCL. 2. PCL exhibits mesh independent convergence, while the penalty method does not scale with respect to the number of iterations as well as PCL when we refine the mesh. s:t:Au = y where u0 = A 1y so that the optimal = 1; the corresponding penalty method solve a least square problem min kA yk22 A = pI A p0 y y = u00 We have proved the following theorem Theorem 0.1 The condition number of A is lim inf (A ) (A)2

!1 and therefore the condition number of the unconstraint optimization problem from the penalty method is the square of that from PCL asymptotically.

Conclusions We believe that enforcing physical constraints in illconditioned inverse problem is essential for developing robust and efficient algorithms. Particularly, when the unknowns are represented by neural networks, PCL demonstrates superior robustness and efficiency compared to the penalty method. Technically, the application of automatic differentiation gets rid of the challenging and timeconsuming process of deriving and implementing gradients and Jacobians. Meanwhile, AD also allows for leveraging the computational graph optimization to improve the inverse modeling performance. One limitation of PCL is that the PDE must be solved for each gradient computation, which can be expensive in both memory and computational costs. This computational challenge can be alleviated by considering accelerating techniques such as reduced-order modeling. [2] Maziar Raissi, Paris Perdikaris, and George E Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019. [4] Kailai Xu and Eric Darve. Physics constrained learning for data-driven inverse modeling from sparse observations. arXiv preprint arXiv:2002.10521, 2020.

[1]

Hayden

Schaeffer . Learning partial differential equations via data discovery and sparse optimization . Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences , 473 ( 2197 ): 20160446 , 2017 .