Nonlocal Physics-Informed Neural Networks – A unified theoretical and computational framework for nonlocal models Marta D’Elia1∗ , George E. Karniadakis2 , Guofei Pang2 , Michael L. Parks1 1 Center for Computing Research, Sandia National Laboratories, 1450 Innovation Parkway SE, Albuquerque, NM, 87123 mdelia,mlparks@sandia.gov 2 Applied Mathematics Department, Brown University, 170 Hope Street, Providence, RI 02912, george karniadakis,guofei pang@brown.edu Abstract distance, without contact. These models are such that they can capture effects that traditional PDEs fail to capture; in Nonlocal models provide an improved predictive capability fact, their solutions can be irregular: non-differentiable, sin- thanks to their ability to capture effects that classical par- tial differential equations fail to capture. Among these effects gular, and discontinuous. Among those effects, we mention: we have multiscale behavior and anomalous behavior such as 1) Multiscale behaviors and discontinuities such as cracks super- and sub-diffusion. These models have become incred- and fractures and 2) Anomalous behaviors such as super- ibly popular for a broad range of applications, including me- and sub-diffusion. In case 1) we refer to nonlocal truncated chanics, subsurface flow, turbulence, plasma dynamics, heat operators where the neighborhood is a ball of radius δ (usu- conduction and image processing. However, their improved ally much smaller than the domain) surrounding any point. accuracy comes at a price of many modeling and numerical In case 2) refer to fractional operators where the interactions challenges. In this work we focus on the estimation of model can be infinite (δ = ∞); a standard representative of this parameters, often unknown, or subject to noise. In particular, class is the fractional Laplacian operator (−∆)s . we address the problem of model identification in presence As a consequence, nonlocal models provide an im- of sparse measurements. Our approach to this inverse prob- lem is based on the combination of 1. Machine Learning and proved predictive capability for several scientific and en- Physical Principles and 2. a Unified Nonlocal Vector Calcu- gineering applications including fracture mechanics (Ha lus and Versatile Surrogates such as neural networks (NN). and Bobaru 2011; Littlewood 2010; Silling 2000), anoma- The outcome is a flexible tool that allows us to learn exist- lous subsurface transport (Benson, Wheatcraft, and Meer- ing and new nonlocal operators. We refer to our technique as schaert 2000; Schumer et al. 2003; 2001), phase transi- nPINNs (nonlocal Physics-Informed Neural Networks); here, tions (Bates and Chmaj 1999; Delgoshaie et al. 2015; Fife we model the nonlocal solution with a NN and we solve an 2003), image processing (A. Buades, Coll, and Morel 2010; optimization problem where we minimize the residual of the Gilboa and Osher 2007; 2008; Lou et al. 2010), multiscale nonlocal equation and the misfit with measured data. The re- and multiphysics systems (Alali and Lipton 2012; Askari sult of the optimization are the weights and biases of the NN 2008), MHD (Schekochihin, Cowley, and Yousef 2008), and the set of unknown model parameters. and stochastic processes (Burch, D’Elia, and Lehoucq 2014; D’Elia et al. 2017; Meerschaert and Sikorskii 2012; Metzler Challenges of nonlocal modeling and Klafter 2000). Nonlocal equations are model descriptions for which the In its simplest form, a nonlocal operator can be defined as state of a system at any point depends on the state in a neigh- Z borhood of points, i.e. every point in a domain interacts with Lu(x) = (u(y) − u(x))k(x, y) dy, (1) a neighborhood of points. As such, interactions can occur at Bδ (x) ∗ Sandia National Laboratories is a multimission laboratory where Bδ (x) is the ball of radius δ centered at x and where managed and operated by National Technology and Engineer- k is an application dependent kernel that determines the reg- ing Solutions of Sandia, LLC, a wholly owned subsidiary of ularity properties of the solution. The integral form allows Honeywell International, Inc., for the U.S. Department of En- us to catch long-range forces and reduces the regularity re- ergy’s National Nuclear Security Administration under contract quirements of the solution. DE-NA0003525. This paper describes objective technical results We consider nonlocal diffusion problems of the form and analysis. Any subjective views or opinions that might be ex- ( pressed in the paper do not necessarily represent the views of the −Lu = f x ∈ Ω U.S. Department of Energy or the United States Government. Re- (2) port 2019-14015. u = g x ∈ ΩI , Copyright c 2020, for this paper by its authors. Use permit- ted under Creative Commons License Attribution 4.0 International where Ω ⊂ Rn is an open bounded domain and ΩI is the (CCBY 4.0). interaction domain, a layer of thickness δ surrounding the domain where nonlocal boundary conditions must be pre- We propose a new approach to model learning that is in scribed for the well-posedness of the problem. stark contrast with previously developed UQ and PDE- Two very important concerns arise when addressing the constrained-like optimization techniques. The game changer solution of (2). is the combination of 1) Machine Learning and Physical Q1 Is (1) general enough? How broad is the class of nonlocal Principles, and 2) Unified Calculus and Versatile Surro- operators that can be described by one single formula and gates, such as neural networks. The outcome is a Data- analyzed through one unified calculus? Driven Physics-Informed tool for learning new complex nonlocal phenomena. Q2 What is the “right” kernel for a given phenomenon? How We refer to our strategy as nPINNs (nonlocal Physics- can available data help determine the appropriate nonlocal Informed Neural Networks); this is an extension of PINNs model and its parameters? Can we design a unified data- (Raissi, Perdikaris, and Karniadakis 2018) and fPINNs driven tool for model identification and simulation of a (Pang, Lu, and Karniadakis 2018) designed for PDEs and broad class of nonlocal models? fractional operators respectively. More specifically nPINNs The first concern arises from the fact that in the literature includes the methods above as special instances. In the next we have independent definitions, formulations and theory of section we describe our strategy and its main properties. nonlocal models. Similarities are evident, but they have not been rigorously proved. This is addressed in the next section. Nonlocal Physics-Informed Neural Networks The nPINNs algorithm consists of three simple steps. A unified nonlocal calculus 1 Collect observations of solution and data in training sets: The purpose of a unified nonlocal notation and theory is to fm (xi ), xi ∈ Tf , and um (xj ), xj ∈ Tu ; • Connect the nonlocal and fractional communities that 2 Approximate the solution with a Neural Network: would benefit from each other’s research; u(x) = uNN (x); • Include as special cases the well-known classical differ- 3 Minimize the loss function ential calculus at the limit of vanishing interactions and 1 X the fractional calculus at the limit of infinite interactions; min Loss(u; δ, s) = (Lδ,s uNN (xi ) − fm (xi ))2 + u;δ,s 2 x ∈T i f • Provide the groundwork for new model discovery thanks β X to the broad class of operators that it describes; (uNN (xj ) − um (xj ))2 , 2 x ∈T • Describe intrinsically nonlocal phenomena that have not i u been analyzed or used due to the lack of theory. where the minimization with respect to u must be regarded • Guide algorithm/discretization/solver design. as minimization with respect to the weights and biases of the NN. The two, distinct, training sets in 1 depend only In this work we introduce a generalized nonlocal operator, on data availability and are not necessarily associated with in the spirit of a unified calculus, that bridges local, truncated quadrature points. Note that Loss has a physics-driven and nonlocal and fractional diffusion operators: a data-driven component: the first term controls the residual Z u(x) − u(y) of the nonlocal equation, whereas the second the mismatch Lδ,s u(x) = Cδ,s n+2s dy (3) between solution and data. The outcome of the optimization Bδ (x) |x − y| are the weights and biases of the NN and the model param- where Cs,δ is such that the corresponding solutions span a eters. This strategy broad range of nonlocal diffusion processes including local • Is as accurate as any other discretization method for the and fractional diffusion at the limit of vanishing and increas- forward problem. As an example, numerical tests show ing nonlocality, i.e. that it has the same convergence rate, as the number lim Lδ,s u = ∆u and lim −Lδ,s u = (−∆)s u. of training points increases, of fPINNs and of a stan- δ→0 δ→∞ dard Finite Difference discretization. However, due to the increased computation cost, nPINNs is not yet recom- A unified computational framework mended for the solution of forward problems. The unified nonlocal vector calculus, and more specifically • Is not tied to any discretization method. the operator in (3) provides us with a universal definition of parametrized nonlocal operators that describe both well- • Requires minimal implementation effort: available solvers known nonlocal phenomena and may describe new intrinsi- can be used as black boxes. cally nonlocal phenomena not yet analyzed and used due to • Easily handles sparsity. lack of theory. However, the universal nature of these new mathematical models and the abundance of data raise im- We tested this method on one-dimensional forward and portant questions. inverse problem (to illustrate our theoretical findings and learn model parameters) and on two- and three- dimensional Q3 What are the true model parameters δ and s? forward problems (to show applicability in higher dimen- Q4 How can we deal with data sparsity and noise (the forcing sions). Also, we applied nPINNs to the solution of turbu- term f and the nonlocal boundary condition g in (2) may lent Couette flow for the estimation of the dispersion rate be sparse or subject to noise)? s and the characteristic length δ. Computational results are Burch, N.; D’Elia, M.; and Lehoucq, R. 2014. The exit-time problem for a markov jump process. The European Physical Journal Special Topics 223:3257–3271. Delgoshaie, A.; Meyer, D.; Jenny, P.; and Tchelepi, H. 2015. Non-local formulation for multiscale flow in porous media. Journal of Hydrology 531(1):649–654. D’Elia, M.; Du, Q.; Gunzburger, M.; and Lehoucq, R. 2017. Nonlocal convection-diffusion problems on bounded do- mains and finite-range jump processes. Computational Methods in Applied Mathematics 29:71–103. Fife, P. 2003. Some nonclassical trends in parabolic and parabolic-like evolutions. Springer-Verlag, New York. chap- ter Vehicular Ad Hoc Networks, 153–191. Gilboa, G., and Osher, S. 2007. Nonlocal linear image reg- Figure 1: Trajectories of the optimization algorithm for the ularization and supervised segmentation. Multiscale Model. initial guesses (δ1 , s1 )=(1, 0.5), left, and (δ2 , s2 )=(10, 0.5), Simul. 6:595–630. right. The blue dot indicates the initial guess, the pink dot the optimal value and the yellow star the true value. Gilboa, G., and Osher, S. 2008. Nonlocal operators with applications to image processing. Multiscale Model. Simul. 7:1005–1028. promising and show that the versatility of NN allows one to Ha, Y. D., and Bobaru, F. 2011. Characteristics of dynamic describe complex phenomena, to identify model parameters brittle fracture captured with peridynamics. Engineering and to handle data sparsity. Fracture Mechanics 78(6):1156–1168. Littlewood, D. 2010. Simulation of dynamic fracture us- ing peridynamics, finite element modeling, and contact. In One-dimensional example In Figure 1 we report the out- Proceedings of the ASME 2010 International Mechanical come of our algorithm (steps 1–3) for the estimation of δ and Engineering Congress and Exposition, Vancouver, British s. For Ω=(0, 1) and Ω ∪ ΩI =(−δ, 1 + δ), we consider the Columbia, Canada. nonlocal diffusion problem (2) with g=0, f =sin(2πx) and Lou, Y.; Zhang, X.; Osher, S.; and Bertozzi, A. 2010. Im- L defined as in (3). The training data um are generated via age recovery via nonlocal operators. Journal of Scientific accurate solution of (2) with parameters (δ ∗ , s∗ )=(14, 0.8); Computing 42:185–197. we refer to these values as true values and represent them with a yellow star in the plot. The training points are 100 Meerschaert, M., and Sikorskii, A. 2012. Stochastic models uniformly spaced points in Ω ∪ ΩI . We run the algorithm for for fractional calculus. Studies in mathematics, Gruyter. two initial guesses, represented by the blue dots and report Metzler, R., and Klafter, J. 2000. The random walk’s guide their trajectories. Both of them, see pink dots in both plots, to anomalous diffusion: a fractional dynamics approach. converge to the true values. The optimal uNN correspond- Physics Reports 339(1):1–77. ing to the estimated parameters are accurate for both initial Pang, G.; Lu, L.; and Karniadakis, G. 2018. Frac- guesses; in fact, their relative errors are of the order of 10−4 . tional physics-informed neural networks. Technical report. ArXiv:1811.08967. References Raissi, M.; Perdikaris, P.; and Karniadakis, G. 2018. A. Buades, A.; Coll, B.; and Morel, J. 2010. Image de- Physics-informed neural networks: A deep learning frame- noising methods. a new nonlocal principle. SIAM Review work for solving forward and inverse problems involving 52:113–147. nonlinear partial differential equations. Journal of Compu- tational Physics. Alali, B., and Lipton, R. 2012. Multiscale dynamics of het- Schekochihin, A.; Cowley, S.; and Yousef, T. 2008. Mhd tur- erogeneous media in the peridynamic formulation. Journal bulence: Nonlocal, anisotropic, nonuniversal? In In IUTAM of Elasticity 106(1):71–103. Symposium on computational physics and new perspectives Askari, E. 2008. Peridynamics for multiscale materials mod- in turbulence, 347–354. Springer, Dordrecht. eling. Journal of Physics: Conference Series, IOP Publish- Schumer, R.; Benson, D.; Meerschaert, M.; and Wheatcraft, ing 125(1):649–654. S. 2001. Eulerian derivation of the fractional advection- Bates, P., and Chmaj, A. 1999. An integrodifferential model dispersion equation. Journal of Contaminant Hydrology for phase transitions: Stationary solutions in higher space 48:69–88. dimensions. J. Statist. Phys. 95:1119–1139. Schumer, R.; Benson, D.; Meerschaert, M.; and Baeumer, Benson, D.; Wheatcraft, S.; and Meerschaert, M. 2000. Ap- B. 2003. Multiscaling fractional advection-dispersion plication of a fractional advection-dispersion equation. Wa- equations and their solutions. Water Resources Research ter Resources Research 36(6):1403–1412. 39(1):1022–1032. Silling, S. 2000. Reformulation of elasticity theory for dis- continuities and long-range forces. Journal of the Mechanics and Physics of Solids 48:175–209.