Data-driven learning of nonlocal models:
                            from high-fidelity simulations to constitutive laws
                              Huaiqian You,1 Yue Yu,1 Stewart Silling,2 Marta D’Elia3
                                          1
                                        Mathematics Department, Lehigh University, PA
                        2
                         Center for Computing Research, Sandia National Laboratories, Albuquerque, NM
                      3
                        Computational Science and Analysis, Sandia National Laboratories, Livermore, CA
                       huy316@lehigh.edu, yuy214@lehigh.edu, sasilli@sandia.gov, mdelia@sandia.gov.

                            Abstract                                  a physical system. The problem of learning an appropriate
                                                                      kernel for a specific application is one of the most challeng-
  We show that machine learning can improve the accuracy              ing open problems in nonlocal modeling. The literature on
  of simulations of stress waves in one-dimensional composite
  materials. We propose a data-driven technique to learn non-
                                                                      techniques for learning kernel parameters for a given func-
  local constitutive laws for stress wave propagation models.         tional form is vast, see, e.g., (Burkovska, Glusa, and D’Elia
  The method is an optimization-based technique in which the          2020; D’Elia and Gunzburger 2016) for control-based ap-
  nonlocal kernel function is approximated via Bernstein poly-        proaches, and (Pang et al. 2020; Pang, Lu, and Karniadakis
  nomials. The kernel, including both its functional form and         2019) for machine-learning approaches. However, the use of
  parameters, is derived so that when used in a nonlocal solver,      machine learning to learn the functional form of the kernel
  it generates solutions that closely match high-fidelity data.       is still in its infancy, (Xu and Foster 2020; Xu, D’Elia, and
  The optimal kernel therefore acts as a homogenized nonlocal         Foster 2020; You et al. 2021) being the only relevant works
  continuum model that accurately reproduces wave motion in           that we are aware of.
  a smaller-scale, more detailed model that can include multi-
  ple materials. We apply this technique to wave propagation             In this work we use an approach similar to the one devel-
  within a heterogeneous bar with a periodic microstructure.          oped in (You et al. 2021) to learn nonlocal kernels whose
  Several one-dimensional numerical tests illustrate the accu-        associated nonlocal wave equation is well posed by con-
  racy of our algorithm. The optimal kernel is demonstrated to        struction and can be used as an accurate surrogate for more
  reproduce high-fidelity data for a composite material in ap-        detailed, high-fidelity wave propagation models. In partic-
  plications that are substantially different from the problems       ular, we present an application to wave propagation at the
  used as training data.                                              microscale in a heterogeneous solid. In this context, the
                                                                      machine-learned nonlocal kernel embeds the material con-
                        Introduction                                  stitutive behavior so that the material interfaces do not have
                                                                      to be treated explicitly and, more importantly, the mate-
Nonlocal models use integral operators acting on a length-            rial microstructure can be unknown. Furthermore, the cor-
scale δ, known as horizon. This feature allows nonlocal               responding nonlocal models allow for accurate simulations
models to capture long-range forces at small scales and mul-          at scales that are much larger than the microstructure.
tiscale behavior, and to reduce regularity requirements on
the solutions, which are allowed to be discontinuous or even          Our main contributions are:
singular. In recent decades, nonlocal equations have been             • The design of an optimization technique that bridges mi-
successfully used to model several engineering and scientific         cro and continuum scales by providing accurate and stable
applications, including fracture mechanics (Silling 2000; Ha          model surrogates for the simulation of wave propagation in
and Bobaru 2011; Trask et al. 2019), subsurface transport             heterogeneous materials.
(Benson, Wheatcraft, and Meerschaert 2000; Schumer et al.             • The illustration of this method via one-dimensional experi-
2003), image processing (D’Elia, De los Reyes, and Tru-               ments that confirm the applicability of our technique and the
jillo 2019; Gilboa and Osher 2007), multiscale and multi-             improved accuracy compared with state-of-the-art results.
physics systems (Alali and Lipton 2012; Askari et al. 2008;           • The demonstration of generalization properties of our al-
You, Yu, and Kamensky 2020), finance (Scalas, Gorenflo,               gorithm whose associated model surrogates are effective
and Mainardi 2000), and stochastic processes (D’Elia et al.           even on problem settings that are substantially different from
2017; Meerschaert and Sikorskii 2012).                                the ones used for training in terms of loading and time
                                                                      scales.
    However, it is often the case that nonlocal kernels defin-
ing nonlocal operators are justified a posteriori and it is
not clear how to define such kernels to faithfully describe                         Nonlocal kernel learning
Copyright © 2021 for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY         We introduce the high-fidelity (HF) model that faithfully
4.0).                                                                 represents the system: for Ω ∈ Rd , the scalar function
u(x, t) solves, for (x, t) ∈ Ω × [0, T ]
              ∂2u
                  (x, t) − LHF [u](x, t) = f (x, t),        (1)
              ∂t2
provided some boundary conditions on ∂Ω for u(x, t) and
initial conditions at t = 0 for u and ∂u/∂t are satisfied.
Here, LHF is the HF operator, which can either be a differ-
ential or integral operator, and f represents a forcing term.         Figure 1: One-dimensional bar with ordered microstructure
   We assume that solutions to this HF problem may be ap-             of period 2L. Material 1 and 2 have the same density and
proximated by solutions to a nonlocal problem of the form             Young modulus E1 and E2 . The horizon δ, the wave length
                                                                      λ, and the discretization size, h, are reported for comparison.
             ∂2u
                  (x, t) − LK [u](x, t) = f (x, t),       (2)
             ∂t2
for (x, t) ∈ Ω × [0, T ], augmented with nonlocal boundary            approximation of LK by Riemann sum with uniform grid
conditions on Ωδ (a layer of thickness δ that surrounds the           spacing h. The optimal parameters are obtained by solving
domain) and the same initial conditions on the variable u             the following optimization problem.
and its derivative as in (1). The forcing f may coincide with                       N Ttr /dt
                                                                           Ttr X X                                    2
the forcing term in (1) or it could be an appropriate repre-           min 3                    ūn+1
                                                                                                  k   − uk (tn+1 ) `2 + R({Cm }),
sentation of the same.                                                 Cm dt N
                                                                                 n=1
                                                                                   k=1
   We seek LK as a nonlocal operator of the form                                                                                     (7)
                   Z
                                                                        s.t. ūk satisfies (6) and                                   (8)
  LK [u](x, t) =       K(|x − y|) (u(y, t) − u(x, t)) dy (3)
                    Ω                                                        K satisfies physics-based constraints.                  (9)
where K is a radial, sign-changing, kernel function, com-             Here, the `2 norm is taken over the space-discretization
pactly supported on the ball of radius δ centered at x, i.e.,         points xi , R(·) is a regularization term on the coefficients
Bδ (x) and Ω = Ω ∪ Ωδ .                                               that improves the conditioning of the optimization problem,
                                                                      and (9) depends on the physics of the problem (as an exam-
The algorithm                                                         ple, it may correspond to enforcing that the surrogate model
To learn the kernel K, we assume that we are given N pairs            reproduces exactly a certain class of solutions).
of forcing terms and corresponding solutions to (1), nor-
malized with respect to the L2 norm of each solution over                   Dispersion in heterogeneous materials
Ω × [0, Ttr ]. These are denoted by                                   We apply the learning algorithm described above to the
                                                 N                    propagation of waves in a one-dimensional heterogeneous
              Dtr = {(uk (x, t), fk (x, t))}k=1 ,               (4)
                                                                      bar, like the one reported in Figure 1, with an ordered mi-
for x ∈ Ω and t ∈ (0, Ttr ]. Similarly to (You et al. 2021),          crostructure, i.e. two materials with the same length alter-
we represent K as a linear combination of Bernstein basis             nate periodically. Our goal is to learn a nonlocal model able
polynomials:                                                          to reproduce wave propagation on distances that are much
                                                                      larger than the size of the microstructure without resolving
             X       M                 
             |y|            Cm             y                          the microscales. The high-fidelity model we rely on is the
         K        =             B
                             d+2 m,M
                                              ,          (5)          classical wave equation; the corresponding high-fidelity data
              δ       m=0
                           δ               δ
                                                                      used for training and validation are obtained with the solver
where the Bernstein
                    basis functions are defined as                  described below.
                   M
   Bm,M (x) =           xm (1 − x)M −m for 0 ≤ x ≤ 1                  High-fidelity data
                   m
and where Cm ∈ R. Note that, by construction, this kernel             For both training and validation purposes we generate data
guarantees that (2) is well-posed (Du, Tao, and Tian 2018).           using high-fidelity simulations for the propagation of stress
   We machine-learn the nonlocal model by finding optimal             waves within the microstructure of the heterogeneous, linear
parameters {Cm } such that solutions ûk to (2), for f = fk           elastic bar. This method, which will be referred to as Direct
and the kernel function K associated to {Cm }, are as close           Numerical Solution (DNS), constructs an arbitrarily com-
as possible to the training variable uk .                             plex wave diagram (also called an x-t diagram), that treats
   In this work we numerically approximate ûk by ūk using           the mutual interaction and superposition of many wavefronts
a central-differencing scheme in time with time step dt, i.e.         moving in either direction. The bar is discretized into nodes
                                                                      such that it takes a constant amount of time ∆t for a wave to
       ūn+1
         k   (xi ) = 2ūnk (xi ) − ūn−1
                                     k   (xi )                        travel between nodes γ and γ + 1, regardless of the elastic
                                                                (6)
                  + dt2 (LK,h [ūnk ](xi ) + fk (xi , tn )) ,         wave speed in the material between these two nodes. There-
                                                                      fore, in a heterogeneous medium, the spacing between nodes
where ūn+1
        k   (xi ) represents the k-th approximate solution at         is not constant. Each node γ, at each time step n, has veloc-
time step tn+1 and at discretization point xi , and LK,h is an        ity vγn (note that, in this case, the subscript refers to position,
      time                                                         1) Oscillating source. We set
                                                                                               2
                                                                                                  b = 50, v(x, 0) = u(x, 0) = 0,
                                                                                   2x 2 − t−t0
                                                                   f (x, t) = e−( 5kL ) e         cos2 2πx
                                                                                                            
                                                                                                         kL , k = 1, 2, . . . , 20,
                                                                                            tp

                                                                   t0 = tp = 0.8.
                                       right-running
                                       transmitted wave            2) Plane wave. For b = 50, f (x, t) = 0 and u(x, 0) = 0, we
                    left-running                                   prescribe v(−b, t) = sin(ωt) for ω = 0.35, 0.7, · · · , 3.85.
                    transmitted wave
                                                                   3) Wave packet. For b = 133.3, f (x, t) = 0 and u(x, 0) = 0,
                                                                   we prescribe v(−b, t) = sin(ωt) exp(−(t/5 − 3)2 ) for ω =
                     right-running
                     incident wave                                 2, 3.9, 5.

                                        left-running               4) Impact. For b =266.6, f (x, t) = 0 and u(x, 0) = 0, we
                                        incident wave              prescribe v(x, 0) = 1 for all x ∈ [−b, −b + 1.6] and v = 0
                                                                   outside of this interval. This initial condition represents an
                                                                   impactor hitting the bar at time zero, generating a velocity
                                                                   pulse of width roughly 3.2 that propagates into the interior
                                                        position   of the bar. The pulse attenuates and changes shape as it en-
                                                                   counters the many microstructural interfaces.

Figure 2: Interaction of two waves in the DNS method. Each         Training procedure
node may or may not be located at a material interface.            For the optimization problem (7) we choose a Tikhonov
                                                                                                                    PM
                                                                   regularization of the form R({Cm }) = M+1 m=0 Cm         2
                                                                                                                               ,
                                                                   where the regularization weight  is chosen empirically to
as opposed to the previous section where it corresponds to a
                                                                   guarantee accurate predictions, as we explain later on. The
specific sample k). To compute the velocities in the next time
                                                                   physics-based constraints in (9) are defined as follows and
step, it is assumed that two waves moving in opposite direc-
                                                                   also discretized by Riemann sum; they are used to explicitly
tions converge on the node γ at time step n (see Figure 2).
                                                                   prescribe values of CM −1 and CM :
The waves shown in the figure can have unequal slopes on
the x-t diagram because the materials on either side of node              M       Z δ 2         
                                                                         X             y         |y|
γ can have different waves speeds c. The jump conditions for                  Cm          B
                                                                                         3 m,M
                                                                                                      dy = ρc20 ,
the waves are applied that relate the stress change [σ] across           m=0        0  δ          δ
a wave to the velocity change [v]. These jump conditions                                                                   (10)
                                                                          M           δ 4       
                                                                                                 |y|
                                                                                  Z
have the following form:
                                                                         X             y                          3
                                                                              Cm          B
                                                                                         3 m,M
                                                                                                      dy = −4ρc0 R,
                       [σ] = ±ρc[v],                                     m=0        0 δ           δ

where ρ is the mass density, and where the + and − signs           where ρ is the density and c0 is the effective wave speed
apply to right-running waves and left-running waves respec-        for infinitely long wavelengths. For ρ = 1, it is given by
                                                                                              1
tively. From these conditions, the velocity vγn+1 can be com-      c0 = (2/(1/E1 + 1/E2 ) 2 . R is the second derivative of the
puted explicitly from the values at the adjacent nodes in time     wave group velocity with respect to the frequency ω evalu-
step n − 1. Externally applied forces can also be included in      ated at ω = 0. Both parameters are obtained by simulating a
a straightforward way. After vγn+1 is computed, the updated        very low frequency plane wave propagating through the mi-
displacement is approximated by                                    crostructure over a long distance using DNS (Silling 2020).
                  un+1 = unγ + ∆tvγn+1 .                           These parameters primarily affect simulations at large times,
                   γ
                                                                   t > 10. However, due to practical limitations on computer
Details of the method can be found in (Silling 2020).              resources, our training simulations are restricted to t ≤ 2.
   This DNS solver has the important advantage of not us-          Therefore, we incorporate these parameters as constraints
ing an approximate representation of derivatives in space or       obtained from DNS as indicated in (10), rather than attempt-
time for the computation of the velocity, which is, therefore,     ing to learn these through our algorithm. The first constraint
free from truncation error and other sources of discretization     in (10) is also used for similar purposes in (Xu, D’Elia, and
error that are usually encountered with PDE solvers. This          Foster 2020) and prescribed in a weak sense by penalization.
allows us to model the propagation of waves through many              Training is performed with DNS data of type 1) and 2).
thousands of microstructural interfaces without the need to        Parameters for the nonlocal solver and the optimization al-
worry about what features of the velocity are real and what        gorithm are set to h = 0.05, dt = 0.02, Ttr = 2, δ =1.2,
are numerical artifacts.                                           M = 24 and  = 0.01. The optimization problem (7) is
   We consider four types of data and use the first two for        solved with L-BFGS. Note that we empirically choose δ and
training and the last two for validation of our algorithm. In       in such a way that the group velocity, defined below, corre-
all our experiments we set L = 0.2, E1 = 1, E2 = 0.25,             sponding to the optimal kernel is as close as possible to the
ρ = 1, and the symmetric domain Ω = (−b, b). Discretiza-           one computed with DNS.
tion parameters for the DNS solver are set to ∆t = 0.01 and           The optimal kernel, Kopt , is reported in Figure 3; as ex-
max{∆x} = 0.01.                                                    pected from the literature (Xu and Foster 2020; Xu, D’Elia,
            60                                                                                       0.7

            50                                                                                       0.6

                                                                                                     0.5        DNS data
            40
                                                                                                                Optimal kernel


                                                                                    Group velocity
                                                                                                     0.4        Const kernel
            30
   K(|y|)


                                                                                                     0.3
            20
                                                                                                     0.2
            10
                                                                                                     0.1
              0                                                                                         0

            -10                                                                                      -0.1
                  0      0.2           0.4         0.6        0.8     1     1.2                             0         2               4   bs    6    8        10
                                         Bond length (|y|)                                                                       Angular frequency

                                                                                                     0.7
 Figure 3: Optimal kernel Kopt as a function of distance.
                                                                                                     0.6

                                                                                                     0.5
            20


                                                                                    Group velocity
                                                                                                     0.4
                                                                                                                  DNS data
                                                                                                     0.3          Optimal kernel
            15
                                                                                                                   = 0.6,    = 1e-1
                                                                                                     0.2           = 0.9,    = 1e-1

                                                                                                     0.1
   2


            10
                                                                                                        0

                                                                                                     -0.1
              5                                                                                             0         1               2         3    4   bs   5
                                                                                                                                 Angular frequency

              0
                  0     20        40          60         80     100   120   140
                                                                                  Figure 5: Comparison of the group velocity. Upper: group
                                             Wave number                          velocity corresponding to Kopt , Kconst , and DNS data. Bot-
                                                                                  tom: group velocity corresponding to Kopt for different
            Figure 4: Dispersion curve associated with Kopt .                     pairs (δ, ).

and Foster 2020; You et al. 2021; Weckner and Silling                             pass bands with the coarse discretization that is used in
2011), we observe a sign-changing behavior. We also com-                          fitting our nonlocal kernel, we address only the first, low-
pute the corresponding dispersion ω(k) and group velocity                         frequency pass band, i.e. ω ∈ (0, ωbs ), where “bs” stands
vg (ω) = dω/dk. For a given kernel K and different fre-                           for band stop. Hence, the optimal kernel is suitable only for
                   2π
quencies ki = 0, 200h , · · · , 2π
                                 h , the corresponding angular                    wavelengths that are bigger than the microstructure; this is
frequency ω(ki ) and group velocity vg (ω(ki )) are approxi-                      enough to reproduce the physically most important features
mated by                                                                          of wave propagation in layered media for typical applica-
                    1X                                                            tions.
          ω(ki )2 ≈          K(|yq |)(1 − cos(ki yq ))h,                              The profile of the group velocity shows the improved ac-
                    ρ q
                                                                                  curacy of our optimal kernel that not only matches the be-
                                  ω(ki+1 ) − ω(ki−1 )                             havior for low values of ω, but also catches the behavior at
                  vg (ω(ki )) ≈                       ,                           ω = ωbs ≈ 4. This fact has important consequences on the
                                     ki+1 − ki−1
                                                                                  ability to reproduce wave propagation for values of ω big-
where yq belong to a uniform grid of size h in (−δ, δ).                           ger than ωbs . In order to justify the statement above on the
   The dispersion curve is reported in Figure 4, its positivity                   optimality of the parameters δ and , we report the group ve-
indicates that Kopt corresponds to a physically stable mate-                      locity profile in correspondence of different pairs; it is clear
rial model. The group velocity is reported in the upper plot                      from the profiles in Figure 5 that (δ, ) = (1.2, 0.01) pro-
of Figure 5 in comparison with the curve computed with                            vides the best match both in terms of curvature at ω = 0 and
DNS by observing the speed of a wave packet of a given                            identification of the band stop.
frequency as it moves through the microstructure. We also
display the group velocity associated with an alternative ker-                    Numerical validation
nel obtained for the same material by a completely different
                                                                                  We test the performance of the optimal kernel Kopt on data
method (Silling 2020). This alternative kernel is a constant,
                                                                                  sets of type 3) and 4), i.e. the problem setting considered
specifically, we have that Cm = Kconst = 0.7714, for M = 3
                                                                                  for validation has different model parameters, including the
and δ = 0.15.
                                                                                  domain, than the one used for training and, hence, these tests
   It is well known that layered, periodic elastic media have
                                                                                  serve as an indicator of the generalization properties of our
a band structure for wave propagation, see (Bedford and
                                                                                  algorithm.
Drumheller 1994, pages 121-122). In the present study, be-
cause it is not possible to reproduce the higher-frequency                        Wave packet. For data type 3) we numerically compute so-
                      1.5                                                                                              1.5

                                DNS data                                                                                           DNS data
                         1                                                                                                1
                                Optimal kernel                                                                                     Const kernel

                      0.5                                                                                              0.5
           Velocity


                                                                                                            Velocity
                         0                                                                                                0


                      -0.5                                                                                             -0.5


                        -1                                                                                               -1
                        -150     -140         -130    -120       -110       -100        -90    -80                       -150      -140         -130   -120    -110            -100     -90      -80
                                                           Position                                                                                      Position
                      0.8                                                                                                 1
                                DNS data                                                                                          DNS data
                      0.6
                                Optimal kernel                                                                                    Const kernel
                      0.4                                                                                              0.5

                      0.2
           Velocity


                                                                                                            Velocity
                         0                                                                                                0

                      -0.2

                      -0.4                                                                                             -0.5

                      -0.6

                      -0.8                                                                                               -1
                          -80           -60          -40              -20           0           20                       -100                    -50                0                       50
                                                           Position                                                                                      Position
            0.015                                                                                                  0.15

                  0.01                                                                                                           DNS data
                                                                            DNS data                                   0.1
                                                                                                                                 Const kernel
            0.005                                                           Optimal kernel

                                                                                                                   0.05
                         0
Velocity


                                                                                                       Velocity


           -0.005                                                                                                         0

              -0.01
                                                                                                                  -0.05
           -0.015
                                                                                                                       -0.1
              -0.02

           -0.025                                                                                                 -0.15
                 -135            -130         -125    -120       -115       -110        -105   -100                             -130          -120      -110            -100          -90        -80
                                                           Position                                                                                      Position


Figure 6: Velocity computed with Kopt . Plots from top to                                             Figure 7: Velocity computed with Kconst . Plots from top to
bottom correspond to: 1. ω1 = 2 at t = 100; 2. ω2 = 3.9 at                                            bottom correspond to: 1. ω1 = 2 at t = 100; 2. ω2 = 3.9
t = 320; 3. ω3 = 5 at t = 100.                                                                        at t = 320; 3. ω3 = 5 at t = 100. The optimal kernel Kopt
                                                                                                      obtained by machine learning (Figure 6) clearly performs
                                                                                                      better than Kconst for the second and third cases (ω2 and
lutions to (2) using Kopt and DNS data as nonlocal bound-                                             ω3 ).
ary conditions. We consider solutions corresponding to three
values of ω: ω1 = 2 < ωbs , ω2 = 3.9 ≈ ωbs and ω3 = 5 > ωbs .
Note that the latter value is beyond the band stop and, as                                            port in Figure 7 the behavior of the velocity corresponding
such, corresponds to a zero group velocity, i.e. the wave does                                        to Kconst at time t = 100, t = 320 and 100, respectively for
not travel in time. In Figure 6 we report the velocity corre-                                         ω1 , ω2 and ω3 . Comparison with DNS data shows that, for
sponding to the computed displacement ū at time t = 100,                                             ω2 the wave associated with Kconst is traveling faster than
t = 320, and t = 100 for ω1 , ω2 , and ω3 respectively; as a                                          the exact one and, for ω3 , it keeps traveling while the exact
reference, we also report the exact DNS velocity. Our results                                         wave is not propagating.
indicate that our kernel can accurately reproduce solutions
of type 3) at times larger than Ttr and for all values of ω,                                          Impact. We use the optimal kernel to compute solutions cor-
even larger than ωbs . This is possible because the group ve-                                         responding to data type 4). In Figure 8 we report the velocity
locity corresponding to Kopt reproduces the true group ve-                                            profile at different time steps in correspondence of Kopt and
locity very accurately, see Figure 5. In particular, detecting                                        DNS data, displayed for comparison. Figure 9 displays the
the presence of a band stop allows us to accurately predict                                           same results in correspondence of Kconst . These results indi-
the wave propagation for values of ω > ωbs . Due to the poor                                          cate that our optimal kernel can accurately predict the short-
accuracy of the group velocity associated to Kconst , corre-                                          and long-time wave propagation, as opposed to the constant
sponding solutions are not as accurate for ω in the proximity                                         kernel that successfully predicts the long-time behavior only.
of ωbs and beyond. To illustrate this phenomenon, we re-                                              We also point out that for values of (δ, ) for which the group
                                                    T=20                                                                                     T = 20
             0.8                                                                                            0.8

                                                                                                                        DNS data
             0.6                                             DNS data                                       0.6         Constant kernel
                                                             Optimal kernel


             0.4                                                                                            0.4
  Velocity


                                                                                                 Velocity
             0.2                                                                                            0.2


                0                                                                                              0


             -0.2                                                                                           -0.2
                -270          -260          -250            -240           -230         -220                   -270     -265        -260         -255         -250     -245   -240
                                               Position
                                                T=600                                                                                       Position
                                                                                                                                            T = 600
             0.8                                                                                            0.8


                                                                                                                        DNS data
             0.6                                                                                            0.6
                                                                   DNS data                                             Constant kernel
                                                                   Optimal kernel

             0.4                                                                                            0.4
  Velocity


                                                                                                 Velocity
             0.2                                                                                            0.2


                0                                                                                              0


             -0.2                                                                                           -0.2
                    80   90          100      110          120       130          140   150                        80   90       100       110          120      130    140   150
                                               Position                                                                                     Position


Figure 8: Velocity profile for the Impact problem at T = 20                                    Figure 9: Velocity profile for the Impact problem at T = 20
and T = 600 with Kopt .                                                                        and T = 600 with Kconst . The optimal kernel Kopt obtained
                                                                                               by machine learning (Figure 8) provides better agreement
                                                                                               with the DNS data than Kconst at the earlier time (T = 20)
velocity is not accurate, the predicted velocity and displace-                                 by reducing the size of the oscillations that trail the main
ment exhibit non-physical oscillations, which disappear in                                     pulse. For large T , the solution is dominated by the low fre-
correspondence of pairs that guarantee an accurate group ve-                                   quency components of the pulse, for which the two kernels
locity profile.                                                                                behave similarly.

                                           Conclusion                                                                          Acknowledgments
We introduced a new data-driven, optimization-based algo-                                      MD and SS are supported by the Sandia National Labo-
rithm for the identification of nonlocal kernels in the con-                                   ratories (SNL) Laboratory-directed Research and Develop-
text of wave propagation through material featuring het-                                       ment program and by the U.S. Department of Energy, Of-
erogeneities at the microscale. The corresponding nonlocal                                     fice of Advanced Scientific Computing Research under the
model is well-posed by construction and allows for accu-                                       Collaboratory on Mathematics and Physics-Informed Learn-
rate simulations at a larger scale than the microstructure.                                    ing Machines for Multiscale and Multiphysics Problems
We stress the fact that our algorithm does not require a pri-                                  (PhILMs) project. SNL is a multimission laboratory man-
ori knowledge of the microstructure (often unknown and/or                                      aged and operated by National Technology and Engineering
hard to model), but only requires high-fidelity measurements                                   Solutions of Sandia, LLC., a wholly owned subsidiary of
of the displacements or the velocity. We also point out that                                   Honeywell International, Inc., for the U.S. Department of
our algorithm has excellent generalization properties as the                                   Energy’s National Nuclear Security Administration under
optimal kernel performs well at much larger times than the                                     contract DE-NA-0003525. This paper, SAND2020-13633,
time instants used for training and on problem settings that                                   describes objective technical results and analysis. Any sub-
are substantially different from the training data set.                                        jective views or opinions that might be expressed in this pa-
   One of the most important findings in this work is the                                      per do not necessarily represent the views of the U.S. De-
key role of the group velocity in the accuracy of the predic-                                  partment of Energy or the United States Government. HY
tions; in fact, our criterion for the choice of the horizon δ and                              and YY are supported by the National Science Foundation
the regularization weight  is the accurate prediction of the                                  under award DMS 1753031.
group velocity profile. Given the critical role of such quan-
tity, our future work includes the identification of the optimal
horizon by, possibly, embedding constraints on the group ve-
locity to the training procedure. Another natural follow-up
work is the illustration of the efficiency of our algorithm on
two- and three-dimensional test cases.
                       References                              Silling, S. 2020. Propagation of a Stress Pulse in a Heteroge-
Alali, B.; and Lipton, R. 2012. Multiscale dynamics of het-    neous Elastic Bar. Sandia Report SAND2020-8197, Sandia
erogeneous media in the peridynamic formulation. Journal       National Laboratories.
of Elasticity 106(1): 71–103.                                  Silling, S. A. 2000. Reformulation of elasticity theory for
                                                               discontinuities and long-range forces. Journal of the Me-
Askari, E.; Bobaru, F.; Lehoucq, R.; Parks, M. L.; Silling,
                                                               chanics and Physics of Solids 48(1): 175–209.
S. A.; and Weckner, O. 2008. Peridynamics for multiscale
materials modeling. In Journal of Physics: Conference Se-      Trask, N.; You, H.; Yu, Y.; and Parks, M. L. 2019. An
ries, volume 125, 012078. IOP Publishing.                      asymptotically compatible meshfree quadrature rule for
                                                               nonlocal problems with applications to peridynamics. Com-
Bedford, A.; and Drumheller, D. S. 1994. Introduction to       puter Methods in Applied Mechanics and Engineering 343:
Elastic Wave Propagation. Wiley.                               151–165.
Benson, D.; Wheatcraft, S.; and Meerschaert, M. 2000. Ap-      Weckner, O.; and Silling, S. A. 2011. Determination of
plication of a fractional advection-dispersion equation. Wa-   nonlocal constitutive equations from phonon dispersion re-
ter Resources Research 36(6): 1403–1412.                       lations. International Journal for Multiscale Computational
Burkovska, O.; Glusa, C.; and D’Elia, M. 2020. An              Engineering 9(6).
optimization-based approach to parameter learning for frac-    Xu, X.; D’Elia, M.; and Foster, J. 2020. Bond-Based
tional type nonlocal models. ArXiv:2010.03666.                 Peridynamic Kernel Learning with Energy Constraint.
D’Elia, M.; De los Reyes, J.; and Trujillo, A. M. 2019.        ArXiv:2101.01095.
Bilevel parameter optimization for nonlocal image denois-      Xu, X.; and Foster, J. 2020. Deriving peridynamic influ-
ing models. ArXiv1912.02347.                                   ence functions for one-dimensional elastic materials with
D’Elia, M.; Du, Q.; Gunzburger, M.; and Lehoucq, R. 2017.      periodic microstructure. ArXiv:2003.05520.
Nonlocal convection-diffusion problems on bounded do-          You, H.; Yu, Y.; and Kamensky, D. 2020. An asymptoti-
mains and finite-range jump processes. Computational           cally compatible formulation for local-to-nonlocal coupling
Methods in Applied Mathematics 29: 71–103.                     problems without overlapping regions. Computer Methods
Du, Q.; Tao, Y.; and Tian, X. 2018. A peridynamic model of     in Applied Mechanics and Engineering 366: 113038.
fracture mechanics with bond-breaking. Journal of Elastic-     You, H.; Yu, Y.; Trask, N.; Gulian, M.; and D’Elia, M. 2021.
ity 132(2): 197–218.                                           Data-driven learning of robust nonlocal physics from high-
                                                               fidelity synthetic data. Computer Methods in Applied Mech-
D’Elia, M.; and Gunzburger, M. 2016. Identification of the
                                                               nics and Engineering 374: 113553.
diffusion parameter in nonlocal steady diffusion problems.
Applied Mathematics & Optimization 73(2): 227–249.
Gilboa, G.; and Osher, S. 2007. Nonlocal linear image reg-
ularization and supervised segmentation. Multiscale Model.
Simul. 6: 595–630.
Ha, Y. D.; and Bobaru, F. 2011. Characteristics of dynamic
brittle fracture captured with peridynamics. Engineering
Fracture Mechanics 78(6): 1156–1168.
Meerschaert, M.; and Sikorskii, A. 2012. Stochastic models
for fractional calculus. Studies in mathematics, Gruyter.
Pang, G.; D’Elia, M.; Parks, M.; and Karniadakis, G. E.
2020. nPINNs: nonlocal Physics-Informed Neural Networks
for a parametrized nonlocal universal Laplacian operator.
Algorithms and Applications. To appear in Journal of Com-
putational Physics.
Pang, G.; Lu, L.; and Karniadakis, G. E. 2019. fPINNs:
Fractional Physics-Informed Neural Networks. SIAM Jour-
nal on Scientific Computing 41: A2603–A2626.
Scalas, E.; Gorenflo, R.; and Mainardi, F. 2000. Fractional
calculus and continuous time finance. Physica A 284: 376–
384.
Schumer, R.; Benson, D.; Meerschaert, M.; and Baeumer,
B. 2003. Multiscaling fractional advection-dispersion equa-
tions and their solutions. Water Resources Research 39(1):
1022–1032.