=Paper=
{{Paper
|id=Vol-2587/article_11
|storemode=property
|title=Physics-Informed Spatiotemporal Deep Learning for Emulating Coupled Dynamical Systems
|pdfUrl=https://ceur-ws.org/Vol-2587/article_11.pdf
|volume=Vol-2587
|authors=Anishi Mehta,Cory Scott,Diane Oyen,Nishant Panda,Gowri Srinivasan
|dblpUrl=https://dblp.org/rec/conf/aaaiss/MehtaSOPS20
}}
==Physics-Informed Spatiotemporal Deep Learning for Emulating Coupled Dynamical Systems==
<pdf width="1500px">https://ceur-ws.org/Vol-2587/article_11.pdf</pdf>
<pre>
                             Physics-Informed Spatiotemporal Deep Learning
                               for Emulating Coupled Dynamical Systems

            Anishi Mehta,1, 3 Cory Scott,2, 3 Diane Oyen,3 Nishant Panda,3 Gowri Srinivasan3
              1
                  Georgia Institute of Technology, 2 University of California-Irvine, 3 Los Alamos National Laboratory


                             Abstract
  Accurately predicting the propagation of fractures, or cracks,       Micro cracks                                    Damage evolution
  in brittle materials is an important problem in evaluating           and loading
                                                                                                             odel      accounts for crack
                                                                        in one cell                     uum m
  the reliability of objects such as airplane wings and con-                                      Contin                  interactions
  crete structures. Efficient crack propagation emulators that
  can run in a fraction of the time of high-fidelity physics sim-
  ulations are needed. A primary challenge of modeling frac-                                                  Constitutive
  ture networks and the stress propagation in materials is that                                                  model
  the cracks themselves introduce discontinuities, making ex-                                                for summary
                                                                                 Emulate
  isting partial differential equation (PDE) discovery models                    dynamics                      statistics
  unusable. Furthermore, existing physics-informed neural net-                                t
  works are limited to learning PDEs with either constant initial
  conditions or changes that do not depend on the PDE outputs          Figure 1: Machine learning emulates dynamics of micro
  at the previous time. In fracture propagation, at each time-         cracks to inform the continuum model.
  step, there is a damage field and a stress field; where the stress
  causes further damage in the material. The stress field at the
  next time step is affected by the discontinuities introduced by      is typically simulated using parallel implementations of fi-
  the propagated damage. Thus, both stress and damage fields
  are heavily dependent on each other; which makes model-
                                                                       nite discrete element methods (FDEM). Industrial software
  ing the system difficult. Spatiotemporal LSTMs have shown            packages applying these methods have been developed,
  promise in the area of real-world video prediction. Building         many of which are capable of representing the high-fidelity
  on this success, we approach this physics emulation problem          dynamics and are extremely paralelized (Hyman et al. 2015;
  as a video generation problem: training the model on simula-         Rougier et al. 2014). Yet, these codes are unable to simulate
  tion data to learn the underlying dynamic behavior. Our novel        samples large enough to have real-world scientific applica-
  deep learning model is a Physics-Informed Spatiotemporal             tions, due to the large computational requirements of sim-
  LSTM, that uses modified loss functions and partial deriva-          ulating the behavior at the spatial and temporal resolutions
  tives from the stress field to build a data-driven coupled dy-       necessary. Upscaled continuum representations are used as
  namics emulator. Our approach outperforms other neural net           an approximation because they discard topological features
  architectures at predicting subsequent frames of a simulation,
  enabling fast and accurate emulation of fracture propagation.
                                                                       of the simulated material and are therefore faster; however,
                                                                       precisely because they omit these features, they fail to match
                                                                       experimental observations (Vaughn et al. 2019). Thus, we
            Introduction and Motivation                                develop a spatio-temporal machine learning model to emu-
                                                                       late the micro-scale physics model and estimate the neces-
Brittle materials fail suddenly with little warning due to the         sary quantities of interest needed to ensure accuracy of the
growth of micro-fractures that quickly propagate and coa-              continuum-scale model, as in Figure 1.
lesce. Prediction of fracture propagation in brittle materi-
                                                                           The goal is to predict summary statistics, or quantities of
als is a multi-scale modeling problem whose time dynam-
                                                                       interest, for both the damage field and the stress field in a
ics are well understood at the micro-scale but do not scale
                                                                       simulated 2-dimensional material from initial conditions un-
well to the macro-scale necessary for practical evaluation
                                                                       til the point of failure (when a single fracture spans the width
of materials under strain (White 2006; Hyman et al. 2016;
                                                                       of the material). The dynamics of the stress field cannot be
Kim et al. 2014). Fracture formation in brittle materials
                                                                       modeled without the damage and vice versa. When dam-
Copyright c 2020 for this paper by its authors. Use permitted un-      age is static, the evolution of stress over the material mim-
der Creative Commons License Attribution International (CC BY          ics properties of fluid flow. However, the damage caused in
4.0).                                                                  the material changes the behavior of stress to no longer be
governed by a single PDE, e.g. stress accumulates at crack        ate methods to solve a PDE whose form is known using
tips and causes cracks (each of which is a discontinuity in       data; for example, (Han, Jentzen, and E 2018). (Long, She,
the stress field) to spread further. Thus, instead of using       and Mukhopadhyay 2018) uses convolutions in the LSTM
solid state dynamics equations to predict this stress field, we   cells in a fully convolutional network to train a PDE solver
must extend approaches successfully demonstrated in ma-           with varying input perturbations. Our work falls under the
chine learning to couple the dynamics of the damage field         second category of approaches; which emulate the behav-
and stress field.                                                 ior of a system governed by PDEs; such as fluid dynam-
   It is tempting to treat damage and the stress tensor at each   ics (Kim et al. 2019; Wiewel, Becher, and Thuerey 2019;
location simply as different channels in the same time se-        White, Ushizima, and Farhat 2019; Guo, Li, and Iorio 2016).
ries and apply methods from the extensive prior work on           Unlike these fluid dynamics emulators where the boundary
video prediction (Wang et al. 2018). However this approach        conditions and topology are constant, in our case the evo-
is ineffective because although the damage and stress fields      lution of the damage field changes both the boundary con-
are highly coupled, they have dramatically different dynam-       ditions and topology. Furthermore, our problem has a bi-
ics in time. Therefore, one model cannot easily predict both      directional relationship between stress (which is governed
quantities simultaneously. The damage data is binary-valued       by a PDE, when damage is constant) and damage (which is
and sparse: most of the finite elements remain undamaged          not) and hence we cannot fit a simple PDE and simply unroll
for the entire simulation, as shown in Figure 2a. The stress      forward in time.
data is real-valued, where values as small as 10−6 are signif-
icant yet magnitudes also range up to 108 (see Figure 2); and         Physics-Informed Spatiotemporal Model
the stress field has spatial discontinuities wherever damage
has occurred. Furthermore, unlike video prediction which is       Formally, the problem to solve is: given initial conditions,
concerned with precise pixel-by-pixel accuracy, we need to        predict a time series of damage field and stress field evo-
emulate the most important features of the simulation over        lution. The initial conditions given to this generative model
a long time horizon (hundreds of frames in the future) with       are some number of simulated frames, from which the rest
high enough accuracy to predict several quantities of interest    of the time-series is predicted.
needed by the continuum model.
   In order to capture the long-term frame dependencies, re-      Architecture of the deep learning model
current neural networks (RNNs) (Williams and Zipser 1995)         Our data-driven approach to predict physical behavior in a
have been recently applied to video predictive learning. For-     complex system leverages advances in deep neural networks
mer state-of-the-art models applied complex nonlinear tran-       (Hinton et al. 2012). We use a Convolutional Neural Net-
sition functions from one frame to the next, constructing         work (CNN) (Krizhevsky, Sutskever, and Hinton 2012) to
a dual memory structure (Wang et al. 2018) upon Long              learn a nonlinear mapping from the stress and damage val-
Short-Term Memory (LSTM) (Hochreiter and Schmidhuber              ues in local neighborhoods at time t to the stress and damage
1997b). To emulate the spatio-temporal model, we propose          fields at the next time-step. CNNs are designed for problems
a Physics-Informed Spatiotemporal LSTM model. First, lin-         with high spatial correlation and translation invariance, mak-
ear interpolation is used to coarsen the damage data to re-       ing them an ideal choice for physical problems.
tain the important fracture features while discarding the un-        In prior work, we found that using a CNN alone to make
informative undamaged regions. Next, a modified recurrent         predictions at the next time step, tends to make biased pre-
neural network learns temporal evolution in the latent space      dictions of lower stress and damage values than the truth. As
representation (Wang et al. 2018). Finally, the predictions       we unroll the predictions over time, these errors compound
from the recurrent neural network are passed to the decoder       resulting in highly inaccurate predictions of stress values af-
sub-network of the convolutional autoencoder, and decoded         ter 10 or so frames in time; and virtually no predictions of
into time-advanced simulation states. As input to the con-        damage occurring. We incorporate an explicit modeling of
volutional autoencoder network, we include point estimates        the time component using a recurrent neural network (RNN)
of partial derivatives of stress values. This allows us to pre-   which shares weights over subsequent time-steps of the in-
dict Coupled Dynamical PDEs unlike existing PDE discov-           put (Pearlmutter 1989). The hidden state of the RNN after
ery models.                                                       consuming an entire time series thus is a fixed-length en-
   Results show that this approach makes accurate predic-         coding of that (varying-length) time series. Specifically, we
tions of fracture propagation. Our method outperforms other       use a Long Short-Term Memory network (LSTM) that al-
neural net architectures at predicting subsequent frames of       lows the network to separately “remember” both long-term
a simulation, and reproduces physical quantities of interest      global context, as well as short-term recent context (Hochre-
with higher fidelity.                                             iter and Schmidhuber 1997a).
                                                                     The spatial and temporal elements can be combined with
                      Related Work                                a Convolutional LSTM or ConvLSTM which maintains the
Machine learning-based prediction of behavior in physical         spatial structure of the input as it processes time series. We
systems in general, and partial differential equations specif-    find that the best model is a Spatiotemporal LSTM (ST-
ically, is an area of active research. Broadly, machine learn-    LSTM) (Lu, Hirsch, and Scholkopf 2017). The main rea-
ing approaches to PDE emulation fall into one of two cat-         son for the improved predictive power is the inclusion of the
egories. In the first category are approaches that acceler-       Spatiotemporal Memory in each LSTM block in addition to
                 (a) Damage                                (b) Stress t = 10                     (c) Coarsened damage field

Figure 2: (a) Ratio of pixels that are damaged versus time. (b) Histogram of absolute value of stress values at one time frame, for
an example simulation run. Note that values range from very small (≈ 10−8 ) to very large (≈ 108 ). (c) Example of coarsened
damage field at the initial time frame for an example simulation. Field is coarsened using the Lanczos method by a factor of 8.


the Temporal Memory. While temporal states are only shared            that dynamic information can be observed.
horizontally between time-steps, the spatiotemporal state is             The stress field, without damage, follows a 2nd-order
shared between the stacked ST-LSTM blocks. This enables               PDE. To build a deep learning model that fits to such PDEs,
efficient flow of spatial information. We make the memory             we include the 1st and 2nd order partial spatial and tempo-
representations of the ST-LSTM cells common between all               ral derivatives as input. Using k = 3 time-steps as input
input fields. This feature allows us to model the highly co-          allows us to capture temporal derivative information accu-
dependent nature of the stress and damage fields. Figure 3            rately. We use the gradient and Hessian calculating func-
shows our novel physics-informed architecture. We intro-              tions of Tensorflow to calculate the gradients (Abadi and
duce various aspects inspired by the physical properties of           others 2015). At each step, we append the derivatives to the
the damage propagation problem allowing for a closer fit-             input fields and predict them as part of the next-step pre-
ting PDE.                                                             diction. This enables the spatiotemporal memory blocks of
                                                                      the network to carry information about spatial and temporal
Coarsening of input damage images                                     derivatives of stress. Through this, we overcome the issue of
The damage field is very sparse with damaged pixels form-             vanishing gradients discussed in (Wang et al. 2018), as well
ing less than 2% of the entire spatial domain. The increase           as capture the monotonically increasing nature of the dam-
in damaged pixels from the initial seed damage at t = 0,              age field. We ensure the mean squared error of the predicted
to the damage at the final step when the sample has failed,           derivatives and derivatives calculated from the stress fields
is less than 0.2% of the total pixels, as shown in Figure 2a.         lies within a pre-decided threshold σ as a self-check. These
This makes it difficult for an ML model to capture and pre-           modeling choices were imbibed from the physical princi-
dict this information, since (formulated as a binary classi-          ples governing the damage propagation process creating a
fication task) the two prediction classes are extremely im-           novel ”physics-informed” deep learning model. Empirical
balanced. Furthermore the distance between cracks is quite            results show that this physics-informed approach of training
large relative to the size of a crack which complicates the use       the model significantly improves accuracy.
of convolutional filters. Hence, we coarsen the damage data
with a linear Lanczos method (Lanczos 1950) with a filter                                    Experiment
of 3x3 (see Figure 2c). We then convert the damage field to           We focus on three variants of LSTM models: Stacked
a binary 0-1 field by applying a threshold of 0.11 which is           LSTM, convLSTM, and ST-LSTM. All three models take in
a standard threshold in this domain beyond which damage               the first k = 3 time-steps of coarsened data as input frames.
cannot be repaired, i.e. any pixels with values higher than           Next, to encourage the model to fit to a PDE, we calculate
this will be considered as a damaged pixel and all others             the 1st and 2nd derivatives of the stress field w.r.t. time and
are non-damaged. In this manner, we effectively coarsen the           append that information to the inputs. The LSTM block cal-
fields by a factor of 8. Empirically, we find that this coars-        culates predicted values for the next frame corresponding to
ening method preserves the important features to accurately           each pixel in the input frames. Each model with derivatives
predict physical quantities of interest.                              as inputs is called Physics-Informed. The whole model is
                                                                      unrolled to predict the entire simulation.
Informing the model with partial derivatives                             The dataset consists of 61 simulations each of which has
The FDEM model that we are emulating is a Markovian                   260 time-steps. We split our dataset into 41 simulations used
process: the damage and stress fields of the next time-step           for training, 10 for validation, and 10 as test cases. We train
are completely determined by the current state. Unlike the            our models until saturation which we reach between 350-
FDEM model, the machine learning model does not have the              400 epochs. To prevent the model from overfitting, we per-
actual PDE to define the dynamics, and so we predict each             form one round of cross-validation at the end of each epoch.
time-step from up to k previous time-steps. We use k = 3 so              Our models are designed to allow us to plugin different
Figure 3: Left: a basic SpatioTemporal LSTM (ST-LSTM) cell. Right: The novel Physics-Informed LSTM generative model.
The spatial and temporal memories work in parallel: the red lines denote the deep transition paths of the spatial memory, while
horizontal black arrows indicate the update directions of the temporal memories. The green lines indicates direction flow of the
stress field derivatives. Gradient Highway Units (GHU) help overcome the problem of vanishing gradients.


ML architectures, choose whether to include partial deriva-        loss function LS is given by:
tive information, and test various loss functions. This mod-                          X                                       
ular approach makes it easy to train the necessary compo-           LS = α3 LGDL +         α1 (Sij − Ŝij )2 + α2 |Sij − Ŝij | ,
nents as needed. Our experiments show that we achieve the                              i,j
best performance by using 6 ST-LSTM blocks stacked on                        X
top of each other, each of size 128. We use tanh (Hochre-           LGDL =          |Si,j − Si−1,j |−|Ŝi,j − Ŝi−1,j | +
iter and Schmidhuber 1997a) as the activation function for                    i,j
the LSTM and Leaky-ReLu (Maas, Hannun, and Ng 2013)
                                                                                                                       
                                                                                    |Si,j−1 − Si,j |−|Ŝi,j−1 − Ŝi,j | ,
for the CNN. For our experiments, we test several combi-
nations of loss functions such as L1 loss, L2 loss, L2 loss                                                                    (2)
weighted by pixel values, cross-entropy loss, etc. We use the
same loss functions for all our models to directly compare         where i, j ranges over the pixels, Ŝij are the predicted stress
performance. For the best results, we treat the damage fields      values, Sij are the true values and α1 , α2 , and α3 are hy-
as a binary classification problem i.e. deducing whether a         perparameters that weight the relative importance of each
given pixel i, j is damaged or not, and use a cross-entropy        term of the loss function. We use α1 = 0.3, α2 = 0.1, and
loss (Goodfellow, Bengio, and Courville 2016). The cross-          α3 = 0.1.
entropy loss LD for the damage field is:
                                                                   Prediction of quantities of interest
                                                                   The continuum-scale model requires as input several quanti-
                     X X                                           ties that describe a material behavior under given conditions.
         LD = −0.5                  yDij ,c log(pDij ,c ),   (1)   These quantities of interest (QoI) are: (a) number of cracks
                      i,j c={1,2}                                  as a function of time; (b) distribution of crack lengths as a
                                                                   function of time; and (c) maximum stress over the field as
                                                                   a function of time. To predict these quantities of interest,
                                                                   we collect stress and damage predictions from our physics-
where y is a binary indicator (0 or 1) if class label c is the     informed spatiotemporal generative model; then, we calcu-
correct prediction for damage field observation Dij and p is       late the QoI.
the probability of the model predicting class c for damage
field observation Dij .                                            Evaluation Metrics
                                                                   We evaluate model performance with two standard video
   For the stress fields, we use L1 and L2 losses and a gradi-     similarity metrics and by quantifying the prediction of quan-
ent difference loss (GDL) which sharpens the image predic-         tities of interest. MSE: Mean Squared Error compares the
tion (Mathieu, Couprie, and LeCun 2016). The stress field          squared difference between prediction and truth, averaged
Table 1: Evaluation of models. Low is good for MSE and               Run-time and speed-up
QoI. High is good for SSIM. Best score is bold.
                                                                     High-fidelity simulators for material failure are computa-
Model                                    MSE      SSIM      QoI      tionally expensive, taking on the order of 1500 CPU-hours
Stacked LSTM                             8.14      0.61     0.35     to run one simulation of a 2-dimensional material for 260
Physics-Informed Stacked LSTM            6.81      0.72     0.31     time-steps, such as in the dataset we use (Rougier et al.
ConvLSTM                                 3.73      0.86     0.28     2014). Physics-Informed ST-LSTM accelerates the entire
Physics-Informed ConvLSTM                2.10      0.87     0.21     workflow by generating approximate QoI in a fraction of the
ST-LSTM                                  1.55      0.94     0.12     time. We train each model to saturation in 10-12 hours on
Physics-Informed ST-LSTM                 1.23      0.92     0.09     four GeForceGTX1080Ti2.20GHz GPUs, after which emu-
                                                                     lation of the physical behavior is on the order of millisec-
                                                                     onds, rather than minutes, per timestep. This is a speedup on
                                                                     the order of 50,000 times faster. Furthermore, once trained,
over all pixels. SSIM: The Structural Similarity Index Met-          the model can generate QoI for any number of simulations
ric considers perception-based similarity between two im-            drawn from the same initial conditions.
ages (Wang et al. 2004). Note that higher is better for SSIM.
QoI: We weigh the quantities of interest (QoI) defined above         Discussion
equally and measure the mean absolute error which indicates          The complexity of our model architecture and loss functions
how well the continuum model will perform with this model            are necessary for accurately emulating a complex spatiotem-
as an emulator.                                                      poral process over a long time horizon. The LSTM learns the
                                                                     monotonically increasing nature of the damage field with-
                                                                     out any constraints being imposed. This physically-plausible
                           Results                                   learned model is an important result that favored the use
                                                                     of LSTMs that can capture time-dependent evolution bet-
Our Physics-Informed ST-LSTM outperforms other models                ter than conventional neural network architectures. Explic-
particularly on MSE and on predicting the QoI, as shown              itly calculating the partial derivatives and including them as
in Table 1. ST-LSTM does perform slightly better according           input improves prediction. This is because the model now
to the SSIM metric, but the difference is small and SSIM             fits to a PDE which is a closer approximation to the original
measures visual similarity which is not our main goal. Qual-         physical problem. The dual memory representation of spa-
itatively, we see in Figures 5 and 6 our Physics-Informed            tial and temporal information in our ST-LSTM cell improves
ST-LSTM model can faithfully emulate both the stress and             performance of our model on this problem significantly. The
damage field propagation. The Stacked LSTM model in par-             failure of Stacked-LSTM (Hermans and Schrauwen 2013)
ticular, tends to predict overly smooth stress fields and no         is also evidence of this. Furthermore, both local and global
change in damage, even with the Physics Informed model.              spatio-temporal information is important to reduce com-
                                                                     pounding errors to make predictions at any given time.
   Our model learns an approximation to the physical equa-
                                                                        We see that the maximum stress is consistently under-
tions governing the evolution of stress and damage fields al-
                                                                     predicted, even after weighting the losses by actual stress
lowing it to make predictions on previously unseen condi-
                                                                     values. We believe this is due to the inherent nature of ML
tions. The quantities of interest are then extracted from these
                                                                     finding an average representation from training data and the
predictions. As an example, Figures 4 and 7 show the results
                                                                     inherently difficult inference problem of estimating a max-
for these quantities of interest for a held-out test simulation.
                                                                     imum statistic. However, an important point to take note of
From this example, we can see generally that our model pre-
                                                                     is that our model is able to follow the peaks and trends of
dicts cracks coalescing with neighbor cracks slightly earlier
                                                                     the maximum stress quite accurately. Future work in uncer-
than when it actually occurs; causing (a) the total damage
                                                                     tainty quantification could learn the correction in our maxi-
to be overestimated, (b) the number of cracks are under-
                                                                     mum stress estimate.
estimated, and (c) the length of individual cracks are over
                                                                        The damage model tends to predict crack coalescence
estimated during the most dynamic parts of the simulation.
                                                                     early. We coarsen the simulation data before giving it as in-
This is likely due to the coarsening of the damage field and
                                                                     put to our model, which proportionately reduces the non-
is not a major concern. We predict the entire stress field for
                                                                     damaged regions between cracks. Due to this, our model
all three directions (or channels) of stress and then extract
                                                                     tends to predict crack coalescence a few steps earlier than
the maximum value from our prediction to compare against
                                                                     ground truth. However, the model is able to converge to the
the maximum value in the ground truth stress field. Figure 7
                                                                     correct number of cracks towards the end of the simulations
shows that our model routinely under-estimates the maxi-
                                                                     (see Figure 4). In future work, learning a coarse representa-
mum stress value, yet generally gets the trend and peaks of
                                                                     tion, such as with a convolutional autoencoder (Masci et al.
the time series. This is a typical result from machine learn-
                                                                     2011), could learn to correct this bias.
ing prediction, which tends not to predict extreme values.
We could improve our prediction of this quantity by opti-
mizing specifically for the prediction of the maximum stress                                Conclusion
rather than predicting the entire stress field, but leave this for   Emulation of complex physical systems has long been a goal
future work.                                                         of artificial intelligence because although we can write down
    (a) Total proportion of damaged elements              (b) Number of cracks             (c) Crack length distribution prediction

                     Figure 4: Example predicted damage quantities of interest for one held-out simulation.


    (a) Truth     (b) Stacked   (c) ConvLSTM (d) ST-LSTM                  (a) Truth      (b) Stacked   (c) ConvLSTM (d) ST-LSTM

Figure 5: Example predicted stress field of a test frame. Re-         Figure 6: Example predicted damage field of a test frame.
sults shown are all from the Physics-Informed version of the          Results shown are all from the Physics-Informed version of
model. The three directions of the stress field are visualized        the model. White means damaged and black means not dam-
with false-color by mapping each channel of stress direction          aged.
(Sxx, Sxy, Syy) to an image color channel (red, green, blue).


the micro-scale physics equations of such a system, it is
computationally intractable to simulate the physics model
to obtain meaningful predictions on a large scale; yet the
macro-scale patterns of these dynamic systems can be quite
intuitive to humans (Lerer, Gross, and Fergus 2016). We
present Physics-Informed ST-LSTM, an extension and ap-
plication of Spatiotemporal LSTM (ST-LSTM) neural net-
work models to emulate the time dynamics of a physical                    Figure 7: Maximum stress for a heldout simulation.
simulation of stress and damage in a material. Unlike PDE
emulators that assume a PDE form, our entirely data driven
framework, can be used equally well on high dimensional                                   Acknowledgments
experimental studies where binary variables can arise. We             Research supported by the Laboratory Directed Research
demonstrate that ST-LSTMs outperform two other machine                and Development program of Los Alamos National Labora-
learning models at predicting these time dynamics and phys-           tory (LANL) under project number 20170103DR. AM sup-
ical quantities of interest, and furthermore that all three mod-      ported by the LANL Applied Machine Learning Summer
els increase in performance when they are physics-informed,           Research Fellowship. CS supported by the LANL Center for
that is they have access to the underlying physics of the sim-        Non-Linear Studies.
ulation. Physics information comes both in the form of spa-
tiotemporal derivatives, and in a loss function which takes
into account the QoI. We furthermore demonstrate that a                                        References
reduced-order model can gainfully capture the time dynam-             Abadi, M., et al. 2015. TensorFlow: Large-scale machine learning
ics of these physical QoI without needing pixel-perfect ac-           on heterogeneous systems. Software available from tensorflow.org.
curacy, an important step towards using machine learning to           Goodfellow, I.; Bengio, Y.; and Courville, A. 2016. Deep Learning.
massively accelerate prediction of complex physics.                   MIT Press.
Guo, X.; Li, W.; and Iorio, F. 2016. Convolutional neural networks     Rougier, E.; Knight, E. E.; Broome, S. T.; Sussman, A. J.; and
for steady flow approximation. In ACM SIGKDD Int. Conference           Munjiza, A. 2014. Validation of a three-dimensional finite-discrete
on Knowledge Discovery and Data Mining.                                element method using experimental results of the split hopkinson
Han, J.; Jentzen, A.; and E, W. 2018. Solving high-dimensional         pressure bar test. International Journal of Rock Mechanics and
partial differential equations using deep learning. Proceedings of     Mining Sciences.
the National Academy of Sciences.                                      Vaughn, N.; Kononov, A.; Moore, B.; Rougier, E.; Viswanathan,
Hermans, M., and Schrauwen, B. 2013. Training and analysing            H.; and Hunter, A. 2019. Statistically informed upscaling of dam-
deep recurrent neural networks. In Neural Information Processing       age evolution in brittle materials. Theoretical and Applied Fracture
Systems.                                                               Mechanics.
Hinton, G.; Deng, L.; Yu, D.; Dahl, G.; Mohamed, A.-r.; Jaitly, N.;    Wang, Z.; Bovik, A. C.; Sheikh, H. R.; and Simoncelli, E. P. 2004.
Senior, A.; Vanhoucke, V.; Nguyen, P.; Kingsbury, B.; et al. 2012.     Image quality assessment: from error visibility to structural simi-
Deep neural networks for acoustic modeling in speech recognition.      larity. In IEEE Transactions on Image Processing.
IEEE Signal Processing Magazine.                                       Wang, Y.; Gao, Z.; Long, M.; Wang, J.; and Yu, P. S. 2018. Pre-
Hochreiter, S., and Schmidhuber, J. 1997a. Long short-term mem-        drnn++: Towards a resolution of the deep-in-time dilemma in spa-
ory. Neural Computation.                                               tiotemporal predictive learning. In International Conference on
                                                                       Machine Learning.
Hochreiter, S., and Schmidhuber, J. 1997b. Long short-term mem-
ory. Neural Computation 9(8):1735–1780.                                White, C.; Ushizima, D.; and Farhat, C. 2019. Neural networks
                                                                       predict fluid dynamics solutions from tiny datasets. arXiv preprint
Hyman, J. D.; Karra, S.; Makedonska, N.; Gable, C. W.; Painter,        arXiv:1902.00091.
S. L.; and Viswanathan, H. S. 2015. dfnworks: A discrete fracture
network framework for modeling subsurface flow and transport.          White, P. 2006. Review of methods and approaches for the struc-
Computers and Geosciences.                                             tural risk assessment of aircraft. Technical report, Australian Gov-
                                                                       ernment Department of Defence, Defence Science and Technology
Hyman, J.; Jiménez-Martı́nez, J.; Viswanathan, H. S.; Carey, J. W.;   Organisation, DSTO-TR-1916.
Porter, M. L.; Rougier, E.; Karra, S.; Kang, Q.; Frash, L.; Chen,
L.; et al. 2016. Understanding hydraulic fracturing: A multi-scale     Wiewel, S.; Becher, M.; and Thuerey, N. 2019. Latent space
problem. Philosophical Transactions of the Royal Society A: Math-      physics: Towards learning the temporal evolution of fluid flow. In
ematical, Physical and Engineering Sciences.                           Computer Graphics Forum.
Kim, J.; Um, E. S.; Moridis, G. J.; et al. 2014. Fracture propaga-     Williams, R. J., and Zipser, D. 1995. Gradient-based learning algo-
tion, fluid flow, and geomechanics of water-based hydraulic frac-      rithms for recurrent networks and their computational complexity.
turing in shale gas systems and electromagnetic geophysical moni-
toring of fluid migration. In SPE Hydraulic Fracturing Technology
Conference.
Kim, B.; Azevedo, V. C.; Thuerey, N.; Kim, T.; Gross, M.; and
Solenthaler, B. 2019. Deep fluids: A generative network for pa-
rameterized fluid simulations. In Computer Graphics Forum.
Krizhevsky, A.; Sutskever, I.; and Hinton, G. E. 2012. Imagenet
classification with deep convolutional neural networks. In Ad-
vances in Neural Information Processing Systems.
Lanczos, C. 1950. An Iteration method for the solution of the
eigenvalue problem of linear differential and integral operators.
United States Governm. Press Office.
Lerer, A.; Gross, S.; and Fergus, R. 2016. Learning physical intu-
ition of block towers by example. In International Conference on
Machine Learning.
Long, Y.; She, X.; and Mukhopadhyay, S. 2018. Hybridnet: Inte-
grating model-based and data-driven learning to predict evolution
of dynamical systems. In Conference on Robot Learning.
Lu, C.; Hirsch, M.; and Scholkopf, B. 2017. Flexible spatio-
temporal networks for video prediction. In The IEEE Conference
on Computer Vision and Pattern Recognition (CVPR).
Maas, A. L.; Hannun, A. Y.; and Ng, A. Y. 2013. Rectifier nonlin-
earities improve neural network acoustic models. In Proceedings
of International Conference on Machine Learning.
Masci, J.; Meier, U.; Cireşan, D.; and Schmidhuber, J. 2011.
Stacked convolutional auto-encoders for hierarchical feature ex-
traction. In Artificial Neural Networks and Machine Learning.
Mathieu, M.; Couprie, C.; and LeCun, Y. 2016. Deep multi-scale
video prediction beyond mean square error. In International Con-
ference on Learning Representations.
Pearlmutter, B. A. 1989. Learning state space trajectories in recur-
rent neural networks. Neural Computation.

</pre>