<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Graph Networks with Physics-aware Knowledge Informed in Latent Space</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Sungyong</forename><surname>Seo</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Yan</forename><surname>Liu</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">University of Southern California</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Graph Networks with Physics-aware Knowledge Informed in Latent Space</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">58276DDFC1FCB28FD07B49895B71D137</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-23T20:21+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>While physics conveys knowledge of nature built from an interplay between observations and theory, it has been considered less important for modeling deep neural networks. Despite the usefulness of physical rules, it is particularly challenging to leverage the knowledge for sparse data since most physics equations are well defined on the continuous and dense space. In addition, it is even harder to inform the equations into a model if the observations are not fully governed by the given physical knowledge. In this work, we present a novel architecture to incorporate physics or domain knowledge given as a form of partial differential equations (PDEs) on sparse observations by utilizing graph structure. Moreover, we leverage the representation power of deep learning by informing the knowledge in latent space. We demonstrate that climate prediction tasks are significantly improved and validate the effectiveness and importance of the proposed model.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Introduction</head><p>Modeling natural phenomena in the real-world, such as climate, traffic, molecule, and so on, is extremely challenging but important. Deep learning has achieved significant successes in prediction performance by learning latent representations from data-rich applications such as speech recognition <ref type="bibr">(Hinton et al. 2012</ref>), text understanding <ref type="bibr" target="#b19">(Wu et al. 2016)</ref>, and image recognition <ref type="bibr" target="#b11">(Krizhevsky, Sutskever, and Hinton 2012)</ref>. While the accuracy and efficiency of datadriven deep learning models can be improved with ad-hoc architectural changes for specific tasks, we are confronted with many challenging learning scenarios in modeling natural phenomenon, where a limited number of labeled examples are available, there is much noise in the data, and there could be constant changes in data distributions (e.g. dynamic systems). Furthermore, in many domains, data are only available on scattered collections of points (sensors or point clouds, see Figure <ref type="figure" target="#fig_0">1</ref>) where the majority of existing methods are not applicable. These challenges are not easily addressed under the purely data-driven learning models and therefore, there is a pressing need to develop new generation robust learning models that can address these challenging learning scenarios. Physics is one of the fundamental pillars describing how the real-world behaves. It is imperative that physicsinformed learning models are powerful solutions to modeling natural phenomena. Incorporating domain knowledge has several benefits: first, it helps an optimized solution to be more stable and to prevent overfitting; second, it provides theoretical guidance with which an optimized model is supposed to follow and thus, helps training with fewer data; lastly, since a model is driven by the desired inductive bias, it would be more robust to unseen data, and thus it is easier to enable accurate extrapolation.</p><p>In the meanwhile, there exist a series of challenges when we incorporate physics principles into machine learning models. First, a model needs to properly handle the spatial and temporal constraints. Many physics equations demonstrate how a set of physical quantities behaves on space and time. For example, the wave equation describes how a signal is propagated through a medium over time. Second, the model should capture relations between objects, such as image patches <ref type="bibr" target="#b18">(Santoro et al. 2017)</ref> or rigid bodies <ref type="bibr" target="#b0">(Battaglia et al. 2016;</ref><ref type="bibr">Chang et al. 2017)</ref>. Third, the learning modules should be shared over all objects because physical laws are commonly applicable to all objects. Finally, the model should be flexible to extract unknown patterns instead of be-ing strictly constrained to the physics knowledge. Since it is not always possible to describe all rules governing realworld data, data-driven learning is required to fill the gap between the known physics and real observations.</p><p>In this paper, we address the problem of modeling dynamical systems based on graph neural networks by incorporating useful knowledge described as differentiable physics equations. We propose a generic architecture, physics-aware graph networks (PaGN), which can leverage explicitly required physics and learn implicit patterns from data as illustrated in Figure <ref type="figure" target="#fig_0">1</ref>. The proposed model properly handles spatially distributed objects and their relations as vertices and edges in a graph. Moreover, temporal dependencies are learned by recurrent computations. As <ref type="bibr" target="#b1">Battaglia et al. (2018)</ref> suggest, the inductive bias of a graph-based model is its invariance [to] node/edge permutations, and thus, all trainable functions for the same input types are shared.</p><p>Our contributions of this work are summarized as follows: • We develop a novel physics-aware learning architecture, PaGN, which incorporates differentiable physics equations with a graph network framework. • We explore the performance of PaGN on graph signal prediction tasks to demonstrate that the physics knowledge is helpful to provide a significant improvement in prediction tasks and make a model more robust. • We investigate the effectiveness and the importance of PaGN from climate prediction to provide how physics knowledge can be beneficial for prediction performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Related Work</head><p>Incorporating physics Among many attempts incorporating physical knowledge into data-driven models, Cressie and Wikle (2015) covered a number of statistical models (e.g., a hierarchical Bayesian framework) handling physical equations. Raissi, Perdikaris, and Karniadakis (2017a) introduced a concept of physics-informed neural networks, which utilize physics equations explicitly to train neural networks. By optimizing the model at initial/boundary and sampled collocation points, the data-driven solutions of nonlinear PDEs can be found. Based on this fundamental idea, a number of works for simulating and discovering PDEs have been published <ref type="bibr" target="#b14">(Raissi and Karniadakis 2018;</ref><ref type="bibr" target="#b13">Raissi 2018;</ref><ref type="bibr" target="#b15">Raissi, Perdikaris, and Karniadakis 2017b)</ref>. Although these works leveraged physical knowledge, they are limited because they require all physics behind given data to be explicitly known. de Bezenac, Pajot, and Gallinari (2018) considered a similar problem as ours. They proposed how transport physics (advection and diffusion) could be incorporated for forecasting sea surface temperature (SST). In other words, they proposed how the motion flow that is helpful for the temperature flow prediction could be extracted in an unsupervised manner from a sequence of SST images. This work is a major milestone since it captures not only the dominant transport physics but also unknown patterns inferred through the neural networks. Despite of its novel architecture, the model is specifically designed for transport physics and it is not straightforward to extend the model to other physics equations. Furthermore, it is restricted in a regular grid to use conventional convolutional neural networks (CNNs) for images.</p><p>Discovering physical dynamics A class of models <ref type="bibr" target="#b8">(Grzeszczuk, Terzopoulos, and Hinton 1998;</ref><ref type="bibr" target="#b0">Battaglia et al. 2016;</ref><ref type="bibr">Chang et al. 2017;</ref><ref type="bibr" target="#b19">Watters et al. 2017;</ref><ref type="bibr" target="#b17">Sanchez-Gonzalez et al. 2018;</ref><ref type="bibr" target="#b10">Kipf et al. 2018</ref>) have been proposed based on the assumption that neural networks can learn complex physical interactions and simulate unseen dynamics based on a current state. The models along this direction are based on common relational inductive biases <ref type="bibr" target="#b18">(Santoro et al. 2017;</ref><ref type="bibr" target="#b1">Battaglia et al. 2018)</ref>, i.e., functions connecting entities and relations are shared and can be learned from a given sequence of simulated dynamics. <ref type="bibr">(Chang et al. 2017;</ref><ref type="bibr" target="#b0">Battaglia et al. 2016;</ref><ref type="bibr" target="#b17">Sanchez-Gonzalez et al. 2018)</ref> commonly assumed that the objects' behaviors were governed by classical kinetic physics equations. Then, object-and relation-centric functions were proposed to learn the transition from the current state to the next state without explicitly injecting the equations into the model. Discovering latent physics by data-driven learning has been actively studied <ref type="bibr" target="#b12">(Long et al. 2018;</ref><ref type="bibr" target="#b3">Brunton, Proctor, and Kutz 2016)</ref>. While the properly constrained filters enable us to identify the governing PDEs, it is only applicable when we are aware of the form of target PDEs. Unlike this line of works that extracts latent patterns from data only, our proposed model can incorporate known physics and at the same time extract latent patterns from data which cannot be captured by existing knowledge.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Background</head><p>In this section, we introduce how differential operators in Euclidean domain are analogously defined on the discrete graph domain and briefly show that the graph networks module is able to efficiently express the differential operators.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Calculus on Graphs</head><p>Preliminary Given a graph G = (V, E) where V and E are a set of vertices V = {1, . . . , n} and edges E ⊆ V 2 , respectively, two types of real functions can be defined on the vertices, f : V → R, and edges, F : E → R, of the graph. It is also possible to define multiple functions on the vertices or edges as multiple feature maps of a pixel in CNNs. Since f and F can be viewed as scalar and vector fields in differential geometry (Figure <ref type="figure">2</ref>), the corresponding discrete operators on graphs can be defined as follow <ref type="bibr" target="#b2">(Bronstein et al. 2017)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Gradient on graphs</head><p>The gradient on a graph is the linear operator defined by</p><formula xml:id="formula_0">∇ : L 2 (V) → L 2 (E) (∇f ) ij = (f j − f i ) if {i, j} ∈ E and 0 otherwise.</formula><p>where L 2 (V) and L 2 (E) denote Hilbert spaces of vertex and edge functions, respectively, thus f ∈ L 2 (V) and F ∈ L 2 (E). As the gradient in Euclidean space measures the rate Divergence on graphs The divergence in Euclidean space maps vector fields to scalar fields. Similarly, the divergence on a graph is the linear operator defined by div :</p><formula xml:id="formula_1">L 2 (E) → L 2 (V) (div F ) i = j:(i,j)∈E w ij F ij ∀i ∈ V</formula><p>where w ij is a weight on the edge (i, j). It denotes a weighted sum of incident edge functions to a vertex i, which is interpreted as the netflow at a vertex i.</p><p>Laplacian on graphs Laplacian (∆ = ∇ 2 ) in Euclidean space measures the difference between the values of the scalar field with its average on infinitesimal balls. Similarly, the graph Laplacian is defined as</p><formula xml:id="formula_2">∆ : L 2 (V) → L 2 (V) (∆f ) i = j:(i,j)∈E w ij (f i − f j ) ∀i ∈ V</formula><p>The graph Laplacian can be represented as a matrix form, L = D − W where D = diag( j:j =i w ij ) is a degree matrix and W denotes a weighted adjacency matrix. Note that L = ∆ = −div∇ and the minus sign is required to make L positive semi-definite.</p><p>Based on the core differential operators on a graph, we can re-write differentiable physics equations (e.g., Diffusion equation or Wave equation) on a graph. Given a set of nodes (v), edges (e), and global (u) attributes, the steps of computation in a graph networks block are as follow:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Graph Networks</head><formula xml:id="formula_3">1. e ij ← φ e (e ij , v i , v j , u) for all {i, j} ∈ E pairs. 2. v i ← φ v (v i , ē i , u) for all i ∈ V.</formula><p>ē i is an aggregated edge attribute related to the node i.</p><formula xml:id="formula_4">3. u ← φ u (u, ē , v )</formula><p>ē and v are aggregated attributes of all edges and all nodes in a graph, respectively. where φ e , φ v , φ u are edge, node, and global update functions, respectively, and they can be implemented by learnable neural networks. Note that the computation order is flexible. The aggregators can be chosen freely once it is invariant to permutations of their inputs.</p><note type="other">Mapping Equation Physics example node</note><formula xml:id="formula_5">→ edge eij = φ e (vi, vj) = (∇v)ij ∇φ = −E (Electric field) edge → node vi = φ v (eij) = (div e)i ∇ • E = ρ/ 0 (Maxwell's eqn.) node → node vi = φ v (vi, {v j:(i,j)∈E }) = (∆v)i ∆φ = 0 (Laplace's eqn.)</formula><p>As φ e is a mapping function from vertices to edges, it can be replaced by the graph gradient operator to describe the known relation explicitly. Similarly, φ v can learn divergence-like mapping (edge to node) functions. For curlinvolved functions, it is required to add another updating function, φ c , which is mapping from nodes/edges/global attributes to a 3-clique attribute and vice versa. In other words, the graph networks have highly flexible modules which are able to imitate the differential operators in a graph explicitly or implicitly.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Physics-aware Graph Networks</head><p>As deep learning models are successful to model complex behaviors or extract abstract features in data, it is natural to focus on how the data-driven modeling can solve practical problems in physics or engineering fields. In this section, we provide how domain knowledge described in physics can be incorporated with the graph networks framework.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Static Physics</head><p>Many fields in physics dealing with static properties, such as Electrostatic, Magnetostatic, or Hydrostatic, describe a number of physics phenomena at rest. Among the various phenomena, it is easy to express differentiable physics rules in discrete forms on a graph with the operators in previous Section . For instances, the Poisson equation (∇ 2 φ = − ρ 0 ) in Electrostatics is realized as a simple matrix multiplication of graph Laplacian with a vertex function. Table <ref type="table" target="#tab_0">1</ref> provides some differential formulas in Electrostatic and how the updating functions are defined in graph networks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Dynamic Physics</head><p>More practical equations have been written in the dynamic forms, which describe how a given physical quantity is changing in a given region over time. GN can be regarded as a module that updates a graph state including the attributes of node, edge, and a whole graph.</p><formula xml:id="formula_6">G = GN(G)<label>(1)</label></formula><p>Equation Physics example </p><formula xml:id="formula_7">v i = vi + αφ v (vi, {v j:(i,j)∈E }) = vi + α(∆v)i u = α∆u (Diffusion eqn.) v i = 2v i − vi + c 2 φ v (v i , {v j:(i,j)∈E }) = 2v i − vi + c 2 (∆v )i ü = c 2 ∆u (Wave eqn.)</formula><formula xml:id="formula_8">f ∂u ∂t , • • • , ∂ M u ∂t M , ∂u ∂x , • • • , ∂ N u ∂x N = 0 (2)</formula><p>where u is a physical quantity spatiotemporally varying and x is the direction where u is defined on. M and N denote the highest order of time and spatial derivatives, respectively. Under the state updating view in Equation <ref type="formula" target="#formula_6">1</ref>, any types of PDEs written in Equation 2 can be represented as a form of finite differences. Table <ref type="table" target="#tab_1">2</ref> provides the examples of the dynamic physics. u and ü are the first and second order time derivatives, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Physics in Latent Space</head><p>We provide how the differential operators are implemented in a GN module in a previous section. However, it is hardly practical for modeling complicated real-world problems with the differential operators solely because it is only possible when all physics equations governing the observed phenomena are explicitly known. For example, although we are aware that there are a number of physics equations involved in climate observations, it is almost infeasible to include all required equations for modeling the observations. Thus, it is necessary to utilize the learnable parameters in GN to fill the missing dynamics which is not described by given equations.</p><p>There is another advantage to utilize learnable parameters. There are a number of unknown parameters, which need to be pre-defined to specify the physics equations, and the parameters can be inferred by the learnable parameters. For example, while we have knowledge that input signal has a wave property, the speed of waves (c in Table <ref type="table" target="#tab_1">2</ref>) should be given to fully describe the wave equation. It will be even worse when multiple input signals are involved since each signal is governed by different parameters in the same kind of equation. While both temperature and surface pressure are continuous and diffusive, they should have different diffusion coefficients (α in Table <ref type="table" target="#tab_1">2</ref>) in the same diffusion equation. To address the issue we can transform the input signals to latent space and use one equation in the latent space instead of imposing multiple equations to input signals separately. Then, the parameters in Encoder make the different signals follow the equation differently. We formalize how this idea is implemented as follow.</p><p>Forward/Recurrent computation Figure <ref type="figure">3</ref> provides how the desired physics knowledge is integrated with the graph networks. Given a graph G = {v, e, u}, it is fed into an encoder which transforms a set of attributes of nodes (v), edges (e), and a whole graph (u) into latent spaces. ṽ, ẽ, ũ = Encoder(v, e, u)</p><p>(3)</p><p>After the encoder, the encoded graph H = {ṽ, ẽ, ũ} is repeatedly updated within the core block as many as the required time steps T . For each step, H is updated to H which denotes the next state of the encoded graph.</p><formula xml:id="formula_9">H = GN(H) (4)</formula><p>Finally, the sequentially updated attributes are retransformed to the original spaces by a decoder.</p><p>v , e , u = Decoder(ṽ , ẽ , ũ )</p><p>There are two types of objective function in this architecture, physics knowledge and supervised objective. First, we define physics-informed constraint, which is a form of equations in Table <ref type="table" target="#tab_0">1</ref> and 2 depending on given physics knowledge and even mixed.</p><formula xml:id="formula_11">f s phy (H t ), f d phy (H t , • • • , H t+M ) (6) L phy = t f s phy (H t ) + f d phy (H t , • • • , H t+M ) (7)</formula><p>where f s phy (H t ) and f d phy (H t , • • • , H t+M ) are the static and dynamic physics-informed quantity, respectively. For example, we can impose gradient constraint or the diffusion equation between node/edge latent representations as follow:</p><formula xml:id="formula_12">f s phy (H t ) = ẽ t − ∇ṽ t 2 f d phy (H t , H t+1 ) = ṽ t+1 − ṽ t − α∇ 2 ṽ t 2</formula><p>Secondly, the supervised loss function between the predicted graph, Ĝ , and the target graph, G . This loss function is constructed based on the task, such as the cross-entropy or the mean squared error (MSE). Finally, the total objective function is a sum of the two constraints:</p><formula xml:id="formula_13">L = L sup + λL phy (8)</formula><p>where λ controls the importance of the physics term.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Experiment</head><p>In this section, we evaluate PaGN on a real-world climate dataset on the Southern California region.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Climate Data</head><p>For the evaluation on real-world data, we used the hourly simulated climate observations for 16 days on the Southern California region <ref type="bibr" target="#b20">(Zhang et al. 2018)</ref>. In this dataset, we sampled small regions randomly from two area (Los Angeles and San Diego, Figure <ref type="figure">4</ref>) encompassing urban and rural meteorological features to generate spatially discrete observations. To build a graph, we connected a pair of the sampled regions by using k-nearest neighbors algorithm (k = 3). This data preprocessing is required to verify the proposed</p><formula xml:id="formula_14">𝒢 Encoder ℋ GN ℋ′ 𝒢 $ ′ Decoder x T</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Physics equation</head><p>Supervised Loss</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>𝒢′ ⨀</head><p>Figure <ref type="figure">3</ref>: Recurrent architecture to incorporate physics equation on GN. The blue blocks have learnable parameters and the orange blocks are objective functions. is a concatenation operator and the middle core block can be repeated as many as the required time steps (T ). idea as well as evaluate PaGN on the spatiotemporally sparse setting, which is more common for sensor-based datasets.</p><p>The vertex attributes consist of 10 climate observations, Air temperature, Albedo, Precipitation, Soil moisture, Relative humidity, Specific humidity, Surface pressure, Planetary boundary layer height, and Wind vector (2 directions). While the edge attributes are not given explicitly, we could specify the type of each edge by using the type of connected regions. There are 13 different land-usage types and each type summarizes how the corresponding land is used. Based on the types of connected regions, we assigned different embedding vectors to edges.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>PaGN Architecture</head><p>As explained in Section , PaGN consists of three modules, graph encoder, GN block, and graph decoder (Figure <ref type="figure">3</ref>). The encoder contains two feed forward networks, φ v and φ e , applied to node and edge features, respectively. By passing the encoder, the features are transformed to the latent space (H) where we will impose physics equations.</p><p>In the GN block, the node/edge/graph features are updated by the GN algorithm described in Section . The latent graph states, H and H , indicate the hidden states of the current and next observations. For the physics constraint, we informed the diffusion and wave equation in Table <ref type="table" target="#tab_1">2</ref>, which describe the behavior of the continuous physical quantities. As the most of the climate observations are varying continuously, the diffusion equation, as a part of the continuity equation, is one of the inductive bias that should be considered for modeling. In addition, the wave equation is useful to describe atmospheric phenomena, especially 1 solar day harmonics (e.g., Atmospheric tide). Note that the physics equations are not directly applied to the input observations, but rather to the latent representations. The state-updating process is repeated at least as many as the order of the equations to provide the finite difference equation. For multistep predictions, the recurrent module is repeated as many as the number of the predictions and the physics equation will be also applied multiple times as well. Finally, the decoder takes H as input to return the next predictions. The following objective is the total loss function of PaGN with the diffusion equation.</p><formula xml:id="formula_15">L = T i=1 ŷ i − y i 2 + λ T i=1 ṽ i − ṽi−1 − α∇ 2 ṽi−1 2 (9)</formula><p>where y is a vector of the target observations (i.e. node vectors) and α adjusts the diffusivity of the latent representa-  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Experimental Settings</head><p>In our experiments, we used the air temperature as a target observation and other 9 observations were used as input. We first evaluated our model by performing the one-step and multistep prediction tasks on the two different area with a mean square error metric. For both regions, we commonly trained the model with input observations for 10 timesteps (t − 10 : t − 1) and predicted targets from t − 9 to t. First 65% of a total length was used as a training set and remaining series was split into validation (10%) and test sets (25%).</p><p>We explored several baselines: MLP, LSTM, and GNonly ignoring the physics constraint in PaGN. We also compared GN-skip which connects between H and H with the skip-connection <ref type="bibr" target="#b9">(He et al. 2016)</ref> without the physics constraint. spatiotemporally continuous. Among the graph-based models, PaGN(diff) provides the least MSEs. It validates that the diffusive property provides a strong inductive bias with the latent representation learning. Note that the standard deviations from PaGN(diff) are significantly smaller than those of other baselines and it implies that the integrated physics knowledge properly stabilizes optimization process by introducing additional objective.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>One step Prediction</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Multistep Prediction</head><p>To evaluate the effectiveness of the state-wise regularization more carefully, we conducted the multistep prediction task (10 forecast horizon). For the task, the recurrent modules are modified to predict input observations as well and the predicted one is re-fed in the model for future timesteps. While the models having a recurrent module are able to predict a few more steps reasonably, there are a couple of things we should pay attention. First, the results imply that utilizing the neighboring information is important because GN-only model shows similar or better MSEs compared to LSTM for the multistep tasks, even though it has a simple recurrent module that is not as good as that of LSTM. Second, we found that the diffusion equation in PaGN gives the stable state transition and the property provides slowly varying latent states which are desired particularly for the climate forecasting. Note that the skip-connection in GN-skip is also able to restrict the rapid changes of H. However, it is necessary to more carefully optimize the parameters in GN-skip to learn the residual term in H = H + GN(H) properly.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Effectiveness of Physics Constraint</head><p>One of the benefits of physics-aware learning is data efficiency. We explore how much the physics constraint is helpful by testing if PaGN can be well-trained when the number of data for the supervised objective is limited for the one-step prediction task. We randomly sampled training data which were used to optimize the total loss function (Equation 9) and the left unsampled data were only used to minimize the physics constraint:</p><formula xml:id="formula_16">L = L i sup + λL i phy , i is a sampled step L = λL i phy ,<label>otherwise</label></formula><p>We found that the diffusion equation can benefit to optimize PaGN even if the target observations are partially available (Figure <ref type="figure" target="#fig_4">5a</ref>). Although the overall performances of PaGN are degraded when less number of sampled data are used, the error are not far deviated from those of GN-only.    </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Importance of Physics Constraint</head><p>To study the importance of the physics term, we trained PaGN with different λ controlling the importance of the physics term. While we found that the physics term is substantially helpful from Table <ref type="table" target="#tab_3">3</ref> and 4, the term is not supposed to be dominant (See Figure <ref type="figure" target="#fig_4">5b</ref>) but tuned properly. This is intuitive since the term only provides partial knowledge (diffusive input signals), which changes loss surface to help parameters more stable to predict next signals, instead of governing the dynamics explicitly. Scaling down the physics term is similar to what <ref type="bibr" target="#b16">Sabour, Frosst, and Hinton (2017)</ref> did for reconstruction error not to dominate margin loss but to help the optimization process. We also present MSEs from PaGN(rand) defined by randomly sampling (α, β) ∈ [−2.5, 2.5] in the constraint ||v + αv + βv − c∆v|| 2 , and PaGN(diff+wave) superposing the two equations. Table <ref type="table" target="#tab_6">5</ref> shows that the random equation significantly degrades the overall prediction quality. Note that the simple superposition of two equations does not always guarantee lower error even if each equation is helpful separately. When the two equations are non-linearly connected in the unknown (fully) governing equation, the superposition cannot provide meaningful inductive bias. The results demonstrate that the physics term is an useful inductive bias when it is properly defined.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Conclusion</head><p>In this work, we introduce a new architecture PaGN based on graph networks to incorporate prior knowledge given as a form of PDEs over time and space. While existing works more focus on how to discover equations in data generated by explicit physics rules, we propose a method to leverage weakly given inductive bias describing data. We empirically analyze the performance of PaGN across a range of prediction experiments on the climate observations.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Concept of the proposed PaGN. Many sensorbased observations are only sparsely available (See circled regions) but there are continuous physical process (e.g., Diffusion) behind the sparse observations. Some of the known physics rules are injected into a model and the remained unknown dynamics will be extracted from data.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>Figure 2: Scalar/vector fields on Euclidean space and vertex/edge functions on a graph. and direction of change in a scalar field, the gradient on a graph computes differences of the values between two adjacent vertices and the differences are defined along the directions of the corresponding edges.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc><ref type="bibr" target="#b1">Battaglia et al. (2018)</ref> proposed a graph networks framework, which generalizes relations among vertices, edges, and a whole graph. Graph Networks (GN) describe how edge, node, and global attributes are updated by propagating information among themselves.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: In (a) MSEs of PaGN are almost as good as GNonly (gray lines) despite the less number of training data. (b) provides how the prediction performance is dependent on the physics term.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Examples of static equations in Graph networks</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Examples of dynamic equations in Graph networkswhere G is the updated graph state. Dynamic physics formulas are written as a function of time and spatial derivatives:</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3 :</head><label>3</label><figDesc>One step prediction error (MSE)tions, which is found through cross validation. Note that the equation term can be replaced by other equations properly.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 4 :</head><label>4</label><figDesc>Table3shows the prediction error of the baselines and PaGN on different areas. MLP and LSTM are shared over all stations and their performaces are outperformed by other models leveraging a given graph structure. It implies that knowing neighboring information is significantly helpful to infer its own state and it is intuitive since climate behaviors are Multistep prediction error (MSE)</figDesc><table><row><cell>Model</cell><cell>LA area</cell><cell>SD area</cell></row><row><cell>LSTM</cell><cell cols="2">1.9022±0.2078 1.2489±0.2295</cell></row><row><cell>GN-only</cell><cell cols="2">1.6137±0.1128 1.5532±0.2023</cell></row><row><cell>GN-skip</cell><cell cols="2">1.5429±0.0932 1.4423±0.1622</cell></row><row><cell cols="3">PaGN(diff) 1.4656±0.0474 1.0999±0.0435</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head></head><label></label><figDesc>Even the GN-only model is outperformed by PaGN when only 70% training data are used with the state-wise constraint.</figDesc><table><row><cell></cell><cell>1.2</cell><cell></cell><cell></cell><cell>LA area SD area</cell></row><row><cell></cell><cell>1.1</cell><cell></cell><cell></cell></row><row><cell></cell><cell>1.0</cell><cell></cell><cell></cell></row><row><cell>MSE</cell><cell>0.9</cell><cell></cell><cell></cell></row><row><cell></cell><cell>0.8</cell><cell></cell><cell></cell></row><row><cell></cell><cell>0.7</cell><cell></cell><cell></cell></row><row><cell></cell><cell>0.6</cell><cell></cell><cell></cell></row><row><cell></cell><cell>0.5</cell><cell></cell><cell></cell></row><row><cell></cell><cell>0%</cell><cell>25%</cell><cell>50%</cell><cell>75%</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 5 :</head><label>5</label><figDesc>One step prediction MSE with different constraints.</figDesc><table /></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Interaction networks for learning about objects, relations and physics</title>
		<author>
			<persName><forename type="first">P</forename><surname>Battaglia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pascanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">J</forename><surname>Rezende</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in neural information processing systems</title>
				<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="4502" to="4510" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Relational inductive biases, deep learning, and graph networks</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">W</forename><surname>Battaglia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">B</forename><surname>Hamrick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Bapst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sanchez-Gonzalez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Zambaldi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Malinowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tacchetti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Raposo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Santoro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Faulkner</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1806.01261</idno>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Geometric deep learning: going beyond euclidean data</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">M</forename><surname>Bronstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bruna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Lecun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Szlam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Vandergheynst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Signal Processing Magazine</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="18" to="42" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Discovering governing equations from data by sparse identification of nonlinear dynamical systems</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">L</forename><surname>Brunton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Proctor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">N</forename><surname>Kutz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Proceedings of the National Academy of Sciences</title>
		<imprint>
			<biblScope unit="volume">113</biblScope>
			<biblScope unit="issue">15</biblScope>
			<biblScope unit="page" from="3932" to="3937" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">B</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ullman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Torralba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">B</forename><surname>Tenenbaum</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A Compositional Object-Based Approach to Learning Physical Dynamics</title>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Statistics for spatiotemporal data</title>
		<author>
			<persName><forename type="first">N</forename><surname>Cressie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">K</forename><surname>Wikle</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2015">2015</date>
			<publisher>John Wiley &amp; Sons</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Deep Learning for Physical Processes: Incorporating Prior Scientific Knowledge</title>
		<author>
			<persName><forename type="first">E</forename><surname>De Bezenac</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Pajot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Gallinari</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Neuroanimator: Fast neural network emulation and control of physics-based models</title>
		<author>
			<persName><forename type="first">R</forename><surname>Grzeszczuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Terzopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hinton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 25th annual conference on Computer graphics and interactive techniques</title>
				<meeting>the 25th annual conference on Computer graphics and interactive techniques</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="1998">1998</date>
			<biblScope unit="page" from="9" to="20" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups</title>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Hinton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE conference on computer vision and pattern recognition</title>
				<meeting>the IEEE conference on computer vision and pattern recognition</meeting>
		<imprint>
			<date type="published" when="2012">2016. 2012</date>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="page" from="82" to="97" />
		</imprint>
	</monogr>
	<note>Deep residual learning for image recognition</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Neural Relational Inference for Interacting Systems</title>
		<author>
			<persName><forename type="first">T</forename><surname>Kipf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fetaya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K.-C</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Welling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Zemel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Imagenet classification with deep convolutional neural networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Krizhevsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Hinton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in neural information processing systems</title>
				<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="1097" to="1105" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">PDE-Net: Learning PDEs from Data</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Long</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Dong</surname></persName>
		</author>
		<ptr target="http://proceedings.mlr.press/v80/long18a.html" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 35th International Conference on Machine Learning</title>
				<meeting>the 35th International Conference on Machine Learning</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Raissi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1801.06637</idno>
		<title level="m">Deep Hidden Physics Models: Deep Learning of Nonlinear Partial Differential Equations</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Hidden physics models: Machine learning of nonlinear partial differential equations</title>
		<author>
			<persName><forename type="first">M</forename><surname>Raissi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Karniadakis</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1711.10561</idno>
	</analytic>
	<monogr>
		<title level="m">Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Raissi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Perdikaris</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Karniadakis</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2017">2018. 2017a</date>
			<biblScope unit="volume">357</biblScope>
			<biblScope unit="page" from="125" to="141" />
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Raissi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Perdikaris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Karniadakis</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1711.10566</idno>
		<title level="m">Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations</title>
				<imprint>
			<date type="published" when="2017">2017b</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Dynamic routing between capsules</title>
		<author>
			<persName><forename type="first">S</forename><surname>Sabour</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Frosst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Hinton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in neural information processing systems</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="3856" to="3866" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Graph Networks as Learnable Physics Engines for Inference and Control</title>
		<author>
			<persName><forename type="first">A</forename><surname>Sanchez-Gonzalez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Heess</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">T</forename><surname>Springenberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Merel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Riedmiller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Hadsell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Battaglia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 35th International Conference on Machine Learning</title>
				<meeting>the 35th International Conference on Machine Learning</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">A simple neural network module for relational reasoning</title>
		<author>
			<persName><forename type="first">A</forename><surname>Santoro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Raposo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">G</forename><surname>Barrett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Malinowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pascanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Battaglia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Lillicrap</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in neural information processing systems</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="4967" to="4976" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Google&apos;s neural machine translation system: Bridging the gap between human and machine translation</title>
		<author>
			<persName><forename type="first">N</forename><surname>Watters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tacchetti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Weber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pascanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Battaglia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zoran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Nips ; Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Schuster</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Norouzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Macherey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Krikun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Macherey</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1609.08144</idno>
		<imprint>
			<date type="published" when="2016">2017. 2016</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
	<note>Visual interaction networks</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Systematic Comparison of the Influence of Cool Wall versus Cool Roof Adoption on Urban Climate in the Los Angeles Basin</title>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mohegh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Levinson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ban-Weiss</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Environmental science &amp; technology</title>
		<imprint>
			<biblScope unit="volume">52</biblScope>
			<biblScope unit="issue">19</biblScope>
			<biblScope unit="page" from="11188" to="11197" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
