Learning Dynamical Models Using Motifs

Learning Dynamical Models Using Motifs GregoryProvan g.provan@cs.ucc.ie Department of Computer Science University College Cork

Ireland

Learning Dynamical Models Using Motifs 64C381F6087C3E1E59ED28C3F3CC9B0B GROBID - A machine learning software for extracting information from scholarly documents

Automatically creating dynamical system models, M, from data is an active research area for a range of real-world applications, such as systems biology and engineering. However, the overall inference complexity increases exponentially in terms of the number of variables in M. We solve this exponential growth by using canonical representations of system motifs (building blocks) to constrain the model search during automated model generation. The motifs provide a good prior set of building blocks from which we can generate system-level models, and the canonical representation provides a theoretically sound framework for modifying the equations to improve the initial models. We present an automated method for learning dynamical models from motifs, such that the models optimize a domain-specific performance metric. We demonstrate our approach on hydraulic systems models.

Introduction

Generating dynamical systems models remains a challenge for most real-world applications. For example, in systems biology, scientists aim to define genetic networks, using data acquired from high-throughput microarray analysis to identify the dynamical behaviour of gene clusters [2]. In automotive applications, engineers aim to develop control algorithms to guarantee driving performance and safety [4].

In these applications, researchers must develop for the dynamical model (1) the underlying structure (e.g., the equation form representing the structure of the gene cluster) and (2) the model's parameters. Among others, Voit [16] notes that structure identification is more difficult than parameter estimation.

Estimating dynamical systems models from data is computationally intensive. As the number of system states increases, the identification of highly coupled non-linear systems becomes increasingly challenging. In computational terms, the overall inference complexity for estimation increases exponentially in terms of the number of variables and parameters in a system [20].

Because computational blowups typically restrict the size of the learned dynamical system, most approaches attempt to restrict the search space. For example, in systems biology researchers use canonical versions of ordinary differential equations (ODEs) to transform learning arbitrary ODEs into learning parameters that characterize the ODEs [16]; in vision, some researchers have used human prior knowledge to learn partial differential equations (PDEs) describing the evolution of visual saliency [9].

Model libraries are widely used in systems design and engineering, and are the standard method for hand-generating large, complex models for several applications, ranging from HVAC [19], fluid-flow [1], power [6] to biological [11,15] systems. Motifs [12,18], the building blocks of libraries for complex systems, are network substructures (i.e., frequently-occurring and unique sub-structures) from which large systems are built. Motifs can reduce design costs and speed up product development. However, a critical drawback is that the motif libraries are typically developed for simulation and design purposes, making them significantly less useful for other tasks such as control or diagnostics. Some tasks, e.g., control, require trading off model fidelity for computational efficiency; other tasks require additional information, e.g., diagnostics requires simulation behaviours for both faulty and nominal conditions.

Because design-focused motif libraries are not optimized for non-design tasks, companies typically create application-specific system-level models that do not leverage existing motif libraries. To avoid this duplication of effort, we propose an approach for learning task-specific system-level models using motif libraries: given data D concerning the selected task T , we search over the space of motif configurations and parameters to generate a system model that optimises a metric µ over D. Our work differs from prior work that learns dynamical equations purely from data, e.g., [13,17], in that we generate models from a pre-defined motif library, measuring the models' performance using statistical model-comparison techniques [10].

We solve this exponential growth problem by constraining the model search during automated model generation, using canonical representations of pre-defined motifs. The canonical representation significantly constrains the structure and parameter space, and the motif libraries provide a good prior set of building blocks from which to generate system-level models.

In this article we show how to reduce the enormous search space of model and parameter configurations by using motif-based learning. We learn task-specific motifs by (1) imposing physical constraints over the models, and (2) using a canonical representation for non-linear dynamical equations [14] that allows us to define a clear notion of relative model fidelity, which we use for guiding search.

Our contributions are as follows.

-We develop a task-specific method for generating system models that can trade off model fidelity for model size, so that we can tailor models for tasks requiring smaller (more efficient) models. -We show how motifs can improve the computational efficiency of learning taskspecific systems that are modelled using non-linear dynamical equations by at least one order-of-magnitude. -We illustrate our approach on a non-linear tank benchmark.

Notation

Model Representation

We represent a system as an inter-connected set of sub-systems (or components), using notions standard within the computing and engineering literature, e.g., [3]. We assume that for any system, we have the system topology, which specifies the component-type connections. We further assume that we have a library of motifs (or components), where we specify each motif using multiple levels of fidelity.

We characterize a system/motif using a graph G(V, E) of vertices and edges, and a set E G of equations. The graph specifies the relationships among the variables in E G and the equations specify the transformations occurring in the system/motif. For example, a bio-system might define converting 2 units of species A into 1 unit of B:

2 A 1 B.

We can represent this using a graph consisting of vertex A connected to vertex B, with a differential equation denoting the rate of conversion of A to B using parameters θ.

As shown in Figure 2(b), a motif consists of sets of input and output variables, v i and v o respectively, a set θ of parameters, and a set E of equations over the variables v and parameters θ.

A sub-system/component can consist of a hierarchical composition of sub-components. A base-level component includes no embedded sub-components.

In most applications there is uncertainty concerning the structure and equational form of a motif, i.e., the level of fidelity or detail of the structure/equations. We capture that uncertainty by specifying mutiple versions of any motif, denoted λ = {λ 1 , ..., λ k }. we can also define a distribution over λ to denote the relative likelihood of each version being correct.

In this article, we adopt the Modelica component-based representation [3]. A motif corresponds to a Modelica component model, which uses the following three entities to define a system: 1. Component library; 2. System topology (represented as a graph G);

A connection mechanism (with clear semantics).

Component library: The component library consists of a collection of components of different types, e.g., resistor, capacitor, inductor, etc.

System topology: We assume that a system has an associated topology G(C, E), where each node χ ∈ C in G defines a component χ and each edge e = (χ i , χ j ) ∈ E defines a connection between χ i and χ j .

Connection mechanism: Components are connected via the connection mechanism, according to the system topology G. In Modelica, connectors define the variables for the component communication interface, i.e., connectors are instances of connector classes. Connectors specify external interfaces for interaction between components.

The component framework specifies a set of constraints over components and connections. For example, we constrain a connection to be possible only between connectors of equivalent type. We define, WLOG, input (positive) and output (negative) connectors.

Dynamical Systems

We consider dynamical systems that can be described by a set of (noise-free) ODEs:

ẋ(t) = ψ(x(t), u(t), θ), x(t 0 ) = x 0 , y(t) = ϕ(x(t), θ),(1)

where x(t) ∈ R n is the state vector, u(t) ∈ R q is the input vector, y(t) ∈ R m is the observation vector, and θ ∈ R p is a vector of parameters associated with the state transition and control dynamics, and the observation function. φ x0 (θ, u) denotes the input-output mapping of the system (1) started at the initial state x 0 with parameter set θ. We use v = {x, y, u} to denote the set of system variables.

Thermo-Hydraulic Example: Tank System

Throughout this article we will use as a running example a thermo-hydraulic system: a system of interconnected tanks. In this domain, the motifs consist of tanks, valves, pipes, pumps, flow sources/sinks, etc. Systems comprising these motifs are used to model several domains, including hydrology (tanks and pipes correspond to lakes and rivers, respectively), chemical process control, and cardiovascular systems.

In this article we consider three components, tanks, valves, and pipes. Figure 1(a) shows the three components. For each component we define a block diagram with a Fig. 1. System components: tank, pipe and valve, and their block diagrams single input and output, denoting inflow and outflow respectively.

We now describe in more detail the model of a single tank, which is an example of a basic component model. Figure 2(a) shows the tank component. We model a tank component using a set of equations over variables denoting inputs (fluid pressure p in , and fluid flow q in ) and outputs (p out , q out ). The tank has cross-sectional area of A, and outlet cross-section a. If the height h of fluid in the tank changes at a rate ḣ, then we have A 1 ḣ = q in − q out . We assume that we measure only pressure, using p = gh, where g is a gravity parameter. We could also define equations using parameters like viscosity σ. Taken together, the tank thus comprises a motif/component with variables, parameters and equations as shown in Figure 2(c).

We define a tank sub-system to consist of three connected motifs: a tank, a valve, and a pipe, as shown in Figure 1(b). We can control the flow through this system, and the height of fluid in the tank, by controlling the inflow to the tank and the valve setting. We can extend our example by defining the three-tank system shown in Fig. 3, which consists of three connected tank sub-systems. Tank T i has area A i and inflow q i−1 , for Fig. 2. Model for tank component i = 1, 2, 3. We create this system by connecting together three tank components, with a valve V i regulating the flow q i out of tank i, for i = 1, 2, 3. Tank T 1 gets filled from a pipe, with measured flow q 0 . Hence, our control input is u = {q 0 }.

We assume that we don't directly measure any flows other than the inlet flow q 0 . Therefore, we use the tank heights as a proxy for deriving flows through the multi-tank system.

Torricelli's Law defines the flow q i out of tank i, with liquid level h i , into tank j, as:

q i = γsign(h i − h j ) 2g(h i − h j ),(2)

where the coefficient γ models the area of the drainage hole and its friction factor through the hole.

We can use equation 3 to derive the following equations for the three-tank system shown in Fig. 3:

ḣ1 = q 0 − c 1 h 1 − h 2 ḣ2 = c 2 h 1 − h 2 − c 3 h 2 − h 3 , ḣ3 = c 4 h 2 − h 3 − c 5 h 3 ,(3)

where the constants c 1 , • • • , c 5 summarize the system parameters representing crosssectional areas, friction factors, gravity, etc. We measure tank pressure, whose equations are given by p i = h i g, for i = 1, 2, 3. Consequently, we define the parameter set using

Θ = {c 1 , c 2 , c 3 , c 4 , c 5 , g}.

3 Learning System Models

Objective

This section describes our approach for learning system models. Our objective is to search over the space S of possible models (composed from component models of different fidelity) to identify a model that optimizes a criterion µ:

M * = argmax i:Mi∈S µ(M i |D)(4)

We use a statistical approach for this model search, first assigning a prior distribution P (S) over the model space. We then use the structure defined by the canonical models to provide us with a framework for model search. Finally, we must use a stopping criterion to identify when we have achieved an "optimal" model given (µ, D).

Our experiments test the hypothesis that our proposed approach is computationally more efficient than and loses little simulation accuracy relative to the standard (unconstrained) approach.

System Architecture

Figure 4 shows the system architecture for model learning. We adopt a two-step process for learning a target model M . We first learn the the motif/component library L; we then use this to constrain the induction of the full system topology G. We represent each component as a set E of dynamical equations.

Motif Learning We use various system constraints to learn a motif/component library.

We map the input equations E into a canonical representation, defined over z, π , where z is a set of transformed variables and π is the set of canonical parameters. System-Level Model Learning We search over π to select the model that optimizes metric µ. We use a statistical model-comparison tool [10] to determine the best model. We ensure a computationally tractable search space by mapping z, π to a subset π of the full model parameter space using physical constraints.

We can search over the model space in two ways: Unconstrained Without motifs, our learning task must define the structure and parameters of the component model equations during the system-level learning process.

Given the variability of defining ODEs, we use a gradient-descent search that systematically modifies the set W of ODEs to achieve a targeted change in the simulation performance of W . Motif-Based Here, we assume that we use the motif/component models to constrain system-level search. We search over all component combinations of a set of models of different fidelity to generate a system model optimizing our metric µ. If we have k models for every component, and a system consists of l components, then the space of models to be searched increases exponentially with the number of components (k l ), i.e., the search space becomes prohibitively large for large systems (l > 100).

Tank System Canonical Representation

This section describes how we define a canonical representation for our tank model. We first define the representation, and then use our tank model as an example. We adopt a Power-Law Canonical Representation, also called S-systems [14], for constraining model search. This representation is general, since almost all non-linear models can be exactly recast into power-law models through a transformation using auxiliary variables, as specified in Theorem 1. This mapping generates m − n additional constraints beyond the n original equations: see [14] for details.

Theorem 1 ([14]). Let ẋi = f i (x 1 , x 2 , ..., x n ), x i (0) = x i0 , i = 1, 2, ..., n(5)

be a set of differential equations where each f i consists of sums and products of elementary functions, or nested elementary functions of elementary functions. Then there is a smooth change of variables x → z that recasts Equations 6 into a power-law (or S-) system z = {z 1 , ..., z m+n }, π :

żi = α i n j=1 z γij j − β i n j=1 z ζij j , z i (0) = z i0 , i = 1, 2, ..., m(6)

where z i are real non-negative variables, and the parameters π = {α i , β i , γ ij , ζ ij } are such that α i , β i are real non-negative and γ ij , ζ ij are real.

In the following we will show how this representation, together enforced physical constraints, can significantly reduce the search space for learning dynamical systems, through a principled control of the power-law system variables and parameters.

We now show how we can transform the tank model into a power-law model. We transform the state equations (Eq. 4) for this 3-tank system to a power-law model by making the following substitutions:

x 0 ← q 0 x 1 ← h 1 x 2 ← h 2 x 3 ← h 3 x 4 ← h 1 − h 2 x 5 ← h 2 − h 3

This can be expressed in the general formula as follows:

ẋ1 = x 0 − β 1 x ζ11 4 ẋ2 = α 2 x γ21 4 − β 2 x ζ21 5 ẋ3 = α 3 x γ31 5 − β 3 x ζ31 3 ẋ4 = (α 41 x γ41 5 + x 0 ) − β 4 x ζ41 4 ẋ5 = α 5 x γ51 6 − β 5 x ζ51 5 ẋ6 = α 6 x γ61 4 − β 6 x ζ61 5(7)

where most γ ij and ζ ij are 1 2 . We assume that x(0) = (0, 0, 0, 0, 0, 0). In this transformation, the equations for ẋ1 , ẋ2 , ẋ3 comprise the transformed state equations, and the latter 3 equations are constraints. We can see that each state equation (corresponding to a tank equation) specifies an (inflow -outflow) representation.

Constraints on the Model Search Space

We must search over a vast space of models if no constraints are imposed. If we fix the variable specification (x, denoting the x i 's) in equation 1, a search over the space of models must consider the entire parameter space, in the worst case. If we modify the variable specification (x), then we must consider the parameter space for every setting of variables x. In a power-law model, for each component/system, the parameters are θ = {α, β, γ, ζ}, where α i , β i are real non-negative, γ ij , ζ ij are real.

We can significantly reduce the search space by using well-known physical constraints, transforming this space from a multi-dimensional continuous-valued space to a finite discrete-valued space. As an example, consider the tank system: component i, i = 1, 2., , ,, has 4 parameters in the canonical model, θ = {α i , β i , γ, ζ}, of which two (α i , β i ) are multiplicative parameters, and two (γ, ζ) are exponents for variables.

Given an assignment of (γ, ζ), the estimation of the multiplicative parameters given data D is a well-known, highly-studied process, e.g., [21,8]. It is the exponent-based parameters (γ, ζ) that create a difficult search problem, a problem that has received relatively little attention. Without any constraints, we must search over the real-valued space of γ × ζ. However, we use physical (or model-based) constraints to prune the search space significantly. Consider the tank example: the physics-based model is nonlinear, and corresponds to setting all γ ij and ζ ij to be 1 2 for i = 1, ...5 and to − 1 2 for i = 6. Other versions of this model that are typically analysed are constant, obtained by setting all γ ij and ζ ij to be 0, or linear, obtained by setting all γ ij and ζ ij to be 1:

ẋ1 = x 0 − β 1 x 4 ẋ2 = α 2 x 4 − β 2 x 5 ẋ3 = α 3 x 5 − β 3 x 3 ẋ4 = (α 41 x 5 + x 0 ) − β 4 x 4 ẋ5 = α 5 x 6 − β 5 x 5 ẋ6 = α 6 x 4 − β 6 x 5

If we want to search over a tank component, this is equivalent to searching over the space described by the equation for ẋ2 : α 2 x γ21 4 − β 2 x ζ21 5 . The three most typical classes of model, constant, non-linear and linear, correspond to setting each of (γ, ζ) to {0, 1 2 , 1}, i.e., 0 corresponds to constant, 1 2 to non-linear, and 1 to linear. Table 1 shows these different models. We can create a hierarchy by fixing (α, β) and varying (γ, ζ), as shown in Table 1. In this table, we show the plausible values of these parameters, namely {0, 1 2 , 1}, corresponding to constant, non-linear and linear equations. We could, in theory, examine model combinations with all discrete combinations of (α, β), or the infinite combinations of real-valued pairs, if (α, β) are both reals. However, by enforcing a restriction to constant, non-linear and linear equations, we obtain a lattice-structured search space with a relatively small, finite number of (α, β)-combinations to search.

γ21 ζ21 ẋ2 Original Type 0 0 α2 − β2 α2 − β2 constant 1 1 α2x4 − β2x5 α2(h1 − h2) − β2(h2 − h3) linear 1 2 1 2 α2x 1 2 4 − β2x 1 2 5 α2(h1 − h2) 1 2 − β2(h2 − h3)

The lattice defines a clear notion of relative model fidelity. A directed edge from model A to model B means that model B has one component whose equations are more complex (and probably lead to higher-fidelity inference) than those in model A. Traversing the lattice thus entails traversing the model space, exploring models with well-defined differences in relative model fidelity.

Empirical Analysis

This section describes our empirical analysis of the proposed power-law framework.

Experiments: Motif-Constrained Search

We have run experiments on the 3-tank system, using a collection of models for tanks and valves, namely constant, linear and non-linear instances of each, giving a search space of 729 possible models, each with a very large parameter space. We used the fully non-linear model as the gold standard model M * , with parameters specified in [7], to simulate data that we used for learning. We computed the sum-of-squared-error (SSE) difference between M * and learned model. We tested both breadth-first and depth-first search algorithms in the parameter lattice, starting from the constant model as the root of the search tree. The results indicate that the depth-first search algorithm is more efficient. However, when we penalize a model for its number of parameters in addition to penalizing a relative lack of accuracy, as in the AIC metric [5], the mixed linear/nonlinear model scores best. Figure 5 compares the AIC scores for composed models of the 3-tank system.

Experiments: Unconstrained

We ran experiments in which we modified the canonical equation structure in a continuous manner. We can simplify the the initial set of equations ( 8) by setting various γ and ζ parameters to zero, or we can extend the equations by increasing the order of various γ and ζ parameters or by adding multiplicative x j variables into the equations. The benefits of the canonical equation structure is that it constrains how to modify the equations, and it provides a clear framework to simplify or extend a set of equations.

Figure 6 compares the AIC scores for composed models of the 3-tank system, when we generate models by model extension. Here, we also create higher-order nonlinear models, which are increasingly penalized by the AIC metric due to their increasing number of parameters outweighing the improved model simulation accuracy.

Discussion

Our results show the tradeoffs we can study by automatically generating models. Although the simulation accuracy increases for the nonlinear models over simpler models, then parameter estimation may possibly be too costly. The AIC metric provides a measure that addresses this trade-off.

A second outcome is that the model extension approach is significantly more expensive computationally than the composition approach. In the hydraulic domain studied model extension did not create significantly more accurate models than those composed from the multi-fidelity library, although this is probably domain-dependent.

Conclusions

This article has illustrated a system that uses a library of motifs to create a computationally efficient system for learning task-specific ODE models from data. In particular, our approach can trade off model fidelity for model size (which typically corresponds to inference complexity) on a task by task basis.

Fig. 3 .3Fig. 3. Diagram of the three-tank system.

Fig. 4 .4Fig. 4. System architecture for learning models, based on (a) learning motifs and (b) using the motifs to constrain learning complete models.

Fig. 5 .5Fig. 5. Comparison of SSE and AIC scores for composed models

Fig. 6 .6Fig. 6. Comparison of AIC scores for extended models

Table 1 .1Hierarchy of equations for tank 21 2 non-linear

Acknowledgement. This research was supported by SFI grants 12/RC/2289 and 13/RC/2094.

The accuracy and efficiency of a matlab-simulink library for transient flow simulation of gas pipelines and networks MBehbahani-Nejad ABagheri Journal of Petroleum Science and Engineering 70 3 2010 Near-optimal experimental design for model selection in systems biology AGBusetto AHauser GKrummenacher MSunnåker SDimopoulos CSOng JStelling JMBuhmann Bioinformatics 29 20 2013 PFritzson Principles of object-oriented modeling and simulation with Modelica 2 John Wiley & Sons 2010 Modeling and simulation of electric and hybrid vehicles DWGao CMi AEmadi Proceedings of the IEEE the IEEE 2007 95 Akaike information criterion SHu Center for Research in Scientific Computation 2007 User friendly simulink thermal power plant modelling using object oriented non-linear dynamic model library JJan BŠulc 5th IASTED International Conference Power and Energy Systems (PES 2001) Tampa 2001 Control of an uncertain three-tank system via on-line parameter identification and fault detection CJoin HSira-Ramírez MFliess IFAC World Congress July 2005 Advances and selected recent developments in state and parameter estimation CKravaris JHahn YChu Computers & Chemical Engineering 51 2013 Adaptive partial differential equation learning for visual saliency detection RLiu JCao ZLin SShan Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition the IEEE Conference on Computer Vision and Pattern Recognition 2014 Approximate bayesian computational methods JMMarin PPudlo CPRobert RJRyder Statistics and Computing 22 6 2012 Physiolibrary: Modelica library for Physiology MMateják TKulhánek JŠilar PPrivitzer FJežek JKofránek 10th International Modelica Conference

Sweden

Linköping University Electronic Press Lund 2014 Network motifs: simple building blocks of complex networks RMilo SShen-Orr SItzkovitz NKashtan DChklovskii UAlon Science 298 5594. 2002 Learning dynamical models using expectation-maximisation BNorth ABlake Sixth International Conference on IEEE 1998. 1998 Computer Vision Recasting nonlinear differential equations as S-systems: a canonical nonlinear form MASavageau EOVoit Mathematical biosciences 87 1 1987 Synthetic regulatory RNAs as tools for engineering biological systems: Design and applications SWSeo GYJung Chemical Engineering Science 103 2013 Biochemical systems theory: a review EOVoit ISRN Biomathematics 2013 2013 Learning deep dynamical models from image pixels NWahlström TBSchön MPDeisenroth IFAC-PapersOnLine 48 28 2015 Motifs in networks KWeihe Gems of Combinatorial Optimization and Graph Algorithms Springer 2015 Modelica buildings library MWetter WZuo TSNouidui XPang Journal of Building Performance Simulation 7 4 2014 Seeding-inspired chemotaxis genetic algorithm for the inference of biological systems SJWu CTWu Computational biology and chemistry 53 2014 Parameter estimation for Continuous-time Models: a Survey PYoung Automatica 17 1 1981