<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Learning the Structure of Causal Models with Relational and Temporal Dependence</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Katerina</forename><surname>Marazopoulou</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">College of Information and Computer Sciences</orgName>
								<orgName type="institution">University of Massachusetts Amherst</orgName>
								<address>
									<postCode>01003</postCode>
									<settlement>Amherst</settlement>
									<region>MA</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Marc</forename><surname>Maier</surname></persName>
							<email>maier@cs.umass.edu</email>
							<affiliation key="aff0">
								<orgName type="department">College of Information and Computer Sciences</orgName>
								<orgName type="institution">University of Massachusetts Amherst</orgName>
								<address>
									<postCode>01003</postCode>
									<settlement>Amherst</settlement>
									<region>MA</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">David</forename><surname>Jensen</surname></persName>
							<email>jensen@cs.umass.edu</email>
							<affiliation key="aff0">
								<orgName type="department">College of Information and Computer Sciences</orgName>
								<orgName type="institution">University of Massachusetts Amherst</orgName>
								<address>
									<postCode>01003</postCode>
									<settlement>Amherst</settlement>
									<region>MA</region>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Learning the Structure of Causal Models with Relational and Temporal Dependence</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">2110251D43F64D5E0D08AE5333EF3DD3</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T17:00+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Alice</term>
					<term>gpa Bob</term>
					<term>gpa CS101</term>
					<term>difficulty CS201</term>
					<term>difficulty</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Many real-world domains are inherently relational and temporal-they consist of heterogeneous entities that interact with each other over time. Effective reasoning about causality in such domains requires representations that explicitly model relational and temporal dependence. In this work, we provide a formalization of temporal relational models. We define temporal extensions to abstract ground graphs-a lifted representation that abstracts paths of dependence over all possible ground graphs. Temporal abstract ground graphs enable a sound and complete method for answering d-separation queries on temporal relational models. These methods provide the foundation for a constraint-based algorithm, TRCD, that learns causal models from temporal relational data. We provide experimental evidence that demonstrates the need to explicitly represent time when inferring causal dependence. We also demonstrate the expressive gain of TRCD compared to earlier algorithms that do not explicitly represent time.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">INTRODUCTION</head><p>Recent work in artificial intelligence has devoted increasing attention to learning and reasoning with causal knowledge. Causality is central to understanding the behavior of complex systems and to selecting actions that will achieve particular outcomes. Thus, causality is implicitly or explicitly central to mainstream AI areas such as planning, cognitive modeling, computational sustainability, game playing, multiagent systems, and robotics. Causal inference is also central to many areas beyond AI, including medicine, public policy, and nearly all areas of science. Substantial research advances have been made over the past several decades that allow the structure and param-eters of causal graphical models to be learned from observational data. As early as the 1990s, researchers developed constraint-based algorithms, such as the IC [Pearl and <ref type="bibr">Verma, 1991]</ref> and PC <ref type="bibr">[Spirtes et al., 2000]</ref> algorithms, that leverage the connection between the graphical criterion of d-separation and the conditional independencies that are inferred to hold in the underlying distribution, in order to learn the structure of causal graphical models from data.</p><p>In this work, we significantly extend the expressiveness of the models learnable with constraint-based algorithms. Specifically, we provide a formalization of temporal relational models, an expressive class of models that can capture probabilistic dependencies between variables on different types of entities within and across time points. We extend the notion of abstract ground graphs <ref type="bibr">[Maier et al., 2013b</ref>]-a lifted representation that allows reasoning about the conditional independencies implied by a relational model-for temporal relational models, and we show that temporal abstract ground graphs are a sound and complete abstraction for ground graphs of temporal relational models. Temporal abstract ground graphs can be used to answer d-separation queries for temporal relational models. We also extend an existing constraint-based algorithm for inferring causal dependence in relational data-the relational causal discovery (RCD) algorithm-to incorporate time, thus providing a constraint-based method that learns causal models from temporal relational data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">RELATIONAL MODELS</head><p>Propositional representations, such as Bayesian networks, describe domains containing a single entity class. Many real world systems comprise instances from multiple entity classes whose variables are interdependent. Such domains are often referred to as relational. In this section, we introduce basic concepts of relational representations that exclude time.</p><p>A relational schema S = (E, R, A, card ) specifies the types of entities, relationships, and attributes that exist in a domain. It includes a cardinality function that constrains Given a relational schema, we can specify relational paths, which intuitively correspond to ways of traversing the schema.</p><p>For the schema shown in Figure <ref type="figure">1</ref>, possible paths include [Student, Takes, Course] (the courses a student takes), as well as [Student, Takes, Course, Takes, Student] (other students that take the same courses). Relational variables consist of a relational path and an attribute that can be reached through that path. For example, the relational variable <ref type="bibr">[Student, Takes, Course]</ref>.difficulty corresponds to the difficulty of the courses that a student takes. Probabilistic dependencies can be defined between relational variables. Dependencies are said to be in canonical form when the path of the effect (or outcome) relational variable is a single item. For canonical dependencies, the path of the cause (or treatment) relational variable describes how dependence is induced. As an example, consider the A relational model M = (S, D, Θ) is a collection of relational dependencies defined over a single relational schema along with their parameterizations (a conditional probability distribution for each attribute given its parents). The structure of a relational model can be depicted by superimposing the dependencies on the ER diagram of the relational schema, as shown in Figure <ref type="figure">1</ref>, and labeling each arrow with the corresponding relational dependency.</p><p>Given a relational model M and a relational skeleton σ, we can construct a ground graph GG M σ by applying the relational dependencies as specified in the model to the specific instances of the relational skeleton. Figure <ref type="figure" target="#fig_1">3</ref> shows the ground graph for the model of Figure <ref type="figure">1</ref> applied on the relational skeleton of Figure <ref type="figure">2</ref>. In this work, we restrict our attention to relational models that do not contain the kind of relational autocorrelation that gives rise to cycles in the ground graph.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">TEMPORAL RELATIONAL MODELS</head><p>Relational models can be extended with a temporal dimension to model probabilistic dependencies over time. Such an extension is similar to the way in which dynamic Bayesian networks (DBNs) extend Bayesian networks <ref type="bibr" target="#b16">[Murphy, 2002]</ref>. A temporal relational model can be thought of as a sequence of time points, each of which is associated with a (non-temporal) relational model, and a set of dependencies that cross time points. Because dependencies in this model have a causal interpretation, dependencies across time points are only directed from the past to the future.</p><p>In this section, we extend the relational notions presented in Section 2 to include time. We assume that (1) time is discrete; (2) the schema is static; (3) relational dependencies do not change over time; (4) the temporal relational skeleton is given a priori; (5) the first-order Markov assumption  Under these assumptions, the structure of a temporal relational model can be represented by using only two time points, as shown in Figure <ref type="figure" target="#fig_4">4</ref>. Every time point has the same relational schema, shown in gray. A temporal relational skeleton provides a partial instantiation of a temporal relational schema. It specifies the entity and relationship instances that exist in each time point. Note that different sets of entity and relationship instances may be present in each time step. An example temporal relational skeleton is shown in Figure <ref type="figure">5</ref>. In this case, both time points have the same set of entity instances (Alice and Bob are instances of the Student entity, CS101 and CS201 are instances of the Course entity). However, the instances of the relationship Takes differ. In t = 0, Alice takes CS101 and Bob takes CS201. In t = 1, Alice takes CS201 and Bob takes no classes.</p><p>Temporal relational paths capture possible ways to traverse the temporal schema; therefore, they can cross time points. For example, a temporal relational path is [Student t+1 , Student t , Takes t , Course t ], which describes the classes that a student took in the previous semester. More formally, a temporal relational path is a sequence of non-temporal relational paths (relational paths within a time point) and "jumps" between neighboring time points. These jumps can happen at both entities and relationships because each choice encodes a distinct semantics. For example, the relational path [Student t+1 , Student t , Takes t , Course t ] describes the courses that a student took in the previous semester, while the path [Student t+1 , Takes t+1 , Course t+1 , Course t ] first finds the courses that a student is taking this semester, and then finds those courses in the previous semester. In the example skeleton of Figure <ref type="figure">5</ref>, for Alice, the first path would reach CS101 at t = 0, while the second path, will reach CS201 at t = 0. Definition 1. A temporal relational path P is a sequence of non-temporal relational paths P t0 0 , . . . , P t k k (k ≥ 0) such that for any two consecutive paths P ti i , P tj j in P the following hold:</p><formula xml:id="formula_0">1. |t i − t j | = 1</formula><p>2. The last item class of P ti i is the same as the first item class of P tj j .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">No subpath of P is of the form</head><formula xml:id="formula_1">[I t k , . . . , I t k ]</formula><p>, where all relations in the subpath are one-to-one.</p><p>We use the notation time(P ) to denote the set of all time points that appear in path P .</p><p>Temporal relational variables consist of a temporal relational path and an attribute that can be reached through that path. Temporal relational dependencies define probabilistic dependencies between two temporal relational variables. Temporal probabilistic dependencies are never directed backwards in time. Therefore, at the model level, there are no dependencies going back in time. However, the temporal constraints associated with a dependency are also implicitly encoded by the temporal relational path that annotates the dependency. To account for this, we forbid the temporal relational path of the treatment to go through any time points later than the time point of the outcome. Definition 2. A temporal relational dependency consists of two temporal relational variables with a common base item,</p><formula xml:id="formula_2">[I t 1 , . . . , I t k ].V k → [I t 1 ].V 1 such that max time [I t 1 , . . . , I t k ] ≤ t</formula><p>The first-order Markov assumption for temporal causal models implies that for every probabilistic dependency, if the treatment is in time point t, then the outcome is either in t or in t + 1. In the case of relational domains, the relational path of the treatment carries some temporal information since it can contain multiple time points. For the firstorder Markov condition to hold, we require the relational path of the treatment to only go through the current and the next time points. More formally, a temporal relational dependency</p><formula xml:id="formula_3">[I t 1 , . . . , I t k ].V k → [I t 1 ].V 1</formula><p>follows the first-order Markov assumption if the following two conditions hold:</p><formula xml:id="formula_4">t = t or t = t + 1 (1) time([I t 1 , . . . , I t k ]) ⊆ {t, t − 1} (2)</formula><p>The structure of a 2-slice temporal relational model is defined as a set of temporal relational dependencies over a relational schema. Definition 3. The structure of a 2-slice temporal relational model is a pair M = S, D T , where S is a relational schema and D T is a set of temporal relational dependencies that adhere to the first-order Markov assumption.</p><p>Figure <ref type="figure" target="#fig_4">4</ref> shows the structure of a 2-slice temporal relational model for the university domain. The temporal dependency shows that the difficulty of a course in the fall semester affects the spring GPA of the students that took this class in the fall. If, for example, a student takes many difficult classes, it is more likely that the student's GPA will drop in the next semester. Finally, given a temporal relational skeleton σ T and a temporal relational model M, we can construct a temporal ground graph GG Mσ T in the same way as in the non-temporal case.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">EXPRESSIVENESS OF TEMPORAL RELATIONAL MODELS</head><p>The temporal relational model described so far subsumes propositional directed networks (Bayesian networks), propositional temporal directed networks (dynamic Bayesian networks), and relational non-temporal models. This added expressivity comes at the cost of increased complexity. This raises an obvious and important question: What is the value of this added expressivity?</p><p>As an example of the practical utility of this added expressivity, we consider a real dataset and show how a path with temporal jumps leads to different terminal sets. Specifically, we used the Reality Mining dataset [Eagle and <ref type="bibr">Pentland, 2006</ref>]. The dataset contains two entities, Tower and Cellphone, and one relationship, Connects. The relational schema for this domain is shown in Figure <ref type="figure" target="#fig_3">6</ref>. For this dataset, entity instances (i.e., the set of towers and cellphones) do not change over time. However, the set of relationship instances (the connections) are different at every time point. In total, there are 95 cellphones, 32,579 towers, and 3,308,709 connections. Every connection is timestamped with precision of 1 second. For our work, the time granularity was coarsened to a day.</p><p>We computed the terminal sets for the following three paths: The first path corresponds to the cellphones that were con- We randomly selected 100 dates from the dataset. For each of these dates and for each tower that was used in these dates, we computed the terminal sets for the above paths. For a given tower and date, we computed the Jaccard distance between the different terminal sets. The Jaccard distance between two sets A and B is defined as</p><formula xml:id="formula_5">P 1 : [Tower t+1</formula><formula xml:id="formula_6">J(A, B) = 1 − |A∩B| |A∪B| .</formula><p>Intuitively, this quantifies the overlap of two sets, while accounting for the size of both. Table <ref type="table" target="#tab_2">1</ref> shows the average Jaccard distance between the terminal sets, averaged across the dates and the towers. The results indicate that, on average, the terminal sets reached through the three paths will be different; therefore, this more expressive representation could be of use in real data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">TEMPORAL ABSTRACT GROUND GRAPHS</head><p>An abstract ground graph is a lifted representation that abstracts paths of dependence over all possible ground graphs for a given relational model <ref type="bibr">[Maier et al., 2013b]</ref>. Abstract ground graphs are shown to be sound and complete in the sense that every edge in the abstract ground graph corresponds to an edge in some ground graph, and every edge in an arbitrary ground graph is represented by an edge in an abstract ground graph. In this section we adapt the definition of abstract ground graphs for the case of a 2-slice temporal relational model. 1. V = RV ∪ IV , where (a) RV is the set of temporal relational variables with a path of length at most h + 1.</p><formula xml:id="formula_7">RV = {[B t , ... , I t j ].V | length([B t , ... , I t j ]) ≤ h + 1}</formula><p>(b) IV are intersection variables between pairs of temporal relational variables that could intersect<ref type="foot" target="#foot_0">1</ref> . </p><formula xml:id="formula_8">IV = {X ∩ Y | X, Y</formula><formula xml:id="formula_9">RVE = {[B t , ... , I t k ].V k → [B t , ... , I t j ].V j | [I t j , ... , I t k ].V k → [I t j ].V j ∈ D T and [B t , ... , I t k ] ∈ extend ([B t , ... , I t j ], [I t j , ... , I t k ])} (b) IVE ⊂ IV × RV ∪ RV ×</formula><p>IV are the intersection variable edges. This is the set of edges that intersection variables "inherit" from the relational variables that they were created from.</p><p>The extend method converts dependencies of the model, specified in the canonical form, into dependencies from the perspective of the abstract ground graph.</p><p>Temporal abstract ground graphs can be shown to be a correct abstraction over all possible temporal ground graphs. The proof follows the one provided for the nontemporal abstract ground graphs, as presented by <ref type="bibr">Maier et al. [2013b]</ref>.</p><p>The temporal abstract ground graph for a model on the student-courses domain with the dependency</p><formula xml:id="formula_10">[Student t+1 , Student t , Takes t , Course t ].difficulty → [Student t+1 ].gpa</formula><p>is shown in Figure <ref type="figure" target="#fig_5">7</ref>. This abstract ground graph is from the perspective of Student and for hop threshold h = 4. Disconnected nodes are omitted.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">d-SEPARATION IN TEMPORAL ABSTRACT GROUND GRAPHS</head><p>The rules of d-separation provide a graphical criterion that specifies whether two sets of variables in a directed acyclic graph are conditionally independent given a third set of variables. Specifically, let X, Y, Z be disjoint sets of variables in a directed acyclic graph. X is d-separated from Y given Z if every path between variables in X and Y is not a d-connecting path given Z. A path is d-connecting given Z if for every collider W on the path, either W ∈ Z or a descendant of W is in Z, and every non-collider on the path is not in Z. <ref type="bibr" target="#b7">Geiger and Pearl [1988]</ref> and <ref type="bibr" target="#b21">Verma and Pearl [1988]</ref> showed that d-separation is sound and complete.</p><p>Ground graphs (and temporal ground graphs) are directed acyclic graphs; therefore, the rules of d-separation can be applied to them. In the case of domains where the firstorder Markov assumption holds, a d-separating set for variables in the same time point can be found by examining only one time point in the past.</p><p>Proposition 5. Let G be a temporal directed acyclic graph that follows the first-order Markov assumption. (i) If X t+1 and Y t+1 are conditionally independent given some set Z, then there exists a separating set W such that time(W) ⊆ {t, t + 1}. (ii) Similarly, if X t and Y t+1 are conditionally independent given some set Z, then there exists a separating set W such that time(W) ⊆ {t, t + 1}.</p><p>Proof. For each case, we will construct an appropriate separating set:</p><p>(i) Let W = parents(X t+1 )∪parents(Y t+1 ). Because of the first-order Markov assumption, time(W) ⊆ {t, t + 1}. Moreover, conditioning of W renders X t+1 , Y t+1 independent because of the local Markov property (either X t+1 is a descendant of Y t+1 or vice versa, or none of them is a descendant of the either).</p><p>(ii) In this case, let W = parents(Y t+1 ). Because of the temporal semantics, X t is a non-descendant of Y t+1 ; therefore, Y t+1 is conditionally independent of its nondescendants (that include X t ) given W.</p><p>The significance of the above proposition is that, given a model where the first-order Markov assumption holds, we could infer conditional independencies (and use them to learn the structure of the model) by only considering consecutive time points. <ref type="bibr">Maier et al. [2013b]</ref> showed that d-separation cannot be applied directly to a relational model. To correct for that, they introduced relational d-separation, a graphical criterion that can be applied to abstract ground graphs and used to infer conditional independencies that hold across all possible ground graphs of the model. Here, we show that the notion of relational d-separation can be generalized for temporal abstract ground graphs as well. In the following definition, X| b denotes the terminal set of X starting at b, i.e., the set of X instances that can be reached if we start from instance b and follow the relational path of X on the relational skeleton. Definition 6 (Temporal relational d-separation). Let X, Y, and Z be three disjoint sets of temporal relational variables from perspective B for a 2-slice temporal relational model M such that not both X and Y contain variables in t. Then The following theorem shows that temporal relational dseparation is sound and complete up to a specified hop threshold. We use the notation X to denote the set of relational variables X augmented with the set of intersection variables they participate in. Theorem 7. Let X, Y, and Z be three disjoint sets of temporal relational variables for perspective B such that not both X and Y contain variables in t. Then, for any temporal skeleton σ T and for all b ∈ σ(B), X| b and Y| b are d-separated by Z| b up to h in ground graph GG Mσ T if and only if X and Ȳ are d-separated by Z on the abstract ground graph tAGG MBh .</p><p>Proof. We prove this by defining a non-temporal relational model that is equivalent to the given temporal model. Since d-separation is sound and complete for non-temporal relational models, the result extends to the equivalent model, and therefore, the temporal relational model. Given a 2slice temporal relational model M T = S, D T , construct a relational model M = S , D as follows:</p><p>• For every item in S, add an item in S with superscript t and one with superscript t + 1: S = {I t , I t+1 | I ∈ S}.</p><p>• For every entity E ∈ S, add in S a 1-1 relation between E t and E t+1 .</p><p>• The set of relational dependencies is the set of temporal relational dependencies: D = D T .</p><p>It can be shown that these two models are equivalent in the sense that there is a one-to-one correspondence between valid paths in M T and M. Leveraging the temporal constraints of d-separation described by Proposition 5 and the soundness and completeness of d-separation for abstract ground graphs, we conclude that d-separation is sound and complete (up to a hop threshold) in temporal abstract ground graphs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">TEMPORAL RCD</head><p>The theory of temporal relational d-separation allows us to derive all conditional independence facts that are consistent with the structure of a temporal relational model, the same way that d-separation connects the structure of a Bayesian network and the conditional independencies of the underlying distribution. This is precisely the connection that constraint-based algorithms leverage in order to learn the structure of models. Thus, temporal relational dseparation (and temporal abstract ground graphs) enable a constraint-based algorithm, TRCD 2 , that learns the structure of temporal and relational causal models from data.</p><p>TRCD extends RCD <ref type="bibr">[Maier et al., 2013a]</ref> to operate over a 2-slice temporal relational model. More specifically, it constructs a set of temporal abstract ground graphs, one from the perspective of each entity, and uses the theory of temporal relational d-separation on temporal abstract ground graphs to decide conditional independence facts. TRCD uses the temporal abstract ground graphs to determine which conditional independence facts should be checked in the data and which dependencies (edges) in the temporal relational model are implied by those facts. As in the case of RCD, for practical reasons, the space of potential dependencies is limited by a domain-specific hop threshold.</p><p>TRCD operates in two phases. Phase I learns a set of undirected dependencies and Phase II employs a set of orientation rules to orient those dependencies. Phase II of TRCD uses the same orientation rules as RCD-collider detection, known non-colliders (KNC), cycle avoidance (CA), Meek rule 3 (MR3), and relational bivariate orientation (RBO). Additionally, TRCD orients dependencies that cross time points from the past to the future.</p><p>RCD was shown to be sound and complete in the sample limit (i.e., with perfect tests of conditional independence) and for infinite hop threshold, under the standard assumptions of the causal Markov condition, faithfulness, and causal sufficiency for relational domains. By leveraging the soundness and completeness of temporal relational d-separation, TRCD can be shown to be sound and complete (under the same assumptions).</p><p>2 Code available at kdl.cs.umass.edu/trcd. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8">EXPERIMENTS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.1">ORACLE EXPERIMENTS</head><p>The goal of this set of experiments is twofold. First, we evaluate the theoretical performance of TRCD in the large sample limit. Towards that end, we employ an oracle for answering d-separation queries in the place of statistical tests of independence. Second, these experiments provide experimental evidence about the correctness of temporal relational d-separation.</p><p>We generated synthetic schemas with number of entities varying from 1 to 3, number of relationships fixed to one less than the number of entities, and number of attributes for each item drawn from Poisson(λ = 1) + 1. The cardinalities were uniformly selected at random. For each number of entities specified, we generated 500 different schemas. This generating process yielded 1,500 different schemas. For a given schema, we generated 10 different models with number of dependencies ranging from 1 to 10. Specifically, we generated all possible dependencies, up to hop-threshold h = 3. Then, we chose the desired number of dependencies from that space at random, subject to the following constraints: Each relational variable has at most 3 parents and the generated model contains no cycles. Moreover, every model must contain at least one dependency with a temporal relational path that contains more than one time point. This procedure resulted in a total of 15,000 models.</p><p>We ran TRCD with a d-separation oracle for each model. Figure <ref type="figure" target="#fig_7">8</ref> shows average precision and recall after Phase I (unoriented dependencies) and after Phase II (partially oriented dependencies). As expected, given that these experiments use an oracle, the algorithm always learns the correct set of unoriented dependencies (unoriented precision and recall are always 1). The algorithm makes no mistakes in the orientation (oriented precision is always 1); however, it is not possible to orient all dependencies (oriented recall is lower than 1). Note that comparing to an oracle version of RCD is not straightforward. The oracle requires a true model that is fully directed. Converting the true temporal model to a non-temporal one would often result in cycles or undirected edges, and the relational d-separation oracle cannot be used.</p><p>Table <ref type="table" target="#tab_4">2</ref> shows how often each orientation rule was used in Phase II. Collider detection, known non-colliders (KNC), and relational bivariate orientation (RBO) are orienting the majority of the edges. As expected, in the case of propositional models (one entity), the relational bivariate orientation rule, a rule for edge orientation that is unique to relational data, is never used. We observe that the other two rules-cycle avoidance and Meek's rule 3-do not fire often. That can be explained by the fact that those rules would never fire in the presence of a temporal edge. Consider cycle avoidance in the propositional case: If the pattern X → Y → Z and X − Y is encountered, then cycle avoidance orients X → Y . If any of the dependencies X → Y → Z crosses time points, then X and Y would be in different time points and the edge between them would be oriented based on temporal precedence. A similar argument can be made for the case of Meek's rule 3.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.2">EXPERIMENTS ON SYNTHETIC DATA</head><p>This experiment showcases the use of TRCD on data, i.e., without the use of an oracle to decide conditional independence. Towards that end, we generated synthetic models, generated data from these models, and applied TRCD on them. The use of synthetic data allows us to have access to the ground truth, so we can measure the accuracy of the learned models. The data-generating process is described in detail below.</p><p>Using the same process as described in 8.1, we generated 5 synthetic schemas with 2 entities and 5 synthetic schemas with 3 entities. For each schema, we generated 10 models with number of dependencies ranging from 1 to 10. This resulted in 100 different models. For each model we created 3 different relational skeletons over 300 timepoints. The number of entity instances at each time point was drawn from Poisson(λ) + 1, where λ ∼ U(5, 10). The degree distribution for the relationship instances was drawn from Poisson(λ) + 1, where λ ∼ U(1, 5). Regarding the parameters of the graphical model, the marginals were parameterized as normal distributions N (µ, σ), where µ ∼ U(0, 5) and σ ∼ U(0, 1). The conditional distribution for a relational variable X was Y ∈parents(X) avg(Y ) + 0.1 * N (0, 1). This resulted in 300 datasets, each over 300 time points.</p><p>In order to assess statistical independence, we fitted a standard linear least-squares regression equation to the outcome variable using the treatment and the variables in the conditioning set as covariates. For relational variables in the conditioning set, we used the average as the aggregation function. Then, we directly used the t-test of the coefficient of the treatment variable to assess independence (p &gt; 0.05 or effect size &lt; 0.01). Figure <ref type="figure" target="#fig_10">9</ref> shows average precision and recall of TRCD after Phase I and Phase II, when applied to the synthetic datasets. While precision after Phase I is more than 0.75 in most cases, the recall after Phase I is relatively low. That implies that we concluded independence (and therefore we removed an edge) more often than we should. This corresponds to Type II errors and can be to the lack of good conditional independence tests for relational data.</p><p>Finally, to demonstrate the difference in expressiveness between TRCD and RCD (the only constraint-based algorithm for relational data), we ran RCD on a "temporally flattened" 3 version of the synthetic data. The true model and the model that TRCD learned are shown in Figure <ref type="figure">10</ref>. The model learned by RCD is shown in Figure <ref type="figure">11</ref>. TRCD correctly learns and orients two of the edges, with the correct path specification. Those dependencies cannot even be expressed in the space of dependencies for RCD.</p><p>3 RCD is ignoring temporal information. An instance is uniquely identified by instance id and time point.  </p><formula xml:id="formula_11">B t+1 ].Y 2 ! [B t+1 ].Y 1 X1 Y1 A B AB X2 X3 Y2 XY1 XY2 XY3 X1 Y1 A B AB X2 X3 Y2 XY1 XY2 XY3 [A t+1 , A t , AB t ].XY 2 ! [A t+1 ].X2 [A t+1 , A t ].X2 ! [A t+1 ].X3 [AB t+1 , A t+1 , AB t+1 , AB t ].XY 1 ! [AB t+1 ].XY 3 [B t+1 , AB t+1 , A t+1 , AB t+1 ].XY 3 ! [B t+1 ].Y 1</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="9">RELATED WORK</head><p>For the propositional case, there are several approaches for learning the structure of temporal probabilistic models. Most of them are not constraint-based methods, but follow the search-and-score paradigm. <ref type="bibr" target="#b6">Friedman et al. [1998]</ref> present an algorithm to learn the structure of DBNs from complete data using a search-and-score algorithm, and from incomplete data using structural EM. <ref type="bibr" target="#b11">Lähdesmäki and Shmulevich [2008]</ref> use Bayesian methods to learn the structure of a DBN from steady state measurements or from time-series data and steady state measurements. <ref type="bibr" target="#b20">Robinson and Hartemink [2010]</ref> present an MCMC algorithm to learn the structure of a DBN that changes over time. A different approach that learns a causal temporal model from time series data is the difference-based causality learner <ref type="bibr" target="#b22">[Voortman et al., 2010]</ref>. This framework is based on dynamic Structural Equation Models.</p><p>A constraint-based method for temporal propositional domains is presented by <ref type="bibr" target="#b5">Entner and Hoyer [2010]</ref> who extend the FCI algorithm <ref type="bibr">[Spirtes et al., 2000]</ref> to temporal domains. FCI relaxes the causal sufficiency assumption, i.e., it allows the presence of latent common causes. Our approach is the first approach that uses a constraint-based algorithm for data that is both temporal and relational (although under the assumption of causal sufficiency).</p><p>Another widely used method for inferring causal relationships from temporal data is Granger causality <ref type="bibr" target="#b8">[Granger, 1969]</ref>. The main idea underlying Granger causality is that a cause should improve the predictive accuracy of its effect, compared to predictions based solely on the effect's past values. There has been work in extending Granger causality for multivariate settings, as well as in combining Granger causality with graphical models <ref type="bibr" target="#b4">[Eichler, 2012]</ref>. <ref type="bibr" target="#b12">Liu et al. [2010]</ref> propose a regularized Hidden Markov Random Field regression to learn the structure of a temporal causal graph from multivariate time-series data. However, the methods used for learning the structure of the causal model are not constraint-based.</p><p>Another line of work in learning temporal and relational models stems from combining first-order logic with probabilistic frameworks. Logical Hidden Markov Models extend Hidden Markov Models to handle relational (non-flat data) <ref type="bibr" target="#b10">[Kersting et al., 2006]</ref>. <ref type="bibr" target="#b9">Kersting and Raiko [2005]</ref> provide an EM-based algorithm to learn the structure of LHMMs.</p><p>In terms of representation, <ref type="bibr" target="#b15">Manfredotti [2009]</ref> introduces relational dynamic Bayesian networks (RDBNs), a firstorder logic-based extension of dynamic Bayesian networks. RDBNs are similar to the relational model we define; however, we provide an explicit characterization for the space of relational paths, and subsequently, the space of relational dependencies. To be more specific, temporal relational paths in our framework are restricted to conjunctions of predicates that correspond to possible traversals of the relational schema. This restriction, together with the domain specific hop threshold, allows us to enumerate the space of potential dependencies to learn.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="10">CONCLUSIONS AND FUTURE WORK</head><p>In this paper we presented a formalization of temporal relational models, and we extended the theory of relational d-separation to the temporal domain. We presented a constraint-based algorithm, TRCD, that leverages the notion of temporal relational d-separation to learn the causal structure of temporal relational models from data. We showed that the algorithm is sound and complete, and we provided experimental evidence that showcases the correctness of TRCD. Finally, we showed the improvement that TRCD achieves compared to RCD when applied to domains with a temporal component.</p><p>TRCD makes certain simplifying assumptions. Future work could focus on relaxing some of those assumptions, specifically, allowing the structure of the causal model to change over time (change point detection for relational data). Another avenue for future research is relaxing the causal sufficiency assumption by employing techniques such as blocking <ref type="bibr" target="#b18">[Rattigan et al., 2011]</ref> for temporal relational domains. Finally, an important issue that arises when modelling time as a discrete quantity is how to choose the appropriate granularity of time points. Ribeiro et al.</p><p>[2012] provide an analysis on how aggregating at a given time granularity affects the characterization of the underlying temporal process. Continuous time Bayesian networks <ref type="bibr" target="#b16">[Nodelman et al., 2002</ref><ref type="bibr" target="#b17">[Nodelman et al., , 2003] ]</ref> provide a way around this by allowing each variable to be modeled at a different time granularity.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :Figure 2 :</head><label>12</label><figDesc>Figure 1: Relational model for the example domain. The underlying relational schema (ER diagram) is shown in gray.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Ground graph for the model of Figure 1 applied on the relational skeleton of Figure 2.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 :Figure 5 :</head><label>45</label><figDesc>Figure 4: Example structure of a temporal relational model.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Relational schema for the reality mining dataset.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Definition 4 .</head><label>4</label><figDesc>A temporal abstract ground graph tAGG MBh = V, E for a 2-slice temporal relational model M = S, D T , perspective B ∈ E ∪ R, and hop threshold h ∈ N 0 is an abstraction of the dependencies D T for all possible ground graphs GG Mσ T of M on arbitrary temporal skeletons σ T . The temporal abstract ground graph is a directed graph with the following nodes and edges:</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: Temporal abstract ground graph from the perspective of Student and hop threshold h = 4 for a model with one dependency: [Student t+1 , Student t , Takes t , Course t ].difficulty → [Student t+1 ].gpa. INT denotes intersection variables between pairs of temporal relational variables.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head></head><label></label><figDesc>X and Y are d-separated by Z if and only if, for any temporal skeleton σ T , X| b and Y| b are d-separated by Z| b in ground graph GG Mσ T for all b ∈ σ T (B).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 8 :</head><label>8</label><figDesc>Figure 8: Precision and recall for TRCD after Phase I (unoriented) and after Phase II (oriented) when using an oracle for answering d-separation queries. The y-axis values start at 0.8.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>[</head><label></label><figDesc></figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Figure 10 :Figure 11 :</head><label>1011</label><figDesc>Figure 10: True temporal relational model. The dotted edges were not learned by TRCD, while the dashed edge was left unoriented.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_10"><head>Figure 9 :</head><label>9</label><figDesc>Figure 9: TRCD on synthetic temporal data.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>[Course t , Takes t , Student t ].gpa ! [Course t ].di culty Takes t+1 , Student t+1 ].gpa ! [Course t+1 ].di culty [Student t+1 , Student t , Takes t , Course t ].di culty ! [Student t+1 ].gpa</figDesc><table><row><cell>gpa</cell><cell>Takes</cell><cell>difficulty</cell><cell>t</cell></row><row><cell>Student</cell><cell></cell><cell>Course</cell><cell></cell></row><row><cell>Student [Course t+1 , gpa</cell><cell>Takes</cell><cell>Course difficulty</cell><cell>t+1</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 1 :</head><label>1</label><figDesc>Jaccard distance between the terminal sets of the different paths for the reality mining dataset (100 sample dates, distance is averaged across dates and across towers).</figDesc><table><row><cell cols="4">Jaccard distance P1 vs. P3 P1 vs. P2 P2 vs. P3</cell></row><row><cell>mean</cell><cell>0.47</cell><cell>0.31</cell><cell>0.31</cell></row><row><cell>min</cell><cell>0</cell><cell>0</cell><cell>0</cell></row><row><cell>max</cell><cell>1</cell><cell>1</cell><cell>1</cell></row><row><cell>median</cell><cell>0.5</cell><cell>0</cell><cell>0</cell></row><row><cell cols="4">nected to a tower in the previous timestep. The second path</cell></row><row><cell cols="4">corresponds to the cellphones that connected to a tower</cell></row><row><cell cols="4">both in the current and in the previous timestep. The</cell></row><row><cell cols="4">third path corresponds to the cellphones that connected to</cell></row><row><cell cols="4">a tower in the current time step, and gets the state of those</cell></row><row><cell cols="2">cellphones in the previous timestep.</cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head></head><label></label><figDesc>RVE ⊂ RV × RV are the relational variable edges:</figDesc><table><row><cell>∈ RV and X = [B t , ... , I t k , ... , I t j ].V</cell></row><row><cell>and Y = [B t , ... , I t l , ... , I t j ].V and I t k = I t l }</cell></row><row><cell>2. E = RVE ∪ IVE , where</cell></row><row><cell>(a)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 2 :</head><label>2</label><figDesc>Frequency of the most-used orientation rules during Phase II of TRCD for the oracle experiments. For the rest of the rules, the frequency was less than 1%. Temporal dependencies are not oriented by the orientation rules.</figDesc><table><row><cell>Number</cell><cell>Collider</cell><cell>KNC</cell><cell>RBO</cell><cell>Percent of</cell></row><row><cell>of</cell><cell>detection</cell><cell></cell><cell></cell><cell>temporal</cell></row><row><cell>entities</cell><cell></cell><cell></cell><cell></cell><cell>dependencies</cell></row><row><cell>1</cell><cell>71%</cell><cell>28%</cell><cell>0%</cell><cell>66%</cell></row><row><cell>2</cell><cell>66%</cell><cell>11%</cell><cell>23%</cell><cell>68%</cell></row><row><cell>3</cell><cell>53%</cell><cell>11%</cell><cell>36%</cell><cell>65%</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Only relational variables in the same time point can intersect.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>Funding was provided by the U.S. Army Research Office (ARO) and Defense Advanced Research Projects Agency (DARPA) under Contract Number W911NF-11-C-0088. The content of the information in this document does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m">Student(t+1), Takes(t+1), Takes(t), Course(t)].difficulty [Student(t+1), Takes(t+1), Takes(t), Student(t), Student(t+1)].gpa)</title>
				<imprint/>
	</monogr>
	<note>Student(t+1), Takes(t+1), Takes(t), Student(t), Student(t+1)</note>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m">gpa INT [Student(t+1), Takes(t+1), Course(t+1), Takes(t+1), Student(t+1)</title>
				<imprint/>
	</monogr>
	<note>Student(t+1), Takes(t+1), Takes(t), Student(t), Student(t+1)</note>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m">gpa INT [Student(t+1), Student(t), Takes(t), Takes(t+1), Student(t+1)</title>
				<imprint/>
	</monogr>
	<note>Student(t+1), Student(t), Takes(t), Course(t). difficulty. Student(t+1). gpa)</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">.difficulty References Nathan Eagle and Alex (Sandy) Pentland. Reality mining: Sensing complex social systems</title>
	</analytic>
	<monogr>
		<title level="m">Student(t+1), Student(t), Takes(t), Course(t)].difficulty INT [Student(t+1), Takes(t+1), Takes(t), Course(t)].difficulty [Student(t+1), Takes(t+1), Takes(t), Course(t)].difficulty INT [Student(t+1),Takes(t+1), Course(t+1), Course(t)].difficulty [Student(t+1), Student(t)</title>
				<imprint>
			<date type="published" when="2006-03">March 2006</date>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="255" to="268" />
		</imprint>
	</monogr>
	<note>, Takes(t), Course(t)].difficulty INT [Student(t+1),Takes(t+1), Course(t+1)</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Graphical modelling of multivariate time series</title>
		<author>
			<persName><forename type="first">Michael</forename><surname>Eichler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Probability Theory and Related Fields</title>
		<imprint>
			<biblScope unit="volume">153</biblScope>
			<biblScope unit="issue">1-2</biblScope>
			<biblScope unit="page" from="233" to="268" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">On causal discovery from time series data using FCI</title>
		<author>
			<persName><forename type="first">Doris</forename><surname>Entner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Patrik</forename><surname>Hoyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th European Workshop on Probabilistic Graphical Models (PGM-2010)</title>
				<meeting>the 5th European Workshop on Probabilistic Graphical Models (PGM-2010)</meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="121" to="128" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Learning the structure of dynamic probabilistic networks</title>
		<author>
			<persName><forename type="first">Nir</forename><surname>Friedman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kevin</forename><surname>Murphy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stuart</forename><surname>Russell</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence</title>
				<meeting>the Fourteenth Conference on Uncertainty in Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="1998">1998</date>
			<biblScope unit="page" from="139" to="147" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">On the logic of causal models</title>
		<author>
			<persName><forename type="first">Dan</forename><surname>Geiger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Judea</forename><surname>Pearl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourth Annual Conference on Uncertainty in Artificial Intelligence</title>
				<meeting>the Fourth Annual Conference on Uncertainty in Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="1988">1988</date>
			<biblScope unit="page" from="136" to="147" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Investigating causal relations by econometric models and cross-spectral methods</title>
		<author>
			<persName><forename type="first">Clive</forename><forename type="middle">W J</forename><surname>Granger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Econometrica</title>
		<imprint>
			<biblScope unit="volume">37</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="424" to="438" />
			<date type="published" when="1969-07">July 1969</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Say EM&quot; for selecting probabilistic models for logical sequences</title>
		<author>
			<persName><forename type="first">Kristian</forename><surname>Kersting</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tapani</forename><surname>Raiko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-First Conference in Uncertainty in Artificial Intelligence</title>
				<meeting>the Twenty-First Conference in Uncertainty in Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="300" to="307" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Logical hidden Markov models</title>
		<author>
			<persName><forename type="first">Kristian</forename><surname>Kersting</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Luc</forename><surname>De Raedt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tapani</forename><surname>Raiko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Artificial Intelligence Research</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="425" to="456" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Learning the structure of dynamic Bayesian networks from time series and steady state measurements</title>
		<author>
			<persName><forename type="first">Harri</forename><surname>Lähdesmäki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ilya</forename><surname>Shmulevich</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Machine Learning</title>
				<imprint>
			<date type="published" when="2008-06">June 2008</date>
			<biblScope unit="volume">71</biblScope>
			<biblScope unit="page" from="185" to="217" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Learning temporal causal graphs for relational time-series analysis</title>
		<author>
			<persName><forename type="first">Yan</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexandru</forename><surname>Niculescu-Mizil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Aurelie</forename><forename type="middle">C</forename><surname>Lozano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yong</forename><surname>Lu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Seventh International Conference on Machine Learning</title>
				<meeting>the Twenty-Seventh International Conference on Machine Learning</meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="687" to="694" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A sound and complete algorithm for learning causal models from relational data</title>
		<author>
			<persName><forename type="first">Marc</forename><surname>Maier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Katerina</forename><surname>Marazopoulou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Arbour</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Jensen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence</title>
				<meeting>the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="371" to="380" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Reasoning about Independence in Probabilistic Models of Relational Data</title>
		<author>
			<persName><forename type="first">Marc</forename><surname>Maier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Katerina</forename><surname>Marazopoulou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Jensen</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1302.4381</idno>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Modeling and Inference with Relational Dynamic Bayesian Networks</title>
		<author>
			<persName><forename type="first">Cristina</forename><forename type="middle">E</forename><surname>Manfredotti</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
		<respStmt>
			<orgName>University of Milano</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Dynamic Bayesian Networks: Representation, Inference and Learning</title>
		<author>
			<persName><forename type="first">Kevin</forename><forename type="middle">P</forename><surname>Murphy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Uri</forename><surname>Nodelman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christian</forename><forename type="middle">R</forename><surname>Shelton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daphne</forename><surname>Koller</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence</title>
				<meeting>the Eighteenth Conference on Uncertainty in Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2002">2002. 2002</date>
			<biblScope unit="page" from="378" to="387" />
		</imprint>
		<respStmt>
			<orgName>University of California, Berkeley</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
	<note>Continuous time Bayesian networks</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Learning continuous time Bayesian networks</title>
		<author>
			<persName><forename type="first">Uri</forename><surname>Nodelman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christian</forename><forename type="middle">R</forename><surname>Shelton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daphne</forename><surname>Koller</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">editors, Principles of Knowledge Representation and Reasoning: Proceeding of the Second International Conference</title>
				<editor>
			<persName><forename type="first">J</forename><surname>Allen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Fikes</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Sandewall</surname></persName>
		</editor>
		<imprint>
			<publisher>Morgan Kaufmann</publisher>
			<date type="published" when="1991">2003. 1991</date>
			<biblScope unit="page" from="441" to="452" />
		</imprint>
	</monogr>
	<note>Proceedings of the Nineteenth International Conference on Uncertainty in Artificial Intelligence</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Relational blocking for causal discovery</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Matthew</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marc</forename><surname>Rattigan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Maier</surname></persName>
		</author>
		<author>
			<persName><surname>Jensen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence</title>
				<meeting>the Twenty-Fifth AAAI Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="145" to="151" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Quantifying the effect of temporal resolution in timevarying network</title>
		<author>
			<persName><forename type="first">F</forename><surname>Bruno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nicola</forename><surname>Ribeiro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrea</forename><surname>Perra</surname></persName>
		</author>
		<author>
			<persName><surname>Baronchelli</surname></persName>
		</author>
		<idno>CoRR, abs/1211.7052</idno>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Learning non-stationary dynamic Bayesian networks</title>
		<author>
			<persName><forename type="first">Joshua</forename><forename type="middle">W</forename><surname>Robinson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><forename type="middle">J</forename><surname>Hartemink</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Causation, Prediction and Search</title>
				<editor>
			<persName><forename type="first">Peter</forename><surname>Spirtes</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Clark</forename><surname>Glymour</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Richard</forename><surname>Scheines</surname></persName>
		</editor>
		<meeting><address><addrLine>Cambridge, MA</addrLine></address></meeting>
		<imprint>
			<publisher>MIT Press</publisher>
			<date type="published" when="2000">December 2010. 2000</date>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="3647" to="3680" />
		</imprint>
	</monogr>
	<note>2nd edition</note>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Causal networks: Semantics and expressiveness</title>
		<author>
			<persName><forename type="first">Thomas</forename><surname>Verma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Judea</forename><surname>Pearl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourth Annual Conference on Uncertainty in Artificial Intelligence</title>
				<meeting>the Fourth Annual Conference on Uncertainty in Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="1988">1988</date>
			<biblScope unit="page" from="352" to="359" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Learning why things change: The difference-based causality learner</title>
		<author>
			<persName><forename type="first">Mark</forename><surname>Voortman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Denver</forename><surname>Dash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marek</forename><forename type="middle">J</forename><surname>Druzdzel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence</title>
				<meeting>the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="641" to="650" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
