<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>How Occam's Razor Provides a Neat Definition of Direct Causation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alexander Gebharter</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gerhard Schurz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Duesseldorf Center for Logic and Philosophy of Science University of Duesseldorf Universitaetsstrasse 1 40225 Duesseldorf</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we show that the application of Occam's razor to the theory of causal Bayes nets gives us a neat definition of direct causation. In particular we show that Occam's razor implies Woodward's (2003) definition of direct causation, provided suitable intervention variables exist and the causal Markov condition (CMC) is satisfied. We also show how Occam's razor can account for direct causal relationships Woodward style when only stochastic intervention variables are available.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Occam’s razor is typically seen as a methodological
principle. There are many possible ways to apply the razor to
the theory of causal Bayes nets. It could, for example,
simply be interpreted to suggest preferring the simplest causal
structure compatible with the given data among all
compatible causal structures. The simplest causal structure could,
for instance, be the one (or one of the ones) featuring the
fewest causal arrows.</p>
      <p>In this paper, however, we are interested in a slightly
different application of Occam’s razor: Our interpretation of
Occam’s razor asserts that given a causal structure is
compatible with the data, it should only be chosen if it
satisfies the causal minimality condition (Min) in the sense of
Spirtes et al. (2000, p. 31), which requires that no causal
arrow in the structure can be omitted in such a way that the
resulting substructure would still be compatible with the
data. When speaking of a causal structure being
compatible with the data, we have a causal structure and a
probability distribution satisfying the causal Markov condition
(CMC) in mind. (For details, see sec. 5.) In the following,
applying Occam’s razor always means to assume that the
causal minimality condition is satisfied.</p>
      <p>In this paper we give a motivation for Occam’s razor that
goes beyond its merits as a methodological principle
dictating that one should always decide in favor of minimal
causal models. In particular, we show that Occam’s
razor provides a neat definition of direct causal relatedness
in the sense of Woodward (2003), provided suitable
intervention variables exist and CMC is satisfied. Note the
connection of this enterprise to Zhang and Spirtes’ (2011)
project. Zhang and Spirtes prove that CMC and an
interventionist definition of direct causation a la Woodward
(2003) together imply minimality. So Occam’s razor is
well-motivated within a manipulationist framework such as
Woodward’s. We show, vice versa, that CMC and
minimality together imply Woodward’s definition of direct
causation. So if one wants a neat definition of direct causation,
it is reasonable to apply Occam’s razor in the sense of
assuming minimality.</p>
      <p>
        The paper is structured as follows: In sec. 2 we introduce
the notation we use in subsequent sections. In sec. 3 we
present Woodward’s (2003) definition of direct causation
and his definition of an intervention variable. In sec. 4 we
give precise reconstructions of both definitions in terms of
causal Bayes nets. We also provide a definition of the
notion of an intervention expansion, which is needed to
account for direct causal relations in terms of the existence of
certain intervention variables. In sec. 5 we show that
Occam’s razor gives us Woodward’s definition of direct
causation if CMC is assumed and the existence of suitable
intervention variables is granted (theorem 2). In sec. 6 we
go a step further and show how Occam’s razor allows us
to account for direct causation Woodward style when only
stochastic intervention variables (
        <xref ref-type="bibr" rid="ref2">cf. Korb et al., 2004</xref>
        , sec.
5) are available (theorem 3). We conclude in sec. 7.
Note that though the main results of the present paper
(i.e., theorems 2 and 3) can be used for causal
discovery, the goal of this paper is not to provide a method for
uncovering direct causal connections among variables in
a set of variables V of interest. The goal of this paper
is to establish a connection between Woodward’s (2003)
intervention-based notion of direct causation and the
presence of a causal arrow in a minimal causal Bayes net, which
can be interpreted as support for Occam’s razor. Because of
this, the present paper does not discuss the relation of
theorems 2 and 3 to results about causal discovery by means of
interventions such as, e.g.,
        <xref ref-type="bibr" rid="ref1">(Eberhardt and Scheines, 2007)</xref>
        or
        <xref ref-type="bibr" rid="ref1">(Nyberg and Korb, 2007)</xref>
        .
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>NOTATION</title>
      <p>We represent causal structures by graphs, i.e., by ordered
pairs hV; Ei, where V is a set of variables and E is a binary
relation on V (E V V). V’s elements are called the
graph’s “vertices” and E’s elements are called its “edges”.
“X ! Y ” stands short for “hX; Y i 2 E” and is interpreted
as “X is a direct cause of Y in hV; Ei” or as “Y is a direct
effect of X in hV; Ei”. P ar(Y ) is the set of all X 2 V
with X ! Y in hV; Ei. The elements of P ar(Y ) are
called Y ’s parents. We write “X { Y ” for “X ! Y or
X Y ”. A path : X { ::: { Y is called a (causal)
path connecting X and Y in hV; Ei. A causal path is
called a directed causal path from X to Y if and only if
(“iff” for short) it has the form X ! ::: ! Y . X is called
a cause of Y and Y an effect of X in that case. A causal
path is called a common cause path iff it has the form
X ::: Z ! ::: ! Y and no variable appears more
often than once on . Z is called a common cause of X
and Y lying on path in that case. A variable Z lying on a
path : X { ::: ! Z ::: { Y is called a collider lying
on this path. A variable X is called exogenous iff no arrow
is pointing at X; it is called endogenous otherwise.
A graph hV; Ei is called a directed graph in case all edges
in E are one-headed arrows “!”. It is called cyclic iff
it features a causal path of the form X ! ::: ! X and
acyclic otherwise. A causal structure hV; Ei together with
a probability distribution P over V is called a causal model
hV; E; P i. P is intended to provide information about the
strengths of causal influences represented by the arrows in
hV; Ei. A causal model hV; E; P i is called cyclic iff its
graph hV; Ei is cyclic; it is called acyclic otherwise. In
the following, we will only be interested in acyclic causal
models.</p>
      <p>We use the standard notions of (conditional) probabilistic
dependence and independence:</p>
      <sec id="sec-2-1">
        <title>Definition 1 (conditional probabilistic (in)dependence)</title>
        <p>X and Y are probabilistically dependent conditional on Z
iff there are X-, Y -, and Z-values x, y, and z, respectively,
such that P (xjy; z) 6= P (xjz) ^ P (y; z) &gt; 0.</p>
        <p>X and Y are probabilistically independent conditional on
Z iff X and Y are not probabilistically dependent
conditional on Z.</p>
        <p>Probabilistic independence between X and Y conditional
on Z is abbreviated as “Indep(X; Y jZ)”, probabilistic
dependence is abbreviated as “Dep(X; Y jZ)”.
Unconditional probabilistic (in)dependence between X and Y
(In)Dep(X; Y ) is defined as (In)Dep(X; Y j;). X, Y ,
and Z in definition 1 can be variables or sequences of
variables. When X; Y; Z; ::: are sequences of variables,
we write them in bold letters. We write also the values
x; y; z; ::: of sequences X; Y; Z; ::: in bold letters. The
set of values x of a sequence X of variables X1; :::; Xn
is val(X1) ::: val(Xn), where val(Xi) is the set of
Xi’s possible values.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>WOODWARD’S DEFINITION OF</title>
    </sec>
    <sec id="sec-4">
      <title>DIRECT CAUSATION</title>
      <p>Woodward’s (2003) interventionist theory of causation
aims to explicate direct causation w.r.t. a set of variables
V in terms of possible interventions. Woodward (2003,
p. 98) provides the following definition of an intervention
variable:
Definition 2 (IVW ) I is an intervention variable for X
with respect to Y if and only if I meets the following
conditions:
I1. I causes X.</p>
      <p>I2. I acts as a switch for all the other variables that cause
X. That is, certain values of I are such that when I attains
those values, X ceases to depend on the values of other
variables that cause X and instead depends only on the
value taken by I.</p>
      <p>I3. Any directed path from I to Y [if there exists one] goes
through X [...].</p>
      <p>
        I4. I is (statistically) independent of any variable Z that
causes Y and that is on a directed path that does not go
through X.
(IVW ) is intended to single out those variables as
intervention variables for X w.r.t. Y that allow for correct causal
inference according to Woodward’s (2003) definition of
direct causation. For I to be an intervention variable for X
w.r.t. Y it is required that I is causally relevant to X
(condition I1), that X is only under I’s influence when I = on
(condition I2), and that a correlation between I and Y can
only be due to a directed causal path from I to Y going
through X (conditions I3 and I4). For a detailed motivation
of I1-I4, see
        <xref ref-type="bibr" rid="ref11">(Woodward, 2003, sec. 3.1.4)</xref>
        . For problems
with Woodward’s definitions, see (Gebharter and Schurz,
ms).
      </p>
      <p>
        An intervention on X w.r.t. Y (from now on we refer to X
as the intervention’s “target variable” and to Y as the “test
variable”) is then straightforwardly defined as an
intervention variable I for X w.r.t. Y taking one of its on-values,
which forces X to take a certain value x. We will call
interventions whose on-values force X to take certain values
x “deterministic interventions” (
        <xref ref-type="bibr" rid="ref2">cf. Korb et al., 2004</xref>
        , sec.
5).
      </p>
      <p>
        Note that Woodward’s (2003) notion of an intervention is,
on the one hand, strong because it requires interventions
to be deterministic interventions. It is, on the other hand,
weak in another respect: In contrast to structural or
surgical interventions
        <xref ref-type="bibr" rid="ref1 ref8">(cf. Eberhardt and Scheines, 2007, p. 984;
Pearl, 2009)</xref>
        Woodward’s interventions are allowed to be
direct causes of more than one variable as long as the
intervention’s direct effects which are non-target variables do
not cause the test variable over a path not going through the
intervention’s target variable (intervention condition I3).
Based on his notion of an intervention, Woodward (2003, p.
59) gives the following definition of direct causation w.r.t.
a variable set V:
Definition 3 (DCW ) A necessary and sufficient condition
for X to be a (type-level) direct cause of Y with respect to
a variable set V is that there be a possible intervention on
X that will change Y or the probability distribution of Y
when one holds fixed at some value all other variables Zi
in V.
(DCW ) neatly explicates direct causation w.r.t. a variable
set V in terms of possible interventions: X is a direct cause
of Y w.r.t. V if Y can be wiggled by wiggling X; and if
X is a direct cause of Y w.r.t. V, then there are possible
interventions by whose means one can influence Y by
manipulating X.1
Note that (DCW ) may be too strong because many domains
involve variables one cannot control by deterministic
interventions. Scenarios of this kind include, for example, the
decay of uranium or states of entangled systems in quantum
mechanics. The decay of uranium can only be
probabilistically influenced, and any attempt to manipulate the state
of one of two entangled photons, for example, would
destroy the entangled system. Glymour (2004) also considers
variables for sex and race as not manipulable by means of
intervention variables in the sense of (IVW ).
      </p>
      <p>To avoid all problems that might arise for Woodward’s
(2003) account due to variables that are not manipulable
by deterministic interventions, we will reconstruct
Woodward’s (DCW ) as a partial definition in sec. 4. In particular,
we will define direct causation only for sets of variables V
for which suitable intervention variables exist.
4</p>
    </sec>
    <sec id="sec-5">
      <title>RECONSTRUCTING WOODWARD’S</title>
    </sec>
    <sec id="sec-6">
      <title>DEFINITION</title>
      <p>In this section we reconstruct Woodward’s (2003)
definition of direct causation in terms of causal Bayes nets. The
reconstruction of (IVW ) is straightforward:</p>
      <p>1Note that Woodward (2003) does not require the intervention
variables I to be elements of the set of variables V containing the
target variable X and the test variable Y .</p>
      <p>
        Definition 4 (IV) IX 2 V is an intervention variable for
X 2 V w.r.t. Y 2 V in a causal model hV; E; P i iff
(a) IX is exogenous and there is a path : IX ! X in
hV; Ei,
(b) for every on-value of IX there is an X-value x such
that P (xjIX = on) = 1 and Dep(x; IX = onjz) holds for
every instantiation z of every Z VnfIX ; Xg,
(c) all paths IX ! ::: ! Y in hV; Ei have the form IX !
::: ! X ! ::: ! Y ,
(d) IX is independent from every variable C (in V or not
in V) which causes Y over a path not going through X.
Note that (IV) still allows for intervention variables IX that
are common causes of their target variable X and other
variables in V. Condition (a) requires IX to be exogenous.
This is, though it is a typical assumption made for
intervention variables, not explicit in Woodward’s (2003) original
definition (IVW ). One problem that might arise for
Woodward’s account when not making this assumption is that IX
in a causal structure Y ! IX ! X may turn out to be an
intervention variable for X w.r.t. Y . If Y then depends on
IX = on, (DCW ) would falsely determine X to be a cause
of Y (cf. Gebharter and Schurz, ms). IX ! X in
condition (a) is a harmless simplification of I1. Condition (b)
captures Woodward’s requirement that interventions have
to be deterministic, from which I2 follows. X is assumed
to be under full control of IX when IX is on. This does
not only require that for every on-value of IX there is an
X-value x such that P (xjIX = on) = 1, but also that
IX = on actually has an influence on x in every possible
context, i.e., under conditionalization on arbitrary
instantiations z of all kinds of subsets Z of VnfIX ; Xg. Condition
(c) directly mirrors I3. Condition (d) mirrors Woodward’s
I4. Note that condition (d) requires reference to variables C
possibly not contained in V
        <xref ref-type="bibr" rid="ref12">(cf. Woodward, 2008, p. 202)</xref>
        .
If we want to account for direct causal connection in a
causal model hV; E; P i by means of interventions, we
have to add intervention variables to V. In other words:
We have to expand hV; E; P i in a certain way. But how
do we have to expand hV; E; P i? To answer this question,
let us assume that we want to know whether X is a direct
cause of Y in the unmanipulated model hV; E; P i. Then
the manipulated model hV0; E0; P 0i will have to contain an
intervention variable IX for X w.r.t. Y and also
intervention variables IZ for all Z 2 V different from X and Y by
whose means these Z can be controlled. X is a direct cause
of Y if IX has some on-values such that we can influence Y
by manipulating X with IX = on when all IZ have taken
certain on-values. On the other hand, to guarantee that X
is not a direct cause of Y , we have to demonstrate that no
one of Y ’s values can be influenced by manipulating some
X-value by some intervention. For establishing such a
negative causal claim, we require an intervention variable IX
by whose means we can control every X-value x.
(Otherwise it could be that Y depends only on X-values that
are not correlated with IX -values; then IX = on would
have no probabilistic influence on Y , though X may be
a causal parent of Y .) In addition, we require for every
Z 6= X; Y an intervention variable IZ by whose means Z
can be forced to take every value z. (Otherwise it could
be that we can bring about only such Z-value
instantiations which screen X and Y off each other; then IX = on
would have no probabilistic influence on Y when Z’s value
is fixed by interventions, though X may be a causal parent
of Y .)
In the unmanipulated model hV; E; P i, all
intervention variables I are of f . In the manipulated model
hV0; E0; P 0i, all intervention variables’ values are realized
for some but not for all individuals in the domain. This
move allows us to compute probabilities for variables in V
when I = of f as well as probabilities for variables in V
for all combinations of on-value realizations of
intervention variables I, while the causal structure of the
unmanipulated model will be preserved in the manipulated model.
(Note that we deviate here from the typical “arrow
breaking” representation of interventions in the literature which
assumes that in the manipulated model all individuals get
manipulated.) This amounts to the following notion of an
intervention expansion (“i-expansion” for short):
Definition 5 (intervention expansion) hV0; E0; P 0i is an
intervention expansion of hV; E; P i w.r.t. Y 2 V iff
(a) V0 = V[_ VI, where VI contains for every X 2 V
different from Y an intervention variable IX w.r.t. Y (and
nothing else),
(b) for all Zi; Zj 2 V : Zi ! Zj in E0 iff Zi ! Zj in E,
(c) for every X-value x of every X 2 V different from
Y there is an on-value of the corresponding
intervention variable IX such that P 0(xjIX = on) = 1 and
Dep(x; IX = onjz) holds for every instantiation z of every
Z VnfIX ; Xg,
(d) PI0=o " V = P ,
(e) P 0(I = on); P 0(I = o ) &gt; 0.
      </p>
      <p>I in conditions (d) and (e) is the set of all newly added
intervention variables I. PI0=o " V in (d) is PI0=o :=
P 0( jI = o ) restricted to V. Hence, “PI0=o " V = P ”
means that PI0=o coincides with P on the value space
of variables in V. Condition (a) guarantees that the
iexpansion contains all the intervention variables required
for testing for direct causal relationships in the sense of
Woodward’s (2003) definition of direct causation. The
assumption that VI contains only intervention variables for
X w.r.t. Y is a harmless simplification. Thanks to
condition (b), the manipulated model’s causal structure fits to the
unmanipulated model’s causal structure. In particular, the
i-expansion is only allowed to introduce new causal arrows
going from intervention variables to variables in V. Due
to condition (c), every X 2 V different from Y can be
fully controlled by means of an intervention variable IX
for X w.r.t. Y . Condition (d) explains how the
manipulated model’s associated probability distribution P 0 fits to
the unmanipulated model’s distribution P . Condition (e)
says that all values of intervention variables have to be
realized by some individuals in the domain.</p>
      <p>With help of this notion of an i-expansion we can now
reconstruct Woodward’s (2003) definition of direct causation.
As already mentioned, Woodward’s definition requires the
existence of suitable intervention variables. Thus, we
reconstruct (DCW ) as a partial definition whose if-condition
presupposes the required intervention variables:
Definition 6 (DC) If there exist i-expansions hV0; E0; P 0i
of hV; E; P i w.r.t. Y 2 V, then: X 2 V is a direct
cause of Y w.r.t. V iff Dep(Y; IX = onjIZ = on) holds
in some i-expansions hV0; E0; P 0i of hV; E; P i w.r.t. Y ,
where IX is an intervention variable for X w.r.t. Y in
hV0; E0; P 0i and IZ is the set of all intervention variables
in hV0; E0; P 0i different from IX .
(DC) mirrors Woodward’s definition restricted to cases in
which the required intervention variables (more precisely:
the required i-expansions) exist: In case Y can be
probabilistically influenced by manipulating X by means of an
intervention variable IX for X w.r.t. Y in one of these
iexpansions, X is a direct cause of Y in the unmanipulated
model. And vice versa: In case X is a direct cause of Y
in the unmanipulated model, there will be an intervention
variable IX for X w.r.t. Y in one of these i-expansions such
that Y is probabilistically sensitive to IX = on.
In the next section we show that (DC) can account for all
direct causal dependencies in a causal model if suitable
iexpansions exist and CMC and Min are assumed to be
satisfied.
5</p>
    </sec>
    <sec id="sec-7">
      <title>OCCAM’S RAZOR, DETERMINISTIC</title>
    </sec>
    <sec id="sec-8">
      <title>INTERVENTIONS, AND DIRECT</title>
    </sec>
    <sec id="sec-9">
      <title>CAUSATION</title>
      <p>
        The theory of causal Bayes nets’ core axiom is the causal
Markov condition (CMC)
        <xref ref-type="bibr" rid="ref9">(cf. Spirtes et al., 2000, p. 29)</xref>
        :
      </p>
      <sec id="sec-9-1">
        <title>Definition 7 (causal Markov condition) A causal model</title>
        <p>hV; E; P i satisfies the causal Markov condition iff every
X 2 V is probabilistically independent of all its
noneffects conditional on its causal parents.</p>
        <p>
          CMC is assumed to hold for causal models whose variable
sets are causally sufficient. A variable set V is causally
sufficient iff every common cause C of variables X and Y in
V is also in V or takes the same value c for all individuals
in the domain
          <xref ref-type="bibr" rid="ref9">(cf. Spirtes et al., 2000, p. 22)</xref>
          . From now on
we implicitly assume causal sufficiency, i.e., we only
consider causal models whose variable sets are causally
sufficient.
        </p>
        <p>
          A finite causal model hV; E; P i satisfies the Markov
condition iff P admits the following Markov factorization
relative to hV; Ei
          <xref ref-type="bibr" rid="ref8">(cf. Pearl, 2009, p. 16)</xref>
          :
        </p>
        <p>P (X1; :::; Xn) =</p>
        <p>Y P (XijP ar(Xi))
i
(1)
The conditional probabilities P (XijP ar(Xi)) are called
Xi’s parameters.</p>
        <p>
          For acyclic causal models, CMC is equivalent to the
dseparation criterion
          <xref ref-type="bibr" rid="ref10 ref7">(Verma, 1986; Pearl, 1988, pp. 119f)</xref>
          :
        </p>
        <sec id="sec-9-1-1">
          <title>Definition 8 (d-separation criterion) hV; E; P i satisfies</title>
          <p>the d-separation criterion iff the following holds for all
X; Y 2 V and Z VnfX; Y g: If X and Y are
dseparated by Z in hV; Ei, then Indep(X; Y jZ).</p>
        </sec>
        <sec id="sec-9-1-2">
          <title>Definition 9 (d-separation, d-connection) X 2 V and</title>
          <p>Y 2 V are d-separated by Z VnfX; Y g in hV; Ei iff
X and Y are not d-connected given Z in hV; Ei.
X 2 V and Y 2 V are d-connected given Z VnfX; Y g
in hV; Ei iff X and Y are connected by a path in hV; Ei
such that no non-collider on is in Z, while all colliders
on are in Z or have an effect in Z.</p>
          <p>The equivalence between CMC and the d-separation
criterion reveals the full content of CMC: If a causal model
satisfies CMC, then every (conditional) probabilistic
independence can be explained by missing (conditional) causal
connections, and every (conditional) probabilistic
dependence can be explained by some existing (conditional)
causal connection.</p>
          <p>In case there is a path between X and Y in hV; Ei such
that no non-collider on is in Z VnfX; Y g and all
colliders on are in Z or have an effect in Z, is said to be
activated by Z. We also say that X and Y are d-connected
given Z over path in that case. If is not activated by Z,
is said to be blocked by Z. We also say that X and Y are
d-separated by Z over path in that case.</p>
          <p>Occam’s razor (as we understand it in this paper) dictates
to prefer from all those causal structures hV; Ei, which
together with a given probability distribution P over V
satisfy CMC, the ones which also satisfy the causal
minimality condition (Min):</p>
        </sec>
      </sec>
      <sec id="sec-9-2">
        <title>Definition 10 (causal minimality condition) A causal</title>
        <p>
          model hV; E; P i satisfying CMC satisfies the causal
minimality condition iff no model hV; E0; P i with E0 E
also satisfies CMC
          <xref ref-type="bibr" rid="ref9">(cf. Spirtes et al., 2000, p. 31)</xref>
          .
        </p>
      </sec>
      <sec id="sec-9-3">
        <title>Definition 11 (causal productivity condition) A causal</title>
        <p>model hV; E; P i satisfies the causal productivity condition
iff Dep(X; Y jP ar(Y )nfXg) holds for all X; Y 2 V with
X ! Y in hV; Ei.</p>
        <p>Theorem 1 For every acyclic causal model hV; E; P i
satisfying CMC, the causal minimality condition and the
causal productivity condition are equivalent.</p>
        <p>The equivalence of Min and Prod reveals the full content of
Min: In minimal causal models, no causal arrow is
superfluous, i.e., every causal arrow from X to Y is productive,
meaning that it is responsible for some probabilistic
dependence between X and Y (when the values of all other
parents of Y are fixed).</p>
        <p>We can now prove the following theorem:
Theorem 2 If hV; E; P i is an acyclic causal model and
for every Y 2 V there is an i-expansion hV0; E0; P 0i of
hV; E; P i w.r.t. Y satisfying CMC and Min, then for all
X; Y 2 V (with X 6= Y ) the following two statements are
equivalent:
(i) X ! Y in hV; Ei.
(ii) Dep(Y; IX = onjIZ = on) holds in some i-expansions
hV0; E0; P 0i of hV; E; P i w.r.t. Y , where IX is an
intervention variable for X w.r.t. Y in hV0; E0; P 0i and IZ is the set
of all intervention variables in hV0; E0; P 0i different from
IX .</p>
        <p>Theorem 2 shows that direct causation a la Woodward
(2003) coincides with the graph theoretical notion of direct
causation in systems hV; E; P i with i-expansions w.r.t.
every variable Y 2 V satisfying CMC and Min. In particular,
theorem 2 says the following: Assume we are interested in
a causal model hV; E; P i. Assume further that for every
Y in V there is an i-expansion hV0; E0; P 0i of hV; E; P i
w.r.t. Y satisfying CMC and Min. This means (among
other things) that for every pair of variables hX; Y i there is
at least one i-expansion with an intervention variable IX for
X w.r.t. Y and intervention variables IZ for every Z 2 V
(different from X and Y ) w.r.t. Y by whose means one can
force the variables in VnfY g to take any combination of
value realizations. Given this setup, theorem 2 tells us for
every X and Y (with X 6= Y ) in V that X is a causal
parent of Y in hV; Ei iff Dep(Y; IX = onjIZ = on) holds in
one of the presupposed i-expansions w.r.t. Y .
6</p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>OCCAM’S RAZOR, STOCHASTIC</title>
    </sec>
    <sec id="sec-11">
      <title>INTERVENTIONS, AND DIRECT</title>
    </sec>
    <sec id="sec-12">
      <title>CAUSATION</title>
      <p>For acyclic causal models satisfying CMC, the following
causal productivity condition (Prod) (cf. Schurz and
Gebharter, forthcoming) can be seen as a reformulation of the
causal minimality condition:
In this section we generalize the main finding of sec. 5 to
cases in which only stochastic interventions are available.
To account for direct causal relations X ! Y by means
of stochastic intervention variables, two intervention
variables are needed, one for X and one for Y . (For details,
see below.) We define a stochastic intervention variable as
follows:
Definition 12 (IVS ) IX 2 V is a stochastic intervention
variable for X 2 V w.r.t. Y 2 V in hV; E; P i iff
(a) IX is exogenous and there is a path : IX ! X in
hV; Ei,
(b) for every on-value of IX there is an X-value x such
that Dep(x; IX = onjz) holds for every instantiation z of
every Z VnfIX ; Xg,
(c) all paths IX ! ::: ! Y in hV; Ei have the form IX !
::: ! X ! ::: ! Y ,
(d) IX is independent from every variable C (in V or not
in V) which causes Y over a path not going through X.
The only difference between (IVS ) and (IV) is condition
(b). For stochastic interventions it is not required that
IX = on determines X’s value to be x with probability
1. It suffices that IX = on and x are correlated conditional
on every value z of every Z VnfIX ; Xg. This specific
constraint guarantees that X can be influenced by IX = on
under all circumstances, i.e., under all kinds of
conditionalization on instantiations of remainder variables in V.
We do also have to modify our notion of an intervention
expansion in case we allow for stochastic interventions. We
define the following notion of a stochastic intervention
expansion:
Definition 13 (stochastic intervention expansion)
hV0; E0; P 0i is a stochastic intervention expansion of
hV; E; P i for X 2 V w.r.t. Y 2 V iff
(a) V0 = V [_VI, where VI contains one stochastic
intervention variable IX for X w.r.t. Y and one stochastic
intervention variable IY for Y w.r.t. Y which is a parent
only of Y (and nothing else),
(b) for all Zi; Zj 2 V : Zi ! Zj in E0 iff Zi ! Zj in E,
(c.1) for every X-value x there is an on-value of IX such
that Dep(x; IX = onjz) holds for every instantiation z of
every Z V0nfIX ; Xg,
(c.2) for every Y -value y, every instantiation r of P ar(Y ),
and every on-value of IY there is an on-value on of
IY such that P 0(yjIY = on ; r) 6= P 0(yjIY = on; r),
P 0(yjIY = on ; r) &gt; 0, and P 0(yjIY = on ; r ) =
P 0(yjIY = on; r ) holds for all r 2 val(P ar(Y ))
different from r,
(d) PI0=o " V = P ,
(e) P 0(I = on); P 0(I = o ) &gt; 0.</p>
      <p>
        This definition differs from the definition of a
(nonstochastic) i-expansion with respect to conditions (a) and
(c): A stochastic i-expansion for X w.r.t. Y contains
exactly two intervention variables, viz. one stochastic
intervention variable IX for X w.r.t. Y and one stochastic
intervention variable IY for Y w.r.t. Y (which trivially satisfies
conditions (c) and (d) in (IVS )). While IX may have more
than one direct effect, the second intervention variable IY
is assumed to be a causal parent only of Y . (This is required
for accounting for direct causal connections; for details see
(i) ) (ii) in the proof of theorem 3 in the appendix.)
The second intervention variable IY is required to exclude
independence between IX and Y due to a fine-tuning of
Y ’s parameters. Such an independence can arise even if
CMC and Min are satisfied, X is a causal parent of Y ,
and IX and Y are each correlated with the same X-values
x. For examples of this kind of non-faithfulness, see, e.g.,
        <xref ref-type="bibr" rid="ref5">(Neapolitan, 2004, p. 96)</xref>
        or (Naeger, forthcoming). In
condition (c.2) we assume that every one of Y ’s parameters can
be changed independently of all other Y -parameters (to a
value r 2 ]0; 1]) by changing IY ’s on-value. This suffices
to exclude non-faithful independencies between IX and Y
of the kind described above.
      </p>
      <p>When not presupposing deterministic interventions, it
cannot be guaranteed anymore that the value of every
variable in our model of interest different from the test variable
Y can be fixed by interventions. The values of a causal
model’s variables can, however, also be fixed by
conditionalization. To account for direct causation between X and
Y when only stochastic interventions are available, one has
to conditionalize on a suitably chosen set Z VnfX; Y g
that (i) blocks all indirect causal paths between X and Y ,
and that (ii) fixes all X-alternative parents of Y . That Z
blocks all indirect paths between X and Y is required to
assure that dependence between IX = on and Y cannot be
due to an indirect path, and fixing the values of all parents
of Y different from X is required to exclude independence
of IX = on and Y due to a fine-tuning of Y ’s X-alternative
parents that may cancel the influence of IX = on on Y over
a path IX ! X ! Y .2 Fortunately, every directed acyclic
graph hV; Ei features a set Z satisfying requirement (i),
viz. P ar(Y )nfXg (cf. Schurz and Gebharter,
forthcoming). Trivially, P ar(Y )nfXg also satisfies requirement
(ii).</p>
      <p>With the help of (IVS ) and definition 13, we can now
define direct causation in terms of stochastic interventions for
models for which suitable stochastic i-expansions exist:
Definition 14 (DCS ) If there exist stochastic i-expansions
hV0; E0; P 0i of hV; E; P i for X w.r.t. Y , then: X
is a direct cause of Y w.r.t. V iff Dep(Y; IX =
onjP ar(Y )nfXg; IY = on) holds in some i-expansions
hV0; E0; P 0i of hV; E; P i for X w.r.t. Y , where IX
is a stochastic intervention variable for X w.r.t. Y in
hV0; E0; P 0i and IY is a stochastic intervention variable
for Y w.r.t. Y in hV0; E0; P 0i.</p>
      <p>Now the following theorem can be proven:</p>
      <p>
        2For details on such cases of non-faithfulness due to
compensating parents see
        <xref ref-type="bibr" rid="ref7">(Schurz and Gebharter, forthcoming; Pearl,
1988, p. 256)</xref>
        .
      </p>
      <p>Theorem 3 If hV; E; P i is an acyclic causal model and
for every X; Y 2 V (with X 6= Y ) there is a stochastic
i-expansion hV0; E0; P 0i of hV; E; P i for X w.r.t. Y
satisfying CMC and Min, then for all X; Y 2 V (with X 6= Y )
the following two statements are equivalent:
(i) X ! Y in hV; Ei.
(ii) Dep(Y; IX = onjP ar(Y )nfXg; IY = on) holds in
some i-expansions hV0; E0; P 0i of hV; E; P i for X w.r.t.
Y , where IX is a stochastic intervention variable for X
w.r.t. Y in hV0; E0; P 0i and IY is a stochastic intervention
variable for Y w.r.t. Y in hV0; E0; P 0i.</p>
      <p>Theorem 3 shows that direct causation a la Woodward
(2003) coincides with the graph theoretical notion of
direct causation in systems hV; E; P i with stochastic
iexpansions for every X 2 V w.r.t. every Y 2 V (with
X 6= Y ) satisfying CMC and Min. In particular,
theorem 3 says the following: Assume we are interested in
a causal model hV; E; P i. Assume further that for every
X; Y in V (with X 6= Y ) there is a stochastic i-expansion
hV0; E0; P 0i of hV; E; P i for X w.r.t. Y satisfying CMC
and Min. This means (among other things) that for every
pair of variables hX; Y i there is at least one stochastic
iexpansion featuring a stochastic intervention variable IX
for X w.r.t. Y and a stochastic intervention variable IY for
Y w.r.t. Y . Given this setup, theorem 3 can account for
every causal arrow between every X and Y (with X 6= Y )
in V: It says that X is a causal parent of Y in hV; Ei iff
Dep(Y; IX = onjP ar(Y )nfXg; IY = on) holds in some
of the presupposed stochastic i-expansions for X w.r.t. Y .
7</p>
    </sec>
    <sec id="sec-13">
      <title>CONCLUSION</title>
      <p>In this paper we investigated the consequences of assuming
a certain version of Occam’s razor. If one applies the razor
in such a way to the theory of causal Bayes nets that it
dictates to prefer only minimal causal models, one can show
that Occam’s razor provides a neat definition of direct
causation. In particular, we demonstrated that one gets
Woodward’s (2003) definition of direct causation translated into
causal Bayes nets terminology and restricted to contexts in
which suitable i-expansions satisfying the causal Markov
condition (CMC) exist. In the last section we showed how
Occam’s razor can be used to account for direct causal
connections Woodward style even if no deterministic
interventions are available. These results can be seen as a
motivation of Occam’s razor going beyond its merits as a
methodological principle: If one wants a nice and simple
interventionist definition of direct causation in the sense of
Woodward (or its stochastic counterpart developed in sec.
6), then it is reasonable to apply a version of Occam’s razor
that suggests to eliminate non-minimal causal models.</p>
      <sec id="sec-13-1">
        <title>Acknowledgements</title>
        <p>This work was supported by DFG, research unit
“Causation, Laws, Dispositions, Explanation” (FOR 1063). Our
thanks go to Frederick Eberhardt and Paul Naeger for
important discussions, to two anonymous referees for helpful
comments on an earlier version of the paper, and to
Sebastian Maaß for proofreading.</p>
        <p>Indep(X; Y W jZ) ) Indep(X; Y jZW )
Indep(IX = on; s = hx; rijIZ = on) )
Indep(IX = on; xjIZ = on; r)</p>
        <p>Indep(s = hx; ri; yjIZ = on) )</p>
        <p>Indep(x; yjIZ = on; r)
With the contrapositions of (3) and (4) it now follows
that Dep(IX = on; s = hx; rijIZ = on) ^ Dep(s =
hx; ri; yjIZ = on).</p>
        <p>We now show that Dep(IX = on; sjIZ = on) ^
Dep(s; yjIZ = on) and the d-separation criterion imply
Dep(IX = on; yjIZ = on). We define P ( ) as
P 0( jIZ = on) and proceed as follows:
The following proof of theorem 1 rests on the equivalence
of CMC and the Markov factorization (1). It is, thus,
restricted to finite causal structures.</p>
        <p>Proof of theorem 1 Suppose hV; E; P i with V =
fX1; :::; Xng to be a finite acyclic causal model satisfying
CMC.</p>
        <p>Prod ) Min: Assume that hV; E; P i does not satisfy Min,
meaning that there are X; Y 2 V with X ! Y in hV; Ei
such that hV; E0; P i, which results from deleting X ! Y
from hV; Ei, still satisfies CMC. But then P ar(Y )nfXg
d-separates X and Y in hV; E0i, and thus, the d-separation
criterion implies Indep(X; Y jP ar(Y )nfXg), which
violates Prod.</p>
        <p>Min ) Prod: Assume that hV; E; P i satisfies Min,
meaning that there are no X; Y 2 V with X ! Y in hV; Ei
such that hV; E0; P i, which results from deleting X ! Y
from hV; Ei, still satisfies CMC. The latter is the case
iff (*) the parent set P ar(Y ) of every Y 2 V (with
P ar(Y ) 6= ;) is minimal in the sense that removing one
of Y ’s parents X from P ar(Y ) would make a
difference for Y , meaning that P (yjx; P ar(Y )nfXg = r) 6=
P (yjP ar(Y )nfXg = r) holds for some X-values x, some
Y -values y, and some instantiations r of P ar(Y )nfXg.
Otherwise P would admit the Markov factorization
relative to hV; Ei and relative to hV; E0i, meaning that also
hV; E0; P i, which results from deleting X ! Y from
hV; Ei, would satisfy CMC. But then hV; E; P i would
not be minimal, which would contradict the assumption.
Now (*) entails that Dep(X; Y jP ar(Y )nfXg) holds for
all X; Y 2 V with X ! Y , i.e., that hV; E; P i satisfies
Prod.</p>
        <p>Proof of theorem 2 Assume hV; E; P i is an acyclic
causal model and for every Y 2 V there is an i-expansion
hV0; E0; P 0i of hV; E; P i w.r.t. Y satisfying CMC and
Min. Let X and Y be arbitrarily chosen elements of V
such that X 6= Y .
(i) ) (ii): Suppose X ! Y in hV; Ei. We assumed that
there exists an i-expansion hV0; E0; P 0i of hV; E; P i w.r.t.
Y satisfying CMC and Min. From condition (b) of
definition 5 it follows that X ! Y in hV0; E0i. Since Min
is equivalent to Prod, X and Y are dependent when the
values of all parents of Y different from X are fixed to
certain values, meaning that there will be an X-value x
and a Y -value y such that Dep(x; yjP ar(Y )nfXg = r)
holds for an instantiation r of P ar(Y )nfXg. Now there
will also be a value of IZ that fixes the set of all parents of
Y different from X to r. Let on be this IZ-value. Thus,
also Dep(x; yjIZ = on) and also Dep(x; yjIZ = on; r)
will hold. Now let us assume that on is one of the IX
values which are correlated with x and which force X to
take value x. (The existence of such an IX -value is
guar(2)
(3)
(4)
(5)
(6)
(7)
(8)
Since IX = on forces P ar(Y ) to take value s when
IZ = on, P (sijIX = on) = 1 in case si = s, and
P (sijIX = on) = 0 otherwise. Thus, we get (7) from
(6):</p>
        <p>P (yjIX = on) = P (yjs) 1
For reductio, let us assume that Indep(IX =
on; yjIZ = on), meaning that P (yjIX = on) = P (y).
But then we get (8) from (7):</p>
        <p>P (y) = P (yjs) 1
Equation (8) contradicts Dep(s; yjIZ = on) above.
Hence, Dep(IX = on; yjIZ = on) has to hold when
Dep(IX = on; sjIZ = on) ^ Dep(s; yjIZ = on) holds.
Therefore, Dep(Y; IX = onjIZ = on).
(ii) ) (i): Suppose hV0; E0; P 0i is one of the presupposed
i-expansions such that Dep(Y; IX = onjIZ = on) holds,
where IX is an intervention variable for X w.r.t. Y in
hV0; E0; P 0i and IZ is the set of all intervention variables
in hV0; E0; P 0i different from IX . Then the d-separation
criterion implies that there must be a causal path
dconnecting IX and Y . cannot be a path featuring
colliders, because IX and Y would be d-separated over such
Equation (5) is probabilistically valid. Because P ar(Y )
blocks all paths between IX and Y , we get (6) from (5):
a path. also cannot have the form IX ::: { Y . This
is excluded by condition (a) in (IV). So must have the
form IX ! ::: { Y . Since cannot feature colliders,
must be a directed path IX ! ::: ! Y . Now either
(A) goes through X, or (B) does not go through X.
(B) is excluded by condition (c) in (IV). Hence, (A) must
be the case. If (A) is the case, then is a directed path
IX ! ::: ! X ! ::: ! Y going through X. Now there
are two possible cases: Either (i) at least one of the paths
d-connecting IX and Y has the form IX ! ::: ! X ! Y ,
or (ii) all paths d-connecting IX and Y have the form
IX ! ::: ! X ! ::: ! C ! ::: ! Y .</p>
        <p>Assume (ii) is the case, i.e., all paths d-connecting IX
and Y have the form IX ! ::: ! X ! ::: ! C !
::: ! Y . Let ri be an individual variable ranging over
val(P ar(Y )). We define P ( ) as P 0( jIZ = on) and
proceed as follows:
P (y) = X P (yjri) P (ri) (10)</p>
        <p>i
Equations (9) and (10) are probabilistically valid. Since
IZ = on forces every non-intervention variable in V0
different from X and Y to take a certain value, IZ = on will
also force P ar(Y ) to take a certain value r, meaning that
P (ri) = 1 in case ri = r, and that P (ri) = 0 otherwise.
Since probabilities of 1 do not change after
conditionalization, we get P (rijIX = on) = 1 in case ri = r, and
P (rijIX = on) = 0 otherwise. Thus, we get (11) from
(9) and (12) from (10):</p>
        <p>P (yjIX = on) = P (yjr; IX = on) 1</p>
        <p>P (y) = P (yjr) 1
(11)
(12)
Since P ar(Y ) blocks all paths between IX and Y , we get
P (yjr; IX = on) = P (yjr) with the d-separation
criterion, and thus, we get P (yjIX = on) = P (y) with
(11) and (12). Thus, Indep(Y; IX = onjIZ = on) holds,
which contradicts the initial assumption that Dep(Y; IX =
onjIZ = on) holds. Therefore, (i) must be the case, i.e.,
there must be a path d-connecting IX and Y that has the
form IX ! ::: ! X ! Y . From hV0; E0; P 0i being an
i-expansion of hV; E; P i it now follows that X ! Y in
hV; Ei.</p>
        <p>Proof of theorem 3 Assume hV; E; P i is an acyclic
causal model and for every X; Y 2 V (with X 6= Y ) there
is a stochastic i-expansion hV0; E0; P 0i of hV; E; P i for X
w.r.t. Y satisfying CMC and Min. Let X and Y be
arbitrarily chosen elements of V such that X 6= Y .
(i) ) (ii): Suppose X ! Y in hV; Ei. We assumed
that there exists a stochastic i-expansion hV0; E0; P 0i
P (yjIX = on; IY = on) =
X P (yjxi; IY = on) P (xijIX = on)
i</p>
        <p>P (yjIY = on) =
X P (yjxi; IY = on) P (xi)</p>
        <p>i
Now either (A) P (yjIX = on; IY = on) 6=
P (yjIY = on), or (B) P (yjIX = on; IY = on) =
P (yjIY = on). If (A) is the case, then Dep(Y; IX =
onjP ar(Y )nfXg; IY = on).</p>
        <p>If (B) is the case, then P (yjIX = on; IY = on)
can only equal P (yjIY = on) due to a fine-tuning of
P (xijIY = on) and P (xi) in equations (16) and (17),
respectively. We already know that X’s value x and
of hV; E; P i for X w.r.t. Y satisfying CMC and Min.
From condition (b) of definition 13 it follows that X !
Y in hV0; E0i. Since Min is equivalent to Prod,
Dep(x; yjP ar(Y )nfXg = r; IY = on) holds for some
Xvalues x, for some Y -values y, for some of IY ’s on-values
on, and for some instantiations r of P ar(Y )nfXg. Now let
us assume that on is one of the IX -values which are
correlated with x conditional on P ar(Y )nfXg = r; IY = on.
(The existence of such an IX -value on is guaranteed by
condition (c.1) in definition 13.) Then we have Dep(IX =
on; xjr; IY = on) ^ Dep(x; yjr; IY = on).</p>
        <p>We now show that Dep(IX = on; xjr; IY = on) ^
Dep(x; yjr; IY = on) together with IX ! X ! Y and
the d-separation criterion implies Dep(IX = on; yjr; IY =
on). We define P ( ) as P 0( jr) and proceed as follows:
P (yjIX = on; IY = on) =
X P (yjxi; IX = on; IY = on) P (xijIX = on; IY = on)
i
(13)
(14)
(15)
(16)
(17)
P (yjIY = on) =
X P (yjxi; IY = on) P (xijIY = on)</p>
        <p>i
Equations (13) and (14) are probabilistically valid. From
IX ! X ! Y and (13) we get with the d-separation
criterion:</p>
        <p>P (yjIX = on; IY = on) =
X P (yjxi; IY = on) P (xijIX = on; IY = on)
i
Since IY is exogenous and a causal parent only of Y , X
and IY are d-separated by IX , and thus, we get (16) from
(15) with the d-separation criterion. Since IY and X are
d-separated (by the empty set), we get (17) from (14) with
the d-separation criterion:
IX = on are dependent conditional on P ar(Y )nfXg =
r; IY = on, meaning that P (xjIX = on; IY = on) 6=
P (xjIY = on) holds. Since X and IY are d-separated
by IX , P (xjIX = on; IY = on) = P (xjIX = on)
holds. Since X and IY are d-separeted (by the empty
set), P (xjIY = on) = P (x) holds. It follows that
P (xjIX = on) 6= P (x) holds. So (i) P (xjIX =
on) &gt; 0 or (ii) P (x) &gt; 0. Thanks to condition (c.2)
in definition 13, every one of the conditional
probabilities P (yjxi; IY = on) can be changed independently
by replacing “on” in “P (yjxi; IY = on)” by some IY
value “on ” (with on 6= on) such that P (yjxi; IY =
on ) &gt; 0. Thus, in both cases ((i) and (ii)) it holds that
P (yjx; IY = on ) P (xjIX = on ) 6= P (yjx; IY =
on ) P (x), while P (yjxi; IY = on ) P (xijIX =
on ) = P (yjxi; IY = on ) P (xi) holds for all xi 6= x.
It follows that P (yjIX = on; IY = on ) 6= P (yjIY =
on ).
(ii) ) (i): Suppose hV0; E0; P 0i is one of the above
assumed stochastic i-expansions for X w.r.t. Y and that
Dep(Y; IX = onjP ar(Y )nfXg; IY = on) holds in
this stochastic i-expansion. The d-separation criterion and
Dep(Y; IX = onjP ar(Y )nfXg; IY = on) imply that IX
and Y are d-connected given (P ar(Y )nfXg) [ fIY g by
a causal path : IX { ::: { Y . cannot have the form
IX ::: { Y . This is excluded by condition (a) in (IVS ).
Thus, must have the form IX ! ::: { Y . Now either (A)
goes through X, or (B) does not go through X.</p>
        <p>Suppose (B) is the case. Then, because of condition (c) in
(IVS ), cannot be a directed path IX ! ::: ! Y . Thus,
must either (i) have the form IX ! ::: { C ! Y (with a
collider on ), or it (ii) must have the form IX ! ::: { C
Y . If (i) is the case, then C must be in (P ar(Y )nfXg) [
fIY g (since C cannot be X). Hence, would be blocked
by (P ar(Y )nfXg) [ fIY g and, thus, would not d-connect
IX and Y given (P ar(Y )nfXg) [ fIY g. Thus, (ii) must
be the case. If (ii) is the case, then there has to be a
collider C on that either is C or that is an effect of C,
and thus, also an effect of Y . But then IX and Y can
only be d-connected given (P ar(Y )nfXg) [ fIY g over
if C is in (P ar(Y )nfXg) [ fIY g or has an effect in
(P ar(Y )nfXg) [ fIY g. But this would mean that Y is a
cause of Y , what is excluded by the initial assumption of
acyclicity. Thus, (A) has to be the case.</p>
        <p>If (A) is the case, then must have the form IX !
::: { X { ::: { Y . If would have the form IX !
::: { X { ::: { C Y (where C and X are
possibly identical), then there is at least one collider C
lying on that is an effect of Y . For IX and Y to be
d-connected given (P ar(Y )nfXg) [ fIY g over path ,
(P ar(Y )nfXg) [ fIY g must activate , meaning that C
has to be in (P ar(Y )nfXg) [ fIY g or has to have an
effect in (P ar(Y )nfXg) [ fIY g. But then we would end up
with a causal cycle Y ! ::: ! Y , which would
contradict the assumption of acyclicity. Hence, must have the
form IX ! ::: { X { ::: { C ! Y (where C and X are
possibly identical). Now either (i) C = X or (ii) C 6= X.
If (ii) is the case, then C 2 (P ar(Y )nfXg) [ fIY g, and
thus, (P ar(Y )nfXg) [ fIY g blocks . But then IX and
Y cannot be d-connected given (P ar(Y )nfXg) [ fIY g
over path . Hence, (i) must be the case. Then has the
form IX ! ::: { X ! Y and from hV0; E0; P 0i being a
stochastic i-expansion of hV; E; P i it follows that X ! Y
in hV; Ei.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>F.</given-names>
            <surname>Eberhardt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Scheines</surname>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Interventions and causal inference</article-title>
          .
          <source>Philosophy of Science</source>
          <volume>74</volume>
          (
          <issue>5</issue>
          ):
          <fpage>981</fpage>
          -
          <lpage>995</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>C.</given-names>
            <surname>Glymour</surname>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>Critical notice</article-title>
          .
          <source>British Journal for the Philosophy of Science</source>
          <volume>55</volume>
          (
          <issue>4</issue>
          ):
          <fpage>779</fpage>
          -
          <lpage>790</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>K. B. Korb</surname>
            ,
            <given-names>L. R.</given-names>
          </string-name>
          <string-name>
            <surname>Hope</surname>
            ,
            <given-names>A. E.</given-names>
          </string-name>
          <string-name>
            <surname>Nicholson</surname>
            , and
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Axnick</surname>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>Varieties of causal intervention</article-title>
          . In C. Zhang, H. W.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Guesgen</surname>
          </string-name>
          , W.-K. Yeap (eds.),
          <source>Proceedings of the 8th Pacific Rim International Conference on AI 2004: Trends in Artificial Intelligence</source>
          ,
          <fpage>322</fpage>
          -
          <lpage>331</lpage>
          . Berlin: Springer.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>R.</given-names>
            <surname>Neapolitan</surname>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>Learning Bayesian Networks</article-title>
          . Upper Saddle River, NJ: Prentice Hall.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>E. P.</given-names>
            <surname>Nyberg</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          K. B.
          <string-name>
            <surname>Korb</surname>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Informative interventions</article-title>
          .
          <source>Technical report 2006/204</source>
          , Clayton School of Information Technology, Monash University, Melbourne.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Pearl</surname>
          </string-name>
          (
          <year>1988</year>
          ).
          <article-title>Probabilistic Reasoning in Expert Systems</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Pearl</surname>
          </string-name>
          (
          <year>2009</year>
          ). Causality. Cambridge: Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>P.</given-names>
            <surname>Spirtes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Glymour</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Scheines</surname>
          </string-name>
          (
          <year>2000</year>
          ). Causation, Prediction, and Search. Cambridge, MA: MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>T. S.</given-names>
            <surname>Verma</surname>
          </string-name>
          (
          <year>1986</year>
          ).
          <article-title>Causal networks: Semantics and expressiveness</article-title>
          .
          <source>Technical report R-65</source>
          , Cognitive Systems Laboratory, University of California, Los Angeles.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Woodward</surname>
          </string-name>
          (
          <year>2003</year>
          ).
          <source>Making Things Happen</source>
          . Oxford: Oxford University Press.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Woodward</surname>
          </string-name>
          (
          <year>2008</year>
          ). Response to Strevens.
          <source>Philosophy and Phenomenological Research</source>
          <volume>77</volume>
          (
          <issue>1</issue>
          ):
          <fpage>193</fpage>
          -
          <lpage>212</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Spirtes</surname>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Intervention, determinism, and the causal minimality condition</article-title>
          .
          <source>Synthese</source>
          <volume>182</volume>
          (
          <issue>3</issue>
          ):
          <fpage>335</fpage>
          -
          <lpage>347</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <article-title>anteed by condition (c) in definition 5.) Then we have Dep(IX = on; xjIZ = on</article-title>
          ; r) ^
          <source>Dep(x; yjIZ = on; r).</source>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <article-title>From the axiom of weak union (2) (cf</article-title>
          . Pearl,
          <year>2009</year>
          , p.
          <volume>11</volume>
          ),
          <article-title>which is probabilistically valid, we get (3) and (4) (in which s = hx; ri is a value realization of P ar(</article-title>
          <string-name>
            <surname>Y )):</surname>
          </string-name>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>