Belief State Estimation for Planning via
                             Approximate Logical Filtering and Smoothing
                  Brent Mombourquette† and Christian Muise∗ and Sheila A. McIlraith†
                           †
                             Department of Computer Science, University of Toronto
                                ∗
                                  CSAIL, Massachusetts Institute of Technology
                         †
                           {bgmomb,sheila}@cs.toronto.edu, ∗ {cjmuise}@mit.edu

                          Abstract                                   liefs about individual fluents – a frequent and time critical
                                                                     component of many decision-making systems.
     State estimation is the task of estimating the state               Our concern is with logical state estimation in service
     of a partially observable dynamical system given                of tasks such as planning, execution monitoring, diagnosis,
     a sequence of executed actions and observations.                and activity recognition. We are particularly concerned with
     In logical settings, state estimation can be real-              systems that include a rich characterization of how the ac-
     ized via logical filtering. Unfortunately such fil-             tions of an agent indirectly affect their environment. These
     tering, though exact, can be intractable. To this               are typically captured by causal or ramification constraints
     end, we propose logical smoothing, a form of                    (e.g., a causal constraint might say that if the battery and radio
     backwards reasoning that works in concert with                  are ok and the radio is on then sound is emitted.). We assume
     logical filtering to refine past beliefs in light of            that such constraints are compiled into the transition system
     new observations. We characterize the notion of                 as additional effects of actions following, e.g., [Pinto, 1999;
     logical smoothing together with an algorithm for                Strass and Thielscher, 2013; Baier et al., 2014]. In planning
     backwards-forwards state estimation. We prove                   such constraints tend to create problems with large confor-
     properties of our algorithms, and experimentally                mant width [Palacios and Geffner, 2009].
     demonstrate their behaviour. Smoothing together                    We exploit the observation that for planning, only a subset
     with backwards-forwards reasoning are important                 of the state is necessary to track. Planning systems need to
     techniques for reasoning about partially observable             know (1) when actions are applicable, and (2) when the goal
     dynamical systems, introducing the logical ana-                 is reached [Bonet and Geffner, 2014]. Execution monitoring
     logue of effective techniques from control theory               systems need only track the conditions under which a plan
     and dynamic programming.                                        remains valid (e.g., [Fikes et al., 1972]). Diagnosis systems
                                                                     track the confirmation and refutation of candidate diagnoses.
1   Introduction                                                        These observations motivate the development of state es-
   Many applications of artificial intelligence from automated       timation techniques tailored to the task of tracking the truth
planning and diagnosis to activity recognition require reason-       of (conjunctions of) fluent literals. In Section 2 we formal-
ing about dynamical systems that are only partially observ-          ize state estimation as semantic logical filtering and propose
able. A necessary component of such systems is state estima-         a sound under-approximation that is computationally appeal-
tion – the task of estimating the state of the systems given a       ing. Motivated by the technique of smoothing for stochastic
sequence of executed actions and observations. State estima-         systems (e.g., [Einicke, 2012]), in Section 3, we introduce
tion is well-studied in control systems where transition sys-        the notion of logical smoothing, which allows for the updat-
tems are stochastic and the belief state is typically represented    ing of beliefs about the past in light of observations about the
as a probability distribution. State estimation is commonly re-      present. In Section 4, we propose an algorithm for backwards-
alized via filtering, of which Kalman filtering [Kalman, 1960]       forwards reasoning that combines smoothing and filtering in
is a well-known example.                                             order to perform state estimation. The application of (approx-
   In logical settings, an analogous form of logical filtering       imate) logical smoothing mitigates for the incompleteness of
was proposed by Amir and Russell [2003] in which an agent’s          approximate logical filtering, while preserving many of its
belief state – the set of possible world states – can be com-        computational advantages. We evaluate the behaviour of our
pactly represented as a formula, and filtering is a form of be-      approach. This is followed by a discussion of related work
lief update. While logical filtering is intractable in the general   and concluding remarks.
case [Eiter and Gottlob, 1992], there are tractable subclasses
often involving restricted transition systems or compact en-         2   The Problem: State Estimation
codings of the belief state (e.g., [Shahaf and Amir, 2007;              State estimation is a core task in reasoning about dynam-
Shirazi and Amir, 2011]). Unfortunately, typical belief state        ical systems with partial observability. Consider the simpli-
representations often require further inference to ascertain be-     fied action-observation sequence in support of diagnosing a
car. You turn the key in the ignition, turn ignition, result-              1. Filter[hi](ρ) = ρ
ing in ignition turned. If ignition turned, battery ok and                 2. Filter[a](ρ) = {s0 | s0 = R(s, a), s ∈ ρ}
gas ok hold, then so will car started. You observe that the
car did not start (car started = F alse), and so, under the                3. Filter[o](ρ) = {s | s ∈ ρ and o is true in s}
assumption that the characterization of the vehicle function-              4. Filter[hai , oi , . . . , at , ot i](ρ) =
ing is complete, you can you infer ¬battery ok ∨ ¬gas ok.                     Filter[hai+1 , oi+1 , . . . , at , ot i](Filter[oi ](Filter[ai ](ρ)))
You turn on the radio, turn on radio, causing radio on
as well as sound if battery ok ∧ radio ok. You observe                   We call step 2 progression with a and step 3 filtering with o.
sound (sound = T rue). Under completeness and frame as-                     When action a is filtered, every state s ∈ ρ is updated with
sumptions, you are now able to infer radio ok, battery ok                respect to the transition system R(s, a). When an observa-
and ¬gas ok. So following the action-observation sequence                tion o is filtered, every state inconsistent with the observation
(turn ignition,¬car started,turn on radio,sound), your                   is eliminated. Logical filtering results in an interleaving of
estimated belief state comprises just one state here repre-              action progression and observation filtering (e.g.,[Vassos and
sented by the set of fluents {ignition turned, radio on,                 Levesque, 2013]). Logical filtering naı̈vely is hard: there are
¬car started, battery ok, radio ok, ¬gas ok, sound}.                        |F |
                                                                         22 belief states. As such, algorithms for logical filtering
   Informally, the state estimation task we address is: Given a          typically represent the belief state ρ compactly as a logical
dynamical system, a belief state, and a sequence of executed             formula φ, called a belief-state formula or state formula.
actions and observations, infer the resulting belief state of               While logical filtering realizes state estimation, the result-
the system. For logical theories, state estimation is captured           ing belief state representation can cause fundamental infer-
by logical filtering [Amir and Russell, 2003].                           ence operations to be intractable, such as inferring beliefs
   To relate our work to planning, execution monitoring                  about individual fluents [Shahaf and Amir, 2007]. A core
and diagnosis, we appeal to standard finite domain plan-                 component of automated planning is determining the appli-
ning language syntax. A dynamical system is a tuple Σ =                  cability of actions during the search for a plan, as well as
hF, S, A, R, Ii, where F is a finite set of propositional fluent         determining whether the goal condition has been achieved.
symbols such that if p ∈ F, then p and ¬p are fluent literals,           Similarly dynamical diagnosis requires determination of the
S = P ow(F) is the set of possible world states, A is a set of           refutation or confirmation of candidate diagnoses as the result
actions including sensing actions, R ⊆ S × A × S is the tran-            of (treatment) actions and sensing (e.g., [McIlraith and Reiter,
sition relation (equivalently we use the notation (s, a, s0 ) ∈ R        1992]). Diagnoses, action preconditions, and planning goals
or s0 = R(s, a)), and I is a set of clauses over F that defines          are typically represented as conjunctions of fluents, motivat-
a set of possible initial states, collectively – the initial be-         ing the following definition of approximate logical filtering:
lief state. As noted in the introduction, we assume that causal
constraints are compiled into our transition system as extra             Definition 2 (Approximate Logical Filtering)
effects of actions (e.g., following [Pinto, 1999]). For the pur-          Given belief-state formula φ of dynamical system Σ, the ap-
poses of this paper, non-sensing actions a ∈ A are assumed               proximate filtering of a sequence of actions and observations
to be deterministic and are defined by a precondition prec(a),           ha1 , o1 , . . . , at , ot i with respect to φ is defined as:
which is a conjunction of fluent literals, and eff (a), a set of           1. Filtera [hi](φ) = φ;
conditional effects of the form C → L, where C is a con-                                       V
junction of fluent literals and L is a fluent literal. We write                   Va [a](φ) = {L | (C → L) ∈ ef f (a) ∧ φ |= C}
                                                                           2. Filter
                                                                                     {L | φ |= L ∧ ∀(C → ¬L) ∈ ef f (a), φ |= ¬C};
the unconditional effect true → L as simply L, and use true
to denote an empty precondition. Each sensing action, on the               3. Filtera [o](φ) = φ ∧ o;
other hand, is defined by its precondition prec(a), which is               4. Filtera [hai , oi , . . . , at , ot i](φ) =
a conjunction of fluent literals, and obs(a), which is the flu-               Filtera [hai+1 , oi+1 , . . . , at , ot i](Filtera [oi ](Filtera [ai ](φ)))
ent literal that is observed by the sensing action. We assume
no exogenous actions and that the transition relation is com-               Approximate logical filtering describes the belief state only
plete, characterizing all and only the conditions under which            in terms of a subset of the fluent literals entailed by the be-
the truth value of a fluent changes.                                     lief state. One can see from line 2 that progressing a state φ
   Throughout this paper we take the viewpoint that the state            through an action a produces a state consisting of what was
of Σ represents the belief state of the agent. The semantics of          known to be caused (directly or indirectly) by a and what was
logical filtering is defined by considering the belief state to be       known to persist through a.
a set of possible world states ρ ⊆ S. For our purposes , we                 Returning to our car example, we can see how this is a
will often represent the belief state (henceforth also the state         weak approximation. After the observations car started =
of the system) as a state formula φ. Later we will be restrict-          F alse and sound = T rue, no further inferences could be
ing φ to a conjunction of fluent literals, sometimes denoted             made, yielding the belief state {ignition turned, radio on,
as a set of fluent literals and referred to as a conjunctive state       ¬car started, sound}
formula. Following Amir and Russell (2003):                              Theorem 1 (Sound Under Approximation)                               Given dy-
Definition 1 (Logical Filtering Semantics)                               namical system, Σ, belief-state formula φ representing pos-
 Given belief state ρ ⊆ S of dynamical system Σ, the filtering           sible belief state ρ ⊆ S, and action-observation se-
of a sequence of actions and observations ha1 , o1 , . . . , at , ot i   quence ha1 , o1 , . . . , at , ot i, Filter[ha1 , o1 , . . . , at , ot i](ρ) |=
with respect to ρ is defined as:                                         Filtera [ha1 , o1 , . . . , at , ot i](φ).
  This follows from Definitions 1 and 2. Approximate Logi-                  Definition 3 (Belief State History) Given dynamical sys-
cal Filtering is not complete with respect to the logical filter-           tem, Σ, a belief state history over Σ is a sequence of tu-
ing semantics.                                                              ples (ρ0 , a0 ),(ρ1 , a1 ),. . . (ρn , an ) such that each ρi is a belief
Proposition 1 (Conjunctive State Formula Preservation)                      state, a set of possible world states ρi ⊆ S and each ai is an
Given dynamical system, Σ, conjunctive state formula φ, and                 action of Σ.
action-observation sequence ha1 , o1 , . . . , at , ot i, where each           The intent of a history is to capture the evolution of the
oi is a conjunctive formula, Filtera [ha1 , o1 , . . . , at , ot i](φ) is   system starting from some designated initial belief state ρ0 .
a conjunctive formula.                                                      The observations are not modeled as separate entities. Rather,
   The above proposition follows naturally from lines 2 and                 they reside in the intermediate belief states, presumably as
3 of Definition 2 and is key to our complexity results and                  part of an original filtering process.
tractable approximate representations.                                      Definition 4 (Logical Smoothing Semantics)
Theorem 2 (Complexity) Given dynamical system, Σ, con-                       Given belief state ρ ⊆ S of dynamical system Σ, the smooth-
junctive state formula φ, and action-observation sequence                   ing of a belief state history (ρ0 , a0 ),(ρ1 , a1 ),. . . (ρn , an ) with
ha1 , o1 , . . . , at , ot i, where each oi is a conjunctive formula,       respect to the ρ is defined as:
Filtera [ha1 , o1 , . . . , at , ot i](φ) is computable in time O(t · c ·     1. Smooth[hi](ρ) = ρ
|F|) where c is the maximum number of conditional effects
over actions in the sequence.                                                 2. Smooth[o](ρ) = Filter[o](ρ)
                                                                                         = {s | s ∈ ρ and o is true in s}
   This follows from fluent entailment of a conjunctive for-
mula and the limit of |F| fluents per consistent conditional                  3. Smooth[(ρ0 , a0 )](ρ) =
effect. For automated planning and similar problems where                                ρ0 ∩ {s | s ∈ PreImage(s0 , a0 ), s0 ∈ ρ}
the general assumption is that observations are conjunctive                   4. Smooth[h(ρ0 , a0 ), (ρ1 , a1 ), . . . (ρn , an )i](ρ) =
formulae, approximate logical filtering enables state estima-                            Smooth[h(ρ0 , a0 ), . . . (ρn−1 , an−1 )i]
tion to reduce to set operations. While computationally ap-                                                 (Smooth[(ρn , an )](ρ))
pealing, the approximation is weak and thus of limited use
on its own for AI planning, dynamical diagnosis, and similar                  where PreImage(s0 , a) = {s | s0 = R(s, a)}
tasks. In the next section, we show how to combine approxi-
mate logical filtering with a new approach for reasoning over                  Logical smoothing works by propagating acquired infor-
the past – logical smoothing.                                               mation (typically an observation or the resultant filtered state-
                                                                            action pair) back through a given history and updating its con-
                                                                            stituent state estimates. Note that this is more akin to belief
3   Logical Smoothing                                                       updating than to belief revision. Each smoothing step refines
   Filtering with a stochastic transition system involves esti-             previous state estimates in that the smoothed state’s set of
mating the marginal posterior probability distribution of the               possible world states are always a (non-strict) subset of the
system conditioned on some data that was received prior                     state’s original set of possible world states.
to the current system state. There is an analogous concept                     With the semantic account of logical smoothing in hand,
for estimating not the current state but instead a previous                 we now define the notion of logical smoothing with respect
state. This is called smoothing – refining a previous state es-             to a more compact representation of the set of belief state in
timation (e.g., [Einicke, 2012]). For stochastic models, this               terms of a belief state formula.
amounts to a re-computation of the marginal posterior proba-
bility distribution. We can carry this idea into the the logical            Definition 5 (History) Given                 dynamical    system,
setting and show how observations can be used to refine esti-               Σ, a history H over Σ is a sequence of tuples
mates of past states. The notion of logical smoothing is only               (σ0 , a0 ), (σ1 , a1 ), . . . (σn , an ) such that each σi is a
of interest if the belief state is approximated in some way. For            belief-state formula over F and each ai is an action of Σ.
example, in later sections, we show how logical smoothing,                     Definition 6 provides a formal characterization of logical
can be used to improve weak approximations of logical filter-               smoothing with respect to a belief-state formula represen-
ing and thus produce a refined estimate of the current state.               tation of the history. It utilizes regression rewriting [Reiter,
   We begin our treatment of logical smoothing by defining                  2001] to propagate updates back through the history. Regres-
the semantics of logical smoothing with respect to a belief                 sion, denoted as R[φ, a], takes a logical formula φ and action
state. Given a representation of the sequence of actions, ob-               a and rewrites the formula in terms of the weakest conditions
servations, and intermediate state estimates, logical smooth-               necessary for φ to hold following the execution of a. We ap-
ing recursively refines previous state estimates through a                  peal to Reiter’s definition of regression in the situation calcu-
backwards update procedure. To this end, we store previous                  lus [Reiter, 2001] with syntactic notation suitably modified.
state estimates, coupled with the actions executed to construct                When the history comprises a single state-action pair, for-
successor states in a so-called history. Note that while we use             mula φ is simply conjoined to the state. Otherwise, the for-
a set of possible worlds to represent a belief state, histories             mula is regressed step by step through the history, and any
can be defined with any sort of state representation (logical               new information garnered from the regression, φN EW , is
formulae, sets of fluents understood as a logical conjunction,              conjoined to the associated state. P I(φ) refers to the prime
etc.), which we exploit later in this section.                              implicates of formula φ.
Definition 6 (Logical Smoothing) Given dynamical system                 of the smoothing procedure lies within R[φ, ai−1 ] (line 3)
Σ, history H = (σ0 , a0 ), . . . (σn , an ) and formula φ, the log-     as explained above. Line 4 identifies those aspects of ψ that
ical smoothing of H with respect to φ is defined as:                    are new to state σi−1 by identifying prime implicates of ψ
  1. Smooth[(σ, a)](φ) = (σ ∧ φ, a)                                     not entailed by σi−1 . This is an optimization as it allows for
                                                                        early termination (line 5) when further smoothing would be
  2. Smooth[(σ0 , a0 ), . . . , (σn , an )](φ) =                        unnecessary and it minimizes the subsequent formula to be
         Smooth[(σ0 , a0 ), . . . , (σn−1 , an−1 )](φNEW ),             regressed in the next iteration. The process then repeats on
                                                Smooth[(σn , an )](φ)   the newly computed ψN EW formula at index i − 1. Finally,
                 V
where φN EW = {ϕ ∈ P I(R[φ, an−1 ]) | σn−1 2 ϕ}                         LSmooth returns a tuple of the refined history and the index
                                                                        of termination term-idx . The purpose of returning term-idx
   The soundness and completeness of Logical Smoothing                  will become apparent in Section 4 when we leverage logical
(Definition 6) relative to the semantic account in Definition           smoothing in a state estimation algorithm.
4 follow straightforwardly from the correspondence between              Proposition 2 (Early Termination Completeness) Given a
PreImage and regression.                                                dynamical system Σ and a history H over Σ with at least n
   Returning to our car example, consider logical smooth-               state-action tuples and a formula φ over F, the updated his-
ing with respect to the observation car started = F alse.               tory returned by LSmooth(H, φ, n) is the updated history re-
For ease of presentation, assume approximate logical filter-            turned by LSmooth(H, φ, n) with the early termination con-
ing is used for action progression. Progressing the initial             dition of line 5 removed.
state, σ0 , through the action turn ignition results in state σ1 ,      It follows straightforwardly from the close correspondence
which is equivalent to σ0 since the effect of turn ignition is          between Definition 6 and Algorithm 1, and Proposition 2 that
predicated on fluents whose truth value is unknown. When                the history computed in Algorithm 1 is equivalent to the spec-
the observation action obs(car started) is subsequently per-            ification in Definition 6.
formed and car started = F alse observed, the smoothed                     We say that H0 = (σ00 , a00 ), . . . , σn0 , a0n ) is a sound and
history is given by Smooth[h0 , h1 ](¬car started) where                complete refinement of H = (σ0 , a0 ), . . . , (σn , an ) with re-
h0 = (σ0 , turn ignition) h1 = (σ1 , obs(car started)).                 spect to formula φ that holds at index i of H if for all
Following point 2 of Definition 6, this is equivalent to                0 ≤ j ≤ n, (1) a0j = aj ; (2) σj0 |= σj ; and (2) if j < i,
Smooth[(h0 )](φNEW ), Smooth[(h1 ))](¬car started). Rule 1              Filter[ha0j , σj+1
                                                                                       0
                                                                                           , ..., a0i , σi0 i](σj0 ) |= φ. In other words, a sound
of the logical smoothing semantics smooths σ1 of h1 as                  and complete refinement captures all of what must be known
σ1 ∧ ¬car started. The action obs(car started) has no ef-               based on the existing history and necessary conditions for φ.
fects and therefore the observation car started = F alse                With this definition in hand we have,
must hold in the prior state. Returning to φN EW , this for-
mula hinges on R[¬car started, turn ignition]. By the                   Theorem 3 (Soundness and Completeness) Given a dy-
regression re-writing and frame axioms, this gives φ =                  namical system Σ, history H over Σ, and formula φ over F
¬car started ∧ (¬battery ok ∨ ¬gas ok). As σ0 |=                        that must be true at σi of H, LSmooth(H, φ, i) produces a
¬car started, we are left with φN EW = ¬battery ok ∨                    sound and complete refinement of H.
¬gas okay after restricting to prime implicates, giving the                Logical smoothing provides a sound, complete, and prin-
intuitive refinement of the initial state given the observation.        cipled approach to smoothing previous state estimations in a
                                                                        logical system in light of additional observations or informa-
                                                                        tion which must hold in the associated state. As this is merely
 Algorithm 1: LSmooth(H, φ, i) Perform logical smooth-                  the general algorithmic structure of logical smoothing, fur-
 ing on history H given that φ holds at σi in H. Returns                ther optimizations are possible but omitted for clarity of ex-
 updated H and the index of termination.                                position of the core concepts.
1 σi = σi ∧ φ
2 if i > 0 then                                                         3.1 Approximate Logical Smoothing
3      ψ = R[φ, aVi−1 ]                                                    Unfortunately, like logical filtering, the querying of belief
4      ψN EW = {ϕ ∈ P I(ψ) | σi−1 2 ϕ}                                  states resulting from logical smoothing may not be tractable.
5      if ψN EW is empty then                                           To make matters worse, while a single regression step results
6          return                                                       in a linear increase in formula size with respect to the input,
                                                                        recursive applications of regression result in an exponential
7      LSmooth(H, ψN EW , i − 1)
                                                                        blow up [Fritz, 2009]. To remedy these issues, we define, as
8 return (H, i)                                                         we did with logical filtering, an analogous procedure of ap-
                                                                        proximate logical smoothing. We do so by one minor adjust-
                                                                        ment to the logical smoothing   outlined in Algorithm 1:
   Algorithm 1 realizes Logical Smoothing together with an                                   ^
optimization to support early termination of the regression.              Line 4: ψN EW = {f | ψ ∧ σi−1 |= f and σi−1 2 f } (1)
Included in its input is a state index parameter, identifying              Where f is restricted to fluent literals. This approximation
the location within history H where φ is to be integrated.              limits the updating of the history to fluent literals entailed by
Logical smoothing can thus acquire information about a past             these regressed additions. Again we consider only the entail-
state and smooth the preceding state estimates in light of it,          ments not already captured by a state in the history. We denote
as well as smoothing from the most recent state. The heart              the resulting algorithm obtained by modifying LSmooth as
per (1) as LSmootha . It should be stressed that this is but one     4   Backwards-Forwards Algorithm
way to deal with the formula size increase due to repeated               Logical smoothing refines the previous state estimates by
applications of regression. Alternatives which do not sacri-         reasoning backwards (regressing) over the history of actions
fice completeness include adding new fluents in place of cer-        and states with respect to some acquired information. In many
tain sub-formulae of the regressed formula [van Ditmarsch et         cases, this process removes some uncertainty about the past,
al., 2007] or representing the formula as a circuit [Rintanen,       particularly in dynamical systems where actions have causal
2008].                                                               ramifications. This information about the past can then be
   Following Definition 6, logical smoothing produced the            propagated forwards (progressed) through the state-action
correct refinement of the initial state with ¬battery ok ∨           history to potentially produce further refinements.
¬gas ok. However, with approximate logical smoothing                     As we last left the car example from Section 3.1, each
ψN EW is limited to fluent literals, resulting in an empty for-      state in the history, including the initial state, was smoothed
mula and therefore no refinement. Some information is lost           with battery ok ∧ radio ok after observing sound = T rue.
but other inferences can still be made. Continuing with the          While it is obvious that this should be propagated forward
sequence of actions in our example, after obs(car started)           to the current state, as it stands it is not so obvious how
the actions turn on radio and obs(sound) are executed, ob-           and why, in general, this should be done. Consider the
taining sound = T rue. By smoothing with respect to this             case where the action turn ignition actually had an addi-
observation the regression R[sound, turn on radio] gives             tional effect battery charging if battery ok. Propagating
ψ = sound∨(battery ok∧radio ok). Following the approx-               battery ok forward from the initial state given the actions
imation rule (1) we get ψN EW = battery ok ∧ radio ok due            and intermediate states in the history further refines the post-
to the previous state in the history entailing ¬sound. Even          turn ignition state estimates (including the current state)
with the approximation, we refine the estimate of the prior          with battery charging.
state with the knowledge battery ok ∧ radio ok. Continuing               We can realize the forward phase outlined above via
this example, approximate logical smoothing would smooth             filtering. Note that when operating on a history, simu-
each state in the history, including the initial state, to include   lating the forwards reasoning phase by filtering works
battery ok ∧ radio ok.                                               on the corresponding sequence of actions with obser-
                                                                     vation formulae being the state formula. For notational
Proposition 3 (Soundness) Let Σ be a dynamical system,               convenience, we define subseq(H, i) of a history H =
H be a history over Σ of n state-action tuples, and φ be             (σ0 , a0 ), (σ1 , a1 ), . . . , (σn , an ) and index 0 ≤ i ≤ n as the
a formula over the language of F that must be true at the            sub-sequence hai , σi+1 , ai+1 , ..., σn , an i. Algorithm 2 out-
state at index i of H. If LSmooth(H, φ, i) = (H0 , j) and            lines our backwards-forwards state estimation algorithm.
LSmootha (H, φ, i) = (H00 , k) then for all 0 ≤ i ≤ n, if
σi0 is the i-th state of H0 and σi00 is the i-th state of H00 then
σi0 |= σi00 .                                                         Algorithm 2: BF(H, φ, i) Perform backwards-forwards
                                                                      state estimation on history H given that formula φ holds
   By Proposition 3, LSmootha is an under-approximation of            at σi in H.
LSmooth. It amounts to a version that smooths only with re-               0
                                                                     1 (H , term-idx ) = LSmootha (H, φ, i)
spect to conjunctive formulae. Moreover, analogously to ap-                                      0
                                                                     2 return Filtera [(subseq(H , term-idx )](σterm-idx )
proximate logical filtering (Proposition 1), approximate logi-
cal smoothing is conjunctive state formula preserving.
                                                                        The BF algorithm uses approximate logical smoothing and
   We also sidestep the exponential size formula blowup from
                                                                     approximate logical filtering. BF maintains conjunctive state
repeated regression operations by always regressing conjunc-
                                                                     formulae while computing sound state estimates. Logical
tive formulae except for, potentially, in the first step.
                                                                     smoothing allows us to encode complex information about
Theorem 4 (Complexity) Given a history H over a dynami-              any state of the system into the history, keeping each individ-
cal system Σ of n state-action tuples such that all states are       ual state estimate in a compact and computationally manage-
conjunctive formulae, a conjunctive formula φ over the lan-          able form that is maintained through a simple logical filter-
guage of F, and an index i of H, LSmootha (H, φ, i) can be           ing procedure. Towards this goal of tractability, BF leverages
                                                                     approximate logical smoothing with unit propagation entail-
computed in time O(n·2|F | ) with propositional entailment or
                                                                     ment. When combined, the result is an intuitive reasoning
O(n · |F |2 ) with unit propagation entailment.
                                                                     mechanism for dynamical states with partial observability.
   While the worst case complexity is exponential, in prac-             Although it may appear excessive to filter with respect to
tice this is not the case. Actions typically trigger few indirect    state formulae, the states are always conjunctive formulae and
system effects (ramifications) in comparison to the size of the      approximate filtering operates by conjoining the observation
propositional domain. This results in a compact regressed for-       formula with the state formula resulting from progressing the
mula where all unit entailments are computable through unit          previous action. Since all refinements are sound, this reduces
propagation in most cases. Note that since we place no re-           to simple set operations over the fluents.
strictions on the syntactic form of regressed formulae, such as      Proposition 4 (Complexity) Given a history H over a dy-
restricting to Horn clauses, unit propagation may not produce        namical system Σ of n state-action tuples such that all states
all entailments, resulting in a sound under-approximation.           are conjunctive formulae, a conjunctive formula φ over the
language of F, and an index i of H, BF(H, φ, i) can be com-         dition / Goal Coverage section. Even with a judicious under-
puted in time O(n · c · 2|F | ) with propositional entailment or    approximation, the BF algorithm is capable of tracking every
O(n · c · |F|2 ) with unit propagation entailment, where c is       relevant fluent for every action trace of all plans for the above
the maximum number of conditional effects over actions of Σ.        problems. This is due in part to the fact that these problems
   This follows from the previous complexity results.               belong to the class of width-1 simple contingent problems,
                                                                    which has the property that once a fluent becomes known it
4.1 Space Optimization                                              stays known [Bonet and Geffner, 2011]. For problems like the
   Here we outline two optimizations to alleviate the space         Colored Balls, ALF is capable of solving a significant portion
requirements of logical smoothing.                                  of the action traces. This is to be expected; the domain has
State Relevance Minimization. The smoothing process only            very little dynamics. In the case of four colors with one ball,
relies on a subset of the state. Given a history H with tuple       the only unsolvable traces result from the situations when the
(σ, a), it is sufficient for σ to only include the fluents f that   ball must be in the last location not observed and the plan is
are involved in the conditional effects of a. This is due to the    able to infer this without directly observing it. Other domains,
regression formula being purely over these fluents plus fluents     like Wumpus and CTP, require heavy reasoning, causing ALF
of φ which have no positive or negative effects with respect        to fail in most, if not all, cases.
to a. Such an optimization of the state fluents would greatly          Table 1 also reports the average history state size as a per-
reduce the memory overhead as actions typically involve and         centage of the total number of fluents in the problem. The
effect a small fraction of the domain. The downside is that the     large reduction in state size from SRM is due to the fact that
ψN EW may contain old inferences and thus the early termi-          only sensing actions have relevant information: no action af-
nation becomes less robust.                                         fects observable fluents. This is a byproduct of the problem
Sliding Window History. A second optimization is to                 structures that the planning community has chosen to focus
smooth over a fixed window size called fixed-lag smoothing          on and highlights the orthogonality of this work to what is
[Einicke, 2012] in stochastic systems. This greatly improves        currently being researched.
the memory footprint, at the potential expense of quality of           Lastly, Table 1 reports how much time it takes to solve ev-
the estimation.                                                     ery action trace of the plan for the associated problem. There
                                                                    are two main take-aways here. First, the SRM optimization
5   Experimental Evaluation                                         creates a space vs time trade off. It significantly reduces the
   We investigate three key questions: (1) how effective is our     memory footprint of the history but has an impact on compu-
approach in capturing the necessary fluent literals to deter-       tation time. Second, as one would expect, the time to solve a
mine action preconditions and the goal; (2) how does our state      given instance correlates with the number of branches (action
relevance minimization impact the system; and (3) how does          traces) in the plan as well as their average length in terms of
our approach perform in light of different manifestations of a      the number of actions. This is why large problems like Wum-
domain’s dynamics. Ideally, we would compare our approach           pus05 can be solved much quicker than a smaller problem
empirically with logical filtering. However, the original au-       like Balls4-2.
thors confirmed in private correspondence that the code is             To further expand the evaluation, we introduce two new
unavailable. Instead, we look to the field of automated plan-       domains in the class of width-n non-simple contingent prob-
ning that deals with partial observability and sensing. We use      lems that specifically involve system dynamics and hidden
a set of standard benchmark problems and two newly created          state information that must be inferred.
domains.                                                            Password. n switches may be flipped by the agent. Each has
   We ran the BF algorithm on valid plans, and here we              an unobservable correct position. There are n + 1 lights such
present statistics on the proportion of action traces (plan         that the i-th light signifies that exactly i switches are in the
branches) for which all action preconditions and goal con-          wrong position. The goal is for the 0-th light to be on and for
dition are known to hold. For comparison, these are also            the agent to know the correct position of each switch.
reported for Approximate Logical Filtering (ALF). Average           Lights. n lights are connected in series and each may po-
history state sizes over the action traces are reported for         tentially be broken. A light may be off if it is broken or a
the BF algorithm and with the state relevance minimization          light downstream is broken. The agent must fix lights that are
(BF+SRM) as outlined in Section 4.1. Problem statistics and         known to be broken and reach a state where all lights are no
running times are also reported.                                    longer broken. This domain has cascading effects as fixing a
   The plans were produced by the planner POPRP [Muise              single light may change the “lit” status of all lights upstream.
et al., 2014], which leverages a compilation of partially ob-          Note that the contingent width (as defined by [Bonet and
servable planning problems that exhibit particular proper-          Geffner, 2014]) for these problems is precisely n.
ties making them easier to solve (i.e., “simple” contingent            These problems represent two orthogonal classes of dy-
problems [Bonet and Geffner, 2011]). For a description of           namic domains with their differences best summarized by
the benchmark domains Wumpus, Doors, Colored Balls, and             Figure 1. First, consider the Password domain. As the prob-
CTP (Canadian Traveler’s Problem), see [Muise et al., 2014].        lem size grows, the average number of conditional effects
Full details and source code used for evaluation will be made       grows dramatically. Any time a switch is flipped, all pos-
available online.                                                   sible cases for the n + 1 lights changing must be covered.
   Table 1 shows the results of evaluating the BF algorithm on      With the Lights domain, the actions of fixing a single light
plans for each of the benchmarks. First, consider the Precon-       have more repercussions as lights are added but the chain is
                                                                                              Precondition / Goal                                                                          Avg History State Size                  Time to Solve
                                                        Problem Statistics
                        Problem                                                              Coverage (% of runs)                                                                           (% of total fluents)                     (seconds)
                                            |F|       #-BR        max-BR           avg-BR     BF                                              ALF                                          BF             BF+SRM                  BF      BF+SRM
              Wumpus05                      802          35            43            32.9    100                                                  0                                        17.4              0.02                0.91       1.69
               Doors05                      100          25            26            16.0    100                                                64.0                                       43.5               0.3                0.01       0.17
               Doors07                      196         343            60            33.2    100                                                62.9                                       43.2               0.1                9.02      16.52
               CTP05                         76          32            10            10.0    100                                                 3.1                                       43.4               0.6                0.06       0.10
               CTP07                        134         128            13            14.0    100                                                 0.7                                       34.3               0.3                0.57       1.01
               CTP09                        205         512            18            18.0    100                                                 0.2                                       28.3               0.2                4.34       7.73
               Balls4-1                     374          48            46            26.1    100                                                81.2                                       21.3               0.1                0.83       1.54
               Balls4-2                     396        2304            90            51.4    100                                                66.0                                       25.1              0.09               166.85     324.69
Table 1: BF Algorithm - Planning Benchmarks Performance. (#-BR) number of branches; (max-BR) max branch depth; (avg-BR) average
branch depth; (Coverage) percentage of runs where all preconditions and goal fluents are captured; (History State Size) average percentage
of all fluents that must be tracked; (Time to Solve) time for each technique to process and verify each branch of the plan.
                                        Problem Size vs Domain Differences                                                                                                               Problem Size vs Average Run-time
                               160                                                           35                                                                                 35
                                         Password                                                                                                                                        Password


                                                                                                  Avg # of Fluents per Conditional Effect
                               140                                                           30                                                                                 30
                                         Lights                                                                                                                                          Lights


                                                                                                                                            Avg Runtime (secs) per simulation
Avg # of Conditional Effects


                               120
                                                                                             25                                                                                 25
                               100                                                                             Dotted-Lines
         Solid-Lines


                                                                                             20                                                                                 20
                               80
                                                                                             15                                                                                 15
                               60
                                                                                             10                                                                                 10
                               40

                               20                                                            5                                                                                   5

                                0                                                             0                                                                                 0
                                    0      10            20            30           40      50                                                                                       0     10            20            30           40     50
                                                Problem Size (# of lights/switches)                                                                                                             Problem Size (# of lights/switches)
                                Figure 1: Differences in Password and Lights domains                                                                                 Figure 2: Performance of BF on Password and Lights domains

only increased by one per new light. If we look at how the                                                                                  of the system manifest when compiled into conditional ef-
average number of conditions (fluents) of the conditional ef-                                                                               fects of actions, Figure 2 shows that the BF algorithm scales
fects grow with respect to problem size we see the reverse -                                                                                equivalently.
the Lights domain grows much faster than Password. By the
parametrization of the Password domain, each conditional ef-                                                                                6                                    Related Works
fect of flipping a switch depends only on the correctness of                                                                                    In this paper we propose an approach to logical state esti-
the switch and the light that is currently on. For the Lights                                                                               mation for dynamical systems that is tailored to address the
domain, as more lights are added fixing any single light has a                                                                              tracking of individual fluents in a computationally efficient
longer chain of potential ramifications given the status of the                                                                             manner. Similar to Amir and Russell’s original work on logi-
lights both upstream and downstream. Therefore, these prob-                                                                                 cal filtering, we elected not to perform our analysis in the sit-
lems allow us to compare how BF scales with respect to these                                                                                uation calculus, but rather to use a dynamical system model
two important domain characteristics.                                                                                                       that is employed by those who develop efficient algorithms
   We evaluate via a simulation that creates a randomized true                                                                              for AI automated planning and diagnosis. Nevertheless, there
state and produces a sequence of actions that should result in                                                                              is a large body of work in the knowledge representation lit-
a state where the goal holds. Figure 2 shows how the BF al-                                                                                 erature, much of it in the situation calculus, that is related
gorithm performs as the problem size increase, averaged over                                                                                to logical filtering; particularly work on belief update and on
100 random action traces. As with the standard benchmarks,                                                                                  progression. Among these, Lin and Reiter (1997) provided
the BF algorithm correctly deduced all action preconditions                                                                                 a broad study of progression spawning a number of other
and goal states over all generated action traces. The main                                                                                  advances and culminating in Vassos and Levesque’s (2013)
point of comparison here is not pure performance but instead                                                                                treatment of first-order progression, which appears to sub-
performance of problem type as per the preceding discussion                                                                                 sume Shirazi and Amir’s (2011) work on first-order logical
of Figure 1. The Password domain scales slightly better par-                                                                                filtering. Also relevant to approximate filtering, and in small
tially due to the general domain growth with respect to prob-                                                                               degree smoothing, is the work by Liu and Levesque (2005)
lem size being lower than the Lights domain as per Figure 3.                                                                                that studies progression in dynamical systems with respect to
As Figures 1 and 3 show, an increase in a simple notion of                                                                                  so-called Proper Knowledge Bases. This work shares some
problem size can have a large impact on multiple facets of                                                                                  motivation with our conjunctive formulae restriction in at-
the problem representation. Regardless of how the dynamics                                                                                  tempting to avoid disjunction in favor of tractability. Further,
                                                Problem Size vs Domain Growth                                                               pairs but for a slightly different purpose. The planner sam-
                               3000                                                             250                                         ples a possible complete initial state then assumes it is correct
                                               Password
                                                                                                                                            and plans appropriately. When information is gained through


                                                                                                     Avg Simulation Length (# of Actions)
                               2500            Lights
                                                                                                200                                         sensing actions that disprove the correctness of the initial
Total # of Fluents in Domain


                                                                                                                                            state sample, re-planning is performed and a new state is sam-
                               2000
                                                                                                                                            pled. To ensure action preconditions are correctly followed,


                                                                                                                 Dotted-Lines
                                                                                                150
         Solid-Lines


                               1500                                                                                                         precondition fluents are regressed through the history to the
                                                                                                100
                                                                                                                                            initial state to ensure satisfiability. This portion of the algo-
                               1000                                                                                                         rithm is similar, at a high level, to the fundamental ideas of
                                                                                                                                            logical smoothing - that an understanding of the evolution of
                                                                                                50                                          the past can produce new information about the present. More
                                500
                                                                                                                                            recently Brafman and Shani (2014) also exploit regression for
                                 0                                                               0                                          effective state estimation. A key difference, however, is that
                                      0          10            20            30           40   50
                                                      Problem Size (# of lights/switches)                                                   we use newly discovered information about past states to re-
                                                                                                                                            duce the uncertainty of more recent states. In contrast, they
                                          Figure 3: Password and Lights domain growth
                                                                                                                                            only perform a backwards pass on the history of actions and
the authors discuss a limited integration of sensing via re-                                                                                observations, using full regression, while we approximate for
gression to determine the context of actions to be performed,                                                                               efficiency.
building on a similar idea for projection by De Giacomo and                                                                                    Most recently Bonet and Geffner (2014) developed algo-
Levesque (1999). Finally, Ewin et al. (2014) study the prob-                                                                                rithms for belief tracking for so-called simple planning prob-
lem of query projection for long-living agents by combining                                                                                 lems, as noted previously. We were unable to perform an
regression and progression in a manner that is similar in spirit                                                                            experimental comparison with their work because their im-
to the work presented here.                                                                                                                 plementation is domain-specific. Nevertheless, it is interest-
   Our work is inspired by work on smoothing and                                                                                            ing to consider when one approach works well and the other
backwards-forwards reasoning in stochastic systems, build-                                                                                  does not. As width-n problems, both the Password and Lights
ing on previous work on logical filtering [Amir and Russell,                                                                                domains would cause an exponential blowup for their tech-
2003], the adaptation of the classical filtering to a logical set-                                                                          nique. Conversely, there are width-1 problems where our ap-
ting. Various works on (database) progression (e.g., [Vassos                                                                                proximation does not capture simple entailments, such as
and Levesque, 2013]) are also closely related to this exact                                                                                 conformant-like conditions where case-based reasoning plays
logical state estimation task. In 2007, Shahaf and Amir de-                                                                                 a role. The complementary nature of the two approaches
veloped a version of logical filtering that represents the un-                                                                              makes their combination an obvious step for future research.
derlying belief state as a circuit. By solving the update prob-
lem completely, logical circuit filtering is able to compute and                                                                            7   Concluding Remarks
represent the belief state, exactly, in polynomial time. Un-                                                                                   Logical smoothing and backwards-forwards reasoning are
fortunately, querying of the underlying data structure used in                                                                              important techniques in service of reasoning about partially
logical circuit filtering requires use of a SAT solver, making it                                                                           observable dynamical systems. They characterize the logical
untenable for tasks such as AI planning which could require                                                                                 analogue of commonly used techniques in control theory and
excessive SAT calls to evaluate action executability during                                                                                 other dynamic programming settings. This paper provides the
plan construction.                                                                                                                          formal foundations for a number of interesting theoretical and
   One of the first works on the approximation of logical be-                                                                               algorithmic tools that are of practical significance for auto-
lief states in dynamical systems was the 0-approximation se-                                                                                mated planning, execution monitoring, diagnosis and beyond.
mantics proposed by Baral and Son (1997). Like our approx-                                                                                  We demonstrated the effectiveness of our approach, which
imation techniques, they represent beliefs in terms of con-                                                                                 has flawless recall on state estimation for the preconditions
junctions of fluents, but do not exploit a notion of backwards-                                                                             and goal conditions in the plans for existing partially observ-
forwards reasoning to do state estimation.                                                                                                  able planning problems. Further, we witnessed a dramatic re-
   The field of automated planning has spawned numerous                                                                                     duction in the number of fluent literals that must be moni-
systems which necessarily have a sub-component to perform                                                                                   tored. In future work we plan to integrate our algorithms with
state estimation. For example, the systems CN Fct , DN Fct ,                                                                                a contingent planning system.While our work was presented
and P Ict ([To et al., 2011b; 2011a]) explicitly maintain sets                                                                              in the context of deterministic actions, the account and algo-
of the possible belief states but in the worst case have ex-                                                                                rithms extend trivially to non-deterministic actions, and can
ponential space complexity. Palacios and Geffner realize a                                                                                  be extended to exogenous actions. We plan to elaborate on
form of approximate state estimation via a compilation pro-                                                                                 such cases and to further explore variants of logical smooth-
cess that introduces additional fluents and actions correspond-                                                                             ing and our BF algorithm as they relate to non-deterministic
ing to possible initial worlds (2009). While this translation                                                                               and probabilistic transition systems. Finally we wish to fur-
based approach provides efficient state querying, the number                                                                                ther explore the theoretical relastionship between our work
of additional fluents required for completeness grows expo-                                                                                 and progression over Proper KBs, and relationship to the var-
nentially in problem’s contingent width.                                                                                                    ious approximations defined by Palacios and Geffner in [Pala-
   The SDR Planner [Brafman and Shani, 2012] maintains                                                                                      cios and Geffner, 2009] and by Bonet and Geffner in [Bonet
a history of actions similar to our history of state-action                                                                                 and Geffner, 2014].
Acknowledgements: The authors gratefully acknowledge                    [Lin and Reiter, 1997] Fangzhen Lin and Raymond Reiter. How to
funding from the Natural Sciences and Engineering Research                 progress a database. Artificial Intelligence, 92:131–167, 1997.
Council of Canada (NSERC).                                              [Liu and Levesque, 2005] Yongmei Liu and Hector J. Levesque.
                                                                           Tractable reasoning with incomplete first-order knowledge in dy-
References                                                                 namic systems with context-dependent actions. In Proc. of the
[Amir and Russell, 2003] Eyal Amir and Stuart J. Russell. Logical          19th Int’l Joint Conference on Artificial Intelligence (IJCAI),
  filtering. In Proceedings of the Eighteenth International Joint          pages 522–527, 2005.
  Conference on Artificial Intelligence (IJCAI-03), pages 75–82,        [McIlraith and Reiter, 1992] S. McIlraith and R. Reiter. On tests
  2003.                                                                    for hypothetical reasoning. In de Kleer J. Hamschers, W. and
[Baier et al., 2014] Jorge A. Baier, Brent Mombourquette, and              L. Console, editors, Readings in Model-Based Diagnosis, pages
                                                                           89–95. Morgan Kaufmann, 1992.
  Sheila A. McIlraith. Diagnostic problem solving via planning
  with ontic and epistemic goals. In Principles of Knowledge Rep-       [Muise et al., 2014] Christian Muise, Vaishak Belle, and Sheila A.
  resentation and Reasoning: Proceedings of the Fourteenth Inter-          McIlraith. Computing contingent plans via fully observable non-
  national Conference (KR 2014), 2014.                                     deterministic planning. In The 28th AAAI Conference on Artifi-
                                                                           cial Intelligence, 2014.
[Baral and Son, 1997] Chitta Baral and Tran Cao Son. Approxi-
  mate reasoning about actions in presence of sensing and incom-        [Palacios and Geffner, 2009] Héctor Palacios and Hector Geffner.
  plete information. In Proceedngs of the 1997 International Sym-          Compiling uncertainty away in conformant planning problems
  posium on Logic Programming, pages 387–401, 1997.                        with bounded width. Journal of Artificial Intelligence Research,
                                                                           35:623–675, 2009.
[Bonet and Geffner, 2011] Blai Bonet and Hector Geffner. Plan-
  ning under partial observability by classical replanning: Theory      [Pinto, 1999] Javier Pinto. Compiling ramification constraints into
  and experiments. In IJCAI 2011, Proceedings of the 22nd Inter-           effect axioms. Computational Intelligence, 15:280–307, 1999.
  national Joint Conference on Artificial Intelligence, Barcelona,      [Reiter, 2001] Raymond Reiter. Knowledge in Action: Logical
  Catalonia, Spain, July 16-22, 2011, pages 1936–1941, 2011.               Foundations for Specifying and Implementing Dynamical Sys-
[Bonet and Geffner, 2014] Blai Bonet and Hector Geffner. Belief            tems. MIT Press, Cambridge, MA, 2001.
  tracking for planning with sensing: Width, complexity and ap-         [Rintanen, 2008] Jussi Rintanen. Regression for classical and non-
  proximations. Journal of Artificial Intelligence Research, pages         deterministic planning. In Proceedings of the 18th European
  923–970, 2014.                                                           Conference on Artificial Intelligence, pages 568–572, 2008.
[Brafman and Shani, 2012] Ronen I. Brafman and Guy Shani. Re-           [Shahaf and Amir, 2007] Dafna Shahaf and Eyal Amir. Logical cir-
   planning in domains with partial information and sensing actions.       cuit filtering. In Proceedings of the 20th International Joint Con-
   J. Artif. Intell. Res. (JAIR), 45:565–600, 2012.                        ference on Artificial Intelligence (IJCAI-07), pages 2611–2618,
[Brafman and Shani, 2014] Ronen I. Brafman and Guy Shani. On               2007.
   the properties of belief tracking for online contingent planning     [Shirazi and Amir, 2011] Afsaneh Shirazi and Eyal Amir. First-
   using regression. In Proc. of the 21st European Conference on           order logical filtering. Artificial Intelligence, 175(1):193–219,
   Artificial Intelligence (ECAI), pages 147–152, 2014.                    2011.
[De Giacomo and Levesque, 1999] Giuseppe De Giacomo and                 [Strass and Thielscher, 2013] Hannes        Strass     and    Michael
  Hector J. Levesque. Projection using regression and sensors. In          Thielscher. A general first-order solution to the ramification
  Proc. of the 16th Int’l Joint Conference on Artificial Intelligence      problem with cycles. Journal of Applied Logic, 11(3):289–308,
  (IJCAI), pages 160–165, 1999.                                            2013.
[Einicke, 2012] Garry A. Einicke. Smoothing, Filtering and Pre-         [To et al., 2011a] Son Thanh To, Enrico Pontelli, and Tran Cao Son.
   diction - Estimating The Past, Present and Future. InTech, 2012.        On the effectiveness of CNF and DNF representations in contin-
                                                                           gent planning. In Proc. of the 22nd Int’l Joint Conference on
[Eiter and Gottlob, 1992] Thomas Eiter and Georg Gottlob. On the           Artificial Intelligence (IJCAI), pages 2033–2038, 2011.
   complexity of propositional knowledge base revision, updates,
   and counterfactuals. Artificial Intelligence, 57(2-3):227–270,       [To et al., 2011b] Son Thanh To, Tran Cao Son, and Enrico Pon-
   1992.                                                                   telli. Contingent planning as AND/OR forward search with dis-
                                                                           junctive representation. In Fahiem Bacchus, Carmel Domshlak,
[Ewin et al., 2014] Christopher James Ewin, Adrian R. Pearce, and          Stefan Edelkamp, and Malte Helmert, editors, Proceedings of
  Stavros Vassos. Transforming situation calculus action theories          the 21st International Conference on Automated Planning and
  for optimised reasoning. In Proc. of the 14th Int’l Conference           Scheduling, ICAPS 2011, Freiburg, Germany June 11-16, 2011.
  on the Principles of Knowledge Representation and Reasoning              AAAI, 2011.
  (KR), 2014.
                                                                        [van Ditmarsch et al., 2007] Hans P. van Ditmarsch, Andreas
[Fikes et al., 1972] Richard Fikes, Peter Hart, and Nils Nilsson.          Herzig, and Tiago De Lima. Optimal regression for reasoning
   Learning and executing generalized robot plans. Artificial In-          about knowledge and actions. In Proceedings of the Twenty-
   telligence, 3:251–288, 1972.                                            Second AAAI Conference on Artificial Intelligence, pages 1070–
[Fritz, 2009] Christian Fritz. Monitoring the Generation and Exe-          1076, 2007.
   cution of Optimal Plans. PhD thesis, University of Toronto, April    [Vassos and Levesque, 2013] Stavros Vassos and Hector J.
   2009.                                                                   Levesque. How to progress a database III. Artificial Intelligence,
[Kalman, 1960] R. E. Kalman. A new approach to linear filtering            195:203–221, 2013.
  and prediction problems. Transactions of ASM E. J. of Basic
  Engineering, 82(Ser. D):35–45, 1960.