=Paper= {{Paper |id=Vol-1433/tc_22 |storemode=property |title=Structural Resolution for Logic Programming |pdfUrl=https://ceur-ws.org/Vol-1433/tc_22.pdf |volume=Vol-1433 |dblpUrl=https://dblp.org/rec/conf/iclp/JohannKK15 }} ==Structural Resolution for Logic Programming == https://ceur-ws.org/Vol-1433/tc_22.pdf
Technical Communications of ICLP 2015. Copyright with the Authors.                               1




      Structural Resolution for Logic Programming
                                       PATRICIA JOHANN
                    Department of Computer Science, Appalachian State University, USA
                                   (e-mail: johannp@appstate.edu)

                              EKATERINA KOMENDANTSKAYA
                              School of Computing, University of Dundee, UK
                               (e-mail: katya@computing.dundee.ac.uk)

                                VLADIMIR KOMENDANTSKIY
                                               Moixa, UK
                                 (e-mail: vladimir@moixaenergy.com)

                              submitted 29 April 2015; accepted 5 June 2015



                                               Abstract
We introduce a Three Tier Tree Calculus (T 3C) that defines in a systematic way three tiers of tree
structures underlying proof search in logic programming. We use T 3C to define a new – structural –
version of resolution for logic programming.
KEYWORDS: Structural resolution, term trees, rewriting trees, derivation trees.




                                           1 Introduction
As ICLP is celebrating the 200th anniversary of George Boole, we are reflecting on the
fundamental “laws” underlying derivations in logic programming (LP), and making an
attempt to formulate some fundamental principles for first-order proof search, analogous
in generality to Boole’s “laws of thought” for propositional logic (Boole 1854).
   Any such principles must be able to reflect two important features of first-order proof
search in LP: its recursive and non-deterministic nature. For this they must satisfy two
criteria: to be able to (a) model infinite structures and (b) reflect the non-determinism of
proof search, relating “laws of infinity” with “laws of non-determinism” in LP.
Example 1.1 The program P1 inductively defines the set of natural numbers:
                                   0.   nat(0) ←
                                   1. nat(s(X)) ←            nat(X)

To answer the question “Does P1 ` nat(s(X)) hold?”, we first represent it as the LP query
? ← nat(s(X)) and then use SLD-resolution to resolve this query with P1 . The topmost
clause selection strategy first resolves nat(s(X)) with P1 ’s second clause (Clause 1), and
then resolves the resulting term with P1 ’s first clause (Clause 0). This gives the derivation
nat(s(X)) → nat(X) → true, which computes the solution {X 7→ 0} in its last step. So
one answer to our question is “Yes, provided X is 0.”
2                  P. Johann, E. Komendantskaya and V. Komendantskiy

Even for this simple inductive program, there will be clause selection strategies (or clause
orderings) that will result in infinite SLD-derivations. If Clause 1 is repeatedly resolved
against, the infinite computation will compute the first limit ordinal.
   The least and greatest Herbrand model semantics (van Emden and Kowalski 1976; Lloyd
1988; van Emden and Abdallah 1985) captured very well the recursive (and corecursive!)
nature of LP (thus satisfying our criterion (a)). For example, the least Herbrand model for
P1 is an infinite set of finite terms nat(0), nat(s(0)), nat(s(s(0))), . . .. The greatest com-
plete Herbrand model for program P1 is the set containing all of the finite terms in the least
Herbrand model for P1 together with the first limit ordinal nat(s(s(...))). However, due
to its declarative nature, the semantics does not reflect the operational non-deterministic
nature of LP, and thus fails our criterion (b).
   The operational semantics of LP has seen the introduction of a variety of tree structures
reflecting the non-deterministic nature of proof search: proof trees, SLD-derivation trees,
and and-or-trees, just to name a few. However, these do not adequately capture the infi-
nite structures arising in LP proof search. It is well-known that SLD-derivations for any
program P are sound and complete with respect to the least Herbrand model for P (Lloyd
1988), but this soundness and completeness depends crucially on termination of SLD-
derivations, and termination is not always available in LP proof search. As a result, logical
entailment is only semi-decidable in LP.
   In one attempt to match the greatest complete Herbrand semantics for potentially non-
terminating programs, an operational counterpart — called computations at infinity — was
introduced in (Lloyd 1988; van Emden and Abdallah 1985). The operational semantics of
a potentially nonterminating logic program P was then taken to be the set of all infinite
ground terms computable by P at infinity. Computations at infinity better capture the com-
putational behaviour of non-terminating logic programs, but infinite computations do not
result in implementations. This observation suggests one more criterion: (c) our operational
semantics must be able to provide an observational (constructive) approach to potential in-
finity and non-determinism of LP proof search, thus incorporating “laws of observability”.
   Coinductive logic programming (CoLP) (Gupta et al. 2007; Simon et al. 2007) pro-
vides a method for terminating certain infinite SLD-derivations (thus satisfying our crite-
ria (a) and (c)). This is based on the principle of coinduction, which is in turn based on
the ability to finitely observe coinductive hypotheses and succeed when coinductive con-
clusions are reached. CoLP’s search for coinductive hypotheses and conclusions uses a
fairly straightforward loop detection mechanism. It requires the programmer to supply an-
notations classifying every predicate as either inductive or coinductive. Then, for queries
marked as coinductive, it observes finite fragments of SLD-derivations, checks them for
unifying subgoals, and terminates when loops determined by such subgoals are found.
   The loop detection mechanism of CoLP has three major limitations, all arising from the
fact that it has relatively week support for analysis of various proof-search strategies and
term structures arising in LP proof search (and thus for our criterion (b)).
  (1) It does not work well for cases of mixed induction-coinduction. For example, to
coinductively define an infinite stream of Fibonacci numbers, we would need to include in-
ductive clauses defining addition on natural numbers. Coinductive goals will be mixed with
inductive subgoals. Closing such computations by simple loop detection is problematic.
                        Structural Resolution for Logic Programming                             3

  (2) There are programs for which computations at infinity produces an infinite term,
whereas CoLP fails to find unifiable loops.
Consider the following (coinductive) program P2 that has the single clause
0. from(X, scons(X, Y)) ← from(s(X), Y)
Given the query ? ← from(0, X), and writing [ , ] as an abbreviation for the stream con-
structor scons, we have that the infinite term t 0 = from(0, [0, [s(0), [s(s(0)), . . .]]]) is
computable at infinity by P2 and is also contained in the greatest Herbrand model for
P2 . However, P2 ` from(0, X) cannot be proven using the unification-based loop detection
technique of CoLP. Since the terms from(0, scons(0, X0 )), from(s(0), scons(s(0), X00 )),
from(s(s(0)), scons(s(s(s(0))), X000 ), ... arising in the derivation for P2 and ? ← from(0, X)
will never unify, CoLP will never terminate.
   (3) CoLP fails to reflect the fact that some infinite computations are not productive, i.e.,
do not produce an infinite term at infinity. The notion of productivity of corecursion is
well studied in the semantics of other programming languages (Endrullis et al. 2010; Agda
2015; Coq 2015). For example, no matter how long an SLD-derivation for the following
program P3 runs, it does not produce an infinite term, and the resulting computation is thus
coinductively meaningless:
0. bad(X) ← bad(X)
Somewhat misleadingly, CoLP’s loop detection terminates with success for such programs,
thus failing to guarantee coinductive construction of infinite terms (failing criterion (a)).
   Is our quest for a theory of LP satisfying criteria (a), (b), and (c) hopeless? We take a
step back and recollect that the semantics of first-order logic and recursive schemes offers
one classical approach to formulating structural properties of potentially infinite first-order
terms. Best summarised in “Fundamental Properties of Infinite Trees” (Courcelle 1983),
the approach comes down to formulating some structural laws underlying first-order syn-
tax. It starts with definition of a tree language as a (possibly infinite) set of sequences of
natural numbers satisfying conditions of prefix-closedness and finite branching. Given a
first-order signature Σ together with a countable set of variables Var, a first-order term tree
is defined as a map from a tree language L to the set Σ ∪ Var. Size of the domain of the
map determines the size of the term tree. The “laws” are then given by imposing several
structural properties: (i) in a given term tree, arities imposed by Σ must be reflected by
the branching in the underlying tree language; (ii) variables have arity 0 and thus can only
occur at leaves of the trees; and (iii) the operation of substitution is given by replacing
leaf variables with term trees. A calculus for the operation can be formulated in terms of a
suitable unification algorithm. We give formal definitions in Sections 2 and 3.
   We extend this elegant theory of infinite trees to give an operational semantics of LP
that satisfies criteria (a), (b), and (c). We borrow a few general principles from this theory.
Structural properties of trees (given by arity and variable constraints) and operations on
trees (substitutions) are defined by means of “structural laws” that hold for finite and infi-
nite trees. This gives us constructive approach to infinity (cf. criteria (a) and (c)). It remains
to find the right kind of structures to reflect the non-determinism of proof search in LP.
   Given a logic program P and a term (tree) t, the first question we may ask is whether
t matches any of P’s clauses. First-order term matching is a restricted form of unifica-
4                       P. Johann, E. Komendantskaya and V. Komendantskiy
                                                         X
                                                         →2
                 ? ← nat(s(X))                                             ? ← nat(s(0))

                   nat(s(X))                                                 nat(s(0))

                   X1      nat(s(X)) ← nat(X)                                X1         nat(s(0)) ← nat(0)

                                  nat(X)                                                      nat(0)

                                  X2     X3                                               nat(0) ←      X3

Fig. 1. The rewriting trees for P1 and ? ← nat(s(X)) and ? ← nat(s(0)). The trees form a transition relative to
                                            2 X
          the Tier 2 variable X2 (shown by →). The second tree is a successful proof for ? ← nat(s(X))).

tion employed in (first-order) term rewriting systems (TRS) (Terese 2003) and — via
pattern-matching — in functional programming. For our P and t, we may proceed with
term matching steps recursively, mimicking an SLD-derivation in which unification is re-
stricted to term matching. Consider the matching sequences for four different terms and
the coinductive program P2 from above:

    from(0, X)       from(0, [0, X0 ])        from(0, [0, [s(0), X00 ]])       from(0, [0, [s(0), [s(s(0)), X000 ]]])

                     from(s(0), X0 )          from(s(0), [s(0), X00 ])            from(s(0), [s(0), [s(s(0)), X000 ]])

                                                  from(s(s(0)), X00 )               from(s(s(0)), [s(s(0)), X000 ])

                                                                                       from(s(s(s(0))), X000 )

Let us call term matching sequences as above rewriting trees, to highlight their relation
to TRS. The above sequences can already reveal some of the structural properties of the
given logic program. If Σ2 is the signature of the program P2 , and if we denote all finite
term trees that can be formed from this signature as Term(Σ2 ), then a rewriting tree for P2
can be defined as a map from a given tree language L to Term(Σ2 ). Since rewriting trees
are built upon term trees, we may say that term trees give a first tier of tree structures, while
the rewriting trees give a second tier of tree structures. To formulate suitable laws for the
second tier, we need to refine our notion of rewriting trees.
   Given a program P and a term t, we may additionally reflect how many clauses from P
can be unified with t, and how many terms those clauses contain in their bodies. We thus
introduce a new kind of “or-nodes” to track the matching clauses. If P has n clauses, t may
potentially have up to n alternative matching sequences. When a clause i does not match
a given term tree t, we may use a Tier 2 variable to denote the fact that, although t does
not match clause i at the moment, a match may be found for some instantiation of t. Thus,
for the program P1 above and the queries ? ← nat(s(X)) and ? ← nat(s(0)), we will have
the two rewriting trees of Figure 1. We note the alternating or-nodes (given by clauses) and
and-nodes (given by terms from clause bodies) and Tier 2 variables.
   Two kinds of laws are imposed on structure of rewriting trees:
      – arity constraints: the arity of an and-node is the number of clauses in the program,
        and arity of and or-node is the number of terms in its clause body.
      – variable constraints: variable leaves have arity 0, and run over the objects being de-
        fined (rewriting trees). Variables are the leaves in which substitution can take place.
  In Figure 1, Tier 2 variable X2 is substituted by a one-node rewriting tree nat(0) ←.
Such substitutions constitute the fundamental operation on Tier 2 trees, and give rise to a
                        Structural Resolution for Logic Programming                             5

calculus for Tier 2 given in terms of so-called rewriting tree transitions. Figure 1 shows a
transition from a rewriting tree for ? ← nat(s(X)) to a rewriting tree for ? ← nat(s(0))
which corresponds to the SLD-derivation outlined in Example 1.1. Thus, a derivation is
a sequence of tree transitions (given by the Tier 2 operation of substitution). We call this
method structural resolution, or S-resolution for short. Its formal relation to TRS and type
theory is given in (Fu and Komendantskaya 2015). Section 4 will introduce Tier 2 formally.
   We note the remarkably precise analogy between structures and operations of Tier 1
and Tier 2. Rewriting trees can be finite or infinite. For programs P1 and P2 , any rewriting
tree will be finite, but program P3 will give rise to infinite rewriting trees. Once again, our
structural analysis is fully generic for finite and infinite tree structures at Tier 2, which fits
our criterion (a). Rewriting trees perfectly reflect the “non-determinism laws” (criterion
(b)), thanks to and-nodes and or- nodes keeping a structural account of all the search op-
tions. Finally, our structural analysis perfectly fits criterion (c). For productive programs
like P1 and P2 , the length of a derivation may be infinite, however, each rewriting tree will
necessarily be finite. This ensures observational approach to corecursion and productivity.
   We complete the picture by introducing the third tier of trees reflecting different search
strategies arising from substitution into different variables of Tier 2. Given the set Rew(P)
of all finite rewriting trees defined for program P, a derivation tree is given by a map from
a tree language L to Rew(P). The arity of a given node in a derivation tree (itself given by
a rewriting tree) is the number of Tier 2 variables in that rewriting tree. The construction
of derivation trees is similar to the construction of SLD-derivation trees (as it accounts for
all possible derivation strategies). The trees of Tier 3 are formally defined in Section 5.
   The resulting Three Tier Tree Calculus (T 3C) developed in this paper formalises the
fundamental properties of trees arising in LP proof search. Apart from being theoretically
pleasing, this new theory can actually deliver very practical results. The finiteness of rewrit-
ing trees comprising a possibly infinite derivation gives an important observational prop-
erty for defining and semi-deciding (observational) productivity for corecursion in LP. This
puts LP on par with other languages in terms of observational productivity and coinductive
semantics (Endrullis et al. 2010; Agda 2015; Coq 2015). With a notion of productivity in
hand for LP, we can ask for results showing inductive and coinductive soundness of deriva-
tions given by transitions among rewriting trees. The two pictures above give, respectively,
a sound coinductive observation of a proof for t 0 = from(0, [0, [s(0), [s(s(0)), . . .]]]) with
respect to P2 , and a sound inductive derivation for nat(s(X)) with respect to P1 . Our ongo-
ing and future research based on T 3C will be further explained in Section 6.

                              2 Background: Tree Languages

Our notation for trees is a variant of that in, e.g., (Lloyd 1988; Courcelle 1983). Let N∗
denote the set of all finite words (i.e., sequences) over the set N of natural numbers. The
length of a word w ∈ N∗ is denoted by |w|. The empty word ε has length 0. We identify
the natural number i and the word i of length 1. If w is a word of length l, then for each
i ∈ {1, ..., l}, wi is the ith element of w. We may write w = w1 ...wl to indicate that w is a
word of length l. We use letters from the end of the alphabet, such u, v, and w, to denote
words in N∗ of any length, and letters from the middle of the alphabet, such as i, j, and k,
to denote words in N∗ of length 1 (i.e., individual natural numbers). The concatenation of
  6                       P. Johann, E. Komendantskaya and V. Komendantskiy
                 ε                     ε                                   stream               scons
                 0                0        1                               scons            0        scons
           00        01               10         .                     0            Y            0           .
                                                 .                                                           .
                                                 .                                                           .

  Fig. 2. The two figures on the left depict the finite and infinite tree languages {ε, 0, 00, 01} and
  {ε, 0, 1, 10, 11, . . .}. The two figures on the right depict the finite term tree stream(scons(X, Y)) and the infi-
  nite term tree scons(0, scons(0, ...)), both over Σ1 .

  words w and u is denoted wu. The word v is a prefix of w if there exists a word u such that
  w = vu, and a proper prefix of w if u 6= ε.

  Definition 2.1 A set L ⊆ N∗ is a (finitely branching) tree language if the following condi-
  tions are satisfied:
• For all w ∈ N∗ and all i, j ∈ N, if w j ∈ L then w ∈ L and, for all i < j, wi ∈ L.
• For all w ∈ L, the set of all i ∈ N such that wi ∈ L is finite.

     A tree language L is finite if it is a finite subset of N∗ , and infinite otherwise. Examples
  of finite and infinite tree languages are given in Figure 2. We may call a word w ∈ L a node
  of L. If w = w1 w2 ...wl , then a node w1 w2 ...wk for k < l is an ancestor of w. The node w is
  the parent of wi, and nodes wi for i ∈ N are children of w. A branch of a tree language L is
  a subset L0 of L such that, for all w, v ∈ L0 , w is an ancestor of v or v is an ancestor of w. If
  L is a tree language and w is a node of L, the subtree of L at w is L\w = {v | wv ∈ L}.
     We can now define our three-tier calculus T 3C.

                                               3 Tier 1: Term Trees
  In this section, we introduce Tier 1 of T 3C, highlighting the structural properties of its ob-
  jects (arity, branching, variables), the operation of first-order substitution, and the relevant
  calculus given by unification.

  3.1 Tier 1 structural properties: Signature as codomain, arity, and variables
  The trees of T 3C’s first tier are term trees over a (first-order) signature. A signature Σ
  is a non-empty set of function symbols, each with an associated arity. The arity of f ∈
  Σ is denoted arity( f ). For example, Σ1 = {stream, scons, 0}, with arity(scons) = 2,
  arity(stream) = 1, and arity(0) = 0, is a signature. To define term trees over Σ, we also
  need a countably infinite set Var of variables disjoint from Σ, each with arity 0. We use
  capital letters from the end of the alphabet, such as X, Y, and Z, to denote variables in Var.

  Definition 3.1 Let L be a non-empty tree language and let Σ be a signature. A term tree
  over Σ is a function t : L → Σ ∪ Var such that, for all w ∈ L, arity(t(w)) = | {i | wi ∈ L} |.

      Structural properties of tree languages extend to term trees. For example, a term tree
  t : L → Σ ∪ Var has depth depth(t) = max{|w| | w ∈ L}. The subtree of t at node w is given
  by t 0 : (L\w) → Σ ∪V , where t 0 (v) = t(wv) for each v ∈ L\w.
      Term trees are finite or infinite according as their domains are finite or infinite. Term
  trees over Σ may be infinite even if Σ is finite. Figure 2 shows the finite and infinite term
  trees stream(scons(X, Y)) and scons(0, scons(0, ...)) over Σ1 . The set of finite (infinite)
  term trees over a signature Σ is denoted Term(Σ) (Term∞ (Σ)). The set of all (i.e., finite
                       Structural Resolution for Logic Programming                             7

and infinite) term trees over Σ is denoted by Termω (Σ). Term trees with no occurrences
of variables are ground. We write GTerm(Σ) (GTerm∞ (Σ), GTermω (Σ)) for the set of
finite (infinite, all) ground term trees over Σ. GTerm(Σ) is also known as the Herbrand
base for Σ, and GTermω (Σ) is known as the complete Herbrand base for Σ, in the litera-
ture (Lloyd 1988). Both GTerm(Σ) and GTermω (Σ) are used to define the Herbrand model
and complete Herbrand model (declarative) semantics of LP (Kowalski 1974; Lloyd 1988).
Additionally, GTermω (Σ) is used to give an operational semantics to SLD-computations
at infinity in (Lloyd 1988; van Emden and Abdallah 1985).

3.2 Tier 1 operation: First-order substitution
A substitution of term trees over Σ is a total function σ : Var → Term(Σ). We write id
for the identity substitution. If σ has finite support — i.e., if |{X ∈ Var | σ (X) 6= X}| ∈ N
— and if σ maps the variables Xi to term trees ti , respectively, and is the identity on all
other variables, then we may write σ as {X1 7→ t1 , ..., Xn 7→ tn }. The set of all substitutions
over a signature Σ is Subst(Σ). Substitutions are extended from variables to term trees
homomorphically: if t ∈ Term(Σ) and σ ∈ Subst(Σ), then the application σ (t) is defined
by (σ (t))(w) = t(w) if t(w) 6∈ Var, and (σ (t))(w) = (σ (X))(v) if w = uv, t(u) = X, and
X ∈ Var. Composition of substitutions is denoted by juxtaposition, so σ2 σ1 (t) is σ2 (σ1 (t)).
Since composition is associative, we write σ3 σ2 σ1 rather than (σ3 σ2 )σ1 or σ3 (σ2 σ1 ).

3.3 Tier 1 calculus: Unification
A substitution σ over Σ is a unifier for term trees t and u over Σ if σ (t) = σ (u), and a
matcher for t against u if σ (t) = u. A substitution σ1 is more general than a substitution
σ2 , denoted σ1 ≤ σ2 , if there exists a substitution σ such that σ σ1 (X) = σ2 (X) for every
X ∈ Var. A substitution σ is a most general unifier (mgu) for t and u if it is a unifier for t
and u, and is more general than any (other) such unifier. A most general matcher (mgm) is
defined analogously. Both mgms and mgus are unique up to variable renaming.
   We write t ∼σ u if σ is a mgu for t and u, and t ≺σ u if σ is a mgm for t against u. Our
notation is reasonable: unification is reflexive, symmetric, and transitive, but matching is
reflexive and transitive only. Mgms and mgus can be computed using Robinson’s seminal
unification algorithm (see, e.g., (Lloyd 1988; Pfenning 2007)). Any standard unification
algorithm (possibly represented by system of sequent-like rules (Pfenning 2007; Fu and
Komendantskaya 2015)) can be seen as the calculus of Tier 1. Additional details about
unification and matching can be found in, e.g., (Baader and Snyder 2001).

                                 4 Tier 2: Rewriting Trees
In this section, we introduce Tier 2 of T 3C, highlighting the structural properties of rewrit-
ing trees: codomains comprising term trees and clauses, suitable notions of arity, the oper-
ation of Tier 2 substitution, and the relevant calculus given by rewriting tree transitions.

4.1 Tier 2 structural properties: Terms and clauses as codomain, arity, and variables
In LP, a clause C over a signature Σ is a pair (A, [B0 , ..., Bn ]), where A ∈ Term(Σ) and
[B0 , . . . Bn ] is a list of term trees in Term(Σ). Such a clause C is usually written as A ←
B0 , . . . , Bn . The head A of C is denoted head(C) and the body B0 , . . . , Bn of C is denoted
body(C). In T 3C, a clause over Σ is naturally represented as a total function (also called
   8                   P. Johann, E. Komendantskaya and V. Komendantskiy

   C) from a finite tree language L of depth 1 to Term(Σ) such that C(ε) = head(C), and if
   body(C) is B0 , . . . , Bn then, for each i ∈ L, C(i) = Bi . The set of all clauses over Σ is denoted
   by Clause(Σ). A goal clause G over Σ is a clause ? ← B0 , . . . , Bn over Σ ∪ {?}. Here, ? is
   a specified symbol not occurring in Σ ∪ Var, and B0 , . . . , Bn are term trees in Term(Σ). The
   goal clause ? ← is called the empty goal clause over Σ. We consider every goal clause
   over Σ to be a clause over Σ. The arity of a clause A ← B0 , . . . , Bn is n + 1. The symbol
   head(C)(ε) is the predicate of C.
      A logic program over Σ is a total function from a set {0, 1, . . . , n} ⊆ N to the set of non-
   goal clauses over Σ. The set of all logic programs over Σ is denoted LP(Σ). The arity of
   P ∈ LP(Σ) is the number |dom(P)| of clauses in P.
      We extend substitutions from variables to clauses and programs homomorphically. The
   variables of a clause C can be renamed with “fresh” variables — i.e., with variables that do
   not appear elsewhere in the current context — to get a new α-equivalent clause that can be
   used interchangeably with C. We assume variables have been thus renamed apart when-
   ever convenient. Renaming apart avoids circular (non-terminating) cases of unification and
   matching in LP. Under renaming, we can always assume that a mgm or mgu of a clause
   and a term is idempotent, i.e., that σ σ = σ .
      We now define the trees of Tier 2. Rewriting trees allow us to simultaneously track all
   matching sequences appearing in an LP derivation, and thus to see relationships between
   them. Since rewriting trees use only matching in their computation steps, they capture
   theorem proving (i.e., computations holding for all compatible term trees). By contrast, the
   Tier 3 derivation trees defined in Section 5 use full unification, and thus capture problem
   solving (i.e., computations holding only for certain compatible term trees).
      We distinguish two kinds of nodes in rewriting trees: and-nodes capturing terms coming
   from clause bodies, and or-nodes capturing the idea that every term tree can in principle
   match several clause heads. We also introduce or-node variables to signify the possibility
   of unification when matching of a term tree against a program clause fails.

   Definition 4.1 Let VR be a countably infinite set of variables disjoint from Var. If P ∈
   LP(Σ), C ∈ Clause(Σ), and σ ∈ Subst(Σ) is idempotent, then rew(P,C, σ ) is the function
   T : dom(T ) → Term(Σ) ∪ Clause(Σ) ∪ VR , where dom(T ) is a non-empty tree language,
   satisfying the following conditions:

1. T (ε) = σ (C) ∈ Clause(Σ) and, for all i ∈ dom(C) \ {ε}, T (i) = σ (C(i)).
2. For w ∈ dom(T ) with |w| even and |w| > 0, T (w) ∈ Clause(Σ) ∪VR . Moreover,
   – if T (w) ∈ VR , then { j | w j ∈ dom(T )} = 0,
                                                  / and
   – if T (w) = B ∈ Clause(Σ), then there exists a clause P(i) and an mgm θ for P(i) against
   head(B). Moreover, for every j ∈ dom(P(i))\{ε}, w j ∈ dom(T ) and T (w j) = σ (θ (P(i)( j))).
3. For w ∈ dom(T ) with |w| odd, T (w) ∈ Term(Σ). Moreover, for every i ∈ dom(P), we have
   – wi ∈ dom(T( ), and
                 σ (θ (P(i)))        if head(P(i)) ≺θ T (w) and
   – T (wi) =
                 a fresh X ∈ VR otherwise
4. No other words are in dom(T ).

   A node T (w) of rew(P,C, σ ) is an or-node if |w| is even and an and-node if |w| is odd. The
                           Structural Resolution for Logic Programming                                 9

  node T (ε) is the root of rew(P,C, σ ). If P ∈ LP(Σ), then T is a rewriting tree for P if it is
  either the empty tree or rew(P,C, σ ) for some C ∈ Clause(Σ) and σ ∈ Subst(Σ).

     The arity of a node T (w) in T = rew(P,C, σ ) is arity(P) if T (w) ∈ Term(Σ), arity(C)
  if T (w) ∈ Clause(Σ), and 0 if T (w) ∈ VR . The role of the parameter σ in the definition of
  rew will become clear when we discuss the notion of substitution for Tier 2. For now, we
  may think of σ as the identity substitution.

  Example 4.1 The rewriting trees rew(P1 , ? ← nat(s(X)), id) and rew(P1 , ? ← nat(s(0)), id)
  are shown in Figure 1.

     A rewriting tree for a program P is finite or infinite according as its domain is finite or
  infinite. We write Rew(P) for the set of finite rewriting trees for P, Rew∞ (P) for the set of
  infinite rewriting trees for P, and Rewω (P) for the set of all (finite and infinite) rewriting
  trees for P. In (Komendantskaya et al. 2014), a logic program P is called (observationally)
  productive, if each rewriting tree constructed for it is in Rew(P). Programs P1 and P2 are
  productive in this sense, whereas program P3 is not. In future work, we will introduce
  methods that semi-decide observational productivity.

  4.2 Tier 2 operation: Substitution of rewriting trees for Tier 2 variables
     With rewriting trees as the objects of Tier 2 and a suitable notion of a Tier 2 variable, we
  can replay Tier 1 substitution at Tier 2 by defining Tier 2 substitution to be the replacement
  of Tier 2 variables by rewriting trees. However, in light of the structural dependency of
  rewriting trees on term trees in Definition 4.1, we must also incorporate first-order substi-
  tution into Tier 2 substitution. Exactly how this is done is reflected in the next definition.

  Definition 4.2 Let P ∈ LP(Σ), C ∈ Clause(Σ), σ , σ 0 ∈ Subst(Σ) idempotent, and T =
  rew(P,C, σ ). Then the rewriting tree σ 0 (T ) is defined as follows:
• for every w ∈ dom(T ) such that T (w) is an and-node or non-variable or-node, (σ 0 (T ))(w) =
  σ 0 (T (w)).
• for every wi ∈ dom(T ) such that T (wi) ∈ VR , if θ is an mgm of head(P(i)) against σ 0 (T )(w),
  then (σ 0 (T ))(wiv) = rew(P, θ (P(i)), σ 0 σ )(v). (Note v = ε is possible.) If no mgm of head(P(i))
  against σ 0 (T )(w) exists, then (σ 0 (T ))(wi) = T (wi).

  Both items in the above definition are important in order to make sure that, given a rewriting
  tree T and a first-order substitution σ , σ (T ) satisfies Definition 4.1.
  Example 4.2 Consider the first rewriting tree T of Figure 1. Given first-order substitution
  σ = {X 7→ 0}, the second tree of that Figure gives σ (T ). Note that Tier 2 variable X2 is
  substituted by the one-node rewriting tree nat(0) ← as a result. In addition, all occur-
  rences of the first-order variable X in T are substituted by 0 in σ (T ).

      Drawing from Examples 4.1 and 4.2, we would ideally like to formally connect the
  definition of a rewriting tree and Tier 2 substitution, and say that, given T = rew(P,C, id)
  and a first-order substitution σ , σ (T ) = rew(P, σ (C), id). However, this does not hold in
  general, as was also noticed in (Komendantskaya et al. 2014). Given a clause C = (t ←
  t1 , . . . ,tn ), we say a variable X is existential if it occurs in some ti but not in t. The presence
  of existential variables shows why the third parameter in definition of rew is crucial:
10                        P. Johann, E. Komendantskaya and V. Komendantskiy


                                   ? ← conn(a, c)                                             ? ← conn(a, c)
                                     conn(a, c)                                                 conn(a, c)
                              0           0
X1     conn(a, c) ← edge(a, Z ), conn(Z , c)        X2 X3 X1          conn(a, c) ← edge(a, b), conn(b, c)      X2   X3
      edge(a, Z0 )                  conn(Z0 , c)               edge(a, b)                             conn(b, c)
 X4    X5    X6      X7       X8      .       X9   X10    X4    X5 edge(a, b) ←    X7            X8     . X9 conn(b, c) ←
                                      .                                                                 .
                                      .                                                                 .

Fig. 3. The infinite rewriting trees T and T 0 for the program P4 of Example 4.3, the clause conn(a, c), and
the substitutions id and {Z0 7→ b}, respectively. T offers no proof that P4 logically entails conn(a, c), but the
underlined steps in T 0 comprise precisely such a proof. The figure also illustrates a transition from T to T 0
relative to variable X6 .


Example 4.3 The graph connectivity program P4 is given by
0. conn(X, X) ←
1. conn(X, Y) ← edge(X, Z), conn(Z, Y)
2. edge(a, b) ←
3. conn(b, c) ←

Figure 3 shows rewriting trees T = rew(P4 ,C, id) and T 0 = rew(P4 ,C, θ ), where C =? ←
conn(a, c), and θ = {Z0 7→ b}. Note that θ (T ) = T 0 but rew(P4 , θ (C), id) 6= T 0 . This hap-
pens because Clause 1 contains an existential variable Z in its body, and construction of
rew(P4 , θ (C), id) fails to apply the substitution θ down the tree.

   Given T = rew(P,C, σ ), for T 0 = θ (T ) = rew(P,C, θ σ ) to hold, we must make sure
that the procedure of renaming variables apart used implicitly when computing mgms
during the rewriting tree construction is tuned in such a way that existential variables
contained in the domain of θ are still in correspondence with the existential variables in
rew(P,C, θ σ ). We achieve this by introducing a new renaming apart convention to supple-
ment Definition 4.1. Given a program P and a clause P(i) with distinct existential variables
Z1 , . . . , Zn ∈ Var, we impose an additional condition on the standard renaming apart proce-
dure. During the construction of T = rew(P,C, σ ), when an and-node T (w) is matched with
head(P(i)) via θ in order to form T (wi) = θ (P(i)), P(i)’s existential variables Z1 , . . . , Zn
must be renamed apart as follows:
   – We partition Var into two disjoint sets called VU and VE . The set VE is used to rename
existential variables apart, while VU is used to (re)name all other variables.
   – Moreover, when computing an mgm θ for T (w) and (P(i)), every existential vari-
able Zk from Z1 , . . . , Zn is renamed apart from variables of T using the following indexing
convention: Zk 7→ Ewi    k , with E k ∈ V .
                                   wi    E
When writing T (wi) = θ (P(i)) we assume that the above renaming convention is already
accounted for by θ . This ensures that the existential variables will be uniquely deter-
mined and synchronized for every two nodes T (w) and T 0 (w) in T = rew(P,C, σ ) and
T 0 = rew(P,C, θ σ ). Subject to this renaming convention, the following theorem holds.

Theorem 4.1 Let P ∈ LP(Σ), C ∈ Clause(Σ), and θ , σ ∈ Subst(Σ). Then θ (rew(P,C, σ )) =
rew(P,C, θ σ ).
Proof. Let T = rew(P,C, σ ), and let T 0 = rew(P,C, θ σ ). We need to prove that θ (T ) = T 0 .
                        Structural Resolution for Logic Programming                              11

The proof proceeds by induction on the length of the tree T and by cases on the types of
nodes in T and θ (T ).
    – If T (w) and θ (T (w)) are non-variable or-nodes (including the case T (ε)), then, by
Definition 4.2, θ (T )(w) = θ (T (w)) = θ σ (C∗ ), where C∗ is either C (i.e., it is a root node)
or some P(i) ∈ P. But, by Definition 4.1, T 0 (w) = θ σ (C∗ ). (Here, the synchronisation of
renamed existential variables is essential, as described.)
    – If T (w) and θ (T (w)) are and-nodes, then the argument is similar.
    – If T (wi) is a variable or-node, then, by Definition 4.2, two cases are possible:
    (1) If no mgm for θ (T (w)) and head(P(i)) exists, then θ (T )(w) = θ (T (w)). But then
no mgm for T 0 (w) and head(P(i)) exists either, so T 0 (w) = θ (T (w)).
    (2) If the mgm for θ (T (w)) and head(P(i)) exists, then by Definition 4.2, θ (T )(wi) =
rew(P, θ 0 (P(i)), θ σ )(ε), where θ 0 is the mgm of θ (T (w)) and head(P(i)). The rest of the
proof proceeds by induction on the depth of rew(P, θ 0 (P(i)), θ σ ).
    Base case. For the root θ (T )(wi) = rew(P, θ 0 (P(i)), θ σ )(ε), by Definition 4.1 we have
that rew(P, θ 0 (P(i)), θ σ )(ε) = (θ σ )(θ 0 (P(i))). On the other hand, Definition 4.1 also
gives that T 0 (wi) = (θ σ )(θ 00 (P(i))), where θ 00 is the mgm of T 0 (w) and head(P(i)). Since
T 0 (w) = (θ (T ))(w) by the earlier argument for and-nodes, θ 0 and θ 00 are mgus of equal
term trees and head(P(i)), so θ 0 = θ 00 . Then T 0 (wi) = (θ (T ))(wi), as desired.
    Inductive case. We need only consider the situation when T (wiv j) is undefined, but
(θ (T ))(wiv j) is defined. By Definition 4.2, θ (T )(wiv j) = rew(P, θ 0 (P(i)), θ σ )(v j). This
node can be either an and-node, a variable or-node, or a non-variable or-node. The first two
cases are simple; we spell out the latter, more complex case only.
    If θ (T )(wiv j) is a non-variable or-node then, by Definition 4.1, it must be (θ σ )(θ ∗ P( j)),
where θ ∗ is the mgm of θ (T )(wiv) and head(P( j)). On the other hand, Definition 4.1 also
gives that T 0 (wiv j) = (θ σ )(θ ∗∗ (P( j))), where θ ∗∗ is the mgm of T 0 (wiv) and head(P( j)).
Since T 0 (wiv) = (θ (T ))(wiv) by the induction hypothesis, θ ∗ and θ ∗∗ are mgms of equal
term trees and head(P( j)), so θ ∗ = θ ∗∗ . Thus T 0 (wiv j) = (θ (T ))(wiv j), as desired.       2

4.3 Tier 2 calculus: Rewriting tree transitions
   The operation of Tier 2 substitution is all we need to define transitions among rewriting
trees. Let P ∈ LP(Σ) and t ∈ Term(Σ). If head(P(i)) ∼σ t, then σ is the resolvent of P(i)
and t. If no such σ exists then P(i) and t have null resolvent. A non-null resolvent is an
internal resolvent if it is an mgm of P(i) against t, and it is an external resolvent otherwise.
Definition 4.3 Let P ∈ LP(Σ) and T = rew(P,C, σ 0 ) ∈ Rewω (P). If X = T (wi) ∈ VR , then
the rewriting tree TX is defined as follows. If the external resolvent σ for P(i) and T (w) is
null, then TX is the empty tree. If σ is non-null, then TX = rew(P,C, σ σ 0 ).
If T ∈ Rewω (Σ) and X ∈ VR , then the computation of TX from T is denoted Trans(P, T, X) =
TX . If the other parameters are clear we simply write T → TX . The operation T → TX is a
tree transition for P and C. A tree transition for P ∈ LP(Σ) is a tree transition for P and
some C ∈ Clause(Σ). A (finite or infinite) sequence T = rew(P,C, id) → T1 → T2 → . . . of
tree transitions for P is a derivation for P and C. Each rewriting tree Ti in the derivation
is given by rew(P, C, σi . . . σ2 σ1 ), where σ1 , σ2 . . . is the sequence of external resolvents
associated with the derivation. When we want to contrast the above derivations with SLD-
derivations, we call them S-derivations, or derivations by structural resolution.
12                 P. Johann, E. Komendantskaya and V. Komendantskiy

Example 4.4 Tree transitions for P1 and P4 are shown in Figures 1 and 3, respectively.

   It is our current work to prove that S-derivations are sound and complete relative to
declarative semantics of LP; see also (Fu and Komendantskaya 2015) for a comparative
study of the operational properties of S-derivations and SLD-derivations.

                                 5 Tier 3: Derivation Trees
While the rewriting trees of Tier 2 capture transitions between Tier 1 term trees that depend
on matching, the derivation trees of Tier 3 capture transitions between Tier 2 rewriting
trees that depend on unification. Derivation trees thus allow us to simultaneously track
all unification sequences appearing in an LP derivation. The arity of a rewriting tree T ,
denoted arity(T ), is the cardinality of the set indices(T ) of indices of variables from VR in
T . There is always a bijection pos from indices(T ) to the (possibly infinite) set arity(T ).


Definition 5.1 If P ∈ LP(Σ) and C ∈ Clause(Σ), the derivation tree der(P,C) is the func-
tion D : dom(D) → Rewω (P) such that D(ε) = rew(P,C, id), and if w ∈ dom(D), i ∈
arity(D(w)), and i = pos(k), then wi ∈ dom(D) and D(wi) is Trans(P, D(w), Xk ).

   For P ∈ LP(Σ) and C ∈ Clause(Σ), the derivation tree der(P,C) is unique up to renam-
ing. If P ∈ LP(Σ), then D is a derivation tree for P if it is der(P,C) for some C ∈ Clause(Σ).
A derivation tree is finite or infinite according as its domain is finite or infinite. Inductive
programs like P1 and coinductive programs like P2 will have infinite derivation trees, so
construction of the full derivation trees for such programs is infeasible. Nevertheless, finite
initial fragments of derivation trees may be used to make coinductive observations about
various routes for proof search. We are currently exploring this research direction.
                             6 Conclusions and Future Work
This paper gives the first fully formal exposition of the Three Tier Tree Calculus T 3C for
S-resolution, relating “laws of infinity”, “laws of non-determinism”, and “laws of observ-
ability” of proof search in LP in a uniform, conceptual way. Implementation of derivations
by S-resolution is available (Komendantskaya et al. 2015).
   The structural approach to LP put forth in this paper relies on the syntactic structure of
programs rather than on their (operational, declarative, or other) semantics. In essence, it
presents an LP analogue of the kinds of reasoning that types and pattern matching support
in interactive theorem proving (ITP) (Agda 2015; Coq 2015). Further study of this analogy
is an interesting direction for future research.
   Our next steps will be to formulate a theory of universal and observational productivity
of (co)recursion in LP, and to supply T 3C with semi-decidable algorithms for ensuring pro-
gram productivity (akin to guardedness checks in ITP). Formally proving that S-resolution
is both inductively and coinductively sound is another of our current goals.
   Since LP and similar automated proof search methods underlie type inference in ITP
and other programming languages, S-resolution also has the potential to impact the design
and implementation of typeful programming languages. This is another research direction
we are currently pursuing.
                       Structural Resolution for Logic Programming                           13

                                         References
AGDA.         2015.              Agda     Development      Team.      agda    reference manual.
   http://appserv.cs.chalmers.se/users/ulfn/wiki/agda.php.
BAADER , F. AND S NYDER , W. 2001. Unification theory. In Handbook of Automated Reasoning,
   A. Robinson and A. Voronkov, Eds. Vol. I. Elsevier Science, Chapter 8, 446–531.
B OOLE , G. 1854. An investigation of the Laws of Thought on Which are Founded the Mathematical
   Theories of Logic and Probabilities. Macmillan.
C OQ. 2015. Coq Development Team. coq reference manual. https://coq.inria.fr/.
C OURCELLE , B. 1983. Fundamental properties of infinite trees. Theoretical Computer Science 25,
   95–169.
E NDRULLIS , J., G RABMAYER , C., H ENDRIKS , D., I SIHARA , A., AND K LOP, J. W. 2010. Produc-
   tivity of stream definitions. Theoretical Compututer Science 411, 4-5, 765–782.
F U , P. AND KOMENDANTSKAYA , E. 2015. A type-theoretic approach to structural resolution. In
   Proceedings, LOPSTR.
G UPTA , G., BANSAL , A., M IN , R., AND L. S IMON , A. M. 2007. Coinductive logic programming
   and its applications. In Proceedings, ICLP. 27–44.
KOMENDANTSKAYA ,            E.    ET    AL .   2015.          Implementation of S-resolution.
   http://staff.computing.dundee.ac.uk/katya/CoALP/.
KOMENDANTSKAYA , E., P OWER , J., AND S CHMIDT, M. 2014. Coalgebraic logic programming:
   from semantics to implementation. Journal of Logic and Computation.
KOWALSKI , R. A. 1974. Predicate logic as a programming language. In Information Processing 74.
   Stockholm, North Holland, 569–574.
L LOYD , J. 1988. Foundations of Logic Programming, 2nd ed. Springer-Verlag.
P FENNING , F. 2007. Logic programming. Carnegie Mellon University.
S IMON , L., BANSAL , A., M ALLYA , A., AND G UPTA , G. 2007. Co-logic programming: Extending
   logic programming with coinduction. In Proceedings, ICALP. 472–483.
T ERESE. 2003. Term Rewriting Systems. Cambridge University Press.
VAN E MDEN , M. AND KOWALSKI , R. 1976. The semantics of predicate logic as a programming
   language. Journal of the Assoc. for Comp. Mach. 23, 733–742.
VAN E MDEN , M. H. AND A BDALLAH , M. A. N. 1985. Top-down semantics of fair computations
   of logic programs. Journal of Logic Programming 2, 1, 67–75.