<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Inductive Learning in Shared Neural Multi-Spaces</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Edjard de Souza Mota</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jacob M. Howe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Artur S. d'Avila Garcez</string-name>
          <email>a.garcezg@city.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>City, University of London</institution>
          ,
          <addr-line>London, EC1V 0HB</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universidade Federal do Amazonas, Instituto de Computaca~o, Campus Setor Norte Coroado - Manaus - AM - Brasil CEP: 69080-900</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The learning of rules from examples is of continuing interest to machine learning since it allows generalization from fewer training examples. Inductive Logic Programming (ILP) generates hypothetical rules (clauses) from a knowledge base augmented with (positive and negative) examples. A successful hypothesis entails all positive examples and does not entail any negative example. The Shared Neural Multi-Space (Shared NeMuS) structure encodes rst order expressions in a graph suitable for ILP-style learning. This paper explores the NeMuS structure and its relationship with the Herbrand Base of a knowledge-base to generate hypotheses inductively. It is demonstrated that inductive learning driven by the knowledge-base structure can be implementated successfully in the Amao cognitive agent framework, including the learning of recursive hypotheses.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>There is renewed interest in inductive logic programming as a framework for
machine learning owing to its human like ability to infer new knowledge from
background knowledge together with a small number of examples. It also comes
with strong mathematical and logical underpinnings. This paper builds on [9],
where a data structure (Shared NeMuS) for the representation of rst-order logic
was introduced. It revisits inductive logic programming and demonstrates that
Shared NeMuS provides a structure that can be used to build an inductive logic
programming system.</p>
      <p>A part of the Shared NeMuS structure is weightings on individual elements
(atom, predicate, function). The purpose of these weightings is to provide a guide
to the use of these elements, for example in theorem proving. One of the key
challenges in inductive logic programming is to nd good heuristics to search
the hypothesis space; the long-term goal of this work is to learn weights in a
training phase that can in turn be used to guide the search of the hypothesis
space and improve the e ciency and success of inductive learning.</p>
      <p>The main purpose of this work is to present a new approach for Clausal</p>
      <p>Inductive Learning (CIL) using Shared NeMuS in which the search mechanism
Copyright © 2017 for this paper by its authors. Copying permitted for private and academic purposes.
uses the Herbrand Base (HB) to build up hypothesis candidates using inverse
uni cation. This generalization of ground expressions to universally quanti ed
ones is supported by the idea of regions of concepts that was explored in [9] to nd
refutation patterns. Here, weights are not explicitly used, but the intuitive use of
them is explored to de ne linkage patterns between predicates of the HB and the
occurrences of ground terms in positive examples. Meaningless hypotheses are
pruned away as a result of inductive momentum between predicates connected to
positive and negative examples. This paper makes the following contributions: it
is demonstrated that the Shared NeMuS data structure can be used as a suitable
structure for inductive logic programming; the Herbrand Base is used to build
candidate hypotheses; chaining and abstraction for rules, including recursive
rules, are given; these rules help keep the number of hypotheses small. NeMuS
is designed for extension with machine learning tactics.</p>
      <p>The remainder of this paper is structured as follows: section 2 gives some
brief background on inductive logic programming and the Shared NeMuS data
structure, sections 3 and 4 describe the implementation of inductive learning in
Amao using the Shared NeMuS data structure, then section 5 describes some
related work and section 6 discusses the work presented.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <sec id="sec-2-1">
        <title>Inductive Logic Programming (ILP)</title>
        <p>The goal of inductive logic programming (introduced in [10]) is to learn logical
formulae that describe a target concept, based on a set of examples. In a typical
set up there is a knowledge base of predicates, called background knowledge
(BK), along with a set of examples that the target concept should prove (positive
examples, e+) and a set of examples that the target concept should not prove
(negative examples, e ). The inductive logic programming problem is to search
for a logical description (a hypothesis, H) of the target concept based on the
knowledge and the examples so that the knowledge base plus the hypothesis
entails the positive examples, whilst does not entail the negative examples.</p>
        <p>Inductive logic programming systems implement search strategies over the
space of possible hypotheses. A good heuristic is one that will a) arrive at a
successful hypothesis, b) nd this hypothesis quickly and c) nd a succinct
hypothesis. In order to achieve this, the e ciency of hypothesis searching mechanisms
depend on partial order of -subsumption [13], or on a total ordering over the
Herbrand Base to constrain deductive, abductive and inductive operations [12].</p>
        <p>The approach presented in this paper for CIL takes a totally di erent
approach. Search considers separated spaces for constant terms, predicates and
clauses, and elements are indexed by their unique identi cation codes. The spaces
are interconnected through weighted bindings pointing to the target space in
which an occurrence of an element appears. This creates a network of shared
spaces because the elements are shared via bindings. As the weights are not
used here, the networks shall be referred to as multi-spaces.
2.2</p>
        <p>Shared NeMuS
Shared Neural Multi-Space (Shared NeMuS) [9] takes inspiration from [2] to give
a shared multi-space representation for a portion of rst-order logic designed for
use with machine learning and neural network methods. The structure
incorporates a relative degree of importance for each element according to the element's
attributes and uses an architecture that leads to a fast implementation.</p>
        <p>Shared NeMuS uses a Smarandache multi-space [8], a union of n spaces
A1; :::; An in which each Ai is the space of a distinct observed characteristic of
the overall space. For each Ai there is a di erent metric to describe a di erent
side of the "major" side. There will be a space for each component of a rst-order
language: atomic constants (of the Herbrand Universe, space 0), functions (space
1), predicates with literal instances (space 2), and clauses (space 3). Variables
are used to refer to sets of atomic terms via quanti cation, and they belong to
the same space of atoms. In this work, the function space is suppressed, since it
is not being dealt with for relational learning. In what follows vectors are written
v, and v[i] or vi is used to refer to an element of a vector at position i.</p>
        <p>Each logical element is described by a T-Node, and in particular each element
is described by a unique integer code within its space. In addition, a T-Node
identi es the lexicographic occurrence of the element, and (when appropriate)
an attribute position.</p>
        <p>De nition 1 (T-Node and Binding) Let c; a; i 2 Z and h 2 f0; 1; 2; 3g. A
T-Node (target node) is a quadruple (h; c; i; a) that identi es an object at space
h, with code c and occurrence i, at attribute position a (when it applies, otherwise
1). If p is a T-Node, then nh(p) = h, nc(p) = c, na(p) = a and ni(p) = i. A
NeMuS Binding is a pair (p; w)k, which represents the in uence w of object k
over occurrence ni(p) of object nc(p) at space nh(p) in position na(p).</p>
        <p>The elements of the subject space represent all of the occurrences of atoms
and variables in an expression.</p>
        <p>De nition 2 (Subject Space) Let C = [x1; : : : ; xm] and V = [y1; : : : ; yn],
where each xi; yi is a vector of bindings. Subject Space refers to the pair (C; V )
for constants and variables, respectively.</p>
        <p>The function maps a constant i to the vector of its bindings xi, as above.</p>
        <p>Higher spaces are made of structured elements, such as predicates and clauses.
Such objects have attributes uniquely identi ed by T-Nodes referring to objects
in the spaces below and their bindings of the objects on which they exert in
uence.</p>
        <p>De nition 3 (Compound) Let xia = [c1; : : : ; cm] be a vector of T-Nodes for
each attribute instance of compound i. Let wi be a vector of NeMuS bindings.
Then a NeMuS Compound is the pair (xia; wi). A NeMuS Compound Space
(C-Space) is a vector of NeMuS Compounds.</p>
        <p>For each literal there is a C-Space to represent it. Since a literal is an instance
of a predicate the predicate space is a vector of C-Spaces. As predicates have
positive and negative instances, there are two vector regions for a predicate
space. Clauses' attributes are the literals that they are made of, and as they
exert no in uence upon spaces above for simplicity the bindings of a clause shall
be empty.</p>
        <p>De nition 4 (Predicate and Clause Spaces) Let Cp+ and Cp be two
vectors of C-spaces. The pair (Cp+; Cp ) is called the NeMuS Predicate Space. A
NeMuS Clause Space is a vector of C-spaces such that every pair in the vector
shall be (xia; []).</p>
        <p>A Shared NeMuS for a coded rst-order expression is a triple hS; P; Ci, in
which S is the subject space, P is the predicate space and C is the clause space.
Example 1. Assume the following symbolic BK (adapted from [3]):
mother(pam,ann).
wife(ann,bob).
wife(eve,wallie).
brother(wallie, ann).</p>
        <p>This translates into the Shared NeMuS below (with weights at default value of
0). The constant region of the subject space, the predicate space and the clause
space are each given with some commentary. The labels and the numbers before
the column are just to emphasise structure.</p>
        <sec id="sec-2-1-1">
          <title>Subject Space: Constant region: {</title>
          <p>
            1:[((
            <xref ref-type="bibr" rid="ref1 ref1 ref1 ref2">2,1,1,1</xref>
            ),0)]
2:[(
            <xref ref-type="bibr" rid="ref1 ref1 ref2 ref2">2,1,1,2</xref>
            ),0),(
            <xref ref-type="bibr" rid="ref1 ref1 ref2 ref2">2,2,1,1</xref>
            ),0),((
            <xref ref-type="bibr" rid="ref1 ref2 ref2 ref3">2,3,1,2</xref>
            ),0)]
3:[((
            <xref ref-type="bibr" rid="ref1 ref2 ref2 ref2">2,2,1,2</xref>
            ),0)]
4:[((
            <xref ref-type="bibr" rid="ref1 ref2 ref2 ref2">2,2,2,1</xref>
            ),0)]
5:[((
            <xref ref-type="bibr" rid="ref2 ref2 ref2 ref2">2,2,2,2</xref>
            ),0),((
            <xref ref-type="bibr" rid="ref1 ref1 ref2 ref3">2,3,1,1</xref>
            ),0)] }
The subject space encodes the occurrences of the constants. For example, the
rst entry gives the binding for pam, with code 1, stating that this atom occurs
in the rst occurrence of the rst predicate (mother) as the rst attribute. The
second entry gives the list of bindings for the three occurrences of ann.
Predicate Space:
          </p>
          <p>
            // pam ann
{+1:[([(
            <xref ref-type="bibr" rid="ref1 ref1 ref1">0,1,1,1</xref>
            ),(
            <xref ref-type="bibr" rid="ref1 ref2 ref2">0,2,1,2</xref>
            )],[((
            <xref ref-type="bibr" rid="ref1 ref1 ref1 ref3">3,1,1,1</xref>
            ),0)])], -1: []}
// ann bob
{+2:[([(
            <xref ref-type="bibr" rid="ref1 ref2 ref2">0,2,2,1</xref>
            ),(
            <xref ref-type="bibr" rid="ref1 ref2 ref3">0,3,1,2</xref>
            )],[((
            <xref ref-type="bibr" rid="ref1 ref1 ref2 ref3">3,2,1,1</xref>
            ),0)]),
          </p>
          <p>
            ([(
            <xref ref-type="bibr" rid="ref1 ref1 ref4">0,4,1,1</xref>
            ),(
            <xref ref-type="bibr" rid="ref1 ref2 ref5">0,5,1,2</xref>
            )],[((
            <xref ref-type="bibr" rid="ref1 ref1 ref3 ref3">3,3,1,1</xref>
            ),0)])], -2: []}
{+3:[([(
            <xref ref-type="bibr" rid="ref1 ref2 ref5">0,5,2,1</xref>
            ),(
            <xref ref-type="bibr" rid="ref2 ref2 ref3">0,2,3,2</xref>
            )],[((
            <xref ref-type="bibr" rid="ref1 ref1 ref3 ref4">3,4,1,1</xref>
            ),0)])], -3: []}
The predicate space encodes each predicate. For example, the rst entry links to
the bindings of the two constants occurring in the only clause in which it occurs.
The second entry details the two clauses for wife.
          </p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Clause Space: {([(2,1,2,1)],[()]), ([(2,2,2,1)],[()]), ([(2,2,3,1)],[()]), ([(2,3,2,1)],[()])}</title>
          <p>Here, the clauses link back to the predicates that de ne them. For example, the
rst entry says that the rst clause is built from the rst predicate.</p>
          <p>This simple example shows how easy it is to navigate across a shared NeMuS
structure to induce new rules from relational data or sets of literals.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3 Inductive Learning with Shared NeMuS</title>
      <p>This section describes, with the aid of running examples, how inductive learning
is performed in Amao, using the Shared NeMuS structure. Amao3 a cognitive
arti cial agent which originally performed just symbolic reasoning on structured
clauses via Linear Resolution [16]. It was extended in [9] to generate a shared
NeMuS representation, during the compilation of symbolic representation, for
neural-symbolic reasoning purposes.</p>
      <p>Inverse uni cation from HB is plausible since all ground expressions of a given
concept p will be at the same region as far as weighted grounds are concerned.
For instance, p(a; b), p(c; d) and p(b; e) all belong to the region of p. So, p(X; Y ) is
a sound generalization of such instances. However, if there are other concepts
involving the elements of the Herbrand Universe, say q(a; d) and r(d) then p(X; Y )
is not a straight generalization without taking into account the combination of
the regions for r and q since their ground atoms have occurrences of constants
appearing in all three concepts.</p>
      <p>The induction algorithm proposed is guided by kinds of linkage patterns
(sections 3.1 and 3.2) and using only those in which there is an inductive momentum
(section 3.3). The explanation is that positive examples bring the ground
expressions "close" to the hypothesis to be generated, while the negative ones pull apart
those which are likely to generate inconsistent hypothesis.
3.1</p>
      <p>Linear Linkage Patterns
One form of the Amao operation for performing inductive learning with target
predicate p/n is:
consider induction on p(X1,...,Xn) knowing p(t1,...,tn).
That is, p/n is not in the BK and Amao will attempt to nd a hypothesis H
such that the positive example(s) p(t1,...,tn) can be deduced from BK [ H.
In what follows, the symbol representation will be used to mean the Shared
NeMuS code of each logical element and recall that maps code symbols to
their bindings.
3 Amao is the name of a deity that taught people of Camanaos tribe, who lived on
the margins of the Negro River in the Brazilian part of the Amazon rainforest, the
process of making mandioca powder and beiju biscuit for their diet.</p>
      <p>The BK is that used in Example 1. Suppose that the target predicate is
motherInLaw/2, with positive knowledge motherInLaw(pam,bob) then when
Amao is asked to generate hypotheses with
consider induction on motherInLaw(X,Y) knowing motherInLaw(pam,bob).
amongst the successful hypotheses should be:
motherInLaw(X; Y )</p>
      <p>mother(X; Z) ^ wif e(Z; Y )</p>
      <p>
        Parsing the positive example against the Shared NeMuS representation of
the BK gives that in the constant region pam has code 1 and bob has code 3.
From the constant region of the subject space the vectors of bindings (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) and
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) are found:
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) = [(1; 1; 1)] : in the rst predicate, its rst instance, as rst attribute
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) = [(2; 1; 2)] : in the second predicate, its rst instance, as second attribute
      </p>
      <p>For each element of the vector of bindings, (i), the vector of attributes of
the predicate in which it occurs is found, written xa( (i)j), where j is an index.
Here,</p>
      <p>
        xa( (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )1) = [(0; 1; 1; 1); (0; 2; 1; 2)] and xa( (
        <xref ref-type="bibr" rid="ref3">3</xref>
        )1) = [(0; 2; 2; 1); (0; 3; 1; 2)]
that is, mother(pam, ann) and wife(ann, bob).
      </p>
      <p>The intersection between xa( (pam)1) and xa( (bob)1) is non-empty since
ann (code 2) occurs in both. Hence predicate codes 1 (mother) and 2 (wife) are
used in the hypothesis. These are ground atoms from the HB of Example 1, and
from them a new clause is built with head (positive) literal as the target
antiuni ed, and the negative literals are those found above. Inverse uni cation will
incrementally build anti-substitution 1 at each step by adding literals to the
body with constants substituted by variables X and Y from the targeted head. As
X appears as the rst attribute of the rst predicate (mother), and Y as the second
attribute of the second predicate (wife), then the linkage term shall be Z 0. Call
Z 0 the hook between both literals and the nal 1 is fpam/X,bob/Y,ann/Z 0g.
The terms which are not the hook term are called attribute-mates. Thus the
hypothesis is the following clause in Amao notation:</p>
      <p>motherInLaw(X,Y); ~mother(X,Z 0); ~wife(Z 0,Y)
This rule is added to the KB and its Shared NeMuS is updated accordingly.</p>
      <p>In general this hook chain can be longer and involve more linkage predicates.
Depending on the position where the linkage is formed from one to another there
may be di erent sorts of linkage pattern. Besides, there can be intermediate
predicates that should not be part of the hypothesis since they may deduce
negative examples. Consider the following example, and this time for the sake of
readability the space information will be suppressed from the bindings and the
attribute position from xa.</p>
      <p>Example 2. Consider the following BK.</p>
      <p>1. parent(pam,bob). 6. parent(pat,jim).
2. parent(tom,bob). 7. parent(ann,eve).
3. parent(tom,liz). 1. male(tom).
4. parent(bob,ann). 2. male(bob).</p>
      <sec id="sec-3-1">
        <title>5. parent(bob,pat). 3. male(jim).</title>
      </sec>
      <sec id="sec-3-2">
        <title>1. female(pam).</title>
      </sec>
      <sec id="sec-3-3">
        <title>2. female(liz).</title>
      </sec>
      <sec id="sec-3-4">
        <title>3. female(ann).</title>
      </sec>
      <sec id="sec-3-5">
        <title>4. female(pat).</title>
      </sec>
      <sec id="sec-3-6">
        <title>5. female(eve).</title>
        <p>Codes for the logical elements in the order they are read or scanned are:
parent male female pam bob tom liz ann pat jim eve</p>
        <p>1 2 3 1 2 3 4 5 6 7 8</p>
        <p>The induction Amao is requested to perform is
consider induction on hasDaughter(X)</p>
        <p>knowing hasDaughter(ann) ~hasDaughter(pat).</p>
        <p>
          First nd the bindings associated with the positive and negative examples.
+ (ann) = [(1; 4; 2); (1; 7; 1); (3; 3; 1)] and
(pat) = [(1; 5; 2); (1; 6; 1); (3; 4; 1)]
nc( (ann)1) = nc( (pat)1) = 1 and their positions are the same in the
predicate attributes, given by na( (ann)1) = na( (pat)1) = 2. As both appear
along with the same constant bob (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ).
        </p>
        <p>xa( (ann)1) = [(0; 2; 3); (0; 5; 1)] and xa( (pat)1) = [(0; 2; 4); (0; 6; 1)]
This path will give hypotheses which make hasDaughter(pat) deducible, which
is not desirable since hasDaughter(pat)2 e . Call this an inconsistent path,
and the instance is dropped and another selected.</p>
        <p>
          nc( (ann)2) = nc( (pat)2) = 1 and na( (ann)2) = na( (pat)2) = 1. This
time their attribute-mates are di erent, eve (
          <xref ref-type="bibr" rid="ref8">8</xref>
          ) for ann and jim (
          <xref ref-type="bibr" rid="ref7">7</xref>
          ) for pat
xa( (ann)2) = [(0; 5; 2); (0; 8; 1)] and xa( (pat)2) = [(0; 6; 2); (0; 7; 1)]:
This splits the path into two branches and it cannot be said, at this stage, if
both will lead to atomic sentences belonging to the Herbrand Base, thus
deducible. Call this a plausible path. From this point the rst literal for the body
of the hypothesis can be considered as a generalization of parent(ann, eve),
i.e. parent(X, Z0) (where fann=X; eve=Z0g is the inverse or anti-uni cation of
terms) has to be con rmed by pruning away any possible predicate found in the
bindings of jim onwards.
        </p>
        <p>+ (eve) = [(1; 7; 2); (3; 5; 1)] and
(jim) = [(1; 6; 2); (2; 3; 1)]
Their rst bindings fail in the same inconsistent path as in the case of ann
and pat, and so they must be dropped. However, their occurrences happen at
di erent predicates as shown by</p>
        <p>nc( (eve)2) = 3 (for female) nc( (jim)2) = 2 (for male)
This means that the path has reached a state of positive path only, meaning
that predicate 2 (male) can be dropped and f emale(eve) ends the search for
this branch. Add to the body the general formula f emale(Z0). The search for a
hypothesis might stop here and Amao would learn
hasDaughter(X)</p>
        <p>parent(X; Z0) ^ f emale(Z0)
This hypothesis meets the desired de nition. If search is continued, further
hypotheses might be found such as
hasDaughter(X)</p>
        <p>parent(X; Z0) ^ f emale(Z0) ^ f emale(X)
which is also consistent with BK, e+ and e .</p>
        <p>Algorithm 1 LinkagePattern(pk; pk1 are T-Nodes )
1: if nc(pk) = nc(pk1) then . possible recursive pattern
2: if xa(pk) \ xa(pk1) 6= ? then
3: if na(pk) &lt; na(pk1) then . (position of ak is less than position of ak1)
4: return linear and recursive
5: else if na(pk) = na(pk1) then
6: return sink hook
7: else
8: return side hook pattern
9: else
10: if na(pk) &lt; na(pk1) then . (position of ak is less than position of ak1)
11: return linear or recursive
12: else if na(pk) = na(pk1) then
13: return deep sink hook
14: else
15: return long side hook
16: else
17: if xa(pk) \ xa(pk1) = ? then
18: return unknown hook
19: else
20: return short linear hook
3.2</p>
        <p>Recursive Linkage Pattern
Suppose Amao is asked to generate hypotheses with ancestor(X,Y) with
positive background knowledge ancestor(pam,jim).
consider induction on ancestor(X,Y) knowing ancestor(pam,jim).
Although there is a long linear linkage from pam until jim, the successful expected
hypotheses should be recursive having parent(X; Y) as base. Amao generates
these as learned hypotheses by following the same steps as for linear linkage,
except that the rst pair of pk and pk1 from (ak) and (ak1) is checked for
equality. In other words, if nc(pk) 6= nc(pk1), then the linkage pattern is linear
and proceed as above. Otherwise it is possible that there is a recursive pattern.
Algorithm 1 diagnoses which linkage pattern to apply (including some not
discussed here as they are intended to be used within a neural network learning
extension of the current method for large background knowledge). In Algorithm 1
na(pk) and na(pk1) are the attribute positions for ak and ak1, respectively.</p>
        <sec id="sec-3-6-1">
          <title>3.3 Inductive Momentum</title>
          <p>The de nition of ILP and all implementations use negative examples e more
as a testing case to check whether a generated hypothesis H plus BK will entail
e . If so, then the cause of the undesired deduction is detected, xed and a new
hypothesis generated. The approach in this work is di erent: negative example
can be used to prune away candidates to generalized body literals if they have
been reached from the bindings of terms from e (call them negative terms).</p>
          <p>The inductive momentum between two T-Nodes a+ of xa+ and a of xa
given partial 1 is given by Algorithm 2.</p>
          <p>Algorithm 2 InductiveMomentum(xa+,a+,xa ,a )
1: if no attribute of xa+ has an anti-substitution in 1 then
2: return useless path.
3: if nc(a+) = nc(a ) then
4: if na(a+) = na(a ) then
5: if both attribute-mates a+ of xa+ and a of xa are equal then
6: return inconsistent path
7: else
8: return plausible path
9: else
10: plausible path
11: return positive path only
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>The Inductive Learning Algorithm</title>
      <p>The induction algorithm presented in Algorithm 3 works as a search guided
by the linkage pattern (Algorithm 1) and inductive momentum (Algorithm 2).
Algorithm 3 is given a shared NeMuS N , a target P with code p=n, and coded
positive and negative examples e+ and e with their respective coded terms
being ak+ (possibly ak+1), ak (possibly ak1) respectively.</p>
      <p>Note that step 8 guarantees that ground atoms common to the e+ and e
derivation chain are left out of the partial clause hypothesis building process.
Algorithm 3 InductiveLearn(N ,pk; pk1 are T-Nodes )
This avoids inconsistency by not allowing the generation of hypotheses that
would satisfy e along with BK.</p>
      <p>Consider again the example with ancestor/2 to give an intuitive idea of
how negative examples are used as an inductive momentum. Everything else is
left out of the hypothesis since it will not be part of the HB that satis es the
positive examples e+ plus BK and hypothesis. The process builds the hypothesis
by selecting from the intersection those that meet one of the linkage patterns,
and are not eliminated as a result of an inductive momentum.
&gt;&gt; consider induction on ancestor(X,Y) knowing ancestor(pam,jim).
--&gt; Consider using these hypotheses...
ancestor(X,Y); ~parent(X,Y).
ancestor(X,Y); ~parent(X,Z0); ~ancestor(Z0,Y).
5</p>
    </sec>
    <sec id="sec-5">
      <title>Related Work</title>
      <p>Inductive logic programming has a large literature, from its antecedents in
inductive generalization [14], through work on search strategies, the logical framework
that the learning sits in, as well as the building of systems and application of
these to speci c problems. Inductive learning continues to be of interest in a
wide range of contexts and applications as recently surveyed in [5].</p>
      <p>Of particular relevance is the work in [4] that investigates path based
algorithms to generate relationships and [15] that uses inductive logic programming
concepts in an early instance of theory repair (that is, revising a theory that is
incorrect so that the counter-examples are no longer such). Additionally [6], that
investigates variations on the standard anti-uni cation algorithm and how these
impact on the e ciency of hypothesis search, is of interest in the current
context. More recently, in [1] boolean constraints are used to describe undesirable
areas of the hypothesis search space and solving these to prune the search space
achieves signi cant speed ups over older inductive logic programming systems
on a range of examples, whilst retaining accuracy.</p>
      <p>The higher-order approach taken in [11, 12] uses a meta-interpreter with
iterative deepening to build Metagol. Metagol has had success on a range of
examples, including learning a subclass of context-free grammars from examples
and inferring relationships in sports data (whilst also uncovering an error in the
data representation). This includes predicate invention, a topic that these papers
suggest has not been paid due attention in inductive learning.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Discussion and Future Work</title>
      <p>This paper has shown how the Amao Shared NeMuS data structure can be used
to build an inductive logic programming system which has been successfully
applied on some small trial examples.</p>
      <p>The results on inductive learning in Amao show that using its shared
structure leads to reliable hypothesis generation in the sense that the minimally
correct ones are generated. However, it still generates additional hypothesis,
logically sound and correct with respect to the Herbrand base derivation. Most
important is the size of the set of hypotheses generated which is small in
comparison with the literature, e.g. [3].</p>
      <p>Future work will focus on two areas. First, the power of the shared structure
to allow a fast implementation of inductive inference. Second, the weights
incorporated in the Shared NeMuS structure (not used in the current paper) will be
used to play an important role in providing heuristics. In [9] it is shown how these
weights can be updated in a manner inspired by self-organising maps [7]. The
propagation across the network of nodes in the structure allows the weights to
capture patterns of refutation. It should be possible to capture negative examples
in inductive logic programming in the network in this way, guiding search away
from these unfruitful regions to ne tune to a small set of generated hypotheses.
Alongside improved hypothesis search the use of the weighted structure to drive
predicate invention { to add to the knowledge base additional inferred predicates
contained in neither it nor the target predicate { will be investigated.</p>
      <sec id="sec-6-1">
        <title>Acknowledgement</title>
        <p>The authors would like to thank Gerson Zaverucha for fruitful discussion and
guidance through emails about Inductive Logic Programming.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ahlgren</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yuen</surname>
          </string-name>
          , S.Y.:
          <article-title>E cient Program Synthesis Using Constraint Satisfaction in Inductive Logic Programming</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>14</volume>
          ,
          <volume>3649</volume>
          {
          <fpage>3681</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Boyer</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moore</surname>
            ,
            <given-names>J.S.:</given-names>
          </string-name>
          <article-title>The Sharing of Structure in Theorem-Proving Programs</article-title>
          .
          <source>In: Machine Intelligence</source>
          <volume>7</volume>
          . pp.
          <volume>101</volume>
          {
          <fpage>116</fpage>
          . Edinburgh University Press (
          <year>1972</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Bratko, I.:
          <article-title>Prolog Programming for Arti cial Intelligence, 4th edition</article-title>
          . Addison
          <string-name>
            <surname>Wesley</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bunescu</surname>
            ,
            <given-names>R.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mooney</surname>
            ,
            <given-names>R.J.: A Shortest</given-names>
          </string-name>
          <string-name>
            <surname>Path</surname>
          </string-name>
          <article-title>Dependency Kernel for Relation Extraction</article-title>
          .
          <source>In: Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>724</volume>
          {
          <fpage>731</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Gulwani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hernandez-Orallo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kitzelmann</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muggleton</surname>
            ,
            <given-names>S.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zorn</surname>
            ,
            <given-names>B.G.</given-names>
          </string-name>
          :
          <article-title>Inductive Programming Meets the Real World</article-title>
          .
          <source>Communications of the ACM</source>
          <volume>58</volume>
          (
          <issue>11</issue>
          ),
          <volume>90</volume>
          {
          <fpage>99</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Idestam-Almquist</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Generalization under Implication by Recursive Antiuni cation</article-title>
          .
          <source>In: International Conference on Machine Learning</source>
          . pp.
          <volume>151</volume>
          {
          <fpage>158</fpage>
          . Morgan-Kaufmann (
          <year>1993</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kohonen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Self-Organizing Maps</surname>
          </string-name>
          . Springer, 3rd edn. (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Mao</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>An introduction to Smarandache multi-spaces and mathematical combinatorics</article-title>
          .
          <source>Scientia Magna</source>
          <volume>3</volume>
          (
          <issue>1</issue>
          ),
          <volume>54</volume>
          {
          <fpage>80</fpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Mota</surname>
          </string-name>
          , E.d.S.,
          <string-name>
            <surname>Diniz</surname>
            ,
            <given-names>Y.B.</given-names>
          </string-name>
          :
          <article-title>Shared Multi-Space Representation for Neural-Symbolic Reasoning</article-title>
          . In: Besold,
          <string-name>
            <given-names>T.R.</given-names>
            ,
            <surname>Lamb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Sera</surname>
          </string-name>
          <string-name>
            <given-names>ni</given-names>
            , L.,
            <surname>Tabor</surname>
          </string-name>
          , W. (eds.) NeSy
          <year>2016</year>
          . vol.
          <volume>1768</volume>
          . CEUR Workshop Proceedings (July
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Muggleton</surname>
            ,
            <given-names>S.H.:</given-names>
          </string-name>
          <article-title>Inductive Logic Programming</article-title>
          .
          <source>New Generation Computing</source>
          <volume>8</volume>
          (
          <issue>4</issue>
          ),
          <volume>295</volume>
          {
          <fpage>318</fpage>
          (
          <year>1991</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Muggleton</surname>
            ,
            <given-names>S.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pahlavi</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tamaddoni-Nezhad</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Meta-interpretive learning: application to grammatical inference</article-title>
          .
          <source>Machine Learning</source>
          <volume>94</volume>
          (
          <issue>1</issue>
          ),
          <volume>25</volume>
          {
          <fpage>49</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Muggleton</surname>
            ,
            <given-names>S.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tamaddoni-Nezhad</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Meta-interpretive learning of higher-order dyadic datalog: predicate invention revisited</article-title>
          .
          <source>Machine Learning</source>
          <volume>100</volume>
          (
          <issue>1</issue>
          ),
          <volume>49</volume>
          {
          <fpage>73</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. Nienhuys-Cheng, S.H.,
          <string-name>
            <surname>De</surname>
            <given-names>Wolf</given-names>
          </string-name>
          , R.:
          <source>Foundations of Inductive Logic Programming, Lecture Notes in Arti cial Intelligence</source>
          , vol.
          <volume>1228</volume>
          . Springer (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Plotkin</surname>
          </string-name>
          , G.D.:
          <article-title>A Note on Inductive Generalization</article-title>
          .
          <source>In: Machine Intelligence</source>
          <volume>5</volume>
          . pp.
          <volume>153</volume>
          {
          <fpage>164</fpage>
          . Edinburgh University Press (
          <year>1969</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Richards</surname>
            ,
            <given-names>B.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mooney</surname>
          </string-name>
          , R.J.:
          <source>Automated Re nement of First-Order Horn-Clause Domain Theories. Machine Learning</source>
          <volume>19</volume>
          (
          <issue>2</issue>
          ),
          <volume>95</volume>
          {
          <fpage>131</fpage>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>A.:</given-names>
          </string-name>
          <article-title>A machine-oriented logic based on the resolution principle</article-title>
          .
          <source>Journal of the ACM</source>
          <volume>12</volume>
          (
          <issue>1</issue>
          ),
          <volume>23</volume>
          {
          <fpage>42</fpage>
          (
          <year>1965</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>