<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Propositional Rule Extraction from Neural Networks under Background Knowledge</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maryam Labaf</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pascal Hitzler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anthony B. Evans</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Data Semantics (DaSe) Laboratory, Wright State University</institution>
          ,
          <addr-line>Dayton, OH</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. of Math. and Stat., Wright State University</institution>
          ,
          <addr-line>Dayton, OH</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>It is well-known that the input-output behaviour of a neural network can be recast in terms of a set of propositional rules, and under certain weak preconditions this is also always possible with positive (or de nite) rules. Furthermore, in this case there is in fact a unique minimal (technically, reduced ) set of such rules which perfectly captures the inputoutput mapping. In this paper, we investigate to what extent these results and corresponding rule extraction algorithms can be lifted to take additional background knowledge into account. It turns out that uniqueness of the solution can then no longer be guaranteed. However, the background knowledge often makes it possible to extract simpler, and thus more easily understandable, rulesets which still perfectly capture the input-output mapping.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The study of rule extraction from trained arti cial neural networks [
        <xref ref-type="bibr" rid="ref15 ref2 ref8">2,8,15</xref>
        ]
addresses the desire to make the learned knowledge accessible to human
interpretation and formal assessment. Essentially, in the propositional case, activations
of input and output nodes are discretized by introducing an arbitrary
threshold. Each node is interpreted as a propositional variable, and activations above
the threshold are interpreted as this variable being \true", while activations
below the threshold are interpreted as this variable being \false". If I denotes the
power set (i.e., set of all subsets) of the ( nite) set B of all propositional variables
corresponding to the nodes, then the input-output function of the network can
be understood as a function f : I ! I: For I 2 I, we interpret each p 2 I as
being \true" and all p 62 I as being \false". The set f (I) then contains exactly
those propositional variables which are \true" (or activated) in the output layer.
      </p>
      <p>In propositional rule extraction, one now seeks sets Pf of propositional rules
(i.e., propositional Horn clauses) which capture or approximate the input-output
mapping f . In order to obtain such sets, there exist two main lines of approaches.</p>
      <p>
        The rst is introspective and seeks to construct rules out of the weights associated
with the connections between nodes in the network, usually proceeding in a
layerby-layer fashion [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The second is to regard the network as a black box and to
consider only the input-output function f . This was, e.g., done in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] where
it was shown, amongst other things, that a positive (or de nite) ruleset can
always be extracted if the mapping f is monotonic, and that there is indeed
Copyright © 2017 for this paper by its authors. Copying permitted for private and academic purposes.
a unique reduced such ruleset; we will provide su cient technical details about
these preliminary results in the next section.
      </p>
      <p>However, rulesets extracted with either method are prone to be large and
complex, i.e., from inspection of these rulesets it is often di cult to obtain real
insights into what the network has learned. In this paper, we thus investigate rule
extraction under the assumption that there is additional background knowledge
which can be connected to network node activations, with the expectation that
such background knowledge will make it possible to formulate simpler rulesets
which still explain the input-output functions of the networks, if the background
knowledge is also taken into account.</p>
      <p>
        The motivation for this line of work is the fact that in recent years there has
been a very signi cant increase in the availability of structured data on the World
Wide Web, i.e., it becomes easier and easier to actually nd such structured
knowledge for all di erent kinds of application domains. That this is the case is,
among other things, a result of recent developments in the eld of Semantic Web
[
        <xref ref-type="bibr" rid="ref12 ref4">4,12</xref>
        ], which is concerned with data sharing, discovery, integration and reuse,
and where corresponding standards, methods and tools are being developed.
      </p>
      <p>
        E.g., structured data in the form of knowledge graphs, usually encoded using
the W3C standards RDF [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and OWL [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], has been made available in ever
increasing quantities for over 10 years [
        <xref ref-type="bibr" rid="ref17 ref5">5,17</xref>
        ]. Other large-scale datasets include
Wikidata [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] and data coming from the schema.org [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] e ort which is driven by
major Web search engine providers.
      </p>
      <p>In order to motivate the rest of the paper, consider the following very
simple example. Assume that the input-output mapping P of the neural network
without background knowledge is
p1 ^ q ! r</p>
      <p>p2 ^ q ! r
and that we also have background knowledge K in form of the rules
p1 ! p</p>
      <p>p2 ! p:
We then obtain the simpli ed input-output mapping PK , taking background
knowledge into account, as</p>
      <p>p ^ q ! r:</p>
      <p>The example already displays a key insight why background knowledge can
lead to simpler extracted rulesets: In the example just given, p serves as a \more
general" proposition, e.g., p1 could stand for \is an apple" while p2 could stand
for \is a banana", while p could stand for \is a fruit". If we now also take, e.g.,
q to stand for \is ripe" and r to stand for \can be harvested", then we obtain a
not-so-abstract toy example, where the background knowledge facilitates a
simpli cation because it captures both apples and bananas using the more general
concept \fruit".</p>
      <p>
        In this paper, we will formally de ne the setting for which we just gave an
initial example. We will furthermore investigate to what extent we can carry over
results regarding positive rulesets from [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] to this new scenario with background
knowledge. We will see that the pleasing theoretical results such as uniqueness of
a solution no longer hold. However, existence of solutions can still be guaranteed
under the same mild conditions as in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], and we will still be able to obtain
algorithms for extracting corresponding rulesets.
      </p>
      <p>
        The rest of the paper will be structured as follows. In Section 2 we will
introduce notation as needed and recall preliminary results from [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. In Section 3
we present the results of our investigation into adding background knowledge.
      </p>
      <p>In Section 4, we brie y discuss related work, and in Section 5 we conclude and
discuss avenues for furture work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>
        We recall notation and some results from [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] which will be central for the rest of
the paper. For further background on notions concerning logic programs, cf. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>As laid out in the introduction, let B be a nite set of propositional variables,
let I be the power set of B, and we consider functions f : I ! I as discretizations
of input-output functions of trained neural networks. In this paper, we consider
only positive (or de nite) propositional rules, which are of the form p1^ ^pn !
q; where q and all pi are propositional variables. A set P of such rules is called
a (propositional) logic program. For such a rule, we call q the head of the rule,
and p1 ^ ^ pn the body of the rule.</p>
      <p>A logic program P is called reduced if all of the following hold.
1. For every rule p1 ^ ^pn ! q in P we have that all pi are mutually distinct.
2. There are no two rules p1 ^ ^ pn ! q and r1 ^ ^ rm ! q in P with
fp1; : : : ; png fr1; : : : ; rmg.</p>
      <p>To every propositional logic program P over B we can associate a semantic
operator TP , called the immediate consequence operator, which is the function</p>
      <p>TP : I ! I :
TP (I) = fq j there exists p1 ^
^ pn ! q in P with fp1; : : : ; png</p>
      <p>Ig:
This operator is well-known to be monotonic in the sense that whenever I J ,
then TP (I) TP (J ).</p>
      <p>We make some additional mild assumptions: We assume that the
propositional variables used to represent input and output nodes are distinct, i.e., each
propositional variable gets used either to represent an input node, or an output
node, but not both. Technically, this means that B can be partitioned into two
sets B1 and B2, i.e., B = B1 [_ B2, and we obtain the corresponding power sets
I1 and I2 such that TP : I1 ! I2.</p>
      <p>While the de nition of the immediate consequence operator just presented
is very common in the literature, we will now give a di erent but equivalent
formalization, which will help us in this paper. For any I = fp1; : : : ; png B,
let c(I) = p1 ^ ^ pn. In fact, whenever I B, in the following we will often
simply write I although we may mean c(I), and the context will make it clear</p>
      <sec id="sec-2-1">
        <title>Algorithm 1: Reduced De nite Program Extraction</title>
        <p>where j= denotes entailment in propositional logic. Please note that we use
another common notational simpli cation, as I ^ P is used to denote I ^ VR2P R.</p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], the following was shown.
        </p>
        <p>Theorem 1. Let f : I1 ! I2 be monotonic. Then there exists a unique reduced
logic program P with TP = f . Furthermore, this logic program can be obtained
using Algorithm 1.</p>
        <p>If we drop the precondition on f to be monotonic, then Theorem 1 no longer
holds, because of the fact mentioned above that immediate consequence
operators are always monotonic.</p>
        <p>We will now investigate Theorem 1 when considering additional background
knowledge. It will be helpful to have the following corollary from Theorem 1 at
hand.</p>
        <p>Theorem 2. Given a logic program P , there is always a unique reduced logic
program Q with TP = TQ.</p>
        <p>Proof. Given P , we know that TP is monotonic. Now apply Theorem 1.</p>
        <p>Let us give an example for reducing a given program. Let B1 = fp1; p2; p3g
and B2 = fq1; q2g be input and output sets, respectively, and consider the logic
program P given as</p>
      </sec>
      <sec id="sec-2-2">
        <title>Applying Algorithm 1 then yields the reduced program</title>
        <p>We consider the following setting. Assume P is a logic program which captures
the input-output function of a trained neural network according to Theorem 1.
Let furthermore K be a logic program which constitutes our background
knowledge, and which may use additional propositional variables, i.e., propositional
variables not occurring in P . We then seek a logic program PK such that, for all
I 2 I1, we have
fq 2 B2 j I ^ P j= qg = fq 2 B2 j I ^ K ^ PK j= qg:
(1)</p>
      </sec>
      <sec id="sec-2-3">
        <title>In this case, we call PK a solution for (P; K).</title>
        <p>3.1</p>
        <p>Existence of Solutions
We next make two more mild assumptions, namely (1) that no propositional
variable from B2 appears in K, and that (2) propositional variables from B1
appear only in bodies of rules in K. The rst is easily justi ed by the use case,
since we want to explain the network behaviour, and the occurrence of variables
from B2 in K would bypass the network. The second is also easily justi ed by the
use case, which indicates that network input activations should be our starting
point, i.e. the activations should not be altered by the background knowledge.</p>
        <p>If we drop assumption (2) just stated, then existence of a solution cannot
be guaranteed: Let B1 = fp1; p2g, let B2 = fq1; q2g. Then, for the given
programs P = fp1 ! q1; p2 ! q2g and K = fp1 ! p2g there is no solution for
(P; K). To see this, assume that PK be a solution for (P; K). Then because
p2 ^ P j= q2 we obtain that p2 ^ K ^ PK j= q2. But then p1 ^ K ^ PK j= q2
although p1 ^ P 6j= q2, i.e., PK cannot be a solution for (P; K).</p>
        <p>If condition (2) from above is assumed, though, a solution always exists.
Proposition 1. Under our standing assumptions on given logic programs P and
K, there always exists a solution for (P; K) which is reduced.</p>
        <p>Proof. Because rule heads from K never appear in P , we obtain
fq 2 B2 j I ^ P j= qg = fq 2 B2 j I ^ K ^ P j= qg
for all I 2 I1, i.e., P is always a solution for (P; K). Existence of a reduced
solution then follows from Theorem 2.</p>
        <p>Our interest of course lies in determining other solutions which are simpler
than P .</p>
      </sec>
      <sec id="sec-2-4">
        <title>Algorithm 2: Construct all reduced solutions for (P; K)</title>
        <p>Proposition 2. There exist logic programs P and K which satisfy our standing
assumptions, such that there are two distinct reduced solutions for (P; K).
Proof. Let B1 = fp1; p2; p3g and B2 = fqg. Then consider the programs P as
and K as</p>
      </sec>
      <sec id="sec-2-5">
        <title>The two logic programs</title>
        <p>p2 ^ p3 ! q</p>
        <p>We rst present a naive algorithm for computing all reduced solutions for
given (P; K). It is given as Algorithm 2 and it uses a brute-force approach to
check all possible logic programs which can be constructed over the given
propositional variables, whether they constitute a solution for (P; K). For each such
solution, it then invokes Algorithm 1 to obtain a corresponding reduced
program, which is then added to the solution set. The algorithm is quite obviously
correct and always terminating, and we skip a formal proof of this.</p>
        <p>The given algorithm is of course too naive to be practically useful for anything
other than toy examples. Still, it is worst-case optimal, as the following theorem
shows { note that Algorithm 2 has exponential runtime because of line 4.
Theorem 3. The problem of nding all solutions to (P; K) is worst-case
exponential in the combined size of P and K.</p>
        <p>Proof. Let n be any positive integer. De ne the logic program Pn to consist of
the single rule p1 ^ ^ pn ! q and let</p>
        <p>Kn = fpi ! ri;1; pi ! ri;2 j i = 1; : : : ; ng:
Then, for any function f : f1; : : : ; ng ! f1; 2g, the logic program
Pf = fr1;f(1) ^
^ rn;f(n) ! qg
is a reduced solution for (Pn; Kn). Since there exist 2n distinct such functions
f , the number of reduced solutions in this case is 2n, so their production is
exponential in n, while the combined size of Pn and Kn grows only linearly in n.</p>
        <p>A more e cient algorithm for obtaining only one reduced solution is given
as Algorithm 3. It is essentially a combination of Algorithms 1 and 2.
Proposition 3. Algorithm 3 is correct and always terminating.
Proof. Like Algorithm 1, Algorithm 3 checks all combinations of I 2 I1 and
q 2 TP (I) and makes sure that there are rules in the output program such that
I ^ K ^ S j= q. The rules for the output program are checked one by one in
increasing length until a suitable one is found. Note that the rule I ! q is going
to be checked at some stage, i.e. the algorithm will either choose this rule, or a
shorter one, but in any case we will eventually have I ^ K ^ S j= q. This shows
that the algorithm always terminates and that we obtain I ^ K ^ S j= q for all
q 2 TP (I).</p>
        <p>In order to demonstrate that the algorithm output S is indeed a solution
for (P; K), we also need to show that for all q 2 B2 and H 2 I1 we have that
H ^ K ^ S j= q implies q 2 TP (H). This is in fact guaranteed by line 11 of
Algorithm 3, i.e. the algorithm output S is indeed a solution for (P; K).</p>
        <p>We nally show that the output of the algorithm is reduced. Assume
otherwise. Then there are I1 ! q and J ! q in S with I1 ( J . By our condition
on the order we thus have I1 J and so we know that I1 ! q was added
to S earlier in the algorithm than J ! q. now let us look at the instance of
line 12 in Algorithm 3 when the rule J ! q was added to S. In this case (using
notation from the algorithm description, and S denoting the current S at that</p>
      </sec>
      <sec id="sec-2-6">
        <title>Algorithm 3: Reduced solution for (P; K)</title>
        <p>moment) we know that I ^ K ^ S ^ (J ! q) j= q and I ^ K ^ S 6j= q. This
implies I ^ K ^ S j= J , and because I1 J we obtain I ^ K ^ S j= I1. But we
also have already observed that I1 ! q is already contained in S at this stage,
and thus we obtain I ^ K ^ S j= q, which contradicts the earlier statement that
I ^ K ^ S 6j= q. We thus have to reject the assumption that S is not reduced;
hence S is indeed reduced. This completes the proof.</p>
        <p>
          To close, we give a somewhat more complex example. Let B1 = fp1; p2; p3g
and B = fq1; q2; q3; q4g. Consider the program P as
p1 ! q1
p2 ! r3:
r2 ! q1
r2 ! q3
It would be out of place to have a lengthy discussion of related work in
neuralsymbolic integration, or even just on the topic of rule extraction, in this brief
paper. We hence limit ourselves to some key pointers including overview texts.
We already discussed the rule-extraction work [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] on which our work is based,
and [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] which pursues a di erent approach based on inspecting weights. For
more extensive entry points to literature on neural-symbolic integration we refer
to [
          <xref ref-type="bibr" rid="ref10 ref2 ref6 ref7">2,6,7,10</xref>
          ] and to the proceedings of the workshop series on Neural-Symbolic
Learning and Reasoning.3
        </p>
        <p>
          Regarding the novel aspect of this work, namely the utilization of background
knowledge for rule extraction, we are not aware of any prior work which pursues
this. However, concurrently the second author has worked on lifting the idea to
the application level in [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], by utilizing description logics and Semantic Web
background knowledge in the form of ontologies and knowledge graphs [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]
together with the DL-Learner system [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] for rule extraction. The results herein,
which are constrained to the propositional case, can be considered foundational
for the more application-oriented work currently pursued along the lines of [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ].
        </p>
        <p>
          We are also greatful that a reviewer pointed out a possible relationship of our
work with work laid out in [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] in the context of abduction in logic programming.
Looked at on a very generic level, the general abduction task is very similar to
our formulation in equation (1), which means that the eld of abduction may
indeed provide additional insights or even algorithms for our setting. On the detail
level, however, [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] di ers signi cantly. Most importantly, [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] consideres literals
or atoms as abducibles, i.e., an explanation consists of a set of literals, while in
our setting explanations are actually rule sets. Another di erence is that [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]
considers logic programs under the non-monotonic answer set semantics, i.e.,
logic programs with default negations, while we consider only logic programs
without negation in our work { it was laid out in much detail in [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] that
propositional rule extraction under negation has a signi cantly di erent dynamics.
Nevertheless, the general eld of abduction in propositional logic programming
may provide inspiration for further developing our approach, but working out
the exact relationships appears to be a more substantial investigation.
3 http://neural-symbolic.org/
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusions and Further Work</title>
      <p>We have investigated the issue of propositional rule extraction from trained
neural networks under background knowledge, for the case of de nite rules. We have
shown that a mild assumption on the background knowledge and monotonicity
of the input-output function of the network su ces to guarantee that a reduced
logic program can be extracted such that the input-output function is exactly
reproduced. We have also shown that the solution is not unique. Furthermore,
we have provided algorithms for obtaining corresponding reduced programs.</p>
      <p>We consider our results to be foundational for further work, rather than
directly applicable in practice. Our observation that background knowledge can
yield simpler extracted rulesets of course carries over to more expressive logics
which extend propositional logic.</p>
      <p>
        It is such extensions which we intend to pursue, which hold signi cant promise
for practical applicability: structured information on the World Wide Web, as
discussed in the Introduction, is provided in logical forms which are usually
non-propositional fragments of rst-order predicate logic, or closely related
formalisms. In particular, description logics [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], i.e. decidable fragments of
rstorder predicate logic, form the foundation of the Web Ontology Language OWL.
First-order rules are also commonly used [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. This raises the question how to
extract meaningful non-propositional rules from trained neural networks while
taking (non-propositional) background knowledge, in a form commonly used on
the World Wide Web, into account.
      </p>
      <p>Acknowledgements. The rst two authors acknowledge support by the Ohio
Federal Research Network project Human-Centered Big Data.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P.F</given-names>
          </string-name>
          . (eds.):
          <article-title>The Description Logic Handbook: Theory, Implementation, and Applications</article-title>
          . Cambridge University Press, 2nd edn. (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bader</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Dimensions of neural-symbolic integration { A structured survey</article-title>
          . In: Artemov,
          <string-name>
            <given-names>S.N.</given-names>
            ,
            <surname>Barringer</surname>
          </string-name>
          , H.,
          <string-name>
            <surname>d'Avila Garcez</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lamb</surname>
            ,
            <given-names>L.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Woods</surname>
            ,
            <given-names>J</given-names>
          </string-name>
          . (eds.)
          <article-title>We Will Show Them! Essays in Honour of Dov Gabbay</article-title>
          , Volume One. pp.
          <volume>167</volume>
          {
          <fpage>194</fpage>
          .
          <string-name>
            <surname>College Publications</surname>
          </string-name>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Beckett</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prud</surname>
          </string-name>
          'hommeaux, E.,
          <string-name>
            <surname>Carothers</surname>
          </string-name>
          , G.:
          <article-title>RDF 1.1. Turtle { Terse RDF Triple Language</article-title>
          .
          <source>W3C Recommendation (25 February</source>
          <year>2014</year>
          ), available at http://www.w3.org/TR/turtle/
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hendler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lassila</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>The Semantic Web</article-title>
          .
          <source>Scienti c American</source>
          <volume>284</volume>
          (
          <issue>5</issue>
          ),
          <volume>34</volume>
          {43 (May
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Linked Data { The Story So Far</article-title>
          .
          <source>International Journal on Semantic Web and Information Systems</source>
          <volume>5</volume>
          (
          <issue>3</issue>
          ),
          <volume>1</volume>
          {
          <fpage>22</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>d'Avila Garcez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Besold</surname>
            , T.R., de Raedt,
            <given-names>L.</given-names>
          </string-name>
          , Foldiak,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Hitzler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Icard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            , Kuhnberger, K.U.,
            <surname>Lamb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.C.</given-names>
            ,
            <surname>Miikkulainen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Silver</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.L.</surname>
          </string-name>
          :
          <article-title>Neuralsymbolic learning and reasoning: Contributions and challenges</article-title>
          . In:
          <string-name>
            <surname>McCallum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gabrilovich</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guha</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murphy</surname>
            ,
            <given-names>K</given-names>
          </string-name>
          . (eds.)
          <source>Proceedings of the AAAI 2015 Spring Symposium on Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches. AAAI Press Technical Report</source>
          , vol. SS-
          <volume>15</volume>
          -
          <fpage>03</fpage>
          . AAAI Press, Palo Alto (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>d'Avila Garcez</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lamb</surname>
            ,
            <given-names>L.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gabbay</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          :
          <article-title>Neural-Symbolic Cognitive Reasoning</article-title>
          .
          <source>Cognitive Technologies</source>
          , Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>d'Avila Garcez</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaverucha</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The connectionist inductive lerarning and logic programming system</article-title>
          .
          <source>Applied Intelligence</source>
          <volume>11</volume>
          (
          <issue>1</issue>
          ),
          <volume>59</volume>
          {
          <fpage>77</fpage>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Guha</surname>
            ,
            <given-names>R.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brickley</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macbeth</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Schema.
          <article-title>org: evolution of structured data on the web</article-title>
          .
          <source>Commun. ACM</source>
          <volume>59</volume>
          (
          <issue>2</issue>
          ),
          <volume>44</volume>
          {
          <fpage>51</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Hammer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
          </string-name>
          , P. (eds.):
          <source>Perspectives of Neural-Symbolic Integration, Studies in Computational Intelligence</source>
          , vol.
          <volume>77</volume>
          . Springer (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Krotzsch,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Parsia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Patel-Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.F.</given-names>
            ,
            <surname>Rudolph</surname>
          </string-name>
          , S. (eds.):
          <article-title>OWL 2 Web Ontology Language Primer (Second Edition)</article-title>
          .
          <source>W3C Recommendation (11 December</source>
          <year>2012</year>
          ), http://www.w3.org/TR/owl2-primer/
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Krotzsch,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Rudolph</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          :
          <article-title>Foundations of Semantic Web Technologies</article-title>
          . CRC Press/Chapman &amp; Hall (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seda</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          :
          <article-title>Mathematical Aspects of Logic Programming Semantics</article-title>
          . CRC Press/Chapman and Hall (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Krisnadhi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maier</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>OWL and rules</article-title>
          . In: Polleres,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>d'Amato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Arenas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Handschuh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Kroner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Ossowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Patel-Schneider</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.F.</surname>
          </string-name>
          <article-title>(eds.) Reasoning Web</article-title>
          .
          <source>Semantic Technologies for the Web of Data { 7th International Summer School</source>
          <year>2011</year>
          , Galway, Ireland,
          <source>August 23-27</source>
          ,
          <year>2011</year>
          ,
          <string-name>
            <given-names>Tutorial</given-names>
            <surname>Lectures</surname>
          </string-name>
          .
          <source>Lecture Notes in Computer Science</source>
          , vol.
          <volume>6848</volume>
          , pp.
          <volume>382</volume>
          {
          <fpage>415</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bader</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Extracting reduced logic programs from articial neural networks</article-title>
          .
          <source>Appl. Intell</source>
          .
          <volume>32</volume>
          (
          <issue>3</issue>
          ),
          <volume>249</volume>
          {
          <fpage>266</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Concept learning in description logics using re nement operators</article-title>
          .
          <source>Machine Learning</source>
          <volume>78</volume>
          (
          <issue>1-2</issue>
          ),
          <volume>203</volume>
          {
          <fpage>250</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isele</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jakob</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jentzsch</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontokostas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendes</surname>
            ,
            <given-names>P.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morsey</surname>
            , M., van Kleef,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>DBpedia { A largescale, multilingual knowledge base extracted from Wikipedia</article-title>
          .
          <source>Semantic Web</source>
          <volume>6</volume>
          (
          <issue>2</issue>
          ),
          <volume>167</volume>
          {
          <fpage>195</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>You</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Abduction in logic programming: A new de nition and an abductive procedure based on rewriting</article-title>
          .
          <source>Artif. Intell</source>
          .
          <volume>140</volume>
          (
          <issue>1</issue>
          /2),
          <volume>175</volume>
          {
          <fpage>205</fpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Sarker</surname>
            ,
            <given-names>M.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xie</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Doran</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raymer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Explaining trained neural networks with semantic web technologies: First steps</article-title>
          .
          <source>In: Proceedings of the Twelveth International Workshop on Neural-Symbolic Learning and Reasoning</source>
          , NeSy'
          <fpage>17</fpage>
          , London, UK,
          <year>July 2017</year>
          (
          <year>2017</year>
          ), to appear
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Vrandecic</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Krotzsch, M.:
          <article-title>Wikidata: a free collaborative knowledgebase</article-title>
          .
          <source>Commun. ACM</source>
          <volume>57</volume>
          (
          <issue>10</issue>
          ),
          <volume>78</volume>
          {
          <fpage>85</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>