<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Inductive Programming as Approach to Comprehensible Machine Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ute Schmid</string-name>
          <email>ute.schmid@uni-bamberg.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty Information Systems and Applied Computer Science University of Bamberg 96045 Bamberg</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <fpage>4</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>In the early days of machine learning, Donald Michie introduced two orthogonal dimensions to evaluate performance of machine learning approaches - predictive accuracy and comprehensibility of the learned hypotheses. Later definitions narrowed the focus to measures of accuracy. As a consequence, statistical/neuronal approaches have been favoured over symbolic approaches to machine learning, such as inductive logic programming (ILP). Recently, the importance of comprehensibility has been rediscovered under the slogan 'explainable AI'. This is due to the growing interest in black-box deep learning approaches in many application domains where it is crucial that system decisions are transparent and comprehensible and in consequence trustworthy. I will give a short history of machine learning research followed by a presentation of two specific approaches of symbolic machine learning - inductive logic programming and end-user programming. Furthermore, I will present current work on explanation generation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Machine learning research began in the 1950s with roots in two di↵erent scientific
communities – artificial intelligence (AI) and signal processing. In signal
processing research mainly statistical methods to generalisation learning from data
were developed under the label of pattern recognition. In AI machine learning
was investigated under this label as a crucial principle underlying general
intelligent behaviour along with other approaches such as knowledge representation,
automated reasoning, planning, game playing, and natural language processing
        <xref ref-type="bibr" rid="ref16">(Minsky, 1968)</xref>
        . Over the next decades, research between these two
communities began to overlap, mainly with respect to neural information processing
research, especially artificial neural networks. An early important domain of
interest outside the machine learning community itself was data bases research
which recognised the usefulness of machine learning methods for the discovery
of patterns ins large data bases (data mining).
      </p>
      <p>
        In the early days of machine learning, Donald Michie introduced two
orthogonal dimensions to evaluate performance of machine learning approaches
– predictive accuracy and comprehensibility of the learned hypotheses
        <xref ref-type="bibr" rid="ref15">(Michie,
1988)</xref>
        . Michie proposed to characterise comprehensibility as operational
e↵ectiveness – meaning that the machine learning system can communicate the learned
hypothesis (model) to a human whose performance is consequently increased to
a level beyond that of the human studying the training data alone
        <xref ref-type="bibr" rid="ref19 ref25">(Muggleton,
Schmid, Zeller, Tamaddoni-Nezhad, &amp; Besold, 2018)</xref>
        . Later definitions
        <xref ref-type="bibr" rid="ref17">(Mitchell,
1997)</xref>
        , narrowed their focus to measures of accuracy. As a consequence,
statistical/neuronal approaches have been favoured over symbolic approaches to
machine learning, such as inductive logic programming (ILP)
        <xref ref-type="bibr" rid="ref18">(Muggleton &amp; De
Raedt, 1994)</xref>
        .
      </p>
      <p>
        However, in recent years, there is a growing recognition that the holy grail
of performance is not sucient for successful applications of machine learning in
complex real world domains. Under the label of ‘explainable AI’
        <xref ref-type="bibr" rid="ref21">(Ribeiro, Singh,
&amp; Guestrin, 2016)</xref>
        di↵erent approaches are proposed for making classification
decisions of black box classifiers such as (deep) neural networks more transparent
and comprehensible for the user. Such explanations are mostly given in form of
visualisations. However, alternatively or additionally, verbal explanations might
be helpful, similar to the approaches to explanation generation developed in early
AI
        <xref ref-type="bibr" rid="ref2">(Clancey, 1983)</xref>
        . There explanations were generated based on the execution
traces of rules during inference. While most current machine learning methods
rely on implicit black box representations of the induced hypotheses, symbolic
machine learning approaches allow to induce rules from training examples.
      </p>
      <p>In the following, I will argue that symbol level approaches to machine
learning, such as approaches to inductive programming, o↵er an interesting
complement to standard machine learning. I will present ILP and end user programming
as two specific approaches of interest. Furthermore, I will present current work
on explanation generation.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Approaches to Inductive Programming</title>
      <p>
        A classic white-box approach which allow the generation of hypotheses in form of
symbolic representations are decisions trees and variants such as decision rules or
random forests. There is empirical evidence that such machine learned
hypothesis can be understood and applied by humans
        <xref ref-type="bibr" rid="ref11 ref6">(Lakkaraju, Bach, &amp; Leskovec,
2016; Fu¨rnkranz, Kliegr, &amp; Paulheim, 2018)</xref>
        . Other approaches to symbol-level
learning are grammar inference
        <xref ref-type="bibr" rid="ref1 ref26">(Angluin, 1980; Siebers, Schmid, Seuß, Kunz, &amp;
Lautenbacher, 2016)</xref>
        , inductive functional programming
        <xref ref-type="bibr" rid="ref10 ref23 ref8">(Gulwani et al., 2015;
Schmid &amp; Kitzelmann, 2011; Kitzelmann &amp; Schmid, 2006)</xref>
        , and the already
mentioned inductive logic programming. An example for a learned functional
program is given in Figure 1, an example for ILP is given in Figure 2
      </p>
      <p>
        In contrast to standard machine learning, these approaches are strongly
related to formalisms of computer science. The expressiveness of learned
hypotheses goes beyond conjunctions over feature values. In principle, arbitrary
computer programs – for instance in the declarative language Haskell (for inductive
functional) or Prolog (for inductive logic programming) can be induced from
training examples
        <xref ref-type="bibr" rid="ref19 ref8">(Gulwani et al., 2015; Muggleton et al., 2018)</xref>
        . The learned
programs or rules are naturally incorporable in rule-based systems. Because the
learned hypotheses are represented in symbolic form they are inspectable by
humans and therefore provide transparency and comprehensibility of the machine
learned classifiers. In contrast to many standard machine learning approaches,
learning is possible from few examples. This corresponds to the way humans
learn in many high-level domains
        <xref ref-type="bibr" rid="ref13">(Marcus, 2018)</xref>
        . However, there are also some
draw-backs for these symbolic approaches to machine learning: They are
inherently brittle and are not designed to deal well with noise and it is in general
not easy to express complex non-linear decision surfaces in logic. While (deep)
neural networks and other black box approaches often reach high predictive
accuracy, symbolic approaches such as inductive functional or logic programming
are superior with respect to comprehensibility. Consequently, research on
hybrid approaches combining both methods seems a promising domain of research
        <xref ref-type="bibr" rid="ref19 ref20 ref25">(Rabold, Siebers, &amp; Schmid, 2018)</xref>
        .
2.1
      </p>
      <p>Inductive Logic Programming
Inductive logic programming has been established in the 1990ies as a
symbollevel approach to machine learning where logic (Prolog) programs are used as
Target Concepts (Rules):
% grandparent without invented pred.
p(X,Y) :- father(X,Z), father(Z,Y).
p(X,Y) :- father(X,Z), mother(Z,Y).
p(X,Y) :- mother(X,Z), mother(Z,Y).
p(X,Y) :- mother(X,Z), father(Z,Y).
% ancestor without invented predicate
p(X,Y) :- father(X,Y).
p(X,Y) :- mother(X,Y).
p(X,Y) :- father(X,Z), p(Z,Y).
p(X,Y) :- mother(X,Z), p(Z,Y).</p>
      <p>Background Knowledge (Observations):
father(jake,alice). mother(matilda,alice).
father(jake,john). mother(matilda,john).
father(bill,ted). mother(alice,ted).
father(bill,megan). mother(alice,megan).
father(john,harry). mother(mary,harry).
father(john,susan). mother(mary,susan).</p>
      <p>mother(mary,andy).
father(ted,bob). mother(jill,bob).
father(ted,jane). mother(jill,jane).
father(harry,sam). mother(liz,sam).
father(harry,jo). mother(liz,jo).</p>
      <p>
        Target Concepts (Rules):
p(X,Y) :- p1(X,Z), p1(Z,Y).
p1(X,Y) :- father(X,Y).
p1(X,Y) :- mother(X,Y).
% ancestor with invented predicate
p(X,Y) :- p1(X,Y).
p(X,Y) :- p1(X,Z), p(Z,Y).
p1(X,Y) :- father(X,Y).
p1(X,Y) :- mother(X,Y).
a uniform representation for examples, background knowledge and hypotheses.
In Figure 2 the well-known family tree example is shown. Here the father and
mother relations are given as background knowledge as observed/known facts
concerning a specific family. Target predicates can be as simple as grandfather
which can be described as the father of a parent of a person or a bit more
general such as grandparent or more complex, such as the ancestor relation
which involves recursion. For learning, some positive and negative examples for
the target predicate are presented. For example ancestor(Jake, Bob) is a positive
example and ancestor(Harry, Mary) is a negative example for ancestor. Given
this information, an ILP system can derive a hypothetical logic program which
entails all the positive and none of the negative examples. Some relaxation of
this brittle criterion is possible
        <xref ref-type="bibr" rid="ref19 ref20 ref25">(Siebers &amp; Schmid, 2018)</xref>
        .
      </p>
      <p>
        One of the main advantages of ILP over other symbolic approaches to
machine learning is that it allows to consider n-ary relations. Although it is possible
to transform relations into features, such a transformation can be tedious, result
in sparse feature vectors, and cannot be performed without loss of information.
The advantage of ILP in structural domains such as chemistry has for example
been demonstrated for the identification of mutagenic chemical structures
        <xref ref-type="bibr" rid="ref9">(King,
Muggleton, Srinivasan, &amp; Sternberg, 1996)</xref>
        . That humans can comprehend and
successfully apply ILP learned rules has been demonstrated in two experiments
        <xref ref-type="bibr" rid="ref19">(Muggleton et al., 2018)</xref>
        Inductive programming approaches typically are defined in a generic way. In the
context of end-user programming, Gulwani and colleagues demonstrated that
such techniques can be highly ecient in complex practical applications if they
are restricted for a specific domain
        <xref ref-type="bibr" rid="ref8">(Gulwani et al., 2015)</xref>
        . For example, since
2013 the system Flashfill is included in Excel. It can learn table manipulations
from observing user actions.
      </p>
      <p>
        Flashfill is a convincing example for end-user programming which gives users
without background in computer programming the possibility to work more
efficiently with their software applications
        <xref ref-type="bibr" rid="ref24 ref3">(Cypher, 1995; Schmid &amp; Waltermann,
2004)</xref>
        . Users with a background in computer programming have the
possibility to inspect the programs induced by Flashfill. For end-users, transparency is
gained by the possibility to directly observe the e↵ect of giving an example on
the actions of the program.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Explanation Generation</title>
      <p>
        Over the last years there is a growing interest in artificial intelligence from
industry and it became a topic of discussion in politics and society in general.
Mostly AI is used a label for machine learning – ignoring all other subject
domains which constitute AI, among them knowledge representation and learning.
Furthermore, machine learning is mostly exclusively referring to artificial neural
networks, especially variants of deep learning. Following the big bang of deep
learning
        <xref ref-type="bibr" rid="ref12">(LeCun, Bengio, &amp; Hinton, 2015)</xref>
        , there is a growing recognition from
practitioners, for example in medicine, in connected industry, and in automotive,
that back box classifiers have the disadvantage of being intransparent.
Evaluation of the safety of software involving such classifiers is problematic and naive
as well as expert users will ultimately hesitated to trust such systems.
      </p>
      <p>
        It is widely recognized that explanations are crucial for comprehensibility and
trust
        <xref ref-type="bibr" rid="ref21 ref27">(Tintarev &amp; Mastho↵, 2015; Ribeiro et al., 2016)</xref>
        . For image classification,
explanations are typically given in a visual form. For example, the system LIME
provides an explanation interface where such parts of an image are highlighted
which are relevant for the classification decision
        <xref ref-type="bibr" rid="ref21">(Ribeiro et al., 2016)</xref>
        . However,
visual explanations are restricted to conjunctions of isolated information. Verbal
explanations in addition allow for the use of relational information, for recursion,
and even negation:
– This is a grave of the iron age because there is one stone circle within another
one.
– This is a correct Tower of Hanoi because starting with the largest disk at
the bottom, each disk is smaller then the previous one.
– This is a peaceful person because he/she is not holding a weapon (but a
flower).
      </p>
      <p>
        Often, it might be especially helpful to combine visual and verbal
explanations. An example is given in Figure 4. Here the explanation involves distortions
of specific regions of the face which are typically characterized by so called
action units used in the facial action coding systems (FA CS)
        <xref ref-type="bibr" rid="ref26 ref4">(Ekman &amp; Friesen,
1971; Siebers et al., 2016)</xref>
        . The explanation can be given either in generally
understandable terms or for an expert knowledgeable in FA CS coding.
      </p>
      <p>Explanations are important for transparency of a machine learned classifier.
Experts might demand an explanation if their own decision deviates from that
of the classifier. The explanation might be convincing for the expert or not and
consequently, the expert might revise his decision or override the classification.</p>
      <p>
        Explanations can also be helpful in education and training. Here, contrasting
examples might be helpful to make explanations more comprehensible. In many
realistic contexts, it is not so easy to relate ones perception to a specific feature
value. For example, to classify a mushroom it might be necessary to decide
whether the cap has the shape of a bell or whether it is conical
        <xref ref-type="bibr" rid="ref22">(Schlimmer,
1987)</xref>
        . To understand the di↵erence between these feature values, contrasting
examples might be helpful. Similarly, facial expressions for pain and disgust share
a number of action units. Consequently, sometimes an observer might confuse
these states and might profit from an explicit contrast (see Fig, 5.
      </p>
      <p>
        It is an open research question which type of contrast is helpful to support
understanding. In cognitive science research on similarity and analogy it is
recognized that alignment is an important aspect for concept acquisition
        <xref ref-type="bibr" rid="ref14 ref7">(Markman
&amp; Gentner, 1996; Goldwater &amp; Gentner, 2015)</xref>
        . Exemplars must be suciently
similar that di↵erences between them are helpful: To understand the concept
of a bottle, it might be helpful to contrast it with some other receptacle such
as a mug but it will not be helpful to contrast it with an arbitrary object, say
a chair. To understand the concept of a grandparent (see Fig. 2) it might be
helpful to replace one attribute, such as male – contrasting with grandmother –
or to invert relations, contrasting with grandchild.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions and Further Work</title>
      <p>
        It is my strong conviction that future applications of machine learning should
not aim at replacing human decision makers but at supporting them in
complex domains. It is still desirable, in my opinion, to develop machine learning
approaches which fulfill the ultra-strong machine learning criterion proposed by
Michie. Thereby, rather than creating autonomous systems which replace
humans, we can create intelligent companion systems
        <xref ref-type="bibr" rid="ref5">(Forbus &amp; Hinrichs, 2006)</xref>
        .
Such companions might make it possible that humans can perform better when
supported by the system than when they are on their own. ‘Better’ might address
eciency, correctness, or simply feeling more secure or more relaxed. I hope that
approaches to inductive programming will be a useful building block in such an
endeavour.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Angluin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>1980</year>
          ).
          <article-title>Inductive inference of formal languages from positive data</article-title>
          .
          <source>Information and control</source>
          ,
          <volume>45</volume>
          (
          <issue>2</issue>
          ),
          <fpage>117</fpage>
          -
          <lpage>135</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Clancey</surname>
            ,
            <given-names>W. J.</given-names>
          </string-name>
          (
          <year>1983</year>
          ).
          <article-title>The epistemology of a rule-based expert system - A framework for explanation</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>20</volume>
          (
          <issue>3</issue>
          ),
          <fpage>215</fpage>
          -
          <lpage>251</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Cypher</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>1995</year>
          ).
          <article-title>Eager: Programming repetitive tasks by example. In Readings in human-computer interaction</article-title>
          (pp.
          <fpage>804</fpage>
          -
          <lpage>810</lpage>
          ). Elsevier.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Ekman</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Friesen</surname>
            ,
            <given-names>W. V.</given-names>
          </string-name>
          (
          <year>1971</year>
          ).
          <article-title>Constants across cultures in the face and emotion</article-title>
          .
          <source>Journal of Personality and Social Psychology</source>
          ,
          <volume>17</volume>
          (
          <issue>2</issue>
          ),
          <fpage>124</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Forbus</surname>
            ,
            <given-names>K. D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Hinrichs</surname>
            ,
            <given-names>T. R.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Companion cognitive systems - a step toward human-level AI</article-title>
          .
          <source>AI Magazine</source>
          , special issue on Achieving
          <source>HumanLevel Intelligence through Integrated Systems and Research</source>
          ,
          <volume>27</volume>
          (
          <issue>2</issue>
          ),
          <fpage>83</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>Fu¨rnkranz</article-title>
          , J.,
          <string-name>
            <surname>Kliegr</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Paulheim</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>On cognitive preferences and the interpretability of rule-based models</article-title>
          .
          <source>CoRR</source>
          , abs/
          <year>1803</year>
          .01316 . Retrieved from http://arxiv.org/abs/
          <year>1803</year>
          .01316
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Goldwater</surname>
            ,
            <given-names>M. B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Gentner</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>On the acquisition of abstract knowledge: Structural alignment and explication in learning causal system categories</article-title>
          .
          <source>Cognition</source>
          ,
          <volume>137</volume>
          ,
          <fpage>137</fpage>
          -
          <lpage>153</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Gulwani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hernandez-Orallo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kitzelmann</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muggleton</surname>
            ,
            <given-names>S. H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Zorn</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Inductive programming meets the real world</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>58</volume>
          (
          <issue>11</issue>
          ),
          <fpage>90</fpage>
          -
          <lpage>99</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>King</surname>
            ,
            <given-names>R. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muggleton</surname>
            ,
            <given-names>S. H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srinivasan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Sternberg</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>1996</year>
          ).
          <article-title>Structureactivity relationships derived by machine learning: The use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming</article-title>
          .
          <source>Proceedings of the National Academy of Sciences</source>
          ,
          <volume>93</volume>
          (
          <issue>1</issue>
          ),
          <fpage>438</fpage>
          -
          <lpage>442</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Kitzelmann</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Inductive synthesis of functional programs: An explanation based generalization approach</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>7</volume>
          ,
          <fpage>429</fpage>
          -
          <lpage>454</lpage>
          . Retrieved from http://www.jmlr.org/papers/v7/kitzelmann06a.html
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Lakkaraju</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bach</surname>
            ,
            <given-names>S. H.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Leskovec</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Interpretable decision sets: A joint framework for description and prediction</article-title>
          .
          <source>In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          (pp.
          <fpage>1675</fpage>
          -
          <lpage>1684</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>LeCun</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Hinton</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Deep learning</article-title>
          .
          <source>nature</source>
          ,
          <volume>521</volume>
          (
          <issue>7553</issue>
          ),
          <fpage>436</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Marcus</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Deep learning: A critical appraisal</article-title>
          .
          <source>CoRR</source>
          , abs/
          <year>1801</year>
          .00631 . Retrieved from http://arxiv.org/abs/
          <year>1801</year>
          .00631
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Markman</surname>
            ,
            <given-names>A. B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Gentner</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>1996</year>
          ).
          <article-title>Commonalities and di↵erences in similarity comparisons</article-title>
          .
          <source>Memory &amp; Cognition</source>
          ,
          <volume>24</volume>
          (
          <issue>2</issue>
          ),
          <fpage>235</fpage>
          -
          <lpage>249</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Michie</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>1988</year>
          ).
          <article-title>Machine learning in the next five years</article-title>
          .
          <source>In Proceedings of the Third European Working Session on Learning</source>
          (pp.
          <fpage>107</fpage>
          -
          <lpage>122</lpage>
          ). Pitman.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Minsky</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (Ed.). (
          <year>1968</year>
          ).
          <article-title>Semantic information processing</article-title>
          . MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Mitchell</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>1997</year>
          ).
          <article-title>Machine learning</article-title>
          .
          <source>McGraw Hill.</source>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Muggleton</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>De Raedt</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <year>1994</year>
          ).
          <article-title>Inductive logic programming: Theory and methods</article-title>
          .
          <source>Journal of Logic Programming</source>
          ,
          <source>Special Issue on 10 Years of Logic Programming</source>
          ,
          <fpage>19</fpage>
          -
          <lpage>20</lpage>
          ,
          <fpage>629</fpage>
          -
          <lpage>679</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Muggleton</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zeller</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tamaddoni-Nezhad</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Besold</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Ultra-strong machine learning: comprehensibility of programs learned with ilp</article-title>
          .
          <source>Machine Learning</source>
          ,
          <volume>107</volume>
          (
          <issue>7</issue>
          ),
          <fpage>1119</fpage>
          -
          <lpage>1140</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Rabold</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siebers</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Explaining black-box classifiers with ILP - empowering LIME with Aleph to approximate non-linear decisions with relational rules</article-title>
          . In F. Riguzzi,
          <string-name>
            <surname>E. Bellodi,</surname>
          </string-name>
          &amp; R. Zese (Eds.),
          <source>Proc of the 28th International Conference (ILP</source>
          <year>2018</year>
          , Ferrara, Italy,
          <source>September 2-4)</source>
          (Vol.
          <volume>11105</volume>
          , pp.
          <fpage>105</fpage>
          -
          <lpage>117</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Ribeiro</surname>
            ,
            <given-names>M. T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Guestrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Why should I trust you?: Explaining the predictions of any classifier</article-title>
          .
          <source>In Proc. of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          (pp.
          <fpage>1135</fpage>
          -
          <lpage>1144</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Schlimmer</surname>
            ,
            <given-names>J. C.</given-names>
          </string-name>
          (
          <year>1987</year>
          ).
          <article-title>Concept acquisition through representational adjustment</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kitzelmann</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Inductive rule learning on the knowledge level</article-title>
          .
          <source>Cognitive Systems Research</source>
          ,
          <volume>12</volume>
          (
          <issue>3</issue>
          ),
          <fpage>237</fpage>
          -
          <lpage>248</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Waltermann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>Automatic synthesis of xsltransformations from example documents</article-title>
          . In M. Hamza (Ed.),
          <source>Artificial Intelligence and Applications Proceedings (AIA</source>
          <year>2004</year>
          )
          <article-title>(pp</article-title>
          .
          <fpage>252</fpage>
          -
          <lpage>257</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Siebers</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Was the year 2000 a leap year? stepwise narrowing theories with metagol</article-title>
          . In F. Riguzzi,
          <string-name>
            <surname>E. Bellodi,</surname>
          </string-name>
          &amp; R. Zese (Eds.),
          <source>Proc of the 28th International Conference (ILP</source>
          <year>2018</year>
          , Ferrara, Italy,
          <source>September 2-4)</source>
          (Vol.
          <volume>11105</volume>
          , pp.
          <fpage>141</fpage>
          -
          <lpage>156</lpage>
          ). Retrieved from https://doi.org/10.1007/978-3-319
          <source>-99960-9 9 doi: 10.1007/978-3- 319-99960-9 9</source>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <surname>Siebers</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seuß</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kunz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Lautenbacher</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Characterizing facial expressions by grammars of action unit sequences-A first investigation using ABL</article-title>
          .
          <source>Information Sciences</source>
          ,
          <volume>329</volume>
          ,
          <fpage>866</fpage>
          -
          <lpage>875</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <surname>Tintarev</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , &amp; Mastho↵,
          <string-name>
            <surname>J.</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Explaining recommendations: Design and evaluation</article-title>
          .
          <source>In Recommender systems handbook</source>
          (pp.
          <fpage>353</fpage>
          -
          <lpage>382</lpage>
          ). Springer.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>