<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Unsupervised Ontology Learning from Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Szymon Klarman</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Katarina Britz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre for AI Research, CSIR and Stellenbosch University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, Brunel University London</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <abstract>
        <p>Data-driven elicitation of ontologies from structured data is a well-recognized knowledge acquisition bottleneck. The development of efficient techniques for (semi-)automating this task is therefore practically vital - yet, hindered by the lack of robust theoretical foundations. In this paper, we study the problem of learning Description Logic TBoxes from interpretations, which naturally translates to the task of ontology learning from data. In the presented framework, the learner is provided with a set of positive interpretations (i.e., logical models) of the TBox adopted by the teacher. The goal is to correctly identify the TBox given this input. We characterize the key constraints on the models that warrant finite learnability of TBoxes expressed in selected fragments of the Description Logic E L and define corresponding learning algorithms.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In the advent of the Web of Data and various “e-” initiatives,
such as e-science, e-health, e-governance, etc., the focus of
the classical knowledge acquisition bottleneck becomes ever
more concentrated around the problem of constructing rich
and accurate ontologies enabling efficient management of
the existing abundance of data [Maedche and Staab, 2004].
Whereas the traditional understanding of this bottleneck has
been associated with the necessity of developing ontologies
ex ante, in a top-down, data-agnostic manner, this seems to
be currently evolving into a new position, recently dubbed the
knowledge reengineering bottleneck [Hoekstra, 2010]. In this
view, the contemporary challenge is to, conversely, enable
data-driven approaches to ontology design — methods that
can make use and make sense of the existing data, be it
readily available on the web or crowdsourced, leading to
elicitation of the ontological commitments implicitly present on the
data-level. Even though the development of such techniques
and tools, which could help (semi-)automate thus
characterized ontology learning processes, becomes vital in practice,
the robust theoretical foundations for the problem are still
rather limited. This gap is addressed in the present work.</p>
      <p>In this paper, we study the problem of learning
Description Logic (DL) TBoxes from interpretations, which
naturally translates to the task of ontology learning from data.
DLs are a popular family of knowledge representation
formalisms [Baader et al., 2003], which have risen to
prominence as, among others, the logics underpinning different
profiles of the Web Ontology Language OWL1. In this
paper, we focus on the lightweight DL E L [Baader et al., 2005]
and some of its more specific fragments. This choice is
motivated, on the one hand, by the interesting applications
of E L, especially as the logic behind OWL 2 E L profile,
while on the other, by its relative complexity, which enables
us to make interesting observations from the learning
perspective. Our learning model is a variant of learning from
positive interpretations (i.e., from models of the target
theory) — a generally established framework in the field of
inductive logic programming [De Raedt and Lavracˇ, 1993;
De Raedt, 1994]. In our scenario, the goal of the learner is
to correctly identify the target TBox T given a finite set of
its finite models. Our overarching interest lies in algorithms
warranting effective learnability in such setting with no or
minimum supervision. Our key research questions and
contributions are therefore concerned with the identification of
specific languages and conditions on the learning input under
which such algorithms can be in principle defined.</p>
      <p>In the following two sections, we introduce DL
preliminaries and discuss the adopted learning model. In Section 4,
we identify two interesting fragments of E L, called E Lrhs
and E Llhs, which satisfy some basic necessary conditions
enabling finite learnability, and at the same time, we show
that full E L does not meet that same requirement. In
Section 5, we devise a generic algorithm which correctly
identifies E Lrhs and E Llhs TBoxes from finite data, employing a
basic equivalence oracle. Further, in case of E Lrhs, we
significantly strengthen this result by defining an algorithm which
makes no such calls to an oracle, and thus supports fully
unsupervised learning. In Section 6, we compare our work to
related contributions, in particular to the framework of
learning TBoxes from entailment queries, by Konev et al. [Konev
et al., 2014]. We conclude in Section 7 with an overview of
interesting open problems.</p>
      <p>1See http://www.w3.org/TR/owl2-profiles/.</p>
      <p>This work was funded in part by the National Research
Foundation under Grant no. 85482.</p>
    </sec>
    <sec id="sec-2">
      <title>Description Logic Preliminaries</title>
      <p>The language of the Description Logic (DL) E L [Baader et
al., 2005] is given by (1) a vocabulary = (NC ; NR), where
NC is a set of concept names and NR a set of role names,
and (2) the following set of constructors for defining complex
concepts, which shall be divided into two groups:
E L:
Lu:</p>
      <p>C; D ::= &gt; j A j C u D j 9r:C</p>
      <p>C; D ::= &gt; j A j C u D
where A 2 NC and r 2 NR. The set of Lu concepts
naturally captures the propositional part of E L. The depth of a
subconcept D in C is the number of existential restrictions
within the scope of which D remains. The depth of a concept
C is the depth of its subconcept with the greatest depth in C.
Every Lu concept is trivially of depth 0.</p>
      <p>The semantics is defined through interpretations of the
form I = ( I ; I ), where I is a non-empty domain of
individuals and I is an interpretation function mapping each
A 2 NC to a subset AI I and each r 2 NR to a
binary relation rI I I . The interpretation function is
inductively extended over complex expressions according to
the fixed semantics of the constructors:</p>
      <p>&gt;I = I
(C u D)I = fx 2 I j x 2 CI \ DI g</p>
      <p>(9r:C)I = fx 2 I j 9y : (x; y) 2 rI ^ y 2 CI g
A concept inclusion is an expression of the form C v D,
stating that all individuals of type C are D, as in, e.g.:
Father of son v Man u 9hasChild:Man. The language
fragments considered in this paper are categorized w.r.t.
restrictions imposed on the syntax of concepts C and D in
permitted concept inclusions C v D:</p>
      <p>E L:</p>
      <p>rhs:
E L</p>
      <p>lhs:
E L
Lu:</p>
      <p>C and D are both E L concepts;
C is an Lu concept and D an E L concept;
C is an E L concept and D an Lu concept;</p>
      <p>C and D are both Lu concepts.</p>
      <p>A TBox (or ontology) is a finite set of concept inclusions,
also called the TBox axioms, in a given language fragment.</p>
      <p>An interpretation I satisfies a concept inclusion C v D
(I j= C v D) iff CI DI . Whenever I satisfies all axioms
in a TBox T (I j= T ), we say that I is a model of T . For
a set of interpretations S, we write S j= C v D to denote
that every interpretation in S satisfies C v D. We say that T
entails C v D (T j= C v D) iff every model of T satisfies
C v D. Two TBoxes T and H are (logically) equivalent
(T H) iff they have the same sets of models.</p>
      <p>A pointed interpretation (I; d) is a pair consisting of a
DL interpretation I = ( I ; I ) and an individual d 2 I ,
such that every e 2 I different from d is reachable from
d through some role composition in I. By a slight abuse of
notation, given an arbitrary DL interpretation I and an
individual d 2 I , we write (I; d) to denote the largest subset
I0 of I such that (I0; d) is a pointed interpretation. If it is
clear from the context, we refer to pointed interpretations and
pointed models simply as interpretations and models. We say
that (I; d) is a model of a concept C iff d 2 CI ; it is a model
of C w.r.t. T whenever also I j= T .</p>
      <p>An interpretation (I; d) can be homomorphically
embedded in an interpretation (J ; e), denoted as (I; d) 7! (J ; e),
iff there exists a mapping h :
lowing conditions:</p>
      <p>J , satisfying the
folh(d) = e,
if (a; b) 2 rI then (h(a); h(b)) 2 rJ , for every a; b 2</p>
      <p>I and r 2 NR,
if a 2 AI then h(a) 2 AJ , for every a 2
A 2 NC .</p>
      <p>I and</p>
      <p>A model (I; d) of C (w.r.t. T ) is called minimal iff it can
be homomorphically embedded in every other model of C
(w.r.t. T ). It is well-known that E L concepts and TBoxes
always have such minimal models (unique up to
homomorphic embeddings) [Lutz et al., 2010]. As in most modal
logics, arbitrary E L models can be unravelled into
equivalent tree-shaped models. Finally, we observe that due to a
tight relationship between the syntax and semantics of E L,
every tree-shaped interpretation (I; d) can be viewed as an
E L concept CI , such that (I; d) is a minimal model of CI .
Formally, we set CI = C(d), where for every e 2 I we let
C(e) = &gt;uA(e)u9(e), with A(e) = dfA 2 NC j e 2 AI g
and 9(e) = d(r;f)2NR I s.t. (e;f)2rI 9r:C(f ). In that case
we call CI the covering concept for (I; d).
3</p>
    </sec>
    <sec id="sec-3">
      <title>Learning Model</title>
      <p>The learning model studied in this paper is a variant of
learning from positive interpretations [De Raedt and Lavracˇ, 1993;
De Raedt, 1994]. In our setting, the teacher fixes a target
TBox T , whose set of all models is denoted by M(T ).
Further, the teacher presents a set of examples from M(T ) to the
learner, whose goal is to correctly identify T based on this
input. The learning process is conducted relative to a mutually
known DL language L and a finite signature T used in T .
Obviously, M(T ) contains in principle sufficient information
in order to enable correct identification of T , as the following
correspondence implies:
M(T ) j= C v D iff T j= C v D, for every C v D in L.
However, as M(T ) might consist of infinitely many
models of possibly infinite size, the teacher cannot effectively
present them all to the learner. Instead, the teacher must
confine him- or herself to certain finitely presentable subset of
M(T ), called the learning set. For the sake of clarity, we
focus here on the simplest case when learning sets consist
of finitely many finite models.2 Formally, we summarize the
learning model with the following definitions.</p>
      <p>Definition 1 (TIP) A TBox Identification Problem (TIP) is a
pair (T ; S), where T is a TBox in a DL language L and S,
called the learning set, is a finite set of finite models of T .
Definition 2 (Learner, identification) For a DL language
L, a learner is a computable function G, which for every set S
over T returns a TBox in L over T . Learner G correctly
identifies T on S whenever G(S) T .</p>
      <p>2An alternative, more general approach can be defined in terms
of specific fragments of models. Such generalization, which lies
beyond the scope of this paper, is essential when the learning problem
concerns languages without finite model property.</p>
      <sec id="sec-3-1">
        <title>Definition 3 (Learnability) For a DL language L, the class</title>
        <p>of TBoxes expressible in L is learnable iff there exists a
learner G such that for every TBox T in L there exists a
learning set S on which G correctly identifies T . It is said
to be finitely learnable whenever it is learnable from finite
learning sets only.</p>
        <p>We are primarily interested here in the notion of finite
learnability, as it provides a natural formal foundation for the
task of ontology learning from data. Intuitively, any finite
collection of data, structured with respect to some
implicitly adopted ontology, can be seen as a potentially instructive
learning set, as presented in an example in Figure 1. The key
question is then what formal criteria must this set satisfy to
warrant correct identification of the ontology constraining it.
To this end we employ the basic admissibility condition,
characteristic also of other learning frameworks [Shapiro, 1981],
which ensures that the learning set is sufficiently rich to
enable precise discrimination between the correct hypothesis
and all the incorrect ones.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Definition 4 (Admissibility) A TIP (T ; S) is admissible iff</title>
        <p>for every C v D in L such that T 6j= C v D there exists
I 2 S such that I 6j= C v D.</p>
        <p>For the target TBox T , let T 6j= to be the set of all concept
inclusions in L that are not entailed by T , i.e., T 6j= = fC v
D in L j T 6j= C v Dg. The admissibility condition requires
that for every C v D 2 T 6j=, the learning set S must
contain a “counterexample” for it, i.e., an individual d 2 I , for
some I 2 S, such that d 2 CI and d 62 DI . Consequently,
any learning set must contain such counterexamples to all
elements of T 6j=, or else, the learner might never be justified
to exclude some of these concept inclusions from the
hypothesis. If it was possible to represent them finitely we could
expect that ultimately the learner can observe all of them and
correctly identify the TBox. In the next section, we
investigate this prospect formally in different fragments of E L.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Finite Learning Sets</title>
      <p>As argued in the previous section, to enable finite
learnability of T in a given language L, the relevant counterexamples
to all the concept inclusions not entailed by T must be
presentable within a finite learning set S. Firstly, we can
immediately observe that this requirement is trivially satisfied
for Lu. Clearly, Lu can only induce finitely many different
concept inclusions (up to logical equivalence) on finite
signatures, such as T . Hence, the set T 6j= can always be finitely
represented (up to logical equivalence) and it is
straightforward to finitely present counterexamples to all its members.
For more expressive fragments of E L, however, this cannot
be assumed in general, as the 9r:C constructor induces
infinitely many concepts. One negative result comes with the
case of E L itself, as demonstrated in the next theorem.</p>
      <sec id="sec-4-1">
        <title>Theorem 1 (Finite learning sets in E L) Let T be a TBox in</title>
        <p>E L. There exists no finite set S such that (T ; S) is admissible.</p>
        <p>The full proof of this and subsequent results is included in
the online technical report [Klarman and Britz, 2015]. The
argument rests on the following lemma. Let (T ; S) be an
admissible TIP and C a concept. By S(C) we denote the set
of all models (I; d) of C w.r.t. T such that I 2 S. By T S(C)
we denote the intersection of all these models, i.e., the model
(J ; d), such that (J ; d) 7! (I; d) for every (I; d) 2 S(C),
and for every other model (J 0; d) such that (J 0; d) 7! (I; d)
for every (I; d) 2 S(C) and (J ; d) 7! (J 0; d), it is the case
that (J 0; d) 7! (J ; d).</p>
      </sec>
      <sec id="sec-4-2">
        <title>Lemma 1 (Minimal model lemma) Let (T ; S) be an ad</title>
        <p>missible TIP for T in E L (resp. in E Lrhs), and C be an
E L (resp. Lu) concept. Whenever S(C) is non-empty then
T S(C) is a minimal model of C w.r.t. T .</p>
        <p>Given the lemma, we consider a concept inclusion of type:
n := 9r: : : : 9r: &gt; v 9r: : : : 9r:9r: &gt;</p>
        <p>| {nz } | n{+z1 }
Suppose n 2 T 6j= for some n 2 N. Since by the
admissibility condition a counterexample to n must be present in S, it
must be the case that S(C) 6= ;, where C is the left-hand-side
concept in n. By the lemma and the definition of a minimal
model, it is easy to see that S must contain a finite chain of
individuals of length exactly n + 1, as depicted below:
r r
|
! : : :
n{+z1
!
}
Finally, since there can always exists some n 2 N, such that
m 2 T 6j= for every m n, we see that the joint size of all
necessary counterexamples in such cases must inevitably be
also infinite. Consequently, for some E L TBoxes admissible
TIPs based on finite learning sets might not exist, and so finite
learnability cannot be achieved in general.</p>
        <p>One trivial way to tame this behavior is to “finitize” T 6j=
by delimiting the entire space of possible TBox axioms to a
pre-defined, finite set. This can be achieved, for instance, by
restricting the permitted depth of complex concepts or
generally setting some a priori bound on the size of axioms. Such
ad hoc solutions, though likely efficient in practice, are not
very elegant. As a more interesting alternative, we are able
to show that there exist at least two languages between Lu
and E L, namely E Llhs and E Lrhs, for which finite learning
sets are always guaranteed to exist, regardless of the fact that
they permit infinitely many concept inclusions. In fact, we
demonstrate that in both cases such learning sets might well
consist of exactly one exemplary finite model.</p>
        <p>A, B, A⊓B, ∃r.(A⊓B), ∃r.∃r.A
A, ∃r.(A⊓B)
∃r.A</p>
        <p>We adopt the technique of so-called types, known from the
area of modal logics [Pratt, 1979]. Types are finite
abstractions of possible individuals in the interpretation domain, out
of which arbitrary models can be constructed. Let con(T ) be
the set of all concepts (and all their subconcepts) occurring in
T . A type over T is a set t con(T ), such that C u D 2 t
iff C 2 t and D 2 t, for every C u D 2 con(T ). A type t
is saturated for T iff for every C v D 2 T , if C 2 t then
D 2 t. For any S con(T ), we write tS to denote the
smallest saturated type containing S. It is easy to see, that tS must
be unique for E L.</p>
        <p>The next theorem addresses the case of E Lrhs. Figure 2
illustrates a finite learning set for a sample E Lrhs TBox,
following the construction in the proof.</p>
        <sec id="sec-4-2-1">
          <title>Theorem 2 (Finite learning sets in E Lrhs) Let T be a TBox</title>
          <p>in E Lrhs. There exists a finite set S such that (T ; S) is
admissible.</p>
          <p>Proof sketch. Let be the smallest set of types satisfying
the following conditions:
tS 2
if t 2
, for every S
then tfCg 2</p>
          <p>NC and for S = f&gt;g,
, for every 9r:C 2 t.</p>
          <p>We define the interpretation I = ( I ; I ) as follows:
I :=</p>
          <p>,
t 2 AI iff A 2 t, for every t 2
(t; tfCg) 2 rI , for every t 2
and A 2 NC ,
, whenever 9r:C 2 t.</p>
          <p>Then S = fIg is a finite learning set such that (T ; S) is
admissible. q</p>
          <p>A similar, though somewhat more complex construction
demonstrates the existence of finite learning sets in E Llhs.
Again, we illustrate the approach with an example in
Figure 3.</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>Theorem 3 (Finite learning sets in E Llhs) Let T be a TBox</title>
          <p>in E Llhs. There exists a finite set S such that (T ; S) is
admissible.</p>
          <p>Proof sketch. Let be the set of all saturated types over
T , and be its subset obtained by iteratively eliminating all
those types t that violate the following condition: for every
r 2 NR and every existential restriction 9r:C 2 t there is
u 2 such that:
A, ∃r.A</p>
          <p>,
for every 9r:D 2 con(T ), if D 2 u then 9r:D 2 t.
Further, we define the interpretation I = ( I ; I ) as follows:
t 2 AI iff A 2 St, for every t 2
and A 2 NC ,
(t; u) 2 rI iff for every 9r:C 2 con(T ), if C 2 u then
9r:C 2 t.</p>
          <p>Then S = fIg is a finite learning set such that (T ; S) is
admissible. q
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Learning Algorithms</title>
      <p>In this section we devise basic learning algorithms that
correctly identify E Llhs and E Lrhs TBoxes in admissible TIPs
based on finite learning sets. Since T 6j= can be in general still
infinite, our starting observation is that a learner cannot
effectively eliminate concept inclusions from T 6j= using a
straightforward enumeration, thus arriving at the target TBox T . The
only feasible strategy is to try to identify the “good”
candidate axioms to be included in T , and possibly apply the
elimination strategy only to finitely many incorrect guesses. One
generic procedure to employ such heuristic, which we define
as Algorithm 1, attempts to construct the hypothesis by
extending it with consecutive axioms of systematically growing
size that are satisfied by the learning set. There, by `(C v D)
we denote the size of the axiom C v D measured in the total
number of symbols used for expressing this axiom. At each
step the algorithm makes use of a simple equivalence oracle,
which informs whether the currently considered hypothesis is
already equivalent to the learning target (in that case the
identification succeeds) or whether some axioms are still missing.
Theorem 4 demonstrates the correctness of this approach.
Theorem 4 (Correct identification in E Lrrhhss// lhs) Let
(T ; S) be an admissible TIP for T in E L EELLlhs. Then the
hypothesis TBox H generated by Algorithm 1 is equivalent to
T .</p>
      <p>Obviously the use of the oracle is essential to warrant
termination of the algorithm. It is not difficult to see that without
it, the algorithm must still converge on the correct TBox for
some n 2 N, and consequently settle on it, i.e., Hm Hn
for every m n. However, at no point of time can it
guarantee that the convergence has been already achieved, and
Algorithm 1 Learning E Lrhs/E Llhs TBoxes on finite inputs.</p>
      <p>T ’? is ‘NO’ (equivalence oracle querying)
so it can only warrant learnability in the limit. This result is
therefore not entirely satisfactory considering we aim at finite
learnability from data in the unsupervised setting.</p>
      <p>A major positive result, on the contrary, can be delivered
for the case of E Lrhs, for which we devise an effective
learning algorithm making no reference to any oracle. It turns out
that in E Lrhs the “good” candidate axioms can be directly
extracted from the learning set, thus granting a proper
unsupervised learning method. The essential insight is provided by
Lemma 1, presented in the previous section. Given any Lu
concept C such that S(C) 6= ; we are able to identify a
treeshaped minimal model of C w.r.t. T . Effectively, it suffices
to retrieve only the initial part of this model, discarding its
infinitely recurrent (cyclic) subtrees. Such an initial model
Iinit is constructed by Algorithm 2. The algorithm performs
simultaneous unravelling of all models in S(C), while on the
way, computing intersections of visited combinations of
individuals, which are subsequently added to the model under
construction. Whenever the same combination of individuals
is about to be visited for the second time on the same branch
it is skipped, as the cycle is evidently detected. The covering
concept CIinit for the resulting interpretation Iinit is then
included in the hypothesis within the axiom C v CIinit .
Meanwhile, all Lu concepts C such that S(C) = ; are ensured
to entail every E L concept, as implied by the admissibility
condition. The contents of the hypothesis TBox are formally
specified in Definition 5. Theorem 5 demonstrates the
correctness of the whole learning procedure.</p>
      <sec id="sec-5-1">
        <title>Definition 5 (E Lrhs hypothesis TBox) Let (T ; S) be an ad</title>
        <p>missible TIP for T in E Lrhs over the signature T . The
hypothesis TBox H is the set consisting of all the following
axioms:</p>
        <p>C v CIinit for every Lu concept C such that S(C) 6= ;,
where CIinit is the covering concept for the interpretation
(Iinit; d) generated by Algorithm 2 on S(C);
C v dr2NR 9r: d NC for every Lu concept C such
that S(C) = ;.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Theorem 5 (Correct identification in E Lrhs) Let (T ; S) be</title>
        <p>an admissible TIP for T in E Lrhs. Then the hypothesis TBox
H for S is equivalent to T .</p>
        <p>Algorithm 2 Computing the initial part of the minimal model
T S(C)
Input: the set S(C) = f(Ii; di)g0 i n, for some n 2 N
Output: a finite tree-shaped interpretation (J ; d), where</p>
        <p>J = ( J ; J )
1: J := ff (d0; : : : ; dn)g, for a “fresh” function symbol
f
2: AJ := ;, for every A 2 NC
3: rJ := ;, for every r 2 NR
4: for every f (d0; : : : ; dn) 2 J , (e0; : : : ; en) 2 I0
: : : In , r 2 NR do
5: if (di; ei) 2 rIi for every 0 i n and there
exists no function symbol g such that g(e0; : : : ; en) is an
ancestor of f (d0; : : : ; dn) in J then
6: J := J [fg(e0; : : : ; en)g, for a “fresh” function
symbol g
7: rJ := rJ [ f(f (d0; : : : ; dn); g(e0; : : : ; en))g
8: end if
9: end for
10: for every f (d0; : : : ; dn) 2 J , A 2 NC do
11: if di 2 AIi for every 0 i n then
12: AJ := AJ [ ff (d0; : : : ; dn)g
13: end if
14: end for
15: return (J ; f (d0; : : : ; dn)), where f (d0; : : : ; dn) is the
root of J , created at step 1.</p>
        <p>The learning algorithm runs in double exponential time in the
worst case and generates TBoxes of double exponential size
in the size of S. This follows from the fact that the
treeshaped interpretations generated by Algorithm 2 might be of
depth exponential in the number of individuals occurring in S
and have exponential branching factor. Importantly, however,
there might exist solutions far closer to being optimal which
we have not as far investigated.</p>
        <p>It is our strong conjecture, which we leave as an open
problem, that a related learning strategy should also be applicable
in the context of E Llhs.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Related Work</title>
      <p>An alternative approach to learning DL TBoxes, based on
Angluin’s model of learning from entailment [Angluin, 1988],
was recently introduced by Konev et al. [Konev et al., 2014].
There, the learner identifies the TBox by posing two types
of queries: entailment (“T j= C v D?”) and equivalence
(“H T ? If no, then return a positive or a negative
counterexample”). The authors study polynomial learnability and
define corresponding algorithms for E Llhs and E Lrhs, while
for E L they show that such polynomial algorithm does not
exist. Apart from the obvious differences in the motivation
underlying both learning models (unsupervised learning from
data vs. learning by queries from an expert), there are also
some strong formal connections. Essentially, given a finite
learning set in an admissible TIP, a learner from
interpretations can autonomously answer arbitrary entailment queries,
thus effectively simulating the entailment oracle. However,
the learner does not have by default access to the equivalence
oracle. Once such oracle is included, as done in our
Algorithm 1, the learning power of both learners becomes
comparable (note that with some smart heuristic our learner can
find a positive or negative counterexample whenever the
oracle gives a negative answer). In this sense, our Theorem 4
should be also indirectly derivable from the results by Konev
et al. However, our stronger result for E Lrhs in Theorem 5
demonstrates that, at least in some cases, the learner from
interpretations is able to succeed without employing the
equivalence oracle, which is essential to the other approach.</p>
      <p>Less directly, our work is also related to various
contributions on learnability of different types of formal
structures from data, e.g.: first-order theories from facts [Shapiro,
1981], finite automata descriptions from observations [Pitt,
1989], logic programs from interpretations [De Raedt and
Lavracˇ, 1993; De Raedt, 1994]. In the area of DLs, a few
learning scenarios have been formally addressed, concerned
largely with learning concept descriptions via different
learning operators [Straccia and Mucci, 2015; Lehmann and
Hitzler, 2008; Fanizzi et al., 2008; Cohen and Hirsh, 1994] and
applications of formal concept analysis techniques to
automated generation of DL axioms from data [Baader et al.,
2007; Distel, 2011].
7</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusions and Outlook</title>
      <p>In this paper, we have delivered initial results on finite
learnability of DL TBoxes from interpretations. We believe that
this direction shows a lot of promise in establishing formal
foundations for the task of ontology learning from data. Some
immediate problems that are left open with this work concern
finite learnability of E Llhs TBoxes in an unsupervised setting,
and possibly of other lightweight fragments of DLs. Another
set of very interesting research questions should deal, in our
view, with the possibility of formulating alternative
conditions on the learning sets and the corresponding learnability
guarantees they would imply in different DL languages. In
particular, some limited use of closed-world operator over the
learning sets might allow to relax the practically restrictive
admissibility condition. Finally, the development of practical
learning algorithms, possibly building on existing inductive
logic programming methods, is an obvious area to welcome
further research efforts.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[Angluin</source>
          , 1988]
          <string-name>
            <given-names>Dana</given-names>
            <surname>Angluin</surname>
          </string-name>
          .
          <article-title>Queries and concept learning</article-title>
          .
          <source>Machine Learning</source>
          ,
          <volume>2</volume>
          (
          <issue>4</issue>
          ):
          <fpage>319</fpage>
          -
          <lpage>342</lpage>
          ,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Baader et al.,
          <year>2003</year>
          ]
          <string-name>
            <given-names>Franz</given-names>
            <surname>Baader</surname>
          </string-name>
          , Diego Calvanese, Deborah L.
          <string-name>
            <surname>Mcguinness</surname>
            ,
            <given-names>Daniele</given-names>
          </string-name>
          <string-name>
            <surname>Nardi</surname>
          </string-name>
          , and
          <string-name>
            <surname>Peter F. PatelSchneider.</surname>
          </string-name>
          <article-title>The description logic handbook: theory, implementation, and applications</article-title>
          . Cambridge University Press,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Baader et al.,
          <year>2005</year>
          ]
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Brandt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          .
          <article-title>Pushing the E L envelope</article-title>
          .
          <source>In Proc. of IJCAI-05</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Baader et al.,
          <year>2007</year>
          ]
          <string-name>
            <given-names>Franz</given-names>
            <surname>Baader</surname>
          </string-name>
          , Bernhard Ganter, Baris Sertkaya, and
          <string-name>
            <given-names>Ulrike</given-names>
            <surname>Sattler</surname>
          </string-name>
          .
          <article-title>Completing description logic knowledge bases using formal concept analysis</article-title>
          .
          <source>In Proc.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>of the International Joint Conference on Artificial Intelligence (IJCAI-07)</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>[Cohen and Hirsh</source>
          , 1994] William W. Cohen and Haym Hirsh.
          <article-title>Learning the classic description logic: Theoretical and experimental results</article-title>
          .
          <source>In Proc. of Principles of Knowledge Representation and Reasoning (KR-94)</source>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [De Raedt and Lavracˇ, 1993] Luc De Raedt and
          <article-title>Nada Lavracˇ. The many faces of inductive logic programming</article-title>
          .
          <source>In Methodologies for Intelligent Systems</source>
          , pages
          <fpage>435</fpage>
          -
          <lpage>449</lpage>
          .
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>[De Raedt</surname>
          </string-name>
          ,
          <year>1994</year>
          ] Luc De Raedt.
          <article-title>First order jk-clausal theories are PAC-learnable</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>70</volume>
          :
          <fpage>375</fpage>
          -
          <lpage>392</lpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [Fanizzi et al.,
          <year>2008</year>
          ]
          <string-name>
            <given-names>Nicola</given-names>
            <surname>Fanizzi</surname>
          </string-name>
          , Claudia dAmato, and
          <article-title>Floriana Esposito. DL-FOIL concept learning in description logics</article-title>
          .
          <source>Inductive Logic Programming</source>
          , pages
          <fpage>107</fpage>
          -
          <lpage>121</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[Hoekstra</source>
          , 2010]
          <string-name>
            <given-names>Rinke</given-names>
            <surname>Hoekstra</surname>
          </string-name>
          .
          <article-title>The knowledge reengineering bottleneck</article-title>
          .
          <source>Journal of Semantic Web</source>
          ,
          <volume>1</volume>
          (
          <issue>1</issue>
          ,2):
          <fpage>111</fpage>
          -
          <lpage>115</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>[Klarman and Britz</source>
          , 2015]
          <string-name>
            <given-names>Szymon</given-names>
            <surname>Klarman</surname>
          </string-name>
          and
          <string-name>
            <given-names>Katarina</given-names>
            <surname>Britz</surname>
          </string-name>
          .
          <article-title>Towards unsupervised ontology learning from data</article-title>
          .
          <source>Technical report</source>
          , CAIR, UKZN/CSIR Meraka. http://klarman.synthasite.com/ resources/KlaBri-DARe15.pdf,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [Konev et al.,
          <year>2014</year>
          ]
          <string-name>
            <given-names>Boris</given-names>
            <surname>Konev</surname>
          </string-name>
          , Carsten Lutz, Ana Ozaki, and
          <string-name>
            <given-names>Frank</given-names>
            <surname>Wolter</surname>
          </string-name>
          .
          <article-title>Exact learning of lightweight description logic ontologies</article-title>
          .
          <source>In Proc. of Principles of Knowledge Representation and Reasoning (KR-14)</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>[Lehmann and Hitzler</source>
          , 2008]
          <string-name>
            <given-names>Jens</given-names>
            <surname>Lehmann</surname>
          </string-name>
          and
          <string-name>
            <given-names>Pascal</given-names>
            <surname>Hitzler</surname>
          </string-name>
          .
          <article-title>A refinement operator based learning algorithm for the ALC description logic</article-title>
          . In Hendrik Blockeel, Jan Ramon,
          <string-name>
            <given-names>Jude</given-names>
            <surname>Shavlik</surname>
          </string-name>
          , and Prasad Tadepalli, editors,
          <source>Inductive Logic Programming</source>
          , pages
          <fpage>147</fpage>
          -
          <lpage>160</lpage>
          . Springer Berlin Heidelberg,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [Lutz et al.,
          <year>2010</year>
          ]
          <string-name>
            <given-names>Carsten</given-names>
            <surname>Lutz</surname>
          </string-name>
          , Robert Piro, and
          <string-name>
            <given-names>Frank</given-names>
            <surname>Wolter. Enriching E L concepts</surname>
          </string-name>
          <article-title>with greatest fixpoints</article-title>
          .
          <source>In Proceedings of the 2010 Conference on ECAI 2010: 19th European Conference on Artificial Intelligence</source>
          , pages
          <fpage>41</fpage>
          -
          <lpage>46</lpage>
          . IOS Press,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <source>[Maedche and Staab</source>
          , 2004]
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Maedche</surname>
          </string-name>
          and
          <string-name>
            <given-names>Steffen</given-names>
            <surname>Staab</surname>
          </string-name>
          .
          <article-title>Ontology learning</article-title>
          .
          <source>In Steffen Staab and Rudi Studer</source>
          , editors,
          <source>Handbook on Ontologies</source>
          , pages
          <fpage>173</fpage>
          -
          <lpage>189</lpage>
          . Springer,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>[Pitt</source>
          , 1989]
          <string-name>
            <given-names>Leonard</given-names>
            <surname>Pitt</surname>
          </string-name>
          .
          <article-title>Inductive inference, DFAs, and computational complexity</article-title>
          . In Klaus P. Jantke, editor,
          <source>Analogical and Inductive Inference</source>
          , volume
          <volume>397</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>18</fpage>
          -
          <lpage>44</lpage>
          . Springer Berlin Heidelberg,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <source>[Pratt</source>
          ,
          <year>1979</year>
          ]
          <string-name>
            <given-names>V.R.</given-names>
            <surname>Pratt</surname>
          </string-name>
          .
          <article-title>Models of program logics</article-title>
          .
          <source>In Proc. of Foundations of Computer Science (FOCS-79)</source>
          ,
          <year>1979</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <source>[Shapiro</source>
          ,
          <year>1981</year>
          ]
          <string-name>
            <surname>Ehud</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Shapiro</surname>
          </string-name>
          .
          <article-title>Inductive inference of theories from facts</article-title>
          .
          <source>In Computational Logic: Essays in Honor of Alan Robinson</source>
          (
          <year>1991</year>
          ). MIT Press,
          <year>1981</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <source>[Straccia and Mucci</source>
          , 2015]
          <string-name>
            <given-names>Umberto</given-names>
            <surname>Straccia</surname>
          </string-name>
          and
          <string-name>
            <given-names>Matteo</given-names>
            <surname>Mucci</surname>
          </string-name>
          .
          <article-title>pFOIL-DL: Learning (fuzzy) E L concept descriptions from crisp OWL data using a probabilistic ensemble estimation</article-title>
          .
          <source>In Proc. of the 30th Annual ACM Symposium on Applied Computing (SAC-15)</source>
          . ACM,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>