-

Towards Unsupervised Ontology Learning from Data

Szymon Klarman

Katarina Britz

0 0 Centre for AI Research, CSIR and Stellenbosch University 1 Department of Computer Science, Brunel University London

2011

Data-driven elicitation of ontologies from structured data is a well-recognized knowledge acquisition bottleneck. The development of efficient techniques for (semi-)automating this task is therefore practically vital - yet, hindered by the lack of robust theoretical foundations. In this paper, we study the problem of learning Description Logic TBoxes from interpretations, which naturally translates to the task of ontology learning from data. In the presented framework, the learner is provided with a set of positive interpretations (i.e., logical models) of the TBox adopted by the teacher. The goal is to correctly identify the TBox given this input. We characterize the key constraints on the models that warrant finite learnability of TBoxes expressed in selected fragments of the Description Logic E L and define corresponding learning algorithms.

In the advent of the Web of Data and various “e-” initiatives, such as e-science, e-health, e-governance, etc., the focus of the classical knowledge acquisition bottleneck becomes ever more concentrated around the problem of constructing rich and accurate ontologies enabling efficient management of the existing abundance of data [Maedche and Staab, 2004]. Whereas the traditional understanding of this bottleneck has been associated with the necessity of developing ontologies ex ante, in a top-down, data-agnostic manner, this seems to be currently evolving into a new position, recently dubbed the knowledge reengineering bottleneck [Hoekstra, 2010]. In this view, the contemporary challenge is to, conversely, enable data-driven approaches to ontology design — methods that can make use and make sense of the existing data, be it readily available on the web or crowdsourced, leading to elicitation of the ontological commitments implicitly present on the data-level. Even though the development of such techniques and tools, which could help (semi-)automate thus characterized ontology learning processes, becomes vital in practice, the robust theoretical foundations for the problem are still rather limited. This gap is addressed in the present work.

In this paper, we study the problem of learning Description Logic (DL) TBoxes from interpretations, which naturally translates to the task of ontology learning from data. DLs are a popular family of knowledge representation formalisms [Baader et al., 2003], which have risen to prominence as, among others, the logics underpinning different profiles of the Web Ontology Language OWL1. In this paper, we focus on the lightweight DL E L [Baader et al., 2005] and some of its more specific fragments. This choice is motivated, on the one hand, by the interesting applications of E L, especially as the logic behind OWL 2 E L profile, while on the other, by its relative complexity, which enables us to make interesting observations from the learning perspective. Our learning model is a variant of learning from positive interpretations (i.e., from models of the target theory) — a generally established framework in the field of inductive logic programming [De Raedt and Lavracˇ, 1993; De Raedt, 1994]. In our scenario, the goal of the learner is to correctly identify the target TBox T given a finite set of its finite models. Our overarching interest lies in algorithms warranting effective learnability in such setting with no or minimum supervision. Our key research questions and contributions are therefore concerned with the identification of specific languages and conditions on the learning input under which such algorithms can be in principle defined.

In the following two sections, we introduce DL preliminaries and discuss the adopted learning model. In Section 4, we identify two interesting fragments of E L, called E Lrhs and E Llhs, which satisfy some basic necessary conditions enabling finite learnability, and at the same time, we show that full E L does not meet that same requirement. In Section 5, we devise a generic algorithm which correctly identifies E Lrhs and E Llhs TBoxes from finite data, employing a basic equivalence oracle. Further, in case of E Lrhs, we significantly strengthen this result by defining an algorithm which makes no such calls to an oracle, and thus supports fully unsupervised learning. In Section 6, we compare our work to related contributions, in particular to the framework of learning TBoxes from entailment queries, by Konev et al. [Konev et al., 2014]. We conclude in Section 7 with an overview of interesting open problems.

1See http://www.w3.org/TR/owl2-profiles/.

This work was funded in part by the National Research Foundation under Grant no. 85482.

Description Logic Preliminaries

The language of the Description Logic (DL) E L [Baader et al., 2005] is given by (1) a vocabulary = (NC ; NR), where NC is a set of concept names and NR a set of role names, and (2) the following set of constructors for defining complex concepts, which shall be divided into two groups: E L: Lu:

C; D ::= > j A j C u D j 9r:C

C; D ::= > j A j C u D where A 2 NC and r 2 NR. The set of Lu concepts naturally captures the propositional part of E L. The depth of a subconcept D in C is the number of existential restrictions within the scope of which D remains. The depth of a concept C is the depth of its subconcept with the greatest depth in C. Every Lu concept is trivially of depth 0.

The semantics is defined through interpretations of the form I = ( I ; I ), where I is a non-empty domain of individuals and I is an interpretation function mapping each A 2 NC to a subset AI I and each r 2 NR to a binary relation rI I I . The interpretation function is inductively extended over complex expressions according to the fixed semantics of the constructors:

>I = I (C u D)I = fx 2 I j x 2 CI \ DI g

(9r:C)I = fx 2 I j 9y : (x; y) 2 rI ^ y 2 CI g A concept inclusion is an expression of the form C v D, stating that all individuals of type C are D, as in, e.g.: Father of son v Man u 9hasChild:Man. The language fragments considered in this paper are categorized w.r.t. restrictions imposed on the syntax of concepts C and D in permitted concept inclusions C v D:

E L:

rhs: E L

lhs: E L Lu:

C and D are both E L concepts; C is an Lu concept and D an E L concept; C is an E L concept and D an Lu concept;

C and D are both Lu concepts.

A TBox (or ontology) is a finite set of concept inclusions, also called the TBox axioms, in a given language fragment.

An interpretation I satisfies a concept inclusion C v D (I j= C v D) iff CI DI . Whenever I satisfies all axioms in a TBox T (I j= T ), we say that I is a model of T . For a set of interpretations S, we write S j= C v D to denote that every interpretation in S satisfies C v D. We say that T entails C v D (T j= C v D) iff every model of T satisfies C v D. Two TBoxes T and H are (logically) equivalent (T H) iff they have the same sets of models.

A pointed interpretation (I; d) is a pair consisting of a DL interpretation I = ( I ; I ) and an individual d 2 I , such that every e 2 I different from d is reachable from d through some role composition in I. By a slight abuse of notation, given an arbitrary DL interpretation I and an individual d 2 I , we write (I; d) to denote the largest subset I0 of I such that (I0; d) is a pointed interpretation. If it is clear from the context, we refer to pointed interpretations and pointed models simply as interpretations and models. We say that (I; d) is a model of a concept C iff d 2 CI ; it is a model of C w.r.t. T whenever also I j= T .

An interpretation (I; d) can be homomorphically embedded in an interpretation (J ; e), denoted as (I; d) 7! (J ; e), iff there exists a mapping h : lowing conditions:

J , satisfying the folh(d) = e, if (a; b) 2 rI then (h(a); h(b)) 2 rJ , for every a; b 2

I and r 2 NR, if a 2 AI then h(a) 2 AJ , for every a 2 A 2 NC .

I and

A model (I; d) of C (w.r.t. T ) is called minimal iff it can be homomorphically embedded in every other model of C (w.r.t. T ). It is well-known that E L concepts and TBoxes always have such minimal models (unique up to homomorphic embeddings) [Lutz et al., 2010]. As in most modal logics, arbitrary E L models can be unravelled into equivalent tree-shaped models. Finally, we observe that due to a tight relationship between the syntax and semantics of E L, every tree-shaped interpretation (I; d) can be viewed as an E L concept CI , such that (I; d) is a minimal model of CI . Formally, we set CI = C(d), where for every e 2 I we let C(e) = >uA(e)u9(e), with A(e) = dfA 2 NC j e 2 AI g and 9(e) = d(r;f)2NR I s.t. (e;f)2rI 9r:C(f ). In that case we call CI the covering concept for (I; d). 3

Learning Model

The learning model studied in this paper is a variant of learning from positive interpretations [De Raedt and Lavracˇ, 1993; De Raedt, 1994]. In our setting, the teacher fixes a target TBox T , whose set of all models is denoted by M(T ). Further, the teacher presents a set of examples from M(T ) to the learner, whose goal is to correctly identify T based on this input. The learning process is conducted relative to a mutually known DL language L and a finite signature T used in T . Obviously, M(T ) contains in principle sufficient information in order to enable correct identification of T , as the following correspondence implies: M(T ) j= C v D iff T j= C v D, for every C v D in L. However, as M(T ) might consist of infinitely many models of possibly infinite size, the teacher cannot effectively present them all to the learner. Instead, the teacher must confine him- or herself to certain finitely presentable subset of M(T ), called the learning set. For the sake of clarity, we focus here on the simplest case when learning sets consist of finitely many finite models.2 Formally, we summarize the learning model with the following definitions.

Definition 1 (TIP) A TBox Identification Problem (TIP) is a pair (T ; S), where T is a TBox in a DL language L and S, called the learning set, is a finite set of finite models of T . Definition 2 (Learner, identification) For a DL language L, a learner is a computable function G, which for every set S over T returns a TBox in L over T . Learner G correctly identifies T on S whenever G(S) T .

2An alternative, more general approach can be defined in terms of specific fragments of models. Such generalization, which lies beyond the scope of this paper, is essential when the learning problem concerns languages without finite model property.

Definition 3 (Learnability) For a DL language L, the class

of TBoxes expressible in L is learnable iff there exists a learner G such that for every TBox T in L there exists a learning set S on which G correctly identifies T . It is said to be finitely learnable whenever it is learnable from finite learning sets only.

We are primarily interested here in the notion of finite learnability, as it provides a natural formal foundation for the task of ontology learning from data. Intuitively, any finite collection of data, structured with respect to some implicitly adopted ontology, can be seen as a potentially instructive learning set, as presented in an example in Figure 1. The key question is then what formal criteria must this set satisfy to warrant correct identification of the ontology constraining it. To this end we employ the basic admissibility condition, characteristic also of other learning frameworks [Shapiro, 1981], which ensures that the learning set is sufficiently rich to enable precise discrimination between the correct hypothesis and all the incorrect ones.

Definition 4 (Admissibility) A TIP (T ; S) is admissible iff

for every C v D in L such that T 6j= C v D there exists I 2 S such that I 6j= C v D.

For the target TBox T , let T 6j= to be the set of all concept inclusions in L that are not entailed by T , i.e., T 6j= = fC v D in L j T 6j= C v Dg. The admissibility condition requires that for every C v D 2 T 6j=, the learning set S must contain a “counterexample” for it, i.e., an individual d 2 I , for some I 2 S, such that d 2 CI and d 62 DI . Consequently, any learning set must contain such counterexamples to all elements of T 6j=, or else, the learner might never be justified to exclude some of these concept inclusions from the hypothesis. If it was possible to represent them finitely we could expect that ultimately the learner can observe all of them and correctly identify the TBox. In the next section, we investigate this prospect formally in different fragments of E L. 4

Finite Learning Sets

As argued in the previous section, to enable finite learnability of T in a given language L, the relevant counterexamples to all the concept inclusions not entailed by T must be presentable within a finite learning set S. Firstly, we can immediately observe that this requirement is trivially satisfied for Lu. Clearly, Lu can only induce finitely many different concept inclusions (up to logical equivalence) on finite signatures, such as T . Hence, the set T 6j= can always be finitely represented (up to logical equivalence) and it is straightforward to finitely present counterexamples to all its members. For more expressive fragments of E L, however, this cannot be assumed in general, as the 9r:C constructor induces infinitely many concepts. One negative result comes with the case of E L itself, as demonstrated in the next theorem.

Theorem 1 (Finite learning sets in E L) Let T be a TBox in

E L. There exists no finite set S such that (T ; S) is admissible.

The full proof of this and subsequent results is included in the online technical report [Klarman and Britz, 2015]. The argument rests on the following lemma. Let (T ; S) be an admissible TIP and C a concept. By S(C) we denote the set of all models (I; d) of C w.r.t. T such that I 2 S. By T S(C) we denote the intersection of all these models, i.e., the model (J ; d), such that (J ; d) 7! (I; d) for every (I; d) 2 S(C), and for every other model (J 0; d) such that (J 0; d) 7! (I; d) for every (I; d) 2 S(C) and (J ; d) 7! (J 0; d), it is the case that (J 0; d) 7! (J ; d).

Lemma 1 (Minimal model lemma) Let (T ; S) be an ad

missible TIP for T in E L (resp. in E Lrhs), and C be an E L (resp. Lu) concept. Whenever S(C) is non-empty then T S(C) is a minimal model of C w.r.t. T .

Given the lemma, we consider a concept inclusion of type: n := 9r: : : : 9r: > v 9r: : : : 9r:9r: >

| {nz } | n{+z1 } Suppose n 2 T 6j= for some n 2 N. Since by the admissibility condition a counterexample to n must be present in S, it must be the case that S(C) 6= ;, where C is the left-hand-side concept in n. By the lemma and the definition of a minimal model, it is easy to see that S must contain a finite chain of individuals of length exactly n + 1, as depicted below: r r | ! : : : n{+z1 ! } Finally, since there can always exists some n 2 N, such that m 2 T 6j= for every m n, we see that the joint size of all necessary counterexamples in such cases must inevitably be also infinite. Consequently, for some E L TBoxes admissible TIPs based on finite learning sets might not exist, and so finite learnability cannot be achieved in general.

One trivial way to tame this behavior is to “finitize” T 6j= by delimiting the entire space of possible TBox axioms to a pre-defined, finite set. This can be achieved, for instance, by restricting the permitted depth of complex concepts or generally setting some a priori bound on the size of axioms. Such ad hoc solutions, though likely efficient in practice, are not very elegant. As a more interesting alternative, we are able to show that there exist at least two languages between Lu and E L, namely E Llhs and E Lrhs, for which finite learning sets are always guaranteed to exist, regardless of the fact that they permit infinitely many concept inclusions. In fact, we demonstrate that in both cases such learning sets might well consist of exactly one exemplary finite model.

A, B, A⊓B, ∃r.(A⊓B), ∃r.∃r.A A, ∃r.(A⊓B) ∃r.A

We adopt the technique of so-called types, known from the area of modal logics [Pratt, 1979]. Types are finite abstractions of possible individuals in the interpretation domain, out of which arbitrary models can be constructed. Let con(T ) be the set of all concepts (and all their subconcepts) occurring in T . A type over T is a set t con(T ), such that C u D 2 t iff C 2 t and D 2 t, for every C u D 2 con(T ). A type t is saturated for T iff for every C v D 2 T , if C 2 t then D 2 t. For any S con(T ), we write tS to denote the smallest saturated type containing S. It is easy to see, that tS must be unique for E L.

The next theorem addresses the case of E Lrhs. Figure 2 illustrates a finite learning set for a sample E Lrhs TBox, following the construction in the proof.

Theorem 2 (Finite learning sets in E Lrhs) Let T be a TBox

in E Lrhs. There exists a finite set S such that (T ; S) is admissible.

Proof sketch. Let be the smallest set of types satisfying the following conditions: tS 2 if t 2 , for every S then tfCg 2

NC and for S = f>g, , for every 9r:C 2 t.

We define the interpretation I = ( I ; I ) as follows: I :=

, t 2 AI iff A 2 t, for every t 2 (t; tfCg) 2 rI , for every t 2 and A 2 NC , , whenever 9r:C 2 t.

Then S = fIg is a finite learning set such that (T ; S) is admissible. q

A similar, though somewhat more complex construction demonstrates the existence of finite learning sets in E Llhs. Again, we illustrate the approach with an example in Figure 3.

Theorem 3 (Finite learning sets in E Llhs) Let T be a TBox

in E Llhs. There exists a finite set S such that (T ; S) is admissible.

Proof sketch. Let be the set of all saturated types over T , and be its subset obtained by iteratively eliminating all those types t that violate the following condition: for every r 2 NR and every existential restriction 9r:C 2 t there is u 2 such that: A, ∃r.A

, for every 9r:D 2 con(T ), if D 2 u then 9r:D 2 t. Further, we define the interpretation I = ( I ; I ) as follows: t 2 AI iff A 2 St, for every t 2 and A 2 NC , (t; u) 2 rI iff for every 9r:C 2 con(T ), if C 2 u then 9r:C 2 t.

Then S = fIg is a finite learning set such that (T ; S) is admissible. q 5

Learning Algorithms

In this section we devise basic learning algorithms that correctly identify E Llhs and E Lrhs TBoxes in admissible TIPs based on finite learning sets. Since T 6j= can be in general still infinite, our starting observation is that a learner cannot effectively eliminate concept inclusions from T 6j= using a straightforward enumeration, thus arriving at the target TBox T . The only feasible strategy is to try to identify the “good” candidate axioms to be included in T , and possibly apply the elimination strategy only to finitely many incorrect guesses. One generic procedure to employ such heuristic, which we define as Algorithm 1, attempts to construct the hypothesis by extending it with consecutive axioms of systematically growing size that are satisfied by the learning set. There, by `(C v D) we denote the size of the axiom C v D measured in the total number of symbols used for expressing this axiom. At each step the algorithm makes use of a simple equivalence oracle, which informs whether the currently considered hypothesis is already equivalent to the learning target (in that case the identification succeeds) or whether some axioms are still missing. Theorem 4 demonstrates the correctness of this approach. Theorem 4 (Correct identification in E Lrrhhss// lhs) Let (T ; S) be an admissible TIP for T in E L EELLlhs. Then the hypothesis TBox H generated by Algorithm 1 is equivalent to T .

Obviously the use of the oracle is essential to warrant termination of the algorithm. It is not difficult to see that without it, the algorithm must still converge on the correct TBox for some n 2 N, and consequently settle on it, i.e., Hm Hn for every m n. However, at no point of time can it guarantee that the convergence has been already achieved, and Algorithm 1 Learning E Lrhs/E Llhs TBoxes on finite inputs.

T ’? is ‘NO’ (equivalence oracle querying) so it can only warrant learnability in the limit. This result is therefore not entirely satisfactory considering we aim at finite learnability from data in the unsupervised setting.

A major positive result, on the contrary, can be delivered for the case of E Lrhs, for which we devise an effective learning algorithm making no reference to any oracle. It turns out that in E Lrhs the “good” candidate axioms can be directly extracted from the learning set, thus granting a proper unsupervised learning method. The essential insight is provided by Lemma 1, presented in the previous section. Given any Lu concept C such that S(C) 6= ; we are able to identify a treeshaped minimal model of C w.r.t. T . Effectively, it suffices to retrieve only the initial part of this model, discarding its infinitely recurrent (cyclic) subtrees. Such an initial model Iinit is constructed by Algorithm 2. The algorithm performs simultaneous unravelling of all models in S(C), while on the way, computing intersections of visited combinations of individuals, which are subsequently added to the model under construction. Whenever the same combination of individuals is about to be visited for the second time on the same branch it is skipped, as the cycle is evidently detected. The covering concept CIinit for the resulting interpretation Iinit is then included in the hypothesis within the axiom C v CIinit . Meanwhile, all Lu concepts C such that S(C) = ; are ensured to entail every E L concept, as implied by the admissibility condition. The contents of the hypothesis TBox are formally specified in Definition 5. Theorem 5 demonstrates the correctness of the whole learning procedure.

Definition 5 (E Lrhs hypothesis TBox) Let (T ; S) be an ad

missible TIP for T in E Lrhs over the signature T . The hypothesis TBox H is the set consisting of all the following axioms:

C v CIinit for every Lu concept C such that S(C) 6= ;, where CIinit is the covering concept for the interpretation (Iinit; d) generated by Algorithm 2 on S(C); C v dr2NR 9r: d NC for every Lu concept C such that S(C) = ;.

Theorem 5 (Correct identification in E Lrhs) Let (T ; S) be

an admissible TIP for T in E Lrhs. Then the hypothesis TBox H for S is equivalent to T .

Algorithm 2 Computing the initial part of the minimal model T S(C) Input: the set S(C) = f(Ii; di)g0 i n, for some n 2 N Output: a finite tree-shaped interpretation (J ; d), where

J = ( J ; J ) 1: J := ff (d0; : : : ; dn)g, for a “fresh” function symbol f 2: AJ := ;, for every A 2 NC 3: rJ := ;, for every r 2 NR 4: for every f (d0; : : : ; dn) 2 J , (e0; : : : ; en) 2 I0 : : : In , r 2 NR do 5: if (di; ei) 2 rIi for every 0 i n and there exists no function symbol g such that g(e0; : : : ; en) is an ancestor of f (d0; : : : ; dn) in J then 6: J := J [fg(e0; : : : ; en)g, for a “fresh” function symbol g 7: rJ := rJ [ f(f (d0; : : : ; dn); g(e0; : : : ; en))g 8: end if 9: end for 10: for every f (d0; : : : ; dn) 2 J , A 2 NC do 11: if di 2 AIi for every 0 i n then 12: AJ := AJ [ ff (d0; : : : ; dn)g 13: end if 14: end for 15: return (J ; f (d0; : : : ; dn)), where f (d0; : : : ; dn) is the root of J , created at step 1.

The learning algorithm runs in double exponential time in the worst case and generates TBoxes of double exponential size in the size of S. This follows from the fact that the treeshaped interpretations generated by Algorithm 2 might be of depth exponential in the number of individuals occurring in S and have exponential branching factor. Importantly, however, there might exist solutions far closer to being optimal which we have not as far investigated.

It is our strong conjecture, which we leave as an open problem, that a related learning strategy should also be applicable in the context of E Llhs. 6

Related Work

An alternative approach to learning DL TBoxes, based on Angluin’s model of learning from entailment [Angluin, 1988], was recently introduced by Konev et al. [Konev et al., 2014]. There, the learner identifies the TBox by posing two types of queries: entailment (“T j= C v D?”) and equivalence (“H T ? If no, then return a positive or a negative counterexample”). The authors study polynomial learnability and define corresponding algorithms for E Llhs and E Lrhs, while for E L they show that such polynomial algorithm does not exist. Apart from the obvious differences in the motivation underlying both learning models (unsupervised learning from data vs. learning by queries from an expert), there are also some strong formal connections. Essentially, given a finite learning set in an admissible TIP, a learner from interpretations can autonomously answer arbitrary entailment queries, thus effectively simulating the entailment oracle. However, the learner does not have by default access to the equivalence oracle. Once such oracle is included, as done in our Algorithm 1, the learning power of both learners becomes comparable (note that with some smart heuristic our learner can find a positive or negative counterexample whenever the oracle gives a negative answer). In this sense, our Theorem 4 should be also indirectly derivable from the results by Konev et al. However, our stronger result for E Lrhs in Theorem 5 demonstrates that, at least in some cases, the learner from interpretations is able to succeed without employing the equivalence oracle, which is essential to the other approach.

Less directly, our work is also related to various contributions on learnability of different types of formal structures from data, e.g.: first-order theories from facts [Shapiro, 1981], finite automata descriptions from observations [Pitt, 1989], logic programs from interpretations [De Raedt and Lavracˇ, 1993; De Raedt, 1994]. In the area of DLs, a few learning scenarios have been formally addressed, concerned largely with learning concept descriptions via different learning operators [Straccia and Mucci, 2015; Lehmann and Hitzler, 2008; Fanizzi et al., 2008; Cohen and Hirsh, 1994] and applications of formal concept analysis techniques to automated generation of DL axioms from data [Baader et al., 2007; Distel, 2011]. 7

Conclusions and Outlook

In this paper, we have delivered initial results on finite learnability of DL TBoxes from interpretations. We believe that this direction shows a lot of promise in establishing formal foundations for the task of ontology learning from data. Some immediate problems that are left open with this work concern finite learnability of E Llhs TBoxes in an unsupervised setting, and possibly of other lightweight fragments of DLs. Another set of very interesting research questions should deal, in our view, with the possibility of formulating alternative conditions on the learning sets and the corresponding learnability guarantees they would imply in different DL languages. In particular, some limited use of closed-world operator over the learning sets might allow to relax the practically restrictive admissibility condition. Finally, the development of practical learning algorithms, possibly building on existing inductive logic programming methods, is an obvious area to welcome further research efforts.

[Angluin , 1988]

Dana

Angluin . Queries and concept learning . Machine Learning , 2 ( 4 ): 319 - 342 , 1988 .

[Baader et al., 2003 ]

Franz

Baader , Diego Calvanese, Deborah L. Mcguinness , Daniele Nardi , and Peter F. PatelSchneider. The description logic handbook: theory, implementation, and applications . Cambridge University Press, 2003 .

[Baader et al., 2005 ]

Baader ,

Brandt , and

Lutz . Pushing the E L envelope . In Proc. of IJCAI-05 , 2005 .

[Baader et al., 2007 ]

Franz

Baader , Bernhard Ganter, Baris Sertkaya, and

Ulrike

Sattler . Completing description logic knowledge bases using formal concept analysis . In Proc.

of the International Joint Conference on Artificial Intelligence (IJCAI-07) , 2007 .

[Cohen and Hirsh , 1994] William W. Cohen and Haym Hirsh. Learning the classic description logic: Theoretical and experimental results . In Proc. of Principles of Knowledge Representation and Reasoning (KR-94) , 1994 .

[De Raedt and Lavracˇ, 1993] Luc De Raedt and Nada Lavracˇ. The many faces of inductive logic programming . In Methodologies for Intelligent Systems , pages 435 - 449 . 1993 .

[De Raedt , 1994 ] Luc De Raedt. First order jk-clausal theories are PAC-learnable . Artificial Intelligence , 70 : 375 - 392 , 1994 .

[Fanizzi et al., 2008 ]

Nicola

Fanizzi , Claudia dAmato, and Floriana Esposito. DL-FOIL concept learning in description logics . Inductive Logic Programming , pages 107 - 121 , 2008 .

[Hoekstra , 2010]

Rinke

Hoekstra . The knowledge reengineering bottleneck . Journal of Semantic Web , 1 ( 1 ,2): 111 - 115 , 2010 .

[Klarman and Britz , 2015]

Szymon

Klarman and

Katarina

Britz . Towards unsupervised ontology learning from data . Technical report , CAIR, UKZN/CSIR Meraka. http://klarman.synthasite.com/ resources/KlaBri-DARe15.pdf, 2015 .

[Konev et al., 2014 ]

Boris

Konev , Carsten Lutz, Ana Ozaki, and

Frank

Wolter . Exact learning of lightweight description logic ontologies . In Proc. of Principles of Knowledge Representation and Reasoning (KR-14) , 2014 .

[Lehmann and Hitzler , 2008]

Jens

Lehmann and

Pascal

Hitzler . A refinement operator based learning algorithm for the ALC description logic . In Hendrik Blockeel, Jan Ramon,

Jude

Shavlik , and Prasad Tadepalli, editors, Inductive Logic Programming , pages 147 - 160 . Springer Berlin Heidelberg, 2008 .

[Lutz et al., 2010 ]

Carsten

Lutz , Robert Piro, and

Frank

Wolter. Enriching E L concepts with greatest fixpoints . In Proceedings of the 2010 Conference on ECAI 2010: 19th European Conference on Artificial Intelligence , pages 41 - 46 . IOS Press, 2010 .

[Maedche and Staab , 2004]

Alexander

Maedche and

Steffen

Staab . Ontology learning . In Steffen Staab and Rudi Studer , editors, Handbook on Ontologies , pages 173 - 189 . Springer, 2004 .

[Pitt , 1989]

Leonard

Pitt . Inductive inference, DFAs, and computational complexity . In Klaus P. Jantke, editor, Analogical and Inductive Inference , volume 397 of Lecture Notes in Computer Science, pages 18 - 44 . Springer Berlin Heidelberg, 1989 .

[Pratt , 1979 ]

V.R.

Pratt . Models of program logics . In Proc. of Foundations of Computer Science (FOCS-79) , 1979 .

[Shapiro , 1981 ] Ehud

Shapiro . Inductive inference of theories from facts . In Computational Logic: Essays in Honor of Alan Robinson ( 1991 ). MIT Press, 1981 .

[Straccia and Mucci , 2015]

Umberto

Straccia and

Matteo

Mucci . pFOIL-DL: Learning (fuzzy) E L concept descriptions from crisp OWL data using a probabilistic ensemble estimation . In Proc. of the 30th Annual ACM Symposium on Applied Computing (SAC-15) . ACM, 2015 .