Introduction

A Formal Characterization of Concept Learning in Description Logics

Francesca A. Lisi

lisi@di.uniba.it 0 0 Dipartimento di Informatica, Universita degli Studi di Bari \Aldo Moro" , Italy

Among the inferences studied in Description Logics (DLs), induction has been paid increasing attention over the last decade. Indeed, useful non-standard reasoning tasks can be based on the inductive inference. Among them, Concept Learning is about the automated induction of a description for a given concept starting from classi ed instances of the concept. In this paper we present a formal characterization of Concept Learning in DLs which relies on recent results in Knowledge Representation and Machine Learning.

Introduction

Building and maintaining large ontologies pose several challenges to Knowledge Representation (KR) because of their size. In DL ontologies, although standard inferences help structuring the knowledge base (KB), e.g., by automatically building a concept hierarchy, they are, for example, not su cient when it comes to (automatically) generating new concept descriptions from given ones. They also fail if the concepts are speci ed using di erent vocabularies (i.e. sets of concept names and role names) or if they are described on di erent levels of abstraction. Altogether it has turned out that for building and maintaining large DL KBs, besides the standard inferences, additional so-called non-standard inferences are required [ 27,19 ]. Among them, the rst ones to be studied have been the Least Common Subsumer (LCS) of a set concepts [ 3 ] and the Most Speci c Concept (MSC) of an individual [ 32,10,20,1 ]. Very recently, a uni ed framework for non-standard reasoning services in DLs has been proposed [ 8 ]. It is based on the use of second-order sentences in DLs [ 7 ] as the unifying de nition model for all those constructive reasoning tasks which rely on speci c optimality criteria to build up the objective concept. E.g., LCS is one of the cases considered for one such reformulation in terms of optimal solution problems.

Since [ 27 ], much work has been done in DL reasoning to support the construction and maintenance of DL KBs. This work has been more or less explicitly related to induction. E.g., the notion of LCS has subsequently been used for the bottom-up induction of Classic concept descriptions from examples [ 5,6 ]. Induction has been widely studied in Machine Learning (ML). Therefore it does not come as a surprise that the problem of nding an appropriate concept description for given concept instances, reformulated as a problem of inductive learning from examples, has been faced in ML, initially attacked by heuristic means [ 6,18,14 ] and more recently in a formal manner [ 2,12,13,22 ] by adopting the methods and the techniques of that ML approach known as Concept Learning.

In this paper, we present a formal characterization of Concept Learning in DLs which relies on recent results in KR and ML. Notably, the proposed formulation can be justi ed by observing that the inductive inference deals with nding or constructing - a concept. Therefore, non-standard reasoning services based on induction can be considered as constructive reasoning tasks. Starting from this assumption, and inspired by Colucci et al 's framework, Concept Learning is modeled as a second-order concept expression in DLs and reformulated in terms that allow for a construction possibly subject to some optimality criteria.

The paper is structured as follows. Section 2 is devoted to preliminaries on Concept Learning according to the ML tradition. Section 3 de nes the Concept Learning problem statement in the KR context. Section 4 proposes a reformulation of Concept Learning as a constructive reasoning task in DLs. Section 5 concludes the paper with nal remarks and directions of future work. 2 2.1

Preliminaries Machine Learning

The goal of ML is the design and development of algorithms that allow computers to evolve behaviors based on empirical data [ 30 ]. The automation of the inductive inference plays a key role in ML algorithms, though other inferences such as abduction and analogy are also considered. The e ect of applying inductive ML algorithms depends on whether the scope of induction is discrimination or characterization [ 28 ]. Discriminant induction aims at inducing hypotheses with discriminant power as required in tasks such as classi cation. In classi cation, observations to learn from are labeled as positive or negative instances of a given class. Characteristic induction is more suitable for nding regularities in a data set. This corresponds to learning from positive examples only.

Ideally, the ML task is to discover an operational description of a target function f : X ! Y which maps elements in the instance space X to the values of a set Y . The target function is unknown, meaning that only a set D (the training data) of points of the form (x; f (x)) is provided. However, it may be very di cult in general to learn such a description of f perfectly. In fact, ML algorithms are often expected to acquire only some approximation f^ to f by searching a very large space H of possible hypotheses (the hypothesis space) which depend on the representation chosen for f (the language of hypotheses ). The output approximation is the one that best ts D according to a scoring function score(f; D). It is assumed that any hypothesis h 2 H that approximates f well w.r.t. a large set of training cases will also approximate it well for new unobserved cases. These notions have been mathematically formalized in computational learning theory within the Probably Approximately Correct (PAC) learning framework [ 36 ].

Summing up, given a hypothesis space H and a training data set D, ML algorithms are designed to nd an approximation f^ of a target function f s.t.: 1. f^ 2 H; 2. f^(D) f (D); and/or 3. f^ = argmaxf2Hscore(f; D).

It has been recently stressed that the rst two requirements impose constraints on the possible hypotheses, thus de ning a Constraint Satisfaction Problem (CSP), whereas the third requirement involves the optimization step, thus turning the CSP into an Optimization Problem (OP) [ 9 ]. We shall refer to the ensemble of constraints and optimization criterion as the model of the learning task. Models are almost by de nition declarative and it is useful to distinguish the CSP, which is concerned with nding a solution that satis es all the constraints in the model, from the OP, where one also must guarantee that the found solution be optimal w.r.t. the optimization function. Examples of typical CSPs in the ML context include Concept Learning for reasons that will become clearer by reading the following subsection. 2.2

Concept Learning

Concept Learning deals with inferring the general de nition of a category based on members (positive examples) and nonmembers (negative examples) of this category. Here, the target is a boolean-valued function f : X ! f0; 1g, i.e. a concept. When examples of the target concept are available, the resulting ML task is said supervised, otherwise it is called unsupervised. The positive examples are those instances with f (x) = 1, and negative ones are those with f (x) = 0.

In Concept Learning, the key inferential mechanism for induction is generalization as search through a partially ordered space of inductive hypotheses [ 29 ]. Hypotheses may be ordered from the most general ones to the most speci c ones. We say that an instance x 2 X satis es a hypothesis h 2 H if and only if h(x) = 1. Given two hypotheses hi and hj , hi is more general than or equal to hj (written hi g hj , where g denotes a generality relation) if and only if any instance satisfying hj , also satis es hi. Note that it may not be always possible to compare two hypotheses with a generality relation: the instances satis ed by the hypotheses may intersect, and not necessarily be subsumed by one another. The relation g de nes a partial order (i.e., it is re exive, antisymmetric, and transitive) over the space of hypotheses.

A hypothesis h that correctly classi es all training examples is called consistent with these examples. For a consistent hypothesis h it holds that h(x) = f (x) for each instance x. The set of all hypotheses consistent with the training examples is called the version space with respect to H and D. Concept Learning algorithms may use the hypothesis space structure to e ciently search for relevant hypotheses. E.g., they may perform a speci c-to-general search through the hypothesis space along one branch of the partial ordering, to nd the most speci c hypothesis consistent with the training examples. Another well known approach, candidate elimination, consists of computing the version space by an incremental computation of the sets of maximally speci c and maximally general hypotheses. An important issue in Concept Learning is associated with the so-called inductive bias, i.e. the set of assumptions that the learning algorithm uses for prediction of outputs given previously unseen inputs. These assumptions represent the nature of the target function, so the learning approach implicitly makes assumptions on the correct output for unseen examples.

Inductive Logic Programming (ILP) was born at the intersection between Concept Learning and the eld of Logic Programming [ 31 ]. From Concept Learning it has inherited the inferential mechanisms for induction [ 33 ]. However, a distinguishing feature of ILP with respect to other forms of Concept Learning is the use of prior knowledge of the domain of interest, called background knowledge (BK), during the search for hypotheses. Due to the roots in Logic Programming, ILP was originally concerned with Concept Learning problems where both hypotheses, observations and BK are expressed with rst-order Horn rules (usually Datalog for computational reasons). E.g., Foil is a popular ILP algorithm for learning sets of Datalog rules for classi cation purposes [ 34 ]. It performs a greedy search in order to maximize an information gain function. Therefore, Foil implements an OP version of Concept Learning.

Over the last decade, ILP has widened its scope signi cantly, by considering, e.g., learning in DLs (see next section) as well as within those hybrid KR frameworks integrating DLs and rst-order clausal languages [ 35,17,25,26 ]. 3

Learning Concepts in Description Logics

Early work on the application of ML to DLs essentially focused on demonstrating the PAC-learnability for various terminological languages derived from Classic. In particular, Cohen and Hirsh investigate the CoreClassic DL proving that it is not PAC-learnable [ 4 ] as well as demonstrating the PAC-learnability of its sublanguages, such as C-Classic [ 5 ], through the bottom-up LcsLearn algorithm. It is also worth mentioning unsupervised learning methodologies for DL concept descriptions, whose prototypical example is Kluster [ 18 ], a polynomial-time algorithm for the induction of BACK terminologies. More recently, algorithms have been proposed that follow the generalization as search approach by extending the methodological apparatus of ILP to DL languages [ 2,11,12,21,22 ]. Supervised (resp., unsupervised) learning systems, such as YinYang [ 16 ] and DL-Learner [ 23 ], have been implemented. Based on a set of re nement operators borrowed from YinYang and DL-Learner, a new version of the Foil algorithm, named DL-Foil, has been proposed [ 13 ]. In DL-Foil, the information gain function takes into account the Open World Assumption (OWA) holding in DLs. Indeed, many instances may be available which cannot be ascribed to the target concept nor to its negation. This requires a di erent setting to ensure a special treatment of the unlabeled individuals. 3.1

The Problem Statement

In this section, the supervised Concept Learning problem in the DL setting is formally de ned. For the purpose, we denote: { T and A are the TBox and the ABox, respectively, of a DL KB K { Ind(A) is the set of all individuals occurring in A { RetrK(C) is the set of all individuals occurring in A that are an instance of a given concept C w.r.t. T { IndC+(A) = fa 2 Ind(A) j C(a) 2 Ag RetrK(C) { IndC (A) = fb 2 Ind(A) j :C(b) 2 Ag RetrK(:C) These sets can be easily computed by resorting to retrieval inference services usually available in DL systems.

De nition 1 (Concept Learning). Let K = (T ; A) be a DL KB. Given: { a (new) target concept name C { a set of positive and negative examples IndC+(A) [ IndC (A) { a concept description language DLH the Concept Learning problem is to nd a concept de nition C D 2 DLH satis es the following conditions Completeness K j= D(a) Consistency K j= :D(b) 8a 2 IndC+(A) and 8b 2 IndC (A)

Ind(A) for C

D such that

Note that the de nition given above provides the CSP version of the supervised Concept Learning problem. However, as already mentioned, Concept Learning can be regarded also as an OP. Algorithms such as DL-Foil testify the existence of optimality criteria to be ful lled in Concept Learning besides the conditions of completeness and consistency. 3.2

The Solution Strategy

In Def. 1, we have considered a language of hypotheses DLH that allows for the generation of concept de nitions in any DL. These de nitions can be organized according to the concept subsumption relationship v. Since v induces a quasi-order (i.e., a re exive and transitive relation) on DLH [ 2,11 ], the problem stated in Def. 1 can be cast as the search for a correct (i.e., complete and consistent) concept de nition in (DLH; v) according to the generalization as search approach in Mitchell's vision. In such a setting, one can de ne suitable techniques (called re nement operators ) to traverse (DLH; v) either top-down or bottom-up.

De nition 2 (Re nement operator in DLs). Given a quasi-ordered search space (DLH; v) { a downward re nement operator is a mapping : DLH ! 2DLH such that 8C 2 DLH (C) fD 2 DLH j D v Cg { an upward re nement operator is a mapping : DLH ! 2DLH such that (C) fD 2 DLH j C v Dg

De nition 3 (Re nement chain in DLs). Given a downward (resp., up

ward) re nement operator (resp., ) for a quasi-ordered search space (DLH; v), a re nement chain from C 2 DLH to D 2 DLH is a sequence

C = C0; C1; : : : ; Cn = D such that Ci 2 (Ci 1) (resp., Ci 2 (Ci 1)) for every 1 i

Note that, given (DL; v), there is an in nite number of generalizations and specializations. Usually one tries to de ne re nement operators that can traverse e ciently throughout the hypothesis space in pursuit of one of the correct de nitions (w.r.t. the examples that have been provided).

De nition 4 (Properties of re nement operators in DLs). A downward re nement operator for a quasi-ordered search space (DLH; v) is { (locally) nite i (C) is nite for all concepts C 2 DLH. { redundant i there exists a re nement chain from a concept C 2 DLH to a concept D 2 DLH, which does not go through some concept E 2 DLH and a re nement chain from C to a concept equal to D, which does go through E. { proper i for all concepts C; D 2 DLH, D 2 (C) implies C 6 D. { complete i , for all concepts C; D 2 DLH with C @ D, a concept E 2 DLH with E C can be reached from D by . { weakly complete i , for all concepts C 2 DLH with C @ >, a concept

E 2 DLH with E C can be reached from > by .

The corresponding notions for upward re nement operators are de ned dually.

Designing a re nement operator needs to make decisions on which properties are most useful in practice regarding the underlying learning algorithm. Considering the properties reported in Def. 4, it has been shown that the most feasible property combination for Concept Learning in expressive DLs such as ALC is fweakly complete; complete; properg [ 21 ]. Only for less expressive DLs like E L, ideal, i.e. complete, proper and nite, operators exist [ 24 ]. 4

Concept Learning as Constructive Reasoning in DLs

In this section, we formally characterize Concept Learning in DLs by emphasizing the constructive nature of the inductive inference. 4.1

Second-order Concept Expressions

We assume to start from the syntax of any Description Logic DL where Nc, Nr, and No are the alphabet of concept names, role names and individual names, respectively. In order to write second-order formulas, we introduce a set Nx = X0; X1; X2; ::: of concept variables, which we can quantify over. We denote by DLX the language of concept terms obtained from DL by adding Nx.

De nition 5 (Concept term). A concept term in DLX is a concept formed

according to the speci c syntax rules of DL augmented with the additional rule C ! X for X 2 Nx.

Since we are not interested in second-order DLs as themselves, we restrict our language to particular existential second-order formulas of interest to this paper. In particular, we allow formulas involving an ABox. By doing so, we can easily model the computation of, e.g., the MSC, which was left out as future work in Colucci et al.'s framework. This paves the way to the modeling of Concept Learning as shown in the next subsection.

De nition 6 (Concept expression). Let a1; : : : ; am 2 DL be individuals,

C1; : : : ; Cm; D1; : : : ; Dm 2 DLX be concept terms containing concept variables X0; X1; : : : ; Xn. A concept expression in DLX is a conjunction (C1 v D1) ^ : : : ^ (Cl v Dl) ^ (Cl+1 6v Dl+1) ^ : : : ^ (Cm 6v Dm)^ (a1 : D1) ^ : : : ^ (al : Dl) ^ (al+1 : :Dl+1) ^ : : : ^ (am : :Dm) of (negated or not) concept subsumptions and concept assertions with 1 l

We use General Semantics, also called Henkin semantics, for interpreting concept variables [ 15 ]. In such a semantics, variables denoting unary predicates can be interpreted only by some subsets among all the ones in the powerset of the domain 2 I - instead, in Standard Semantics a concept variable could be interpreted as any subset of I . Adapting General Semantics to our problem, the structure we consider is exactly the sets interpreting concepts in DL. That is, the interpretation XI of a concept variable X 2 DLX must coincide with the interpretation EI of some concept E 2 DL. The interpretations we refer to in the following de nition are of this kind.

De nition 7 (Satis ability). A concept expression of the form (1) is satis able in DL i there exist n + 1 concepts E0; : : : ; En 2 DL such that, extending the semantics of DL for each interpretation I, with: (Xi)I = (Ei)I for i = 0; : : : ; n, it holds that 1. for each j = 1; : : : ; l, and every interpretation I, (Cj )I (Dj )I and (aj )I 2 (Dj )I , and 2. for each j = l + 1; : : : ; m, there exists an interpretation I s.t. (Cj )I 6 (Dj )I and (aj )I 62 (Dj )I Otherwise,

is said to be unsatis able in DL.

De nition 8 (Solution). If a concept expression of the form (1) is satis able in DL, then hE0; : : : ; Eni is a solution for . Moreover, we say that the formula 9X0 9Xn: is true in DL if there exist at least a solution for , otherwise it is false.

(1) (2) 4.2

Modeling Concept Learning with Second-Order DLs

It has been pointed out that the constructive reasoning tasks can be divided into two main categories: tasks for which we just need to compute a concept (or a set of concepts) and those for which we need to nd a concept (or a set of concepts) according to some minimality/maximality criteria [ 8 ]. In the rst case, we have a set of solutions while in the second one we also have a set of suboptimal solutions to the main problem. E.g., the set of sub-optimal solutions in LCS is represented by the common subsumers. Both MSC and Concept Learning belong to this second category of constructive reasoning tasks. We remind the reader that MSC can be easily reduced to LCS for DLs that admit the one-of concept constructor. However, this reduction is not trivial for the general case. Hereafter, rst, we show how to model MSC in terms of formula (2). This step is to be considered as functional to the modeling of Concept Learning. Most Speci c Concept Intuitively, the MSC of individuals described in an ABox is a concept description that represents all the properties of the individuals including the concept assetions they occur in and their relationship to other individuals. Similar to the LCS, the MSC is uniquely determined up to equivalence. More precisely, the set of most speci c concepts of individuals a1; : : : ; ak 2 DL forms an equivalence class, and if S is de ned to be the set of all concept descriptions that have a1; : : : ; ak as their instance, then this class is the least element in [S] w.r.t. a partial ordering on equivalence classes induced by the quasi ordering v. Analogously to the LCS, we refer to one of its representatives by MSC(a1; : : : ; ak). The MSC need not exist. Three di erent phenomena may cause the non existence of a least element in [S], and thus, a MSC: 1. [S] might be empty, or 2. [S] might contain di erent minimal elements, or 3. [S] might contain an in nite decreasing chain [D1] [D2] .

A concept E is not the MSC of a1; : : : ; ak i the following formula is true in DL: 9X:(a1 : X) ^ : : : ^ (ak : X) ^ (X v E) ^ (E 6v X) (3) that is, E is not the MSC if there exists a concept X which is a most speci c concept, and is strictly more speci c than E.

Concept Learning Following Def. 1, we assume that IndC+(A) = fa1; : : : ; amg and IndC (A) = fb1; : : : ; bng. A concept D 2 DLH is a correct concept de nition for the target concept name C w.r.t. IndC+(A) and IndC (A) i it is a solution for the following second-order concept expression: (C v X) ^ (X v C) ^ (a1 : X) ^ : : : ^ (am : X) ^ (b1 : :X) ^ : : : ^ (bn : :X) (4) The CSP version of the task is therefore modeled with the following formula. 9X:(C v X)^(X v C)^(a1 : X)^: : :^(am : X)^(b1 : :X)^: : :^(bn : :X) (5) A simple OP version of the task could be modeled with the formula: 9X:(C v X) ^ (X v C) ^ (X v E) ^ (E 6v X)^ (a1 : X) ^ : : : ^ (am : X) ^ (b1 : :X) ^ : : : ^ (bn : :X) (6) which asks for solutions that are compliant with a minimality criterion involving concept subsumption checks. Therefore, a concept E 2 DLH is not a correct concept de nition for C w.r.t. IndC+(A) and IndC (A) if there exists a concept X which is a most speci c concept, and is strictly more speci c than E. 5

Conclusions

In this paper, we have provided a formal characterization of Concept Learning in DLs according to a declarative modeling language which abstracts from the speci c algorithms used to solve the task. To this purpose, we have de ned a fragment of second-order logic under the general semantics which allows to express formulas involving concept assertions from an ABox. One such fragment enables us to cover the general case of MSC as well. Also, as a minor contribution, we have suggested that the generalization as search approach to Concept Learning in Mitchell's vision is just that unifying framework necessary for accompanying the declarative modeling language proposed in this paper with a way of computing solutions to the problems declaratively modeled with this language. More precisely, the computational method we refer to in this paper is based on the iterative application of suitable re nement operators. Since many re nement operators for DLs are already available in the literature, the method can be designed such that it can be instantiated with a re nement operator speci cally de ned for the DL in hand.

The preliminary results reported in this paper open a promising direction of research at the intersection of KR and ML. For this research we have taken inspiration from recent results in both areas. On one hand, Colucci et al.'s work provides a procedure which combines Tableaux calculi for DLs with rules for the substitution of concept variables in second-order concept expressions [ 8 ]. On the other hand, De Raedt et al.'s work shows that o -the-shelf constraint programming techniques can be applied to various ML problems, once reformulated as CSPs and OPs [ 9 ]. Interestingly, both works pursue a uni ed view on the inferential problems of interest to the respective elds of research. This match of research e orts in the two elds has motivated the work presented in this paper which, therefore, moves a step towards bridging the gap between KR and ML in areas such as the maintenance of KBs where the two elds have already produced interesting results though mostly indipendently from each other. New questions and challenges are raised by the cross-fertilization of these results. In the future, we intend to investigate how to express optimality criteria such as the information gain function within the second-order concept expressions and how the generalization as search approach can be e ectively integrated with second-order calculus.

1. Baader , F. : Least common subsumers and most speci c concepts in a description logic with existential restrictions and terminological cycles . In: Gottlob, G. , Walsh , T. (eds.) IJCAI'03: Proceedings of the 18th International Joint Conference on Arti cial intelligence . pp. 319 { 324 . Morgan Kaufmann Publishers ( 2003 )

2. Badea , L. , Nienhuys-Cheng, S.: A re nement operator for description logics . In: Cussens, J. , Frisch , A . (eds.) Inductive Logic Programming, Lecture Notes in Arti cial Intelligence , vol. 1866 , pp. 40 { 59 . Springer-Verlag ( 2000 )

3. Cohen , W.W. , Borgida , A. , Hirsh , H.: Computing least common subsumers in description logics . In: Proc. of the 10th National Conf. on Arti cial Intelligence . pp. 754 { 760 . The AAAI Press / The MIT Press ( 1992 )

4. Cohen , W.W. , Hirsh, H.: Learnability of description logics . In: Haussler, D . (ed.) Proc. of the 5th Annual ACM Conf. on Computational Learning Theory . pp. 116 { 127 . ACM ( 1992 )

5. Cohen , W.W. , Hirsh, H.: Learning the CLASSIC description logic: Theoretical and experimental results . In: Proc. of the 4th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR'94) . pp. 121 { 133 . Morgan Kaufmann ( 1994 )

6. Cohen , W.W. , Hirsh, H.: The learnability of description logics with equality constraints . Machine Learning 17 ( 2-3 ), 169 { 199 ( 1994 )

7. Colucci , S. , Di Noia, T. , Di Sciascio , E. , Donini , F.M. , Ragone , A. : Second-order description logics: Semantics, motivation, and a calculus . In: Haarslev, V. , Toman , D. , Weddell , G.E. (eds.) Proc. of the 23rd Int. Workshop on Description Logics (DL 2010 ). CEUR Workshop Proceedings , vol. 573 . CEUR-WS.org ( 2010 )

8. Colucci , S. , Di Noia, T. , Di Sciascio , E. , Donini , F.M. , Ragone , A. : A uni ed framework for non-standard reasoning services in description logics . In: Coelho, H. , Studer , R. , Wooldridge , M. (eds.) ECAI 2010 - 19th European Conference on Arti cial Intelligence. Frontiers in Arti cial Intelligence and Applications , vol. 215 , pp. 479 { 484 . IOS Press ( 2010 )

9. De Raedt , L. , Guns , T. , Nijssen , S.: Constraint programming for data mining and machine learning . In: Fox, M. , Poole , D . (eds.) Proc. of the 24th AAAI Conference on Arti cial Intelligence . AAAI Press ( 2010 )

10. Donini , F.M. , Lenzerini , M. , Nardi , D.: An e cient method for hybrid deduction . In: ECAI . pp. 246 { 252 ( 1990 )

11. Esposito , F. , Fanizzi , N. , Iannone , L. , Palmisano , I. , Semeraro , G.: Knowledgeintensive induction of terminologies from metadata . In: McIlraith, S.A. , Plexousakis , D., van Harmelen , F . (eds.) The Semantic Web - ISWC 2004: Third International Semantic Web Conference. Lecture Notes in Computer Science , vol. 3298 , pp. 441 { 455 . Springer ( 2004 )

12. Fanizzi , N. , Iannone , L. , Palmisano , I. , Semeraro , G.: Concept formation in expressive description logics . In: Boulicaut, J.F. , Esposito , F. , Giannotti , F. , Pedreschi , D . (eds.) Machine Learning: ECML 2004. Lecture Notes in Computer Science , vol. 3201 , pp. 99 { 110 . Springer ( 2004 )

13. Fanizzi , N., d'Amato , C. , Esposito , F. : DL-FOIL concept learning in description logics . In: Zelezny, F. , Lavrac , N. (eds.) Inductive Logic Programming. Lecture Notes in Computer Science , vol. 5194 , pp. 107 { 121 . Springer ( 2008 )

14. Frazier , M. , Pitt , L. : CLASSIC learning . Machine Learning 25 ( 2-3 ), 151 { 193 ( 1996 )

15. Henkin , L. : Completeness in the theory of types . Journal of Symbolic Logic 15 ( 2 ), 81 { 91 ( 1950 )

16. Iannone , L. , Palmisano , I. , Fanizzi , N.: An algorithm based on counterfactuals for concept learning in the semantic web . Applied Intelligence 26 ( 2 ), 139 { 159 ( 2007 )

17. Kietz , J.: Learnability of description logic programs . In: Matwin, S. , Sammut , C. (eds.) Inductive Logic Programming. Lecture Notes in Arti cial Intelligence , vol. 2583 , pp. 117 { 132 . Springer ( 2003 )

18. Kietz , J.U. , Morik , K. : A polynomial approach to the constructive induction of structural knowledge . Machine Learning 14 ( 1 ), 193 { 217 ( 1994 )

19. Kusters, R.: Non-Standard Inferences in Description Logics, Lecture Notes in Computer Science , vol. 2100 . Springer ( 2001 )

20. Kusters, R., Molitor , R.: Approximating most speci c concepts in description logics with existential restrictions . AI Communications 15 ( 1 ), 47 { 59 ( 2002 )

21. Lehmann , J. , Hitzler , P. : Foundations of re nement operators for description logics . In: Blockeel, H. , Ramon , J. , Shavlik , J.W. , Tadepalli , P. (eds.) Inductive Logic Programming. Lecture Notes in Arti cial Intelligence , vol. 4894 , pp. 161 { 174 . Springer ( 2008 )

22. Lehmann , J. , Hitzler , P. : Concept learning in description logics using re nement operators . Machine Learning 78 ( 1-2 ), 203 { 250 ( 2010 )

23. Lehmann , J.: DL-Learner: Learning concepts in description logics . Journal of Machine Learning Research 10 , 2639 { 2642 ( 2009 )

24. Lehmann , J. , Haase , C. : Ideal downward re nement in the EL description logic . In: De Raedt, L. (ed.) Inductive Logic Programming. Lecture Notes in Computer Science , vol. 5989 , pp. 73 { 87 ( 2010 )

25. Lisi , F.A. : Building rules on top of ontologies for the semantic web with inductive logic programming . Theory and Practice of Logic Programming 8 ( 03 ), 271 { 300 ( 2008 )

26. Lisi , F.A.: Inductive logic programming in databases: From Datalog to DL+log . Theory and Practice of Logic Programming 10 ( 3 ), 331 { 359 ( 2010 )

27. McGuinness , D. , Patel-Schneider , P. : Usability issues in knowledge representation systems . In: Mostow, J. , Rich, C. (eds.) Proc. of the 15th National Conf. on Articial Intelligence and 10th Innovative Applications of Arti cial Intelligence Conference . pp. 608 { 614 . AAAI Press / The MIT Press ( 1998 )

28. Michalski, R.S.: A theory and methodology of inductive learning . In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: an arti cial intelligence approach , vol. I. Morgan Kaufmann ( 1983 )

29. Mitchell, T.: Generalization as search . Arti cial Intelligence 18 , 203 { 226 ( 1982 )

30. Mitchell, T.: Machine Learning . McGraw Hill ( 1997 )

31. Muggleton , S.: Inductive logic programming . In: Arikawa, S. , Goto , S. , Ohsuga , S. , Yokomori , T. (eds.) Proc. of the 1st Conf. on Algorithmic Learning Theory . Springer/Ohmsha ( 1990 )

32. Nebel , B.: Reasoning and revision in hybrid representation systems , Lecture Notes in Computer Science , vol. 422 . Springer ( 1990 )

33. Nienhuys-Cheng, S., de Wolf, R.: Foundations of inductive logic programming , Lecture Notes in Arti cial Intelligence , vol. 1228 . Springer ( 1997 )

34. Quinlan , J. : Learning logical de nitions from relations . Machine Learning 5 , 239 { 266 ( 1990 )

35. Rouveirol , C. , Ventos , V. : Towards learning in CARIN-ALN . In: Cussens, J. , Frisch , A . (eds.) Inductive Logic Programming. Lecture Notes in Arti cial Intelligence , vol. 1866 , pp. 191 { 208 . Springer ( 2000 )

36. Valiant , L. : A theory of the learnable . Communications of the ACM 27 ( 11 ), 1134 { 1142 ( 1984 )