-

Analogical Reasoning in Description Logics

Claudia d'Amato

claudia.damato@di.uniba.it 0

Nicola Fanizzi

fanizzi@di.uniba.it 0

Floriana Esposito

esposito@di.uniba.it 0 0 Dipartimento di Informatica, Universita` degli Studi di Bari , Campus Universitario, Via Orabona 4, 70125 Bari , Italy

This work presents a framework, founded on multi-relational instancebased learning, for inductive (memory-based) reasoning on knowledge bases expressed in Description Logics. The procedure, which exploits a relational dissimilarity measure based on the notion of Information Content, can be employed both to answer to class-membership queries and to predict assertions, that may not be logically entailed by the knowledge base. These tasks may be the baseline for other inductive methods for ontology construction and evolution. In a preliminary experimentation, we show that the method is sound. Besides it is actually able to induce new knowledge that might be acquired in the knowledge base.

Most of the research on formal ontologies has been focussing on methods based on deductive reasoning. However, important tasks that are likely to be provided by new generation knowledge-based systems, such as classification, construction, revision, population and evolution are likely supported also by inductive methods.

In order to support these tasks and overcome the inherent complexity of classic logic-based inference other forms of reasoning are being investigated, both deductive, such as non-monotonic, paraconsistent [ 10 ], approximate reasoning (see the discussion in [ 11 ]), case-based reasoning [ 7 ] and inductive-analogical forms such as inductive generalization [ 5 ] and specialization [ 9 ].

All these approaches require scalable and efficient reasoning. From this viewpoint, instance-based inductive methods [ 8 ] are particularly well suited. Indeed, they are known to be both very efficient and fault-tolerant compared to the classic logic-based methods. Particularly being this methods fault-tolerant, they can be suitably applied to shared knowledge bases coming from distributed sources and for this reason often characterized by the presence of noise.

Instance-based algorithms have been mainly applied to attribute-value representations to solve tasks such as classification, clustering, and case-based reasoning. Upgrading the algorithms to work on multirelational representations [ 8 ], namely on the concept languages used in the Semantic Web, founded in Description Logics (DLs) [ 1 ] (see Sect. 2), requires novel similarity measures that are suitable for such First Order Logic fragments. As pointed out in [ 4 ], most of these measures focus on the similarity of atomic concepts within hierarchies or simple ontologies. Yet, recently, dissimilarity measures for composite concept descriptions in DL have been proposed [ 6 ]. These measures (see Sect. 3) elicit the underlying semantics by querying the knowledge base (as hinted in [ 3 ]) for determining the extension of concept descriptions (estimated by their retrieval [ 1 ]). Such measures can be applied to assess the dissimilarity between concepts and/or between individuals.

An instance-based framework (Sect. 4) applicable to ontological knowledge has been devised. Exploiting a dissimilarity measure, it can derive inductively (by analogy) both consistent consequences from the knowledge base and also new assertions which were not previously logically derivable. Such a framework can be effectively used to semi-automatize the task of populating ontologies that are partially defined (in terms of assertions). For instance, classification can be performed even in absence of a definition for the target concept in the knowledge base by analogy with a set of training assertions on such a concept provided by an expert. In turn, this enables also other related (bottom-up) services such as classification, clustering, ontology construction and evolution. Specifically, we have worked on a classification procedure based on a lazy learning approach, namely a relational form of the k-Nearest Neighbor (k-NN) procedure (see [ 12 ]). The main idea is that similar individuals, by analogy, should likely belong to similar concepts. The adaptation to the context of DLs could not be straightforward. Indeed a theoretical problem has been posed by the Open World Assumption (OWA) that is generally made in the target context, differently from data mining settings where the Closed World Assumption (CWA) is the standard. Besides, in the standard kNN multi-class setting, different classes are often assumed to be disjoint, which is not typical in a Semantic Web context.

The measure and the modified classification method have been implemented and some preliminary experimental results with real ontologies have been presented (Sect. 5). Moreover, the decision procedure could be further enriched with a non-parametric statistic test for controlling the degree of significance of the classification of new instances. 2

Representation and Logic Inference

The basics of ALC are briefly recalled. This logic adopts constructors supported by the standard Web ontology languages (see the DL handbook [ 1 ] for a thorough reference).

In DLs, concept descriptions are defined in terms of a set NC of primitive concept names and a set NR of primitive roles. The semantics of the concept descriptions is defined by an interpretation I = (ΔI , ·I ), where ΔI is a non-empty set, the domain of the interpretation, and ·I is the interpretation function that maps each A ∈ NC to a set AI ⊆ ΔI and each R ∈ NR to RI ⊆ ΔI × ΔI . The top concept > is interpreted as the whole domain ΔI , while the bottom concept ⊥ corresponds to ∅. Complex descriptions can be built in ALC using the following constructors. The language supports full negation: given any concept description C, denoted ¬C, it amounts to ΔI \ CI . The conjunction of concepts, denoted with C1 u C2, yields an extension C1I ∩ C2I and, dually, concept disjunction, denoted with C1 t C2, yields C1I ∪ C2I . Finally, there are two restrictions on roles: the existential restriction, denoted with ∃R.C, and interpreted as the set {x ∈ ΔI | ∃y ∈ ΔI : (x, y) ∈ RI ∧ y ∈ CI } and the value restriction, denoted with ∀R.C, whose extension is {x ∈ ΔI | ∀y ∈ ΔI : (x, y) ∈ RI → y ∈ CI }.

A knowledge base K = hT , Ai contains a TBox T and an ABox A. T is a set of concept definitions1 C ≡ D, meaning CI = DI , where C is atomic (the concept name) and D is an arbitrarily complex description defined as above. A contains assertions on the world state, e.g. C(a) and R(a, b), meaning that aI ∈ CI and (aI , bI ) ∈ RI . Moreover, normally the unique names assumption is made on the ABox individuals. These are denoted with Ind(A). In this context the most common inference is the semantic notion of subsumption between concepts: Definition 2.1 (subsumption). given two concept descriptions C and D, C subsumes D, denoted by D v C, iff for every interpretation I it holds that DI ⊆ CI . When D v C and C v D then they are equivalent, denoted with C ≡ D.

Semantically equivalent (yet syntactically different) descriptions can be given for the same concept. Nevertheless, equivalent concepts can be reduced to a normal form by means of rewriting rules that preserve their equivalence [ 1 ]: Definition 2.2 (normal form). A concept description D is in ALC normal form iff D = ⊥ or D = > or if D = D1 t · · · t Dn (∀i = 1, . . . , n, Di 6≡ ⊥) and Di =

A∈prim(Di)

R∈NR

 A u l ∀R.valR(Di) u

l E∈exR(Di)

 ∃R.E – prim(Di) is the set of (negated) primitive concepts occurring at the top level of Di; – valR(Di) is the conjunction C1i u · · · u Cni in the value restriction of role R, if any (otherwise valR(Di) = >); – exR(Di) is the set of concepts in the existential restrictions on R. and ∀R ∈ NR ∀E ∈ exR(Di) ∪ {valR(Di)} E is in normal form.

L will denote the set of concepts in normal form (L = ALC/≡).

Another inference for reasoning with individuals requires finding the concepts which an individual belongs to, especially the most specific one: Definition 2.3 (most specific concept). Given an ABox A and an individual a, the most specific concept of a w.r.t. A is the concept C, denoted MSCA(a), such that A |= C(a) and for any other concept D such that A |= D(a), it holds that C v D, where |= stands for the logical entailment.

In a language endowed with existential (or numeric) restrictions, such as ALC, the exact MSC may not be always expressed with a finite description [ 1 ], yet it may be approximated [ 5, 2 ]. 1 The cases of general axioms or cyclic definitions will not considered here.

A Dissimilarity Measure for ALC A measure of concept similarity can be derived from the notion of Information Content (IC) that, in turn, depends on the probability of an individual to belong to a certain concept. Now, differently from other works which assume that a probability distribution for the concepts in an ontology is known, here it is derived from the knowledge base, that is from the distribution that can be estimated therein [ 3 ].

In order to approximate this probability for a certain concept C, its extension is used with respect to the considered ABox. Namely, we chose the canonical interpretation IA, i.e. the one adopting the set of individuals mentioned in the ABox as its domain and the identity as its interpretation function [ 1 ]. Now, given a concept C its probability is estimated by: pr(C) = |CIA |/|ΔIA |. Finally, the information content of a concept computed, employing this probability: IC(C) = − log pr(C).

A measure of the concept dissimilarity is now formally defined [ 6 ]: Cm,eDasu∈reLf, wisitha Cfun=ctFioinn=1fC:i aLnd×DL=7→Fm

j=1 Dj Definition 3.1. Let L be the set of all concepts in ALC normal form. The dissimilarity

R+ defined recursively as follows. For all  0  f (C, D) := ft(C, D) =  ∞  maxi ∈ [1, n] fu(Ci, Dj )  j ∈ [1, m]

if C ≡ D if C u D = ⊥ o.w. fu(Ci, Dj ) := fP (prim(Ci), prim(Dj )) + f∀(Ci, Dj ) + f∃(Ci, Dj )

∞ if prim(Ci) u prim(Dj ) ≡ ⊥ fP (prim(Ci), prim(Dj )) :=

IC(prim(Ci)uprim(Dj))+1

IC(LCS(prim(Ci),prim(Dj)))+1 o.w. where LCS computes the least common subsumer of the input concepts [ 1 ]. f∀(Ci, Dj ) :=

X ft(valR(Ci), valR(Dj )) f∃(Ci, Dj ) := p p=m1,a..x.,M ft(Cik, Dj ) R∈NR

N X X

R∈NR k=1 where Cik ∈ exR(Ci) and Djp ∈ exR(Dj ) and we suppose w.l.o.g. that N = |exR(Ci)| ≥ |exR(Dj )| = M , otherwise the indices N and M are to be exchanged.

Now f has values in [0, ∞]. It is possible to derive a normalized dissimilarity measure as shown in the following.

Definition 3.2 (dissimilarity measure). The dissimilarity measure d is a function d :

L × L 7→ [ 0, 1 ], such that given the concept descriptions in normal form C = Fin=1 Ci and D = Fm j=1 Dj , let

 0 d(C, D) :=  1  1 − 1/f (C, D) if f (C, D) = 0 if f (C, D) = ∞ otherwise

A measure has been defined whose baseline (counts on the extensions of primitive concepts) depends on the semantics of the knowledge base, as conveyed by the ABox assertions. This is in line with the ideas in [ 3, 4 ], where semantics is elicited as a probability distribution over the domain Δ. By comparing concept descriptions reduced to the normal form, we have a structural definition of dissimilarity. However, since MSCs are computed from the same ABox assertions, reflecting the current knowledge state, this guarantees that structurally similar representations will be obtained for semantically similar concepts.

Let us recall that it is possible to calculate the most specific concept of an individual a w.r.t. the ABox, MSC(a) (see Def. 2.3) or at least its approximation MSCk(a) up to a certain description depth k. In the following we suppose to have fixed this k to the depth of the ABox. Given two individuals a and b in the ABox, we consider MSCk(a) and MSCk(b) (supposed in normal form). Now, in order to assess the dissimilarity between the individuals, the f measure can be applied:

d(a, b) := d(MSCk(a), MSCk(b))

The computational complexity of the measure is strictly related to some reasoning services, namely retrieval and instance-checking (see [ 1 ], Ch. 3). Nevertheless, it is limited to primitive concepts only and it is linear in the number of individuals in the ABox (data complexity). In practical applications, these computations may be efficiently carried out exploiting the statistics that are maintained by the DBMSs. Besides, the counts that are necessary for computing the concept extensions could be estimated by means of the probability distribution over the domain. 4

Nearest Neighbor for Description Logics

Given an ontology, a classification method can be employed for assigning an individual with the concepts it is likely to belong to. These individuals are supposed to be partially described by assertions in the ABox, thus classification may induce new assertions by analogy, which cannot be inferred by deduction.

We briefly review the basics of the k-Nearest Neighbor method (k-NN) and propose how to exploit this classification procedure for inductive reasoning. It is considered a lazy approach to learning, since the learning phase is reduced to memorizing instances of the target concepts pre-classified by an expert. Then, during the classification phase, a notion of dissimilarity for the instance space is employed to classify a new instance in analogy with its neighbor.

Let xq be the instance that must be classified. Using a dissimilarity measure (or any other distance function), the set of the k nearest pre-classified instances w.r.t. xq is selected. The objective is to learn a discrete-valued target function h : IS 7→ V from a space of instances IS to a set of values V = {v1, . . . , vs} standing for the classes to assign. This should be adapted for the more complex context of DLs descriptions.

In its simplest setting, the k-NN algorithm approximates h for new instances xq on the ground of the value that h assumes for the training instances in the neighborhood of xq, i.e. the k closest instances to the new instance in terms of a dissimilarity measure. Precisely, it is assigned according to the value which is voted by the majority of instances in the neighborhood.

The problem with this formulation is that it does not take into account the (dis)similarity among instances, except when selecting the instances to be included in the neighborhood. Therefore a modified setting is generally adopted, based on weighting the vote according to the distance of the query instance from the training instances: k hˆ(xq) := argmax X wiδ(v, h(xi)) v∈V i=1 (1) where δ is the Kronecker symbol, a function that returns 1 in case of matching arguments and 0 otherwise, and, given a distance measure wi = 1/d(xi, xq) or wi = 1/d(xi, xq)2.

Note that the hypothesis function hˆ is defined only extensionally, therefore the kNN method does not return an intensional classification model (a function or a concept definition), it merely gives an answer for new query instances to be classified, employing the procedure above. It should be observed that a strong assumption of this setting is that it can be employed to assign the query instance to the class from a set of values which can be regarded as a set of pairwise disjoint concepts. This is a simplifying assumption that cannot be always valid. In our setting, indeed, an individual could be an instance of more than one concept.

Let us consider a value set V = {C1, . . . , Cs}, of possibly overlapping concepts Cj (1 ≤ j ≤ s) that may be assigned to a query instance xq. If the classes were disjoint as in the standard setting, the decision procedure defining the hypothesis function is the same as in Eq. (1), with the query instance assigned the single class of the majority of instances in the neighborhood. In the general case, when the pairwise disjointness of the concepts cannot be assumed, one can adopt another classification procedure, decomposing the multi-class problem into smaller binary classification problems (one per target concept). Therefore, a simple binary value set (V = {−1, +1}) may be employed. Then, for each concept, a hypothesis hˆj is computed, iteratively: v∈V

k hˆj (xq) := argmax X δ(v, hj (xi))

d(xq, xi)2 i=1 ∀j ∈ {1, . . . , s} (2) where each function hj (1 ≤ j ≤ s) simply indicates the occurrence (+1) or absence (−1) of the corresponding assertion in the ABox: Cj (xi) ∈ A. Alternately2, hj may return +1 when Cj (xi) can be inferred from the knowledge base K, and −1 otherwise.

The problem with non-explicitly disjoint concepts is also related to the CWA usually made in the knowledge discovery context. That is the reason for adapting the standard setting to cope both with the case of generally non-disjoint classes and with the OWA which is commonly made in the Semantic Web context.

To deal with the OWA, the absence of information on whether a certain training instance x belongs to the extension of concept Cj should not be interpreted negatively, as shown before. Rather, it should count as neutral information. Thus, another value set has to be adopted for the hj ’s, namely V = {−1, 0, +1}, where the three values denote, 2 For the sake of simplicity and efficiency, this case will not be considered in the following, since instance checking may be computationally expensive (see [ 1 ], Ch. 3). respectively, occurrence, absence and occurrence of the opposite assertion:  +1  hj (x) =

−1  0

Cj (x) ∈ A ¬Cj (x) ∈ A o.w.

Occurrence can be easily computed with a lookup in the ABox, therefore the overall complexity of the procedure depends on the number k |Ind(A)|, that is the number of times the distance measure is needed.

Note that, being based on a majority vote of the individuals in the neighborhood, this procedure is less error-prone in case of noise in the data (i.e. incorrect assertions in the ABox), therefore it may be able to give a correct classification even in case of (partially) inconsistent knowledge bases.

Again, a more complex procedure may be devised by substituting the notion of occurrence (absence) of assertions in (from) the ABox with the one of derivability (nonderivability) from the whole knowledge base, i.e. K ` Cj (x), K 6` Cj (x)∧K 6` ¬Cj (x) and K ` ¬Cj (x), respectively. Although this may exploit more information and turn out to be more accurate, it is also much more computationally expensive, since the simple lookup in the ABox must be replaced with a logical inference (instance checking). 5

Experiments

We present the outcomes of preliminary experiments carried out for testing the feasibility of the method illustrated in the previous section. In order to assess the appropriateness and validity of the method with the proposed similarity measure, we have applied it to a classification problem, using three datasets. Namely, we have chosen three different ontologies: the FSM ontology and the SURFACE-WATER-MODEL ontology from the Prote´ge´ library3, and a handcrafted FAMILY ontology.

FSM is an ontology describing finite state machines. It is made up of 20 concepts (both primitives and defined), some of them are declared to be disjoint, 10 object properties, 7 datatype properties, 37 distinct individual names. About half of the individuals are instances of only a single class and are not involved in any property, while the other half are involved in properties.

SURFACE-WATER-MODEL is an ontology describing water quality models. It is based on the Surface-water Models Information Clearinghouse (SMIC) of the US Geological Survey (USGS). Namely, it is an ontology of numerical models for surface water flow and water quality simulation, which are applicable to lakes, oceans, estuaries, etc. These models are classified according to their availability, domain, dimensions, and characteristic types. It is made up of 19 concepts (both primitives and defined) without any specification about disjointness, 9 object properties, 115 distinct individual names; each of them is an instance of a single class and only some of them are involved in object properties.

FAMILY is an handcrafted ontology describing family relationships. It is made up of 14 concepts (both primitives and defined), some of them are declared to be disjoint, 3 http://protege.stanford.edu/download/ontologies.html 5 object properties, 39 distinct individual names. Most of the individuals are instances of more than one concept, besides most of them are involved in more than one role. The ontology was intended to provide a more complex instance graph than in the other cases. Indeed, in the other ontologies there are assertions about instances only for some concepts (the others being defined intensionally), while in the FAMILY ontology every concept has at least one instance asserted in the ABox. The same happens for the role assertions; particularly there are some situations where roles assertions constitute a chain from an individual to another one, by means of other intermediate roles. So the FAMILY ontology likely describes a real-world state of affairs (the ABox having been modeled after a real family tree).

The proposed method was applied to each ontology, with the parameter k set to √|Ind(A)|, as recommended in the literature based on specific experiments; namely, for every ontology, all individuals are classified to be instances of one or more concepts of the considered ontology. Specifically, we consider all the individuals in the ontology and for each of them the MSC is computed, thus the MSC list represents the set of training examples. Each example is classified applying the k-NN method for DLs, adopting the leave-one-out cross validation procedure. It is straightforward to point out that two elements are fundamental for getting good results with the method: – the similarity measure has to be able to select really similar instances with respect to the example to be classified; – the examples in the training set have to be meaningful for the one to be classified. We intended to assess whether the method is able to classify instances correctly, i.e. assign the same concepts computed with instance checking. Additionally, it should also be able to induce by analogy new (previously unknown) class-membership assertions that cannot be logically inferred. Particularly, for each ontology and for each concept, three rates have been computed: omission error rate, commission error rate, induction rate. The omission error is related to completeness. It measures the amount of unlabelled individuals (hˆj (xq ) = 0) with respect to a certain concept (Cj ) while it was to be classified as an instance of that class (hj (xq ) = 1). The commission error is related to soundness. It measures the amount of individuals labelled as instances of the negation of the target concept (hˆj (xq ) = −1), while they belong to that concept (hj (xq ) = 1) or vice-versa. The induction rate measures the amount of individuals labelled that were found to belong to a concept or its negation (hˆj (xq ) = ±1), while this information is not logically derivable from the knowledge base (hj (xq ) = 0). Thus commission error (erroneous labelling) may be more harmful than omission error (no label assigned, also because of the OWA) for further inductive methods to be applied. A high induction rate means that the procedure was actually able to induce new assertions that are likely to be valid and can then be incorporated into the knowledge base.

By looking at Tab. 1 reporting the experimental outcomes, it is important to note that, for every ontology, the commission error was null. This means that the classifier has never made critical mistakes because no individual has been deemed as an instance of a concept while really it is an instance an disjoint class.

In particular, for the SURFACE-WATER-MODEL ontology, the predictive accuracy is 100% i.e. the omission error and induction rate are null. This means that for the SURFACE-WATER-MODEL ontology our classifier always assigns individuals to the correct concepts but it is also never able to induce new knowledge (indeed the induction rate is null). This is due to the fact that individuals in this ontology are all instances of the same concepts and roles, so computing their MSC, these are all very similar and so the amount of information they convey is very low.

For the same reasons, also for the FSM ontology, we have a maximal accuracy. However, differently from the previous ontology, the induction rate for the FSM ontology is not null. Since the induction rate represents assertions that are not logically deducible from the ontology and was induced by the classifier, these figures would be positive if this knowledge is correct. Particularly, in this case the increase of the induction rate has been due to the presence of some concepts that are declared to be mutually disjoint.

Results are different for the case of the FAMILY ontology, where the predictive accuracy is lower and there have been some omission errors. This is due to the way individuals are asserted as instances of the concepts. First of all, instances are more irregularly spread over the classes, that is they are instances of different concepts, which are sometimes disjoint. Specifically, there is a concentration of instances of some concepts. Hence the MSC approximations that were computed are very different, which reduces the possibility of matching significantly similar MSCs. Nevertheless our algorithm does not make any commission error and it is able to infer new knowledge.

Concluding, we have observed that the proposed method is able to induce new assertions in addition those that were already logically derivable from the knowledge base. Particularly, an increase in prediction accuracy was observed when the instances are homogeneously spread. Besides, the method confirmed its tolerance to noise as no commission error was observed. 6

Conclusions and Future Work

This paper explored the application of an instance-based learning method to relational representations such as DLs. A dissimilarity measure has been employed in the method that may be applied to predicting/suggesting missing information about a knowledge base individuals. The first experiments made showed that the method is effective, although its performance depends on the number (and distribution) of the available training instances. Besides, as expected, the procedure is robust to noise and never made commission errors in the experiments carried out.

Currently, we are investigating the application of statistical tests which are likely to augment the significance of the inductive conclusions drawn by the mere instance-based method. Besides, an instance-based method may be also suitable for the induction of missing values for (scalar or numeric) datatype properties of an individual as an estimate derived from the values of the datatypes for the surrounding individuals. The employed measure can be refined by introducing a weighting factor, useful for decreasing the impact of the similarity between nested sub-concepts in the descriptions on the determination of the overall value. Another natural extension may concern the definition of measures for more expressive DLs languages so to scale up the applicability of the method.

[1]

Baader ,

Calvanese ,

McGuinness ,

Nardi , and P. Patel-Schneider, editors. The Description Logic Handbook . Cambridge University Press, 2003 .

[2]

Baader and

Ku ¨ sters. Non-standard inferences in description logics: The story so far . In D. Gabbay,

S. S.

Goncharov , and M. Zakharyaschev, editors, Mathematical Problems from Applied Logic . New Logics for the XXIst Century , volume 4 of International Mathematical Series. Kluwer/Plenum Publishers, 2005 .

[3]

Bacchus . Lp, a logic for representing and reasoning with statistical knowledge . Computational Intelligence , 6 : 209 - 231 , 1990 .

[4]

Borgida ,

Walsh , and

Hirsh . Towards measuring similarity in description logics . In Working Notes of the International Description Logics Workshop , CEUR Workshop Proceedings, Edinburgh, UK, 2005 .

[5]

Cohen and

Hirsh . Learning the CLASSIC description logic . In P. Torasso,

Doyle , and E. Sandewall, editors, Proceedings of the 4th International Conference on the Principles of Knowledge Representation and Reasoning , pages 121 - 133 . Morgan Kaufmann, 1994 .

[6] C. d'Amato , N.

Fanizzi , and F.

Esposito . A dissimilarity measure for ALC concept descriptions . In Proceedings of the 21st Annual ACM Symposium of Applied Computing, SAC2006 , volume 2 , pages 1695 - 1699 , Dijon, France, 2006 . ACM.

[7] M. d'Aquin , J. Lieber , and

Napoli . Decentralized case-based reasoning for the Semantic Web . In Y. Gil,

Motta , E. Benjamins, and M. A. Musen, editors, Proceedings of the 4th International Semantic Web Conference, ISWC2005, number 3279 in LNCS , pages 142 - 155 . Springer, 2005 .

[8]

Emde and

Wettschereck . Relational instance-based learning . In L. Saitta, editor, Proceedings of the Thirteenth International Conference, ICML96 , pages 122 - 130 . Morgan Kaufmann, 1996 .

[9]

Esposito ,

Fanizzi ,

Iannone , I. Palmisano , and

Semeraro . Knowledge-intensive induction of terminologies from metadata . In F. van Harmelen , S. McIlraith , and D. Plexousakis, editors, ISWC2004, Proceedings of the 3rd International Semantic Web Conference , volume 3298 of LNCS , pages 441 - 455 . Springer, 2004 .

[10]

Haase , F. van Harmelen , Z.

Huang , H.

Stuckenschmidt , and Y.

Sure . A framework for handling inconsistency in changing ontologies . In Y. Gil,

Motta , E. Benjamins, and M. A. Musen, editors, Proceedings of the 4th International Semantic Web Conference, ISWC2005, number 3279 in LNCS , pages 353 - 367 , Galway, Ireland, November 2005 . Springer.

[11]

Hitzler and D. Vrandec˘ic´. Resolution-based approximate reasoning for OWL DL . In Y. Gil,

Motta , E. Benjamins, and M. A. Musen, editors, Proceedings of the 4th International Semantic Web Conference, ISWC2005, number 3279 in LNCS , pages 383 - 397 , Galway, Ireland, November 2005 . Springer.

[12]

Mitchell . Machine Learning. McGraw-Hill , 1997 .