A System for Learning GCI Axioms in Fuzzy Description Logics Francesca A. Lisi1 and Umberto Straccia2 1 Dipartimento di Informatica, Università degli Studi di Bari “Aldo Moro”, Italy francesca.lisi@uniba.it 2 ISTI - CNR, Pisa, Italy umberto.straccia@isti.cnr.it Abstract. Vagueness is inherent to several real world domains and is particularly pervading in those domains where entities could be better described in natural language. In order to deal with vague knowledge, several fuzzy extensions of DLs have been proposed. In this paper, we face the problem of supporting the evolution of DL ontologies under vague- ness. Here, we present a system for learning fuzzy GCI axioms from crisp assertions and discuss preliminary experimental results obtained in the tourism application domain. 1 Introduction Ontologies provide a major source of structured knowledge. However, it is well known that “classical” ontology languages are not appropriate to deal with vague knowledge, which is inherent to several real world domains and is particularly pervading in those domains where objects could be better described in natural language [21]. We recall for the inexpert reader that there has been a long- lasting misunderstanding in the literature of artificial intelligence and uncertainty modelling, regarding the role of probability/possibility theory and vague/fuzzy theory. A clarifying paper is [5]. Specifically, under uncertainty theory fall all those approaches in which statements are true or false to some probability or possibility (for example, “it will rain tomorrow”). That is, a statement is true or false in any world/interpretation, but we are “uncertain” about which world to consider as the right one, and thus we speak about, e.g., a probability distribution or a possibility distribution over the worlds. On the other hand, under fuzzy theory fall all those approaches in which statements (for example, “the hotel is cheap”) are true to some degree, which is taken from a truth space (usually [0, 1]). That is, an interpretation maps a statement to a truth degree, since we are unable to establish whether a statement is entirely true or false due to the involvement of vague concepts, such as “cheap” (we cannot always say whether a hotel is cheap or not). Here, we shall focus on fuzzy logic only. So far, several fuzzy extensions of DLs can be found in the literature (see the survey in [15]). Although a relatively important amount of work has been carried out in the last years concerning the use of fuzzy DLs as ontology languages, the problem of automatically managing the evolution of fuzzy ontologies still remains rela- tively unaddressed. In this paper, we describe a method, named Foil-DL, for the automated induction of fuzzy General Concept Inclusion (GCI) axioms. The method follows the machine learning approach known as Inductive Logic Pro- gramming (ILP). ILP was born at the intersection between Logic Programming and Concept Learning [17]. From Logic Programming (LP) it has borrowed the representation formalism for both data and hypotheses. From Concept Learning it has inherited the inferential mechanisms for induction. A distinguishing feature of ILP, also with respect to other forms of Concept Learning, is the use of prior domain knowledge available in the background during the induction process. ILP has been traditionally concerned with rule induction for classification purposes. Foil-DL adapts known results in ILP concerning crisp rules to the novel case of fuzzy DL GCIs. More precisely, it adapts the popular rule induction method Foil [18]. The paper is structured as follows. Section 2 is devoted to preliminaries on fuzzy DLs. Section 3 describes the learning problem, the solution strategy and the implementation of Foil-DL. Section 4 introduces a case study in the tourism application domain. Section 5 concludes the paper by discussing limits of the current work, related work and possible directions of future work. 1 1 1 1 0 0 0 0 a b c d x a b c x a b x a b x (a) (b) (c) (d) Fig. 1. (a) Trapezoidal function trz (a, b, c, d), (b) triangular function tri(a, b, c), (c) left shoulder function ls(a, b), and (d) right shoulder function rs(a, b). 2 Basics of Fuzzy DLs We recap here some basic definitions we rely on. We refer the reader to e.g. [15], for a more in depth presentation. Mathematical Fuzzy Logic. Fuzzy Logic is the logic of fuzzy sets. A fuzzy set R over a countable crisp set X is a function R : X → [0, 1]. The standard fuzzy set operations conform to (A ∩ B)(x) = min(A(x), B(x)), (A ∪ B)(x) = max(A(x), B(x)) and Ā(x) = 1 − A(x), P while the inclusion degree between A and x∈X (A∩B)(x) B is defined typically as deg(A, B) = P . The trapezoidal (Fig. 1 (a)), x∈X A(x) the triangular (Fig. 1 (b)), the L-function (left-shoulder function, Fig. 1 (c)), and the R-function (right-shoulder function, Fig. 1 (d)) are frequently used to specify membership functions of fuzzy sets. Although fuzzy sets have a greater expressive power than classical crisp sets, its usefulness depends critically on the capability to construct appropriate membership functions for various given concepts in dif- ferent contexts. The problem of constructing meaningful membership functions is a difficult one and we refer the interested reader to, e.g. [10, Chapter 10]. However, one easy and typically satisfactory method to define the membership functions is to uniformly partition the range of, e.g. salary values (bounded by a minimum and maximum value), into 5 or 7 fuzzy sets using either trapezoidal functions (e.g. as illustrated on the left in Figure 2), or using triangular functions (as illustrated on the right in Figure 2). The latter is the more used one, as it has less parameters and is also the approach we adopt. Fig. 2. Fuzzy sets over salaries using trapezoidal or triangular functions. In Mathematical Fuzzy Logic [7], the convention prescribing that a statement is either true or false is changed and is a matter of degree measured on an ordered scale that is no longer {0, 1}, but e.g. [0, 1]. This degree is called degree of truth of the logical statement φ in the interpretation I. For us, fuzzy statements have the form hφ, αi, where α ∈ (0, 1] and φ is a statement, encoding that the degree of truth of φ is greater or equal α. A fuzzy interpretation I maps each atomic statement pi into [0, 1] and is then extended inductively to all statements: I(φ ∧ ψ) = I(φ) ⊗ I(ψ), I(φ ∨ ψ) = I(φ) ⊕ I(ψ), I(φ → ψ) = I(φ) ⇒ I(ψ), I(¬φ) = I(φ), I(∃x.φ(x)) = supy∈∆I I(φ(y)), I(∀x.φ(x)) = inf y∈∆I I(φ(y)), where ∆I is the domain of I, and ⊗, ⊕, ⇒, and are so-called t-norms, t-conorms, implication functions, and negation functions, respectively, which extend the Boolean conjunction, disjunc- tion, implication, and negation, respectively, to the fuzzy case. One usually distinguishes three different logics, namely Lukasiewicz, Gödel, and Product logics [7]3 , whose combination functions are reported in Table 1. Table 1. Combination functions of various fuzzy logics. Lukasiewicz logic Gödel logic Product logic Zadeh logic α ⊗ β max(α + β − 1, 0) min(α, β) α·β min(α, β) α⊕β min(α + β, 1) ( max(α, β) α+β−α·β max(α, β) 1 if α ≤ β α ⇒ β min(1 − α + β, 1) min(1, β/α) max(1 − α, β) β otherwise ( ( 1 if α = 0 1 if α = 0 α 1−α 1−α 0 otherwise 0 otherwise Note that the operators for Zadeh logic, namely α ⊗ β = min(α, β), α ⊕ β = max(α, β), α = 1 − α and α ⇒ β = max(1 − α, β), can be expressed in Lukasiewicz logic. More precisely, min(α, β) = α ⊗l (α ⇒l β), max(α, β) = 1 − min(1 − α, 1 − β). Furthermore, the implication α ⇒kd β = max(1 − α, b) is called Kleene-Dienes implication (denoted ⇒kd ), while Zadeh implication (denoted ⇒z ) is the implication α ⇒z β = 1 if α ≤ β; 0 otherwise. 3 Any other continuos t-norm can be obtained as a combination of them. An r-implication is an implication function obtained as the residuum of a continuous t-norm ⊗4 , i.e.α ⇒ β = max{γ | α ⊗ γ ≤ β}. Note also, that given an r-implication ⇒r , we may also define its related negation r α by means of α ⇒r 0 for every α ∈ [0, 1]. The notions of satisfiability and logical consequence are defined in the stan- dard way, where a fuzzy interpretation I satisfies a fuzzy statement hφ, αi or I is a model of hφ, αi, denoted as I |= hφ, αi, iff I(φ) ≥ α. Fuzzy ALC(D) basics. We recap here the fuzzy variant of the DL ALC(D) [22]. A fuzzy concrete domain or fuzzy datatype theory D = h∆D , · D i consists of a datatype domain ∆D and a mapping · D that assigns to each data value an element of ∆D , and to every n-ary datatype predicate d an n-ary fuzzy relation over ∆D . We will restrict to unary datatypes as usual in fuzzy DLs. Therefore, · D maps indeed each datatype predicate into a function from ∆D to [0, 1]. Typical examples of datatype predicates d are the well known membership functions d := ls(a, b) | rs(a, b) | tri(a, b, c) | trz(a, b, c, d) | ≥v | ≤v | =v , where e.g. ls(a, b) is the left-shoulder membership function and ≥v corresponds to the crisp set of data values that are greater or equal than the value v. Now, let A be a set of concept names (also called atomic concepts), R be a set of role names. Each role is either an object property or a datatype property. The set of concepts are built from concept names A using connectives and quantification constructs over object properties R and datatype properties T , as described by the following syntactic rules: C → > | ⊥ | A | C1 u C2 | C1 t C2 | ¬C | C1 → C2 | ∃R.C | ∀R.C | ∃T.d | ∀T.d . An ABox A consists of a finite set of assertion axioms. An assertion axiom is an expression of the form ha:C, αi (concept assertion, a is an instance of concept C to degree at least α) or of the form h(a1 , a2 ):R, αi (role assertion, (a1 , a2 ) is an instance of role R to degree at least α), where a, a1 , a2 are individual names, C is a concept, R is a role name and α ∈ (0, 1] is a truth value. A Terminological Box or TBox T is a finite set of General Concept Inclusion (GCI) axioms, where a GCI is of the form hC1 v C2 , αi (C1 is a sub-concept of C2 to degree at least α), where Ci is a concept and α ∈ (0, 1]. We may omit the truth degree α of an axiom; in this case α = 1 is assumed. A Knowledge Base (KB) is a pair K = hT , Ai. Concerning the semantics, let us fix a fuzzy logic. Unlike classical DLs in which an interpretation I maps e.g. a concept C into a set of individuals C I ⊆ ∆I , i.e. I maps C into a function C I : ∆I → {0, 1} (either an individual belongs to the extension of C or does not belong to it), in fuzzy DLs, I maps C into a function C I : ∆I → [0, 1] and, thus, an individual belongs to the extension of C to some degree in [0, 1], i.e. C I is a fuzzy set. Specifically, a fuzzy interpretation is a pair I = (∆I , ·I ) consisting of a nonempty (crisp) set ∆I (the domain) and of a fuzzy interpretation function ·I that assigns: (i) to each atomic concept A a function AI : ∆I → [0, 1]; (ii) to each object property R a function RI : ∆I × ∆I → [0, 1]; (iii) to each data type property T a function T I : ∆I × ∆D → [0, 1]; (iv) to each individual a an element aI ∈ ∆I ; and (v) to each concrete value v an 4 Note that Lukasiewicz, Gödel and Product implications are r-implications, while Kleene-Dienes implication is not. element v I ∈ ∆D . Now, a fuzzy interpretation function is extended to concepts as specified below (where x ∈ ∆I ): ⊥I (x) = 0, >I (x) = 1, (C u D)I (x) = C I (x) ⊗ DI (x), (C t D)I (x) = C I (x) ⊕ DI (x), (¬C)I (x) = C I (x), (C → D)I (x) = C I (x) ⇒ DI (x), (∀R.C)I (x) = inf y∈∆I {RI (x, y) ⇒ C I (y)}, (∃R.C)I (x) = supy∈∆I {RI (x, y) ⊗ C I (y)}, (∀T.d)I (x) = inf y∈∆D {T I (x, y) ⇒ dD (y)}, (∃T.d)I (x) = supy∈∆D {T I (x, y) ⊗ dD (y)} . Hence, for every concept C we get a function C I : ∆I → [0, 1]. The satisfiability of axioms is then defined by the following conditions: (i) I satisfies an axiom ha:C, αi if C I (aI ) ≥ α; (ii) I satisfies an axiom h(a, b):R, αi I if RI (aI , bI ) ≥ α; (iii) I satisfies an axiom hC v D, αi if (C v D) ≥ α where5 I (C v D) = inf x∈∆I {C I (x) ⇒ DI (x)}. I is a model of K = hA, T i iff I satisfies each axiom in K. We say that K entails axiom τ , denoted K |= τ , if any model of K satisfies τ . The best entailment degree of τ of the form C v D, a:C or (a, b):R, denoted bed(K, τ ), is defined as bed(K, τ ) = sup{α | K |= hτ, αi}. Fig. 3. Fuzzy concepts derived from the datatype property hasPrice. Example 1. Let us consider the following axiom h∃hasP rice.High v GoodHotel, 0.569i , where hasP rice is a datatype property whose values are measured in euros and the price concrete domain has been automatically fuzzified as illustrated in Figure 3. Now, it can be verified that for hotel verdi, whose room price is 105 euro, i.e. we have the assertion verdi:∃hasP rice. =105 in the KB, we infer under Product logic that6 K |= hverdi:GoodHotel, 0.18i . 3 Learning fuzzy DL axioms with Foil-DL The problem statement. The problem considered in this paper concerns the au- tomated induction of fuzzy DL GCI axioms providing a sufficient condition for a given atomic concept H. It can be cast as a rule learning problem, provided that positive and negative examples of H are available. This problem can be formalized as follows. Given: – a consistent DL KB K = hT , Ai (the background theory); 5 However, note that under Zadeh logic v is interpreted as ⇒z and not as ⇒kd . 6 0.18 = 0.318 · 0.569, where 0.318 = tri(90, 112, 136)(105). – an atomic concept H (the target concept); – a set E = E + ∪ E − of crisp concept assertions labelled as either positive or negative examples for H (the training set); – a set LH of fuzzy GCIs (the language of hypotheses) the goal is to find a set H ⊂ LH (a hypothesis) such that: Completeness. ∀e ∈ E + , K ∪ H |= e, and Consistency. ∀e ∈ E − , K ∪ H 6|= e. Here we assume that K ∩ E = ∅. Also, the language LH is given implicitly by means of syntactic restrictions over a given alphabet. In particular, the alphabet underlying LH is a subset of the alphabet for the language LK of the background theory. However, LH differs from LK as for the form of axioms. Two further restrictions hold naturally. One is that K 6|= E + since, in such a case, H would not be necessary to explain E + . The other is that K ∪ H 6|= ⊥, which means that K ∪ H is a consistent theory, i.e. has a model. An axiom φ ∈ LH covers an example e ∈ E iff K ∪ {φ} |= e. The training examples. Given the target concept H, the training set E consists of concept assertions of the form H(a) (1) where a is an individual occurring in K. In this paper, the training examples are crisp. Also, E is split into E + and E − . Note that, under OWA, E − consists of all those individuals which can be proved to be instance of ¬H. However, E − can be deduced under CWA by collecting the individuals which cannot be proved to be instance of H. The language of hypotheses. Given the target concept H, the hypotheses to be induced are fuzzy GCIs of the form BvH , (2) where the left-hand side is defined according to the following syntax B −→ > | A | ∃R.B | ∃T.d | B1 u B2 . (3) Note that the language LH generated by this EL(D) syntax is potentially infinite due, e.g., to the nesting of existential restrictions yielding to complex concept expressions such as ∃R1 .(∃R2 . . . .(∃Rn .(C)) . . .). The language can be made finite by imposing further restrictions on the generation process such as the maximal number of conjuncts and the depth of existential nestings allowed in the left-hand side. Also, note that the learnable GCIs do not have an explicit truth degree. However, even if K is a crisp DL KB, the possible occurrence of fuzzy concrete domains in expressions of the form ∃T.d in the left-hand side of a fuzzy GCI of the form (2) may imply both that bed(K, B v H) 6∈ {0, 1} and bed(K, a:B) 6∈ {0, 1}. Furthermore, as we shall see later on, once we have learned a fuzzy GCI B v H, we attach to it a truth degree that is obtained by computing the confidence degree by means of the cf function (see Eq (5)), which may be seen as the fuzzy set inclusion degree (see Section 2) between the fuzzy set represented by concept function Learn-Sets-of-Axioms(K, H, E + , E − , LH ): H begin 1. H := ∅; 2. while E + 6= ∅ do 3. φ := Learn-One-Axiom(K, H, E + , E − , LH ); 4. H := H ∪ {φ}; 5. Eφ+ := {e ∈ E + |K ∪ φ |= e}; 6. E + := E + \ Eφ+ ; 7. endwhile 8. return H end Fig. 4. Foil-DL: Learning a set of GCI axioms. B and the (crisp) set represented by concept H. Finally, note that the syntactic restrictions of LH allow for a straightforward translation of the inducible axioms into rules of the kind “if x is a C1 and . . . and x is a Cn then x is an H”, which corresponds to the usual pattern in fuzzy rule induction (in our case, B v H is seen as a rule “if B then H”) . The solution strategy. The solution proposed for the learning problem defined in Section 3 is inspired by Foil. Foil is a popular ILP algorithm for learning sets of rules which performs a greedy search in order to maximise a gain function [18]. In Foil-DL, the learning strategy of Foil (i.e. the so-called sequential cover- ing approach) is kept. The function Learn-Sets-of-Axioms (reported in Figure 4) carries on inducing axioms until all positive examples are covered. When an axiom is induced, the positive examples covered by the axiom are removed from E. In order to induce an axiom, the function Learn-One-Axiom (reported in Figure 5) starts with the most general axiom (i.e. > v H) and specializes it by applying the refinement rules implemented in the function Refine (step 7.). The iterated specialization of the axiom continues until the axiom does not cover any negative example and its confidence degree is greater than a fixed threshold (θ). The confidence degree of axioms being generated with Refine allows for evaluating the information gain obtained on each refinement step by calling the function Gain (step 9.). Of course, we need to adapt the functions Refine and Gain developed for Foil to Foil-DL, which we address in the next two subsections. The refinement operator. The function Refine implements a specialization op- erator, i.e. an operator for traversing the hypotheses space top down, with the following refinement rules: AddA adds an atomic concept A Add∃R.> adds a complex concept ∃R.> by existential role restriction Add∃T.d adds a complex concept ∃T.d by existential role restriction SubstA replaces an atomic concept A with another atomic concept A0 s.t. A0 v A function Learn-One-Axiom(K, H, E + , E − , LH ): φ begin 1. B := >; 2. φ := B v H; 3. Eφ− := E − ; 4. while cf (φ) < θ or Eφ− 6= ∅ do 5. Bbest := B; 6. maxgain := 0; 7. Φ := Refine(φ, LH ) 8. foreach φ0 ∈ Φ do 9. gain := Gain(φ0 , φ); 10. if gain ≥ maxgain then 11. maxgain := gain; 12. Bbest := B 0 ; 13. endif 14. endforeach 15. φ := Bbest v H; 16. Eφ− := {e ∈ E − |K ∪ φ |= e}; 17. endwhile 18. return φ end Fig. 5. Foil-DL: Learning one GCI axiom. At each refinement step (i.e. at each call of Refine), the rules are applied first to the left-hand side of the axiom being specialized and then recursively to the range of all the conjuncts defined with existential role restriction. For example, let us consider that H is the target concept, A, A0 , B, R, R0 , T are concepts and roles occurring in K, and A0 v A holds in K. Under these assumptions, the axiom ∃R.B v H is specialized into the following axioms: – A u ∃R.B v H, B u ∃R.B v H, A0 u ∃R.B v H; – ∃R0 .> u ∃R.B v H, ∃T.d u ∃R.B v H; – ∃R.(B u A) v H, ∃R.(B u A0 ) v H; – ∃R.(B u ∃R.>) v H, ∃R.(B u ∃R0 .>) v H, ∃R.(B u ∃T.d) v H. Note that a specialization operator reduces the number of examples covered by a GCI. The aim of a refinement is to reduce the number of covered negative examples, while still keeping some covered positive examples. The heuristic. The function Gain implements the criterion for selecting the best candidate at each refinement step according to the following formula: Gain(φ0 , φ) = p ∗ (log2 (cf (φ0 )) − log2 (cf (φ))) , (4) where p is the number of positive examples covered by the axiom φ that are still covered by φ0 . Thus, the gain is positive iff φ0 is more informative in the sense of Shannon’s information theory (i.e. iff the confidence degree increases). If there are some refinements, which increase the confidence degree, the function Gain tends to favour those that offer the best compromise between the confidence degree and the number of examples covered. The confidence degree of an axiom φ is computed as a sort of fuzzy set inclu- sion degree (see Section 2) between the fuzzy set represented by concept B and the (crisp) set represented by concept H and is as follows: P a∈P bed(K, a:B) cf (φ) = cf (B v H) = (5) |D| where – D is the set of individuals occurring in Eφ+ ∪ Eφ− such that bed(K, a:B) > 0. – P is the set of individuals occurring in Eφ+ such that bed(K, a:B) > 0 7 . We remind the reader that bed(K, a:B) denotes the best entailment degree of the concept assertion a:B w.r.t. K (see Section 2). Also, note that, as pointed out in Section 3, the possible occurrence of expressions of the form ∃T.d in B may imply that bed(K, a:B) 6∈ {0, 1}. The implementation. A variant of Foil-DL has been implemented in the fuzzyDL- Learner 8 system. Notably, fuzzy GCIs in LH are interpreted under Gödel se- mantics, while the background theory and the training set are represented in crisp DLs. As K is crisp, we have used a classical DL reasoner, together with a specialised code, to compute the confidence degree of fuzzy GCI. For instance, bed(K, a:∃T.d), can be computed from the derived T -fillers v of a, and applying v to the fuzzy membership function of d. The examples covered by a GCI, and, thus, the entailment tests in Learn-Sets-of-Axioms and Learn-One-Axiom, have been determined in a similar way. The implementation of Foil-DL features several optimizations wrt the solu- tion strategy presented in Section 3. Notably, the search in the hypothesis space can be optimized by enabling a backtracking mode. This option allows to over- come one of the main limits of Foil, i.e. the sequential covering strategy. Because it performs a greedy search, formulating a sequence of rules without backtrack- ing, Foil does not guarantee to find the smallest or best set of rules that explain the training examples. Also, learning rules one by one could lead to less and less interesting rules. To reduce the risk of a suboptimal choice at any search step, the greedy search can be replaced in Foil-DL by a beam search which maintains a list of k best candidates at each step instead of a single best candidate. Additionally, to guarantee termination, we provide two parameters to limit the search space: namely, the maximal number of conjuncts and the maximal depth of existential nestings allowed in a fuzzy GCI. In fact, the computation may end without covering all positive examples. Two graphical user interfaces are available for Foil-DL: One is a stand-alone Java application, the other is a tab widget plug-in for the ontology editor Protégé9 (release 4.2). The latter is shown in Figure 6. 7 Note that for individuals a ∈ P , K |= a:H holds and, thus, bed(K, a:B u H) = bed(K, a:B). 8 http://straccia.info/software/FuzzyDL-Learner 9 http://protege.stanford.edu/ Fig. 6. User interface of Foil-DL as tab widget plug-in for Protégé. 4 A Case Study in the Tourism Domain In order to test the validity and the usefulness of Foil-DL on a real-world appli- cation, we have considered a case study in the tourism domain. More precisely, we have focused on the task of hotel finding because it can be reformulated as a classification problem solvable with Foil-like algorithms. To the purpose, we have built an ontology, named Hotel.owl10 , which models the meaningful en- tities of the domain in hand (see Figure 6, further details can be found in the appendix). It has the DL expressivity of ALCHOF(D) and consists of 8000 ax- ioms, 74 classes, 4 object properties, and 2 data properties. The ontology has been populated with 1504 individuals concerning tourism in the city of Pisa. In particular, data about hotels as well as graded hotel judgements from users have been automatically extracted from the web site of Trip Advisor11 whereas the distances of hotels from sites of interest have been computed by means of Google Maps12 API. Then, one may set a learning problem with the class Good Hotel as target concept and ask Foil-DL to induce axioms - such as the graded GCI reported in Example 1 - from positive and negative examples of Good Hotel, which allow for discriminating good hotels from bad ones. For more details on this case study we refer the reader to the Appendix. 10 http://gaia.isti.cnr.it/~straccia/software/FOIL-DL/download/FOIL-DL/ examples/Hotel/Hotel.owl 11 http://www.tripadvisor.com 12 http://maps.google.com/ 5 Conclusions and future work In this paper, we have described a method, named Foil-DL, which solves the problem of automatically inducing fuzzy EL(D) GCI axioms from crisp DL as- sertions. The method extends Foil, a popular ILP algorithm for learning sets of crisp rules, in a twofold direction: from crisp to fuzzy and from rules to GCIs. Notably, fuzziness is captured by the definition of confidence degree reported in (5). Here, the different truth degrees of the variable bindings with which an axiom covers a positive example are taken into account. Also, thanks to the variable-free syntax of DLs, the learnable GCIs are highly understandable by humans, e.g. the axiom reported in Example 1 translates into the natural language sentence “a hotel having a high price is a good hotel.” Related Foil-like algorithms for the fuzzy case are reported in the literature [4,19,20] but they are not conceived for DL ontologies. In the context of DL ontologies, DL-Foil adapts Foil to learn crisp OWL DL equivalence axioms under OWA [6]. DL-Learner supports the same learning problem as in DL-Foil but implements algorithms that are not based on Foil [8]. Likewise, Lehmann and Haase [12] propose a refinement operator for concept learning in EL (implemented in the ELTL algorithm within DL-Learner) whereas Chitsaz et al. [3] combine a refinement operator for EL++ with a reinforcement learning algorithm. However, both works deal only with crip ontologies. Very recently, an extension of DL- Learner with some of the most up-to-date fuzzy ontology tools has been proposed [9]. Notably, it can learn fuzzy OWL DL equivalence axioms from FuzzyOWL 2 ontologies13 by interfacing the fuzzyDL reasoner [1]. However, it has been tested only on a toy ontology14 with crisp training examples. Last, the work reported in [11] is based on an ad-hoc translation of fuzzy Lukasiewicz ALC DL constructs into LP and then uses a conventional ILP method to lean rules. The method is not sound as it has been recently shown that the mapping from fuzzy DLs to LP is incomplete [16] and entailment in Lukasiewicz ALC is undecidable [2]. A previous investigation of the problem considered in the present paper can be found in [13,14]. However, the resulting method (named SoftFoil) provides a different solution from Foil-DL as for the knowledge representation language, the confidence degree computation, the refinement operator and the heuristic. Finally, unlike Foil-DL, SoftFoil has not been implemented. For the future, we intend to conduct a more extensive empirical validation of the system which could suggest directions of improvement of the method towards more effective formulations of, e.g., the information gain function and the refine- ment operator as well as of the search strategy and the halt conditions employed in Learn-One-Axiom, the choice of the t-norm, so as to investigate other fuzzy GCI learning algorithms based on e.g. fuzzy Decision Trees [23]. Eventually, we will investigate about learning fuzzy GCI axioms from FuzzyOWL 2 ontologies, by coupling the learning algorithm to the fuzzyDL reasoner, instead of learning from crisp OWL 2 data by using a classical DL reasoner. Only then, a direct comparative evaluation with DL-Learner will be possible. 13 http://www.straccia.info/software/FuzzyOWL 14 http://wiki.aksw.org/Projects/DLLearner/fuzzyTrains A Evaluation Data A.1 The ontology Hotel.owl According to Protégé, the ontology metrics for Hotel.owl are the following: – metrics: • axioms: 8000; • logical axiom count: 6309; • class count: 74 • object property count: 4; • data property count: 2; • individual count: 1504; • DL expressivity: ALCHOF(D). – class axioms: • subClassOf axioms count: 71; • equivalentClasses axioms count: 6; • disjointClasses axioms count: 1; • hidden GCI count:6. – object property axioms: • objectPropertyDomain axioms count: 4; • objectPropertyRange axioms count:4. – data property axioms: • functionalDataProperty axioms count: 2; • dataPropertyDomain axioms count: 2; • dataPropertyRange axioms count: 2. – individual axioms: • classAssertion axioms count: 1866; • objectPropertyAssertion axioms count: 2876; • dataPropertyAssertion axioms count: 1475. Classes. The classes forming the terminology of Hotel.owl are shown in Fig- ure 7, 8 and 9. The main concepts model the sites of interest (class Site), the places where sites are located (class Place), the services offered by hotels (class Amenity), the ranks assigned to hotels in the star classification (class Rank), and the distances between sites (class Distance). The classes Good Hotel and Not Good Hotel represent the target concept and its complement. Sites of in- terest in the tourism application domain include accommodations such as hotels (class Accomodation), attractions such as parks (class Attraction), stations of transportation means such as airports (class Station), and civic facilities such as hospitals (class Civic). Fig. 7. Classes modeling the main entities in Hotel.owl. Object and data properties. The object properties in Hotel.owl are: – hasAmenity (with domain Hotel and range Amenity) models the relationship between hotels and the services offered; – hasRank (with domain Hotel and range Rank) models the rank assigned to a hotel; – hasDistance (with domain Site and range Distance) models the relation- ship between a site and a distance; – isDistanceFor (with domain Distance and range Site) models the rela- tionship between a distance and the two sites. The data properties are: – hasPrice (with domain Accommodation and range integer) is the average price of a room; – hasValue (with domain Distance and range double) is the numerical value of the distance between two sites. Note that the numerical value of the distance between two sites would be better modeled as attribute of a ternary relation. However, only binary relations can be represented in OWL. The concept Distance and the properties hasDistance, isDistanceFor and hasValue are necessary to simulate a ternary relation by means of binary relations. Individuals. The individuals occurring in Hotel.owl refer to the case of Pisa. In particular, 59 instances of Hotel have been extracted from TripAdvisor. In- formation about the rank, the amenities and the average room price has been added in the ontology for each of these instances. Out of the 59 hotels, 12 with a higher percentage of positive feedback have been classified as instances of Good Hotel whereas 11 with a lower percentage have been classified as instances of Not Good Hotel. Also, further 24 instances of Site have been created and dis- tributed among the classes under Attraction, Civic and Station. Finally, 1416 distances (instances of Distance) between the accommodations and the sites of interest have been measured in km and computed by means of Google Maps API. Fig. 8. Classes modeling hotel amenities in Hotel.owl. Fig. 9. Classes modeling sites in Hotel.owl. Two further remarks concern the modeling of the instance level of Hotel.owl. First, no instance of the class Amenity has been created because it is not nec- essary for our purposes. However, hotel amenities can be implicitely declared. For instance, the fact that a hotel, say hotel 10, offers a laundry service is modeled by means of an axiom stating that hotel 10 is instance of the class ∃hasAmenity.Laundry. Second, ranks are modeled as individuals (1 Star, . . . , 5 Stars). Further classes have been created in order to classify hotels according to the star ranking system. For instance, 3-star hotels are modeled with the class Hotel 3 Stars ≡ ∃hasRank.{3 stars} defined by means of an equivalence ax- iom. This modeling choice allows to overcome the limit of Foil-DL in dealing directly with expressions like ∃hasRank.{3 stars}. A.2 The experiments on Hotel.owl First trial. The first experiment conducted by running Foil-DL on Hotel.owl is summarized in the log reported in Figure 10. Here, the target concept is Good Hotel and the search for hypotheses is conducted under OWA by using Pellet as a DL reasoner. All the classes, object properties and data properties oc- curring in Hotel.owl are considered to be part of the alphabet of the language of hypotheses. However, syntactic restrictions are imposed on the form of the learn- able GCI axioms. More precisely, conjunctions can have at most 5 conjuncts and at most 2 levels of nesting are allowed in existential role restrictions. The mem- bership functions for fuzzy concepts derived from the data properties hasPrice and hasValue in this trial are shown in Figure 3 and Figure 11(a), respectively. The results obtained for this configuration of Foil-DL suggest the existence of user profiles, e.g. families and disabled people. Note that no axiom has been in- duced which encompasses knowledge about the distance of the accommodation from sites of interest. Second trial. In the second experiment, the configuration of Foil-DL remains unchanged except for the alphabet underlying the language of hypotheses and the definition of membership functions for the fuzzy concepts (see log reported in Figure 12). More precisely, the use of the object property hasAmenity is forbid- den. Also, the fuzzification of the data property hasValue is more reasonable for a foot distance. Here, a very low distance does not exceed 900 meters, an average distance is about 1500 meters, and so on, as illustrated in Figure 11(b). The axioms learned by Foil-DL suggest that closeness to stations of transportation means is a desirable feature when choosing a hotel. References 1. Bobillo, F., Straccia, U.: fuzzyDL: An expressive fuzzy description logic reasoner. In: 2008 International Conference on Fuzzy Systems (FUZZ-08). pp. 923–930. IEEE Computer Society (2008) 2. Cerami, M., Straccia, U.: On the (un)decidability of fuzzy description logics under Lukasiewicz t-norm. Information Sciences 227, 1–21 (2013) ---------------------------------------------------- http://www.semanticweb.org/ontologies/Hotel.owl Parameters: - ontology: .\examples\Hotel\Hotel.owl - concept: Good_Hotel - debug_mode: true - cwa_mode: false - direct_subclasses: false - backtrack_mode: false - max_conjuncts: 5 - max_depth: 2 - theta: 0.0 - reasoner: PELLET - skip_classes: [] - skip_properties: [] - skip_data_properties: [] FuzzyConcepts: - hasPrice_veryhigh: hasPrice, rightShoulder(112,136) - hasPrice_low: hasPrice, triangular(45,68,90) - hasPrice_fair: hasPrice, triangular(68,90,112) - hasPrice_high: hasPrice, triangular(90,112,136) - hasPrice_verylow: hasPrice, leftShoulder(45,68) - hasValue_low: hasValue, triangular(0.035,4.545,9.055) - hasValue_veryhigh: hasValue, rightShoulder(13.565,18.075) - hasValue_fair: hasValue, triangular(4.545,9.055,13.565) - hasValue_high: hasValue, triangular(9.055,13.565,18.075) - hasValue_verylow: hasValue, leftShoulder(0.035,4.545) Confidence Axiom 1,000 Hostel subclass of Good_Hotel 1,000 hasPrice_veryhigh subclass of Good_Hotel 0,569 hasPrice_high subclass of Good_Hotel 0,286 hasAmenity some (24h_Reception) and hasAmenity some (Disabled_Facilities) and hasPrice_low subclass of Good_Hotel 0,282 hasAmenity some (Babysitting) and hasRank some (Rank) and hasPrice_fair subclass of Good_Hotel 0,200 Hotel_1_Star subclass of Good_Hotel Running time 76 sec and 786 msec of which: 287 msec for subclasses retrieval 61 sec for individuals retrieval 14 sec for instance checking 1 sec for the algorithm running ---------------------------------------------------- Fig. 10. Log of the first trial of Foil-DL on Hotel.owl (target concept: Good Hotel). (a) (b) Fig. 11. Membership functions for fuzzy concepts derived from the data property hasValue in (a) the first trial and (b) the second trial of Foil-DL on Hotel.owl. ---------------------------------------------------- http://www.semanticweb.org/ontologies/Hotel.owl Parameters: - ontology: .\examples\Hotel\Hotel.owl - concept: Good_Hotel - debug_mode: true - cwa_mode: false - direct_subclasses: false - backtrack_mode: false - max_conjuncts: 5 - max_depth: 2 - theta: 0.0 - reasoner: PELLET - skip_classes: [] - skip_properties: [hasAmenity] - skip_data_properties: [] FuzzyConcepts: - hasPrice_veryhigh: hasPrice, rightShoulder(112,136) - hasPrice_low: hasPrice, triangular(45,68,90) - hasPrice_fair: hasPrice, triangular(68,90,112) - hasPrice_high: hasPrice, triangular(90,112,136) - hasPrice_verylow: hasPrice, leftShoulder(45,68) - hasValue_low: hasValue, triangular(0.5,0.9,1.5) - hasValue_fair: hasValue, triangular(0.9,1.5,3.0) - hasValue_high: hasValue, triangular(1.5,3.0,4.0) - hasValue_veryhigh: hasValue, rightShoulder(3.0,4.0) - hasValue_verylow: hasValue, leftShoulder(0.5,0.9) Confidence Axiom 1,000 Hostel subclass of Good_Hotel 1,000 hasPrice_veryhigh subclass of Good_Hotel 0,739 hasDistance some (isDistanceFor some (Bus_Station) and hasValue_low) and hasDistance some (isDistanceFor some (Town_Hall) and hasValue_fair) and hasRank some (Rank) and hasPrice_verylow subclass of Good_Hotel 0,569 hasPrice_high subclass of Good_Hotel 0,289 Hotel_3_Stars and hasDistance some (isDistanceFor some (Train_Station) and hasValue_verylow) and hasPrice_fair subclass of Good_Hotel 0,198 Hotel_4_Stars and hasDistance some (isDistanceFor some (Square) and hasValue_high) and hasRank some (Rank) and hasPrice_fair subclass of Good_Hotel Running time 655 sec and 905 msec of which: 573 msec for subclasses retrieval 226 sec for individuals retrieval 424 sec for instance checking 4 sec for the algorithm running ---------------------------------------------------- Fig. 12. Log of the second trial of Foil-DL on Hotel.owl (target concept: Good Hotel). 3. Chitsaz, M., Wang, K., Blumenstein, M., Qi, G.: Concept learning for EL++ by refinement and reinforcement. In: Anthony, P., Ishizuka, M., Lukose, D. (eds.) PRI- CAI 2012: Trends in Artificial Intelligence - 12th Pacific Rim International Confer- ence on Artificial Intelligence, Kuching, Malaysia, September 3-7, 2012. Proceed- ings. pp. 15–26. Lecture Notes in Artificial Intelligence, Springer-Verlag (2012) 4. Drobics, M., Bodenhofer, U., Klement, E.P.: FS-FOIL: an inductive learning method for extracting interpretable fuzzy descriptions. Int. J. Approximate Rea- soning 32(2-3), 131–152 (2003) 5. Dubois, D., Prade, H.: Possibility theory, probability theory and multiple-valued logics: A clarification. Annals of Mathematics and Artificial Intelligence 32(1-4), 35–66 (2001) 6. Fanizzi, N., d’Amato, C., Esposito, F.: DL-FOIL concept learning in description logics. In: Zelezný, F., Lavrač, N. (eds.) Inductive Logic Programming, 18th Inter- national Conference, ILP 2008, Prague, Czech Republic, September 10-12, 2008, Proceedings. Lecture Notes in Computer Science, vol. 5194, pp. 107–121. Springer (2008) 7. Hájek, P.: Metamathematics of Fuzzy Logic. Kluwer (1998) 8. Hellmann, S., Lehmann, J., Auer, S.: Learning of OWL Class Descriptions on Very Large Knowledge Bases. International Journal on Semantic Web and Information Systems 5(2), 25–48 (2009) 9. Iglesias, J., Lehmann, J.: Towards integrating fuzzy logic capabilities into an ontology-based inductive logic programming framework. In: Proc. of the 11th Int. Conf. on Intelligent Systems Design and Applications. IEEE Press (2011) 10. Klir, G.J., Yuan, B.: Fuzzy sets and fuzzy logic: theory and applications. Prentice- Hall, Inc., Upper Saddle River, NJ, USA (1995) 11. Konstantopoulos, S., Charalambidis, A.: Formulating description logic learning as an inductive logic programming task. In: Proc. of the 19th IEEE Int. Conf. on Fuzzy Systems. pp. 1–7. IEEE Press (2010) 12. Lehmann, J., Haase, C.: Ideal Downward Refinement in the EL Description Logic. In: De Raedt, L. (ed.) Inductive Logic Programming, 19th International Conference, ILP 2009, Leuven, Belgium, July 02-04, 2009. Revised Papers. Lecture Notes in Computer Science, vol. 5989, pp. 73–87 (2010) 13. Lisi, F.A., Straccia, U.: Towards learning fuzzy DL inclusion axioms. In: Fanelli, A.M., Pedrycz, W., Petrosino, A. (eds.) Fuzzy Logic and Applications - 9th In- ternational Workshop, WILF 2011, Trani, Italy, August 29-31,2011. Proceedings. Lecture Notes in Computer Science, vol. 6857, pp. 58–66. Springer (2011) 14. Lisi, F.A., Straccia, U.: A logic-based computational method for the automated induction of fuzzy ontology axioms. Fundamenta Informaticae 124, 1–17 (2013) 15. Lukasiewicz, T., Straccia, U.: Managing Uncertainty and Vagueness in Description Logics for the Semantic Web. Journal of Web Semantics 6, 291–308 (2008) 16. Motik, B., Rosati, R.: A faithful integration of description logics with logic pro- gramming. In: Veloso, M. (ed.) IJCAI 2007, Proc. of the 20th Int. Joint Conf. on Artificial Intelligence. pp. 477–482 (2007) 17. Nienhuys-Cheng, S.H., de Wolf, R.: Foundations of Inductive Logic Programming, Lecture Notes in Artificial Intelligence, vol. 1228. Springer (1997) 18. Quinlan, J.R.: Learning logical definitions from relations. Machine Learning 5, 239– 266 (1990) 19. Serrurier, M., Prade, H.: Improving expressivity of inductive logic programming by learning different kinds of fuzzy rules. Soft Computing 11(5), 459–466 (2007) 20. Shibata, D., Inuzuka, N., Kato, S., Matsui, T., Itoh, H.: An induction algorithm based on fuzzy logic programming. In: Zhong, N., Zhou, L. (eds.) Methodologies for Knowledge Discovery and Data Mining, Third Pacific-Asia Conference, PAKDD-99, Beijing, China, April 26-28, 1999, Proceedings. Lecture Notes in Computer Science, vol. 1574, pp. 268–273. Springer (1999) 21. Straccia, U.: Reasoning within fuzzy description logics. Journal of Artificial Intelli- gence Research 14, 137–166 (2001) 22. Straccia, U.: Description logics with fuzzy concrete domains. In: Bachus, F., Jaakkola, T. (eds.) 21st Conference on Uncertainty in Artificial Intelligence (UAI- 05). pp. 559–567. AUAI Press, Edinburgh, Scotland (2005) 23. Wang, T., Li, Z., Yan, Y., Chen, H.: A survey of fuzzy decision tree classifier method- ology. In: Proceedings of the Second International Conference of Fuzzy Information and Engineering, (ICFIE-07). pp. 959–968. No. 40 in Advances in Soft Computing, Springer Verlag (2007)