Exact Learning of EL Ontologies Ricardo Duarte, Boris Konev, Ana Ozaki Center for Advancing Electronics Dresden (cfaed), TU Dresden, Germany University of Liverpool, United Kingdom Free University of Bozen-Bolzano, Italy Abstract. We present the learning algorithm underpinning ExactLearner, a tool for exactly learning and teaching EL terminologies. The algorithm is superpolynomial in the size of the concept expressions and vocabulary of the terminology, but not in the size of the whole terminology (which is optimal). We evaluate ExactLearner’s performance on EL ontologies from the Oxford ontology repository and demonstrate that despite the algorithm being exponential, it successfully terminates for small and medium size ontologies. We investigate the impact of various learner and teacher features and identify those most useful for learning. 1 Introduction Authoring ontologies is a laborious task that requires a combined expertise of domain experts, who know the vocabulary of terms used in a particular subject area and have an understanding of the conceptual relationships between them, and of knowledge engineers, who can formalise these relations in an appropriate ontology definition language. In [6] the dialogue between an expert and a knowl- edge engineer is formalised as an instance of Angluin’s exact learning framework in which a learner tries to exactly identify an ontology by communicating with a teacher, seen as an oracle. It is assumed that the vocabulary of terms is known and is communicated directly to the learner; in contrast, the exact ontology composition has to be found through a ‘trial and error’ learning process. The learner poses queries of two kinds: membership queries, which ask the oracle to determine whether a given inclusion is entailed by the ontology, and equivalence queries, whether the ontology constructed is complete. If the answer to an equivalence query is negative, the oracle returns a statement which follows from the expert’s knowledge but not from the ontology constructed, or the other way around. We are interested in learning algorithms that can identify any target ontology independently of the behaviour of the teacher. For complexity bounds, we consider the worst possible, adversarial teacher, which chooses to reveal as little information as possible. In this paper we build on results of [6] and give an algorithm for learning EL terminologies, which is exponential in the size of concept expressions and its vocabulary but not in the size of the whole terminology, thus providing a formal underpinning to the ExactLearner system [5]. This result complements previous results showing that there is no polynomial time algorithm which can exactly learn (even acyclic) EL terminologies [6]. We then introduce ExactLearner, a tool for exactly learning and teaching EL terminologies, which contains an implementation of our learning algorithm as well as a teacher. We evaluate Ex- actLearner’s performance on EL ontologies from the Oxford ontology repository [8] and demonstrate that despite the algorithm being exponential, it successfully terminates for small and medium size ontologies. We investigate the impact of various learner and teacher features and identify those most useful for learning. Related work. Most relevant to our work are: the DL-Learner [7], which learns concept expressions (but not ontologies) in various fragments of description logic, using refinement operators; and systems based on the exact learning model, such as: Logan-H [2] for learning function-free first order Horn sentences from interpretations; and EIRENE [1], for learning schema mappings. For a more detailed discussion of related work, see [6]. 2 Preliminaries Description logic. Let NC and NR be countably infinite sets of concept and role names. An EL concept expression C is formed according to the rule: C, D := A | > | C u D | ∃r.C, where A ranges over NC and r ranges over NR . A (general) EL concept inclusion has the form C v D, where C and D are EL concept expressions. An EL ontology is a finite set of EL concept inclusions [4]. We call an EL ontology O a terminology if for all C v D ∈ O either C or D is a concept name and O has at most one1 inclusion of the form A v C for every A ∈ NC . ELlhs is the class of EL terminologies consisting only of inclusions of the form C v A, while ELrhs only of inclusions of the form A v C. The size |C| of a concept expression C is the length of the string that represents it, where concept names and role names are considered to be of length one. An ontology vocabulary is the set of concept and role names occurring in the ontology. The size of a concept inclusion C v D, P denoted |C v D|, is |C| + |D| and the size of an ontology O, denoted |O|, is CvD∈O |C v D|. The semantics of EL is defined as usual [3]. We write I |= α to say that a concept inclusion α is true in I. An interpretation I is a model of an ontology O if I |= α for all α ∈ O. O |= α means that I |= α for all models I of O; and O ≡ O0 means that O |= α if and only if O0 |= α for all concept inclusions α. We identify each EL concept expression C with a finite tree TC where nodes are labelled with sets of concept names and edges are labelled with roles: if A is a concept name, then TA has a single node d with label l(d) = {A}; if C = ∃r.D, then TC is obtained from TD by adding a new root d0 and an edge from d0 to the root d of TD with label l(d0 , d) = r (we then call d an r-successor of d0 ); if C = D1 u D2 , then TC is obtained by identifying the roots of TD1 and 1 In the literature, the term terminology commonly refers to sets of concept inclusions A v C and concept definitions A ≡ C, with no concept name occurring more than once on the left. As A ≡ C can be equivalently rewritten as A v C and C v A, our definition is a natural extension of this one. TD2 . A mapping h from a tree TC to an interpretation I is a homomorphism if A ∈ l(d) implies h(d) ∈ AI for every concept name A and r = l(d, d0 ) implies (h(d), h(d0 )) ∈ rI for all role names r. Then d ∈ C I if, and only if, there is a homomorphism from TC to I mapping the root of TC to d. In our proofs, we use notion of a canonical model, defined for every EL ontology O and EL concept C. IC,∅ , also denoted IC , is defined as TC viewed as a tree- shaped interpretation. For O 6= ∅, IC,O is the limit of the sequence I0 , I1 , . . . of interpretations, where I0 = IC and In+1 is obtained from In by applying, in a fair way, the following rule: if C v D ∈ O and d ∈ C In but d 6∈ DIn , then take the interpretation ID and add it to In by identifying its root ρD with d. IC,O |= O and, for any EL concept D, we have ρC ∈ DIC,O iff O |= C v D, where ρC is the root of IC,O . Subsumption learning framework. Given a class of ontologies L (for example all ontologies in a particular DL, EL terminologies etc), we are interested in the exact identification of a target ontology O ∈ L by posing queries to an oracle. We assume that the vocabulary of the target terminology ΣO is known to the learner. A membership query is a call to the oracle to test for an inclusion C v D, where C, D are ΣO -concept expressions of the DL under consideration, if O |= C v D. An inclusion C v D is a positive example w.r.t. a target O if O |= C v D and a negative example else. An equivalence query is a call to the oracle to check if a hypothesis ontology H is equivalent to the target O. If it is the case, the oracle responds ‘yes’, otherwise the oracle returns a positive example C v D with H 6|= C v D or a negative example E v F with H |= E v F . Such a positive example C v D (negative example E v F ) is called a positive counterexample (a negative counterexample, resp.) to H being equivalent to O. For a formal definition of the subsumption learning framework and a discussion of how this definition relates to Angluin’s exact learning model see [6]. We say that a class of ontologies L is exactly learnable if there is an algorithm, which halts for any target O ∈ L and computes, using membership and equivalence queries, H ∈ L with H ≡ O as long as the oracle replies to the queries truthfully. An ontology class is exactly learnable in polynomial time if it is exactly learnable by an algorithm A such that at every step2 of computation the time used by A up to that step is bounded by a polynomial p(|O|, |C v D|), where O is the target and C v D is the largest counterexample seen so far. ELlhs and ELrhs are known to be exactly learnable in polynomial time, while the class of all EL ontologies is not learnable in polynomial time [6]. 3 Learning EL ontologies In this section we present Algorithm 1, which can exactly learn EL terminologies in time exponential in |CO |, the size of the largest concept expression in O, and |ΣO |, the size of the ontology vocabulary, but not in the size of the whole ontology. 2 We count each call to an oracle as one step. Algorithm 1 The learning algorithm for EL Input: An EL terminology O given to the oracle; ΣO given to the learner Output: An EL terminology H computed by the learner such that O ≡ H 1: Set H = {A v B | O |= A v B, A, B ∈ ΣO } 2: while H 6≡ O do 3: Let C v D be the returned positive counterexample for O relative to H 4: Compute C 0 v D0 with C 0 or D0 in ΣO ∩ NC 5: if C 0 ∈ ΣO ∩ NC then Compute a right O-essential α from C 0 v D0 u F0 d 6: C 0 vF 0 ∈H 7: else 8: Compute a left O-essential α from C 0 v D0 9: end if 10: Add α to H 11: end while 12: return H In the main loop of the algorithm the learner poses an equivalence query to the oracle. If the oracle answers “yes” then the algorithm returns H equivalent to O. Otherwise, it receives a counterexample C v D. It is easy to see that at all times O |= H so the counterexample is always positive. As O is a terminology, complex C and D in the counterexample can only “connect” via a concept name, which can be identified by asking membership queries. Lemma 1. Given a counterexample C v D, one can construct, by posing mem- bership queries, a counterexample C 0 v D0 such that |C 0 v D0 | ≤ |C v D| and either C 0 or D0 is a concept name in time polynomial in |H|, |C| and |ΣO |. Proof. (Sketch) This lemma relies on the properties of the rooted canonical model IC,O of an EL concept C and an EL terminology O. It can be proved by induction on the structure of the right-hand side of the counterexample C v D using the following property proved using the construction of IC,O : For any d ∈ ∆IC and EL concept expression of the form ∃r.F : if there exists a homomorphism h from the labeled tree corresponding to ∃r.F into IC,O such that h(a∃r.F ) = d and all nodes in the labeled tree corresponding to F are mapped to ∆IC,O \ ∆IC , then there exists a concept name E with d ∈ E IC such that O |= E v ∃r.F . Having transformed the counterexample to the case of a concept name on the left or on the right, the algorithm minimises the size of the counterexample and makes it more informative by applying rules. If C 0 is a concept name then Algorithm 1 merges D0 with the right-hand sides of all inclusions in H with C 0 on the left (if they exist) and computes a so called right O-essential counterexample. Otherwise D0 is a concept name, and the algorithm computes a left O-essential counterexample. It then adds the resulting O-essential concept inclusion α to H. Right O-essential concept inclusions We say that an inclusion A v C is right O-essential if none of the following three rules applies to A v C. Concept saturation for O: If O |= A v C 0 and C 0 results from C by adding a concept name A0 to the label of some node, then replace A v C by A v C 0 : Sibling merging for O: If O |= A v C 0 and C 0 is the result of identifying in C two r-successors of the same node then replace A v C by A v C 0 . Decomposition on the right for O: If d0 is an r-successor of d in C, A0 is in the node label of d, and O |= A0 v ∃r.Cd0 plus A0 6≡O A if d is the root of C, then replace A v C by (a) A0 v ∃r.Cd0 if H 6|= A0 v ∃r.Cd0 ; or (b) A v C|− d0 ↓ , otherwise, where Cd is the concept corresponding to the subtree rooted in d and C|− d↓ is the concept corresponding to the result of removing the subtree rooted in d from C. It follows from [6], Lemma 29 and its proof, that given a counterexample A v C, one can construct a right O-essential counterexample A0 v C 0 using only polynomially many polynomial size membership queries in |H| + |C| + |ΣO |. Concept saturation, sibling merging and decomposition on the right are all essential—hence the name—steps of the polynomial learning algorithm for DL-Lite∃R , which extends ELrhs with inverse roles and role hierarchies [6]. Example 1. (a) For H = ∅ and O = {Human v ∃hasParent.Human} the oracle can return arbitrary many arbitrary long hasParent chains starting at Human as a counterexamples leading to non-termination. For instance, Human v ∃hasParent.∃hasParent.> is a chain of length two. With concept saturation, this counterexample can be strengthened to Human v ∃hasParent.(Human u ∃hasParent.Human), which is equivalent to O. (b) For O = {Human v ∃hasParent.(Human u Male)} and H = {Human v ∃hasParent.Human}, upon receiving a counterexample Human v ∃hasParent. Male, the learner merges its right hand side with the right hand side of the inclusion in H to form Human v ∃hasParent.Male u ∃hasParent.Human and then strengthens it by sibling merging to form the inclusion in O. (c) For H = ∅ and O = {Woman v Human, Human v ∃hasParent.Human}, even with concept saturation, there exist infinitely many chain counterexamples; Woman v Human u ∃hasParent.(Human u ∃hasParent.Human) is one of them. This inclusion can be decomposed at the root into (a) Human v Woman and (b) Human v ∃hasParent.(Human u ∃hasParent.Human). Picking either of them allows the learner to make progress. In fact, one can prove that Algorithm 1 polynomially learns ELrhs . Theorem 1. The class of ELrhs ontologies is exactly learnable in polynomial time by Algorithm 1. We show that even in the presence of inclusions of the form D v B, right O-essential concept inclusions are bounded in size by the size of the largest concept expression in O. Lemma 2. Suppose that A v C is right O-essential. Then |C| ≤ |CO | · |ΣO |, where CO is the largest concept expression in O. Proof. (Sketch) We follow the lines of the proof of Lemma 32 in [6]. First we define AO = {A} ∪ {D | O |= A v B, B v D ∈ O} d and construct the canonical model ID0 ,O of D0 = D∈AO D and O. The inter- pretation ID0 ,O has the following properties: 1. for every EL concept expression F : ρD0 ∈ F ID0 ,O iff O |= A v F ; 2. for any d ∈ ∆ID0 and EL concept expression of the form F = ∃r.F 0 : if there exists a homomorphism h from the labeled tree corresponding to F into ID0 ,O such that h(aF ) = d and all nodes in the labeled tree corresponding to F 0 are mapped to ∆ID0 ,O \ ∆ID0 , then there exists a concept name E with d ∈ E ID0 such that O |= E v F . Then, one can show that there is an injective homomorphism h from the labeled tree TC of C into the restriction J of ID0 ,O to ∆ID0 which maps aC to ρD0 . By definition of AO , there is B v D ∈ O such that |∆IC | ≤ |∆ID |, which means that |∆IC | ≤ |∆ICO |, where CO is the largest concept expression in O. The main observation here is that the argument holds for EL terminologies, that is, in the presence of concept inclusions of the form D v B. Such inclusions do not interfere in the argument since concept saturation ensures that nodes have all concept names implied by O in their labels. Left O-essential concept inclusions We say that an inclusion C v A is left O-essential if neither of the following rules applies to C v A. Concept saturation for H: If H |= C v C 0 and C 0 results from C by adding a concept name A0 to the label of some node, then replace C v A by C 0 v A. Decomposition on the left for O: If d is a non-root node such that O |= C|− d↓ v A 0 − 0 0 − 0 and H 6|= C|d↓ v A , for some A ∈ ΣO , then replace C v A by C|d↓ v A ; if O |= Cd v A0 and H 6|= Cd v A0 , then replace C v A by Cd v A0 . The applicability of the second rule may depend on the application of the first. For example, for H = {∃hasParent.> v Human} and O = H ∪ {∃hasChild.Human v Human} a counterexample could be ∃hasChild.∃hasParent.> v Human, which can only be decomposed on the left for O if we apply concept saturation for H first. Similarly to right O-essential counterexamples, given a counterexample C v A, one can construct a left O-essential counterexample C 0 v A0 using only polynomially many polynomial size membership queries in |H| + |C| + |ΣO |. We show in Lemma 3 that the size of left O-essential counterexamples is polynomially bounded by |ΣO | and |CO |. The intuition behind this lemma is that if neither concept saturation for H nor decomposition on the left for O apply to an inclusion C v A then for some D v E ∈ O, with E ∈ NC , not only there is a homomorphism h from ID to IC,O mapping their roots but also all elements of IC are in the image of this homomorphism. However, in contrast to Lemma 58 by Konev et. al [6], due to inclusions of the form B v F with F a complex expression, some nodes of D may be mapped by h outside of IC . For example, for H = ∅, O = {∃r.∃s.B v E, E v A, B v ∃s.B} and a left O-essential counterexample ∃r.B v A, the homomorphism from I∃r.∃s.B to I∃r.B,O covers I∃r.B but there is only a partial homomorphism (that is, a partial function satisfying properties of a homomorphism) from I∃r.∃s.B to I∃r.B . Lemma 3. Suppose that C v A is left O-essential. Then |C| ≤ |CO | · |ΣO |, where CO is the largest concept expression in O. Proof. To show this lemma, we use the canonical model IC,O of C and O. If there is E ∈ ΣO such that O |= C v E (and C v E is not a tautology) then there is D v E ∈ O such that ρC ∈ DIC,O . Let (C, O)IC = {D | ∃E ∈ ΣO : ρC ∈ DIC,O , ρC 6∈ E IC , D v E ∈ O}. For all D ∈ (C, O)IC there is a homomorphism h : ID → IC,O mapping the root of ID to the root of IC,O . Also, by construction of IC,O , there is D0 ∈ (C, O)IC such that: 1. there is a partial homomorphism h : ID0 → IC mapping the root of ID0 to the root of IC , for e ∈ ∆ID0 with h(e) ∈ ∆IC ; 2. if D0 has a conjunct ∃r.F , with d ∈ F ID0 as the corresponding r-successor of ρD , and h(d) 6∈ ∆IC then there is A ∈ NC such that ρC ∈ AIC and O |= A v ∃r.F . The fact that |∆IC | ≤ |∆ID0 | for such D0 v E ∈ O is a result of the Claim. Claim. For all d ∈ ∆IC , there is e ∈ ∆ID0 such that d = h(e). Suppose to the contrary that there is d0 ∈ ∆IC such that d0 is outside the image of h : ID0 → IC,O . Let G = C|− d0 ↓ be the concept corresponding to the result of removing the subtree rooted in d0 . Since C is concept saturated for H, if ρG ∈ D0IG,O and ρG 6∈ E IG then this contradicts the fact that decomposition on the left for O was exhaustively applied. To show the contradiction, we argue that h is a homomorphism from ID0 into the restriction IG,O of IC,O . Our construction has the following properties: 3. for e ∈ ∆ID0 with h(e) ∈ ∆IC , we have that h(e) ∈ ∆IG ; 4. for all d ∈ ∆IC \ {ρC }, d ∈ AIC iff d ∈ AIC,O , for all A ∈ NC . We obtain Point (3) by the fact that since d0 is outside the image of h : ID0 → IC,O , all descendants of d0 are outside the image of h : ID0 → IC,O as well. Point (4) follows from decomposition of the left for O and concept saturation for H. For e ∈ ∆ID0 with h(e) ∈ ∆IC , by Points (1) and (3), there is a partial homomorphism from ID0 to IG . For e ∈ ∆ID0 with h(e) 6∈ ∆IC , there is e1 ∈ ∆ID0 such that: (i) h(e1 ) 6∈ ∆IC ; (ii) e is in the subtree rooted in e1 ; and (iii) e1 has minimal distance from ρD0 , where ‘distance’ here is the length of a chain of elements connected via roles from ρD0 to e1 . This means that the parent e2 of e1 is such that h(e2 ) ∈ ∆IC . Let F be the concept corresponding to the subtree rooted in e1 and let s be the role between e2 and e1 . That is, (e2 , e1 ) ∈ sID0 . Since h(e1 ) 6∈ ∆IC we have that all descendants of e1 are not mapped to ∆IC . Then, by construction of the canonical model IC,O of C and the EL terminology O, one can show that there is a B ∈ NC such that h(e2 ) ∈ B IC,O and O |= B v ∃s.F . So it remains to show that h(e2 ) ∈ B IG,O , for some B ∈ NC such that O |= B v ∃s.F . We make a case distinction: – for h(e2 ) 6= ρC : by Point (4), h(e2 ) ∈ B IC,O implies h(e2 ) ∈ B IC . As h(e2 ) ∈ ∆IC , by Point (3), we have that h(e2 ) ∈ ∆IG . So h(e2 ) ∈ B IG , and thus, h(e2 ) ∈ B IG,O . – for h(e2 ) = ρC : by Point (2), if e1 ∈ F ID0 and h(e1 ) 6∈ ∆IC then there is a concept name B such that ρC ∈ B IC and O |= B v ∃s.F . So ρG ∈ B IG , which means that ρG = h(e2 ) ∈ B IG,O . Since we showed for arbitrary e ∈ ∆ID0 with h(e) ∈ ∆IC and h(e) 6∈ ∆IC that there is d ∈ ∆IG,O such that d = h(e), we have that h : ID0 → IG,O is a homomorphism. Thus, there is D0 ∈ (C, O)IG such that ρG ∈ D0IG,O and ρG 6∈ E IG for some D0 v E ∈ O, which contradicts the fact that decomposition on the left for O was exhaustively applied, so our claim follows. This bounds the size of the domain of IC to the size of the domain of ID0 . Since |∆ID0 | ≤ |∆ICO |, we have |∆IC | ≤ |∆ICO |. By combining lemmas 2 and 3 we obtain the following result. Lemma 4. Given a counterexample C v D for O relative to H, one can con- struct a counterexample C 0 v D0 such that |C 0 v D0 | ≤ |CO | · |ΣO | + 1 in polynomial time in |C v D|, |ΣO | and |H|. Since there are at most |ΣO ||CO |·|ΣO |+1 many inclusions over ΣO of size |CO | · |ΣO | + 1, at most |ΣO ||CO |·|ΣO |+1 counterexamples get added to H over the run of the algorithm. Thus we obtain the following theorem. Theorem 2. The class of EL terminologies is exactly learnable by Algorithm 1 in O(|ΣO |2|CO |·|ΣO |+2 · |C v D|2 ) steps, where ΣO is the vocabulary of the target terminology O, CO is the largest concept expression in O and C v D is the largest counterexample seen by Algorithm 1. 4 Evaluation The ExactLearner system [5] has two main components: a learner and a teacher. The learner supports (1) “Concept Saturation”, (2) “Sibling Merging”, (3) “De- composition”, applied on the right side of inclusions, and (4) “Concept Desatu- ration”, (5) “Sibling Branching” and (6) “Decomposition”, applied on the left. Operations (1), (2), (3) and (6) have already been described. In addition, we have also implemented (4) and (5), which act as heuristics to construct smaller, more informative counterexamples. Concept desaturation tries to remove concept names from nodes in the left of counterexamples to make them logically stronger. Sibling branching tries to strengthen a counterexample by splitting paths on Test 2 Test 3 p # timeouts avg CE avg max C # timeouts avg CE avg max C 0.01 3 17.2 27.7 2 5.6 31.7 0.5 25 107.8 26.6 3 6.1 31.6 1.0 26 190.4 19.5 3 6.3 31.9 Table 1. Learner against the adversarial teacher. the left. For example, for O = {∃hasDegree.BSc u ∃hasDegree.MSc v PG} and H = ∅, the inclusion ∃hasDegree.(BSc u MSc u PhD) v PG is a counterexample, from which desaturation removes the irrelevant PhD and then sibling branching strengthens it to the one in O. Concept saturation for H was implemented as part of Rule (6), “Decomposition” applied to the left, as the applicability of Rule (6) may depend on the exhaustive application of this rule. We have evaluated ExactLearner’s performance on EL ontologies from the Oxford ontology repository [8]. As a first experiment we ran the learner against a naïve teacher, which presents the target ontology inclusions one by one without modification. This experiment aims at estimating the overheads of the learning process under the best possible conditions. In this first experiment, for 50 out of 174 EL ontologies computations concluded within 1 hour. We selected these ontologies for further experiments. The selected ontologies range in size from 9 to 11 177 inclusions with signature sizes ranging from 23 to 9334 concept names and from 2 to 25 role names. The average size of counterexamples produced by the teacher was 5.48 while the average size of the largest concept in O was 2.7. The average size of the largest concept in H was 31.3, an increase caused by concept saturation on the right side of inclusions. The performance bottlenecks in our system are checking if the presented inclusion is a counterexample w.r.t. the current hypothesis ontology at the server side and entailment checks in the learner. We have also introduced an adversarial teacher, which forces the learner to apply particular operations from (1)–(6) above by manipulating the counterexam- ples. For instance, to force the learner to perform concept saturation on the right of A v C, the teacher exhaustively tries to remove concept names from every node in the tree representation of C, while ensuring that the modified inclusion is still a counterexample. The adversarial teacher can apply: (7) “Concept Desaturation” on the right, which we have just described; (8) “Sibling Branching” on the right, which weakens counterexamples of the form A v ∃r.(C u D) into A v ∃r.C u ∃r.D (provided the latter is still a counterexample); (9) “Concept Saturation” on the left; and (10) “Sibling Merging” on the left, which are the opposite of learner’s concept desaturation and sibling branching. We also substitute concept defini- tions into counterexamples, for instance, if A v ∃r.B is a counterexample and B v C ∈ O we test A v ∃r.C for being a counterexample as well. We call this operation (11) “Composition on the right”. (12) “Composition on the left” is its Fig. 1. Oracle and learner skills usage in Test 2. counterpart. Operations (7)–(12) are applied at random with set probabilities so that the level of difficulty could be controlled (whenever the oracle applies an operation, the result is a ‘less informative’ counterexample, so learning from those counterexamples may require more queries, and thus, increase the difficulty). Table 1 presents statistics of running the learner against the adversarial teacher. In Test 2 the teacher was applying transformations (7)–(12) with probability p of 0.01, 0.5 and 1.0. The learner can cope with a small distortion of examples (p = 0.01) but a significant distortion leads to a big increase in the number of time-outs. Figure 1 shows the percentage of the rules applied by the learner and the oracle in Test 2. As the chart indicates, the most frequently applied rule (42%) for the learner with probability p = 0.01 is desaturation in the left. This number grows to 94% when p = 0.5 and to 96% when p = 1.0. For the oracle, the most frequently applied rule with p = 0.01 is saturation on the left. Then, we see an increase on the application of composition on the left, which grows to 90% when p = 0.5 and to 92% when p = 1.0. As the oracle applies saturation on the left, there are more concept names on the left which can be used for composition on the left. The discrepancy between the number of compositions on the left by the oracle and decompositions on the left by the learner is due to the fact that the oracle applies the rule repeatedly while the learner finds a minimal subtree in one rule application. In Test 3 we have disabled rule (9) at the oracle side as well as rules (8) and (10) as they almost never applied, see Figure 1, yet took up a significant time in our tests. This led to a significant drop in the failure rate even though other adversarial teacher operations were applied with a high probability. This suggests that the main cause of time-outs is the exponential explosion in the size of the signature rather than the size of concepts in the target ontology. We also Fig. 2. Membership and equivalence queries. measured the increase on the number of queries in Test 2 when the probability for the oracle to apply a certain rule increases. In Test 2, 22 ontology computations concluded within 1 hour with the probability of the oracle to apply a certain rule set to 1.0 (the oracle always apply all rules exhaustively). Figure 2 shows the delta between averages of the number of queries when the probability for the oracle to apply its rules are set to 0.0 and the other values: 0.01, 0.5 and 1.0. The number of membership queries visibly increases, while the number of equivalence queries remains nearly the same. This is expected, since computing O-essential inclusions from less informative counterexamples need more membership queries. Though, since the learner indeed computes O-essential counterexamples, there is only a small impact on the number of equivalence queries. 5 Conclusion We presented the learning algorithm underpinning ExactLearner. As future work, we plan to extend our algorithm in an ontology-based data access setting and adopt the Probably Approximately Correct learning model extended with membership queries, so that our algorithm can also run without the teacher. Acknowledgements. Konev was supported by the EPSRC project EP/H043594/1. We would like to thank Frank Wolter for fruitful discussions and Liyi Zhao for her contribution to an earlier version of ExactLearner. References 1. Bogdan Alexe, Balder ten Cate, Phokion G. Kolaitis, and Wang Chiew Tan. EIRENE: interactive design and refinement of schema mappings via data examples. PVLDB, 4(12):1414–1417, 2011. 2. Marta Arias, Roni Khardon, and Jérôme Maloberti. Learning horn expressions with LOGAN-H. Journal of Machine Learning Research, 8:549–587, 2007. 3. F. Baader, D. Calvanese, D. McGuiness, D. Nardi, and P. Patel-Schneider. The Description Logic Handbook: Theory, implementation and applications. Cambridge University Press, 2003. 4. Franz Baader, Sebastian Brandt, and Carsten Lutz. Pushing the EL envelope. In IJCAI, pages 364–369. Professional Book Center, 2005. 5. Ricardo Duarte, Boris Konev, and Ana Ozaki. Exact learning of EL ontologies. In Principles of Knowledge Representation and Reasoning: Proceedings of the 16th International Conference, KR 2018, Tempe, USA, October 30-November 2, 2018, 2018. 6. Boris Konev, Carsten Lutz, Ana Ozaki, and Frank Wolter. Exact learning of lightweight description logic ontologies. Journal of Machine Learning Research, 18(201):1–63, 2018. 7. Jens Lehmann. DL-Learner: Learning concepts in description logics. Journal of Machine Learning Research, 10:2639–2642, 2009. 8. Oxford. Information systems group ontologies. Retrieved from https://www.cs.ox. ac.uk/isg/ontologies/. Accessed: 18 May 2018.