Lethe: Saturation-Based Reasoning for Non-Standard Reasoning Tasks Patrick Koopmann, Renate A. Schmidt The University of Manchester, UK {koopmanp, schmidt}@cs.man.ac.uk Abstract. We present the saturation-based reasoning system Lethe. Lethe is a tool that can be used for uniform interpolation, forgetting, TBox abduction and logical dierence. To solve these problems, Lethe uses saturation-based reasoning to eliminate certain symbols from an on- tology, such that entailments in the remaining vocabulary are preserved. This is known as forgetting or uniform interpolation. Lethe is an imple- mentation of our forgetting methods for various expressive description logics, and can be used as a Java library and as a standalone tool for the mentioned reasoning tasks. We give a high level description of the calculi used by Lethe, describe the reasoning algorithm implemented in Lethe, and give an evaluation of the system on realistic ontologies. 1 Introduction Lethe is a tool for non-standard reasoning tasks that provides functionality for forgetting, TBox abduction and logical dierence, which have applications in ontology analysis, debugging and ontology change. These functionalities are implemented by saturation-based reasoning methods for forgetting. The name Lethe is taken from Greek mythology, and denotes the river of forgetfulness. For- getting, also known as uniform interpolation, is the core functionality of Lethe. It reduces the vocabulary of an ontology in such a way that entailments in the re- stricted vocabulary are preserved. We give a formal denition. Let sig(E) denote the concept and role symbols occurring in an ontology or axioms E . Denition 1. The result of forgetting a set of symbols Σ from an ontology O in a logic L is an ontology O−Σ such that sig(O−Σ ) = sig(O) \ Σ and for all L axioms α with sig(α) ∩ Σ = ∅ we have O−Σ |= α i O |= α. Apart from TBox abduction and computing logical dierences between on- tology versions, forgetting has a lot of direct applications, and can be used to hide condential information from an ontology, to analyse hidden relations be- tween concepts and roles, to generate ontology summaries, and has many more applications, examples of which can found in [2, 13, 14]. Saturation-based reasoning has received increased interest in the last years by the description logic community due to its good performance for classication. Examples include the consequence-based reasoners used by ELK [3] and Con- DOR [19, 20]. Saturation procedures work by adding new inferences performed 2 by a set of rules to the knowledge until no new inferences can be performed. This technique is especially useful if the aim is to compute all entailments of a speci- ed form, such as atomic concept inclusions in the case of classication. For the same reason, saturation-based reasoning is well-suited for forgetting. Forgetting a concept or role symbol x is performed in Lethe by computing all entailments on that symbol, until axioms containing x can be removed. This can be done using goal-oriented saturation, provided that the underlying calculus enjoys the necessary completeness properties. Common consequence-based reasoning methods do not enjoy these proper- ties, since they are optimised towards classication. As example, take the fol- lowing two axioms: A1 v ∀r.B A2 v ∀r.¬B Forgetting B from these axioms should result in the single axiom A1 uA2 v ∀r.⊥. However, this inference is not necessary if we are only interested in entailments of the form A v B . Inferences between universal restrictions are usually not required for classication. We developed a new family of saturation-based calculi for ALC , ALCH, SHQ and SIF ontologies [6, 5, 8, 10], as well as for ALC and SHI knowledge bases [9, 12, 11], that are interpolation complete. Interpolation completeness ensures that all inferences needed for forgetting are performed. Lethe implements the calculi for ALCH and SHQ ontologies, as well as for ALC knowledge bases. Furthermore, Lethe implements optimised variations of the forgetting pro- cedure to solve the related problems logical dierence and TBox abduction. The aim of logical dierence is to compute diering entailments between ontology versions with respect to some user-dened signature [4]. Lethe computes logi- cal dierences between two ontologies O1 and O2 by forgetting symbols from O2 that are not in the desired signature, and returning all resulting axioms that are not entailed by O1 (the same approach followed in [13]), and has some dedicated optimisations to make this feasible for large ontologies. The form of abduction that is implemented in Lethe takes as input an ontology O, a set of axioms Obs and a signature of abducibles Σ , and computes a logically weakest set of axioms H such that sig(H) ⊆ Σ and O ∪ H |= Obs. TBox abduction can support ontology engineers in nding missing axioms of an ontology if expected entailments are not satised. This problem can be reduced to a form of forgetting by observing that O ∪ H |= Obs i O ∪ {¬Obs} |= ¬H . The negation of a TBox axiom C v D can be encoded in the TBox axiom > v ∃r∗ .(Ci u ¬Di ), where r∗ is fresh. Using the technique described in [7], we forget all symbols from ¬Obs that are not in Σ , where O is used as background knowledge. If we negate the result again, we obtain the weakest set of axioms in Σ such that O ∪ H |= Obs. 2 The Calculi Even though Lethe implements dierent calculi, these use similar techniques, which allows to implement a unied reasoning strategy. To give an idea, we 3 Resolution Rule: C1 t A C2 t ¬A C1 t C2 ∃-Elimination Rule: C t ∃r.D ¬D C ∀Q-Combination Rule: C1 t ∀R.D1 C2 t QS.D2 C1 t C2 t QS.D12 where Q ∈ {∀, ∃} and D12 is a possibly new dener representing D1 u D2 . Fig. 1. Rules of the calculus ResALC . describe a simple calculus for ALC , introduced in [6], that shares main ideas with these calculi. The method requires the input to be normalised. Let Nc and Nr be the sets of concept symbols and role symbols that may be used in an ontology, and Nd ⊂ Nc be a set of designated concept symbols called deners. Denition 2. An ALC literal is a concept of the form A, ¬A, ∃r.D, ∀r.D, where D ∈ Nd . An ALC literal of the form ¬D, D ∈ Nd , is called negative dener literal. An ALC clause is an axiom of the form > v L1 t . . . t Ln , where every Li is an ALC literal. We usually omit the leading `> v' and assume that clauses are represented as sets, that is, they do not have duplicate literals and the order of the literals is not important. An ontology is in ALC normal form i every axiom is an ALC clause. Every ALC ontology can be transformed into ALC normal form using stan- dard structural transformation and CNF transformation techniques. The result- ing ontology shares all entailments modulo deners with the original ontology. If a normalised ALC ontology contains only clauses with at most one negative dener literal, similar transformations can be applied in the other direction in order to obtain an ontology without deners, which may use greatest xpoint operators (see [6] for details). Example 1. For the ontology given in the introduction, a normalised represen- tation is N = {¬A1 t ∀r.D1 , ¬D1 t B, ¬A2 t ∀r.D2 , ¬D2 t ¬B}. The calculus ResALC is shown in Figure 1. A major dierence between our calculi and calculi for consequence-based reasoning [3, 19], or methods for direct resolution in modal logics [1, 1517], is that symbols are not only introduced as part of the normalisation, but are also introduced dynamically during the saturation process. This allows to preserve the normal form without omitting required inferences. Specically, the conclusions of the ∀Q-combination rule use a dener D12 representing D1 u D2 . Such a dener is introduced by adding the 4 two clauses ¬D12 t D1 and ¬D12 t D2 to the current clause set. New deners are only introduced if no corresponding dener already exists. This way, the number of introduced deners can be limited by 2n , where n is the number of deners in the input ontology. As a result, we obtain that any saturated set of clauses is n nite and contains at most 22 clauses. The calculi for ALCH ontologies, SHQ ontologies and ALC knowledge bases, as implemented in Lethe, use dierent rule sets, but the structure of these calculi is similar. They always have a resolution rule that allows to infer inferences on a specic symbol, and they have a set of combination rules, which lead to the introduction of new deners, and subsequently make further inferences possible. In order to forget a concept symbol from the ontology, all inferences of the resolution rule for which the conclusion contains maximally one negative dener literal have to be performed. This may involve several applications of the com- bination rules, as these lead to the introduction of new deners and clauses. In the resulting clause set, all deners can be eliminated by applying the normali- sation backwards, resulting in an ontology without dener symbols, which may use xpoint operators. As described in [6], we can obtain a representation with- out xpoints by leaving some dener symbols in the ontology or use the dener symbols as a guide to approximate the result. Both methods are implemented in Lethe. 3 Central Reasoning Algorithm Used in Lethe The main challenge of implementing the calculi is to determine which applica- tions of the combination rules are required in order to compute the forgetting result. If combination rules are applied unrestricted, we infer more clauses than necessary, and the implementation becomes impractical already for small on- tologies. On the other hand, applications of the combination rules are necessary to make further inferences possible. Lethe uses an easy technique to ensure that all required inferences of combination rules are applied, while keeping the number of unnecessary inferences low. The idea is to use the resolution rule as guide to determine which deners should be introduced if possible. We can then determine how this dener can be introduced by using the combination rules. Symbols to be forgotten are processed one after another, starting with the symbols with less occurrences in the ontology. For each symbol, we process only the portion of the ontology that uses this symbol. The main inference loop for forgetting a concept symbol A is then controlled by the following methods: MAIN_LOOP. Process each negative occurrence of A one after another and perform all possible resolution steps. Pass the results to the method PRO- CESS_CLAUSE. PROCESS_CLAUSE(C). Determine whether the clause C contains more than one negative dener literal. If not, pass it to ADD_CLAUSE. Otherwise, we have C = ¬D1 t ¬D2 t C 0 . If there is a dener D12 that represents D1 t D2 , replace C by ¬D12 t C 0 and pass it again to PROCESS_CLAUSE. (This step basically performs resolution on the clauses ¬D12 t D1 and ¬D12 t D2 , which 5 are then also in the current clause set). If there is no such dener yet, look for occurrences of D1 and D2 under a role restriction and apply the corresponding role combination rules if possible, which may then introduce the dener D12 . Every clause that is inferred in this process, as well as the clause ¬D12 t C 0 if it could be generated, is sent again to PROCESS_CLAUSE, since it may again contain more than one negative dener symbol. ADD_CLAUSE(C). We apply simplication rules on the clause and check whether it is redundant. If the clause does not contain the symbol to be elimi- nated, we add it to the nal output clause set. Example 2. Assume we want to forget A from the clauses in the last exam- ple. MAIN_LOOP resolves on the clauses ¬D1 t A and ¬D2 t ¬A, and passes ¬D1 t ¬D2 to PROCESS_CLAUSE. As this clause contains more than one neg- ative dener literal, we do not have to add it to the result. Instead, we check whether we can apply combination rules on clauses in which D1 and D2 occur un- der a role restriction. This is possible on the clauses ¬A1 t∀r.D1 , ¬A2 t∀r.D2 , and we pass the resulting clauses ¬A1 t¬A2 t∀r.D12 , ¬D12 tD1 , ¬D12 tD2 and ¬D12 to the method ADD_CLAUSE. The nal output contains only clauses with at most one negative dener literal, and no occurrences of B . The clause ¬D12 corresponds to the axiom D12 v ⊥. We can therefore replace D12 by ⊥. After some further syntactic adjustments we obtain the result A1 u A2 v ∀r.⊥. Several hash maps are used in the implementation to allow fast access of all occurrences of deners and the symbols that are currently forgotten. Through- out the computation, we ensure the invariant property that every stored clause contains maximally one negative dener literal. This is necessary to be able to eliminate introduced dener symbols in the end. 4 Evaluation We evaluated the current version of Lethe on a set of 339 ontologies taken from the NCBO BioPortal repository [18], restricted to the dierent fragments supported by the dierent calculi. The ontologies correspond to the consistent ontologies of that repository that have less than 100,000 axioms. The average number of axioms of these ontologies is 4,760, and the 90th percentile is 13,045 axioms. For each ontology and each supported description logic, we generated 360 sets of symbols of size 50 and of size 100 and eliminated the symbols using Lethe. A timeout was set to 30 minutes. Table 1 shows the results of this evaluation. The tables show the success rate for each experiment, the fraction of results that could be represented without xpoints, and the mean, median and 90th percentile of the duration of the computation. Note that for SHQ ontologies, Lethe only supports elimination of concept symbols. In around 9095% percent of cases, forgetting could be performed within 30 minutes. In most cases, the computation took just a few seconds, with the median of the duration being below 7 seconds in all cases. Except for the ALC forgetter with ABox support, the 90th percentile of the duration was always below 22 seconds. 6 ALCH, forget 50 symbols ALCH, forget 100 symbols Success Rate: 91.10% Success Rate: 88.10% Without Fixpoints: 95.29% Without Fixpoints: 93.27% Duration Mean: 7.68 sec. Duration Mean: 18.03 sec. Duration Median: 2.74 sec. Duration Median: 3.81 sec. Duration 90th percentile: 12.45 sec. Duration 90th percentile: 21.17 sec. ALC w. ABoxes, forget 50 symbols ALC w. ABoxes, forget 100 symbols Success Rate: 94.79% Success Rate: 91.37% Without Fixpoints: 92.91% Fixpoints: 92.48% Duration Mean: 23.94 sec. Duration Mean: 57.87 sec. Duration Median: 3.01 sec. Duration Median: 6.43 sec. Duration 90th percentile: 29.00 sec. Duration 90th percentile: 99.26 sec. SHQ, forget 50 concept symbols SHQ, forget 100 concept symbols Success Rate: 95.83% Success Rate: 90.77% Without Fixpoints: 93.40% Fixpoints: 91.99% Duration Mean: 7.62 sec. Duration Mean: 13.51 sec. Duration Median: 1.04 sec. Duration Median: 1.60 sec. Duration 90th percentile: 4.89 sec. Duration 90th percentile: 11.65 sec. Table 1. Forgetting 50 and 100 symbols using Lethe. 5 Outlook Lethe currently supports forgetting of concept and role symbols from ALCH ontologies and ALC knowledge bases with ABoxes, as well as forgetting concept symbols from SHQ ontologies. Apart from providing functionality for forgetting, Lethe uses these forgetting procedures together with dedicated optimisations for TBox abduction and computing logical dierences. The current version can be downloaded at http://www.cs.man.ac.uk/~koopmanp/lethe. It can be used as command line tool and as Java library. Furthermore, a simple GUI for computing and showing forgetting results on smaller ontologies is provided. We currently have a prototypical implementation for SHI knowledge bases with ABoxes, which still has to be thoroughly tested and debugged. Even though our implementation of TBox abduction and logical dierences has been thor- oughly optimised, a proper evaluation on realistic use cases is still open. A natural next step is to develop and implement a method for SHIQ knowl- edge bases, which would generalise all three currently implemented methods. Though currently only used for forgetting, the calculi can in theory also be used for classical reasoning tasks such as satisability checking, classication or realisation. An interesting open question is how Lethe would perform as a reasoner if it is implemented towards these reasoning tasks. Another open ques- tion is whether our saturation-based reasoning approach can be used for other non-classical problems, such as for example approximation or ABox abduction. 7 References 1. Enjalbert, P., Fariñas del Cerro, L.: Modal resolution in clausal form. Theoretical Computer Science 65(1), 133 (1989) 2. Gabbay, D.M., Schmidt, R.A., Szaªas, A.: Second Order Quantier Elimination: Foundations, Computational Aspects and Applications, Studies in Logic, vol. 12. College Publications (2008) 3. Kazakov, Y.: Consequence-Driven Reasoning for Horn SHIQ Ontologies. In: Boutilier, C. (ed.) Proceedings of the International Joint Conference on Articial Intelligence (IJCAI-09). pp. 20402045. AAAI Press (2009) 4. Konev, B., Walther, D., Wolter, F.: The logical dierence problem for description logic terminologies. In: Automated Reasoning, Lecture Notes of Computer Science, vol. 5195, pp. 259274. Springer (2008) 5. Koopmann, P., Schmidt, R.A.: Forgetting concept and role symbols in ALCH- ontologies. In: Logic for Programming, Articial Intelligence and Reasoning. Lec- ture Notes in Computer Science, vol. 8312, pp. 552567. Springer (2013) 6. Koopmann, P., Schmidt, R.A.: Uniform interpolation of ALC -ontologies using x- points. In: Frontiers of Combining Systems. Lecture Notes in Computer Science, vol. 8152, pp. 87102. Springer (2013) 7. Koopmann, P., Schmidt, R.A.: Computing uniform interpolants of ALCH- ontologies with background knowledge. In: Proc. ARW-DT'14 (2014) 8. Koopmann, P., Schmidt, R.A.: Count and forget: Uniform interpolation of SHQ- ontologies. In: Automated Reasoning. Lecture Notes in Computer Science, vol. 8562, pp. 434448. Springer (2014) 9. Koopmann, P., Schmidt, R.A.: Forgetting and uniform interpolation for ALC - ontologies with ABoxes. In: Proceedings of the 27th International Workshop of Description Logics (DL 2014). CEUR Workshop Proceedings, vol. 1193, pp. 245 257. CEUR-WS.org (2014) 10. Koopmann, P., Schmidt, R.A.: Saturation-based forgetting in the description logic SIF . In: Proceedings of DL-15. CEUR Workshop Proceedings, vol. 1350, pp. 439 451. CEUR-WS.org (2015) 11. Koopmann, P., Schmidt, R.A.: Saturation-based reasoning for SHI -knowledge bases with applications to forgetting and uniform interpolation. In: Proceedings of ARW-15. University of Birmingham (2015) 12. Koopmann, P., Schmidt, R.A.: Uniform interpolation and forgetting for ALC on- tologies with ABoxes. In: Proceedings of AAAI-15. AAAI-Press (2015), to appear 13. Ludwig, M., Konev, B.: Practical uniform interpolation and forgetting for ALC TBoxes with applications to logical dierence. In: Proc. KR'14. AAAI Press (2014) 14. Lutz, C., Wolter, F.: Foundations for uniform interpolation and forgetting in ex- pressive description logics. In: Proceedings of the International Joint Conference on Articial Intelligence (IJCAI-11). pp. 989995. AAAI Press (2011) 15. Mints, G.: Gentzen-type systems and resolution rules part I: propositional logic. In: COLOG-88, Lecture Notes in Computer Science, vol. 417, pp. 198231. Springer Berlin Heidelberg (1990) 16. Nalon, C., Dixon, C.: Clausal resolution for normal modal logics. Journal of Algo- rithms 62(34), 117  134 (2007) 17. Nalon, C., Marcos, J., Dixon, C.: Clausal resolution for modal logics of conuence. In: Automated Reasoning, Lecture Notes of Computer Science, vol. 8562, pp. 322 336. Springer (2014) 8 18. Noy, N.F., Shah, N.H., Whetzel, P.L., Dai, B., Dorf, M., Grith, N., Jonquet, C., Rubin, D.L., Storey, M.A., Chute, C.G., Musen, M.A.: BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Research 37, 170173 (2009) 19. Simancík, F., Kazakov, Y., Horrocks, I.: Consequence-based reasoning beyond Horn ontologies. In: Proceedings of the Twenty-Second International Joint Conference on Articial Intelligence (IJCAI-11). vol. 22, pp. 10931098. AAAI Press (2011) 20. Simancík, F., Motik, B., Krötzsch, M.: Fixed parameter tractable reasoning in DLs via decomposition. In: Proceedings of the 24th International Workshop on Description Logics (DL 2011). CEUR Workshop Proceedings, vol. 745, pp. 400 410. CEUR-WS.org (2011)