Introduction

Just: a Tool for Computing Justi cations w.r.t. ELH Ontologies

Michel Ludwig?

michel@tcs.inf.tu-dresden.de 0 0 Theoretical Computer Science , TU Dresden , Germany

We introduce the tool Just for computing justi cations for general concept inclusions w.r.t. ontologies formulated in the description logic EL extended with role inclusions. The computation of justi cations in Just is based on saturating the input axioms under all possible inferences w.r.t. a consequence-based calculus. We give an overview of the implemented techniques and we conclude with an experimental evaluation of the performance of Just when applied on several practical ontologies.

Introduction

rst give a brief overview of the implemented techniques, and we conclude the paper with an experimental evaluation of its performance on several practical ontologies. The tool and proofs establishing its correctness and completeness are available from http://lat.inf.tu-dresden.de/~michel/software/just/ 2

Preliminaries

We assume the reader to be familiar with basic notions of description logics [ 1 ]. An ELH-ontology, or ELH-TBox T , is a nite set of axioms of the form C v D or r v s, where C, D are EL-concepts and r, s are roles names. An EL-concept C is either a concept name, or a concept that makes use of the concept constructors >, u, 9r:D only, where r is a role name and D an EL-concept. We assume standard set-theoretic semantics.

We now formally introduce the main notion relevant for our tool. De nition 1. Let T be an ELH-TBox and let C; D be EL-concepts. A justi cation1 for T j= C v D is a subset M T such that M j= C v D, and for every M0 ( M it holds that M0 6j= C v D. The set of all the justi cations for T j= C v D will be denoted by JustT (C v D).

Example 1. Let T = fA v X; A v Y; X v 9r:Y; 9r:Y v B; Y v B; Y v Y 0; Y 0 v Y g. Then the set JustT (A v B) consists of the two sets fA v X; X v 9r:Y; 9r:Y v Bg and fA v Y; Y v Bg.

Note that there can exist exponentially many justi cations for a given TBox T and a concept inclusion C v D. Hence, unlike with standard reasoning in ELH, it is not possible to compute all justi cations in polynomial time w.r.t. the size of the TBox in every case [ 2 ]. It is however still (theoretically) possible to compute one justi cation in polynomial time. 3

Tool Overview & Computation Techniques

Just v 0.1 is implemented in Java and it takes an ELH-TBox T and two EL concepts C; D as input. The default operation mode of Just is then to compute and output the set JustT (C v D). Alternatively, the tool can be instructed to search for one justi cation for T j= C v D only.

We rst focus on describing how the computation of justi cations for inclusions between concept names w.r.t. normalised TBoxes is performed in Just. A TBox T is said to be in normal form if every axiom of T is of one of the following forms: din=1 Ai v B, A v 9r:X, 9r:X v B, or r v s, where n 0, A; B are concept names, and an expression of the form X stands either for the concept name X or for the >-concept.

The computation of justi cations for inclusions between concept names w.r.t. a normalised TBox T is performed by a saturation procedure that computes all the (minimal) derivations for inclusions of the form A v B, > v A, or > v >, for 1 Justi cations are also known as MinAs in the literature.

(RoleSucc) r v s

r v t (Ex) X v Y

A v B

r v s (Conj) A v X1

A v Xn (Merge) A v X

X v B

if s v t 2 T if A v 9r:X 2 T and 9s:Y v B 2 T if X1 u : : : u Xn v B 2 T with n 0 (Ax)

A v A (AxTop)

A v > (RoleAx) : : : A v B A v B A v B

A v B concept names A and B, using the calculus T depicted in Fig. 1. The calculus T is related to calculi used for consequence-based reasoning [ 6 ]. Note that expressions of the form A must be consistently instantiated (either with concept names or >) in the premise and the conclusion of inference rules.

The axioms used in the application of the inference rules for generating a derivation for X v Y yield a subset of T from which the consequence X v Y logically follows. The minimal such axiom sets resulting from all the (minimal) derivations for X v Y are therefore the justi cations for X v Y w.r.t. T . Example 2. Let T be de ned as in Example 1. Then, for instance, the following two derivations 1 and 2 for the inclusion A v B can be generated w.r.t. T using the calculus T. We associate with each derivation i a set Axioms( i) which consists of the axioms of T that were used in the inference rule applications occurring in i. First, 1 is given as follows:

(Ax) (Conj) A v A

A v Y

Y v Y (Ax) Y v B (Conj) (Merge) We have Axioms( 1) = fA v Y; Y v Bg. The derivation 2 can be depicted as with Axioms( 2) = fA v Y; Y v B; Y v Y 0; Y 0 v Y g ) Axioms( 1).

(Ax) (Conj) A v A

A v Y

(Ax) Y v Y (Conj) Y v Y 0

Y v Y Y v B

Y 0 v Y 0 (Ax) Y 0 v Y ((CMoenrjg)e) (Conj) (Merge)

Following the example of derivation 2, it is possible to construct in nitely many derivations for the inclusion A v B from the TBox T (as de ned in Example 1). However, one can prove that for nding all justi cations it is sufcient to construct so-called admissible derivations only in which every subderivation 0 of an inclusion does not contain an occurrence of as a premise in 0. It is easy to see that there can only exist nitely many admissible derivations w.r.t. a normalised TBox T for a given inclusion .

To compute justi cations w.r.t. general TBoxes, Just keeps track of which original axiom corresponds to which normalised axiom. The justi cations for a consequence C v D w.r.t. a general TBox T are then generated from the justi cations for C v D w.r.t. the normalisation of T by successively replacing every normalised axiom with all the axioms from which it originates. The minimal sets obtained in that way are the justi cations w.r.t. the general TBox. Example 3. Let T = fA v B u 9r:>; A v B u 9s:>g and let TN = fA v B; A v 9r:>; A v 9s:>g be the normalisation of T . Then JustTN (A v B) = ffA v Bgg. As the axiom A v B in TN originates from both axioms in T , we obtain that JustT (A v B) consists of the two sets fA v B u 9r:>g and fA v B u 9s:>g.

The computation of justi cations for inclusions of the form C v D w.r.t. T , where C and D are not necessarily concept names, can be done analogously by computing justi cations for AC v AD w.r.t. T [ X , where X = fAC v C; D v ADg for fresh concept names AC and AD. The justi cations for C v D are then the minimal sets obtained from the justi cations for AC v AD after removing the axioms contained in X .

Example 4. Let T = fA v Bg and let X = fAC v A u Y; B v ADg. Then JustT [X (AC v AD) = ffAC v A u Y; B v AD; A v Bgg and JustT (A u Y v B) = ffA v Bgg.

Similarly to [ 9 ], Just extracts a reachability-based module for X v Y rst before starting to compute all the (minimal) derivations for X v Y . Moreover, we used techniques from resolution-based theorem provers to obtain an acceptable performance of our tool in practice. In particular, whenever a derivation 0 for an inclusion was generated, another derivation for had been obtained previously, and all the axioms associated with were contained in the set of axioms associated with 0, then 0 was discarded. Such a deletion strategy corresponds to forward subsumption deletion in resolution theorem provers. We also employed a technique equivalent to backward subsumption deletion.

The computation of a single justi cation is currently implemented by stopping the saturation process after a derivation for the target inclusion has been found.2 As the axioms occurring in this derivation might not represent a justi cation yet, super uous axioms are identi ed by checking for each axiom whether the target inclusion still follows (using the reasoner ELK [ 7 ]) after the considered axiom has been removed. 2 Note that in Just v 0.1 the saturation step is not guaranteed to nish in polynomial time w.r.t. the size of the input TBox, even when only one justi cation is computed. For our experiments we picked three ontologies that are typically considered to pose di erent challenges to DL reasoners and that are expressed mainly in ELH: version 13.12e of the NCI thesaurus,3 the January 2010 international release of SNOMED CT, and the GALEN-OWL ontology.4 In the case of NCI all the 152 axioms that fell outside the considered ELH fragment were rst removed from the ontology. The number of axioms, concept names, and role names in the resulting ontologies is shown in Tbl. 1. For each of the three ontologies T we then randomly selected 1000 inclusions between concept names, A v B, such that T j= A v B holds. In a rst set of experiments we used Just and the algorithm for computing all justi cations implemented in the OWL-API [ 3, 4 ] (using the reasoner FaCT++5 [ 10 ]) to compute all the justi cations for each selected inclusion w.r.t. the respective ontology. In a second series of experiments we instructed Just to only search for one justi cation for each considered inclusion. All experiments were conducted on a PC equipped with an Intel i5-2500K CPU running at 3.30GHz and with 16 GiB of main memory. An execution timeout of 10 CPU minutes was imposed on each problem.

The results obtained for computing all justi cations are shown in Tbl. 2. The rst column indicates which ontology was used. The next ve columns then show the results obtained with Just, whereas the last two columns refer to the OWL-API implementation. More precisely, the second and seventh column indicate the number of successful computations within the given time limit by the respective implementations. The average and the maximal number, as well as the maximal size of the justi cations as computed by Just are shown in the next three columns. The sixth and eighth columns contain the average CPU time 3 http://evs.nci.nih.gov/ftp1/NCI_Thesaurus 4 http://owl.cs.manchester.ac.uk/research/co-ode/ 5 Note that at the time of writing it was not possible to use the reasoner ELK in combination with the OWL-API implementation for computing justi cations. required by the respective implementations for the successful computations over each considered set of inclusions.

Regarding the computation of all justi cations with Just, the lowest number of successful computations were observed for GALEN-OWL, despite it being the smallest ontology of the corpus. The largest size, average & maximal number of justi cations, as well as the longest average computation times involving Just, were found for SNOMED. No computation succeeded within the allocated time for experiments involving GALEN-OWL using the OWL-API implementation. In general, we observed fewer timeouts and shorter average computation times with Just than with the OWL-API.

The results obtained for computing only one justi cation using Just are given in Tbl. 3. Analogously to Tbl. 2, the number of successful computations and the maximal size of the computed justi cations are given in the second and third columns. The last column shows the average CPU time required to compute one justi cation.

One can see that fewer timeouts occurred when only one instead of all justi cations were computed with Just. The least number of successful computations was now observed for SNOMED, whereas only 22 computations involving GALEN-OWL did not succeed within the given time limit. The longest average computation times were reported for GALEN-OWL, and the largest justi cation was again computed for SNOMED CT.

In all the successful computations Just required at most 14.13 GiB of main memory. 5

Conclusion

We presented the tool Just for computing either one, or all the justi cations for general concept inclusions w.r.t. ontologies formulated in ELH. Our experimental evaluation showed that Just is capable of nding all the justi cations for consequences in many practical cases within a reasonable time, even when a large number of justi cations exist.

As future work we aim to implement an extended calculus which would allow it to compute justi cations for more expressive description logics. Our tool could also bene t from a more goal-oriented construction of derivations and from an improved module extraction procedure that yields smaller modules. It would also be interesting to implement an incremental computation of justi cations, which would permit it to generate as many justi cations as requested by the user. Finally, we also plan to perform a more extensive comparison of the performance of our tool against other approaches for computing justi cations.

1. Baader , F. , Calvanese , D. , McGuinness , D.L. , Nardi , D. , Patel-Schneider , P.F . (eds.): The Description Logic Handbook: Theory, Implementation, and Applications . Cambridge University Press, 2nd edn. ( 2007 )

2. Baader , F. , Pen~aloza, R., Suntisrivaraporn , B. : Pinpointing in the description logic EL+ . In: Proceedings of the 30th Annual German Conference on AI (KI 2007). Lecture Notes in Computer Science , vol. 4667 , pp. 52 { 67 . Springer ( 2007 )

3. Horridge , M. , Bechhofer , S.: The OWL API: A Java API for OWL ontologies . Semantic Web 2 ( 1 ), 11 { 21 ( 2011 )

4. Kalyanpur , A. , Parsia , B. , Horridge , M. , Sirin , E.: Finding all justi cations of OWL DL entailments . In: Proceedings of the 6th International Semantic Web Conference & 2nd Asian Semantic Web Conference (ISWC 2007 & ASWC 2007) . Lecture Notes in Computer Science , vol. 4825 , pp. 267 { 280 . Springer ( 2007 )

5. Kalyanpur , A. , Parsia , B. , Sirin , E. , Hendler , J.A. : Debugging unsatis able classes in OWL ontologies . Journal of Web Semantics 3 ( 4 ), 268 { 293 ( 2005 )

6. Kazakov , Y. : Consequence-driven reasoning for Horn SHIQ ontologies . In: Proceedings of the 21st International Joint Conference on Arti cial Intelligence (IJCAI'09) . pp. 2040 { 2045 ( 2009 )

7. Kazakov , Y. , Krotzsch, M. , Simancik , F. : The incredible ELK: From polynomial procedures to e cient reasoning with EL ontologies . Journal of Automated Reasoning 53 ( 1 ), 1 { 61 ( 2014 )

8. Suntisrivaraporn , B. : Finding all justi cations in SNOMED CT . ScienceAsia 39 ( 1 ), 79 { 90 ( 2013 )

9. Suntisrivaraporn , B. , Qi , G. , Ji , Q. , Haase , P.: A modularization-based approach to nding all justi cations for OWL DL entailments . In: Proceedings of the 3rd Asian Semantic Web Conference (ASWC 2008). Lecture Notes in Computer Science , vol. 5367 , pp. 1 { 15 . Springer ( 2008 )

10. Tsarkov , D. , Horrocks , I.: FaCT++ description logic reasoner: System description . In: Proceedings of the Third International Joint Conference on Automated Reasoning (IJCAR 2006). Lecture Notes in Computer Science , vol. 4130 , pp. 292 { 297 . Springer ( 2006 )

11. Zhou , Z. , Qi , G. , Suntisrivaraporn , B. : A new method of nding all justi cations in OWL 2 EL . In : Proceedings of the 2013 IEEE/WIC/ACM International Conferences on Web Intelligence (WI 2013 ). pp. 213 { 220 . IEEE Computer Society ( 2013 )