=Paper= {{Paper |id=None |storemode=property |title=MORe: a Modular OWL Reasoner for Ontology Classification |pdfUrl=https://ceur-ws.org/Vol-1015/paper_8.pdf |volume=Vol-1015 |dblpUrl=https://dblp.org/rec/conf/ore/RomeroGHJ13 }} ==MORe: a Modular OWL Reasoner for Ontology Classification== https://ceur-ws.org/Vol-1015/paper_8.pdf
             MORe: a Modular OWL Reasoner
               for Ontology Classification

                  Ana Armas Romero, Bernardo Cuenca Grau,
                      Ian Horrocks, Ernesto Jiménez-Ruiz

               Department of Computer Science, University of Oxford



      Abstract. MORe exploits module extraction techniques to divide the
      workload of ontology classification between two reasoners: a reasoner for
      the lightweight profile EL of OWL 2, and a fully fledged OWL 2 reasoner.
      This division is carried out in such a way that the bulk of the workload
      is assigned, as much as possible, to the OWL 2 EL reasoner, in order to
      exploit the more efficient classification techniques specific to this profile.


1   Introduction
MORe [1] is an OWL 2 reasoning system dedicated to ontology classification
that integrates a general purpose OWL 2 reasoner (OWL reasoner for short)
and a reasoner specific for the OWL 2 EL1 profile (EL reasoner for short). The
current implementation of MORe uses ELK [8] as its EL reasoner, and offers the
possibility to choose between two OWL reasoners: HermiT 1.3.7 [4] and Pellet
2.3.0 [12]. The EL and OWL reasoners are, however, integrated in a “black-box”
way: our implementation of MORe provides the required infrastructure to bundle
any other OWL and/or EL reasoner.
    Given an input ontology O, MORe identifies a part of the classification of
O that can be computed using the EL reasoner and limits the use of the OWL
reasoner to a fragment of O as restricted as possible. The main advantage of
MORe lies in its “pay-as-you-go” behaviour when an OWL 2 EL ontology is
extended with axioms in a more expressive logic: the use of an efficient EL
reasoner is not necessarily precluded by the extension; in fact, it is to be expected
that the EL reasoner will still perform most of the computational work.
    MORe performs only terminological reasoning and ignores any assertional
axioms that the input ontology might contain. Therefore, completeness is only
guaranteed for ontologies that contain no ABox assertions.
    MORe2 is implemented in Java using the OWL API3 [6]. It can therefore
process ontologies in any format handled by the OWL API, such as RDF/XML,
OWL Functional Syntax, or OBO. It is available both as a Java library and a
Protégé4 plugin, and it can also be used via a commmand line interface.
1
  http://www.w3.org/TR/owl2-profiles/#OWL 2 EL
2
  https://code.google.com/p/more-reasoner/
3
  http://owlapi.sourceforge.net/
4
  http://protege.stanford.edu/
2      The Technique
The main idea behind the technique implemented in MORe is to identify, given
an ontology O with signature Sig(O), two subsets M1 , M2 of O such that
    • M1 is as small as possible;
    • the output of classifying M2 with the EL reasoner is complete for Sig(M2 )
      w.r.t. O (i.e. it contains all subsumption relations A v B entailed by O such
      that A ∈ Sig(M2 ));
    • the output of classifying M1 with the OWL reasoner is complete for Sig(M1 )
      w.r.t. O; and
    • Sig(M1 ) ∪ Sig(M2 ) = Sig(O).
    Our implementation of MORe relies on ELK, which does not yet implement
the whole of OWL 2 EL. The unsupported constructs are documented and hence
we can identify the fragment LELK of OWL 2 EL implemented by ELK.
    They key to identifying M1 and M2 is in computing an LELK -signature: a
signature Σ ELK ⊆ Sig(O) such that the ⊥-module for Σ ELK in O is an ontology
in the language LELK for which ELK is complete.
    The ⊥-module for O and Σ, M[O,Σ] , is the smallest subset of O such that all
axioms in O \ M[O,Σ] are ⊥-local w.r.t. Σ ∪ Sig(M[O,Σ] ). Intuitively, an axiom
α is ⊥-local w.r.t. Σ if replacing by ⊥ all occurrences in α of symbols not in Σ
would turn α into a syntactically recognisable tautology; e.g., the axiom A v B
is ⊥-local w.r.t. Σ = {B}. Cuenca Grau et al. [2] offer a deeper insight into the
notions of ⊥-module, ⊥-locality, and modularity in a more general sense. For
the scope of this system description, we only remark the following properties:
 1. For any class A in Sig(M[O,Σ] ):
    (a) if A is unsatisfiable in O then it is also unsatisfiable in M[O,Σ]
    (b) if another class B in Sig(O) is a superclass of A in O, then it is so in
         M[O,Σ] as well —and so B is in Sig(M[O,Σ] ) too.
 2. If Σ1 ⊆ Σ2 , then if some axiom α is ⊥-local w.r.t. Σ2 , it is also ⊥-local w.r.t.
    Σ1 , and therefore M[O,Σ1 ] ⊆ M[O,Σ2 ] .
 3. both checking ⊥-locality and extracting a ⊥-module can be done in polyno-
    mial time.
Property 1(b), in particular, is not shared by other kinds of modules, and makes
⊥-modules especially well suited for classification purposes.

2.1     Modular Combination of Reasoners
The integration of the two reasoners is performed as follows. Given an OWL 2
ontology O, MORe first tries to compute a nonempty LELK -signature Σ ELK for O
(details of how this is done are given in Section 2.2). If it suceeds, then ELK is
used to classify M[O,Σ ELK ] , and HermiT or Pellet to classify M[O,Sig(O)\Σ ELK ] ;
finally, both partial hierarchies are unified into a single one. If MORe fails to find
a nonempty LELK -signature, then it delegates the whole classification to either
HermiT or Pellet. Details about the correctness (soundness and completeness)
of this technique can be found in Armas Romero et al. [1].
2.2   Computing an LELK -signature
To find a suitable LELK -signature Σ ELK for a given ontology O, MORe first identi-
fies the set S of axioms that ELK cannot process, and —if possible— a subset Σ
of Sig(O) such that all the axioms in S are ⊥-local w.r.t. Σ. This alone, however,
does not guarantee that M[O,Σ] ∩ S = ∅.
Example 1. Consider the ontology Oex consisting of the following axioms:
                     A≡BtC          B ≡ D u ∃R.E       F v ∃R.G
All the axioms in Oex are in LELK except for α = A ≡ B t C. Now, we have that
α is ⊥-local w.r.t. a signature Σ iff Σ ∩ {A, B, C} = ∅, therefore, α is ⊥-local
w.r.t. Σ = Sig(Oex ) \ {A, B, C}. However, β = B ≡ D u ∃R.E is not ⊥-local w.r.t.
Σ, so β ∈ M[O,Σ] and B ∈ Sig(M[O,Σ] ), and therefore α is not ⊥-local w.r.t.
Σ ∪ Sig(M[O,Σ] ) and needs to be in M[O,Σ] .                                   ♦
   All we need to do is progressively reduce Σ until Sig(M[O,Σ] ) ⊆ Σ. This can
be done as follows:
 1. Let S0 be the set of axioms in O that are not in LELK and let Σ0 = Sig(O).
 2. Reduce Σ0 to some Σ1 ⊂ Σ0 such that S0 is ⊥-local w.r.t. Σ1 . If this is not
    possible, then make Σ1 = ∅.
 3. Compute the set S1 of axioms in M[O,Σ1 ] containing symbols outside Σ1 .
 4. Repeat Steps 2–3 until Si = ∅ (i.e. until Sig(M[O,Σi ] ) ⊆ Σi ) or Σi = ∅.
It is important to note that, in some cases, there may be several different ways
of obtaining Σi+1 from Σi .
Example 2. As shown in the previous example, taking Σ = Sig(Oex )\{A, B, C} is
not enough to keep A ≡ B t C outside M[O,Σ] . We need to remove more symbols
from Σ to keep B ≡ D u ∃R.E outside M[O,Σ] too. One possibility would be to
remove D from Σ, but we could also choose to remove R or E instead. It turns
out that choosing one option over another can change things substantially.
   If we chose to take Σ1 = Sig(Oex ) \ {A, B, C, R} then, because F v ∃R.G is not
⊥-local w.r.t. Σ1 and contains the symbol R, we would need to further reduce
Σ1 to some Σ2 ⊂ Σ1 .
   However, if we took Σ1 = Sig(Oex )\{A, B, C, D} or Σ1 = Sig(Oex )\{A, B, C, E},
then we would already have Sig(M[Oex ,Σ1 ] ) = Σ1 , and we would be done.       ♦
    The ⊥-module M[O,Sig(O)\Σ ELK ] that the OWL reasoner needs to classify is
likely to be smaller the larger Σ ELK is. Therefore, it is desirable to find heuristics
to choose each Σi in a way that leads to an LELK -signature as large as possible.
Below we describe the main heuristics that we have implemented in MORe.

Keeping Properties As far as possible, we try not to remove properties from
Σi . The reason for this is that most ontologies contain fewer properties than
classes, and each property usually appears in more axioms than any class. Thus,
removing a property from Σi is more likely to bring more axioms into the next
Si+1 and lead to a smaller LELK -signature.
Global Symbols We perform a preprocessing stage to identify a (possibly
empty) set Γ of global symbols in Sig(O), such that either Γ ⊆ Σ ELK or Σ ELK = ∅.
For this, we first find the set G of all global axioms in O, i.e. those that cannot
possibly be made ⊥-local, (e.g. axioms of the form > v C) and take Γ = Sig(G).
We then keep adding to Γ the signatures of all the axioms in O that would only
be ⊥-local w.r.t. Σ ELK if some symbol in Γ was left outside Σ ELK , until no more
symbols need to be added to Γ .
   Then, we will only ever consider sets Σi such that Γ ⊆ Σi . As the following
example shows, this can sometimes mark the difference between finding a non-
empty LELK -signature or not.

                                       0
Example 3. Consider the ontology Oex      = Oex ∪ {> v ∃R.E}. In the previous
example we saw how both Sig(Oex ) \ {A, B, C, D} and Sig(Oex ) \ {A, B, C, E} were
equally good choices when choosing a suitable Σ1 for Oex . This is not the case
                 0                             0
any more with Oex  , as after choosing Sig(Oex   ) \ {A, B, C, E} we would need to
try to keep > v ∃R.E outside M[Oex0 ,Σ ELK ] too; but this is not possible, so in the
end we would have Σ ELK = ∅.                                                       ♦


Reducing Nondeterminism In each iteration of the algorithm, instead of
considering the set Si as a whole, we split it into two subsets: Sinondet , containing
those axioms in S for which there are several ways in which Σi can be reduced
to make them ⊥-local, and Sidet , those for which there is only one way.
    Whenever Sidet 6= ∅, we obtain Σi+1 by removing from Σi the symbols re-
quired by each axiom in Sidet , and ignore Sinondet . When Sidet = ∅, we deal with
the axioms in Sinondet taking a greedy approach —finding the optimal solution
is often too expensive.
    The intuition behind this heuristic is that, by postponing making any nonde-
terministic decisions as much as possible, we might eliminate the need to make
them altogether.
    Note that, using this heuristic, we are not guaranteed to handle all the non
LELK -axioms in the first iteration any more, therefore we also have to consider
in each Si those non-LELK axioms that are still not ⊥-local w.r.t. Σi .

                                  00
Example 4. Consider the ontology Oex consisting of all the axioms in Oex , plus
the following additional axioms:

                   EvC       H v ∃R.E       I ≡ (E u F) t (G u H)

We first get S0nondet = {I ≡ (E u F) t (G u H)} and S0det = {A ≡ B t C}. The
new non-LELK axiom, I ≡ (E u F) t (G u H), goes into S0nondet because it could
be handled by removing any of the following sets of symbols: {I, E, G}, {I, F, G},
{I, E, H} or {I, F, H}. For now we only deal with S0det = {A ≡ B t C}, and we do
                            00
so by taking Σ0 = Sig(Oex      ) \ {A, B, C}.
    Then we obtain the sets S1nondet = {B ≡ D u ∃R.E, I ≡ (E u F) t (G u H)} and
                                                                00
S1det = {E v C} and we deal with E v C by taking Σ1 = Sig(Oex      ) \ {A, B, C, E}.
                           Table 1. Ontology metrics

              Metrics
                         Expressivity   |Sig(O)|     |O|     |O \ OLELK | |MOWL2 |
Ontology
Gazetteer                  ALE+         517,039 652,361           0         0%
Cardiac Electrophys.      SHF(D)         81,020 124,248          22         1%
Protein                      S           35,205 46,114           15        22%
Biomodels                  SRIF         187,577 439,248        22,104      45%
Cell Cycle v0.95            SRI         144,619 511,354           1       <0.1%
Cell Cycle v2.01            SRI         106,517 624,702           9        98%
NCI v09.12d                SH(D)        77,571     109,013      4,682      58%
NCI v13.03d                SH(D)        97,652     136,902       158       57%
SNOMED15t                 ALCR    291,216 291,185                15         3%
SNOMED+LUCADA           ALCRIQ(D) 309,405 550,453               122        0.1%



   In the next iteration, we get S2nondet = {I ≡ (E u F) t (G u H)} —note that
axiom B ≡ D u ∃R.E has been taken care of indirectly— and S2det = {H v ∃R.E},
                                                   00
and we handle H v ∃R.E by taking Σ2 = Sig(Oex         ) \ {A, B, C, E, H}.
                          nondet            det
   After that, we find S2        = ∅ and S2 = {I ≡ (E u F) t (G u H)}, and take
            00
Σ3 = Sig(Oex   ) \ {A, B, C, E, H, I}, which finally gives S3 = ∅. Thus, we have
            ELK
computed Σ       = Σ3 without making any nondeterministic decisions.


3   Evaluation

We have tested MORe using an Ubuntu 12.04 64-bit machine with 7.8 GiB of
RAM (fully assigned to the JVM) and an Intel Core i7-3770 CPU @ 3.40GHz x
8 processor. Our test ontology suite includes six BioPortal ontologies5 [3] —for
Biomodels we consider only its TBox—, two different versions of NCI6 [5], and
two extensions of SNOMED7 [9]: SNOMED15t was built from a 2012 version
of SNOMED, following the suggestions of domain experts, by adding 15 axioms
containing disjunctions; SNOMED+LUCADA was obtained by mapping a 2011
version of SNOMED to the terminological part of the LUCADA ontology8 [10,
11] using the ontology matching system LogMap [7]. Table 1 gives an overview
of the general features of these ontologies, including the number of non-LELK
axioms they contain and the size of the module extacted by MORe for the OWL
reasoner, M[O,Sig(O)\Σ ELK ] , referred to as MOWL2 in Table 1.
5
  http://bioportal.bioontology.org/ontologies/1397
6
  http://evs.nci.nih.gov/ftp1/NCI Thesaurus/archive
7
  http://www.ihtsdo.org/snomed-ct/
8
  The LUCADA ontology (ALCHI(D)) contains 476 entities and can be classified by
  both HermiT and Pellet in less than 2 seconds
                     Table 2. Classification times in seconds

                   Reasoner     MOReHermiT         MORePellet
                                            HermiT              Pellet
       Ontology                HermiT total        Pellet total
       Gazetteer                  0     20.6    651       0     20.3 1,414
       Cardiac Electrophys.      0.3    6.3     22.7     0.3    5.5     11.0
       Protein                   2.0    4.8     10.0     2.0    4.7    2,920
       Biomodels                 377    487     582      373    483    1,915
       Cell Cycle v0.95         <0.1    9.9    mem      <0.1 10.4 3,433
       Cell Cycle v2.01         mem     mem    mem      mem mem 3,435
       NCI v09.12d              244     252     261     256     266    93.6
       NCI v13.03d              45.1    62.7    68.4    45.7    62.9   191
       SNOMED15t                 4.5    25.4   1,395     4.4    22.9 4,314
       SNOMED+LUCADA             1.1    28.8   1,302     1.2    29.2 mem



    We analyse our results by comparing the performance of MORe using HermiT
vs. HermiT alone, and of MORe using Pellet vs. Pellet alone. A summary of all
results can be found in Table 2. mem indicates an out of memory error.
    When integrating HermiT, MORe is always able to improve, or at least main-
tain its performance. We remark the case of Cell Cycle v0.95, where the perfor-
mance is improved from an out of memory error to termination in under 10s.
    Integrating Pellet, however, sometimes has an unexpected effect. In the cases
of NCI v09.12d and Cell Cycle v2.01, Pellet takes longer to classify MOWL2 when
integrated in MORe than to classify the whole ontology on its own. This however,
does not happen when Pellet is used to classify MOWL2 independently of MORe.
We are still unsure about the causes of this phenomenon. Apart from these two
cases, MORe is still often able to improve on the performance of Pellet.
    It is worth mentioning that the reason why, in the case of Cell Cycle v2.01,
the portion of the ontology that the OWL reasoner has to process is so close
to the whole ontology (98%) is because the 9 non LELK axioms are symmetric
property axioms, which force their 9 respective properties out of Σ ELK , reducing
it to a very small set.


4   Conclusions and Future Directions

We are continuing to develop MORe, exploring new ways of further reducing
the workload assigned to a general purpose OWL 2 classification algorithm.
We are looking into the possibility of alternative modularity notions specific
for this application, and also into exploiting computational properties of other
lightweight ontology languages to combine our modular approach with one based
on finding lower and upper bounds for the classification.
Acknowledgements

This work was supported by the Royal Society, the Seventh Framework Program
(FP7) of the European Commission under Grant Agreement 318338, ”Optique”,
and the EPSRC projects Score!, ExODA and MaSI3 .


References
 1. Armas Romero, A., Cuenca Grau, B., Horrocks, I.: MORe: Modular Combination
    of OWL Reasoners for Ontology Classification. In: International Semantic Web
    Conference (ISWC). pp. 1–16 (2012)
 2. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular reuse of ontolo-
    gies: Theory and practice. J. Artificial Intelligence Research 31, 273–318 (2008)
 3. Fridman Noy, N., Shah, N.H., Whetzel, P.L., Dai, B., Dorf, M., Griffith, N., Jon-
    quet, C., Rubin, D.L., Storey, M.A.D., Chute, C.G., Musen, M.A.: BioPortal:
    ontologies and integrated data resources at the click of a mouse. Nucleic Acids
    Research 37(Web-Server-Issue), 170–173 (2009)
 4. Glimm, B., Horrocks, I., Motik, B., Shearer, R., Stoilos, G.: A novel approach to
    ontology classification. J. of Web Semantics 10(1) (2011)
 5. Golbeck, J., Fragoso, G., Hartel, F.W., Hendler, J.A., Oberthaler, J., Parsia, B.:
    The National Cancer Institute’s Thésaurus and Ontology. J. Web Semantics 1(1),
    75–80 (2003)
 6. Horridge, M., Bechhofer, S.: The OWL API: A Java API for OWL ontologies.
    Semantic Web 2(1), 11–21 (2011)
 7. Jiménez-Ruiz, E., Cuenca Grau, B., Zhou, Y., Horrocks, I.: Large-scale interactive
    ontology matching: Algorithms and implementation. In: European Conference on
    Artificial Intelligence (ECAI). pp. 444–449 (2012)
 8. Kazakov, Y., Krötzsch, M., Simancik, F.: Concurrent classification of EL ontolo-
    gies. In: International Semantic Web Conference (ISWC). pp. 305–320 (2011)
 9. Schulz, S., Cornet, R., Spackman, K.A.: Consolidating SNOMED CT’s ontological
    commitment. Applied Ontology 6(1), 1–11 (2011)
10. Sesen, M.B., Bañares-Alcántara, R., Fox, J., Kadir, T., Brady, J.M.: Lung Cancer
    Assistant: An Ontology-Driven, Online Decision Support Prototype for Lung Can-
    cer Treatment Selection. In: OWL: Experiences and Directions Workshop (2012)
11. Sesen, M.B., Jiménez-Ruiz, E., Bañares-Alcántara, R., Brady, J.M.: Evaluating
    OWL 2 Reasoners in the context of Clinical Decision Support in Lung Cancer
    Treatment Selection. In: OWL Reasoner Evaluation (ORE) Workshop (2013)
12. Sirin, E., Parsia, B., Cuenca Grau, B., Kalyanpur, A., Katz, Y.: Pellet: A practical
    OWL DL reasoner. J. of Web Semantics 5(2), 51–53 (2007)