<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ELK Reasoner: Architecture and Evaluation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yevgeny Kazakov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Markus Krötzsch</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>František Simančík</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Oxford</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Artificial Intelligence, Ulm University</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>ELK is a specialized reasoner for the lightweight ontology language OWL EL. The practical utility of ELK is in its combination of high performance and comprehensive support for language features. At its core, ELK employs a consequence-based reasoning engine that can take advantage of multi-core and multi-processor systems. A modular architecture allows ELK to be used as a stand-alone application, Protégé plug-in, or programming library (either with or without the OWL API). This system description presents the current state of ELK and experimental results with some difficult OWL EL ontologies.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The logic-based ontology language OWL is becoming increasingly popular in
application areas, such as Biology and Medicine, which require dealing with a
large number of technical terms. For example, medical ontology SNOMED CT
provides formal description of over 300,000 medical terms covering various topics
such as diseases, anatomy, and clinical procedures. Terminological reasoning,
such as automatic classification of terms according to subclass (a.k.a. ‘is-a’)
relations, plays one of the central roles in applications of biomedical ontologies.
To effectively deal with large ontologies, several profiles of the W3C standard
OWL 2 have been defined [11]. Among them, the OWL EL profile aims to provide
tractable terminological reasoning. Specialized OWL EL reasoners, such as CEL
[1], Snorocket [9], and jCEL [10], can offer a significant performance improvement
over general-purpose OWL reasoners.</p>
      <p>This paper describes the ELK system.3 ELK is developed to provide high
performance reasoning support for OWL EL ontologies. The main focus of the
system is (i) extensive coverage of the OWL EL features, (ii) high performance of
reasoning, and (iii) easy extensibility and use. In these regards, ELK can already
offer advantages over other OWL EL reasoning systems mentioned above. For
example, as of today, ELK is the only system that can utilize multiple
processors/cores to speed up the reasoning process, which makes it possible to classify
SNOMED CT in less than 10 seconds on a commodity hardware [4]. This paper
presents an overview of the implementation techniques used in ELK to achieve
high performance of classification, and provides an experimental evaluation of
classification using ELK and related reasoners on some of the largest available
OWL EL ontologies.</p>
    </sec>
    <sec id="sec-2">
      <title>3 http://elk-reasoner.googlecode.com/</title>
      <p>Command-line Client
Protégé Plugin</p>
      <p>OWL API Bindings
Saturation
Taxonomy</p>
      <p>Reasoner</p>
      <p>Job
Manager</p>
      <p>ELK is a flexible system that can be used in a variety of configurations. This
is supported by a modular program structure that is organized using the Apache
Maven build manager for Java. Maven can be used to automatically download,
configure, and build ELK and its dependencies, but there are also pre-built
packages for the most common configurations. The modular structure also separates
the consequence-based reasoning engine from the remaining components, which
facilitates extension of the system with new language features. The latest
stable release ELK 0.2.0 supports conjunction (ObjectIntersectionOf), existential
restriction (ObjectSomeValuesFrom), the top class (owl:Thing), complex role
inclusions (property chains), and syntactic datatype matching. Support for
disjointness axioms, ABox facts (assertions), and datatypes is under development.</p>
      <p>The main software modules of ELK are shown in Fig. 1. The arrows illustrate
the information flow during classification. The two independent entry points are
the command-line client and the Protégé plug-in to the left. The former extracts
OWL ontologies from files in OWL Functional-Style Syntax (FSS), while the
latter uses ELK’s bindings to the OWL API4 to get this data from Protégé.5 All
further processing is based on ELK’s own representation of OWL objects (axioms
and expressions) that does not depend on the (more heavyweight) OWL API.
The core of ELK is its reasoning module, which will be discussed in detail.</p>
      <p>Useful combinations of ELK’s modules are distributed in three pre-built
packages, each of which includes the ELK reasoner. The standalone client includes
the command-line client and the FSS parser for reading OWL ontologies. The
Protégé plugin allows ELK to be used as a reasoner in Protégé and compatible
tools such as Snow Owl.6 The OWL API bindings package allows ELK to be
used as a software library that is controlled via the OWL API interfaces.
2</p>
      <sec id="sec-2-1">
        <title>The Reasoning Algorithm of ELK</title>
        <p>The ELK reasoning component works by deriving consequences of ontological
axioms under inference rules. The improvement and extension of these rules is
an important part of the ongoing development of ELK [4, 7, 5]. To simplify the
presentation, in this paper we focus on inference rules for a small yet non-trivial
fragment of OWL EL, which is sufficient to illustrate the work of the main
reasoning component of the ELK system.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4 http://owlapi.sourceforge.net/</title>
    </sec>
    <sec id="sec-4">
      <title>5 http://protege.stanford.edu/</title>
    </sec>
    <sec id="sec-5">
      <title>6 http://www.b2international.com/portal/snow-owl</title>
      <p>init(C)
C v C</p>
      <p>R+ init(C) : &gt; occurs negatively in O
&gt; C v &gt;</p>
      <p>C v D
v C v E
: D v E 2 O</p>
      <p>We use the more concise description logic (DL) syntax to represent OWL
axioms (see, e.g., [2] for details on the relationship of OWL and DL, and [8] for DL
syntax and semantics). The DL used here is E L, which supports concept inclusion
axioms (TBoxes) but no assertions (ABoxes). E L concepts are either atomic
concepts or of the form &gt; (top), C u D (conjunction), and 9R:C (existential
restriction), where C and D are concepts and R is a role. An E L ontology O is a
set of axioms of the form C v D (subsumption) where C and D are E L concepts.
We say that a concept C occurs negatively (resp. positively ) in an ontology O if
C is a syntactic subexpression of D (resp. E) for some axiom D v E 2 O.</p>
      <p>Inference rules for E L are shown in Fig. 2. They can be seen as a restriction
of the rules for E LH [4]. The rules operate with expressions of the form init(C)
and subsumptions of the form C v D and D v 9R:C. The bars in C and D
have no effect on the logical meaning of the axioms; they are used to control the
application of rules, which will be explained in detail in Section 4. The expression
init(C) is used to initialize the derivation of superconcepts for C. The rules are
sound, i.e., the conclusion subsumptions follow from the premise subsumptions
and O. The rules are complete for classification in the sense that, for each E L
concept C and each atomic concept A occurring in O, if O entails C v A, then
C v A is derivable from init(C). Note that the axioms in O are never used as
premises of the rules, but only as side-conditions of the rule Rv.
Example 1. Consider the ontology O consisting of the following axioms:</p>
      <p>A v 9R:(B u C);</p>
      <p>A u 9R:B v C:
To compute all atomic superconcepts of A, we start with the goal init(A) and
compute all conclusions under the inference rules in Fig. 2.</p>
      <p>
        init(A)
A v A
by R0 to (
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
by R
by R
u
to (
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
to (
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
u
by R9+ to (
        <xref ref-type="bibr" rid="ref7">7</xref>
        ) and (
        <xref ref-type="bibr" rid="ref9">9</xref>
        )
by R+ to (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) and (
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
      </p>
      <p>
        u
by R9 to (
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
by R9 to (
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
by Rv to (
        <xref ref-type="bibr" rid="ref12">12</xref>
        ) using (
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
by R0 to (
        <xref ref-type="bibr" rid="ref13">13</xref>
        )
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
(
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
(
        <xref ref-type="bibr" rid="ref9">9</xref>
        )
(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )
(
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
(
        <xref ref-type="bibr" rid="ref12">12</xref>
        )
(
        <xref ref-type="bibr" rid="ref13">13</xref>
        )
(
        <xref ref-type="bibr" rid="ref14">14</xref>
        )
(
        <xref ref-type="bibr" rid="ref15">15</xref>
        )
(
        <xref ref-type="bibr" rid="ref16">16</xref>
        )
Since A v C has been derived but not, say, A v B, we conclude that C is a
superconcept of A but B is not. The application of rules R9+ and Ru+ in lines (
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
and (
        <xref ref-type="bibr" rid="ref12">12</xref>
        ) uses the fact that the concepts 9R:B and A u 9R:B occur negatively in
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ). Intuitively, these rules are used to “build up” the subsumption A v Au9R:B,
so that rule Rv with side condition (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) can be applied to derive A v C.
      </p>
      <p>In order to classify an ontology O, it is sufficient to compute the deductive
closure of init(A) for every atomic concept A occurring in O using the rules in
Fig. 2. Note that in this case the rules can derive only subsumptions of the form
C v D and D v 9R:C where C and D occur in O. Therefore, the deductive
closure can be computed in polynomial time.</p>
      <p>In the following three sections we give details of the indexing, saturation,
and taxonomy construction phases, which are the main components of the core
reasoning algorithm implemented in ELK (see Fig. 1).
3</p>
      <sec id="sec-5-1">
        <title>Indexing</title>
        <p>The indexing phase is used to build datastructures that can be used to effectively
check the side conditions of the rules in Fig. 2. Specifically, given an ontology
O, the index assigns to every (potentially complex) concept C and every role R
occurring in O the following attributes.</p>
        <p>C:toldSups = fD j C v D 2 Og
C:negConj = fhD; C u Di j C u D occurs negatively in Og [</p>
        <p>fhD; D u Ci j D u C occurs negatively in Og
C:negExis = fhR; 9R:Ci j 9R:C occurs negatively in Og</p>
        <p>R:negExis = fhC; 9R:Ci j 9R:C occurs negatively in Og
The sets C:negConj, C:negExis, and R:negExis consisting of pairs of elements are
represented as key-value (multi-) maps from the first element to the second.
Example 2. Consider the ontology O from Example 1. The following attributes
in the index of O are nonempty.</p>
        <p>A:toldSups = f9R:(B u C)g
A:negConj = fh9R:B; A u 9R:Big
B:negExis = fhR; 9R:Big
(A u 9R:B):toldSups = fCg
(9R:B):negConj = fhA; A u 9R:Big</p>
        <p>R:negExis = fhB; 9R:Big</p>
        <p>Indexing is a lightweight task that can be performed by a single recursive
traversal through the structure of each axiom in the ontology. Since it can
consider one axiom at a time, it can be started even before the whole ontology is
known to the reasoner. In ELK, indexing is executed in a second thread in
parallel to loading of axioms. In addition, ELK keeps track of the exact counts of
negative and positive occurrences of concepts in order to enable fast incremental
updates of the index structure without having to reload the whole ontology.
4</p>
      </sec>
      <sec id="sec-5-2">
        <title>Saturation</title>
        <p>The saturation phase computes the deductive closure of the input axioms under
the inference rules in Fig. 2. This is where most time is spent in typical cases,
and the optimization of this phase is key to overall efficiency.</p>
        <p>
          The saturation algorithm is closely related to the “given clause” algorithm
for saturation-based theorem proving and semi-naive (bottom-up) evaluation of
logic programs. The algorithm maintains two collections of axioms: the set of
processed axioms between which the rules have been already applied (initially
empty) and the to-do queue of the remaining axioms (initially containing the
input axioms). The algorithm repeatedly polls an axiom from the to-do queue; if
the axiom is not yet in the processed set, it is moved there and the conclusions
of all inferences involving this axiom and the processed axioms are added at the
end of the to-do queue (regardless of whether they have been already derived).
Example 3. The derivation in Example 1 already presents the axioms in the
order they are processed by the saturation algorithm. For example, after processing
axiom (
          <xref ref-type="bibr" rid="ref8">8</xref>
          ), the processed set contains axioms (
          <xref ref-type="bibr" rid="ref3">3</xref>
          )–(
          <xref ref-type="bibr" rid="ref8">8</xref>
          ), and the to-do queue
contains axioms (
          <xref ref-type="bibr" rid="ref9">9</xref>
          ) and (
          <xref ref-type="bibr" rid="ref10">10</xref>
          ). The algorithm then polls axiom (
          <xref ref-type="bibr" rid="ref9">9</xref>
          ) from the queue,
adds it to the processed set, and applies all inferences involving (
          <xref ref-type="bibr" rid="ref9">9</xref>
          ) and the
previously processed axioms (
          <xref ref-type="bibr" rid="ref3">3</xref>
          )–(
          <xref ref-type="bibr" rid="ref8">8</xref>
          ). In particular, to apply rule R+ with (
          <xref ref-type="bibr" rid="ref9">9</xref>
          ) as the
9
second premise, the algorithm iterates over B:negExis to find possible ways of
satisfying the side condition. Since B:negExis contains hR; 9R:Bi, the algorithm
looks for processed axioms of the form D v 9R:(B u C) for some D, which can
be used as the first premise of R+. Axiom (
          <xref ref-type="bibr" rid="ref7">7</xref>
          ) is of this form, so conclusion (
          <xref ref-type="bibr" rid="ref11">11</xref>
          )
9
is added to the to-do queue. Note that (a pointer to) the concept 9R:B used in
the conclusion (
          <xref ref-type="bibr" rid="ref11">11</xref>
          ) can be taken directly from the pair hR; 9R:Bi in B:negExis,
so the concept does not have to be reconstructed (and reindexed) during the
saturation phase. This illustrates that conclusions of the inference rules can be
constructed by simply following the pointers in the index.
        </p>
        <p>Algorithm 1: Processing of to-do axioms
process(D v 9R:C):
if C.predecessors.add(hR; Di) then // the axiom was not processed
// the axiom can only be used as the first premise of R+
9
for E 2 (R:negExis:keySet() \ C:superConcepts) do</p>
        <p>F R:negExis:get(E );
todo.add(D v F );
process(C v E):
if C.superConcepts.add(E) then // the axiom was not processed
// use the axiom as the second premise of R+
9
for R 2 (E:negExis:keySet() \ C:predecessors:keySet()) do</p>
        <p>F E:negExis:get(R);
for D 2 C:predecessors:get(R) do</p>
        <p>todo.add(D v F );
// use the axiom as premises of other rules</p>
        <p>To speed up the search for matching premises of binary rules, there is not
just one global set of processed axioms in ELK. Instead, axioms are assigned
to different contexts, one context per each initialized concept C (one for which
init(C) has been derived). The bar over C in the presentation of inference rules
in Fig. 2 indicates that the axiom is assigned to the context of C. For example,
C v 9R:D is assigned to the context of C and C v 9R:D is assigned to the
context of D, even though the two axioms have the same logical meaning. Our
assignment of contexts ensures that the two premises of each binary rule belong
to the same context. Thus, when processing an axiom in some context, it is
possible to restrict the search for relevant premises to this context.</p>
        <p>
          In Example 3, the premises of the form D v 9R:(B u C) can only occur in the
context for B u C. Thus, one only needs to inspect axioms (
          <xref ref-type="bibr" rid="ref7">7</xref>
          )–(
          <xref ref-type="bibr" rid="ref10">10</xref>
          ). Yet iterating
over all processed axioms of a context may still be inefficient. To optimize the
search even further, we save information about the (two types of) processed
axioms within each context C in the following sets:
        </p>
        <p>C:superConcepts = fD j C v D is processedg;</p>
        <p>C:predecessors = fhR; Di j D v 9R:C is processedg:
The latter set is implemented as a key-value multimap from R to D. Thus, to find
all axioms of the form D v 9R:(B u C) with the given R in the context (B u C),
it is sufficient to retrieve all values D for the key R in (B u C):predecessors.
Algorithm 1 demonstrates how these sets are used for processing of axioms.</p>
        <p>The separation of axioms into contexts also helps in parallelizing the
saturation phase because multiple workers can independently process axioms in
different contexts at the same time [4]. To ensure that no two workers are
concurrently processing axioms in the same context, the to-do queue in ELK is split
into a two-level hierarchy of queues: each context of C maintains a local queue
C:todo of to-do axioms that are assigned to the context of C, and there is a
global queue of active contexts whose to-do queues are nonempty. Using
concurrency techniques, such as Boolean flags with atomic compare-and-set operations,
the queue of active contexts is kept duplicate free. Each worker then repeatedly
polls an active context C from the queue and processes all axioms in C:todo.
5</p>
      </sec>
      <sec id="sec-5-3">
        <title>Taxonomy Construction</title>
        <p>The saturation phase computes the full transitively closed subsumption relation.
However, the expected output of classification is a taxonomy which only contains
direct subsumptions between nodes representing equivalence classes of atomic
concepts (if the taxonomy contains A v B and B v C then it should not
contain A v C, unless some of these concepts are equivalent). Therefore, the
computed subsumptions between atomic concepts must be transitively reduced.</p>
        <p>In the first step, we discard all subsumptions derived by the saturation
algorithm that involve non-atomic concepts. Thus, in the remainder of this section,
we can assume that all concepts are atomic.</p>
        <p>A naive solution for computing the direct superconcepts of A is shown in
Algorithm 2. The algorithm iterates over all superconcepts C of A, and for each
of them checks if another superconcept B of A exists with A v B v C. If no
such B exists, then C is a direct superconcept of A. This algorithm is inefficient
because it performs two nested iterations over the superconcepts of A (it also
does not work correctly in the presence of equivalent concepts). In realistic
ontologies, the number of all superconcepts of A can be sizeable, while the number
of direct superconcepts is usually much smaller, often just one. A more efficient
algorithm would take advantage of this and perform the inner iteration only
over the set of direct superconcepts of A that have been found so far, as shown
in Algorithm 3. Given A, the algorithm computes two sets A:equivalentConcepts
and A:directSuperConcepts. The first set contains all concepts that are equivalent
to A, including A itself. The second set contains exactly one element from each
equivalence class of direct superconcepts of A. Note that it is safe to execute
Algorithm 3 in parallel for multiple concepts A.</p>
        <p>Having computed A:equivalentConcepts and A:directSuperConcepts for each
A, the construction of the taxonomy is straightforward. We introduce one
taxAlgorithm 2: Naive Transitive Reduction
for C 2 A:superConcepts do
isDirect true;
for B 2 A:superConcepts do
if B 6= A and B 6= C and C 2 B:superConcepts then</p>
        <p>isDirect false;
if isDirect and C 6= A then</p>
        <p>A.directSuperConcepts.add(C)
Algorithm 3: Better Transitive Reduction
for C 2 A:superConcepts do
if A 2 C:superConcepts then</p>
        <p>A.equivalentConcepts.add(C);
else
isDirect true ; // so far C is a direct superconcept of A
for B 2 A:directSuperConcepts do
if C 2 B:superConcepts then
isDirect false ; // C is not a direct superconcept of A
break;
if B 2 C:superConcepts then
// B is not a direct superconcept of A</p>
        <p>A.directSuperConcepts.remove(B);
if isDirect then</p>
        <p>A.directSuperConcepts.add(C);
onomy node for each distinct class of equivalent concepts, and connect the nodes
according to the direct superconcepts relation. Finally, we put the top and the
bottom node in the proper positions, even if &gt; or ? do not occur in the ontology.
6</p>
      </sec>
      <sec id="sec-5-4">
        <title>Evaluation</title>
        <p>In this section we evaluate the performance of ELK for classification of large
existing ontologies, and compare it to other commonly used DL reasoners.</p>
        <p>Our test ontology suite contains the SNOMED CT ontology obtained from
the official January 2012 international release by converting from the native
syntax (RF2) to OWL functional syntax using the supplied converter. We also
include the E L version of GALEN which is obtained from the version 7 of
OpenGALEN7 by removing all inverse role, functional role, and role chain axioms.
Both ontologies have been used extensively in the past for evaluating E L
reasoners. To obtain additional test data, we selected some of the largest ontologies
listed at the OBO Foundry [14] and the Ontobee [16] websites that were in OWL
EL but were not just plain taxonomies, i.e., included some non-atomic concepts.
This gave us the Foundational Model of Anatomy (FMA), the e-Mouse Atlas
Project (EMAP), Chemical Entities of Biological Interest (ChEBI), the Molecule
Role ontology, and the Fly Anatomy. We also used two versions of the Gene
Ontology which we call GO1 and GO2. The older GO1, published in 2006, has
been used in many performance experiments. GO2 is the version of Mar 23 2012
and uses significantly more features than GO1, including negative occurrences of
conjunctions and existential restrictions, and even disjointness axioms. Table 1</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>7 http://www.opengalen.org/sources/sources.html</title>
      <sec id="sec-6-1">
        <title>Ontology</title>
      </sec>
      <sec id="sec-6-2">
        <title>SNOMED CT</title>
      </sec>
      <sec id="sec-6-3">
        <title>GALEN GO1 GO2 FMA</title>
      </sec>
      <sec id="sec-6-4">
        <title>ChEBI</title>
      </sec>
      <sec id="sec-6-5">
        <title>EMAP</title>
      </sec>
      <sec id="sec-6-6">
        <title>Molecule Role</title>
      </sec>
      <sec id="sec-6-7">
        <title>Fly Anatomy</title>
        <p>shows the number of concepts, roles, and axioms in each of these ontologies.
Links to their sources can be found on the ELK website.8 We plan to maintain
and extend this list with further interesting E L ontologies in the future.</p>
        <p>We compared the performance of the public development version of ELK
(r577) to the specialized E L reasoners jcel 0.17.0 [10] and Snorocket 1.3.4.alpha4
[9], and to general OWL 2 reasoners FaCT++ 1.5.3 [15], HermiT 1.3.6 [12], and
Pellet 2.3.0 [13]. We accessed all reasoners uniformly through the OWL API
3.2.4 [3] in their default settings. All experiments were executed on a laptop
(Intel Core i7-2630QM 2GHz quad core CPU; 6GB RAM; Java 1.6; Microsoft
Windows 7). On this architecture, ELK defaults to using 8 concurrent workers in
the saturation phase; the other reasoners run in a single thread. We set time-out
to 30 minutes and allowed Java to use 4GB of heap space. All figures reported
here were obtained as the average over 5 runs of the respective experiments.</p>
        <p>We loaded the ontologies using the OWL API and, in our first experiment,
we measured the wall-clock time each reasoner spent executing the
classification method precomputeInferences(InferenceType.CLASS_HIERARCHY). The
results are shown in Table 2. In all the 5 runs of the experiment, Pellet threw
a ConcurrentModificationException on ChEBI. The measured classification
times for Snorocket were 0 in all the test cases: the reasoner appears to trigger
classification automatically after loading the ontology without waiting for the
above method call. Therefore, for a more meaningful comparison of Snorocket
with the remaining reasoners, in our second experiment we measured the overall
time for loading and classification. These results are shown in Table 3.</p>
        <p>The results show that, on all tested ontologies, ELK and Snorocket far
outperform all the remaining reasoners. On the smaller ontologies (GO1, ChEBI,
EMAP, Molecule Role, and Fly Anatomy), ELK and Snorocket show similar
performance, while on the larger ontologies (SNOMED CT, GALEN, GO2, and
FMA) ELK is 2–3 times faster than Snorocket. In particular, ELK can load
and classify SNOMED CT in under 15 seconds. Since ELK can update its index
structure incrementally without having to reload the whole ontology, subsequent</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>8 http://code.google.com/p/elk-reasoner/wiki/Test_Ontologies</title>
      <sec id="sec-7-1">
        <title>SNOMED CT</title>
      </sec>
      <sec id="sec-7-2">
        <title>GALEN GO1 GO2 FMA</title>
      </sec>
      <sec id="sec-7-3">
        <title>ChEBI</title>
      </sec>
      <sec id="sec-7-4">
        <title>EMAP</title>
      </sec>
      <sec id="sec-7-5">
        <title>Molecule Role</title>
      </sec>
      <sec id="sec-7-6">
        <title>Fly Anatomy</title>
        <p>10000
1000
100
10
1
reclassification of SNOMED CT due to small changes in the ontology is likely
to take only about 6 seconds as reported in Table 2.</p>
        <p>To judge the correctness of reasoning, we compared the taxonomies computed
by different reasoners. We found that, whenever a reasoner succeeded in
computing a taxonomy at all, the result agreed with the results of all other successful
reasoners. Although this does not exclude the possibility that all reasoners made
the same errors, such a situation seems unlikely since each ontology was
successfully classified by at least three (and often more) reasoners. Therefore, we
conclude that in all test cases all reasoners computed the correct taxonomy.</p>
        <p>Finally, we wanted to find out if the test ontologies entail any subsumptions
that are not already “told”, i.e., which do not follow by a simple transitive closure
of the subsumptions explicitly present in the ontology. For this experiment we
ran ELK disabling all inference rules except the initialization rules and rule Rv,
and we compared the taxonomies obtained in this way to the correct taxonomies.
It turned out that the two taxonomies differed only for SNOMED CT, GALEN,
and GO2. The remaining ontologies entail only told subsumptions.</p>
        <p>Note that the fact that all entailed subsumptions are told does not
immediately imply that the ontologies are trivial for classification because a reasoner
still needs to prove that no other subsumptions hold, which usually requires full
reasoning with all axioms in the ontology. This, however, is not the case here. A
closer inspection revealed that, apart from SNOMED CT, GALEN, and GO2, all
our test ontologies contain only axioms of the form A v B and A v 9R:B, where
A and B are atomic concepts. In such case, the axioms of the form A v 9R:B
cannot possibly lead to new subsumptions between atomic concepts, and
therefore can be discarded for classification. This can be shown easily, for example,
using our calculus in Fig. 2: a positive occurrence of an existential restriction
can lead to a new subsumption only through the interaction with a negative
occurrence of some other existential restriction in rule R9+; since there are no
negative occurrences of existential restriction in these ontologies, axioms of the
form A v 9R:B cannot lead to new subsumptions.</p>
        <p>Since our experiments suggest that ontologies without negative existentials
are relatively common, it might be worthwhile to further optimize classification
by disregarding positive existentials in such cases. Looking at the classification
times, we believe that no reasoner currently takes advantage of this optimization.
In its current implementation, ELK will also blindly apply rule R9 even when
there are no negative existentials in the ontology.
7</p>
        <sec id="sec-7-6-1">
          <title>Conclusions</title>
          <p>This paper outlines some major implementation techniques that contribute to
the overall efficiency of ELK, and evaluates the classification performance of
several reasoners on large OWL EL ontologies. As can be seen from the evaluation
results, despite their relatively large size, most ontologies were not difficult for
ELK and can be classified in less than 1 second. Furthermore, we have observed
that for many ontologies the classification problem is trivial due to their very
limited use of the language features. It is worth noting, however, that although
some axioms do not have any impact on classification, they can be used in some
other reasoning tasks, such as finding subclasses of complex class expressions.</p>
          <p>Since we had to present evaluation results, we were not able to discuss some
further interesting optimization details in ELK, including, concurrent processing,
efficient implementation of set intersections, such as those in Algorithm 1, and
pruning of redundant inferences. These details can be found in the extended
technical report [6]. Most of these methods are not specific to OWL EL, or even
to description logics, and can thus benefit other (reasoning) tools that compute
a deductive closure by exhaustive application of inference rules.
Acknowledgments This work was supported by the EU FP7 project SEALS and
by the EPSRC projects ConDOR, ExODA and LogMap. The first author is
supported by the German Research Council (DFG).</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suntisrivaraporn</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>CEL-a polynomial-time reasoner for life science ontologies</article-title>
          .
          <source>In: Proc. 3rd Int. Joint Conf. on Automated Reasoning (IJCAR'06)</source>
          . LNCS, vol.
          <volume>4130</volume>
          , pp.
          <fpage>287</fpage>
          -
          <lpage>291</lpage>
          . Springer (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krötzsch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudolph</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Foundations of Semantic Web Technologies</article-title>
          . Chapman &amp; Hall/CRC (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Horridge</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bechhofer</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The OWL API: A Java API for working with OWL 2 ontologies</article-title>
          .
          <source>In: Proc. OWLED 2009 Workshop on OWL: Experiences and Directions. CEUR Workshop Proceedings</source>
          , vol.
          <volume>529</volume>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kazakov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krötzsch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simančík</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Concurrent classification of EL ontologies</article-title>
          .
          <source>In: Proc. 10th Int. Semantic Web Conf. (ISWC'11)</source>
          . LNCS, vol.
          <volume>7032</volume>
          , pp.
          <fpage>305</fpage>
          -
          <lpage>320</lpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kazakov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krötzsch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simančík</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Unchain my EL reasoner</article-title>
          .
          <source>In: Proc. 24th Int. Workshop on Description Logics (DL'11)</source>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <volume>745</volume>
          , pp.
          <fpage>202</fpage>
          -
          <lpage>212</lpage>
          . CEUR-WS.org (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kazakov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krötzsch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simančík</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>ELK: a reasoner for OWL EL ontologies</article-title>
          .
          <source>Tech. rep. (</source>
          <year>2012</year>
          ), available from http://code.google.com/p/elk-reasoner/ wiki/Publications
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kazakov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krötzsch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simančík</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Practical reasoning with nominals in the EL family of description logics</article-title>
          .
          <source>In: Proc. 13th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR'12)</source>
          (
          <year>2012</year>
          ), to appear, available from http://code.google.com/p/elk-reasoner/wiki/Publications
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Krötzsch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simančík</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.:</given-names>
          </string-name>
          <article-title>A description logic primer</article-title>
          .
          <source>CoRR abs/1201</source>
          .4089 (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Lawley</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bousquet</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Fast classification in Protégé: Snorocket as an OWL 2 EL reasoner</article-title>
          .
          <source>In: Proc. 6th Australasian Ontology Workshop (IAOA'10)</source>
          .
          <source>Conferences in Research and Practice in Information Technology</source>
          , vol.
          <volume>122</volume>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>49</lpage>
          . Australian Computer Society Inc. (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Mendez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ecke</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Turhan</surname>
            ,
            <given-names>A.Y.</given-names>
          </string-name>
          :
          <article-title>Implementing completion-based inferences for the EL-family</article-title>
          .
          <source>In: Proc. 24th Int. Workshop on Description Logics (DL'11)</source>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <volume>745</volume>
          , pp.
          <fpage>334</fpage>
          -
          <lpage>344</lpage>
          . CEUR-WS.org (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cuenca Grau</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fokoue</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
          </string-name>
          , C. (eds.)
          <source>: OWL 2 Web Ontology Language: Profiles. W3C Recommendation (27 October</source>
          <year>2009</year>
          ), available at http://www.w3.org/TR/owl2-profiles/
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shearer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Hypertableau reasoning for description logics</article-title>
          .
          <source>J. of Artificial Intelligence Research</source>
          <volume>36</volume>
          ,
          <fpage>165</fpage>
          -
          <lpage>228</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Sirin</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parsia</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grau</surname>
            ,
            <given-names>B.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalyanpur</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Katz</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Pellet: A practical OWL-DL reasoner</article-title>
          .
          <source>J. of Web Semantics</source>
          <volume>5</volume>
          (
          <issue>2</issue>
          ),
          <fpage>51</fpage>
          -
          <lpage>53</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bard</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bug</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goldberg</surname>
            ,
            <given-names>L.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eilbeck</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ireland</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Consortium</surname>
            ,
            <given-names>T.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leontis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>RoccaSerra</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruttenberg</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sansone</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scheuermann</surname>
            ,
            <given-names>R.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Whetzeland</surname>
            ,
            <given-names>P.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration</article-title>
          .
          <source>Nature Biotechnology</source>
          <volume>25</volume>
          ,
          <fpage>1251</fpage>
          -
          <lpage>1255</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Tsarkov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
          </string-name>
          , I.:
          <article-title>FaCT++ description logic reasoner: System description</article-title>
          .
          <source>In: Proc. 3rd Int. Joint Conf. on Automated Reasoning (IJCAR'06)</source>
          . LNCS, vol.
          <volume>4130</volume>
          , pp.
          <fpage>292</fpage>
          -
          <lpage>297</lpage>
          . Springer (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruttenberg</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Ontobee: A linked data server and browser for ontology terms</article-title>
          .
          <source>In: International Conference on Biomedical Ontologies (ICBO)</source>
          . pp.
          <fpage>279</fpage>
          -
          <lpage>281</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>