<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Completeness Guaranteed Approximation for OWL DL Query Answering</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jeff Z. Pan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Edward Thomas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuting Zhao</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Computing Science, University of Aberdeen King's College</institution>
          ,
          <addr-line>Aberdeen AB24 3FX</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>How to provide scalable and quality guaranteed approximation for query answering over expressive description logics (DLs) is an important problem in knowledge representation (KR). This is a pressing issue, in particular due to the fact that, for the widely used standard Web Ontology Language OWL, whether conjunctive query answering is decidable is still an open problem. Pan and Thomas propose a soundness guaranteed approximation, which transforms an ontology in a more expressive DL to its least upper bound approximation in a tractable DL. In this paper, we investigate a completeness guaranteed approximation, based on transformations of both the source ontology and input queries. We have implemented both soundness guaranteed and completeness guaranteed approximations in our TrOWL ontology reasoning infrastructure.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        The growing availability of semantic annotated data requires scalable query
answering in description logics. How to provide efficient querying answering service for
expressive DLs has been an important open problem in KR. In fact, whether query
answering in OWL DL is decidable is still an open problem. The closest available
complexity results are about the SHIQ DL, which is OWL DL without nominals and
datatypes, but allowing qualified number restrictions. Ortiz et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] have shown the
coNP-complete data complexity result for query answering in SHIQ, with a restriction
that transitive properties and properties with transitive sub-properties are disallowed in
queries. Glimm et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] have further provided the co-NP-complete data complexity
and 2EXPTIME combined complexity results for general query answering in SHIQ.
      </p>
      <p>
        Approximation has been identified as a potential way to reduce the complexity
of reasoning over OWL DL ontologies. Many existing approaches [
        <xref ref-type="bibr" rid="ref10 ref12 ref2 ref3 ref5">10, 12, 3, 2, 5</xref>
        ] are
mainly based on syntactic approximation of ontological axioms and queries. All these
approaches could introduce unsound answers. Pan and Thomas [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] propose a soundness
guaranteed approximation, which transforms an ontology in a more expressive DL to
its least upper bound approximation in a tractable DL. To the best of our knowledge,
there is no published scalable completeness preserving approximation approaches for
query answering in OWL DL.
      </p>
      <p>
        In this paper, we investigate a completeness guaranteed approximation for
nonboolean query answering, based on transformations of both the source ontology and
input queries It turns out that the semantic approximation constructed for soundness
guaranteed query service can also be exploited to provide completeness guaranteed
query service. We then provide a more fine grained approximation, which can provide
potentially much smaller but still completeness guaranteed, which can be used to
provide anytime querying answering for OWL DL. We have implemented both soundness
guaranteed and completeness guaranteed approximations in our TrOWL ontology
reasoning infrastructure. Accordingly, for each query, we provide two answer sets: one is
soundness guaranteed and the other is completeness guaranteed. To evaluate our
approach, we use the ontologies that Motik et al [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] used for evaluating ABox reasoning,
and extend the tested queries by allowing non-distinguished variables. Our evaluation
shows: (1) Both the soundness guaranteed and completeness guaranteed answer sets
can be computed efficiently. (2) Our anytime algorithm effectively produces smaller
completeness guaranteed answer set. (3) For every query in our evaluation, at least one
of our completeness guaranteed answer sets is both sound and complete with respect to
the reference answer for that query.
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Preliminary</title>
      <sec id="sec-2-1">
        <title>Conjunctive Query Answering</title>
        <p>A conjunctive query is of the form q(X) 9U:'(X; U ), or simply q(X) '(X; U ),
where X and U are vectors of distinguished variables (DVs) and non-distinguished
variables (NDVs) resp., and ' is a conjunction of atoms of the form A(v), R(v1; v2),
where A; R are named concepts and named roles resp., v; v1 and v2 are variables in
X and U , or individual names in the given ontology. If v; v1 and v2 are not in U , then
A(v) and R(v1; v2) are non-distinguished variable free atoms. If X is an empty set, we
say q is a boolean query; otherwise, we say q is a non-boolean query. Theoretically,
allowing only named concepts and roles in atoms is not a restriction, as we can always
define such named concepts and roles in ontologies. Practically, this should not be an
issue as querying against named relations is a usual practice when people query over
relational databases. In this paper, although we consider input queries with only named
concepts and roles in atoms, it is still possible to have concept descriptions in atoms due
to query rewriting (see Section 3). As usual, an interpretation I satisfies an ontology O
if it satisfies all the axioms in O; in this case, we say I is a model of O. Given an
evaluation [X 7! S], where S is a vector of individual names, if every model I of O
satisfies q[X7!S], we say O entails q[X7!S]; in this case, S is called a solution of q.</p>
        <p>
          We could consider a conjunctive query as a directed graph [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], where the nodes
are variable or individual names. In addition, concept and role terms provide labels for
nodes and edges respectively. Note that the direction of the edge is related to role label;
namely, R(i; j) (i connects to j via R) is equivalent to R (j; i) (j connects to i via
R ). Therefore, it is enough to consider weak connectivity of query graphs. Without
loss of generality, we assume input queries corresponding to weakly connected graphs;1
a graph is a weakly connected if replacing all its directed edges with undirected edges
produces a connected (undirected) graph. The weak connectivity degree of a node is
1 Unconnected components do not share variables, therefore they can be considered
independently to each other.
the number of nodes that weakly connect to it. A path in a query graph is a sequence
of nodes such that each of its nodes weakly connects to the next node in the sequence.
The length of the path is n 1, where n is the number of nodes in the path. For the
readers’ convenience the non-distinguished variables are represented by filled circles
( ), distinguished variables by unfilled circles ( ) and individuals by diamonds ( ).
For example, the query q1(x) C(x) ^ R1(x; u) ^ R2(x; Rome) corresponds to the
following graph.
        </p>
        <p>A non-distinguished variable sub-graph (or NDV sub-graph) of a query q is a sub-graph
of q with all its nodes being non-distinguished variables. For example, q1 has one NDV
sub-graph (containing the node u).
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Knowledge Compilation</title>
        <p>
          Selman and Kautz illustrated the idea of knowledge compilation in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] by showing
how a propositional theory can be compiled into a tractable form consisting of a set of
Horn clauses. As a logically equivalent set of Horn clauses does not always exist, they
proposed to use Horn lower-bound and Horn upper-bound to approximate the original
theory.
        </p>
        <p>Let be a set of clauses (the original theory), the sets lb and ub of Horn clauses
are respectively a Horn lower-bound and a Horn upper-bound of iff M( lb)
M( ) M( ub), or, equivalently, lb j= j= ub. This says the Horn
lowerbound is logically stronger than the original theory and the Horn upper-bound is
logically weaker than it. A Horn lower-bound glb is a greatest Horn lower-bound iff there
is no set 0 of Horn clauses such that M( lb) M( 0) M( ). A Horn
upperbound lub is a least Horn upper-bound iff there is no set 0 of Horn clauses such that
M( ) M( 0) M( lub).
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Semantic Approximation</title>
        <p>
          Pan and Thomas [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] apply the idea of knowledge compilation on semantically
approximating a source ontology Os in a more expressive DL Ls (source language) with its
(least) upper-bound Ot in a less expressive DL Lt (target language). This sub-section
summaries the results from [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>The following definition provides the notion of least upper-bound in their setting;
i.e., they consider all Lt axioms that are entailed by Os.</p>
        <p>Definition 1. (Entailment Set) Let NC , NP and NI be the finite set of named
concepts, named roles and named individuals, respectively, used in Os. The entailment set
of Os w.r.t. Lt, denoted as ES(Os; Lt), is the set which contains all Lt axioms
(constructed by using only vocabulary in NC , NP and NI ) that are entailed by Os.
Lemma 1. ES(Os; Lt) is the least upper bound compilation of Os in Lt.
In order to use ES(Os; Lt) as the target approximation ontology Ot, we need to find
some lightweight language Lt such that ES(Os; Lt) is finite.</p>
        <p>
          Lemma 2. Given an OWL DL ontology Os, ES(Os; LDL-LiteR) is a finite set.
Accordingly, we can use DL-LiteR as a lightweight language Lt for approximating
OWL DL ontologies in order to provide scalable query answering service. See [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] for
how to compute ES(Os; LDL-LiteR) from Os.
        </p>
        <p>Given an OWL DL ontology Os and an arbitrary query q, we denote the set of
solutions of q over Os as Sq;Os and the set of solutions of q over ES(Os; DL-LiteR) as
Sq;ES(Os;DL-LiteR). The following theorem shows that query answering based on the
DL-LiteR entailment set is soundness guaranteed.
In this section, we will first show how to make use of entailment sets (introduced in
Section 2.3) to provide a completeness guaranteed approximation for OWL DL query
answering. Secondly, we will show how to provide smaller completeness guaranteed
approximations by using some enriched entailment sets.
In order to provide a completeness guaranteed approximation based on entailment sets,
we first introduce the approximate function FC0, and then show that, given an OWL DL
ontology Os and a query q, querying FC0(q) over ES(Os; DL-LiteR) is completeness
guaranteed, i.e. Sq;Os SFC0(q);ES(Os;DL-LiteR).</p>
        <p>Definition 2. (Approximation Function FC0) Let q be an input non-boolean
conjunctive query of the form q(X) '(X; U ), the approximation function FC0 returns
qC0(X) '0(X; ;), where '0(X; ;) is a non-distinguished variable free conjunction
that only contains all the non-distinguished variable free atoms in '(X; U ).
From the query graph point of view, this amounts to removing all NDV sub-graphs of
an input query q and all edges connecting to these sub-graphs from q.</p>
        <p>Example 1. Let us revisit the query q1(x) C(x) ^ R1(x; u) ^ R2(x; Rome), FC0(q1)
is q1C0(x) C(x) ^ R2(x; Rome), which corresponds to the following graph.
Theorem 3. Given an OWL DL ontology Os and an arbitrary non-boolean query q,
Sq;Os SFC0(q);ES(Os;DL-LiteR).</p>
        <sec id="sec-2-3-1">
          <title>Proof: (Sketch) Immediate consequence of (i) Sq;Os</title>
        </sec>
        <sec id="sec-2-3-2">
          <title>SFC0(q);Os , due to the definition</title>
          <p>of FC0, and (ii) SFC0(q);Os = SFC0(q);ES(Os;DL-LiteR), due to Theorem 2.</p>
          <p>Theorems 1 and 3 suggest that, given an OWL DL ontology Os and a non-boolean
query q, we could now provide both a soundness guaranteed answer set Sq;ES(Os;DL-LiteR)
and a completeness guaranteed answer set SFC0(q);ES(Os;DL-LiteR).
3.2</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>Towards Fine Grained Approximations</title>
        <p>The approximation function FC0 removes all the non-distinguished atoms from the
input query q, which could potentially introduce many unsound answers in the
completeness guaranteed answer set.</p>
        <p>Example 2. Consider the cyclic query q2 of the form q2(x1; x2) C(x1)^R1(x1; u1)^
R2(x1; Rome) ^ R3(u1; u2) ^ R4(Rome; u2) ^ D(u2) ^ R5(u2; x2), which corresponds
to the following graph:
FC0(q2) is q2C0(x1; x2)
graph:</p>
        <sec id="sec-2-4-1">
          <title>R2(x1; Rome)^&gt;(x2), which corresponds to the following</title>
          <p>FC0(q2) has two disconnected components that do not share any variables: R2(x1; Rome)
and &gt;(x2). The latter one binds all named individuals to x2, thus the answer set
potentially could be large and contain many unsound answers.</p>
          <p>
            In this section, we investigate how to improve FC0 by keeping more information
from the non-distinguished atoms. We develop a query rewriting technique based on
the rolling up technique that has been used [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ] to help reduce the problem of answering
boolean queries with acyclic query graphs to the problem of knowledge base
satisfiability checking.
          </p>
          <p>Let us revisit the query q1 (with an acyclic query graph) before providing formal
analysis.</p>
          <p>Example 3. The query q1 is to retrieve all named individuals (x), which are instances
of C, related by role R1 to an (possibly unnamed) individual (u) and related by role R2
to the individual Rome. The query can be paraphrased as the query q10(x) C(x) ^
9R1:&gt;(x) ^ R2(x; Rome), which corresponds to the following graph.</p>
          <p>It should be noted that the intuition from the above example is substantiated by the fact
q1 corresponds to the first order logic formula 8x(9u(C(x)^R1(x; u)^R2(x; Rome))),
which can be translated to the first order logic formula 8x(C(x)^9R1:&gt;(x)^R2(x; Rome)).
Accordingly, we have the the following lemma for rolling-up a path of non-distinguished
variables.</p>
          <p>Lemma 3. Let Os be an OWL DL ontology, C1; :::Cm concepts, R1; :::Rm named roles
and o a named individual in Os, x; u1; :::; um variables. Given the following input
query q and corresponding rolled up query q0:
– q(x) R(x; u1)^S(o; u1)^C1(u1)^R2(u1; u2)^C2(u2)^:::^Rm(um 1; um)^</p>
          <p>Cm(um)
– q0(x) 9R:(C1u9R2:(:::9Rm 1:(Cm 1u9Rm:Cm)))(x)^9S:(C1u9R2:(:::9Rm 1:(Cm 1u
9Rm:Cm)))(o) ^ 9R:9S :fog(x) ^ 9S:9R :&gt;(o)</p>
          <p>A few remarks for the above lemma: (i) If we have Ai;1(ui) ^ ::: ^ Ai;n(ui) in the
query q, we could first combine them into one atom Ai;1 u:::uAi;n(ui), before applying
the lemma. Hence, Ci can be either a named concept or a conjunction of named
concepts Ai;1 u ::: u Ai;n. (ii) The lemma shows that rolling up non-distinguished variables
to distinguished variables (x) is the similar to rolling up to individuals (o). For example,
we could roll up q3(x) R1(x; u1) ^ R3(u1; u2) into q30(x) 9R1:(9R3:&gt;)(x).</p>
          <p>In order to apply Lemma 3 for completeness guaranteed approximations of
potentially cyclic non-boolean queries, we introduce the notion of proxy nodes. Intuitively
speaking, proxy nodes are nodes to which we roll up paths of non-distinguished
variable.</p>
          <p>Definition 3. (Proxy Node) Given a non-boolean query q, a proxy node of q is a node
in the query graph of q that (i) corresponds to a distinguished variable or an individual
and (ii) directly connects via a role to a non-distinguished variable.</p>
          <p>For example, in the query graph of q2, proxy nodes include x1; x2 and Rome.</p>
          <p>Given a non-boolean query q, a proxy node p of q, we call the acyclic path p; n1; :::nh
(where n1; :::; nh are nodes in q) a proxy path w.r.t. p. Note that the above
definition does not take into account the direction of edges, since R(ui; uj ) is equivalent to
R (uj ; ui). We use Path(p; n) to denote all proxy paths from the proxy node p to
the node n. For example, in the query graph of q2, Path(x1; Rome) contains one path:
x1; u1; u2; Rome w.r.t. x1.</p>
          <p>The following lemma deals with enriching the labels of proxy nodes based on their
non-distinguished variable paths.</p>
          <p>Lemma 4. Let Os be an OWL DL ontology, C1; :::Cm concepts and R1; :::Rm named
roles in Os, X the set of distinguished variables, p a proxy node in q and n1; :::; nm
nodes in q. Given the following input query q and corresponding enriched query q0
based on the rolling up of the proxy path p; n1; :::; nm:
– q(X) D(p)^R1(p; n1)^C1(n1)^R2(n1; n2)^C2(n2)^:::^Rm(nm 1; nm)^</p>
          <p>Cm(nm)
– q0(X) D(p)^R1(p; n1)^C1(n1)^R2(n1; n2)^C2(n2)^:::^Rm(nm 1; nm)^</p>
          <p>Cm(nm) ^ 9R1:(C1 u 9R2:(:::9Rm 1:(Cm 1 u 9Rm:Cm)))(p)</p>
          <p>It should be noted that q0 contains all atoms of q, while adding an atom for p based on
the rolling up.</p>
          <p>Definition 4. (Proxy Node Based Rolling Up Function FR) Let q be an input
nonboolean query of the form q(X) '(X; U ), p1; :::; pn proxy nodes in q, the proxy
node based rolling up function FR rewrite q as follows:
1. Normalise q into q0: transform concept atoms of the form A1(i) ^ ::: ^ Ak(i) into</p>
          <p>A1 u ::: u Ak(i).
2. Enrich q0 into q00: For each NDV-subgraph g of q0,
– if g weakly connects to at least two proxy nodes p1; :::pk(k 2), then for each
pair hpi; pj i (1 i &lt; j k) of the weakly connected proxy nodes, enrich q0
based on all paths in Path(pi; pj ) according to Lemma 4;
– if there is only one proxy node p connected to g, let n1; :::ns be the set of nodes
in g that has the lowest weak connectivity degree. For each nh (1 h s),
enrich q0 based on all paths in Path(p; nh) according to Lemma 4.
3. Returns FC0(q00).</p>
          <p>Example 4. q2 contains only one NDV sub-graph, which connects to three proxy nodes.
FR(q2) is q2R(x1; x2) C(x1) ^ R2(x1; Rome) ^ 9R1:(9R3:(D u 9R5:&gt;))(x1) ^
9R5 :(D u 9R3 :(9R1 :C))(x2) ^ 9R1:(9R3:(D u 9R4 :&gt;))(x1) ^ 9R4:(D u
9R3 :(9R1 :C))(Rome) ^ 9R5 :(D u 9R4 :&gt;)(x2) ^ 9R4:(D u 9R5 :&gt;)(Rome),
which corresponds to the following graph.</p>
          <p>Theorem 4. Given an OWL DL ontology Os and an arbitrary non-boolean query q,
Sq;Os SFR(q);Os SFC0(q);ES(Os;DL-LiteR).</p>
          <p>Theorem 4 shows that SFR(q);Os is a more fine grained completeness guaranteed
answer set.
Given an OWL DL ontology Os and a non-boolean query q, this sub-section
investigates how to to answer FR(q) over Os. In particular, we will show this can be done
based on enriched entailment sets of Os.</p>
          <p>According to Def 4, these descriptions are of the form
9R1:(C1 u 9R2:(:::9Rm 1:(Cm 1 u 9Rm:Cm)))
(1)
where C1; :::Cm are conjunctions of named concepts, and R1; :::Rm are either named
roles or their inverse. Intuitively, we first introduce fresh named concepts to represent
concept descriptions in the labels of proxy nodes, then query against the extended
ontology. Formally, given an ontology Os and a non-boolean query q, we can extend Os
to q-enriched ontology Osq as follows: for each proxy node concept description P of the
form (1) in FR(q), add an axiom AP P into Os, where AP is a fresh named
concept. We use FN (FR(q)) to denote the resulted query of rewriting FR(q) by replacing
all concept descriptions P of the form (1) with the corresponding named concepts AP .
Lemma 5. Let Os be an ontology, q a non-boolean query. SFR(q);Os = SFN (FR(q));Osq =
SFN (FR(q));ES(Osq;DL-LiteR).</p>
          <p>Proof: (Sketch) The first equivalence is trivial. The second equivalence is due to
Theorem 2 and the fact that FR(q) does not contain any non-distinguished variables.</p>
          <p>We call ES(Osq; DL-LiteR) an enriched entailment set of Os w.r.t. the query q. In
the next section, we will investigate query independent enriched entailment sets.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Anytime Reasoning Approximation</title>
      <p>A9R4:&gt;(Rome).</p>
      <p>In this section, we introduce a strategy to incrementally construct enriched entailment
sets. For a concept description P of the form (1), we use depth(P ) to denote the
maximal number of serial existential quantifiers in it; e.g., depth(9R1:(C1 u 9R2:&gt;)) = 2.
We define the depth of a query q as the maximal length of proxy paths in it.</p>
      <p>For an OWL DL ontology Os, let Ei = fei1; ; eikg (i 1) be the set of
DLLiteR axioms that a) contain some of the representative named concepts for the
concept descriptions of the form (1) that are with depth i, and b) are entailed by Os.
Accordingly, a serial of i-entailment set i (i 1), are defined as followings: 0 =
ES(Os; DL-LiteR), 1 = 0 [ E1, ..., i = i 1 [ Ei. Let 0 i m be an
integer, we define the function Fi to rewrite FR(q) by a) replacing all concept
descriptions 9R1:(C1 u 9R2:(:::9Rm 1:(Cm 1 u 9Rm:Cm))) with the representative named
concept (of depth i) for 9R1:(C1 u 9R2:(:::9Ri 1:(Ci 1 u 9Ri:&gt;))). For example,
F1(FR(q2)) is q2R(x1; x2) C(x1) ^ R2(x1; Rome) ^ A9R1:&gt;(x1) ^ A9R5 :&gt;(x2) ^
Theorem 5. Given an OWL DL ontology Os and a query q over Os. Let j
have:
i
0, we</p>
      <p>It should be noted that the above theorems are for arbitrary queries. For OWL DL
ontology Os and a query q, an anytime reasoning for query answering is a set of
computing jobs fSF0(FR(q)); 0 ; ; SFi(FR(q)); i ; g. Fig. 1 shows the intuitive hierarchy
of the set of solutions in anytime reasoning for query q on Os. We note when i increases,
the curve is approaching to the middle red line, which is Sq;Os .</p>
    </sec>
    <sec id="sec-4">
      <title>5 Implementation and Evaluation</title>
      <p>We have implemented both soundness guaranteed and completeness guaranteed
approximations in our TrOWL ontology reasoning infrastructure, which is based on the
infrastructure used in the ONTOSEARCH2 system. As Theorem 5 indicates, soundness
guaranteed and completeness guaranteed semantic approximations can be implemented
in a similar manner. The main difference is that, the form one is based on 0, while the
latter one is based on i( 1). All these (i-)entailment sets are stored in the database.
As it could be quite time consuming to compute i( 1), we apply some optimisations
and produce some relaxed completeness guaranteed approximations Ci (instead of i).
All the tests were done on an Apple Macbook, with 2.0 Ghz Dual Core and 2Gb ram.</p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], Motik et al used four ontologies to evaluate the performance of ABox
answering across several different reasoning systems. To evaluate the performance of our
query tool we will use the same ontologies, but we change the queries to include
nondistinguished variables. Table 1 lists the four ontologies, together with the time spent
on computing C1 C3, which only needs to be computed once. In [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], each ontology
is queried twice, a simple instance retrieval query, and a more complex query
containing three distinguished variables. We have modified these complex queries to produce
queries with zero, one, or two non-distinguished variables, so that each combination of
distinguished and non-distinguished variable can be tested.
      </p>
      <p>Table 3 presents the average completeness degrees (cd, with cd = 1:0 indicating
being the exact answer set) and average time (ms) spent on querying each set of queries
(one with zero NDVs, three with one, and three with two) of over 0; C1; :::; C3.
Completeness degree is defined as jjetaajj , where ta is the total answers returned, while ea is the
exact answer set. Since all the queries were tree-shape queries, the exact result set was
calculated semi-automatically. For comparison, we also list the average completeness
degrees and average time (querying over the original ontology) spent for Pellet, which
treats all variables in the query as distinguished when query answering over ontologies.</p>
      <p>Our evaluation shows (see also Table 3): (1) Query with Pellet and over 0 are both
soundness preserving; moveover, our soundness preserving approximation based on 0
is more complete than Pellet. (2) Querying over 0 is much more efficient than Pellet.
(3) Querying over C1; C2; C3 is also more efficient than over Pellet, while slightly less
efficient than over 0. (4) The refinement from of the completeness preserving
approximations is effective. The completeness preserving answer sets over C3 are much
smaller than those over C2, which in turn are much smaller than those over C1. In fact,
for all the tested queries, answer sets over C3 are both sound and complete. Figure 2
shows the convergence of each complete entailment set, as the complexity of the set
increases. This shows the result for each query using two non-distinguished variables.
It can clearly be seen how using anytime reasoning over increasingly large entailment
sets can improve the precision of a query, as the additional entailment sets are calculated
from the source ontology.
6</p>
    </sec>
    <sec id="sec-5">
      <title>Discussion and Outlook</title>
      <p>
        In this paper, we investigate a completeness guaranteed approximation, based on
transformations of both the source ontology and input queries. Based on this approach and
the results from [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], we have implemented both soundness preserving and
completeness preserving approximations in our TrOWL ontology infrastructure. To the best of
our knowledge, this is the first scalable OWL DL query engine supporting both
soundness preserving and completeness preserving approximations.
      </p>
      <p>
        From the literature, the closest approach is SCREECH-ALL [
        <xref ref-type="bibr" rid="ref11 ref3">3, 11</xref>
        ], which
provides complete approximations for ABox reasoning (rather than conjunctive query
answering) on SHIQ ontologies, based on the KAON2 reasoner. While SCREECH-ALL
could produce many unsound answers for atomic and defined concepts, our approach is
very precise on atomic concepts. In Fig. 1 the position of SCREECH-ALL is at point
A or B or C, it depends on the ontology used. It should be noted that Fig. 1 illustrates
an interesting advantage of our approach: when i is increasing our approach produces
smaller completeness preserving answer sets; i.e., our anytime approach is monotonic.
SCREECH-NONE [
        <xref ref-type="bibr" rid="ref11 ref3">3, 11</xref>
        ], on the other hand, simply removes all disjunctive rules and
it guarantees sound by not complete reasoning. Unfortunately the result is that some
instance of named concepts might be lost. The semantic approximation approach,
however, is based on least upper bound approximation, i.e. the strongest weaker
approximation w.r.t. DL-LiteR. Thus, it will never lose any instances of DL-LiteR basic concepts,
let alone named concepts.
      </p>
      <p>One immediate future work is to further test this approach on variant application
ontologies. Study more optimization techniques in order to build more efficient systems
will be one of the important future works.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>B.</given-names>
            <surname>Glimm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          , and
          <string-name>
            <given-names>U.</given-names>
            <surname>Sattler</surname>
          </string-name>
          .
          <article-title>Answering conjunctive queries in the SHIQ description logic</article-title>
          .
          <source>Journal of Artificial Intelligence Research</source>
          ,
          <volume>31</volume>
          :
          <fpage>150</fpage>
          -
          <lpage>197</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>P.</given-names>
            <surname>Groot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Wache</surname>
          </string-name>
          .
          <article-title>Approximating Description Logic Classification for Semantic Web Reasoning</article-title>
          .
          <source>In Proc. of ESWC2005</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>P.</given-names>
            <surname>Hitzler</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Vrandecic</surname>
          </string-name>
          .
          <article-title>Resolution-Based Approximate Reasoning for OWL DL</article-title>
          .
          <source>In Proc. of the 4th International Semantic Web Conference (ISWC2005)</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>I.</given-names>
            <surname>Horrocks</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Tessaris</surname>
          </string-name>
          .
          <article-title>Querying the Semantic Web: a Formal Approach</article-title>
          .
          <source>In Proc. of the 1st International Semantic Web Conference (ISWC</source>
          <year>2002</year>
          ), pages
          <fpage>177</fpage>
          -
          <lpage>191</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>C.</given-names>
            <surname>Hurtado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poulovassilis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Wood</surname>
          </string-name>
          .
          <article-title>A Relaxed Approach to RDF Querying</article-title>
          .
          <source>In Proc. of the 5th International Semantic Web Conference (ISWC-2006)</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          and
          <string-name>
            <surname>Ul. Sattler.</surname>
          </string-name>
          <article-title>A Comparison of Reasoning Techniques for Querying Large Description Logic ABoxes</article-title>
          .
          <source>In Proc. of LPAR</source>
          <year>2006</year>
          , pages
          <fpage>227</fpage>
          -
          <lpage>241</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Eiter</surname>
          </string-name>
          .
          <article-title>Characterizing data complexity for conjunctive query answering in expressive description logics</article-title>
          .
          <source>In In Proc. of AAAI</source>
          <year>2006</year>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>J. Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          and
          <string-name>
            <given-names>E. Thomas. Approximating</given-names>
            <surname>OWL-DL Ontologies</surname>
          </string-name>
          . In
          <source>the Proc. of the 22nd National Conference on Artificial Intelligence (AAAI-07)</source>
          , pages
          <fpage>1434</fpage>
          -
          <lpage>1439</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Bart</given-names>
            <surname>Selman</surname>
          </string-name>
          and
          <string-name>
            <given-names>Henry</given-names>
            <surname>Kautz</surname>
          </string-name>
          .
          <article-title>Knowledge Compilation and Theory Approximation</article-title>
          .
          <source>Journal of the ACM (JACM)</source>
          ,
          <volume>43</volume>
          (
          <issue>2</issue>
          ):
          <fpage>193</fpage>
          -
          <lpage>224</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>H.</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          and
          <string-name>
            <given-names>F. van Harmelen. Approximating</given-names>
            <surname>Terminological</surname>
          </string-name>
          <article-title>Queries</article-title>
          .
          <source>In Proc. of FQAS2002)</source>
          , pages
          <fpage>329</fpage>
          -
          <lpage>343</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Tuvshintur</surname>
            <given-names>Tserendorj</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Rudolph</surname>
          </string-name>
          , Markus Kro¨tzsch, and Pascal Hitzler.
          <article-title>Approximate owl-reasoning with screech</article-title>
          . In Diego Calvanese and Georg Lausen, editors,
          <source>RR</source>
          , volume
          <volume>5341</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>165</fpage>
          -
          <lpage>180</lpage>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. H.
          <string-name>
            <surname>Wache</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Groot</surname>
            , and
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Stuckenschmidt</surname>
          </string-name>
          .
          <article-title>Scalable Instance Retrieval for the Semantic Web by Approximation</article-title>
          .
          <source>In Proc. of WISE-2005 Workshop on Scalable Semantic Web Knowledge Base Systems</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>