<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Compilation Technique for Interactive Ontology-mediated Data Exploration∗</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Medina Andresel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Magdalena Ortiz de la Fuente</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mantas Šimkus</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>TU Wien</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Austria</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>We make a step towards the interactive exploration of data in the ontology-mediated setting, presenting a technique to construct an ofline compilation that allows us to answer eficiently diferent related queries in online phase, without the need to access again the original data. We also propose algorithms to construct relevant variations of a given query that, for example, reduce or increase the number of answers, while modifying the query in a minimal way, or make explicit the common properties of all objects that are in the answers to a given query.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>structure such that, without accessing the data again, we can eficiently get the
answers to any query that asks for objects of the desired type, with diferent
combinations of properties. We also use this compilation to obtain (minimal)
modifications of a query that retrieve strictly more or strictly less answers, as well
as to obtain the most specialized query that retrieves exactly the same answers
but makes explicit all their shared properties. A prototype implementation of our
algorithms shows promising results, pointing to adequate functionality to support
users in their attempts to understand datasets in the presence of ontologies.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <sec id="sec-2-1">
        <title>We briefly recall the syntax and semantics of DL-LiteR. As usual, NC , NR and</title>
        <sec id="sec-2-1-1">
          <title>NI are countably infinite sets of concept names, role names, and individuals. The</title>
          <p>set of roles is NR = NR ∪ {P − | P ∈ NR}, and R− denotes the inverse of the role
R. Basic concepts are concept names A, and expressions of the form ∃R with
R a role. A TBox is a set of axioms of the forms R1 ⊑ R2, R1 ⊑ ¬R2, C1 ⊑ C2,
and C1 ⊑ ¬C2, where R1, R2 are roles and the C1, C2 basic concepts. An ABox
is a finite set of assertions of the form A(b), ¬A(b) and P (b, c), ¬P (b, c), where
A ∈ NC , P ∈ NR and b, c ∈ NI . We denote by Ind(A) the set of individuals
appearing in A. A knowledge base (KB) has the form K = ⟨A, T ⟩ for A an ABox
and T a TBox. The semantics of DL KBs is defined in terms of interpretations
I = (∆ I , ·I ) with ∆ I the domain and ·I the usual interpretation function. We
use the usual definitions and notations for satisfaction of axioms, assertions,
TBoxes and ABoxes, models of a KB, and entailment.</p>
          <p>
            We also recall the syntax and semantics of conjunctive queries (CQs) over
DL KBs. A CQ has the form q(x) :- ∃y.ϕ, where ϕ is a conjunction of atoms
of the form A(t) and R(t, t′), where each term t, t′ is either an individual or a
variable from x ∪ y. The variables in x are called answer variables, and term(q)
denotes the set of terms occurring in q. Given a CQ q and an interpretation I,
we call a partial match any function π mapping terms of q to objects in ∆ I , and
we write I π A(t) for a concept atom A(t) if π(t) ∈ AI , and I π R(t1, t2) for
role atom R(t1, t2) ∈ q if (π(t1), π(t2)) ∈ RI . A match for q in I is a partial
match with domain term(q) such that I π α for every atom α in q. A tuple of
individuals t is a certain answer to q(x) over KB K if for each I model of K
there exists a match π for q in I s.t. (t)I = π(x). We use cert(q, K) to denote
the set of certain answers of q over K. It is well known that if a DL-LiteR KB
K = ⟨T , A⟩ is consistent, it possesses a canonical model IT ,A such that for any
CQ q(x), cert(q, K) = {t | IT ,A π q, π(x) = t} [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ]. Its domain ∆ IT ,A consists of
all words aR1 . . . Rn (n ≥ 0), where a ∈ Ind(A) and each Ri is a role such that
∃Ri−−1 ⊑T ∃Ri. We let AnonObj = ∆ IT ,A \ Ind(A) be the set of anonymous
objects. The interpretation function ·IT ,A is as follows:
aIT ,A =a
          </p>
          <p>AIT ,A = {a | T , A</p>
          <p>A(a)} ∪ {aR1 . . . Rn | n ≥ 1, ∃Rn− ⊑T A}
P IT ,A ={(a, b) | R(a, b) ∈ A and R ⊑T P } ∪ {(b, a) | R(a, b) ∈ A and R ⊑T P −}
∪ {(w1, w2) | w2 = w1S, S ⊑T P } ∪ {(w2, w1) | w2 = w1S, S ⊑T P −}
We can rely on IT ,A for query answering, since we can assume that KB consistency
has been tested in advance, and queries trivialize over inconsistent KBs.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Answering Related Queries</title>
      <p>In this paper we only consider CQs of a restricted form, which we call 1treeCQs.
A 1treeCQs is tree-shaped (i.e., its primal graph is a tree), and has exactly one
answer variable, which is its root. We denote this variable x. We use a special
subquery relation between 1treeCQs:
Definition 1 (Subquery). Given a DL-LiteR TBox T and two 1treeCQs q1 and
q2, we call q1 subquery of q2 (w.r.t. T ), written q1 FT q2, if term(q1) ⊆ term(q2)
and (i) for each atom R1(t1, t2) ∈ q1 there exists R2(t1, t2) or R2−(t2, t1) ∈ q2
such that R2 ⊑T R1; and (ii) for each atom C1(t) ∈ q1 there exists C2(t) ∈ q2
such that C2 ⊑T C1. In this case, we also call q2 a superquery of q1 (w.r.t. T ).</p>
      <p>The following claim is straightforward:
Proposition 1. Let K be a DL-LiteR KB. For any two given 1treeCQs q1, q2
such that q1 FT q2, we have that cert(q2, K) ⊆ cert(q1, K).
3.1</p>
      <p>Compiling the Answers of Related Queries
For a given DL-LiteR KB K := ⟨T , A⟩, let qL, qU be two 1treeCQs such that
qL FT qU holds. We call them the lower bound query and the the upper bound
query, respectively. Any 1treeCQ q such that qL FT q FT qU is called in-between
query. We use the term 1treeCQs family to refer to the set of 1treeCQs containing
a pair of qL and qU, together with all the in-between queries. Intuitively, the lower
bound is a general query that retrieves all objects the user could be interested
in, while the upper bound is a query (which may have no answers) that gathers
properties that those objects could potentially have; the family of 1treeCQs
contains queries that ask for combinations of those objects.</p>
      <p>
        Example 1. Our examples are based on the well-known LUBM ontology and
describe the domain of universities [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. If we are interested in exploring the
properties of the employees in the database, we can use as lower bound qL(x) :
−Employee(x), and as upper bound a complex query with many potential
properties of employees, i.e., qU(x) : −Dean(x), Prof(x), FullProf(x), teacherOf(x, y1),
Course(y1), headOf(x, y2), Dep(y2), autPub(x, y3), Pub(y3). Some of the
inbetween queries in the induced family are discussed in Section 5, and depicted in
Figures 5 and 6.
      </p>
      <p>In what follows, we assume a given pair of qL and qU, and a KB K = ⟨T , A⟩.
The goal of this section is to build a structure that stores all information needed
for answering any query in the 1treeCQ family, without having to look up A.</p>
      <p>To this aim, we introduce the notion of matching witness. Intuitively, it
stores all objects in the canonical model that may be relevant to a match of an
in-between query. We start from the answers to qL (since any answer to an
inbetween query must also be an answer to qL), and from there, we consider partial
matches for pairs of terms t1, t2 that occur together in an atom R(t1, t2) ∈ qU.
For each object in the range of a partial match, we store some labels that store
the concept and roles the object satisfies that may be relevant to the matches.</p>
      <p>We introduce some notation. For a ∈ ∆ IT ,A , MSconc(a, K) denotes the set of
most specialized concepts satisfied by a in IT ,A, containing each concept name A
such that IT ,A A(a) and there is no A′ ̸= A with A′ ⊑T A and IT ,A A′(a).
Similarly, for a pair a, b ∈ ∆ IT ,A , the set of all most specialized roles satisfied
by (a, b) in IT ,A is denoted MSrole(a, b, K) and contains each role R such that
IT ,A R(a, b) and there is no R′ ̸= R with R′ ⊑T R and IT ,A R′(a, b). Given
a partial match π := [t1 7→ o1, t2 7→ o2], a role label R is relevant w.r.t. π if there
exists R′(t1, t2) ∈ qU with R ⊑T R′ or R′ ⊑T R. Similarly, a concept label C
is relevant (w.r.t π) if there exists C′(t2) ∈ qU with C ⊑T C′ or C′ ⊑T C. For
each pair t1, t2 of query terms that occur together in some atom R(t1, t2) ∈ qU,
and each o1 ∈ ∆ IT ,A , we define a set of matching candidates that contains each
o2 such that if o1 is a match for t1 in IT ,A, then o2 may be a match for t2.
Definition 2 (Matching candidates). Let {t1, t2} ⊆ term(qU) be such that
there is some R(t1, t2) in qU, and let o1 ∈ Ind(A) ∪ AnonObj. The set of
matching candidates for t2 relative to t1 7→ o1, denoted MCt17→o1 (t2), is:
I. if t2 ∈ Ind(A), then
1. if there is some R ∈ MSrole(o1, t2, K) that is relevant w.r.t. [t1 7→ o1, t2 7→
t2], then MCt17→o1 (t2) := {t2},
2. otherwise, MCt17→o1 (t2) := ∅.</p>
      <p>II. if t2 ∈ vars(qU), then
1. if o1 ∈ Ind(A), then MCt17→o1 (t2) := candA ∪ cando1R where:
candA ={o2 | T , A
cando1R ={o1R | T , A</p>
      <p>R(o1, o2), R′ ⊑T R, R′(t1, t2) ∈ qU}</p>
      <p>∃R(o1), R is relevant w.r.t [t1 7→ o1, t2 7→ o1R]}
2. if o1 := wR ∈ AnonObj, then MCt17→o1 (t2) := candwRS ∪ candw where:
candwRS ={wRS | T</p>
      <p>∃R− ⊑ ∃S, S relevant w.r.t [t1 7→ wR, t2 7→ wRS]}
candw ={w | T</p>
      <p>R− ⊑ S, R′ ⊑T S, R′(t1, t2) ∈ qU}
We also define label strings that store the concepts and roles for partial matches:
Definition 3 (Label String). For a partial match π, we define labels(π) as:
I. if π is of the form [t 7→ o], where t ∈ term(qU) and o ∈ Ind(A) ∪ AnonObj,
then labels(π) = ⟨{C | C relevant w.r.t. π}⟩
II. if π is of the form [t1 7→ o1, t2 7→ o2] with o2 ∈ MCt17→o1 (t2), then labels(π) =
⟨Rlbl, Clbl⟩, where Clbl = {C | C ∈ MSconc(o2, K) relevant w.r.t. π}, and Rlbl
is defined as follows:
A. if o1, o2 ∈ Ind(A) or o1 is of the form o2R, then Rlbl = {R | R ∈
MSrole(o1, o2, K) and R is relevant w.r.t π}
B. if o2 = o1R, then Rlbl = {R}</p>
      <p>Now we are ready to build matching witnesses by iteratively constructing
relevant partial matches and storing them together with their label strings:
Definition 4 (Matching Witness). Let a1, . . . , an be an arbitrary
enumeration of the individuals that are a certain answer to qL. We define the root
matching witness as wroot = ⟨[labels(x 7→ a1)a1, . . . labels(x 7→ an)an]⟩.</p>
      <p>For a given term t ∈ term(qU) and an object o ∈ Ind(A) ∪ AnonObj, the
matching witness for t and o is defined as follows: wot = ⟨valuest1 , . . . , valuestk ⟩,
where {t1, . . . , tk} = {t′ | there exists R(t, t′) ∈ qU}, and valuesti , 1 ≤ i ≤ k, is
constructed as follows:
if MCt7→o(ti) = {o1 . . . om}, then valuesti = [φ1o1, . . . , φmom], where φj =
labels(t 7→ o, ti 7→ oj ), 1 ≤ j ≤ m.</p>
      <p>In particular, if MCt7→o(ti) = ∅, then valuesti = [ε] (ε is the empty string).
We denote by W the set obtained by starting from x and wroot, and then recursively,
for each child t′ of the current term t and each object o in IT ,A that occurs in
the range of a witness, adding the witness wot′ to W.
3.2</p>
      <p>Compilation-based Query Answering</p>
      <sec id="sec-3-1">
        <title>W gathers all information that we need to answer any query in the 1treeCQ family.</title>
        <p>In fact, to decide whether an individual is an answer to an arbitrary in-between
query, it sufices to recursively check the following satisfaction conditions. Below,
for t ∈ term(q), we denote by q t the query obtained by restricting q to atoms
that involve only terms that are descendants of t in q.</p>
        <p>Definition 5 (Satisfaction). Let t ∈ term(qU) and q FT qU t. An object
o ∈ Ind(A) ∪ AnonObj satisfies q (w.r.t. W) if:
if t = x, then o is an answer to qL, and for each C′(x) ∈ q there exists
C ∈ labels(x 7→ o) s. t. C ⊑T C′
if t is a leaf in (the tree-structure of) q, then wot = ⟨⟩ ∈ W
Additionally, the following condition must hold: for each t′ child of t in q there
exists o′ ∈ MCt7→o(t′) such that labels(t 7→ o, t′ 7→ o′) = ⟨Rlbl , Clbl ⟩ satisfies: for
each R′(t, t′) ∈ q there exists R ∈ Rlbl such that R ⊑T R′ and for each C′(t′) ∈ q
there exists C ∈ Clbl such that C ⊑T C′; and o′ satisfies q t′ (w.r.t. W).</p>
        <p>This correctly captures the answers of any query in the 1treeCQ family:
Theorem 1 (Correctness of compilation-based QA). For each in-between
1treeCQ q, we have: cert(q, K) = {a ∈ Ind(A) | a satisfies q (w.r.t.W)}.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Query Modifications</title>
      <p>Given a query, we identify interesting variations of it that could be relevant to a
user. For example, we identify the minimal ways to modify the current query so
that it provides more or less answers. We are also interested in identifying all the
common properties that our set of answers has, that is, the possible additions to
our query that, while making it syntactically more restrictive, would still retrieve
the same set of answers. In the experiments in the next section, we will illustrate
the usefulness of these query modifications on several examples.</p>
      <p>Algorithm 1.1: neutralSpecialization</p>
      <p>Input: W the set of matching witnesses, q in-between query</p>
      <p>Output: Q the set of neutral specializations of q
1 S := {};
2 foreach a ∈ ans(q, W) do
3 S := S ∪ {q | q x-maximal for a};
4 Q := ∅;</p>
      <p>/* compute intersection of each tuple in the n-fold Cartesian product over S */
5 intersect(S,Q,q,∅,0);
6 return Q;
Algorithm 1.2: intersect</p>
      <p>Input: S = {Q1, . . . , Qn} set of sets of queries, Qneutral collects the neutral
specializations, q in-between query, qinter intersection query, k the index
of current set in S
1 if k == n then
2 Qneutral := Qneutral ∪ {qinter} ;
3 else
4 foreach query q′ ∈ Qk do
5 if q FT q′ then
6 if qinter == ∅ then
7 qinter := q′;
8 else
9 qinter := qinter ∩T q′ ;
10 intersect(S,Qneutral,q,qinter,k + 1)</p>
      <p>Algorithm 1.3: strictSpecialization</p>
      <p>Input: W the set of matching witnesses, q in-between query</p>
      <p>Output: Q set of strict specializations of q
1 Q := ∅ ;
2 foreach a ∈ ans(q, W) do
3 foreach q1 x-maximal for a do
4 foreach q2 max. neutral spec. of q do
5 if q2 FT q1 then
6 qdiff := q1 \ q2;
7 foreach query atom A ∈ qdiff do
8 if q2 ∪ {A} is 1treeCQ then
9 Q := Q ∪ {q2 ∪ {A}} ;
10 return Q;
Algorithm 1.4: minGeneralization
Intuitively, the set of constructed matching witnesses contains information
regarding how much of the upper-bound query we have matched in the canonical model,
for each a ∈ Ind(A) - possible answer. Hence we can read from the witnesses,
for each possible answer a, the set of maximal queries in the family for which a
is a match. We will see that these maximal queries provide the key information
to suggest the user the desired query modifications. We remark that there can
be multiple such maximal queries for the same individual a.</p>
      <p>Definition 6 (t-Maximal Query). Let q t be a 1treeCQ rooted in t, where
t ∈ term(qU), s.t. q tFT qU t. A query q is said to be t-maximal for a ∈
Ind(A) ∪ AnonObj if: (a) a satisfies q t w.r.t. W, and (b) for each q′ t, q t
̸= q′ t, s.t. q t FT q′ t FT qU t, a does not satisfy q′ t w.r.t. W.</p>
      <sec id="sec-4-1">
        <title>In what follows, we use ans(q, W) to denote the answers for q over W, which are the individuals a such that a satisfies q w.r.t. W (see Theorem 1).</title>
        <p>4.2</p>
        <p>Specializations and Generalizations
We use the term query specialization to refer to any superquery of a given q in a
1treeCQ family. Indeed, superqueries ‘specialize’ the query by requiring additional
or more specific properties to hold. We further distinguish between two kinds of
specializations: the ones that retrieve the same answers as the given q, and the
ones that retrieve strictly less answers.</p>
        <p>Definition 7 (Neutral Specialization). We call q2 a neutral specialization
of q1 if q1 FT q2 and ans(q2, W) = ans(q1, W).</p>
        <p>Trivially, any in-between 1treeCQ is a neutral specialization of itself.
Definition 8 (Strict Specialization). We call q2 a strict specialization of q1
if q1 FT q2 and ans(q2, W) ( ans(q1, W), with ans(q2, W) ̸= ∅.</p>
        <p>Conversely, any subquery of a given 1treeCQ q is less constrained than q, and
retrieves at least the same answers. We are interested in those subqueries that
retrieve strictly more answers, which we call query generalizations.</p>
        <p>The first query modification problem we consider is to construct the maximal
neutral specializations of a query. Intuitively, the maximal neutral specialization
includes all most specialized additional constraints that we can add to the query
without modifying its answers, and hence it makes explicit all the properties that
are common to all the answer individuals.</p>
        <p>Definition 9 (Maximal Neutral Specialization). A maximal neutral
specialization for q w.r.t. ans(q, W) is a neutral specialization q′ such that, for each
q′′ s.t. q′′ ̸= q′ and q′ FT q′′, we have ans(q′′, W) ̸= ans(q, W).</p>
        <p>Algorithm 1.1 computes neutral specializations for a given in-between query
q, by intersecting maximal queries for each a ∈ ans(q, W). Binary operation ∩T
on 1treeCQs keeps only those query atoms that are subsumed in both 1treeCQs.
One can show that by dropping all subqueries from the neutral specializations
we obtain the maximal neutral specializations.</p>
        <p>The second query modification problem we consider is to find the minimal
strict specializations of a given query, that is, which are the minimal modifications
to the query that will result in a smaller set of answers.</p>
        <p>Definition 10 (Minimal Strict Specialization). A minimal strict
specialization for q is a strict specialization q′ such that, for each q′′ with q FT q′′ FT q′,
q′′ is a neutral specialization of q.</p>
        <p>It can be the case that q cannot be strictly specialized without loosing all
answers. In this case we say that q’s only strict specialization is qU. One can obtain
the minimal strict specializations for a given in-between query by discarding all
superqueries from the set of strict specializations, constructed by Algorithm 1.3.</p>
        <p>Conversely a query generalization refers to any subquery of a given q in a
1treeCQ family that has strictly more answers. Finally, we consider the problem
of finding the minimal generalizations of given query, that is, how to minimally
relax the query to retrieve more answers.</p>
        <p>Definition 11 (Minimal Generalization). Let q be an in-between 1treeCQ. A
minimal generalization or relaxation for q is a generalization q′ such that for
each q′′, that is q′ FT q′′ FT q, it holds that: ans(q′′, W) = ans(q, W)</p>
        <p>It might be the case that an in-between 1treeCQ q cannot be generalized
if ans(q, W) = ans(qL, W). Hence, whenever q is a neutral specialization of qL
it doesn’t have any generalizations. Algorithm 1.4 constructs all the minimal
generalizations for a given in-between query.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Implementation and Experiments</title>
      <p>The matching witness solution is implemented in Java (jdk1.8), using additional
libraries. For ontology loading we used the OWLAPI java library1. For ABox
1 http://owlapi.sourceforge.net/
lirsssseeonbwA#112,,,500000000
P
5000.5
0.5 1#Matching W1.i5tnesses 2 ·104</p>
      <p>(b) Batch2
Figure 1: Performance of QA
1.5 Max. Neutral
1.25 MMinin..SGtreinct
] 1
[scond
se 0.75
e
iTm 0.5
0.25
0</p>
      <p>
        Seconds
reasoning we used the Ontop [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] since it eficiently provides complete answers
for DL-LiteR. However, Ontop was specially tailored for OBDA and does not
provide full support when the data is not stored using relational databases. So
teacherOf
autPub
teacherOf
      </p>
      <p>autPub
Employee</p>
      <p>x
y1
y3</p>
      <p>Employee
x</p>
      <p>Employee</p>
      <p>x
y1
y3
we used HermiT2 for TBox reasoning, which captures more expressive DLs but is
not as eficient over DL-LiteR. To boost the performance we used multi-threading
for computing the matching witnesses as well as for query answering.
Specifications. The entire evaluation was done on a server using 7 CPUs (each
core running at 2.3GHz), 10 GB RAM, operating system Ubuntu Server amd64,
java version Java SE Runtime Environment 7.</p>
      <p>
        Ontology. As ontology we used LUBM [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] - a well known ontology suitable
for testing, especially since it provides a data generator, and its domain is easy
to comprehend and relatively simple. To match our DL fragment, we used a
      </p>
      <sec id="sec-5-1">
        <title>DL-LiteR version previously used in [6].</title>
        <sec id="sec-5-1-1">
          <title>Datasets. We used the LUBM generator3 to compute the data sets. The datasets</title>
          <p>size vary between ≈ 3.5 MB and ≈ 19 MB.</p>
          <p>Evaluation measurements. We evaluated 3 batches of 1treeCQs families
deifned by 3 pairs of bound queries that can be accessed online 4. With this simple
prototype implementation, the time for constructing the compilation varies from
minutes (for small-sized data sets) up to few hours (for largest data set). We
hope to decrease this time dramatically in an improved version that, among
others optimizations, does not rely on expensive calls to external reasoners. We
measured the number of possible answers for the 1treeCQs family, the number of
matching witnesses, the average time for answering an in-between query, and for
each query modification, the average time per in-between query.
2 http://www.hermit-reasoner.com/
3 http://swat.cse.lehigh.edu/projects/lubm/
4 https://github.com/medinaAnd/MatchingWitnesses
ResearchGroup ⊑T Institute</p>
          <p>headOf ⊑T worksFor
worksFor ⊑T memberOf</p>
          <p>Faculty ⊑T Employee</p>
          <p>Prof ⊑T Faculty</p>
          <p>AssProf ⊑T Prof
FullProf ⊑T Prof</p>
          <p>Note that query modifications depend also on the data, not only on the
ontology. E.g., for the maximal neutral specialization 5c, no axiom in the ontology
states that each employee that is a head of something is also a full professor.
6</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Related Work</title>
      <p>
        Works that aim at assisting users formulate queries include Quelo [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], the Faceted
Search Interface SemFacet [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], and the Optique virtual query formulation system
(VQS) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Quelo and SemFacet both provide interfaces in which users can
build tree-shaped queries like the ones considered here, which can be specialized
by adding additional atoms, or generalized by dropping them. Moreover, the
interfaces have an ontological reasoning layer that is used to find, for example,
relevant specializations to suggest to the user, similarly to our work. However,
unlike our work, those interfaces do not aim at compiling data to support online
answering of diferent query variations. Instead, they use of-the-shelf reasoners,
and query answering is then done from scratch for each query variation. Another
closely related work is presented in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], where the authors focus on information
overload and study ways to specialize a query. They consider EL and arbitrary
CQs. We focus only on tree-shaped CQs and DL-Lite, but consider also other
reasoning problems.
7
      </p>
    </sec>
    <sec id="sec-7">
      <title>Conclusions</title>
      <p>In this paper we contributed to the theoretical foundations for
compilationbased query answering and construction of useful query modifications in support
of data exploration. Our prototype shows overall good performance, however
implementation can be further improved. Many questions remain open for future
work. A natural next step is to explore how one can extend the solution to other
DLs. It seems that this can be done at least for Horn-DLs, since a canonical
model can be constructed. Another interesting but challenging question is how to
extend the solution in such way that it considers queries that are not tree-shaped.
This would be indeed very valuable, and we believe it may be possible.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ahmet</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Evgeny</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dmitriy</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ernesto</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          , Martin,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Ian</surname>
          </string-name>
          , H.:
          <article-title>Ontologybased visual query formulation: An industry experience</article-title>
          .
          <source>In: Proceedings of the 11th International Symposium on Visual Computing (ISVC</source>
          <year>2015</year>
          ). Springer, Las Vegas, Nevada, USA (
          <year>2015</year>
          ), http://www.ahmetsoylu.com/wp-content/uploads/ 2015/10/Soylu_et_al_ISVC2015.pdf
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Arenas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Cuenca</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Kharlamov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Marciuska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Zheleznyakov</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>Faceted search over ontology-enhanced rdf data</article-title>
          .
          <source>In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management</source>
          . pp.
          <fpage>939</fpage>
          -
          <lpage>948</lpage>
          . CIKM '
          <volume>14</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giacomo</surname>
            ,
            <given-names>G.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Tractable reasoning and eficient query answering in description logics: The dl-lite family</article-title>
          .
          <source>J. OF AUTOMATED REASONING</source>
          p.
          <year>2007</year>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Franconi</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guagliardo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tessaris</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trevisan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Quelo: an ontology-driven query interface (</article-title>
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heflin</surname>
          </string-name>
          , J.:
          <article-title>Lubm: A benchmark for owl knowledge base systems</article-title>
          .
          <source>Web Semant</source>
          .
          <volume>3</volume>
          (
          <issue>2-3</issue>
          ),
          <fpage>158</fpage>
          -
          <lpage>182</lpage>
          (
          <year>Oct 2005</year>
          ), http://dx.doi.org/10.1016/j.websem.
          <year>2005</year>
          .
          <volume>06</volume>
          .005
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rezk</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodriguez-Muro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiao</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Answering SPARQL queries over databases under OWL 2 QL entailment regime</article-title>
          .
          <source>In: Proc. of International Semantic Web Conference (ISWC 2014). Lecture Notes in Computer Science</source>
          , Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grau</surname>
            ,
            <given-names>B.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fokoue</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Owl 2 web ontology language: Profiles</article-title>
          . World Wide Web Consortium, Working Draft WDowl2-profiles- 20081202
          <source>(December</source>
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Nutakki</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Specializing conjunctive queries in the EL-family for better comprehension of result sets</article-title>
          .
          <source>Master's thesis</source>
          , TU Dresden (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Poggi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Giacomo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
          </string-name>
          , R.:
          <article-title>Linking data to ontologies</article-title>
          . In: Spaccapietra,
          <string-name>
            <surname>S</surname>
          </string-name>
          . (ed.)
          <source>Journal on Data Semantics X, Lecture Notes in Computer Science</source>
          , vol.
          <volume>4900</volume>
          , pp.
          <fpage>133</fpage>
          -
          <lpage>173</lpage>
          . Springer Berlin Heidelberg (
          <year>2008</year>
          ), http://dx.doi.org/10.1007/978-3-
          <fpage>540</fpage>
          -77688-
          <issue>8</issue>
          _
          <fpage>5</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Rodriguez-Muro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ontology-based data access: Ontop of databases</article-title>
          .
          <source>In: International Semantic Web Conference (1). Lecture Notes in Computer Science</source>
          , vol.
          <volume>8218</volume>
          , pp.
          <fpage>558</fpage>
          -
          <lpage>573</lpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Soylu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kharlamov</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zheleznyakov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jimenez-Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giese</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Optiquevqs: Visual query formulation for obda</article-title>
          .
          <source>In: Proceedings of the 27th International Workshop on Description Logics (DL</source>
          <year>2014</year>
          ). vol.
          <volume>1193</volume>
          , pp.
          <fpage>725</fpage>
          -
          <lpage>728</lpage>
          . CEUR-WS.org, Vienna, Austria (
          <year>2014</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1193</volume>
          /paper_88. pdf
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>