Controlled Query Evaluation
                  over Lightweight Ontologies?

      B. Cuenca Grau, E. Kharlamov, E. V. Kostylev, and D. Zheleznyakov

                Department of Computer Science, University of Oxford


        Abstract. We study confidentiality enforcement in ontologies under the
        Controlled Query Evaluation (CQE) framework. In a CQE system, a pol-
        icy specifies the sensitive information, and a censor ensures that answers
        to user’s queries that could violate the policy are not returned. Our goal
        is the design of optimal CQE algorithms, which ensure confidentiality
        while maximising access to information. We study two natural classes
        of censors that can be realised using existing infrastructure for query
        answering and propose optimal CQE algorithms for the standardised
        profiles of the ontology language OWL 2.


1     Introduction
As ontology-based information systems are becoming increasingly mature, there
is a pressing need to devise mechanisms for ensuring that data is only made
accessible to authorised users [1, 7–12, 20, 21].
    Controlled Query Evaluation (CQE) is a prominent formal framework for
confidentiality enforcement. In CQE, sensitive information is declaratively spec-
ified by means of a policy and confidentiality is enforced by a censor : when given
a user query, a censor checks whether returning the answer might lead to a pol-
icy violation, in which case it returns a distorted answer. CQE was introduced
in [18], and studied in [3, 4, 6] for propositional databases with complete infor-
mation. It was extended to (propositional) incomplete databases in [5]. Beyond
propositional logic, CQE remains largely unexplored [2].
    In this paper we study CQE in the context of ontologies. Our basic framework
is described in Section 3. We assume data to be hidden and that users interact
with the system by means of a query interface. An ontology, which we assume to
be known to all users, provides the vocabulary and background knowledge needed
for users to formulate accurate queries, as well as to enrich query answers with
implicit information. Policies are given as a set of ground atoms that follow from
the ontology and data. When given a (conjunctive) query, the system returns
the subset of certain answers determined by the censor; in this way, the role
of the censor is to preserve confidentiality by filtering out those answers that
could lead to a violation of the policy. Formally, we model the information that
users could gather by querying the system as an (infinite) first-order theory;
?
    This paper recapitulates and extends our previous work [11]. It comes with a tech-
    nical report available at http://tinyurl.com/DL14paper55.
confidentiality preservation then amounts to ensuring that such theory together
with the ontology does not entail any atom in the policy. In this setting, there is
a danger that confidentiality enforcement may over-restrict users’ access. Thus,
we focus on optimal censors, which maximise answers to queries while ensuring
confidentiality of the policy. Furthermore, we are interested in censors that can
be implemented by reusing off-the-shelf query answering infrastructure. To fulfil
this requirement, we introduce in Section 4 two classes of censors, which we call
view and obstruction censors, respectively.
     View censors return only those answers that follow from the ontology and
a materialised dataset (a view ). In this way, a view encodes the information
that users are authorised to access: the censor answers faithfully all queries
against the view, and any information not captured by the view is inaccessible
by default. View censors require the ability to materialise implicit data, and
hence are especially well-suited for RDF-based applications in which reasoning
is performed by a triple store. Obstruction censors are dual to view censors in
the sense that they explicitly specify information which users are denied access
to (with all other information being accessible by default). Obstruction censors
are specified by a finite set of “forbidden query patterns” (obstructions), and all
query answers that instantiate such patterns are filtered out. In contrast to view
censors, obstruction censors do not require modification of the data and hence
are well-suited for OBDA applications, where data is typically managed by an
RDBMS. We finally characterise the duality of views and obstructions and argue
that it is not always possible to “simulate” one with another.
     In Section 5 we focus on view censors. First, we investigate their intrinsic lim-
itations, and then show how these limitations can be circumvented. We propose
algorithms for computing optimal view censors for knowledge bases with OWL
2 RL, EL and QL ontologies under relatively minor restrictions. Our algorithms,
however, rely on views that can be of exponential size in the worst case. So,
we identify natural conditions on ontologies that guarantee polynomial size of
optimal views. In particular, all OWL 2 QL ontologies satisfy these conditions.
     In Section 6 we turn our attention to obstruction censors and provide suf-
ficient and necessary conditions for an optimal such censor to exist. Then, we
propose algorithms for computing optimal obstruction censors for knowledge
bases with OWL 2 QL as well as restricted OWL 2 RL ontologies, which are
based on obstructions of polynomial size.


2    Preliminaries

We adopt standard notions in first-order logic over finite function-free signatures.
We treat equality ≈ as an ordinary predicate, but assume that any set of formulae
Σ contains all the axioms of equality for Σ. A fact is a ground, equality-free
atom, and a dataset is a finite set of facts. A structure I is a pair (∆I , ·I ) of a
domain and interpretation function for the symbols in the signature. We define
homomorphisms between structures I and J in the standard way and I ,→ J
denotes the fact that such a homomorphism from I to J exists.
          (1 ) A(x) ∧ R(x, y1 ) ∧ B(y1 ) ∧ R(x, y2 ) ∧ B(y2 ) → y1 ≈ y2 ,
          (2 ) R(x, y) → S(x, y),            (3 ) A(x) → ∃y.[R(x, y) ∧ B(y)],
          (4 ) A(x) → x ≈ a,                 (5 ) R(x, y) ∧ S(y, z) → T (x, z),
          (6 ) A(x) → R(x, x),               (7 ) A(x) ∧ B(x) → C(x),
          (8 ) R(x, x) → A(x),               (9 ) A(x) ∧ R(x, y) → B(y),
          (10 ) R(x, y) → S(y, x),           (11 ) R(x, a) → B(x),
          (12 ) R(x, y) → A(y),              (13 ) A(x) → R(x, a),
          (14 ) A(x) → B(x),                 (15 ) R(x, y) ∧ B(y) → A(x).

             Table 1. Horn-SROIF rules; unary predicates can be >.


    A rule is a sentence of the form ∀x.∀z.[ϕ(x, z) → ∃y.ψ(x, y)], where x, z,
and y are pairwise disjoint vectors of variables, the body ϕ(x, z) is an equality-
free conjunction of atoms with variables x ∪ z, and the head ψ(x, y) is a con-
junction of atoms with variables x ∪ y. For simplicity universal quantifiers in
rules are usually omitted. A rule is (i) Datalog if its head consists of a single
atom and y is empty; (ii) guarded if it has a body atom (a guard ) mentioning
all universally quantified variables; (iii) linear if it has a single body atom; and
(iv) multi-linear if the body contains only guards. An ontology is a finite set of
rules. We assume that both rule heads and bodies are non-empty and they do
not contain the nullary atoms > and ⊥. Thus, O ∪ D is satisfiable for each on-
tology O and dataset D, and O 6|= α for each fact α, which ensures a separation
between schema and data.
    To capture all OWL 2 profiles, we focus on Horn-SROIF. Table 1 provides
the normalised axioms of this DL in the form of rules. To capture the semantics
of >, usually allowed in DLs, we treat it as a unary predicate and assume that
each ontology O contains the rule P (x1 , . . . , xn ) → >(xi ) for each predicate P
and 1 ≤ i ≤ n. A Horn-SROIF ontology is in (i) RL if it has no rules of
Type (3); (ii) QL if it contains only rules of Types (2), (3), (10), (12), and (14);
(iii) EL if it does not contain rules of Types (1), (9), and (10).
    A conjunctive query (CQ) is a formula Q(x) of the form ∃y.ϕ(x,Wy), with
ϕ(x, y) a conjunction of atoms. A union of CQs (UCQ) is a formula i Qi (x),
with each Qi (x) a CQ. A CQ is Boolean (BCQ) if x is empty. A tuple of constants
t is a (certain) answer to Q(x) over ontology O and dataset D if O ∪ D |= Q(t).
Then, cert(Q, O, D) is the set of answers to Q(x) over O and D. Given a BCQ Q,
A[Q] denotes the structure interpreting each relation R with (f (u1 ), . . . f (un ))
for every atom R(u1 , . . . , un ) in Q, where f maps each constant in Q to itself
and each variable y to a fresh element dy in the structure.


3   Basic Framework

Given an ontology O and dataset D, we assume that D is hidden while O is known
to users, who can formulate arbitrary CQs via a query interface. A policy, which
is unknown to users, is given as a set of facts entailed by O∪D. It is assumed that
system administrators are in charge of specifying policies, and that each policy
is assigned to a specific (group of) users by means of standard mechanisms such
as role-based access control techniques [17].
Definition 1. A policy P for O and D is a dataset such that O ∪ D |= P. A
CQE-instance I is a triple (O, D, P), with P a policy for O and D.
Example 1. Consider the following ontology and dataset that describe an excerpt
of a social network including information about movies:
         Oex = {FOf (x, y) →FOf (y, x), Susp(x)∧Cr (x) →Thr (x),
                 Likes(x, y) ∧ Thr (y) → ThrFan(x)},
         Dex = {FOf (John, Bob), FOf (Bob, Mary), Likes(John, Seven),
                 Likes(Bob, Seven), Susp(Seven), Cr (Seven)}.
Here, Oex states that friendship is symmetric; movies that are both suspense and
crime are thrillers; and everyone who likes a thriller is a thriller fan. Assume that
John wants to hide his friend list. Then, Pex = {αex } with αex = FOf (John, Bob),
and Iex = (Oex , Dex , Pex ).
A key component of a CQE system is the censor, whose goal is to decide accord-
ing to the policy which query answers can be safely returned to users.
Definition 2. A censor for a CQE-instance I = (O, D, P) is a function cens
that maps each CQ Q to a subset of cert(Q, O, D). The characteristic theory
Thcens of cens is the (possibly infinite) set of sentences
                      {Q(t) | t ∈ cens(Q) and Q(x) is a CQ}.
Censor cens is confidentiality preserving if O ∪ Thcens 6|= α for each α ∈ P. It is
optimal if (i) it is confidentiality preserving and (ii) no confidentiality preserving
censor cens0 6= cens exists such that cens(Q) ⊆ cens0 (Q) for every Q.
Intuitively, Thcens represents the information that a user can potentially gather
by asking an unbounded number of queries to the system. If the censor is confi-
dentiality preserving, then no information can be obtained about P, regardless
of the CQs asked. Finally, optimal censors maximise information accessibility
without compromisig the policy.


4     View and Obstruction Censors
The idea behind view censors is to associate to a CQE instance I a new dataset,
called a view. Intuitively, a view encodes the information that a user is allowed
to see. The user gets only those query answers that follow from O and this view.
In this way, the main workload of the censor boils down to the computation of
certain answers, which can be fully delegated to the query answering engine.1
1
    We assume that all the definitions in this section are parameterised by a (fixed)
    instance I = (O, D, P).
Definition 3. The view censor vcensV
                                   I for I based on a dataset V (a view), is
the function mapping each CQ Q(x) to the set cert(Q, O, D) ∩ cert(Q, O, V).
   Obviously, if we want the censor to enjoy the properties we are after, the view
V must be constructed with care. For the censor to be confidentiality preserving,
O ∪ V must not entail any atom from the policy P, and to be optimal V must
encode as much information from the hidden dataset as possible.
Example 2. Consider a view Vex obtained from the dataset Dex by replacing Bob
with a fresh constant an b . Intuitively, Vex is the result of “anonymising” the
constant Bob, while keeping the structure of the data intact. Since Vex contains
no information about Bob, we have Oex ∪ Vex 6|= αex , that is the censor based
on Vex is confidentiality preserving. View Vex , however, is not optimal: Oex ∪ Vex
does not entail the fact Likes(Bob, Seven), which is harmless for confidentiality.
                0                                  0
Indeed, Oex ∪Vex  6|= αex holds for the extension Vex of Vex with Likes(Bob, Seven).
   View censors require the ability to materialise implicit data, and hence are es-
pecially well-suited for RDF-based applications, in which reasoning is performed
by a triple store. In OBDA scenarios, however, data is typically managed by an
RDBMS and materialisation is not possible. To fulfill the requirement of OBDA
applications, we need a different kind of censors.
   The idea behind obstruction censors is to associate to I an obstruction in
the form of a Boolean UCQ U , such that given a query Q(x) and an answer t
over O and D, the censor returns t only if no CQ in U follows from Q(t). Thus,
the obstruction can be seen as a collection of “forbidden query patterns”, which
should not be disclosed.
Definition 4. The obstruction censor ocensU I for I based on a Boolean UCQ U
(called an obstruction) is the function mapping each CQ Q(x) to the set

                    {t | t ∈ cert(Q, O, D) and A[Q(t)] 6|= U }.
Similarly to view censors, obstruction censors do not require dedicated algo-
rithms: checking A[Q(t)] |= U can be delegated to an RDBMS. Also, obstruc-
tions can be maintained virtually without the need of data materialisation.
Example 3. The censor based on Vex from Example 2 can also be realised with
following obstruction Uex :
     ∃x.FOf (x, Bob) ∨ ∃x.FOf (Bob, x) ∨ ∃x.Likes(Bob, x) ∨ ThrFan(Bob).

Intuitively, Uex “blocks” query answers involving Bob; and all other answers are
the same as over Oex ∪ Dex .
    As seen in Examples 2 and 3, the same censor may be based on both a view
and an obstruction. View and obstruction censors, however, behave rather dif-
ferently: a view explicitly encodes the information accessible to users, whereas
obstructions specify information which users are denied access to. Thus, obstruc-
tions are dual to views. Unsurprisingly, even in simple cases it is not obvious
whether (and how) a view can be realised by an obstruction, or vice-versa.
     We next focus on Datalog ontologies and characterise when a given view V
and obstruction U yield the same censor. Each Datalog ontology O and dataset
D have a unique least Herbrand model HO,D , that is a finite structure that sat-
isfies t ∈ cert(Q, O, D) iff A[Q(t)] ,→ HO,D and hence captures the information
relevant to query answering. We can then formalise the duality between views
and obstructions in a natural way: U and V implement the same censor iff U cap-
tures the structures not homomorphically embeddable into HO,V . To formalise
this statement, we recall the notion of (non-uniform) constraint satisfaction [13].
Definition 5. Let J be a finite structure and C a class of finite structures. The
CSP of J relative to C (denoted CSP[C] (J )) is the set {I ∈ C | I ,→ J }.
   A central problem is to determine whether a class of finite structures can be
captured by a single formula.
Definition 6. Let C be a class of finite structures and let C 0 ⊆ C. First-order
sentence ψ defines C 0 if I ∈ C 0 is equivalent to I |= ψ for every structure I ∈ C.
     The correspondence between view and obstruction censors is then as follows.
Theorem 1. Let I = (O, D, P) be a CQE-instance with O Datalog ontology,
and C the class of finite structures I with I ,→ HO,D . Then, vcensV        U
                                                                   I = ocensI
iff U defines ¬CSP[C] (HO,V ), for any view V and obstruction U .
   Using Theorem 1 together with definability results in Finite Model Theory,
we can show that views and obstructions cannot simulate one another in general.
Theorem 2. There is a CQE-instance for which there exists a view censor,
but no obstruction censor. There is a CQE-instance for which there exists an
obstruction censor, but no view censor.


5     Optimal View Censors
Our discussion in Section 4 shows that view and obstruction censors should be
studied independently. In this section, we focus on view censors.
    Before investigating the design of view-based CQE algorithms, we first es-
tablish the theoretical limitations of our approach. We show that an optimal
view-based censor for an instance I is not guaranteed to exist since the optimal-
ity requirement may lead to infinite “views”, even for EL and RL ontologies.
Theorem 3. There are CQE-instances I1 and I2 such that
    - the ontology of I1 uses rules of Types (1) and (9), and
    - the ontology of I2 uses rules of Types (5), (8) and (15),
for which no optimal view censors exist.
    The construction of I1 shows that equality rules lead to non-existence of
optimal views; in turn, the construction of I2 shows that equality is not needed
to preclude optimality in the presence of recursion, transitivity axioms, and Self
restrictions.
                    MovFan anj                          MovFan
                    ThrFan                       anb2   ThrFan
                                                 anb1   MovFan
                               John              Bob

 Fig. 1. Essential part of an exhaustive view (solid arrows represent FOf relation).

5.1   View Censors for Guarded RL
We next describe how to compute an optimal view for guarded RL ontologies.
The idea is to create anonymised copies of constants in the data to encode the
information required for optimality. Such a view may use exponentially many
anonymised copies of constants.

Definition 7. Let I = (O, D, P) be a CQE-instance with O a guarded RL on-
tology. A view V consisting of unary atoms Vc1 on constants from I, unary atoms
V∃1 on new constants (anonymised copies), and binary atoms V 2 , is exhaustive
on I if it satisfies all of the following.
 1. The part Vc1 is a maximal set of unary atoms from HO,D such that O∪Vc1 6|= α
    for each α ∈ P.
 2. For each constant a from I, and each set A of unary predicates, such that
    - A(a) ∈ HO,D for every A ∈ A,
    - if A, B ∈ A, and O |= A(x) ∧ B(x) → C(x), then C ∈ A,
    - if A ∈ A, then O 6|= A(x) → x ≈ a for each a,
    - O ∪ Vc1 ∪ {A(a0 )|A ∈ A} 6|= α for each α ∈ P and a fresh constant a0 ,
    the view V uses a fresh constant aA , and the part V∃1 contains all unary
    atoms A(aA ) such that A ∈ A.
 3. For each constant a from I, let σa be the set of constants consisting of a
    itself and all the constants aA . The binary part V 2 of the view V contains
    the atom R(a∗1 , a∗2 ) for constants a∗1 , a∗2 iff
    - R(a1 , a2 ) ∈ HO,D , where a∗1 ∈ σa1 and a∗2 ∈ σa2 ,
    - O ∪ {R(a∗1 , a∗2 )} 6|= α for any α ∈ P, and
    - O ∪ Vc1 ∪ V∃1 ∪ {R(a∗1 , a∗2 )} |= A(a∗ ) implies that A(a∗ ) ∈ Vc1 ∪ V∃1 .

    This definition is constructive and it is routine to devise an algorithm, which
for any instance (non-deterministically) constructs an exhaustive view.

Example 4. Consider the following CQE-instance (O, D, P):

 O = {ThrFan(x) → MovieFan(x), ThrFan(y) ∧ FOf (x, y) → MovieFan(x)},
 D = {FOf (John, Bob), ThrFan(John), ThrFan(Bob)},
 P = {MovieFan(Bob), MovieFan(John)}.

The essential part of the exhaustive view on this CQE-instance is given in Fig-
ure 1, where Vc1 = ∅, V∃1 contains unary atoms over the anonymised copies an1b ,
an2b of Bob, and anj of John, and V 2 contains the depicted binary atoms. Two
anonymised copies of Bob are necessary in any optimal view for I to answer
correctly “harmless” queries like
         ∃x, y, z.ThrFan(z) ∧ MovieFan(z) ∧ FOf (y, z) ∧ ThrFan(y) ∧
                 MovieFan(y) ∧ FOf (y, x) ∧ MovieFan(x) ∧ FOf (John, x).
      The following theorem formulates the desired properties of exhaustive views.
Theorem 4. Let I = (O, D, P) be a CQE-instance with O a guarded RL ontol-
ogy, and V an exhaustive view on I. Then V is optimal. Furthermore, if O is
linear, then vcensV
                  I is the only optimal censor for I.

    The proof relies on the following facts. First, the construction ensures that
HO,V = V for any exhaustive view V, that is, no rules are applicable to V.
Also, properties of Vc1 , V∃1 , and V 2 guarantee that V does not entail any policy
atom. Optimality follows from the fact that for any a in I, each combination
of its unary atoms that satisfies the relevant axioms in O is “witnessed” by a
new constant from σa , and all possible binary atoms which are compatible with
those combinations are added to the view. Then, no essentially new atom can be
“added” to the view V without disclosing a policy. The uniqueness of the optimal
censor for linear ontologies follows from the lack of choices in the construction
of V. An exhaustive view may use exponentially many constants. However, for
multi-linear ontologies, optimal views are of polynomial size.
Proposition 1. Let I = (O, D, P) be a CQE-instance with O a multi-linear RL
ontology. There is an optimal censor for I based on a view of polynomial size.

5.2     View Censors for EL and QL
In contrast to OWL 2 RL, the QL and EL profiles can capture existentially
quantified knowledge. To bridge this gap, we show that, under some mild con-
ditions, we can transform an ontology O into a Datalog ontology O0 such that
an optimal view for (O, D, P) can be directly obtained from such a view for
(O0 , D, P). Thus, devising an optimal view censor for an instance is reduced to
devising one for an instance with a Datalog ontology.
Definition 8. Let σ be a set of constants. A Datalog ontology O0 is a ( Datalog)
σ-rewriting of an ontology O if for each fact β and dataset D over constants from
σ we have that O ∪ D |= β iff O0 ∪ D |= β.

Proposition 2. Let I = (O, D, P) be a CQE-instance with D using set of con-
stants σ, O0 a σ-rewriting of O such that O0 |= O, and V 0 an optimal view for
(O0 , D, P). Then HO0 ,V 0 is an optimal view for I.
    Now, we just need to transform a QL (or guarded EL) ontology into a stronger
guarded RL ontology, which, however, entails the same facts for any dataset. We
exploit techniques developed for the combined approach to query answering [14–
16, 19]. The idea is to transform rules of Type (3) into Datalog by Skolemising
existentially quantified variables into globally fresh constants. Such transfor-
mation strengthens the ontology; however, if applied to a QL or guarded EL
ontology, it preserves entailment of facts for any dataset over σ [19].
Definition 9. Let O be an ontology and σ a set of constants. The ontology
Ξσ (O) is obtained from O by replacing each rule of the form A(x) → ∃y.[R(x, y)∧
B(y)] with A(x) → P (x, a), P (x, y) → R(x, y), P (x, y) → B(y), where P is a
fresh predicate and a is a globally fresh constant not from σ, unique to A and R.2

Proposition 3. If O is a Horn-SROIF ontology, then Ξσ (O) |= O. If also O
is either a QL or guarded EL ontology, then Ξσ (O) is a σ-rewriting of O.

    Propositions 2 and 3 ensure that HΞσ (O),V is optimal for I = (O, D, P) with
O a QL or guarded EL ontology, whenever V is such a view for (Ξσ (O), D, P).
The transformation of O to Ξσ (O) preserves linearity and guardedness, so Ξσ (O)
is a guarded RL ontology, and the results of Section 5.1 are applicable.

Theorem 5. A CQE-instance with a QL or guarded EL ontology has an optimal
view censor. For QL, it is unique and can be based on a polynomial size view.


6     Obstruction Censors
We start our study of obstruction censors by focusing on Datalog ontologies and
characterising optimality in terms of resolution proofs of the policy. To this end,
we first recapitulate the standard notions on (clause) SLD resolution.

Definition 10. A goal is a conjunction ofVatoms. The SLD resolution step takes
                                                 k
a goal β1 ∧ . . . ∧ βm and a Datalog rule i=0 γi → δ and produces a new goal
 Vk
( i=0 γi θ) ∧ β2 θ ∧ . . . ∧ βm θ, where θ is a most general unifier (MGU) of β1 and
δ. A proof of a goal G0 in a Datalog ontology O and dataset D is a sequence
    r1 ,θ1     r2 ,θ2       rn ,θn
G0 −−  −→ G1 −−−→ . . . −−−−→ Gn , where Gn = > and the goal Gi is obtained from
the goal Gi−1 and sentence ri ∈ O ∪D by an SLD resolution step with MGU θi .

    SLD resolution is sound and complete: for each satisfiable O ∪ D and goal G,
a proof of G exists in O ∪ D iff O ∪ D |= ∃∗ G, with ∃∗ G the existential closure of
G. We next provide a characterisation of optimality based on proofs. Consider a
policy atom α ∈ P and some proof π of α in O ∪D. If a censor answers positively
sufficiently many BCQs ∃∗ G for goals G in π, then a user could “reconstruct”
(a part of) π and compromise the policy. Also, there can be many proofs of α,
and a user can compromise the policy by reconstructing any of them. Thus, to
ensure that a censor is confidentiality preserving, we must guarantee that the
obstruction contains enough CQs to prevent reconstruction of any π. If we want
the censor to be optimal, the obstruction should not “block” too many queries.
As we will see later on, these requirements may be in conflict and lead to an
infinite “obstruction”. To formalise this intuition we need an auxiliary notion.
A core of a set of Boolean CQs Q is a minimal subset C of Q such that for each
Q ∈ Q there exists Q0 ∈ C with Q |= Q0 .
2
    To correctly deal with Self restrictions (rules (6) and (8)) a slightly more complex
    transformation is required. These changes are straightforward but require introduc-
    ing further notation, so we present here only the basic transformation for simplicity.
Definition 11. Given a CQE-instance I = (O, D, P) with O a Datalog ontology,
let Q(I) be the set of all Boolean CQs ∃∗ G with G 6= > a goal in a proof of a
fact α ∈ P in O ∪ D, and let S be a maximal subset of Q(I) such that O ∪ S 6|= α
for any α ∈ P. Then, a pseudo-obstruction Υ of I is a core of Q(I) \ S.
   We now relate pseudo-obstructions and optimality.
Theorem 6. Let I = (O, D, P) be a CQE-instance with O a Datalog ontology.
                                                            W
1. If Υ is a finite pseudo-obstruction for I, then U = Q∈Υ Q is an optimal
   obstruction for I.
2. If each pseudo-obstruction for I is infinite, then no optimal obstruction censor
   for I exists.
   This theorem has consequences on the expressive power of obstructions. Us-
ing the results from Section 5.1 we can see that optimal view and obstruction
censors are incomparable. This complements Theorem 2, which talks about not
necessarily optimal censors.
Theorem 7. There is a CQE-instance with ontology in both RL and EL (respec-
tively, RL) for which an optimal view (respectively, obstruction) censor exists,
but no optimal obstruction (respectively, view) censor exists.
    Next, we show how to apply resolution-based techniques to compute opti-
mal obstructions for instances with linear RL ontologies. These results are then
adapted to the case of QL. The algorithm for linear RL is based on the compu-
tation of the set Q(I). To do this computation efficient, we need the following
auxiliary structure.
Definition 12. Let O be a linear RL ontology, D a dataset, x and y fresh vari-
ables, and A the set of all equality-free atoms over the signature of O∪D extended
with x and y. The proof graph of O ∪ D is the directed graph with the set of
nodes A ∪ {>}, and edges (β, γ) such that γ can be derived from β by means of
a single SLD resolution step with a rule from O ∪ D.
   The following example illustrates proof graphs.
                                                        1
Example 5. Consider a CQE-instance I1ex with ontology Oex = {Likes(x, y) →
                                                 1
Movie(y), Likes(x, y) → MovieFan(x)}, dataset Dex = {Likes(John, Seven)},
                               1
and the policy of single atom αex = MovieFan(John). A fragment of the proof
graph is given in Figure 2.
   Using proof graphs we can compute optimal censors.
Theorem 8. Let I = (O, D, P) be a CQE-instance with O a linear RL ontology.
For each α ∈ P, let Sα be the set of nodes in the proof graph of O ∪ D in a path
from α to >. Finally, let U be the Boolean UCQ
                            _       _
                                                ∃∗ G.
                               α∈P     G∈Sα \{>}

Then, ocensU
           I  is the unique optimal censor for I, and U can be computed in
polynomial time in the size of I.
                1                         Movie(Seven)      MovieFan(x)
               ↵ex = MovieFan(John)

                                                               Movie(y)
                    Likes(John, y)        Likes(x, Seven)


                     Likes(John, Seven)         >            Likes(x, y)


                                                         1     1
                    Fig. 2. Fragment of proof graph for Oex ∪ Dex


Example 6. For I1ex from Example 5, there is only one path in the proof graph
      1
from αex to > and Sα1ex\ {>} = {MovieFan(John), likes(John, y)}. Thus, U =
MovieFan(John) ∨ ∃y.Likes(John, y) is optimal.

    Finally, note that the transformation of a QL ontology O to an RL ontology
Ξσ (O) given in Definition 9, preserves linearity of rules. Hence, Proposition 3
and Theorem 8 yield the following result.

Theorem 9. For every CQE-instance with a QL ontology there exists a unique
optimal obstruction censor.


7     Discussion and Conclusions
We have studied CQE in the context of ontologies. Our results yield a flexible way
for system designers to ensure selective access to data and provide insights on the
fundamental tradeoff between accessibility and confidentiality of information.
We have proposed algorithms applicable to the profiles of OWL 2, which can
be implemented using off-the-shelf query answering infrastructure. Thus, our
algorithms provide a starting point to the development of CQE systems.
    The problems studied here remain rather unexplored and we see many open
questions. From a theoretic point of view, we plan to consider policies beyond sets
of facts (e.g., given as CQs). We also plan to study weaker notions of optimality
that can ensure polynomiality of views and obstructions for more expressive
languages. From a practical perspective, we will implement our algorithms and
test their scalability using state-of-the art Datalog engines such as RDFox.3
    The approach closest to ours is the view-based access authorisation frame-
work in [9]. In this setting, policies are represented as authorisation views: CQs
that define the only information accessible to the user; since queries are answered
faithfully against the views, there is no explicit notion of policy violation. In con-
trast, in our setting policies express inaccessible information, and our goal is to
maximally answer queries without violating the policy.
Acknowledgements. Work supported by the Royal Society, the EPSRC projects
Score!, Exoda, and MaSI3 , and the FP7 project OPTIQUE.

3
    http://www.cs.ox.ac.uk/isg/tools/RDFox/
References
 1. Bao, J., Slutzki, G., Honavar, V.: Privacy-Preserving Reasoning on the Semantic
    Web. In: WI. pp. 791–797. IEEE Computer Society (2007)
 2. Biskup, J., Bonatti, P.: Controlled Query Evaluation with Open Queries for a
    Decidable Relational Submodel. Ann. Math. and Artif. Intell. 50(1-2), 39–77 (2007)
 3. Biskup, J., Bonatti, P.A.: Lying Versus Refusal for Known Potential Secrets. Data
    Knowl. Eng. 38(2), 199–222 (2001)
 4. Biskup, J., Bonatti, P.A.: Controlled Query Evaluation for Enforcing Confidential-
    ity in Complete Information Systems. Int. J. Inf. Sec. 3(1), 14–27 (2004)
 5. Biskup, J., Weibert, T.: Keeping Secrets in Incomplete Databases. Int. J. Inf. Sec.
    7(3), 199–217 (2008)
 6. Bonatti, P.A., Kraus, S., Subrahmanian, V.S.: Foundations of Secure Deductive
    Databases. IEEE Trans. Knowl. Data Eng. 7(3), 406–422 (1995)
 7. Bonatti, P.A., Sauro, L.: A Confidentiality Model for Ontologies. In: ISWC. pp.
    17–32 (2013)
 8. Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati, R.: View-based Query An-
    swering over Description Logic Ontologies. In: KR. AAAI Press (2008)
 9. Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati, R.: View-based Query An-
    swering in Description Logics: Semantics and Complexity. J. Comput. Syst. Sci.
    78(1), 26–46 (2012)
10. Cuenca Grau, B.: Privacy in ontology-based information systems: A pending mat-
    ter. Semantic Web 1(1-2), 137–141 (2010)
11. Cuenca Grau, B., Kharlamov, E., Kostylev, E.V., Zheleznyakov, D.: Controlled
    Query Evaluation over OWL 2 RL Ontologies. In: ISWC. pp. 49–65 (2013)
12. Cuenca Grau, B., Motik, B.: Reasoning over Ontologies with Hidden Content: The
    Import-by-Query Approach. J. Artif. Intell. Res. 45, 197–255 (2012)
13. Kolaitis, P.G., Vardi, M.Y.: A Logical Approach to Constraint Satisfaction. In:
    Complexity of Constraints. pp. 125–155 (2008)
14. Kontchakov, R., Lutz, C., Toman, D., Wolter, F., Zakharyaschev, M.: The Com-
    bined Approach to Ontology-Based Data Access. In: IJCAI. pp. 2656–2661 (2011)
15. Lutz, C., Seylan, I., Toman, D., Wolter, F.: The Combined Approach to OBDA:
    Taming Role Hierarchies Using Filters. In: ISWC. pp. 314–330 (2013)
16. Lutz, C., Toman, D., Wolter, F.: Conjunctive Query Answering in the Description
    Logic EL Using a Relational Database System. In: IJCAI. pp. 2070–2075 (2009)
17. Sandhu, R.S., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role-Based Access Con-
    trol Models. IEEE Computer 29(2), 38–47 (1996)
18. Sicherman, G.L., de Jonge, W., van de Riet, R.P.: Answering Queries Without
    Revealing Secrets. ACM Trans. Database Syst. 8(1), 41–59 (1983)
19. Stefanoni, G., Motik, B., Horrocks, I.: Introducing Nominals to the Combined
    Query Answering Approaches for EL. In: AAAI (2013)
20. Stouppa, P., Studer, T.: A Formal Model of Data Privacy. In: PSI (2007)
21. Tao, J., Slutzki, G., Honavar, V.: Secrecy-Preserving Query Answering for Instance
    Checking in EL. In: RR. pp. 195–203 (2010)