=Paper=
{{Paper
|id=Vol-2994/paper56
|storemode=property
|title=Privacy Preserving Query Answering in Description Logics Through Instance Indistinguishability (Discussion Paper)
|pdfUrl=https://ceur-ws.org/Vol-2994/paper56.pdf
|volume=Vol-2994
|authors=Gianluca Cima,Domenico Lembo,Riccardo Rosati,Domenico Fabio Savo
|dblpUrl=https://dblp.org/rec/conf/sebd/CimaLRS21
}}
==Privacy Preserving Query Answering in Description Logics Through Instance Indistinguishability (Discussion Paper)==
Privacy Preserving Query Answering in
Description Logics Through Instance
Indistinguishability
(Discussion Paper)
Gianluca Cima1 , Domenico Lembo2 , Riccardo Rosati2 and Domenico Fabio Savo3
1
University of Bordeaux, CNRS, Bordeaux INP, LaBRI
2
Sapienza Università di Roma
3
Università degli Studi di Bergamo
Abstract
We study privacy-preserving query answering in Description Logics (DLs). Specifically, we consider the
approach of controlled query evaluation (CQE) based on the notion of instance indistinguishability. We
derive data complexity results for query answering over DL-Liteℛ ontologies, through a comparison with
an alternative, existing confidentiality-preserving approach to CQE. Finally, we identify a semantically
well-founded notion of approximated query answering for CQE, and prove that, for DL-Liteℛ ontologies,
this form of CQE is tractable with respect to data complexity and is first-order rewritable, i.e., it is always
reducible to the evaluation of a first-order query over the data instance.
1. Introduction
We consider controlled query evaluation (CQE), a declarative framework for privacy-preserving
query answering investigated in the literature on knowledge representation and database theory [1,
2, 3]. The basic idea of CQE is defining a data protection policy through logical statements.
Consider for instance an organization that wants to keep confidential the fact that it has suppliers
involved in both Project A and Project B. This can be expressed over the information schema of the
organization through a denial assertion of the form ∀𝑥. Supplier(𝑥) ∧ ProjA(𝑥) ∧ ProjB(𝑥) → ⊥
In CQE, two different main approaches can be identified. The first one [4, 5, 6] models privacy
preservation through the notion of indistinguishable data instances. In this approach, a system
for CQE enforces data privacy if, for every data instance 𝐼, there exists a data instance 𝐼 ′ that
does not violate the data protection policy and is indistinguishable from 𝐼 for the user, i.e., for
every user query 𝑞, the system provides the same answers to 𝑞 over 𝐼 and over 𝐼 ′ . We call this
approach (instance) indistinguishability-based (IB). In continuation of the previous example,
SEBD 2021: The 29th Italian Symposium on Advanced Database Systems, September 5-9, 2021, Pizzo Calabro (VV),
Italy
" gianluca.cima@u-bordeaux.fr (G. Cima); lembo@diag.uniroma1.it (D. Lembo); rosati@diag.uniroma1.it
(R. Rosati); domenicofabio.savo@unibg.it (D. F. Savo)
0000-0003-1783-5605 (G. Cima); 0000-0002-0628-242X (D. Lembo); 0000-0002-7697-4958 (R. Rosati);
0000-0002-8391-8049 (D. F. Savo)
© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
in the presence of an instance {Supplier(𝑐), ProjA(𝑐), ProjB(𝑐)}, an IB system should answer
user queries as if the instance were, e.g., {Supplier(𝑐), ProjA(𝑐)} (note that other instances not
violating the policy can be considered as indistinguishable, e.g., {Supplier(𝑐), ProjB(𝑐)}).
The second approach [7, 8] models privacy preservation by considering the whole (possibly
infinite) set of answers to queries that the system provides to the user. In this approach, a CQE
system protects the data if, for every data instance 𝐼, the logical theory corresponding to the set
of answers provided by the system to all queries over 𝐼 does not entail any violation of the data
protection policy. According to [8], we call this approach confidentiality-preserving (CP). In
our ongoing example, a CP system would entail, e.g., the queries Supplier(𝑐) ∧ ProjA(𝑐) and
∃𝑥.Supplier(𝑥) ∧ ProjB(𝑥), but not also the query Supplier(𝑐) ∧ ProjB(𝑐).
In both approaches, the ultimate goal is to realize optimal CQE systems, i.e., systems maximiz-
ing the answers returned to user queries, still respecting the data protection policy. Traditionally,
this aim has been pursued through the construction of a single optimal censor, i.e., a specific
implementation of the adopted notion of privacy-preservation, either IB or CP. Since, however, in
both approaches several optimal censors typically exist, this way of proceeding requires to make
a choice on how to obfuscate data, which, in the absence of additional (preference) criteria, may
result discretionary. To avoid this, query answering over all optimal censors has been recently
studied, limited to the CP approach [7, 9], whereas query answering over all optimal IB censors
has not been investigated so far. Moreover, among the complexity results obtained and the
techniques defined so far for CQE, we still miss the identification of cases that are promising
towards its practical usage.
In this paper, we aim at filling some of the above mentioned gaps in the context of Description
Logic (DL) ontologies. We focus on the approach to CQE based on instance indistinguishability
(Section 3), and study its relationship with the CP approach (Section 4). Specifically, we prove
that the IB approach to CQE in DLs corresponds to a particular instance of the CP approach to
CQE [9]. Based on such a correspondence, we show that, even in the lightweight DL DL-Liteℛ ,
query answering in the IB approach is intractable with respect to data complexity, unless one relies
on a single optimal censor chosen non-deterministically in the lack of further meta-information
about the domain of the dataset.
To overcome the above problems and provide a practical, semantically well-founded solution,
we define a quasi-optimal notion of IB censor, which corresponds to the best sound approximation
of all the optimal IB censors (Section 5). We then prove that, in the case of DL-Liteℛ ontologies,
query answering based on the quasi-optimal IB censor is reducible to the evaluation of a first-order
query over the data instance, i.e., it is first-order rewritable. We believe that this result has an
important practical impact. Indeed, we have identified a setting in which privacy-preserving query
answering formalized in a declarative logic-based framework as CQE, for a DL (i.e., DL-Liteℛ )
specifically designed for data management, has the same data complexity as evaluating queries
over a database (i.e., in AC0 ). This opens the possibility of defining algorithms for CQE of
practical usage, amenable to implementation on top of traditional (relational) data management
systems, as done in [10] exploiting Ontology-based Data Access engines.
This paper summarizes the results of our IJCAI-20 conference publication [11].
2. Preliminaries
We use standard notions of function-free first-order (FO) logic, and in particular we consider
Description Logics (DLs), which are fragments of FO using only unary and binary predicates,
called concepts and roles, respectively.We assume to have the pairwise disjoint countably infinite
sets Σ𝐶 , Σ𝑅 , Σ𝐼 and Σ𝒱 for atomic concepts, atomic roles, constants (a.k.a. individuals), and
variables, respectively. A DL ontology 𝒪 = 𝒯 ∪ 𝒜 is constituted by a TBox 𝒯 and an ABox 𝒜,
specifying intensional and extensional knowledge, respectively. The set of atomic concepts and
roles occurring in 𝒪 is the signature of 𝒪. The semantics of 𝒪 is given in terms of FO models
over the signature of 𝒪, in the standard way.In particular, we say that 𝒪 is consistent if it has at
least one model, inconsistent otherwise. 𝒪 entails an FO sentence 𝜑 specified over the signature
of 𝒪, denoted 𝒪 |= 𝜑, if 𝜑 is true in every model of 𝒪. In this paper, we consider ontologies
expressed in DL-Liteℛ , the member of the DL-Lite family [12] which underpins OWL 2 QL, i.e.,
the OWL 2 profile specifically designed for efficient query answering. A TBox 𝒯 in DL-Liteℛ
is a finite set of axioms of the form 𝐵1 ⊑ 𝐵2 (resp., 𝑅1 ⊑ 𝑅2 ), denoting concept (resp., role)
inclusion, and 𝐵1 ⊑ ¬𝐵2 (resp., 𝑅1 ⊑ ¬𝑅2 ), denoting concept (resp., role) disjointness, where:
𝑅1 , 𝑅2 are of the form 𝑃 , with 𝑃 ∈ Σ𝑅 , or its inverse 𝑃 − , and 𝐵1 , 𝐵2 are of the form 𝐴, with
𝐴 ∈ Σ𝐶 , ∃𝑃 , or ∃𝑃 − , i.e., unqualified existential restrictions, which denote the set of objects
occurring as first or second argument of 𝑃 , respectively. An ABox 𝒜 is a finite set of ground
atoms, i.e., assertions of the form 𝐴(𝑎), 𝑃 (𝑎, 𝑏), where 𝐴 ∈ Σ𝐶 , 𝑃 ∈ Σ𝑅 , and 𝑎, 𝑏 ∈ Σ𝐼 . As
usual in query answering over DL ontologies, we focus on conjunctive queries. A Boolean
conjunctive query (BCQ) 𝑞 is an FO sentence of the form ∃𝑥 ⃗ .𝜑(𝑥
⃗ ), where ⃗𝑥 are variables in Σ𝒱 ,
and 𝜑(𝑥 ⃗ ) is a finite, non-empty conjunction of atoms of the form 𝛼(𝑡 ⃗), where 𝛼 ∈ Σ𝐶 ∪ Σ𝑅 , and
⃗
each term in 𝑡 is either a constant in Σ𝐼 or a variable in ⃗𝑥. We denote by Eval(𝑞, 𝒜) the result of
the evaluation of a query 𝑞 over (the model isomorphic to) an ABox 𝒜.
A denial assertion (or simply a denial) is an FO sentence of the form ∀𝑥 ⃗ .𝜑(𝑥⃗ ) → ⊥, such that
∃𝑥 ⃗ ) is a BCQ. Given one such denial 𝛿 and an ontology 𝒪, we say that 𝒪 ∪ {𝛿} is consistent
⃗ .𝜑(𝑥
if 𝒪 ̸|= ∃𝑥⃗ .𝜑(𝑥⃗ ), and is inconsistent otherwise.
In the following, with FO, CQ, and GA, we denote the languages of function-free FO
sentences, BCQs, and ground atoms, respectively, all specified over the alphabets Σ𝐶 , Σ𝑅 , Σ𝐼 ,
and Σ𝒱 . Given an ontology 𝒪 and a language ℒ, with ℒ(𝒪) we refer to the subset of ℒ whose
sentences are built over the signature of 𝒪 and the variables in Σ𝒱 . For a TBox 𝒯 and a language
ℒ, we denote by cl𝒯ℒ (·) the function that, for an ABox 𝒜, returns all the sentences 𝜑 ∈ ℒ(𝒯 ∪ 𝒜)
such that 𝒯 ∪ 𝒜 |= 𝜑.
For the sake of presentation, we will limit our technical treatment to languages containing
only closed formulas, but our results hold also for open formulas. In particular, the results on
entailment of BCQs can be extended to arbitrary (i.e., non-Boolean) CQs in the standard way1 .
Our complexity results are for data complexity, i.e., are w.r.t. the size of the ABox only.
1
We also notice that, since DL-Liteℛ is insensitive to the adoption of the unique name assumption (UNA) for CQ
answering,our results hold both with and without UNA.
3. CQE through Instance Indistinguishability
A CQE framework consists of a TBox 𝒯 and a policy 𝒫 over 𝒯 , i.e., a finite set of denial
assertions over the signature of 𝒯 . An ABox 𝒜 for 𝒯 is such that 𝒜 and 𝒯 have the same
signature. In the following, when a TBox 𝒯 is given, we always assume that the coupled policy is
specified over 𝒯 , that each considered ABox 𝒜 is for 𝒯 , and that 𝒯 ∪ 𝒜 and 𝒯 ∪ 𝒫 are consistent.
A censor is a function that taken an ABox for 𝒯 as input alters standard query answering over
𝒯 ∪ 𝒜 so that on the basis of the answers (even a possibly infinite set thereof) and the TBox, a
user can never infer a BCQ ∃𝑥 ⃗ ) such that ∀𝑥
⃗ .𝜑(𝑥 ⃗ ) → ⊥ belongs to 𝒫.
⃗ .𝜑(𝑥
We here propose a notion of censor which is the natural application to our framework of the
analogous definitions given in [4, 5, 6, 13, 14]. The basic idea of this approach is that for every
underlying instance (an ABox in our framework) and every query, a censor returns to the user
the same answers it would return on another (possibly identical) instance that does not contain
confidential data, so that she cannot understand which of the two instances she is querying. This
is formalized as follows.
Definition 1. [Indistinguishability-based censor] Let 𝒯 be a DL TBox and 𝒫 be a policy. An
indistinguishability-based (IB) censor for 𝒯 and 𝒫 is a function cens(·) that, for each ABox 𝒜,
returns a set cens(𝒜) ⊆ cl𝒯CQ (𝒜) such that there exists an ABox 𝒜′ for which (𝑖) cens(𝒜) =
cl𝒯CQ (𝒜′ ) (in this case we say that 𝒜 and 𝒜′ are indistinguishable w.r.t. cens) and (𝑖𝑖) 𝒯 ∪ 𝒫 ∪ 𝒜′
is a consistent FO theory.
Example 1. The TBox signature consists of the atomic concepts Supplier, ProjA, and ProjB,
denoting the set of suppliers of the company, suppliers involved in Project A and those involved in
Project B, respectively, and contains the axioms ProjA ⊑ Supplier and ProjB ⊑ Supplier, stating
that each individual instance of ProjA or ProjB is also instance of Supplier. Data protection is
specified through the policy 𝒫 = {∀𝑥.ProjA(𝑥) ∧ ProjB(𝑥) → ⊥}. The following functions are
IB censors for 𝒯 and 𝒫:
• cens1 : given an ABox 𝒜, cens1 (𝒜) returns the set cl𝒯CQ (𝒜PA ) of BCQs, where 𝒜PA
is obtained from 𝒜 by removing the assertion ProjA(𝑐), for each individual 𝑐 such that
both ProjA(𝑐) and ProjB(𝑐) are in 𝒜 (note that for every ABox 𝒜, 𝒜 and 𝒜PA are
indistinguishable w.r.t. cens1 . Similarly in the following censors).
• cens2 : given an ABox 𝒜, cens2 (𝒜) returns the set cl𝒯CQ (𝒜PB ) of BCQs, where 𝒜PB is
obtained from 𝒜 by removing the assertion ProjB(𝑐), for each individual 𝑐 such that both
ProjA(𝑐) and ProjB(𝑐) are in 𝒜.
• cens3 : given an ABox 𝒜, cens3 (𝒜) returns the set cl𝒯CQ (𝒜sup ) of BCQs, where 𝒜𝑠𝑢𝑝 is
obtained from 𝒜 by adding the assertion Supplier(𝑐) and removing ProjA(𝑐) and ProjB(𝑐),
for each individual 𝑐 such that both ProjA(𝑐) and ProjB(𝑐) are in 𝒜.
It is easy to see that an IB censor always exists, but, as Example 1 shows, there may be many IB
censors for 𝒯 and 𝒫, and so it is reasonable to look for censors preserving as much information
as possible. Formally, given two IB censors cens and cens′ for 𝒯 and 𝒫, we say that cens′ is
more informative than cens if: (i) for every ABox 𝒜, cens(𝒜) ⊆ cens′ (𝒜), and (ii) there exists
an ABox 𝒜′ such that cens(𝒜′ ) ⊂ cens′ (𝒜′ ). Optimal censors are then defined as follows.
Definition 2. Let 𝒯 be a DL TBox and 𝒫 be a policy. An IB censor cens for 𝒯 and 𝒫 is optimal
if there does not exist any other IB censor for 𝒯 and 𝒫 that is more informative than cens. We
denote by OptIBCens𝒯 ,𝒫 the set of the optimal IB censors for 𝒯 and 𝒫.
Example 2. Among the censors of Example 1, cens3 ̸∈ OptIBCens𝒯 ,𝒫 , since both cens1 and
cens2 are more informative than cens3 . It can be then verified that cens1 and cens2 are the only
optimal IB censors for 𝒯 and 𝒫.
4. IB Censors vs. CP Censors
In [8], a different notion of censor, named confidentiality-preserving (CP) censor, has been
proposed. Intuitively, a CP censor establishes which are the BCQs entailed by a TBox and a given
ABox that can be disclosed without violating the policy. We report below the definition given
in [9], which generalizes CP censors to any language ℒ ⊆ FO, called the censor language.
Definition 3. [Confidentiality-preserving censor] Let 𝒯 be a DL TBox, 𝒫 be a policy, and
ℒ ⊆ FO be a language. A confidentiality-preserving (CP) censor in ℒ for 𝒯 and 𝒫 is a function
cens(·) that, for each ABox 𝒜, returns a set cens(𝒜) ⊆ cl𝒯ℒ (𝒜) such that 𝒯 ∪ 𝒫 ∪ cens(𝒜) is a
consistent FO theory.
The notion of more informative censor previously given for IB censors can be naturally
extended to CP censors, and so we can define optimal censors also in this case. We denote by
ℒ-OptCPCens𝒯 ,𝒫 the set of the optimal CP censors in ℒ for 𝒯 and 𝒫.
Example 3. Consider 𝒯 and 𝒫 as defined in Example 1. An optimal CP censor cens4 in CQ for
𝒯 and 𝒫 is defined as follows: given an ABox 𝒜, cens4 (𝒜) returns the set of BCQs obtained by
removing from cl𝒯CQ (𝒜) every query containing the atom ProjA(𝑐), for each individual 𝑐 such
that both ProjA(𝑐) and ProjB(𝑐) are in 𝒜.
We notice that the CP censor cens4 is not an IB censor. Indeed, consider the ABox 𝒜 =
{ProjA(𝑐), ProjB(𝑐)}. We have that cens4 (𝒜) = {𝜑 | 𝜑 ∈ CQ and 𝒯 ∪ 𝒮 |= 𝜑}, where
𝒮 = {∃𝑥.ProjA(𝑥), ProjB(𝑐)}. It is not hard to see that there exists no ABox 𝒜′ such that 𝒜′
and 𝒜 are indistinguishable w.r.t. cens4 and 𝒯 ∪ 𝒫 ∪ 𝒜′ is consistent.
Let 𝒜 be an ABox and cens be either an IB or a CP censor, the set cens(𝒜) is called theory of
the censor cens for 𝒜.
The following theorems explain the relation between IB censors and CP censors.
Theorem 1. Let 𝒯 be a DL TBox and 𝒫 be a policy. If cens is an IB censor for 𝒯 and 𝒫, then it
is a CP censor in CQ for 𝒯 and 𝒫. The converse does not necessarily hold.
Theorem 2. Let 𝒯 be a DL TBox and 𝒫 be a policy. Then, ib_cens ∈ OptIBCens𝒯 ,𝒫
iff there exists a CP censor cp_cens ∈ GA-OptCPCens𝒯 ,𝒫 such that, for each ABox 𝒜,
cl𝒯CQ (cp_cens(𝒜)) = ib_cens(𝒜).
From Theorem 2 and the results given in [9], it follows that, given a DL-Liteℛ TBox 𝒯 , a policy
𝒫, an ABox 𝒜, and a BCQ 𝑞, deciding whether 𝑞 ∈ cens(𝒜) for every cens ∈ OptIBCens𝒯 ,𝒫 is
coNP-complete in data complexity.
5. Approximating Optimal IB Censors
Towards a practical approach to CQE, in this section we consider a different entailment problem
that approximates entailment under each optimal censor, and we show that its data complexity is
in AC0 (i.e., the same complexity of evaluating FO queries over a database). The approximation
we propose consists in considering a non-necessarily optimal IB censor whose theory, for every
ABox, is as close as possible to the theories of all the optimal IB censors. We call such censor
quasi-optimal IB (QIB) censor.
Definition 4. [QIB censor] Let 𝒯 be a DL TBox, let 𝒫 be a policy, and let cens be an IB
censor for 𝒯 and 𝒫. We say that cens is a QIB censor if: (i) cens(𝒜) ⊆ cens′ (𝒜), for each
cens′ ∈ OptIBCens𝒯 ,𝒫 , and (ii) there exists no IB censor cens′ for 𝒯 and 𝒫 which is more
informative than cens and that satisfies condition.
Example 4. The IB censor cens3 of Example 1 is a QIB censor for 𝒯 and 𝒫 (but cens3 ̸∈
OptIBCens𝒯 ,𝒫 ).
It is not hard to see that a QIB censor for a DL TBox 𝒯 and a policy 𝒫 always exists and it is
unique. Hereinafter, we denote with qib_cens𝒯 ,𝒫 the QIB censor for 𝒯 and 𝒫.
Definition 5. Let 𝒯 be a DL TBox, 𝒫 be a policy, 𝒜 be an ABox, and 𝑞 be a BCQ.
QIB-Entailment(𝒯 , 𝒫, 𝒜, 𝑞) is the problem of deciding whether 𝑞 ∈ qib_cens𝒯 ,𝒫 (𝒜).
We now focus on the case of DL-Liteℛ TBoxes and prove that, in this case, entailment of
BCQs under QIB censors is FO-rewritable. Formally, we say that QIB-entailment in a DL ℒ is
FO-rewritable, if for every TBox 𝒯 expressed in ℒ, every policy 𝒫 and every BCQ 𝑞, one can
effectively compute an FO query 𝑞𝑟 such that for every ABox 𝒜, QIB-Entailment(𝒯 , 𝒫, 𝒜, 𝑞)
is true iff 𝒜 |= 𝑞𝑟 . We call 𝑞𝑟 the QIB-perfect reformulation of 𝑞 w.r.t. 𝒯 and 𝒫.
We prove FO-rewritability of entailment of BCQs under QIB censors in DL-Liteℛ by exploiting
a correspondence between this problem and entailment of BCQs under IAR-semantics for DL
ontologies, which is indeed FO-rewritable for DL-Liteℛ,𝑑𝑒𝑛 , i.e., DL-Liteℛ enriched with denial
assertions [15]. We recall that the IAR-semantics is an inconsistency-tolerant semantics that
allows for meaningful entailment also when the ABox contradicts the TBox of an ontology. The
entailment under IAR-semantics is defined as follows: give a DL 𝒯 , an ABox 𝒜, and BCQ 𝑞,
IAR-Entailment(𝒯 , 𝒜, 𝑞) is the problem of verifying whether 𝒯 ∪ ℛ𝑖𝑎𝑟 |= 𝑞, where ℛ𝑖𝑎𝑟 is the
intersection of all the maximal subsets 𝒜𝑟 of 𝒜 such that 𝒜𝑟 ∪ 𝒯 is consistent.
Theorem 3. Let 𝒯 be a DL-Liteℛ TBox, 𝒫 be a policy, 𝒜 be an ABox, and 𝑞 be a BCQ.
QIB-Entailment(𝒯 , 𝒫, 𝒜, 𝑞) is true iff IAR-Entailment(𝒯 ∪ 𝒫, cl𝒯GA (𝒜), 𝑞) is true.
Theorem 3 actually states that, to solve QIB-entailment, we can resort to the query rewriting
techniques used to establish IAR-entailment given in [15], provided that we compute cl𝒯GA (𝒜).
We recall that query entailment under IAR-semantics in a DL ℒ is FO-rewritable if for every
TBox 𝒯 expressed in ℒ and every BCQ 𝑞, one can effectively compute an FO query 𝑞𝑟 such
that for every ABox 𝒜, IAR-Entailment(𝒯 , 𝒜, 𝑞) is true iff 𝒜 |= 𝑞𝑟 . The query 𝑞𝑟 is called the
IAR-perfect reformulation of 𝑞 w.r.t. 𝒯 .
To establish FO-rewritability of QIB-entailment in DL-Liteℛ , however, we still need to address
the above mentioned computation of cl𝒯GA (𝒜), and turn it into an additional query reformulation
step. To this aim, we can exploit the fact that, for a DL-Liteℛ,𝑑𝑒𝑛 ontology 𝒯 ∪ 𝒜, an FO query 𝑞
evaluates to true over cl𝒯GA (𝒜) iff 𝑞 ′ evaluates to true over 𝒜, where 𝑞 ′ is obtained by suitably
rewriting each atom of 𝑞 according to the positive inclusions of 𝒯 . Intuitively, in this way we cast
into the query all the possible causes of the facts that are contained in the closure of the ABox
w.r.t. the TBox.
To compute such a query 𝑞 ′ , we use the function atomRewr(𝑞, 𝒯 ), which substitutes each
atom 𝛼 of 𝑞 with the formula 𝜑(𝛼) defined as follows (where 𝐴, 𝐵 are atomic concepts and 𝑅, 𝑆
are atomic roles):
⋁︀ ⋁︀ ⋁︀
𝜑(𝐴(𝑡)) = 𝒯 |=𝐵⊑𝐴 𝐵(𝑡) ∨ 𝒯 |=∃𝑅⊑𝐴 (∃𝑥.𝑅(𝑡, 𝑥)) ∨ 𝒯 |=∃𝑅− ⊑𝐴 (∃𝑥.𝑅(𝑥, 𝑡))
⋁︀ ⋁︀
𝜑(𝑅(𝑡1 , 𝑡2 )) = 𝒯 |=𝑆⊑𝑅 𝑆(𝑡1 , 𝑡2 ) ∨ 𝒯 |=𝑆 − ⊑𝑅 𝑆(𝑡2 , 𝑡1 ).
The following lemma, whose proof can be immediately obtained from the definitions of cl𝒯GA (·)
and atomRewr(·, ·), states the property we are looking for.
Lemma 1. Let 𝒯 be a DL-Liteℛ,𝑑𝑒𝑛 TBox, 𝒜 be an ABox, and 𝑞 be an FO sentence. Then
Eval(𝑞, cl𝒯GA (𝒜)) = Eval(atomRewr(𝑞, 𝒯 ), 𝒜).
We are now able to extablish FO-rewritability of QIB-entailment.
Theorem 4. Let 𝒯 be a DL-Liteℛ TBox, 𝒫 be a policy, 𝑞 be a BCQ, and 𝑞𝑟 be an FO sentence
that is a IAR-perfect reformulation of 𝑞 w.r.t. the DL-Liteℛ,𝑑𝑒𝑛 TBox 𝒯 ∪ 𝒫. Then, the FO
sentence atomRewr(𝑞𝑟 , 𝒯 ) is a QIB-perfect reformulation of 𝑞 w.r.t. 𝒯 and 𝒫.
Since IAR-entailment is actually FO rewritable, as shown in [15], the above theorem proves the
FO rewritability of QIB-entailment for DL-Liteℛ TBoxes. Moreover, the above theorem identifies
a technique for obtaining the QIB-perfect reformulation of a CQ, based on a simple combination
of the IAR-perfect reformulation algorithm of [15] and the atomRewr reformulation defined
above. Therefore:
Corollary 1. Let 𝒯 be a DL-Liteℛ TBox, 𝒫 be a policy, 𝒜 be an ABox, and 𝑞 be a BCQ. The
problem QIB-Entailment(𝒯 , 𝒫, 𝒜, 𝑞) is in AC0 in data complexity.
6. Conclusions
In this paper, we have studied the approach to CQE based on instance indistinguishability and
identified a semantically well-founded notion of CQE that enjoys first-order rewritability in the
case of DL-Liteℛ ontologies.
An important future research direction is a deeper study of the user model. Our framework
inherits from its predecessors a relatively simple model, which assumes that the user knows
(at most) the TBox and all the query answers returned by the system, and considers only the
deductive abilities of the user over such knowledge. This user model might need to be enriched
to capture more realistic data protection scenarios.
Acknowledgments
This work was partly supported by the ANR AI Chair INTENDED (ANR-19-CHIA-0014), by
MIUR under the PRIN 2017 project “HOPE” (prot. 2017MMJJRE), by the EU within the H2020
Programme under the grant agreement 834228 (ERC Advanced Grant WhiteMec) and the grant
agreement 825333 (MOSAICrOWN), by Regione Lombardia within the Call Hub Ricerca e
Innovazione under the grant agreement 1175328 (WATCHMAN), and by Sapienza Università di
Roma (2019 project CQEinOBDM).
References
[1] G. L. Sicherman, W. de Jonge, R. P. van de Riet, Answering queries without revealing
secrets, ACM Trans. Database Syst. 8 (1983) 41–59.
[2] P. A. Bonatti, S. Kraus, V. S. Subrahmanian, Foundations of secure deductive databases,
IEEE Trans. Knowl. Data Eng. 7 (1995) 406–422.
[3] J. Biskup, For unknown secrecies refusal is better than lying, Data and Knowl. Eng. 33
(2000) 1–23.
[4] J. Biskup, P. A. Bonatti, Controlled query evaluation for known policies by combining lying
and refusal, Ann. Math. Artif. Intell. 40 (2004) 37–62.
[5] J. Biskup, T. Weibert, Keeping secrets in incomplete databases, Int. J. of Information
Security 7 (2008) 199–217.
[6] P. A. Bonatti, L. Sauro, A confidentiality model for ontologies, in: Proc. of ISWC, volume
8218 of LNCS, 2013, pp. 17–32.
[7] B. Cuenca Grau, E. Kharlamov, E. V. Kostylev, D. Zheleznyakov, Controlled query
evaluation over OWL 2 RL ontologies, in: Proc. of ISWC, volume 8218, 2013, pp. 49–65.
[8] B. Cuenca Grau, E. Kharlamov, E. V. Kostylev, D. Zheleznyakov, Controlled query
evaluation for datalog and OWL 2 profile ontologies, in: Proc. of IJCAI, 2015, pp. 2883–
2889.
[9] D. Lembo, R. Rosati, D. F. Savo, Revisiting controlled query evaluation in description
logics, in: Proc. of IJCAI, 2019, pp. 1786–1792.
[10] G. Cima, D. Lembo, L. Marconi, R. Rosati, D. F. Savo, Controlled query evaluation in
ontology-based data access, in: Proc. of ISWC, volume 12506 of LNCS, 2020, pp. 128–146.
[11] G. Cima, D. Lembo, R. Rosati, D. F. Savo, Controlled query evaluation in description logics
through instance indistinguishability, in: Proc. of IJCAI, 2020, pp. 1791–1797.
[12] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, R. Rosati, Tractable reasoning
and efficient query answering in description logics: The DL-Lite family, J. of Automated
Reasoning 39 (2007) 385–429.
[13] M. Benedikt, B. Cuenca Grau, E. V. Kostylev, Logical foundations of information disclosure
in ontology-based data integration, Artif. Intell. J. 262 (2018) 52–95.
[14] M. Benedikt, P. Bourhis, L. Jachiet, M. Thomazo, Reasoning about disclosure in data
integration in the presence of source constraints, in: Proc. of IJCAI, 2019, pp. 1551–1557.
[15] D. Lembo, M. Lenzerini, R. Rosati, M. Ruzzi, D. F. Savo, Inconsistency-tolerant query
answering in ontology-based data access, J. of Web Semantics 33 (2015) 3–29.