-

Controlled Query Evaluation with Epistemic Dependencies: Algorithms and Experiments (Extended Abstract)

Lorenzo Marconi

Flavia Ricci

Riccardo Rosati

0 0 Sapienza University of Rome , Italy

2025

38 3 6

This work summarizes our paper accepted to the 24th International Semantic Web Conference focusing on Controlled Query Evaluation over Description Logics ontologies. We express the data protection policy using epistemic dependencies (EDs), and use optimal ground atom (GA) censors as tools for exposing the facts entailed by the ontology in a maximal, policy-compliant way. We study the complexity of answering Boolean unions of conjunctive queries with respect to the intersection of all optimal GA censors. We identify a class of EDs for which the examined entailment problem over DL-Liteℛ ontologies is first-order rewritable, and we empirically validate the eficiency of our method.

eol>Description Logics Confidentiality Preservation Query Answering First-Order Rewritability

Definition 1.

An epistemic dependency (ED) is a sentence of the form

∀x1, x2 ((x1, x2) → ℎ(x2)) where (x1, x2) is a CQ with free variables x1 ∪ x2, ℎ(x2) is a CQ with free variables x2, and is an epistemic operator.

In the same spirit as in the aforementioned work, we use EDs as disclosure rules to govern the publication of data. Intuitively, if is any substitution assigning the universal variables of an ED to constants, the fact that the ontology entails () may only be disclosed if (ℎ) can also be made public. More formally, we say that an FO theory Φ satisfies an ED (in symbols Φ |=EQL ) if, for every assignment of the free variables of with constants, if Φ |= () then Φ |= (ℎ). If Φ satisfies all EDs of a policy , then we say that Φ satisfies (in symbols Φ |=EQL ). Note that EDs can also be used to express denials, as one can have ℎ = ⊥.

To describe our ontology, we rely on Description Logics (DLs) [9].1 A DL ontology is an FO theory ∪ , where (the TBox) is a set of intensional axioms and (the ABox) is a set of facts. Hereinafter, we call CQE instance the triple ℰ = ⟨ , , ⟩, where is a TBox, is a policy and is an ABox. Moreover, we define an optimal GA censor for ℰ as any maximal (w.r.t. set inclusion) set of ground atoms such that ∪ |= and ∪ |=EQL . The next example shows the case of a policy consisting of EDs coupled with a DL ontology.

Example 1. A company establishes that all the salaries of its employees must be kept undisclosed, except for those who hold a managerial position; moreover, it requires that consensual personal relationships between managers and their team members not be publicly revealed. This policy can be formally represented through the following set of EDs: = { ∀, (salary(, ) → manager()),

∃, (managerOf(, ) ∧ consRel(, )) → ⊥ } where manager is a unary predicate indicating that an individual is a manager, and salary, consRel and managerOf are binary predicates modelling, respectively, the salary level of a person, the consensual relationship between two individuals and the relationship where one individual manages another. By employing the existential quantifier, the second ED asserts that, for any manager (or employee), the fact that a consensual relationship exists with one of their employees (or managers) must itself be concealed—not just the two parties’ identities.

In addition, our knowledge about the company is defined by the following ontology: • A TBox = {∃managerOf ⊑ manager, manager ⊑ ∃respDept}, meaning that () if one person manages another, then he or she is a manager; and () every manager is responsible for at least one department. • An ABox = {managerOf(lucy, tom), consRel(lucy, tom), salary(lucy, 150k), salary(tom, 50k)}, which describes a situation in which Lucy manages Tom, they have a consensual relationship, and they receive a salary of $150,000 and $50,000, respectively.

Intuitively, a GA censor certainly does not contain either managerOf(lucy, tom) or consRel(lucy, tom) (in order not to violate the denial) and, in any case, it does not contain the fact salary(tom, 50k) (as Tom is not a manager). Moreover, every optimal GA censor includes both manager(lucy) and salary(lucy, 150k), because the fact that Lucy is a manager can be logically deduced from the ontology and, by revealing this information, knowing also her salary does not violate the policy.

In this setting, our objective is to answer Boolean unions of conjunctive queries (BUCQs) based on a formal entailment semantics that balances information disclosure with policy compliance. In particular, we investigate the problem of checking whether a BUCQ is entailed by the TBox and the intersection of all the optimal GA censors. This task, known as IGA-entailment, consists in checking whether ∪ IGA |= , where IGA is the intersection of all the optimal GA censors of a CQE instance ℰ (in this case, we write ℰ |=IGA ).

The work [12] showed that IGA-entailment is FO-rewritable in the case of DL-Liteℛ ontologies and policy consisting of denials. Formally, this means that the IGA-entailment of a BUCQ can be checked through an algorithm that first rewrites into an FO query that does not depend on the ABox and, in a second moment, evaluates over the ABox. From a theoretical perspective, such a property ensures a nice computational behavior, as its direct implication is that the problem of determining the IGA-entailment of a BUCQ with DL-Liteℛ ontologies and denials is in AC0 in data complexity [13].

1An up-to-date overview of CQE in the context of DLs can be found in [10, 11].

On the practical side, experiments under these conditions were carried out in [14], within the OBDA framework.

We aim to extend this scenario to accommodate policies defined using EDs while preserving the FO-rewritability property. In particular, we focus on the class of full EDs, i.e. EDs whose head contains no existential variable. This class enjoys a desirable property related to security: given a CQE instance ℰ whose policy is made of full EDs, the intersection of all the optimal GA censors for ℰ is still a GA censor for ℰ . We also show that, in general, this property does not hold. We exclude, however, the FO-rewritability of IGA-entailment for this class of dependencies, by providing the following complexity result (which holds even in the case the TBox is empty): Theorem 1. IGA-entailment is coNP-hard in data complexity in the case of full EDs.

We thus identify a suficient condition for full EDs for which IGA-entailment in the case of DL-Liteℛ ontologies remains FO-rewritable. Specifically, we require the policy to be such that the set Σ of TGDs derived (in the natural way) from and from (the inclusion assertions of) is UCQ-rewritable i.e., given any CQ (x), there exists a UCQ ′ such that, for every set ℱ of facts and for every ground substitution of the free variables of , Σ ∪ ℱ |= () if ℱ |= () for some (x) ∈ ′. In this case, we say that is expandable w.r.t. .

Theorem 2. IGA-entailment is FO-rewritable, and thus in AC0 in data complexity, for DL-Liteℛ TBoxes and policies that are full and expandable w.r.t. the coupled TBox.

As a final step, focusing on two categories of EDs satisfying this requirement—namely the acyclic full (where acyclicity condition is defined as in [ 6]) and the linear full EDs—we carried out experiments to assess the practical viability of our rewriting procedure. We developed a tool that transforms a SPARQL BUCQ into an FO query using only the provided TBox and policy, and then executes over an SQL database storing the ABox. As the above theorem pertains to DL-Liteℛ, we employed the OWL 2 QL ontology and the 10 queries for OWL 2 QL provided by the OWL2Bench benchmark [15]. The outcome of our experiments, conducted on a standard laptop with an Intel i7 @1.8 GHz processor and 16GB of RAM, is summarized in Table 1. For every test case and every query, we report the rewriting time () and the evaluation time () expressed in milliseconds, other than the number of returned tuples (#). In the table, ∅, a, b, a-, and b- refer to the empty policy, a full acyclic policy, a full linear policy, and, respectively, their “reduced” versions. Moreover, o2b with ∈ {5, 10} refers to the ontology included in the benchmark containing axioms and ground data about fictitious universities.

We observe that () in most cases the evaluation time is acceptable (of the order of seconds), although the seventh query takes several minutes; () the rewriting time —which is nearly identical for o2b5 and o2b10 as it does not depend on the ABox—is often negligible and never exceeds three seconds; () for both acyclic and binary policies, values for smaller and larger policies are of comparable magnitude; and () binary policies tend to remove more tuples than acyclic ones, likely because EDs with fewer atoms in their body are more easily “activated”.

All the results with full proofs are reported in the extended version of the paper [16].

Acknowledgments

This work was partially supported by: projects FAIR (PE0000013) and SERICS (PE00000014) under the MUR National Recovery and Resilience Plan funded by the EU - NextGenerationEU; the MUR PRIN 2022LA8XBH project Polar (POLicy specificAtion and enfoRcement for privacy-enhanced data management); by the EU under the HORIZON.2.1.5 project dAIbetes (grant id. 101136305); and by projects SEED PNR 2021 and SEED PNR 2022 funded by Sapienza Università di Roma.

Declaration on Generative AI

During the preparation of this work, the authors used GPT-4 in order to: Grammar and spelling check. After using this service, the authors reviewed and edited the content as needed and take full responsibility for the publication’s content. [1] J. Biskup, For unknown secrecies refusal is better than lying, Data and Knowledge Engineering 33 (2000) 1–23. [2] J. Biskup, P. A. Bonatti, Controlled query evaluation for enforcing confidentiality in complete information systems, Int. J. Inf. Sec. 3 (2004) 14–27. [3] P. A. Bonatti, L. Sauro, A confidentiality model for ontologies, in: Proc. of the 12th Int. Semantic

Web Conf. (ISWC), volume 8218 of Lecture Notes in Computer Science, Springer, 2013, pp. 17–32. [4] B. Cuenca Grau, E. Kharlamov, E. V. Kostylev, D. Zheleznyakov, Controlled query evaluation over

OWL 2 RL ontologies, in: Proc. of the 12th Int. Semantic Web Conf. (ISWC), 2013, pp. 49–65. [5] G. Cima, D. Lembo, R. Rosati, D. F. Savo, Controlled query evaluation in description logics through consistent query answering, Artificial Intelligence 334 (2024) 104176. [6] G. Cima, D. Lembo, L. Marconi, R. Rosati, D. F. Savo, Enhancing controlled query evaluation through epistemic policies, in: Proc. of the 33th Int. Joint Conf. on Artificial Intelligence (IJCAI), Int. Joint Conf. on Artificial Intelligence Organization, 2024, pp. 3307–3314. [7] M. Console, M. Lenzerini, Epistemic integrity constraints for ontology-based data management, in:

Proc. of the 34th AAAI Conf. on Artificial Intelligence (AAAI), AAAI Press, 2020, pp. 2790–2797. [8] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, R. Rosati, EQL-Lite: Efective first-order query processing in description logics, in: Proc. of the 20th Int. Joint Conf. on Artificial Intelligence (IJCAI), Morgan Kaufmann Publishers Inc., 2007, pp. 274–279. [9] F. Baader, D. Calvanese, D. McGuinness, D. Nardi, P. F. Patel-Schneider (Eds.), The Description Logic Handbook: Theory, Implementation and Applications, 2nd ed., Cambridge University Press, 2007. [10] P. A. Bonatti, A false sense of security, Artificial Intelligence 310 (2022). [11] G. Cima, D. Lembo, L. Marconi, R. Rosati, D. F. Savo, A gentle introduction to controlled query evaluation in DL-Lite ontologies, Springer Nature Computer Science 5 (2024) 335. [12] G. Cima, D. Lembo, L. Marconi, R. Rosati, D. F. Savo, Indistinguishability in controlled query evaluation over prioritized description logic ontologies, J. of Web Semantics 84 (2025) 100841. [13] S. Abiteboul, R. Hull, V. Vianu, Foundations of Databases, Addison Wesley Publ. Co., 1995. [14] G. Cima, D. Lembo, L. Marconi, R. Rosati, D. F. Savo, Controlled query evaluation in ontology-based data access, in: The Semantic Web - ISWC 2020 - 19th International Semantic Web Conference, Athens, Greece, November 2-6, 2020, Proceedings, Part I, volume 12506 of Lecture Notes in Computer Science, Springer, 2020, pp. 128–146. [15] G. Singh, S. Bhatia, R. Mutharaju, OWL2Bench: A benchmark for OWL 2 reasoners, in: Proc. of the 19th Int. Semantic Web Conf. (ISWC), volume 12507 of Lecture Notes in Computer Science, Springer, 2020, pp. 81–96. [16] L. Marconi, F. Ricci, R. Rosati, CQE under epistemic dependencies: Algorithms and experiments (extended version), 2025. URL: https://arxiv.org/abs/2507.17487.