=Paper=
{{Paper
|id=Vol-3739/abstract-10
|storemode=property
|title=Controlled Query Evaluation in DL-Lite through Epistemic Protection Policies (Extended Abstract)
|pdfUrl=https://ceur-ws.org/Vol-3739/abstract-10.pdf
|volume=Vol-3739
|authors=Gianluca Cima,Domenico Lembo,Lorenzo Marconi,Riccardo Rosati,Domenico Fabio Savo
|dblpUrl=https://dblp.org/rec/conf/dlog/CimaL00S24
}}
==Controlled Query Evaluation in DL-Lite through Epistemic Protection Policies (Extended Abstract)==
Controlled Query Evaluation in DL-Lite through Epistemic Protection Policies (Extended Abstract) Gianluca Cima1 , Domenico Lembo1 , Lorenzo Marconi1 , Riccardo Rosati1 and Domenico Fabio Savo2 1 Sapienza UniversitΓ di Roma 2 UniversitΓ degli Studi di Bergamo Abstract This extended abstract summarizes our work recently accepted to the 33rd International Joint Conference on Artificial Intelligence, in which we present a new language for data protection policies in the Controlled Query Evaluation framework, based on the use of an epistemic operator. Keywords Description Logics, Data Protection, First-Order Rewritability, Epistemic Modal Logic Controlled Query Evaluation (CQE) is an approach to safeguard sensitive information by filtering query responses to prevent users from inferring data declared confidential by a data protection policy [1, 2, 3, 4]. A critical aspect of CQE revolves around the expressiveness of the policy language, which allows to determine the information that must remain undisclosed. Previous literature has mainly considered only policies consisting of sentences, i.e. closed logical formulas, as in [5, 2, 3, 6]. Through this approach, it is only possible to impose that the truth value of a sentence in the policy cannot be inferred by asking queries to the system. For example, in [6] policy state- ments take the form βπ₯ β ) β β₯), referred to as denial. Enforcing one such denial over the data means β (ππ(π₯ refraining from disclosing the inference of the Boolean conjunctive query (BCQ) βπ₯ β ) by the system, β ππ(π₯ even if the inference has occurred. For instance, the rule πΏ1 = βπ₯, π¦(Patient(π₯) β§ admitted(π₯, π¦) β β₯) aims to keep undisclosed the fact that a patient is admitted to a hospital department. One may argue that the above dependency imposes an excessively stringent level of protection, since it denies the presence of any patient in the hospital to protect their identity, which is implausible in practice. A more effective approach could require the system not to answer the open query π(π₯) : βπ¦(Patient(π₯)β§admitted(π₯, π¦)). Intuitively, a rule of this kind expresses that the set of patients that the system knows to be admitted to the hospital must not be disclosed. To properly capture this behaviour, we propose to use an epistemic operator πΎ in policy formulas, in the spirit of [7, 8]. This allows us to formalize the epistemic state of the user, i.e., which information can be safely undisclosed for her. In our example, by expressing a rule like (οΈ )οΈ πΏ2 = βπ₯ πΎ βπ¦(Patient(π₯) β§ admitted(π₯, π¦)) β πΎβ₯ one could still allow the end user to infer that some patient has been admitted, while protecting the identities of all such patients. In fact, our proposal enables us to accomplish more than this. In a more advanced scenario, concealing the relationship between a patient and a hospital department should apply only if the patient has not signed a consensus form. This can be expressed by the following formula: (οΈ )οΈ πΏ3 = βπ₯ πΎ βπ¦(Patient(π₯) β§ admitted(π₯, π¦)) β πΎConsent(π₯) DL 2024: 37th International Workshop on Description Logics, June 18β21, 2024, Bergen, Norway $ cima@diag.uniroma1.it (G. Cima); lembo@diag.uniroma1.it (D. Lembo); marconi@diag.uniroma1.it (L. Marconi); rosati@diag.uniroma1.it (R. Rosati); domenicofabio.savo@unibg.it (D. F. Savo) 0000-0003-1783-5605 (G. Cima); 0000-0002-0628-242X (D. Lembo); 0000-0001-9633-8476 (L. Marconi); 0000-0002-7697-4958 (R. Rosati); 0000-0002-8391-8049 (D. F. Savo) Β© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings In essence, this formula stipulates that if a user knows a patient has been admitted, she must also know the patient has signed a consent form. Consequently, if a patient has not signed a consent form, the system cannot disclose her admission status. Hereafter, we employ standard notions of function-free first-order (FO) logics with unary and binary predicates and, besides the alphabets of predicate names, constants and variables, we assume the existence of an infinite set of symbols called labeled nulls. As customary when dealing with epistemic operators, we adopt the standard name assumption [9]. Moreover, we write eval(π, β) to indicate the evaluation of an FO sentence π over an FO interpretation β. A Description Logic ontology is a FO theory π― βͺ π, where π― is a TBox and π is a quantified ABox [10] (i.e. a set of atomic FO formulas predicating over constants and labeled nulls), and all the predicates occurring in π also occur in π― . As for user queries, we focus on BCQs and on Boolean Union of Conjunctive Queries (BUCQs). Formulas as πΏ2 and πΏ3 belong to the language of epistemic dependencies (EDs) which, in general, take the following form: βπ₯ β 1 , βπ₯2 ) β πΎπβ (π₯ β 1 , βπ₯2 (πΎππ (π₯ β 2 )) (1) where ππ (π₯ β 1 , βπ₯2 ) is a CQ with free variables βπ₯1 βͺ βπ₯2 , πβ (π₯ β 2 ) is a CQ with free variables βπ₯2 , and πΎ is a modal operator. EDs have been originally introduced in [8] to express integrity constraints in ontology-based data management and are indeed a special case of domain-independent EQL-Lite(CQ) sentences [9]. A policy consisting of EDs is called epistemic policy. In the present paper, we use EDs as policy rules that must be satisfied to preserve data confidentiality over Descprition Logic ontologies, similarly as integrity constraints must be satisfied to guarantee data consistency. However, our aim is totally different from that of [8], whose focus is on consistency checking. Intuitively, an ED of form (1) should be read as follows: if the sentence ππ (π β 1 , βπ2 ) is known to hold, then the sentence πβ (π β 2 ) is known to hold, for any sequences of constants βπ1 and βπ2 of the same length of βπ₯1 and βπ₯2 , respectively. More formally, we say that a FO theory Ξ¦ satisfies an ED πΏ (denoted Ξ¦ |=EQL πΏ) if, for every sequences of constants βπ1 and βπ2 of the same length of βπ₯1 and βπ₯2 , respectively, the fact that eval(ππ (πβ 1 , βπ2 ), β) is true for every β β β³ implies that eval(πβ (π β 2 ), β) is true for every β β β³, where β³ is the set of all FO models of Ξ¦. We say that Ξ¦ satisfies a policy π« (denoted Ξ¦ |=EQL π«) if Ξ¦ satisfies every πΏ β π«. Now, to completely characterize our CQE framework, we need to specify its semantics. This issue is addressed in CQE through censors. Here, we employ CQ-censors, introduced in [6]. To this aim, we first define the notion of CQE instance as a triple β° = β¨π― , π, π«β©, where π― is a TBox, π is a quantified ABox such that π― βͺ π has at least one model, and π« is an epistemic policy such that π― |=EQL π«. The notion of (optimal) CQ-censors is then as follows. Definition 1 (CQ-censor). A CQ-censor of a CQE instance β° = β¨π― , π, π«β© is a set π of BCQs such that (π) π― βͺ π |=EQL π« and (ππ) π― βͺ π |= π for every π β π. Moreover, π is an optimal CQ-censor of β° if there does not exist any CQ-censor π β² of β° such that π β² β π. As in [6], we define CQE as the problem of checking whether an FO sentence is entailed by all optimal censors. We call this problem SC-entailment (Skeptical entailment under CQ-censors). This form of CQE does not suffer from the problem of having to arbitrarily select an optimal censor among several incomparable ones [3, 11]. We also consider a sound approximation of SC-entailment, that we call IC-entailment, consisting in checking whether an FO sentence is entailed by the intersection of all optimal censors. We use the notation β° |=SC π (resp., β° |=IC π) for indicating that an FO sentence π is SC-entailed (resp., IC-entailed) by a CQE instance β°. Example 1. Suppose that company Cπ΄ wants to share certain user-profiling data with a company Cπ΅ for targeted advertising. This is not allowed in general, but only in some countries with a special regulation that enables sharing based on the usersβ consent. Cπ΄ may use the following ED in the policy to enable Cπ΅ to access only data compliant with the above requirements: (οΈ )οΈ πΏ4 = βπ₯, π¦ πΎprofileActivity(π₯, π¦) β πΎ βπ§(citOf(π₯, π§) β§ SR(π§) β§ Consent(π₯)) . In the rule, profileActivity links a user with her profiling-data, citOf relates a user to her country of citizenship, SR denotes countries with special regulation, and Consent denotes users who have given their consent. Suppose that Cπ΄ also wants Cπ΅ not to be able to associate to a user with her real identity, and that this is possible by collecting the personβs name and her date of birth at the same time. To this aim, Cπ΄ also specifies the ED: (οΈ )οΈ πΏ5 = βπ₯, π¦, π§ πΎ(name(π₯, π¦) β§ dateB(π₯, π§)) β πΎβ₯ . Now, let β° = β¨π― , π, π«β© be a CQE instance, where π― = β , π« = {πΏ4 , πΏ5 }, and π is as follows: π = { profileActivity(u1 , a1 ), Consent(u1 ), citOf(u1 , π1 ), SR(π1 ), name(u1 , ann), dateB(u1 , date1 ), profileActivity(u2 , a2 ), citOf(u2 , π1 ) }, where π1 is a labeled null and all the other terms used in π are constants. Let us also consider the following four BCQs: π1 = βπ¦ (profileActivity(u1 , a1 ) β§ citOf(u1 , π¦) β§ SR(π¦)), π2 = profileActivity(u2 , a2 ), π3 = βπ¦ profileActivity(π¦, a2 ), π4 = profileActivity(u1 , a1 ) β§ name(u1 , ann). For π β {SC, IC}, one can verify that β° |=π π1 since user u1 gave the consent and she is a citizen of some country (π1 ) with special regulation. Conversely, β° ΜΈ|=π π2 since u2 did not give the consent. Nevertheless, one can verify that π3 belongs to each optimal CQ-censor of β°, and so β° |=π π3 . Finally, we have that β° ΜΈ|=π π4 since there is an optimal CQ-censor π of β° such that dateB(u1 , date1 ) β π, implying that name(u1 , ann) ΜΈβ π (otherwise πΏ5 would be violated). We focus on TBoxes specified in DL-Liteβ [7], for which SC-entailment of BCQs is known to be tractable in data complexity when policies are expressed as denials [6]. Unfortunately, when the policy consists of epistemic dependencies, such a problem turns out to be intractable in general, and the same holds for IC-entailment. Theorem 1. In the case of CQE instances with DL-Liteβ TBoxes, both SC-entailment and IC-entailment of BUCQs are coNP-complete in data complexity. We remark that, in the above findings, the hardness for coNP holds already when the TBox is empty and the query is a ground atom. To regain tractability, we then impose an acyclicity condition on the policy π« w.r.t. a DL-Liteβ TBox π― , defined as follows. Definition 2 (Acyclicity). We call dependency graph of (π― , π«) the direct graph whose nodes are the predicates occurring in π― βͺ π«, and whose edges (π1 , π2 ) correspond to the presence of π1 in the left-hand side and π2 in right-hand side of either an ED in π« (we call such edges P-edge) or a positive inclusion axiom in π― . We say that π« is acyclic for π― if there exists no cycle involving a P-edge in the dependency graph of (π― , π«). For this class of epistemic policies, we are able to provide first-order rewriting algorithms for deciding if a BUCQ is SC- or IC-entailed by a CQE instance, thus proving the following result. Theorem 2. In the case of CQE instances with DL-Liteβ TBoxes π― and epistemic policies π« acyclic for π― , both SC-entailment and IC-entailment of BUCQs are in AC0 in data complexity. Besides the computational complexity study, we also carry out an analysis of the robustness of the above-defined entailment semantics w.r.t. confidentiality-preservation. In [12, 3, 13], it is shown that censoring mechanisms based on an indistinguishability criterion are indeed more secure than others. In this regard, we adopt a similar approach to the one described in [12] for relational databases. Intuitively, under such an approach an entailment semantics preserves confidentiality if, for every CQE instance β¨π― , π, π«β© and every finite set π¬ of queries, the answers to such queries are the same as if they were obtained from a (possibly different) CQE instance β¨π― , πβ² , π«β© such that π― βͺ πβ² |=EQL π«. We here describe this property formally. First, for a TBox π― , a policy π«, two ABoxes π and πβ² , a set π¬ of BUCQs, and π β {SC, IC}, we say that π and πβ² are π¬-indistinguishable for π-entailment w.r.t. (π― , π«) if, for every π β π¬, β¨π― , π, π«β© |=π π iff β¨π― , πβ² , π«β© |=π π. Definition 3 (Confidentiality preservation). For π β {SC, IC}, we say that π-entailment preserves confidentiality for BCQs (resp., BUCQs) if, for every CQE instance β¨π― , π, π«β©, and for every finite set π¬ of BCQs (resp., BUCQs), there exists an ABox πβ² such that the following holds: (π) π― βͺ πβ² |=EQL π«; (ππ) π and πβ² are π¬-indistinguishable for X-entailment with respect to (π― , π«). In the case of DL-Liteβ ontologies, we can show that SC-entailment preserves confidentiality for BCQs, but this result does not carry over to BUCQs. We however prove that the property is enjoyed even for BUCQs in the case of IC-entailment. Acknowledgments This work was partially supported by: projects FAIR (PE0000013) and SERICS (PE00000014) under the MUR National Recovery and Resilience Plan funded by the EU - NextGenerationEU; GLACIATION project funded by the EU (N. 101070141); ANTHEM (AdvaNced Technologies for Human-centrEd Medicine) project (CUP B53C22006700001) funded by the National Plan for NRRP Complementary Investments; the MUR PRIN 2022LA8XBH project Polar (POLicy specificAtion and enfoRcement for privacy-enhanced data management); and by the EU under the H2020-EU.2.1.1 project TAILOR (grant id. 952215). References [1] J. Biskup, For unknown secrecies refusal is better than lying, Data and Knowledge Engineering 33 (2000) 1β23. [2] J. Biskup, P. A. Bonatti, Controlled query evaluation for enforcing confidentiality in complete information systems, Int. J. Inf. Sec. 3 (2004) 14β27. [3] P. A. Bonatti, L. Sauro, A confidentiality model for ontologies, in: Proc. of the 12th Int. Semantic Web Conf. (ISWC), 2013, pp. 17β32. [4] B. Cuenca Grau, E. Kharlamov, E. V. Kostylev, D. Zheleznyakov, Controlled query evaluation over OWL 2 RL ontologies, in: Proc. of the 12th Int. Semantic Web Conf. (ISWC), 2013, pp. 49β65. [5] G. L. Sicherman, W. de Jonge, R. P. van de Riet, Answering queries without revealing secrets, ACM Trans. on Database Systems 8 (1983) 41β59. [6] D. Lembo, R. Rosati, D. F. Savo, Revisiting controlled query evaluation in description logics, in: Proc. of the 28th Int. Joint Conf. on Artificial Intelligence (IJCAI), 2019, pp. 1786β1792. [7] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, R. Rosati, Tractable reasoning and efficient query answering in description logics: The DL-Lite family, J. of Automated Reasoning 39 (2007) 385β429. [8] M. Console, M. Lenzerini, Epistemic integrity constraints for ontology-based data management, in: Proc. of the 37th AAAI Conf. on Artificial Intelligence (AAAI), AAAI Press, 2020, pp. 2790β2797. URL: https://doi.org/10.1609/aaai.v34i03.5667. doi:10.1609/AAAI.V34I03.5667. [9] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, R. Rosati, EQL-Lite: Effective first-order query processing in description logics, in: Proc. of the 20th Int. Joint Conf. on Artificial Intelligence (IJCAI), 2007, pp. 274β279. [10] F. Baader, F. Kriegel, A. Nuradiansyah, R. PeΓ±aloza, Computing compliant anonymisations of quantified ABoxes w.r.t. β°β policies, in: Proc. of the 19th Int. Semantic Web Conf. (ISWC), volume 12506 of Lecture Notes in Computer Science, Springer, 2020, pp. 3β20. [11] B. Cuenca Grau, E. Kharlamov, E. V. Kostylev, D. Zheleznyakov, Controlled query evaluation for datalog and OWL 2 profile ontologies, in: Proc. of the 24th Int. Joint Conf. on Artificial Intelligence (IJCAI), 2015, pp. 2883β2889. [12] J. Biskup, T. Weibert, Keeping secrets in incomplete databases, Int. J. Inf. Sec. 7 (2008) 199β217. [13] P. A. Bonatti, A false sense of security, Artificial Intelligence 310 (2022).