<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On Information Disclosure in Ontology-based Data Access (Extended Abstract)?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lorenzo Marconi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Riccardo Rosati</string-name>
          <email>rosatig@diag.uniroma1.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sapienza Universita di Roma</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universita degli Studi di Bergamo</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Bordeaux</institution>
          ,
          <addr-line>CNRS, Bordeaux INP, LaBRI</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>This extended abstract summarizes our recent work [4] about Controlled Query Evaluation over Ontology-based data access systems. Controlled Query Evaluation (CQE) is an approach to privacy-preserving query answering that recently has gained attention in the context of ontologies [2,6,8,9,12]. In our work, we consider the more general Ontology-based Data Access (OBDA) framework, where an ontology is coupled to external data sources via a declarative mapping [14,15], and extend OBDA with CQE features. In this new framework, which we call Policy-Protected Ontology-based Data Access (PPOBDA), a data protection policy is speci ed over the ontology of an OBDA speci cation in terms of logical statements declaring con dential information that must not be revealed to the users. As an example, consider the following formula (expressed as a denial assertion): 8x; y:OilComp(x) ^ IssuesLic(x; y) ^ Comp(y) ! ?; which says that the existence of an oil company issuing a license to another company (to operate over its properties) is a private information. More formally, we de ne a PPOBDA speci cation E as a quadruple hT ; S; M; Pi, where: { T is a Description Logic (DL) TBox [1], formalizing intensional domain knowledge;</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology-based Data Access</kwd>
        <kwd>Data Protection</kwd>
        <kwd>First-Order Rewritability</kwd>
        <kwd>Information Disclosure</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>{ S is the relational schema at the sources;
{ M is the mapping between T and S, i.e., a set of logical assertions de ning
the semantic correspondence between the TBox and the source schema;
{ P is the data protection policy (i.e., a set of formulas) expressed over T .</p>
      <p>The components T , S, and M are exactly as in OBDA speci cations, and, as
in standard OBDA, a user can only ask queries over the TBox T . Then, query
answering is ltered through a censor, i.e., a function that alters the answers
to queries, in such a way that no data are returned that may lead a malicious
user to infer knowledge declared con dential by the policy, even in case she/he
accumulates the answers she/he gets over time. Among all possible censors,
optimal ones are preferred, i.e., those altering query answers in a minimal way.</p>
      <p>
        Within this framework, we initially consider two di erent notions of censor,
called censor in CQ and censor in GA, previously de ned for CQE over DL
ontologies [
        <xref ref-type="bibr" rid="ref12 ref9">9,12</xref>
        ], and which can be naturally extended to PPOBDA. More
precisely, given a PPOBDA speci cation E = hT ; S; M; Pi, an optimal censor in
CQ (resp., GA) for E is a function that, taken as input a database instance
D for the source schema S, returns a maximal subset C of the set of Boolean
Conjunctive Queries (resp., Ground Atoms) inferred by hT ; S; Mi and D, such
that C [ T does not entail information protected by the policy. Since in
general several of these maximal sets (incomparable to each other) exist, for both
cases we de ne query answering under optimal censors in PPOBDA as a form
of skeptical reasoning over all such sets, in the same spirit of [
        <xref ref-type="bibr" rid="ref12 ref6">12,6</xref>
        ].
      </p>
      <p>
        Our basic idea to solve query answering under censors is to transform a
PPOBDA speci cation E into a classical OBDA speci cation J (i.e., without
policies), in such a way that, whatever database D instantiates the source schema
S, query answering under censors in E over D is equivalent to standard query
answering in J over D. In this transformation, we require that J has the same
TBox of E , so that this reduction is transparent to the user, and the same source
schema as E , since, as typical in OBDA, the data sources to be accessed are
autonomous. We aim at a transformation independent from the underlying data,
so that it can be computed at design-time. This enables us to use o -the-shelf
OBDA engines, like Mastro4 [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] or Ontop5 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        The problem we study can be thus summarized as follows: Given a
PPOBDA speci cation E = hT ; M; S; Pi, construct an OBDA speci cation
J = hT ; S; M0i such that, for any database D for S, conjunctive query
answering under optimal censors in E over D is equivalent to standard conjunctive
query answering in J over D. We investigate this problem for the relevant case
in which the TBox is expressed in DL-LiteR, the DL underpinning owl 2 ql [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ],
and the policy is a set of denial assertions, i.e., conjunctive queries for which an
empty answer is imposed due to con dential reasons (as in our initial example).
Our contributions are as follows:
4 http://obdasystems.com/mastro
5 https://ontop-vkg.org/
      </p>
      <p>On Information Disclosure in Ontology-based Data Access
(i) We show that the above problem has in general no solution when censors in
either CQ or GA are considered, whatever is the DL adopted for expressing
the TBox.
(ii) To overcome this issue, we propose a further, semantically well-founded
approximated notion of censor, named IGA (Intersection GA) censor, which
intuitively, for a PPOBDA speci cation E and any database D, returns the
intersection of the sets of ground atoms computed by the optimal censors in
GA for E applied to D.
(iii) We provide an algorithm that solves our problem for every DL-LiteR</p>
      <p>
        PPOBDA speci cations under IGA censors.
(iv) We carried out an experimental evaluation of our approach on (the
approximation [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] in DL-LiteR of) the OBDA NPD benchmark [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The tests show
that the cost of the transformation performed by our tool is negligible, and
answering queries in the presence of a policy in our approach does not cause
a signi cant overhead with respect to the case without policy.
      </p>
      <p>
        We are currently working on enriching our CQE framework to improve its
abilities in the enforcement of con dentiality. In particular, we are investigating
more expressive forms of policy, which go beyond denial assertions, and the
possibility of expressing preferences that a ect the way in which secret information
is obfuscated, as in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Acknowledgements. This work was partly supported by the ANR AI Chair
INTENDED (ANR-19-CHIA-0014), by the EU within the H2020 Programme
under the grant agreement 834228 (ERC Advanced Grant WhiteMec) and the grant
agreement 825333 (MOSAICrOWN), by Regione Lombardia within the Call Hub
Ricerca e Innovazione under the grant agreement 1175328 (WATCHMAN), and
by the Italian MUR (Ministero dell'Universita e della Ricerca) through the PRIN
project HOPE (prot. 2017MMJJRE), and by Sapienza (project CQEinOBDM).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nardi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. F.</given-names>
            <surname>Patel-</surname>
          </string-name>
          Schneider, editors.
          <source>The Description Logic Handbook: Theory, Implementation and Applications</source>
          . Cambridge University Press, 2nd edition,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Bonatti</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Sauro</surname>
          </string-name>
          .
          <article-title>A con dentiality model for ontologies</article-title>
          .
          <source>In Proc. of the 12th Int. Semantic Web Conf. (ISWC)</source>
          , volume
          <volume>8218</volume>
          of Lecture Notes in Computer Science, pages
          <volume>17</volume>
          {
          <fpage>32</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cogrel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Komla-Ebri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rezk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rodriguez-Muro</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Xiao.</surname>
          </string-name>
          <article-title>Ontop: Answering SPARQL queries over relational databases</article-title>
          .
          <source>Semantic Web J.</source>
          ,
          <volume>8</volume>
          (
          <issue>3</issue>
          ):
          <volume>471</volume>
          {
          <fpage>487</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>G.</given-names>
            <surname>Cima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Marconi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Savo</surname>
          </string-name>
          .
          <article-title>Controlled query evaluation in Ontology-Based Data Access</article-title>
          .
          <source>In Proc. of the 19th Int. Semantic Web Conf. (ISWC)</source>
          , volume
          <volume>12506</volume>
          , pages
          <fpage>128</fpage>
          {
          <fpage>146</fpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>G.</given-names>
            <surname>Cima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Marconi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Savo</surname>
          </string-name>
          .
          <article-title>Controlled query evaluation over prioritized ontologies with expressive data protection policies</article-title>
          .
          <source>In Proc. of the 20th Int. Semantic Web Conf. (ISWC)</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>G.</given-names>
            <surname>Cima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Savo</surname>
          </string-name>
          .
          <article-title>Controlled query evaluation in description logics through instance indistinguishability</article-title>
          .
          <source>In Proc. of the 29th Int. Joint Conf. on Arti cial Intelligence (IJCAI)</source>
          , pages
          <fpage>1791</fpage>
          {
          <fpage>1797</fpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>M.</given-names>
            <surname>Console</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Santarelli</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Savo</surname>
          </string-name>
          .
          <article-title>E ective computation of maximal sound approximations of description logic ontologies</article-title>
          .
          <source>In Proc. of the 13th Int. Semantic Web Conf. (ISWC)</source>
          , pages
          <fpage>164</fpage>
          {
          <fpage>179</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>B.</given-names>
            <surname>Cuenca Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kharlamov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. V.</given-names>
            <surname>Kostylev</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Zheleznyakov</surname>
          </string-name>
          .
          <article-title>Controlled query evaluation over OWL 2 RL ontologies</article-title>
          .
          <source>In Proc. of the 12th Int. Semantic Web Conf. (ISWC)</source>
          , pages
          <fpage>49</fpage>
          {
          <fpage>65</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>B.</given-names>
            <surname>Cuenca Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kharlamov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. V.</given-names>
            <surname>Kostylev</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Zheleznyakov</surname>
          </string-name>
          .
          <article-title>Controlled query evaluation for datalog and OWL 2 pro le ontologies</article-title>
          .
          <source>In Proc. of the 24th Int. Joint Conf. on Arti cial Intelligence (IJCAI)</source>
          , pages
          <fpage>2883</fpage>
          {
          <fpage>2889</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>G. De Giacomo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Poggi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Rosati</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ruzzi</surname>
            , and
            <given-names>D. F.</given-names>
          </string-name>
          <string-name>
            <surname>Savo</surname>
          </string-name>
          . MASTRO:
          <article-title>A reasoner for e ective Ontology-Based Data Access</article-title>
          .
          <source>In Proc. of the 1st Int. Workshop on OWL Reasoner Evaluation (ORE)</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>D.</given-names>
            <surname>Lanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rezk</surname>
          </string-name>
          , G. Xiao, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          .
          <article-title>The NPD benchmark: Reality check for OBDA systems</article-title>
          .
          <source>In Proc. of the 18th Int. Conf. on Extending Database Technology (EDBT)</source>
          , pages
          <fpage>617</fpage>
          {
          <fpage>628</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Savo</surname>
          </string-name>
          .
          <article-title>Revisiting controlled query evaluation in description logics</article-title>
          .
          <source>In Proc. of the 28th Int. Joint Conf. on Arti cial Intelligence (IJCAI)</source>
          , pages
          <fpage>1786</fpage>
          {
          <fpage>1792</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Cuenca</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fokoue</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          .
          <article-title>OWL 2 Web Ontology Language pro les (second edition)</article-title>
          .
          <source>W3C Recommendation</source>
          , World Wide Web Consortium, Dec.
          <year>2012</year>
          . Available at http://www.w3.org/ TR/owl2-pro les/.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          , G. De Giacomo,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          .
          <article-title>Linking data to ontologies</article-title>
          .
          <source>J. on Data Semantics</source>
          , X:
          <volume>133</volume>
          {
          <fpage>173</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. G. Xiao,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Zakharyaschev</surname>
          </string-name>
          .
          <article-title>Ontology-based data access: A survey</article-title>
          .
          <source>In Proc. of the 27th Int. Joint Conf. on Arti cial Intelligence (IJCAI)</source>
          , pages
          <fpage>5511</fpage>
          {
          <fpage>5519</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>