=Paper= {{Paper |id=Vol-1936/paper-10 |storemode=property |title=Mapping Repair in Ontology-based Data Access Evolving Systems (Extended Abstract) |pdfUrl=https://ceur-ws.org/Vol-1936/paper-10.pdf |volume=Vol-1936 |authors=Domenico Lembo,Riccardo Rosati,Valerio Santarelli,Domenico Fabio Savo,Evgenij Thorstensen |dblpUrl=https://dblp.org/rec/conf/semweb/LemboRSST17 }} ==Mapping Repair in Ontology-based Data Access Evolving Systems (Extended Abstract)== https://ceur-ws.org/Vol-1936/paper-10.pdf
      Mapping Repair in Ontology-based Data Access
         Evolving Systems (Extended Abstract)

               Domenico Lembo1 , Riccardo Rosati1 , Valerio Santarelli1 ,
                   Domenico Fabio Savo1 , Evgenij Thorstensen2
       1                                                2
         Sapienza Università di Roma                     University of Oslo
     lastname@dis.uniroma1.it                         evgenit@ifi.uio.no

Introduction. Ontology-based data access (OBDA) is the problem of accessing source
databases through the mediation of a conceptual domain view, given in terms of an
ontology [9]. An OBDA specification is constituted by an ontology, usually a Description
Logic (DL) TBox [2], a schema of the source databases, and a mapping specifying the
relationship between the source data and the elements of the ontology, which is com-
monly given as a set of assertions, each one associating a query over the source schema
with a query over the ontology. In the following, we denote an OBDA specification as
J = hT , S, Mi, where T is the TBox, S is the source schema, and M is the mapping.
    A major issue in OBDA concerns the design and management of a specification, and
in particular of a mapping [6]. Mapping design is indeed a time-consuming and complex
operation, which in general cannot be totally automatized. Of course, modifying the
mapping due to changes in the other components of the specification may result in a
time-consuming process as well. We experienced this in various industrial and academic
projects. Among them, we mention a collaboration with the Italian Ministry of Economy
and Finances, where we had to map a domain ontology with a source database completely
independent from it, which caused mapping definition to be particularly complex [1],
and the use cases of the EU project Optique, focused on OBDA for Big Data [4], which
were particularly challenging with respect to mapping design and analysis.
    In this paper, we study the evolution of OBDA specifications, and focus on the typical
case in which the ontology and/or the source schema are updated and the mapping needs
to be in turn modified to restore system consistency. Our approach is based on the
idea of repairing the mapping according to the usual principle of minimal change and
on a recent, mapping-based notion of consistency of the specification. We define and
analyze two notions of mapping repair, called deletion-based and entailment-based
repair, respectively. Whereas the former notion adopts the simple idea of looking only
at consistent subsets of the original mapping, the latter aims at preserving as much as
possible of the mapping assertions that are entailed by the initial specification and do
not contradict the update. We then present a set of initial results on the complexity of
query answering in OBDA under ontology update in both the repair semantics.
    We observe that many approaches exist for both ontology evolution [11] and database
schema evolution [10]. However, to the best of our knowledge, no previous study has
analyzed evolution in the presence of a mapping connecting an ontology to a database
schema. In this sense, a problem that is close to OBDA is ontology matching and
alignment, which is based on the use of a notion of mapping to integrate different
ontologies. Several works have studied the problem of repairing inconsistent mappings
in this context (e.g., [5, 8]), but the framework considered is very different from OBDA.
    The present paper is an extended abstract of [7].

OBDA specifications. A source schema S is a relational schema, possibly equipped
with integrity constraints (ICs). A legal instance for S is a database that satisfies the ICs
in S. We say that a source schema is simple if it has no ICs.
    A mapping assertion m from a source schema S to a TBox T has the form φ(x) ;
ψ(x), where φ(x) and ψ(x) are queries over S and T , respectively, both with free
variables x. A mapping M from S to T is a possibly infinite set of mapping assertions
from S to T . We consider the notable cases in which φ(x) is a CQ, and ψ(x) is either
a CQ without constants (GLAV mapping language), or a single-atom query without
constants and existential variables (GAV mapping language).
    An OBDA specification is a triple J = hT , S, Mi, where T is a TBox, S is a source
schema, and M is a mapping between the two. The semantics of J is given with respect
to a legal instance D for S: a model for J w.r.t. D is a first-order interpretation I that
satisfies T and satisfies M w.r.t. D, i.e., if for each assertion φ(x) ; ψ(x) in M and
each tuple of constants t such that t is in the evaluation of φ(x) over D, we have that
I |= ψ(t). If J has no models w.r.t. D, we say that (J , D) is inconsistent. Also, we
denote denote with (J , D) |= α the entailment of a sentence α by (J , D).

DMR and EMR repairs. We adopt a mapping-centered notion of OBDA evolution:
given an OBDA specification J = hT , S, Mi, we want to repair the mapping M
given a modification of the TBox T and/or of the source schema S. This is a natural
assumption: indeed, the mapping is an information that depends on both the TBox and
the source schema, while the TBox and the schema are (at least in principle) semantically
independent entities (since the data sources are autonomous systems).
     Following the classical approaches to belief revision, we want to find a notion
of repair of a mapping that is based on two general principles: (i) it should preserve
consistency of the OBDA specification; (ii) it should express minimal change with
respect to the initial OBDA specification. With respect to consistency preservation,
we adopt a non-classical notion of inconsistency for an OBDA specification, called
global mapping inconsistency, recently introduced in [6]. According to this notion, a
mapping M is globally inconsistent for hT , Si, with T a TBox and S a source schema,
if there exists no instance D for S that activates all the mapping assertions in M and
such that (hT , S, Mi, D) is consistent. We say that D activates a mapping assertion
φ(x) ; ψ(x) if φ(x) has a non-empty answer on D. In the context of OBDA, global
mapping inconsistency provides a more meaningful notion of inconsistency than the
classical one (which considers all possible source instances): for example, in all the cases
when S is a relational schema with standard ICs, the OBDA specification is inconsistent
according to the classical semantics iff its TBox is inconsistent.
     With respect to minimal change, we propose two different notions of repair. The first
one, called deletion-based mapping repair (DMR), reflects the simple idea of repairing a
mapping through a (subset-)minimal deletion of assertions from the initial mapping.
     Let J = hT , S, Mi be an OBDA specification such that M is globally consistent
for hT , Si, T 0 a consistent TBox, S 0 a consistent source schema, and M0 a mapping
such that M0 ⊆ M. We say that M0 is a DMR for J under update hT 0 , S 0 i if: (i) M0
is globally consistent for hT 0 , S 0 i; and (ii) there exists no mapping M00 ⊆ M such that
M00 is globally consistent for hT 0 , S 0 i, and M00 ⊃ M0 . In other words, a DMR is a
maximal subset of M that is globally consistent for the new TBox and source schema.
     It is easy to see that a DMR for J under update hT 0 , S 0 i always exists, and that if
M is finite, also M0 is finite. Furthermore, if M is globally consistent for hT 0 , S 0 i, then
M is the only DMR for J . In general, however, several repairs exist.
     We are now able to introduce our DMR-based update operator, denoted with •, and
define the OBDA specifications computed by such operator. Given J under update
hT 0 , S 0 i, the set of such specifications, denoted by J • hT 0 , S 0 i, is
              {hT 0 , S 0 , M0 i | M0 is a DMR for J under update hT 0 , S 0 i}.
    Note that the above notions are actually independent of the initial TBox T and
schema S. Therefore, in the following we will simply call M0 a DMR for M under
update hT 0 , S 0 i, and will denote the set J • hT 0 , S 0 i as M • hT 0 , S 0 i.
    The notion of query entailment in the DMR-based update framework is as follows.
Let M be a mapping, T 0 a consistent TBox, S 0 a consistent source schema, D a legal
instance for S 0 , and q a Boolean conjunctive query (BCQ). We say that q is entailed
under DMR by M , T 0 , S 0 , and D, denoted as (M • hT 0 , S 0 i, D) |= q, if (J 0 , D) |= q
for every J 0 ∈ M • hT 0 , S 0 i.
    The second notion of repair, called entailment-based mapping repair (EMR), relies
on the mapping entailment set (MES). The MES of an OBDA specification J for a
mapping language L, denoted MES L (J ), is a set of mapping assertions in L that are
logical consequences of J , and such that their body coincides with the body of an
assertion in the original mapping. Then, the repairs are globally consistent mappings that
allow for preserving as much as possible of the initial MES, according to a minimality
criterion that formalizes the intuitive principle of preferring insertions over deletions.
Formally, M0 is an EMR for J under update (T 0 , S 0 ) if: (i) M0 is globally consistent
for hT 0 , S 0 i; and (ii) there exists no L-mapping M00 such that M00 is globally consistent
for hT 0 , S 0 i and MES L (hT 0 , S 0 , M00 i) has fewer changes than MES L (hT 0 , S 0 , M0 i)
with respect to MES L (J ). We say that a mapping M1 has fewer changes than a
mapping M2 with respect to a mapping M if M \ M1 ⊂ M \ M2 (fewer deletions) or
M \ M1 = M \ M2 and M1 \ M ⊂ M2 \ M (same deletions and fewer insertions).
    Similarly to DMR, we introduce an operator, denoted ◦L , to update an OBDA
specification through EMR, and a notion of query entailment for L-EMRs.
    We finally note that, differently from DMRs, a finite L-EMR does not always exist,
even if M is finite, thus the study of this case is particularly challenging.
Complexity of CQ entailment under DMR and EMR semantics. We focus on
TBoxes specified in DL-LiteR [3], the logical counterpart of the W3C standard OWL 2
QL, which is a prominent DL in OBDA, and consider both GAV and GLAV mappings.
In these settings we provide the following complexity results.
Theorem 1. Let T 0 be a DL-LiteR TBox, S 0 a simple source schema, M a finite GLAV
mapping, q a BCQ over T 0 , and D a legal instance for S 0 . Deciding (M•hT 0 , S 0 i, D) |=
q is Π2p -complete in combined complexity and in AC0 in data complexity.
    As for the EMR case, we first study the setting with GAV mappings and obtain a
result analogous to Theorem 1. Roughly speaking, for this setting we are able to reduce
CQ entailment under EMRs to CQ entailment under DMRs through a construction that
is polynomial in the size of T and M and independent of D.
Theorem 2. Let J = hT , S, Mi be an OBDA specification, where T is a DL-LiteR
TBox, S is a simple source schema, and M is a finite GAV mapping that is globally
consistent for hT , Si. Let T 0 be a DL-LiteR TBox, S 0 a simple source schema, D a
legal instance for S 0 , and q a BCQ over T 0 . Deciding (J ◦GAV hT 0 , S 0 i, D) |= q is
Π2p -complete in combined complexity and in AC0 in data complexity.
    For GLAV, we give a different reduction of CQ entailment under EMR to CQ
entailment under DMR, which allows us to establish the data complexity of the problem,
under a condition on the interaction between the negative role inclusions (i.e., disjointness
role axioms) of the updated TBox T 0 (denoted NR(T 0 )) and the initial specification.
Theorem 3. Let J = hT , S, Mi be an OBDA specification, where T is a DL-LiteR
TBox, S is a simple source schema, and M is a finite GLAV mapping. Let T 0 be a
DL-LiteR TBox such that M is globally consistent for hT ∪ NR(T 0 ), Si, S 0 a simple
source schema, D a legal instance for S 0 , and q a BCQ over T 0 . Deciding (J ◦GLAV
hT 0 , S 0 i, D) |= q is in AC0 in data complexity.

References
 1. N. Antonioli, F. Castanò, S. Coletta, S. Grossi, D. Lembo, M. Lenzerini, A. Poggi, E. Virardi,
    and P. Castracane. Ontology-based data management for the italian public debt. In Proc. of
    FOIS, pages 372–385, 2014.
 2. F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. F. Patel-Schneider, editors. The De-
    scription Logic Handbook: Theory, Implementation and Applications. Cambridge University
    Press, 2nd edition, 2007.
 3. D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati. Tractable reasoning
    and efficient query answering in description logics: The DL-Lite family. JAR, 39(3):385–429,
    2007.
 4. M. Giese, A. Soylu, G. Vega-Gorgojo, A. Waaler, P. Haase, E. Jiménez-Ruiz, D. Lanti,
    M. Rezk, G. Xiao, Ö. L. Özçep, and R. Rosati. Optique: Zooming in on big data. IEEE
    Computer, 48(3):60–67, 2015.
 5. E. Jiménez-Ruiz, C. Meilicke, B. C. Grau, and I. Horrocks. Evaluating mapping repair systems
    with large biomedical ontologies. In Proc. of DL, volume 1014 of CEUR, ceur-ws.org,
    pages 246–257, 2013.
 6. D. Lembo, J. Mora, R. Rosati, D. F. Savo, and E. Thorstensen. Mapping analysis in ontology-
    based data access: Algorithms and complexity. In Proc. of ISWC, pages 217–234, 2015.
 7. D. Lembo, R. Rosati, V. Santarelli, D. Fabio Savo, and E. Thorstensen. Mapping repair in
    ontology-based data access evolving systems. In Proc. of IJCAI, pages 1160–1166, 2017.
 8. C. Meilicke, H. Stuckenschmidt, and A. Tamilin. Reasoning support for mapping revision.
    JLC, 19(5):807–829, 2009.
 9. A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, and R. Rosati. Linking
    data to ontologies. J. on Data Semantics, X:133–173, 2008.
10. E. Rahm and P. A. Bernstein. An online bibliography on schema evolution. SIGMOD Record,
    35(4):30–31, 2006.
11. F. Zablith, G. Antoniou, M. d’Aquin, G. Flouris, H. Kondylakis, E. Motta, D. Plexousakis,
    and M. Sabou. Ontology evolution: a process-centric survey. Knowledge Eng. Review,
    30(1):45–75, 2015.