Approaching OBDA Evolution through Mapping Repairs? Domenico Lembo1 , Riccardo Rosati1 , Valerio Santarelli1 , Domenico Fabio Savo1 , Evgenij Thorstensen2 1 2 Sapienza Università di Roma University of Oslo lastname@dis.uniroma1.it evgenit@ifi.uio.no Ontology-based data access (OBDA) is a new paradigm for accessing source databases through mediation of a conceptual domain view, given in terms of an ontol- ogy [9]. A major issue in OBDA is the design of an OBDA specification and the man- agement of its evolution. An OBDA specification is constituted by an ontology, usually a Description Logic (DL) TBox, a schema of the source databases, and a declarative mapping specifying the semantic relationship between the data at the sources and the elements of the ontology. In the following we denote it by J = hT , S, Mi, where T is the TBox (a set of DL axioms), S the source schema, i.e., a relational signature possibly with integrity constraints (ICs), and the mapping M is a set of mapping assertions of the form φ(x) ; ψ(x), where φ(x) and ψ(x) are queries over S and T , respectively, both with free variables x (such an assertion is called a GLAV mapping assertion if both φ(x) and ψ(x) are conjunctive queries (CQs), while it is a GAV mapping assertion if it is GLAV and ψ(x) is an atom without occurrences of non-free variables). The design of the specification is normally conducted in an iterative fashion, and changes are continuously implemented to its various components. Also, the entire spec- ification is often a lively artifact, continuously modified due to, e.g., changes in the re- quirements. Due to these characteristics, OBDA design and maintenance must be sup- ported by adequate tools and methodologies. The mapping is certainly the component of the specification which has received so far less attention, and thus consolidated tools supporting its design are currently not available. Mapping design is a time-consuming and complex operation, which typically (and especially in complex scenarios) has to be conducted manually [1]. Of course, modifying the mapping due to changes in the other components of the specification is tedious and time-consuming as well. In this paper we study the evolution of OBDA specifications. We start our investi- gation by observing that many approaches exist for both ontology evolution [12] and database schema evolution [11]. However, to the best of our knowledge, no previous study has analyzed evolution in the presence of mappings connecting an ontology to a database schema. In this sense, a problem that is close to OBDA is ontology matching and alignment, which is based on the use of a notion of mapping to integrate different ontologies. Several works have studied the problem of repairing inconsistent mappings in this context (e.g., [3, 7, 8, 10]). However, the framework of ontology matching, and in particular the notion of mapping, is very different from OBDA. We adopt a mapping-centered notion of OBDA evolution: given an OBDA specifi- cation J = hT , S, Mi, we want to repair the mapping M given a modification of the ? The present paper is an extended abstract of [6]. TBox T and/or of the source schema S. We think that, at least for a first analysis of evolution in OBDA, this is a natural assumption: indeed, the mapping is an information that depends on both the TBox and the source schema, while the TBox and the schema are (at least in principle) semantically independent entities. Following the classical approaches to belief revision, we look for a notion of repair of a mapping that is based on two general principles: (i) preserving consistency of the OBDA specification; (ii) expressing minimal change with respect to the initial OBDA specification. With respect to consistency preservation, we adopt a non-classical notion of inconsistency for an OBDA specification, called global mapping inconsistency, re- cently introduced in [4, 5]. According to this notion, a mapping M is inconsistent with respect to a TBox T and a source schema S if there exists no instance D for S such that D is consistent with J and M is active on D, i.e., every query over the source schema appearing in M has a non-empty answer in D. Definition 1 (Global mapping inconsistency [5]). Let J = hT , S, Mi be an OBDA specification. We say that M is globally inconsistent for hT , Si if there does not exist a source instance D satisfying the ICs in S such that (i) M is active on D; and (ii) there exists an interpretation that agrees with D on the predicates of S and satisfies J . Global mapping inconsistency provides a more meaningful notion of inconsistency than the classical one in the context of OBDA: for instance, in all the cases when the source schema is a relational database schema with standard integrity constraints, the OBDA specification is inconsistent according to the classical semantics if and only if its TBox is inconsistent (which in turn implies that this notion is trivial for many DLs). With respect to minimal change, we propose two different notions of repair. The first one is called deletion-based mapping repair and reflects the simple idea of repairing a mapping through a (subset-)minimal deletion of assertions from the initial mapping. Definition 2 (Deletion-based mapping repair). Let J = hT , S, Mi be an OBDA specification such that M is globally consistent for hT , Si, T 0 a consistent TBox, S 0 a consistent source schema, and M0 a mapping such that M0 ⊆ M. We say that M0 is a deletion-based mapping repair (DMR) for J under update hT 0 , S 0 i if: (i) M0 is globally consistent for hT 0 , S 0 i, and (ii) there exists no mapping M00 such that M00 is globally consistent for hT 0 , S 0 i, and M0 ⊂ M00 ⊆ M. The second notion of repair, called entailment-based mapping repair, relies on the notion of mapping entailment set (MES): the mapping entailment set of an OBDA spec- ification J for a mapping language L is the set of mapping assertions in L that are logical consequences of J . Then, the repairs are the globally consistent subsets of the MES that are selected according to a preference criterion (the notion of fewer changes [2]) that formalizes the intuitive principle of preferring insertions over deletions. In practice, a mapping M1 has fewer deletions (resp., fewer insertions) than a mapping M2 with respect to a mapping M if M\M1 ⊂ M\M2 (resp., M1 \M ⊂ M2 \M), and M1 has fewer changes than M2 with respect to M if either M1 has fewer dele- tions than M2 with respect to M, or M1 and M2 have the same deletions with respect to M, and M1 has fewer insertions than M2 with respect to M. Definition 3 (Entailment-based mapping repair). Let J = hT , S, Mi be an OBDA specification such that M is globally consistent for hT , Si, T 0 a consistent TBox, S 0 a consistent source schema, L a mapping language, and M0 an L-mapping. We say that M0 is an entailment-based L-mapping repair (L-EMR) for J under update hT 0 , S 0 i if: (i) M0 is globally consistent for hT 0 , S 0 i; and (ii) there exists no L-mapping M00 such that M00 is globally consistent for hT 0 , S 0 i and MES L (hT 0 , S 0 , M00 i) has fewer changes than MES L (hT 0 , S 0 , M0 i) with respect to MES L (J ). Example 1. Consider the OBDA specification J = hT , S, Mi where: T = {C v F, C v A, ∃R v B, A v B}, S = {T1 /2}, M = {T1 (x, y) ; R(x, y), T1 (x, y) ; C(x)}, and consider the TBox T 0 = T ∪ {A v ¬∃R}. It is easy to see that M is not globally consistent for hT 0 , Si, since for every database that activates M the mapping produces two facts of the form R(x, y) and C(x), which vio- late the TBox assertion C v ¬∃R inferred by T 0 . Then, the DMRs of J under update hT 0 , Si are M01 = {T1 (x, y) ; R(x, y)} and M02 = {T1 (x, y) ; C(x)}. On the other hand, one can easily verify that the following are GAV-EMRs of J under update hT 0 , Si: M001 = {T1 (x, y) ; R(x, y), T1 (x, y) ; F (x)} and M002 = {T1 (x, y) ; C(x)}. Such repairs are contained in all GAV-EMRs of J under update hT 0 , Si. t u Then, we define query entailment under mapping repairs, which corresponds to a form of skeptical reasoning over all the mapping repairs. Definition 4 (Query Entailment). Let J = hT , S, Mi be an OBDA specification such that M is globally consistent for hT , Si, T 0 a consistent TBox, S 0 a consistent source schema, D a source instance satisfying the ICs in S 0 , and q a Boolean CQ over over the signature of T 0 . We say that q is entailed under DMR (resp. L-EMR where L is a mapping language) by J , T 0 , S 0 , and D if (hT 0 , S 0 , M0 i, D) |= q for every DMR (respectively, for every L-EMR) for J under update hT 0 , S 0 i. Let J = hT , S, Mi and T 0 be as in Example 1, and let D = {T1 (a, b)}. Given the DMRs M01 , M02 and the GAV-EMRs M001 , M002 above described, it follows that the query B(a) is entailed under DMR by J , T 0 , S 0 , and D; moreover, the query ∃x.F (x) is not entailed under DMR by J , T 0 , S 0 , and D, while it is entailed under GAV-EMR. We finally provide initial results on the complexity of query entailment, focusing on DL-LiteR TBoxes, simple source schemas (i.e., without ICs), and on GAV and GLAV mappings. We first establish an exact bound for query entailment under deletion-based mapping repairs, which holds for both GAV and GLAV mappings. Theorem 1. Let J = hT , S, Mi be an OBDA specification, with T a DL-LiteR TBox, S a simple source schema, and M a GLAV mapping. Let T 0 be DL-LiteR TBox, S 0 a simple source schema, D an instance for S, and q a Boolean CQ over the signature of T 0 . Deciding whether q is entailed under DMR by J , T 0 , S 0 , and D is Π2p -complete. Then, we study the same problem under entailment-based mapping repairs and GAV mappings, and prove that the Π2p exact bound holds also in this case. Theorem 2. Let J = hT , S, Mi be an OBDA specification, where T is a DL-LiteR TBox, S is a simple source schema, and M is a GAV mapping. Let T 0 be DL-LiteR TBox, S 0 a simple source schema, D an instance for S 0 , and q a Boolean CQ over the signature of T 0 . Deciding whether q is entailed under GAV-EMR by J , T 0 , S 0 , and D is Π2p -complete. References 1. N. Antonioli, F. Castanò, S. Coletta, S. Grossi, D. Lembo, M. Lenzerini, A. Poggi, E. Virardi, and P. Castracane. Ontology-based data management for the italian public debt. In Proc. of the 8th Int. Conf. on Formal Ontology in Information Systems (FOIS), pages 372–385, 2014. 2. R. Fagin, G. M. Kuper, J. D. Ullman, and M. Y. Vardi. Updating logical databases. Advances in Computing Research, 3:1–18, 1986. 3. E. Jiménez-Ruiz, C. Meilicke, B. C. Grau, and I. Horrocks. Evaluating mapping repair systems with large biomedical ontologies. In Proc. of the 26th Int. Workshop on Description Logic (DL), CEUR Electronic Workshop Proceedings, http://ceur-ws.org/, pages 246–257, 2013. 4. D. Lembo, J. Mora, R. Rosati, D. F. Savo, and E. Thorstensen. Towards mapping analysis in ontology-based data access. In Proc. of the 8th Int. Conf. on Web Reasoning and Rule Systems (RR), volume 8741 of Lecture Notes in Computer Science, pages 108–123. Springer, 2014. 5. D. Lembo, J. Mora, R. Rosati, D. F. Savo, and E. Thorstensen. Mapping analysis in ontology- based data access: Algorithms and complexity. In Proc. of the 14th Int. Semantic Web Conf. (ISWC), pages 217–234, 2015. 6. D. Lembo, R. Rosati, V. Santarelli, D. F. Savo, and E. Thorstensen. Approaching OBDA evolution through mapping repairs. Manuscript, 2016. 7. C. Meilicke, H. Stuckenschmidt, and A. Tamilin. Reasoning support for mapping revision. J. Log. Comput., 19(5):807–829, 2009. 8. Ö. L. Özçep. Towards principles for ontology integration. In Proc. of the 5th Int. Conf. on Formal Ontology in Information Systems (FOIS), pages 137–150, 2008. 9. A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, and R. Rosati. Linking data to ontologies. J. on Data Semantics, X:133–173, 2008. 10. G. Qi, Q. Ji, and P. Haase. A conflict-based operator for mapping revision. In Proc. of the 22nd Int. Workshop on Description Logic (DL), CEUR Electronic Workshop Proceedings, http://ceur-ws.org/, 2009. 11. E. Rahm and P. A. Bernstein. An online bibliography on schema evolution. SIGMOD Record, 35(4):30–31, 2006. 12. F. Zablith, G. Antoniou, M. d’Aquin, G. Flouris, H. Kondylakis, E. Motta, D. Plexousakis, and M. Sabou. Ontology evolution: a process-centric survey. Knowledge Eng. Review, 30(1):45–75, 2015.