Introduction

Querying Inconsistent Description Logic Knowledge Bases under Preferred Repair Semantics (Extended Abstract)

Meghyn Bienvenu

Camille Bourgaux

Francois Goasdoue

0 0 IRISA, Universite de Rennes 1 , Lannion , France 1 LRI, CNRS & Universite Paris-Sud , Orsay , France

Many inconsistency-tolerant semantics for DLs rely on the notion of a repair, i.e., a -maximal subset of the ABox that is consistent with the TBox. This extended abstract presents our investigation [4] of variants of two popular inconsistency-tolerant semantics obtained by replacing classical repairs by various types of preferred repair.

Introduction

Recent years have seen an increasing interest in performing database-style query answering over description logic (DL) knowledge bases (KBs). Since scalability is crucial in data-rich applications, there has been a trend to using lightweight DLs for which query answering is tractable w.r.t. the size of the ABox. Particular attention has been paid to DLs of the DL-Lite family [ 6 ] for which query answering can be reduced to evaluation of standard database queries.

An important issue that arises in the context of DL query answering is how to handle the case in which the ABox is inconsistent with the TBox. Indeed, while it may be reasonable to assume that the TBox has been properly debugged, the ABox will typically be very large and subject to frequent modi cations, both of which make errors likely. Since it may be too di cult or costly to identify and x these errors, it is essential to be able to provide meaningful answers to queries in the presence of such data inconsistencies. To address this issue, several di erent inconsistency-tolerant semantics have been proposed for querying inconsistent DL knowledge bases. Among them, the AR and IAR semantics [ 10 ] are the most well-known and well-studied. Both semantics are based upon the notion of a repair, de ned as an inclusion-maximal subset of the ABox which is consistent with the TBox. The AR semantics, which was inspired by consistent query answering in relational databases (cf. [ 2 ] for a survey), amounts to computing those answers that hold no matter which repair is chosen, whereas the more cautious IAR semantics queries the intersection of the repairs.

When additional information on the reliability of ABox assertions is available, it is natural to use this information to identify preferred repairs, and to use the latter as the basis of inconsistency-tolerant query answering. This motivated us to study variants of the AR and IAR semantics obtained by replacing the classical notion of repair by one of four di erent types of preferred repairs. Cardinalitymaximal repairs, denoted -repairs, are intended for settings in which all ABox assertions are believed to have the same likelihood of being correct. The other three types of preferred repairs target the scenario in which some assertions are considered to be more reliable than others, which can be captured qualitatively by partitioning the ABox into priority levels (and then applying either the set inclusion ( P -repairs) or cardinality ( P -repairs) criterion to each level), or quantitatively by assigning weights to the ABox assertions and selecting those repairs having the greatest weight ( w-repairs). 2

Complexity Results

There have been several studies of the complexity of querying DL knowledge bases under the AR and IAR semantics for the standard notion of repairs [ 10, 14, 3, 5 ]. The complexity of AR query answering with - or w-repairs was rst studied in the database setting for denial constraints [ 11 ], and more recently in the DL setting for the highly expressive SHIQ [ 9 ]. However, there had been no systematic study of the computational properties of preferred repairs for important lightweight DLs. We addressed this gap in the literature by studying the complexity of answering conjunctive and atomic queries under the eight preferred repair semantics (cf. Figure 1). We focused on the lightweight logic DL-LiteR (which underlies the OWL 2 QL pro le [ 13 ]), though many of our results hold also for other data-tractable ontology languages.

Unsurprisingly, the results are mainly negative, showing that adding preferences increases complexity. For the IAR semantics, we move from polynomial data complexity in the case of (plain) IAR semantics to coNP-hard data complexity (or worse) for IAR semantics based on preferred repairs. For the AR semantics, query answering is known to be coNP-complete in data complexity already for the standard notion of repairs, but adding preferences often results in even higher complexities. The sole exception is P -repairs (which combine priority levels and the set inclusion criterion), for which both AR and IAR query answering are \only" coNP-complete in data complexity.

The lower combined complexity of the IAR semantics shows that they still retain some computational advantage over the AR semantics. Intuitively, this is because one can precompute the intersection of repairs in an o ine phase and then utilize standard querying algorithms at query time, whereas no such pre-computation is possible for the AR semantics. 3

System and Experiments

We are aware of two systems for inconsistency-tolerant query answering over DL KBs: the system of [ 9 ] for querying SHIQ KBs under w-AR semantics, and the QuID system [ 15 ] that handles IAR semantics (without preferences) and DLLiteA KBs. The latter work left open whether, for the standard notion of repairs,

P P & w

AR coNP 2p[O(log n)] coNP py 2

Data complexity IAR in P

2p[O(log n)] coNP py 2

P P & w

AR p 2 p 2 p 2 p 2 IAR

NP 2p[O(log n)] 2p[O(log n)] py 2

Combined complexity

the IAR semantics constitutes a good approximation of the AR semantics, and whether the AR semantics could be feasibly computed in practice.

Encouraged by the performance of modern-day SAT solvers, we proposed a practical SAT-based approach for query answering in DL-LiteR under the (plain) AR, P -IAR, and P -AR semantics. We showed how to encode query answering under these three semantics in terms of propositional unsatis ability, using a reachability analysis on the con ict graph (inspired by [ 7 ]) to reduce the size of the encodings. We implemented the approach in our CQAPri system. The ABox is stored in a PostgreSQL server, while the TBox is kept in-memory together with the pre-computed set of con icts for the KB. When a query arrives, CQAPri evaluates it over the ABox using its SQLized rewriting (obtained with the Rapid query rewriting engine [ 8 ]), to get its possible answers and their causes. Among the possible answers, CQAPri identi es the IAR ones, or an approximation of the

P -IAR ones. For those possible answers that are not found to be IAR answers (resp. in the approximation of the P -IAR answers), deciding whether they are entailed under the AR semantics (resp. the P -IAR and P -AR semantics), is done using our encodings and the SAT4J SAT solver [ 1 ].

An experimental evaluation demonstrates the scalability of the approach in settings we presume realistic. We considered the DL-LiteR LUBM290 TBox from [ 12 ] and the associated Extended University Data Generator (EUDG). We extended LUBM290 with negative inclusions and introduced inconsistencies in the ABox. Prioritizations of ABox were made to capture a variety of scenarios. The experiments show that our system scales up to large ABoxes (up to about 2 million assertions) for the IAR/AR and P -IAR/ P -AR semantics, when the number of con icting assertions varies from a few percent (for all of these semantics) to a few tens of percent (only for IAR/AR). This positive empirical result is due in large part to the e cacy of the incomplete methods, which leave only few cases to be handled by the SAT solver. In a similar vein, our simple approximation of the P -IAR semantics often produced a large share of the P -IAR answers, which themselves constituted a large portion of the P -AR answers. Acknowledgements This work was supported by contract ANR-12-JS02-007-01. We acknowledge Despoina Trivela for her work on an early version of the system.

1. Berre , D.L. , Parrain , A. : The sat4j library, release 2.2. JSAT 7 ( 2-3 ), 59 { 64 ( 2010 )

2. Bertossi , L.E. : Database Repairing and Consistent Query Answering . Synthesis Lectures on Data Management , Morgan & Claypool Publishers ( 2011 )

3. Bienvenu , M. : On the complexity of consistent query answering in the presence of simple ontologies . In: Proc. of AAAI ( 2012 )

4. Bienvenu , M. , Bourgaux , C. , Goasdoue , F. : Querying inconsistent description logic knowledge bases under preferred repair semantics . In: Proc. of AAAI ( 2014 )

5. Bienvenu , M. , Rosati , R. : Tractable approximations of consistent query answering for robust ontology-based data access . In: Proc. of IJCAI ( 2013 )

6. Calvanese , D. , De Giacomo , G. , Lembo , D. , Lenzerini , M. , Rosati , R. : Tractable reasoning and e cient query answering in description logics: The DL-Lite family . Journal of Automated Reasoning 39 ( 3 ), 385 { 429 ( 2007 )

7. Chomicki , J. , Marcinkowski , J. , Staworko , S. : Computing consistent query answers using con ict hypergraphs . In: Proc. of CIKM . pp. 417 { 426 ( 2004 )

8. Chortaras , A. , Trivela , D. , Stamou , G.: Optimized query rewriting for OWL 2 QL . In : Proc. of CADE ( 2011 )

9. Du , J. , Qi , G. , Shen , Y.D. : Weight-based consistent query answering over inconsistent SHIQ knowledge bases . Knowledge and Information Systems 34 ( 2 ), 335 { 371 ( 2013 )

10. Lembo , D. , Lenzerini , M. , Rosati , R. , Ruzzi , M. , Savo , D.F. : Inconsistency-tolerant semantics for description logics . In: Proc. of RR . pp. 103 { 117 ( 2010 )

11. Lopatenko , A. , Bertossi , L.E. : Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics . In: Proc. of ICDT . pp. 179 { 193 ( 2007 )

12. Lutz , C. , Seylan , I. , Toman , D. , Wolter , F. : The combined approach to OBDA: Taming role hierarchies using lters . In: Proc. of ISWC . pp. 314 { 330 ( 2013 )

13. Motik , B. , Cuenca Grau , B. , Horrocks , I. , Wu , Z. , Fokoue , A. , Lutz , C. : OWL 2 Web Ontology Language pro les . W3C Recommendation (11 December 2012 ), http://www.w3.org/TR/owl2-profiles/, available at http://www.w3.org/ TR/owl2-profiles/

14. Rosati , R.: On the complexity of dealing with inconsistency in description logic ontologies . In: Proc. of IJCAI . pp. 1057 { 1062 ( 2011 )

15. Rosati , R. , Ruzzi , M. , Graziosi , M. , Masotti , G. : Evaluation of techniques for inconsistency handling in OWL 2 QL ontologies . In: Proc. of ISWC . pp. 337 { 349 ( 2012 )