Introduction

Closed Predicates in Description Logics: Results on Combined Complexity (Extended Abstract)

Nhung Ngo

ngo@inf.unibz.it 0

Magdalena Ortiz

ortiz@kr.tuwien.ac.at 1

Mantas Sˇimkus

1 0 Free University of Bozen-Bolzano 1 Vienna University of Technology

Introduction

Description Logics (DLs) are a family of languages for knowledge representation, popular for writing ontologies that describe domain-specific knowledge. As fragments of classical first-order logic, DLs make the open-world assumption. That is, the semantics of a knowledge base (KB) is given by the set of all its models. This view naturally supports modeling of incomplete knowledge. However, it is not always fully adequate, as in many applications of DLs, incomplete knowledge interacts with complete information. For example, in the popular paradigm of ontology-based data access, DL ontologies are advocated as a means to leverage domain knowledge when querying data sources. These sources often stem from traditional relational databases, which have a closedworld view and are assumed to fully describe the instances included in a relation. E.g., a municipality-provided table of bus routes can be assumed complete. Query answering algorithms should exploit this to exclude irrelevant models and infer more answers.

Combining open and closed world reasoning is not a new topic in DLs [ 3 ], but it has received renewed attention in recent years [ 13, 12, 7, 6, 18 ]. A prominent proposal for achieving partial closed world reasoning is to enrich KBs with a set of predicates that are to be interpreted as closed [ 12 ]. In this way, some relations are interpreted under the closed semantics, while others are considered open.

Most work considering closed predicates has focused on the problem of evaluating a query (usually a conjunctive query) under the standard certain answer semantics, and on data complexity, i.e. the complexity measured in the size of the data instance to be queried, while both the ontology and the query are assumed to be fixed. E.g., Franconi et at. ([ 6 ]) show that closed predicates lead to the loss of tractability of query answering even in the core fragments of DL-Lite, and provided the first matching upper bounds. An in-depth analysis of the reasons for intractability, and a fine-grained analysis of tractable classes can be found in [ 12, 13 ].

The data complexity is an important indicator of whether algorithms can scale to large amounts of data. However, it is also fundamental to understand the combined complexity, which takes the size of the ontology and the query along with the data into account, as the size of these two components often has a big impact on the performance of query answering algorithms. Indeed, ontologies can be very large [ 15 ], and disregarding their size easily leads to infeasible algorithms. The query size can also have a major impact on the query answering techniques, and avoiding the exponential blow-up in the size of the query was in fact one of the hardest obstacles to overcome for achieving scalable query answering even for DLs of very low data complexity like dialects of DL-Lite [ 1 ]. It is thus surprising that the combined complexity of query answering in the presence of closed predicates has received so little attention. In this extended abstract, we summarize several results on the combined complexity of reasoning in the presence of closed predicates in a wide range of DLs, which are presented in detail in [ 14 ]. 2

DLs with closed predicates and query answering

We use NC, NR and NI for concept names, role names and individual names, respectively. DL knowledge bases (KB) take the form K = (T ; ; A). The TBox T is a set of axioms , which usually include concept inclusions C v D and role inclusions r v s. The specific syntax of concepts C; D, roles r; s and other possible axioms depends on the DL in question. is a set of concept and role names, called the closed predicates of K. Finally, the ABox A is a finite set of assertions of the forms A(b) and r(a; b) with A 2 NC, r 2 NR and a; b 2 NI.

The notion of an interpretation I = ( I ; I ) is as usual. We make the standard name assumption (SNA), i.e., aI = a for all I and individuals a. Concepts C are interpreted as sets of objects CI I and roles r as binary relations rI I I in the standard way. The notions of TBox and ABox satisfaction I j= T and I j= A are also as usual. We say that I respects a given NC [ NR if: (a) for all A 2 \ NC, if d 2 AI , then A(d) 2 A, and (b) for all r 2 \ NR, if (d; d0) 2 rI , then r(d; d0) 2 A.

We call I a model of a KB K = (T ; ; A), and write I j= K, if I j= T , I j= A, and I respects the closed predicates in .

The main problem we study here is query answering, which is a generalization of standard reasoning tasks such as consistency testing and instance checking. Let NV be a countably infinite set of variables. Boolean conjunctive queries (BCQs), take the form q = 1 ^ : : : ^ n, where each j is an atom of the form A(t) or r(t; t0) with A 2 NC, r 2 NR, and ft; t0g NI [ NV. Queries are given the standard first order semantics, where all variables are existentially quantified: we write I j= q0 if there is an assignment from the variables in q to objects in I that makes all atoms true. A Boolean union of conjunctive queries (BUCQ) is of the form q0 = q1 _ : : : _ qn, where each qi is a BCQ. Given an interpretation I, we write I j= q0 if there is i 2 f1; : : : ; ng such that I j= qi, and given a KB K we write K j= q if I j= q holds for every model I of K. Given K and a B(U)CQ q as above, the query entailment problem is to decide whether K j= q.All results below apply also to query answering (as a decision problem) for queries with answer variables using the standard reduction to query entailment. 3

Results

In the presence of complete data, already consistency checking and instance query entailment in the basic DL-Litecore are NP-complete. For the hardness, it is not hard to reduce 3-colorability to deciding consistency of a DL-Litecore KB with closed concepts.

Without closed predicates

With closed predicates Instance queries B(U)CQs

Instance queries

B(U)CQs DL-Litecore DL-LiteR EL ALCO SHOQ, SHOI

NL [ 1 ] NL [ 1 ]

For the membership, we provide an algorithm that shows an NP upper bound even for rich DL-Lite dialects with Boolean combinations of concepts, role inclusions, and nominals, and for UCQs without existentially quantified variables.

In E L we can use DBoxes to express disjunction, atomic negation, and nominals, making E L as expressive as ALCO, and causing consistency testing to require exponential time in the worst case. Tight upper bounds for instance queries and quantifier free UCQs for any DL containing E L can then be inferred by a reduction to standard reasoning in well-known expressive DLs with nominals.

For more expensive queries, we show that entailment of UCQs is 2EXPTIME-hard for DL-LiteR and E L based on a reduction from the word problem for Alternating Turing machines (ATMs) with exponential work space [ 5 ]. Intuitively, using closed concepts can use a simple KB to enforce candidate computations of a given ATM, and then UCQs can test whether the represented computation is correct and accepting. The result is in sharp contrast to the NP upper bound in the setting with no DBox predicate, and coincides with known upper bounds for much richer DLs. A minor variation of the reduction allows shows coNEXPTIME-hardness for DL-Litecore, but we still don’t know if this bound is tight.

A similar reduction allows us to prove 2EXPTIME-hardness of CQ entailment in the standard DL ALCO (with no DBox). This closes an open problem and singles out nominals as a previously unidentified source of complexity when answering queries over expressive DLs. A summary of our result and its comparison to the case without DBox is provided in Table 1.

In the light of these negative results, we also looked for classes of queries for which the query answering problem has lower complexity. In particular, we considered a restriction on variables called K-safety which guarantees that it is sufficient to consider assignments of the variable to individuals in the KB, and there is no need to consider other objects in the interpretation domain . A query is called K-safe if all its variables are K-safe. We also generalize K-safe queries to K-acyclic by allowing for variables that are not K-safe, but requiring that they induce only acyclic subqueries in the original query. 4

Conclusion

Closed predicates seem to have a great potential in applications of DLs, but as our results show, they are computationally costly. It remains a challenge to identify useful restricted settings with better complexity, and to explore other ways for effectively supporting partial completeness at a lower computational cost.

We note that all hardness proofs in the paper are given for KBs (T ; ; A) where contains only a constant number of concept names (which can be further reduced to a single one in DLs that can express disjointness). Moreover, they also apply to KBs (T ; ;; A) with no closed predicates, but where axioms A v fa; bg with the so-called one-of or enum are allowed. Hence our results also shed light on the effect of nominal enumerations in lightweight DLs.

Some interesting questions remain open for the DL-Lite family, like the precise complexity of (U)CQs in DL-Litecore. We remark that inverses play a very limited role in the hardness results here. They all apply to 1-way DL-Lite KBs, where only role names r occur in the right-hand side of axioms, and inverse roles r on the left-hand side; these KBs have directed models in the style of E L and ALC.

Acknowledgments

This work was partially supported by the Austrian Science Fund (FWF) projects T515, P25207, and the project ORMiE from the Free University of Bozen-Bolzano.

Alessandro

Artale , Diego Calvanese, Roman Kontchakov, and

Michael

Zakharyaschev . The dl-lite family and relations . J. Artif. Int. Res. , 36 ( 1 ): 1 - 69 , September 2009 .

Franz

Baader ,

Sebastian

Brand , and

Carsten

Lutz . Pushing the EL envelope . In Proc. of IJCAI 2005 , pages 364 - 369 . Morgan-Kaufmann Publishers, 2005 .

Marco

Cadoli , Francesco M. Donini , and Marco Schaerf . Closed world reasoning in hybrid systems . In Proc. of ISMIS 1990 , pages 474 - 481 . Elsevier Science Publishing, 1990 .

4. Diego Calvanese, Thomas Eiter, and

Magdalena

Ortiz . Regular path queries in expressive description logics with nominals . In Proc. of IJCAI 2009 , pages 714 - 720 , 2009 .

5. Ashok

Chandra , Dexter C. Kozen , and Larry J. Stockmeyer . Alternation. Journal of the ACM , 28 ( 1 ): 114 - 133 , 1981 .

Enrico

Franconi , Yazmin Ange´lica Iba´n˜ez-Garc´ıa, and Inanc¸ Seylan. Query answering with DBoxes is hard . Electr. Notes Theor. Comput. Sci. , 278 : 71 - 84 , 2011 .

Enrico

Franconi , Volha Kerhet, and

Nhung

Ngo . Exact query reformulation over databases with first-order and description logics ontologies . J. Artif. Intell. Res. (JAIR) , 48 : 885 - 922 , 2013 .

Birte

Glimm , Ian Horrocks, and

Ulrike

Sattler . Deciding shoqˆcap knowledge base consistency using alternating automata . In Franz Baader, Carsten Lutz, and Boris Motik, editors, Proceedings of the 21st International Workshop on Description Logics (DL2008) , Dresden, Germany, May 13-16, 2008 , volume 353 of CEUR Workshop Proceedings. CEUR-WS.org , 2008 .

Birte

Glimm , Ian Horrocks, and

Ulrike

Sattler . Unions of conjunctive queries in SHOQ . In Proc. of the 11th International Conference on the Principles of Knowledge Representation and Reasoning (KR-08) , pages 252 - 262 . AAAI Press/The MIT Press, 2008 .

10.

Jan

Hladik . A tableau system for the description logic SHIO . In IJCAR Doctoral Programme , 2004 .

11. Markus Kro¨ tzsch and Sebastian Rudolph . Conjunctive queries for E L with composition of roles . volume 250 of CEUR Electronic Workshop Proceedings , http://ceur-ws. org/ Vol- 250 /, 2007 .

12. Carsten

Lutz

, Inanc¸ Seylan, and

Frank

Wolter . Ontology-based data access with closed predicates is inherently intractable (sometimes) . In Proc. of IJCAI 2013. IJCAI/AAAI , 2013 .

13. Carsten

Lutz

, Inanc¸ Seylan, and

Frank

Wolter . Ontology-mediated queries with closed predicates . In Proc. of IJCAI 2015 , 2015 .

14. Nhung

Ngo

, Magdalena Ortiz, and

Mantas

Simkus . Closed predicates in description logics: Results on combined complexity . In KR . AAAI Press, April 2016 .

15.

Cathy

Price and

Kent

Spackman . Snomed clinical terms . British Journal of Healthcare Computing and Information Management , 17 ( 3 ): 27 - 31 , 2000 .

16.

Riccardo

Rosati . On conjunctive query answering in E L. volume 250 of CEUR Electronic Workshop Proceedings , http://ceur-ws. org/ Vol- 250 /, 2007 .

17.

Klaus

Schild . A correspondence theory for terminological logics: Preliminary report . In Proc. of IJCAI 1991 , pages 466 - 471 . Morgan Kaufmann, 1991 .

18. Inanc¸ Seylan,

Enrico

Franconi , and Jos de Bruijn. Effective query rewriting with ontologies over DBoxes . In Proc. of IJCAI 2009 , pages 923 - 925 , 2009 .

19.

Stephan

Tobies . Complexity Results and Practical Algorithms for Logics in Knowledge Representation . PhD thesis , LuFG Theoretical Computer Science, RWTH-Aachen, Germany, 2001 .