Introduction

Clopen Knowledge Bases: Combining Description Logics and Answer Set Programming?

Labinot Bajraktari

Magdalena Ortiz

Mantas Sˇ imkus

0 0 Institute of Information Systems , TU Wien , Austria

Answer Set Programming (ASP) and ontology languages like Description Logics (DLs) play leading roles in Knowledge Representation and Reasoning (KR&R). ASP and DLs have largely orthogonal features because they make very different assumptions regarding the completeness of information, and thus reasoning techniques and algorithms that are deployed in ASP are significantly different from the ones used in DLs. Combining ASP, which makes the closed-world assumption (CWA), with DLs, which make the open-world assumption (OWA), into expressive hybrid languages that would enjoy the positive features of both has received significant attention in the last decade (see, e.g., [23, 24, 7, 20, 19]). However, progress on understanding the relationship between different hybrid languages, and their relationship with more standard languages like plain ASP, has been limited, as has the development of efficient reasoning algorithms and implementations. These and related problems are investigated in this paper for a new hybrid language called Clopen Knowledge Bases (CKBs), which generalizes and improves the prominent r-hybrid language [23] and DL+LOG [24]. Each CKB is a triple H = (P ; '; ), where P is a disjunctive Datalog program with “not ” literals in rule bodies, ' is theory (e.g., in first-order logic), and is a set of predicate symbols. Intuitively, specifies the predicates that should be interpreted under the OWA; the remaining predicates should be interpreted under the CWA (see Section 3 for details). The contributions of this paper can be summarized as follows: We introduce CKBs, and define for them a stable model semantics, inspired the semantics given by Rosati to r-hybrid and DL+LOG KBs. In a nutshell, the major difference between the latter formalisms and CKBs is that CKBs allow to use CWA predicates in the theory. This allows for more convenient knowledge representation, but also causes technical challenges. We study automated reasoning in CKBs. To this end, in Section 4 we provide a general decidability result for checking entailment of ground atoms and consistency testing in CKBs H = (P ; '; ), where ' is expressed in the guarded negation fragment of FO (GNFO) [4]. This is a very expressive fragment that subsumes the more prominent guarded fragment of FO, as well as many expressive DLs. We give a NEXPTIME2EXPTIME upper bound for inference from GNFO-based CKBs (we note that satisfiability of GNFO formulas is 2EXPTIME-hard). In Section 5 we study reasoning in CKBs H = (P ; '; ), where ' is expressed in the very expressive DL ALCHIO. We show that the (combined) complexity of ? The work was supported by the Austrian Science Fund's projects P25207, P25518, and W1255.

Introduction

reasoning in such CKBs is not higher than in standard (non-ground) ASP. If we assume bounded predicate arities in rules, the basic reasoning problems are EXPTIMEcomplete, which coincides with the complexity of standard problems in ALCHIO.

We explore ways to implement reasoning in CKBs. To this end, we define a restricted class of separable CKBs, and present a translation from separable CKBs into standard ASP programs, thus enabling the reuse of existing ASP solvers. Roughly, the idea is to compile the necessary knowledge about the ontology into a set of disjunctive Datalog rules. Together with the original rules of the CKB, they form a plain ASP program whose stable models are in close correspondence with the stable models of the input CKB (see Section 5.2). The translation actively exploits the structure of the data in the input CKB in order to minimize non-deterministic choices.

We have implemented our translation from separable CKBs with ALCH ontologies into plain ASP, and present here some promising empirical results. We pit our approach against an alternative implementation based on a polynomial time translation (given in the appendix of the extended version of this paper [ 3 ]) from separable CKBs into the so-called dl-programs [ 7 ]. Intuitively, a dl-program for a CKB effectively implements a naive algorithm for reasoning in CKBs. In particular, such a dl-program non-deterministically guesses a (relatively large) set of ground atoms, and then uses an external query (a dl-atom) to update the ontology that is checked for consistency by an external DL reasoner. Our experiments show that the translation into plain ASP provides a dramatic performance improvement over the implementation based on dl-programs. 2

Preliminaries

In this paper we talk about logics which are, in general, sets of theories, and our results are for specific logics that are fragments of standard FO. We start by introducing the notions of (relational) interpretations, as usual in FO, and Herbrand interpretations, as usual in rule languages.

Interpretations and models. We assume a countably infinite set Sconst of constants, and a countably infinite set Spred of predicate symbols. Each r 2 Spred is associated with a non-negative integer, called the arity of r. An interpretation is a pair I = ( I ; I ) that consists of a non-empty set I (called domain), and a valuation function I that maps (i) each constant c 2 Sconst to an element cI 2 I , and (ii) each predicate symbol r to a set rI ( I )n, where n is the arity of r.

We assume a countably infinite set T of theories. Each theory ' 2 T is associated with a set mods (') of interpretations. Each I 2 mods (') is called a model of '. We assume that > 2 T, and we let mods (>) be the set of all interpretations. A logic is simply a set of theories L T. As concrete logics we will consider various fragments of FO; the notion of a model for a theory ' in FO is the standard one. (ii) r I~ = fu j r(u) 2 I g for all r 2 Spred.

Atoms and Herbrand interpretations. We assume a countably infinite set Svar of variables. The elements of Sconst [ Svar are called terms. An atom is an expression of the form r(t1; : : : ; tn), where r 2 Spred, n is the arity of r, and t1; : : : ; tn are terms. An atom is called ground if no variables occur in it. An Herbrand interpretation I is any set of ground atoms. An Herbrand interpretation I can be seen as an ordinary interpretation: we let I~ = ( I~; I~) be the interpretation such that (i) I~ = Sconst, and

Clopen Knowledge Bases

We next define the syntax and the semantics of our hybrid language.

Syntax. A rule is an expression of the form p1 _ : : : _ pk pk+1; : : : ; pl; not pl+1; : : : ; not pm (1) such that p1; : : : ; pm are atoms. We define head ( ) = fp1; : : : ; pkg, body +( ) = fpk+1; : : : ; plg, and body ( ) = fpl+1; : : : ; pmg. We call the expression not p, where p is an atom, a negated atom. If head ( ) = ;, then is a constraint.

A program P is a set of rules. A Clopen Knowledge base (CKB) is a triple H = (P ; '; ), where P is a program, ' 2 T is a theory, and Spred. The predicate symbols in (resp., in Spred n ) are called the open predicates (resp., closed predicates) w.r.t. H . The CKB H is called safe if the following is satisfied for every rule 2 P : every variable that appears in also appears in some atom r(u) 2 body +( ) with r 62 . Unless stated otherwise, all considered CKBs are safe.

A rule or program is called ground (resp., positive) if no variables (resp., negated atoms) occur in it. A ground rule r(u) is called a fact. We write r(u) 2 P in case the fact r(u) is present in a program P .

As usual, dom(f ) denotes the domain of a function f , and ran(f ) its range. A substitution is a partial function from Svar to Sconst. For a rule and a substitution , we use ( ) to denote the rule that is obtained from by replacing every variable X 2 dom( ) with (X). The grounding of a program P (in symbols, ground (P )) is the ground program that consists of all ground rules 0 such that 2 P and 0 = ( ) for some substitution . Note that ground (P ) is infinite when P has at least one variable. Semantics. An Herbrand interpretation I is called a model of a ground positive program P if body +( ) I implies head ( ) \ I 6= ; for all 2 P . Moreover, I is a minimal model of P if, in addition, there is no J ( I such that J is a model of P .

Given a program P , an Herbrand interpretation I, and a set Spred, the reduct P I; of P w.r.t. I and is the ground positive program that is obtained from the program ground (P ) in two steps: (1) Delete every rule that contains (a) r(u) 2 body +(r) with r 2 and r(u) 62 I, (b) r(u) 2 head (r) with r 2 and r(u) 2 I, or (c) r(u) 2 body (r) with r(u) 2 I. (2) In remaining rules, delete all negated atoms, and all ordinary atoms r(u) with r 2 .

An Herbrand interpretation I is a stable model of a CKB H = (P ; '; ) if the following hold: (i) fr(u) j r(u) 2 I; r 62 (ii) I~ is model of '.

g is a minimal model of P I; , and Relationship to ASP. Assume a program P and an Herbrand interpretation I. We call I a stable model of P if I is a stable model of the CKB H = (P ; >; ;). It is not difficult to see that this definition yields precisely the stable models that can alternatively be computed using the standard definition of stable model semantics in ASP. Indeed, the program P I;; boils down to the standard Gelfond-Lifschitz reduct P I of P w.r.t. I [ 13 ]. Observe that in a CKB H = (P ; '; ;), the theory ' plays the role of integrity constraints on the stable models of the plain program P , i.e. I is a stable model of H iff I is a stable model of P such that I~ 2 mods(').

Relationship to r-hybrid KBs. Our CKBs are a close relative of the r-hybrid KBs of Rosati [23]. The safety restriction here is inspired by the safety condition in r-hybrid KBs, and so is our definition of the semantics via a generalization of the GelfondLifschitz reduct that additionally reduces the program according to the truth value of atoms over open predicates. Intuitively, r-hybrid KBs are a special kind of CKBs in which the rule component can refer to both open and closed predicates, but the theory component can use open predicates only. More formally, an r-hybrid KB H = (T ; P), where T is a theory in FO and P is a Datalog:;_ program as defined in [23], corresponds to the CKB H 0 = (P; T ; ), where is the set of predicates symbols appearing in T . One can verify that the stable models of H 0 are precisely the NM-models of H.

In generic CKBs H = (P ; '; ), where ' is an FO theory, the set need not contain all the predicate symbols that appear in ', i.e., closed predicates may occur in '. Consequently, the extensions of these predicates in (the relevant) models of ' must be justified by program rules. This feature causes technical challenges, but is very useful for declarative specification of problems: in our approach, predicates under the OWA and the CWA can be used both in the program and in the theory of a hybrid KB (see Example 1 for an illustration).

The DL+LOG language is obtained from the r-hybrid language by allowing only DLs for specifying theories, and relaxing the safeness condition to weak safeness [24]. In the appendix of the extended version of this paper we show that, when sufficiently rich DLs are considered, CKBs also generalize DL+LOG [ 3 ].

Reasoning problems. As usual in hybrid languages (see, e.g., [23]), the basic reasoning task for CKBs is entailment of ground atoms. That is, given a CKB H = (P ; '; ) and a ground atom R(u), the problem is to decide whether R(u) 2 I holds for all stable models I of H . This problem can be reduced to checking the non-existence of a stable model for the CKB H 0 = (P [ f R(u)g; '; ). Thus in the rest of the paper we focus on checking the stable model existence for a given CKB. Note that in general a CKB may have infinitely many stable models.

Example 1. The CKB H = (P ; '; ) contains information on the local transport network (provided by the city’s transport authority and assumed to be complete) and on hotels and relevant locations (extracted form the web and not necessarily complete). We have P = P1 [ P2 [ P3, where P1 and P2 contain facts. The network, which is depicted by solid lines at the top of Figure 1, is described in P1. Facts of the form RouteTable(`; s; s0) store that on the line `, station s is followed by station s0. The constants t1 and t2 represent tram lines, while `1 represents a subway line; we have corresponding facts SubwayLine(`1), TramLine(t1), TramLine(t2). P2 contains facts related to locations, including the following (for convenience, CloseTo is depicted with dotted lines).

CloseTo(c1; s1) ; Hotel(h1) ; TramConn(h1) ; Hotel(h2) ; CloseTo(h2; s4) The (self-explanatory) rules in P1 and the theory ' are in Figure 1 (URailConn stands for urban rail connection). If h is a hotel with direct connection to the point of interest

Definition 1 (Program Choose(H )). Assume a CKB H = (P ; '; ). For every n-ary relation symbol r 2 , let r be a fresh n-ary relation symbol that does not appear in H . We let Choose(H ) be the set that contains

r(c1; : : : ; cn) _ r(c1; : : : ; cn) for each n-ary r 2

occurring in P , and each tuple (c1; : : : ; cn) of constants from P .

A stable model I of P [ Choose(H ) can be seen as (partially complete) candidate for a stable model of a CKB H = (P ; '; ). The following proposition, whose proof relies on the imposed CKB safety requirement, tells us when such an I witnesses the existence of a stable model of H .

Proposition 1. A CKB H = (P ; '; ) has a stable model iff P [Choose(H ) has some stable model I for which there exists some I 2 mods(') with the following properties: (C1) (c1I ; : : : ; cIn) 2 rI for all r(c1; : : : ; cn) 2 I, (C2) (c1I ; : : : ; cIn) 62 rI for all r(c1; : : : ; cn) 2 I, and (C3) if (e1; : : : ; en) 2 rI and r 62 e1; : : : ; cIn = en.

, then there exists r(c1; : : : ; cn) 2 I with c1I =

From Proposition 1, we obtain decidability of stable model existence for H = (P ; '; ) whenever we can list the stable models of P [ Choose(H ) and test, for each of them, the existence of a model I of the theory ' satisfying conditions (C1–C3). Moreover, if the logic L in question is strong enough to express, for a fixed candidate I, conditions (C1–C3) as part of a theory in L, then decidability of the underlying satisfiability problem suffices. This applies, in particular, to the guarded negation fragment (GNFO), which is among the most expressive FO fragments for which satisfiability has been established [ 4 ].

We use '[x] to indicate that an FO formula ' has x as free variables. The fragment GNFO contains all formulas that can be built using the following grammar: ' ::= r(u1; : : : ; un) j u1 = u2 j 9x ' j ' ^ ' j ' _ ' j ^ :'[x]; where u1; : : : ; un are terms, and is an atom or an equality statement such that all variables of x also occur in . Intuitively, in GNFO a subformula can be negated only if its free variables are “guarded” by an atom or an equality statement. Observe also that a subformula with a single free variable x can always be guarded by an equality statement x = x. GNFO is flexible and natural for domain modelling; for instance, the theory ' in Example 1 is in GNFO.

The following upper bound can be shown by employing Proposition 1, and an encoding of conditions (C1–C3) in GNFO (see the appendix in [ 3 ], as well as [ 5 ] for a similar trick).

Theorem 1. Checking the stable model existence in CKBs H = (P ; '; ), where ' is in GNFO, is decidable. The problem belongs to the class NEXPTIME2EXPTIME, and is 2EXPTIME-hard.

CKBs and Description Logics

GNFO is very expressive and thus computationally very expensive. We study next CKBs based on DLs, and show that they are (to a large extent) computationally not more expensive than plain ASP. We concentrate here on the expressive DL ALCHIO. To this end, we assume a countably infinite set Scn Spred of unary relation symbols, called concept names, and a countably infinite set Srn Spred of binary relation symbols, called role names. We use the elements of Sconst as individuals. The syntax for ALCHIO concepts, roles, concept and role inclusions, and TBoxes is the usual. We also reuse interpretations I = ( I ; I ) as defined in Section 2, and note that the semantics to all complex concepts and roles is given by extending I in the standard way (see [ 2 ]). The notions of models and satisfiability of TBoxes are also standard. Example 2. The theory ' in Example 1 can be written in the syntax of ALCHIO as follows (we use the axiom C D as a shortcut for the two inclusions C v D, D v C):

SubwayStation t TramStation Station TramConn 9CloseTo:TramStation SubwayConn 9CloseTo:SubwayStation URailConn 9CloseTo:Station

Theorem 2. Deciding stable model existence in CKBs H = (P ; T ; ), where T is an ALCHIO TBox, is NEXPTIMENP-complete. If P is not disjunctive, the problem is NEXPTIME-complete. The problem is EXPTIME-complete, if (i) P is both positive and non-disjunctive, or (ii) the arity of predicate symbols in P is bounded by a constant. The above theorem can be proven by employing an encoding of condition (C3) of Proposition 1 by means of nominals, similarly to the encoding of DBoxes in [ 11 ]. 5.1

Separability We provide next a translation from DL-based CKBs to plain ASP. The translation is given for a large fragment of CKBs, which we call separable CKBs, and which in fact generalizes r-hybrid KBs. To define the fragment we need the notion of a positive occurrence and a negative occurrence of a concept or role name in a (complex) concept C. These notions are defined inductively as follows: - Every concept name A occurs positively in A. - Every role name R occurs positively in 9R:C, for any concept C. - Every role name R occurs negatively in 8R:C, for any concept C. - If a concept name A occurs positively (resp., negatively) in C, then A occurs positively (resp., negatively) in C u D, C t D, 8R:C, and 9R:C, for any concept D and role R. - If a concept or role name occurs positively (resp., negatively) in C, then occurs negatively (resp., positively) in :C.

Definition 2 (Separability). A CKB H = (P ; T ; ) is separable if dCvD2T (:C t D) does not have a positive occurrence of concept or role name with 62 . Example 3. Take the CKB H = (P ; T ; ) with T = f9R:(9P:A) v Bg, P = fQ(X; Y; Z) T (X; Y ); P (Y; Z)g, and = fR; A; Bg. Then H is separable because P occurs only negatively in :(9R:(9P:A)) t B.

Intuitively, in a separable CKB H = (P ; T ; ) the inclusions in T can be used to infer the extensions of open predicates from the extensions of closed predicates and other predicates, but these axioms simply cannot assert membership of a domain element (resp., pair of elements) in a closed concept name (resp., role name). More concretely, for separable CKBs one can show a version of Proposition 1 where the condition (C3) is omitted (the rest of the proposition remains the same). The omission of condition (C3) is a major change: recall that we relied heavily on the equality predicate in GNFO, and on nominals supported in ALCHIO in order to cope with (C3). We note that separable CKBs capture r-hybrid KBs H = (T ; P) with T an ALCHOI TBox. Such KBs, as mentioned in Section 3, correspond to CKBs H = (P; T ; ), where is the set of predicates symbols that appear in T , and which trivially satisfy the separability condition. We remark that the pair (T ; P ) with T ; P from Example 3 is not a safe r-hybrid KB (neither is it weakly safe in the spirit of DL+LOG), because the variable Z does not appear in a rule atom with a predicate symbol that does not occur in T . 5.2

Translation into Plain ASP We describe here our translation from separable CKBs H = (P ; T ; ) into standard ASP. Intuitively, we perform reasoning about T during the translation so that afterwards T can be effectively forgotten. This translation is not polynomial and may take single exponential time in the size of the input. However, our experiments show that in practice this translation performs much better than the translation into dl-programs (which is provided in [ 3 ]). The translation is inspired by existing translations from expressive DLs into disjunctive Datalog [ 17, 9, 6 ], however it actively exploits the structure of the data (i.e., the facts) and is not data-independent. We limit this approach to ALCH TBoxes (i.e., we do not support inverses and nominals).

We assume here TBoxes in normal form, that is, each axiom has the form A1 u : : : u An v B 9R:A v B

A v B1 t : : : t Bm A v 8R:B

A v 9R:B R v S (2) (3) where A; B; Ai; Bi are concept names, > or ?, and R; S are role names. It is well known that any TBox T can be normalized into a TBox T 0 in polynomial time so that T and T 0 have the same models up to the original signature of T (see, e.g., [25]). Definition 3 (The communication rules Comm(H )). For a separable CKB H = (P ; T ; ), let Comm(H ) denote the set of the following rules:

S(X; Y )

R(X; Y ) B(X) B(Y ) r(X; Y ); A(Y ) A(X); r(X; Y ) for each for each for each

R v S 2 T 9R:A v B 2 T A v 8R:B 2 T The program Comm(H ) contains the direct translation of inclusions listed in (3). To deal with the remaining inclusions (the ones listed in (2)), we need the notion of types. Definition 4 (Types). A type is any set Scn [ f:A j A 2 Scng. A type is consistent w.r.t. a TBox T if there exists a model I of T and an element e 2 I such that e 2 (dC2 C)I . We use types(T ) to denote the set of types over the signature of a TBox T that are consistent w.r.t. T .

Definition of the translation. Assume a separable CKB H = (P ; T ; ). For a TBox T , we use vT for the transitive closure of the relation f(R; S) j R v S 2 T g. For every constant c that appears in H , let t(c; H ) be the set of types returned by the non-failing runs of the following non-deterministic procedure: (1) Let = fA j P has the fact A(c) g. (2) Close under the following inference rules: (a) If A1 u u An v B 2 T and fA1; : : : ; Ang , then add B to . (b) If 9S:> v B 2 T , R vT S, and P has a fact R(c; d) for some d, add B to . (c) If > v 8S:B 2 T , R vT S, and P has a fact R(d; c) for some d, add B to .

If in not consistent w.r.t. T , then return failure. (3) Pick a concept name B such that fB; :Bg \ = ;, and B appears in one of the following: (a) in a non-fact rule of P , (b) in some 9R:A v B 2 T or A v 8R:B 2 T such that R appears in a non-fact rule of P , (c) in some 9S:A v B 2 T such that P has the fact R(c; d) for some d, and

R vT S, or (d) in some A v 8R:B 2 T such that P has the fact R(d; c) for some d, and

R vT S.

If the above B does not exist, then return . Otherwise, non-deterministically add to either B or :B, and go to step (2).

Take a fresh unary predicate symbol Type for each 2 t(c; H ) such that c occurs in H . We let ASPdd(H ) be the extension of P [ Comm(H ) with the following rules: (i) for all roles R 2 that appear in a non-fact rule in P , and all constants c; d of P , the disjunctive fact R(c; d) _ R(c; d) , where R is a fresh relation symbol (ii) for each constant c of H , the disjunctive fact W 2t(c;H ) Type (c) (iii) for each constant c of H and type

2 t(c; H ), the following constraints A(c)

Type (c)

Type (c); A(c) for each A 2 for each :A 2 \ Scn

The above translation yields a tool to decide the existence of a stable model for H . Theorem 3. The CKB H has a stable model iff ASPdd(H ) has a stable model.

6 Implementation and Experiments

We next present some experiments that demonstrate the advantages of translating a separable CKB H into a plain program ASPdd(H ). We have implemented our approach in a prototype reasoner. In particular, to build the function t described in Section 5.2, instead of relying on an external DL reasoner for testing consistency of types w.r.t. a TBox, we have implemented our own algorithm for this purpose. It is designed in such a way that the consistency of several types can be tested simultaneously, using caching to avoid recomputation. Consistent types are stored in a database and can be reused for other hybrid knowledge bases over the same ontology. The ASP program resulting from the translation is evaluated with Clingo 4.2.1 [ 12 ].

Our implementation is written in Java and PostgreSQL 9.5.5 database, and uses OWLAPI [ 16 ] to manage ontologies. The experiments were run on a PC with Intel Core i7 CPU and 16GB RAM running 64bit Linux-Mint 17. We compared the performance of our system with an implementation based on a direct encoding into dlprograms (given in the appendix of [ 3 ]), and implemented in dlvhex (an implementation of dl-programs, see [21]). We note that both encodings use Clingo. For benchmarking data, we used as source real-world OpenStreetMap data1, and followed the approach described in [ 10 ] to transform it into Datalog facts. The data describes the city of Vienna and are available from BBBike as database dumps2. The extracted data contains facts about 19517 geographical points in the map treated as constants. Concept assertions were extracted from tags in the mapping data, for points of interest like

Hotel; Restaurant; Shop; Hospital; SubwayStation, etc. There are also facts about re

lations between these points and other constants representing objects of interest such as subway lines, types of cuisine, dishes, etc. Among the plain Datalog relations, we extracted next, relating pairs of points whose distance is below a certain threshold set in meters. By considering different thresholds, ranging from 50 to 250 meters, we obtained sets of facts of different sizes. Other Datalog relations extracted to describe the Vienna subway network include locatedAlong and nextStation. The former relates a subway station to the corresponding subway line, and the latter relates pairs of consecutive stations on the same line. The extracted relations that also occur in T include roles like hasCuisine and serves, which relate a Restaurant to a Cuisine or a Dish, respectively. As the TBox of our separable CKBs, we used an ontology based on the geospatial ontology found in MyITS Project [ 8 ] (currently discontinued), which was a system for smart, semantically enriched route planning over real work data sources, including OpenStreetMap (OSM) data. More specifically, we increased the expressivness of the original ontology in DL-LiteR, by adding ALCH inclusion axioms that were relevant for our use cases. We considered different separable CKBs with the same TBox, but different programs. Four of them are provided in the appendix of [ 3 ]. Each example captures the potential information need of a tourist planning to settle for a hotel. Programs P1–P4 ask for a reachable Hotel from the main station “Hauptbahnhof ”. Additionally P1–P3 ask for Hotels that are next to some LocRestaurant (a concept inferred from the ontology). P4 asks for Hotels that are in a quiet neighbourhood, achieved by using the negation of the computed relation LoudNeighbourhood. Note that P1 requires that the station close to the Hotel should be reachable with no line changes starting from “Hauptbahnhof ”, while P2 allows for at most one line change, whereas P3–P4 allow for any number of changes as long as a station is reachable (achieved via recursion). 1 https://www.openstreetmap.org 2 http://download.bbbike.org/osm/bbbike/Wien/ Fact count 145014 263075 479283 743935 1053335

P1 P2 P3 P4

For each of the mentioned programs, we included the datasets of different sizes shown in Table 1, which have up to roughly a million facts. Our approach behaved well, as can be seen from the running times shown in Table 1. The dl-program encoding for dlvhex did not scale for any of the example programs provided, and failed to return answers because of memory exhaustion even for the smallest dataset shown in Table 1. We tried to test it against a smaller yet useful set of facts with approx 13000 Datalog facts, and it still reached the time out of 600s that was set. 7

Discussion

In this paper, we have presented CKBs which is a powerful generalization of r-hybrid KBs due to Rosati. In addition to decidability and complexity results for CKBs, we have provided an implementation for a rich fragment of CKBs. The implementation is based on a reduction to reasoning in plain ASP. Our experiments show that this is a promising approach that provides a dramatic improvement over a naive implementation based on a translation into dl-programs.

Related Work. There are few other works on implementing reasoning over combinations of DL ontologies and rules. For expressive (non-Horn) DLs that go beyond the lightweight DLs of the DL-Lite and E L families, dl-programs is the richest formalism that has been implemented, in particular in the dlvhex suite. The HermiT system supports reasoning in expressive DLs enriched with positive rules under DL-safety [ 14 ]. The work in [ 17 ] enables query answering services over expressive DLs using a dataindependent translation into disjunctive Datalog. For Horn DLs, Heymans et al. show how dl-programs with external queries over Datalog-rewritable DLs can be translated into Datalog with stable negation [ 15 ]. Redl recently presented a generalization of this rewriting approach to external atoms in general HEX-programs [22], still its applicability for reasoning with DL ontologies was demonstrated only using the lightweight logic DL-Lite. An implementation of reasoning in hybrid MKNF KBs (with lightweight ontologies) under the Well-Founded Semantics is also available [ 1, 18 ]. The work in [26] shows how reasoning about DL concepts, but not general TBoxes, can be implemented in ASP.

Future work. The main task for future work is to generalize our ASP translations from separable ALCH-based CKBs to non-separable CKBs based on more expressive DLs like ALCHIO. Another challenge is to understand the data complexity of CKBs. 21. Christoph Redl. The dlvhex system for knowledge representation: recent advances (system description). TPLP, 16(5-6):866–883, 2016. 22. Christoph Redl. Efficient evaluation of answer set programs with external sources based on external source inlining. In Proc. of AAAI 2017. AAAI Press, February 2017. 23. Riccardo Rosati. On the decidability and complexity of integrating ontologies and rules. Web Semantics: Science, Services and Agents on the World Wide Web, 3(1):61 – 73, 2005. Rules Systems. 24. Riccardo Rosati. DL+log: Tight integration of description logics and disjunctive datalog. In

Proc. of KR 2006. AAAI Press, 2006. 25. Frantisˇek Simancˇ´ık, Yevgeny Kazakov, and Ian Horrocks. Consequence-based reasoning beyond horn ontologies. In Proc. of IJCAI 2011, pages 1093–1098. AAAI Press, 2011. 26. Terrance Swift. Deduction in ontologies via ASP. In Proc. of LPNMR 2004. Springer, 2004.

1. Jose´ Ju´lio Alferes, Matthias Knorr, and

Terrance

Swift . Query-driven procedures for hybrid mknf knowledge bases . ACM Trans. Comput. Logic , 14 ( 2 ): 16 : 1 - 16 : 43 , June 2013 .

Franz

Baader . The description logic handbook: theory, implementation, and applications . Cambridge university press, 2003 .

Labinot

Bajraktari , Magdalena Ortiz, and Mantas Sˇ imkus . Clopen knowledge bases: Combining description logics and answer set programming (with appendix ), 2017 . Available at http://dbai.tuwien.ac.at/staff/simkus/papers/ clopen-kbs-extended.pdf.

Vince

Ba ´ra´ny, Balder Ten Cate, and Luc Segoufin. Guarded negation . J. ACM , 62 ( 3 ): 22 : 1 - 22 : 26 , June 2015 .

Michael

Benedikt , Pierre Bourhis, Balder ten Cate, and

Gabriele

Puppis . Querying visible and invisible information . In Proc. of LICS 2016 , pages 297 - 306 . ACM, 2016 .

Meghyn

Bienvenu , Balder ten Cate, Carsten Lutz, and

Frank

Wolter . Ontology-based data access: A study through disjunctive datalog, csp, and MMSNP . ACM Trans. Database Syst ., 39 ( 4 ): 33 : 1 - 33 : 44 , 2014 .

Thomas

Eiter , Giovambattista Ianni, Thomas Lukasiewicz, Roman Schindlauer, and

Hans

Tompits . Combining answer set programming with description logics for the semantic web . Artif . Intell., 172 ( 12 -13):p. 1495 , 2008 .

Thomas

Eiter , Thomas Krennwallner, and

Patrik

Schneider . Lightweight spatial conjunctive query answering using keywords . In Proc. of ESWC 2013 . Springer, 2013 .

Thomas

Eiter , Magdalena Ortiz, and Mantas Sˇ imkus . Conjunctive query answering in the description logic SH using knots . J. Comput. Syst. Sci. , 78 ( 1 ): 47 - 85 , 2012 .

10. Thomas

Eiter

, Jeff

Pan , Patrik Schneider, Mantas

Sˇimkus

, and

Guohui

Xiao . A rule-based framework for creating instance data from OpenStreetMap . In Proc. of RR 2015 . Springer, 2015 .

11. Enrico

Franconi

, Yazmin Ange´lica Iba´n˜ez-Garc´ıa, and Inanc¸ Seylan. Query answering with DBoxes is hard . Electr. Notes Theor. Comput. Sci. , 278 : 71 - 84 , 2011 .

12. Martin

Gebser

, Benjamin Kaufmann, Roland Kaminski, Max Ostrowski, Torsten Schaub, and Marius Thomas Schneider. Potassco: The potsdam answer set solving collection . AI Commun ., 24 ( 2 ): 107 - 124 , 2011 .

13.

Michael

Gelfond and

Vladimir

Lifschitz . The stable model semantics for logic programming . In Proc. of ICLP/SLP 1988 , pages 1070 - 1080 . MIT Press, 1988 .

14. Birte

Glimm

, Ian Horrocks, Boris Motik, Giorgos Stoilos, and

Zhe

Wang . Hermit: An OWL 2 reasoner. J. Autom. Reasoning , 53 ( 3 ): 245 - 269 , 2014 .

15. Stijn

Heymans

, Thomas Eiter, and

Guohui

Xiao . Tractable reasoning with dl-programs over datalog-rewritable description logics . In Proc. of ECAI 2010 . IOS Press, 2010 .

16.

Matthew

Horridge and

Sean

Bechhofer . The OWL API: A java API for OWL ontologies . Semantic Web , 2 ( 1 ): 11 - 21 , 2011 .

17. Ullrich

Hustadt

, Boris Motik, and

Ulrike

Sattler . Reasoning in description logics by a reduction to disjunctive datalog . J. Autom. Reasoning , 39 ( 3 ): 351 - 384 , 2007 .

18. Vadim

Ivanov

, Matthias Knorr, and

Joao

Leite . A query tool for EL with non-monotonic rules . In Proc. of ISWC 2013 . Springer, 2013 .

19. Matthias

Knorr

, Jose´ Ju´lio Alferes, and

Pascal

Hitzler . Local closed world reasoning with description logics under the well-founded semantics . Artif . Intell., 175 ( 9 -10): 1528 - 1554 , 2011 .

20.

Boris

Motik and

Riccardo

Rosati . Reconciling description logics and rules . J. ACM , 57 ( 5 ), 2010 .