Introduction

Advances in Accessing Big Data with Expressive Ontologies

Ralf Moller

Christian Neuenstadt

O zgur L. O zcep

Sebastian Wandelt

1 0 Hamburg University of Technology , 21073 Hamburg , Germany 1 Humboldt-Universitat zu Berlin , 10099 Berlin , Germany

Ontology-based query answering has to be supported w.r.t. secondary memory and very expressive ontologies to meet practical requirements in some applications. Recently, advances for the expressive DL SHI have been made in the dissertation of S. Wandelt for conceptbased instance retrieval on Big Data descriptions stored in secondary memory. In this paper we extend this approach by investigating optimization algorithms for answering grounded conjunctive queries.3

Introduction

Triplestores, originally designed to store Big Data in RDF format on secondary memory with SPARQL as a query language, are currently more and more used in settings where query answering (QA) w.r.t. ontologies is bene cial. However, reasoning w.r.t. ontologies in secondary memory is provided for weakly expressive languages only (e.g., RDFS), if at all, and in some cases, query answering algorithms are known to be incomplete. For weakly expressive DL languages, such as DL-Lite, good results for sound and complete query answering w.r.t. large (virtual) Aboxes have already been achieved with OBDA based query rewriting techniques and schema speci c mapping rules [ 1 ]. However, for expressive, more powerful DLs such as ALC and beyond only rst steps have been made. Solving the problem of Accessing Big Data with Expressive Ontologies (ABDEO) is an important research goal.

A strategy to solve the ABDEO problem is to \summarize" Big Aboxes by melting individuals such that Aboxes t into main memory [ 2 ]. In some situations inconsistencies occur, and summarization individuals must be \re ned" (or unfolded) at query answering time in order to guarantee soundness and completeness, a rather expensive operation [ 3 ]. Other approaches make use of Abox modularization techniques and try to extract independent modules such that query answering is sound and complete. A rst investigation of Abox modularization for answering instance queries w.r.t. the DL SHIF is presented in [ 5 ]. 3 This work has been partially supported by the European Commission as part of the

FP7 project Optique (http://www.optique-project.eu/). However, modularization with iterative instance checks over all individuals and modules of an Abox is not su cient to ensure fast performance [ 5 ].4

The ABDEO approach presented here is based on a modularization approach developed by Wandelt [ 15,12,14 ] for really large Aboxes containing data descriptions for > 1000 universities in terms of LUBM scale measures [ 6 ], i.e., datasets in the range of billions of triples. Modules (islands) derived by Wandelt's techniques are usually small in practical applications and can be loaded into main memory such that a standard tableau prover can be used for instance checks. Iteration over all individuals gives sound and complete answers, in principle. Compared to [ 5 ], Wandelt (i) proposed extended modularization rules, (ii) implemented incremental ways of computing Abox modularizations, and (iii) investigated new ways to optimize sound and complete concept-based query answering (instance queries) with tableau-based reasoning systems for the logic SHI. In particular, \similarities" between modules are detected such that a single instance query on a representative data structure (a so-called one-step node) yields multiple results at a time, and thus, instance checks are saved (the approach is reminiscent of but di erent from [ 3 ], see below or cf. [ 12 ] for details). Due to modularization rules, one-step node query answering is sound [ 12 ] and in many (well-de ned) cases complete for eliminating candidates for a successive iterative instance checking process. In addition, to eliminate candidates, Wandelt and colleagues also investigate complete approximation techniques (see [ 15 ] for details).

In this paper we extend Wandelt's modularization based approach for query answering by investigating optimization techniques for answering grounded conjunctive queries w.r.t. SHI ontologies. Grounded conjunctive queries are more expressive from a user's point of view than instance queries. We argue that grounded conjunctive queries substantially narrow down the set of instance checking candidates if selective role atoms mentioned in queries are exploited for generating candidates for concept atoms, such that approximation techniques (to, e.g., DL-Lite) are not required in many cases. We demonstrate our ndings using the LUBM benchmark as done, e.g., in [ 9 ] and [ 6 ]. As an additional extension to Wandelt's work, which uses speci c storage layouts for storing Abox data descriptions and internal information in SQL databases, we investigate ontologybased access to existing data stores, namely triplestores, while providing query answering w.r.t. expressive ontologies. 2

Preliminaries

We assume the reader is familiar with description logic languages, ontologies (knowledge bases), inference problems, and optimized tableau-based reasoning algorithms (see, e.g., [ 11,7 ]). For the reader's convenience we de ne conjunctive queries in general, and grounded conjunctive queries in particular (adapted from [ 10 ]). In the following we use AtCon; Con; and Rol for the sets of atomic 4 Note that Abox modularization is di erent from Tbox modularization, as for instance investigated in [ 4 ]. concept descriptions, concept descriptions, and role descriptions, respectively, in the ontology.

A conjunctive query (CQ) is a rst-order query q of the form 9u: (u; v) where is a conjunction of concept atoms A(t) and role atoms R(t; t0), with A and R being concept and role names, respectively. The parameters t; t0 are variables from u or v or constants (individual names). The variables in u are the existentially quanti ed variables of q and v are the free variables, also called distinguished variables or answer variables of q. The query q is called a k-ary query i jvj = k. In a grounded conjunctive query (GCQ), u is empty. We only consider grounded conjunctive queries in this paper. We de ne an operator skel that can be applied to a CQ to compute a new CQ in which all concept atoms are dropped.

The query answering problem is de ned w.r.t. an ontology O = (T ; R; A). Let Inds(A) denote the individuals in A. For I an interpretation, q = (v) a k-ary grounded conjunctive query, and a1; : : : ; ak 2 Inds(A), we write I j= q[a1; : : : ; ak] if I satis es q (i.e., all atoms of q) with variables vi replaced by ai; 1 i k. A certain answer for a k-ary conjunctive query q and a ontology O is a tuple (a1; : : : ; ak) such that I j= q[a1; : : : ; ak] for each model I of O. We use cert(q; O) to denote the set of all certain answers for q and O. This de nes the query answering problem. Given a SHI ontology O and a GCQ q, compute cert(q; O). It should be noted that \tree-shaped" conjunctive queries can be transformed into grounded conjunctive queries, possibly with additional axioms in the Tbox [ 8 ]. The restriction to grounded conjunctive queries is not too severe in many practical applications.

Grounded conjunctive query answering can be implemented in a naive way by computing the certain answers for each atom and doing a join afterwards. Certain answers for concept atoms can be computed by iterating over Inds(A) with separate instance checks for each individual. Rather than performing an instance check on the whole Abox, which is too large to t in main memory in many application scenarios, the goal is to do an instance check on a module such that results are sound and complete. More formally, given an input individual a, the proposal is to compute a set of Abox assertions Aisl (a subset of the source Abox A), such that for all atomic (!) concept descriptions A, we have hT; R; Ai A(a) i hT; R; Aisli A(a). 3

Speeding Up Instance Retrieval

In order to de ne subsets of an Abox relevant for reasoning over an individual a, we de ne an operation which splits up role assertions in such a way that we can apply graph component-based modularization techniques over the outcome of the split.

De nition 1 (Abox Split). Given { a role description R, { two distinct named individuals a and b, { Else { two distinct fresh individuals c and d, and, { an Abox A,

R(a;b): SA ! SA, de ned as follows ( SA is the an Abox split is a function #c;d set of Aboxes and A 2 SA): { If R(a; b) 2 A and fc; dg * Ind(A), then

R(a;b) (A) =A n fR(a; b)g [ fR(a; d); R(c; b)g [ #c;d fC(c) j C(a) 2 Ag [ fC(d) j C(b) 2 Ag

R(a;b) (A) = A: #c;d

In the following we assume that the Tbox is transformed into a normal form such that all axioms are \internalized" (i.e., on the lefthand side of a GCI there is only > mentioned. For a formal de nition of the normal form of a Tbox, see [ 12 ]. Here we use an example to illustrate the idea.

Example 1 (Example for an Extended 8-info Structure). Let then the TBox in normal form is

TEx1 = fChair v 8headOf:Department; 9memberOf:> v P erson;

GraduateStudent v Studentg;

REx1 = fheadOf v memberOfg; TEx1norm = f> v :Chair t 8headOf:Department; > v 8memberOf:? t P erson; > v :GraduateStudent t Studentg and the extended 8-info structure for TEx1norm and REx1 is: extinfo 8T;R (R) =

f?g > :; 8>fDepartment; ?g if R = headOf; < if R = memberOf; otherwise:

De nition 2 (Extended 8-info Structure). Given a TBox T in normal form

and an Rbox R, an extended 8-info structure for T and R is a function extinfo 8T;R : Rol ! }(Con), such that we have C 2 extinfo 8T;R (R) if and only if there exists a role R2 2 Rol, such that R R v R2 and 8R2:C 2 clos (T), where clos (T) denotes the set of all concept descriptions mentioned in T.

Now we are ready to de ne a data structure that allows us to check which concept descriptions are (worst-case) \propagated" over role assertions in SHIontologies. If nothing is \propagated" that is not already stated in corresponding Abox assertions, a role assertion is called splittable. This is formalized in the following de nition.

De nition 3 (SHI-splittability of Role Assertions). Given a SHI-ontology

O = hT; R; Ai and a role assertion R(a; b), we say that R(a; b) is SHI-splittable with respect to O i 1. there exists no transitive role R2 with respect to R, such that R 2. for each C 2 extinfo 8T;R (R) { C = ? or { there is a C2(b) 2 A and T C2 v C or { there is a C2(b) 2 A and T C u C2 v ? and 3. for each C 2 extinfo 8T;R (R ) { C = ? or { there is a C2(a) 2 A and T C2 v C or { there is a C2(a) 2 A and T C u C2 v ?.

R v R2,

So far, we have introduced approaches to modularization of the assertional part of an ontology. In the following, we use these modularization techniques to de ne structures for e cient reasoning over ontologies.

We formally de ne a subset of assertions, called an individual island, which is worst-case necessary, i.e. possibly contains more assertions than really necessary for sound and complete instance checking. Informally speaking, we take the graph view of an Abox and, starting from a given individual, follow all role assertions in the graph until we reach a SHI-splittable role assertion. We show that this strategy is su cient for entailment of atomic concepts. The formal foundations for these subsets of assertions have been set up before, where we show that, under some conditions, role assertions can be broken up while preserving soundness and completeness of instance checking algorithms. First, in De nition 4, we formally de ne an individual island candidate with an arbitrary subset of the original Abox. The concrete computation of the subset is then further de ned below.

De nition 4 (Individual Island Candidate).

Given an ontology O = hT; R; Ai and a named individual a 2 Ind(A), an individual island candidate, is a tuple ISLa = hT; R; Aisl; ai, such that Aisl A. Given an individual island candidate ISLa = hT; R; Aisl; ai and an interpretation I, we say that I is a model of ISLa , denoted I ISLa , if I hT; R; Aisli. Given an individual island candidate ISLa = hT; R; Aisl; ai, we say that ISLa entails a concept assertion C(a), denoted hT; R; Aisl; ai C(a), if for all interpretations I, we have I ISLa =) I C(a). We say that ISLa entails a role assertion R(a1; a2), denoted hT; R; Aisl; ai R(a1; a2), if for all interpretations I, we have I ISLa =) I R(a1; a2).

Please note that entailment of concept and role assertions can be directly reformulated as a decision problem over ontologies, i.e., we have hT; R; Aisl; ai C(a) () hT; R; Aisli C(a). In order to evaluate the quality of an individual island candidate, we de ne soundness and completeness criteria for individual island candidates.

De nition 5 (Soundness and Completeness for Island Candidates).

Given an ontology O = hT; R; Ai and an individual island candidate ISLa = hT; R; Aisl; ai, we say that ISLa is sound for instance checking in ontology O if for all atomic concept descriptions C 2 AtCon, ISLa C(a) =) hT; R; Ai C(a). ISLa is complete for instance checking in ontology O if for all atomic concept descriptions C 2 AtCon, hT; R; Ai C(a) =) ISLa C(a).

We say that ISLa is sound for relation checking in ontology O if for all role descriptions R 2 Rol and all individuals a2 2 Inds(A) { ISLa { ISLa

R(a; a2) =) hT; R; Ai R(a2; a) =) hT; R; Ai

R(a; a2) and

R(a2; a).

ISLa is complete for relation checking in ontology O if for all role descriptions R 2 Rol and all individuals a2 2 Inds(A) { hT; R; Ai { hT; R; Ai

R(a; a2) =) ISLa R(a2; a) =) ISLa

R(a; a2) and

R(a2; a).

We say that ISLa is sound for reasoning in ontology O if ISLa is sound for instance and relation checking in O. We say that ISLa is complete for reasoning in O if ISLa is complete for instance and relation checking in O.

De nition 6 (Individual Island).

Given an individual island candidate ISLa = hT; R; Aisl; ai for an ontology O = hT; R; Ai, ISLa is called individual island for O if ISLa is sound and complete for reasoning in O.

An individual island candidate becomes an individual island if it can be used for sound and complete reasoning. It is easy to see that each individual island candidate is sound for reasoning since it contains a subset of the original Abox assertions.

In Fig. 1, we de ne an algorithm which computes an individual island starting from a given named individual a. The set agenda manages the individuals which have to be visited. The set seen collects already visited individuals. Individuals are visited if they are connected by a chain of SHI-unsplittable role assertions to a. We add the role assertions of all visited individuals and all concept assertions for visited individuals and their direct neighbors.

Theorem 1 shows that the computed set of assertions is indeed su cient for complete reasoning.

Theorem 1 (Island Computation yields Individual Islands for Ontolo

gies). Given an ontology O = hT; R; Ai and an individual a 2 Inds(A), the algorithm in Fig. 1 computes an individual island ISLa = hT; R; Aisl; ai for a. Input: Ontology O = hT; R; Ai, individual a 2 Inds(A) Output: Individual island ISLa = hT; R; Aisl; ai Algorithm:

Let agenda = fag Let seen = ; Let Aisl = ; While agenda 6= ; do

Remove a1 from agenda Add a1 to seen Let Aisl = Aisl [ fC(a1) j C(a1) 2 Ag For each R(a1; a2) 2 A

Aisl = Aisl [ fR(a1; a2) 2 Ag If R(a1; a2) 2 A is SHI-splittable with respect to O then

Aisl = Aisl [ fC(a2) j C(a2) 2 Ag else agenda = agenda [ (fa2g n seen) For each R(a2; a1) 2 A

Aisl = Aisl [ fR(a2; a1) 2 Ag If R(a2; a1) 2 A is SHI-splittable with respect to O then

Aisl = Aisl [ fC(a2) j C(a2) 2 Ag else agenda = agenda [ (fa2g n seen)

The proof is given in [ 12 ].

For each individual there is an associated individual island, and Abox consistency can be checked by considering each island in turn (islands can be loaded into main memory on the y). Individual islands can be used for sound and complete instance checks, and iterating over all individuals gives a sound and complete (albeit still ine cient) instance retrieval procedure for very large Aboxes.

De nition 7 (Pseudo Node Successor). Given an Abox A, a pseudo node

successor of an individual a 2 Inds(A) is a pair pnsa;A = hrs; csi, such that there is an a2 2 Ind(A) with 1. 8R 2 rs:(R(a; a2) 2 A _ R (a2; a) 2 A), 2. 8C 2 cs:C(a2) 2 A, and 3. rs and cs are maximal.

De nition 8 (One-Step Node).

Given O = hT; R; Ai and an individual a 2 Inds(A), the one-step node of a for A, denoted osna;A , is a tuple osna;A = hrootconset; re set; pnsseti, such that rootconset = fCjC(a) 2 Ag, re set = fRjR(a; a) 2 A _ R (a; a) 2 Ag, and pnsset is the set of all pseudo node successors of individual a.

It should be obvious that for realistic datasets, multiple individuals in an Abox will be mapped to a single one-step node data structure. We associate the corresponding individuals with their one-step node. In addition, it is clear that one-step nodes can be mapped back to Aboxes. The obvious mapping function is called Abox. If Abox(osna;A ) j= C(a) for a query concept C (a named concept), all associated individuals of osna;A are instances of C. It is also clear that not every one-step node is complete for determining whether a is not an instance of C. This is the case only if one-step nodes \coincide" with the islands derived for the associated individuals (splittable one-step nodes). Wandelt found that for LUBM in many cases islands are very small, and one-step nodes are indeed complete in the sense that if Abox(osna;A ) 6j= C(a) then A 6j= C(a) (for details see [ 12 ]). In the following we assume that for instance retrieval, it is possible to specify a subset of Abox individuals as a set of possible candidates. If the set of candidates is small, with some candidates possibly eliminated by one-step nodes, then iterative instance checks give us a feasible instance retrieval algorithm in practice. 4

Answering Grounded Conjunctive Queries

In this section we will shortly describe an implementation of the introduced techniques with a triplestore database. As other groups we use the Lehigh University Benchmark or LUBM [ 6 ] for evaluating algorithms and data structures. This benchmark is an ontology system designed to test large ontologies with respect to OWL applications. With the LUBM generator, the user can generate n universities each consisting of a random number of departments and individuals. As the number of individuals and the number of assertions increases nearly linear with the number of universities, LUBM is an instrument to test the performance for query answering machines, especially for grounded conjunctive queries in a scalable Abox environment. If a system cannot handle LUBM with, say, a billion triples, it cannot deal with more complex scenarios occurring in future applications, either. The code we used in this paper for evaluating the optimization techiques is written in Java and can be downloaded at http://www.sts.tu-harburg.de/people/c.neuenstadt/. We store data in the triplestore AllegroGraph, which provides access to role instances (triples) w.r.t. RDFS plus transitive roles, i.e., role hierarchies and transitive roles are handled by AllegroGraph. Alternatively one could use materialization or query expansion in the OBDA style for role hierarchies. SPARQL is used as a query language for specifying queries to be executed on a particular triplestore database. 4.1

Setting up an AllegroGraph Triplestore

AllegroGraph is run as a server program. In our setting, data is loaded directly into the server, whereas islands as well as one-step nodes are computed by a remote program run on a client computer (we cannot extend the server program easily). In a rst step the whole data system has to be set up before we can start query answering. During the setup process, the communication between client and server system consists basically of sending SPARQL queries for data access required in the algorithm shown in Fig. 1 as well as sending AllegroGraph statements for adding additional triples (or changing existing ones) for storing islands and one-step nodes. Islands are indicated using the subgraph components of AllegroGraph triples (actually, quintuples).

The \similarity" of one-step nodes is deifned using hashvalues with a su cient bitlength. We rst compute a set from each island, compute a hashvalue for it, and store it together with the speci c island in the triplestore. Identical hash values allow one to refer to \similar" one-step nodes (with additional checks applied to the collision list as usual for hashing).

Given the concept description C from the query and the named individual a from the tuple, we load the speci c one-step node for a from the database and determine whether osna entails C(a). Depending on the outcome, three states are possible: { Osna entails C(a), then a is actually an instance of C. { Osna entails :C(a) or does not entail C(a) and is splittable, then a is actually not an instance of C. { Osna is not splittable, then the client has to load and check the entire island associated with a to nd out whether a actually is an instance of C.

Candidates for concept atoms are determined in our experiments by rst doing a retrieval for a query q by executing skel(q) (see above for a de nition). Bindings for variables in skel(q) de ne the candidates for retrieval with concept atoms. By eliminating all skeleton query result tuples that include individuals which do not belong to corresponding concept assertions used in the query, nally all remaining tuples are correct answers to the original conjunctive query.

Wandelt has already investigated the e ciency of Abox modularization techniques for an SQL database server. Here, instead, we work directly on an existing AllegroGraph triplesotore, convert the large ontology step by step into small chunks and compare the generated modules with the local modularization of Sebastian Wandelt on the SQL server [ 12 ]. The processing time for one university is about 5000 seconds on AllegroGraph server, where it is like one minute for Wandelt's approach. The modularization of one university takes nearly 200,000 queries. The decrease in performance is based on the huge number of SPARQL mini queries between server and the remote modularization client in the prototype implementation. Thus, only 5 universities are investigated for query answering experiments. 4.2

Evaluating Conjunctive Queries

For evaluating grounded conjunctive queries, LUBM provides 14 prede ned test queries, which check several database criteria. We run the tests with an ontology, which uses the description logic language SHI. LUBM queries di er in the amount of input, selectivity and reasoning behaviour for example by relying on inverse roles, role hierarchy, or transitivity inferences.5 Selectivity basically 5 In the LUBM dataset, explict assertions about subrelations of degreeFrom are made (e.g., doctoralDegreeFrom). The relation degreeFrom is declared as an inverse to means that the grounded conjunctive queries being considered have role atoms with large result sets to be reduced by atom queries, which we call less selective (proportion of the instances involved in the skeleton are high compared to those that actually satisfy the query criteria), or automatically excludes a lot of individuals, what we call a highly selective query. Rather than doing a join on the result sets of all atoms in a grounded conjunctive query, role atoms de ne candidates for concept atoms. Thus for selective queries, candidate sets for concept atoms are smaller. This reduces the number of instance checks that remain if, e.g., one-step node optimizations are not applicable (see above).

The result indicates that the less selective a query is w.r.t. role atoms, the more instance checks we need afterwards, and the more time consuming retrieval is (see Figure 2). Nevertheless, most of the LUBM queries are handled fast, even with the simple implementation for concept atoms with repetitive instance checks. Performance for Query 8 will be increased with an implementation of full one-step node retrieval (with multiple individuals returned at a time, see above). Queries 6 and 14 contain only concept atoms and are not tested here.

To demonstrate that our skeleton query candidate generator is able to significantly improve the results for queries with even low selectivity, we compare the hasAlumnus. Thus, although, e.g., Query 13 contains a reference to University0 (asking for llers of hasAlumnus), adding new universities with degreeFrom tuples with University0 on the righthand side causes the cardinality of the set of llers for hasAlumnus w.r.t. University0 to increase, i.e., having a constant in the query does not mean the result set to be independent of the number of universities. approach of skeleton queries with the naive approach without skeleton queries in Figure 3. One can directly see the huge performance gain of the skeleton query even for less selective queries. We avoid lots of instance checks and can therefore decrease the answering time by orders of magnitude in many cases. 5

Conclusion

In this work we extended the Abox modularization strategies of Wandelt and colleagues to the e cient use of grounded conjunctive queries on triplestore servers. Results obtained with the techniques discussed in this paper are sound and complete. Note that query engines investigated in [ 6 ] are incomplete.

Our prototype needs linear time to add information to the triplestore in a setup phase. Therefore we were not able to run queries on billions of triples. We conclude that island computation needs to be built into the triplestore software iteself and cannot be done from a remote client.

In the average case, the size of the individual island (with respect to the number of assertion in its Abox) is considerably smaller than the original Abox. In our experiments the size is usually orders of magnitudes smaller. Please note that these modularization techniques allow traditional description logic reasoning systems to deal with ontologies which they cannot handle without modularizations (because the data or the computed model abstraction does not t into main memory).

In addition, the evaluation of the prototype showed how grounded conjuctive queries on triplestore servers w.r.t. expressive ontologies (SHI) can be implemented using only a small size of main memory. The main strategy is to use a skeleton query and try to keep the necessary amount of instance checks in the second step as small as possible. If the number of results for less selective skeleton queries gets larger, the number of instance checks increases rapidly. In some cases it would obviously have been better to reduce the set of possible tuples by considering concept atoms rst. This observation has also been made in [ 9 ] and, much earlier, in [ 16 ] where more elaborate query plan generation techniques are investigated, albeit for main memory systems.

We would like to emphasize that the proposed optimizations can be used for parallel reasoning over ontologies [ 13 ]. This will be further investigated in future work such that ABDEO will become possible for practically relevant datasets and ontologies that are more demanding than LUBM.

Calvanese , G. De Giacomo,

Lembo ,

Lenzerini ,

Poggi , M.

RodriguezMuro, and

Rosati . Ontologies and databases: The DL-Lite approach . Reasoning Web. Semantic Technologies for Information Systems , pages 255 { 356 , 2009 .

Dolby ,

Fokoue ,

Kalyanpur ,

Kershenbaum ,

Schonberg ,

Srinivas , and

Ma . Scalable semantic retrieval through summarization and re nement . In Proceedings of the National Conference on Arti cial Intelligence , page 299. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999 , 2007 .

Julian

Dolby , Achille Fokoue, Aditya Kalyanpur, Edith Schonberg, and

Kavitha

Srinivas . Scalable highly expressive reasoner (SHER) . J. Web Sem ., 7 ( 4 ): 357 { 361 , 2009 .

B.C.

Grau ,

Parsia , E. Sirin, and

Kalyanpur . Modularity and web ontologies . proc. KR , 2006 , 2006 .

Guo and J. He in. A scalable approach for partitioning owl knowledge bases . In Proc. of the 2nd International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS2006) , 2006 .

Guo ,

Pan , and J. He in. LUBM: A benchmark for OWL knowledge base systems . Web Semantics: Science, Services and Agents on the World Wide Web , 3 ( 2 ): 158 { 182 , 2005 .

Haarslev and

Mo ller. On the scalability of description logic instance retrieval . Journal of Automated Reasoning , 41 ( 2 ): 99 { 142 , 2008 .

Ian

Horrocks and

Sergio

Tessaris . A conjunctive query language for description logic ABoxes . In Proc. of the 17th Nat. Conf. on Arti cial Intelligence (AAAI 2000 ), pages 399 { 404 , 2000 .

Ilianna

Kollia , Birte Glimm, and

Ian

Horrocks . SPARQL query answering over OWL ontologies . In Proc. of the 8th European Semantic Web Conf. (ESWC 2011 ), Lecture Notes in Computer Science, pages 382 { 396 . Springer, 2011 .

10. Carsten

Lutz

, David Toman,

and Frank

Wolter . Conjunctive query answering in the description logic EL using a relational database system . In Craig Boutilier, editor, IJCAI , pages 2070 { 2075 , 2009 .

11. R. Moller and

Haarslev . Tableaux-based reasoning . In S. Staab and R. Studer, editors, Handbook of Ontologies , pages 509 { 528 . Springer, 2009 .

12.

Wandelt . E cient instance retrieval over semi-expressive ontologies . PhD thesis , Hamburg University of Technology, 2011 .

13.

Sebastian

Wandelt and Ralf Moller. Distributed island-based query answering for expressive ontologies . In Paolo Bellavista, Ruay-Shiung Chang , Han-Chieh

Chao

, Shin-Feng Lin , and Peter M. A . Sloot, editors, Advances in Grid and Pervasive Computing , 5th International Conference, GPC 2010, Hualien, Taiwan, May 10 - 13, 2010 . Proceedings, volume 6104 of Lecture Notes in Computer Science, pages 461 { 470 . Springer, 2010 .

14.

Sebastian

Wandelt and Ralf Moller. Towards Abox modularization of semiexpressive description logics . Journal of Applied Ontology , 7 ( 2 ): 133 { 167 , 2012 .

15. Sebastian

Wandelt

, Ralf Moller, and

Michael

Wessel . Towards scalable instance retrieval over ontologies . Journal of Software and Informatics , 2010 .

16.

Michael

Wessel . Flexible und kon gurierbare Software-Architekturen fr datenintensive ontologiebasierte Informationssysteme . PhD thesis , Technische Universitat Hamburg-Harburg, Hamburg, Germany, 2009 . ISBN 978-3- 8325 -2312-1.