Introduction

Mariano Rodr guez-Muro1, Roman Kontchakov2 and Michael Zakharyaschev2

Faculty of Computer Science

Free University of Bozen-Bolzano

Italy

Department of Computer Science

Information Systems

0 0 Birkbeck, University of London , U.K

We describe the architecture of the OBDA system Ontop and analyse its performance in a series of experiments. We demonstrate that, for standard ontologies, queries and data stored in relational databases, Ontop is fast, e cient and produces SQL rewritings of high quality. In this paper, we report on a series of experiments designed to test the performance of the ontology-based data access (OBDA) system Ontop1 implemented at the Free University of Bozen-Bolzano. Our main concern was the quality of the query rewritings produced automatically by Ontop when given some standard queries, ontologies, databases and mappings from the database schemas to the ontologies. Recall [4] that, in the OBDA paradigm, an ontology de nes a high-level global schema of (already existing) data sources and provides a vocabulary for user queries. An OBDA system rewrites such queries into the vocabulary of the data sources and then delegates query evaluation to a relational database management system (RDBMS). The existing query rewriting systems include QuOnto [19], Nyaya [9], Rapid [7], Requiem [17]/Blackout [18], Clipper [8], Prexto [22] and the system of [14] (some of which use datalog engines rather than RDBMSs). To illustrate how an OBDA system works, we take a simpli ed IMDb database (www.imdb.com/interfaces), whose schema contains relations title[m; t; y] with information about movies (ID, title, production year), and castinfo[p; m; r] with information about movie casts (person ID, movie ID, person role). The users are not supposed to know the structure of the database. Instead, they are given an ontology, say MO (www.movieontology.org), describing the application domain in terms of concepts (classes), such as mo:Movie and mo:Person, and roles and attributes (object and datatype properties), such as mo:cast and mo:year:

Introduction

mo:Movie mo:Movie 9mo:title; 9mo:cast; mo:Movie ⊑ 9mo:year; 9mo:cast ⊑ mo:Person (we use the description logic parlance of OWL 2 QL). The user can query the data in terms of concepts and roles of the ontology; for example, q(t; y)

mo:Movie(m); mo:title(m; t); mo:year(m; y); (y > 2010) 1 http://ontop.inf.unibz.it is a query asking for the titles of recent movies with their production year. To rewrite it to an SQL query over the data source, the OBDA system requires a mapping that relates the ontology terms to the database schema; for example: mo:Movie(m); mo:title(m; t); mo:year(m; y) mo:cast(m; p); mo:Person(p) title(m; t; y); castinfo(p; m; r): By evaluating this mapping over a data instance with, say, 7m28 `Django tUtitnlechained' 20y12 nnp3378cas77tmi22n88fo that can be thought of as the ABox over which we can execute the query q(t; y) taking account of the consequences implied by the MO ontology. Such an ABox is not materialised and called virtual [21].

Thus, the OBDA system is facing three tasks: it has to (i ) rewrite the original query to a query over the virtual ABox, (ii ) unfold the rewriting, using the mapping, into an SQL query, and then (iii ) evaluate it over the data instance using an RDBMS. The idea of OBDA stems from the empirical fact that answering conjunctive queries (CQs) in RDBMSs is very e cient in practice. So one can expect task (iii ) to be smooth provided that the rewriting (i ) and unfolding (ii ) are reasonably small and standard.

However, the available experimental data (see, e.g., [20, 3]) as well as the recent complexity-theoretic analysis of rewritings show that they can be prohibitively large or complex. First, there exist CQs and ontologies for which any ( rst-order or datalog) rewriting results in an exponential blowup [12]; the polynomial datalog rewriting of [10] hides this blowup behind the existential quanti cation over special constants. Second, even for simple and natural ontologies and CQs, rewritings (i ) become exponential when presented as (most suitable for RDBMSs) unions of CQs (UCQs) because they must include all sub-concepts/roles of each atom in the query induced by the ontology.

In Ontop, this bottleneck is tackled by making use of { the tree-witness rewriting [13] that separates the topology of the CQ from the taxonomy de ned by the ontology; { an extended mapping (called a T -mapping [21]) that takes account of the taxonomy and can be optimised using database integrity constraints and SQL features; { an unfolding algorithm that employs the semantic query optimisation technique with database integrity constraints to produce small and e cient SQL queries.

For example, a rewriting of the query q(t; y) above can be split into the CQ q′(t; y)

ext:Movie(m); mo:title(m; t); mo:year(m; y); (y > 2010) and the datalog rules for the ext:Movie predicate: ext:Movie(m) ext:Movie(m) ext:Movie(m) mo:Movie(m); mo:cast(m; p); mo:title(m; t): ( 1 ) ( 2 ) (3) The former inherits the topology of the original CQ, while the latter represents the taxonomy de ned by the ontology. In theory, the topological part can contain exponentially many rules (re ecting possible matches in the canonical models) [12], but this never happens in practice, and usually there are very few of them (see the experiments below). The taxonomical component is independent from the CQ and combines with the mapping into a T -mapping [21], which can then be drastically simpli ed using the database integrity constraints. For example, since castinfo has a foreign key (its movie ID attribute references ID in title), every virtual ABox of IMDb will satisfy the axiom 9mo:cast ⊑ mo:Movie, making ( 2 ) redundant; moreover, (3) and ( 1 ) give rise to the same rule, resulting in a T -mapping with a single rule for mo:Movie. Thus, a rewriting over IMDb ABoxes will be a single CQ. In contrast, any UCQ rewriting over arbitrary ABoxes contains three CQs which simply duplicate the answers because the data respects the integrity constraints (a query with a few more atoms may give rise to a UCQ rewriting with thousands CQs).

By straightforwardly applying the unfolding algorithm to q′ and the T mapping M above, we obtain the query q′0′(t; y)

title(m; t0; y0); title(m; t; y1); title(m; t2; y); (y > 2010); which requires two (potentially) expensive Join operations. However, if we use the fact that the ID attribute is a primary key of title (uniquely de ning the title and production year), then q′ can be unfolded into a much simpler q′′(t; y)

title(m; t; y); (y > 2010): In fact, such multiple Joins are very typical in OBDA because n-ary relations of data sources are rei ed by ontologies into binary roles and attributes.

The aim of this paper is to (i ) present the rewriting and optimisation techniques that allow Ontop to produce optimised queries as discussed above, and (ii ) evaluate the performance of Ontop using three use cases. We demonstrate that|at least in these cases|Ontop produces query rewritings of reasonably high quality and its performance is comparable to that of traditional RDBMSs. 2

OWL 2 QL and Databases

The language of OWL 2 QL contains individual names ai, concept names Ai, and role names Pi (i 1). Roles R and basic concepts B are de ned by R ::=

Pi j

Pi ; j 9R: A TBox (or an ontology ), T , is a nite set of inclusions of the form B1 ⊑ B2;

B1 ⊑ 9R:B2;

B1 ⊓ B2 ⊑ ?;

R1 ⊑ R2;

R1 ⊓ R2 ⊑ ?: An ABox, A, is a set of atoms of the form Ak(ai) or Pk(ai; aj ). The semantics for OWL 2 QL is de ned in the usual way based on interpretations I = (∆I ; I ) [2]. The set of individual names in A is denoted by ind(A).

A conjunctive query q(x) is a rst-order formula 9y φ(x; y), where φ is a conjunction of atoms of the form Ak(t1) or Pk(t1; t2), and each ti is a term (an individual or a variable in x or y). We use the datalog notation for CQs, writing q(x) φ(x; y) (without existential quanti ers), and call q the head and φ the body of the rule. A tuple a ind(A) is a certain answer to q(x) over (T ; A) if I j= q(a) for all models I of (T ; A); in this case we write (T ; A) j= q(a).

We assume that the data comes from a relational database rather than an ABox. We view databases [1] as triples (R; ; I), where R is a database schema, containing predicate symbols for both stored database relations and views (together with their de nitions in terms of stored relations), is a set of integrity constraints over R (in the form of inclusion and functional dependencies), and I is a data instance over R (satisfying ). The vocabularies of R and T are linked together by means of mappings. A mapping, M, from R to T is a set of (GAV) rules of the form

S(x) where S is a concept or role name in T and φ(x; z) a conjunction of atoms with stored relations and views from R and a lter, that is, a Boolean combination of built-in predicates such as = and <. (Note that, by including views in the schema, we can express any SQL query in mappings.) Given a mapping M, the atoms S(a), for S(x) φ(x; z) in M and I j= 9z φ(a; z), comprise the ABox, AI;M, which is called the virtual ABox for M over I. We can now de ne certain answers to a CQ q over a TBox T linked by a mapping M to a database (R; ; I) as certain answers to q over (T ; AI;M). 3

The Architecture of Ontop

We now brie y describe the main ingredients of Ontop: the tree-witness rewriting over complete ABoxes, T -mappings and the unfolding algorithm. Suppose we are given a CQ q over an ontology T and a mapping M from a database schema R to T . The tree-witness rewriting of q and T , denoted qtw, presupposes that the underlying ABox A is H-complete with respect to T in the sense that S(a) 2 A whenever

S′(a) 2 A and T j= S′ ⊑ S; for all concept names S and basic concepts S′ and for all role names S and roles S′ (we identify P (b; a) and P (a; b) in ABoxes and assume 9R(a) 2 A if R(a; b) 2 A, for some b). An obvious way to de ne H-complete ABoxes is to i . take the composition MT of M and the inclusions in T given by A(x)

A(x) P (x; y) (We identify P (y; x) with P (x; y) in the heads of the mapping rules.) Thus, to compute answers to q over T with M and a database instance I, it su ces to evaluate the rewriting qtw over AI;MT : (T ; AI;M) j= q(a)

AI;MT j= qtw(a); for any I and a ind(AI;M): standard rewritings UCQ tw-rewriting virtual ABox

completion

H-complete ABox

mapping database instance T -mapping

OBDA systems such as QuOnto [19] and Prexto [22] rst construct rewritings over arbitrary ABoxes and only then unfold them, using mappings, into UCQs which are evaluated by an RDBMS (dashed lines above). The same result can be obtained by unfolding rewritings over H-complete ABoxes with the help of the composition MT (solid lines above). However, in practice the resulting UCQs very often turn out to be too large [20].

In Ontop, we also start with MT . But before applying it to unfold qtw, we rst simplify and reduce the size of the mapping by exploiting the database integrity constraints. Following [21], a mapping M from R to T is called a T -mapping over integrity constraints if the virtual ABox AI;M is H-complete w.r.t. T , for any data instance I satisfying . (The composition MT is a T -mapping over any .) Ontop transforms MT to a much simpler T -mapping by taking account of database integrity constraints (dependencies), and SQL features such as disjunctions in lter conditions. 3.1

Tree-Witness Rewriting We explain the essence of the tree-witness rewriting using an example. Consider an ontology T with the axioms

RA ⊑ 9worksOn:Project; worksOn ⊑ involves;

Project ⊑ 9isManagedBy:Prof; isManagedBy ⊑ involves (4) (5) and the CQ asking to nd those who work with professors: q(x)

worksOn(x; y); involves(y; z); Prof(z): Observe that if a model I of (T ; A), for some A, contains individuals a 2 RAI and b 2 ProjectI , then I must also contain the following fragments: . a RA worksOn involves

Project

isManagedBy involves v

Prof

isManagedBy Project involves w

Prof

Here the points are not necessarily named individuals from the ABox, but can be generated by the axioms (4) as (anonymous) witnesses for the existential quanti ers. It follows then that a is an answer to q(x) if a 2 RAI , in which case the atoms of q are mapped to the fragment generated by a as follows: q(x) x. worksOn y involves z Prof worksOn; involves

Project

isManagedBy; involves Alternatively, if a is in both RAI and Prof I , then we obtain the following match: z involves worksOn u u y v

Prof

worksOn; involves

Project

isManagedBy; involves Another option is to map x and y to ABox individuals, a and b, and if b is in ProjectI , then the last two atoms of q can be mapped to the anonymous part: q(x) x. worksOn y involves z Prof Finally, all the atoms of q can be mapped to ABox individuals. The possible ways of mapping parts of the CQ to the anonymous part of the model are called tree witnesses. The tree witnesses for q found above give the following tree-witness rewriting qtw of q and T over H-complete ABoxes: b

Project

isManagedBy; involves a

Prof

q(x) .

x a RA qtw(x) qtw(x) qtw(x) qtw(x) worksOn(x; y); involves(y; z); Prof(z); RA(x); RA(x); Prof(x); worksOn(x; y); Project(y): (6) (7) (8) (9) (Note that qtw is not a rewriting over arbitrary ABoxes.)

In theory, the size of the rewriting qtw can be large [12]: there exists a sequence of qn and Tn generating exponentially many (in jqnj) tree witnesses, and any rewriting of qn and Tn is of exponential size (unless it employs jqnj-many additional existentially quanti ed variables [10]). Our experiments (see Section 4) demonstrate, however, that in practice, real-world ontologies and CQs generate small and simple tree-witness rewritings.

There are two ways to simplify tree-witness rewritings further. First, we can use a subsumption algorithm to remove redundant CQs from the union: for example, (7) subsumes (8), which can be safely removed. Second, we can reduce the size of the individual CQs in the union using the following observation: for any CQ q (viewed as a set of atoms),

q [ fA(x); A′(x)g q [ fA(x); R(x; y)g q [ fP (x; y); R(x; y)g c c c q [ fA(x)g; q [ fR(x; y)g; q [ fR(x; y)g; if T j= A ⊑ A′; if T j= 9R ⊑ A; if T j= R ⊑ P; where c reads `has the same certain answers over H-complete ABoxes' (we again identify P (y; x) with P (x; y)). Surprisingly, such a simple optimisation, especially for the domain/range constraints, makes rewritings substantially shorter [23, 9]. 3.2

Optimising T -mappings Suppose M [ fS(x) 1(x; z)g is a T -mapping over . If there is a more speci c rule than S(x) 1(x; z) in M, then M itself is also a T -mapping. To discover such `more speci c' rules, we run the standard query containment check (see, e.g., [1]), but taking account of the inclusion dependencies. For example, since T j= 9mo:cast ⊑ mo:Movie, the composition MMO of the mapping in the introduction and MO contains the following rules for mo:Movie: mo:Movie(m) mo:Movie(m) title(m; t; y); castinfo(p; m; r): The latter rule is redundant since IMDb contains the foreign key 8m (9p; r castinfo(p; m; r) ! 9t; y title(m; t; y)):

Another way to reduce the size of a T -mapping is to identify pairs of rules whose bodies are equivalent up to lters w.r.t. constant values. For example, the mapping M for IMDb and MO contains 6 rules for sub-concepts of mo:Person: mo:Actor(p)

castinfo(c; p; m; r); (r = 1); mo:Editor(p)

castinfo(c; p; m; r); (r = 6): So, the composition MMO contains six rules for mo:Person that di er only in the last condition (r = k), for 1 k 6. These can be reduced to a single rule: mo:Person(p) castinfo(c; p; m; r); (r = 1) _ _ (r = 6): Note that such disjunctions lend themselves to e cient evaluation by RDBMSs. 3.3

Unfolding with Semantic Query Optimisation (SQO) The unfolding procedure [19] applies SLD-resolution to qtw and the T -mapping, and returns those rules whose bodies contain only database atoms (cf. partial evaluation in [15]). Ontop applies SQO [6] to rules obtained at the intermediate steps of unfolding. In particular, this eliminates redundant Join operations caused by rei cation of database relations by means of concepts and roles. We saw in the introduction that the primary key m of title, i.e., following two functional dependencies with determinant m: 8m (9y title(m; t1; y) ^ 9y title(m; t2; y) ! (t1 = t2)); 8m (9t title(m; t; y1) ^ 9t title(m; t; y2) ! (y1 = y2)); remove the two Join operations in title(m; t0; y0), title(m; t; y1), title(m; t2; y), resulting in a single atom title(m; t; y). Note that these two Join operations were introduced to reconstruct the ternary relation from its rei cation by means of roles mo:title and mo:year.

The role of SQO in OBDA systems appears to be much more prominent than in conventional RDBMSs, where it was initially proposed to optimise SQL queries. While some of SQO techniques reached industrial RDBMSs, it never had a strong impact on the database community because it is costly compared to statistics- and heuristics-based methods, and because most SQL queries are written by highly-skilled experts (and so are nearly optimal anyway). In OBDA scenarios, in contrast, SQL queries are generated automatically, and so SQO becomes the only tool to avoid redundancy. 4

Experiments

We illustrate the performance of Ontop by three use cases. All experiments were run on Ubuntu 12.04 64-bit with an Intel Core i5 650, 4 cores@3.20GHz, 16 GB RAM and 1 TB@7200 rpm HD. We used a Java 7 virtual machine for Ontop with MySQL 5.5 for Cases 1 and 3, and with PostgreSQL 9.1 for Case 2. Full details of the experiments are available at obda.inf.unibz.it/data/owled13.

Case 1 is a simulation of a railway network for cargo delivery developed by the University of Genoa with the industrial partner Intermodal Logistics [5]. The ILog ontology, mapping and queries are used to monitor the status of the network. The case includes an ontology with 70 concepts and roles, a mapping with 43 rules and 11 queries (www.mind-lab.it/~gcicala/isf2012). For our experiments, we generated data for 30 days.

Case 2 uses the Movie Ontology (MO) over the real data from the Internet Movie Database (IMDb) with a mapping created by the Ontop development team. We use nine complex, yet natural queries, e.g.,

SELECT DISTINCT ?x ?title ?actor name ?prod year ?rating WHERE f ?m a mo:Movie; mo:title ?title; mo:imdbrating ?rating; dbpedia:productionStartYear ?prod year; mo:hasActor ?x; mo:hasDirector ?x . ?x dbpedia:birthName ?actor name .

FILTER ( ?rating > '7.0' && ?prod year >= 2000 && ?prod year <= 2010 ) g ORDER BY desc(?rating) ?prod year

LIMIT 25 (full details are available at the URL above). Most queries are of high selectivity and go beyond CQs, using inequalities, ORDER BY/LIMIT and DISTINCT operators. Both the SQL database and the ontology were developed independently by third parties (IMDb and the University of Zurich) for purposes di erent from benchmarking.

Case 3 is based on the Lehigh University Benchmark (LUBM, swat.cse. lehigh.edu/projects/lubm), which comes with an OWL ontology, 14 simple CQs of varying degree of selectivity and a data generator. We approximated the ontology in OWL 2 QL and created a database schema to store the data for 200 universities (1 university 130K assertions). Note that although the data has some degree of randomness, it is not arbitrary and follows what can be regarded as a natural pattern: each university has 15{25 departments, each department has 7{10 full professors, every person has a name, etc. These considerations were taken into account to produce a normalised database schema with relations of appropriate arity together with primary and foreign keys (instead of the standard universal tables storing RDF triples).

case ILog IMDb-MO LUBM in ILog and IMDb-MO have no tree witnesses, and so the rewriting returns the original query (note that Q7, Q9 of ILog have a union in the original query). The only exception is Q5 in IMDb-MO with one tree witness, which generates two CQs in the rewriting. Second, the ratio of the number of rules in a mapping per concept/role in both scenarios is very low when our optimisations are applied: most have at most one rule (even in the case of large hierarchies with many domain/range axioms). So, the unfolding with such a mapping produces a small number of Select-Project-Join queries in the union. These observations support our claim that, in practice, there are few tree witnesses and that our T -mapping optimisations can handle e ciently concept and role hierarchies, domain and range constraints.

The time required for query rewriting and optimisation is negligible and stays within 4ms. In contrast, the time required to generate queries without optimisations is higher, especially for queries involving large hierarchies ( 25ms): in particular, Q5 and Q8 in IMDb-MO, where our optimisations reduce the time of unfolding from 37/26ms to 3/6ms. Similarly to other systems, Ontop applies CQ containment (CQC) checks to reduce the number of Select-Project-Join queries, and these checks prove to be costly on large unfoldings without optimisations. Although few milliseconds might seem negligible, the performance of such systems as RDF triple stores and DBs is measured in queries per second and is usually expected to be in thousands. With such requirements, an overhead of 20{30ms per query is not acceptable.

The execution time for SQL queries produced by Ontop, in MySQL or Postgres, is within 100ms for simple, high selectivity queries (with few results). Although Q2, Q4 and Q10 in ILog and Q1, Q2, Q3, Q5 and Q8 in IMDb-MO take up to 4s to execute, their SQL rewritings are optimal (in the sense that they coincide with hand-crafted queries), and their relatively long execution time is due to DISTINCT/ORDER BY over large relations.

LUBM

query

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14

In the LUBM case, the queries have no tree witnesses, which results in treewitness rewritings that coincide with the original queries. LUBM is, however, the only case where Ontop generated unions with hundreds Select-Project-Join queries, which is due to a higher ratio of mappings per concept/role. This is a consequence of the database structure and the way in which mappings construct object URIs from integers and strings in the database (known as impedance mismatch [19]). The generated SQL queries are still optimal in the sense that they correspond to human-generated queries for the given database schema.

Query execution appears to be optimal for all queries (but Q6, Q9 and Q14), with response times under 12ms even for queries with Join and lter operations over large tables. This corresponds to the expected performance of an optimised RDBMS, in which most operations can be performed using in-memory indexes (provided that SQL queries have the right structure for the query planner). Q6, Q9 and Q14 have low selectivity (large number of results) and the execution time is dominated by disk access.

It is to be noted that although we used an OWL 2 QL approximation of LUBM, most queries return the same results as for the original LUBM ontology. The only exceptions are Q11 and Q12: all answers to Q11 are recovered with an extra mappings simulating transitivity (up to a prede ned depth) by means of self-Joins on the transitive property; similarly, for all answers to Q12, we include an extra mapping rule expressing 9R:B ⊑ A on the elements of the virtual ABox. The execution times in the table are given for the extensions described above, which ensure completeness (w.r.t. the original LUBM) of the returned answers.

Finally, by comparing the performance of Ontop (see ontop.inf.unibz.it) with that of other open-source or commercial systems [11, 16], we see that Ontop is much faster than Sesame or Jena (open-source), and similar to OWLIM (commercial), but does not pay the heavy price for inference materialisation, which can take days or hours. 5

Conclusions

To conclude, we believe this paper shows that|despite the negative theoretical results on the worst-case OWL 2 QL query rewriting and sometimes disappointing experiences of the rst OBDA systems|high-performance OBDA is achievable in practice when applied to standard ontologies, queries and data stored in relational databases. In such cases, query rewriting together with SQO and SQL optimisations are fast, e cient and produce SQL queries of high quality. Acknowledgements. We thank G. Cicala and A. Taccella for their help on the ILog experiments and the Ontop development team (J. Hardi, T. Bagosi and M. Slusnys) for their help with the experiments. This work was supported by the EU FP7 project Optique (grant 318338) and UK EPSRC project ExODA (EP/H05099X). 3. D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, A. Poggi, M. RodriguezMuro, R. Rosati, M. Ruzzi, and D. F. Savo. The MASTRO system for ontologybased data access. Semantic Web, 2( 1 ):43{53, 2011. 4. D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati. Tractable reasoning and e cient query answering in description logics: The DL-Lite family.

J. Autom. Reasoning, 39(3):385{429, 2007. 5. M. Casu, G. Cicala, and A. Tacchella. Ontology-based data access: An application to intermodal logistics. Information Systems Frontiers, pages 1{23, 2012. 6. U. S. Chakravarthy, D. H. Fishman, and J. Minker. Semantic query optimization in expert systems and database systems. Benjamin-Cummings Publishing Co., Inc., 1986. 7. A. Chortaras, D. Trivela, and G. Stamou. Optimized query rewriting for OWL 2

QL. In Proc. of CADE-23, volume 6803 of LNCS, pages 192{206. Springer, 2011. 8. T. Eiter, M. Ortiz, M. Simkus, T.-K. Tran, and G. Xiao. Query rewriting for

Horn-SHIQ plus rules. In Proc. of AAAI 2012. AAAI Press, 2012. 9. G. Gottlob, G. Orsi, and A. Pieris. Ontological queries: Rewriting and optimization. In Proc. of ICDE 2011, pages 2{13. IEEE Computer Society, 2011. 10. G. Gottlob and T. Schwentick. Rewriting ontological queries into small nonrecursive datalog programs. In Proc. of KR 2012. AAAI Press, 2012. 11. V. Khadilkar, M. Kantarcioglu, B. M. Thuraisingham, and P. Castagna. JenaHBase: A distributed, scalable and e cient RDF triple store. In Proc. of ISWC, volume 914 of CEUR-WS, 2012. 12. S. Kikot, R. Kontchakov, V. Podolskii, and M. Zakharyaschev. Exponential lower bounds and separation for query rewriting. In Proc. of ICALP 2012, Part II, volume 7392 of LNCS, pages 263{274. Springer, 2012. 13. S. Kikot, R. Kontchakov, and M. Zakharyaschev. Conjunctive query answering with OWL 2 QL. In Proc. of KR 2012. AAAI Press, 2012. 14. M. Konig, M. Leclere, M.-L. Mugnier, and M. Thomazo. A sound and complete backward chaining algorithm for existential rules. In Proc. of RR 2012, volume 7497 of LNCS, pages 122{138. Springer, 2012. 15. J.W. Lloyd and J.C. Shepherdson. Partial Evaluation in Logic Programming. The

Journal of Logic Programming, 11(3-4):217{242, October 1991. 16. Ontotext. OWLIM performance with Jena, 2011. http://www.ontotext.com/ owlim/benchmark-results/owlim-jena-performance. 17. H. Perez-Urbina, B. Motik, and I. Horrocks. A comparison of query rewriting techniques for DL-lite. In Proc. of DL 2009, volume 477 of CEUR-WS, 2009. 18. H. Perez-Urbina, E. Rodr guez-D az, M. Grove, G. Konstantinidis, and E. Sirin.

Evaluation of query rewriting approaches for OWL 2. In Proc. of SSWS+HPCSW 2012, volume 943 of CEUR-WS, 2012. 19. A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, and R. Rosati.

Linking data to ontologies. J. Data Semantics, 10:133{173, 2008. 20. M. Rodr guez-Muro. Tools and Techniques for Ontology Based Data Access in Lightweight Description Logics. PhD thesis, KRDB Research Centre for Knowledge and Data, Free Univ. of Bozen-Bolzano, 2010. 21. M. Rodr guez-Muro and D. Calvanese. Dependencies: Making ontology based data access work. In Proc. of AMW 2011, volume 749. CEUR-WS.org, 2011. 22. R. Rosati. Prexto: Query rewriting under extensional constraints in DL-Lite. In

Proc. of EWSC 2012, volume 7295 of LNCS, pages 360{374. Springer, 2012. 23. R. Rosati and A. Almatelli. Improving query answering over DL-Lite ontologies.

In Proc. of KR 2010. AAAI Press, 2010.

Abiteboul ,

Hull , and

Vianu . Foundations of Databases. Addison-Wesley , 1995 .

Baader ,

Calvanese ,

D. L.

McGuinness ,

Nardi , and

P. F.

Patel- Schneider, editors. The Description Logic Handbook: Theory , Implementation, and Applications . Cambridge University Press, 2003 .