1 Introduction

Knowledge Base Exchange

M. Arenas

marenas@ing.puc.cl 0

E. Botoeva

D. Calvanese

V. Ryzhikov

E. Sherkhonov

e.sherkhonov@uva.nl 1 0 Dept. of Computer Science , PUC Chile 1 ISLA, University of Amsterdam , Netherlands 2 KRDB Research Centre, Free Univ. of Bozen-Bolzano , Italy

Knowledge base exchange can be considered as a generalization of data exchange in which the aim is to exchange between a source and a target connected through mappings, not only explicit knowledge, i.e., data, but also implicit knowledge in the form of axioms. Such problem has been investigated recently using Description Logics (DLs) as representation formalism, thus assuming that the source and target KBs are given as a DL TBox+ABox, while the mappings have the form of DL TBox assertions. In this paper we are interested in the problem of representing a given source TBox by means of a target TBox that captures at best the intensional information in the source. In previous work, results on representability have been obtained for DL-LiteRDFS , a DL corresponding to the FOL fragment of RDFS. We extend these results to the positive fragment of DLLiteR, in which, differently from DL-LiteRDFS , the assertions in the TBox and the mappings may introduce existentially implied individuals. For this we need to overcome the challenge that the chase, a key notion in data and knowledge base exchange, is not guaranteed anymore to be finite.

1 Introduction

Knowledge base exchange is an extension of the data exchange setting, where the source may contain implicit knowledge by which new data may be inferred. The first framework for data exchange with incompletely specified data in the source was proposed in [ 3 ]. This framework is based on the general notion of representation system, as a mechanism to represent multiple instances of a data schema, and considers the problem of incomplete data exchanges under mappings constituted by a set of tuple generating dependencies (tgds), i.e., mappings between pairs of conjunctive queries. Given that the source data may be incompletely specified, (possibly infinitely) many source instances are implicitly represented. This framework was refined in [ 1, 2 ] to the case where as a representation system Description Logics (DL) knowledge bases (KBs) were used: the TBox and the ABox of a DL KB represent implicit and explicit information respectively, and mappings are sets of DL inclusions. While in the traditional data exchange setting, given a source instance and a mapping specification, (universal) solutions are target instances derived from the source instance and the mapping, in this case solutions are target DL KBs, derived from the source KB and the mapping.

In such a setting, in order to minimize the exchange (and hence transfer and materialization) of explicit (i.e., ABox) information, one is interested in computing universal solutions that contain as much implicit knowledge as possible. Therefore, the notion of representability was defined, which helps us in understanding the capacity of (universal) solutions to transfer implicit knowledge: we say that a source TBox is representable under a mapping if there exists a target TBox that leads to a universal solution when it is combined with a suitable ABox computed from the source ABox, independently of the actual source ABox. Weak representability is concerned with representability under a mapping extended with assertions that are implied by the given mapping and the source TBox. (Weak) representability of a source TBox under a mapping implies that the only knowledge that remains to be transferred explicitly via the (extended) mapping is the one in the source ABox. Therefore, checking (weak) representability and computing a representation of a source TBox under a(n extended) mapping turn out to be crucial problems in the context of KB exchange.

In [ 1, 2 ] the problems of deciding representability and weak-representability, and of computing a representation for a given mapping and a TBox was tackled for DLLiteRDFS , the DL counterpart of RDFS [ 5 ] and a member of the DL-Lite family of DLs [ 6 ]. It has been shown that these problems can be solved in polynomial time for DL-LiteRDFS mappings and TBoxes. Moreover, due to the simplicity of the logic, the characterization of representations is concise and simple.

In this paper we extend those results to the case of DL-LiteR without disjointness assertions, a DL that we call DL-Lite pos. The presence of existential quantifiers on the

R right-hand side of concept inclusions makes the problem considerably more complicated than for DL-LiteRDFS . However, we show that also in the presence of existentials on the right-hand side we are able to decide representability and weak representability of a DL-Lite pos TBox under a DL-Lite pos mapping in polynomial time and to construct

R R a polynomial size representation. 2 2.1

Preliminaries

DL-Lite Rpos Knowledge Bases The DLs of the DL-Lite family [ 6 ] are characterized by the fact that reasoning can be done in polynomial time, and that data complexity of reasoning and conjunctive query answering is in AC0. We present now the syntax and semantics of DL-Lite pos, which is R the DL that we adopt here, together with a sub-language of it.

In the following, we use A and P to denote concept and role names, respectively, and B and R to denote generic concepts and roles, respectively. The latter are defined by the following grammar:

R ::= P j P

B ::= A j 9R For a role R, we use R to denote P when R = P , and P when R = P .

A DL-Lite pos TBox is a finite set of concept inclusions B v B0 and role inclusions

R R v R0. A DL-Lite pos ABox is a finite set of membership assertions of the form A(u)

R and P (u; v), where u and v are individuals or labeled nulls. We distinguish between the two, since individuals are interpreted under the unique name assumption, while labeled nulls obtain their meaning through assignments (see below). Notice that we include labeled nulls in ABoxes as they are needed in the exchange of KBs. A DL-Lite pos KB R K isNaopteaitrhhaTt ;DALi-,Lwitehpeorse iTs tihsea fDraLg-mLietnetpoos fTtBheoxDaLndDAL-iLsiateDRLs-tLuidteiepdos iAnB[6o]x.without

R R

R disjointness assertions on concepts and roles. In DL-LiteR, B and R are called basic concepts and basic roles, respectively, and for coherence with previous work on the DLLite family, we adopt here this terminology as well. We call DL-LiteRDFS the fragment of DL-Lite pos (and hence of DL-LiteR) in which there are only atomic concepts and

R atomic roles on the right-hand side of inclusions.

The semantics of DL-Lite pos is, as usual in DLs, based on the notion of first-order interpretation I = h I ; I i, wRhere I is a non-empty domain and I is an interpretation function such that: (1) AI I , for every concept name A; (2) P I I I , for every role name P ; (3) aI 2 I , for every individual name a; and (4) such that: (9R)I (P )I = = fx 2 I j there exists y 2 f(y; x) 2 I

I j (x; y) 2 P I g

I s.t. (x; y) 2 RI g

Moreover, satisfaction of concept and role inclusions is defined as follows: I j= B v B0 if BI B0I , and I j= R v R0 if RI R0I . Finally, satisfaction of membership assertions is defined as follows. A substitution over an interpretation I is a function h from individuals and labeled nulls to I such that h(a) = aI for each individual a. Then (I; h) j= A(u) if h(u) 2 AI , and (I; h) j= P (u; v) if (h(u); h(v)) 2 P I .

An interpretation I is a model of a DL-Lite pos TBox T if for every 2 T , it holds tohvaetrII j=such, tahnadt fiotrisevaemryodel2oAfa, iDt Lh-oLl ditse RtphosatRA(BIo;xh)Aj=if t h.eFrienaelxliys,tsIaissuabmstoitduetiloonf ha DL-LiteR KB K = hT ; Ai if I is a model of both T and A. The set of all models of K is denoted MOD(K), and K is consistent if MOD(K) 6= ;. We observe that in DLLite pos one cannot express any form of negative information, and hence a DL-Lite pos

R R KB is always consistent.

We assume that interpretations satisfy the standard names assumption, that is, we assume given a fixed infinite set U of individual names, and we assume that for every interpretation I, it holds that I U and aI = a for every individual name a. This implies that interpretations satisfy the unique name assumption over individual names.

A signature is a set of concept and role names. An interpretation I is said to be an interpretation of if it is defined exactly on the concept and role names in . Given a KB K, the signature (K) of K is the alphabet of concept and role names occurring in K, and K is said to be defined over (or simply, over) a signature if (K) (and likewise for a TBox T , an ABox A, inclusions B v C and R v Q, and membership assertions A(u) and P (u; v)). 2.2

Queries, Certain Answers, and Chase

A k-ary query q over a signature , with k 0, is a function that maps every interpretation h I ; I i of into a k-ary relation qI k. In particular, if k = 0, then q is called a Boolean query, and qI is either a relation containing the empty tuple () (representing the value true) or the empty relation (representing the value false). A query q is said to be a query over a KB K if q is a query over a signature and (K). Moreover, the answer to q over K, denoted by cert (q; K), is defined as cert (q; K) = TI2MOD(K) qI : Each tuple in cert (q; K) is called a certain answer for q over K. Notice that if q is a Boolean query, then cert (q; K) evaluates to true if qI evaluates to true for every I 2 MOD(K), and it evaluates to false otherwise.

In this paper, we adopt the class of unions of conjunctive queries as our main query formalism. A conjunctive query (CQ) over a signature is a first-order formula of the form q(x) = 9y:conj (x; y), where x, y are tuples of variables and conj (x; y) is a conjunction of atoms of the form: (1) A(t), with A a concept name in and t either an individual from U or a variable from x or y, or (2) P (t1; t2), with P a role name in and ti (i = 1; 2) either an individual from U or a variable from x or y. In a CQ q(x) = 9y:conj (x; y) over a signature , x is the tuple of free variables of q(x). Moreover, given an interpretation I = h I ; I i of , the answer of q over I, denoted by qI , is defined as the set of tuples a of elements from I for which there exist a tuple b of elements from I such that I satisfies every conjunct in conj (a; b). Finally, a union of conjunctive queries (UCQ) over a signature is a finite set of CQs over that have the same free variables. A UCQ q(x) is interpreted as the first-order formula Wqi2q qi(x), and its semantics is defined as qI = Sqi2q qiI .

Certain answers in DL-Lite pos can be characterized through the notion of chase. We

R call a chase a (possibly infinite) set of assertions of the form A(t), P (t; t0), where t, t0 are either individuals from U, or labeled nulls interpreted as not necessarily distinct domain elements (see the definition of the semantics of DL-Lite pos in Section 2.1). For R DL-Lite pos KBs, we employ the notion of restricted chase as defined in [ 6 ]. For such a

R KB hT ; Ai, the chase of A w.r.t. T , denoted chaseT (A), is a chase obtained from A by adding facts implied by inclusions in T , and introducing fresh labeled nulls whenever required by an inclusion with 9R in the right-hand side (see [ 6 ] for details). 2.3

Knowledge Base Exchange Framework

Assume that 1, 2 are signatures with no concepts or roles in common. Then we say that an inclusion N1 v N2 is an inclusion from 1 to 2, if N1 is a concept or a role over 1 and N2 is a concept or a role over 2. For a DL L (e.g., DL-Lite pos), R we define an L-mapping (or just mapping, when L is clear from the context) as a tuple M = ( 1; 2; T12), where T12 is a TBox in L consisting of inclusions from 1 to 2: (1) C1 v C2, where C1, C2 are concepts in L over 1 and (2) Q1 v Q2, where Q1 and Q2 are roles in L over 1 and

2, respectively, and 2, respectively.

If T12 is an L-TBox, for a DL L (e.g., DL-Lite pos), then M is called an L-mapping.

The semantics of a mapping is defined in terms of the notion of satisfaction. More specifically, given a mapping M = ( 1; 2; T12), an interpretation I of 1 and an interpretation J of 2, pair (I; J ) satisfies TBox T12, denoted by (I; J ) j= T12, if for each concept inclusion C1 v C2 2 T12, it holds that C1I C2J , and for each role inclusion Q1 v Q2 2 T12, it holds that Q1I Q2J . Moreover, given an interpretation I of 1, SATM(I) is defined as the set of interpretations J of 2 such that (I; J ) j= T12, and given a set X of interpretations of 1, SATM(X ) is defined as: SATLMet(XM) == (SI12;X S2;ATT1M2)(bIe).a mapping, K1 a KB over 1, and K2 a KB over 2. – K2 is called a solution for K1 under M if MOD(K2) SATM(MOD(K1)), and – K2 is called a universal solution for K1 under M if MOD(K2) SATM(MOD(K1)). =

Universal solutions present several limitations, as argued in [ 1, 2 ]. First, a universal solution (in DL-Lite pos and DL-LiteR) does not always exist. Second, if it exists, then

R its TBox is trivial (that is, equivalent to the empty TBox). Finally, in the worst case the smallest universal solution is exponential in the size of the mapping and the source KB. A notion of solution parametrized w.r.t. a query language was proposed in [ 1, 2 ] in order to overcome these limitations. Such a notion, though weaker, is in line with the objective of (data and) KB exchange of providing in the target sufficient information to answer queries that could also be posed over the source.

Let Q be a class of queries, M = ( 1; 2; T12) a mapping, K1 = hT1; A1i a KB over 1, and K2 a KB over 2. Then – K2 is called a Q-solution for K1 under M if for every query q 2 Q over 2, cert (q; hT1 [ T12; A1i) cert (q; K2), and – K2 is called a universal Q-solution for K1 under M if for every query q 2 Q over 2, cert (q; hT1 [ T12; A1i) = cert (q; K2).

The definitions of solutions are illustrated in the following example.

Example 1. Assume 1 = fPainting ( ), PaintedBy ( ; ), ArtMovement ( ; )g and 2 = fArtPiece( ), ArtAuthor ( ; ), HasStyle( ; ), HasGenre( ; )g. Consider mapping M = ( 1; 2; T12), where T12 is the following TBox:

Painting v ArtPiece Painting v 9HasGenre

PaintedBy v ArtAuthor ArtMovement v HasStyle Further, assume T1 = fPainting 9PaintedBy , Painting v 9ArtMovement g and A1 = fPainting (blacksquare)g. Then, a universal solution for the KB K1 = hT1; A1i under M is the KB K2 = hT2; A2i, where T2 = ; and A2 is the following ABox, where n01, n02, and m01 are labelled nulls:

ArtPiece(blacksquare) HasGenre(blacksquare; m01)

ArtAuthor (blacksquare; n01) HasStyle(blacksquare; n02)

Now, consider KB K20 = hT20; A02i with non-empty TBox, where T20 = fArtPiece 9ArtAuthor ; ArtPiece v 9HasStyle; ArtPiece v 9HasGenreg and A02 = fArtPiece(blacksquare)g. Then we have that K20 is a solution for K1 under M. However, we also have that K20 is not a universal solution for K1 under M. Notably, both KB K2 and KB K20 are universal UCQ-solutions for KB K1 under mapping M.

In order to understand the capacity of universal solutions, and also of the querylanguages based notions of solutions to transfer implicit knowledge, the notion of representability has been introduced in [ 1, 2 ]. Here we adapt that definition to the case where the KB is always satisfiable, as in DL-Lite pos. In the definition below we use R chaseT ; (A) to denote the projection of chaseT (A) on the signature .

Let L be a DL, Q a class of queries, M = ( 1; 2; T12) an L-mapping, and T1 an L-TBox over 1. Then, – T1 is (Q-)representable under M if there exists an L-TBox T2 over 2, called a (Q-)representation of T1 under M, such that for every ABox A1 over 1, hT2; chaseT12; 2 (A1)i is a (Q-)universal solution for hT1; A1i under M. – T1 is weakly (Q-)representable under M if there exists a mapping M? = ( 1; 2; T1?2) such that T12 T1?2, T1 [ T12 j= T1?2, and T1 is (Q-)representable ? under M .

Example 2. Let M = ( 1; 2; T12) and T1 be as in Example 1. Then we have that T2 = fArtPiece 9ArtAuthor ; ArtPiece v 9HasStyle; ArtPiece v 9HasGenreg is a UCQ-representation of T1 under M.

On the other hand, if M0 = ( 1; 2; T102) with T102 = fPaintedBy v ArtPieceg, then we have that T1 is not UCQ-representable under M0: take ABox A1 = fPainting (blacksquare)g, then chaseT102; 2 (A1) = ; and for no TBox T20, hT20; chaseT102; 2 (A1)i is a universal UCQ-solution for hT1; A1i under M0. However, if we consider T1?2 = T102 [ fPainting v 9ArtAuthor g, we conclude that T1 is weakly UCQ-representable under M0 since T102 T1?2, T1 [ T102 j= T1?2 and T1 is UCQrepresentable under M? = ( 1; 2; T1?2) (in fact, ; is a UCQ-representation of T1 under M?). 3

Solving UCQ-Representability for DL-Lite pos R In this section, we show that the UCQ-representability problem can be solved in polynomial time for the case where TBoxes and mappings are expressed in DL-Lite pos. R More specifically, we give a polynomial time algorithm UCQREP pos that, given a DLLite Rpos-mapping M = ( 1; 2; T12) and a DL-Lite Rpos-TBox T1, verifies whether T1 is UCQ-representable under M, and if this is the case computes a UCQ-representation of T1 under M. Moreover, we also show that this algorithm can be used to solve the UCQ-representability problem for the case of DL-LiteRDFS . It is important to notice that the algorithm we present can be used to compute universal UCQ-solutions of polynomial size, which make good use of the source implicit knowledge. Thus, this algorithm computes solutions with good properties to be used in practice.

A related problem is that of query inseparability [ 7 ], which can be formulated as follows: given TBoxes T1 and T2, and a signature , decide whether for each ABox A over and for each query q over , cert (q; hT1; Ai) = cert (q; hT2; Ai). In contrast to our polynomial result for UCQ-representability, query inseparability has been proved to be PSPACE-hard for DL-LiteR TBoxes and CQs [ 7 ], and an analysis of the proof shows that the same lower bound holds already for DL-Lite pos.

R 3.1

Checking Whether a Given Target TBox is a UCQ-Representation

We start by considering the decision problem associated with UCQ-representability: Given a DL-LiteR-mapping M = ( 1; 2; T12), a DL-LiteR-TBox T1 over 1, and a DL-LiteR-TBox T2 over 2, check whether T2 is a UCQ-representation of T1 under M, i.e., for each ABox A1 over 1, hT2; chaseT12; 2 (A1)i is a universal UCQ-solution for hT1; A1i under M. This problem can be solved in two steps: (C1) Check whether for each ABox A1 over 1, hT2; chaseT12; 2 (A1)i is a UCQsolution for hT1; A1i under M. (C2) Check whether for each ABox A1 over 1 and for each UCQ q over 2, we have that cert (q; hT2; chaseT12; 2 (A1)i) cert (q; hT1 [ T12; A1i).

If both checks succeed, then T2 is a UCQ-representation of T1 under M, otherwise not. We develop now techniques to perform these two checks in polynomial time. Checking Condition (C1). For a DL-Lite pos TBox T and a concept or role N , we define the upward closure of N w.r.tR. T as the set UT (N ) = fN 0 j N 0 is concept or role and T j= N v N 0g; and the strict upward closure ST (N ) aSsNU2TN(UNT)(nNf)N,agn.dTihtsens,trfiocrt avesresitoNn SoTf (cNon).ceNpotsticaendthraotlebsotwheSdTe12fi(nUeTU1(TN(N)) )an=d UT2 (SM(N )) are sets over 2, for each concept or role N over 1.

With these notions in place, we can provide a necessary and sufficient condition for the satisfaction of condition (C1).

TPBroopxoosviteiron 11,. aLnedt TM2 a=DL(-Li1t;e Rpo2s-;TTB12o)x boevear D2L.-LTihteeRnpoTs-1m, aTp2p,ianngd, TM1 asaDtisLf-yLcitoenRpdosi-tion (C1) iff the following conditions are satisfied: (A) SM(UT1 (B)) UT2 (SM(B)), for each basic concept B over 1; (B) SM(UT1 (R)) UT2 (SM(R)), for each basic role R over 1; (C) for each basic concept B and each basic role R over 1 such that 9R 2 UT1 (B) and SM(UT1 (9R )) 6= ;, we have that if SM(UT1 (R)) 6= ;, then there exists a role QR;B over 2 such that (CA) 9QR;B 2 UT2 (SM(B)), (CB) SM(UT1 (R)) UT2 (QR;B ), and (CC) SM(UT1 (9R )) UT2 (9QR;B ), and if SM(UT1 (R)) = ;, either

(CD) SM(UT1 (9R )) UT2 (SM(B)), or there exist roles Q1R;B ; : : : ; QnR;B over 2 such that (CE) 9Q1R;B 2 UT2 (SM(B)), (CF) T2 j= 9(Q1R;B ) v 9Q2R;B ; : : : ; 9(QnR;B1) v 9QnR;B, and (CG) SM(UT1 (9R ))

UT2 (9(QnR;B) ).

It is important to notice that the necessary and sufficient condition in Proposition 1 can be checked in polynomial time, as the implication problem for DL-Lite pos can R be solved in polynomial time. In particular, for a basic concept B and a basic role R over 1 such that 9R 2 UT1 (B), SM(UT1 (9R )) 6= ;, SM(UT1 (R)) = ;, and SM(UT1 (9R )) * UT2 (SM(B)), checking the existence of roles Q1R;B ; : : : ; QnR;B over 2 satisfying conditions (CE), (CF), and (CG) can be reduced to checking reachability in a directed graph. Indeed, for each pair of basic concepts B2; B20 over 2 T2 j= B2 v 9Q1R;B; T2 j= 9(Q1R;B) v 9Q2R;B; : : : ; T2 j= 9(QnR;B) G = (V; E) be the directed graph defined as: Then we test for the existence of the roles Q1R;B; : : : ; QnR;B by verifying whether B20 is reachable from B2 in G. If for some pair B2, B20 the aforementioned two-step test succeed, then we have that there exist roles Q1R;B; : : : ; QnR;B that satisfy conditions (CE), (CF), and (CG). Otherwise, we know that such roles do not exist.

Checking Condition (C2). We rely on the following result:

TPBroopxoosviteiron 21,. aLnedt TM2 a=DL(-Li1t;e Rpo2s-;TTB12o)x boevear D2L.-LTihteeRnpoTs-1m, aTp2p,ianngd, TM1 asaDtisLf-yLcitoenRpdosi-tion (C2) iff the following conditions are satisfied: (A) UT2 (SM(B)) SM(UT1 (B)) for each basic concept B over 1; (B) UT2 (SM(R)) SM(UT1 (R)) for each role R 2 1; (C) for each basic role Q over 2 and each basic concept B over 1 such that 9Q 2 UT2 (SM(B)) and UT2 (9Q ) 6= f9Q g, there exists a role RQ;B over 1 s.t. (CA) 9RQ;B 2 UT1 (B) and (CB) Q 2 SM(RQ;B).

The necessary and sufficient condition in Proposition 2 can be checked in polynomial time, as the implication problem can be solved in polynomial time for DL-Lite pos. R

Thus, given that, by Propositions 1 and 2, both conditions (C1) and (C2) can be tested in polynomial time, we obtain the following result.

Theorem 1. The problem of verifying, given a DL-Lite pos-mapping M = ( 1; 2; T12), a DL-Lite Rpos-TBox T1 over 1, and a DL-LiteRRpos-TBox T2 over 2, whether T2 is a UCQ-representation of T1 under M can be solved in polynomial time. 3.2

The Algorithm for Computing a UCQ-Representation

In what follows, we present the algorithm UCQREP pos, which verifies whether a source TBox T1 is UCQ-representable under a mapping M, and if this is the case returns a UCQ-representation of T1 under M.

Intuitively, given a source TBox T1 and a mapping M, algorithm UCQREP pos constructs the best possible candidate T2 for a UCQ-representation of T1 under M (given the conditions in Propositions 1 and 2), and then checks whether T2 effectively satisfies the properties required for a UCQ-representation (note, that T2 is indeed a DL-Lite pos R TBox). To prove the correctness of this algorithm, we need to show that T1 is UCQrepresentable under M if and only if T2 is a UCQ-representation of T1 under M. This Algorithm: UCQREP pos(T1; M) IOnuptuptu:t: AA DDLL--LLiitteeRRppooss--mTBapopxinTg2 Move=r ( 2 1th; at 2is; Ta12U)CaQnd-reapDreLs-eLnittaetRpioosn-ToBf oTx1 Tu1nodvererM1,.if

T1 is UCQ-representable under M. The keyword false otherwise.

1. Let T2 be a TBox over

2 defined as: T2 = fN2 v M2 j N1 a basic concept or role over 1;

N2 2 SM(N1); M2 2 SM(UT1 (N1))g 2. Remove from T2 every inclusion N2 v M2 such that (i) N2 2 SM(N1) for some N1 over 1, and (ii) for every M1 over 1 such that M2 2 SM(M1), it holds that T1 6j= N1 v M1. Moreover, if N2 = 9R2 and M2 = 9R20, then also remove inclusions R2 v R20 and R2 v R20 from T2. 3. Remove from T2 every inclusion of the form either 9R2 v B2 or R2 v R20 or R2 v R20 for roles R2, R20 and a concept B2 over 2, if there exists a concept B1 over 1 such that (i) 9R2 2 SM(B1), and (ii) for every role R1 over 1 such that 9R1 2 UT1 (B1) and R2 2 SM(R1), it holds that T1 6j= B1 v 9R1. 4. Verify whether T2 is a UCQ-representation of T1 under M. If the test succeeds, return

T2, otherwise return false.

Fig. 1. Algorithm to compute the UCQ-representation of a DL-Lite pos TBox T1 under a DL-Lite pos mapping M. R R is done in the following theorem, where it is also proved that the algorithm works in polynomial time. The latter is a consequence of the fact that T2 is of polynomial size in the sizes of T1 and M, and that, by Theorem 1, it is possible to check in polynomial time whether T2 is a UCQ-representation of T1 under M.

Theorem 2. Algorithm UCQREP pos is correct and runs in polynomial time.

The following examples illustrate how algorithm UCQREP pos works. Example 3. Let M = ( 1; 2; T12), where 1 = fA1( ); B1( ); C1( )g, 2 = fA2( ), B2( )g, and T12 = fA1 v A2, B1 v B2; C1 v B2g. Furthermore, assume that T1 = fB1 v A1g. Then, in step 1, the algorithm constructs the TBox T2 = fB2 v A2g. In step 2, it removes the only axiom from T2 as B2 2 SM(C1) and T1 6j= C1 v A1. In step 3, it does nothing, and finally, at the last step it checks whether the empty TBox T2 is a UCQ-representation of T1 under M. Since A2 2 SM(UT1 (B1)) and A2 2= UT2 (SM(B1)), the algorithm returns false.

Example 4. Let M = ( 1; 2; T12), where 1 = fB1( ); P1( ; ); R1( ; )g, 2 = fA2( ); B2( ); R2( ; )g, and T12 = f9P1 v A2; B1 v B2; R1 v R2g. Furthermore, assume that T1 = fB1 v 9P1; B1 v 9R1; 9R1 v 9P1 g. Then, in step 1, the algorithm constructs the TBox T2 = fB2 v 9R2; 9R2 v A2g. It does not remove anything in steps 2 and 3. Finally, at the last step it successfully checks that T2 is a UCQ-representation of T1 under M and returns T2. It is not difficult to see that if the input of algorithm UCQREP pos is a DL-LiteRDFS mapping M = ( 1; 2; T12) and a DL-LiteRDFS -TBox T1 over 1, then the set T2 computed by this algorithm is a DL-Lite Rpos-TBox over 2 that can be easily transformed into an equivalent DL-LiteRDFS -TBox. Indeed, T2 might contain inclusions between basic concepts of the form 9R2 v 9R20, but this occurs only if T2 implies also the role inclusion R2 v R20. Hence, all concept inclusions that would fall outside DLLiteRDFS are implied by the DL-LiteRDFS fragment of T2 and can be removed from T2 without affecting its semantics. Thus, we conclude that algorithm UCQREP pos can also be used to solve in polynomial time the UCQ-representability problem for DL-LiteRDFS mappings and TBoxes. In this section, we show that also the weak UCQ-representability problem can be solved in polynomial time when TBoxes and mappings are expressed DL-Lite pos. We first need R to introduce some terminology. Given a DL-LiteR-TBox T over a signature and a UCQ q over , a UCQ qr over is said to be a perfect reformulation of q w.r.t. T if for every ABox A over , it holds that [ 6 ]: cert (q; hT ; Ai) = cert (qr; h;; Ai). That is, the certain answers to the UCQ q over a KB hT ; Ai can be computed by posing the UCQ qr over the ABox A. It is well-known that every UCQ q admits a perfect reformulation w.r.t. a DL-LiteR-TBox T , which can be computed in polynomial time [ 6 ].

Interestingly, the fundamental notion of perfect reformulation can be used to solve the UCQ-representability problem for DL-Lite Rpos. More precisely, let M = ( 1; 2; T12) be a DL-LiteR-mapping and T1 a DL-LiteR-TBox over 1. Then define a mapping COMP(M; T1) = ( 1; 2; T1?2) that extends M by compiling the knowledge from T1 into T12. Formally, for a basic concept or role N over 1, let bq N be the CQ defined as follows: bq A(x) = A(x), bq 9P (x) = 9y:P (x; y), bq 9P (x) = 9y:P (y; x), bq P (x; y) = P (x; y), and bq P (x; y) = P (y; x). Then, for every concept inclusion B v C 2 T12 and for every CQ q in the perfect reformulation of bq B w.r.t. T1, include Cq v C into T1?2, where Cq is the (unique) basic concept such that bq Cq = q. Also, for every role inclusion R v Q 2 T12 and for every CQ q in the perfect reformulation of bq R w.r.t. T1, include Rq v Q into T1?2, where Rq is the basic role such that bq Rq = q. T1 aItDisL-iLmitpeoRprotsa-nTtBtooxnoovtiecre th1a,tthifenMCO=M(P( M1; ; T21;)T=12)(is1a; D2L;-TL1i?t2e)Rpioss-amDapLp-iLnigteRaponsdmapping that can be computed in polynomial time in the sizes of M and T1. Therefore, given that the set of inclusions defining COMP(M; T1) contains the set of inclusions defining M and T1 [ T12 j= T1?2, we conclude that COMP(M; T1) can be used to check in polynomial time whether T1 is weakly UCQ-representable under M. Theorem 3. Let M = ( 1; 2; T12) be a DL-Lite Rpos-mapping and T1 a DL-Lite RposTBox over 1. Then T1 is weakly UCQ-representable under M if and only if T1 is UCQ-representable under COMP(M; T1).

From this result and Theorem 2 we obtain a polynomial time algorithm for solving the weak UCQ-representability problem for DL-Lite pos mappings and TBoxes. R The example below shows a DL-Lite pos TBox T1 and a DL-Lite pos mapping M such

R R that T1 is not weakly UCQ-representable under M.

Example 5. Let M = ( 1; 2; T12), where 1 = fP1( ; ); B1( )g, 2 = fA2( ), B2( )g, and T12 = fB1 v B2, 9P1 v A2g. Furthermore, assume that T1 = fB1 v 9P1g. Then T1 is not weakly UCQ-representable under M. In fact, given that the perfect reformulation of 9P1 w.r.t. TBox T1 is 9P1 itself, and likewise for concept B1, we have that M = COMP(M; T1) and, thus, T1 is not weakly UCQ-representable under M, as T1 is not UCQ-representable under M.

Instead, as shown in [ 1, 2 ], for each DL-LiteRDFS TBox T1 and DL-LiteRDFS mapping M, T1 is weakly UCQ-representable under M. 5 In this paper, we have extended previous results on representability in the knowledge exchange framework to DL-Lite pos, a DL of the DL-Lite family that allows for existen

R tials in the right-hand side of inclusion assertions, both in the source TBox and in the mapping. We are currently working on extending our results to DL-LiteR, which includes disjointness assertions, and to the other DLs in the extended DL-Lite family [ 4 ]. A further interesting problem that we are investigating is that of checking the existence of universal solutions for DL-Lite pos and other more expressive DLs.

R Acknowledgements. The authors were supported by the EU Marie Curie action FP7PEOPLE-2009-IRSES (grant 24761), by Fondecyt (grant 1090565), by the EU ICT projects ACSI (grant FP7-257593) and FOX (grant FP7-233599), and by the NWO project DEX (612.001.012).

1. Arenas , M. , Botoeva , E. , Calvanese , D. : Knowledge base exchange . In: Proc. of DL 2011 . CEUR, ceur-ws.org , vol. 745 ( 2011 )

2. Arenas , M. , Botoeva , E. , Calvanese , D. , Ryzhikov , V. , Sherkhonov , E.: Exchanging description logic knowledge bases . In: Proc. of KR 2012 ( 2012 )

3. Arenas , M. , Pe´rez, J., Reutter , J.L. : Data exchange beyond complete data . In: Proc. of PODS 2011 . pp. 83 - 94 ( 2011 )

4. Artale , A. , Calvanese , D. , Kontchakov , R. , Zakharyaschev , M.: The DL-Lite family and relations . J. of Artificial Intelligence Research 36 , 1 - 69 ( 2009 )

5. Brickley , D. , Guha , R.V. : RDF vocabulary description language 1 .0:

RDF

Schema. W3C Recommendation , World Wide Web Consortium (Feb 2004 ), available at http://www.w3.org/TR/rdf-schema/

6. Calvanese , D. , De Giacomo , G. , Lembo , D. , Lenzerini , M. , Rosati , R. : Tractable reasoning and efficient query answering in description logics: The DL-Lite family . J. of Automated Reasoning 39 ( 3 ), 385 - 429 ( 2007 )

7. Konev , B. , Kontchakov , R. , Ludwig , M. , Schneider , T. , Wolter , F. , Zakharyaschev , M. : Conjunctive query inseparability of OWL 2 QL TBoxes . In: Proc. of AAAI 2011 . pp. 221 - 226 ( 2011 )