=Paper= {{Paper |id=None |storemode=property |title=Representability in DL-Lite_R Knowledge Base Exchange |pdfUrl=https://ceur-ws.org/Vol-846/paper_60.pdf |volume=Vol-846 |dblpUrl=https://dblp.org/rec/conf/dlog/ArenasBCRS12 }} ==Representability in DL-Lite_R Knowledge Base Exchange== https://ceur-ws.org/Vol-846/paper_60.pdf
Representability in DL-LiteR Knowledge Base Exchange

     M. Arenas1 , E. Botoeva2 , D. Calvanese2 , V. Ryzhikov2 , and E. Sherkhonov3
                          1
                          Dept. of Computer Science, PUC Chile
                              marenas@ing.puc.cl
                2
                  KRDB Research Centre, Free Univ. of Bozen-Bolzano, Italy
                             lastname@inf.unibz.it
                     3
                       ISLA, University of Amsterdam, Netherlands
                             e.sherkhonov@uva.nl



       Abstract. Knowledge base exchange can be considered as a generalization of
       data exchange in which the aim is to exchange between a source and a target con-
       nected through mappings, not only explicit knowledge, i.e., data, but also implicit
       knowledge in the form of axioms. Such problem has been investigated recently
       using Description Logics (DLs) as representation formalism, thus assuming that
       the source and target KBs are given as a DL TBox+ABox, while the mappings
       have the form of DL TBox assertions. In this paper we are interested in the prob-
       lem of representing a given source TBox by means of a target TBox that captures
       at best the intensional information in the source. In previous work, results on
       representability have been obtained for DL-LiteRDFS , a DL corresponding to the
       FOL fragment of RDFS. We extend these results to the positive fragment of DL-
       LiteR , in which, differently from DL-LiteRDFS , the assertions in the TBox and the
       mappings may introduce existentially implied individuals. For this we need to
       overcome the challenge that the chase, a key notion in data and knowledge base
       exchange, is not guaranteed anymore to be finite.


1   Introduction

Knowledge base exchange is an extension of the data exchange setting, where the source
may contain implicit knowledge by which new data may be inferred. The first frame-
work for data exchange with incompletely specified data in the source was proposed
in [3]. This framework is based on the general notion of representation system, as a
mechanism to represent multiple instances of a data schema, and considers the problem
of incomplete data exchanges under mappings constituted by a set of tuple generating
dependencies (tgds), i.e., mappings between pairs of conjunctive queries. Given that the
source data may be incompletely specified, (possibly infinitely) many source instances
are implicitly represented. This framework was refined in [1, 2] to the case where as a
representation system Description Logics (DL) knowledge bases (KBs) were used: the
TBox and the ABox of a DL KB represent implicit and explicit information respec-
tively, and mappings are sets of DL inclusions. While in the traditional data exchange
setting, given a source instance and a mapping specification, (universal) solutions are
target instances derived from the source instance and the mapping, in this case solutions
are target DL KBs, derived from the source KB and the mapping.
    In such a setting, in order to minimize the exchange (and hence transfer and materi-
alization) of explicit (i.e., ABox) information, one is interested in computing universal
solutions that contain as much implicit knowledge as possible. Therefore, the notion of
representability was defined, which helps us in understanding the capacity of (univer-
sal) solutions to transfer implicit knowledge: we say that a source TBox is representable
under a mapping if there exists a target TBox that leads to a universal solution when it is
combined with a suitable ABox computed from the source ABox, independently of the
actual source ABox. Weak representability is concerned with representability under a
mapping extended with assertions that are implied by the given mapping and the source
TBox. (Weak) representability of a source TBox under a mapping implies that the only
knowledge that remains to be transferred explicitly via the (extended) mapping is the
one in the source ABox. Therefore, checking (weak) representability and computing
a representation of a source TBox under a(n extended) mapping turn out to be crucial
problems in the context of KB exchange.
    In [1, 2] the problems of deciding representability and weak-representability, and
of computing a representation for a given mapping and a TBox was tackled for DL-
LiteRDFS , the DL counterpart of RDFS [5] and a member of the DL-Lite family of
DLs [6]. It has been shown that these problems can be solved in polynomial time for
DL-LiteRDFS mappings and TBoxes. Moreover, due to the simplicity of the logic, the
characterization of representations is concise and simple.
    In this paper we extend those results to the case of DL-LiteR without disjointness
                                        pos
assertions, a DL that we call DL-LiteR      . The presence of existential quantifiers on the
right-hand side of concept inclusions makes the problem considerably more compli-
cated than for DL-LiteRDFS . However, we show that also in the presence of existentials
on the right-hand side we are able to decide representability and weak representability
              pos                        pos
of a DL-LiteR     TBox under a DL-LiteR       mapping in polynomial time and to construct
a polynomial size representation.


2     Preliminaries
             pos
2.1   DL-LiteR   Knowledge Bases
The DLs of the DL-Lite family [6] are characterized by the fact that reasoning can be
done in polynomial time, and that data complexity of reasoning and conjunctive query
                                                                          pos
answering is in AC0 . We present now the syntax and semantics of DL-LiteR     , which is
the DL that we adopt here, together with a sub-language of it.
    In the following, we use A and P to denote concept and role names, respectively,
and B and R to denote generic concepts and roles, respectively. The latter are defined
by the following grammar:

                    R ::= P | P −                    B ::= A | ∃R

For a role R, we use R− to denote P − when R = P , and P when R = P − .
               pos
    A DL-LiteR     TBox is a finite set of concept inclusions B v B 0 and role inclusions
        0            pos
R v R . A DL-LiteR       ABox is a finite set of membership assertions of the form A(u)
and P (u, v), where u and v are individuals or labeled nulls. We distinguish between the
two, since individuals are interpreted under the unique name assumption, while labeled
nulls obtain their meaning through assignments (see below). Notice that we include
                                                                                    pos
labeled nulls in ABoxes as they are needed in the exchange of KBs. A DL-LiteR           KB
                                             pos                        pos
K is a pair hT , Ai, where T is a DL-LiteR TBox and A is a DL-LiteR ABox.
                       pos
    Note that DL-LiteR     is the fragment of the DL DL-LiteR studied in [6] without
disjointness assertions on concepts and roles. In DL-LiteR , B and R are called basic
concepts and basic roles, respectively, and for coherence with previous work on the DL-
Lite family, we adopt here this terminology as well. We call DL-LiteRDFS the fragment
            pos
of DL-LiteR     (and hence of DL-LiteR ) in which there are only atomic concepts and
atomic roles on the right-hand side of inclusions.
                                pos
    The semantics of DL-LiteR       is, as usual in DLs, based on the notion of first-order
interpretation I = h∆ , · i, where ∆I is a non-empty domain and ·I is an interpreta-
                       I I

tion function such that: (1) AI ⊆ ∆I , for every concept name A; (2) P I ⊆ ∆I × ∆I ,
for every role name P ; (3) aI ∈ ∆I , for every individual name a; and (4) such that:
                   I
             (∃R)      =    {x ∈ ∆I | there exists y ∈ ∆I s.t. (x, y) ∈ RI }
                   I
             (P − )    =    {(y, x) ∈ ∆I × ∆I | (x, y) ∈ P I }

    Moreover, satisfaction of concept and role inclusions is defined as follows: I |=
                      I                                 I
B v B 0 if B I ⊆ B 0 , and I |= R v R0 if RI ⊆ R0 . Finally, satisfaction of member-
ship assertions is defined as follows. A substitution over an interpretation I is a function
h from individuals and labeled nulls to ∆I such that h(a) = aI for each individual a.
Then (I, h) |= A(u) if h(u) ∈ AI , and (I, h) |= P (u, v) if (h(u), h(v)) ∈ P I .
                                                   pos
    An interpretation I is a model of a DL-LiteR       TBox T if for every α ∈ T , it holds
                                               pos
that I |= α, and it is a model of a DL-LiteR ABox A if there exists a substitution h
over I such that for every α ∈ A, it holds that (I, h) |= α. Finally, I is a model of a
DL-LiteR KB K = hT , Ai if I is a model of both T and A. The set of all models of
K is denoted M OD(K), and K is consistent if M OD(K) 6= ∅. We observe that in DL-
     pos                                                                                  pos
LiteR    one cannot express any form of negative information, and hence a DL-LiteR
KB is always consistent.
    We assume that interpretations satisfy the standard names assumption, that is, we
assume given a fixed infinite set U of individual names, and we assume that for every
interpretation I, it holds that ∆I ⊆ U and aI = a for every individual name a. This
implies that interpretations satisfy the unique name assumption over individual names.
    A signature Σ is a set of concept and role names. An interpretation I is said to be
an interpretation of Σ if it is defined exactly on the concept and role names in Σ. Given
a KB K, the signature Σ(K) of K is the alphabet of concept and role names occurring
in K, and K is said to be defined over (or simply, over) a signature Σ if Σ(K) ⊆ Σ (and
likewise for a TBox T , an ABox A, inclusions B v C and R v Q, and membership
assertions A(u) and P (u, v)).

2.2   Queries, Certain Answers, and Chase
A k-ary query q over a signature Σ, with k ≥ 0, is a function that maps every interpreta-
tion h∆I , ·I i of Σ into a k-ary relation q I ⊆ ∆k . In particular, if k = 0, then q is called
a Boolean query, and q I is either a relation containing the empty tuple () (representing
the value true) or the empty relation (representing the value false). A query q is said to
be a query over a KB K if q is a query over a signature Σ and Σ ⊆ Σ(K).TMoreover, the
answer to q over K, denoted by cert(q, K), is defined as cert(q, K) = I∈M OD(K) q I .
Each tuple in cert(q, K) is called a certain answer for q over K. Notice that if q is
a Boolean query, then cert(q, K) evaluates to true if q I evaluates to true for every
I ∈ M OD(K), and it evaluates to false otherwise.
      In this paper, we adopt the class of unions of conjunctive queries as our main query
formalism. A conjunctive query (CQ) over a signature Σ is a first-order formula of the
form q(x) = ∃y.conj (x, y), where x, y are tuples of variables and conj (x, y) is a
conjunction of atoms of the form: (1) A(t), with A a concept name in Σ and t either
an individual from U or a variable from x or y, or (2) P (t1 , t2 ), with P a role name
in Σ and ti (i = 1, 2) either an individual from U or a variable from x or y. In a CQ
q(x) = ∃y.conj (x, y) over a signature Σ, x is the tuple of free variables of q(x).
Moreover, given an interpretation I = h∆I , ·I i of Σ, the answer of q over I, denoted
by q I , is defined as the set of tuples a of elements from ∆I for which there exist a
tuple b of elements from ∆I such that I satisfies every conjunct in conj (a, b). Finally,
a union of conjunctive queries (UCQ) over a signature Σ is a finite set of CQs over Σ
W have the same free variables. A UCQ q(x)
that
                                                    I
                                                      is S
                                                         interpreted as the first-order formula
                                                                   I
   qi ∈q q i (x), and its semantics is defined as q   =    qi ∈q qi .
                                    pos
      Certain answers in DL-LiteR can be characterized through the notion of chase. We
call a chase a (possibly infinite) set of assertions of the form A(t), P (t, t0 ), where t,
t0 are either individuals from U, or labeled nulls interpreted as not necessarily distinct
                                                                        pos
domain elements (see the definition of the semantics of DL-LiteR            in Section 2.1). For
            pos
DL-LiteR KBs, we employ the notion of restricted chase as defined in [6]. For such a
KB hT , Ai, the chase of A w.r.t. T , denoted chaseT (A), is a chase obtained from A by
adding facts implied by inclusions in T , and introducing fresh labeled nulls whenever
required by an inclusion with ∃R in the right-hand side (see [6] for details).


2.3   Knowledge Base Exchange Framework

Assume that Σ1 , Σ2 are signatures with no concepts or roles in common. Then we
say that an inclusion N1 v N2 is an inclusion from Σ1 to Σ2 , if N1 is a concept or
                                                                                   pos
a role over Σ1 and N2 is a concept or a role over Σ2 . For a DL L (e.g., DL-LiteR      ),
we define an L-mapping (or just mapping, when L is clear from the context) as a tuple
M = (Σ1 , Σ2 , T12 ), where T12 is a TBox in L consisting of inclusions from Σ1 to Σ2 :

(1) C1 v C2 , where C1 , C2 are concepts in L over Σ1 and Σ2 , respectively, and
(2) Q1 v Q2 , where Q1 and Q2 are roles in L over Σ1 and Σ2 , respectively.
                                                pos
If T12 is an L-TBox, for a DL L (e.g., DL-LiteR     ), then M is called an L-mapping.
    The semantics of a mapping is defined in terms of the notion of satisfaction. More
specifically, given a mapping M = (Σ1 , Σ2 , T12 ), an interpretation I of Σ1 and an
interpretation J of Σ2 , pair (I, J ) satisfies TBox T12 , denoted by (I, J ) |= T12 ,
if for each concept inclusion C1 v C2 ∈ T12 , it holds that C1 I ⊆ C2 J , and for
each role inclusion Q1 v Q2 ∈ T12 , it holds that Q1 I ⊆ Q2 J . Moreover, given an
interpretation I of Σ1 , S ATM (I) is defined as the set of interpretations J of Σ2 such
that (I, J ) |=S
               T12 , and given a set X of interpretations of Σ1 , S ATM (X ) is defined as:
S ATM (X ) = I∈X S ATM (I).
    Let M = (Σ1 , Σ2 , T12 ) be a mapping, K1 a KB over Σ1 , and K2 a KB over Σ2 .
 – K2 is called a solution for K1 under M if M OD(K2 ) ⊆ S ATM (M OD(K1 )), and
 – K2 is called a universal solution for K1 under M if M OD(K2 ) =
   S ATM (M OD(K1 )).
    Universal solutions present several limitations, as argued in [1, 2]. First, a universal
                      pos
solution (in DL-LiteR     and DL-LiteR ) does not always exist. Second, if it exists, then
its TBox is trivial (that is, equivalent to the empty TBox). Finally, in the worst case
the smallest universal solution is exponential in the size of the mapping and the source
KB. A notion of solution parametrized w.r.t. a query language was proposed in [1, 2] in
order to overcome these limitations. Such a notion, though weaker, is in line with the
objective of (data and) KB exchange of providing in the target sufficient information to
answer queries that could also be posed over the source.
    Let Q be a class of queries, M = (Σ1 , Σ2 , T12 ) a mapping, K1 = hT1 , A1 i a KB
over Σ1 , and K2 a KB over Σ2 . Then
 – K2 is called a Q-solution for K1 under M if for every query q ∈ Q over Σ2 ,
   cert(q, hT1 ∪ T12 , A1 i) ⊆ cert(q, K2 ), and
 – K2 is called a universal Q-solution for K1 under M if for every query q ∈ Q over
   Σ2 , cert(q, hT1 ∪ T12 , A1 i) = cert(q, K2 ).
The definitions of solutions are illustrated in the following example.
Example 1. Assume Σ1 = {Painting(·), PaintedBy(·, ·), ArtMovement(·, ·)} and
Σ2 = {ArtPiece(·), ArtAuthor (·, ·), HasStyle(·, ·), HasGenre(·, ·)}. Consider map-
ping M = (Σ1 , Σ2 , T12 ), where T12 is the following TBox:

           Painting v ArtPiece                      PaintedBy v ArtAuthor
           Painting v ∃HasGenre                  ArtMovement v HasStyle

Further, assume T1 = {Painting ≡ ∃PaintedBy, Painting v ∃ArtMovement} and
A1 = {Painting(blacksquare)}. Then, a universal solution for the KB K1 = hT1 , A1 i
under M is the KB K2 = hT2 , A2 i, where T2 = ∅ and A2 is the following ABox,
where n01, n02, and m01 are labelled nulls:
      ArtPiece(blacksquare)                        ArtAuthor (blacksquare, n01)
      HasGenre(blacksquare, m01)                   HasStyle(blacksquare, n02)

    Now, consider KB K20 = hT20 , A02 i with non-empty TBox, where T20 =
{ArtPiece ≡ ∃ArtAuthor , ArtPiece v ∃HasStyle, ArtPiece v ∃HasGenre} and
A02 = {ArtPiece(blacksquare)}. Then we have that K20 is a solution for K1 under M.
However, we also have that K20 is not a universal solution for K1 under M. Notably,
both KB K2 and KB K20 are universal UCQ-solutions for KB K1 under mapping M.
    In order to understand the capacity of universal solutions, and also of the query-
languages based notions of solutions to transfer implicit knowledge, the notion of rep-
resentability has been introduced in [1, 2]. Here we adapt that definition to the case
                                                   pos
where the KB is always satisfiable, as in DL-LiteR     . In the definition below we use
chaseT ,Σ (A) to denote the projection of chaseT (A) on the signature Σ.
   Let L be a DL, Q a class of queries, M = (Σ1 , Σ2 , T12 ) an L-mapping, and T1 an
L-TBox over Σ1 . Then,
    – T1 is (Q-)representable under M if there exists an L-TBox T2 over Σ2 , called
      a (Q-)representation of T1 under M, such that for every ABox A1 over Σ1 ,
      hT2 , chaseT12 ,Σ2 (A1 )i is a (Q-)universal solution for hT1 , A1 i under M.
    – T1 is weakly (Q-)representable under M if there exists a mapping M? =
                   ?                       ?                  ?
      (Σ1 , Σ2 , T12 ) such that T12 ⊆ T12   , T1 ∪ T12 |= T12  , and T1 is (Q-)representable
                 ?
      under M .
Example 2. Let M = (Σ1 , Σ2 , T12 ) and T1 be as in Example 1. Then we have that
T2 = {ArtPiece ≡ ∃ArtAuthor , ArtPiece v ∃HasStyle, ArtPiece v ∃HasGenre}
is a UCQ-representation of T1 under M.
     On the other hand, if M0 = (Σ1 , Σ2 , T12  0          0
                                                  ) with T12 = {PaintedBy v ArtPiece},
then we have that T1 is not UCQ-representable under M0 : take ABox A1 =
{Painting(blacksquare)}, then chaseT120 ,Σ2 (A1 ) = ∅ and for no TBox T20 ,
hT20 , chaseT120 ,Σ2 (A1 )i is a universal UCQ-solution for hT1 , A1 i under M0 . However,
                     ?      0
if we consider T12     = T12   ∪ {Painting v ∃ArtAuthor }, we conclude that T1 is weakly
UCQ-representable under M0 since T12        0
                                                ⊆ T12?          0
                                                       , T1 ∪ T12       ?
                                                                   |= T12 and T1 is UCQ-
                             ?                ?
representable under M = (Σ1 , Σ2 , T12 ) (in fact, ∅ is a UCQ-representation of T1
under M? ).

                                              pos
3     Solving UCQ-Representability for DL-LiteR
In this section, we show that the UCQ-representability problem can be solved in poly-
                                                                                      pos
nomial time for the case where TBoxes and mappings are expressed in DL-LiteR              .
                                                                     pos
More specifically, we give a polynomial time algorithm UCQR EP that, given a DL-
     pos                                              pos
LiteR    -mapping M = (Σ1 , Σ2 , T12 ) and a DL-LiteR      -TBox T1 , verifies whether T1
is UCQ-representable under M, and if this is the case computes a UCQ-representation
of T1 under M. Moreover, we also show that this algorithm can be used to solve the
UCQ-representability problem for the case of DL-LiteRDFS . It is important to notice that
the algorithm we present can be used to compute universal UCQ-solutions of polyno-
mial size, which make good use of the source implicit knowledge. Thus, this algorithm
computes solutions with good properties to be used in practice.
    A related problem is that of query inseparability [7], which can be formulated as
follows: given TBoxes T1 and T2 , and a signature Σ, decide whether for each ABox A
over Σ and for each query q over Σ, cert(q, hT1 , Ai) = cert(q, hT2 , Ai). In contrast to
our polynomial result for UCQ-representability, query inseparability has been proved to
be PS PACE-hard for DL-LiteR TBoxes and CQs [7], and an analysis of the proof shows
                                                     pos
that the same lower bound holds already for DL-LiteR     .

3.1     Checking Whether a Given Target TBox is a UCQ-Representation
We start by considering the decision problem associated with UCQ-representability:
Given a DL-LiteR -mapping M = (Σ1 , Σ2 , T12 ), a DL-LiteR -TBox T1 over Σ1 , and
a DL-LiteR -TBox T2 over Σ2 , check whether T2 is a UCQ-representation of T1 under
M, i.e., for each ABox A1 over Σ1 , hT2 , chaseT12 ,Σ2 (A1 )i is a universal UCQ-solution
for hT1 , A1 i under M. This problem can be solved in two steps:
(C1) Check whether for each ABox A1 over Σ1 , hT2 , chaseT12 ,Σ2 (A1 )i is a UCQ-
     solution for hT1 , A1 i under M.
(C2) Check whether for each ABox A1 over Σ1 and for each UCQ q over Σ2 , we have
     that cert(q, hT2 , chaseT12 ,Σ2 (A1 )i) ⊆ cert(q, hT1 ∪ T12 , A1 i).
If both checks succeed, then T2 is a UCQ-representation of T1 under M, otherwise not.
We develop now techniques to perform these two checks in polynomial time.

                                              pos
Checking Condition (C1). For a DL-LiteR           TBox T and a concept or role N ,
we define the upward closure of N w.r.t. T as the set UT (N ) = {N 0 |
N 0 is concept or role and T |= N v N 0 }, and the strict upward closure ST (N )
S UT (N ) \ {N }. Then, for a set N of concepts and roles we define UT (N) =
as
   N ∈N UT (N ), and its strict version ST (N). Notice that both ST12 (UT1 (N )) and
UT2 (SM (N )) are sets over Σ2 , for each concept or role N over Σ1 .
    With these notions in place, we can provide a necessary and sufficient condition for
the satisfaction of condition (C1).
                                                      pos                        pos
Proposition 1. Let M = (Σ1 , Σ2 , T12 ) be a DL-LiteR     -mapping, T1 a DL-LiteR    -
                                   pos
TBox over Σ1 , and T2 a DL-LiteR -TBox over Σ2 . Then T1 , T2 , and M satisfy condi-
tion (C1) iff the following conditions are satisfied:
(A) SM (UT1 (B)) ⊆ UT2 (SM (B)), for each basic concept B over Σ1 ;
(B) SM (UT1 (R)) ⊆ UT2 (SM (R)), for each basic role R over Σ1 ;
(C) for each basic concept B and each basic role R over Σ1 such that ∃R ∈ UT1 (B)
    and SM (UT1 (∃R− )) 6= ∅, we have that
    if SM (UT1 (R)) 6= ∅, then there exists a role QR,B over Σ2 such that
       (CA) ∃QR,B ∈ UT2 (SM (B)),
       (CB) SM (UT1 (R)) ⊆ UT2 (QR,B ), and
       (CC) SM (UT1 (∃R− )) ⊆ UT2 (∃Q−         R,B ),
    and if SM (UT1 (R)) = ∅, either
       (CD) SM (UT1 (∃R− )) ⊆ UT2 (SM (B)),
    or there exist roles Q1R,B , . . . , QnR,B over Σ2 such that
       (CE) ∃Q1R,B ∈ UT2 (SM (B)),
       (CF) T2 |= ∃(Q1R,B )− v ∃Q2R,B , . . . , ∃(Qn−1       −    n
                                                        R,B ) v ∃QR,B , and
                                                      −
       (CG) SM (UT1 (∃R− )) ⊆ UT2 (∃(QnR,B ) ).
It is important to notice that the necessary and sufficient condition in Proposition 1
                                                                               pos
can be checked in polynomial time, as the implication problem for DL-LiteR         can
be solved in polynomial time. In particular, for a basic concept B and a basic role
R over Σ1 such that ∃R ∈ UT1 (B), SM (UT1 (∃R− )) 6= ∅, SM (UT1 (R)) = ∅, and
SM (UT1 (∃R− )) * UT2 (SM (B)), checking the existence of roles Q1R,B , . . . , QnR,B
over Σ2 satisfying conditions (CE), (CF), and (CG) can be reduced to checking reach-
ability in a directed graph. Indeed, for each pair of basic concepts B2 , B20 over Σ2
such that B2 ∈ SM (B) and SM (UT1 (∃R− )) ⊆ UT2 (B20 ), we use the following ap-
proach to check for the existence of the roles Q1R,B , . . . , QnR,B over Σ2 such that
T2 |= B2 v ∃Q1R,B , T2 |= ∃(Q1R,B )− v ∃Q2R,B , . . . , T2 |= ∃(QnR,B )− v B20 . Let
G = (V, E) be the directed graph defined as:

      V = {B2 , B20 } ∪ {Q2 | Q2 is a role in Σ2 }
      E = {(B2 , B20 ) | T2 |= B2 v B20 } ∪ {(B2 , Q2 ) | T2 |= B2 v ∃Q2 } ∪
          {(Q2 , B20 ) | T2 |= ∃Q−       0             0            −      0
                                  2 v B2 } ∪ {(Q2 , Q2 ) | T2 |= ∃Q2 v ∃Q2 }

Then we test for the existence of the roles Q1R,B , . . . , QnR,B by verifying whether B20 is
reachable from B2 in G. If for some pair B2 , B20 the aforementioned two-step test suc-
ceed, then we have that there exist roles Q1R,B , . . . , QnR,B that satisfy conditions (CE),
(CF), and (CG). Otherwise, we know that such roles do not exist.

Checking Condition (C2). We rely on the following result:
                                                      pos                        pos
Proposition 2. Let M = (Σ1 , Σ2 , T12 ) be a DL-LiteR     -mapping, T1 a DL-LiteR    -
                                   pos
TBox over Σ1 , and T2 a DL-LiteR -TBox over Σ2 . Then T1 , T2 , and M satisfy condi-
tion (C2) iff the following conditions are satisfied:
(A) UT2 (SM (B)) ⊆ SM (UT1 (B)) for each basic concept B over Σ1 ;
(B) UT2 (SM (R)) ⊆ SM (UT1 (R)) for each role R ∈ Σ1 ;
(C) for each basic role Q over Σ2 and each basic concept B over Σ1 such that ∃Q ∈
    UT2 (SM (B)) and UT2 (∃Q− ) 6= {∃Q− }, there exists a role RQ,B over Σ1 s.t.
      (CA) ∃RQ,B ∈ UT1 (B) and
      (CB) Q ∈ SM (RQ,B ).
The necessary and sufficient condition in Proposition 2 can be checked in polynomial
                                                                             pos
time, as the implication problem can be solved in polynomial time for DL-LiteR   .
    Thus, given that, by Propositions 1 and 2, both conditions (C1) and (C2) can be
tested in polynomial time, we obtain the following result.
                                                             pos
Theorem 1. The problem of verifying, given a DL-LiteR            -mapping M =
                           pos                               pos
(Σ1 , Σ2 , T12 ), a DL-LiteR -TBox T1 over Σ1 , and a DL-LiteR   -TBox T2 over Σ2 ,
whether T2 is a UCQ-representation of T1 under M can be solved in polynomial time.

3.2   The Algorithm for Computing a UCQ-Representation
In what follows, we present the algorithm UCQR EP pos , which verifies whether a source
TBox T1 is UCQ-representable under a mapping M, and if this is the case returns a
UCQ-representation of T1 under M.
    Intuitively, given a source TBox T1 and a mapping M, algorithm UCQR EP pos con-
structs the best possible candidate T2 for a UCQ-representation of T1 under M (given
the conditions in Propositions 1 and 2), and then checks whether T2 effectively satisfies
                                                                                      pos
the properties required for a UCQ-representation (note, that T2 is indeed a DL-LiteR
TBox). To prove the correctness of this algorithm, we need to show that T1 is UCQ-
representable under M if and only if T2 is a UCQ-representation of T1 under M. This
  Algorithm: UCQR EP pos (T1 , M)
                   pos                                            pos
  Input: A DL-LiteR    -mapping M = (Σ1 , Σ2 , T12 ) and a DL-LiteR   -TBox T1 over Σ1 .
                   pos
  Output: A DL-LiteR   -TBox T2 over Σ2 that is a UCQ-representation of T1 under M, if
          T1 is UCQ-representable under M. The keyword false otherwise.

    1. Let T2 be a TBox over Σ2 defined as:

                    T2 = {N2 v M2 | N1 a basic concept or role over Σ1 ,
                                    N2 ∈ SM (N1 ), M2 ∈ SM (UT1 (N1 ))}

    2. Remove from T2 every inclusion N2 v M2 such that (i) N2 ∈ SM (N1 ) for some
       N1 over Σ1 , and (ii) for every M1 over Σ1 such that M2 ∈ SM (M1 ), it holds that
       T1 6|= N1 v M1 . Moreover, if N2 = ∃R2 and M2 = ∃R20 , then also remove inclusions
                                  −
       R2 v R20 and R2− v R20 from T2 .
    3. Remove from T2 every inclusion of the form either ∃R2− v B2 or R2 v R20 or R2− v
       R20 for roles R2 , R20 and a concept B2 over Σ2 , if there exists a concept B1 over Σ1 such
       that (i) ∃R2 ∈ SM (B1 ), and (ii) for every role R1 over Σ1 such that ∃R1 ∈ UT1 (B1 )
       and R2 ∈ SM (R1 ), it holds that T1 6|= B1 v ∃R1 .
    4. Verify whether T2 is a UCQ-representation of T1 under M. If the test succeeds, return
       T2 , otherwise return false.

                                                                pos                        pos
Fig. 1. Algorithm to compute the UCQ-representation of a DL-LiteR   TBox T1 under a DL-LiteR
mapping M.



is done in the following theorem, where it is also proved that the algorithm works in
polynomial time. The latter is a consequence of the fact that T2 is of polynomial size
in the sizes of T1 and M, and that, by Theorem 1, it is possible to check in polynomial
time whether T2 is a UCQ-representation of T1 under M.

Theorem 2. Algorithm UCQR EP pos is correct and runs in polynomial time.

    The following examples illustrate how algorithm UCQR EP pos works.

Example 3. Let M = (Σ1 , Σ2 , T12 ), where Σ1 = {A1 (·), B1 (·), C1 (·)}, Σ2 =
{A2 (·), B2 (·)}, and T12 = {A1 v A2 , B1 v B2 , C1 v B2 }. Furthermore, assume that
T1 = {B1 v A1 }. Then, in step 1, the algorithm constructs the TBox T2 = {B2 v A2 }.
In step 2, it removes the only axiom from T2 as B2 ∈ SM (C1 ) and T1 6|= C1 v A1 .
In step 3, it does nothing, and finally, at the last step it checks whether the empty
TBox T2 is a UCQ-representation of T1 under M. Since A2 ∈ SM (UT1 (B1 )) and
A2 ∈/ UT2 (SM (B1 )), the algorithm returns false.

Example 4. Let M = (Σ1 , Σ2 , T12 ), where Σ1 = {B1 (·), P1 (·, ·), R1 (·, ·)}, Σ2 =
{A2 (·), B2 (·), R2 (·, ·)}, and T12 = {∃P1− v A2 , B1 v B2 , R1 v R2 }. Furthermore,
assume that T1 = {B1 v ∃P1 , B1 v ∃R1 , ∃R1− v ∃P1− }. Then, in step 1, the al-
gorithm constructs the TBox T2 = {B2 v ∃R2 , ∃R2− v A2 }. It does not remove
anything in steps 2 and 3. Finally, at the last step it successfully checks that T2 is a
UCQ-representation of T1 under M and returns T2 .
3.3   Solving UCQ-Representability for DL-LiteRDFS
It is not difficult to see that if the input of algorithm UCQR EP pos is a DL-LiteRDFS -
mapping M = (Σ1 , Σ2 , T12 ) and a DL-LiteRDFS -TBox T1 over Σ1 , then the set T2
                                             pos
computed by this algorithm is a DL-LiteR         -TBox over Σ2 that can be easily trans-
formed into an equivalent DL-LiteRDFS -TBox. Indeed, T2 might contain inclusions be-
tween basic concepts of the form ∃R2 v ∃R20 , but this occurs only if T2 implies also
the role inclusion R2 v R20 . Hence, all concept inclusions that would fall outside DL-
LiteRDFS are implied by the DL-LiteRDFS fragment of T2 and can be removed from T2
without affecting its semantics. Thus, we conclude that algorithm UCQR EP pos can also
be used to solve in polynomial time the UCQ-representability problem for DL-LiteRDFS
mappings and TBoxes.

                                                   pos
4     Solving Weak UCQ-Representability for DL-LiteR
In this section, we show that also the weak UCQ-representability problem can be solved
                                                                             pos
in polynomial time when TBoxes and mappings are expressed DL-LiteR               . We first need
to introduce some terminology. Given a DL-LiteR -TBox T over a signature Σ and a
UCQ q over Σ, a UCQ qr over Σ is said to be a perfect reformulation of q w.r.t. T if for
every ABox A over Σ, it holds that [6]: cert(q, hT , Ai) = cert(qr , h∅, Ai). That is, the
certain answers to the UCQ q over a KB hT , Ai can be computed by posing the UCQ
qr over the ABox A. It is well-known that every UCQ q admits a perfect reformulation
w.r.t. a DL-LiteR -TBox T , which can be computed in polynomial time [6].
    Interestingly, the fundamental notion of perfect reformulation can be used to
                                                              pos
solve the UCQ-representability problem for DL-LiteR               . More precisely, let M =
(Σ1 , Σ2 , T12 ) be a DL-LiteR -mapping and T1 a DL-LiteR -TBox over Σ1 . Then define a
                                             ?
mapping C OMP(M, T1 ) = (Σ1 , Σ2 , T12         ) that extends M by compiling the knowledge
from T1 into T12 . Formally, for a basic concept or role N over Σ1 , let bq N be the CQ
defined as follows: bq A (x) = A(x), bq ∃P (x) = ∃y.P (x, y), bq ∃P − (x) = ∃y.P (y, x),
bq P (x, y) = P (x, y), and bq P − (x, y) = P (y, x). Then, for every concept inclusion
B v C ∈ T12 and for every CQ q in the perfect reformulation of bq B w.r.t. T1 , include
                   ?
Cq v C into T12      , where Cq is the (unique) basic concept such that bq Cq = q. Also, for
every role inclusion R v Q ∈ T12 and for every CQ q in the perfect reformulation of
                                           ?
bq R w.r.t. T1 , include Rq v Q into T12     , where Rq is the basic role such that bq Rq = q.
                                                                             pos
    It is important to notice that if M = (Σ1 , Σ2 , T12 ) is a DL-LiteR         -mapping and
                pos                                                        ?                 pos
T1 a DL-LiteR -TBox over Σ1 , then C OMP(M, T1 ) = (Σ1 , Σ2 , T12 ) is a DL-LiteR                -
mapping that can be computed in polynomial time in the sizes of M and T1 . Therefore,
given that the set of inclusions defining C OMP(M, T1 ) contains the set of inclusions
                                   ?
defining M and T1 ∪ T12 |= T12       , we conclude that C OMP(M, T1 ) can be used to check
in polynomial time whether T1 is weakly UCQ-representable under M.
                                                pos                          pos
Theorem 3. Let M = (Σ1 , Σ2 , T12 ) be a DL-LiteR   -mapping and T1 a DL-LiteR   -
TBox over Σ1 . Then T1 is weakly UCQ-representable under M if and only if T1 is
UCQ-representable under C OMP(M, T1 ).
    From this result and Theorem 2 we obtain a polynomial time algorithm for solving
                                                   pos
the weak UCQ-representability problem for DL-LiteR     mappings and TBoxes.
                                      pos                      pos
    The example below shows a DL-LiteR    TBox T1 and a DL-LiteR   mapping M such
that T1 is not weakly UCQ-representable under M.
Example 5. Let M = (Σ1 , Σ2 , T12 ), where Σ1 = {P1 (·, ·), B1 (·)}, Σ2 = {A2 (·),
B2 (·)}, and T12 = {B1 v B2 , ∃P1− v A2 }. Furthermore, assume that T1 = {B1 v
∃P1 }. Then T1 is not weakly UCQ-representable under M. In fact, given that the perfect
reformulation of ∃P1− w.r.t. TBox T1 is ∃P1− itself, and likewise for concept B1 , we
have that M = C OMP(M, T1 ) and, thus, T1 is not weakly UCQ-representable under
M, as T1 is not UCQ-representable under M.
Instead, as shown in [1, 2], for each DL-LiteRDFS TBox T1 and DL-LiteRDFS mapping
M, T1 is weakly UCQ-representable under M.

5   Conclusions
In this paper, we have extended previous results on representability in the knowledge
                                  pos
exchange framework to DL-LiteR        , a DL of the DL-Lite family that allows for existen-
tials in the right-hand side of inclusion assertions, both in the source TBox and in the
mapping. We are currently working on extending our results to DL-LiteR , which in-
cludes disjointness assertions, and to the other DLs in the extended DL-Lite family [4].
A further interesting problem that we are investigating is that of checking the existence
                                    pos
of universal solutions for DL-LiteR      and other more expressive DLs.

Acknowledgements. The authors were supported by the EU Marie Curie action FP7-
PEOPLE-2009-IRSES (grant 24761), by Fondecyt (grant 1090565), by the EU ICT
projects ACSI (grant FP7-257593) and FOX (grant FP7-233599), and by the NWO
project DEX (612.001.012).

References
1. Arenas, M., Botoeva, E., Calvanese, D.: Knowledge base exchange. In: Proc. of DL 2011.
   CEUR, ceur-ws.org, vol. 745 (2011)
2. Arenas, M., Botoeva, E., Calvanese, D., Ryzhikov, V., Sherkhonov, E.: Exchanging descrip-
   tion logic knowledge bases. In: Proc. of KR 2012 (2012)
3. Arenas, M., Pérez, J., Reutter, J.L.: Data exchange beyond complete data. In: Proc. of
   PODS 2011. pp. 83–94 (2011)
4. Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: The DL-Lite family and rela-
   tions. J. of Artificial Intelligence Research 36, 1–69 (2009)
5. Brickley, D., Guha, R.V.: RDF vocabulary description language 1.0: RDF Schema.
   W3C Recommendation, World Wide Web Consortium (Feb 2004), available at
   http://www.w3.org/TR/rdf-schema/
6. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning
   and efficient query answering in description logics: The DL-Lite family. J. of Automated Rea-
   soning 39(3), 385–429 (2007)
7. Konev, B., Kontchakov, R., Ludwig, M., Schneider, T., Wolter, F., Zakharyaschev, M.: Con-
   junctive query inseparability of OWL 2 QL TBoxes. In: Proc. of AAAI 2011. pp. 221–226
   (2011)