=Paper= {{Paper |id=Vol-1193/paper_57 |storemode=property |title=Pushing the CFDnc Envelope |pdfUrl=https://ceur-ws.org/Vol-1193/paper_57.pdf |volume=Vol-1193 |dblpUrl=https://dblp.org/rec/conf/dlog/TomanW14 }} ==Pushing the CFDnc Envelope== https://ceur-ws.org/Vol-1193/paper_57.pdf
                Pushing the CF D nc Envelope

                        David Toman and Grant Weddell

                       Cheriton School of Computer Science
                         University of Waterloo, Canada
                       {david,gweddell}@cs.uwaterloo.ca



      Abstract. We consider the consequences on basic reasoning problems
      of allowing negated primitive concepts on left-hand-sides of inclusion
      dependencies in the description logic dialect CFDnc . Although earlier
      work has shown that this makes CQ answering coNP-complete, we show
      that TBox consistency and concept satisfiability remain in PTIME. We
      also show that knowledge base consistency and instance retrieval remain
      in PTIME if a CFDnc knowledge base satisfies a number of additional
      conditions, and that failing any one of these conditions will alone lead
      to intractability for these problems.


1   Introduction

Dialects of description logic (DLs) with PTIME complexity of reasoning services
have become an important tool in addressing scalability issues in the semantic
web, in particular with regard to SPARQL query evaluation in the context of the
OWL 2 direct semantics entailment regime. Indeed, the W3C has adopted two
DL fragments of OWL 2 that are designed to ensure PTIME data complexity
for the key services of checking for knowledge base (KB) consistency and for
conjunctive query (CQ) answering. Called profiles, the DLs are EL++ [2] and
DL-Lite [1, 4]. Medical ontologies are an important factor in the design of EL++
while DL-Lite is designed to address access to legacy relational data sources in
cases where the underlying relational schema was derived via ER modeling.
    Toman and Weddell have proposed an alternative CF D family of DLs with
PTIME complexity of various reasoning services [8, 14, 15] that were also de-
signed to address the problem of access to legacy relational data sources. The
CF D family is, however, incomparable to the other logics in terms of expressive
power: its focus was to address at least the common varieties of data depen-
dencies in legacy relational data sources, in particular, arbitrary collections of
primary keys, unary foreign keys, and functional dependencies.
    One member of this family, CF Dnc , has PTIME complexity with respect
to the size of a knowledge base for checking KB consistency, and PTIME data
complexity for CQ answering [15]. Notably, CF Dnc retains the ability to capture
data dependencies that are common to the CF D family: support for termino-
logical cycles with universal restrictions over functional roles, and the ability to
capture a rich variety of functional constraints over functional role paths.
    In this paper, we consider the consequences of relaxing a number of syntactic
restrictions in CF Dnc on KB consistency checking and other reasoning tasks. In
particular, we consider the DL CF D∀,¬
                                     nc which now allows value restrictions and
negated primitive concepts to occur on the left-hand-sides of inclusion depen-
dencies that comprise the “metadata”, called a TBox, of a given CF D∀,¬ nc KB.
Our results, in the order established, are as follows.
 – We show that TBox consistency for CF D∀,¬  nc is no longer trivial, unlike
   the case for CF Dnc and many other lightweight logics, but remains in
   NLOGSPACE.
 – We show that CF D∀,¬
                      nc KB consistency is NP-complete and identify precise
   fragments of CF D∀,¬
                     nc where KB consistency remains tractable. Interestingly,
   when negated concepts are allowed on the left-hand side of inclusion depen-
   dencies, this boundary is tied to the structure of keys and to the unique
   name assumption (UNA).
We begin in the next section by introducing the syntax and semantics of CF D∀,¬
                                                                              nc ,
and talk about some of its key features and limitations. In Section 3, we consider
the problem of TBox consistency, and then more general KB consistency in
Section 4. A review of related work and summary comments then follows in
Section 5.

                               ∀,¬
2   The Description Logic CF D nc
                              ∀,¬
A formal definition of CF Dnc     knowledge bases and the above reasoning prob-
lems now follows. Observe that the logic is based on attributes or features denot-
ing unary functions instead of the more common case of roles denoting arbitrary
binary relations. However, this is not really an issue; CF D∀,¬
                                                            nc is ideal for express-
ing reification for predicates of any arity (see comments following).
                     ∀,¬
Definition 1 (CF Dnc     Knowledge Bases) Let F, PC and IN be disjoint sets
of (names of) attributes, primitive concepts and individuals, respectively. A path
function Pf is a word in F∗ with the usual convention that the empty word
is denoted by id and concatenation by “.”. Concept descriptions C and D are
defined by the grammars on the left-hand-side of Figure 1 in which occurrences of
“A” denote primitive concepts. A concept “C : Pf 1 , . . . , Pf k → Pf” produced by
the last production of the grammar for D is called a path functional dependency
(PFD). To avoid undecidability [13], any occurrence of a PFD must also satisfy
a regularity condition by adhering to one of the following two forms:

                     1. C : Pf 1 , . . . , Pf . Pf i , . . . , Pf k → Pf or
                                                                                 (1)
                     2. C : Pf 1 , . . . , Pf . Pf i , . . . , Pf k → Pf .f

A PFD is a key if it adheres to the first of these forms.
Metadata and data in a CF D∀,¬  nc knowledge base K are respectively defined by
a TBox T and an ABox A. Assume A ∈ PC, C and D are arbitrary concepts
given by the grammars in Figure 1, {Pf 1 , Pf 2 } ⊆ F∗ and that {a, b} ⊆ IN.
Then T consists of a finite set of inclusion dependencies of the form C v D,
and A consists of a finite set of facts in the form of concept assertions A(a),
basic function assertions f (a) = b and path function assertions Pf 1 (a) = Pf 2 (b).
  Syntax                               Semantics: “(·)I ”
C ::= A                              AI ⊆ 4
    | ¬C                             4 \ CI
    | ∀ Pf .C                        {x : Pf I (x) ∈ CI }
D ::= A                              AI ⊆ 4
    | ¬A                             4 \ AI
    | ∀ Pf .D                        {x : Pf I (x) ∈ DI }
                                                     Vk
    | C : Pf 1 , . . . , Pf k → Pf   {x : ∀ y ∈ CI . i=1 Pf Ii (x) = Pf Ii (y)) ⇒ Pf I (x) = Pf I (y)}


                                     Fig. 1. CFD∀,¬
                                                nc Concepts.



A is called a primitive ABox if it consists only of concept and basic function
assertions.
Semantics is defined in the standard way with respect to an interpretation
I = (4, (·)I ), where 4 is a domain of “objects” and (·)I an interpretation
function that fixes the interpretation of primitive concepts A to be subsets of
4, attributes f to be total functions on 4, and individuals a to be elements of
4. The interpretation function is extended to path expressions by interpreting
id , the empty word, as the identity function λx.x, concatenation as function
composition, and to derived concept descriptions C or D as defined in Figure 1.
An interpretation I satisfies an inclusion dependency C v D if CI ⊆ DI , a
concept assertion A(a) if aI ∈ AI , a basic function assertion f (a) = b if f I (aI ) =
bI and a path function assertion Pf 1 (a) = Pf 2 (b) if Pf I1 (aI ) = Pf I2 (bI ). I satisfies
a knowledge base K if it satisfies each inclusion dependency and assertion in K,
and also satisfies UNA if, for any individuals a and b occurring in K, aI 6= bI . 2

As usual, allowing conjunction (resp. disjunction) on the right-hand (resp. left-
hand) sides of inclusion dependencies is a simple syntactic sugar.
    The conditions imposed on PFDs in (1) distinguish, e.g., PFDs of the form
C : f → id and C : f → g from PFDs of the form C : f → g.h, and are
necessary to retain PTIME complexity for reasoning problems [9, 13]. However,
this does not impact the modeling utility of CF D∀,¬   nc for formatted legacy data
sources. In particular, it remains possible to capture arbitrary keys or functional
dependencies in a relational schema.
                             ∀,¬
    The grammar for CF Dnc       , unlike CF Dnc , allows an unrestricted use of con-
cept constructors for value restrictions and concept negation on left-hand-sides
of inclusion dependencies. To see that this constitutes a genuine enhancement
of modeling utility, consider an enrollment relation for a hypothetical relational
data source about students and courses. In a CF Dnc TBox, the relation would
be reified as the ENROLLMENT primitive concept with Student and Course
features “typed” as follows:

         ENROLLMENT v (∀Student.STUDENT) u (∀Course.COURSE).
With the added flexibility of value restrictions in CF D∀,¬
                                                        nc , it is now straightfor-
ward to further restrict who may enroll in graduate courses:

           ∀Course.GRADCOURSE v ∀Student.GRADSTUDENT.

The added flexibility relating to negation can be used, e.g., to introduce a “top”
concept, say THING, by asserting the following:

               STUDENT v THING              ¬STUDENT v THING.

Such a concept can then be used, e.g., to enforce general range restrictions for
attributes:
                      THING v ∀Student.STUDENT.
   For reasoning tasks, such as TBox and more general KB consistency, it is
convenient to assume by default, and without loss of generality, that CF D∀,¬
                                                                          nc
knowledge bases are given in a normal form.
                  ∀,¬
Lemma 2 (CF Dnc        KB Normal Form) For every KB (T , A), there is an
equi-satisfiable KB (T 0 , A0 ) in which T 0 adheres to the following (more limited)
                   ∀,¬
grammar for CF Dnc     concept descriptions:

                  C ::= A | ¬A | ∀f.A
                  D ::= A | ¬A | ∀f.A | A : Pf 1 , . . . , Pf k → Pf,

and also where ABox A0 contains only assertions of the form f (a) = b and a = b.
2
Obtaining T 0 and A0 from an arbitrary knowledge base K that are linear in the
size of K is easily achieved by a straightforward conservative extension using
auxiliary names for intermediate concept descriptions and individuals (e.g., see
defn. of simple concepts in [11, 13]). In fact, one can go further and also disallow
value restrictions on left-hand-sides of inclusion dependencies in a normalized
T 0 , e.g., by replacing subsets {∀f.A v B} of T 0 with the following (where A0 is
a fresh primitive concept):

                       {A0 v ∀f.A, ¬A0 v ∀f.¬A, A0 v B}.

    Finally note that, when the UNA is assumed, it becomes necessary to dis-
tinguish the individuals that appear in the original ABox from those introduced
during the normalization and for which, technically, the UNA does not apply.
In particular, occurrences of path function assertions in a CF D∀,¬
                                                                 nc KB K can
indeed influence the computational properties of reasoning problems for K, see
Section 4.


3   TBox Consistency & Concept Satisfiability
Unlike the case for CF Dnc and many other lightweight logics, it is possible for a
     ∀,¬
CF Dnc   TBox T to be inconsistent. In this section, we introduce a NLOGSPACE
procedure for deciding TBox consistency for CF D∀,¬ nc .
Definition 3 Let PC and F be finite sets of primitive concepts and features,
respectively. We define an implication graph over PC and F to be a node and
edge labeled graph G = (N, E) whose nodes are

N = PC ∪ {¬A | A ∈ PC} ∪ {∀f.A | A ∈ PC, f ∈ F} ∪ {¬∀f.A | A ∈ PC, f ∈ F}

                                                                          f
and whose edges are labeled by F ∪ {} and such that L → L, ∀f.A → A, and
       f
¬∀f.A → ¬A are among the edges occurring in E.                                     2
In the following, the nodes of implication graphs are identified with their labels.
We call the pairs (A, ¬A) and (∀f.A, ¬∀f.A) complementary, and, if L is a node
in an implication graph G, write ¬L to denote the node of G for the concept
label that is complementary to L. We call a node L in the implication graph
                                     
G = (N, E) an empty node if L → ¬L ∈ E. We consider paths in G to be
synonymous with words in a finite automaton based on G in which  denotes
the empty transition. Thus we identify, e.g., f  with f with f , etc.

Definition 4 Let G = (N, E) be an implication graph. We say that G is closed
                         
under consequences if L1 → ¬L2 ∈ E whenever for nodes L1 , L2 , and L there is
                           Pf         Pf
a path Pf ⊆ E such that L1 → L and L2 → ¬L.                                 2
It is easy to see from the above that it is sufficient to consider paths of the form
 and f  to obtain the same result.

Definition 5 (Implication Graph for T ) Let T be a CF D∀,¬      nc TBox in nor-
mal form. We define an implication graph for T to be the least graph G = (N, E)
over the primitive concepts and features occurring in T such that
                            
 – if C v D ∈ T then C → D ∈ E, and
 – G is closed under consequences.                                                 2
The implication graph for T makes all implications between literals implied by
                                                               
T explicit: for T |= L1 v L2 it cannot be the case that L1 → L2 6∈ E since
L1 u ¬L2 must be empty, and this can only be enforced in the case where two
paths traversing the same features and originating in L1 and ¬L2 , respectively,
end in a complementary pair of concepts. This yields the following theorem:

Theorem 6 Let T be a CF D∀,¬nc TBox in normal form. Then T is consistent if
and only if no two complementary nodes are empty in the implication graph G
for T .
Proof (sketch): If two complementary nodes, e.g., A and ¬A, are empty then
T |= A v ¬A and T |= ¬A v A; a contradiction.
If no such pair exists then there must be at least one non-empty node. We
define an interpretation I for T to be a complete F-tree whose nodes form the
domain 4 and whose edges provide the interpretation for features in F. We label
the nodes of the tree by sets of literals as follows. First, initialize all labels to
be empty, and then repeat the following: let n be a node that has not been
   Data: A CFD∀,¬    nc TBox T
   initialize N and E as in Definition 3;
                    
   E := E ∪ {C → D | C v D ∈ T };
   repeat
        for L1 , L2 , L ∈ N and Pf ∈ {, f } do
                 Pf       Pf                  
          if L1 → L, L2 → ¬L ∈ E and L1 → ¬L2 6∈ E then
                           
              E := E ∪ {L1 → ¬L2 };
          end
      end
   until E is unchanged ;

           Algorithm 1: Construction of Implication Graph for T


labeled by neither A nor ¬A for some primitive concept A, but whose ancestors
have already been labeled by all primitive concepts or their negations. Then
either A or ¬A must be non-empty (since T is consistent) and neither of the
two literals is implied by any of the existing labels of n (since, otherwise, the
label n would already contain such a literal). Assume A is non-empty. We add
a literal L to the label of the node m if Pf(n) = m (i.e., m is a Pf-descendant
of n) and there is a path Pf from A to L in G (note that the  edges behave
as empty transitions in the construction of the path; hence, e.g., the node n
is labeled by A). The construction of G for T guarantees that no node will be
labeled by complementary literals. Hence, since every node is labeled either by
a primitive concept or by its negation and the labeling respects all constraints
implied by T , it can serve as the interpretation of primitive concepts, completing
the construction.                                                                 2

Observe that, since the above interpretation is a tree, all PFDs in T are satisfied
vacuously and are therefore not taken into account in the above construction
(nor in the definition of an implication graph). It is easy to see that Algorithm 1
constructs an implication graph for TBox T and runs in PTIME since there are
only |T |2 edges that can be added to E. A straightforward but tedious modifi-
cation of Algorithm 1 that (repeatedly) recomputes consequences as additional
edges in the graph on the fly, instead of materializing all consequences as ad-
ditional edges in the graph, can be shown to be in NLOGSPACE; the need to
explore paths in G then yields the following:

Corollary 7 Consistency of CF D∀,¬
                               nc TBoxes is NLOGSPACE-complete.

Whenever T is consistent, the implication graph for T also records which primi-
tive concepts (and/or their negations) are empty in every model, those for which
   
A → ¬A ∈ E.

Corollary 8 Concept satisfiability with respect to CF D∀,¬
                                                       nc TBoxes is NLOG-
SPACE-complete.
4     ABox Reasoning and K Satisfiability

The complexity landscape for ABox reasoning, in particular for knowledge base
consistency, depends crucially on the occurrence of non-key PFDs in T , on path
function assertions in A that are not equivalent to basic function assertions, and
on whether UNA is assumed.


4.1   The Tractable Cases

We first consider KB consistency for primitive ABoxes and TBoxes without
PFDs.

Definition 9 (2-CNF for an ABox) Let T be a consistent TBox without
any mention of the PFD concept constructor and A a primitive ABox. We define
2-CNF(T , A) to be the union of each of the following sets of clauses over the
propositions A(a) and ∀f.A(a), where a is an ABox individual, A a primitive
concept, and f a primitive feature, and where Li stand for propositions or their
negations.

 – {A(a) : A(a) ∈ A}
 – {L1 (a) → L2 (a) : T |= L1 v L2 }
 – {L1 (a) → L2 (b) : f (a) = b ∈ A, T |= L1 v ∀f.L2 }
 – {L1 (b) → L2 (a) : f (a) = b ∈ A, T |= ∀f.L1 v L2 }

Recall that we have shown in Section 3 that any implication questions of the
form T |= L1 v L2 can be read directly from the implication graph G for T
since all implications of this form are explicit in G. Thus, we have the following.

Theorem 10 Let T be a CF D∀,¬     nc TBox without any mention of the PFD con-
cept constructor and A a primitive ABox. Then the knowledge base K = (T , A)
is consistent if and only if T is consistent and 2-CNF(T , A) is satisfiable.
Proof (sketch): The satisfying assignment for 2-CNF(T , A) yields an interpreta-
tion for the ABox of K: aI ∈ AI if the proposition A(a) is true in the assignment.
To complete this interpretation, since ABox individuals may be missing some
features, we follow the construction from Theorem 6.
Conversely, if K is consistent, the class membership of ABox objects yields a
satisfying assignment for 2-CNF(T , A).                                         2

Note that the theorem can be slightly strengthened by allowing general ABoxes
since, without PFD constraints, there is never a need to equate ABox objects
and thus reasoning under UNA is the same as reasoning without UNA.

                 ∀,¬
Corollary 11 CF Dnc  knowledge base consistency is NLOGSPACE-complete.
Adding Keys under UNA. We say that a PFD key constraint L1 v L2 :
Pf 1 , . . . , Pf k → Pf is potentially violated by an ABox A with respect to two
distinct individuals a and b if for all 0 < i ≤ k the path functions Pf 0i (a) and
Pf 0i (b) reach the same object in A, where Pf 0i are the shortest prefixes of Pf i with
that property but where Pf 0i is not a prefix of Pf. Under UNA, such potentially
violated PFDs can only be satisfied if the objects a and b do not simultaneously
belong to descriptions L1 and L2 , respectively, since UNA prevents us from
repairing the violation by equating the objects Pf(a) and Pf(b). Consequently,
one can simply add the following 2-CNF clauses,

 – L1 (a) → ¬L2 (b) and L2 (b) → ¬L1 (a), for all pairs of individuals a, b in A
   and key PFDs L1 v L2 : Pf 1 , . . . , Pf k → Pf in T for which the latter is
   potentially violated by the former w.r.t. a and b,
called KEY-2-CNF(T , A), to the set of clauses 2-CNF(T , A) to account for this
situation (recall Definition 9 above for the latter):

Theorem 12 Let T be a CF D∀,¬ nc TBox with arbitrary occurrences of key PFDs
and let A be a primitive ABox. Then, assuming UNA, the knowledge base
K = (T , A) is consistent if and only if T is consistent and 2-CNF(T , A) ∪
KEY-2-CNF(T , A) is satisfiable.
Proof (sketch): The proof is essentially the same as for Theorem 10 since the in-
terpretation of the ABox makes all PFDs in T satisfied (the KEY-2-CNF(T , A)
clauses guarantee that) and since the additional anonymous objects cannot vio-
late PFDs since corresponding parts of the interpretation are tree shaped.     2

The theorem assumes that the ABox is primitive. However, this condition can
be relaxed without harm to allow path assertions of the form “f (a) = g(b)” for
which the same construction and proof argument would apply.
                   ∀,¬
Corollary 13 CF Dnc    knowledge base consistency is NLOGSPACE-complete
for knowledge bases with primitive ABoxes, key-PFDs, and assuming unique
names.


4.2   The Intractable Cases

Unfortunately, relaxing any of the above conditions leads to intractability. For
each of the cases, we show a reduction of 3-SAT to knowledge base consistency.
Figure 2 illustrates the ABox widgets used to capture the behavior of 3-clauses
in the three respective cases.


The Keys without UNA Case. We reduce 3-CNF satisfiability to KB con-
sistency as follows: let ϕ be a propositional formula in 3-CNF with clauses
C1 , . . . , Ck and propositional variables x1 , . . . , xn . We define a CF D∀,¬
                                                                              nc knowl-
edge base Kϕ = (T , A) as follows:

 1. For each propositional variable xi in ϕ, introduce an A individual xi .
 xi1               xi2                   xi3   xi1                 xi2                  xi3      xi1                    xi2                      xi3

  lj,1             lj,2                 lj,3   lj,1                lj,2                 lj,3         lj,1               lj,2                 lj,3
                                                                                                                                
         aj                    bj       :B               aj                    bj                           aj                     bj
                                                     g                                  g                   g                           g
               f           f                                   f           f
                                                                                                                             
                    cj                                              cj                          :B                                          :B

                                                                                                                    f          f
                                                                                                                          
                                                                                                                         cj

 The Key w/o UNA Case                                The General FD Case                             The Non-unit Paths Case

                           Fig. 2. 3-SAT Reduction Widgets for Clause Cj .


2. For each clause Cj introduce individuals aj , bj , and cj .
3. For the variables xi1 , xi2 , and xi3 that appear in a clause Cj , include the
   following basic function assertions in A: lj,1 (xi1 ) = aj , lj,2 (xi2 ) = aj , and
   lj,3 (xi3 ) = bj .
4. Add assertions f (aj ) = cj and f (bj ) = cj to A.
5. Add concept assertion B(bj ) to A.
6. Add the following inclusion dependencies to T :
     – T v ∀lj,1 .¬A and ¬T v ∀lj,1 .A when xi1 appears positively in Cj , and
         T v ∀lj,1 .A and ¬T v ∀lj,1 .¬A when xi1 appears negatively in Cj ;
     – T v ∀lj,2 .B and ¬T v ∀lj,2 .¬B when xi2 appears positively in Cj , and
         T v ∀lj,2 .¬B and ¬T v ∀lj,2 .B when xi2 appears negatively in Cj ; and
     – T v ∀lj,3 .¬A and ¬T v ∀lj,3 .A when xi3 appears positively in Cj , and
         T v ∀lj,3 .A and ¬T v ∀lj,3 .¬A when xi3 appears negatively in Cj .
7. Finally, add the key PFD A v A : f → id to T .

A truth assignment to the propositions xi occurring in ϕ is then encoded by
an interpretation I of Kϕ : xIi ∈ T I is where xi is true, and xIi ∈ ¬T I is
where xi is false. With this in mind, it is straightforward to confirm that any
interpretation of Kϕ will only encode a satisfying truth assignment and that, in
turn, an interpretation of Kϕ can always be constructed by a satisfying truth
assignment (see the proof sketch to Lemma 14 below).


The General Functional Dependencies Case. Allowing functional depen-
dencies (i.e., non-key PFDs) even under UNA also leads to intractability. The
reduction shares the first four steps with the previous case:

5. Add concept assertion ∀g.B(bj ) to A.
6. Add the following to T :
    – T v ∀lj,1 .¬A and ¬T v ∀lj,1 .A when xi1 appears positively in Cj , and
      T v ∀lj,1 .A and ¬T v ∀lj,1 .¬A when xi1 appears negatively in Cj .
     – T v ∀lj,2 .∀g.B and ¬T v ∀lj,2 .∀g.¬B when xi2 appears positively in Cj ,
       and T v ∀lj,2 .∀g.¬B and ¬T v ∀lj,2 .∀g.B when xi2 appears negatively
       in Cj ; and
     – T v ∀lj,3 .¬A and ¬T v ∀lj,3 .A when xi3 appears positively in Cj , and
       T v ∀lj,3 .A and ¬T v ∀lj,3 .¬A when xi3 appears negatively in Cj .
 7. A v A : f → g ∈ T .

The Non-unit Path Agreements Case. Similarly, allowing non-primitive
ABoxes with path function assertions even under UNA leads to intractability.
The reduction again shares the first three steps with the previous cases:
 4. Add g.f (aj ) = cj , g.f (bj ) = cj .
 5. Add ∀g.B(bj ) to A.
 6. Add the following to T :
     – T v ∀lj,1 .∀g.¬A and ¬T v ∀lj,1 .∀g.A when xi1 appears positively in Cj ,
       and T v ∀lj,1 .∀g.A and ¬T v ∀lj,1 .∀g.¬A when xi1 appears negatively
       in Cj .
     – T v ∀lj,2 .∀g.B and ¬T v ∀lj,2 .∀g.¬B when xi2 appears positively in Cj ,
       and T v ∀lj,2 .∀g.¬B and ¬T v ∀lj,2 .∀g.B when xi2 appears negatively
       in Cj ; and
     – T v ∀lj,3 .∀g.¬A and ¬T v ∀lj,3 .∀g.A when xi3 appears positively in Cj ,
       and T v ∀lj,3 .∀g.A and ¬T v ∀lj,3 .∀g.¬A when xi3 appears negatively
       in Cj .
 7. Finally, add the key PFD A v A : f → id to T .
Note the use of paths of length 2 to in step 4 to construct anonymous objects
that evade UNA. In all three cases it is easy to confirm that:

Lemma 14 Let ϕ be a propositional formula in 3-CNF. Then ϕ is satisfiable if
and only if Kϕ is consistent.
Proof (sketch): It is easy to verify that a satisfying assignment for ϕ can be
used to create a model for K, essentially by assigning the class membership of
the individuals xij to the primitive concepts T and ¬T and then extending this
assignment to the aj , bj and cj individuals as dictated by the constraints in T .
On the other hand, a model for K (when restricted to each widget representing
a clause) guarantees that at least one of the xi1 , xi2 , and xi3 will belong to the
concept T : the concept B is used to make the widget corresponding to a 3-CNF
clause unsatisfied exactly when the clause itself is unsatisfied: an interpretation
assigning all of xi1 , xi2 , and xi3 to ¬T forces the two individuals connected by
the dashed lines in the widgets in Figure 2 to be equal as a consequence of the
PFD and to belong to B and ¬B at the same time, thus disqualifying such
interpretations as models of K.                                                   2
Hence, knowledge base consistency without assuming UNA or when allowing
functional dependencies in T or non-primitive path equalities in A is NP-complete:
membership in NP is established by guessing primitive classes and equalities for
ABox individuals.
Instance Retrieval. Since CF D∀,¬ nc is closed under negation, instance retrieval,
the question K |= C(a) reduces naturally to knowledge base (in)consistency, i.e.,
K |= C(a) if and only if (T , A ∪ {¬C(a)}) is inconsistent (for K = (T , A)) and
therefore inherits the computational properties of testing for consistency:
 – CF D∀,¬
        nc instance retrieval is NLOGSPACE-complete for primitive ABoxes
   and key PFDs under UNA, and
 – coNP-complete in all other cases.
The instance retrieval result cannot, however, be extended to conjunctive queries.
In particular, it is a straightforward consequence of results of Schaerf that al-
lowing negated primitive concepts on left-hand-sides of inclusion dependencies
in CF Dnc alone makes CQ answering coNP-complete [10].


5   Related Work and Summary Comments

PFD-based constraints were first introduced and studied in the context of graph-
oriented data models (similar to RDF) and its refinements [7, 16]. Subsequently,
an FD concept constructor was proposed and incorporated in Classic [3], an early
DL with PTIME reasoning capabilities, without changing the complexity of its
implication problem. On the other hand, removing the restrictions imposed on
PFDs (see Section 2 (1)), makes logical implication EXPTIME-complete [9] and
general reasoning, in particular knowledge base consistency, undecidable [13].
    Logics in the CF D family that allow conjunctions on left of subsumptions
[14] cannot support tractability in the presence of, e.g., disjointness. The most
notable cases are as follows.
 – Allowing conjunction “u” yields the logic CF D⊥ and therefore makes rea-
   soning PSPACE-complete [14].
 – Allowing conjunction and value restriction “∀” makes reasoning EXPTIME-
   complete even in the absence of negation (or the empty concept ⊥) [9].
Adding various forms of functional dependencies and keys to other DLs—usually
as additional separate varieties of constraints, often called a key box—have been
considered, e.g., by [6]. They show that their dialect is undecidable for description
logics with inverse roles, but becomes decidable when unary functional depen-
dencies are disallowed. This line of investigation is continued in the context of
PFDs and inverse features, with analogous results [12]. Subsequently, Calvanese
et al. have shown how DL-Lite can be extended with a path-based variety of
identification constraints analogous to PFDs without affecting the complexity of
reasoning problems [5].
    In this paper, we have considered extensions to the logic CF Dnc and their
impact on the complexity of reasoning problems. CF D∀,¬      nc and its various re-
stricted fragments introduced in this paper have served as vehicles for this study.
Most significantly, we have shown that instance retrieval remains in PTIME for
CF D∀,¬
      nc under UNA, provided that ABoxes are primitive and that PFDs oc-
curring in TBoxes are keys. Thus, basic graph pattern (BGP) evaluation over
CF D∀,¬
      nc knowledge bases satisfying these conditions is also in PTIME.
References
 1. Alessandro Artale, Diego Calvanese, Roman Kontchakov, and Michael Za-
    kharyaschev. The DL-Lite family and relations. J. Artificial Intelligence Research,
    36:1–69, 2009.
 2. Franz Baader, Sebastian Brandt, and Carsten Lutz. Pushing the EL Envelope. In
    Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI), pages 364–369, 2005.
 3. Alexander Borgida and Grant Weddell. Adding Uniqueness Constraints to De-
    scription Logics (Preliminary Report). In International Conference on Deductive
    and Object-Oriented Databases, pages 85–102, 1997.
 4. Diego Calvanese, Giuseppe de Giacomo, Domenico Lembo, Maurizio Lenzerini,
    and Riccardo Rosati. Tractable Reasoning and Efficient Query Answering in De-
    scription Logics: The DL-Lite Family. J. of Automated Reasoning, 39(3):385–429,
    2007.
 5. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini,
    and Riccardo Rosati. Path-Based Identification Constraints in Description Logics.
    In Principles of Knowledge Representation and Reasoning, pages 231–241, 2008.
 6. Diego Calvanese, Giuseppe De Giacomo, and Maurizio Lenzerini. Identification
    Constraints and Functional Dependencies in Description Logics. In Proc. Int.
    Joint Conf. on Artificial Intelligence (IJCAI), pages 155–160, 2001.
 7. Minoru Ito and Grant Weddell. Implication Problems for Functional Constraints
    on Databases Supporting Complex Objects. Journal of Computer and System
    Sciences, 49(3):726–768, 1994.
 8. Vitaliy L. Khizder, David Toman, and Grant Weddell. Reasoning about Duplicate
    Elimination with Description Logic. In Rules and Objects in Databases (DOOD,
    part of CL’00), pages 1017–1032, 2000.
 9. Vitaliy L. Khizder, David Toman, and Grant Weddell. On Decidability and
    Complexity of Description Logics with Uniqueness Constraints. In Int. Conf. on
    Database Theory ICDT’01, pages 54–67, 2001.
10. Andrea Schaerf. On the complexity of the instance checking problem in concept
    languages with existential quantification. J. Intell. Inf. Syst., 2(3):265–278, 1993.
11. David Toman and Grant Weddell. On Keys and Functional Dependencies as First-
    Class Citizens in Description Logics. In Proc. of Int. Joint Conf. on Automated
    Reasoning (IJCAR), pages 647–661, 2006.
12. David Toman and Grant E. Weddell. On the interaction between inverse features
    and path-functional dependencies in description logics. In Proc. Int. Joint Conf.
    on Artificial Intelligence (IJCAI), pages 603–608, 2005.
13. David Toman and Grant E. Weddell. On keys and functional dependencies as
    first-class citizens in description logics. J. Autom. Reasoning, 40(2-3):117–132,
    2008.
14. David Toman and Grant E. Weddell. Applications and extensions of PTIME
    description logics with functional constraints. In Proc. Int. Joint Conf. on Artificial
    Intelligence (IJCAI), pages 948–954, 2009.
15. David Toman and Grant E. Weddell. Conjunctive Query Answering in CFDnc :
    A PTIME Description Logic with Functional Constraints and Disjointness. In
    Australasian Conference on Artificial Intelligence, pages 350–361, 2013.
16. Grant Weddell. A Theory of Functional Dependencies for Object Oriented Data
    Models. In International Conference on Deductive and Object-Oriented Databases,
    pages 165–184, 1989.