Conjunctive Query Answering in EL using a Database System Carsten Lutz1 , David Toman2 , and Frank Wolter3 1 Fachbereich Informatik Universität Bremen, Germany clu@informatik.uni-bremen.de 2 D.R. Cheriton School of Computer Science University of Waterloo, Canada david@uwaterloo.ca 3 Department of Computer Science University of Liverpool, UK frank@csc.liv.ac.uk Abstract. We study conjunctive query answering in the description logic EL, the core of the designated OWL2-EL profile of OWL2. In particular, we present an approach that allows the use of conventional relational database technology for conjunctive query answering in EL. This approach is inspired by the OWL support in Oracle Database 11g, semantic technologies. 1 Introduction The OWL ontology language is currently being revisited by a W3C working group with the goal of designing a revised and extended version of OWL, called OWL2 [1]. The new recommendation aims to include a number of profiles, frag- ments of OWL2 that trade expressive power for various other desirable properties not enjoyed by the full language. At the time of writing, there were three pro- files designated to become part of the OWL2 recommendation: OWL2-EL based on the description logic EL++ [4, 5], OWL2-QL based on the description logic DL-LiteA [7], and OWL2-RL inspired by the pD∗ semantics for OWL [17]. Each of these fragments aims at achieving a different benefit while simultaneously maximizing expressive power. In the case of OWL2-EL, the design goal is clas- sification of ontologies in polynomial time. This profile has already been widely adopted by large-scale ontologies in which efficiency of reasoning plays a central role. In particular, many life-science ontologies such as SNOMED CT and NCI are formulated in OWL2-EL [16, 15]. In contrast to OWL2-EL, the design goal of both OWL2-QL and OWL2-RL is to enable the use of database technology for query answering over OWL in- stance data: many applications access considerable amounts of such data, often in the range of millions of objects. This makes it necessary to store the data in secondary storage and to develop highly efficient querying mechanisms. Re- lational database systems often satisfy these needs and benefit from decades of research in academia and industry. Thus they are natural candidates for im- plementing a query processing solution for OWL. However, employing such an approach is rather challenging due to significant differences between relational database systems and OWL: first, relational systems adopt a closed-world se- mantics, i.e., all facts that are not explicitly stated to be true are assumed to be false. In contrast, OWL is based on an open world semantics which does not re- quire one to fix the truth value of every fact and is more similar to an incomplete database [10]. Second, database systems are unaware of the intensional part of an OWL ontology (called henceforth the TBox). In OWL2-QL and OWL2-RL these difficulties are addressed in different ways. In a nutshell, the OWL2-QL approach suggests employing a backward chaining approach while OWL2-RL suggests forward chaining. In both cases the instance data (called the ABox) is stored as a relational database instance. In OWL2-QL, a query q is rewritten into a new query q ∗ that takes into account the TBox. This rewriting is independent of the data. The resulting query q ∗ is passed to the database system to retrieve answers. In this way, OWL2-QL allows the use of an off-the-shelf relational database system with the help of an external query rewriting component. In contrast, query answering in OWL2-RL relies on a data preprocessing phase. This phase is initiated after each modification of the instance data and adds data that is entailed by the explicitly given data and the TBox. Data preprocessing is independent of the queries to be answered. To actually answer a query, it then suffices to pass it to the relational query engine without rewriting. The OWL2-RL approach thus suggests that the data management facilities of the database system should be modified to incorporate a data preprocessing phase, but the query engine is left untouched. Notably, OWL2-RL and the described approach to query answering is used in the Oracle Database 11g semantic technologies [18]. In this paper, we study conjunctive query answering in the description logic EL using relational database systems. Since EL can be viewed as the core of the OWL2-EL profile and is successfully used in the design of large-scale ontologies, adding efficient querying capabilities is of immediate practical use, e.g., in ap- plications of OWL in medical informatics in which an ABox is used to describe medical records and where a medical EL ontology, such as SNOMED CT, assigns a meaning to the medical terms used in these records [12, 13]. Unsurprisingly, very large ABoxes are frequently encountered in such applications and efficient querying is a crucial requirement. Conjunctive query answering in EL has also been studied in [9, 14, 8], but not in the context of database systems. The pure backward chaining approach of OWL2-QL cannot be applied to EL since that approach is limited to DLs for which the data complexity of con- junctive query answering is in LogSpace, while it has been shown that EL is PTime-hard in this respect [6]. Pure forward chaining is not applicable either because it may cause the introduction of infinitely many new database objects (due to the use of existential restrictions on the right-hand side of concept inclu- sions in a TBox). For these reasons, we present a novel approach to conjunctive query answering that has some similarity to both of the approaches above. It can be summarized as follows similarly to Oracle 11g semantic technologies, we use a data preprocessing phase that adds new data to the database. This phase depends on the ABox and the TBox, but not on the queries to be answered. The data added during our preprocessing phase is auxiliary: it is only used in- ternally by the ontology-aware database system for query answering and hidden from the user. To answer a query, we first rewrite it and then pass it on to the database query engine. Hence, our approach shares the main virtues of the two approaches for OWL2-QL and OWL2-RL above; in particular it leaves the query engine of the relational database system unchanged. Technically, our approach relies on the fact that, in EL, conjunctive queries can be answered by considering a single canonical model. During the preprocessing phase, we complete the data in the database to such a model. The main challenge of pursuing this approach is to work with a finite canonical model while still giving correct answers to all conjunctive queries. It is worth mentioning that the query rewriting in our approach is of a rather different nature than the rewriting done for OWL2-QL: it does not correspond to backward chaining and is independent of the data and of the TBox. An- other significant difference from OWL2-QL is that data preprocessing needs only quadratic time and query rewriting needs only linear time. In contrast, the query rewriting step of OWL2-QL may induce an exponential blowup. The remainder of this paper is organized as follows. In Section 2, we intro- duce the description logic EL and conjunctive query answering. Section 3 then provides an overview of our approach, introduces canonical models, and provides a first example. The data preprocessing phase is described in Section 4 and query rewriting is discussed in Section 5. In Section 6, we conclude with pointing out future research issues. 2 Preliminaries We briefly introduce the description logic EL and conjunctive queries. In EL, concepts are built according to the syntax rule C ::= A | > | C u D | ∃r.C where, here and in the remaining paper, A ranges over concept names taken from a countably infinite set NC , r ranges over role names taken from a countably infinite set NR , and C, D range over concepts. A TBox is a finite set of concept inclusions C v D, and an ABox is a finite set of concept assertions A(a) and role assertions r(a, b), where a, b range over a countably infinite set NI of individual names. A knowledge base is a pair (T , A) with T a TBox and A an ABox. As usual, an interpretation is a pair (∆I , ·I ) with ∆I a non-empty domain and ·I an interpretation function that maps each concept name A to a subset AI ⊆ ∆I , each role name r to a binary relation rI ⊆ ∆I × ∆I , and each individual name a to an element aI ∈ ∆I The interpretation function is extended to composite concepts by setting >I = ∆ I (C u D)I = C I ∩ DI (∃r.C)I = {d ∈ ∆I | ∃e ∈ ∆I : (d, e) ∈ rI ∧ e ∈ C I }. An interpretation I satisfies a concept inclusion C v D if C I ⊆ DI , a concept assertion A(a) if aI ∈ AI , and a role assertion r(a, b) if (aI , bI ) ∈ rI . I is a model of a TBox T (ABox A) if it satisfies all concept inclusions in T (assertions in A). It is a model of a knowledge base K = (T , A) if it is a model of T and A. For a concept inclusion or assertion α, we write K |= α if α is satisfied in all models of K. If empty, A is simply omitted. Let NV be a countably infinite set of variables. Together, the set of NV of vari- ables and the set NI of individual names form the set NT of terms. A conjunctive query is an expression of the form ∃u.ϕ(v, u), where – u = u1 , . . . , un and v = v1 , . . . , vm are vectors of variables and – ϕ is a conjunction of concept atoms A(t) and role atoms r(t, t0 ), where A ranges over concept names, r over role names, and t, t0 are terms from the set v ∪ u ∪ NI . The variables in u are called quantified variables, and the variables in v are answer variables. We call the query k-ary if there are k answer variables, use var(q) to denote the set of all variables in q, qvar(q) for the set of quantified variables, avar(q) for the set of answer variables, and term(q) for the terms in q. Slightly abusing notation, we write α ∈ q if the concept or role atom α occurs in q. Let I be an interpretation and q = ∃u.ϕ(v, u) a conjunctive query. A match for I and q is a mapping π : term(q) → ∆I such that π(a) = aI for all a ∈ term(q) ∩ NI and all atoms in q are satisfied, i.e., – π(t) ∈ AI for all A(t) ∈ q and – (π(t), π(t0 )) ∈ rI for all r(t, t0 ) ∈ q. If v = v1 , . . . , vk with π(vi ) = aIi for 1 ≤ i ≤ k, then π is called an (a1 , . . . , ak )- match for I and q. If such a match exists, we write I |= q[a1 , . . . , ak ]. A certain answer for a k-ary conjunctive query q and a knowledge base K is a tuple (a1 , . . . , ak ) such that a1 , . . . , ak occur in K and, for each model I of K, there is an (a1 , . . . , ak )-match for I and q. We use cert(q, K) to denote the set of all certain answers for q and K. This defines the querying problem studied in this paper: given an EL-knowledge base K and a conjunctive query q, we want to compute cert(q, K). Observe that the definition of certain answers reflects the open world semantics of ABoxes: we quantify over all possible models of K instead of treating the ABox itself as a model. Without loss of generality, we admit only concept names in a conjunctive query, instead of composite concepts. Indeed, it is easily possible to unfold an EL-concept into the query. For example, the query ∃u.r(u, v) ∧ (A u ∃s.B)(u) can be unfolded into ∃u, u0 .r(u, v) ∧ A(u) ∧ s(u, u0 ) ∧ B(u0 ). In the remainder of this paper, we make the unique name assumption, i.e., we assume that aI 6= bI for all interpretations I and all a, b ∈ NI with a 6= b. The following (easy to prove) lemma shows that assuming UNA does not impact the certain answers. Lemma 1. Let K be a knowledge base, q a k-ary conjunctive query, and let a1 , . . . , ak ∈ NI . If there is a model I of K and an (a1 , . . . , ak )-match for I and q, then there is a model I 0 of K that respects the UNA and an (a1 , . . . , ak )- match for I 0 and q. 3 Canonical Models Every EL knowledge base K = (T , A) has a canonical model IK that enjoys many pleasant properties [2, 11]. To define IK , let sub(T ) denote the set of all subconcepts of concepts used in T , Ind(A) the set of individual names that occur in A, and set NIaux := {xC | C ∈ sub(T )}. Then ∆IK := Ind(A) ∪ NIaux AIK := {a ∈ ind(A) | K |= A(a)} ∪ {xC ∈ NIaux | K |= C v A} rIK := {(a, b) ∈ ind(A) × ind(A) | r(a, b) ∈ A}∪ {(a, xC ) ∈ ind(A) × NIaux | K |= ∃r.C(a)}∪ {(xC , xD ) ∈ NIaux × NIaux | K |= C v ∃r.D} aIK := a for all a ∈ ind(K) Among the pleasant properties of IK is that it can be used to answer instance queries. Such a query has the form C(a) and is entailed by a knowledge base K if aI ∈ C I for all models I of K. It is well-known that C(a) is entailed by K iff a ∈ C IK , and thus we can answer instance queries w.r.t. K by looking only at the single model IK [4]. Since A can be viewed as being a fragment of IK , this observation suggests that we can use a database system to answer instance queries as follows: we store A as a database, complete A to IK in a preprocessing phase, and then just pass instance queries to the database system. In principle, it is the same approach that we want to use for conjunctive queries. Unfortunately, for conjunctive queries q it is not true that (a1 , . . . , ak ) ∈ cert(q, K) iff IK |= q[a1 , . . . , qk ] for all a1 , . . . , ak ∈ NI with k the arity of q. To see this, let us consider four examples. First, take the knowledge base K1 = (T1 , A1 ) with T1 = {A v ∃r.B} A1 = {A(a), A(b), r(a0 , c0 ), r(b0 , c0 )} q1 = ∃u.r(v, u) ∧ r(v 0 , u) We have (a, xB ) ∈ rIK1 and (b, xB ) ∈ rIK1 , and thus IK1 |= q[a, b]. However, (a, b) ∈ / cert(q1 , K1 ) because it is easy to find a model I of K1 with I 6|= q1 [a, b]. Second, take T2 = {A v ∃r.B u ∃s.B} A2 = {A(a)} q2 = ∃u.r(v, u) ∧ s(v, u) Similarly to our first example, it is not hard to see that IK2 |= q2 [a], but a ∈ / cert(q2 , K2 ). Our third example is K3 = (T3 , A3 ) with T3 = {A v ∃r.B, B v ∃s.B} A3 = {A(a)} q3 = ∃u.r(v, u) ∧ s(u, u) Again, IK3 |= q3 [a], but a ∈/ cert(q3 , K3 ). All the three examples above show situations in which matches can be found in the canonical model, matches that do not exist in the unraveling of this model. Our last example is K4 = (T4 , A4 ) with T4 = {A v A} A4 = {B(a)} q4 = ∃u.B(v) ∧ A(u) illustrates a different difficulty: while xA ∈ AIK4 , in the unraveled model the interpretation of A is empty: hence, IK4 |= q4 [a], but a ∈ / cert(q4 , K4 ). This situation can be detected by observing that the value x4 is not reachable from a by a role chain in IK4 . In principle, all of these problems can be overcome by replacing IK with its unraveling into a less constrained, tree-like model. In the following, we introduce unraveling as a general operation on models of a knowledge base. Let K be a knowledge base and I a model of K. We use Ind(A)I to denote the set {aI | a ∈ Ind(A)}. A path in I is a finite sequence d0 r1 d1 · · · rn dn , n ≥ 0, where d0 ∈ Ind(A)I and, for all i < n, (di , di+1 ) ∈ ri+1 I . We use paths(I) to denote the set of all paths in I. If p ∈ paths(I), then tail(p) denotes the last element dn in p. Now the unraveling J of I is defined as follows: ∆J := paths(I) aJ := aI AJ := {p | tail(p) ∈ AI } rJ := {(d, e) | d, e ∈ Ind(A)I ∧ (d, e) ∈ rI } ∪ {(p, p · re) | p, p · re ∈ ∆I } where “·” denotes concatenation. The following result is proved in [8] for the extension of EL with inverse and functional roles and for the case of 0-ary queries. An extension to k-ary queries is straightforward. Lemma 2. Let K be a knowledge base and UK the unraveling of IK . For all k- ary conjunctive queries q and individual names a1 , . . . , ak , we have (a1 , . . . , ak ) ∈ cert(q, K) iff UK |= q[a1 , . . . , ak ]. Alas, UK may be infinite and thus we cannot use it as a database. The solution is to continue working with IK , but to rewrite q into q ∗ such that each match of q ∗ in IK can be reproduced as a match of q in UK and vice versa, thus IK |= q ∗ [a1 , . . . , ak ] iff UK |= q[a1 , . . . , ak ] for all a1 , . . . , ak ∈ NI . We defer a formal definition and concrete examples of this rewriting to Section 5. 4 Data Preprocessing We describe the data preprocessing phase of our approach to conjunctive query answering in EL, where we insert additional, auxiliary data. The purpose of this phase is to complete the ABox A initially stored in the database to (a fragment of) the canonical model IK . In principle, the auxiliary data has to be updated each time that A and T are modified (for the purposes of query answering, however, it makes sense to assume that T is essentially immutable). To keep the presentation simple, we concentrate on the case where the auxiliary data is computed from scratch. In an actual implementation, it would be preferable to modify the auxiliary data in an incremental way when data is inserted to or deleted from A. Let K = (T , A) be the knowledge base over which queries are to be an- swered. We store the ABox A as data in a database system in the obvious way: individual names become database objects, concept names become unary tables, and role names become binary tables. When we understand A as a database, we denote it with Adb and use ans(q, Adb ) to denote the set of answers that a relational database system returns for q over the database Adb . As explained in the previous section, cert(q, K) and ans(q, Adb ) are not identical in general. The precompletion phase extends Adb to a database A∗db that represents (a fragment of) the canonical model IK . In the following, we present details of how this completion can be achieved using the query mechanism of a relational system. In A∗db , we use the elements of NIaux as additional database objects, assuming that NIaux ∩ NI = ∅. We also introduce an additional unary database table Aux that is used to identify elements of NIaux and distinguish them from individual names in A.4 First, we use a reasoner such as CEL [3] to compute the canonical model of the TBox T defined as follows: ∆IT := NIaux AIT := {xC ∈ NIaux | T |= C v A} rIT := {(xC , xD ) ∈ NIaux × NIaux | T |= C v ∃r.D} This model should not to be confused with the canonical model IK of the knowl- edge base K; in fact, IT is obviously a fragment of IK . It can be computed in polynomial time and its size is at most quadratic in the size of T . To properly insert IT into the database, we need a notion of reachability between domain elements of IT . For xC , xD ∈ ∆IT , we say that xD is reachable from xC if there are xC0 , . . . , xCn ∈ ∆IT , n ≥ 0, such that xC0 = xC , xCn = xD , and for all i < n, 4 Thus, Aux denotes the complement of Ind(A)IK , as introduced in Section 3. for all C, A ∈ sub(T ) such that for some C v D ∈ T , A is a conjunct of D do (i) extend the table A with ans(qC , Adb ) for all C, ∃r.D ∈ sub(T ) such that for some C v E ∈ T , ∃r.D is a conjunct of E do (i) if ans(qC , Adb ) 6= ∅ then (i) extend the table r with {(a, xD ) | a ∈ ans(qC , Adb )} extend the table Aux with Reach(xD ) extend the table A with AIT ∩ Reach(xD ), for each concept name A extend the table s with sIT ∩ (Reach(xD ) × Reach(xD )), for each role name s endif Fig. 1. The central procedure for data preprocessing. there is an r ∈ NR with (xCi , xCi+1 ) ∈ rIT . We use Reach(xC ) to denote the set of those xD ∈ ∆IT that are reachable from xC (including xC itself). Now, the (1) (k) data preprocessing phase constructs a sequence of ABoxes Adb , . . . Adb starting (0) with Adb = Adb and then repeatedly applying the procedure from Figure 1 to (i+1) (i) produce Adb from Adb . In the figure, qC denotes the result of converting the concept C into a tree-shaped conjunctive query with one answer variable that denotes the root (c.f. the remark at the end of Section 2) and “C is a conjunct of D” includes the case where C = D. The repeated application of this procedure stops when no more changes to the data occur. It is not hard to see that A∗db represents IK restricted to those domain el- ements that are reachable (along roles) from some individual name in A. This restriction to reachable elements is not (only) an optimization: it is needed in order to eliminate the problem illustrated in the fourth example (K4 , q4 ) in pre- vious section. We now analyze the complexity of the preprocessing phase. Lemma 3. Let n be the number of individual names in A and m the number of subconcepts in T . Then the procedure in Figure 1 is applied at most n · m times. We note that n and m are linear in the size of A and T , and that the bound given in Lemma 3 is only an upper bound that is unlikely to occur in practice. Let us use A∗db to denote the final database that is constructed in the preprocessing phase. It is not hard to see that the size of A∗db is bounded by O(n · m), where n and m are as in Lemma 3. 5 Query Rewriting This section shows how to rewrite a given conjunctive query q into a (domain independent) first-order query q ∗ such that cert(q, K) = ans(q ∗ , A∗db ) for every knowledge base K = (T , A), i.e., the answers that a relational system returns for q ∗ are precisely the certain answers for q and K. We assume that A∗db (defined in the previous section) is stored in a relational database system. We remind the reader that domain independent first-order queries are nothing else but SQL queries. In particular, every domain independent FO query can be rewritten into an SQL query in linear time. To proceed, we use several auxiliary definitions. Let ∼q denote the smallest relation on term(q) that includes the identity relation, is transitive, and satisfies the following closure condition: (∗) if there are r(s, t), r(s0 , t0 ) ∈ q with t ∼q t0 , then s ∼q s0 . The relation ∼q is central to our rewriting procedure. Recall from Section 3 that the goal of query rewriting is to produce a query q ∗ such that matches of q ∗ in IK can be reproduced as matches of q in the unraveling UK of IK and vice versa. Intuitively, UK is produced from IK by keeping the Ind(A)-part of IK intact and relaxing the NIaux -part into a collection of trees. The importance of (∗) can be seen when assuming t = t0 : then (∗) describes a non-tree situation in the query since t = t0 has two predecessors s and s0 . Therefore, we should avoid matches π of q to the NIaux -part of IK where π(s) 6= π(s0 ) as we will not be able to reproduce them in UK . The case where t ∼q t0 instead of t = t0 can be understood similarly. It is not hard to verify that ∼q is an equivalence relation and can be computed in time polynomial in the size of q. For each equivalence class ζ of ∼q , choose a representative tζ ∈ ζ. Let – Fork= be the set of pairs (T, ζ) with T ⊆ term(q) and ζ an equivalence class of ∼q such that for some r ∈ NR , T = {t ∈ term(q) | ∃t0 ∈ ζ : r(t, t0 ) ∈ q}, and T is of cardinality at least two; – Fork6= ⊆ qvar(q) be the set of quantified variables v such that for some s, s0 , t ∈ term(q) and r, r0 ∈ NR with r 6= r0 , we have r(s, v), r0 (s0 , t) ∈ q and v ∼q t; – Cyc ⊆ qvar(q) be the set of quantified variables v such that there are r0 (t0 , t00 ), . . . , rn−1 (tn , t0n ) ∈ q, n ≥ 0, with v = ti for some i ≤ n and t0i ∼q ti+1 mod n for all i < n. It is not hard to see that Fork= , Fork6= , and Cyc can be computed in time poly- nomial in the size of q. The rewritten query q ∗ is defined as q ∧ q1 ∧ q2 , where q1 and q2 are as follows: ^ q1 := ¬Aux(v) v∈avar(q)∪Fork6= ∪Cyc ^ q2 := ( Aux(tζ ) → (t1 = t2 ∧ · · · ∧ tk−1 = tk ) ) ({t1 ,...,tk },ζ)∈Fork= Note that the construction of q ∗ does not depend on K and can be carried out in polynomial time. Moreover, q ∗ is at most linear in the size of q. To see this, first note that the number of conjuncts in q1 is bounded by the number of variables in q. Regarding q2 , let Fork= = {(T1 , ζ1 ), . . . , (T` , ζ` )}. It is not hard to see that v0 v1 v2 r r r    v3 A v4 A v5 AA r } AA r } AA }} AA }} }} }} r r AA } AA } }~ }~ v6 v7 Fig. 2. Example Query. |T1 |+· · ·+|T` | is bounded by the number of role atoms in q, and thus the number of conjuncts in q2 is also bounded linearly in the size of q. Since q is a conjunct of q ∗ , it is readily checked that q ∗ is domain independent. The following theorem states the correctness of our approach to conjunctive query answering in EL using a database system. Its proof is given in the full version of this paper available at http://lat.inf.tu-dresden.de/∼clu. Theorem 1. cert(q, K) = ans(q ∗ , A∗db ). We now give three examples for query rewriting: – For tree-shaped queries and for queries without quantified variables almost no query rewriting is needed: in both cases q ∗ = q ∧ v∈avar(q) ¬Aux(v). V – The queries q1 to q3 from Section 3 are rewritten as follows. In the case of q1 = ∃u.r(v, u) ∧ r(v 0 , u), ∼q consists of the equivalence classes {v, v 0 } and {u}. Assume that the chosen representative for {v, v 0 } is v. In this example, q2 and Fork= are the most important ingredients of the rewriting, and we have q1∗ = ∃u.¬Aux(v) ∧ ¬Aux(v 0 ) ∧ r(v, u) ∧ r(v 0 , u) ∧ (Aux(u) → v = v 0 ). In the case of q2 = ∃u.r(v, u) ∧ s(v, u), the Fork6= part of q1 is the most important ingredient of the rewriting and we have q2∗ = ∃u.¬Aux(v) ∧ ¬Aux(u) ∧ r(v, u) ∧ s(v, u). Finally, reconsider q3 = ∃u.r(v, u) ∧ s(u, u). Here, the Cyc part of the rewrit- ing plays the crucial role, and we have q3∗ = ∃u.¬Aux(v) ∧ ¬Aux(u) ∧ r(v, u) ∧ s(u, u) – Let q be the query shown in Figure 2, where all variables are quantified. Then ∼q consists of the equivalence classes {v0 , v1 , v2 }, {v3 , v4 , v5 }, {v6 }, and {v7 }. Assume that the chosen representative for {v3 , v4 , v5 } is v3 . Thus, we have q∗ = q ∧ Aux(v6 ) → (v3 = v4 ) ∧ Aux(v7 ) → (v4 = v5 ) ∧ Aux(v3 ) → ((v0 = v1 ) ∧ (v1 = v2 )) – Let qnc be the query that has no answer variables and whose body is an n-clique, i.e., ^ qnc = ∃v0 , . . . , vn−1 . r(vi , vj ) i,j