Introduction

On Higher-Order Description Logics

Giuseppe De Giacomo

Maurizio Lenzerini

Riccardo Rosati

We investigate an extension of Description Logics with higher-order capabilities, based on Henkin-style semantics. Our study starts from the observation that the various possibilities of adding higher-order constructs to a DL form a spectrum of increasing expressive power, including domain metamodeling, i.e., using concepts and roles as predicate arguments, and full metamodeling, providing the ability of using the language constructors and operators as predicate arguments, in the style of RDF. We argue that higher-order features of the former type are sufficiently rich and powerful for the modeling requirements arising in many relevant situations, and therefore we carry out an investigation of the computational complexity of reasoning in DLs extended with such features. In particular, we show that adding domain metamodeling capabilities to expressive DLs has no impact on the complexity of the various reasoning tasks.

Introduction

Metamodeling allows one to treat concepts and properties as first-order citizens, and to see them as individuals whose properties can be asserted and reasoned upon. This feature is important in all applications where the need arises of modeling and reasoning about meta-concepts, i.e., concepts whose instances are themselves concepts, and metaproperties, i.e., relationships between meta-concepts.

It is well-known that in logic, and in Description Logics (DLs) in particular, higherorder constructs are needed for a correct representation of concepts and properties at the meta-level. However, the issue of devising suitable extensions to DLs for representing and reasoning about meta-level elements is largely unexplored. Recent research on this subject shows that there is a spectrum in the modeling capabilities of DLs. Four points in this spectrum represent specific notable cases, which we call domain modeling, metaquerying, domain metamodeling, and full metamodeling, respectively. Domain modeling. In domain modeling, the language only focuses on the ability of specifying the domain of interest in terms of individuals, concepts and roles, and is therefore “first-order”. This is the simplest case of the spectrum, with no higher-order feature, and is actually the one addressed in most of the research on DLs. Metaquerying. This is the case where the knowledge base does not contain any axiom regarding meta-concepts or meta-roles, but the query language allows for using metaconcepts, so that concepts and roles in the knowledge base can match the variables in the query, and may thus be returned as answers to the query [ 1 ]. This mechanism allows to express queries that are beyond first-order logic. For instance, asking for the least common subsumers of two concepts, or for the most specific concept for an individual, can be done by means of suitable meta queries.

Domain metamodeling. This is the case where the language allows for using concepts and roles as predicate arguments, so that one can assert properties of concepts and roles, as if they were individuals. Domain metamodeling includes metaquerying as a special case. It is our opinion that higher-order features of this kind are sufficiently rich and powerful for the modeling requirements arising in many relevant situations. One of the most popular approaches to domain metamodeling, which is closely related to DLs, is HiLog [ 7 ]. HiLog is a logic with a higher-order syntax, thus allowing predicates to appear as arguments in atomic formulae, but with a Henkin-style semantics, which implies that the expressive power of the language actually remains within first-order. Full metamodeling. This is the most general case, where the modeling language allows not only for using concepts and roles as predicate arguments, but also for refining and extending the properties of the language operators, and to reason upon such properties. RDF and RDFS, which are again based on a Henkin semantics, are popular languages enabling this kind of metamodeling. Both languages allow for stating axioms not only on domain (meta-)elements, but also on the so-called built-in vocabulary (i.e., operators such as rdf:type) of the language.

In this paper1, we investigate an extension of DLs with higher-order capabilities. We are especially interested in those features that allow us both to model and to query individuals, concepts, roles, meta-concepts and meta-roles with no limitations. Therefore, the extension that we study is geared towards domain metamodeling (thus including metaquerying), for which we provide the following specific contributions.

First, we present syntax and semantics of an extension of DLs with domain metamodeling features (see Section 2). In particular, we show how, starting from any DL L, one can define its higher-order version, called Hi(L). From the syntax point of view, our approach stems from two ideas. On one hand, every modeling element can be seen simultaneously as an individual, as a concept, and as a role. On the other hand, since concepts in DLs are denoted not only by names, but also by complex expressions, every complex expression is a modeling element in our language. From the semantic point of view, we adopt a Henkin semantics, as in HiLog and RDF(S).

Second, we carry out an investigation of the computational complexity of reasoning in DLs extended with higher-order features. By reasoning we mean not only logical implication, but also answering unions of conjunctive queries with metaquerying abilities. We show that adding domain metamodeling capabilities to expressive DLs, in particular to SHIQ [ 3 ], has no impact on the complexity of the various reasoning tasks, including conjunctive query answering (see Section 4).

The idea of representing concepts and properties at the meta-level is an old one in Knowledge Representation and Computer Science. Semantic networks and early Frame-based systems incorporated specific mechanisms for representing concepts whose instances are themselves concepts [ 10, 2 ]. Conceptual modeling languages proposed in the 70’s, such as TAXIS [ 12 ], provided both the notion of meta-class, and suitable facilities for describing properties of operators on meta-classes. The notion of meta-class is also present in virtually all object-oriented languages, including modern programming languages (see, e.g., the Java class Class).

As we said before, the issue of extending DLs with higher-order constructs has been addressed only by few research papers. In [ 4 ], probably the first paper on this subject, the notion of “reification of concepts” is proposed as a means to express meta-level classes, but the paper does not address neither the issue of meta-roles, nor the issue

1 For the sake of brevity, proofs are omitted in this version.

of query answering. A more recent paper is [ 8 ], where metamodeling capabilities are added to a DL of the DL-Lite family.

Our work has connections with recent investigations on full metamodeling, in particular on RDF, RDFS, and OWL Full. In [ 11 ], the author addresses the issue of decidability of reasoning on meta-properties in different fragments of OWL Full. It is shown that, although going from domain to full metamodeling easily leads to undecidability, reasoning in some fragments of OWL Full is decidable. Differently from the present paper, the focus of [ 11 ] is neither on the tractability frontier, nor on conjunctive query answering. Finally, reasoning (but not query answering) with metamodeling is also studied in [ 14 ], where the language OWL FA is proposed, which introduces a stratum number in class constructors and axioms to indicate the strata they belong to, and suitable contraints impose that TBox axioms are stated on classes of the same stratum, while ABox axioms can only involve elements of two consecutive strata.

The rest of the paper is organized as follows. In Section 2, we describe syntax and semantics of Hi(L), by referring, in particular, to the higher-order DL Hi (SHIQ). As we said above, Hi(L) denotes the higher-order version of the Description Logic L. In Section 3 we present our technique for satisfiability in Hi (SHIQ), and in Section 4 we address query answering in the same DL. Both in Section 3 and in Section 4 we also characterize the computational complexity of the presented algorithms. We end the paper in Section 5, by pointing out future directions of our reseearch. 2

Higher-order Description Logics

In this section, we present our approach to higher-order DLs, by showing how, starting from a DL L, one can define its higher-order version, called Hi(L). During the presentation, we will also refer to a specific DL, namely SHIQ. Therefore, we will describe in detail the higher-order DL Hi (SHIQ).

Before delving into Hi(L), we present some preliminary definitions. Every traditional DL L is characterized by a set OP (L) of operators, used to form concept and role expressions, and a set of MP (L) of meta-predicates, used to form assertions. Each operator and each meta-predicate have an associated arity. If symbol S has arity n, then we often write S=n to denote such symbol and its arity. For SHIQ, we have OP (SHIQ) = fInv =1; And =2; Not =1g [ fAtLeastQ n=2 j n 2 Ng MP (SHIQ) = fInst C =2; Inst R=3; IsaC =2; IsaR=2; Tran=1g:

We assume that the reader is familiar with SHIQ. Therefore, the intuitive meaning of all the above symbols should be clear. The formal specification of their semantics will be given shortly.

Syntax. We assume the existence of two disjoint, countably infinite alphabets: S, the set of names, and V, the set of variables. The buidling blocks of a Hi(L) knowledge base are assertions, which in turn are based on expressions. We define the set of expressions, denoted by EL(S), over the alphabet S for Hi(L) inductively as follows: – if E 2 S then E 2 EL(S); – if C=n 2 OP (L) and E1; : : : ; En 2 EL(S) then C(E1; : : : ; En) 2 EL(S). Example 1. If the names Course; Teaches; Full belong to the alphabet S, then the following is a Hi (SHIQ) expression:

And (Course; Not (AtLeastQ 2(Inv (Teaches)); Full )))

which intuitively denotes the concept representing the set of courses that are taught by at most one full professor.

A Hi(L) assertion over EL(S) is a statement of the form M (E1; : : : ; En) where M 2 MP (L), n 0 is the arity of M , and for every 1 i n, Ei 2 EL(S). A Hi (L) knowledge base (KB) is a set of assertions over EL(S).

Thus, an assertion is simply an application of a meta-predicate to a set of expressions. Intuitively, an assertion is an axiom that predicate over a set of individuals, concepts or roles.

Example 2. Suppose that the alphabet S contains all names mentioned in Example 1, plus GradCourse; UniversityConcept ; ObsoleteConcept ; John, and De nedBy . Then the following are Hi (SHIQ) assertions:

IsaC (GradCourse; And (Course; Not (AtLeastQ 2(Inv (Teaches)); Full )))) Inst C (And (Course; Not (AtLeastQ 2(Inv (Teaches)); Full ))); UniversityConcept )

Inst R(UniversityConcept ; John; De nedBy )

Inst R(Not (ObsoleteConcept ); John; De nedBy ) The first assertion states that every graduate course is taught by at most one full professor. The intended meaning of the second assertion is that the concept

And (Course; Not (AtLeastQ 2(Inv (Teaches)); Full ))) is an instance of the concept

UniversityConcept (which is therefore a meta-concept). Finally, the intended meaning of the third and the fourth assertions is that the concepts UniversityConcept and Not (ObsoleteConcept ) have been introduced in the knowledge base by John.

Next, we introduce the notion of query, which in turn relies on the notion of “atom”. Intuitively, an atom is constituted by a meta-predicate applied to a set of arguments, where each argument is either an expression or a variable. More formally, we define the set (S; V) of terms over S and V to be EL(S) [ V. Terms of the form EL(S) are called ground. We define an atom to be constituted by the application of a meta-predicate in MP (L) to a set of terms, and we call an atom ground if no variable occurs in it. Note that a ground atom has the same form of an assertion. An atom whose meta-predicate is IsaC or IsaR is called an ISA-atom, while we call instance-atom an atom whose meta-predicate is Inst C or Inst R.

A higher-order conjunctive query (HCQ) of arity n is an expression of the form q(x1; : : : ; xn) a1; : : : ; am where q, called the query predicate, is a symbol that does not belong to S [ V, every xi belongs to V, every ai is a (possibly non-ground) atom, and all variables x1; : : : ; xn occur in some aj . The variables x1; : : : ; xn are called the free variables (or distinguished variables) of the query, while the other variables occurring in a1; : : : ; am are called existential variables. A higher-order union of conjunctive queries (HUCQ) of arity n is a set of HCQs of arity n with the same query predicate. A HCQ/HUCQ is called Boolean if it has no free variable.

1. for each d1 2 , if d = Inv Io (d1) then dIr = (d1Ir ) 1; 2. for each d1; d2 2 , if d = And Io (d1; d2) then dIc = d1Ic \ d2Ic ; 3. for each d1 2 , if d = Not Io (d1) then dIc = d1Ic ; 4. for each d1; d2 2 and for each n 2 N, if d = AtLeastQ n(d1; d2) then dIc = fe j 9e1; : : : ; en s.t. ei 6= ej for i 6= j; and

(8i s.t. 1 i n; he; eii 2 d1Ir and ei 2 d2Ic ).

Intuitively, the query asks for the instances of all the concepts in the knowledge base defined by John. In our case, the answer will be simply fGradCourse; Not (AtLeastQ 2(Inv (Teaches)); Full )))g.

Example 4. Consider now the case where we want to ask for the instances of all the concepts y such that the expression Not (y) is a concept in the knowledge base defined by John. The natural formulation of this query would be: q(x)

Inst C (x; y); Inst R(Not (y); John; De nedBy ) However, according to our syntax for queries, variables cannot appear as arguments within terms, and therefore the above is not a query in Hi (SHIQ). We discuss this kind of queries further in the conclusions.

Semantics. The semantics of Hi (L) is based on the notion of interpretation structure. An interpretation structure is a triple = h ; Ic; Iri where: (i) is a non-empty (possibly countably infinite) set; (ii) Ic is a function that maps each d 2 into a subset of ; and (iii) Ir is a function that maps each d 2 into a subset of . In other words, treats every element of simultaneously as: (i) an individual; (ii) a unary relation, i.e., a concept, through Ic; and (iii) a binary relation, i.e., a role, through Ir.

An interpretation over is a pair I = h ; Ioi, where = h ; Ic; Iri is an interpretation structure, and Io is a function that maps: (i) each element of S to a tsiionnglCeIdoo m:ainn o!bject othfat s;aatinsdfie(siit)heeaccohnedlietimonenstcCha=rnact2eriOzinPg( Lth)e toopaenratno-rarCy=fnu.nIcnparticular, the conditions for the operators in OP (SHIQ) are described in Figure 1. We extend Io to expressions in EL(S) inductively as follows: if C=n 2 OP (L), then (C(E1; : : : ; En))Io = CIo (E1Io ; : : : ; EnIo ).

To interpret non-ground terms, we need assignments over interpretations. An assignment over h ; Ioi is a function : V ! .

We are now ready to describe how to interpret terms in Hi (L). Given an interpretation I = h ; Ioi and an assignment over I, we define the function ( )Io; : (S; V) ! as follows: – if t 2 S then tIo; = tIo ; – if t 2 V then tIo; = (t); – if t is of the form C(t1; : : : ; tn), then tIo; = CIo (t1Io; ; : : : ; tIno; ).

Satisfaction of an assertion with respect to an interpretation I and an assignment over I is defined based on the semantics of the meta-predicates in MP (L). For the meta-predicates used in SHIQ, satisfaction in I; is defined as follows: – I; j= Inst C (E1; E2) if E1Io; 2 (E2Io; )Ic ; – I; j= Inst R(E1; E2; E3) if hE1Io; ; E2Io; i 2 (E3Io; )Ir ; – I; j= IsaC (E1; E2) if (E1Io; )Ic (E2Io; )Ic ; – I; j= IsaR(E1; E2) if (E1Io; )Ir (E2Io; )Ir ; – I; j= Tran(E) if (EIo; )Ir is a transitive relation.

A Hi (L) KB H is satisfied by I if all the assertions in H are satisfied by I.2 As usual, the interpretations I satisfying H are called the models of H. A Hi (L) KB H is satisfiable if it has at least one model.

Let I be an interpretation and an assignment over I. A Boolean HCQ q of the form q a1; : : : ; an is satisfied in I; if every assertion ai is satisfied in I; . A Boolean HUCQ Q is satisfied in I; if there exists a Boolean HCQ q 2 Q that is satisfied in I; . A Boolean HUCQ Q is satisfied in an interpretation I, written I j= Q, if there exists an assignment over I such that Q is satisifed in I; . Given a Boolean HUCQ Q and a Hi (L) KB H, we say that Q is logically implied by H (denoted by H j= Q) if for each model I of H there exists an assignment such that Q is satisfied by I; .

Given a non-Boolean HUCQ q of the form q(t1; : : : ; tn) a1; : : : ; am, a grounding substitution of q is a substitution such that t1 ; : : : ; tn are ground terms. We call t1 ; : : : ; tn a grounding tuple. The set of certain answers to q in H is the set of grounding tuples t1 ; : : : ; tn that make the Boolean query q a1 ; : : : ; an logically implied by H. Notice that, in general, the set of certain answers may be infinite even if the KB is finite. Therefore, it is of interest to define suitable notions of safeness, which guarantee that the set of answers is bounded. This issue, however, is beyond the scope of the present paper.

Indeed, in this paper, we focus on Boolean queries only, so as to address the computation of certain answers as a decision problem. Also, in our analysis, we measure the computational complexity in three different ways: with respect to the size of the whole KB (KB complexity), with respect to the size of the part of the KB formed by the assertions involving only the meta-predicates Inst C =2; Inst R=3 (instance complexity), and with respect to the size of the KB and the query together (combined complexity). 3

Satisfiability in Hi (S HI Q) In this section we study the computational characterization of KB satisfiability in the higher-order DL Hi (SHIQ). Query answering in the same DL is addressed in the next section. 2 We do not need to mention assignments here, since all assertions in H are ground.

We start by defining a translation three injective functions

from Hi (SHIQ) to SHIQ. First, we define

O : ESHIQ(S) ! So; C : ESHIQ(S) ! Sc; R : ESHIQ(S) ! Sr where So, Sc and Sr are three mutually disjoint alphabets of names, each one disjoint from S. Then, we inductively define two functions C and R as follows: – if S 2 S, then C (S) = C (S) and R(S) = R(S); – C (Not (E)) = Not ( C (E)); – C (And (E1; E2)) = And ( C (E1); C (E2)); – C (AtLeastQ n(E1; E2)) = AtLeastQ n( R(E1); C (E2)); – R(Inv (E)) = Inv ( R(E)).

Now, let Expr (H) denote the set of ground expressions occurring in H (notice that every subexpression of an expression occurring in H also belongs to Expr (H)). Then, given a Hi (SHIQ) KB H, we inductively define the SHIQ KB (H) as follows: 1. if Not (E) 2 Expr (H), then C (Not (E)) C (Not (E)) 2 2. if Inv (E) 2 Expr (H), then R(Inv (E)) R(Inv (E)) 2 3. if And (E1; E2) 2 Expr (H), then C (And (E1; E2))

(H); 4. if AtLeastQ n(E1; E2) 2 Expr (H), then

C (AtLeastQ n(E1; E2)) 2 (H); 5. if Inst C (E1; E2) 2 H, then C (E1)( O(E2)) 2 (H); 6. if Inst R(E1; E2; E3) 2 H, then R(E1)( O(E2); O(E3)) 2 7. if IsaC (E1; E2) 2 H, then C (E1) v C (E2) 2 (H); 8. if IsaR(E1; E2) 2 H, then R(E1) v R(E2) 2 (H); 9. if Tran(E) 2 H, then Tran( R(E)) 2 (H).

(H); (H);

C (And (E1; E2)) 2

C (AtLeastQ n(E1; E2))

(H);

Informally, the above translation, when applied to a Hi (SHIQ) DL H, provides a SHIQ KB (H) in which for every ground term E occurring in H (notice that E may be a subterm of another term occurring in H) there exists a concept name C (E) (and a role name R(E)) that is defined, through the use of the function C (respectively, R) as equivalent to the term E seen as a concept (respectively, role) expression.

Based on the above translation, we get the first of our main results, namely a reduction of KB satisfiability in Hi (SHIQ) to KB satisfiability in SHIQ. Theorem 1. A Hi (SHIQ) KB H is satisfiable iff the SHIQ KB (H) is satisfiable.

Proof (sketch). One direction of the proof is trivial: if there exists a model I for H, then based on I it is immediate to define a model I0 for (H) which interprets objects according to Io, atomic concepts (i.e., concepts denoted by names) according to Ic, and atomic roles according to Ir. As for the other direction, given a model I for (H) over a domain , it is possible to define a model I0 for H by considering the disjoint union of a countably infinite number of copies of I (over a countably infinite number of copies of ), and defining Io0 so that it coincides with Io on the expressions occurring in H, while every expression that does not occur in H is interpreted by Io0 to an element of the extra copies of . Then, it is easy to define Ic0 and Ir0 in order to satisfy the semantic conditions of Figure 1.

From the above theorem, and the computational characterization of KB satisfiability in SHIQ [ 3 ], we are able to provide the computational characterization of KB satisfiability in Hi (SHIQ).

Theorem 2. KB satisfiability in Hi (SHIQ) is EXPTIME-complete w.r.t. KB complexity, and coNP-complete w.r.t. instance complexity. 4

Query answering in Hi (S HI Q) In this section we study query answering in Hi (SHIQ). In particular, we restrict our attention to a specfic class of HUCQs, which we call guarded. For the definition of this class of queries, we need the notions of object position, concept position, and role position, whose goal is to characterize the various argument positions in both atoms and terms. If we use symbol O to mark object positions, symbol C to mark concept positions, and symobl R to mark role positions, then we have:

Inst C (O; C ) Inst R(O; O; R ) IsaC (C; C ) IsaR(R; R ) Tran(R )

Not (C ) And (C; C ) Inv (R ) AtLeastQ n(R; C )

Now, a HCQ q is called guarded if, for every variable x occurring in an ISA-atom of q, x also occurs in a concept or role position of an instance-atom of q. A HUCQ is called guarded is every HCQ q in Q is guarded.

We start our analysis of query answering by showing that answering guarded HUCQs is coNP-hard w.r.t. KB complexity (actually, w.r.t. instance complexity only) and 2p-complete w.r.t combined complexity, as soon as the DL admits the Inst C , Inst R and IsaC meta-predicates (and even if the DL does not allow for any logical operator).

Theorem 3. Let L be a DL such that MP (L) contains the meta-predicates Inst C , Inst R and IsaC . Answering guarded HUCQs over Hi (L) KBs is coNP-hard w.r.t. instance complexity, and 2p-hard w.r.t. combined complexity, even if OP (L) = ;. The previous theorem implies that answering guarded HUCQs is intractable w.r.t. instance (and KB) complexity not only in Hi (SHIQ), but in all the DLs currently studied, since all DLs comprise the meta-predicates Inst C , Inst R and IsaC .

We now provide a technique for query answering over Hi (SHIQ) KBs, which is based on the reduction to SHIQ provided by the function () defined for KB satisfiability. For query answering, however, the function () must be extended to account for expressions occurring in the query; moreover, we also need to define a translation of HUCQs. Such functions are defined below.

Let Q be a HUCQ. We say that Q is a metaground HUCQ if it does not contain any variable in concept or role position. Moreover, we say that Q is an instance HUCQ if it only contains instance-atoms.

In the following, given a HUCQ Q, we denote by Expr (Q) the set of ground expressions occurring in Q. Now let q be a HCQ and let e1; : : : ; ek be the ground expressions occurring as arguments of ISA-atoms in q. We define inductively the set of expressions conj ISA(q) as follows:

C1 = fe1; : : : ; ekg Ci+1 = fAnd (e; e0)je 2 Ci and eo 2 C1g conj ISA(q) = Ck Informally, conj ISA(q) denotes the set of all the possible conjunctions of ground expressions occurring as arguments of ISA-atoms in q. We are now ready to define the set of ground expressions Expr (H; Q) as follows:

Expr (H; Q) = Expr (H) [ Expr (Q) [ [ conj ISA(q) q2Q

As we will show in the following, Expr (H; Q) constitutes the set of ground expressions that we use for grounding metavariables. Notice that Expr (H; Q) has size polynomial in the size of H.

Let q be a HCQ. An H-metaground instantiation of q is a HCQ obtained from q by replacing every variable occurring in at least one concept or role position with an expression of Expr (H; q). Given a HCQ q, we define metaground (q; H) as the HUCQ corresponding to the union of all the H-metaground instantiations of q. If there are no ground terms occurring in H (i.e., H is empty), we define metaground (q; H) to be the HCQ obtained from q by replacing all variables occurring in concept and role positions with any name in S. Given a HUCQ Q, we define metaground (Q; H) = Sq2Q metaground (q; H).

Given a metaground HUCQ Q, we denote by (Q; H) the standard UCQ obtained from metaground (Q; H) by: (i) replacing every ground term E occurring as an argument in object position of an atom in Q with O(E); (ii) replacing every ground term E occurring as an argument in concept position of an atom in Q with C (E); (iii) replacing every ground term E occurring as an argument in role position of an atom in Q with R(E). Finally, given a Hi (SHIQ) KB H and a HUCQ Q, we denote by (H; Q) the SHIQ KB obtained starting from (H) and adding, for every ground term E that occurs in metaground (Q; H) and does not occur in H, the inclusion assertions generated by the first 5 items in the definition of (H) above.

Now we restrict our attention to both metaground and instance queries. For this class of HUCQs, the following property can be easily proved.

Theorem 4. Let H be a Hi (SHIQ) KB, and let Q be a metaground instance HUCQ. Then, H j= Q iff (H; Q) j= (Q; H).

Based on the known computational characterization of answering “standard” UCQs, i.e., both metaground and instance UCQs, in SHIQ [ 9, 6, 13 ], we immediately get the following result.

Theorem 5. Answering metaground instance HUCQs over Hi (SHIQ) KBs is coNPcomplete w.r.t. instance complexity, EXPTIME-complete w.r.t. KB complexity, and 2EXPTIME-complete w.r.t. combined complexity.

We can actually extend the previous theorem to the whole class of instance HUCQs. First, we show the following crucial property, which holds for the whole class of guarded HUCQs.

Theorem 6. Let H be a Hi (SHIQ) KB, and let Q be a guarded HUCQ. H j= Q iff H j= metaground (Q; H).

Proof (sketch). One direction (if H j= metaground (Q; H) then H j= Q) is trivial. The proof of the other direction is quite involved. First, the following property (*) can be shown: if H j= Q then H j= metaground (Q), where metaground (Q) is the query obtained from Q through the meta-grounding of the meta-variables over the set of all expressions of the language (not only those terms occurring in Expr (H; Q)). Now suppose H j= Q. If H 6j= metaground (Q; H), then there exists a model I for H such that I j= metaground (Q) and I 6j= metaground (Q; H). It is now possible to define a model I0 for H which is essentially the disjoint union of a countably infinite number of copies of I, in which the function Io0 is defined in such a way that I0 6j= metaground (Q), which contradicts the above property (*). Consequently, H j= metaground (Q; H).

Theorem 6, Theorem 4, and Theorem 5 allow us to immediately derive the computational characterization of query answering in Hi (SHIQ) for the whole class of instance HUCQs.

Theorem 7. Answering instance HUCQs over Hi (SHIQ) KBs is coNP-complete w.r.t. instance complexity, EXPTIME-complete w.r.t. KB complexity, and 2-EXPTIMEcomplete w.r.t. combined complexity.

In order to go beyond instance HUCQs, and answer guarded HUCQs in Hi (SHIQ), we now define a technique which reduces this problem to answering standard UCQs in SHIQ.

In the following, we call intensional (or, TBox) assertion every assertion using one of the meta-predicates IsaC , IsaR, and Tran. Moreover, given a KB H and a HUCQ Q, we define T AH;Q to be the set of all the intensional assertions in SHIQ that can be obtained from the set of ground terms occurring in Expr (H; Q).

Let T 0 be a subset of T AH;Q. We say that T 0 is coherent with H iff T T 0, where T is the set of TBox assertions occurring in H, and T 0 [ H 6j= for every 2 TA(Expr (H; Q)) T 0. Then, we denote by IntEval (Q; H; T 0) the metaground instance HUCQ Q0 obtained starting from Q0 = metaground (Q; H) and then evaluating every intensional assertion over T 0 as follows: – if is an intensional assertion occurring in a HCQ q 2 Q0 and eliminate from q; – if is an intensional assertion occurring in a HCQ q 2 Q0 and eliminate q from Q0. 2 T 0, then 62 T 0, then

Finally, we define KBSHIQ(H; T 0; Q) as the SHIQ KB3 obtained starting from K0 = (H0; Q) (where H0 = T 0 [ H) and then adding to K0 the following assertions for every TBox assertion 2 TA(Expr (H; Q)) T 0: – if = IsaC (E1; E2) then add to K0 the ABox assertions C (E1)(n) and

C (Not (E2))(n), where n is a new individual name in K0; – if = IsaR(E1; E2) then add to K0 the TBox assertion (role disjointness) R(E2) v :Aux i, where i is such that Aux i is a new role name in K0, and the ABox assertions R(E1)(n1; n2) and Aux i(n1; n2), where n1; n2 are new individual names in K0; – if = Tran(E) then add to K0 the TBox assertion (role disjointness) R(E) v :Aux i, where i is such that Aux i is a new role name in K0, and the ABox assertions R(E)(n1; n2), R(E)(n2; n3) and Aux i(n1; n3), where n1; n2; n3 are new individual names in K0. 3 Actually, KBSHIQ(H; T 0; Q) is a SHIQ KB with role disjointness assertions, however adding this kind of axioms to SHIQ does not change the complexity of query answering. Intuitively, KBSHIQ(H; T 0; Q) is such that, if 2 TA(Expr (H; Q)) T 0, then is forced to be false in every model of KBSHIQ(H; T 0; Q). The following theorem (whose proof relies on Theorem 6) reduces answering guarded HUCQs in Hi (SHIQ) to answering standard UCQs in SHIQ.

Theorem 8. Let H be a Hi (SHIQ) KB, and let Q be a guarded HUCQ. Then, H 6j= Q iff there exists a subset T 0 of T AH;Q such that T 0 is coherent with H, and KBSHIQ(H; T 0; Q) 6j= (IntEval (Q; H; T 0); H).

Based on Theorem 8, Theorem 4 and Theorem 5, we get the computational characterization of answering guarded HUCQs in Hi (SHIQ).

Theorem 9. Answering guarded HUCQs over Hi (SHIQ) KBs is coNP-complete w.r.t. instance complexity, EXPTIME-complete w.r.t. KB complexity, and 2-EXPTIMEcomplete w.r.t. combined complexity. 5

Conclusions

In this paper we have presented a general mechanism for defining a family of Description Logics for domain metamodeling. We have shown how, starting from any DL L, one can define a higher-order logic, called Hi(L), that adds to L metamodeling features. Also, we have presented algorithms for both satisfiability and query answering in a specific expressive higher-order Description Logic, namely Hi (SHIQ). This research can be seen as an extension of the work in [ 8 ]: in particular, on the one hand we have extended the results presented there to SHIQ, and on the other hand we have characterized the computational complexity for a larger class of meta-queries.

The research presented here can be continued along different lines. First, while the query answering algorithm presented in this paper is suited for the class of guarded HUCQs, it is our goal to address query answering for the whole class of HUCQs. Second, it would be interesting to see whether more metamodeling features can be added to the query language. In particular, one might wonder whether the query answering method described in this paper can be extended to deal with the case where variables can appear freely within the terms in the query atoms (see Example 4 in Section 2). Unfortunately, our first investigation on this subject shows that allowing for a more flexible use of variables in the queries easily leads to undecidability of query answering. Another interesting direction is to add both domain and full metamodeling capabilities to tractable DLs, in particular, the DLs of the DL-Lite family [ 5 ], so as to check whether reasoning remains tractable in the resulting logics.

Finally, we remark that punning, i.e., using the same name for different elements of the ontology (for example, an individual and a concept), has been introduced in OWL 2 4. While punning can be treated trivially in classical reasoning tasks over the DL ontology, it poses interesting problems in the context of query processing. In particular, if variables are not typed a priori, punning introduces the kind of meta-querying discussed in this paper. Indeed, the DL query language presented in this paper is the first one (to our knowledge) that exploits punning in queries, since it allows for expressing joins involving variables which simultaneously denote both individuals and predicates. Acknowledgments This research has been partially supported by MIUR FIRB project “Tecnologie Orientate alla Conoscenza per Aggregazioni di Imprese in Internet” (TOCAI.IT).

4 http://www.w3.org/2007/OWL/wiki/OWL Working Group

Angiulli ,

Ben-Eliyahu-Zohary ,

Ianni , and

Palopoli . Computational properties of metaquerying problems . ACM Trans. on Computational Logic , 4 ( 2 ): 149 - 180 , 2003 .

Attardi and

Simi . Consistency and completeness of OMEGA, a logic for knowledge representation . In Proc. of the 7th Int. Joint Conf. on Artificial Intelligence (IJCAI'81) , pages 504 - 510 , 1981 .

Baader ,

Calvanese ,

McGuinness ,

Nardi , and

P. F.

Patel- Schneider, editors. The Description Logic Handbook: Theory, Implementation and Applications . Cambridge University Press, 2003 .

Badea . Reifying concepts in description logics . In Proc. of the 15th Int. Joint Conf. on Artificial Intelligence (IJCAI'97) , pages 142 - 147 , 1997 .

Calvanese , G. De Giacomo,

Lembo ,

Lenzerini , and

Rosati . Tractable reasoning and efficient query answering in description logics: The DL-Lite family . J. of Automated Reasoning , 39 ( 3 ): 385 - 429 , 2007 .

Calvanese , G. De Giacomo, and

Lenzerini . Conjunctive query containment and answering under description logics constraints . ACM Trans. on Computational Logic , 9 ( 3 ): 22 . 1 - 22 .31, 2008 .

Chen ,

Kifer , and

D. S.

Warren. HILOG : A foundation for higher-order logic programming . J. of Logic Programming , 15 ( 3 ): 187 - 230 , 1993 .

8. G. De Giacomo , M.

Lenzerini , and R.

Rosati . Towards higher-order DL-Lite . In Proc. of the 2008 Description Logic Workshop (DL 2008 ), volume 353 of CEUR Electronic Workshop Proceedings , http://ceur-ws. org/ , 2008 .

Glimm , I. Horrocks,

Lutz , and

Sattler . Conjunctive query answering for the description logic SHIQ . In Proc. of the 20th Int. Joint Conf. on Artificial Intelligence (IJCAI 2007 ), pages 399 - 404 , 2007 .

10. F. Lehmann, editor. Semantic Networks in Artificial Intelligence. Pergamon Press, Oxford (United Kingdom) , 1992 .

11.

Motik . On the properties of metamodeling in OWL . J. of Logic and Computation , 17 ( 4 ): 617 - 637 , 2007 .

12. J. Mylopoulos , P. A.

Bernstein , and H. K. T.

Wong . A language facility for designing database-intensive applications . ACM Trans. on Database Systems , 5 ( 2 ): 185 - 207 , 1980 .

13. M. Ortiz , D.

Calvanese , and T.

Eiter . Data complexity of query answering in expressive description logics via tableaux . J. of Automated Reasoning , 41 ( 1 ): 61 - 98 , 2008 .

14.

J. Z.

Pan and I. Horrocks. OWL FA: a metamodeling extension of OWL DL . In Proc. of the 15th Int. World Wide Web Conf. (WWW 2006 ), pages 1065 - 1066 , 2006 .