1. Introduction

On the Expressive Power of Ontology-Mediated Queries: Capturing coNP

Sanja Lukumbuzya

Magdalena Ortiz

Mantas Šimkus

1 0 TU Wien , Austria 1 Umeå University , Sweden

2023

36 2 4

The complexity and relative expressiveness of Ontology-mediated Queries (OMQs) is quite well understood by now. In this paper, we study the expressive power of OMQs from a descriptive complexity perspective, where the central question is to understand whether a given OMQ language is powerful enough to express all queries that can be computed within some bound on time or space. We show that the OMQ language that pairs instance queries with ontologies in the very expressive DL ℒℋℐ with closed predicates cannot express all coNP-computable Boolean queries, despite being coNP-complete in data complexity. We, then, propose an extension of this OMQ language that is expressive enough to precisely capture the class of all Boolean queries computable in coNP. This involves adding functionality as well as path expressions and nominal schemata, which are restricted in a way that allows us to carefully incorporate them into the existing mosaic technique for the DL ℒℋℐℱ with closed predicates without afecting the coNP upper bound in data complexity.

eol>Description Logics Ontology-mediated Query Answering Expressive Power Descriptive Complexity

1. Introduction

Ontology-mediated Queries (OMQs) have received significant attention in the Description Logic (DL) community as a powerful tool to answer database-like queries, while taking into account domain knowledge captured in ontologies. The computational complexity of this problem is now well-understood both in terms of combined complexity and data complexity. Specifically, for the standard OMQs based on expressive DLs of the ℒ family, we have coNP-completeness in data complexity: this is the complexity of checking if a given tuple of individuals belongs to the answer to an OMQ over an ABox , assuming that is fixed and only varies (see, e.g., [ 1 ]). Relative expressiveness of OMQs has also been investigated, e.g., via the established translations of OMQs into variants of Datalog (see, e.g., [ 2, 3, 4 ]).

The expressive power of OMQs from a descriptive complexity [ 5 ] perspective has thus far received limited attention. In this context, we ask whether a given OMQ language is powerful enough to express all queries computable within some time or space resources, i.e., belonging to a certain complexity class. It was shown in [ 2 ] that there are OMQ languages that capture certain subclasses of coNP related to constraint satisfaction problems (CSPs). More recently, a close connection between OMQ languages with the so-called closed predicates and surjective CSPs was shown in [ 6 ]. In this paper, we continue this line of work and we focus on finding an OMQ language that precisely captures the complexity class coNP. It is easy to see that OMQs based on standard expressive DLs and conjunctive queries—while being coNP-complete in data complexity—are not powerful enough to express all queries computable in coNP. This follows directly from the monotonicity of standard OMQ languages (e.g., in all cases where the OMQ language is based on first-order logic). Specifically, if is a Boolean OMQ in such a language, then for all ABox pairs 1 ⊆ 2 we have that (1) = 1 implies (2) = 1. These OMQ languages cannot capture coNP since there exist (rather trivial) non-monotonic queries computable in coNP (or even polynomial time). Here is a simple example of such a (non-monotonic) query: “Does the input ABox have an odd number of individuals?”

We, therefore, pose the following question: “Which features make an OMQ language suficiently expressive to capture the class of all queries computable in coNP, while still maintaining coNP-completeness in data complexity?” The summary of our contributions is as follows: ∘ We first present an inexpressibility result for non-monotonic OMQs based on ℒℋℐ with closed predicates. For such expressive OMQs the aforementioned monotonicity-based argument does not apply, and we need a more sophisticated approach. Specifically, by analyzing an existing algorithm in [ 3 ] and invoking the Non-deterministic Time Hierarchy Theorem, we show that instance queries mediated by ℒℋℐ TBoxes with closed predicates are not expressive enough to capture coNP. ∘ We present an OMQ language that is powerful enough to express all coNP computable queries. As our base DL we choose ℒℋℐℱ (with closed predicates) and we add to it nominal schemas [ 7 ] of very restricted shapes. We argue that these additions do not cause an increase in combined complexity and data complexity, i.e. answering ontology-mediated instance queries remains complete for NExpTime and coNP, respectively. This is done by suitably modifying the mosaic-based algorithm in [ 4 ]. ∘ We prove that our enriched OMQ language is powerful enough to express all (so-called generic) Boolean queries computable in coNP. Each such generic query is associated to a signature Σ as well as to a set of ABoxes over Σ (also called Σ-ABoxes) in which the answer to the query is “true”. By saying that “ is computable in coNP” we mean that there is a Nondeterministic Turing Machine (NTM) that recognizes the language of strings representing Σ-ABoxes in which the answer to the query is “false” and runs in polynomial time in the size (of the string representation) of the input ABox. We show that the enriched OMQ language can properly express the computations of . As a consequence of this and the previous point, we obtain a language that captures precisely coNP.

2. Preliminaries

Ontology-Mediated Queries (OMQs) We recall here the necessary notions related to OMQs based on ℒℋℐℱ with closed predicates.

We use N , N, and N to denote countably infinite, mutually disjoints sets of concept names, role names, and individuals, respectively. A signature is any finite set Σ ⊆ N ∪ N. An assertion (or, fact) is an expression of the form (, ) or (), where ∈ N, ∈ N , and , ∈ N . An ABox is any finite set of assertions. If all concept and role names that appear in an ABox belong to a signature Σ, then is a Σ-ABox.

We define the set of roles N+ as follows: N+ = {, − | ∈ N}. Furthermore, we define the set N+ of basic concepts as N+ = N ∪ {{} | ∈ N } ∪ {⊤, ⊥}. (Complex) concepts are defined according to the following syntax := | ⊓ | ⊔ | ¬ | {} | ∃. | ∀., where ∈ N , ∈ N , and ∈ N+. We call the concepts of the form {}, where ∈ N , nominals. Axioms are expressions of the form ⊑ (concept inclusions), ⊑ (role inclusions), and (func ) (functionality assertion), where , are concepts and , are roles. A TBox is a ifnite set of axioms. A knowledge base (with closed predicates) (KB) is a tuple ( , Σ, ), where is a TBox, Σ ⊆ N ∪ N is a set of closed predicates and is an ABox. We use Nom() to denote the set {{} | ∈ N , and occurs in }. The semantics of TBoxes and ABoxes as given above is defined in the standard way using interpretations of the form ℐ = (Δℐ , · ℐ ). Additionally, we make the Standard Name Assumption (SNA), which is common when dealing with closed predicates and which forces us to interpret every constant occurring in the KB as itself. We say that ℐ satisfies a KB = ( , Σ, ), in symbols ℐ ⊨ , if ℐ satisfies and and (i) for each ∈ Σ, ℐ = { | () ∈ , ∈ Σ ∩ N }, and (ii) for each ∈ Σ, ℐ = {(, ) | (, ) ∈ , ∈ Σ ∩ N}.

In the literature, an OMQ is often given as a pair ( , ), where is a TBox and is a Conjunctive Query (CQs). In this paper, we do not deal with general CQs: here is just an atomic query or an inconsistency query (corresponding to a Boolean CQ ∃.⊥()). On the other hand, our OMQs include closed predicates. Thus, an ontology-mediated atomic query is a triple = ( , Σ, ), where is a TBox and Σ ∪ { } ⊆ N ∪ N. If is a concept name, then the answer to over an ABox is the set of all individuals such that ℐ ∈ ℐ for all models ℐ of ( , Σ, ). The definition of an answer in case is a role name is analogous. An inconsistency query given as a pair = ( , Σ). For such and any ABox , we let () = 1 if ( , Σ, ) has no model, and () = 0 if ( , Σ, ) has a model.

Turing Machines A Nondeterministic Turing Machine (NTM) is a tuple = (Γ, , , 0, acc, rej ), where Γ is an alphabet, is a set of states, ⊆ (Γ ∪ {}) × × (Γ ∪ {}) × × {− 1, +1} is a transition relation, and 0, acc, rej ∈ are the initial state, the accepting state, and the rejecting state, respectively. The symbol is the blank symbol. An NTM takes as input a finite word over Γ, which is written on an infinite tape: the blank symbol is written in every cell that is not occupied by . Initially, the read-write head of is over the first symbol of and the machine is in state 0. The computation of is defined in the usual way. If there is a run of on that reaches acc, then accepts . The language of is defined as the set of all words over Γ that accepts.

3. Inexpressbility Result

We present here our inexpressibility results. For this, we first need to formalise generic queries over ABoxes, and clarify the notion of membership of such queries in a complexity class. Definition 1 (Generic Boolean Queries). For an ABox , we use Adom() to denote the set of individuals that appear in . We say ABoxes 1, 2 are isomorphic, if they are equal up to renaming of individuals, i.e. there is a bijection : Adom(1) → Adom(2) such that 2 = {( () | () ∈ 1)} ∪ {( (), ()) | (, ) ∈ 1)}.

A Generic Boolean Query (GBQ) over a signature Σ is a function that maps each Σ-ABox to a value () ∈ {0, 1}, and is such that (1) = (2) holds for any pair 1, 2 of isomorphic Σ-ABoxes.

The assumption that answers GBQs are invariant under isomorphic ABoxes is natural: we are interested in queries about the structure of ABoxes, and they should not depend on the concrete names of individuals. Dropping this assumption would render the expressivness analysis virtually meaningless: because an OMQ (or any standard database query) can only use a finite number of constants in the query expression, many computationaly trivial queries could not be expressed even in very powerful query languages.

Turing Machines operate on strings. This means that in order to compute an answer to a query over an ABox , we need to suitably encode as a string. We choose a simple encoding, where we first enumerate all pairs , of individuals in . Then, for each such pair, we store in a single symbol all the concept names asserted for those individuals along with the roles that link them. Note that diferent types of encodings are possible (cf. [ 5 ], Chapter 2.2). Definition 2 (Encoding ABoxes as words). Consider a fixed signature Σ. A 2-type over Σ is a tuple (, , ′), where , ′ ⊆ Σ ∩ N and ⊆ Σ ∩ N+. We let ΓΣ denote the set of all 2-types over Σ. We next define an encoding function Σ that maps Σ-ABoxes to words over ΓΣ. Assume a Σ-ABox with ℓ individuals and take an arbitrary enumeration 1, . . . , ℓ of the individuals in . Then we define Σ() = 1,1 . . . 1,ℓ 2,1 . . . 2,ℓ . . . ℓ,1 . . . ℓ,ℓ, where each , := ({ | () ∈ }, { | (, ) ∈ }, { | ( ) ∈ }).

Based on the encoding above, we can now define membership of a GBQ in a complexity class. Definition 3. Let Σ be a signature and a GBQ over Σ. We say belongs to a complexity class , if the language {Σ() | is a Σ-ABox with () = 1} over the alphabet ΓΣ belongs to . Proposition 1. Assume a GBQ over Σ. The following are equivalent:

1. belongs to coNP.

2. There is an integer and a NTM with alphabet ΓΣ such that, for any Σ-ABox we have: a) () = 0 if accepts Σ(); b) terminates within |Σ()| computation steps.

As we have already seen in the introduction, OMQs based on classical first-order logic do not capture coNP due to their monotonicity. We can also show that the same applies to inconsistency queries based on ℒℋℐ with closed predicates, which are non-monotonic in general.

Theorem 1. There exists a GBQ 1 over Σ such that:

(a) 1 belongs to coNP, and (b) there is no inconsistency query 2 = ( , Σ′), with in ℒℋℐ, such that 1() = 2() holds for all Σ-ABoxes .

Proof. To see this, we can analyze the running time of the algorithm in [ 3 ] for checking satisfiability of an ℒℋℐ knowledge base = ( , Σ′, ), where Σ′ is a set of closed predicates. We assume that , Σ′ are fixed, and we want an algorithm that takes an ABox as input and checks if = ( , Σ′, ) is consistent. Specifically, based on [ 3 ], we have an algorithm that runs in time bounded by || × ℓ + , where ℓ and are constants that depend on and Σ′, while is a constant that does not depend on or Σ′. In other words, there is a constant , such that for any and Σ′, we can build a non-deterministic algorithm that checks the consistency of an input ABox in the KB = ( , Σ′, ), and that runs in time (||).

The key here is that the constant does not depend on or Σ′. Using an ℒℋℐ TBox with closed predicates one can capture decision problems that can be solved via a non-deterministic TM in time that is polynomial with degree . However using the Non-deterministic Time Hierarchy Theorem, we know that there are problems that can be solved in non-deterministic polynomial time, but not in polynomial time with polynomial degree . Specifically, there a problems solvable in time (+1) but not ().

We note that the above result can be formulated also for OMQs with atomic queries and a certain class of conjunctive queries, but it is unclear if it generalizes to OMQs with CQs. This is because the proof of Theorem 1 relies on an upper bound on the running time of a known algorithm for ℒℋℐ with closed predicates. To the best of our knowledge, no suitable upper bounds on data complexity are known in the case of CQs over ℒℋℐ KBs. Furthermore, at this point, it is unfortunately unclear whether one can prove the same inexpressibility result for ℒℋℐℱ with closed predicates. This is left as future work.

4. Language Extension

As stated above, it is unclear whether OMQs based on plain ℒℋℐℱ with closed predicates are capable of capturing coNP. However, we present an extension of this OMQ language that we prove is powerful enough to do so. To this end, we assume a countably infinite set N of variables. We refer to the expressions of the form {}, where ∈ N , as nominal variables.

Complex roles are expressions of the form ? ∘ s.t. ∈ N+ and is of the form 1 ∘ 1? ∘ . . . ∘ , where ≥ 1 and ∈ N+, ∈ N+ , for all 1 ≤ ≤ . We call an expression of type ?, where ∈ N+ a test role.

Now, in addition to standard ℒℋℐℱ axioms, we allow axioms of the following shape: (trans ), for a restricted role ∈ N (transitivity axiom) ∃.({} ⊓ ∃.{}) ⊓ ∃.{} ⊑ , where ∈ N , is a restricted role, and, are complex roles consisting only of tests and functional roles and , ∈ N ∃.({} ⊓ ¬∃.{}) ⊓ ∃.{} ⊑ , where ∈ N , is a restricted role, and , are complex roles consisting only of tests and functional roles and , ∈ N {} ⊑ ∀.{}, where is a complex role {} ⊓ ⊑ ∀.¬{}, where ∈ N , is a restricted role, and ∈ N (6) (7) (8) (9) Note that in the previous definition, we refer to functional and restricted roles. We say that a role is functional if (func ) ∈ . Intuitively, a role is restricted if we can guarantee that, in any model ℐ of the KB, (, ′) ∈ ℐ implies that and ′ are constants occurring in the KB. We next give a syntactic definition that, albeit incompletely, characterizes such roles. Definition 4. A concept ∈ N+ is restricted w.r.t. a TBox and a set Σ of closed predicates if one of the following conditions hold: (i) ∈ Σ ∪ Nom() ∪ {⊥}, or (ii) ⊑ 1 ⊔ · · · ⊔ ∈ , where is a restricted concept or an expression of the form ∃ or ∃− , for a restricted role .

A role ∈ N+ is called restricted w.r.t. a TBox and a set Σ of closed predicates if one of the following holds: (i) ∈ Σ, (ii) {∃ ⊑ , ∃− ⊑ } ⊆ , where and are restricted concepts, (iii) − is a restricted role, or (iv) ⊑ , where is a restricted role.

Extended semantics Let ℐ = (Δℐ , · ℐ ) be an interpretation and be a KB. The extension of the interpretation function · ℐ to complex roles of the form ? ∘ 1 ∘ 1? ∘ · · · ∘ ∘ ? is defined as ℐ := {0 ∈ Δℐ ∩ ℐ | ∃1, . . . , ∈ Δℐ s.t. (− 1, ) ∈ ℐ , ∈ ℐ , for all 1 ≤ ≤ }. The semantics of transitivity axioms is standard: ℐ satisfies (trans ) if (1, 2) ∈ ℐ and (2, 3) ∈ ℐ implies (1, 3) ∈ ℐ . The axioms of the form (7)-(10) are reminiscent of nominal schemas introduced in [ 8, 7 ]. In these works, the semantics of such nominal schemas is given by grounding the knowledge base with respect to the set of all individuals N , where N is assumed to be finite. For our purposes, we ground with respect to Nom() by uniformly replacing all nominal variables with nominals in Nom() in all possible ways. We use ground() to denote such grounding of and we say that ℐ satisfies if it satisfies ground().

Theorem 2. The KB satisfiability problem in ℒℋℐℱ with closed predicates extended with axioms of the form (6)-(10) is NP-complete in data, and NExpTime-complete in combined complexity. Proof sketch. Due to space restrictions, we only ofer a brief sketch of the decision procedure that runs in nondeterministic exponential time in the size of the given knowledge base, and in nondeterministic polynomial time, in if the TBox and the set of closed predicates are considered ifxed. The first step in this procedure is to guess the extensions of restricted concepts and roles over the individuals occurring in the given knowledge base. These concepts and role names are now considered closed. Since all transitivity axioms only involve restricted roles, we can right away check whether they are satisfied and eliminate them. Thus, what is left to do is devise a procedure that can decide satisfiability of KBs with closed predicates whose TBox contains no transitivity axioms, and where all restricted concepts and role names are now considered closed. We do this by modifying the mosaic approach introduced in [ 4 ] for ℒℋℐℱ with closed predicates to support complex roles and nominal schemas.

5. The Encoding

Theorem 3 (Main result). Assume a signature Σ and a GBQ over Σ that belongs to coNP. Then there is a TBox in extended ℒℋℐℱ such that the Boolean inconsistency query OMQ = ( , Σ) has the following property: for all Σ-ABoxes , () = ().

)*+#$ n2

ABox ( c3 %&' $

The rest of this section serves as a proof sketch for the theorem above. Let be a GBQ over some signature Σ that is in coNP. According to Proposition 1, there is a nondeterministic Turing machine that decides the language {encΣ() | () = 0, is a Σ-ABox} and its running time is bounded by , where is a constant and is the size of the input word. We show that we can come up with a TBox such that for the ontology-mediated inconsistency query = ( , Σ), () = (), for all Σ-ABoxes . The basic idea is to craft in a way that ensures that all models of ( , Σ, ) contain a grid structure of size × . We then use this grid to simulate the given Turing machine as follows. The first row of the grid stores the initial configuration of while each subsequent row stores the next configuration in some computation of . Finally, we eliminate those computations that do not end in acceptance of the word. As a result, we have that each model of ( , Σ, ) corresponds to a computation of that accepts encΣ(), and vice versa: every computation of that ends in acceptance of encΣ() corresponds to some model of ( , Σ, ), for all Σ-ABoxes . Thus, checking whether accepts encΣ() boils down to checking whether ( , Σ, ) is unsatisfiable, which is equivalent answering the inconsistency query .

We now begin with our construction. In the rest of this section we assume we are given a GBQ in the form of a nondeterministic Turing machine = (ΓΣ, , , 0, acc, rej ) and an integer constant . 5.1. Constructing the × Grid Consider an arbitrary ABox over some signature Σ. We next show how to build a KB = ( , Σ, ) s.t. that each model ℐ of contains a × grid formed by the domain elements, where is the number of known individuals (i.e., the number of individuals occurring in plus two special constants firstand last).

We begin by generating diferent domain elements that serve as grid nodes. Each such grid node is associated two words of length over the known individuals that serve as its and -coordinate in the grid. This is accomplished using roles two sets of functional roles: 1, . . . and 1, . . . . We say that a domain element has an -coordinate 12 · · · , if (, ) participates in , for each , 1 ≤ ≤ . The -coordinate of is defined analogously. This is illustrated in Figure 1, left. In the first step of the construction, we let the special individual firstbe the origin of the grid and set its - and - coordinates to first· · · first. To generate the remainder of the grid nodes, we add axioms that create a binary tree rooted in firstusing two roles ℎ and , denoting horizontal and vertical successors, respectively. We next assign the and -coordinates to each grid node in the tree making sure that they respect a certain pattern. To do this, we use a linear order over the known individuals that we can can easily generate by using firstand last as the designated first and last elements and guessing the remaining part of the successor relation, encoded using the role . We then lift this linear order to words of length over the available individuals and add axioms that require that for each grid node with the horizontal successor ′, the -coordinate of ′ is the successor of the -coordinate of with respect to the generalized linear order, while the -coordinate remains unchanged. We then do a similar thing with the vertical successor ′′ of . Namely, the -coordinate of ′′ is the successor of the -coordinate of with respect to the generalized linear order, while the -coordinate stays the same. Figure 1, on the right illustrates this. It is not hard to see that all possible pairs of - and -coordinates occur within this tree. Now, the only thing that is left to do is to merge nodes with same coordinates. This is easy: we simply let the special individual last be the only grid node with last · · · last as its - and -coordinate. Propagating backwards from last while relying on the fact that each grid node has at most one ℎ- and at most one -predecessor, we can easily see that each diferent combination of the coordinates occurs exactly one time – thus we have 2 diferent grid nodes. Moreover, the way we assigned their coordinates ensures that they form a proper grid. We next detail the construction by providing all the relevant axioms.

Collect constants from .

We first collect in Adom all the individuals occurring in : Adom ≡

⨆︁ ∈N+∩Σ ⊔

⨆︁ ∈N+∩Σ (∃ ⊔ ∃− ) Guess a linear order. We next add the axioms that guess a linear order over the known individuals, stored using the concept name Node. We use two individuals First and Last as designated first and last elements in this linear order. The role stores the successor relation, and lessThan is a role that stores the induced "less than" relation.

Node ≡ Adom ⊔ {First} ⊔ {Last} Node ⊑ ∃next.Node ⊔ {Last} Node ⊑ ∃next− .Node ⊔ {First} (func next) (func next− ) {} ⊓ Node ⊑ ∀lessThan.¬{} next ⊑ lessThan (trans lessThan) ∃lessThan ⊑ Node ∃lessThan− ⊑ Node

Axioms on the left-hand side are responsible for guessing the successor relation of the linear order that is being generated. They ensure that all individuals except for the last one have a successor, and all individuals except for the first one have a predecessor. Moreover, successors and predecessors must be unique. Axioms on the right-hand side says that the transitive closure of contains no cycles, meaning that we have a proper linear order. The last two axioms serve as guards to ensure that a transitivity assertion is made over a restricted role. Creating the grid structure. To create a × grid, we take the approach above and add, for all 1 ≤ ≤ , the following axioms that create a binary tree routed in firstusing ℎ and : GridNode ⊑ d1≤ ≤ (∃.Node ⊓ ∃.Node) {First} ≡ GridNode ⊓ d1≤ ≤ (∃.{First} ⊓ ∃.{First}) GridNode ⊑ ∃ℎ.GridNode ⊔ (d1≤ ≤ ∃.{Last}) GridNode ⊑ ∃.GridNode ⊔ (d1≤ ≤ ∃.{Last}) d1≤ ≤ ∃.{Last} ⊑ ¬∃ℎ.⊤ d1≤ ≤ ∃.{Last} ⊑ ¬∃.⊤ (func ℎ) (func ℎ− ) (func ) (func − ) (func ) (func )

The first axiom on the left-hand side states that every grid node has 2k pointers to the known individuals using functional roles and , 1 ≤ ≤ that encode its - and -coordinates. The second axiom on the left-hand side sets firstas a designated origin point with - and -coordinates first· first. The rest of the axioms simply create the tree.

We next make sure that the coordinates align, i.e., if ′ is an ℎ-successor of , then the coordinates of and ′ coincide, while the -coordinate of ′ is a successor of the -coordinate of w.r.t. to the linear order in extended to words of length . For example, if the -coordinate of is · · · · last · · · last, where ̸= 1, then the -coordinate of is · · · ′ · first· · · first, where ′ is the successor of according to the given linear order. We now define the axioms that do this and add for all , 1 ≤ ≤ : {First} ⊑ ∀()− .∀ℎ− .∀.{Last}) GridNode ⊓ ¬∃.{Last} ⊓ d1≤ ≤ ∃.{Last} ⊑ IncrX IncrX ⊑ ∀ℎ.(d1≤ ≤ ∃.{First}) {} ⊑ ∀Node? ∘ ()− ∘ IncrX? ∘ ℎ ∘ ∘ next− .{}) {} ⊑ ∀Node? ∘ ()− ∘ IncrX? ∘ ℎ ∘ .{}

{} ⊑ ∀Node? ∘ ()− ∘ ℎ ∘ .{} We only show how to handle the -coordinate, since the -coordinate is treated analogously. Finally, we add the axiom that triggers the merging of the nodes with same coordinates: {Last} ≡ GridNode ⊓ l (∃.{Last} ⊓ ∃.{Last})

1≤ ≤ 5.2. Encoding the Turing Machine We next simulate the computation of using the grid we just created. We assume we have the following concept names available: (i) 1, ¯1, 2, ¯2, , ¯, for all ∈ Σ ∩ N and all ∈ Σ ∩ N, (ii) , for all symbols ∈ ΓΣ′ ∪ {}, (iii) , for all ∈ , and (iv) < and >. Copying onto the input tape. The first row of the grid, referred to as the input tape, represents the initial configuration of . Recall that we encode Σ-ABoxes over the signature as words of length 2 where each position in the word represents a pair of individuals in and each pair of individuals occurring in is represented by one position in the word. We now add axioms that make sure that each one of the first 2 cells on the input tape corresponds to a single pair of individuals occurring in the KB, while the remainder of the cells on the input tape are filled out with the blank symbol B. This is done by assuring that every input cell, i.e., a node in the first row, has two pointers to known individuals: hasFst and hasSnd. If for some input cell there are two known individuals , s.t. (, ) participates in hasFst and (, ) participates in hasSnd, then represents the pair (, ). To ensure that all pairs are represented on the input tape, we follow the same approach as for the grid construction. Namely, the available linear order is lifted to pairs of known individuals, and we require that a horizontal successor of some input cell also represents the next pair w.r.t. to this linear order. Once all pairs are represented, the remaining input cells are set to blank, i.e., they participate in the concept . We defer the exact axioms to the appendix.

We next need put the correct symbols in each cell on the input tape. Recall that, if position in the encoding of the ABox represents the pair (, ), we have the following symbol at position : = ({ | () ∈ }, { | (, ) ∈ }, { | () ∈ }) ∈ ΓΣ′ . We next add axioms that ensure exactly that. Namely, if a cell on the input tape represents the pair (, ), then it participates in the concept . We first copy the information about which concept and role names and participate in. To this end, for ∈ Σ ∩ N and every ∈ Σ ∩ N we add: ¯ 1 ⊓ 1 ⊑ ⊥

¯ 2 ⊓ 2 ⊑ ⊥

¯ ⊓ ⊑ ⊥ ∃hasFst. ⊑ 1 ∃¬hasFst. ⊑ ¯1 ∃hasSnd. ⊑ 2 ∃¬hasSnd. ⊑ ¯2 ∃hasFst.({} ⊓ ∃.{}) ⊓ ∃hasSnd.{} ⊑ ∃hasFst.({} ⊓ ¬∃.{}) ⊓ ∃hasSnd.{} ⊑ ¯ Finally, for each = (, , ′) ∈ ΓΣ, we add: l ∈, ∈(Σ∩N)∖ (1 ⊓ ¯1) ⊓

l ∈ ′, ∈(Σ∩N)∖ ′ (2 ⊓ ¯2) ⊓

( ⊓ ¯) ⊑ , l ∈, (∈Σ∩N)∖

We now use the rest of the grid to simulate the computation of the TM . Recall that a row in the grid stores a configuration that is currently in, while corresponds to time. To this end, we need to ensure that for each row in the grid satisfies two conditions. Firstly, there is exactly one element in where holds for some and at most one ∈ . For other elements ′ ̸= in , does not hold for any . Secondly, for all elements of in , holds for exactly one ∈ Γ ∪ {}. It is then clear that each row is indeed a valid encoding of some configuration of . If for a node in , then is in state and the read-write head is in the position .

We next add the axioms that ensure that at the beginning, is in the state 0 and the read-write head is above the first symbol:

InputCell ⊓ l .{First} ⊑ 0 1≤ ≤ ⊑

l ′∈(∖{}) ′ , for all ∈

Further, for all (, ) ∈ × Γ∪{}, we add the following axiom that selects one configuration among possible next configurations, and overwrites the current symbol, changes the state and moves the read-write head accordingly: ⊓ ⊑ ︀(

⨆︁ (′, ′,+1)∈ (, ) ∀.( ′ ⊓ ∀ℎ.′ )︀) ⊔ ︀(

⨆︁ (′, ′,− 1)∈ (, ) ∀.( ′ ⊓ ∀ℎ− .′ )︀) For all states ∈ , we mark the positions that are not under the read-write head: ⊑ (∀ℎ.<) ⊓ (∀ℎ− .>) < ⊑ l ′ ⊓ ∀ℎ.< > ⊑

l ′ ⊓ ∀ℎ− .< ∈ ∈ The intuition of the above is as follows. If in some position we have , then all the position to the right from are marked with <. Intuitively, < (resp. >) says that the read-write head is behind (resp. ahead) and thus these positions do not participate in , for any .

As one of the last steps, we need to add an axiom that copies the content of the tape that is not overwritten. For all ∈ Γ ∪ {} we add: ⊓ (> ⊔ <) ⊑ ∀.

We are now almost done with our construction of : for any Σ-ABox , each model of ( , Σ, ), where is the TBox we have constructed so far, corresponds to one possible computation of on the encoding of . By assumption, always terminates, which means that in each model of the theory we will either have some object for which acc holds or some object for which rej holds. Finally, to obtain , we add the axiom rej ⊑ ⊥ to . Now, every model of ( , Σ, ) corresponds to computation of accepting the encoding of . Thus, for = ( , Σ), () = 1 if and only if there are no accepting computations of ran on encΣ, that is, () = 1.

6. Discussion

In this paper, we have discussed some of the expressiveness limitation of very expressive OMQ languages, and then proposed an extension of ℒℋℐℱ equipped with closed predicates as OMQ language that captures precisely the class of generic Boolean queries over ABoxes that are computable in coNP.

The arguments presented in the paper can also be applied to standard Horn-DLs (with no closed predicates). For instance, the OMQ language that couples inconsistency and instance queries with ℰ ℒℋℐℱ ontologies is PTime-hard, but it cannot express all queries computable in PTime. It is not dificult see that extending

ℰ ℒℋℐℱ with a built-in linear order is not suficient to capture PTime, but the further addition of the features described in Section 4 leads to a DL that allows to precisely capture PTime.

Acknowledgments

This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation. It was also partially supported by the Austrian Science Fund (FWF) project P30873.

[1]

Ortiz ,

Calvanese ,

Eiter , Data complexity of query answering in expressive description logics via tableaux , J. Autom. Reason . 41 ( 2008 ) 61 - 98 . URL: https://doi.org/10.1007/ s10817-008-9102-9. doi: 10 .1007/s10817-008-9102-9.

[2]

Bienvenu , B. ten Cate ,

Lutz ,

Wolter , Ontology-based data access: A study through disjunctive datalog, CSP, and MMSNP , ACM Trans. on Database Systems 39 ( 2014 ) 33 : 1 - 33 : 44 . URL: http://doi.acm. org/10 .1145/2661643. doi: 10 .1145/2661643.

[3]

Ahmetaj ,

Ortiz ,

Simkus , Polynomial rewritings from expressive description logics with closed predicates to variants of datalog , Artificial Intelligence 280 ( 2020 ) 103220 . URL: https://doi.org/10.1016/j.artint. 2019 . 103220 . doi: 10 .1016/j.artint. 2019 . 103220 .

[4]

Gogacz ,

Lukumbuzya ,

Ortiz ,

Simkus , Datalog rewritability and data complexity of ALCHOIF with closed predicates , in: D. Calvanese , E. Erdem, M. Thielscher (Eds.), Proc. of KR 2020 , 2020 , pp. 434 - 444 . URL: https://doi.org/10.24963/kr.2020/44. doi: 10 .24963/kr. 2020/44.

[5]

Immerman , Descriptive complexity, Graduate texts in computer science, Springer, 1999 . URL: https://doi.org/10.1007/978-1- 4612 -0539-5. doi: 10 .1007/978-1- 4612 -0539-5.

[6]

Lutz ,

Seylan ,

Wolter , The data complexity of ontology-mediated queries with closed predicates , Logical Methods in Computer Science 15 ( 2019 ). URL: https://doi.org/10.23638/ LMCS- 15 ( 3 :23) 2019 . doi: 10 .23638/LMCS- 15 ( 3 :23) 2019 .

[7]

Krötzsch ,

Maier ,

Krisnadhi ,

Hitzler , A better uncle for OWL: nominal schemas for integrating rules and ontologies , in: S. Srinivasan,

Ramamritham ,

Kumar ,

M. P.

Ravindra , E. Bertino, R. Kumar (Eds.), Proceedings of the 20th International Conference on World Wide Web, WWW 2011 , Hyderabad, India, March 28 - April 1, 2011 , ACM, 2011 , pp. 645 - 654 . URL: https://doi.org/10.1145/1963405.1963496. doi: 10 .1145/1963405.1963496.

[8]

Krötzsch ,

Rudolph , Nominal schemas in description logics: Complexities clarified , in: C. Baral,

G. D.

Giacomo , T. Eiter (Eds.), Proc. of KR 2014 , AAAI Press, 2014 . URL: http://www.aaai.org/ocs/index.php/KR/KR14/paper/view/8027.