Introduction

Fixed Parameter Tractable Reasoning in DLs via Decomposition

František Simancˇík

Boris Motik

Markus Krötzsch

0 0 Department of Computer Science, University of Oxford , UK

DL reasoning is of high computational complexity even for basic DLs such as ALCI [3, Chapter 3]. Intuitively, due to disjunctions (or-branching) and/or existential quantifiers (and-branching), a DL reasoner may need to investigate (at least) exponentially many combinations of concepts. A range of highly-tuned optimizations, such as absorption, dependency-directed backtracking, blocking, and caching [3, Chapter 9], can be used to tame these sources of complexity. None of these techniques, however, provide formal tractability guarantees. Such guarantees can be obtained by restricting the language expressivity, as done in the EL [2], DL-Lite [4,1], and DLP [8] families of DLs. Tractable DLs typically do not support disjunctions, which eliminates or-branching, and they either significantly restrict universal quantification (as in EL and DL-Lite) or disallow existential quantification (as in DLP), which eliminates or reduces and-branching. Obtaining tractability guarantees for hard computational problems has been extensively studied in parameterized complexity [5]. The general idea is to measure the “hardness” of a problem instance of size n using a nonnegative integer parameter k, and the goal is to solve the problem in time that becomes polynomial in n whenever k is fixed. A particular goal is to identify fixed parameter tractable (FPT) problems, which can be solved in time f (k) nc, where c is a constant and f is an arbitrary computable function that depends only on k. Note that not every problem that becomes tractable if k is fixed is in FPT. For example, checking whether a graph of size n contains a clique of size k can clearly be performed in time O(nk), which is polynomial if k is a constant; however, since k is in the exponent of n, this does not prove membership in FPT. Note that every problem is FPT if the parameter is the problem's size, so a useful parameterization should allow increasing the size arbitrarily while keeping the parameter bounded. Various problems in AI were successfully parameterized by exploiting the graph-theoretic notions of tree decompositions and treewidth [6,7,10], which we recapitulate next. A hypergraph is a pair G = hV; Hi where V is a set of vertices and H 2V is a set of hyperedges. A tree decomposition of G is a pair hT; Li where T is an undirected tree whose sets of vertices (also called bags) and edges are denoted with B(T ) and E(T ), and L : B(T ) ! 2V is a labeling of B(T ) by subsets of V such that (T1) for each v 2 V, the set fb 2 B(T ) j v 2 L(b)g induces a connected subtree of T , and (T2) for each e 2 H, there exists a bag b 2 B(T ) such that e L(b). The width of hT; Li is defined as maxb2B(T) L(b) 1. Finally, the treewidth of G is the minimum width among all possible tree decompositions of G. Consider now an instance N of the SAT problem, where N is a finite set of clauses (i.e., disjunctions

Introduction

of possibly negated propositional variables). The notions of tree decompositions and treewidth of N are defined w.r.t. the hypergraph GN = hVN ; HN i where VN is the set of propositional variables occurring in N, and HN contains the hyperedge fp1; : : : ; pkg for each clause (:)p1 _ : : : _ (:)pk 2 N. When parameterized by treewidth, SAT is FPT [ 10 ]. Intuitively, the treewidth of N shows how many propositional variables must be considered simultaneously in order to check the satisfiability of N; thus, bounding the treewidth has the e ect of bounding or-branching.

Inspired by these results, we present a novel DL reasoning algorithm that ensures fixed parameter tractability. To this end, in Section 3 we introduce a notion of a decomposition D of a signature . Intuitively, D is a graph that restricts the propagation information between the atomic concepts in . A decomposition of can be seen as one or more tree decompositions, each reflecting the propagation of information due to or-branching, interconnected to reflect the propagation of information due to andbranching. We identify a parameter of D called width; intuitively, this parameter determines an upper bound on the number of concepts that must be considered simultaneously to solve a reasoning problem. Let O be an ALCI ontology normalized to contain only axioms of the form i Ai v F j B j, A v 9R:B, and A v 8R:B, where A(i) and B( j) are atomic concepts, and R is a (possibly inverse) role. We present a resolution-based reasoning calculus that runs in time O( f (d) jDj jOj), where d is the width of D, jDj is the size of D, and jOj is the number of axioms in O. Our calculus is not complete for all D: it is not guaranteed to derive all consequences that might be of interest. To remedy that, we introduce a notion of D being admissible for O and the relevant consequences, and we show that admissibility guarantees completeness.

Ideally, given O and the relevant consequences, one would identify an admissible decomposition D of smallest width and then run our calculus in order to obtain an FPT algorithm. In Section 4, however, we show that, for certain O, all admissible decompositions of smallest width have exponentially many vertices. This is in contrast to tree decompositions (e.g., for each instance of SAT, a tree decomposition of minimal width exists in which the number of vertices is linear in the size of the instance) and is due to the fact that, in addition to or-branching, our decompositions analyze information flow due to and-branching as well. We therefore further restrict the notion of admissible decompositions in several ways. For each of the resulting notions, one can compute a decomposition of width at most d (if one exists) in time f (d) jOjc with f a computable function and c an integer constant; together with our resolution-based calculus, we thus obtain an FPT calculus for reasoning with normalized ALCI ontologies.

In Section 5 we show that the minimum decomposition width of several commonly used ontologies is much smaller than the respective ontology’s size. This suggests that decomposition width provides a “reasonable” measure of ontology complexity, and that our approach might even provide practical tractability guarantees.

Our results can be applied to SH I ontologies by transforming away role hierarchies and transitivity and normalizing the ontology in a preprocessing step. Such transformations, however, are don’t-care nondeterministic, and the minimum decomposition width of the normalization result might depend on the nondeterministic choices. In this paper we thus restrict our attention to normalized ALCI ontologies, and we leave an investigation of how normalization a ects the minimum width for future work. R3 K v M : K v M 2 O R4

K1 v M1 t A A u K2 v M2

K1 u K2 v M1 t M2 or

B u i Di v F j E j

i Di v F j E j A u i Ci v F j F j

A v 9R:B 2 O : Ci v 8R:Di 2 O

E j v 8R :F j 2 O In order to motivate the results presented in the following sections, in this section we present a very simple calculus that is not FPT, and we discuss the rough idea for making the calculus FPT. The calculus is based on resolution, and is similar to the calculus presented in [ 9 ]. Resolution can often provide worst-case optimal calculi whose best case complexity is significantly lower than the worst case complexity; indeed, the calculus from [ 9 ] has demonstrated excellent practical performance.

The calculus manipulates clauses—expressions of the form K v M, where K is a finite conjunction of atomic concepts, and M is a finite disjunction of atomic concepts. With sig(K), sig(M), and sig(K v M) we denote the sets of atomic concepts occurring in K, M, and K v M, respectively. We consider two disjunctions (resp. conjunctions) to be the same whenever they mention the same atoms; that is, we disregard the order and the multiplicity of atoms. We write empty K and M as > and ?, respectively. Furthermore, we say that a clause K0 v M0 is a strengthening of a clause K v M if sig(K0) sig(K) and sig(M0) sig(M). We write K v M 2ˆ N if the set of clauses N contains at least one strengthening of the clause K v M.

Given a normalized ontology O, our calculus constructs a derivation—a sequence S0; S1; : : : of sets of clauses such that S0 = ;, and for each i > 0, set Si is obtained from Si 1 by applying a rule from Fig. 1. Rules R1 and R2 implement propositional resolution, and rule R3 ensures that each clause in O is taken into account. Rule R4 handles role restrictions; letter R stands for a role (i.e., R need not be atomic), and inv(R) is the inverse role of R; finally, note that the atom B in the premise of the rule is optional. Intuitively, the rule says that, if B, Di, and :E j jointly imply a contradiction, but A v 9R:B, Ci v 8R:Di, and :F j v 8R::E j hold, then A, Ci, and :F j jointly imply a contradiction too. Reasoning with the second premise is analogous.

A saturation is defined as S B Si Si. The calculus infers a clause K v M, written O ` K v M, if K v M 2ˆ S. It is straightforward to see that the calculus is sound: if O ` K v M, then O j= K v M. Typically, resolution is used as a refutation-complete calculus; however, it is possible to show that the variant of resolution presented here is complete in the following stronger sense: if O j= K v M, then O ` K v M; note that this means that the calculus infers at least one strengthening of each clause entailed by O. This stronger notion of completeness can be useful in practice; for example, O can be classified using a single run of the calculus, which is not the case for calculi (such as tableau) that are only refutationally complete.

Let d be the number of atomic concepts in O. Since each clause is uniquely identified by the atomic concepts that occur in K and/or M, the calculus can derive at most 4d clauses, which is exponential in jOj. The high complexity of DL reasoning arises because one may have to consider exponentially many combinations of concepts, and this fact fundamentally underpins all DL reasoning algorithms. Clearly, a tractable algorithm should consider only polynomially many combinations. For example, reasoning algorithms for EL exploit the fact that only polynomially many combinations are “relevant” and that all of them can be constructed deterministically. In the following sections, we ensure tractability of reasoning in a radically di erent way. Instead of restricting the ontology language, we show that by restricting the structure of the ontology with a suitable parameter one can limit the number of concepts that must be simultaneously considered, which e ectively limits the exponent in the above calculation. Since the base of the exponent not depend on jOj, we will thus obtain an FPT reasoning calculus. 3

Reasoning with Decompositions

In this section we develop the notions of decomposition, decomposition admissibility, and the resolution calculus. We start by introducing the notion of decomposition. Definition 1. Let = h A; Ri be a DL signature, where A is a finite set of atomic concepts and R is a finite set of atomic roles; let R = fR j R 2 Rg be the set of inverse roles of R; and let be a symbol not contained in A [ R [ R .

A decomposition of is a labeled graph D = hV; E; sigi, where V is a finite set of vertices, E V V ( R [ R [ f g) is a set of directed edges labeled by a role or by , and sig : V ! 2 A is a labeling of each vertex with a set of atomic concepts. The width of D is defined as wd(D) B maxv2V jsig(v)j.

Note that D is not defined w.r.t. an ontology, but w.r.t. a signature , and we will establish a link between D and O shortly in our notion of admissibility. This is mainly so as to gather all conditions that guarantee completeness in one place. We discuss the intuition behind this definition after presenting the resolution-based calculus. Definition 2. Let be a DL signature, let D = hV; E; sigi be a decomposition of , and let O be a normalized ALCI ontology over . The resolution calculus for D and O is defined as follows.

A clause system for D is a function S that assigns to each vertex v 2 V a set of clauses S(v). A derivation of the calculus is a sequence of clause systems S0; S1; S2; : : : such that S0(v) = ; for each v 2 V and, for each i > 0, Si is obtained from Si 1 by an application of a derivation rule from Fig. 2; we assume that each derivation is fair in the usual sense. The saturation is the clause system S defined by S(v) B Si Si(v) for each v 2 V. The calculus infers a clause K v M at vertex v, written O; v `D K v M, if K v M 2ˆ S(v); furthermore, the calculus infers a clause K v M, written O `D K v M, if a vertex v 2 V exists such that O; v `D K v M.

The calculus is complete (sound) if O j= K v M implies (is implied by) O `D K v M for each clause K v M over . Given a set of clauses C over , the calculus is Ccomplete if O j= K v M implies O `D K v M for each K v M 2 C.

R1 add A v A to S(v)

: A 2 sig(v) R2

K1 v M1 t A 2 S(v) A u K2 v M2 2 S(v)

add K1 u K2 v M1 t M2 to S(v) R3 add K v M to S(v) : sKigv(KMv2MO)

sig(v)

B u or R4 add A u i Di v F j E j 2 S(u) i Di v F j E j 2 S(u) i Ci v F j F j to S(v)

A v 9R:B 2 O Ci v 8R:Di 2 O : E j v 8inv(R):F j 2 O hu; v; Ri 2 E sig(A u i Ci v F j F j) sig(v)

K v M 2 S(u) : hu; v; i 2 E R5 add K v M to S(v) sig(K v M)

sig(v)

While the simple calculus from Section 2 saturates a single set of clauses, the resolution calculus for D and O saturates one set of clauses per decomposition vertex. In particular, for a vertex v 2 V, set S(v) contains only clauses whose propositional atoms are all contained in sig(v), so v identifies a propositional subproblem of O. Rules R1–R3 implement propositional resolution “within” each vertex v. Rule R5 propagates propositional consequences from vertex u to vertex v connected by an -labeled edge; thus, the -labeled edges of D “connect” the subproblems of O in accordance with or-branching. Finally, rule R4 propagates modal consequences from a vertex u to a vertex v connected by an R-labeled edge; thus, the R-labeled edges of D “connect” the subproblems of O in accordance with and-branching. A clause is inferred if at least one saturated set S(v) contains a strengthening of the clause.

Note that rules R1–R3 consider only one vertex at a time, whereas rules R4 and R5 involve two vertices. Thus, although this was not our initial motivation, the calculus seems to exhibit significant parallelization potential. We leave a thorough investigation of the reasoning problem in terms of parallel complexity classes for future work.

The notion of C-completeness takes into account that one might be interested not only in refutational completeness, but in the derivation of all clauses from some set C. For example, if one is interested in the classification of O, then C would contain all clauses of the form A v B with A and B atomic concepts occurring in O.

The following proposition determines the complexity of the calculus in terms of the sizes of D and the number jOj of axioms in O. It essentially observes two key facts: first, since the clauses in each S(v) are restricted to atomic concepts in sig(v), the maximum number of clauses in S(v) is determined solely by wd(D); and second, given a node or a pair of nodes, all rules can be applied in time that also depends solely on wd(D). Once we limit the size of D, this proposition will provide us with an FPT algorithm. Proposition 1. Let D = hV; E; sigi and O be as in Definition 2. The saturation of the resolution calculus for D and O can be computed in time O( f (wd(D)) (jVj + jEj) jOj), where f is some computable function.

The rules of our calculus are clearly sound for arbitrary decompositions D and ontologies O; however, the converse is not true. As a trivial example, note that the decomposition with the empty vertex and edge sets satisfies Definition 1, and that our calculus does not infer any clause using such D. Therefore, we next introduce the notion of admissibility, which we later show to be su cient for completeness. Definition 3. Let D = hV; E; sigi be a decomposition of a DL signature = h A; Ri.

Let W V be an arbitrary set of vertices. The signature of W is defined as sig(W) B Sw2W sig(w). The -projection of D w.r.t. W is the undirected graph DW that contains the undirected edge fu; vg for each hu; v; i 2 E with u; v 2 W. Set W is -connected if, for all u; v 2 W, vertices fw0; w1; : : : ; wng W exist such that w0 = u, wn = v, and hwi 1; wi; i 2 E for each 1 i n; furthermore, W is an -component of D if W is -connected, and each W0 such that W ( W0 V is not -connected.

Decomposition D is admissible for an ontology O if hu; v; i 2 E implies hv; u; i 2 E for all u; v 2 V, and if each -component W of D satisfies the following properties: (i) DW is an undirected tree; (ii) for each atomic concept A 2 sig(W), the set fw 2 W j A 2 sig(w)g is -connected; (iii) for each clause K v M 2 O such that sig(K) sig(W), a vertex w 2 W exists such that sig(K v M) sig(w); (iv) for each axiom A v 9R:B 2 O such that A 2 sig(W), an -component U of D and vertices w 2 W and u 2 U exist such that – hu; w; Ri 2 E, – A 2 sig(w), – B 2 sig(u), – for each C v 8R:D 2 O, if C 2 sig(W) then C 2 sig(w) and D 2 sig(u), and – for each E v 8inv(R):F 2 O, if E 2 sig(U) then E 2 sig(u) and F 2 sig(w).

A clause K v M is covered by D if an -component W of D and a vertex w 2 W exist such that sig(K) [ [sig(M) \ sig(W)] sig(w). Decomposition D is admissible for C if each clause in C is covered by D.

Definition 3 incorporates two largely orthogonal ideas. First, each -component W of D reflects the propositional constraints on domain elements of a particular type in a model of O. To deal with or-branching, each W is a tree decomposition formed by undirected -labeled edges. Conditions (i)–(iii) are analogous to (T1) and (T2) in Section 1, but (iii) is more general: instead of requiring sig(K v M) sig(w) for each K v M 2 O and some w 2 W, Condition (iii) takes into account that, if sig(K) * sig(W), then K v M can be satisfied by making the atomic concepts in sig(K) n sig(W) false on the appropriate domain element; thus, sig(K v M) sig(w) must hold for some w 2 W only if sig(K) sig(W). Admissibility for C uses an analogous idea.

Second, to deal with and-branching, the -components of D are interconnected via role-labeled edges. If a concept A occurs in an -component W and in an axiom of O of the form A v 9R:B, then a domain element corresponding to W might need to have an R-successor; to reflect that, D must contain an -component U, and vertices w 2 W and u 2 U connected by an R-labeled edge must exist such that A 2 sig(w) and B 2 sig(u). Furthermore, in order to address the universal quantifiers over R, if C v 8R:D 2 O and C 2 sig(W), then C 2 sig(w) and D 2 sig(u) must hold, and analogously for universals over inv(R). These conditions ensure that w and u contain all atomic concepts that might be relevant for modal reasoning, which in turn allows our calculus to infer all relevant constrains on atomic concepts.

The following theorem shows that admissibility indeed ensures completeness. Theorem 1. Let O be an ontology, let C be a set of clauses, and let D = hV; E; sigi be a decomposition that is admissible for O and C. Then, the resolution calculus for D and O is C-complete.

Ideally, given an ontology O and a set of clauses C, one would identify a decomposition D of smallest width and then apply the resolution calculus for D and O to obtain an FPT algorithm. The following theorem shows, however, that this idea does not work, since it is not the case that, for each ontology O, there exists a decomposition of minimal width that is admissible for O and whose size is polynomial in jOj. In order to address this problem, in Section 4 we further restrict the notion of admissibility. Theorem 2. A family of ALCI ontologies fOng exists such that each decomposition admissible for On and C = fC v ?g of minimal width has size exponential in jOnj. 4

Constructing Decompositions of Polynomial Size

In Section 4.3 we present a general method for computing admissible decompositions of polynomial size, for which we obtain the desired FPT result. This method embodies two largely orthogonal ideas, each of which we present separately for didactic purposes. In particular, in Section 4.1 we present an approach for analyzing and-branching, and in Section 4.2 we present an approach for analyzing or-branching. 4.1

Analyzing And-Branching via Deductive Overestimation In this section we present an approach for analyzing and-branching, which is inspired by the reasoning algorithm for EL [ 2 ]. The approach uses an overestimation of the subsumption relation to construct the decomposition. It manipulates expressions of the form K A, where K is a conjunction of atomic concepts, and A is an atomic concept. Given an ALCI ontology O and a set of clauses C, the deductive overestimation for O and C is the relation obtained by exhaustive application of the rules shown in Fig. 3.

Intuitively, K A states that an object whose existence is required to satisfy K can become an instance of A. On EL ontologies coincides with the subsumption relation, but on more expressive ontologies overestimates the subsumption relation. In order to check whether a clause K v M 2 C is entailed by O, rule E1 introduces an instance of all atomic concepts in K. Rule E2 addresses the fact that, if some object is an instance of A1; : : : ; An and O contains a clause A1 u : : : u An v B1 t : : : t Bm, then the

K E2 K

K E3 B

K E4

A1 : : : K A1 : : : K B1 : : : K A B : A v 9R:B 2 O An : A1 u : : : u An v B1 t : : : t Bm 2 O

Bm A K B D

C : A v 9R:B 2 O C v 8R:D 2 O

A B K F

E : A v 9R:B 2 O

E v 8R :F 2 O object must be an instance of some Bi. Since a polynomial overestimation method that reasons by case is unlikely to exist, rule E2 overestimates the subsumption relation by saying that can be an instance of all B1; : : : ; Bm. Rule E3 takes into account that, given A v 9R:B 2 O, each instance of A needs an R-successor that is an instance of B. Analogously to the EL reasoning calculus, in order to obtain a polynomial overestimation method, rule E3 “reuses” the same successor to satisfy multiple existential restrictions to the same concept B. Finally, rules E4 and E5 implement modal reasoning.

Having computed , we construct the decomposition DE = hV; E; sigi of the symbols occurring in O and C as shown below. Note that DE contains no -labeled edges, as this decomposition method does not analyze or-branching. By Theorems 1 and 3, the resolution calculus for DE and O is C-complete.

V B fvK j K A for some Ag

E B fhvB; vK ; Ri j K A and A v 9R:B 2 Og sig(vK ) B fA j K

Ag Theorem 3. Decomposition DE is admissible for O and C. 4.2

Analyzing Or-Branching via Tree Decomposition We now present an approach for computing admissible decompositions that analyzes or-branching. The approach handles the clauses in O as explained in Section 1 for SAT, and it imposes additional constraints in order to satisfy condition (iv) of Definition 3.

Given a normalized ontology O and a set of clauses C, we define the hypergraph O C = hV; Hi such that V and H are the smallest sets satisfying the following properG ; ties. For each atomic concept A occurring in O or C, we have A 2 V. For each clause K v M 2 O, we have sig(K v M) 2 H. For each A v 9R:B 2 O, set H contains hyperedges domAv9R:B and ranAv9R:B defined as shown below, where Ci v 8R:Di, 1 i n and E j v 8inv(R):F j, 1 j m are all axioms in O of the respective forms: domAv9R:B B fA; C1; : : : ; Cn; F1; : : : ; Fmg; ranAv9R:B B fB; D1; : : : ; Dn; E1; : : : ; Emg:

Finally, sig(K v M) 2 H for each K v M 2 C.

Given a tree decomposition hT; Li of GO;C, we construct (don’t-care nondeterministically) a decomposition DT = hV; E; sigi as follows. The vertices of DT are the bags of T —that is, V B B(T ). The signatures of DT are the labels of T —that is, sig B L. The -edges of DT are the edges of T —that is, for each fu; vg 2 E(T ), we have hu; v; i 2 E. Finally, for each A v 9R:B 2 O, choose vertices u; v 2 V such that ranAv9R:B L(u) and domAv9R:B L(v) and set hu; v; Ri 2 E; such u and v exist due to property (T2) of the definition of tree decompositions in Section 1.

Theorem 4. Every decomposition DT is admissible for O and C. 4.3

Analyzing And- and Or-Branching Simultaneously We now show how to combine the approaches for analyzing and- and or-branching to obtain a C-decomposition of a normalized ALCI ontology O and a set of clauses C.

The procedure consists of three steps. First, we compute the relation as described in Section 4.1. This step analyzes the and-branching inherent in O and C.

Second, for all K such that K A for some A, we simultaneously define hypergraphs GK = hVK ; HK i where VK B fA j K Ag, and HK are the smallest sets satisfying the following conditions. For each clause K0 v M0 2 O with sig(K0 v M0) VK , we have sig(K0 v M0) 2 HK . For each axiom A v 9R:B 2 O such that A 2 VK , set HK contains hyperedge domK;Av9R:B and set HB contains hyperedge ranK;Av9R:B defined below, where Ci v 8R:Di, 1 i n and E j v 8inv(R):F j, 1 j m are all axioms in O of the respective forms such that Ci 2 VK and E j 2 VB: domK;Av9R:B B fA; C1; : : : ; Cn; F1; : : : ; Fmg; ranK;Av9R:B B fB; D1; : : : ; Dn; E1; : : : ; Emg:

Finally, [sig(K v M) \ VK ] 2 HK for each K v M 2 C.

Third, we compute a tree decomposition hTK ; LK i for each hypergraph GK ; without loss of generality we assume that all sets B(TK ) are disjoint. We then construct the decomposition DC = hV; E; sigi as follows. The vertices of DC are the bags of the tree decompositions—that is, V B SK B(TK ). The signatures of DC are the labels of the tree decompositions—that is, sig B SK LK . The -edges of DC are the edges of the tree decompositions—that is, hu; v; i 2 E for each fu; vg 2 E(TK ). Finally, for each axiom A v 9R:B 2 O and each K such that A 2 VK , choose u 2 B(VB) and v 2 B(VK ) such that ranK;Av9R:B L(u) and domK;Av9R:B L(v) and set hu; v; Ri 2 E; such u and v exist due to property (T2) of the definition of tree decompositions in Section 1.

The class of all C-decompositions of O and C consists of all decompositions obtained in the way specified above. Note that the first step (computation of ) is deterministic, but the second step is not as each GK may admit several tree decompositions. The C-width of O and C is the minimal width of any C-decomposition of O and C. Theorem 5. Every decomposition DC is admissible for O and C.

To show that DL reasoning is FPT if the C-width is bounded, we next estimate the e ort required for computing a C-decomposition of O and C. With kOk and kCk we denote the sizes of (i.e. the numbers of symbols required to encode) O and C, respectively.

Ontology j Aj j Anormj

SNOMED CT (http://ihtsdo.org/snomed-ct/) 315,489 516,703 SNOMED CT-SEP (see [ 9 ] for reference) 54,973 149,839 FMA (http://fma.biostr.washington.edu/) 41,700 81,685 GALEN (http://opengalen.org/) 23,136 49,245 OBI (http://obi-ontology.org/) 2,955 4,296 Proposition 2. An algorithm exists that takes as input a positive integer d, a normalized ALCI ontology O, and a set of clauses C, that runs in time O(g(d) (kOk + kCk)5) for g a computable function, and that computes a C-decomposition of O and C of width at most d whenever at least one such decomposition exists.

We can now formulate the main FPT result for C-decompositions.

Theorem 6. Let d be a positive integer, let O be a normalized ALCI ontology, and let K v M be a clause. The problem of deciding whether a C-decomposition of O and C = fK v Mg of width at most d exists, and if so, whether O j= K v M, is FPT. 5

Experimental Results

It can be argued that FPT is interesting only if the parameter can be substantially smaller than the input size. In order to judge the “usefulness” of C-width as a complexity measure, we measured the C-width of several ontologies (listed in Table 1) that are often used for evaluating DL reasoners. We weakened all ontologies to ALCH I by discarding all unsupported features, we applied the structural transformation from [ 9 ], and we eliminated role inclusion axioms by unfolding the role hierarchy into universal restrictions to obtain normalized ALCI ontologies. Note that there are several di erent ways of formulating and optimizing structural transformation, and each could produce an ontology of a di erent C-width, so our results are not necessarily optimal.

After normalization, we next computed the deductive overestimation and the decomposition DE as described in Section 4.1, we constructed the hypergraphs GK as described in Section 4.3, and we fed all of them into TreeD1—a library for computing tree decompositions—to construct a C-decomposition DC. For each ontology we considered two sets of goal clauses: C1 = fA v ? j A 2 Ag, which corresponds to checking satisfiability of all atomic concepts, and C2 = fA v B j A; B 2 Ag, which corresponds to classification. In theory, the C-width of O and C1 can be smaller than the C-width of O and C2; however, we have not observed a di erence between the two in practice, so we present here only the results for classification. Also, please note that TreeD was able only to produce approximate, rather than exact tree decompositions; hence, our results provide only an upper bound on the C-width.

The results of our experiments are shown in Table 1. For each ontology we list the number of atomic concepts in the original ontology (j Aj), the number of atomic 1 http://www.itu.dk/people/sathi/treed/ concepts after normalization (j Anormj), and the widths of the two decompositions that we constructed. Notice that although some of the tested ontologies contain tens or even hundreds of thousands of concepts, the width of DC rarely exceeds one hundred, and it is always by several orders of magnitude smaller than the total number of concepts in the ontology. This suggests that our notion of a decomposition might even prove to be useful in practice, provided that our resolution algorithm is suitably optimized. 6

Conclusion

We presented a DL reasoning algorithm that is fixed parameter tractable for a suitable notion of the input width. We see two main challenges for our future work. On the theoretical side, our approach should be extended to more complex ontology languages; handling counting seems particularly challenging. On the practical side, our algorithm should be optimized for practical use. A particular challenge is to combine the construction of a decomposition with actual reasoning and thus save preprocessing time.

1. Artale , A. , Calvanese , D. , Kontchakov , R. , Zakharyaschev , M.: The DL-Lite Family and Relations . Journal of Artificial Intelligence Research 36 , 1 - 69 ( 2009 )

2. Baader , F. , Brandt , S. , Lutz , C. : Pushing the EL Envelope . In: Kaelbling, L.P. , Sa otti , A. (eds.) Proc. of the 19th Int. Joint Conference on Artificial Intelligence (IJCAI 2005 ). pp. 364 - 369 . Morgan Kaufmann Publishers, Edinburgh, UK (July 30-August 5 2005 )

3. Baader , F. , Calvanese , D. , McGuinness , D. , Nardi , D. , Patel-Schneider , P.F . (eds.): The Description Logic Handbook: Theory, Implementation and Applications . Cambridge University Press, 2nd edn. (August 2007 )

4. Calvanese , D. , De Giacomo , G. , Lembo , D. , Lenzerini , M. , Rosati , R. : Tractable Reasoning and E cient Query Answering in Description Logics: The DL-Lite Family . Journal of Automated Reasoning 9 , 385 - 429 ( 2007 )

5. Downey , R.G. , Fellows , M.R. : Parameterized Complexity . Springer ( 1999 )

6. Gottlob , G. , Pichler , R. , Wei , F. : Bounded Treewidth as a Key to Tractability of Knowledge Representation and Reasoning . In: Proc. of the 21st Nat. Conf. on Artificial Intelligence (AAAI 2006 ). pp. 250 - 256 . AAAI Press, Boston, MA, USA ( 2006 )

7. Gottlob , G. , Scarcello , F. , Sideri , M. : Fixed-parameter complexity in AI and nonmonotonic reasoning . Artificial Intelligence 138 ( 1-2 ), 55 - 86 ( 2002 )

8. Grosof , B.N. , Horrocks , I. , Volz , R. , Decker , S.: Description Logic Programs: Combining Logic Programs with Description Logic . In: Proc. of the 12th Int. World Wide Web Conference (WWW 2003 ). pp. 48 - 57 . ACM Press, Budapest, Hungary (May 20-24 2003 )

9. Simancˇík , F. , Kazakov , Y. , Horrocks , I. : Consequence-Based Reasoning beyond Horn Ontologies . In: Proc. of the 22nd Int. Joint Conf. on Artificial Intelligence (IJCAI 2011 ) (July 16-22 2011 ), to appear

10. Szeider , S. : On Fixed-Parameter Tractable Parameterizations of SAT . In: Giunchiglia, E. , Tacchella , A . (eds.) Proc. of the 6th Int. Conf. on Theory and Applications of Satisfiability Testing (SAT 2003 ), Selected Revised Papers. LNCS , vol. 2919 , pp. 188 - 202 . Springer, Santa Margherita Ligure, Italy (May 5-8 2003 )