Satisfiability in the Triguarded Fragment of First-Order Logic Sebastian Rudolph1 and Mantas Šimkus2 1 Computational Logic Group, TU Dresden, Germany 2 Institute of Logic and Computation, TU Wien, Austria Abstract. Most Description Logics (DLs) can be translated into well- known decidable fragments of first-order logic FO, including the guarded fragment GF and the two-variable fragment FO2 . Given their prominence in DL research, we take closer look at GF and FO2 , and present a new fragment that subsumes both. This fragment, called the triguarded frag- ment (denoted TGF), is obtained by relaxing the standard definition of GF: quantification is required to be guarded only for subformulae with three or more free variables. We show that satisfiability of equality-free TGF is N2ExpTime-complete, but becomes NExpTime-complete if we bound the arity of predicates by a constant (a natural assumption in the context of DLs). Finally, we observe that many natural extensions of TGF, including the addition of equality, lead to undecidability. 1 Introduction Description Logics (DLs) are a family of logic-based knowledge representation languages, usually suitably limited to ensure the decidability of basic reasoning problems [2, 3]. Many properties of DLs can be explained by seeing them as fragments of (function-free) first-order logic (denoted FO in this paper). In fact, most DLs fall into well-known decidable fragments of FO, implying not only decidability, but also complexity results, model-theoretic properties, and limits of expressiveness. For instance, many standard DLs are subsumed by FO2 , the fragment of FO with at most two variables [8]. For FO2 without equality, the satisfiability problem has been known to be decidable for over five decades due to Scott [20]. The decidability of satisfiability in FO2 in the presence of equality is known since 1975 due to Mortimer [16], with the worst-case optimal NExpTime upper bound known since over two decades [13]. An alternative explanation for the decidability of DLs is the fact that they can often be translated into the guarded fragment GF of FO [1] (see also [11] for a discussion). Satisfiability checking in GF is 2ExpTime-complete in general, but it is ExpTime-complete under the assumption that the arities of predicates are bounded by a constant [12]. The latter is particularly important because many standard DLs are ExpTime-complete for consistency checking, while their FO translations use predicate symbols of arity at most two. We note that the connection between DLs and GF is somewhat more robust than that between DLs and FO2 , which can be observed if we look beyond consistency checking in DLs. Most notably, conjunctive query answering, which is decidable for most DLs, remains decidable for GF, but becomes undecidable for FO2 [5, 18]. Given the importance of GF and FO2 to research in DLs, in this paper we take a deeper look at them, and study a new fragment of FO that subsumes both GF and FO2 . The fragment is called the triguarded fragment (denoted TGF), and it is obtained by relaxing the standard definition of GF. In GF, existential and universal quantification can only be used in (sub)formulae of the form ∃x.(R(t)∧ψ) or ∀x.(R(t) → ψ), where R(t) is an atomic formula such that t contains all free variables of ψ (the atom R(t) “guards” the formula ψ). In TGF, guardedness of quantification is required only in case ψ has three or more free variables (hence the name “triguarded”). This entails that quantification can be used in an unrestricted way for formulae with at most two free variables, and hence FO2 gets included in TGF seamlessly. After providing a simple definition of TGF, we study its satisfiability problem. To this end, we first consider a slightly different problem: we study satisfiability of formulae of GF in the presence of a built-in binary predicate U that contains all pairs of domain elements. In DL parlance, we consider the extension of GF with the universal role, and thus this fragment is denoted GFU. Since the predicate U can be used to provide “spurious” guards to formulae with up to two free variables, GFU adds to GF precisely the expressivity needed to capture TGF, and thus in the paper we mainly focus on GFU instead of TGF. We show that in the equality-free case, satisfiability of formulae in GFU (and in TGF) is N2ExpTime-complete. We establish the upper bound by character- izing the satisfiability of a formula in GFU via mosaics, where a mosaic is a special (finite) collection of types that can be used to build a model for the input formula. The matching lower bound can be obtained by a reduction from the tiling problem of a doubly exponential grid. We then consider the assumption that predicate arities are bounded by a constant. In this case, the mosaic con- struction gives rise to a NExpTime upper bound for satisfiability of formulae without equality. We note that FO2 is already NExpTime-hard (even without equality), which means that in the bounded-arity setting TGF and GFU do not have higher complexity than FO2 . Finally, we show that satisfiability of TGF and GFU formulae with equality is undecidable (interestingly, the complexity of satisfiability in GF and FO2 is insensitive to the presence of equality). The fragment GFU is similar to the fragment GF×2 of [10], which extends GF with cross products (allowing to capture statements like “all elephants are bigger than all mice” as in [19]). The difference is that GF×2 , inspired by the database view, imposes a separation into a set of ground facts (the data) and a constant- free theory (the schema) [9]. Under this restriction on expressiveness (which is only implicit in [10]), GF×2 is in fact subsumed by the fragment GF |FO2 from [14]. Using a resolution-based procedure, satisfiability in GF |FO2 was shown to be in 2ExpTime, and in NExpTime in case of bounded predicate arities [14]. Instead of resolution, the proof of the 2ExpTime upper bound for GF×2 in [10] uses a reduction to satisfiability in plain GF. As we shall see, the unrestricted availability of constants is key in the N2ExpTime-hardness of full GFU and TGF, and thus is the main distinguishing feature of the fragments introduced in this paper. We note that the undecidability of GFU and TGF in the presence of equality can be inferred from [14] (Section 4.2.3), where a reduction from satisfiability in the Goldfarb class is presented, and it can be applied to our fragments. Instead, in this paper we provide a more direct undecidability proof by a reduction from the tiling problem for an infinite grid. 2 Preliminaries We assume the reader is familiar with the syntax and semantics of FO, and thus here we only present some notation. We use NP , NC and NV to denote the countably infinite, mutually disjoint sets of predicate symbols, constants and variables, respectively. We will mostly use (possibly subscripted) P , R, B and H as predicate symbols. Given a formula ϕ, we use NC (ϕ) and NP (ϕ) to denote the set of constants and the set of predicate symbols that appear in ϕ, respectively. Elements of NC ∪ NV are called terms. An atom (or, atomic formula) is an expression of the form R(t), where t is an n-tuple of terms, where n is the arity of the predicate symbol R ∈ NP . For convenience, given a tuple t = ht1 , . . . , tn i of terms, we sometimes view t as the set {t1 , . . . , tn }. Given a tuple x of variables, an x-assignment is any function f : NC ∪ NV → NC ∪ NV such that (i) f (y) ∈ NC for all y ∈ x, and (ii) f (t) = t for all t 6∈ x. Given a tuple t = ht1 , . . . , tn i of terms and an x-assignment f , we let f (t) = hf (t1 ), . . . , f (tn )i. The semantics to formulae is given using interpretations. An interpretation is a pair I = (∆I , ·I ), where ∆I is a non-empty set (called domain), and ·I is a function that maps (i) every constant c ∈ NC to an element cI ∈ ∆I , and (ii) every predicate symbol R ∈ NP to an n-ary relation over ∆I , where n is the arity of R. We assume that 0-ary predicate symbols > and ⊥ belong to NP , and they have the usual (built-in) meaning. The equality predicate ≈ also belongs to NP , and has the fixed meaning ≈I = {(e, e) | e ∈ ∆I } for all interpretations I. We write I |= ϕ, if an interpretation I is a model of a closed formula (or, a sentence) ϕ. We use free(ϕ) to denote the set of free variables in a formula ϕ. 3 The Triguarded Fragment We are now ready to introduce the triguarded fragment of FO. Essentially, it is a relaxed variant of GF where guards are only required when quantifying over formulae with three or more free variables. Definition 1. The triguarded fragment TGF of first-order logic with equality is defined as the smallest set of formulae closed under the following rules: (1) Every atomic formula belongs to TGF. (2) TGF is closed under the propositional connectives ¬, ∧, ∨ and →. (3) If x is a variable, and ϕ is a formula in TGF with |free(ϕ)| ≤ 2, then ∃x.ϕ and ∀x.ϕ also belong to TGF. (4) If x is a non-empty tuple of variables, ϕ is a formula in TGF, α is an atom, and free(ϕ) ⊆ free(α), then ∃x.(α ∧ ϕ) and ∀x.(α → ϕ) also belong to TGF. Observe that if we consider only the items (1), (2) and (3) in Definition 1 as legal rules to build formulae, we can build all formulae of FO that use at most 2 variables, and thus FO2 ⊆ TGF. If we consider the items (1), (2) and (4) in Definition 1, we can build all guarded formulae, and thus GF ⊆ TGF. The syntax of TGF also allows us to build formulae that are neither in GF nor in FO2 , witnessed by formulae like ∀x∀y.((R1 (x, a) ∧ R2 (y, b)) → ∃z.R3 (x, y, z)). Our main goal in this paper is to understand the computational complexity of satisfiability in TGF. To this end, we concentrate on a slightly different logic, which is effectively equivalent to TGF, but which makes presentation significantly easier. In particular, there is a simple extension of GF that allows us to capture TGF. Intuitively, TGF 6⊆ GF because TGF allows “unguarded” quantification in front of formulae ϕ, but only in case ϕ has no more than 2 free variables. If we have the availability of a binary predicate whose extension always contains all pairs of domain elements, we can use it to guard ϕ. In particular, we consider next the binary universal role predicate U ∈ NP , whose extension is fixed to be UI = ∆I × ∆I for all interpretations I. Note that in FO, FO2 and TGF, the built-in predicate U does not add expressiveness, because it can be axiomatized using an ordinary binary predicate U and the sentence φ = ∀x∀y.U (x, y); thus we can safely allow U to be used as a predicate symbol in formulae of FO, FO2 and TGF. Since φ is not in GF, the addition of the built-in U to GF makes a big difference (as we shall see from complexity results). We now formally define GFU, which extends GF with U, and in fact adds to GF the necessary expressivity to capture TGF. Definition 2. Let GFU be the set of formulae of TGF that can be built using the items (1), (2) and (4) of Definition 1 only, possibly using the predicate U in atomic formulae. By using the U predicate as a guard for formulae with at most 2 free variables, we can convert any TGF formula into an equivalent formula in GFU. For instance, the above example formula can be transformed into the equivalent GFU formula ∀x∀y.(U(x, y) → ((R1 (x, a) ∧ R2 (y, b)) → ∃z.R3 (x, y, z))). Proposition 1. For any ϕ ∈ TGF, we can build in polynomial time an equiva- lent formula ϕ0 ∈ GFU. Moreover, NP (ϕ0 ) ⊆ NP (ϕ) ∪ {U}. Due to Proposition 1, in order to check satisfiability in TGF, it suffices to focus on the satisfiability problem for GFU, and thus in the rest of the paper we focus on GFU. 4 Characterizing Satisfiability via Mosaics In this section, we study GFU in the equality-free setting, and provide a finite representation of models of satisfiable GFU formulae, which will be the basis of the satisfiability checking algorithm. In particular, we show that an equality-free GFU formula ϕ has a model iff there exists a mosaic for ϕ, which is a relatively small set of building blocks that can be used to build a model for ϕ. In this way, checking satisfiability of ϕ reduces to checking the existence of a mosaic for ϕ. To simplify the structure of GFU formulae we use a suitable (Scott-like) normal form, which is not much different from the ones used, e.g., in [13, 12]. Definition 3V(NormalVForm). A sentence ϕ ∈ GFU is in normal form if it has the form ψ∈A ψ ∧ ψ∈E ψ, where A contain sentences of the form ∀x.(R(t) → (¬H1 (v 1 ) ∨ . . . ∨ ¬Hn (v n ) ∨ Hn+1 (v n+1 ) ∨ . . . ∨ Hm (v m ))), (1) and E contain sentences of the form ∀x.(R(u) → ∃y.H(v)). (2) We use A(ϕ) and E(ϕ) to denote the sets A and E of a formula ϕ as above. For a sentence ψ = ∀x.(R(u) → ∃y.H(v)), we let width(ψ) denote the number of variables that appear in v. For a formula ϕ as above, width(ϕ) is the maximal width(ψ) over all ψ ∈ E(ϕ). As usual, in case m = 0, the empty disjunction in (1) stands for ⊥. Note that since (1) and (2) are in GFU, each variable that appears in v 1 , . . . , v m also appears in t, and each variable that appears in v also appears in u. Observe that the sentence in (1) can be equivalently written as ∀x.(R(t) ∧ H(v 1 ) ∧ . . . ∧ Hn (v n ) → Hn+1 (v n+1 ) ∨ . . . ∨ Hm (v m )). (3) For presentation reasons, in what follows we will mostly use the form (3) instead of (1) when speaking about sentences in A. Note that (3) closely resembles a (guarded) disjunctive Datalog rule with R(t) a guard atom. The following statement shows that we can focus on formulae in normal form. Proposition 2. For any formula ϕ ∈ GFU, we can construct in polynomial time a formula ϕ0 ∈ GFU in normal form such that (a) ϕ is satisfiable iff ϕ0 is satisfiable, and (b) the translation does not increase the arity of predicate symbols, i.e., there is no predicate symbol in ϕ0 whose arity is strictly greater than the arity of every predicate symbol in ϕ. To define mosaics, we need the notion of a type for a formula ϕ. Types will form mosaics, and they can be seen as patterns (interpretations of restricted size) for building models of ϕ. Definition 4 (Types). A type τ for a formula ϕ is any set of ground atoms with predicate symbols from NP (ϕ). We let dom(τ ) denote the set of constants that appear in a type τ , and let I(τ ) denote the interpretation such that (i) ∆I(τ ) = dom(τ ), and (ii) P I(τ ) = {t | P (t) ∈ τ } for all predicate symbols P . For a sentence ϕ, we write τ |= ϕ if I(τ ) |= ϕ. Given a set of constants F , we let τ |F = {P (t) ∈ τ | t ⊆ F }, i.e., τ |F is the restriction of τ to atoms whose all arguments are included in F . Of particular interest in our treatment is how a distinguished element of some type “looks like” in terms of the predicates it satisfies and its relationship to constants. This information is captured using unary types, in which we abstract from the concrete target constant by replacing it with a special variable. Definition 5 (unary types). Assume a formula ϕ ∈ GFU and let xϕ be a special variable associated with ϕ. We let base(ϕ) denote the set of all atoms P (t) such that t ⊆ NC (ϕ) ∪ {xϕ } and P ∈ NP (ϕ). Any subset σ ⊆ base(ϕ) is called a unary type for ϕ. Assume a constant c, and let f be the function such that (i) f (xϕ ) = c, and (ii) f (d) = d for all d ∈ NC . For a type τ , we define the unary type τ |ϕc = {R(t) ∈ base(ϕ) | R(f (t)) ∈ τ }. We are now ready to define mosaics, which will act as witnesses to satisfi- ability of GFU formulae (without equality). Roughly, a mosaic for a formula ϕ is a pair (M, X ), where X is a collection of “placeholder” constants, and M is a set of types for ϕ. In order to be a proper witness to satisfiability, a mo- saic must satisfy a collection of conditions. In particular, they ensure that in case ϕ is satisfiable, we will be able to construct a model by arranging together (possibly multiple) instances of types from M. Intuitively, by an instance of a type τ ∈ M we mean a concrete structure that is obtained by replacing the placeholder constants from X with concrete domain elements. Definition 6 (Mosaic). A mosaic for a sentence ϕ ∈ GFU in normal form is a pair (M, X ), where M is a set of types for ϕ and X ⊆ NC \ NC (ϕ), satisfying the following: (A) |X | ≤ width(ϕ); (B) For all τ ∈ M, dom(τ ) ⊆ NC (ϕ) ∪ X ; (C) For all τ, τ 0 ∈ M, τ |NC (ϕ) = τ 0 |NC (ϕ) ; (D) U(t, v) ∈ τ for all τ ∈ M and each pair t, v ∈ dom(τ ); (E) τ |= ψ for all τ ∈ M and all ψ ∈ A(ϕ); (F) If τ ∈ M, ∀x.(R(t) → ∃y.H(v)) ∈ E(ϕ), and R(g(t)) ∈ τ for some x- assignment g, then there is some τ 0 ∈ M such that: (a) H(h(g(v))) ∈ τ 0 for some y-assignment h; (b) τ |F = τ 0 |F , where F = NC (ϕ) ∪ {g(x) | x ∈ x ∩ v}. (G) If t1 ∈ dom(τ1 ) ∩ X and t2 ∈ dom(τ2 ) ∩ X for some τ1 , τ2 ∈ M, then there exists a type τ ∈ M and a pair v1 , v2 with dom(τ ) ∩ X = {v1 , v2 } such that (i) v1 6= v2 , (ii) τ1 |ϕ ϕ ϕ ϕ t1 = τ |v1 , (iii) τ2 |t2 = τ |v2 . Intuitively, the conditions (A-G) ensure the following. (A) requires that only a small number of placeholder constants is used. Due to (B), types in mosaics only refer to original constants of the formula and the small number of place holder constants. The conditions (A) and (B) are important to ensure the rela- tively small size of mosaics. The condition (C) forces the types to agree on the participation of constants in predicates. (D) requires U to be correctly inter- preted locally (i.e., within the individual types), and (E) requires each type to (locally) satisfy all sentences from A(ϕ). The condition (F) ensures that for each type locally satisfying the body of some sentence from E(ϕ), we find a matching type where also the head of that sentence is satisfied. Using (G) we make sure that any two representatives of unnamed domain elements (in terms of unary types) found across the types also occur together in one type. The following soundness and completeness theorems show that mosaics prop- erly characterize satisfiability of equality-free GFU formulae (and, due to Propo- sition 1, of equality-free TGF formulae). Theorem 1 (Completeness). Let ϕ ∈ GFU be a formula in normal form. If ϕ is satisfiable, then there exists a mosaic (M, X ) for ϕ. Proof (Sketch). Assume that ϕ has some model J . Since ϕ is equality-free, we can make the standard name assumption (SNA): NC (ϕ) ⊆ ∆J and cI = c for all c ∈ NC (ϕ). Now, let I be obtained from J by duplicating all anonymous individuals. Formally, let ∆anon = ∆J \NC (ϕ) and ∆I = NC (ϕ)∪{1, 2}×∆anon . Let π : ∆I → ∆J such that π(c) = c for c ∈ NC (ϕ) and π((i, e)) = e otherwise. Now we let t ∈ P I if π(t) ∈ P J . As ϕ does not contain equality, J |= ϕ implies I |= ϕ. This duplication of anonymous individuals makes sure that for every non-constant domain element e, I contains a twin element ẽ different from e but with the same unary type. This property turns out to be crucial to show part (G) of the mosaic definition. We show how to extract from I a mosaic (M, X ) for ϕ. We can assume, w.l.o.g., that ∆I ⊆ NC and that cI = c for all c ∈ NC (ϕ). Let X be any set with X ⊆ NC , X ∩ ∆I = ∅, and |X | = width(ϕ). We say a type τ can be extracted from I if τ can be obtained from I in 4 steps: (a) Take any S ⊆ ∆I such that NC (ϕ) ⊆ S and |S| − |NC (ϕ)| ≤ width(ϕ). (b) Let τ ∗ = {P (t) | t ⊆ S ∧ t ∈ P I }. (c) Let f be any injective function from dom(τ ∗ ) \ NC (ϕ) to X . (d) Let τ be the type obtained from τ ∗ by replacing every occurrence of c ∈ dom(τ ∗ ) \ NC (ϕ) by f (c). The set M contains all types τ that can be extracted from I. It is not difficult to see that the constructed (M, X ) is a mosaic for ϕ. t u Theorem 2 (Soundness). Let ϕ ∈ GFU be a formula in normal form. If there exists a mosaic (M, X ) for ϕ, then ϕ is satisfiable. Proof (Sketch). Assume a mosaic (M, X ) for ϕ. An instantiation for a type τ ∈ M is any injective function δ from dom(τ ) ∩ X to NC \ X . Given such τ and δ, we use δ(τ ) to denote the type that is obtained from τ by replacing every occurrence of a constant c ∈ dom(τ ) ∩ X by δ(c). Our goal is to show how to inductively construct a possibly infinite sequence S = (τ0 , δ0 ), (τ1S, δ1 ), . . . of pairs (τj , δj ), where τj ∈ M and δj is an instantiation for τj , such that i≥0 δi (τi ) |= ϕ. In the base case, we let τ0 be an arbitrary type from M, and let δ0 be any instantiation for τ0 . For the inductive case, suppose (τ0 , δ0 ), . . . , (τi−1 , δi−1 ) have been defined, where i > 0. We show how define the next segment (τi , δi ), . . . , (τm , δm ) of S, where m ≥ i (we indeed may attach to S multiple new elements in one step). To this end, choose the smallest index 0 ≤ j ≤ i − 1 satisfying the following condition: there is ∀x.(R(t) → ∃y.H(v)) ∈ E(ϕ), and R(g(t)) ∈ δj (τj ) for some x-assignment g. If such j does not exist, the construction S of S is complete, and we can proceed to (?) below, where we argue that 0≤k