=Paper=
{{Paper
|id=None
|storemode=property
|title=On
Checking Domain Independence
|pdfUrl=https://ceur-ws.org/Vol-857/paper_s03.pdf
|volume=Vol-857
|dblpUrl=https://dblp.org/rec/conf/cilc/KerhetF12
}}
==On
Checking Domain Independence==
On Checking Domain Independence Volha Kerhet and Enrico Franconi Faculty of Computer Science, Free University of Bozen-Bolzano, Italy {kerhet,franconi}@inf.unibz.it Abstract Domain independence is an important property of a formula that guarantees that the truth value of the formula in an interpreta- tion remains the same regardless of the underlying domain of the inter- pretation. Unfortunately, checking domain independence of a first-order formula is in general undecidable. There have been several attempts to define syntactic fragments of first-order logic characterising domain in- dependent formulas that easily can be checked, but all of them are in- complete. We reduce the problem of checking domain independence of a general first-order logic formula to checking a standard first-order logic entail- ment. This method can be applied in any decidable fragment where the formulas participating in the entailment can be expressed. 1 Introduction Domain independence [1] is an important property of a formula that guaran- tees that the truth value of the formula in an interpretation remains the same regardless of the underlying domain of the interpretation. The importance of checking domain independence stems from the use of first-order logic as a query language for databases. Since a query can be an arbitrary first-order formula, its answer can be infinite (since the domain is not restricted to be finite) or it may depend on the domain. For example, the query Q(x) = ¬Student(x) over the database Student(a), Student(b), with domain {a, b, c} has the answer {x 7→ c}, while with domain {a, b, c, d} has the answer {x 7→ c, x 7→ d}, and if we change the domain to an infinite one, the answer will be infinite even in presence of such a finite database. Therefore, the notion of domain independent queries has been introduced in relational databases: indeed, the above open formula ¬Student(x) turns out not to be domain independent. An example of a domain independent formula would be ∃x. P (x), since if it turns out to be true in some interpretation of the unary predicate symbol P with a specific domain, it is also true in all the structures sharing the same interpretation for the predicate P and any other compatible domain. On the other hand, the formula ∀x. P (x) is not a domain independent formula, since if it is true in an interpretation with some domain, it is definitely not true in any structure sharing the same interpretation for the predicate P but with a larger domain. The problem of checking whether a FOL formula is domain independent is undecidable [2]. The well known safe-range syntactic fragment of FOL intro- duced by Codd is an equally expressive language; indeed any safe-range for- mula is domain independent, and any domain independent formula can be easily transformed into a logically equivalent safe-range formula. Intuitively, a formula is safe-range if and only if its variables are bounded by positive predicates or equalities – for the exact syntactical definition see, e.g., [2]. For example, the formula ∀x. G(x) → P (x) is safe-range, while formula ∃x. (A(x) ∨ B(a)) is not. To check whether a formula is safe-range, the formula is transformed into a lo- gically equivalent safe-range normal form and its range restriction is computed according to a set of syntax based rules; the range restriction of a formula is a subset of its free variables, and if it coincides with the free variables then the formula is said to be safe-range. The problem arises since the safe-range property is an incomplete charac- terisation of domain independence: indeed, the above non-safe-range formula ∃x. (A(x) ∨ B(a)) is logically equivalent to the formula (∃x. A(x)) ∨ B(a) which is safe-range, and therefore we can conclude that also the former formula is do- main independent. In this paper, we show for the first time a sound and complete procedure to check domain independence of a formula, by reducing this problem to an entailment problem in first-order logic. All the proofs of the theorems presented in this paper can be found in the technical report [3]. 2 Preliminaries Let FOL(C, P) be a classical function-free first-order language with equality over a signature Σ = (C, P), where C is a set of constants and P is a set of predicates with associated arities.The arity of a predicate P we denote as ar(P ). We denote as σ(φ) the signature of the formula φ, that is all the predicates and constants occurring in φ. We denote the set of all variables appearing in φ as var(φ), and the set of the free variables appearing in φ as free(φ); we may use for φ the notation φ(x̄), where x̄ = free(φ) is the (possibly empty) set of free variables of the formula. A formula in FOL(C, P) is in prenex normal form, if it is written as a string of quantifiers followed by a quantifier-free part. Every formula is equivalent to a formula in prenex normal form and can be converted into it in polynomial time [4]. Let x̄ and S be two countable sets, first of which is some set of variables. We define a substitution Θx̄S to be a total function x̄ 7→ S assigning an element of the set S to each variable in x̄, including the empty substitution when x̄ = ∅. We can see substitution as (at most) a countable set of assignments of elements from S to elements from x̄. That is, if x̄ = {x1 , x2 , . . .}, then Θx̄S := {x1 → s1 , x2 → s2 , . . .}, where s1 , s2 , . . . are elements from S assigned to corresponding variables from x̄ by Θx̄S . As usual, an interpretation M = h∆M , ·M i includes a non-empty set – the domain ∆M – and an interpretation function ·M defined over constants and pre- dicates of the signature. We use standard definitions of validity, satisfiability and entailment of a formula. An extension of φ(x̄) in interpretation M = h∆M , ·M i, denoted (φ(x̄))M , is the set of substitutions which satisfy φ in M. That is, (φ(x̄))M = {Θx̄∆ | M, Θx̄∆ |= φ(x̄)}. If φ is closed, then the extension depends on whether φ holds in M or not. Thus, for a closed formula φ, (φ)M = , if M |= φ, and (φ)M = ∅, if M 6|= φ. Hereafter let σ 0 be a subset of σ = P ∪ C. Two interpretations M1 = h∆1 , ·M1 i and M2 = h∆2 , ·M2 i are σ 0 -compatible iff they agree on the interpretations of all the predicates and constants in σ 0 . That is, for any predicate P and constant c in σ 0 the following holds: P M2 = P M1 cM2 = cM1 Definition 1. (Domain Independence) Let M1 = h∆1 , ·M1 i and M2 = h∆2 , ·M2 i be any two σ-compatible interpretations, and φ be a formula in FOL(C, P). φ is domain independent, if and only if (φ)M1 = (φ)M2 The semantic active domain of sub-signature σ 0 in an interpretation M = h∆ , ·M i, denoted adom(σ 0 , M), is the set of all elements of the domain ∆M M occurring in interpretations of predicates and constants from σ 0 in M: [ [ [ adom(σ 0 , M) := {a1 , . . . , an } ∪ {cM }. P ∈σ 0 ∩P (a1 ,...,an )∈P M c∈σ 0 ∩C 3 Checking Domain Independence In this section we show how to prove domain independence of a formula φ by proving a validity of a first-order logic entailment. We introduce a new unary predicate Adom σ0 . According to the idea, we consider a theory encoding that interpretation of Adom σ0 in any model of the theory is never empty and equal to the semantic active domain of σ 0 in this model, if it is not empty. We call such a theory σ 0 -active domain theory and denote it as Aσ0 . Formally, Aσ0 is the smallest theory satisfying the following four requirements. 1. For every n-ary predicate P ∈ σ 0 ∩ P, ^ ∀x̄ . P (x̄) → Adom σ0 (x) ∈ Aσ0 . x∈x̄ In other words, Adom σ0 contains any domain element that occurs in the interpretation of some predicate from σ 0 . 2. Adom σ0 contains all the constants from σ 0 : ^ Adom σ0 (c) ∈ Aσ0 . c∈σ 0 ∩C 3. Let n be a maximal arity among the predicates in σ 0 , and let z̄ = {z1 , . . . , zn−1 }. Then _ _ ∃x̄. P (x̄) ∨ ∃x. (x = c) → (∀x.Adom σ0 (x) → P ∈σ 0 ∩P_ c∈σ 0 ∩C _ ∃z̄. (P (x, z1 , . . . , zar(P )−1 ) ∨ . . . ∨ P (z1 , . . . , zar(P )−1 , x)) ∨ (x = c)) ∈ Aσ0 . P ∈σ 0 ∩P c∈σ 0 ∩C It means, that every element from Adom σ0 necessarily appears in at least one predicate from σ 0 or equal to some constant from σ 0 , if not all the predicates from σ 0 are empty or σ 0 contains constants. 4. Adom σ0 is not empty: ∃x. Adom σ0 (x) ∈ Aσ0 . The next proposition follows from the definition of Aσ0 . Proposition 1. An interpretation M = h∆M , ·M i is a model of Aσ0 iff – Adom M 0 0 σ 0 = adom(σ , M), if adom(σ , M) 6= ∅; – Adom M 0 σ 0 6= ∅, if adom(σ , M) = ∅. Let φ(x̄) be a formula in FOL(C, P) with free variables x̄, x ∈ x̄, and let P be some predicate. For the sake of readability we write ∃x ∈ P. φ(x̄) to denote the formula ∃x. P (x) ∧ φ(x̄); and we write ∀x ∈ P. φ(x̄) to denote the formula ∀x. P (x) → φ(x̄). Let now φ(x̄) be a query in FOL(C, P) in prenex normal form, that is φ(x̄) = Q1 y1 . . . Qk yk . ψ(ȳ, x̄), where Qi stands for either ∀ or ∃, x̄ = {x1 , . . . , xn } is the set of all free variables of φ(x̄), ȳ = {y1 , . . . , yk } is the set of all bounded (quantified) variables of φ(x̄), and ψ(ȳ, x̄) is a quantifier free formula. Let P be an unary predicate. We will use the following denotation: n ^ φ(x̄)|P := (Q1 y1 ∈ P . . . Qk yk ∈ P. ψ(ȳ, x̄)) ∧ P (xi ). i=1 We extend this denotation also for the formulas, that are not in prenex normal form, meaning that the aforementioned operation (·)|(·) is applied to the prenex normal form of the formula. Theorem 1 (Checking Domain Independence). Let φ(x̄) be a formula in FOL(C, P). Then φ(x̄) is domain independent iff Aσ(φ) |= ∀x̄ . φ(x̄) ↔ φ(x̄)|Adom σ(φ) . (1) Example 1. Let us consider the formula φ = ∃x. (A(x)∨B(a)) from [5] mentioned in Section 1 and apply the theorem for it. This formula is not safe-range but it is domain independent, because it is logically equivalent to (∃x. A(x)) ∨ B(a), which is safe-range and, hence, domain independent. Such kind of formulas are of particular interest for us. Since adom(σ(φ), M) = AM ∪ B M ∪ {aM }, because of the proposition ?? we have: Aσ(φ) |= ∀x. Adom σ(φ) (x) ↔ A(x) ∨ B(x) ∨ (x = a). Then Aσ(φ) |= φ|Adom σ(φ) ↔ ∃x. (A(x) ∨ B(x) ∨ (x = a)) ∧ (A(x) ∨ B(a)). One can see, that |= ∃x. (A(x) ∨ B(a)) ↔ ∃x. (A(x) ∨ B(x) ∨ (x = a)) ∧ (A(x) ∨ B(a)). That is, Aσ(φ) |= φ ↔ φ|Adom σ(φ) . Therefore, by applying the theorem we proved, that formula φ is domain independent. 4 Conclusion We proposed a method that allows to reduce the problem of checking domain independence of a formula to checking standard first-order logic entailment: a formula φ(x̄) from FOL(C, P) is domain independent whenever the entailment (1) holds. Using this technique in particular makes sense when the formula to be checked for domain independence is not safe-range. Thus, in practice this method is useful and can be applied in any not safe-range decidable fragment of FOL(C, P), where Aσ(φ) and ∀x̄ . φ(x̄) ↔ φ(x̄)|Adom σ(φ) can be expressed. Discovering of such interesting fragments and application of the technique there is our future work. We would like to thank Martín Rezk and Nhung Ngo for ideas and fruitful discussions. References 1. Avron, A.: Constructibility and decidability versus domain independence and abso- luteness. Theor. Comput. Sci. 394 (April 2008) 144–158 2. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995) 3. Franconi, E., Kerhet, V.: On checking domain independence (extended version). Technical Report KRDB12-3, KRDB Research Centre, Free University of Bozen- Bolzano (2012) http://www.inf.unibz.it/krdb/pub/TR/KRDB12-3.pdf. 4. Kleene, S.C.: Mathematical Logic. New York: Dover (2002) 5. Topor, R.W.: Domain-independent formulas and databases. Theor. Comput. Sci. 52 (1987) 281–306