<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On Checking Domain Independence</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Volha Kerhet</string-name>
          <email>kerhet@inf.unibz.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Enrico Franconi</string-name>
          <email>franconi@inf.unibz.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Computer Science, Free University of Bozen-Bolzano</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Domain independence is an important property of a formula that guarantees that the truth value of the formula in an interpretation remains the same regardless of the underlying domain of the interpretation. Unfortunately, checking domain independence of a first-order formula is in general undecidable. There have been several attempts to define syntactic fragments of first-order logic characterising domain independent formulas that easily can be checked, but all of them are incomplete. We reduce the problem of checking domain independence of a general first-order logic formula to checking a standard first-order logic entailment. This method can be applied in any decidable fragment where the formulas participating in the entailment can be expressed.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Domain independence [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is an important property of a formula that
guarantees that the truth value of the formula in an interpretation remains the same
regardless of the underlying domain of the interpretation. The importance of
checking domain independence stems from the use of first-order logic as a query
language for databases. Since a query can be an arbitrary first-order formula, its
answer can be infinite (since the domain is not restricted to be finite) or it may
depend on the domain. For example, the query Q(x) = :Student(x) over the
database Student(a); Student(b), with domain fa; b; cg has the answer fx 7! cg,
while with domain fa; b; c; dg has the answer fx 7! c; x 7! dg, and if we change
the domain to an infinite one, the answer will be infinite even in presence of such
a finite database. Therefore, the notion of domain independent queries has been
introduced in relational databases: indeed, the above open formula :Student(x)
turns out not to be domain independent. An example of a domain independent
formula would be 9x: P (x), since if it turns out to be true in some interpretation
of the unary predicate symbol P with a specific domain, it is also true in all the
structures sharing the same interpretation for the predicate P and any other
compatible domain. On the other hand, the formula 8x: P (x) is not a domain
independent formula, since if it is true in an interpretation with some domain,
it is definitely not true in any structure sharing the same interpretation for the
predicate P but with a larger domain.
      </p>
      <p>
        The problem of checking whether a FOL formula is domain independent is
undecidable [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The well known safe-range syntactic fragment of FOL
introduced by Codd is an equally expressive language; indeed any safe-range
formula is domain independent, and any domain independent formula can be easily
transformed into a logically equivalent safe-range formula. Intuitively, a formula
is safe-range if and only if its variables are bounded by positive predicates or
equalities – for the exact syntactical definition see, e.g., [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. For example, the
formula 8x: G(x) ! P (x) is safe-range, while formula 9x: (A(x) _ B(a)) is not.
To check whether a formula is safe-range, the formula is transformed into a
logically equivalent safe-range normal form and its range restriction is computed
according to a set of syntax based rules; the range restriction of a formula is a
subset of its free variables, and if it coincides with the free variables then the
formula is said to be safe-range.
      </p>
      <p>The problem arises since the safe-range property is an incomplete
characterisation of domain independence: indeed, the above non-safe-range formula
9x: (A(x) _ B(a)) is logically equivalent to the formula (9x: A(x)) _ B(a) which
is safe-range, and therefore we can conclude that also the former formula is
domain independent. In this paper, we show for the first time a sound and complete
procedure to check domain independence of a formula, by reducing this problem
to an entailment problem in first-order logic.</p>
      <p>
        All the proofs of the theorems presented in this paper can be found in the
technical report [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>
        Let F OL(C; P) be a classical function-free first-order language with equality over
a signature = (C; P), where C is a set of constants and P is a set of predicates
with associated arities.The arity of a predicate P we denote as ar(P ). We denote
as ( ) the signature of the formula , that is all the predicates and constants
occurring in . We denote the set of all variables appearing in as var( ), and
the set of the free variables appearing in as free( ); we may use for the
notation (x), where x = free( ) is the (possibly empty) set of free variables of
the formula. A formula in F OL(C; P) is in prenex normal form, if it is written
as a string of quantifiers followed by a quantifier-free part. Every formula is
equivalent to a formula in prenex normal form and can be converted into it in
polynomial time [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>Let x and S be two countable sets, first of which is some set of variables. We
define a substitution xS to be a total function x 7! S assigning an element of the
set S to each variable in x, including the empty substitution when x = ;. We can
see substitution as (at most) a countable set of assignments of elements from S to
elements from x. That is, if x = fx1; x2; : : :g, then xS := fx1 ! s1; x2 ! s2; : : :g,
where s1; s2; : : : are elements from S assigned to corresponding variables from x
by xS.</p>
      <p>As usual, an interpretation M = h M; Mi includes a non-empty set – the
domain M – and an interpretation function M defined over constants and
predicates of the signature. We use standard definitions of validity, satisfiability and
entailment of a formula. An extension of (x) in interpretation M = h M; Mi,
denoted ( (x))M, is the set of substitutions which satisfy
( (x))M = f x j M; x j= (x)g:
If is closed, then the extension depends on whether holds in M or not. Thus,
for a closed formula , ( )M = , if M j= , and ( )M = ;, if M 6j= .</p>
      <p>Hereafter let 0 be a subset of = P [ C.</p>
      <p>Two interpretations M1 = h 1; M1 i and M2 = h 2; M2 i are 0-compatible
iff they agree on the interpretations of all the predicates and constants in 0.
That is, for any predicate P and constant c in 0 the following holds:
P M2 = P M1
cM2 = cM1
( )M1 = ( )M2
Definition 1. (Domain Independence) Let M1 = h 1; M1 i and M2 =
h 2; M2 i be any two -compatible interpretations, and be a formula in F OL(C; P).
is domain independent, if and only if</p>
      <p>The semantic active domain of sub-signature 0 in an interpretation M =
h M; Mi, denoted adom( 0; M), is the set of all elements of the domain M
occurring in interpretations of predicates and constants from 0 in M:
adom( 0; M) :=
[</p>
      <p>[
P 2 0\P (a1;:::;an)2P M
fa1; : : : ; ang [</p>
      <p>[
In this section we show how to prove domain independence of a formula by
proving a validity of a first-order logic entailment.</p>
      <p>We introduce a new unary predicate Adom 0 . According to the idea, we
consider a theory encoding that interpretation of Adom 0 in any model of the
theory is never empty and equal to the semantic active domain of 0 in this
model, if it is not empty. We call such a theory 0-active domain theory and
denote it as A 0 . Formally, A 0 is the smallest theory satisfying the following
four requirements.
1. For every n-ary predicate P 2</p>
      <p>0 \ P,
8x : P (x) !
^ Adom 0 (x) 2 A 0 :
x2x
In other words, Adom 0 contains any domain element that occurs in the
interpretation of some predicate from 0.
2. Adom 0 contains all the constants from 0:</p>
      <p>^
3. Let n be a maximal arity among the predicates in 0, and let z = fz1; : : : ; zn 1g.</p>
      <p>Then
_</p>
      <p>_ 9x: (x = c) ! (8x:Adom 0 (x) !
P 2 0\P c2 0\C
9z: _ (P (x; z1; : : : ; zar(P ) 1) _ : : : _ P (z1; : : : ; zar(P ) 1; x)) _
_ (x = c)) 2 A 0 .</p>
      <p>P 2 0\P c2 0\C
It means, that every element from Adom 0 necessarily appears in at least one
predicate from 0 or equal to some constant from 0, if not all the predicates
from 0 are empty or 0 contains constants.
4. Adom 0 is not empty:</p>
      <p>9x: Adom 0 (x) 2 A 0 :</p>
      <p>The next proposition follows from the definition of A 0 .</p>
      <p>Proposition 1. An interpretation M = h
M; Mi is a model of A 0 iff
– AdomM0 = adom( 0; M), if adom( 0; M) 6= ;;
– AdomM0 6= ;, if adom( 0; M) = ;.</p>
      <p>Let (x) be a formula in F OL(C; P) with free variables x, x 2 x, and let P
be some predicate. For the sake of readability we write 9x 2 P: (x) to denote
the formula 9x: P (x) ^ (x); and we write 8x 2 P: (x) to denote the formula
8x: P (x) ! (x).</p>
      <p>Let now (x) be a query in F OL(C; P) in prenex normal form, that is
(x) = Q1y1 : : : Qkyk: (y; x);
where Qi stands for either 8 or 9, x = fx1; : : : ; xng is the set of all free variables
of (x), y = fy1; : : : ; ykg is the set of all bounded (quantified) variables of (x),
and (y; x) is a quantifier free formula. Let P be an unary predicate. We will
use the following denotation:
n
(x)jP := (Q1y1 2 P : : : Qkyk 2 P: (y; x)) ^ ^ P (xi):
i=1
We extend this denotation also for the formulas, that are not in prenex normal
form, meaning that the aforementioned operation ( )j( ) is applied to the prenex
normal form of the formula.</p>
      <p>Theorem 1 (Checking Domain Independence).</p>
      <p>
        Let (x) be a formula in F OL(C; P). Then (x) is domain independent iff
A ( ) j= 8x : (x) $
(x)jAdom ( ) :
(1)
Example 1. Let us consider the formula = 9x: (A(x)_B(a)) from [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] mentioned
in Section 1 and apply the theorem for it. This formula is not safe-range but it
is domain independent, because it is logically equivalent to (9x: A(x)) _ B(a),
which is safe-range and, hence, domain independent. Such kind of formulas are
of particular interest for us.
      </p>
      <p>Since adom( ( ); M) = AM [ BM [ faMg, because of the proposition ??
we have:</p>
      <p>A ( ) j= 8x: Adom ( )(x) $ A(x) _ B(x) _ (x = a):
Then A ( ) j= jAdom ( ) $ 9x: (A(x) _ B(x) _ (x = a)) ^ (A(x) _ B(a)). One
can see, that</p>
      <p>j= 9x: (A(x) _ B(a)) $ 9x: (A(x) _ B(x) _ (x = a)) ^ (A(x) _ B(a)):
That is, A ( ) j= $ jAdom ( ) . Therefore, by applying the theorem we proved,
that formula is domain independent.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Conclusion</title>
      <p>We proposed a method that allows to reduce the problem of checking domain
independence of a formula to checking standard first-order logic entailment: a
formula (x) from F OL(C; P) is domain independent whenever the entailment
(1) holds. Using this technique in particular makes sense when the formula to
be checked for domain independence is not safe-range. Thus, in practice this
method is useful and can be applied in any not safe-range decidable fragment
of F OL(C; P), where A ( ) and 8x : (x) $ (x)jAdom ( ) can be expressed.
Discovering of such interesting fragments and application of the technique there
is our future work.</p>
      <p>We would like to thank Martín Rezk and Nhung Ngo for ideas and fruitful
discussions.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Avron</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Constructibility and decidability versus domain independence and absoluteness</article-title>
          .
          <source>Theor. Comput. Sci</source>
          .
          <volume>394</volume>
          (
          <year>April 2008</year>
          )
          <fpage>144</fpage>
          -
          <lpage>158</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Abiteboul</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hull</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vianu</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          : Foundations of Databases. Addison-Wesley (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Franconi</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kerhet</surname>
          </string-name>
          , V.:
          <article-title>On checking domain independence (extended version)</article-title>
          .
          <source>Technical Report KRDB12-3</source>
          , KRDB Research Centre, Free University of BozenBolzano (
          <year>2012</year>
          ) http://www.inf.unibz.it/krdb/pub/TR/KRDB12-3.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kleene</surname>
            ,
            <given-names>S.C.</given-names>
          </string-name>
          : Mathematical Logic. New York: Dover (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Topor</surname>
          </string-name>
          , R.W.:
          <article-title>Domain-independent formulas and databases</article-title>
          .
          <source>Theor. Comput. Sci</source>
          .
          <volume>52</volume>
          (
          <year>1987</year>
          )
          <fpage>281</fpage>
          -
          <lpage>306</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>