=Paper= {{Paper |id=Vol-2100/paper18 |storemode=property |title=Universal Nulls (Extended Abstract) |pdfUrl=https://ceur-ws.org/Vol-2100/paper18.pdf |volume=Vol-2100 |authors=Gosta Grahne,Ali Moallemi |dblpUrl=https://dblp.org/rec/conf/amw/GrahneM18 }} ==Universal Nulls (Extended Abstract)== https://ceur-ws.org/Vol-2100/paper18.pdf
                Universal Nulls (Extended Abstract)

                         Gösta Grahne and Ali Moallemi

                     Concordia University, Montreal, Canada
              grahne@cs.concordia.ca,moa_ali@encs.concordia.ca




1    Introduction

In this paper we revisit the foundations of the relational model and unearth
universal nulls, showing that they can be treated on par with the usual existential
nulls [10,4,5,2] Recall that an existential null in a tuple in a relation R represents
an existentially quantified variable in an atomic sentence R(..). This corresponds
to the intuition “value exists, but is unknown.” A universal null, on the other hand,
does not represent anything unknown, but stands for all values of the domain. In
other words, a universal null represents a universally quantified variable. Universal
nulls have an obvious application in databases, as the following example shows.
The symbol “∗” denotes a universal null.

Example 1. Consider binary relations F (ollows) and H(obbies), where F (x, y)
means that user x follows user y on a social media site, and H(x, z) means that
z is a hobby of user x. Let the database be the following.


                          F                 H
                          Alice Chris       Alice Movies
                          ∗     Alice       Alice Music
                          Bob ∗             Bob Basketball
                          Chris Bob

This is to be interpreted as expressing the facts that Alice follows Chris and
Chris follows Bob. Alice is a journalist who would like to give access to everyone
to articles she shares on the social media site. Therefore, everyone can follow
Alice. Bob is the site administrator, and is granted the access to all files anyone
shares on the site. Consequently, Bob follows everyone. “Everyone” in this context
means all current and possible future users. The query below, in domain relational
calculus, asks for the interests of people who are followed by everyone:

               x4 . ∃x2 ∃x3 ∀x1 (F (x1 , x2 ) ∧ H(x3 , x4 ) ∧ (x2 ≈ x3 )).        (1)

The answer to our example query is {(Movies), (Music)}. Note that star-nulls
also can be part of an answer. For instance, the query x1 , x2 . F (x1 , x2 ) would
return all the tuples in F . ◂
    Another area of applications of “*”-nulls relates to intuitionistic, or con-
structive database logic. In the constructive four-valued approach of [7] and the
three-valued approach of [5,12] the proposition A ∨ ¬A is not a tautology. In
order for A ∨ ¬A to be true, we need a constructive proof of A or a constructive
proof of ¬A. Therefore both [7] and [12] assume that the database I has a theory
of the negative information, i.e. that I = (I + , I − ), where I + contains the positive
information and I − the negative information. The papers [7] and [12] then show
how to transform an FO-query ϕ(x̄) to a pair of queries (ϕ+ (x̄), ϕ− (x̄)) such
that ϕ+ (x̄) returns the tuples ā for which ϕ(ā) is true in I, and ϕ− (x̄) returns
the tuples ā for which ϕ(ā) is true in I (i.e. ϕ(ā) is proven to be false in I). It
turns out that databases containing “*”-nulls are suitable for storing I − .

Example 2. Suppose that the instance in Example 1 represents I + , and that all
negative information we have deduced about the H(obbies) relation, is that we
know Alice doesn’t play Volleyball that Bob only has Basketball as hobby, and
that Chris has no hobby at all. This negative information about the relation H
is represented by the table H − below. Note that H − is part of I − .
                            H−
                            Alice Volleyball
                            Bob ∗ (except Basketball)
                            Chris ∗
Suppose the query ϕ asks for people who have a hobby, that is ϕ = x1 . ∃x2 H(x1 , x2 ).
Then ϕ+ = x1 . ∃x2 H + (x1 , x2 ), and ϕ− = x1 . ∀x2 H − (x1 , x2 ). Evaluating ϕ+ on
I + returns {(Alice), (Bob)}, and evaluating ϕ− on I − returns {(Chris)}. Note
that there is no closed-world assumption as the negative facts are explicit. Thus
it is unknown whether someone called David has a hobby or not. ◂

    This example highlights our main motivation for adding universal nulls to
databases. When intuitionistic logic is used in answering queries with negation
against four-valued database instances, universal quantifiers are necessary. In
fact, even the simplest queries, such as Conjunctive Queries will be translated
to a universally quantified disjunction of the atoms in the query. The cost of
evaluating FO-queries ϕ intuitionistically on four-valued databases remains the
same as evaluating ϕ with standard semantics on two-valued databases. Note
however that ϕ+ and ϕ− are always negation free.
    The Cylindric Set Algebra [8,9] —as an algebraization of first order logic— is
an algebra on sets of valuations of variables in an FO-formula. A valuation ν of
variables {x1 , x2 , . . .} can be represented as a tuple ν, where ν(i) = ν(xi ). Any
subset of all valuations can then be represented by a relation C of such tuples.
In particular, if the FO-formula only involves a finite number n of variables,
then the representing relation C has arity n. Note however that C can have
an infinite number of tuples, since the domain, named D from now on, of the
variables (such as the users of a social media site) should be assumed unbounded.
One of the basic connections [8,9] between FO and Cylindric Set Algebra is
that, given any interpretation I and FO-formula ϕ, the set of valuations ν under
which ϕ is true in I can be represented as such a relation C. Moreover, each
logical connective and quantifier corresponds to an operator in the Cylindric Set
Algebra. Naturally disjunction corresponds to union, conjunction to intersection,
and negation to complement. More interestingly, existential quantification on
variable xi corresponds to cylindrification ci on column i, defined as

                    ci (C) = {ν ∶ ν(i/a) ∈ C, for some a ∈ D},

where ν(i/a) denotes the valuation (tuple) ν ′ , and ν ′ (i) = a and ν ′ (j) = ν(j) for
i ≠ j. The algebraic counterpart of universal quantification can be derived from
cylindrification and complement, or be defined directly as inner cylindrification

                       i (C)   = {ν ∶ ν(i/a) ∈ C, for all a ∈ D}.
                      c

In addition, in order to represent equality, the Cylindric Set Algebra also contains
constant relations dij = {t ∈ Dn ∶ t(i) = t(j)}, representing the equality xi ≈ xj .
That is, dij is the set of all valuations ν, such that ν(i) = ν(j).
    The objects C and dij of [8,9] are of course infinitary. In this paper we
therefore develop a finitary representation mechanism, namely relations containing
universal nulls “∗” and certain equality literals. These objects are called Star
Tables when they represent the records stored in the database. When used as run-
time constructs in algebraic query evaluation, they will be called Star Cylinders.
Example 1 showed star-tables in a database. The run-time variable binding
pattern of the query (1), as well as its algebraic evaluation is shown in the
star-cylinders in Example 3 below.

Example 3. Continuing Example 1, in that database the atoms F (x1 , x2 ) and
H(x3 , x4 ) of query (1) are represented by star-tables CF and CH , and the equality
atom is represented by the star-cylinder C23 . Note that these are positional
relations, the “attributes” x1 , x2 , x3 , x4 are added for illustrative purposes only.

       CF                       CH                               C23
       x1    x2    x3 x4        x1 x2 x3  x4                     x1 x2 x3 x4
       Alice Chris ∗ ∗          ∗ ∗ Alice Movies                 ∗ ∗ ∗ ∗ 2=3
       ∗     Alice ∗ ∗          ∗ ∗ Alice Music
       Bob ∗       ∗ ∗          ∗ ∗ Bob Basketball
       Chris Bob ∗ ∗

The algebraic translation of query (1) is the SCA-expression

                           ċ2 (ċ3 (˙ 1 ((CF ⩀ CH ) ⩀ C23 )))
                                   c                                               (2)

The intersection of CF and CH is carried out as star-intersection ⩀, where for
instance {(a, ∗, ∗)} ⩀ {(∗, b, ∗)} = {(a, b, ∗)}. The result will contain 12 tuples,
and when these are star-intersected with C23 , the star-cylinder C23 will act as a
selection by columns 2 and 3 being equal. The result is the left-most star-cylinder
C ′ = (CF ⩀ CH ) ⩀ C23 below.
    C′                                  C ′′                         C ′′′
    x1    x2    x3    x4                x1 x2   x3    x4             x1 x2 x3 x4
    ∗     Alice Alice Movies            ∗ Alice Alice Movies         ∗ ∗ ∗ Movies
    ∗     Alice Alice Music             ∗ Alice Alice Music          ∗ ∗ ∗ Music
    Bob Alice Alice Movies
    Bob Alice Alice Music
    Bob Bob Bob Basketball
    Chris Bob Bob Basketball
Applying the inner star-cylindrification on column 1 results in C ′′ in the middle
above. Finally, applying outer star-cylindrifications on columns 2 and 3 of star-
cylinder C ′′ yields the final result C ′′′ = ċ2 (ċ3 (˙ 1 ((CF ⩀ CH ) ⩀ C23 ))) right-most
                                                     c
above. The system can now return the answer, i.e. the values of column 4 in
cylinder C ′′′ . Note that columns where all rows are “∗” do not actually have to
be materialized at any stage. Negation requires some additional details that can
be found in the full version [6]. ◂
    The aim of this paper is to develop a clean and sound modeling of universal
nulls, and furthermore show that the model can be seamlessly extended to
incorporate the existential nulls of Imielinski and Lipski [10]. We show that FO
and our Cylindric Star Algebra are equivalent in expressive power when it comes
to querying databases containing universal nulls, and that our algebraic queries
can be evaluated naively. This will be done in three steps: In Section 2 we show
the equivalence between FO and Cylindric Set Algebra (CA) [8,9] over infinitary
databases. This was of course only the starting point of [8,9], and we recast the
result here in terms of database theory.1 In Section 3 we introduce our finitary
Cylindric Star Algebra (SCA) 2 , and show the equivalence between CA and SCA,
thus denomstrating that certain infinitary cylinders can be finitely represented as
star-cylinders, and that our finitary Cylindric Star Algebra on finite star-cylinders
mirrors the Cylindric Set Algebra on the infinite cylinders they represent.
    At the end of Section 3 we we take the third step show how to tie these
two results together, delivering the promised SCA evaluation of FO queries on
databases containing universal nulls. In the full paper [6] we seamlessly extend
our framework to also handle existential nulls, and show that naive evaluation
can still be used for positive queries (allowing universal quantification, but not
negation) on databases containing both universal and existential nulls. The full
paper [6] shows that all SCA expressions can be evaluated in time polynomial in
the size of the database when only universal nulls are present. In [6] we also show
that when both universal and existential nulls are present, the certain answer to
any negation-free (allowing inner cylindrification, i.e. universal quantification)
SCA query can be evaluated naively in polynomial time. The full paper [6]
also includes complexity results related to database and view containment for
databases with universal nulls.
1
    Van Den Bussche [3] has recently referred to [8,9] in similar terms.
2
    In this extended abstract we only describe the positive case, where there is no negation
    in the query or database. The extension to include negation is developed in the full
    paper [6].
2     Relational calculus and cylindric set algebra

Throughout this paper we assume a fixed schema R = {R1 , . . . , Rm , ≈}, where
each Rp , p ∈ {1, . . . , m}, is a relational symbol with an associated positive integer
ar(Rp ), called the arity of Rp . The symbol ≈ represents equality.
Logic. Our calculus is the standard domain relational calculus. Let {x1 , x2 , . . .}
be a countably infinite set of variables. We define the set of FO-formulas ϕ
(over R) in the usual way: Rp (xi1 , . . . , xiar(Rp ) ) and xi ≈ xj are atomic formulas,
and these are closed under ∧, ∨, ¬, ∃xi , and ∀xi , in a well-formed manner possibly
using parenthesis’s for disambiguation. If an FO-formula uses n variables, it will
be called an FOn -formula.
Instances. Let D = {a1 , a2 , . . .} be a countably infinite domain. An instance I
(over R) is a mapping that assigns a possibly infinite subset RpI of Dar(Rp ) to
each relation symbol Rp , and ≈I = {(a, a) ∶ a ∈ D}. Note that our instances are
infinite model-theoretic ones. The set of tuples actually recorded in the database
will be called the stored database (to be defined in Section 3).
    In order to define the (standard) notion of truth of an FOn -formula ϕ in an
instance I we first define a valuation to be a mapping ν ∶ {x1 , . . . , xn } → D. If ν
is a valuation, xi a variable and a ∈ D, then ν(i/a) denotes the valuation which is
the same as ν, except ν(i/a) (xi ) = a. Then we use the usual recursive definition
of I ⊧ν ϕ, meaning instance I satisfying ϕ under valuation ν, i.e. I ⊧ν (xi ≈ xj )
if (ν(xi ), ν(xj )) ∈ ≈I , I ⊧ν Rp (xi1 , . . . , xiar(Rp ) ) if (ν(xi1 ), . . . , ν(xiar(Rp ) )) ∈ RpI ,
and I ⊧ν ∃xi ϕ if I ⊧ν(i/a) ϕ for some a ∈ D, and so on. Our stored databases
will be finite representations of infinite instances, so the semantics of answers to
FO-queries will be defined in terms of the infinite instances:

Definition 1. Let I be an instance, and ϕ an FOn -formula with f ree(ϕ) =
{xi1 , . . . , xik }, k ≤ n. Then the answer to ϕ on I is defined as

                            ϕI = {(ν(xi1 ), . . . , ν(xik )) ∶ I ⊧ν ϕ}.


Algebra. As noted in [11] the relational algebra is really a disguised version of
the Cylindric Set Algebra of Henkin, Monk, and Tarski [8,9]. We shall therefore
work directly with the Cylindric Set Algebra instead of Codd’s Relational Algebra.
Apart from the conceptual clarity, the Cylindric Set Algebra will also allow us to
smoothly introduce the promised universal nulls.
    Let n be a fixed positive integer. The basic building block of the Cylindric Set
Algebra is an n-dimensional cylinder C ⊆ Dn . Note that a cylinder is essentially
an infinite n-ary relation. They will however be called cylinders, in order to
distinguish them from instances. The rows in a cylinder will represent run-time
variable valuations, whereas tuples in instances represent facts about the real
world. We also have special cylinders called diagonals, of the form dij = {t ∈ Dn ∶
t(i) = t(j)} representing the equality xi ≈ xj . We can now define the Cylindric
Set Algebra.
Definition 2. Let C and C ′ be infinite n-dimensional cylinders. The Cylindric
Set Algebra consists of set theoretic union C ⋃ C ′ , complement C = Dn ∖ C,
diagonals dij = {t ∈ Dn ∶ t(i) = t(j)}, and outer cylindrification:

                     ci (C) = {t ∈ Dn ∶ t(i/a) ∈ C, for some a ∈ D}.

The operation ci corresponds to existential quantification of variable xi . For the
geometric intuition behind the name cylindrification, see [8,11]. Intersection is
considered a derived operator, and we also introduce the inner cylindrification as
a derived operator i (C) = ci (C), corresponding to universal quantification. Note
                       c
that
                    i (C) = {t ∈ D ∶ t(i/a) ∈ C, for all a ∈ D}.
                       c          n



Equivalence of FO and CA. In the next two theorems we will restate, in the
context of the relational model, the correspondence between domain relational
calculus and cylindric set algebra as query languages on instances [8,9]. An
expression E in cylindric set algebra of dimension n will be called a CAn -
expression. When translating an FOn -formula to a CAn -expression we first need
to extend all k-ary relations in I to n-ary by filling the n − k last columns in all
possible ways. The result is denoted hn (I).
    Once an instance is expanded it becomes a sequence C = (C1 , . . . , Cm , dij )i,j
of n-dimensional cylinders and diagonals, on which Cylindric Set Algebra Expres-
sions can be applied. The main technical difficulty in the translation from FOn to
CAn is the correlation of the variables in the FOn -sentence ϕ with the columns
in the expanded relations in the instance. This can be achieved using a derived
“swapping” operator zij11,...,i    ,...,jk that interchanges the columns il and jl , where l ∈
                                         k


{1, . . . , k}. Every atom Rp in ϕ will correspond to a CAn -expression Cp = hn (RpI ).
               3

However, for every occurrence of an atom Rp (xi1 , . . . , xik ) in ϕ we need to in-
terchange the columns 1, . . . , k with columns i1 , . . . , ik . This is achieved by the
                 i1 ,...,ik (Cp ). The entire FOn -formula ϕ with f ree(ϕ) = {xi1 , . . . , xik }
expression z1,...,k
will then correspond to the CAn -expression Eϕ = zi1,...,k     1 ,...,ik
                                                                         (Fϕ ), where Fϕ is defined
recursively as follows:

  – If ϕ = Rp (xi1 , . . . , xik ) where k = ar(Rp ), then Fϕ = z1,...,k
                                                                 i1 ,...,ik (Cp ).
  – If ϕ = xi ≈ xj , then Fϕ = dij .
  – If ϕ = ψ ∨ χ, then Fϕ = Fψ ⋃ Fχ , if ϕ = ψ ∧ χ, then Fϕ = Fψ ⋂ Fχ , and if
    ϕ = ¬ ψ, then Fϕ = Fψ .
  – If ϕ = ∃xi ψ, then Fϕ = ci (Fψ ).
  – If ϕ = ∀xi ψ, then Fϕ = i (Fψ ).
                                 c

For an example, let us reformulate the F O4 -query ϕ from (1) as

                x4 . ∃x2 ∃x3 ∀x1 (R1 (x1 , x2 ) ∧ R2 (x3 , x4 ) ∧ (x2 ≈ x3 )).

 3
     For a definition of swapping using primitive operators, see Definition 1.5.12 in [8].
When translating ϕ, the relation R1I is first expanded to C1 = R1I × D × D, and R2I
is expanded to C2 = R2I × D × D. In order to correlate the variables in ϕ with the
columns in the expanded relations, we do the shifts z1,2 1,2 (C1 ) and z3,4 (C2 ). The
                                                                        1,2

equality (x2 ≈ x3 ) was expanded to the diagonal d23 = {t ∈ D ∶ t(2) = t(3)} so
                                                                   n

here the variables are already correlated. After this the conjunctions are replaced
with intersections and the quantifiers with cylindrifications. Finally, the column
corresponding to the free variable x4 in ϕ (whose bindings will constitute the
answer) is shifted to column 1. The final CAn -expression will then be evaluated
against I as Eϕ (h4 (I)) =

                 z41 (c23 ( 1 (z1,2
                         c
                                1,2 (R1 × D ) ⋂ z3,4 (R2 × D ) ⋂ d23 ))).
                                      I    2     1,2   I    2



We now have Eϕ (h4 (I)) = h4 (ϕI ). The following fundamental result follows from
[8,9]. For the benefit of the readers who don’t want to consult [8,9], an explicit
proof (in the current notation) can be found in [6].
Theorem 1. For all FOn -formulas ϕ, there is a CAn expression Eϕ , such that
Eϕ (hn (I)) = hn (ϕI ), for all instances I. ◂
   The translation from CAn -expressions to FOn -formulas is fairly straightfor-
ward an can be found in the full version [6].


3    Cylindric Set Algebra and Cylindric Star Algebra
Since cylinders can be infinite, we want a finite mechanism to represent (at
least some) infinite cylinders, and the mechanism to be closed under queries.
Our representation mechanism comes in two variations, depending on whether
negation is allowed or not. Here we only consider the positive (no negation) case.
The full machinery is described in [6].
Star Cylinders. We define an n-dimensional (positive) star-cylinder Ċ to be a
finite set of n-ary star-tuples, the latter being elements of (D ∪ {∗})n × ℘(Θn ),
where Θn denotes the set of all equalities of the form i = j, with i, j ∈ {1, . . . , n}.
Star-tuples will be denoted ṫ, u̇, . . .. A star-tuple such as ṫ = (a, ∗, c, ∗, ∗, {(4 = 5)})
is meant to represent the set of all “ordinary” tuples (a, x, c, y, y) where x, y ∈ D.
It will be convenient to assume that all our star-cylinders are in the following
normal form.
Definition 3. An n-dimensional star-cylinder Ċ is said to be in normal form if
ṫ(n + 1) ⊧ (i = j) entails (i = j) ∈ ṫ(n + 1) and ṫ(i) = ṫ(j), for all star-tuples ṫ ∈ Ċ.
    The symbol ⊧ above stands for standard logical implication. It is easily
seen that maintaining star-cylinders in normal form can be done efficiently in
polynomial time. We shall therefore assume without loss of generality that all
star-cylinders and star-tuples are in normal form. We next define the notion of
dominance, where a dominating star-tuple represents a superset of the ordinary
tuples represented by the dominated star-tuple. First we define a relation ⪯ ⊆
(D ∪ {∗})2 by a ⪯ a, ∗ ⪯ ∗, and a ⪯ ∗, for all a ∈ D.
Definition 4. Let ṫ and u̇ be n-dimensional star-tuples. We say that u̇ domi-
nates ṫ, denoted ṫ ⪯ u̇, if ṫ(i) ⪯ u̇(i) for all i ∈ {1, . . . , n}, and (i = j) ∈ u̇(n + 1)
entails (i = j) ∈ ṫ(n + 1) when ṫ(i) = ṫ(j) = ∗, and entails ṫ(i) = ṫ(j) otherwise.
    We complete the definition by stipulating that ṫ ⪯ u̇ whenever ṫ(n + 1) =
{false}4 . We can now define the meet ṫ ⋏ u̇ of star-tuples ṫ and u̇ :

Definition 5. Let ṫ and u̇ be n-ary star-tuples. If ṫ(j), u̇(j) ∈ D for some j
and ṫ(j) ≠ u̇(j) then ṫ ⋏ u̇ (i) = a for i ∈ {1, . . . , n}, and ṫ ⋏ u̇(n + 1) = {false}. 5
Otherwise, for i ∈ {1, . . . , n}
                                         ⎧
                                         ⎪ ṫ(i) if ṫ(i) ∈ D
                                         ⎪
                                         ⎪
                           ṫ ⋏ u̇ (i) = ⎨ u̇(i) if u̇(i) ∈ D
                                         ⎪
                                         ⎪
                                         ⎪
                                         ⎩∗      if ṫ(i) = u̇(i) = ∗

and ṫ ⋏ u̇ (n + 1) = ṫ(n + 1) ∪ u̇(n + 1).

      For an example, let ṫ = (a, ∗, ∗, ∗, ∗, {(3 = 4)}) and u̇ = (∗, b, ∗, ∗, ∗, {(4 = 5)}).
Then we have ṫ ⋏ u̇ = (a, b, ∗, ∗, ∗, {(3 = 4), (4 = 5), (3 = 5)}). Note that ṫ ⋏ u̇ ⪯ ṫ,
and ṫ ⋏ u̇ ⪯ u̇. Note also that for n-ary star-tuples ṫ∅ = (a, a, . . . , a, {false}) and
ṫDn = (∗, ∗, . . . , ∗, {true}), and for any n-ary star-tuple ṫ, it holds that ṫ ⋏ ṫ∅ = t˙∅ ,
ṫ ⋏ ṫDn = ṫ, and ṫ∅ ⪯ ṫ ⪯ ṫDn .
      We extend the order ⪯ to include “ordinary” n-ary tuples t ∈ Dn by identifying
(a1 , . . . , an ) with star-tuple (a1 , . . . , an , {true}). Let Ċ be an n-dimensional star-
cylinder. We can now define the meaning of Ċ to be the set [[Ċ ]] of all ordinary
tuples it represents, where [[Ċ ]] = {t ∈ Dn ∶ t ⪯ u̇ for some u̇ ∈ Ċ}. We lift the
order to n-dimensional star-cylinders Ċ and Ḋ, by stipulating that Ċ ⪯ Ḋ, if for
all star-tuples ṫ ∈ Ċ there is a star-tuple u̇ ∈ Ḋ, such that ṫ ⪯ u̇.

Lemma 1. Let Ċ and Ḋ be n-dimensional (positive) star-cylinders. Then it
holds that [[Ċ]] ⊆ [[Ḋ]] iff Ċ ⪯ Ḋ.

Positive Cylindric Star Algebra. Next we redefine the positive cylindric set
operators so that [[Ċ ○˙ Ḋ]] = [[Ċ]] ○ [[Ḋ]] or ○([[Ċ]]) = [[˙○(Ċ)]], for each
positive cylindric operator ○, its redefinition ○,
                                                ˙ and star-cylinders Ċ and Ḋ.
Definition 6. The positive cylindric star-algebra consists of the operators
 1. Star-diagonal: d˙ij = {(∗, . . . , ∗, (i = j))}
 2. Star-union: Ċ ⊍ Ḋ = {ṫ ∶ ṫ ∈ Ċ or ṫ ∈ Ḋ}
 3. Star-intersection: Ċ ⩀ Ḋ = {ṫ ⋏ u̇ ∶ ṫ ∈ Ċ and u̇ ∈ Ḋ}
 4. Outer cylindrification: Let i ∈ {1, . . . , n}, let Ċ be an n-dimensional star-
    cylinder, and ṫ ∈ Ċ. Then

                                                      ṫ(j) if j ≠ i
                                    ċi (ṫ)(j) = {
                                                      ∗ if j = i
 4
     Note that only in this case ṫ is not in normal form.
 5
     Here a is an arbitrary constant in D.
     for j ∈ {1, . . . , n}, and

                          ċi (ṫ)(n + 1) = {(j = k) ∈ ṫ(n + 1) ∶ j, k ≠ i}.

    We then let ċi (Ċ) = {ċi (ṫ ) ∶ ṫ ∈ Ċ}.
 5. Inner cylindrification: Let Ċ be an n-dimensional cylinder and i ∈ {1, . . . , n}.
    Then ˙ i (Ċ) = {ṫ ∈ Ċ ∶ ṫ(i) = ∗ and (i = j) ∉ ṫ(n + 1) for any j}.
            c

   The class of positive SCAn - and CAn -expressions will be denoted SCA+n and
CA+n . We illustrate the positive cylindric star-algebra with the following small
example.
Example 4. Let Ċ1 = {(a, ∗, ∗, ∗, ∗, {(3 = 4)})}, Ċ2 = {(∗, b, ∗, ∗, ∗, {(4 = 5)})},
Ċ3 = {(a, b, ∗, ∗, ∗, {(4 = 5)})}, and consider ˙ 3 ((ċ1,4 (Ċ1 ⩀ Ċ2 )) ⊍ Ċ3 ). Then we
                                                           c
have the following.
                               Ċ1 ⩀ Ċ2 = {(a, b, ∗, ∗, ∗, {(3 = 4), (4 = 5), (3 = 5)})}
                          ċ1,4 (Ċ1 ⩀ Ċ2 ) = {(∗, b, ∗, ∗, ∗, {(3 = 5)})}
                (ċ1,4 (Ċ1 ⩀ C˙2 )) ⊍ C3 = {(∗, b, ∗, ∗, ∗, {(3 = 5)})
                                              (a, b, ∗, ∗, ∗, {(4 = 5)})}
           ˙ 3 ((c1,4 (Ċ1 ⩀ C˙2 )) ⊍ C3 ) = {(a, b, ∗, ∗, ∗, {(4 = 5)})}
           c

     Next we show that the cylindric star-algebra has the promised property.

Theorem 2. For every SCA+n -expression Ė and the corresponding CA+n expres-
sion E, it holds that [[Ė(Ċ)]] = E([[Ċ]]) for every sequence of n-dimensional
star-cylinders and star-diagonals Ċ.

Stored databases with universal nulls. We now show how to use star cylin-
dric algebra to evaluate FO-queries on stored databases containing universal
nulls. Let k be a positive integer. Then a k-ary star-relation Ṙ is a finite set
of star-tuples of arity k. In other words, a k-ary star-relation is a star-cylinder
of dimension k. A sequence Ṙ of star-relations (over schema R) is called a
stored database. Examples 1 and 3 show stored databases. A stored database Ṙ
represents the infinite instance [[Ṙ]] = ([[Ṙ1 ]], . . . , [[Ṙm ]], {(a, a)}a∈D ).
    Everything that is defined for star-cylinders applies to k-ary star-relations.
The major exception is that no operators from the cylindric star-algebra are
applied to star-relations. To do that, we first need to expand the stored database
Ṙ, by filling columns n − k . . . n with ∗ in all k-ary relations, and similarly for the
diagonals. The result is denoted ḣn (Ṙ). We are now ready for our main result.6

Theorem 3. For every FOn -formula ϕ there is an SCAn expression Ėϕ , such
that for every stored database Ṙ, we have hn (ϕ[[Ṙ]] ) = [[Ėϕ (ḣn (Ṙ))]].
6
    It might be argued for some applications that the “*”-value should range over the
    finite active domain only. This can be achieved in the positive framework by requiring
    for each star-cylinder Ċ (expanded star-relation), and each star-tuple ṫ ∈ Ċ, that if
    ci ([[ṫ]]) ⊆ [[Ċ]], then Ċ also contains a star-tuple u̇, where u̇ = ṫ(i/∗).
4    Related and future work
Universal nulls were first studied in the early days of database theory by Biskup
in [2]. This was a follow-up on his earlier paper on existential nulls [1]. The
problem with Biskup’s approach, as noted by himself, was that the semantics for
his algebra worked only for individual operators, not for compound expressions
(i.e. queries). This was remedied in the foundational paper [10] by Imielinski and
Lipski, as far as existential nulls were concerned. Universal nulls next came up in
[11], where Imielinski and Lipski showed that Codd’s Relational Algebra could
be embedded in CA, the Cylindric Set Algebra of Henkin, Monk, and Tarski
[8,9]. As a side remark, Imielinski and Lipski suggested that the semantics of
their “∗” symbol could be seen as modeling the universal null of Biskup, and
proposed a simplified version of the star-cylinders corresponding to the structures
in Diagonal-free Cylindric Set Algebras [8,9]. The exact FO-expressive power of
the finite diagonal-free star-cylinders is an open question. Nevertheless, using the
techniques of [6], it can be shown that naive existential nulls can be seamlessly
incorporated in diagonal-free star-cylinders.

References
 1. Joachim Biskup. A foundation of codd’s relational maybe-operations. ACM Trans.
    Database Syst., 8(4):608–636, 1983.
 2. Joachim Biskup. Extending the relational algebra for relations with maybe tuples
    and existential and universal null values. Fundam. Inform., 7(1):129–150, 1984.
 3. Jan Van den Bussche. Applications of Alfred Tarski’s ideas in database theory. In
    Computer Science Logic, 15th International Workshop, Paris, France, September
    10-13, 2001, pages 20–37, 2001.
 4. Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang Chiew Tan. Reverse
    data exchange: coping with nulls. In PODS, pages 23–32, 2009.
 5. Amélie Gheerbrant, Leonid Libkin, and Cristina Sirangelo. Naı̈ve evaluation of
    queries over incomplete databases. ACM Trans. Database Syst., 39(4):31:1–31:42,
    December 2014.
 6. G. Grahne and A. Moallemi. Universal (and Existential) Nulls. ArXiv e-prints,
    March 2018.
 7. Gösta Grahne, Ali Moallemi, and Adrian Onet. Intuitionistic data exchange. In
    Proceedings of the 9th Alberto Mendelzon International Workshop on Foundations
    of Data Management, Lima, Peru, May 6 - 8, 2015., 2015.
 8. Leon Henkin, J Donald Monk, and Alfred Tarski. Cylindric Algebras–Part I,volume
    64 of Studies in Logic and the Foundations of Mathematics. North-Holland Pub-
    lishing Company, 1971.
 9. Leon Henkin, J Donald Monk, and Alfred Tarski. Cylindric Algebras–Part II,
    volume 115 of Studies in Logic and the Foundations of Mathematics. North-Holland
    Publishing Company, 1985.
10. Tomasz Imielinski and Witold Lipski Jr. Incomplete information in relational
    databases. J. ACM, 31(4):761–791, 1984.
11. Tomasz Imielinski and Witold Lipski Jr. The relational model of data and cylindric
    algebras. J. Comput. Syst. Sci., 28(1):80–102, 1984.
12. Leonid Libkin. Negative knowledge for certain query answers. In Web Reasoning
    and Rule Systems - 10th International Conference, 2016, pages 111–127, 2016.