Extension of Tuple Calculus to Multisets
                         Iryna Lysenko1*, Olha Moroz2

                         1
                         Nizhyn Mykola Gogol State University, Grafska str., 2, Nizhyn, 16600, Ukraine
                         2
                         International Research and Training Centre for Information Technologies and Systems of the NASU and MESU,
                         Academician Hlushkov Ave., 40, Kyiv, 03680, Ukraine


                                           Abstract
                                           This paper is a logical continuation of research devoted to the actual problem of developing the theoretical
                                           foundations of table (relational) databases. The issue of using multisets in table databases is important and
                                           relevant. Many database-oriented languages require a relational model with multiset semantics. There are
                                           many applied problems, the feature of which is multiplicity and repeatability of data. For example, these
                                           are sociological polls of different population groups, calculations on DNA, and others. In this context, the
                                           question of constructing a tuple calculus for a multiset table algebra is considered, in which the concept of
                                           a table is refined using the concept of a multiset. In the article, the formalization of tuple calculus for
                                           multiset table algebra is carried out. The alphabet, and the syntax of terms, atoms, and formulas are
                                           defined. A set of legal formulas is introduced through the concept of the free and bound variable. The
                                           concept of a scheme and set of attributes with which a tuple variable occurs in a formula are also
                                           introduced. The definition of tuple calculus expression for multiset table algebra is given, according to
                                           which it is a multiset of tuples that satisfy the condition defined by the legal formula. The article provides
                                           rules for determining the number of tuple duplicates in the resulting multiset. Another important result
                                           consists in proving that constructed tuple calculus is as expressive as multiset table algebra. This research
                                           opens up new possibilities for database theory development and may be useful for information technology
                                           and database professionals. It contributes to a deeper understanding of construction query principles, an
                                           important aspect of modern computer science and industry.

                                           Keywords 1
                                           Relation Databases, multiset, multiset table algebra, tuple calculus


                         1. Introduction
                         Relational calculus underlies most relational query languages, specifying only the expected result,
                         while relational algebra involves constructing a relational expression and performing operations.
                         There are two main approaches to relational calculus:

                                •   Tuple calculus (E. Codd) which operates on table tuples [1];
                                •   Domain calculus (M. Lacroix, A. Pirotte) which focuses on table domains [2].

                            The clarification of the concept of relation in terms of naminal sets was carried out by V.N.
                         Redko, Yu.Y. Bron, D.B. Buy, and S.A. Polyakov [3]. The monograph [4] introduces the
                         consideration of table algebra for infinite (finite) tables, which significantly generalizes and extends
                         classical Codd's relational algebra. The generalization is that a relation is understood as an arbitrary
                         set of single-scheme tuples, in particular, an infinite one, while each table is assigned a certain
                         scheme. Tuple (domain) calculi have been also constructed for these algebras. Tuple calculus and
                         domain calculus are supplemented with functional and predicate signatures on the universal
                         domain. Two main results are presented. The first result demonstrates the equivalence between the


                         14th International Scientific and Practical Conference from Programming UkrPROG’2024, May 14-15, 2024, Kyiv, Ukraine
                         * Corresponding author.
                           iryna.glushko@ndu.edu.ua (I. Lysenko); mog_91@ukr.net (O. Moroz)
                            0000-0003-2549-5356 (I. Lysenko); 0000-0002-0356-8780 (O. Moroz)
                                     © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
table algebra for infinite tables and the corresponding generalized relational calculi. The second
result focuses on the equivalence between the subalgebra for finite tables of this algebra and the
corresponding restricted generalized relational calculi. For the second result, the consideration is
limited to only "safe" expressions. The methodological basis of these formalisms is the
compositional approach to programming.
   Additionally, in the work [4] multiset table algebra is constructed, which extends database
capabilities through multisets. The signature of multiset table algebra is supplemented with new
operations: inner and outer join operations, semijoin operations, and aggregate operations.
   In [5, 6] the theorem-plural and logical-algebraic methods found that table algebra of infinite
tables does not form subalgebra of multiset table algebra since it is not closed with respect to the
union, projection and active complement. Thus, multiset table algebra is not a wider formalism then
table algebra of infinite tables
   Many database-oriented languages, such as SQL, require a relational model with multiset
semantics. However, the traditional tuple (domain) calculi do not support queries that produce
tables with duplicates. Therefore, there is a need to develop a calculus for multiset table algebra.
   The research aims to construct a tuple calculus for multiset table algebra and to show that it is
no less expressive than multiset table algebra.


2. Basics of Multisets
Many languages oriented towards working with databases require a relational data model with
multiset semantics because, firstly, relations that allow duplicates are useful in many applied fields
where duplicate objects may exist; secondly, in the relational data model, removing duplicates after
performing projection and union operations implies merging identical elements or performing other
labor-intensive actions. Various researchers have focused on the utilization of multisets in
databases, including Paul Grefen and Rolf A. de By [7, 8], G. Lamperti, M. Melchiori, and M. Zanella
[9], G. Garcia-Molina, J. Ullman, and J. Widom [10, 11], A. Silbeschatz, H. Korth, and S. Sudarshan
[12], D.B. Buy, S.A. Polyakov, Yu.Y. Brona, and V.N. Redko [3]. However, this issue still requires
clarification and expansion.
    Let's start by considering the key terms of multisets based on sources [3, 13].
    Let 𝑈 be an arbitrary set. A multiset 𝛼 with basis 𝑈 is a function

                                              𝛼: 𝑈 → 𝑁,

where 𝑁 = {1, 2, … } is the set of natural numbers.
Let 𝑫 be the universe of elements of multiset bases. A characteristic function of multiset 𝛼 is a
function 𝜒! : 𝑫 → 𝑍! , the values of which are specified by the following piecewise schema:

                                               𝛼 𝑑 if 𝑑 ∈ 𝑑𝑜𝑚 𝛼
                                    𝜒! 𝑑 =
                                                    0, else

for all d ∈ D . Here 𝑑𝑜𝑚 𝛼 is the range of definition of multiset as a function (𝑑𝑜𝑚 𝛼 = 𝑈! – basis
of multiset 𝛼).
   The empty multiset ∅! is defined as a multiset which basis is an empty set.
   The 1-multisets are multisets whose range of values is the empty set or a single-element set {1}.
These multisets are the analogues of ordinary sets.
   The operations over multisets are defined in terms of characteristic functions in monograph [3].
Authors define operations of multiset ⋃! , intersection ⋂! , difference \! , which build 1-multisets,
                                    !                  !
and operations of multiset union ⋃!"" , intersection ⋂!"" , difference \!!"" , which build multisets of
general view. The Cartesian product of multiset ⊗ , the operation Dist α, which build 1-multiset,
and analog of a full image for multisets are defined too.

3. Multiset Table Algebra
Let's examine the main concepts and statements of multiset table algebra, based on the monograph
[4]. In this case, under relation, understand a multiset, in particular, infinite.
    Let's consider two sets: 𝑨 is the set of attributes, and 𝑫 is the universal domain.
    A scheme will be defined as any arbitrary finite set of attributes 𝑅 ⊆ 𝑨. A tuple of the scheme 𝑅
is a nominal set on pair 𝑅, 𝑫. The projection of this nominal set for the first component is equal to
𝑅. We will use the following notations: 𝑆(𝑅) is the set of all tuples of the schema 𝑅, and 𝑆 is the set
of all tuples.
        A table of the scheme 𝑅 (𝑅 ⊆ 𝑨) is a pair

                                                         𝜓, 𝑅 ,

where the first component 𝜓 is an arbitrary multiset, the basis of which Θ(𝜓) is the set of tuples of
the same scheme and the second component 𝑅 is a scheme of the table.
   The set of all table on scheme 𝑅 is designated as Ψ(𝑅) and the set of all table is designated as
Ψ = ! Ψ( ).
   The notation 𝑂𝑐𝑐(𝑠, 𝜓) represents the number of duplicate tuple 𝑠 in the multiset 𝜓. A multiset
                           !          !
𝜓 can also be written as {𝑠! ! , … , 𝑠! ! }, where 𝑛! = 𝑂𝑐𝑐(𝑠! , 𝜓), 𝑖 = 1, … , 𝑘, and Θ 𝜓 = {𝑠! , … , 𝑠! } is
a of the multiset 𝜓.
   Under multiset table algebra is understood an algebra

                                                       𝛹, Ω!,! ,

                                                 !      !                                                !∈!,!∈!
   where 𝛹 is the set of all tables, Ω!,! = {⋃!""    , ⋂!"" ,\!!"" , 𝜎!,! , 𝜋!,! , ⨂!! ,!! , 𝑅𝑡!,! , ~! }!,!,!! ,!! ⊆𝑨 is
the signature, 𝑃, Ξ are the sets of parameters. The signature Ω!,! contains operations of multiset
           !                 !
(union ⋃!""  , intersection ⋂!"" , difference \!!"" ) and special operations (selection 𝜎!,! , projection
𝜋!,! , join ⨂ , renaming 𝑅𝑡!,! , and active complement ~! ).
            !! ,!!
   Multiset table algebra is also filled up with additional operations such as inner join (Cartesian
join, natural join, join using attributes and join on predicate), outer join (outer left join, outer right
join, outer full join and union join), semijoin and aggregate operations (Sum, Avg, Max, Min,
Count). The special element NULL is inserted in the universal domain for to define of outer join and
outer set operations. The operations of signature Ω!,! and a formal mathematical semantics of
additional operations are defined in [4].
   The following statement holds.
   Theorem 1. Any expression of the multiset table algebra can be replaced with an equivalent
expression that uses only the operations of selection, join, projection, union, difference, and
renaming.
   Proof. To prove the first statement, we will show that the operations of intersection and active
complement can be expressed through other operations of multiset table algebra. The operation of
multiset intersection can be replaced by the difference [13]:

                                     !
                             𝜓! , 𝑅 ⋂!"" 𝜓! , 𝑅 = 𝜓! , 𝑅 \!!"" ( 𝜓! , 𝑅 \!!"" 𝜓! , 𝑅 ).

    The operation of active complement is expressed through the operations of projection,
difference, and join (by definition):

                                       ∼!     𝜓, 𝑅    = 𝐶 𝜓, 𝑅 \!!"" 𝜓, 𝑅 ,
   where 𝐶 𝜓, 𝑅     = 𝜋 !! ,! 𝜓, 𝑅      ⨂          …           ⨂             𝜋 !! ,! 𝜓, 𝑅 , and 𝑅 = 𝐴! , … , 𝐴! is
                                     {!! },{ ! }       {!! ,…,!!!! },{!! }
a scheme of the table 𝜓, 𝑅 .
   According to the proof, the operations of intersection and active complement are derived from
other operations in the signature of multiset table algebra and can henceforth be excluded from
consideration.


4. Tuple Calculus for Multiset Table Algebra
   The basis of relational calculus is the calculus of first-order predicates. We will start the
construction of tuple calculus for multiset table algebra with the definition of the alphabet.
   The alphabet of tuple calculus for multiset table algebra consists of:
   •   a set of attributes 𝑨 and universal domain 𝑫;

   •    a set of object variables (tuple variables) 𝑥! , 𝑥! , …;
   •    a set of object constants 𝑑! , 𝑑! , …;
                                     !      !
   •    a set of function symbols 𝑓! ! , 𝑓! ! , … , 𝑛! ≥ 1;
                                       !     !
   •    a set of predicate symbols 𝑝! ! , 𝑝! ! , … , 𝑚! ≥ 1;
   •    a set of symbols of constant tables 𝛼, 𝑅 ;
   •    a set of symbols of variable tables 𝑋, 𝑅 ;
   •    the signs of logical operations ¬, ⋀, ⋁, and quantifiers ∀, ∃;
   •    punctuation marks – parentheses () and commas.
   The universal domain 𝑫 is the domain of interpretation of object constants, and the set of all
tuples over 𝑫 is the domain of interpretation of object variables.
   We will use

   •    𝒙 as syntactic variable, the domain of change of which is the set of variables;
   •    𝒇 as syntactic variable, the domain of change of which is the set of function symbols;
   •    𝒑 as syntactic variable, the domain of change of which is the set of predicate symbols;
   •    𝒅 as syntactic variable;
   •    𝓐 as syntactic variable, the domain of change of which is the set of attributes.
   Terms and formulas are distinguished among the words written using alphabet symbols. Let us
define these syntactic objects by induction on their length.
   The following expressions are terms:
   a) 𝒅 is a term;
   b) 𝒙(𝓐) is a term;
   c) if 𝒖! , … , 𝒖! are terms and 𝒇 is a function symbol of arity 𝑛 then 𝒇 𝒖! , … , 𝒖! is a term;
   d) there are no terms other than those specified in points a)-c).
   Further in the text, 𝒖 is a syntactic variable, the domain of change of which is the set of terms.
   Let's define the formulas, starting with atomic formulas (atoms), which come in three types:
   а1. For any constant table 𝛼, 𝑅 and for any tuple variable 𝒙, 𝛼! (𝒙) is an atom. 𝛼! (𝒙) stands
        for 𝒙 ∈ 𝛼, 𝑅 .
   а2. For any variable table 𝑋, 𝑅 and for any tuple variable 𝒙, 𝑋! (𝒙) is an atom. 𝑋! (𝒙) stands
        for 𝒙 ∈ 𝛼, 𝑅 .
   а3. For any terms 𝒖! , … , 𝒖! , and for any predicate 𝒑 of arity 𝑚 on the universal domain 𝑫,
        𝒑 𝒖! , … , 𝒖! is an atom.
      f1. Let's construct formulas from atoms using the logical connectives ¬, ⋀, ⋁, quantifiers
           ∀, ∃, and parentheses.
      f2. Every atom is a formula.
      f3. If 𝑷 and 𝑸 are formulas, then ¬𝑷, 𝑷⋀𝑸, 𝑷⋁𝑸 are formulas.
       f4. If 𝒙 is a tuple variable, 𝑷 is a formula, 𝑅 ⊆ 𝑨 is a scheme, then ∀𝒙(𝑅)𝑷, ∃𝒙(𝑅)𝑷 are
           formulas.
       f5. If 𝑷 is a formula, then (𝑷) is a formula.
       f6. There are no other formulas besides those specified in points f1-f4.
   We will use 𝑷, 𝑸 and 𝑮 as syntactic variables, the domain of change of which is the set of
formulas.
   Let's define the class of legal formulas, using the concepts of free and bound variables, the
scheme 𝑠𝑐ℎ𝑒𝑚𝑒(𝒙, 𝑷) for tuple variable 𝒙, and the set of attributes with which tuple variable 𝒙
occurs in a formula 𝑷, 𝑎𝑡𝑡𝑟(𝒙, 𝑷). The expressions 𝑐ℎ𝑒𝑚𝑒(𝒙, 𝑷) and 𝑎𝑡𝑡𝑟(𝒙, 𝑷) are defined if the
tuple 𝒙 has at least one free occurrence in the formula 𝑷. Additionally, the including 𝑎𝑡𝑡𝑟(𝒙, 𝑷) ⊆
𝑠𝑐ℎ𝑒𝑚𝑒(𝒙, 𝑷) holds if these expressions are defined.
   We will define expression 𝑎𝑡𝑡𝑟 for terms first:
       1. if 𝒖 = 𝒅, then 𝑎𝑡𝑡𝑟 𝒙, 𝒖 = ∅;
       2. if 𝒖 = 𝒙(𝓐), then 𝑎𝑡𝑡𝑟 𝒙, 𝒖 = {𝓐}, аnd 𝑎𝑡𝑡𝑟 𝒙, 𝒚(𝓐) = ∅, where 𝒙 ≠ 𝒚;
       3. if 𝒖 = 𝒇 𝒖! , … , 𝒖! , where 𝒖! are terms then 𝑎𝑡𝑡𝑟 𝒙, 𝒖 = !!!! 𝑎𝑡𝑡𝑟 𝒙, 𝒖𝒊 .
   In other words, 𝑎𝑡𝑡𝑟 𝒙, 𝒖 is the set of attributes that the scheme of the tuple 𝒙 must have.
   Consider the cases where 𝑷 is an atomic formula, then
       а1. if 𝑷 = 𝛼! (𝒙), then 𝒙 is free in 𝑷 and 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑷 = 𝑎𝑡𝑡𝑟 𝒙, 𝑷 = 𝑅;
       а2. similarly if 𝑷 = 𝑋! (𝒙), then 𝒙 is free in 𝑷 and 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑷 = 𝑎𝑡𝑡𝑟 𝒙, 𝑷 = 𝑅;
       а3. if 𝑷 = 𝒑 𝒖! , … , 𝒖! , where 𝒖! are terms, and 𝒙! , … , 𝒙! are all variables of these terms,
           then this tuple variables are free in formula 𝑷, 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙! , 𝑷 is undefined, and
           𝑎𝑡𝑡𝑟 𝒙! , 𝑷 = !    !!! 𝑎𝑡𝑡𝑟 𝒙! , 𝒖! , 𝑖 = 1, … , 𝑘.
   Atomic formulas are all legal. The construction of all legal formulas proceeds by induction on
the length of formulas. Assume 𝑸 and 𝑮 are both legal formulas.
       f1. If 𝑷 = ¬𝑮, then 𝑷 is legal, and all occurrences of variables in 𝑷 free or bound as they are
           in 𝑮. For every tuple 𝒙 that occurs free in 𝑮, 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑷 ≃ 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑮 and
           𝑎𝑡𝑡𝑟 𝒙, 𝑷 = 𝑎𝑡𝑡𝑟 𝒙, 𝑮 , where ≃ is a generalized equality, meaning that both sides of
           the equality are either undefined or defined and have equal values [14].
       f2. If 𝑷 = 𝑮⋀ or 𝑷 = 𝑮⋁𝑸, then all occurrences of variables in 𝑷 are free or bound as their
           corresponding occurrences are in 𝑮 and 𝑸. Assume variable 𝒙 occurs free in subformulas
           𝑮 and/or 𝑸. Define the scheme, and the set of attributes with which tuple variable 𝒙
           occurs in a formula for formula 𝑷. Next cases take place.
             a. The schemes of formulas 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑮 and 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑸 are both defined.
                  Formula 𝑷 is legal if equality 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑮 = 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑸 holds. According to
                  the definition 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑷 = 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑮 .
             b. The scheme is defined for only one subformula. Assume 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑮 is defined,
                  and 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑸 is undefined. Formula 𝑷 is legal if including 𝑎𝑡𝑡𝑟 𝒙, 𝑸 ⊆
                  𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑮 holds. According to the definition 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑷 = 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑮 .
             c. The scheme is undefined for both subformulas, then formula 𝑷 is legal but
                  𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑷 is undefined.
              In all these three cases 𝑎𝑡𝑡𝑟 𝒙, 𝑷 = 𝑎𝑡𝑡𝑟 𝒙, 𝑮 ∪ 𝑎𝑡𝑡𝑟 𝒙, 𝑸 .
       f3. If 𝑷 = ∃𝒙(𝑅)𝑮 then 𝒙 must occur free in 𝑮 for 𝑷 to be legal. Furthermore, if
           𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑮 is defined, then equality 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑮 = 𝑅 must hold if including
           𝑎𝑡𝑡𝑟 𝒙, 𝑮 ⊆ 𝑅 is held. 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑷 and 𝑎𝑡𝑡𝑟 𝒙, 𝑷 are not defined, since 𝒙 does not
           occur free in 𝑷. Any occurrence of a variable 𝒚 ≠ 𝒙 is free or bound in 𝑷 as it was in 𝑮.
           If 𝒚 occur free in 𝑷, then 𝑠𝑐ℎ𝑒𝑚𝑒 𝒚, 𝑷 ≃ 𝑠𝑐ℎ𝑒𝑚𝑒 𝒚, 𝑮 and 𝑎𝑡𝑡𝑟 𝒚, 𝑷 = 𝑎𝑡𝑡𝑟 𝒚, 𝑮 .
       f4. If 𝑷 = ∀𝒙(𝑅)𝑮, then all restrictions and definitions are the same as in f3.
       f5. If 𝑷 = (𝑮), then 𝑷 is legal, and free and bound variables, 𝑠𝑐ℎ𝑒𝑚𝑒 and 𝑎𝑡𝑡𝑟 are the same
           as for 𝑮.
   In other words, the equality 𝑎𝑡𝑡𝑟 𝒙, 𝑷 = 𝑅 means that for a specific interpretation of the
formula 𝑷, when the variable 𝒙 takes on a value in the form of a tuple 𝑠 of the scheme 𝑅′, the
inclusion 𝑅 ⊆ 𝑅′ must hold.
   Expressions of tuple calculus for multiset table algebra have the form

                                             {𝒙! 𝑅 |𝑷(𝒙)},
    where:
        1. The formula 𝑷 is legal.
        2. The variable 𝒙 is the only variable that occurs free in the formula 𝑷.
        3. If 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑷 is defined, then 𝑠𝑐ℎ𝑒𝑚𝑒 𝒙, 𝑷 = 𝑅, otherwise 𝑎𝑡𝑡𝑟 𝒙, 𝑷 ⊆ 𝑅.
4. 𝑛 is the number of duplicates of the tuple 𝒙.
    It is worth emphasizing that the result of executing the query defined by the expression
{𝒙! 𝑅 |𝑷(𝒙)} is a multiset of tuple described by the expression 𝑷(𝒙).
    Let 𝑷 𝒙 be a legal formula, 𝑅 ⊆ 𝑨. Then 𝑠 substituted for 𝒙 in formula 𝑷 is the formula 𝑷(𝑠/𝒙).
The value of the formula 𝑷(𝑠/𝒙) is determined by modifying each atom from 𝑷 according to the
following rules:
         а1. Let the tuple variable 𝒙 in 𝛼!! (𝒙) be free in formula 𝑷. By the definition of an legal
              formula, we have the inclusion 𝑅′ ⊆ 𝑅. Atom 𝛼!! (𝒙) acquires the value of true at 𝑠
              substituted for 𝒙, if 𝑠|𝑅′ ∈ 𝜃(𝛼), otherwise, atom 𝛼!! (𝒙) acquires the value of false.
         а2. Let the tuple variable 𝒙 in 𝑋!! (𝒙) be free in formula 𝑷. By the definition of an legal
              formula, we have the inclusion 𝑅′ ⊆ 𝑅. Atom 𝑋!! (𝒙) acquires the value of true at 𝑠
              substituted for 𝒙, if 𝑠|𝑅′ ∈ 𝜃(𝑋), otherwise, atom 𝑋!! (𝒙) acquires the value of false.
         а3. Let the tuple variable 𝒙 in 𝑷 = 𝒑 𝒖! , … , 𝒖! be free in formula 𝑷, then at 𝑠 substituted
              for 𝒙, replace 𝒙(𝐴! ) by 𝑑! ∈ 𝑫, where 𝐴! , 𝑑! ∈ 𝑠 (𝑑! is value of attribute 𝐴! in tuple 𝑠).
              Atom 𝒑 𝒖! , … , 𝒖! acquires the value of true, if predicate 𝒑 is true on the proper values,
              otherwise atom acquires the value of false.
    The set of values of truth of all atoms of formula is called interpretation. Let 𝑷 be a legal formula
with no free tuple variables. The interpretation of formula 𝑷 is defined as follows.
         f1. If 𝑷 = ¬𝑮, then 𝑮 must have no free variables. The formula 𝑷 is true, if 𝑮 is false, and it
              is false, if formula 𝑮 is true.
         f2. If 𝑷 = 𝑮⋀𝑸 or 𝑷 = 𝑮⋁𝑸, then neither 𝑮 or Q have free variables. If 𝑷 = 𝑮⋀𝑸, then 𝑷
              is true exactly when 𝑮 and 𝑸 are both true, otherwise, 𝑷 is false. If 𝑷 = 𝑮⋁𝑸, then 𝑷 is
              false exactly when 𝑮 and 𝑸 are both false, otherwise, 𝑷 is true.
         f3. If 𝑷 = ∃𝒙(𝑅)𝑮, then 𝒙 is the only variable that occurs free in 𝑮. The formula 𝑷 is true, if
              there is at least one tuple 𝑠 ∈ 𝑆(𝑅) such that formula 𝑮(𝑠/𝒙) is true, otherwise, formula
              𝑷 is false.
         f4. If 𝑷 = ∀𝒙(𝑅)𝑮, then 𝒙 is the only variable that occurs free in 𝑮. The formula 𝑷 is true,
              if for every tuple 𝑠 ∈ 𝑆(𝑅) formula 𝑮(𝑠/𝒙) is true, otherwise, formula 𝑷 is false.
    If 𝑷 = (𝑮), then 𝑷 is true, if formula 𝑮 is true and 𝑷 is false, if 𝑮 is false.
    Let 𝐸 = {𝒙! 𝑅 |𝑷(𝒙)} be a tuple calculus expression. The value of expression 𝐸 is the table
 𝜑, 𝑅 of multiset table algebra containing every tuple 𝑠 ∈ 𝑆(𝑅) such that formula 𝑷(𝑠/𝒙) is true.
The number of duplicates of the tuple 𝑠 in the table 𝜑, 𝑅 is determined as follows:
         1) if 𝑷 = 𝛼! (𝒙), then 𝑛 = 𝑂𝑐𝑐 𝑠, 𝜑 = 𝑂𝑐𝑐(𝑠, 𝛼);
         2) if 𝑷 = 𝑋! (𝒙), then 𝑛 = 𝑂𝑐𝑐 𝑠, 𝜑 = 𝑂𝑐𝑐(𝑠, 𝑋);
         3) if 𝑷 = 𝒑 𝒖! , … , 𝒖! , where 𝒖! are terms, 𝑗 = 1, 𝑚, and 𝒙! , … , 𝒙! are all variables of
              these terms, then

                             𝑛 = 𝑂𝑐𝑐 𝑠, 𝜑 =       ! ! ∈! ! ,! ! |! ! !! 𝑂𝑐𝑐 𝑠′, 𝜓 ,
                                                 𝒑 𝒖! ,…,𝒖! !!"#$


          where 𝜓, 𝑅 is the table to which the query is constructed, 𝑅 ! = !   !!! 𝑎𝑡𝑡𝑟(𝒙! , 𝒖! ),𝑖 =
          1, 𝑘;
       4) if 𝑷 = ¬𝑮 and the formula 𝑮 generates 𝑚 duplicates of the tuple 𝑠, then
       5)
                              𝑛 = 𝑂𝑐𝑐 𝑠, 𝜑 = 𝑂𝑐𝑐(𝑠, 𝐶(𝜓))−𝑚,
             where 𝐶(𝜓) is a multiset of the table 𝐶( 𝜓, 𝑅 ) which is the saturation of the table
                                                                                       𝑥 − 𝑦 𝑖𝑓 𝑥 ≥ 𝑦
              𝜓, 𝑅 , which is the value of the expression {𝒚! 𝑅 |𝑷(𝒚)}, and 𝑥−𝑦 =                     ;
                                                                                          0 𝑖𝑓 𝑥 < 𝑦
        6) if 𝑷 = 𝑮⋀𝑸, the formula 𝑮 generates 𝑘 duplicates of the tuple 𝑠, and the formula 𝑸
             generates 𝑚 duplicates of the tuple 𝑠, then 𝑛 = 𝑂𝑐𝑐 𝑠, 𝜑 = min (𝑘, 𝑚);
        7) if 𝑷 = 𝑮⋁𝑸 , the formula 𝑮 generates 𝑘 duplicates of the tuple 𝑠, and the formula 𝑸
             generates 𝑚 duplicates of the tuple 𝑠, then 𝑛 = 𝑂𝑐𝑐 𝑠, 𝜑 = 𝑘 + 𝑚;
        8) if 𝑷 = ∃𝒙(𝑅)𝑮 and the formula 𝑮 generates 𝑘 duplicates of the tuple 𝑠, then 𝑛 =
             𝑂𝑐𝑐 𝑠, 𝜑 = 𝑘.
        9) if 𝑷 = ∀𝒙(𝑅)𝑮 and the formula 𝑮 generates 𝑘 duplicates of the tuple 𝑠, then 𝑛 =
             𝑂𝑐𝑐 𝑠, 𝜑 = 𝑘.
        10) if 𝑷 = (𝑮) and the formula 𝑮 generates 𝑘 duplicates of the tuple 𝑠, then
             𝑛 = 𝑂𝑐𝑐 𝑠, 𝜑 = 𝑘.
    Theorem 2. If 𝐹 is a multiset table algebra expression, then it is possible effectively to build
equivalent to it expression 𝐸 of tuple calculation.
    Proof. According to Theorem 1, in the proof, it is sufficient to consider expressions of multiset
table algebra that contain only the operations of union, difference, selection, projection, join, and
renaming. We will prove this theorem by mathematical induction on the number of operations in
the expression 𝐹.
    Basis. In this case, the expression 𝐹 does not contain any operations. There are two cases.
First case. If 𝐹 = 𝛼, 𝑅 is constant table, where 𝛼 is a multiset of the scheme 𝑅, then 𝐸 =
{𝒙! 𝑅 |𝛼! (𝒙)}.
Second case. If 𝐹 = 𝑋, 𝑅 is variable table, then 𝐸 = {𝒙! 𝑅 |𝑋! (𝒙)}.
    Induction. Assume the theorem holds for any multiset table algebra expression with fewer than 𝑖
operators. Let 𝐹 have 𝑖 operators. There are six cases.
    Case 1 (union). 𝐹 = 𝐹! !!"" 𝐹! ,
    where expressions 𝐹! and 𝐹! each have less than 𝑖 operators. By the inductive hypothesis we can
find tuple calculus expressions {𝒙! 𝑅 |𝑷(𝒙)} and {𝒙! 𝑅 |𝑸(𝒙)} equivalent to 𝐹! and 𝐹!
respectively. The values of these expressions are tables in which the tuple 𝒙 appears 𝑛 and 𝑚 times,
respectively. Then 𝐸 is {𝒙! 𝑅 |𝑷(𝒙)⋁𝑸(𝒙)}, where 𝑘 = 𝑛 + 𝑚.
    Case 2 (difference). 𝐹 = 𝐹! \!!"" 𝐹! ,
where expressions 𝐹! and 𝐹! each have less than 𝑖 operators. Tuple calculus expressions
{𝒙! 𝑅 |𝑷(𝒙)} and {𝒙! 𝑅 |𝑸(𝒙)} are equivalent to 𝐹! and 𝐹! respectively as in Case 1. Then
𝐸 = {𝒙! 𝑅 |𝑷(𝒙)⋀¬𝑸(𝒙)}, where 𝑘 = 𝑛−𝑚.
    Case 3 (selection). 𝐹 = 𝜎!,! (𝐹! ),
where expression 𝐹! has less than 𝑖 operators. Let {𝒙! 𝑅 |𝑷(𝒙)} be tuple calculus expression
equivalent to 𝐹! . Then 𝐸 = {𝒙! 𝑅 |𝑷(𝒙)⋀𝒑(𝒙 𝐴! , … , 𝒙 𝐴! )}, where 𝑅 = {𝐴! , … , 𝐴! } is the
scheme of table that is the value of expression 𝐹! . The number of duplicates of the tuple 𝒙 in the
output table does not change, therefore 𝑘 = 𝑛. Assume that predicate-parameter of select is defined
as 𝑝 𝑠 = 𝑡𝑟𝑢𝑒 ⇔ 𝒑 𝑠 𝐴! , … , 𝑠 𝐴! = 𝑡𝑟𝑢𝑒, 𝑠 ∈ 𝑆(𝑅), where 𝒑 is a predicate symbol of
arity 𝑚. □
    Case 4 (projection). 𝐹 = 𝜋!,! (𝐹! ).
Let     {𝒙! 𝑅 |𝑷(𝒙)}         be    tuple    calculus    expression    equivalent      to     𝐹! .  Then
        !
𝐸 = {𝒚 𝑋⋂𝑅 |∃𝒙(𝑅)(𝑷(𝒙)⋀ !∈!⋂! 𝒚 𝐴 = 𝒙 𝐴 )}, where 𝑘 =                    !∈!⋂! 𝒚 ! !𝒙 !
                                                                                         𝑛.
    Case 5 (join). 𝐹 = 𝐹! ⨂ 𝐹! .
                        !! ,!!
      Tuple calculus expressions {𝒙! 𝑅 |𝑷(𝒙)} and {𝒙! 𝑅 |𝑸(𝒙)} are equivalent to 𝐹! and 𝐹!
                                       respectively Then 𝐸 is
    !
  {𝒛 𝑅! ⋃𝑅! |∃𝒙(𝑅! )∃𝒚(𝑅! )(𝑷 𝒙 𝑸 (𝒙)⋀ !∈!! 𝒛 𝐴 = 𝒙 𝐴 ⋀ !∈!! 𝒛 𝐴 = 𝒚 𝐴 )}, where
                                            𝑘 = 𝑛×𝑚. □
   Case 6 (renaming). 𝐹 = 𝑅𝑡!,! (𝐹! ),
  where 𝜉: 𝑨 → 𝑨 is injective function that renames attributes. Let {𝒙! 𝑅 |𝑷(𝒙)} be tuple calculus
                                 expression equivalent to 𝐹! . Then
           !
   𝐸 = {𝒚 𝑅! |∃𝒙(𝑅)(𝑷(𝒙)⋀ !∈!\!"#$ 𝒚 𝐶 = 𝒙 𝐶 ⋀ !∈!⋂!"#$ 𝒙 𝐴 = 𝒚 𝜉(𝐴) )}, where
                                 𝑅! = 𝑅\𝑑𝑜𝑚𝜉⋃𝜉[𝑅], and 𝑘 = 𝑛.

5. Relation to the SQL query language
The fundamental data object in the SQL language is not the classical relation of E. Codd but rather a
table. Moreover, SQL tables do not contain sets of tuples but multisets of tuples, meaning duplicates
are allowed. The basic SQL operators are not relational operators in the strict sense but are analogs
of relational operators designed to work with multisets. When creating a new table using a query,
an SQL system typically does not remove duplicate tuples but returns a result in which the same
tuple can appear multiple times. To exclude duplicates, the keyword DISTINCT must be placed after
the SELECT operator.
   Let's demonstrate the appropriateness of constructing tuple calculus for multiset table algebra
with the following example.
 Example 1. Consider the table Scores, 𝑅 , where the scheme 𝑅 = {№, Name, Topic 1, Topic 2,
                                    Topic 3, Quiz} (see Table 1).

Table 1
Table Scores, 𝑅
        №                Name             Topic 1           Topic 2        Topic 3          Quiz
        1.             Student 1            5                 15             14              16
        2.             Student 2            6                 14             15              16
        3.             Student 3            7                 17             20              20
        4.             Student 4            9                 20             19              20
        5.             Student 5            5                 18             15              18
        6.             Student 6            6                 19             13              20
        7.             Student 7            4                 8              9               16

The answer to the question "What scores did the first five students get for the quiz?" in SQL would
look like this:

   SELECT Quiz
   FROM Scores
   LIMIT 5;

  The result is a table Quiz, 𝑅! , where the schema 𝑅! = Quiz , which contains duplicate values
                                            (see Table 2).

Table 2
Table Quiz, 𝑅!
                                                 Quiz
                                                  16
                                                  16
                                                  20
                                                  20
                                                  18

   It is impossible to implement this query in terms of classical tuple calculus, since the result will
be a set of tuples, not a multiset (as expected), meaning duplicates will not be considered in the
result.
   In the tuple calculus for multiset table algebra, the expression equivalent to this query will have
the form:

   {𝑥 ! (Quiz) | 𝑦(𝑅)(Scores 𝑦 ⋀ (№) = 1⋁ (№) = 2⋁(№) = 3⋁(№) = 4 ⋁(№) = 5)⋀
                                  ⋀𝑥(Quiz) = 𝑦(Quiz))}.

   The result described by this tuple calculus expression is similar to the result obtained when
executing the corresponding query in SQL.
   Example 2. Consider the table Results, 𝑅′ , where the scheme 𝑅′ = { №, Name, Total, ECTS}
(see Table 3).
   Let's consider the query "How many points for the quiz have students who scored more than 83
points in total?".
   We will write the corresponding expression of multiset table algebra, assuming that we only
need to know the quiz scores:

                         𝐹 = 𝜋!"#$     Scores, 𝑅 ⨂ 𝜎!"#$%!!" Results, 𝑅′ .
                                                !,!!


   The result is a table Query, {Quiz} , which contains duplicate values (see Table 4).

Table 3
Table Results, 𝑅′
           №                        Name                           Total                  ECTS
           1.                     Student 1                         77                     C
           2.                     Student 2                         80                     C
           3.                     Student 3                         93                     А
           4.                     Student 4                         90                     А
           5.                     Student 5                         81                     C
           6.                     Student 6                         79                     C
           7.                     Student 7                         51                     Fx

   From Table 4, it is clear that only two students in total have more than 83 points. Let’s write the
tuple calculus expression equivalent to this algebraic expression.

Table 4
Table Query, {Quiz}
                                                Quiz
                                                 20
                                                 20

      Tuple calculus expressions {𝒙! 𝑅 |Scores(𝒙)} and {𝒚! 𝑅′ |𝑹𝒆𝒔𝒖𝒍𝒕𝒔(𝒚)} are equivalent to
    Scores, 𝑅 and Results, 𝑅′ respectively, where 𝑅 = {№, Name, Topic 1, Topic 2, Topic 3,
                             𝑄𝑢𝑖𝑧} and 𝑅′ = { №, 𝑁𝑎𝑚𝑒, 𝑇𝑜𝑡𝑎𝑙, 𝐸𝐶𝑇𝑆}.

   We have tuple calculus expression

                               {𝒚! 𝑅 ! |Results 𝒚 ⋀𝒚 Total > 83}
   for algebraic expression
                                         𝜎!"#$%!!" Results, 𝑅′ .

  Its value is a table without duplicates because table Results, 𝑅′ does not have duplicates.
Tuple calculus expressions which equivalent to algebraic expression
                                Scores, 𝑅 ⨂ 𝜎!"#$%!!" Results, 𝑅′
                                           !,!!


is
          {𝒛!!!×! 𝑅⋃𝑅 ! |∃𝒙(𝑅)∃𝒚(𝑅′)(Scores 𝒙 ⋀Results 𝒚 ⋀𝒚 Total > 83⋀𝒛 № =
              𝒙 № ⋀𝒛 Name = 𝒙 Name ⋀𝒛 № = 𝒚 № ⋀𝒛 Name = 𝒚 Name )}.

   Since there are no duplicate tuples in the tables Scores, 𝑅 and Results, 𝑅′ , there will also be
no duplicate tuples in the resulting table after the join.
   Finally, the tuple calculus expression equivalent to the algebraic expression 𝐹 has the form:

       {𝒘𝒒 ({𝑸𝒖𝒊𝒛}⋂(𝑹⋃𝑹! ))|∃𝒛 𝑹⋃𝑹! (∃𝒙 𝑹 ∃𝒚 𝑹! 𝑺𝒄𝒐𝒓𝒆𝒔 𝒙 ⋀𝑹𝒆𝒔𝒖𝒍𝒕𝒔 𝒚 ⋀𝒚 𝑻𝒐𝒕𝒂𝒍 >
      𝟖𝟑⋀𝒛 № = 𝒙 № ⋀𝒛 𝑵𝒂𝒎𝒆 = 𝒙 𝑵𝒂𝒎𝒆 ⋀𝒛 № = 𝒚 № ⋀𝒛 𝑵𝒂𝒎𝒆 = 𝒚 𝑵𝒂𝒎𝒆 )}.

  In the table obtained after the join, the tuple 𝒛 appears once. Therefore, in the output table, the
number of duplicates of the tuple 𝑤 is

                                 𝒒=     𝑨∈{𝑸𝒖𝒊𝒛}⋂(𝑹⋃𝑹! )𝒘 𝑨 !𝒛 𝑨 𝒌.


The result is a table Query, {Quiz} , where 𝑂𝑐𝑐 𝑠, Query = 1 + 1 = 2, 𝑠 = { 𝑄𝑢𝑖𝑧 , 20 }.
The corresponding SQL query has the form:

     SELECT Quiz
     FROM scores
     INNER JOIN result ON scores.`№`=result.`№` AND scores.Name=result.Name
     WHERE result.Total>83

Thus, as can be seen from the examples, the constructed tuple calculus for multiset table algebra
allows for the adequate formalization of query languages, particularly SQL, considering the multiset
semantics embedded in them.


Conclusions
The article proposes a tuple calculus for multiset table algebra. The alphabet, syntax of terms,
atoms, and formulas of the tuple calculus are defined. Using the concept of free and bound tuple, the
notion of scheme, and the set of attributes with which a tuple appears in formulas, a class of legal
formulas is introduced. It is shown that the proposed tuple calculus is no less expressive than
multiset table algebra. An example demonstrates the feasibility of constructing tuple calculus for
multiset table algebra, considering that query languages oriented towards database work imply the
repeatability of elements in a table.
    The next research challenge is to establish the corresponding dual result.

References
[1] E.F. Codd, Relational Сompleteness of Data Base Sublanguages, in: Data Base Systems (1972)
    65-98.
[2] M. Lacroix, A. Pirotte, Domain-oriented Relational Languages, in: Proceedings of 3rd Int. Conf.
    on Very Large Data Bases., 1977, pp. 370-378.
[3] V.N. Redko, et al., Relational Databases: Table Algebras and SQL-like Language. Kyiv:
    Publishing house Academperiodica, 2001. [in Ukrainian]
[4] D.B Buy, I.M. Glushko, Calculi and extensions of table algebras signature. Nizhyn: NDU im. M.
     Gogol, 2016. [in Ukrainian]
[5] I.Glushko, About relationship between table algebra of infinite tables and multiset table
     algebra, in CEUR Workshop Proceedings, 2139, 2018, pp. 159–163.
[6] I. Lysenko, Extended table algebra, extended multiset table algebra, and their relationship, in:
     Proceedings of International Conference on Software Engineering “SoftEngine 2020”, 2020, pp
     28-32.
[7] Paul W.P.J. Grefen, Rolf A. de By, A Multi-Set Extended Relational Algebra. A Formal
     Approach to a Practical Issue, in: Proceedings of 10th International Conference on Data
     Engineering, ICDE, 1994, pp. 80-88.
[8] Paul W.P.J. Grefen, J. Flokstra, Extending a Multi-Set Relational Algebra to a Parallel
     Environment, in: Distributed and Parallel Databases, Vol 4. (1996) 81-99.
[9] G. Lamperti, M. Melchiori, M. Zanella, On Multisets in Database Systems, in Multiset
     Processing: Mathematical, Computer Science, and Molecular Computing Points of View,
     number 2235 in Lecture Notes in Computing Since (2001)147-215.
[10] H. Garcia-Molina, J.D. Ullman, J. Widom, Database Systems: The Complete Book, 2nd. ed.,
     Prentice Hall, 2008.
[11] J.D. Ullman, J. Widom.Ullman, A First Course in Database Systems, 3rd ed., Prentice Hall, 2007.
[12] A. Silbeschatz, H. Korth, S. Sudarshan, Database System Concepts, 6th ed. McGraw-Hill, 2011.
[13] J.A. Bogatyreva, Multisets theory and its applications. Ph.D. thesis, Kyiv National Taras
     Shevchenko University, 2011. [in Ukrainian]
[14] N.Cutland, Computability. An introduction to recursive function theory, Cambridge University
     Press, 1980.