Extension of Tuple Calculus to Multisets Iryna Lysenko1*, Olha Moroz2 1 Nizhyn Mykola Gogol State University, Grafska str., 2, Nizhyn, 16600, Ukraine 2 International Research and Training Centre for Information Technologies and Systems of the NASU and MESU, Academician Hlushkov Ave., 40, Kyiv, 03680, Ukraine Abstract This paper is a logical continuation of research devoted to the actual problem of developing the theoretical foundations of table (relational) databases. The issue of using multisets in table databases is important and relevant. Many database-oriented languages require a relational model with multiset semantics. There are many applied problems, the feature of which is multiplicity and repeatability of data. For example, these are sociological polls of different population groups, calculations on DNA, and others. In this context, the question of constructing a tuple calculus for a multiset table algebra is considered, in which the concept of a table is refined using the concept of a multiset. In the article, the formalization of tuple calculus for multiset table algebra is carried out. The alphabet, and the syntax of terms, atoms, and formulas are defined. A set of legal formulas is introduced through the concept of the free and bound variable. The concept of a scheme and set of attributes with which a tuple variable occurs in a formula are also introduced. The definition of tuple calculus expression for multiset table algebra is given, according to which it is a multiset of tuples that satisfy the condition defined by the legal formula. The article provides rules for determining the number of tuple duplicates in the resulting multiset. Another important result consists in proving that constructed tuple calculus is as expressive as multiset table algebra. This research opens up new possibilities for database theory development and may be useful for information technology and database professionals. It contributes to a deeper understanding of construction query principles, an important aspect of modern computer science and industry. Keywords 1 Relation Databases, multiset, multiset table algebra, tuple calculus 1. Introduction Relational calculus underlies most relational query languages, specifying only the expected result, while relational algebra involves constructing a relational expression and performing operations. There are two main approaches to relational calculus: โ€ข Tuple calculus (E. Codd) which operates on table tuples [1]; โ€ข Domain calculus (M. Lacroix, A. Pirotte) which focuses on table domains [2]. The clarification of the concept of relation in terms of naminal sets was carried out by V.N. Redko, Yu.Y. Bron, D.B. Buy, and S.A. Polyakov [3]. The monograph [4] introduces the consideration of table algebra for infinite (finite) tables, which significantly generalizes and extends classical Codd's relational algebra. The generalization is that a relation is understood as an arbitrary set of single-scheme tuples, in particular, an infinite one, while each table is assigned a certain scheme. Tuple (domain) calculi have been also constructed for these algebras. Tuple calculus and domain calculus are supplemented with functional and predicate signatures on the universal domain. Two main results are presented. The first result demonstrates the equivalence between the 14th International Scientific and Practical Conference from Programming UkrPROGโ€™2024, May 14-15, 2024, Kyiv, Ukraine * Corresponding author. iryna.glushko@ndu.edu.ua (I. Lysenko); mog_91@ukr.net (O. Moroz) 0000-0003-2549-5356 (I. Lysenko); 0000-0002-0356-8780 (O. Moroz) ยฉ 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings table algebra for infinite tables and the corresponding generalized relational calculi. The second result focuses on the equivalence between the subalgebra for finite tables of this algebra and the corresponding restricted generalized relational calculi. For the second result, the consideration is limited to only "safe" expressions. The methodological basis of these formalisms is the compositional approach to programming. Additionally, in the work [4] multiset table algebra is constructed, which extends database capabilities through multisets. The signature of multiset table algebra is supplemented with new operations: inner and outer join operations, semijoin operations, and aggregate operations. In [5, 6] the theorem-plural and logical-algebraic methods found that table algebra of infinite tables does not form subalgebra of multiset table algebra since it is not closed with respect to the union, projection and active complement. Thus, multiset table algebra is not a wider formalism then table algebra of infinite tables Many database-oriented languages, such as SQL, require a relational model with multiset semantics. However, the traditional tuple (domain) calculi do not support queries that produce tables with duplicates. Therefore, there is a need to develop a calculus for multiset table algebra. The research aims to construct a tuple calculus for multiset table algebra and to show that it is no less expressive than multiset table algebra. 2. Basics of Multisets Many languages oriented towards working with databases require a relational data model with multiset semantics because, firstly, relations that allow duplicates are useful in many applied fields where duplicate objects may exist; secondly, in the relational data model, removing duplicates after performing projection and union operations implies merging identical elements or performing other labor-intensive actions. Various researchers have focused on the utilization of multisets in databases, including Paul Grefen and Rolf A. de By [7, 8], G. Lamperti, M. Melchiori, and M. Zanella [9], G. Garcia-Molina, J. Ullman, and J. Widom [10, 11], A. Silbeschatz, H. Korth, and S. Sudarshan [12], D.B. Buy, S.A. Polyakov, Yu.Y. Brona, and V.N. Redko [3]. However, this issue still requires clarification and expansion. Let's start by considering the key terms of multisets based on sources [3, 13]. Let ๐‘ˆ be an arbitrary set. A multiset ๐›ผ with basis ๐‘ˆ is a function ๐›ผ: ๐‘ˆ โ†’ ๐‘, where ๐‘ = {1, 2, โ€ฆ } is the set of natural numbers. Let ๐‘ซ be the universe of elements of multiset bases. A characteristic function of multiset ๐›ผ is a function ๐œ’! : ๐‘ซ โ†’ ๐‘! , the values of which are specified by the following piecewise schema: ๐›ผ ๐‘‘ if ๐‘‘ โˆˆ ๐‘‘๐‘œ๐‘š ๐›ผ ๐œ’! ๐‘‘ = 0, else for all d โˆˆ D . Here ๐‘‘๐‘œ๐‘š ๐›ผ is the range of definition of multiset as a function (๐‘‘๐‘œ๐‘š ๐›ผ = ๐‘ˆ! โ€“ basis of multiset ๐›ผ). The empty multiset โˆ…! is defined as a multiset which basis is an empty set. The 1-multisets are multisets whose range of values is the empty set or a single-element set {1}. These multisets are the analogues of ordinary sets. The operations over multisets are defined in terms of characteristic functions in monograph [3]. Authors define operations of multiset โ‹ƒ! , intersection โ‹‚! , difference \! , which build 1-multisets, ! ! and operations of multiset union โ‹ƒ!"" , intersection โ‹‚!"" , difference \!!"" , which build multisets of general view. The Cartesian product of multiset โŠ— , the operation Dist ฮฑ, which build 1-multiset, and analog of a full image for multisets are defined too. 3. Multiset Table Algebra Let's examine the main concepts and statements of multiset table algebra, based on the monograph [4]. In this case, under relation, understand a multiset, in particular, infinite. Let's consider two sets: ๐‘จ is the set of attributes, and ๐‘ซ is the universal domain. A scheme will be defined as any arbitrary finite set of attributes ๐‘… โІ ๐‘จ. A tuple of the scheme ๐‘… is a nominal set on pair ๐‘…, ๐‘ซ. The projection of this nominal set for the first component is equal to ๐‘…. We will use the following notations: ๐‘†(๐‘…) is the set of all tuples of the schema ๐‘…, and ๐‘† is the set of all tuples. A table of the scheme ๐‘… (๐‘… โІ ๐‘จ) is a pair ๐œ“, ๐‘… , where the first component ๐œ“ is an arbitrary multiset, the basis of which ฮ˜(๐œ“) is the set of tuples of the same scheme and the second component ๐‘… is a scheme of the table. The set of all table on scheme ๐‘… is designated as ฮจ(๐‘…) and the set of all table is designated as ฮจ = ! ฮจ( ). The notation ๐‘‚๐‘๐‘(๐‘ , ๐œ“) represents the number of duplicate tuple ๐‘  in the multiset ๐œ“. A multiset ! ! ๐œ“ can also be written as {๐‘ ! ! , โ€ฆ , ๐‘ ! ! }, where ๐‘›! = ๐‘‚๐‘๐‘(๐‘ ! , ๐œ“), ๐‘– = 1, โ€ฆ , ๐‘˜, and ฮ˜ ๐œ“ = {๐‘ ! , โ€ฆ , ๐‘ ! } is a of the multiset ๐œ“. Under multiset table algebra is understood an algebra ๐›น, ฮฉ!,! , ! ! !โˆˆ!,!โˆˆ! where ๐›น is the set of all tables, ฮฉ!,! = {โ‹ƒ!"" , โ‹‚!"" ,\!!"" , ๐œŽ!,! , ๐œ‹!,! , โจ‚!! ,!! , ๐‘…๐‘ก!,! , ~! }!,!,!! ,!! โІ๐‘จ is the signature, ๐‘ƒ, ฮž are the sets of parameters. The signature ฮฉ!,! contains operations of multiset ! ! (union โ‹ƒ!"" , intersection โ‹‚!"" , difference \!!"" ) and special operations (selection ๐œŽ!,! , projection ๐œ‹!,! , join โจ‚ , renaming ๐‘…๐‘ก!,! , and active complement ~! ). !! ,!! Multiset table algebra is also filled up with additional operations such as inner join (Cartesian join, natural join, join using attributes and join on predicate), outer join (outer left join, outer right join, outer full join and union join), semijoin and aggregate operations (Sum, Avg, Max, Min, Count). The special element NULL is inserted in the universal domain for to define of outer join and outer set operations. The operations of signature ฮฉ!,! and a formal mathematical semantics of additional operations are defined in [4]. The following statement holds. Theorem 1. Any expression of the multiset table algebra can be replaced with an equivalent expression that uses only the operations of selection, join, projection, union, difference, and renaming. Proof. To prove the first statement, we will show that the operations of intersection and active complement can be expressed through other operations of multiset table algebra. The operation of multiset intersection can be replaced by the difference [13]: ! ๐œ“! , ๐‘… โ‹‚!"" ๐œ“! , ๐‘… = ๐œ“! , ๐‘… \!!"" ( ๐œ“! , ๐‘… \!!"" ๐œ“! , ๐‘… ). The operation of active complement is expressed through the operations of projection, difference, and join (by definition): โˆผ! ๐œ“, ๐‘… = ๐ถ ๐œ“, ๐‘… \!!"" ๐œ“, ๐‘… , where ๐ถ ๐œ“, ๐‘… = ๐œ‹ !! ,! ๐œ“, ๐‘… โจ‚ โ€ฆ โจ‚ ๐œ‹ !! ,! ๐œ“, ๐‘… , and ๐‘… = ๐ด! , โ€ฆ , ๐ด! is {!! },{ ! } {!! ,โ€ฆ,!!!! },{!! } a scheme of the table ๐œ“, ๐‘… . According to the proof, the operations of intersection and active complement are derived from other operations in the signature of multiset table algebra and can henceforth be excluded from consideration. 4. Tuple Calculus for Multiset Table Algebra The basis of relational calculus is the calculus of first-order predicates. We will start the construction of tuple calculus for multiset table algebra with the definition of the alphabet. The alphabet of tuple calculus for multiset table algebra consists of: โ€ข a set of attributes ๐‘จ and universal domain ๐‘ซ; โ€ข a set of object variables (tuple variables) ๐‘ฅ! , ๐‘ฅ! , โ€ฆ; โ€ข a set of object constants ๐‘‘! , ๐‘‘! , โ€ฆ; ! ! โ€ข a set of function symbols ๐‘“! ! , ๐‘“! ! , โ€ฆ , ๐‘›! โ‰ฅ 1; ! ! โ€ข a set of predicate symbols ๐‘! ! , ๐‘! ! , โ€ฆ , ๐‘š! โ‰ฅ 1; โ€ข a set of symbols of constant tables ๐›ผ, ๐‘… ; โ€ข a set of symbols of variable tables ๐‘‹, ๐‘… ; โ€ข the signs of logical operations ยฌ, โ‹€, โ‹, and quantifiers โˆ€, โˆƒ; โ€ข punctuation marks โ€“ parentheses () and commas. The universal domain ๐‘ซ is the domain of interpretation of object constants, and the set of all tuples over ๐‘ซ is the domain of interpretation of object variables. We will use โ€ข ๐’™ as syntactic variable, the domain of change of which is the set of variables; โ€ข ๐’‡ as syntactic variable, the domain of change of which is the set of function symbols; โ€ข ๐’‘ as syntactic variable, the domain of change of which is the set of predicate symbols; โ€ข ๐’… as syntactic variable; โ€ข ๐“ as syntactic variable, the domain of change of which is the set of attributes. Terms and formulas are distinguished among the words written using alphabet symbols. Let us define these syntactic objects by induction on their length. The following expressions are terms: a) ๐’… is a term; b) ๐’™(๐“) is a term; c) if ๐’–! , โ€ฆ , ๐’–! are terms and ๐’‡ is a function symbol of arity ๐‘› then ๐’‡ ๐’–! , โ€ฆ , ๐’–! is a term; d) there are no terms other than those specified in points a)-c). Further in the text, ๐’– is a syntactic variable, the domain of change of which is the set of terms. Let's define the formulas, starting with atomic formulas (atoms), which come in three types: ะฐ1. For any constant table ๐›ผ, ๐‘… and for any tuple variable ๐’™, ๐›ผ! (๐’™) is an atom. ๐›ผ! (๐’™) stands for ๐’™ โˆˆ ๐›ผ, ๐‘… . ะฐ2. For any variable table ๐‘‹, ๐‘… and for any tuple variable ๐’™, ๐‘‹! (๐’™) is an atom. ๐‘‹! (๐’™) stands for ๐’™ โˆˆ ๐›ผ, ๐‘… . ะฐ3. For any terms ๐’–! , โ€ฆ , ๐’–! , and for any predicate ๐’‘ of arity ๐‘š on the universal domain ๐‘ซ, ๐’‘ ๐’–! , โ€ฆ , ๐’–! is an atom. f1. Let's construct formulas from atoms using the logical connectives ยฌ, โ‹€, โ‹, quantifiers โˆ€, โˆƒ, and parentheses. f2. Every atom is a formula. f3. If ๐‘ท and ๐‘ธ are formulas, then ยฌ๐‘ท, ๐‘ทโ‹€๐‘ธ, ๐‘ทโ‹๐‘ธ are formulas. f4. If ๐’™ is a tuple variable, ๐‘ท is a formula, ๐‘… โІ ๐‘จ is a scheme, then โˆ€๐’™(๐‘…)๐‘ท, โˆƒ๐’™(๐‘…)๐‘ท are formulas. f5. If ๐‘ท is a formula, then (๐‘ท) is a formula. f6. There are no other formulas besides those specified in points f1-f4. We will use ๐‘ท, ๐‘ธ and ๐‘ฎ as syntactic variables, the domain of change of which is the set of formulas. Let's define the class of legal formulas, using the concepts of free and bound variables, the scheme ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’(๐’™, ๐‘ท) for tuple variable ๐’™, and the set of attributes with which tuple variable ๐’™ occurs in a formula ๐‘ท, ๐‘Ž๐‘ก๐‘ก๐‘Ÿ(๐’™, ๐‘ท). The expressions ๐‘โ„Ž๐‘’๐‘š๐‘’(๐’™, ๐‘ท) and ๐‘Ž๐‘ก๐‘ก๐‘Ÿ(๐’™, ๐‘ท) are defined if the tuple ๐’™ has at least one free occurrence in the formula ๐‘ท. Additionally, the including ๐‘Ž๐‘ก๐‘ก๐‘Ÿ(๐’™, ๐‘ท) โІ ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’(๐’™, ๐‘ท) holds if these expressions are defined. We will define expression ๐‘Ž๐‘ก๐‘ก๐‘Ÿ for terms first: 1. if ๐’– = ๐’…, then ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐’– = โˆ…; 2. if ๐’– = ๐’™(๐“), then ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐’– = {๐“}, ะฐnd ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐’š(๐“) = โˆ…, where ๐’™ โ‰  ๐’š; 3. if ๐’– = ๐’‡ ๐’–! , โ€ฆ , ๐’–! , where ๐’–! are terms then ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐’– = !!!! ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐’–๐’Š . In other words, ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐’– is the set of attributes that the scheme of the tuple ๐’™ must have. Consider the cases where ๐‘ท is an atomic formula, then ะฐ1. if ๐‘ท = ๐›ผ! (๐’™), then ๐’™ is free in ๐‘ท and ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ท = ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ท = ๐‘…; ะฐ2. similarly if ๐‘ท = ๐‘‹! (๐’™), then ๐’™ is free in ๐‘ท and ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ท = ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ท = ๐‘…; ะฐ3. if ๐‘ท = ๐’‘ ๐’–! , โ€ฆ , ๐’–! , where ๐’–! are terms, and ๐’™! , โ€ฆ , ๐’™! are all variables of these terms, then this tuple variables are free in formula ๐‘ท, ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™! , ๐‘ท is undefined, and ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™! , ๐‘ท = ! !!! ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™! , ๐’–! , ๐‘– = 1, โ€ฆ , ๐‘˜. Atomic formulas are all legal. The construction of all legal formulas proceeds by induction on the length of formulas. Assume ๐‘ธ and ๐‘ฎ are both legal formulas. f1. If ๐‘ท = ยฌ๐‘ฎ, then ๐‘ท is legal, and all occurrences of variables in ๐‘ท free or bound as they are in ๐‘ฎ. For every tuple ๐’™ that occurs free in ๐‘ฎ, ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ท โ‰ƒ ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ฎ and ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ท = ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ฎ , where โ‰ƒ is a generalized equality, meaning that both sides of the equality are either undefined or defined and have equal values [14]. f2. If ๐‘ท = ๐‘ฎโ‹€ or ๐‘ท = ๐‘ฎโ‹๐‘ธ, then all occurrences of variables in ๐‘ท are free or bound as their corresponding occurrences are in ๐‘ฎ and ๐‘ธ. Assume variable ๐’™ occurs free in subformulas ๐‘ฎ and/or ๐‘ธ. Define the scheme, and the set of attributes with which tuple variable ๐’™ occurs in a formula for formula ๐‘ท. Next cases take place. a. The schemes of formulas ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ฎ and ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ธ are both defined. Formula ๐‘ท is legal if equality ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ฎ = ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ธ holds. According to the definition ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ท = ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ฎ . b. The scheme is defined for only one subformula. Assume ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ฎ is defined, and ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ธ is undefined. Formula ๐‘ท is legal if including ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ธ โІ ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ฎ holds. According to the definition ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ท = ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ฎ . c. The scheme is undefined for both subformulas, then formula ๐‘ท is legal but ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ท is undefined. In all these three cases ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ท = ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ฎ โˆช ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ธ . f3. If ๐‘ท = โˆƒ๐’™(๐‘…)๐‘ฎ then ๐’™ must occur free in ๐‘ฎ for ๐‘ท to be legal. Furthermore, if ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ฎ is defined, then equality ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ฎ = ๐‘… must hold if including ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ฎ โІ ๐‘… is held. ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ท and ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ท are not defined, since ๐’™ does not occur free in ๐‘ท. Any occurrence of a variable ๐’š โ‰  ๐’™ is free or bound in ๐‘ท as it was in ๐‘ฎ. If ๐’š occur free in ๐‘ท, then ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’š, ๐‘ท โ‰ƒ ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’š, ๐‘ฎ and ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’š, ๐‘ท = ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’š, ๐‘ฎ . f4. If ๐‘ท = โˆ€๐’™(๐‘…)๐‘ฎ, then all restrictions and definitions are the same as in f3. f5. If ๐‘ท = (๐‘ฎ), then ๐‘ท is legal, and free and bound variables, ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ and ๐‘Ž๐‘ก๐‘ก๐‘Ÿ are the same as for ๐‘ฎ. In other words, the equality ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ท = ๐‘… means that for a specific interpretation of the formula ๐‘ท, when the variable ๐’™ takes on a value in the form of a tuple ๐‘  of the scheme ๐‘…โ€ฒ, the inclusion ๐‘… โІ ๐‘…โ€ฒ must hold. Expressions of tuple calculus for multiset table algebra have the form {๐’™! ๐‘… |๐‘ท(๐’™)}, where: 1. The formula ๐‘ท is legal. 2. The variable ๐’™ is the only variable that occurs free in the formula ๐‘ท. 3. If ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ท is defined, then ๐‘ ๐‘โ„Ž๐‘’๐‘š๐‘’ ๐’™, ๐‘ท = ๐‘…, otherwise ๐‘Ž๐‘ก๐‘ก๐‘Ÿ ๐’™, ๐‘ท โІ ๐‘…. 4. ๐‘› is the number of duplicates of the tuple ๐’™. It is worth emphasizing that the result of executing the query defined by the expression {๐’™! ๐‘… |๐‘ท(๐’™)} is a multiset of tuple described by the expression ๐‘ท(๐’™). Let ๐‘ท ๐’™ be a legal formula, ๐‘… โІ ๐‘จ. Then ๐‘  substituted for ๐’™ in formula ๐‘ท is the formula ๐‘ท(๐‘ /๐’™). The value of the formula ๐‘ท(๐‘ /๐’™) is determined by modifying each atom from ๐‘ท according to the following rules: ะฐ1. Let the tuple variable ๐’™ in ๐›ผ!! (๐’™) be free in formula ๐‘ท. By the definition of an legal formula, we have the inclusion ๐‘…โ€ฒ โІ ๐‘…. Atom ๐›ผ!! (๐’™) acquires the value of true at ๐‘  substituted for ๐’™, if ๐‘ |๐‘…โ€ฒ โˆˆ ๐œƒ(๐›ผ), otherwise, atom ๐›ผ!! (๐’™) acquires the value of false. ะฐ2. Let the tuple variable ๐’™ in ๐‘‹!! (๐’™) be free in formula ๐‘ท. By the definition of an legal formula, we have the inclusion ๐‘…โ€ฒ โІ ๐‘…. Atom ๐‘‹!! (๐’™) acquires the value of true at ๐‘  substituted for ๐’™, if ๐‘ |๐‘…โ€ฒ โˆˆ ๐œƒ(๐‘‹), otherwise, atom ๐‘‹!! (๐’™) acquires the value of false. ะฐ3. Let the tuple variable ๐’™ in ๐‘ท = ๐’‘ ๐’–! , โ€ฆ , ๐’–! be free in formula ๐‘ท, then at ๐‘  substituted for ๐’™, replace ๐’™(๐ด! ) by ๐‘‘! โˆˆ ๐‘ซ, where ๐ด! , ๐‘‘! โˆˆ ๐‘  (๐‘‘! is value of attribute ๐ด! in tuple ๐‘ ). Atom ๐’‘ ๐’–! , โ€ฆ , ๐’–! acquires the value of true, if predicate ๐’‘ is true on the proper values, otherwise atom acquires the value of false. The set of values of truth of all atoms of formula is called interpretation. Let ๐‘ท be a legal formula with no free tuple variables. The interpretation of formula ๐‘ท is defined as follows. f1. If ๐‘ท = ยฌ๐‘ฎ, then ๐‘ฎ must have no free variables. The formula ๐‘ท is true, if ๐‘ฎ is false, and it is false, if formula ๐‘ฎ is true. f2. If ๐‘ท = ๐‘ฎโ‹€๐‘ธ or ๐‘ท = ๐‘ฎโ‹๐‘ธ, then neither ๐‘ฎ or Q have free variables. If ๐‘ท = ๐‘ฎโ‹€๐‘ธ, then ๐‘ท is true exactly when ๐‘ฎ and ๐‘ธ are both true, otherwise, ๐‘ท is false. If ๐‘ท = ๐‘ฎโ‹๐‘ธ, then ๐‘ท is false exactly when ๐‘ฎ and ๐‘ธ are both false, otherwise, ๐‘ท is true. f3. If ๐‘ท = โˆƒ๐’™(๐‘…)๐‘ฎ, then ๐’™ is the only variable that occurs free in ๐‘ฎ. The formula ๐‘ท is true, if there is at least one tuple ๐‘  โˆˆ ๐‘†(๐‘…) such that formula ๐‘ฎ(๐‘ /๐’™) is true, otherwise, formula ๐‘ท is false. f4. If ๐‘ท = โˆ€๐’™(๐‘…)๐‘ฎ, then ๐’™ is the only variable that occurs free in ๐‘ฎ. The formula ๐‘ท is true, if for every tuple ๐‘  โˆˆ ๐‘†(๐‘…) formula ๐‘ฎ(๐‘ /๐’™) is true, otherwise, formula ๐‘ท is false. If ๐‘ท = (๐‘ฎ), then ๐‘ท is true, if formula ๐‘ฎ is true and ๐‘ท is false, if ๐‘ฎ is false. Let ๐ธ = {๐’™! ๐‘… |๐‘ท(๐’™)} be a tuple calculus expression. The value of expression ๐ธ is the table ๐œ‘, ๐‘… of multiset table algebra containing every tuple ๐‘  โˆˆ ๐‘†(๐‘…) such that formula ๐‘ท(๐‘ /๐’™) is true. The number of duplicates of the tuple ๐‘  in the table ๐œ‘, ๐‘… is determined as follows: 1) if ๐‘ท = ๐›ผ! (๐’™), then ๐‘› = ๐‘‚๐‘๐‘ ๐‘ , ๐œ‘ = ๐‘‚๐‘๐‘(๐‘ , ๐›ผ); 2) if ๐‘ท = ๐‘‹! (๐’™), then ๐‘› = ๐‘‚๐‘๐‘ ๐‘ , ๐œ‘ = ๐‘‚๐‘๐‘(๐‘ , ๐‘‹); 3) if ๐‘ท = ๐’‘ ๐’–! , โ€ฆ , ๐’–! , where ๐’–! are terms, ๐‘— = 1, ๐‘š, and ๐’™! , โ€ฆ , ๐’™! are all variables of these terms, then ๐‘› = ๐‘‚๐‘๐‘ ๐‘ , ๐œ‘ = ! ! โˆˆ! ! ,! ! |! ! !! ๐‘‚๐‘๐‘ ๐‘ โ€ฒ, ๐œ“ , ๐’‘ ๐’–! ,โ€ฆ,๐’–! !!"#$ where ๐œ“, ๐‘… is the table to which the query is constructed, ๐‘… ! = ! !!! ๐‘Ž๐‘ก๐‘ก๐‘Ÿ(๐’™! , ๐’–! ),๐‘– = 1, ๐‘˜; 4) if ๐‘ท = ยฌ๐‘ฎ and the formula ๐‘ฎ generates ๐‘š duplicates of the tuple ๐‘ , then 5) ๐‘› = ๐‘‚๐‘๐‘ ๐‘ , ๐œ‘ = ๐‘‚๐‘๐‘(๐‘ , ๐ถ(๐œ“))โˆ’๐‘š, where ๐ถ(๐œ“) is a multiset of the table ๐ถ( ๐œ“, ๐‘… ) which is the saturation of the table ๐‘ฅ โˆ’ ๐‘ฆ ๐‘–๐‘“ ๐‘ฅ โ‰ฅ ๐‘ฆ ๐œ“, ๐‘… , which is the value of the expression {๐’š! ๐‘… |๐‘ท(๐’š)}, and ๐‘ฅโˆ’๐‘ฆ = ; 0 ๐‘–๐‘“ ๐‘ฅ < ๐‘ฆ 6) if ๐‘ท = ๐‘ฎโ‹€๐‘ธ, the formula ๐‘ฎ generates ๐‘˜ duplicates of the tuple ๐‘ , and the formula ๐‘ธ generates ๐‘š duplicates of the tuple ๐‘ , then ๐‘› = ๐‘‚๐‘๐‘ ๐‘ , ๐œ‘ = min (๐‘˜, ๐‘š); 7) if ๐‘ท = ๐‘ฎโ‹๐‘ธ , the formula ๐‘ฎ generates ๐‘˜ duplicates of the tuple ๐‘ , and the formula ๐‘ธ generates ๐‘š duplicates of the tuple ๐‘ , then ๐‘› = ๐‘‚๐‘๐‘ ๐‘ , ๐œ‘ = ๐‘˜ + ๐‘š; 8) if ๐‘ท = โˆƒ๐’™(๐‘…)๐‘ฎ and the formula ๐‘ฎ generates ๐‘˜ duplicates of the tuple ๐‘ , then ๐‘› = ๐‘‚๐‘๐‘ ๐‘ , ๐œ‘ = ๐‘˜. 9) if ๐‘ท = โˆ€๐’™(๐‘…)๐‘ฎ and the formula ๐‘ฎ generates ๐‘˜ duplicates of the tuple ๐‘ , then ๐‘› = ๐‘‚๐‘๐‘ ๐‘ , ๐œ‘ = ๐‘˜. 10) if ๐‘ท = (๐‘ฎ) and the formula ๐‘ฎ generates ๐‘˜ duplicates of the tuple ๐‘ , then ๐‘› = ๐‘‚๐‘๐‘ ๐‘ , ๐œ‘ = ๐‘˜. Theorem 2. If ๐น is a multiset table algebra expression, then it is possible effectively to build equivalent to it expression ๐ธ of tuple calculation. Proof. According to Theorem 1, in the proof, it is sufficient to consider expressions of multiset table algebra that contain only the operations of union, difference, selection, projection, join, and renaming. We will prove this theorem by mathematical induction on the number of operations in the expression ๐น. Basis. In this case, the expression ๐น does not contain any operations. There are two cases. First case. If ๐น = ๐›ผ, ๐‘… is constant table, where ๐›ผ is a multiset of the scheme ๐‘…, then ๐ธ = {๐’™! ๐‘… |๐›ผ! (๐’™)}. Second case. If ๐น = ๐‘‹, ๐‘… is variable table, then ๐ธ = {๐’™! ๐‘… |๐‘‹! (๐’™)}. Induction. Assume the theorem holds for any multiset table algebra expression with fewer than ๐‘– operators. Let ๐น have ๐‘– operators. There are six cases. Case 1 (union). ๐น = ๐น! !!"" ๐น! , where expressions ๐น! and ๐น! each have less than ๐‘– operators. By the inductive hypothesis we can find tuple calculus expressions {๐’™! ๐‘… |๐‘ท(๐’™)} and {๐’™! ๐‘… |๐‘ธ(๐’™)} equivalent to ๐น! and ๐น! respectively. The values of these expressions are tables in which the tuple ๐’™ appears ๐‘› and ๐‘š times, respectively. Then ๐ธ is {๐’™! ๐‘… |๐‘ท(๐’™)โ‹๐‘ธ(๐’™)}, where ๐‘˜ = ๐‘› + ๐‘š. Case 2 (difference). ๐น = ๐น! \!!"" ๐น! , where expressions ๐น! and ๐น! each have less than ๐‘– operators. Tuple calculus expressions {๐’™! ๐‘… |๐‘ท(๐’™)} and {๐’™! ๐‘… |๐‘ธ(๐’™)} are equivalent to ๐น! and ๐น! respectively as in Case 1. Then ๐ธ = {๐’™! ๐‘… |๐‘ท(๐’™)โ‹€ยฌ๐‘ธ(๐’™)}, where ๐‘˜ = ๐‘›โˆ’๐‘š. Case 3 (selection). ๐น = ๐œŽ!,! (๐น! ), where expression ๐น! has less than ๐‘– operators. Let {๐’™! ๐‘… |๐‘ท(๐’™)} be tuple calculus expression equivalent to ๐น! . Then ๐ธ = {๐’™! ๐‘… |๐‘ท(๐’™)โ‹€๐’‘(๐’™ ๐ด! , โ€ฆ , ๐’™ ๐ด! )}, where ๐‘… = {๐ด! , โ€ฆ , ๐ด! } is the scheme of table that is the value of expression ๐น! . The number of duplicates of the tuple ๐’™ in the output table does not change, therefore ๐‘˜ = ๐‘›. Assume that predicate-parameter of select is defined as ๐‘ ๐‘  = ๐‘ก๐‘Ÿ๐‘ข๐‘’ โ‡” ๐’‘ ๐‘  ๐ด! , โ€ฆ , ๐‘  ๐ด! = ๐‘ก๐‘Ÿ๐‘ข๐‘’, ๐‘  โˆˆ ๐‘†(๐‘…), where ๐’‘ is a predicate symbol of arity ๐‘š. โ–ก Case 4 (projection). ๐น = ๐œ‹!,! (๐น! ). Let {๐’™! ๐‘… |๐‘ท(๐’™)} be tuple calculus expression equivalent to ๐น! . Then ! ๐ธ = {๐’š ๐‘‹โ‹‚๐‘… |โˆƒ๐’™(๐‘…)(๐‘ท(๐’™)โ‹€ !โˆˆ!โ‹‚! ๐’š ๐ด = ๐’™ ๐ด )}, where ๐‘˜ = !โˆˆ!โ‹‚! ๐’š ! !๐’™ ! ๐‘›. Case 5 (join). ๐น = ๐น! โจ‚ ๐น! . !! ,!! Tuple calculus expressions {๐’™! ๐‘… |๐‘ท(๐’™)} and {๐’™! ๐‘… |๐‘ธ(๐’™)} are equivalent to ๐น! and ๐น! respectively Then ๐ธ is ! {๐’› ๐‘…! โ‹ƒ๐‘…! |โˆƒ๐’™(๐‘…! )โˆƒ๐’š(๐‘…! )(๐‘ท ๐’™ ๐‘ธ (๐’™)โ‹€ !โˆˆ!! ๐’› ๐ด = ๐’™ ๐ด โ‹€ !โˆˆ!! ๐’› ๐ด = ๐’š ๐ด )}, where ๐‘˜ = ๐‘›ร—๐‘š. โ–ก Case 6 (renaming). ๐น = ๐‘…๐‘ก!,! (๐น! ), where ๐œ‰: ๐‘จ โ†’ ๐‘จ is injective function that renames attributes. Let {๐’™! ๐‘… |๐‘ท(๐’™)} be tuple calculus expression equivalent to ๐น! . Then ! ๐ธ = {๐’š ๐‘…! |โˆƒ๐’™(๐‘…)(๐‘ท(๐’™)โ‹€ !โˆˆ!\!"#$ ๐’š ๐ถ = ๐’™ ๐ถ โ‹€ !โˆˆ!โ‹‚!"#$ ๐’™ ๐ด = ๐’š ๐œ‰(๐ด) )}, where ๐‘…! = ๐‘…\๐‘‘๐‘œ๐‘š๐œ‰โ‹ƒ๐œ‰[๐‘…], and ๐‘˜ = ๐‘›. 5. Relation to the SQL query language The fundamental data object in the SQL language is not the classical relation of E. Codd but rather a table. Moreover, SQL tables do not contain sets of tuples but multisets of tuples, meaning duplicates are allowed. The basic SQL operators are not relational operators in the strict sense but are analogs of relational operators designed to work with multisets. When creating a new table using a query, an SQL system typically does not remove duplicate tuples but returns a result in which the same tuple can appear multiple times. To exclude duplicates, the keyword DISTINCT must be placed after the SELECT operator. Let's demonstrate the appropriateness of constructing tuple calculus for multiset table algebra with the following example. Example 1. Consider the table Scores, ๐‘… , where the scheme ๐‘… = {โ„–, Name, Topic 1, Topic 2, Topic 3, Quiz} (see Table 1). Table 1 Table Scores, ๐‘… โ„– Name Topic 1 Topic 2 Topic 3 Quiz 1. Student 1 5 15 14 16 2. Student 2 6 14 15 16 3. Student 3 7 17 20 20 4. Student 4 9 20 19 20 5. Student 5 5 18 15 18 6. Student 6 6 19 13 20 7. Student 7 4 8 9 16 The answer to the question "What scores did the first five students get for the quiz?" in SQL would look like this: SELECT Quiz FROM Scores LIMIT 5; The result is a table Quiz, ๐‘…! , where the schema ๐‘…! = Quiz , which contains duplicate values (see Table 2). Table 2 Table Quiz, ๐‘…! Quiz 16 16 20 20 18 It is impossible to implement this query in terms of classical tuple calculus, since the result will be a set of tuples, not a multiset (as expected), meaning duplicates will not be considered in the result. In the tuple calculus for multiset table algebra, the expression equivalent to this query will have the form: {๐‘ฅ ! (Quiz) | ๐‘ฆ(๐‘…)(Scores ๐‘ฆ โ‹€ (โ„–) = 1โ‹ (โ„–) = 2โ‹(โ„–) = 3โ‹(โ„–) = 4 โ‹(โ„–) = 5)โ‹€ โ‹€๐‘ฅ(Quiz) = ๐‘ฆ(Quiz))}. The result described by this tuple calculus expression is similar to the result obtained when executing the corresponding query in SQL. Example 2. Consider the table Results, ๐‘…โ€ฒ , where the scheme ๐‘…โ€ฒ = { โ„–, Name, Total, ECTS} (see Table 3). Let's consider the query "How many points for the quiz have students who scored more than 83 points in total?". We will write the corresponding expression of multiset table algebra, assuming that we only need to know the quiz scores: ๐น = ๐œ‹!"#$ Scores, ๐‘… โจ‚ ๐œŽ!"#$%!!" Results, ๐‘…โ€ฒ . !,!! The result is a table Query, {Quiz} , which contains duplicate values (see Table 4). Table 3 Table Results, ๐‘…โ€ฒ โ„– Name Total ECTS 1. Student 1 77 C 2. Student 2 80 C 3. Student 3 93 ะ 4. Student 4 90 ะ 5. Student 5 81 C 6. Student 6 79 C 7. Student 7 51 Fx From Table 4, it is clear that only two students in total have more than 83 points. Letโ€™s write the tuple calculus expression equivalent to this algebraic expression. Table 4 Table Query, {Quiz} Quiz 20 20 Tuple calculus expressions {๐’™! ๐‘… |Scores(๐’™)} and {๐’š! ๐‘…โ€ฒ |๐‘น๐’†๐’”๐’–๐’๐’•๐’”(๐’š)} are equivalent to Scores, ๐‘… and Results, ๐‘…โ€ฒ respectively, where ๐‘… = {โ„–, Name, Topic 1, Topic 2, Topic 3, ๐‘„๐‘ข๐‘–๐‘ง} and ๐‘…โ€ฒ = { โ„–, ๐‘๐‘Ž๐‘š๐‘’, ๐‘‡๐‘œ๐‘ก๐‘Ž๐‘™, ๐ธ๐ถ๐‘‡๐‘†}. We have tuple calculus expression {๐’š! ๐‘… ! |Results ๐’š โ‹€๐’š Total > 83} for algebraic expression ๐œŽ!"#$%!!" Results, ๐‘…โ€ฒ . Its value is a table without duplicates because table Results, ๐‘…โ€ฒ does not have duplicates. Tuple calculus expressions which equivalent to algebraic expression Scores, ๐‘… โจ‚ ๐œŽ!"#$%!!" Results, ๐‘…โ€ฒ !,!! is {๐’›!!!ร—! ๐‘…โ‹ƒ๐‘… ! |โˆƒ๐’™(๐‘…)โˆƒ๐’š(๐‘…โ€ฒ)(Scores ๐’™ โ‹€Results ๐’š โ‹€๐’š Total > 83โ‹€๐’› โ„– = ๐’™ โ„– โ‹€๐’› Name = ๐’™ Name โ‹€๐’› โ„– = ๐’š โ„– โ‹€๐’› Name = ๐’š Name )}. Since there are no duplicate tuples in the tables Scores, ๐‘… and Results, ๐‘…โ€ฒ , there will also be no duplicate tuples in the resulting table after the join. Finally, the tuple calculus expression equivalent to the algebraic expression ๐น has the form: {๐’˜๐’’ ({๐‘ธ๐’–๐’Š๐’›}โ‹‚(๐‘นโ‹ƒ๐‘น! ))|โˆƒ๐’› ๐‘นโ‹ƒ๐‘น! (โˆƒ๐’™ ๐‘น โˆƒ๐’š ๐‘น! ๐‘บ๐’„๐’๐’“๐’†๐’” ๐’™ โ‹€๐‘น๐’†๐’”๐’–๐’๐’•๐’” ๐’š โ‹€๐’š ๐‘ป๐’๐’•๐’‚๐’ > ๐Ÿ–๐Ÿ‘โ‹€๐’› โ„– = ๐’™ โ„– โ‹€๐’› ๐‘ต๐’‚๐’Ž๐’† = ๐’™ ๐‘ต๐’‚๐’Ž๐’† โ‹€๐’› โ„– = ๐’š โ„– โ‹€๐’› ๐‘ต๐’‚๐’Ž๐’† = ๐’š ๐‘ต๐’‚๐’Ž๐’† )}. In the table obtained after the join, the tuple ๐’› appears once. Therefore, in the output table, the number of duplicates of the tuple ๐‘ค is ๐’’= ๐‘จโˆˆ{๐‘ธ๐’–๐’Š๐’›}โ‹‚(๐‘นโ‹ƒ๐‘น! )๐’˜ ๐‘จ !๐’› ๐‘จ ๐’Œ. The result is a table Query, {Quiz} , where ๐‘‚๐‘๐‘ ๐‘ , Query = 1 + 1 = 2, ๐‘  = { ๐‘„๐‘ข๐‘–๐‘ง , 20 }. The corresponding SQL query has the form: SELECT Quiz FROM scores INNER JOIN result ON scores.`โ„–`=result.`โ„–` AND scores.Name=result.Name WHERE result.Total>83 Thus, as can be seen from the examples, the constructed tuple calculus for multiset table algebra allows for the adequate formalization of query languages, particularly SQL, considering the multiset semantics embedded in them. Conclusions The article proposes a tuple calculus for multiset table algebra. The alphabet, syntax of terms, atoms, and formulas of the tuple calculus are defined. Using the concept of free and bound tuple, the notion of scheme, and the set of attributes with which a tuple appears in formulas, a class of legal formulas is introduced. It is shown that the proposed tuple calculus is no less expressive than multiset table algebra. An example demonstrates the feasibility of constructing tuple calculus for multiset table algebra, considering that query languages oriented towards database work imply the repeatability of elements in a table. The next research challenge is to establish the corresponding dual result. References [1] E.F. Codd, Relational ะกompleteness of Data Base Sublanguages, in: Data Base Systems (1972) 65-98. [2] M. Lacroix, A. Pirotte, Domain-oriented Relational Languages, in: Proceedings of 3rd Int. Conf. on Very Large Data Bases., 1977, pp. 370-378. [3] V.N. Redko, et al., Relational Databases: Table Algebras and SQL-like Language. Kyiv: Publishing house Academperiodica, 2001. [in Ukrainian] [4] D.B Buy, I.M. Glushko, Calculi and extensions of table algebras signature. Nizhyn: NDU im. M. Gogol, 2016. [in Ukrainian] [5] I.Glushko, About relationship between table algebra of infinite tables and multiset table algebra, in CEUR Workshop Proceedings, 2139, 2018, pp. 159โ€“163. [6] I. Lysenko, Extended table algebra, extended multiset table algebra, and their relationship, in: Proceedings of International Conference on Software Engineering โ€œSoftEngine 2020โ€, 2020, pp 28-32. [7] Paul W.P.J. Grefen, Rolf A. de By, A Multi-Set Extended Relational Algebra. A Formal Approach to a Practical Issue, in: Proceedings of 10th International Conference on Data Engineering, ICDE, 1994, pp. 80-88. [8] Paul W.P.J. Grefen, J. Flokstra, Extending a Multi-Set Relational Algebra to a Parallel Environment, in: Distributed and Parallel Databases, Vol 4. (1996) 81-99. [9] G. Lamperti, M. Melchiori, M. Zanella, On Multisets in Database Systems, in Multiset Processing: Mathematical, Computer Science, and Molecular Computing Points of View, number 2235 in Lecture Notes in Computing Since (2001)147-215. [10] H. Garcia-Molina, J.D. Ullman, J. Widom, Database Systems: The Complete Book, 2nd. ed., Prentice Hall, 2008. [11] J.D. Ullman, J. Widom.Ullman, A First Course in Database Systems, 3rd ed., Prentice Hall, 2007. [12] A. Silbeschatz, H. Korth, S. Sudarshan, Database System Concepts, 6th ed. McGraw-Hill, 2011. [13] J.A. Bogatyreva, Multisets theory and its applications. Ph.D. thesis, Kyiv National Taras Shevchenko University, 2011. [in Ukrainian] [14] N.Cutland, Computability. An introduction to recursive function theory, Cambridge University Press, 1980.