<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Axiomatic System for Order Dependencies</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Jaroslaw</forename><surname>Szlichta</surname></persName>
							<email>jszlicht@cse.yorku.ca</email>
							<affiliation key="aff0">
								<orgName type="institution">York University</orgName>
								<address>
									<settlement>Toronto</settlement>
									<country key="CA">Canada</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">IBM Centre for Advanced Studies</orgName>
								<address>
									<settlement>Toronto</settlement>
									<country key="CA">Canada</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Parke</forename><surname>Godfrey</surname></persName>
							<email>godfrey@cse.yorku.ca</email>
							<affiliation key="aff0">
								<orgName type="institution">York University</orgName>
								<address>
									<settlement>Toronto</settlement>
									<country key="CA">Canada</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jarek</forename><surname>Gryz</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">York University</orgName>
								<address>
									<settlement>Toronto</settlement>
									<country key="CA">Canada</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Calisto</forename><surname>Zuzarte</surname></persName>
							<email>calisto@ca.ibm.com</email>
							<affiliation key="aff1">
								<orgName type="department">IBM Centre for Advanced Studies</orgName>
								<address>
									<settlement>Toronto</settlement>
									<country key="CA">Canada</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Axiomatic System for Order Dependencies</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">413F0F8FF9675DC4CBCCF070D6632B1D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T03:41+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Dependencies play an important role in database theory. We study order dependencies (ODs) and unidirectional order dependencies (UODs), which describe the relationships among lexicographical orderings of sets of tuples. We investigate the axiomatization problem for order dependencies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Fundamentals</head><p>Understanding the semantics of data is important, both for data quality analysis and knowledge discovery. While the relational data model is set based and does not concede the concept of order, ordered streams nonetheless play important roles in relational systems. SQL allows one to specify by its order-by clause that the answer "set" be returned in the specified order. Ordered streams are prevalent in query plans between operators to provide efficient evaluation. A query optimizer must reason extensively over interesting orders while building efficient query plans <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b6">7]</ref>.</p><p>We are interested in lexicographical ordering, or nested sort, as is provided by SQL's order-by directive. Let X = [A | T] be a list of attributes, the attribute A is the head of the list, and the list T the tail. For two tuples s and t, s X t iff s A &lt; t A or (s A = t A and (T = [ ] or s T t)).</p><p>Given two lists of attributes X and Y, X → Y denotes a unidirectional order dependency (UOD) [5], read as X orders Y. Table <ref type="table">r</ref> satisfies X → Y iff, for all s, t ∈ r, s X t implies s Y t. That is, given table r, any list of the table's tuples that satisfies order by X also satisfies order by Y. Also, X ∼ Y denotes that X ↔ Y. The default direction of the order for SQL's order-by is ascending. We also consider order-by's that mix asc and desc directives; e.g., order by A asc, B desc. This generalization we simply call order dependency (OD) <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b6">7]</ref>.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Example 1. Let r be a table with attributes A, B, C, D, E, F (Table <ref type="table" target="#tab_0">1</ref>). Note that</p><formula xml:id="formula_0">[A, B, C] → [F, E, D] is satisfied by r but [A, B, C] → [F, D, E] is falsified by r. Also note r |= [ − → C , ← − A ] → [ − → B , ← − D , ← − E ], but r |= [ − → C , ← − A ] → [ ← − E , − → B , ← − D ].</formula><p>Functional dependencies (FDs) are to group-by as ODs are to order-by. Given X → Y, if one has an SQL query with order by Y, one can rewrite the query with order by X instead, and meet the intent of the original query. Assume we have a tree index for X. This index can help in a query plan with the semantic information that X orders Y. In <ref type="bibr" target="#b4">[5]</ref><ref type="bibr" target="#b5">[6]</ref><ref type="bibr" target="#b6">[7]</ref>, we showed the details how current query optimizers could be extended with ODs to simplify queries with order by in similar ways to how FDs have been shown to be useful in simplifying queries with group by.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Axiomatization and Challenges</head><p>A key concern in dependency theory is developing the algorithms for testing logical implication. Developing inference rules (axioms) is an approach to show logical implication between dependencies. We investigate the axiomatization for ODs. In <ref type="bibr" target="#b4">[5]</ref>, we studied UODs. We provided a sound and complete axiomatization for UODs. The inference rules of the axiomatization are shown in Figure <ref type="figure" target="#fig_0">1</ref>. The inference rules remain sound over ODs. To prove the axiomatization is sound, we show that each of the axiom is sound, which is simple. Proving completeness is much more involved. To prove the axiomatization is complete for UODs, we demonstrate in <ref type="bibr" target="#b4">[5]</ref> that for any set of UODs M, a table t can be constructed that satisfies M and that every OD that is not derivable over M with axioms presented in Figure <ref type="figure" target="#fig_0">1</ref> is falsified by table t. In <ref type="bibr" target="#b4">[5]</ref>, we additionally demonstrate that Armstrongs axiomatization for functional dependencies (FDs) is subsumed within our axiomatization for ODs. Working with ODs is more involved than with FDs because the order of the attributes matters. Thus, we must work with lists of attributes instead of with sets. This necessarily complicates our axioms compared with Armstrongs axioms for FDs. The axioms provide insight into how ODs behave and patterns for how ODs logically follow from others that are not easily evident reasoning from first principles. The work about ODs, we feel, opens exciting venues for future work to develop a powerful new family of query optimization techniques in database systems.</p><formula xml:id="formula_1">1. Reflexivity. XY → X 2. Prefix. X → Y ZX → ZY 3. Normalization. WXYXV ↔ WXYV. 4. Transitivity. X → Y Y → Z X → Z 5. Suffix. X → Y X ↔ YX 6. Chain. X ∼ Y1 ∀ i∈[1,n−1] Yi ∼ Yi+1 Yn ∼ Z ∀ i∈[1,n] YiX ∼ YiZ X ∼ Z</formula><p>There is more that can be done, and that we plan to do. We plan to work on extending our work axiomatization for UODs <ref type="bibr" target="#b4">[5]</ref> into an complete axiomatization for ODs, which allow the mix of ascending and descending orders. Such an axiomatization might provide insight into how ODs behave, and provide input for useful query rewrites.</p><p>If ABC → D holds but not AB → D, is ordering by AB useful if we need a stream sorted by D? If the stream is sorted by AB, it may be nearly sorted on D. If it were known that every partition of AB is small, each AB-partition could be sorted on-the-fly in main memory, removing the need for an external sort operator. We would like to formalize the concept of nearly sorted.</p><p>We are working on introducing a framework for discovering conditional order dependencies. (Conditional sequential dependencies were proposed in <ref type="bibr" target="#b1">[2]</ref>.) A conditional order dependency can be represented as a pair (X → Y, T r ), where X → Y, referred to as the embedded OD, and T r is a range pattern tableau defining over which rows the dependency applies. It would provide a novel integrity constraint allowing one to express, that an OD date → salary holds for a given employee id.</p><p>Furthermore, in the process of merging data from various sources, it is often the case that small variations occur. For example, one movie site might report the movie Gone with the Wind as having a running time of 222, while site two reports 238 minutes for it. The FD that movie → length would be violated. In <ref type="bibr" target="#b2">[3]</ref>, they define a metric over FDs to allow for such small variations. Likewise, we would like to define metric ODs to generalize both ODs as in this paper and metric FDs. We would like to devise algorithms for determining whether a given metric OD holds for a given relation, and to investigate the use of these as data cleaning techniques as in <ref type="bibr" target="#b0">[1]</ref> for matching dependencies.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Axioms for UODs.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Relation instance r.</figDesc><table><row><cell># A B C D E F</cell></row><row><cell>s 3 2 0 4 7 9</cell></row><row><cell>t 3 2 1 3 8 9</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Data cleaning and query answering with matching dependencies and matching functions</title>
		<author>
			<persName><forename type="first">L</forename><surname>Bertossi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kolahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Lakshmanan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICDT</title>
		<imprint>
			<biblScope unit="page" from="268" to="279" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Sequential Dependencies</title>
		<author>
			<persName><forename type="first">L</forename><surname>Golab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Karloff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Korn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Srivastava</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PVLDB</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="574" to="585" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Metric Functional Dependencies</title>
		<author>
			<persName><forename type="first">N</forename><surname>Koudas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Saha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Srivastava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Venkatasubramanian</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICDE</title>
				<imprint>
			<date type="published" when="1291">1291-1294. 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Chasing Polarized Order Dependencies</title>
		<author>
			<persName><forename type="first">J</forename><surname>Szlichta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Godfrey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gryz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AMW</title>
		<imprint>
			<biblScope unit="page" from="168" to="179" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Fundamentals of Order Dependencies</title>
		<author>
			<persName><forename type="first">J</forename><surname>Szlichta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Godfrey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gryz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PVLDB</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">11</biblScope>
			<biblScope unit="page" from="1220" to="1231" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Queries on Dates: Fast Yet not Blind</title>
		<author>
			<persName><forename type="first">J</forename><surname>Szlichta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Godfrey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gryz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Pawluk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zuzarte</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">EDBT</title>
				<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="497" to="502" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Expressiveness and Complexity of Order Dependencies</title>
		<author>
			<persName><forename type="first">J</forename><surname>Szlichta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Godfrey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gryz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zuzarte</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PVLDB</title>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
