<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Have Your Cake and Eat it, Too: Data Provenance for Turing-Complete SQL Queries</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Tobias</forename><surname>Müller</surname></persName>
							<email>to.mueller@uni-tuebingen.de</email>
							<affiliation key="aff0">
								<orgName type="institution">Torsten Grust) University of Tübingen Tübingen</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Have Your Cake and Eat it, Too: Data Provenance for Turing-Complete SQL Queries</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">F5A954F5AD26B6CADC27A26FEC83E0CC</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T00:57+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We report on our work about the computation of data provenance for feature-rich SQL. Among further constructs, our prototype supports correlated subqueries, aggregations, recursive queries and window functions. Our analysis approach completely sidesteps relational algebra and instead requires a translation of the input query into an imperativestyle program. Provided that the target language is Turingcomplete, any SQL query can be covered. We employ a new variant of program analysis which consists of a dynamic and a static part. This two-step approach enables us to dodge limitations that a Turing-complete computation model entails for program analyses otherwise. The derived data provenance directly reflects the data provenance of the original SQL query.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>Data provenance <ref type="bibr" target="#b3">[3,</ref><ref type="bibr" target="#b4">4]</ref> is metadata -primarily about the origin of a certain data piece. Everyday examples for desirable provenance information are the From: header field in an email or citations in academic papers. In these two cases, the provenance is trivial and does not need any clever algorithms for its computation (at least: should not).</p><p>However, in the context of real-world relational database systems there is a deficiency regarding the provenance computation for contemporary implementations of SQL. SQL, being the standard of relational query languages, has support for advanced language constructs like recursive queries or window functions. Further, nesting of queries is possible, for example, through (correlated) subqueries. These features make writing queries convenient but also make the data provenance of query results non-trivial in the general case. Concrete scenarios in which data provenance for SQL has proved being relevant are the view update/maintenance problem <ref type="bibr" target="#b4">[4]</ref>, data warehouses <ref type="bibr" target="#b4">[4]</ref> and debugging purposes <ref type="bibr">[5]</ref>. The analysis approach we are going to describe is capable of computing the data provenance for any non-updating SQL query.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1">Provenance Model</head><p>We adopt a basic distinction of Where-and Why-provenance as originally introduced by Buneman and Tan <ref type="bibr" target="#b1">[1]</ref>:</p><p>• Where-provenance Where-provenance : where has a certain data piece originated? Exactly which table cells were copied or transformed to yield an output cell? • Why-provenance Why-provenance : why is a certain data piece in the result? Which input table cells were inspected to decide about the existence or contents of an output cell?</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2">Basic Example</head><p>Figure <ref type="figure">1</ref> shows an intentionally simple SQL query and corresponding example tables. Mouse pointer 1 represents an inquiry for the data provenance of C 6 H 12 O 6 . In the following sections we revisit this example and illustrate how this outcome actually is computed using our program analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.3">Advanced Example</head><p>The provenance analysis of the query found in Figure <ref type="figure" target="#fig_3">2(b</ref>) is a unique feature of our approach: to the best knowledge of the author, only our analysis approach can deal with recursive SQL queries.</p><p>The query syntax-checks molecular formulae. Technically, the finite state machine depicted in Figure <ref type="figure">3</ref>   More interesting markers can be found within table fsm. The highlighted cells inside t 8 and t 9 indicate which state changes were triggered while parsing the first letters of the formula.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.4">Analysis Overview</head><p>Figure <ref type="figure" target="#fig_1">4</ref> provides a graphical overview of our analysis approach. The actual provenance analysis happens within the dotted box. It requires the SQL query to be translated into imperative program code. For our prototype, we use a handcrafted SQL compiler. Contemporary database systems like HyPer <ref type="bibr" target="#b9">[9]</ref> perform such translation internally.</p><p>The provenance analysis itself consists of two steps. At first, a dynamic analysis takes place which includes code instrumentation and execution. This step actually computes the same query result as a regular query processor would do.</p><p><ref type="foot" target="#foot_0">1</ref> As a side effect, two light-weight execution logs are written. They describe the execution flow during runtime and are a key element of this approach.</p><p>In our second step, a static analysis is carried out exploiting the runtime knowledge encoded within the logs. Our static analysis does absolutely no data processing. The data provenance is derived from program code and logs only. It is inspired by Program Slicing <ref type="bibr" target="#b2">[2,</ref><ref type="bibr" target="#b10">10]</ref>.</p><p>In Section 3, all elements of our provenance analysis will be explained in deeper detail.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">SQL COMPILATION</head><p>Figure <ref type="figure" target="#fig_5">5</ref> shows a simplified yet executable translation of the basic SQL query in Figure <ref type="figure">1(b)</ref>. Ignore the logging statements until the subsequent section.</p><p>The target language is kept minimal to just fit our needs: it can compute query results but has no support for I/O operations, for example. Due to space limitations and as the presented code fragment consists of well-known language elements we do not give a formal definition.  You find the table compounds of Figure <ref type="figure">1</ref>(a) represented as a data structure (list of dictionaries). The algorithm iterates over the input table (line 3) and if a tuple has qualified (line 5), its formula is appended to the result (line 7).</p><p>Please note that we combined input data (i.e. database instance) and the computation algorithm into one program. In the regular case, both of them are kept separate (refer to Figure <ref type="figure" target="#fig_1">4</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">PROVENANCE ANALYSIS</head><p>Before we get to the details of our approach, we shed some light on the theoretical limits of program analysis and the arising dilemma. The theorem of Rice is a result of computational theory. Cast informally, the theorem states that in the Turing-complete computation model only trivial questions about the behavior of a program can be answered. A sample trivial question would be: how many lines has the program? However, non-trivial properties of a program (such as data provenance) can only be adressed if the program actually is executed.</p><p>This gives rise to the following dilemma: to embrace a rich SQL dialect, we want to be Turing-complete (i.e., compute anything). Regarding program analysis, however, we want to avoid Turing completeness and its implications formulated in the theorem of Rice. The approach illustrated next allows us to have the cake and eat it, too. It allows us to stay in the Turing-complete computation model during runtime and to switch into a weaker computation model for provenance analysis.    [{"compound": "citrate", "formula": "C 6 H 5 O 7 3-"}, {"compound": "glucose", "formula": "C 6 H 12 O 6 "}, {"compound": "hydronium", "formula": "H 3 O + "}];  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Two-Step Program Analysis</head><p>To make this switch possible we run consecutive dynamic and static analyses (compare Figure <ref type="figure" target="#fig_1">4</ref>).</p><p>During dynamic analysis, the behavior (not: result) of certain program statements is recorded in logs. For example, an if-statement can branch into the then-or the else-block. We record this (binary) decision. During static analysis, this makes the behavior of an if-statement predetermined. The if does no longer actively contribute to the computation and can be replaced by the according then-or else-branch.</p><p>When applying this record &amp; replace discipline for a relevant subset of a program's statements, we get an equivalent form of the original program computing the same result. But now, the computation model has been simplified and is open for running an exhaustive program analysis. In the remainder of this section we explain the two analysis steps in detail.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Dynamic Analysis</head><p>As motivated above, we aim to record the behavior of program statements during runtime. The following two logs are appended to:</p><p>• log cf (control flow): which/how often does a certain code branch get executed by if and foreach? • log ix (indices): at which locations are elements inside lists/dictionaries accessed? During runtime, these properties are available and can easily be recorded. We use the technique of code instrumentation to create the two logs.</p><p>For an instrumented example, see Figure <ref type="figure" target="#fig_5">5</ref>. The instrumentation instructions are placed on the righthand side of the listing. The first argument of the put()-function is the type of log we want to append to. Its second argument is the actual value being logged. Figure <ref type="figure" target="#fig_6">6</ref> lists the according logs. These are written (and read) sequentially and do not need any further meta-data, keeping the logs small.</p><p>The logged data items are to be interpreted in the context of the (uninstrumented) source code. For example, the first entry of log cf corresponds to the first control flow decision in the program at line 3. The foreach loop opened there can either execute its body (another time) or terminate and continue at the statement after line 11. We encode these decisions using Boolean values. The first true found in the log indicates that the body has been executed. <ref type="bibr">The</ref>   List/dictionary element accesses get logged in log ix . Note that foreach and append implicitly use numeric indices to read/write from/into lists and need to be included. The idxOf() function retrieves the ordinal position of a list element.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Static Analysis</head><p>Our static analysis does an abstract (value-less) interpretation of the uninstrumented source code. Instead of computing values, all input values are replaced by unique numeric identifiers. These pids are propagated during program interpretation and successively create a variable environment containing the data provenance information. Based on the basic query example of Figure <ref type="figure">1</ref> we present a simplified subset of our provenance derivation algorithm.</p><p>Figure <ref type="figure" target="#fig_9">8</ref> shows provenance inference rules denoted in operational semantics. The top Statements rule is the entry point for the interpretation. It takes the first statement s out of all statements ss to be interpreted. In general, interpretation of statements is triggered by the − − ⤇ symbol and leads to an update of the current variable environment Γ. The CF symbol represents the current data provenance for the control flow. The idea behind this is that reaching a certain code section depends on a number of branching decisions carried out by if/else statements. The dependencies for these decisions are collected in CF and propagated during program interpretation.</p><p>The numeric ids which represent a data provenance relationship are defined in Figure <ref type="figure" target="#fig_7">7</ref>. There are the two kinds pid e and pid y which stand for Where-and Why-provenance, respectively. During analysis, these ids are created by the new ()-function (for an example, see the Lit-Str rule). Initially, all pids are of the Where-type because any pid e represents a certain value and a location of origin. During interpretation, they may be converted into Why-type using function Υ().</p><p>The main data structure is P . It can represent any value of any type of our programming language. Its second component e is used for container types (i.e., lists/dictionaries) to store contained elements. The first component c is used for both, containers as well as atomic values (e.g., strings). It represents the provenance for that value itself. The logs log cf and log ix are read by the inference rules. See rules If-True and If-False, for example. The popf ()-function reads and removes the first element of the according log.</p><p>The inference rules presented in Figure <ref type="figure" target="#fig_9">8</ref> are suitable to compute the data provenance of the basic query and finally yield the environment shown in Figure <ref type="figure" target="#fig_10">9</ref>. As the     </p><formula xml:id="formula_0">Statements CF ; Γ ⊢ s − − ⤇ Γ 1 CF ; Γ 1 ⊢ ss − − ⤇ Γ 2 CF ; Γ ⊢ s ; ss − − ⤇ Γ 2 PutVar CF ; Γ ⊢ e ⤇ P Γ res = Γ + {v ↦ P } CF ; Γ ⊢ v = e − − ⤇ Γ res Skip CF ; Γ ⊢ skip − − ⤇ Γ If-True popf (log cf ) CF ; Γ ⊢ e ⤇ P e CF if = Υ(CF ∪ γ(P e )) CF if ; Γ ⊢ ss 1 − − ⤇ Γ res CF ; Γ ⊢ if e then ss 1 else ss 2 fi − − ⤇ Γ res If-False ¬popf (log cf ) ... 2 CF ; Γ ⊢ if ... fi − − ⤇ Γ res</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Related Work</head><p>The strongest group of related work builds upon provenance propagation through query transformation on the algebraic layer. For example, there is the Provenance Semirings approach <ref type="bibr" target="#b7">[7]</ref> as well as the PERM system <ref type="bibr" target="#b6">[6]</ref>. In more recent work, both of them were extended to support aggregations and subqueries, respectively.  The aforementioned algebraic approaches are all limited in their expressiveness and extending the number of supported algebraic operators is non-trivial.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">CONCLUSIONS</head><p>The approach presented in this article pushes the boundaries of the provenance analysis for SQL queries. Our prototype can analyse queries with advanced but timely SQL language features. Due to Turing-completeness, this approach can deal with any (non-updating) query translated into imperative code.</p><p>It is part of our future work to run this approach in the environment of a decent DBMS. In parallel, we pursue the derivation of How -provenance <ref type="bibr" target="#b3">[3]</ref>, i.e. get each one of the computed provenance relations associated to the SQL clauses accountable for its existence.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>9 ]Figure 3 :</head><label>93</label><figDesc>Figure 3: FSM.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Overview of the two-step analysis.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>2 (</head><label>2</label><figDesc>c) Parsing trace.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 2 :</head><label>2</label><figDesc>Advanced query example and provenance markers.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: The translated and instrumented SQL query.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Log contents.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: Data structures used in provenance computation.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head></head><label></label><figDesc>Foreach-False ¬popf (log cf ) CF ; Γ ⊢ foreach ... od − − ⤇ Γ Foreach-True popf (log cf ) CF ; Γ ⊢ e ⤇ P e P el = (P e )[popf (log ix )] Γ for = Γ + {v ↦ ⟨γ(P e ) ∪ γ(P el ), (P el )⟩} CF ; Γ for ⊢ ss ; foreach v in e do ss od − − ⤇ Γ res CF ; Γ ⊢ foreach v in e do ss od − − ⤇ Γ res Append CF ; Γ ⊢ e ⤇ P e P v = Γ[v] P = ⟨γ(P v ), (P v ) + {popf (log ix ) ↦ P e }⟩ Γ res = Γ + {v ↦ P } CF ; Γ ⊢ append(v, e) − − ⤇ Γ res GetVar P = Γ[v] P res = ⟨γ(P ) ∪ CF , (P )⟩ CF ; Γ ⊢ v ⤇ P res Lit-Str P res = ⟨{new ()} ∪ CF, ∅⟩ CF ; Γ ⊢ c ⤇ P res GetVar-Idx P = (Γ[v])[popf (log ix )] CF ; Γ ⊢ e ⤇ P e P res = ⟨γ(P ) ∪ CF ∪ γ(Γ[v]) ∪ Υ(γ(P e )), (P )⟩ CF ; Γ ⊢ v[e] ⤇ P res Lit-Dict |CF ; Γ ⊢ e i ⤇ P i | i=0...n P res = ⟨{new ()} ∪ CF , {| i ↦ P i | i=0...n }⟩ CF ; Γ ⊢ { 0 :e 0 , . . . , n :e n } ⤇ P res Lit-List |CF ; Γ ⊢ e i ⤇ P i | i=0...n P res = ⟨{new ()} ∪ CF , {|i ↦ P i | i=0...n }⟩ CF ; Γ ⊢ [e 0 , . . . , e n ] ⤇ P res BinOp CF ; Γ ⊢ e 1 ⤇ P 1 CF ; Γ ⊢ e 2 ⤇ P 2 P res = ⟨γ(P 1 ) ∪ γ(P 2 ), ∅⟩ CF ; Γ ⊢ e 1 ⊛ e 2 ⤇ P res</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Figure 8 :</head><label>8</label><figDesc>Figure 8: Inference rules for data provenance. main result, we find four provenance relationships located in res[0]["formula"]. The highlighted pids 5 e 5 e (relates to t 2 : C 6 H 12 O 6C 6 H 12 O 6 ) and 4 y 4 y (relates to t 2 : glucose glucose ) constitute the data provenance visualized in Figure1. 6 y 6 y and 10 y 10 y do not correspond to table cells and may be ignored. We already presented a visualization prototype in a recent demo paper<ref type="bibr" target="#b8">[8]</ref>.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_10"><head>Figure 9 :</head><label>9</label><figDesc>Figure 9: Resulting environment Γ after static analysis. Pids of non-input-values (&gt; 10 e|y ) got dropped.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>table cell t 4 : C 6 H 12 O 6 C 6 H 12 O 6 . According to the SQL query, two input columns are accessed: compound is used to decide if a tuple gets filtered or not. If a tuple qualifies, its value sitting in formula is copied over into the result table. Our provenance analysis accordingly finds the result being why-dependent on tuple t 2 : glucose glucose and being where-dependent on t 2 : C 6 H 12 O 6</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head></head><label></label><figDesc>last false indicates that the foreach loop has exited. Similarly, an if-statement can decide between then (yields true) or else (yields false).</figDesc><table><row><cell>log cf</cell><cell>log ix</cell></row><row><cell>⟨ true,</cell><cell>⟨ 0,</cell></row><row><cell>false,</cell><cell>"compound",</cell></row><row><cell>true,</cell><cell>1,</cell></row><row><cell>true,</cell><cell>"compound",</cell></row><row><cell>true,</cell><cell>"formula",</cell></row><row><cell>false,</cell><cell>0,</cell></row><row><cell>false ⟩</cell><cell>2,</cell></row><row><cell></cell><cell>"compound" ⟩</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>P</head><label></label><figDesc>∶= ⟨c, e⟩ c ∶= {pid 1 , ..., pid n } pid ∈ {1 e , 1 y , 2 e , 2 y , 3 e , 3 y , ...} e ∶= {l 1 ↦ P 1 , ..., l n ↦ P n }</figDesc><table><row><cell>l ∶= any identifier</cell></row><row><cell>γ(P ) ∶= c</cell></row><row><cell>(P ) ∶= e</cell></row></table><note>Υ(pids) ∶= {pid y ∶ pid e|y ∈ pids}</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head></head><label></label><figDesc>The highlighted pids 5 e 5 e (relates to t 2 : C 6 H 12 O 6 C 6 H 12 O 6 ) and 4 y 4 y (relates to t 2 : glucose glucose ) constitute the data provenance visualized in Figure 1. 6 y 6 y and 10 y 10 y do not correspond to table cells and may be ignored.</figDesc><table><row><cell>We already presented a visualization prototype in a recent</cell></row><row><cell>demo paper [8].</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_8"><head>2</head><label></label><figDesc>Analogous to If-True: ss 2 is interpreted. data ∶ ⟨{ 10 e 10 e }, {0 ↦ ⟨{3 e }, "compound" ↦ ⟨{1 e }, ∅⟩, "formula" ↦ ⟨{2 e }, ∅⟩⟩, 1 ↦ ⟨{ 6 e 6 e }, "compound" ↦ ⟨{ 4 e 4 e }, ∅⟩, "formula" ↦ ⟨{ 5 e 5 e }, ∅⟩⟩, 2 ↦ ⟨{9 e }, "compound" ↦ ⟨{7 e }, ∅⟩, "formula" ↦ ⟨{8 e }, ∅⟩⟩}⟩ row ∶ ⟨{9 e , 10 e }, {"compound" ↦ ⟨{7 e }, ∅⟩, "formula" ↦ ⟨{8 e }, ∅⟩}⟩ c ∶ ⟨{7 e , 9 e , 10 y }, ∅⟩ t ∶ ⟨{4 y , 6 y , 10 y }, {"formula" ↦ ⟨{5 e , 4 y , 6 y , 10 y }, ∅⟩}⟩ res ∶ ⟨∅, 0 ↦ ⟨{4 y , 6 y , 10 y }, {"formula" ↦ ⟨{ 5 e 5 e , 4 y 4 y , 6 y 6 y , 10 y 10 y }, ∅⟩}⟩⟩</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">As part of our future work, we seek to modify an existing database system and let it run the dynamic analysis simultaneously with query execution.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title/>
		<author>
			<persName><surname>References</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Why and Where: A Characterization of Data Provenance</title>
		<author>
			<persName><forename type="first">P</forename><surname>Buneman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Khanna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W.-C</forename><surname>Tan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. ICDT</title>
				<meeting>ICDT</meeting>
		<imprint>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Program Slicing and Data Provenance</title>
		<author>
			<persName><forename type="first">J</forename><surname>Cheney</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Data Engineering Bulletin</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="22" to="28" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Provenance in Databases: Why, How, and Where</title>
		<author>
			<persName><forename type="first">J</forename><surname>Cheney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chiticariu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W.-C</forename><surname>Tan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Foundations and Trends in Databases</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="issue">4</biblScope>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Tracing the Lineage of View Data in a Warehousing Environment</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Cui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Widom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wiener</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM TODS</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="issue">2</biblScope>
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">The best bang for your bu(ck)g</title>
		<author>
			<persName><forename type="first">B</forename><surname>Dietrich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Müller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Grust</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. EDBT</title>
				<meeting>EDBT</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Perm: Processing Provenance and Data on the Same Data Model Through Query Rewriting</title>
		<author>
			<persName><forename type="first">B</forename><surname>Glavic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Alonso</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. ICDE</title>
				<meeting>ICDE</meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Provenance Semirings</title>
		<author>
			<persName><forename type="first">T</forename><surname>Green</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Karvounarakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Tannen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. PODS</title>
				<meeting>PODS</meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Provenance for SQL Based on Abstract Interpretation: Value-less, but Worthwhile</title>
		<author>
			<persName><forename type="first">T</forename><surname>Müller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Grust</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. VLDB</title>
				<meeting>VLDB<address><addrLine>Hawaii, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Efficiently Compiling Efficient Query Plans for Modern Hardware</title>
		<author>
			<persName><forename type="first">T</forename><surname>Neumann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. VLDB</title>
				<meeting>VLDB</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Program Slicing</title>
		<author>
			<persName><forename type="first">M</forename><surname>Weiser</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Software Engineering</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">4</biblScope>
			<date type="published" when="1984">1984</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
