<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Schema-based Query Optimization for XQuery Queries</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Sven</forename><surname>Groppe</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Paderborn</orgName>
								<address>
									<addrLine>Faculty 5, Fürstenallee 11</addrLine>
									<postCode>D-33102</postCode>
									<settlement>Paderborn</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Stefan</forename><surname>Böttcher</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Paderborn</orgName>
								<address>
									<addrLine>Faculty 5, Fürstenallee 11</addrLine>
									<postCode>D-33102</postCode>
									<settlement>Paderborn</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Schema-based Query Optimization for XQuery Queries</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">884FBFC4D9D32C161C80FB15A7E84FD1</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T11:14+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>XQuery is widely used for querying XML documents. Within this paper, we examine optimization rules for XQuery queries that exploit type information of the input XML document given in XML Schema. These optimization rules are applicable for all XQuery expressions and are very useful e.g. in the scenario of XQuery queries on XQuery views. The basic idea is to transform the XML Schema definition into a graph, which is extended to a graph representing the XQuery expression. The latter graph is used to delete subexpressions of the XQuery expression that are not used to retrieve the final result of the given XQuery expression. We further include experimental results that demonstrate the improvement of our optimization.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The W3C has developed XQuery <ref type="bibr" target="#b26">[25]</ref> for querying XML documents in recent years. XQuery evaluators have not only been implemented for single XML documents, but XQuery has also been integrated in many XML databases and XML-enabled databases. However, query optimization for XQuery is still a major challenge.</p><p>In this paper, we introduce a new optimization technique for the following class of XQuery expressions. We call the class of XQuery expressions C XQuery , which consist of all XQuery expressions, which contain at least one sub-expression, which generates either no output at all or an intermediate result, which is not used to compute the final result of the query. Our optimization technique identifies and eliminates those sub-expressions of queries of C XQuery which do not generate any output or generate superfluous intermediate results.</p><p>Even if queries are well designed, queries of C XQuery often occur in the following important scenario of queries on views. A view describes how a certain section of the stored data in the database is transformed. Now, a user can query the view to retrieve all or a subset of the transformed data of the view. Inside database management systems, the query on the view and the view itself are composed and then evaluated as one composed query. Whenever the user restricts the view by the query and does not query for all the data of the view, the composed query can in general belong to the class C XQuery . In these cases, applying our new optimization technique saves time of processing the composed query. We show by experimental results that the execution of the optimized query is much faster than computing the complete query, whenever the size of all data is relatively large and we can avoid unnecessary computation of unneeded intermediate results.</p><p>The rest of the paper is organized as follows. Section 2 outlines an optimization example. Section 3 describes our general approach for optimizing XQuery queries. Section 4 presents a performance analysis. Section 5 refers to the related work, and we end up with the summary and conclusions in Section 6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Optimization Example</head><p>In order to illustrate our optimization, we start with an example. Let us consider the following XQuery query in Fig. <ref type="figure">1</ref>, which is composed of a view definition in line <ref type="bibr" target="#b2">(1)</ref> to line <ref type="bibr" target="#b20">(19)</ref> and of a user defined query on this view in line <ref type="bibr" target="#b22">(21)</ref>. The XQuery query in Fig. <ref type="figure">1</ref> shall be applied to the input XML document of Fig. <ref type="figure">2</ref> that fulfills the XML schema definition of Fig. <ref type="figure" target="#fig_0">3</ref>.</p><p>(1)let $view := (2) &lt;root&gt; (3) &lt;result&gt; (4) { <ref type="bibr" target="#b6">(5)</ref> for $a in <ref type="bibr" target="#b7">(6)</ref> document('p.xml')/conference/paper <ref type="bibr" target="#b8">(7)</ref> let $b := <ref type="bibr" target="#b9">(8)</ref> &lt;single_result&gt;{$a}&lt;/single_result&gt; <ref type="bibr" target="#b10">(9)</ref> return $b (10) } ( <ref type="formula">11</ref>) { <ref type="bibr" target="#b13">(12)</ref> for $a in <ref type="bibr" target="#b14">(13)</ref> document('p.xml')/conference/tutorial <ref type="bibr" target="#b15">(14)</ref>   When the XQuery query in Fig. <ref type="figure">1</ref> is evaluated on the XML document 'p.xml' outlined in Fig. <ref type="figure">2</ref> and its XML Schema definition outlined in Fig. <ref type="figure" target="#fig_0">3</ref>, at first the variable $view is computed. Its content is presented in Fig. <ref type="figure" target="#fig_1">4</ref>.  Thereafter, the result of the entire query of Fig. <ref type="figure">1</ref> is computed, the result of which is presented in Fig. <ref type="figure">5</ref>. &lt;tutorial name="Web Services"&gt; &lt;author&gt;Expert&lt;/author&gt; &lt;/tutorial&gt; Fig. <ref type="figure">5</ref>. Result of the query in Fig. <ref type="figure">1</ref> Note that the result in Fig. <ref type="figure">5</ref> is only a part of the fragment computed by the variable $view. Although XQuery evaluators process the complete query in Fig. <ref type="figure">1</ref>, and therefore can optimize the complete query, current implementations ( <ref type="bibr" target="#b7">[6,</ref><ref type="bibr" target="#b16">15,</ref><ref type="bibr" target="#b19">18,</ref><ref type="bibr" target="#b23">22]</ref>) always compute the entire content of the variable $view either at once or one XML node after the next XML node by using an iterator. After that, current implementations project to the required part of $view.</p><formula xml:id="formula_0">&lt;root&gt;</formula><p>In comparison, our approach computes a variable assignment of $view which only contains those computations for the required part, which is used to compute the final result. For example, the query in Fig. <ref type="figure" target="#fig_2">6</ref> is an optimized query of the query in Fig. <ref type="figure">1</ref>. Within Section 4, we present experimental results, which show the speed-up factor of such optimized queries. In the following subsections, we describe how to eliminate sub-expressions in variable assignments like $view so that only the required part is computed. This optimization technique can be used in the important scenario of XQuery queries on XQuery views, because one way to reformulate an XQuery query Q according to an XQuery view V is to use the pattern presented in Fig. <ref type="figure" target="#fig_3">7</ref>. Whenever there is a reference in the XQuery query Q to the view V, the variable $view is used to access the view. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">The General Optimization Approach</head><p>We design the optimization step of reducing the XQuery query independently of the current instance of the XML document. The advantage is that the approach can optimize the XQuery query in advance without connecting to any database. As the output of XQuery expressions often contains whole sub-trees of the input XML document, we can use schema information of the input XML document in order to optimize queries. For this purpose, we introduce the ordered schema graph, which is based on the schema of the input XML document. We use this ordered schema graph for search algorithms, which are outlined in Section 3.2 and Section 3.4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Ordered Schema Graph</head><p>The ordered schema graph is generated from an XML Schema definition. As an example, see the XML Schema definition in Fig. <ref type="figure" target="#fig_0">3</ref> and the ordered schema graph in Fig. <ref type="figure" target="#fig_4">8</ref>.</p><p>Each node of an ordered schema graph represents an element node of the schema, the document node or a dummy node (called :EMPTY node). The :EMPTY node represents a whole choice expression or a whole sequence expression. There are three kinds of edges: parent-child edges represent a parent child relationship between element nodes, the document node and/or :EMPTY nodes. A sibling edge represents a directed sibling relationship between element nodes, the document node and/or :EMPTY nodes. Finally, an expression edge represents a relationship to a whole choice expression or a whole sequence expression between element nodes, the document node and :EMPTY nodes. We present here the general rules for generating the ordered schema graph from an XML Schema definition:</p><p>• We create a start node of type :DocumentNode, the child of which is a node representing that element, which is the root node of the XML document. In the example of Fig. <ref type="figure" target="#fig_0">3</ref> and Fig. <ref type="figure" target="#fig_4">8</ref>, N1 represents the document node and N2 represents the root element node.</p><p>• We add all (required, implied and fixed) attributes of any type of the corresponding XML element E to the nodes representing E. In the example of Fig. <ref type="figure" target="#fig_0">3</ref> and Fig. <ref type="figure" target="#fig_4">8</ref>, we add the attribute name to the node N2.</p><p>• We transform the right-hand side of an element declaration according to the following rules:</p><p>o Nodes of the ordered schema graph representing elements are the parents of the representation of the right-hand sides of their element declarations. In the example of Fig. <ref type="figure" target="#fig_0">3</ref> and Fig. <ref type="figure" target="#fig_4">8</ref>, N1 is the parent of N2. o Whenever an element E1 can be a following sibling node of another element E2, we insert a sibling edge from E2 to E1. This is the case for repetitions of elements (see right-hand sides of element declarations of conference, tutorial, paper and references in Fig. <ref type="figure" target="#fig_0">3</ref> and Fig. <ref type="figure" target="#fig_4">8</ref>) and for sequences of elements (see right-hand side of paper). Whenever the XML Schema defines an element to be a complexType defined by a choice (see right-hand side of conference) or a sequence (see right-hand side of paper), then we create an extra :EMPTY node for easy access to the whole choice expression or the whole sequence expression, respectively. As an example, see node N3 in Fig. <ref type="figure" target="#fig_4">8</ref> representing the xsd:choice element in Fig. <ref type="figure" target="#fig_0">3</ref> and node N7 representing the xsd:sequence element in Fig. <ref type="figure" target="#fig_0">3</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Satisfiability of an XPath Expression According to a Schema</head><p>Definition: An XPath expression XP is satisfiable according to a schema, if and only if there exists at least one document, which is valid according to the schema, where XP is evaluated to a non-empty result.</p><p>The problem of satisfiability of certain subclasses of XPath expressions (without respect to a schema) is in NP <ref type="bibr" target="#b15">[14]</ref>. We present here a fast (but incomplete) satisfiability test for XPath expressions according to a schema. The test is incomplete in the following way. The test returns not satisfiable if we are sure that the XPath expression is not satisfiable according to a schema. Otherwise the test returns may be satisfiable. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>2.</head><p>3., 8.</p><p>4., 9.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>5.</head><p>6. We use a modified XPath evaluator to test whether or not the XPath expression XP is satisfiable according to a schema. The input of the modified XPath evaluator is the schema and XP. First, the ordered schema graph is created according to the schema by using the rules summarized in Section 3.1. After that, we execute the modified XPath evaluator, which performs the task to check whether or not all elements and attributes of paths within XP are available at the required hierarchical position. The modified XPath evaluator starts at the node of type :DocumentNode representing the document node. Within the ordered schema graph, there is all necessary information in order to execute the modified XPath evaluator, as the parent-child-axis and the next-sibling-axis are available in the ordered schema graph. The evaluator passes empty nodes without consuming an element of the XPath expression XP. In comparison to an XML document, the ordered schema graph can contain loops. Therefore, the modified XPath evaluator must consider loops when it revisits a node but did not process the next location step within XP. For this purpose, the modified XPath evaluator marks all the nodes that contribute to a successful evaluation of XP.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>7.</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Successful</head><p>As an example, see the steps 1, …, 9 performed on the ordered schema graph of Fig. <ref type="figure">9</ref> for the successful evaluation of the XPath query /conference/paper/references/conference/paper. As in step 9 the whole XPath expression is processed, this XPath query is considered to be satisfiable.</p><p>In general, with this technique, we can test whether or not a schema definition allows only XML documents for which a given XPath expression XP can never be successfully evaluated, i.e. for which the evaluation of the XPath expression returns an empty set. This subset of XQuery contains nested for-let-return clauses, if expressions, element and attribute constructors, declarations of functions and function calls.</p><p>We present here the general rules for generating the XQuery graph from an XQuery expression: We generate an own XQuery graph for each variable assignment $view of the XQuery expression. Every expression within the variable assignment of $view, which generates output, gets its own new node N representing the output. Variables (and also nested variables) are replaced with their content. We set a new node N as parent node of every node in the XQuery graph representing output, which could be generated as child node to the output of node N by the XQuery evaluator. Furthermore, we relate the new node N with every node in the XQuery graph representing output, which could be generated as sibling node to the output of node N by the XQuery evaluator, by a directed sibling relation. If the next generated output of the XQuery evaluator is a sub-tree of the input XML document specified by an XPath expression XP, we search within the ordered schema graph by the modified XPath evaluator as before, and we retrieve a node set SN of nodes of the ordered schema graph. We first copy the nodes of SN and copy all descendant nodes and all sibling nodes, which can be reached from the nodes of SN by parent-child relationships and sibling relationships. We copy also all parent-child relationships and sibling relationships for the copied nodes. Finally, we set the current node as parent node of the copies of SN. In the example of Fig. <ref type="figure">10</ref>, which represents the XQuery graph of the XQuery query of Fig. <ref type="figure">1</ref>, these copies consists of the node N2 and its descendant nodes N3 to N10, and the node N14 and its descendant node N15. We label the association with the XPath expression of the XQuery expression (or we use a reference; here, line number ( <ref type="formula">5</ref>) and ( <ref type="formula">12</ref>)).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Optimization</head><p>This section outlines how the XQuery graph is used to optimize queries. In the following paragraphs, we describe the general rules for optimizing the query.</p><p>The first step of the optimization approach is as follows: Whenever the content of a variable $view is queried by an XPath expression XP by $view/XP in the XQuery query, we process the following optimization steps. We execute the modified XPath evaluator on the XQuery graph of the variable assignment of $view with the input XPath query XP. In the case of the XQuery expression in Fig. <ref type="figure">1</ref>, the XPath query XP is /result/single_result/tutorial for $view. The modified XPath evaluator marks all nodes within the XQuery graph that contribute to a successful evaluation of the query XP in the same way as an XPath query is evaluated on the ordered schema graph.  In the second experiment, the size of the input XML document is constant (here approximately 2 Megabytes), but we vary the amount of paper and tutorial elements (see Fig. <ref type="figure" target="#fig_8">15</ref>). The speed-up factor of the optimized query varies from 14.6, when there are no tutorial elements, to 1, when there are only tutorial elements (see Fig. <ref type="figure" target="#fig_9">16</ref>).  sary parts. We follow this idea; however, we optimize even more, as we also consider the restrictions of copied sub-trees of the input XML document within the userdefined query and view.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Amount of tutorial elements in %</head><p>Our work was inspired by contributions in <ref type="bibr" target="#b12">[11,</ref><ref type="bibr" target="#b13">12,</ref><ref type="bibr" target="#b14">13]</ref>, which deal with XPath query reformulation according to an XSLT view and its optimization. In comparison, in this paper we describe general optimization rules, which are especially applicable for query reformulation on an XQuery view.</p><p>In comparison to all other approaches, we focus on the optimization of XQuery queries based on the schema of the input XML document in order to eliminate unnecessary query code, which computes not used intermediate results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Summary and Conclusions</head><p>In this paper we have examined general optimization rules, which are applicable for all XQuery queries, and which are very useful for example in the scenario of XQuery queries on XQuery views. First, we have introduced a tester, which checks whether or not a given schema allows only XML documents, where the evaluation of a given XPath expression returns an empty set for all valid XML documents. In the second step, we have extended the tester. The extended tester optimizes variable assignments of XQuery queries so that computations of unnecessary intermediate results are eliminated.</p><p>We have shown by experimental results that the evaluation of the optimized queries saves processing costs depending on the amount of saved unnecessary intermediate results.</p><p>Future work will include further optimization rules, which, for example, also optimize the order of operations within the XQuery expression.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Schema of input XML document p.xml</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 4 .</head><label>4</label><figDesc>Fig. 4. Content of $view when evaluating the query of Fig. 1</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 6 .</head><label>6</label><figDesc>Fig.6. Optimized query of the query in Fig.1</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 7 .</head><label>7</label><figDesc>Fig. 7. Pattern of reformulating an XQuery query Q according to an XQuery view V</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Fig. 8 .</head><label>8</label><figDesc>Fig.8. Ordered schema graph of the schema defined in Fig.3</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Fig. 13 .</head><label>13</label><figDesc>Fig. 13. Experiment 1: Evaluation time depending on filesize</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Fig. 14 .</head><label>14</label><figDesc>Fig. 14. Speed-up factor in Experiment 1</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Fig. 15 .</head><label>15</label><figDesc>Fig. 15. Experiment 2: Evaluation time depending on the selectivity of the query for a constant filesize of 2 Megabytes</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Fig. 16 .</head><label>16</label><figDesc>Fig. 16. Speed-up factor of Experiment 2</figDesc></figure>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">XQuery Graph</head><p>For the optimization rules, we consider that subset of XQuery, where the XQuery expression must conform to following rule Start in EBNF notation. Within a second step, we do not delete all the sub-expressions within the XQuery expression from which the marked nodes in the XQuery graph are generated and which do not assign variables, which are used in sub-expressions of marked nodes. The for-statement defines how often and in which order the result-statement is executed. Therefore, we do not delete for-statements, except if their returnstatement is reduced to an empty statement.</p><p>We delete all other unmarked sub-expressions and finally the optimized query remains.</p><p>The pseudo code of the entire algorithm for optimizing XQuery queries is given in Fig. <ref type="figure">12</ref>, which expects an XQuery query Q as input and which returns the optimized XQuery query Q' as output.</p><p>In the case of the XQuery query in Fig. <ref type="figure">1</ref>, we first generate the XQuery graph shown in Fig. <ref type="figure">10</ref> for the variable assignment of $view. Then we mark the nodes shown in Fig. <ref type="figure">11</ref>, and finally we optimize to the XQuery query in Fig. <ref type="figure">6</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Algorithm OptimizeQuery</head><p>Input: XQuery query Q conforming to rule Start in Section 3.3</p><p>Output:Optimized XQuery query Q'</p><p>(1) Generate abstract syntax tree T of Q (2) Compute XQuery graph XG with marked nodes of T <ref type="bibr" target="#b4">(3)</ref> all nodes in T, which correspond to marked nodes in XG Mark ( <ref type="formula">4</ref>) while(all children of a symbol ExprSingle of a LetClause expression are unmarked) do <ref type="bibr" target="#b6">(5)</ref> delete the whole LetClause expression (6) For all nodes n in T do <ref type="bibr" target="#b8">(7)</ref> If(n and its child nodes are unmarked and (n is a symbol ExprSingle and not(n is a parameter of a function call or n is a condition of an if-statement)) ) then <ref type="bibr" target="#b9">(8)</ref> delete n (and its children) (9) Compute Q' from the remaining nodes of T Fig. <ref type="figure">12</ref>. Algorithm for optimizing XQuery queries</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Performance Analysis</head><p>The test system for all experiments is a 1.7 GHz Intel Pentium 4 processor with 128 Megabyte RAM, Windows 2000 as the operating system and Java VM build version 1.4.2. We use the XQuery evaluator of Saxon version 7.9 <ref type="bibr" target="#b16">[15]</ref>.</p><p>We have generated test input XML documents of different size valid according to the schema of Fig. <ref type="figure">3</ref> for all experiments, where every paper and tutorial element contains exactly one empty author element. Furthermore, we have used the XQuery query of Fig. <ref type="figure">1</ref> and the optimized query of Fig. <ref type="figure">6</ref>. In the figures, we present the average of 10 experiments.</p><p>In the first experiment, the test input XML documents consist of the same amount of paper and tutorial elements. We increase the file size of the input XML documents from approximately 8 Kilobytes to approximately 8 Megabytes. Fig. <ref type="figure">13</ref> shows the evaluation time depending on the filesize in Kilobytes. The evaluation of the optimized query is approximately 1.8 times faster than the evaluation of the original query for file sizes larger than 1.5 Megabytes. Fig. <ref type="figure">14</ref> presents how much faster the evaluation of the optimized query is than the evaluation of the non-optimized query.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Related Work</head><p>The contributions <ref type="bibr" target="#b9">[8,</ref><ref type="bibr" target="#b11">10,</ref><ref type="bibr" target="#b21">20]</ref> introduce an algebra for XQuery. Additionally, they list transformation and optimization rules based on the introduced algebra, but they do not contain the optimization approach presented in this paper.</p><p>[2] describes how the language XQuery can be extended to support views. It describes the language extensions but does not describe how to optimize. <ref type="bibr" target="#b18">[17]</ref> projects XML documents to a sufficient XML fragment before processing XQuery queries. It contains a static path analysis of XQuery queries, which computes a set of projection paths formulated in XPath. Our approach optimizes the XQuery expression itself and does not project XML documents.</p><p>Whereas the complexity of XPath query evaluation on XML documents is examined in <ref type="bibr" target="#b10">[9]</ref>, we consider the complexity of our XPath query evaluation algorithm on an XQuery graph, which is a part of the proposed optimization steps for XQuery queries.</p><p>[7] uses graph schemas to optimize regular path expressions within queries for semistructured data. In comparison, we do not optimize a path expression according to a schema, but avoid unnecessary transformation steps by eliminating query code for the generation of output, which is not used further. Furthermore, we introduce an ordered schema graph (and an XQuery graph as extended version for XQuery expressions), which contain additional information in comparison to graph schemas like a sibling relationship, and we deal with XPath as path language.</p><p>[14] deals with the problem of satisfiability of XPath expressions without respect to schema information as e.g. by an XML schema definition. In comparison, we introduce a fast (but incomplete) test that checks whether or not a given XPath expression is valid according to a given schema and according to a given XQuery expression.</p><p>[16] deals with the test of satisfiability of tree pattern queries, which cover a fragment of XPath, without respect to a schema and with respect to an acyclic schema. <ref type="bibr" target="#b17">[16]</ref> discusses, when the test of satisfiability is NP-complete and when there exist polynomial time algorithm for the test of satisfiability. In comparison, we introduce a fast (but incomplete) satisfiability test according to a given schema, which can be also a cyclic schema, and according to a given XQuery expression.</p><p>Papakonstantinou et al. <ref type="bibr" target="#b20">[19]</ref> studies the inference of DTDs for views of XML data, but uses the language loto-ql for the definition of XML views, which is less powerful than XQuery. Furthermore, our approach optimizes all XQuery queries and cannot be only used in the scenario of XQuery queries on XQuery views. <ref type="bibr" target="#b5">[4]</ref> investigates XML document specifications with schemas and integrity constraints. It deals with the consistency problem, i.e. whether or not there exists an XML document that both conforms to the schema and satisfies the constraints. In comparison, we investigate XPath and XQuery expressions and present a fast (but incomplete) test that checks whether or not a given XPath expression is valid according to a given schema. <ref type="bibr" target="#b6">[5,</ref><ref type="bibr" target="#b22">21]</ref> describe frameworks for publishing relational data in XML. Both frameworks allow specifying view definitions formulated in XQuery, and optimize the reformulated query of a user-defined query according to a view by deleting unneces-</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<ptr target=":=" />
		<title level="m">RelativePathExpr) | RelativePathExpr. RelativePathExpr ::= (Step | PrimaryExpr)</title>
				<imprint/>
	</monogr>
	<note>FLRExpr ::= (ForClause | LetClause)+ &quot;return&quot; ExprSingle. ForClause ::= &quot;for. //&quot;) (Step | PrimaryExpr))*. Step ::= (&quot;child&quot; | &quot;descendant&quot; | &quot;attribute&quot; | &quot;self&quot; | &quot;descendant-or-self&quot; | &quot;following-sibling&quot; | &quot;following&quot; | &quot;parent&quot; |&quot;ancestor&quot; | &quot;preceding-sibling&quot; | &quot;preceding&quot; | &quot;ancestor-or-self. QName | &quot;node()</note>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><surname>Primaryexpr</surname></persName>
		</author>
		<title level="m">$&quot; QName | Constructor | FunctionCall. Constructor ::=</title>
				<editor>
			<persName><surname>Exprsingle</surname></persName>
		</editor>
		<imprint/>
	</monogr>
	<note>ExprSingle)*)?</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Efficient Filtering of XML documents for Selective Dissemination of Information</title>
		<author>
			<persName><forename type="first">M</forename><surname>Altinel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Franklin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 26th International Conference on Very Large Databases</title>
				<meeting>26th International Conference on Very Large Databases<address><addrLine>Cairo, Egypt</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Designing Valid XML Views</title>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">B</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">W</forename><surname>Ling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">LNCS</title>
		<imprint>
			<biblScope unit="volume">2503</biblScope>
			<biblScope unit="page" from="463" to="477" />
			<date type="published" when="2002">2002. 2002</date>
		</imprint>
	</monogr>
	<note>ER</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Reformulation of XML Queries and Constraints</title>
		<author>
			<persName><forename type="first">A</forename><surname>Deutsch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Tannen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICDT 2003</title>
				<imprint>
			<date type="published" when="2003">2003</date>
			<biblScope unit="volume">2572</biblScope>
			<biblScope unit="page" from="225" to="241" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">On XML Integrity Constraints in the Presence of DTDs</title>
		<author>
			<persName><forename type="first">W</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Libkin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the ACM</title>
		<imprint>
			<biblScope unit="volume">49</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="368" to="406" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">SilkRoute: A Framework for Publishing Relational Data in XML</title>
		<author>
			<persName><forename type="first">M</forename><surname>Fernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Kadiyska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Suciu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Database Systems</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="438" to="493" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Fernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Siméon</surname></persName>
		</author>
		<author>
			<persName><forename type="middle">C</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Choi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Dinoff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Gapeyev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Marian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Michiels</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Onose</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Petkanics</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Radhakrishnan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Re</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Resnick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vyas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Wadler</surname></persName>
		</author>
		<ptr target="http://www.galaxquery.org" />
		<title level="m">Galax pre-release 0</title>
				<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page">0</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Optimizing Regular Path Expressions Using Graph Schemas</title>
		<author>
			<persName><forename type="first">M</forename><surname>Fernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Suciu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourteenth International Conference on Data Engineering (ICDE)</title>
				<meeting>the Fourteenth International Conference on Data Engineering (ICDE)<address><addrLine>Orlando, Florida, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Algebraic Transformation and Optimization for XQuery</title>
		<author>
			<persName><forename type="first">D</forename><surname>Fisher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Lam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">K</forename><surname>Wong</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">APWeb 2004</title>
				<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="volume">3007</biblScope>
			<biblScope unit="page" from="201" to="210" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">The Complexity of XPath Query Evaluation</title>
		<author>
			<persName><forename type="first">G</forename><surname>Gottlob</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Koch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pichler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 22th ACM SIGMOD-SIGACT-SIGART symposium of Principles of database systems (PODS 2003)</title>
				<meeting>the 22th ACM SIGMOD-SIGACT-SIGART symposium of Principles of database systems (PODS 2003)<address><addrLine>San Diego, California, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Towards an Exhaustive Set of Rewriting Rules for XQuery Optimisation: BizQuery Experience</title>
		<author>
			<persName><forename type="first">M</forename><surname>Grinev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kuznetsov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ADBIS 2002</title>
				<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="volume">2435</biblScope>
			<biblScope unit="page" from="340" to="345" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">XPath Query Transformation based on XSLT stylesheets</title>
		<author>
			<persName><forename type="first">S</forename><surname>Groppe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Böttcher</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Fifth International Workshop on Web Information and Data Management (WIDM&apos;03)</title>
				<meeting><address><addrLine>New Orleans, Louisiana, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Efficient Querying of transformed XML documents</title>
		<author>
			<persName><forename type="first">S</forename><surname>Groppe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Böttcher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Birkenheuer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">6th International Conference of Enterprise Information Systems (ICEIS 2004)</title>
				<meeting><address><addrLine>Porto, Portugal</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Using XSLT Stylesheets to Transform XPath Queries</title>
		<author>
			<persName><forename type="first">S</forename><surname>Groppe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Böttcher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Heckel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Birkenheuer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Eighth East-European Conference on Advances in Databases and Information Systems (ADBIS 2004)</title>
				<meeting><address><addrLine>Budapest, Hungary</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Satisfiability of XPath Expressions</title>
		<author>
			<persName><forename type="first">J</forename><surname>Hidders</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">DBPL</title>
		<imprint>
			<biblScope unit="volume">2921</biblScope>
			<biblScope unit="page" from="21" to="36" />
			<date type="published" when="2003">2003. 2004</date>
		</imprint>
	</monogr>
	<note>LNCS</note>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Saxon -The XSLT and XQuery Processor</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">H</forename><surname>Kay</surname></persName>
		</author>
		<ptr target="http://saxon.sourceforge.net" />
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">On Testing Satisfiability of Tree Pattern Queries</title>
		<author>
			<persName><forename type="first">L</forename><surname>Lakshmanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ramesh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 30 th VLDB Conference (VLDB 2004)</title>
				<meeting>the 30 th VLDB Conference (VLDB 2004)<address><addrLine>Toronto, Canada</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Projecting XML Documents</title>
		<author>
			<persName><forename type="first">A</forename><surname>Marian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Siméon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 29 th VLDB Conference</title>
				<meeting>the 29 th VLDB Conference<address><addrLine>Berlin, Germany</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Oracle XQuery Technology -Preview</title>
		<ptr target="http://www.oracle.com/technology/tech/xml/xquery/index.html" />
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">DTD Inference for Views of XML Data</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Papakonstantinou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Vianu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Nineteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 2000)</title>
				<meeting>the Nineteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 2000)<address><addrLine>Dallas, Texas, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Tree Logical Classes for Efficient Evaluation of XQuery</title>
		<author>
			<persName><forename type="first">S</forename><surname>Paparizos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">V S</forename><surname>Lakshmanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">V</forename><surname>Jagadish</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SIGMOD 2004</title>
				<meeting><address><addrLine>Paris, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Querying XML Views of Relational Data</title>
		<author>
			<persName><forename type="first">J</forename><surname>Shanmugasundaram</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kiernan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Shekita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Funderburk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 27 th VLDB Conference</title>
				<meeting>the 27 th VLDB Conference<address><addrLine>Roma, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<title level="m" type="main">Tamino XML Server</title>
		<ptr target="http://www.softwareag.com/tamino/News/tamino_41.htm" />
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
		<respStmt>
			<orgName>Software AG</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Updating XQuery Views Published over Relational Data: A Round-Trip Case Study</title>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mulchandani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">A</forename><surname>Rundensteiner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Xsym</title>
		<imprint>
			<biblScope unit="volume">2824</biblScope>
			<biblScope unit="page" from="223" to="237" />
			<date type="published" when="2003">2003. 2003</date>
		</imprint>
	</monogr>
	<note>LNCS</note>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<title level="m" type="main">XML Path Language (XPath) Version 1</title>
		<author>
			<persName><surname>W3c</surname></persName>
		</author>
		<ptr target="http://www.w3.org/TR/xpath" />
		<imprint>
			<date type="published" when="1999">1999</date>
			<biblScope unit="page">0</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">XQuery 1.0: An XML Query Language</title>
		<author>
			<persName><surname>W3c</surname></persName>
		</author>
		<ptr target="http://www.w3.org/TR/xquery" />
	</analytic>
	<monogr>
		<title level="m">W3C Working Draft</title>
				<imprint>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
