<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Determining the Output Schema of an XSLT Stylesheet</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sven Groppe</string-name>
          <email>Groppe@uibk.ac.at</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jinghua Groppe</string-name>
          <email>Jinghua@uibk.ac.at</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sven.Groppe</institution>
          ,
          <addr-line>Jinghua</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Innsbruck</institution>
          ,
          <addr-line>Technikerstrasse 21a, A-6020 Innsbruck</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The XSLT language is used to describe transformations of XML documents into other formats. The transformed XML documents conform to output schemas of the used XSLT stylesheet. Output schemas of XSLT stylesheets can be used for a static analysis of the used XSLT stylesheet, to automatically detect the XSLT stylesheet, which has been used for the transformation, of target XML documents or to reason on the output schema without access to the target XML documents. In this paper, we describe how to automatically determine such an output schema of a given XSLT stylesheet, where we only consider XML to XML transformations. The input of our proposed output schema generator is the XSLT stylesheet and the schema of the input XML documents. The experimental evaluation shows that our prototype can determine the output schemas of nearly all typical XSLT stylesheets.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Among other usages of XML, XML is the most widely used data model for
exchanging data on the web and elsewhere. For the exchange of data, we have to transform the
data from one format into another format whenever the two exchange partners use
different formats. The exchange partners can use different formats, which might be a
proprietary company standard, a proprietary application format or other standard
formats, for historical, political or other reasons. We focus on XSLT [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] as
transformation language for the XML data. In this paper, we describe how to determine output
schemas of XSLT stylesheets, where we use XML Schema [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] to express the output
schema. We define the term output schema as follows:
Definition 1 (output schema): The output schema OS of an XSLT stylesheet V must
be valid for all results of the XSLT stylesheet V, i.e. we require the following to hold:
∀ XML documents D: V(D) is valid according to OS.
      </p>
      <sec id="sec-1-1">
        <title>Output schemas of XSLT stylesheets can be used in several application scenarios:</title>
      </sec>
      <sec id="sec-1-2">
        <title>In the first application scenario, we have an already transformed XML document</title>
        <p>and a collection of XSLT stylesheets. In this application scenario, we want to exclude
those XSLT stylesheets, which were not used to retrieve the given transformed XML
document. This application scenario is useful, whenever it is unknown, which XSLT
stylesheet from a set of XSLT stylesheets has been used for transformation. However,
the user wants to execute this unknown XSLT stylesheet (which is available in a set of
XSLT stylesheets) again, because the source data has been changed. The output
schema of the XSLT stylesheets can be used for this test in the following way: If the
given transformed XML document is not valid according to the output schema of the
XSLT stylesheet, then we are sure that this XML document was not transformed by
the considered XSLT stylesheet.</p>
      </sec>
      <sec id="sec-1-3">
        <title>Another application scenario is the optimization of the execution times of XSLT</title>
        <p>
          stylesheets. The contributions [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] describe a search on the output nodes of
XSLT stylesheets as part of a static analysis of the XSLT stylesheets. This search
could be analogously done in the output schemas of the XSLT stylesheets.
        </p>
        <p>Within further application scenarios (see Figure 1), we use a given XSLT stylesheet
as XML view. Then we can use the output schema of the given XSLT stylesheet to
reason on it by using a satisfiability tester, an intersection tester, a containment tester
(which is sometimes called subsumption tester) or an equivalence tester. The input of
these testers is the output schema and queries. Figure 1 summarize some application
scenarios, where these testers can be used to reason on the output schema and given
queries for a special purpose, which is also described in the table of Figure 1 .</p>
        <sec id="sec-1-3-1">
          <title>Application</title>
        </sec>
        <sec id="sec-1-3-2">
          <title>Scenario</title>
          <p>Querying a
cache of
original data
or of
transformed data
Update of the
original data
of a cache
Querying
distributed
sources
Optimization
of disjunctive
queries
Access
control
Optimization
of answering
queries
Check of
user query
?
Used Tester
∩ ⊆ ≡
x x x
x
x
x
x
x
x</p>
        </sec>
        <sec id="sec-1-3-3">
          <title>Description</title>
          <p>Caches are designed to return answers faster compared to querying the
original source (and optionally transforming its data) so that the answer
time is reduced whenever results can be retrieved from the cache.</p>
          <p>Given a query Q and the content of the cache described by a query C. Q
can be partially answered by the cache if proving Q∩C≠{} is
successful. Q can be fully answered by the cache if Q⊆C can be proved. The
content of the cache can be returned as answer of Q if proving Q≡C is
successful.</p>
          <p>Given an update query U and the content of the cache described by a
query C. U describes the part of the original data that has been updated.</p>
          <p>We can check whether the content of the cache is valid by proving
U∩C={}.</p>
          <p>Given a query Q and the content of a source described by a query C.</p>
          <p>We can avoid querying the source if Q∩C={} can be proved. Avoiding
querying the source saves transportation costs and reduces the load of
the processor where the source is located.</p>
          <p>We can optimize a disjunctive query Q1|Q2 to Q2 if Q1⊆Q2. We can
save processing costs by applying this optimization.</p>
          <p>
            We can check access rights of users by using the approach of [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ].
          </p>
          <p>Given a query Q. We can avoid querying a source (or a view
respectively) if we can prove Q={}. Particularly in the case of views with
cost intensive operations, we can save processing costs.</p>
          <p>We can check a user query Q by testing Q={}, which is a hint that the
user query Q is not meaningful.</p>
          <p>Fig. 1: Application scenarios of the testers, where ? represents the satisfiability tester, ∩
represents the intersection tester, ⊆ represents the containment tester and ≡ represents the
equivalence tester</p>
        </sec>
      </sec>
      <sec id="sec-1-4">
        <title>We have implemented the output schema generator for XSLT stylesheets in a prototype. The experimental evaluation shows that our prototype can determine 95% of the output schemas of typical XSLT stylesheets.</title>
      </sec>
      <sec id="sec-1-5">
        <title>The rest of the paper is organized as follows. Section 2 describes the algorithm of the output schema generator. Section 3 presents the experimental evaluation. Section 4 deals with the related work. We end up with the summary in Section 5.</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2 Determining the Output Schema</title>
      <p>The goals of output schema generators are to compute the output schema definition as
restrictive as possible, but to guarantee the requirement of Definition 1. We describe
the output schema generator used in our prototype in the following sections in detail.
&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"&gt;
&lt;xsl:template match="table"&gt;</p>
      <p>&lt;table&gt;&lt;xsl:apply-templates/&gt;&lt;/table&gt;
&lt;/xsl:template&gt;
&lt;xsl:template match="row"&gt;
&lt;address id="{id}" firstname="{firstname}" lastname="{lastname}"</p>
      <p>
        street="{street}" city="{city}" state="{state}" zip="{zip}"/&gt;
&lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;
Example 1 (Output schema of XSLT stylesheet): Figure 3 presents the output schema
of the XSLT stylesheet of Figure 2 computed with the output schema generator. The
output schema generator expects name attributes in the &lt;xsl:template&gt;
instructions, a start template matching “/” (in Figure 2 hidden as built-in template as
specified in [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]) and long forms when generating element or attribute nodes, i.e.
&lt;xsl:element name=”address”&gt;
      </p>
      <p>&lt;xsl:attribute name=”id”&gt;&lt;xsl:value-of select=”id”/&gt;&lt;/xsl:attribute&gt;
&lt;/xsl:element&gt;
instead of &lt;address id=”{id}”/&gt;. Note that XSLT stylesheets in general and in
particular the XSLT stylesheet of Figure 2 can be easily transformed into such a
representation.
2.1</p>
      <p>Processes of the determination of output schemas of XSLT stylesheets
The processes of the determination of the output schemas of XSLT stylesheets is
presented in Figure 4. First, we have to pre-process the schema of the input XML
documents of the XSLT stylesheets (see Section 2.2). Second, we parse an expression,
which is here an XSLT stylesheet. The result of the parsing is the abstract syntax tree
of the expression. For example, Figure 5 presents the abstract syntax tree of the XSLT
stylesheet of Figure 2. Third, we evaluate the attribute grammar described in Section
2.3 on this abstract syntax tree, which computes the attributes of the nodes in the
abstract syntax tree in the way we describe in Section 2.3. A special attribute of the root
node of the abstract syntax tree contains the result after the overall evaluation of the
attribute grammar. Fourth and at last, we have to post-process the determined output
schema, which we describe in Section 2.4.
2.2</p>
      <p>Preprocessing the XML Schema definition of the input XML document
In order to integrate parts of the XML Schema definition of the input XML document,
we perform the following tasks in the XML Schema definition of the input XML
document. We first embed all element declarations &lt;xsd:element name=N1&gt; on
the top level of the XML Schema definition of the input XML document in
&lt;xsd:group name=N2&gt; elements so that we can refer to these element
declarations. In order to correct the references to the embedded element declarations, we
replace all &lt;xsd:element ref=N1&gt; elements with &lt;xsd:group ref=N2&gt;
elements. Furthermore, we rename all names, which are also used in the generated
output schema so that they are not in conflict any more with the names of the
generated output schema. We will embed all referred groups of the XML Schema definition
of the input XML document in the output schema.</p>
      <p>Example 2: Figure 7 presents the pre-processed XML Schema definition of Figure 6.
2.3</p>
      <p>Attribute Grammar
For the description of the algorithm of the output schema generator, we use attribute
grammars.</p>
      <p>XSLT
stylesheet
(2) Generating the
abstract syntax
tree of the XSLT
stylesheet
abstract syntax
tree of the</p>
      <p>XSLT
stylesheet
(3) applying the
attribute grammar
attribute
grammar
XML Schema
of input XML
documents
(1) Pre-processing
of the schema of
the input XML
documents
pre-processed
XML Schema
of input XML
documents
intermediate
output schema</p>
      <p>(4)
postprocessing the
intermediate
output schema</p>
      <p>final
output schema
included in the figure, but only the first and the last attribute XSLT instructions of the
address element XSLT instruction are represented due to space limitations
“&lt;xsl:template match=‘“
“&lt;xsl:element name=‘“ String</p>
      <p>“‘&gt;“
XPath
“‘row‘“ “‘&gt;“
“address“
attribute
“&lt;xsl:attribute name=‘“</p>
      <p>String “‘&gt;“ Instructions
“id“
valueOf</p>
      <p>“&lt;/xsl:attribute&gt;“
“&lt;xsl:value-of select=‘“ String “‘/&gt;“</p>
      <p>“id“
“&lt;xsl:attribute name=‘“
template
Instructions
element
Instructions
…
“&lt;/xsl:template&gt;“
“&lt;/xsl:element&gt;“
attribute
valueOf</p>
      <p>“zip“
String “‘&gt;“ Instructions “&lt;/xsl:attribute&gt;“
“zip“
&lt;xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"&gt;
&lt;xsd:element name="table" minOccurs="1" maxOccurs="1"&gt;
&lt;xsd:complexType&gt;
&lt;xsd:element name="row" minOccurs="0" maxOccurs="unbounded"&gt;
&lt;xsd:complexType&gt;
&lt;xsd:sequence minOccurs="1" maxOccurs="1"&gt;
&lt;xsd:element name="id" minOccurs="1" maxOccurs="1"&gt;
&lt;complexType mixed='true'/&gt;&lt;/xsd:element&gt;
&lt;xsd:element name="firstname" minOccurs="1" maxOccurs="1"&gt;
&lt;complexType mixed='true'/&gt;&lt;/xsd:element&gt;
&lt;xsd:element name="lastname" minOccurs="1" maxOccurs="1"&gt;
&lt;complexType mixed='true'/&gt;&lt;/xsd:element&gt;
&lt;xsd:element name="street" minOccurs="1" maxOccurs="1"&gt;
&lt;complexType mixed='true'/&gt;&lt;/xsd:element&gt;
&lt;xsd:element name="city" minOccurs="1" maxOccurs="1"&gt;
&lt;complexType mixed='true'/&gt;&lt;/xsd:element&gt;
&lt;xsd:element name="state" minOccurs="1" maxOccurs="1"&gt;
&lt;complexType mixed='true'/&gt;&lt;/xsd:element&gt;
&lt;xsd:element name="zip" minOccurs="1" maxOccurs="1"&gt;
&lt;complexType mixed='true'/&gt;&lt;/xsd:element&gt;
&lt;/xsd:sequence&gt;
&lt;/xsd:complexType&gt;
&lt;/xsd:element&gt;
&lt;/xsd:complexType&gt;
&lt;/xsd:element&gt;
&lt;/xsd:schema&gt;
Definition 2 (attribute grammar): An attribute grammar consists of a grammar G in
EBNF notation, attributes of symbols of G, and computation rules for the attributes
added to a production rule of G.</p>
      <p>
        In the following, we use the notation P { C }, where P is a production rule of G
in the EBNF notation and C contains the computation rules for attributes of symbols,
which occur in P. We use a slightly different variant of the Java notation in C. We
refer to an attribute a of the m-th symbol n in P as n[m].a. If there is only one
symbol n in P, we use n.a instead of n[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].a. If there exists an arbitrary number of a
symbol n in P, then i represents the concrete number of occurrences of the symbol n
in P. i1 and i2 respectively represent the concrete numbers of occurrences if two
symbols n1 and n2 respectively can both occur an arbitrary number of times in P.
      </p>
      <sec id="sec-2-1">
        <title>An attribute grammar G is evaluated on an expression E as follows: First the syntax</title>
        <p>tree of E is generated. Then we compute the attributes until all attributes are computed
or no new attributes can be computed. We compute an attribute a of a node N of the
syntax tree according to a rule r of G, if r represents the recognized situation, in
which N is embedded, and if all necessary attributes for the computation of a in r
have already been computed.</p>
      </sec>
      <sec id="sec-2-2">
        <title>In this section, we present the algorithm of the output schema generator for XSLT</title>
        <p>stylesheets. We first present the function getSchemaOfXPath, which is used in the
output schema generator. Afterwards, we present the algorithm of the output schema
generator in form of an attribute grammar.</p>
      </sec>
      <sec id="sec-2-3">
        <title>We define the following term for use in the function getSchemaOfXPath.</title>
        <p>Definition 3 (relative part and absolute part of an XPath expression): An XPath
expression I can be divided into a relative part rp(I) and an absolute part ap(I)
(both of which may be empty) in such a way that rp(I) contains a relative path
expression, ap(I) contains an absolute path expression, and the union of ap(I) and
rp(I), denoted as ap(I)|rp(I), is equivalent to I. This means that the
evaluation of I returns the same node set as the evaluation of ap(I)|rp(I) for all XML
documents and for all context nodes in the current XML document.
(1) Algorithm getSchemaOfXPath
(2) Input: XPath expression QXPath
(3) Output: XML fragment containing the XML Schema definition of the result of
QXPath
(4) XSD = XML Schema definition of the data in which context QXPath is applied
(5) Q = ap(QXPath)|/descendant-or-self::node(/self::node()|</p>
        <p>/attribute::node()|/namespace::node())/rp(QXPath)
(6) sp ← evaluateXPath(Q, &lt;(Q, start node of XSD, null, null)&gt;)
(7) if(sp.size()=1) {s1=;s2=;}
(8) else s1="&lt;xsd:choice&gt;";s2="&lt;/xsd:choice&gt;";
(9) for each spe ∈ sp do
(10) s3 = s3 + copy (last entry in spe).getNode() and its subtree;
(11)return createXMLFragment(s1 + s3 + s2);</p>
        <p>
          Whenever an XPath expression occurs in the XSLT stylesheet, the output schema
generator uses the function getSchemaOfXPath of Figure 8 for generating the
schema of the result of the XPath expression. getSchemaOfXPath executes a
search in the XML Schema definition of the data in which context the given XPath
expression QXPath is applied (see line (4), the search is described in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] in more
detail). The context of QXPath can be XML nodes of the input XML document or of a
variable. The schema of the input XML document or of the variable in which context
QXPath is applied has been already determined in our static analysis of the output
schema generators. In line (5), we replace the relative part of QXPath with an XPath
expression that describes a superset of the XML nodes described by the relative part
of QXPath, as we are not aware of the concrete context of QXPath. For determining the
XML nodes of the relative part of QXPath more exactly, we refer to [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ],
which contains static analysises that can be used for this purpose. We start the search
in the XML Schema definition in line (6). The result of the search in the XML Schema
definition and its subtree is the schema of the given XPath expression QXPath. Thus,
we copy these nodes of the XML Schema definition and return them as the resultant
schema of the given XPath expression QXPath in line (7) to line (11).
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>In this section, we describe the algorithm, which generates the output schema in</title>
        <p>
          form of an XML Schema definition from an XSLT stylesheet VXSLT. If VXSLT contains
an XSLT instruction &lt;xsl:copy-of&gt;, which copies XML nodes of the input XML
document, then our output schema generator requires the XML Schema definition of
the input XML document. We present the algorithm of the output schema generator in
the following attribute grammar. Note that the chosen attribute grammar covers many
XSLT stylesheets. If we consider the XSLT stylesheets of the XSLTMark benchmark
[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], then we can determine the output schemas of 37 from 39 XSLT stylesheets of the
XSLTMark benchmark. One of the remaining XSLT stylesheets, the output schema of
which we cannot determine with the proposed attribute grammar for the determination
of the output schema, contains element declarations the names of which depend on the
input XML document. Another remaining XSLT stylesheet does not start with
&lt;xsl:stylesheet&gt;.
        </p>
      </sec>
      <sec id="sec-2-5">
        <title>The Start symbol is applied to the root node of the abstract syntax tree of the</title>
        <p>considered XSLT stylesheet. After evaluating the attribute grammar to the abstract
syntax tree, the schema attribute of the Start symbol contains the determined
output schema of the considered XSLT stylesheet.</p>
        <p>
          Start ::= "&lt;?xml version='1.0'?&gt; &lt;xsl:stylesheet&gt;" (attributeSet|decimalFormat)*
startTemplate (template)* "&lt;/xsl:xsl:stylesheet&gt;".
{ Start.schema = createXMLDocument(
"&lt;xsd:schema xmlns:xsd=’http://www.w3.org/2001/XMLSchema’&gt;" +
startTemplate.schema +
template[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].schema + … + template[i].schema+"&lt;/xsd:schema&gt;");}
        </p>
      </sec>
      <sec id="sec-2-6">
        <title>The name of the template is used as name of a group construct that contains the</title>
        <p>schema information of the currently considered template. We will refer to this group
construct whenever this template could be called. As the startTemplate symbol
represents the initial template with which the processing of the XSLT stylesheet starts,
we use the schema of this template as root nodes of the whole output schema and
register the group containing the schema of the initial template by a function
rootNodesGroup. Note that if no template matches the document node “/”, we have to
consider the built-in template of XSLT that matches the document node “/”. The
function generateNewUniqueName() returns a new unique name in order to be able
to refer to the schema of the template.
startTemplate ::= "&lt;xsl:template match='/' (name='" String "'")? "&gt;" Instructions
"&lt;/xsl:template&gt;".
{if(String occurs) nameString = String.getString();
else nameString = generateNewUniqueName();
startTemplate.schema = createXMLFragment(
"&lt;xsd:group name=’" + nameString + "’&gt;"
+ Instructions.schema + "&lt;/xsd:group&gt;");
template.attributes = Instructions.attributes;
template.name = String.getString();
rootNodesGroup(String.getString());}
template ::= "&lt;xsl:template match='" XPath "'" (name='" String "'")? "&gt;"
Instructions "&lt;/xsl:template&gt;".
{if(String occurs) nameString = String.getString();
else nameString = generateNewUniqueName();
template.schema = createXMLFragment("&lt;xsd:group name=’" +
nameString +
"’&gt;"+ Instructions.schema + "&lt;/xsd:group&gt;");
template.name = String.getString();
template.attributes = Instructions.attributes;}</p>
      </sec>
      <sec id="sec-2-7">
        <title>We concatenate the schemas of several instructions in a sequence. Furthermore, we</title>
        <p>check whether the content of the instructions is fixed and store the fixed value in
Instructions.fixed in this case.</p>
        <p>Instructions ::= (applyTemplates | forEach | copyOf | copy | variable | choose |
if | text | element | attribute | valueOf | callTemplate | attributeSet |
decimalFormat)*.
{if(only one text symbol is recognized)</p>
        <p>{Instructions.fixedFlag=true; Instructions.fixed=text.fixed;}
else Instructions.fixedFlag=false;
String attributes="";
attributes=attributes + (Recognized Symbol 1).attributes;
…
attributes=attributes + (Recognized Symbol i).attributes;
Instructions.attributes=attributes;
Instructions.schema = createXMLFragment("&lt;xsd:sequence&gt;" +
(Recognized Symbol 1).schema + … +
(Recognized Symbol i).schema + "&lt;/xsd:sequence&gt;");}</p>
      </sec>
      <sec id="sec-2-8">
        <title>We refer to the schema of a called function in case of a call-template XSLT</title>
        <p>instruction.
callTemplate ::= "&lt;xsl:call-template name=’" String "’/&gt;".
{callTemplate.schema = createXMLFragment("&lt;xsd:group ref=’" +</p>
        <p>
          String.getString() + "’ minOccurs='1' maxOccurs='1'/&gt;");
callTemplate.attributes = (Symbol with name String.getString()).attributes;}
A template
&lt;xsl:template
match=M&gt;
can
be called from
an
&lt;xsl:apply-templates select=S&gt; XSLT instruction, if M and S intersect.
For simplicity of presentation, we do not describe the determination of the concrete
context of M that could be determined by the methods described in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], but
we replace the relative part of M, i.e. rp(M), with a superset of the XML nodes
described by rp(M). This superset of the XML nodes described by rp(M) is
/descendant-or-self::node()(/self::node()|/attribute::node()| /namespace::node())/rp(M). We can use an
intersection tester, which consists of a reduction of the the intersection test to the
satisfiability test of XPath expressions [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] and the satisfiability tester proposed in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ],
[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] or [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
applyTemplates ::= "&lt;xsl:apply-templates select=’" XPath "’/&gt;".
{T = { sy | Symbol sy representing &lt;xsl:template match=M name=n&gt; of the XSLT
stylesheet, where maybe
(ap(XPath.getString())|/descendant-orself::node()(/self::node() | /attribute::node()|/namespace::node())/
rp(XPath.getString())) ∩ (ap(M)|/descendant-or-self::node()
(/self::node()|/attribute:: node()|/namespace::node())/rp(M))};
applyTemplates.schema = createXMLFragment("&lt;xsd:choice&gt;"
+ "&lt;xsd:group ref=" + T[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].name +
        </p>
        <p>" minOccurs='0' maxOccurs='1'/&gt;" + …
+ "&lt;xsd:group ref=" + T[i].name +</p>
        <p>
          " minOccurs='0' maxOccurs='1'/&gt;"
+ "&lt;/xsd:choice&gt;");
applyTemplates.attributes= createXMLFragment(T[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].attributes
        </p>
        <p>+ … + T[i].attributes);}</p>
      </sec>
      <sec id="sec-2-9">
        <title>The returned schema stored in Instructions.schema already contains an &lt;xsd:sequence&gt; specification.</title>
        <p>forEach ::= "&lt;xsl:for-each select=’" XPath "’&gt;" Instructions "&lt;/xsl:for-each&gt;".
{forEach.schema =
"&lt;xsd:sequence minOccurs=’0’ maxOccurs=’unbounded’&gt;" +
Instructions.schema + "&lt;/xsd:sequence&gt;";
forEach.attributes=Instructions.attributes;}</p>
        <p>The function storeVariableSchema(name,schema,attributes) is
used in order to store the schema schema and the attributes attributes of a
variable name. We access the schema of a variable if e.g. we analyze the schema of an</p>
      </sec>
      <sec id="sec-2-10">
        <title>XSLT instruction that copies the content of the variable with &lt;xsl:copy-of&gt;.</title>
        <p>variable ::= "&lt;xsl:variable name=’" String "’&gt;" Instructions "&lt;/xsl:variable&gt;".
{storeVariableSchema( String.getString(), Instructions.schema,</p>
        <p>Instructions.attributes);}</p>
        <p>The function getSchemaOfXPath(XP) returns the schema to which the result
of the XPath expression XP is valid. The function uses the schema information of the
input XML document if XP specifies XML nodes in the input XML document, or uses
the schema information of that variable the XML nodes of which are addressed by XP.
copyOf ::= "&lt;xsl:copy-of select=’" XPath "’/&gt;".
{copyOf.schema = getSchemaOfXPath(XPath.getString());
copyOf.attributes=;}</p>
        <p>The function getElementName(j) returns the element name of the j-th
element
declaration
at
top
level
of
the
considered
schema,
i.e.
schema.getElementName(2) returns “name2” if schema is
&lt;xsd:element name=”name1”/&gt;
&lt;xsd:element name=”name2”&gt;
&lt;xsd:complexType&gt;&lt;xsd:element name=”name3”/&gt;&lt;/xsd:complexType&gt;
&lt;/xsd:element&gt;.</p>
        <p>Let n contain the number of top level element declarations of the considered
schema.
copy ::= "&lt;xsl:copy" useAttributeSet? "&gt;" Instructions "&lt;/xsl:copy&gt;".
{SchemaInst = Instructions.schema;
currentElementNodes = getSchemaOfXPath("/descendant::node()");
if(useAttributeSet occurs) attributes = useAttributeSet.attributes;
else attributes="";
copy.schema = "&lt;xsd:choice&gt;" + "&lt;xsd:element name=’"+
currentElementNodes.schema.getElementName(1)+"’&gt;&lt;xsd:complexType mixed=’true’&gt;"+
schemaInst+"&lt;/xsd:complexType&gt;"+attributes+"&lt;/xsd:element&gt;"+ … +
"&lt;xsd:element name=’"+currentElementNodes.schema.getElementName(n)+
"’&gt;&lt;xsd:complexType mixed=’true’&gt;"+schemaInst+"&lt;/xsd:complexType&gt;"+
attributes+"&lt;/xsd:element&gt;"+"&lt;/xsd:choice&gt;";}</p>
      </sec>
      <sec id="sec-2-11">
        <title>Let name contain the names of the attribute sets and getSchemaOfAttrib</title>
        <p>
          uteSet(name) be a function, which searches for the attribute set with name name
in the correct scope and returns the declared attributes of its inner instructions.
useAttributeSet ::= "use-attribute-sets=’" name* "’".
{useAttributeSet.attributes=getSchemaOfAttributeSet(name[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ])+…+
        </p>
        <p>getSchemaOfAttributeSet(name[i]);}</p>
      </sec>
      <sec id="sec-2-12">
        <title>The schemas definitions of the attribute-set XSLT instructions are only con</title>
        <p>sidered if they are referred.
attributeSet ::= "&lt;xsl:attribute-set&gt;" Instructions "&lt;/xsl:attribute-set&gt;".
{attributeSet.schema="";}</p>
        <p>
          Let name contain an arbitrary name of an attribute.
decimalFormat ::= "&lt;xsl:decimal-format" + (name "=’" + String + "’")* + "/&gt;".
{decimalFormat.schema="";}
choose ::= "&lt;xsl:choose&gt;" when+ otherwise? "&lt;/xsl:choose&gt;".
{if(otherwise occurs) s = otherwise.schema;
else s="";
choose.schema = createXMLFragment("&lt;xsd:choice&gt;" +
when[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].schema + … + when[i].schema + s + "&lt;/xsd:choice&gt;");
choose.attributes = createXMLFragment(when[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].attributes + … +
        </p>
        <p>when[i].attributes);}
when ::= "&lt;xsl:when test=’" XPath "’&gt;" Instructions "&lt;/xsl:when&gt;".
{when.schema = Instructions.schema;
when.attributes = Instructions.attributes;}
otherwise ::= "&lt;xsl:otherwise&gt;" Instructions "&lt;/xsl:otherwise&gt;".
{otherwise.schema = Instructions.schema;
otherwise.attributes = Instructions.attributes;}
if ::= "&lt;xsl:if test=’" XPath "’&gt;" Instructions "&lt;/xsl:if&gt;".
{if.schema = createXMLFragment(
"&lt;xsd:sequence minOccurs=’0’ maxOccurs=’1’&gt;" + Instructions.schema +
"&lt;/xsd:sequence");
if.attributes = Instructions.attributes;}</p>
        <p>The content of an &lt;xsl:text&gt; XSLT instruction is used to specify fixed
elements or fixed attributes.
text ::= "&lt;xsl:text&gt;" String "&lt;/xsl:text&gt;".
{text.schema =; text.fixed = String.getString(); text.attributes=;}
element ::= "&lt;xsl:element name=’" String "’" useAttributeSet? "&gt;" Instructions
"&lt;/xsl:element&gt;".
{if(Instructions.fixedFlag=true) s = "fixed=’" + Instructions.fixed + "’";
else s=;
if(useAttributeSet occurs) attributes=useAttributeSet.attributes;
else attributes="";
element.schema = createXMLFragment("&lt;xsd:element name=’" +
String.getString() + "’" + s + "&gt;&lt;complexType mixed=’true’&gt;" +
Instructions.schema + "&lt;/xsd:complexType&gt;" + Instructions.attributes +
attributes + "&lt;/xsd:element&gt;");
element.attributes=;}
attribute ::= "&lt;xsl:attribute name=’" String "’&gt;" Instructions "&lt;/xsl:attribute&gt;".
{if(Instructions.fixedFlag=true) s = "fixed=’" + Instructions.fixed + "’";
else s=;
attribute.attributes = createXMLFragment(
"&lt;xsd:attribute name=’" + String.getString() + "’ type=’xsd:string’ " + s +
"/&gt;");
attribute.schema=;}
valueOf ::= "&lt;xsl:value-of select=’" String "’/&gt;".
{valueOf.schema=;valueOf.attributes=;}
2.4</p>
        <p>Post-processing the XML Schema definition generated by the output
schema generator
The XML Schema definition generated by the output schema generator must be
postprocessed for the following reasons:
1.</p>
        <p>The output schema generator for XSLT stylesheets embeds the schema definition
of the start template in a group, which does not declare the root element of the
schema. The root element of the schema can be declared in an &lt;xsd:element&gt;
declaration as child of the &lt;xsd:schema&gt; element of the XML Schema
defini</p>
        <p>tion.</p>
      </sec>
      <sec id="sec-2-13">
        <title>The output schema generators may generate not allowed circular definitions of</title>
        <p>groups for recursive functions.</p>
        <p>Different types of elements with the same name are not allowed in the same
scope. However, the output schema generators generate different types of
elements with the same name in the same scope whenever elements with the same
name
occur
in
a
sequence
of
element
constructors
(e.g.
&lt;a&gt;&lt;b/&gt;&lt;/a&gt;&lt;a&gt;&lt;c/&gt;&lt;/a&gt;) in the XSLT stylesheet respectively. We can
solve this problem by merging the schema of the types of these elements into one
unique type of elements with the same name in the same scope. The disadvantage
of this technique is a less precise output schema.
been post-processed so far.</p>
      </sec>
      <sec id="sec-2-14">
        <title>We post-process the XML Schema definition generated by the output schema gen</title>
        <p>erator for XSLT stylesheets (Case 1.) by copying the declaration of the top level
elements generated in the start template (or its called templates) to the child nodes of the
&lt;xsd:schema&gt; instruction of the XML Schema definition.</p>
      </sec>
      <sec id="sec-2-15">
        <title>In the case of circular definitions of groups (Case 2.), the group reference contain</title>
        <p>ing the circular definition of groups is not embedded in a complexType construct.
Then we transform the circular definition by “rolling out” into an iterated sequence.</p>
      </sec>
      <sec id="sec-2-16">
        <title>Thus, we proceed as follows:</title>
        <p>At each group reference R that contains a circular definition of groups, we insert an
&lt;xsd:sequence minOccurs=’1’ maxOccurs=’unbounded’&gt; instruction
(embedded in another &lt;xsd:sequence&gt; instruction as minOccurs and
maxOccurs attributes are not allowed in an &lt;xsd:sequence&gt; instruction directly after a
named group) at the beginning of the group in which R is contained. If the referred
group is the group in which the group reference is contained, we delete the group
reference. Otherwise we replace the group reference with copies of the content of the
referred group in an &lt;xsd:sequence&gt; instruction.</p>
        <p>Example 3 (Post-processed XML Schema definition): Figure 3 contains the
postprocessed XML Schema definition of Figure 9.</p>
      </sec>
      <sec id="sec-2-17">
        <title>In the case of different types Ti of elements Ei with the same name (Case 3.), we merge the schemas of the different types by declaring a new named complexType. In the new named complexType, we declare the different schemas in an &lt;xsd:choice&gt; instruction and we set the types in Ei to the named complex</title>
        <p>Type.</p>
        <p>Example 4 (Post-processed XML Schema definition): For example, the following
schema definition
&lt;xsd:sequence&gt;
&lt;xsd:element name=’a’&gt;
&lt;complexType mixed=’true’&gt;
&lt;xsd:element name=’b’&gt;&lt;complexType mixed=’true’/&gt;&lt;/xsd:element&gt;
&lt;/complexType&gt;
&lt;/xsd:element&gt;
&lt;xsd:element name=’a’&gt;
&lt;complexType mixed=’true’&gt;
&lt;xsd:element name=’c’&gt;&lt;complexType mixed=’true’/&gt;&lt;/xsd:element&gt;
&lt;/complexType&gt;
&lt;/xsd:element&gt;
&lt;/xsd:sequence&gt;</p>
        <p>will be transformed into
&lt;xsd:sequence&gt;
&lt;xsd:element name=’a’ type=’newName’/&gt;
&lt;xsd:element name=’a’ type=’newName’/&gt;
&lt;/xsd:sequence&gt;
&lt;xsd:complexType name=’newName’&gt;
&lt;xsd:choice&gt;
&lt;xsd:element name=’b’&gt;&lt;complexType mixed=’true’/&gt;&lt;/xsd:element&gt;
&lt;xsd:element name=’c’&gt;&lt;complexType mixed=’true’/&gt;&lt;/xsd:element&gt;
&lt;/xsd:choice&gt;
&lt;/xsd:complexType&gt;
We have implemented the output schema described above by coding the attribute
grammar in Java. We have used Java 1.5 and Windows XP as operation system.</p>
      </sec>
      <sec id="sec-2-18">
        <title>We have tested our prototype with the XSLT stylesheets of the XSLTMark bench</title>
        <p>
          mark [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], which consists of 39 stylesheets. The designers of XSLTMark chose the
XSLT stylesheets in order to cover as many aspects of the XSLT language as possible.
        </p>
        <p>Our prototype first loads the XSLT stylesheet, the output schema of which we
want to determine, in the main memory by the DOM parser of Java 1.5. Small
problems occur in this phase, which have resulted in little changes of the original XSLT
stylesheets, which are listed in the following enumeration:
1. Whenever attributes of the XSLT stylesheets, which are enclosed by “ characters
in the XSLTMark benchmark and contain the character ‘, the DOM parser of Java</p>
      </sec>
      <sec id="sec-2-19">
        <title>1.5 raises errors. Thus, we have replaced the outer “ character by the ‘ characters</title>
        <p>and the inner ‘characters by the “ characters. For example, we have replaced the
line &lt;xsl:value-of select="concat ($bottles, ' bottles')"/&gt; of bottles.xsl
with the line &lt;xsl:value-of select='concat ($bottles, " bottles")'/&gt;.
2. Whenever &amp;lt; occurs in the attributes of the original XSLT stylesheets, the
DOM parser reports an error that the DOM parser does not expect the &lt; character
(to which &amp;lt; is expanded) here. The DOM parser interprets &amp;lt; as a start
sequence of a new tag and not as comparison operator within a string representing
the value of an attribute. Thus, we have replaced comparison expressions P1
&amp;lt; P2 with P2 &gt; P1 in the original XSLT stylesheets of the XSLTMark
benchmark. For example, we have replaced the line &lt;xsl:template
match="foo[following-sibling::*[position()&amp;lt;=2][self::barg] and
followingsibling::*[position()&amp;lt;=2][self::nar]]"&gt; of the xpath.xsl XSLT stylesheet
of the XSLTMark benchmark with the line &lt;xsl:template
match="foo[followingsibling::*[2&gt;=position()][self::barg] and following-sibling::*[2&gt;=position()]
[self::nar]]"&gt;.</p>
      </sec>
      <sec id="sec-2-20">
        <title>After modifying the original XSLT stylesheets of the XSLTMark benchmark in</title>
        <p>the proposed way, our prototype can successfully determine the output schemas of 37
out of 39 XSLT stylesheets of the XSLTMark benchmark. One of the remaining
XSLT stylesheets, the output schema of which we cannot determine with our
prototype of the output schema generator, contains element declarations the names of which
depend on the input XML document. Another remaining XSLT stylesheet does not
start with &lt;xsl:stylesheet&gt;.</p>
        <p>Overall, if we consider the XSLT stylesheets of the XSLTMark benchmark as
typical XSLT stylesheets, then our prototype can determine 95% of the output
schemas of the typical XSLTstylesheets.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4 Further Related work</title>
      <p>
        [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] describe how to determine an XML Schema
according to a set of XML documents, but they do not describe how to determine the
output schema of XSLT stylesheets.
      </p>
      <sec id="sec-3-1">
        <title>To the best of our knowledge, there are no contributions to the determination of</title>
        <p>output schemas of XSLT stylesheets formulated in XML Schema.</p>
        <p>
          There are contributions [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], which deal with a search on the output nodes
of XSLT stylesheets as part of a static analysis for the optimization of the execution
time of XSLT stylesheets. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] transforms XQuery expressions into an graph, which
represents the output of XQuery expressions, which is also further used for the
optimization of the execution time of XQuery expressions.
        </p>
        <p>
          [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] and [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] focus on the satisifability problem of XPath
queries. There is a great amount of work dealing with query containment for XPath
([
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] and [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]). [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] describes how to reduce the containment and
intersection test of XPath expressions to the satisfiability test.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5 Summary and Conclusions</title>
      <p>We have proposed an approach for the determination of the output schema of an
XSLT stylesheet. An output schema of an XSLT stylesheet is the schema to which all
possible results of the XSLT stylesheet conform to. Possible application scenarios for
using the output schemas of XSLT stylesheets are the test, whether a given XSLT
stylesheet was not used to transform to a given XSLT stylesheet, for optimizations for
XSLT stylesheets and other application scenarios (see Figure 1), where logical testers
are used to reason on the output schemas.</p>
      <sec id="sec-4-1">
        <title>We have described the algorithm for determining the output schema of XSLT</title>
        <p>stylesheets by an attribute grammar. The experimental evaluation shows that our
prototype can determine 95% of the output schemas of typical XSLT stylesheets.</p>
      </sec>
      <sec id="sec-4-2">
        <title>The future work will cover to extend the prototype to handle also the rare cases, which we did not consider until now. Furthermore, we will consider determining also other output schemas such as the output schemas of XQuery expressions and the output schemas of other query and transformation languages outside the XML world.</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements Reference</title>
      <p>The work is funded by the European Commission under the projects ASG and
TripCom; by Science Foundation Ireland under the DERI-Lion Grant No.SFI/02/CE1/I13.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>S.</given-names>
            <surname>Amer-Uahis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. K. S.</given-names>
            <surname>Laksmanan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          .
          <article-title>Mininization of tree pattern queries</article-title>
          .
          <source>SIGMOD</source>
          <year>2001</year>
          ,
          <string-name>
            <given-names>Santa</given-names>
            <surname>Barbara</surname>
          </string-name>
          , California, USA,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Michael</given-names>
            <surname>Benedikt</surname>
          </string-name>
          ,
          <article-title>Wenfei Fan and Floris Geerts. XPath Satisfiability in the presence of DTDs</article-title>
          , PODS 2005, Baltimore, Maryland,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Böttcher</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Steinmetz</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <source>Adaptive XML Access Control Based on Query Nesting, Modification and Simplification. BTW</source>
          <year>2005</year>
          , Karlsruhe, Germany,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>B.</given-names>
            <surname>Chidlovskii</surname>
          </string-name>
          .
          <article-title>Schema extraction from xml: A grammatical inference approach</article-title>
          .
          <source>In Proceedings of the International Workshop KRDB</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>A.</given-names>
            <surname>Deutsch</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.</given-names>
            <surname>Tannen</surname>
          </string-name>
          .
          <article-title>Containment and Integrity Constraints for XPath</article-title>
          ,
          <source>In KRDB 2001, CEUR Workshop proceedings 45</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Developer</surname>
          </string-name>
          ,
          <source>XSLT Mark version 2.1</source>
          .0, http://www.datapower.com/xmldev/xsltmark.html,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>M. N.</given-names>
            <surname>Garofalakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gionis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rastogi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Seshadri</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Shim. XTRACT</surname>
          </string-name>
          :
          <article-title>Learning document type descriptors from xml document collections</article-title>
          .
          <source>Data Mining and Knowledge Discovery</source>
          ,
          <volume>7</volume>
          (
          <issue>1</issue>
          ):
          <fpage>23</fpage>
          -
          <lpage>56</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Jinghua</given-names>
            <surname>Groppe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Sven</given-names>
            <surname>Groppe</surname>
          </string-name>
          .
          <article-title>A Prototype of a Schema-based XPath Satisfiability tester</article-title>
          .
          <source>17th International Conference on Database and Expert Systems Applications (DEXA</source>
          <year>2006</year>
          ), Krakow, Poland,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Jinghua</given-names>
            <surname>Groppe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Sven</given-names>
            <surname>Groppe</surname>
          </string-name>
          .
          <source>Filtering Unsatisfiable XPath Queries, 8th International Conference on Enterprise Information Systems (ICEIS</source>
          <year>2006</year>
          ), Paphos, Cyprus,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Sven</surname>
            <given-names>Groppe</given-names>
          </string-name>
          ,
          <article-title>XML Query Reformulation for XPath, XSLT and XQuery</article-title>
          . Sierke-Verlag, Göttingen, Germany,
          <year>2005</year>
          . ISBN 3-933893-24-0.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Sven</surname>
            <given-names>Groppe</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Böttcher</surname>
          </string-name>
          .
          <article-title>Schema-based query optimization for XQuery queries</article-title>
          .
          <source>ADBIS</source>
          <year>2005</year>
          , Tallinn, Estonia,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>S.</given-names>
            <surname>Groppe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Böttcher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Birkenheuer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Höing</surname>
          </string-name>
          .
          <article-title>Reformulating XPath Queries and XSLT queries on XSLT Views</article-title>
          .
          <source>Data &amp; Knowledge Engineering Journal</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Sven</surname>
            <given-names>Groppe</given-names>
          </string-name>
          , Stefan Böttcher, Jinghua Groppe,
          <article-title>XPath Query Simplification with regard to the Elimination of Intersect and Except Operators, XSDM 2006 in conjunction with IEEE ICDE 2006, Atlanta</article-title>
          , USA,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>J. Hegewald</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Naumann</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Weis</surname>
          </string-name>
          ,
          <article-title>XStruct: Efficient Schema Extraction from Multiple and Large XML Documents</article-title>
          ,
          <article-title>XSDM 2006 in conjunction with IEEE ICDE 2006, Atlanta</article-title>
          , USA,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Hidders</surname>
            , Jan, Satisfiability of XPath Expressions,
            <given-names>DBPL</given-names>
          </string-name>
          <year>2003</year>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Lakshmanan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramesh</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <article-title>On Testing Satisfiability of Tree Pattern Queries</article-title>
          ,
          <source>In VLDB</source>
          <year>2004</year>
          , Toronto, Canada,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>G.</given-names>
            <surname>Miklau</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Suciu</surname>
          </string-name>
          .
          <article-title>Containment and equivalence for an XPath fragment</article-title>
          ,
          <source>In Proc. PODS'02</source>
          , pages
          <fpage>65</fpage>
          -
          <lpage>76</lpage>
          , Madison, Wisconsin,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>J.-K. Min</surname>
            ,
            <given-names>J.-Y.</given-names>
          </string-name>
          <string-name>
            <surname>Ahn</surname>
            , and
            <given-names>C.-W.</given-names>
          </string-name>
          <string-name>
            <surname>Chung</surname>
          </string-name>
          .
          <article-title>Efficient extraction of schemas for XML documents</article-title>
          .
          <source>Information Processing Letters</source>
          ,
          <volume>85</volume>
          :
          <fpage>7</fpage>
          -
          <lpage>12</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>S.</given-names>
            <surname>Nestorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Motwani</surname>
          </string-name>
          .
          <article-title>Extracting schema from semi-structured data</article-title>
          .
          <source>In Proceedings of the ACM International Conference on Management of Data (SIGMOD)</source>
          , pages
          <fpage>295</fpage>
          -
          <lpage>306</lpage>
          , Seattle, WA,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <given-names>F.</given-names>
            <surname>Neven</surname>
          </string-name>
          , T. Schewentick.
          <article-title>XPath containment in the presence of disjunction, DTDs, and variables</article-title>
          .
          <source>Proc. Of the 9th Int. Conf on Database Theory (IDCT)</source>
          ,
          <source>Siena, Italy. January 8-10</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <given-names>Q. Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. X.</given-names>
            <surname>Yu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.-F.</given-names>
            <surname>Wong</surname>
          </string-name>
          .
          <article-title>Approximate graph schema extraction for semistructured data</article-title>
          .
          <source>In Proceedings of the International Conference on Extending Database Technology (EDBT)</source>
          , pages
          <fpage>302</fpage>
          {
          <fpage>316</fpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.P.T. Wood.
          <article-title>Containment for XPath Fragments under DTD constraints</article-title>
          .
          <source>Proc. of the 9th Int. Conf. on Database Theory (ICDT)</source>
          .
          <source>Siena, Italy, January</source>
          <volume>8</volume>
          -
          <issue>10</issue>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.World Wide Web Consortium (
          <issue>W3C</issue>
          ),
          <source>XSL Transformations (XSLT) Version</source>
          <volume>1</volume>
          .0,
          <string-name>
            <given-names>W3C</given-names>
            <surname>Recommendation</surname>
          </string-name>
          , http://www.w3.org/TR/1999/REC-xslt-
          <volume>19991116</volume>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <article-title>W3C: XML Schema Part 1: Structures Second Edition</article-title>
          .
          <article-title>W3C Recommendation, www</article-title>
          .w3.org/TR/xmlschema-1,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>