<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AA CCoommppaarriissoonn ooff EElleemmeenntt--bbaasseedd aanndd PPaatthh--bbaasseedd AApppprrooaacchheess ttoo IInnddeexxiinngg XXMMLL DDaattaa?∗</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michal Kra´tky´</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Radim Baˇca Michal Kr´atky´</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Radim Baˇca</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science</institution>
          ,
          <addr-line>VS</addr-line>
        </aff>
      </contrib-group>
      <fpage>103</fpage>
      <lpage>115</lpage>
      <abstract>
        <p>The mark-up language XML (Extensible Mark-up Language) is recently understood as a new approach to data modeling. A wellformed XML document or a set of documents is an XML database and the associated DTD or schema specified in the language XML Schema is its database schema. Implementation of a system enabling us to store and query XML documents efficiently (so called native XML databases) requires a development of new techniques that make it possible to index an XML document in a way that provides an efficient evaluation of a user query. Most of XML query languages are based on the language XPath and use a form of path expressions for composing more general queries. In the paper we compare element-based and path-based approaches to indexing XML data. In the case of element-based approaches query is evaluated step by step. Each step produces a lot of elements which may be refused in the next evaluation step. In the paper we show that the previously published multi-dimensional path-based approach overcomes conventional element-based approaches.</p>
      </abstract>
      <kwd-group>
        <kwd>indexing XML data</kwd>
        <kwd>XPath</kwd>
        <kwd>element-based approach</kwd>
        <kwd>pathbased approach</kwd>
        <kwd>multi-dimensional data structures</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The mark-up language XML (Extensible Mark-up Language) [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] is recently
understood as a new approach to data modelling [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. A well-formed XML
document or a set of documents is an XML database and the associated DTD or
schema specified in the language XML Schema [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] is its database schema.
Implementation of a system enabling us to store and query XML documents efficiently
(so called native XML databases) requires a development of new techniques [
        <xref ref-type="bibr" rid="ref16 ref5">16,
5</xref>
        ].
      </p>
      <p>
        An XML document is usually modelled as a graph the nodes of which
correspond to XML elements and attributes. The graph is mostly a tree (we consider
no attribute IDREFS now). To obtain specified data from an XML database a
? Work is partially supported by Grant of GACR No. 201/06/P113.
number of special query languages have been developed, e.g. XPath [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], and
XQuery [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. A common feature of these languages is a possibility to formulate
paths in the XML graph. Such a path is a sequence of element or attribute names
from the root element to a leaf. Regular expressions provide a valuable method
for paths specifications. In fact, most of XML query languages are based on the
XPath language that uses a form of path expressions for composing more general
queries. The XPath defines a family of 13 axes, i.e. relationship types in that an
actual element can be associated to other elements represented in the XML tree.
The family of axes defined in the XPath is designed to allow the set of graph
traversal operations that are seen to be atomic in XML document trees.
      </p>
      <p>In the past, there were many considerations about use of existing relational
or object-relational DBMSs for storing and querying XML data. Since a tree
is accessed during evaluation of a query, conventional approaches through the
conventional database languages SQL or OQL fail or they are not too efficient.
Consequently, a form of indexing is necessary.</p>
      <p>
        Recently there are several approaches to indexing XML or, more general,
semistructured data. Some of them are based on a traditional relational
technology (e.g. Lore [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and XISS [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]), the others use special data structures for
representation of XML data like trie (e.g. Index Fabric [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and DataGuide [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ])
or multi-dimensional data structures (e.g. XPath Accelerator [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]). The latter
approach uses R-trees but also B-trees as database indices in environment of a
relational DBMS. As it was expected, R-trees outperforms B-trees in this
proposal. The work [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] presents an index over the prefix-encoding of the paths in an
XML document tree, in which each leaf uL of the document tree is prefixed by
the sequence of element tags that one encounters during a path traversal from
the document root to uL. A more complete summary of various approaching to
indexing XML data is e.g. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        In the course of the development of XML databases the need for a benchmark
framework has become more and more evident: many different ways to store and
query XML data have been suggested in the past, e.g. XMark [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and XML Data
Repository [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        From the other point of view we distinguish element-based (e.g. XPath
Accelerator [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] or XISS [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]) and path-based (e.g. [
        <xref ref-type="bibr" rid="ref11 ref13">13, 11</xref>
        ]) approaches to indexing
XML data. In the case of an element-based approach a query is evaluated step
by step. Each step produces a lot of elements which may be refused in the next
evaluation step.
      </p>
      <p>
        In the paper we compare XPath accelerator (XPA) element-based approach
and the multi-dimensional (MDA) path-based approach. We show that the
previously published multi-dimensional path-based approach (e.g. [
        <xref ref-type="bibr" rid="ref11 ref13">13, 11</xref>
        ]) overcomes
conventional element-based approaches. In Section 2 we describe XPA as a
typical representative of element-based approaches. In Section 3 we describe MDA
to indexing XML data. Section 4 reports on results of experiments for selected
XPath queries. In conclusion we summarize the paper content and outline
possibilities of a future work.
      </p>
    </sec>
    <sec id="sec-2">
      <title>XPath Accelerator (XPA)</title>
      <p>
        XPA [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is an indexing method for efficient evaluation of XPath queries. XPA
is an element-based approach to indexing semi-structured data applying
multidimensional data structures like R-tree or UB-tree but it can be realized in a
relational database as well.
2.1
      </p>
      <sec id="sec-2-1">
        <title>Model of XML documents</title>
        <p>Every XML document can be modelled as a tree where one tree node corresponds
to exactly one element or attribute of XML document. For support all XPath
axes XPA assigns 5-dimensional descriptor desc(v) to every node v
desc(v) = hpre(v), post(v), par(v), att(v), tag(v)i .</p>
        <p>The first attribute in the descriptor is preorder rank pre(v) of node v. P re(v)
corresponds to a document order of node v. Document order is defined as an
order in which nodes appear in XML serialization of a document. Otherwise
said, document order corresponds to the order in which nodes come in sequential
reading of text representation of XML document. P re(v) is assigned to node v
before any of its children is visited in sequential reading. The second attribute
is postorder rank post(v) of node v. This number is assigned to node v after all
its children are visited. Let v and v0 be evaluated nodes of XML tree. Then
– v0 is descendant of v
– v0 is following of v</p>
        <p>⇔
⇔</p>
        <p>pre(v) &lt; pre(v0) ∧ post(v0) &lt; post(v),
pre(v) &gt; pre(v0) ∧ post(v0) &lt; post(v).</p>
        <p>Similarly, relations ancestor and preceding can be evaluated between any
two nodes. If every node v of the tree has assigned pair (pre(v), post(v)) then
we are able to determine four major axes descendant, ancestor, f ollowing and
preceding for the node v with a single query. We call this node a context node.
The determination of axe for a context node means finding all nodes which are
in the appropriate relation with the context node.</p>
        <p>Example 1 (Evaluation of four major axis for context node).</p>
        <p>In Figure 1(a) we can see an XML document modelled as a tree. All nodes are
evaluated with the preorder and postorder rank. In Figure 1(b) we see pre/post
plane divided into four regions where every region corresponds to one axis. The
division is made for a context node h. Major axes for the context node h contain
the following nodes:
– a, f in axis ancestor :: ∗,
– k in axis f ollowing :: ∗,
– i, j in axis descendant :: ∗,
– b, c, d, e, g in axis preceding :: ∗.</p>
        <p>Descriptor desc(v) for a node v includes an attribute par(v) for support of
axes parent, child, f ollowing − sibling and preceding − sibling. The attribute
stores the parent’s preorder rank of a node v. Boolean attribute att(v) is true
if the node v is attribute. The attribute att is included to support of attribute
axis. Finally, attribute tag stores element (or attribute) name id. There is an
algorithm which can compute the descriptor desc for every node of an XML tree
during single sequential reading of an XML document.</p>
        <p>
          We use a term index for mapping term to its id. Every term which occurs
during parsing XML document is faced with term index and mapped to
appropriate id. Attribute and element names are stored in XPA index as a part of
descriptor dest, but also terms in an element content or attribute value should
be stored somehow. We decided to use the inverted list [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Value of every
element or attribute v is parsed into single words which are mapped to ids. We
store pairs (id, pre(v)) for every word in the inverted list. Consequently, we can
retrieve pre(v) of all nodes to be contain searched word.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Querying in the XPA index</title>
        <p>XPA index is done after we map the whole XML document into 5-dimensional
space. We resolve location steps of XPath query step by step. The location step
consists of name of the axis, name of the node (nodeN ame) and predicate. The
predicate is optional but in the case there is some predicate we have to solve
axis::nodeName part and predicate separately and then we have to union the
results. We designed implementation of XPA so that it is possible to handle
nested predicates. Solving axis::nodeName part of one location step is realized
using query upon 5-dimensional space. We find all nodes inside 5-dimensional
cube as it is shown in Table 1 in more detail.</p>
        <p>Wildcard ’*’ in Table 1 means that this attribute can have any value to match
corresponding axis, but value of attribute tag depends on value of nodeN ame of
axe pre post par att tag
child (pre(v), ∞) [0, post(v)) pre(v) false *
descendant (pre(v), ∞) [0, post(v)) * false *
descendant-or-self [pre(v), ∞) [0, post(v)) * false *
parent [par(v), par(v)] [post(v), ∞) * false *
ancestor [0, pre(v)) (post(v), ∞) * false *
ancestor-or-self [0, pre(v)] [post(v), ∞) * false *
following (pre(v), ∞) (post(v), ∞) * false *
preceding [0, pre(v)) [0, post(v)) * false *
following-sibling (pre(v), ∞) (post(v), ∞) par(v) false *
preceding-sibling [0, pre(v)) [0, post(v)) par(v) false *
attribute (pre(v), ∞) [0, post(v)) pre(v) true *
location step. When the index applies R-trees or other multi-dimensional data
structure retrieving of all nodes inside 5-dimensional cube can be performed by
a single range query.</p>
        <p>XPath query is evaluated from one context node vc. XPath query consists of
a sequence of location steps. Query processing is done in these phases:
1. We obtain a set of nodes S1 as a result of evaluation of the first location step
from context node vc. We set i = 1.
2. The set Si is established as a set of context nodes for the following step.
3. We evaluate (i + 1)th location step for every context node from the set Si
and the result is a set of nodes Si+1. We increment i by one.
4. Phases 2 and 3 are repeated until the last location step of XPath query is
evaluated.
5. Set of nodes Si is the result of the XPath query.</p>
        <p>That means running many range queries during every phase 3. With
increasing number of location steps the execution time of the query increases as well.
Size of the set Si which is created during each location step may be much larger
then the size of the XPath query result. Such inefficiency leads to unnecessary
execution time overhead.
3</p>
        <p>
          Multi-dimensional Approach to Indexing XML Data
In [
          <xref ref-type="bibr" rid="ref11 ref12 ref13">13, 11, 12</xref>
          ] MDA was introduced. This path-based approach applies to
indexing XML data paged and balanced multi-dimensional data structures like
UB-trees [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], R-trees [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], R∗-trees [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], and BUB-trees [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Model of XML documents</title>
        <p>
          As mentioned above an XML document may be modelled by a tree, whose nodes
correspond to elements and attributes. String values of elements or attributes or
empty values occur in leafs. An attribute is modelled as a child of the related
element. Consequently, an XML document may be modelled as a set of paths
from the root node to all leaf nodes. Note, unique number idU (ui) of a node ui
(element or attribute) is obtained by counter increments according to the
document order [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. Unique numbers may be obtained using an arbitrary numbering
schema. Of course, document order must be preserved.
        </p>
        <p>Let P be a set of all paths in a XML tree. The path p ∈ P in an XML tree is
sequence idU (u0), idU (u1), . . . , idU (uτP (p)−1), s, where τP (p) is the length of the
path p, s is PCDATA or CDATA string, idU (ui) ∈ D = {0, 1, . . . , 2τD − 1}, τD is the
chosen length of binary representation of a number from domain D. Node u0
is always the root node of the XML tree. Since each attribute is modelled as a
super-leaf node with CDATA value, nodes u0, u1, . . . , uτP (p)−2 represent elements
always.</p>
        <p>&lt;!DOCTYPE books [
&lt;!ELEMENT books(book)&gt;
&lt;!ELEMENT book(title,author)&gt;
&lt;!ATTLIST book id CDATA #REQUIRED&gt;
&lt;!ELEMENT title(#PCDATA)&gt;
&lt;!ELEMENT author(#PCDATA)&gt;
]&gt;
&lt;?xml version="1.0" ?&gt;
&lt;books&gt;
&lt;book id="003-04312"&gt;
&lt;title&gt;The Two Towers&lt;/title&gt;
&lt;author&gt;J.R.R. Tolkien&lt;/author&gt;
&lt;/book&gt;
&lt;book id="001-00863"&gt;
&lt;title&gt;The Return of the King&lt;/title&gt;
&lt;author&gt;J.R.R. Tolkien&lt;/author&gt;
&lt;/book&gt;
&lt;book id="045-00012"&gt;
&lt;title&gt;Catch 22&lt;/title&gt;
&lt;author&gt;Joseph Heller&lt;/author&gt;
&lt;/book&gt;
&lt;/books&gt;</p>
        <p>A labelled path lp for a path p is a sequence s0, s1, . . . , sτLP (lp) of names of
elements or attributes, where τLP (lp) is the length of the labelled path lp, and si
is the name of the element or attribute belonging to the node ui. Let us denote
the set of all labelled paths by LP. A single labelled path belongs to a path,
one or more paths belong to a single labelled path. If the element or attribute
is empty, then τP (p) = τLP (lp), else τP (p) = τLP (lp) + 1.</p>
        <p>Example 2 (Decomposition of XML tree to paths and labelled paths).
In Figure 2 we see an example of an XML document. In Figure 3 we see an XML
tree modelling the XML document. We see that this XML document contains
paths:
0
books (0)
book (15)
– 0,1,2,’003-04312’; 0,5,6,’001-00863’ ; and 0,9,10,’045-00012’ belong
to the labelled path books,book,id,
– 0,1,3,’The Two Towers’; 0,5,7,’The Return of the King’; and 0,9,11,
’Catch 22’ belong to the labelled path books,book,title,
– 0,1,4,’J.R.R. Tolkien’; 0,5,8,’J.R.R. Tolkien’; and 0,9,12,’Joseph
Heller’ belong to the labelled path books,book,author.</p>
        <sec id="sec-2-3-1">
          <title>Definition 1 (point of n-dimensional space representing a labelled path).</title>
          <p>Let ΩLP = Dn be an n-dimensional space of labelled paths, |D| = 2τD , and
lp ∈ LP be a labelled path s0, s1, . . . , sτLP (lp), where n = max(τLP (lp), lp ∈
LP) + 1. Point of n-dimensional space representing a labelled path is
defined tlp = (idT (s0), idT (s1), . . . , idT (sτLP (lp))) ∈ ΩLP , where idT (si) is a
unique number of term si, idT (si) ∈ D. A unique number idLP (lpi) is assigned
to lpi.</p>
        </sec>
        <sec id="sec-2-3-2">
          <title>Definition 2 (point of n-dimensional space representing a path).</title>
          <p>Let ΩP = Dn be an n-dimensional space of paths, |D| = 2τD , p ∈ P be a
path idU (u0), idU (u1), . . . , idU (uτLP (lp)), s and lp a relevant labelled path with
the unique number idLP (lp), where n = max(τP (p), p ∈ P) + 2. Point of
ndimensional space representing path is defined tp = (idLP (lp), idU (u0), . . . ,
idU (uτLP (lp)), idT (s)) ∈ ΩP .</p>
          <p>We define three indexes:
1. Term index. This index contains a unique number idT (si) for each term
si (names and text values of elements and attributes). The unique numbers
can be generated by counter increments according to the document order.
We want to get a unique number for a term and a term for a unique number
too. This index can be implemented by the B-tree.</p>
          <p>In Figure 3 we see the XML tree with unique numbers of terms in parenthesis.
2. Labelled path index. Points representing labelled paths together with
labelled paths’ unique numbers (also generated by counter increments) are
stored in the labelled path index.</p>
          <p>In Figure 3 we see that the document contains three unique labelled paths
books,book,id; books,book,title; and books,book,author. We create
points (0,1,2); (0,1,4); and (0,1,6) using idT of element’s and attribute’s
names. These points are inserted into a multi-dimensional data structure
with idLP 0, 1, and 2.
3. Path index. Points representing paths are stored in the path index.</p>
          <p>In Figure 3 we see unique numbers of elements. Let us take the path to the
value The Two Towers. Relevant labelled path book,book,title has got
idLP 1 (see labelled path index). We get point (1,0,1,3,5) after
inserting unique numbers of labelled path idLP , unique numbers of elements idU
and term The Two Towers. This point is stored in a multi-dimensional data
structure.</p>
          <p>
            An XML document is transformed to points of vector spaces and XML
queries are implemented using a multi-dimensional data structure queries. The
multi-dimensional data structures provide a nature processing of point or range
queries [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ]. The point query probes if the vector is or is not present in the data
structure. The range query searches all points in a query box T1 : T2 defined by
two points T1, T2.
3.2
          </p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>Queries for values of elements and attributes</title>
        <p>Now, implementation of a query for values of elements and attributes and query
defined by a simple path based on an ancestor-descendent relation will be
described. Query processing is performed in three phases which are connected:</p>
        <sec id="sec-2-4-1">
          <title>1. Finding unique numbers idT of query’s term in the term index.</title>
        </sec>
        <sec id="sec-2-4-2">
          <title>2. Finding labelled paths’ idLP of query in the labelled path index.</title>
          <p>We search the unique numbers in a multi-dimensional data structure using
point or range queries.
3. Finding points in the path index. We find points representing paths in
this index using range queries. Now, we often want to retrieve (using labelled
paths and term index) names or values of elements and attributes.
In Figure 4(a)
we see that we
can model the
query (a) as a
tree. Consequently,
we can create the
range queries in
the same way as
the XML tree
is decomposed
to vectors of
multi-dimensional
spaces.</p>
          <p>Fig. 4. Trees modelling XPath queries for values of
elements or attributes: (a) /books/books[author=’Joseph</p>
          <p>Heller’] (b) /books//author.
Example 3 (Evaluation plan of the XPath query /books/book[author="Joseph
Heller"]).</p>
          <p>By the query we want to retrieve all books written by Joseph Heller:
1. Find idT of terms books, book, author, and Joseph Heller in the term
index.
2. Find a unique number idLP of the labelled path books,book,author in the
labelled path index, which was transformed to the point representing the
labelled path. We retrieve idLP = 2 of labelled path by the point query
(0,1,6).
3. Create two points defining a query box, which searches points relevant to
this query. The query box is defined by points (2,0,0,0,12) and (2,maxD,
maxD,maxD,12), where maxD is the maximal value of domain D of space
ΩP . idLP of the labelled path retrieved during the last phase is located in the
first points’ coordinates. idT of term Joseph Heller is located in the last
points’ coordinates. Since, we search points with arbitrary values of 2nd–4th
coordinates, the first point contains the minimal values of multi-dimensional
space’s domain and the second point contains the maximal values of the
domain.</p>
          <p>We need to distinguish labelled paths and paths belonging to element or
attribute. We solve it using flags added to points. Similarly, we can solve indexing
of more XML documents, which can be valid w.r.t different schema. Hence,
the multi-dimensional approach is hopeful for implementation of a native XML
database.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experimental results</title>
      <p>
        In our experiments we compare XPA element-based approach with MDA
pathbased approach. We show the element-based XPA is less effective than MDA.
Both approaches are based on multi-dimensional data structures (R-tree [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
and Signature R-tree [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], respectively). The framework ATOM [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is applied
in our implementation of data structures. Although compared approaches are
very different we can compare some same parameters (e.g. DAC). Consequently,
we show the main disadvantage of element-based approaches. Single steps are
evaluated step by step during a query evaluation. Each step produces a lot of
elements which may be refused in the next evaluation step. We show the number
of the refused elements is rather large in the case of XPA.
      </p>
      <p>
        In our experiments1 we use the XMARK collection [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The collection
contains one file of the size 111MB. It includes 2,082,854 elements. Table 2 shows
statistics of XPA indices. In Table 3 tested queries are put forward. These queries
were selected wilfully. The first one includes 180 elements in the result, whereas
the second one includes 7.5× elements more than the first query, that is 1,350.
      </p>
      <p>Tables 5 and 6 show results of evaluation of queries Q1 and Q2, respectively,
in XPA. Tables includes surveyed parameters for each location step. In Table 4
such parameters are described.</p>
      <p>Inefficiency of element-based approach is obvious in the difference between
Nodes and Useful values. In the case of Q1 55,383 elements are retrieved but
1 The experiments were executed on an Intel Pentium r4 2.4Ghz, 1GB DDR400,
under Windows XP.</p>
      <p>Number of nodes in the result set after evaluation of one location step
Number of nodes which leads to at least one node in the next location step
Time for processing the step</p>
      <p>Number of access in indices
the result contains only 180 elements. In the case of Q2 27,493 elements are
retrieved but the result contains only 1,350 elements.
In the paper we compare XPA element-based approach and MDA path-based
approach. In the case of an element-based approach a query is evaluated step
by step. Each step produces a lot of elements which may be refused in the next
evaluation step. Results of our experiments prove the previously published MDA
path-based approach overcomes conventional element-based approaches. In our
future work, we would like further to improve the abilities and the efficiency
of MDA. In particular, we are going to develop an implementation of another
complex XML querying such XPath and XQuery query languages defined it. We
would like to use data types described by XML Schema for querying and develop
an efficient implementation of approximate querying of XML documents.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Amphora Research Group (ARG).
          <source>Amphora Tree Object Model (ATOM)</source>
          , http://arg.vsb.cz/,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>R.</given-names>
            <surname>Baeza-Yates</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Ribiero-Neto</surname>
          </string-name>
          .
          <article-title>Modern Information Retrieval</article-title>
          . Addison Wesley, New York,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>R.</given-names>
            <surname>Bayer</surname>
          </string-name>
          .
          <article-title>The Universal B-Tree for multidimensional indexing: General Concepts</article-title>
          .
          <source>In Proceedings of World-Wide Computing and Its Applications'97 (WWCA'97)</source>
          , Tsukuba, Japan, Lecture Notes in Computer Science. Springer-Verlag,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>N.</given-names>
            <surname>Beckmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.-P.</given-names>
            <surname>Kriegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Schneider</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Seeger. The R</surname>
          </string-name>
          ∗
          <article-title>-tree: An efficient and robust access method for points and rectangles</article-title>
          .
          <source>In Proceedings of the 1990 ACM SIGMOD</source>
          , pages
          <fpage>322</fpage>
          -
          <lpage>331</lpage>
          . ACM Press,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>R.</given-names>
            <surname>Bourret</surname>
          </string-name>
          . XML and
          <string-name>
            <surname>Databases</surname>
          </string-name>
          ,
          <year>2001</year>
          , http://www.rpbourret.com/xml/XMLAndDatabases.htm.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>A. B. Chaudhri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rashid</surname>
            , and
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Zicari. XML Data</surname>
          </string-name>
          <article-title>Management: Native XML and XML-Enabled Database Systems</article-title>
          . Addison Wesley Professional,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>B.</given-names>
            <surname>Cooper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sample</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Franklin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. R.</given-names>
            <surname>Hjaltason</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Shadmon</surname>
          </string-name>
          .
          <article-title>A Fast Index for Semistructured Data</article-title>
          .
          <source>In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB'01)</source>
          , pages
          <fpage>341</fpage>
          -
          <lpage>350</lpage>
          . Morgan Kaufmann,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>R.</given-names>
            <surname>Fenk</surname>
          </string-name>
          .
          <article-title>The BUB-Tree</article-title>
          .
          <source>In Proceedings of 28rd VLDB International Conference on Very Large Data Bases (VLDB'02)</source>
          , Hongkong, China. Morgan Kaufmann,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>T.</given-names>
            <surname>Grust</surname>
          </string-name>
          .
          <article-title>Accelerating XPath Location Steps</article-title>
          .
          <source>In Proceedings of the 2002 ACM SIGMOD</source>
          , Madison, USA. ACM Press, June 4-6,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>A.</given-names>
            <surname>Guttman. R-Trees</surname>
          </string-name>
          :
          <article-title>A Dynamic Index Structure for Spatial Searching</article-title>
          .
          <source>In Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, Annual Meeting</source>
          , Boston, USA, pages
          <fpage>47</fpage>
          -
          <lpage>57</lpage>
          . ACM Press,
          <year>June 1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. M. Kr´atky´, J. Pokorny´,
          <string-name>
            <given-names>T.</given-names>
            <surname>Skopal</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Sn</surname>
          </string-name>
          <article-title>´aˇsel. The Geometric Framework for Exact and Similarity Querying XML Data</article-title>
          .
          <source>In Proceedings of First EurAsian Conference</source>
          ,
          <source>EurAsia-ICT</source>
          <year>2002</year>
          , Shiraz, Iran, volume
          <volume>2510</volume>
          of Lecture Notes in Computer Science. Springer-Verlag,
          <source>October 27-31</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. M. Kr´atky´, J. Pokorny´, and
          <string-name>
            <given-names>V.</given-names>
            <surname>Sn</surname>
          </string-name>
          <article-title>´aˇsel. Implementation of XPath Axes in the Multi-dimensional Approach to Indexing XML Data</article-title>
          . In Current Trends in Database Technology,
          <source>International Workshop on Database Technologies for Handling XML information on the Web, DataX, Int'l Conference on Extending Database Technology (EDBT</source>
          <year>2004</year>
          ), volume
          <volume>3268</volume>
          of Lecture Notes in Computer Science. Springer-Verlag,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. M. Kr´atky´, J. Pokorny´, and
          <string-name>
            <given-names>V.</given-names>
            <surname>Sna</surname>
          </string-name>
          <article-title>´ˇsel. Indexing XML data with UB-trees</article-title>
          .
          <source>In Proceedings of Advances in Databases and Information Systems, ADBIS</source>
          <year>2002</year>
          , 6th East European Conference, Bratislava, Slovakia, volume Research Commmunications, pages
          <fpage>155</fpage>
          -
          <lpage>164</lpage>
          , September 8-
          <issue>11</issue>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Moon</surname>
          </string-name>
          .
          <article-title>Indexing and Querying XML Data for Regular Path Expressions</article-title>
          .
          <source>In Proceedings of 27th International Conference on Very Large Data Bases (VLDB'01)</source>
          . Morgan Kaufmann,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>J. McHugh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Abiteboul</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Goldman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Quass</surname>
            , and
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Widom</surname>
          </string-name>
          .
          <article-title>Lore: A Database Management System for Semistructured Data</article-title>
          .
          <source>ACM SIGMOD Record</source>
          ,
          <volume>26</volume>
          (
          <issue>3</issue>
          ):
          <fpage>54</fpage>
          -
          <lpage>66</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>J. Pokorny</surname>
          </string-name>
          <article-title>´. XML: a challenge for databases?</article-title>
          , pages
          <fpage>147</fpage>
          -
          <lpage>164</lpage>
          . Kluwer Academic Publishers, Boston,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>J. W. R.</given-names>
            <surname>Goldman</surname>
          </string-name>
          .
          <article-title>DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases</article-title>
          .
          <source>In Proceedings of 23rd International Conference on Very Large Data Bases (VLDB'97)</source>
          , pages
          <fpage>436</fpage>
          -
          <lpage>445</lpage>
          . Morgan Kaufmann,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>A. R. Schmidt</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Waas</surname>
            ,
            <given-names>M. L.</given-names>
          </string-name>
          <string-name>
            <surname>Kersten</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Florescu</surname>
            , I. Manolescu,
            <given-names>M. J.</given-names>
          </string-name>
          <string-name>
            <surname>Carey</surname>
            , and
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Busse</surname>
          </string-name>
          .
          <source>The XML Benchmark. Technical Report INS-R0103</source>
          ,
          <string-name>
            <surname>CWI</surname>
          </string-name>
          , Amsterdam, The Netherlands, April,
          <year>2001</year>
          , http://monetdb.cwi.nl/xml/.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. University of Washington's database group.
          <source>The XML Data Repository</source>
          ,
          <year>2002</year>
          , http://www.cs.washington.edu/research/xmldatasets/.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <given-names>W3</given-names>
            <surname>Consortium. Extensible Markup</surname>
          </string-name>
          <article-title>Language (XML) 1.0</article-title>
          ,
          <string-name>
            <given-names>W3C</given-names>
            <surname>Recommendation</surname>
          </string-name>
          , 10
          <source>February</source>
          <year>1998</year>
          , http://www.w3.org /TR/REC-xml.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <article-title>W3 Consortium. XQuery 1.0: An XML Query Language</article-title>
          ,
          <source>W3C Working Draft, 12 November</source>
          <year>2003</year>
          , http://www.w3.org/TR/xquery/.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>W3</given-names>
            <surname>Consortium. XML Path</surname>
          </string-name>
          <article-title>Language (XPath) Version 2</article-title>
          .0,
          <issue>W3C</issue>
          Working Draft, 15
          <source>November</source>
          <year>2002</year>
          , http://www.w3.org/TR/xpath20/.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <given-names>W3</given-names>
            <surname>Consortium. XML Schema</surname>
          </string-name>
          <article-title>Part 1: Structure, W3C Recommendation, 2 May 2001</article-title>
          , http://www.w3.org/TR/xmlschema-1/.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>