<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Application-Tailored XML Storage</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maxim Grinev</string-name>
          <email>maxim@grinev.net</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ivan Shcheklein</string-name>
          <email>shcheklein@ispras.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute for System Programming of the Russian Academy of Sciences</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Proceedings of the Spring Young Researcher's Colloquium on Database and Information Systems SYRCoDIS</institution>
          ,
          <addr-line>St.-Petersburg, Russia, 2008</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Several native approaches to storing large XML data sets exist. In all of these approaches the internal data representation is designed to support any ad-hoc XQuery query. In this paper we argue that XQuery and its data model are too universal and any one-size-fits-all XML representation leads to significant overheads in terms of representation size and complexity. Based on the consideration that in many applications queries/updates workload is known in advance and does not change often, we propose an application-tailored XML storage. Elimination of the superfluous XQuery data model features and utilization of the various physical data representations improve performance on the specified workload, while ad-hoc queries support can be limited.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        XML/XQuery [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] provides great flexibility and
extensibility. It is a universal model to represent data
ranging from relational-like to content-oriented
(including mixed content). XML applications are quite
extensible as XML/XQuery can handle irregularity in
data. However, this flexibility/extensibility leads to
inefficiency. There are a number of general approaches
[
        <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5">2, 3, 4, 5</xref>
        ] that tackle the issue. However each approach
has its own obvious advantages and disadvantages that
make it applicable only for particular type of
applications (see comparison with the approach in
related work below).
      </p>
      <p>
        Moreover, in any of the approaches there are a lot of
features that are redundant for any particular
applications. For example, suppose we have
relationallike data stored in XML. When we query such data we
don't usually use features like sibling/parent axes or
document order which are supported in all general
approaches. Another example is a set of queries to
content-oriented XML (such as an encyclopedia article
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] or Microsoft Word XML format [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]). Such queries
do not usually address rendering elements (such as para,
bold, emphasize, etc which constitute the majority in
content-oriented XML) but address meaningful
semantic elements such as: author, date, bibliography,
etc., so rendering elements can be stored in a
compressed unqueryable form to improve the efficiency
of updates and serialization.
      </p>
      <p>We believe that efficient storage and processing of
XML data cannot be achieved using any general
approach. The only approach to achieve great efficiency
for such a universal and flexible model as
XML/XQuery is to choose appropriate data structures
and processing techniques for a given application (i.e.
given XML schema and set of queries and updates). We
need to go beyond compiling query execution plans and
compile an XML storage tailored for a given workload
of queries/updates. This approach allows us to support
flexible XQuery model at logical level but eliminate the
XQuery data model overhead at physical level. For
example, querying/updating relational-like data (even
more: nested-table data) can have efficiency
comparable with that provided by relational databases.
Content-oriented data can be processed with efficiency
that is comparable with pure text-oriented systems.</p>
      <p>In this paper we summarize the preliminary results
of in-progress research on building application-tailored
XML storage.</p>
      <sec id="sec-1-1">
        <title>1.2 Paper Outline</title>
        <p>The rest of the paper is organized as follows. Within the
next section we consider an example of the application
which motivates this work. In Section 3 we give brief
overview of the approach proposed. In Section 4 we
propose physical data representation and illustrate it on
example. Within the Section 5 we survey related work
and consider existing approaches. And finally, Section 6
concludes and points out directions of our future work.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2 Motivating Example</title>
      <p>
        To illustrate advantages and various aspects of the
application-tailored XML storage we use simplified
version of the Great Russian Encyclopedia (GRE)
application [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Figures 1-2 show fragment of the
encyclopedia XML and its descriptive schema (by
definition, every path of the document has exactly one
path in the descriptive schema, and every path of the
descriptive schema is a path of the document). In the
example there is one volume that contains at least three
articles. Each article in turn consists of the title, list of
authors and body element contained content of the
article.
      </p>
      <p>For data processing encyclopedia application uses a
number of predefined XQuery queries which are not
likely to change often in the production system. Let us
consider some of them (Q1-Q5):
(Q1): List all articles’ titles.</p>
      <p>declare ordering unordered;
volume/article/title
(Q2): Get article by id.
(Q3): Get article by title.</p>
      <p>declare ordering unordered;
volume/article[@id eq “...”]
declare ordering unordered;
volume/article[title eq “...”]
(Q4): List articles referenced from the article “1”.
declare ordering unordered;
for $i in volume/article</p>
      <p>[@id eq “1”]//link
return volume/article</p>
      <p>[@id eq $i/@idref]/title
(Q5): List articles which have references to the article
‘atom’.</p>
      <p>declare ordering unordered;
let $j := volume/article</p>
      <p>[title eq “atom”]/@id
for $i in volume/article
where $i//link[@idref eq $j]
return $i/title</p>
      <p>Considering this simplified example we can point
out some interesting observations concerned workload
and XML data which we believe are more or less
common to every XML processing application:
1. Rendering Markup Content - content part of
the XML data usually contains a lot of rendering
elements (e.g. HTML in Figures 1-2) which only aim
are to be used in front-end applications (like browsers
or Word processors) to display proper image on the
screen and they are not used directly in queries. Often
rendering markup language uses XML syntax and
produces additional stress on XML database since it
can’t distinguish rendering elements and elements
which reflect application-level entities (articles,
persons, etc) and extensively used in application
defined queries.</p>
      <p>2. Relational Like Content - besides rendering
elements we can single out attributes and elements with
simple content (e.g. id, idref attributes and title
element in the example) which are intended to be used
in queries just as properties of the application-level
entities. For example, id of the article is not interested
for us itself; however we are interested in article with
some id.</p>
      <p>3. Document Ordering Avoidance - quite often
document order of the result doesn’t make sense for the
application. Therefore in a lot of cases parent-child
relationships are the only relationship we actually need
between certain entities. In queries Q1 - Q5 we use
standard XQuery prolog declaration to turn off results
ordering.</p>
      <p>4. Known Workload - we can derive a number of
quite simple path expressions which play a role of basis
for all queries in application. In our example we have
volume/article, volume/article/link,
volume/article/title and some modifications
with predicates.</p>
      <p>5. Fixed Workload - finally, just as in the GRE
application we suppose that basic queries which are
used in production systems are very rare subjected to be
changed. We do not have ad-hoc queries but a number
of well defined, possibly parameterized expressions.</p>
      <p>As the paper progresses we will illustrate how these
observations can be used to adjust XML storage for the
specified queries.</p>
    </sec>
    <sec id="sec-3">
      <title>3 Approach Overview</title>
      <p>The approach we employ can be split into two distinct
phases: storage compilation for the initial workload and
recompilation phase in which storage is reorganized to
restore good performance after the workload changed.
Below is a brief description of each phase from the
logical point of view. A sketch of the physical data
representation is observed in next section.</p>
      <sec id="sec-3-1">
        <title>3.1 Compiling Application-Tailored XML Storage</title>
        <p>Given a specific query workload (that can include
update queries) we compile optimized query plans for
the workload and also produce a proper storage plan. In
this plan features of XQuery data model that are not
required to execute the workload are eliminated.</p>
        <p>To build such a storage plan the following main
techniques can be used:</p>
        <p>
          1. Combining structural and textual data
representations (and using appropriate techniques to
process each type of representation). As we mentioned
above, majority of elements (such as rendering and
grouping elements) in content-oriented XML are not
addressed by queries/updates at all or addressed in such
a way that they can be parsed and processed efficiently
on-the-fly (using XML streaming processing
techniques). We have designed a method in which
queries are analyzed to find nodes of XML data that
should be stored in a structural way (preserving parent,
child and/or other relationships between nodes like in
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] for example) to make evaluation of the specified
queries and updates as much efficient as possible. The
rest of XML data (e.g. rendering elements) are stored as
compressed text in the same way as text content of the
nodes. This method is quite flexible and is not restricted
by storing the whole XML sub-trees as text: structured
elements may have textual children which in turn may
have structured elements inside (see Section 4 for the
method illustration). Compressing nodes in text
representation does not eliminate at all an ability to
query them. They still can be effectively processed
using XML streaming techniques like in [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
        </p>
        <p>
          2. Using various schemes of clustering nodes in
blocks. Sedna structured representation [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] can be
combined with the Natix/DB2 [
          <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
          ] approach on the
basis of queries/updates analysis providing significant
performance improvements in comparison with both
approaches.
        </p>
        <p>3. Eliminating redundant structures (such as
redundant pointers, numbering labels, etc.) and
flattening structure when possible (i.e. removing
grouping elements etc.). Analyzing queries/updates we
can identify which structures are redundant to support
the queries/updates. Redundant pointers and grouping
element can be eliminated and data can be represented
using more compact data structures. For example,
“relational-like” XML data can be stored in records
similar to that used in relational storages. Such records
are still not as rigorous as relational records to support
possible irregularity in data but it is much more efficient
then to use any of the general approaches. We can also
flatten the structure of XML in many cases. For
example, if the person element contains the address
element which in turn contains street, house, city as
sub-elements - such the address structure can be flatten.</p>
        <p>Note also that the techniques eliminate only
necessary XML-specific features of XML/XQuery data
model. They don’t lead to losing the data or don't lead
to emerging of redundant data. However our approach
can be naturally extended with such powerful
techniques as data projection on the basis of static query
analysis (e.g. identifying constant-based predicate) or
materialized views (might lead to data redundancy).</p>
        <p>
          And last but not least, in the result of building
storage structures for a given application we will get
simpler data structures than that used in general
approaches: elements are less interconnected. It allows
improving not only query/update performance but also
opens the door for improving locking granularity and
building a distributed system. For example, we can
implement data parallelism on shared-nothing
architecture. The main critique of the shared-nothing
approach [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] is that it works well only for particular
queries/update workload. But in our approach we
optimize for particular application so shared-nothing
fits our approach well.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2 Recompiling Application-Tailored XML Storage</title>
        <p>When the application is modified we might need to
recompile the queries and storage. We can employ a
flexible policy of recompiling the database. First, the
frequency at which we need to recompile the database
depends on the level of optimization that we choose to
customize the storage for the given application (it might
a parameter of optimization as in programming
language compilers: O1-O5). Second, it might be
required not as often as it might seem. Indeed it is very
unlikely that new queries which address a relational-like
XML will start using sibling pointers that were removed
at the phase of storage compilation.</p>
        <p>But in general case we might really need to
recompile the whole database into the new structures.</p>
        <p>
          The solution is as follows. The whole database can
be reconstructed using massive-parallel distributed
processing. It is true for small and middle sized
databases that it can be done quite fast even on
commodity hardware. If the reconstructed database is
distributed (see on the possibility to build a distributed
system above) it can be done really fast. In case of
simple scheme of partitioning the database (when it is
not optimized for things like collocated joins [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]) the
reconstruction can be done in parallel just transforming
each document independently. In more complex case
(e.g. data are partitioned to use collocated joins) we can
employ techniques like map-reduce [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] to repartition
the data using the keys for different joins.
        </p>
        <p>
          There are two main options in reconstructing the
database. Simple solution is to stop the database and
reconstruct it. As it presumably does not take a long
time for small or middle sized database the down time
should not be long. Advanced solution is to use
snapshot isolation (shadow/versioning) transaction
mechanism [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] to reconstruct the database without
stopping it.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4 Data Representation</title>
      <p>In this section we give a sketch of the physical data
representations used in application-tailored XML
storage to assure best performance for the given
workload.</p>
      <p>Let us return to the example described in Section 2
(Figures 1-2). Descriptive schema defines XML nodes
decomposition according to pathways in the document.
For each group the best storage method is determined in
compliance with workload (Figure 3, a-c):
•</p>
      <p>
        Node descriptor - each node in group can be
stored as ‘node descriptor’ structure, which
through direct pointers reflects
child/sibling/parent relationships between
nodes. This way gives effective navigation and
goes very well to evaluate structured path
expressions [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Besides (or along with) direct
pointers in this approach numbering scheme
can be effectively employed [
        <xref ref-type="bibr" rid="ref11 ref9">11, 9</xref>
        ]. Every
node descriptor can have a label ‘nid’. The
main goal of using numbering scheme is to
quickly determine ancestor-descendant
relationship between any pair of nodes in the
hierarchy of XML data. It can also be used for
determining document order relationship.
      </p>
      <p>
        Value packed in node descriptor – like in
relational databases for some nodes we employ
structures similar to the relational records [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
•
      </p>
      <sec id="sec-4-1">
        <title>Record is packed in node descriptor of the</title>
        <p>ancestor (like id and title values shown in
Figure 3, b). Actually, in this option we cluster
application-level entity (like article or
person) with its “relational-like” flat
properties (e.g. id, name etc). It gives several
advantages. Firstly, presentation is as much
compact as possible since we do not have to
store irrelevant pointers and numbering
scheme labels. Second, it speeds up a whole
number of path expressions, particularly with
predicates with condition on packed nodes
(like //aricle[@id eq “1”]). Finally, it
also speeds up serialization process.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Node packed in text - nodes which are not</title>
        <p>expected to be queried (e.g. rendering
elements) can be stored in the textual form.
Obviously, it saves space (since we do not
have to allocate data blocks for each type of
nodes) and speeds up serialization process. As
mentioned in Section 3 this method is quite
flexible and is not restricted by storing the
whole XML sub-trees as text. Textual node can
have placeholders inside for the descendants
stored using first two options.</p>
        <p>Database executor uses descriptive schema to
determine the way the node is stored and as an entry
point to the data blocks in which node is located.</p>
        <p>In Figure 4 storage plan for the example defined in
Section 2 is shown. According to this plan:
•
‘article’ and ‘link’ nodes are stored in a
structural way using node descriptors – since
we directly query and serialize them in Q1-Q5
•
•
•
queries. Numbering schema labels and some
pointers are eliminated because document
order and sibling axes are not used in the
workload;
‘id’, ‘idref’ attributes values are packed in
their parents node descriptors – they have
simple content and are used in predicates to
filter out application-level entities like article
or link;
‘title’ nodes is queried and serialized in
path expressions in Q1-Q5. Though they are
also flatten in their parents nodes descriptors
since they have simple content;
‘author’ nodes along with content markup
nodes encircled in Figure 2 are packed in
textual representation which parent is article’s
node. It is used only to serialize article and has
#id, #title and #link placeholders for the
id, title and link values respectively.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5 Related work</title>
      <p>
        There are a few works on building customizable XML
storage exists. In the OrientStore [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] system authors
propose approach based on the combination of Natix [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
and Sedna [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] storage strategies but the choice of the
strategy is data-driven (schema-driven) and the physical
representation contains all the features to execute any
ad-hoc query. There are a number of approaches like in
the LegoDB [
        <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
        ] or XCacheDB [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] which are very
close to our work but these storages are based on the
relational storage which brings its limitations and
overhead. Ideas proposed in all these works might be
useful but they do not provide any complete solution for
building application-tailored XML storage.
      </p>
      <p>
        What we propose should not be mixed up with
component database [
        <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
        ]. We think that component
databases are too general approach which cannot be
efficiently implemented in practice. We do not propose
to generate database systems for various kinds of
database applications such as OLTP, OLAP, etc and for
various hardware and software platforms such as PDA,
desktop, or server. We just extend the idea of query
optimization from building query execution plan to
choosing storage structures also. That is aimed at first
place to reduce XML/XQuery-specific overhead caused
by extra flexibility and extensibility of XML/XQuery.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6 Conclusion and Future Work</title>
      <p>In this paper we propose a method of compiling
querydriven XML storage designed to reduce the overhead
caused by the universality of XQuery data model.
According to our preliminary studies and experiments,
proposed method allows us to reduce the size of internal
representation from several times to orders of
magnitude (consequently optimize buffer memory
usage) and to store data in a way that minimize the
number of blocks addressed by the queries/updates.
The overall effect of such optimization should make
XML database significantly effective.</p>
      <p>This paper reports the preliminary results of
inprogress research. The feature work includes
prototyping the system and conducting performance
experiments. Also we are going to design an XQuery
optimizer to construct storage plan automatically (or
semi-automatically using a small number of hits) and a
method of reconstructing internal XML representation
which does not require database shutdown.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[1] XQuery 1</source>
          .0:
          <string-name>
            <surname>An</surname>
            <given-names>XML</given-names>
          </string-name>
          <string-name>
            <surname>Query</surname>
          </string-name>
          <article-title>Language</article-title>
          .
          <source>W3C Recommendation 23 January</source>
          <year>2007</year>
          , w3.org/TR/2007/REC-xquery-20070123
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Fiebig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Helmer</surname>
          </string-name>
          et al.
          <article-title>Anatomy of a Native XML Base Management System</article-title>
          ,
          <source>The VLDB Journal 11/ 4</source>
          , 2002
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nicola</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. van der Linden. Native</surname>
          </string-name>
          <article-title>XML support in DB2 universal database</article-title>
          .
          <source>In Proceedings of the VLDB</source>
          , Trondheim, Norway,
          <year>2005</year>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Grinev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fomichev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kuznetsov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Antipin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Boldakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lizorkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Novak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rekouts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pleshachkov</surname>
          </string-name>
          .
          <article-title>Sedna: A Native XML DBMS, www</article-title>
          .modis.ispras.ru/sedna
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Haustein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Härder</surname>
          </string-name>
          .
          <article-title>An efficient infrastructure for native transactional XML processing. Data Knowledge Eng</article-title>
          ., June 2007
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.</given-names>
            <surname>Ehrli</surname>
          </string-name>
          . Walkthrough:
          <article-title>Word 2007 XML Format Microsoft Corporation</article-title>
          , June 2006
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>“Great</given-names>
            <surname>Russian Encyclopedia</surname>
          </string-name>
          " Publishing Company, http://www.greatbook.ru/ (in Russian)
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chandrasekaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bamford</surname>
          </string-name>
          .
          <article-title>Shared Cache - The Future of Parallel Databases</article-title>
          .
          <source>In Proceedings of the ICDE</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Fomichev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Grinev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S</given-names>
            <surname>Kuznetsov</surname>
          </string-name>
          .
          <source>Descriptive Schema Driven XML Storage. Technical Report</source>
          , MODIS,
          <article-title>Institute for System Programming of the Russian Academy of Sciences, 2004</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <article-title>Join methods in partitioned database environments, IBM DB2 Database Information Center</article-title>
          , http://publib.boulder.ibm.com/infocenter/db2luw/v 9r5/index.jsp
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          .
          <source>MapReduce: Simplified Data Processing on Large Clusters. OSDI</source>
          ,
          <year>December 2004</year>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>H.</given-names>
            <surname>Berenson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Melton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. O</given-names>
            <surname>'Neil</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          <article-title>O'Neil. A Critique of ANSI SQL Isolation Levels</article-title>
          .
          <source>SIGMOD International Conference on Management of Data</source>
          San Jose, May 1995
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>X.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. An.</surname>
          </string-name>
          <article-title>OrientStore: A Schema Based Native XML Storage System</article-title>
          .
          <source>In Proceedings of the VLDB</source>
          ,
          <year>2003</year>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bohannon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Freire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Haritsa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ramanath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          <article-title>Siméon: Bridging the XML Relational Divide with LegoDB</article-title>
          .
          <source>In Proceedings of the ICDE</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ramanath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Freire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Haritsa</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Roy</surname>
          </string-name>
          .
          <article-title>Searching for Efficient XML to Relational Mappings</article-title>
          .
          <source>Technical Report</source>
          , DSL/SERC, Indian Institute of Science,
          <year>2003</year>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Seltzer</surname>
          </string-name>
          . Beyond Relational Databases:
          <article-title>There is More to Data Access than SQL</article-title>
          ,
          <source>ACM Queue 3/3</source>
          ,
          <string-name>
            <surname>April</surname>
          </string-name>
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Weikum</surname>
          </string-name>
          .
          <article-title>Rethinking Database System Architecture: Towards a Self-Tuning RISCStyle Database System</article-title>
          .
          <source>The VLDB Journal</source>
          ,
          <year>2000</year>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D.</given-names>
            <surname>Florescu</surname>
          </string-name>
          et al.
          <source>The BEA Streaming XQuery Processor</source>
          .
          <source>The VLDB Journal 13/3</source>
          , September 2004
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Moon</surname>
          </string-name>
          .
          <article-title>Indexing and Querying XML Data for Regular Path Expressions</article-title>
          .
          <source>Proceedings of the VLDB Conference</source>
          , Roma, Italy,
          <year>2001</year>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>H.</given-names>
            <surname>Garcia-Molina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ullman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Widom</surname>
          </string-name>
          .
          <article-title>Database Systems: The Complete Book</article-title>
          . Prentice Hall,
          <year>October 2001</year>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.</given-names>
            <surname>Balmin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Papakonstantinou</surname>
          </string-name>
          .
          <article-title>Storing and Querying XML Data Using Denormalized Relational Databases</article-title>
          .
          <source>The VLDB Journal</source>
          ,
          <volume>14</volume>
          (
          <issue>1</issue>
          ):
          <fpage>30</fpage>
          -
          <lpage>49</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>