<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Growing Triples on Trees: an XML-RDF Hybrid Model for Annotated Documents</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Franc¸ ois Goasdou e´</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Julien Leblay</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Konstantinos Karanasos</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ioana Manolescu</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yannis Katsis</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stamatis Zampetakis</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Leo team</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>INRIA Saclay</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Univ. Paris-Sud</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>INRIA Saclay</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ENS Cachan</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Univ. of Crete rstname.lastname@inria.fr</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Content on today's Web is typically document-structured and richly connected; XML is by now widely adopted to represent Web data. Moreover, the vision of a computerunderstandable Web relies on Web (and real world) resources described by simple properties having names or values; URIs are the normative method of identifying resources and RDF (the Resource Description Framework) enjoys important traction as a way to encode such statements. We present XR, a carefully designed hybrid model between XML and RDF, for describing RDF-annotated XML documents. XR follows and combines the W3C's XML, URI and RDF standards by assigning URIs to all XML nodes and enabling these URIs to appear in RDF statements. The XR management platform thus provides the capabilities to create and handle interconnected XML and RDF content. We de ne the XR data model, its query language, and present preliminary results with a prototype implementation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee. This article was presented at the workshop Very
Large Data Search (VLDS) 2011.</p>
      <p>Copyright 2011.
istrator adds annotations to the articles, classifying them
according to their topics selected from a publicly available
ontology on news articles. By reasoning on the ontology, a
user searching for article topics (e.g., biology), also obtains
articles on more speci c ones (e.g., bioengineering, biofuel
energy, etc.). The readers can also annotate the documents
stating their personal opinion on an article or on speci c
parts of it. Moreover, articles can be linked with each other.
For instance, when a speci c article is also published in a
blog, this version can be linked to the newspaper version.
Hence, when searching for articles, it becomes feasible to
use semantic information that is not attached directly to a
given article, but to another linked version of it.</p>
      <p>Many more use cases of semantic annotations can be found.
Among them, and closely related to the Web search, lately,
search engines such as Google Search and Microsoft Bing,
let users reward search results they like or use information
available through social networks (e.g., friends list) to return
results more suitable for the users.</p>
      <p>Such annotations can be naturally expressed in RDF, which
is becoming the de facto standard for describing
semantically rich data. First, by assigning URIs to any part of
the XML articles, we can refer to these parts in RDF
statements and, thus, add semantic information on the articles at
any granularity, all the while leaving the original documents
intact. Second, in the spirit of the Linked Data initiative
(http://linkeddata.org), we can create semantic links
between documents. Finally, RDF, especially when combined
with RDF Schema (RDFS in short), enables the discovery
of new implicit knowledge through reasoning.</p>
      <p>Previous works have enabled the simultaneous
processing of XML and RDF data, typically by converting RDF to
XML (or vice-versa) and then using a language and
platform dedicated to the target format. Such translation
results are generally awkward and bury content under syntax.
Fundamentally, the data models are quite di erent: XML
emphasizes structural relationships, whereas RDF consists
of triples connected in arbitrarily complex graphs. As a
consequence, the challenges and opportunities for optimization
are very di erent for the two models. Thus, existing
optimizers and techniques for one model are likely not to apply
well on the data converted from the other model.</p>
      <p>At a more conceptual level, these approaches juxtapose
XML and RDF data, and do not consider the main idea of
this work, namely, turning XML nodes (or words) into RDF
resources, and connecting these data sources.</p>
      <p>
        In this work, we propose a uni ed model allowing the
combination of XML with RDF data into a single instance. Our
main contributions are: (i) a data model for annotated
documents which can express instances where XML and RDF
are interconnected (described in Section 2), (ii) a query
language for querying the uni ed data model (Section 3) and
(iii) an implemented system with experiments
encompassing the previous ideas (Section 4). The full technical report
describing our work is available at [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. THE XR DATA MODEL</title>
      <p>To represent annotated documents, we introduce the XR
data model, extending and combining the standard XML
structured data model and RDF semantic data model. An
instance of the XR data model comprises two sub-instances:
An XML sub-instance, consisting of a set of XML trees,
and an RDF one, consisting of a set of RDF triples. The
connection between the two is achieved by assigning to each
XML node a unique URI, which can then be referred to from
an RDF triple, as we will explain below.</p>
      <p>The XML sub-instance relies on a set N of possible XML
element and attribute names, a set U of URIs, and a set L
of literals. An XML tree is de ned as usual:</p>
      <p>Definition 2.1 (XML Tree). An XML tree is a
nite, unranked, ordered, labeled tree T = (N; E) with nodes
N and edges E, where each node n ∈ N is assigned a label
(n) ∈ N and a type (n) ∈ {attribute; element; text}. An
attribute node must be the child of an element node; it has
a value belonging to L and it does not have any children. A
text node can only appear as a leaf.</p>
      <p>A set of XML trees forms an XML instance:</p>
      <p>Definition 2.2 (XML Instance). An XML instance
IX is a nite set of XML trees together with a function
assigning to each node of these trees a unique URI from U .</p>
      <p>The URI assignment function is crucial for
interconnecting the XML and RDF sub-instances. The unique identi ers
assigned to the nodes allow the RDF sub-instance to refer
to nodes of the XML sub-instance.</p>
      <p>In addition to the aforementioned sets U and L, the RDF
sub-instance also relies on a set of blank nodes B, whose role
is discussed below.</p>
      <p>Definition 2.3 (RDF Instance). An RDF instance
IR is a set of triples of the form (s; p; o), where s ∈ (U ∪ B),
p ∈ U , and o ∈ (L ∪ U ∪ B).</p>
      <p>As customary, the components of a triple (s; p; o) are
referred to as its subject, property and object, respectively.</p>
      <p>
        As de ned above, the subject or the object of a triple can
be bound to a blank node. Blank nodes are used in RDF
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] to denote unknown URIs or literals, similarly to labeled
nulls in the database literature [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. For instance, one can
use a blank node b1 in the triples (b1; country; \F rance")
and (b1; city; \P aris") to state that the country and city of
b1 is France and Paris, resp., without using a concrete URI.
      </p>
      <p>
        Another advantage of adopting RDF is that XR inherits
its reasoning capabilities. As described in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], an RDF data
set contains not only explicit triples but also implicit triples,
derived from the explicit ones through a set of entailment
rules, the most important of which are due to knowledge
encoded in RDF Schemas. In the example discussed in the
introduction, an RDF Schema speci es that bioengineering
is a subClassOf biology. Assuming the paragraph p1 is
related to bioengineering, modeled by the explicit RDF triple
(#3, ⟨sameAs⟩, #13),
(#3, ⟨reportOn⟩, #8),
(_:B, ⟨workAt⟩, #3),
(#8, rdf:type, ⟨Organization⟩),
(#8, ⟨sameAs⟩, dbpedia:ACME),
(#8, rdf:type, ⟨Company⟩),
(_:B, ⟨email⟩,“bob@example.com”),
(⟨Company⟩, rdf:subClassOf, ⟨Organization⟩),
      </p>
      <p>RDF sub-instance</p>
      <sec id="sec-2-1">
        <title>Article #2</title>
        <p>…</p>
      </sec>
      <sec id="sec-2-2">
        <title>NewsFeed #1</title>
        <p>Title#4</p>
        <p>Article
#3</p>
        <p>NewsArchive
…
Date #11</p>
        <p>Report
#13
“Rise and Fall of
Phone Carrier”#5 Story #6 “03-04-2011”#12
“Once upon a
time...” #7
“... happily ever</p>
        <p>after.” #10
NamedEntity
“ACME”
#8
#9</p>
        <p>XML sub-instance
(p1, relatedTo, bioengineering), the implicit triple (p1,
relatedTo, biology) is also conceptually part of the data set.
Implicit tuples must be re ected in RDF query answers.
Accordingly, XR queries (presented in Section 3) also consider
that implicit triples are part of the RDF sub-instance. The
management of implicit RDF (and thus, XR) triples is an
orthogonal issue, on which we will not delve further here.</p>
        <p>We can now de ne an XR instance as follows:</p>
        <p>Definition 2.4 (XR Instance). An XR instance is a
pair (IX ; IR), where IX and IR are an XML and an RDF
data instance, respectively, built upon the same set of URIs.</p>
        <p>Note that the XML and the RDF sub-instances are
dened over the same set of URIs U , thus allowing RDF triples
to annotate XML nodes, as shown in the following example.</p>
        <p>Example. Figure 1 shows an XR instance. The XML
subinstance (in the bottom of the Figure) consists of two trees.
The rst represents a news feed composed of articles. Each
article has a story (consisting of text nodes and a
namedentity identifying a company name), a title and a date. The
second tree refers to a news archive. Each XML node has a
subscript corresponding to its URI. The RDF sub-instance
comprises RDF triples which refer to (annotate) the XML
sub-instance by using the URIs of the XML nodes. For
instance, we state that the article with URI=`#3' is a report
on the named-entity with URI=`#8', which is of type
Organization. Moreover, in the context of Open Linked Data, we
use the sameAs property to show that two resources are the
same, e.g., the nodes with URIs `#3' and `#13' in the
Figure. RDF triples may also contain blank nodes (e.g., ∶B).
Finally, some triples may be implicitly derived, such as the
one stating that the named-entity is an Organization, which
was inferred from the facts that the Company is a subclass
of Organization and the named-entity is a Company.
3.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>THE XRQ QUERY LANGUAGE</title>
      <p>Given an XR data instance, users should be able to query
the data based on both its structure, described in the XML
sub-instance, and its semantic annotations, stored in the
RDF sub-instance. To this end, we design the XRQ query
language. In XRQ, the XML sub-instance is queried using
tree patterns, while the RDF sub-instance is queried through
triple patterns. Both are de ned below. Most importantly,
reusing variables across tree patterns and triple patterns
allows to relate nodes and resources of the two sub-instances.
a
($B, rdf:type, ⟨Organization⟩),($A, ⟨workAt⟩, $B),</p>
      <p>($A, ⟨email⟩, $C)</p>
      <p>Definition 3.1 (Tree Pattern). A tree pattern is a
nite, ordered, unranked, N -labeled tree with two types of
edges, namely child and descendant edges. We may attach
to each node at most one uri variable, one val variable and
one cont variable. We may also attach to a node an equality
predicate of the form [val=c] for some c ∈ L.</p>
      <p>
        A tree pattern is a variant of tree patterns as presented in
the literature [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] with the additional capability of attaching
one or more variables of di erent types to the nodes.
Variables serve two purposes: (i) to denote data items that are
returned by the query (in the style of distinguished variables
in conjunctive queries) and (ii) to express joins between tree
or triple patterns. The variable type speci es the exact
information item from an XML node, to which the variable
will be bound. When a node nt of a tree pattern is matched
against a node nd of an XML tree, the variables attached to
nt will be bound to the following concepts. A uri variable is
bound to the URI of nd. If nd is an element, a val variable
is bound to the concatenation of all text descendants of nd;
if nd is an attribute, a val variable is bound to the attribute
value. Finally, a cont variable is bound to the serialization
of the subtree rooted at nd. The semantics of val variables
follow from the XPath and XQuery speci cations [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>Example. The lower part of Figure 2 depicts two tree
patterns. Pattern (b) nds articles containing a named-entity
with a value equal to \ACME". Variables $B and $V D are
bound to the URI of this entity and the publication date
of the article, respectively. Observe that $V D appears also
in the second tree-pattern, thus expressing a join between
them. Pattern (c) retrieves the titles of articles whose
publication date matches that of pattern (b).</p>
      <p>Definition 3.2 (Triple Pattern). A triple pattern is
a triple (s; p; o), where s; p are URIs or variables, whereas o
is a URI, a literal, or a variable.</p>
      <p>Example. The top part of Figure 2 depicts three triple
patterns, looking for entities of type Organization and nding
emails of resources known to be working at that
organization. Joins between triple patterns are expressed through
shared variables, such as $A and $B in the Figure.</p>
      <p>By combining tree and triple patterns and endowing them
with a set of head variables, we obtain an XRQ query:</p>
      <p>Definition 3.3 (XRQ Query). An XRQ query
consists of a head and a body. The body is a set of tree and/or
triple patterns built over the same set of variables, whereas
the head is a list of variables appearing also in the body.</p>
      <p>Note that by using variables in multiple places within the
query, one can express joins. In general, three types of joins
are possible: between two tree patterns, between two triple
patterns or between a tree pattern and a triple pattern. The
latter facilitates queries that cross the boundaries between
XML documents and their RDF annotations. The following
example illustrates the expressivity of XRQ.</p>
      <p>Example. Figure 2 shows an XRQ query, whose body
(at the right-hand side) consists of the previously described
tree and triple patterns. Observe that variable $B appears
in both parts of the body, forming a join between the XML
and RDF sub-instances. Two variables appear in the head of
the query: $V T from tree pattern (c) and $C from the rst
triple pattern. The answer set for this query will contain
the titles of articles published on the same date as articles
referring to the organization called ACME, and emails of
people working in this organization.</p>
      <p>
        Query Semantics For each match of the tree patterns,
triple patterns and the corresponding variables against the
XR instance (including the XML data, the explicit as well
as the implicit triples, as explained in Section 2), we create
a copy of the head tuple with the variables replaced by their
bound values. The query answer is the set of all such tuples.
For more details see the extended version [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] we also provide an extended version of XRQ,
supporting more complex construction clauses, such that the
result of an XRQ query is an XR data instance.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. EXPERIMENTS</title>
      <p>To experiment with XR, we implemented XRP, a
prototype platform that supports the storage of XR instances and
the evaluation of XRQ queries, using Java 1.6.</p>
      <p>
        Architecture The data store is built on top of
BerkeleyDB. RDF triples are mapped to a simple three-attribute
relation. XML documents are stored also within a tabular
format, enriched with XML-speci c features such as
structural XML identi ers and full serialized XML fragments.
For the experiments described here, we manually speci ed
the XML data that each relation will store, by means of a
tree pattern. XRP supports easily changing the data
organization by specifying other tree patterns, thanks to its
reliance on the ViP2P (http://vip2p.saclay.inria.fr) access path
selection infrastructure [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Furthermore, we materialized
an index of all XML node URIs appearing in some RDF
triples. This index facilitates the evaluation of the XRQ
queries that combine data from the two sub-instances.
      </p>
      <p>
        The query engine supports the typical operators: scan,
selection, projection, joins on values and XML structural IDs,
etc. To take advantage of indexes, we also implemented the
sideways information passing BindJoin operator [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which
uses attribute values from one join input as keys to access
the other input (an XML/RDF index is used to retrieve for
each XML node URI, all the triples containing this URI).
      </p>
      <p>Experimental Setup For the purpose of the
experiments, we created a synthetic XR instance with soccer league
information. The XML sub-instance describes teams and
players, while the RDF sub-instance annotates players with
properties. To assess the system's scalability, we varied the
number of player nodes (10K, 50K and 100K), while keeping
the size of the XML document constant (95MB) by
modulating the length of some attributes that each player contains.
The number of annotations also increased proportionally by
keeping an average of one annotation per player.</p>
      <p>We devised three queries, each joining a tree pattern with
a triple pattern on URIs of player nodes. Q1 retrieves teams
which have players annotated with a given property. Q2 is
a relaxed version of Q1 returning teams regardless of the
property used. Finally, Q3 returns properties annotating
players of a particular team. Observe that Q1 and Q3 are
highly selective; Q1 applies a selection on the RDF instance,
whereas Q3 applies a selection on the XML instance.</p>
      <p>For each query, we compared execution times both with
and without indexes on the joined attributes. For Q1 and
Q2, we used an index on the XML node URIs, while for Q3,
we used an index on the RDF subjects.</p>
      <p>The experiments were conducted on a 2.40GHz Intel X3430
with 4GB RAM running Mandriva Linux (kernel 2.6.31.14).</p>
      <p>Experimental Results Figure 3 shows the total query
evaluation times for each query over the three instances both
with indexes (BindJoin) and without (HashJoin).</p>
      <p>The results indicate that XRP scales well with the number
of nodes in the XML sub-instance (and the number of triples
in the RDF sub-instance). For instance, evaluating the least
selective query (Q2) over the largest data instance (100K)
requires less than 6.5 seconds.</p>
      <p>As expected, the BindJoin plans (using the index) are
generally faster (up to a factor of 4 in some cases). The only
case when BindJoin is slightly worse than HashJoin is for
non-selective queries (such as Q2). This is due to the
overhead incurred from accessing all items of an index. Neither
side of the join operator is more selective than the other and
thus the join operator has to match all RDF triples with all
tuples from the XML instance. BindJoin will need random
access to the whole index, while HashJoin will simply apply
a sequential scan to the same relation.</p>
      <p>Our experiments demonstrate the applicability of many
query and storage optimization techniques to the problem
of XRP data management, in a single-platform setting. To
better exploit existing systems, we are also investigating the
deployment of an XR platform as an integrator,
complementing separate XML, respectively, RDF databases, while
preserving the single-model abstraction.</p>
      <p>
        More experiments and a detailed description of the data
sets can be found in our extended report [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. RELATED WORK &amp; CONCLUSION</title>
      <p>We brie y discuss the main classes of related works.</p>
      <p>
        Tools for annotating web pages Several works propose
frameworks that let users semantically annotate web-pages
either manually [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] or (semi-)automatically [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] (see [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] for
a survey of such systems). However, they focus solely on
storing and querying annotations and they do not consider
the problem of simultaneously querying structured data and
the annotations on them.
      </p>
      <p>
        Embedding RDF annotations in XML documents
Other works look at the particular problem of embedding
RDF annotations in XHTML [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. However, this requires
modifying the original XML document, which is not always
possible, due to privacy reasons or access constraints. In
contrast, in XRP the RDF annotations can be kept
separately from the XML documents in their own physical store.
      </p>
      <p>
        From RDF to XML and back Another line of work
studies the connection between RDF and XML. This
includes works on (i) transforming XML data to RDF and vice
versa [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], (ii) using the query language of one model to query
the other (e.g., use XQuery to query XML-ized RDF) [
        <xref ref-type="bibr" rid="ref10 ref5">5, 10</xref>
        ]
and (iii) extending the query language of one model with
primitives for the other (e.g., adding XPath constructs to
SPARQL)[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Finally, XML and RDF are modeled within
a single rule-based framework in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>As outlined in Section 1, transforming XML to RDF or
vice-versa is feasible but brings a signi cant conversion
overhead, while the optimization techniques appropriate for one
are likely not very e cient for the other. The main di
erence between our work and the previous ones integrating
XML and RDF, though, is our treatment of any XML node
as a resource, and the resulting focus on connecting (in the
RDF/Semantic Web sense) data to and from documents.</p>
      <p>Conclusion XRP is a rst step towards enabling the
modeling and querying of annotated XML documents. Our
current and future work focuses on (i) checking the model's
expressiveness against large-scale real-life annotated
document applications built on previous content management
systems such as Confolio (www.confolio.org), and (ii) devising
an architecture for XR management based on o -the-shelf
e cient XML and RDF data management platforms.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hull</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Vianu</surname>
          </string-name>
          .
          <source>Foundations of Databases. Addison-Wesley</source>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>W.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kopecky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Krennwallner</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          . XSPARQL:
          <article-title>Traveling between the XML and RDF worlds - and avoiding the XSLT pilgrimage</article-title>
          .
          <source>In ESWC</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Amer-Yahia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Lakshmanan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          .
          <article-title>Minimization of Tree Pattern Queries</article-title>
          .
          <source>In SIGMOD</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Dill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Eiron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gibson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gruhl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Guha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jhingran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kanungo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rajagopalan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tomkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Tomlin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. Y.</given-names>
            <surname>Zien</surname>
          </string-name>
          .
          <article-title>SemTag and seeker: bootstrapping the semantic web via automated semantic annotation</article-title>
          .
          <source>In WWW</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Droop</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Flarer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Groppe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Groppe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Linnemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pinggera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Santner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schier</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          <article-title>Schopf, H. Sta er, and</article-title>
          <string-name>
            <given-names>S.</given-names>
            <surname>Zugal</surname>
          </string-name>
          .
          <article-title>Bringing the XML and Semantic Web Worlds Closer: Transforming XML into RDF and Embedding XPath into SPARQL</article-title>
          .
          <source>In ICEIS</source>
          .
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Florescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Manolescu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Suciu</surname>
          </string-name>
          .
          <article-title>Query optimization in the presence of binding patterns</article-title>
          .
          <source>In SIGMOD</source>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Furche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bry</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Bolzer</surname>
          </string-name>
          .
          <article-title>Marriages of Convenience: Triples and Graphs, RDF and XML in Web Querying</article-title>
          .
          <source>In Principles and Practice of Semantic Web Reasoning</source>
          . Springer Berlin / Heidelberg,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8] Extended version of this work, available at http://xr.saclay.inria.fr/papers/xr-vlds-extended.pdf,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Handschuh</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          .
          <article-title>Authoring and annotation of web pages in CREAM</article-title>
          . In WWW,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Karanasos</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Zoupanos</surname>
          </string-name>
          .
          <article-title>Viewing a world of annotations through AnnoVIP (demo)</article-title>
          .
          <source>In ICDE</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>I.</given-names>
            <surname>Manolescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Karanasos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vassalos</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Zoupanos</surname>
          </string-name>
          .
          <article-title>E cient XQuery rewriting using multiple views</article-title>
          .
          <source>In ICDE</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Reeve</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Han</surname>
          </string-name>
          .
          <article-title>Survey of semantic annotation platforms</article-title>
          .
          <source>In ACM SAC</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>[13] RDF. http://www.w3.org/standards/techs/rdf.</mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14] RDFa. http://www.w3.org/TR/xhtml-rdfa-primer/,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>SPAT - SPARQL Annotations</surname>
          </string-name>
          . http://www.w3.org/
          <year>2007</year>
          /01/SPAT/,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>[16] XQuery 1.0 and XPath 2</source>
          .
          <article-title>0 Data Model (XDM)</article-title>
          . http://www.w3.org/TR/xpath-datamodel/,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>