<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Provenance and Linked Data in Biological Data Webs</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Jun Zhao, Graham Klyne and David Shotton Image Bioinformatics Research Group Department of Zoology University of Oxford Oxford OX1 3PS</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>To created a linked data web of heterogeneous biological data resources, we need not only to de¯ne and create the alignment between related data resources but also to express the knowledge about why data items from di®erent sources are linked with each other and how each data link has evolved, so that scientists can trust the data links provided by the data web. This paper highlights the importance of keeping provenance information about the links between data items from di®erent sources, and proposes the use of named graphs to make a provenance statement about each pair of linked data items and each release of a data web.</p>
      </abstract>
      <kwd-group>
        <kwd>Data Web</kwd>
        <kwd>Named Graphs</kwd>
        <kwd>Provenance</kwd>
        <kwd>RDF</kwd>
        <kwd>Semantic Web</kwd>
        <kwd>Trust</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        The number of biology databases available has increased
rapidly in the recent years [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. To obtain knowledge about a
gene or protein from this sea of data, biologists often need to
go through an information gathering process, navigating
between the public genomic and publication databases. These
resources are scattered around the world and present data
in heterogeneous formats. Scientists have to rely on their
domain knowledge in order to identify how data resources
are linked with each other.
      </p>
      <p>Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.</p>
      <p>Copyright is held by the author/owner(s). LDOW2008, April 22, 2008,
Beijing, China.</p>
      <p>
        To simplify this process, the Image Bioinformatics
Research Group (IBRG)1 of the University of Oxford proposes
the use of subject-speci¯c data webs, which use the Web
as the native platform upon which to integrate access to
datasets relating to particular subjects [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Within each
data web, data resources are integrated using loosely
coupled software tools that permit both information discovery
and links back to the original data. With this approach,
the data linked into the data webs are neither required to
be semantically coordinated nor constrained to conform to
a single imposed model. Furthermore, copyright and
access control issues remain the concern of the data sources,
not of the data web that unites them. These data sources
maintain their unique characters and continue independent
publication of their holdings.
      </p>
      <p>The ¯rst demonstrator data web being developed by IBRG
is FlyWeb2, which will integrate the heterogeneous data
resources concerning research on fruit °y Drosophila
melanogaster. These data resources include FlyTED3 (our local
research image repository concerning gene expression in the
testis of fruit °ies), BDGP4 (the Berkeley Drosophila Genome
Project database concerning gene expression in the Drosophila
embryos), FlyBase5 (the global database of genomics
information concerning Drosophila), and online research
publications on Drosophila gene expressions. The goal of
FlyWeb is to allow biologists to obtain information about a
Drosophila gene, including the gene expression images of
its testis and embryos, without having to hop between the
Drosophila data islands on the Web.</p>
      <p>To build FlyWeb, we need not only to de¯ne and
implement the alignment between Drosophila data resources,
but also to maintain the data links between related data
items from di®erent sources. This position paper focuses on
the second issue, and will analyze the motivation for
keeping provenance of the links between related data items and
present our proposed solution.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>SEMANTIC WEB AND FLYWEB</title>
      <p>
        The initial development goals of the FlyWeb Project
include understanding the distributed Drosophila data resources;
creating the alignment between them; and creating a query
service to access the integrated data resources. At the time
1http://ibrg.zoo.ox.ac.uk/
2http://imageweb.zoo.ox.ac.uk/wiki/index.php/FlyWeb_
project
3http://www.fly-ted.org/
4http://www.fruitfly.org/
5http://flybase.org/
of the writing, concentrating ¯rst on linking the FlyTED
and BDGP databases, we have achieved:
² Describing the Drosophila testis images in FlyTED
using an extension to the Fly Anatomy Ontology [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
which is also used by BDGP to describe its Drosophila
embryo gene expression images.
² Publishing FlyTED, the Drosophila testis gene
expression image database, through a SPARQL endpoint [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ],
the same interface used by BDGP for publishing its
gene expression images and annotations [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
² Identifying the relationships between FlyTED and BDGP
using the genomic knowledge, particularly the gene
names, captured in FlyBase.
      </p>
      <p>These initial works provide the foundation that permits us
to align the two data resources and build a lightweight data
web. However, the evolving nature of biological databases
has motivated us to further consider how to manage the links
in FlyWeb, once they have been established.
3.</p>
    </sec>
    <sec id="sec-3">
      <title>MOTIVATION</title>
      <p>Data items from di®erent Drosophila data resources are
integrated into FlyWeb using references to the original data.
Related data items are linked together in FlyWeb using
biological knowledge from public genomic databases.
Biological knowledge is growing rapidly, and genomic databases are
frequently updated. By referencing back to these evolving
databases, FlyWeb can synchronize with advances in
biological knowledge. However, with each update of such an
external resource, some of the links between data items recorded
locally within FlyWeb may become obsolete or need to be
updated with more links to related data items. We need to
provide additional metadata about these data links in order
to maintain consistency between FlyWeb and the advancing
biological knowledge. This will allow scientists to:
² Trust that the data links established in FlyWeb are
valid;
² Trust that the data referenced in FlyWeb are
consistent with the latest release of the public databases;
² Trace back the data links established by FlyWeb using
previous releases of the public databases, which may
previously have been used by the scientists to annotate
their own local data.</p>
      <p>Thus, for each data link to a pair of data items, we need
to record the following provenance metadata:
² The evidence of the link;
² When this link was created, by whom, using which
version of which database;
² When this link was updated or deprecated;
² Whether there were any previous links between this
pair of data items;
² What previous links between data items became
obsolete, and why.</p>
      <p>To express this provenance of data links, we propose to
use named graphs.
4.</p>
    </sec>
    <sec id="sec-4">
      <title>NAMED GRAPHS</title>
      <p>
        An RDF graph contains a collection of RDF triples. A
named graph is an RDF graph which is assigned a name
in the form of a URI [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. It provides a way to group RDF
statements into sub-graphs that may be asserted separately,
and it also provides names for such graphs. By grouping and
naming RDF statements as a named graph, applications can
state access control rights, copyright, or provenance
information about these RDF statements as a whole. Thus, named
graphs provide a mechanism for establishing trust within
the Semantic Web. More generally, this mechanism allows
us to make statements about the content of the graph
without asserting that the statements contained in the graph are
true.
      </p>
      <p>In order to provide information about why a pair of related
data items are linked together in FlyWeb, or why/when they
become no longer linked, we create a named graph for each
pair of linked data items. In this position paper, we only
consider two types of links between data items, i.e. either
they are same as or di®erent from each other. There may
be other types of data links in FlyWeb. But the provenance
model introduced in this paper is not yet designed for
describing all di®erent types of data links.</p>
      <p>FlyWeb will be updated whenever a major release of the
linked-in Drosophila databases is announced. To provide
information about each FlyWeb release and the versions of the
public databases upon which each release is based, we will
also create a named graph for each release of the FlyWeb.</p>
      <p>
        In this position paper we use TriG as the notation to
de¯ne named graphs. \TriG is a variation of Turtle [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] which
extends that notation by using `f' and `g' to group triples
into multiple graphs, and to precede each by the name of
that graph" [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
4.1
      </p>
    </sec>
    <sec id="sec-5">
      <title>A Named Graph for Each FlyWeb Release</title>
      <p>FlyWeb integrates several Drosophila data sources, noted
as a, b, c, etc. Each data source is associated with version
information. Thus ax indicates version x of data source a.</p>
      <p>Each release of FlyWeb (f wg, f wk, etc) will contain a
collection of data items, im, in, etc, from di®erent Drosophila
data sources. A data item from data source a of version x
should be uniquely identi¯ed as im(ax). In f wg, im(ax)
will be described by all the metedata from its original data
source, as well as by FlyWeb statements about whether it is
related to another data item in(bx) from data source bx, or
whether im(ax) had previously been linked with in(bx) in a
previous release of FlyWeb.</p>
      <p>Each release of FlyWeb itself is a named graph, which is
associated with information about when it was released, by
whom, using which versions of which databases. An example
of two such named graphs is given below.</p>
      <p>The following two examples show two graphs (see
Figure 1). Example 1 tells information about FlyWeb version
1.0 (&lt;dwi:flyweb_r1&gt;) that it was released on
\2007-1219" and it was built using FlyTED version 1.0, the BDGP
database version \2007-03-09" and FlyBase version 4.3. It
also contains a single statement about data items, i.e. the
gene &lt;flyted:gene_g1&gt; from FlyTED is the same as the
gene &lt;bdgp:gene_g2&gt;6 from BDGP. Example 2 de¯nes a
6The bdgp namespace might not be the actual namespace used
by the BDGP SPARQL endpoint. Due to technical maintenance,
its server was unreachable when the paper was written.
named graph of FlyWeb version 1.1, which was built using
the same versions of FlyTED and BDGP as FlyWeb
version 1.0, but a di®erent version of FlyBase. Because of this
update, gene &lt;flyted:gene_g1&gt; is no longer the same as
&lt;bdgp:gene_g2&gt;.
:flyweb_r1 {
flyted:gene_g1 owl:sameAs bdgp:gene_g2 .
:flyweb_r1 dc:created "2007-12-19"^^xsd:date;
dc:hasVersion "1.0" ;
dc:creator
&lt;http://www.datawebs.net/foaf.rdf#ibrg&gt; ;
dw:derivedFrom flyted:v1.0 ;
dw:derivedFrom &lt;bdgp/2007-03-09&gt; ;
dw:derivedFrom &lt;flybase/v4.3&gt; .
}</p>
      <p>Example 2. Named graph for FlyWeb release 1.1
:flyweb_r2 {
flyted:gene_g1 owl:differentFrom bdgp:gene_g2 .
:flyweb_r2 dc:created "2008-01-25"^^xsd:date;
dc:hasVersion "1.1" ;
dc:creator
&lt;http://www.datawebs.net/foaf.rdf#ibrg&gt; ;
dw:derivedFrom flyted:v1.0 ;
dw:derivedFrom &lt;bdgp/2007-03-09&gt; ;
dw:derivedFrom &lt;flybase/v5.3&gt; .
}
4.2</p>
    </sec>
    <sec id="sec-6">
      <title>A Named Graph for Each Data Link</title>
      <p>In each FlyWeb named graph, a collection of named graphs
are also created for the data links between pairs of related
data items. Each such named graph states:
² Why a pair of data items should be or should no longer
be linked;
² When the link was made or released, and by whom;
² Which previous links had been created between this
pair of data items;
² What the type the link is: a MappingRelation, either a</p>
      <sec id="sec-6-1">
        <title>SameRelation or a DifferentRelation. The last two</title>
        <p>concepts will be de¯ned in a data web ontology using
the owl:sameAs and owl:differentFrom properties.</p>
        <p>Example 3 (see Figure 2) shows a named graph &lt;dwi:
mapping_m1&gt; that de¯nes an abstract relationship between
the gene from FlyTED (&lt;flyted:gene_g1&gt;) and the gene
from BDGP (&lt;bdgp:gene_g2&gt;) and traces this relationship
by its two children, both of which are themselves named
graphs and de¯ne the actual relationships between these two
genes built in di®erent releases of FlyWeb.</p>
        <p>The ¯rst child &lt;dwi:mapping_m11&gt; de¯nes that the two
genes are synonyms given the evidence of &lt;dwi:evidence_
e1&gt; and that this link was created on \2007-12-19" within
the release of &lt;dwi:flyweb_r1&gt;. The second child &lt;dwi:
mapping_m12&gt; states that the two genes are not the same
given the evidence of &lt;dwi:evidence_e2&gt;, and that this
link was created on \2008-01-25" within the release of &lt;dwi:
flyweb_r2&gt;.</p>
      </sec>
      <sec id="sec-6-2">
        <title>The dw:childOf property links &lt;dwi:mapping_m11&gt; and</title>
        <p>&lt;dwi:mapping_m12&gt; with the graph &lt;dwi:mapping_m1&gt;, and
they are linked together by the property dw:siblingOf. These
properties enable us to trace a lineage of the data links
between a pair of data items.</p>
        <p>Example 3. Named graph for a data link
@prefix dw: &lt;http://www.datawebs.net/&gt; .
@prefix flyted: &lt;http://id.fly-ted.org/&gt; .
@prefix dc: &lt;http://purl.org/dc/elements/1.1/&gt; .
@prefix xsd: &lt;http://www.w3.org/2001/XMLSchema#&gt; .
@prefix owl: &lt;http://www.w3.org/2002/07/owl#&gt; .
@prefix bdgp: &lt;http://www.fruitfly.org/&gt; .
@prefix dwi: &lt;http://id.datawebs.net/&gt; .
@prefix : &lt;http://id.datawebs.net/&gt; .
@prefix rdf:</p>
        <p>&lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt; .
:mapping_m1 {
:mapping_m1 rdf:type dw:MappingRelation .
flyted:gene_g1 dw:maps bdgp:gene_g2 .
# the first child
:mapping_m11 dw:childOf :mapping_m1 ;
dw:evidencedBy :evidence_e1 ;
dw:createdIn :flyweb_r1 ;
rdf:type dw:SameRelation ;
dc:creation</p>
        <p>"2007-12-19"^^xsd:date .
:mapping_m11 {</p>
        <p>flyted:gene_g1 owl:sameAs bdgp:gene_g2 .</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>5. SCENARIOS</title>
      <p>This section uses the above example datasets to walk
through three scenarios to show how the named graphs could
help us to manage the data links in FlyWeb in a manner that
promotes trust.
5.1</p>
    </sec>
    <sec id="sec-8">
      <title>Links in a Previous Release</title>
      <p>The ¯rst scenario shows how FlyWeb can help users to
¯nd out which data items in FlyWeb are linked to their
gene, which is annotated using information from FlyBase
release 4.3.</p>
      <p>Many biology data compilations are maintained locally
by research groups and might not always be kept up-to-date
with successive releases of the genomic database FlyBase
due to the ending of the projects that funded them. Such
local legacy data will have been annotated using
information from a now out-of-date version of the public database.
Subsequent releases of the public database might have
annotated its gene records using di®erent gene names.
Occasionally, new biological evidence shows that a particular DNA
sequence, formerly thought to be a single gene and given a
single gene name, in fact encodes two distinct genes that are
then given di®erent names.</p>
      <p>
        Without provenance data, users would not be able to ¯nd
in FlyWeb any data relating to their locally recorded former
gene names, because the genes are now annotated with new
names. In order to prevent this situation in the future, we
provide provenance information for each release of FlyWeb,
to state which versions of the public databases it links to.
This provides the °exibility for the scientists to trace data
links for their legacy data. A SPARQL query [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for this
scenario is shown below, which will search for all the data
items that are linked to the gene &lt;flyted:gene_g1&gt; in the
release of FlyWeb that was built using FlyBase version 4.3.
      </p>
      <sec id="sec-8-1">
        <title>SELECT *</title>
      </sec>
      <sec id="sec-8-2">
        <title>WHERE { ?g dw:derivedFrom &lt;flybase/v4.3&gt; graph ?g { { flyted:gene_g1 ?p ?data }</title>
      </sec>
      <sec id="sec-8-3">
        <title>UNION { ?data1 ?p1 flyted:gene_g1 } } }</title>
        <p>5.2</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>All Links in the Latest Release</title>
      <p>This scenario shows how users can navigate information
about a Drosophila gene in the latest release of FlyWeb
using the version information and the creation date associated
with the named graph of each release of FlyWeb. The
following SPARQL query will retrieve all the data links from
the v1.1. release of FlyWeb.</p>
      <sec id="sec-9-1">
        <title>SELECT *</title>
      </sec>
      <sec id="sec-9-2">
        <title>WHERE { ?g dc:hasVersion "1.1" . graph ?g {?gene1 ?p ?gene2 } }</title>
        <p>5.3</p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>Explaining Conflicts</title>
      <p>One way of allowing users to trace the data links between
a pair of related data items is to keep a history of all the
data links that have ever existed between them. This means
that con°icting statements about the relationship between
the same pair of data items might exist in di®erent releases
of FlyWeb. In order to explain these con°icts, we provide
the evidence information for the data links.</p>
      <p>Example 1 and Example 2, describing release1.0 and 1.1 of
FlyWeb, contain con°icting statements about the
relationships between &lt;flyted:gene_g1&gt; and &lt;bdgp:gene_g2&gt;. In
order to explain this con°ict, we need to take the following
steps:
² Retrieve all the statements about the data link
between &lt;flyted:gene_g1&gt; and &lt;bdgp:gene_g2&gt; from
di®erent releases of FlyWeb. This will return all the
statements about the graphs &lt;dwi:mapping_m11&gt; and
&lt;dwi:mapping_m12&gt; that de¯ne the relationships
between the two gene names;
² Compare the statements about these two graphs in
order to ¯nd out the di®erences between the two
versions of relationships between &lt;flyted:gene_g1&gt; and
&lt;bdgp:gene_g2&gt;;
² Present the di®erences resulting from the above
comparison step to the users, including their creation date,
in which release of FlyWeb they were created, as well as
the evidence for explaining why each di®erent
relationship existed between &lt;flyted:gene_g1&gt; and &lt;bdgp:
gene_g2&gt;.</p>
      <p>A SPARQL query for the ¯rst step would be:</p>
      <sec id="sec-10-1">
        <title>CONSTRUCT {?cg ?p ?o}</title>
        <p>WHERE {
graph ?g {flyted:gene_g1 ?p1 bdgp:gene_g2 .
?g rdf:type dw:MappingRelation .
?cg dw:childOf ?g .</p>
        <p>?cg ?p ?o}
}</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSIONS</title>
      <p>In this position paper we have analyzed how recording
the provenance of data links can help us both maintaining
the links between related data items and bringing trust to
the data web, by providing evidence for links, or tracing
how the data links have been updated and maintained. We
have shown the potential of named graphs for expressing
this provenance information. The °exibility of RDF named
graphs and the RDF query language SPARQL provide the
capability for us to query and ¯lter the data links on behalf
of the data web users, e.g. by presenting only those links
newly created since the previous release of FlyWeb, or those
present in a particular earlier release of FlyWeb.</p>
      <p>When de¯ning this conceptual provenance model, we have
adopted existing vocabulary as much as possible, such as the
properties of dc:creation and dc:creator from the Dublin
Core Metadata Element Set7. We have also used the dw
namespace (http://www.datawebs.net/) to specify the
following properties of our own:
² dw:derivedFrom
² dw:evidencedBy
² dw:childOf
² dw:siblingOf
² dw:createdIn
² dw:maps
We are planning to include these conceptual properties in a
data web provenance ontology, that will include other
existing vocabularies about provenance and trust.</p>
      <p>In this conceptual model we associated with each data
link a dw:evidencedBy property to provide the information
about why particular statements were asserted. This will
bring trust to the linked data for the scientists, so that they
can verify that the links are consistent with scienti¯c
knowledge. However, we are still investigating how much
information should be provided as evidence for each data link:
whether it should contain the actual heuristic used for
building the links or a textual description of this heuristic; and
how we can make this evidence information more
comprehensible for biological researchers.</p>
      <p>
        There is a separate provenance issue that is not discussed
in this position paper, namely the provenance of the data
items themselves. We discussed neither the provenance
information for telling where each data item came from nor
the provenance information that might be associated with a
data item from the individual data resource. These are key
research topics for Semantic Web and provenance for life
sciences [
        <xref ref-type="bibr" rid="ref3 ref8">3, 8</xref>
        ]. The datasets published by BDGP through their
SPARQL endpoint have been annotated with some
provenance and evidence information [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Those data provenance
statements will be integrated into FlyWeb along with all
other descriptions concerning the data. We need to research
how this provenance of data can best be incorporated into
FlyWeb, together with the provenance of the data links.
      </p>
    </sec>
    <sec id="sec-12">
      <title>ACKNOWLEDGEMENT</title>
      <p>This work is supported by funding from the JISC
(FlyWeb Project to Dr David Shotton; http://imageweb.
zoo.ox.ac.uk/wiki/index.php/FlyWeb_project) and from
BBSRC (Grant BB/E018068/1, The FlyData Project:
Decision Support and Semantic Organisation of Laboratory Data
in Drosophila Gene Expression Experiments, to Drs David
Shotton and Helen White-Cooper). The FlyTED Database
was developed with funding from the UK's BBSRC (Grant
BB/C503903/1, Gene Expression in the Drosophila testis, to
Drs Helen White-Cooper and David Shotton). Preliminary
data web requirements analysis was supported by a JISC
grant to Dr David Shotton (De¯ning Image Access Project;
http://imageweb.zoo.ox.ac.uk/wiki/index.php/Defining</p>
      <sec id="sec-12-1">
        <title>ImageAccess).</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ashburner</surname>
          </string-name>
          and et al.
          <article-title>A structured controlled vocabulary of the anatomy of Drosophila melanogaster</article-title>
          . http://obofoundry.org/cgi-bin/detail.cgi?id= fly_anatomy.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Beckett. Turtle - Terse RDF Triple Language</surname>
          </string-name>
          ,
          <year>2007</year>
          . http://www.dajobe.org/
          <year>2004</year>
          /01/turtle/.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Carroll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hayes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Stickler</surname>
          </string-name>
          .
          <article-title>Named graphs, provenance and trust</article-title>
          .
          <source>In Proc. of the 14th International World Wide Web Conference</source>
          , pages
          <volume>613</volume>
          {
          <fpage>622</fpage>
          ,
          <string-name>
            <surname>Chiba</surname>
          </string-name>
          , Japan,
          <year>2005</year>
          . http://doi.acm.
          <source>org/10</source>
          .1145/1060745.1060835.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Galperin</surname>
          </string-name>
          .
          <article-title>The molecular biology database collection: 2008 update</article-title>
          .
          <source>Nucleic Acids Research</source>
          ,
          <volume>36</volume>
          (Database issue):
          <volume>2</volume>
          {
          <issue>4</issue>
          ,
          <year>2008</year>
          . doi:
          <volume>10</volume>
          .1093/nar/gkm1037.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Mungall</surname>
          </string-name>
          .
          <article-title>A SPARQL endpoint for a database of annotated gene expression</article-title>
          . http://www.bioontology.org/wiki/index.php/OBD: SPARQL-InSitu.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.</given-names>
            <surname>Prud</surname>
          </string-name>
          <article-title>'hommeaux and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Seaborne</surname>
          </string-name>
          .
          <article-title>SPARQL query language for RDF</article-title>
          ,
          <year>January 2008</year>
          . W3C Recommendation. http://www.w3.org/TR/rdf-sparql-query/.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Shotton</surname>
          </string-name>
          . World Wide Science:
          <article-title>Promises, Threats and Realities, chapter Data webs for image repositories</article-title>
          . Oxford University Press,
          <year>2008</year>
          . in press.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Goble</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Stevens</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Turi</surname>
          </string-name>
          .
          <article-title>Mining Taverna's Semantic Web of Provenance</article-title>
          .
          <source>Journal of Concurrency and Computation:Practice and Experience</source>
          ,
          <year>2007</year>
          . doi:
          <volume>10</volume>
          .1002/cpe.1231.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Klyne</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Shotton</surname>
          </string-name>
          .
          <article-title>Building a semantic web image repository for biological research images</article-title>
          .
          <source>In Proc. of the 5th European Semantic Web Conference</source>
          , Tenerife, Spain,
          <year>2008</year>
          . accepted.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>