<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Making Linked Data SPARQL with the InterMine Biological Data Warehouse</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maxime Deraspe</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gail Binkley</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniela Butano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matthew Chadwick</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>J. Michael Cherry</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Justin Clark-Casey</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergio Contrino</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jacques Corbeil</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Josh Heimbach</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kalpana Karra</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rachel Lyne</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Julie Sullivan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yo Yehudi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gos Micklem</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michel Dumontier</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Cambridge Systems Biology Centre, University of Cambridge</institution>
          ,
          <addr-line>Cambridge</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Genetics, Stanford University</institution>
          ,
          <addr-line>Stanford</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Genetics, University of Cambridge</institution>
          ,
          <addr-line>Cambridge</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Department of Molecular Medicine, Universite Laval</institution>
          ,
          <addr-line>Quebec</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Stanford Center for Biomedical Informatics Research, Stanford University</institution>
          ,
          <addr-line>Stanford</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>InterMine is a system for integrating, analysing, and republishing biological data from multiple sources. It provides access to these data via a web user interface and programmatic web services. However, the precise invocation of services and subsequent exploration of returned data require substantial expertise on the structure of the underlying database. Here, we describe an approach that uses Semantic Web technologies to make InterMine data more broadly accessible and reusable, in accordance with the FAIR principles. We describe a pipeline to extract, transform, and load a Linked Data representation of the InterMine store. We use Docker to bring together SPARQL-aware applications to search, browse, explore, and query the InterMine-based data. Our work therefore extends interoperability of the InterMine platform, and supports new query functionality across InterMine installations and the network of open Linked Data.</p>
      </abstract>
      <kwd-group>
        <kwd>linked data</kwd>
        <kwd>SPARQL</kwd>
        <kwd>RDF</kwd>
        <kwd>biological data warehouse</kwd>
        <kwd>integrative bioinformatics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        InterMine is a Java-based open-source data warehouse created speci cally for
the integration and analysis of biological information [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. It can load data from
a wide range of heterogeneous data sources into a data model that is mutable
and extensible, and expose this loaded data in a manner that is easy to explore
and mine. Many MODs (model organism databases) use the InterMine platform
to make their data available to users [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], such as the MODs for y [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], mouse
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], nematode [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], rat [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], budding yeast [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and zebra sh [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. It is also in use in
many other projects such as modENCODE [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and for drug discovery [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        In order to implement its exible data model, InterMine stores data using
a custom Object Relational Mapping (ORM) in a PostgreSQL database. Data
objects are presented to the user via a web interface and via REST-ful web
services and clients that implement a bespoke API[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. These access mechanisms
are comparable with other primary and secondary biological databases [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>Integration of an arbitrary number of data sources into a single system is one
of InterMine's primary features. However, users may still want to perform further
integration with sources that remain outside the data warehouse. For instance,
they may have additional unpublished or private datasets; a data source may
be integrated with InterMine but not to the level of detail that they require; or
they may require extensive ad-hoc cross-domain data integration in the course
of their research that is di cult to anticipate.</p>
      <p>
        In this case, the integration bene ts of InterMine are reduced. Users have
to fall back to performing further manual integration, which is di cult and
time consuming due to di erences between le formats and data-access services
provided by InterMine and other data sources [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Manual integration also incurs
maintenance costs over time as data formats and access services evolve [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>Over recent years, various data providers, notably the European
Bioinformatics Institute (EBI) [14] and PubChem [15], have started to provide their
data as RDF Linked Data in addition to their existing data-access facilities.
Providing information in a common structured form allows a user to download
datasets from one or more sources and perform queries across them using
standard SPARQL query mechanisms. When organizations such as the EMBL-EBI
provide a public SPARQL endpoint [14], these queries can also be performed
directly over the Internet, potentially across many di erent data providers at
once.</p>
      <p>Providing Linked Data also advances FAIR (Findable, Accessible,
Interoperable, and Re-usable) principles [16], a vision that lies at the heart of the InterMine
project. Therefore, we are extremely interested in how we can implement a
process to make it easy for an InterMine operator to provide RDF Linked Data and
a public SPARQL endpoint as an extension of the InterMine system.</p>
      <p>In this paper, we describe a very important component of this process, namely
a mechanism created recently by the Dumontier Lab at Stanford University to
generate Linked Data from InterMine-loaded data. We also describe the same
lab's Model Organism Linked Database (MOLD), a Linked Open Data cloud
generated from the RDF output of six MOD InterMine installations [17].</p>
      <p>Following on from this, we discuss future work by which we could adapt
this RDFization mechanism to allow any InterMine operator to easily generate
Linked Data and make it downloadable and queryable. We will talk about the
process and challenges involved, both in terms of data and in terms of technology.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Converting InterMine data: RDFization</title>
      <p>The InterMine-RDFizer [18] is an open-source software tool that allows a user
to generate RDF Linked Data from data loaded into InterMine. The tool works
by extracting data from InterMine using its standard web services. This is in
contrast to projects such as D2RQ [19] that directly map relational tables to
RDF graphs. In experimental work we have found it di cult to adapt such
projects to InterMine's custom ORM database structure, where data objects are
split over multiple tables generated from a mutable data model. By contrast, the
InterMine-RDFizer receives logical uni ed views of the data objects, which are
much easier to convert to the RDF data model.</p>
      <p>Figure 1 shows the implementation view of the InterMine-RDFization
process, where data is downloaded into Tab Separated Value (TSV) les and then
converted into RDF triples.</p>
      <p>InterMine stores data as representations of biological objects (Genes,
Organisms, Proteins, etc.) in a class-based model. The InterMine-RDFizer maps
each biological object to an RDF resource. The resource type is based on the
class name (e.g. Gene, Organism) and the resource URI is built using the unique
sequential ID assigned by InterMine's ORM system to each object when it is
loaded. The listing 1.1, for example, represents the triples generated for the gene
with ID 1007664:
&lt;http://mo-ld.org/flymine:1007664&gt; rdf:type http://mo-ld.org/resource/
flymine_SequenceFeature&gt;
&lt;http://mo-ld.org/flymine:1007664&gt; rdf:type http://mo-ld.org/resource/flymine_Gene
&gt;</p>
      <sec id="sec-2-1">
        <title>Listing 1.1. Resource types created in the RDFization process</title>
        <p>The InterMine-RDFizer generates predicates using a generic approach from
the properties of each InterMine data class. Figure 2, for instance, partially
shows the resources generated for the gene with symbol \zen" in the organism
Drosophila melanogaster.
shows how one can fetch genes from the organism Drosophila melanogaster
annotated with a speci ed GO term, using the triples generated by the
InterMineRDFizer from the FlyMine MOD InterMine installation.</p>
        <p>PREFIX rdf: &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt;
PREFIX rdfs: &lt;http://www.w3.org/2000/01/rdf-schema#&gt;
PREFIX mold: &lt;http://mo-ld.org/resource/&gt;
PREFIX mold_voc: &lt;http://mo-ld.org/mine_vocabulary:&gt;
SELECT DISTINCT ?primaryIdentifier ?symbol ?termIdentifier ?termName
WHERE {
?gene a mold:flymine_Gene;</p>
        <p>mold_voc:hasOrganism/rdfs:label ?organism .</p>
        <p>FILTER (?organism="Drosophila melanogaster") .
?gene mold_voc:hasPrimaryIdentifier/rdf:value ?primaryIdentifier;
mold_voc:hasSymbol/rdf:value ?symbol;
mold_voc:hasGOAnnotation/mold_voc:hasOntologyTerm ?term .
?term rdfs:label ?termName .</p>
        <p>FILTER (?termName="nucleoplasm") .</p>
        <p>?term mold_voc:hasIdentifier/rdf:value ?termIdentifier
}</p>
        <p>Query 1.2. SPARQL query for genes annotated with a speci ed GO term
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Creating Linked Data</title>
      <p>As part of its data integration process, InterMine merges data from multiple
sources into common data objects. For instance, a protein object may contain
data from UniProt merged with records from other protein data sources like
IntAct or InterPro. For any merged source, InterMine stores cross-references
to other databases (e.g. PubMed cross-references in InterPro data) in a
crossreferences table.</p>
      <p>The InterMine-RDFizer uses these stored identi ers to generate Linked Data.
The script has to be provided with a le containing the mapping between
the data source name, as stored in InterMine (e.g. UniProt), and the URI of
the external RDF repository (e.g. http://purl.uniprot.org/uniprot/).
For example, the protein \Breast cancer type 1 susceptibility protein", in the
organism \Homo sapiens", could be linked to the resource &lt;http://purl.
uniprot.org/uniprot/BRCA1_HUMAN&gt;. In addition to cross-references, the
RDFizer can also link entries in InterMine's ontology tables to external
ontology term (class) URLs, using the same con guration le. For instance, the
Gene Ontology term with identi er GO:0005654 could be linked to the resource
http://amigo.geneontology.org/amigo/term/GO:0005654.
4</p>
    </sec>
    <sec id="sec-4">
      <title>MOLD project</title>
      <p>The InterMine-RDFizer was developed as part of the MOLD (Model Organism
Linked Data) project. This is a Semantic Web platform, recently developed by
the Dumontier Lab at Stanford University, for publishing model organism data
under FAIR[16] principles. It currently includes 6 MODs: FlyMine, HumanMine,
MouseMine, YeastMine, RatMine, Zebra shMine.</p>
      <p>MOLD uses the RDFizer to generate RDF from the InterMine installations of
these MODs. The RDFizer also links this generated data to Bio2RDF [20][21][22],
one of the largest networks of Linked Data for the life sciences. The data is also
linked to external ontologies as such as GO [23] and the Sequence Ontology [24].</p>
      <p>The MOLD platform provides a web interface to query, browse and exploring
its contained RDF data. This includes a SPARQL editor with a result viewer
supporting several result set formats, a search widget providing a full text search,
and the RelFinder tool [25], to interactively explore relations between two RDF
resources, for which some examples have been already provided.</p>
      <p>In addition, the MOLD platform provides a REST-based web services API
that currently supports the following commands: describe, links, search and
sparql. The example below 1.3 shows how to retrieve the triples that describe a
resource given the resource URI http://mo-ld.org/flymine:1007664
curl -X GET --header ’Accept: text/html’ ’http://api.mo-ld.org:80/v1/describe?uri=
http%3A%2F%2Fmo-ld.org%2Fflymine%3A1007664’</p>
      <sec id="sec-4-1">
        <title>Listing 1.3. HTTP request to describe endpoint</title>
        <p>The user can access the same API via the web interface, editing the input
parameters and browsing the returned results.</p>
        <p>The Docker container system is used to deploy the MOLD project. Docker
packages a software application together with its dependencies in a single image,
eliminating the need to separately install other software libraries and
frameworks. The MOLD project provides three images: one for the triple store, one
for the MOLD web application and one for the REST API.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Future work</title>
      <p>As we have seen, the InterMine-RDFizer facility can generate RDF from an
InterMine installation and the MOLD project used this to create a Linked Open
Data network from 6 MOD InterMine instances. Our interest now is to extend
this work so that we can ship RDF generation and SPARQL query facilities
as a native component of the InterMine system. We want to do this in such a
way that any operator can activate and maintain these facilities without major
operational overhead, no matter what type of data their mine integrates. This
will make more RDF and SPARQL endpoints available for InterMine-integrated
datasets, and give a reasonable expectation that generated RDF data will remain
in sync with InterMine-loaded data.</p>
      <p>To achieve these goals, we need to tackle a number of challenges. On the
data side, we need to make sure that any InterMine resource in the generated
RDF has an IRI that is unique and stable over time, one of the core Linked Data
requirements [26]. This is not straightforward because, unlike primary biological
databases, InterMine does not have prior knowledge of the structure of loaded
biological data, as its core data model is mutable and extensible.</p>
      <p>Currently, the InterMine-RDFizer script generates resource IRIs that use the
sequential IDs that InterMine generates as part of its ORM system (e.g. http:
//mo-ld.org/flymine:1007664). These will not be stable over time since
these IDs will change when the data in the warehouse is updated. Instead, for
each externally referenceable data class we may need to identify which properties
form a unique and temporally-stable key. One possibility is to concatenate the
data class name (e.g. "Protein") with a primary ID property that comes from
one of the loaded external data sources (e.g. P38398 from Uniprot).</p>
      <p>Regarding Linked Data, we also want to ensure that the RDF generated
from any particular InterMine object links back to the data sources that were
integrated into that object. As we described in an earlier section, the
InterMineRDFizer uses InterMine's cross-reference and ontology term data to generate
links. However, these capture cross-references provided by the source rather than
the source itself (e.g. we are not generating triples that link an InterPro IRI to
an InterMine ProteinDomain resource). Capturing this data for RDF generation
may require some additional data source recording by InterMine itself.</p>
      <p>Another data-related challenge concerns ontologies. The data sources that
are loaded into InterMine often use ontology terms as property values, such
as Gene Ontology terms to identify the functions of a gene. However, except
in an automated fashion for sequence properties, InterMine does not attach
ontological terms to the properties themselves. For instance, properties such as
"abstractText" and "title" in the Publication class are not attached to terms in
the Dublin Core ontology.</p>
      <p>The InterMine-RDFizer handles this by automatically generating RDF
predicates from InterMine property names. For example, it generates the IRI http:
//mo-ld.org/mine_vocabulary:hasAuthor for the "authors" property
of the Publication class. But we would also want to make it possible to use
predicates from existing ontologies, such as those from the Dublin Core ontology
above. We would need to either extend InterMine itself to associate ontology
terms with data model properties, or provide a further con guration mechanism
in the InterMine-RDFizer that can do this at RDF generation time.</p>
      <p>On the technological side, a major issue concerns how InterMine data
converted to RDF will be stored and made available to users. The Docker images
created by MOLD that provide a triplestore, web application and REST API
will serve as a very useful base. We will need to assess the performance, ease
of use and maintainability of the systems used, in the context of making this a
very generic facility for any InterMine installation.</p>
      <p>We will also want to integrate RDF downloads with the InterMine web
interface proper. InterMine has existing facilities for exporting data in di erent
formats (CSV, JSON, etc.), so adding a further option to link to the URI that
serves RDF for a particular biological object would be very desirable. This
architectural layout is shown in gure 3.</p>
      <p>Fig. 3. InterMine RDF Provision Process</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work was supported by NIH/NHGRI U41HG001315 (M. Cherry, K Karra,
G Binkley, J Sullivan) and supplement 3U41HG001315-21S1 supplement (M.
Dumontier, M Deraspe), NIH/NHGRI U41HG002659 (supplement subcontract
to G.Micklem), and the Wellcome Trust grant 099133 (G.Micklem). The content
is solely the responsibility of the authors and does not necessarily represent the
o cial views of any of the funding bodies.
14. Jupp S, Malone J, Bolleman J, Brandizi M, Davies M, Garcia L, et al. The EBI
RDF platform: linked open data for the life sciences. Bioinformatics. 30: 1338{1339
(2014)
15. Fu G, Batchelor C, Dumontier M, Hastings J, Willighagen E, Bolton E.
PubChemRDF: towards the semantic annotation of PubChem compound and substance
databases. J Cheminform. 7: 34 (2015)
16. Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A, et
al. The FAIR Guiding Principles for scienti c data management and stewardship.</p>
      <p>Sci Data. 3: 160018 (2016)
17. http://mo-ld.org/
18. https://github.com/mo-ld/intermine-rd zer
19. http://d2rq.org/
20. Belleau F, Nolin MA , Tourigny N, Rigault P, andMorissette J. Bio2RDF:
towards a mashup to build bioinformatics knowledge systems.Journal of biomedical
informatics, vol. 41, pp. 706{16 (2008)
21. Nolin MA, Ansell P, Belleau F, Idehen K, Rigault P, Tourigny N, Roe P, Hogan
JM, and Dumontier M. Bio2RDF network of linked data. Semantic Web Challenge;
International Semantic Web Conference (ISWC 2008). Citeseer (2008)
22. Callahan A, Cruz-Toledo J, Ansell P, Dumontier M, Bio2RDF Release 2: Improved
Coverage, Interoperability and Provenance of Life Science Linked Data, in The
Semantic Web: Semantics and Big Data (Cimiano P, Corcho O, Presutti V, Hollink
L, and Rudolph S, eds.), vol. 7882 of Lecture Notes in Computer Science, pp. 200{
212, Springer Berlin Heidelberg (2013)
23. The Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucl</p>
      <p>Acids Res 43 Database issue D1049{D1056 (2015)
24. Eilbeck K., Lewis S., Mungall C.J., Yandell M., Stein L., Durbin R., Ashburner M.</p>
      <p>The Sequence Ontology: A tool for the uni cation of genome annotations. Genome
Biology 6:R44 (2005)
25. http://www.visualdataweb.org/rel nder.php
26. Hausenblas M. 5 * Open Data [Internet]. [cited 12 Sep 2016]. Available:
http://5stardata.info/en/ (2016)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Smith</surname>
            <given-names>RN</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aleksic</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butano</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carr</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Contrino</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lyne</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lyne</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalderimis</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rutherford</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stepan</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sullivan</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wakeling</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Watkins</surname>
            <given-names>X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Micklem</surname>
            <given-names>G.</given-names>
          </string-name>
          <article-title>InterMine: a exible data warehouse system for the integration and analysis of heterogeneous biological data</article-title>
          .
          <source>Bioinformatics</source>
          .
          <volume>28</volume>
          (
          <issue>23</issue>
          ):
          <fpage>3163</fpage>
          -
          <lpage>5</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Lyne</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sullivan</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butano</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Contrino</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heimbach</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalderimis</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lyne</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            <given-names>RN</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stepan</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balakrishnan</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Binkley</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harris</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karra</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moxon</surname>
            <given-names>SA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motenko</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neuhauser</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruzicka</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cherry</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Richardson</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wester eld</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Worthey</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Micklem</surname>
            <given-names>G</given-names>
          </string-name>
          .
          <article-title>Cross-organism analysis using InterMine</article-title>
          .
          <source>Genesis</source>
          .
          <volume>53</volume>
          (
          <issue>8</issue>
          ):
          <fpage>547</fpage>
          -
          <lpage>60</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Lyne</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rutherford</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wakeling</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varley</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guillier</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Janssens</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ji</surname>
            <given-names>W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mclaren</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>North</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rana</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riley</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sullivan</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Watkins</surname>
            <given-names>X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Woodbridge</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lilley</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Russell</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ashburner</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mizuguchi</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Micklem</surname>
            <given-names>G.</given-names>
          </string-name>
          <article-title>FlyMine: an integrated database for Drosophila and Anopheles genomics</article-title>
          .
          <source>Genome Biol</source>
          .
          <volume>8</volume>
          (
          <issue>7</issue>
          ):R129 (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Motenko</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neuhauser</surname>
            <given-names>SB</given-names>
          </string-name>
          ,
          <string-name>
            <surname>O'Keefe</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Richardson</surname>
            <given-names>JE</given-names>
          </string-name>
          .
          <article-title>MouseMine: a new data warehouse for MGI</article-title>
          .
          <source>Mamm Genome</source>
          .
          <volume>26</volume>
          (
          <issue>7-8</issue>
          ):
          <fpage>325</fpage>
          -
          <lpage>30</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Howe</surname>
            <given-names>KL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bolt</surname>
            <given-names>BJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cain</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chan</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            <given-names>WJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davis</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Done</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Down</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grove</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harris</surname>
            <given-names>TW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kishore</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lomax</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muller</surname>
            <given-names>HM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nakamura</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nuin</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulini</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raciti</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schindelman</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stanley</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tuli</surname>
            <given-names>MA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Auken</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wright</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yook</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berriman</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kersey</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schedl</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sternberg</surname>
            <given-names>PW</given-names>
          </string-name>
          .
          <article-title>WormBase 2016: expanding to enable helminth genomic research</article-title>
          .
          <source>Nucleic Acids Res</source>
          .
          <volume>44</volume>
          (
          <issue>D1</issue>
          ):
          <fpage>D774</fpage>
          -
          <lpage>80</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>6. http://ratmine.org/</mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Balakrishnan</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Park</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karra</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitz</surname>
            <given-names>BC</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Binkley</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hong</surname>
            <given-names>EL</given-names>
          </string-name>
          , et al.
          <article-title>YeastMine{ an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit</article-title>
          .
          <source>Database</source>
          .
          <year>2012</year>
          : bar062 (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>8. http://zebra shmine.org/</mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Contrino</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            <given-names>RN</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butano</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carr</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lyne</surname>
            <given-names>R</given-names>
          </string-name>
          , et al.
          <article-title>modMine: exible access to modENCODE data</article-title>
          .
          <source>Nucleic Acids Res</source>
          .
          <volume>40</volume>
          :
          <issue>D1082</issue>
          {8 (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Chen</surname>
            <given-names>Y-A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yi-An</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tripathi</surname>
            <given-names>LP</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kenji</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework</article-title>
          .
          <source>Database</source>
          .
          <year>2016</year>
          : baw009 (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Kalderimis</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lyne</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butano</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Contrino</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lyne</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heimbach</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stepan</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sullivan</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Micklem</surname>
            <given-names>G.</given-names>
          </string-name>
          <article-title>InterMine: extensive web services for modern biology</article-title>
          .
          <source>Nucleic Acids Res</source>
          .
          <volume>42</volume>
          (
          <issue>Web Server issue</issue>
          ):
          <fpage>W468</fpage>
          -
          <lpage>72</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Stein</surname>
            <given-names>LD</given-names>
          </string-name>
          .
          <article-title>Integrating biological databases</article-title>
          .
          <source>Nat Rev Genet</source>
          .
          <volume>4</volume>
          :
          <issue>337</issue>
          {
          <fpage>345</fpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Goble</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carole</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robert</surname>
            <given-names>S.</given-names>
          </string-name>
          <article-title>State of the nation in data integration for bioinformatics</article-title>
          .
          <source>J Biomed Inform</source>
          .
          <volume>41</volume>
          :
          <issue>687</issue>
          {
          <fpage>693</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>