<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Improving discovery in Life Sciences Linked Open Data Cloud</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ali Hasnain</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Insight Center for Data Analytics, National University of Ireland</institution>
          ,
          <addr-line>Galway</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Multiple datasets that add high value to biomedical research have been exposed on the web as part of the Life Sciences Linked Open Data (LSLOD) Cloud. The ability to easily navigate through these datasets is crucial for personalized medicine and the improvement of drug discovery process. However, navigating these multiple datasets is not trivial as most of these are only available as isolated SPARQL endpoints with very little vocabulary reuse. The content that is indexed through these endpoints is scarce, making the indexed dataset opaque for users. We propose an approach to create an active Linked Life Sciences Data Compendium, a set of con gurable rules which can be used to discover links between biological entities in the LSLOD cloud. We have catalogued and linked concepts and properties from 137 public SPARQL endpoints. Our Compendium is primarily used to dynamically assemble queries retrieving data from multiple SPARQL endpoints simultaneously.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        A considerable portion of the Linked Open Data cloud is comprised of datasets
from Life Sciences Linked Open Data (LSLOD). The signi cant contributors
includes the Bio2RDF project1, Linked Life Data2 and the W3C HCLSIG Linking
Open Drug Data (LODD) e ort3. The deluge of biomedical data in the last few
years, partially due to the advent of high-throughput gene sequencing
technologies, has been a primary motivation for these e orts. There had been a critical
requirement for a single interface, either programmatic or otherwise, to access
the Life Sciences (LS) data. Although publishing datasets as RDF is a necessary
step towards uni ed querying of biological datasets, it is not su cient to retrieve
meaningful information due to data being heterogeneously available at di erent
endpoints [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Moreover in the LS domain, LD is extremely heterogeneous and
dynamic [
        <xref ref-type="bibr" rid="ref14 ref6">14,6</xref>
        ]; also there is a recurrent need for ad hoc integration of novel
experimental datasets due to the speed at which technologies for data capturing
in this domain are evolving. As such, integrative solutions increasingly rely on
federation of queries [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. With the standardization of SPARQL 1.1, it is now
possible to assemble federated queries using the \SERVICE" keyword, already
supported by multiple tool-sets (SWobjects and Fuseki etc). To assemble queries
encompassing multiple graphs distributed over di erent places, it is necessary
1 http://bio2rdf.org/ (l.a.: 2015-03-31 )
2 http://linkedlifedata.com/ (l.a.: 2015-05-16 )
3 http://www.w3.org/wiki/HCLSIG/LODD (l.a.: 2014-07-16 )
that all datasets should be query-able using the same global schema [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. This
can be achieved either by ensuring that the multiple datasets make use of the
same vocabularies and ontologies, an approach previously described as \a priori
integration" or conversely, using \a posteriori integration", which makes use of
mapping rules that change the topology of remote graphs to match the global
schema [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The methodology to facilitate the latter approach is the focus of our
research. Moreover for LD to become a core technology in the LS domain, three
challenges need to be addressed: i) dynamically discover datasets containing
data regarding biological entities (e.g. Drugs, Molecules), ii) retrieve
information about the same entities from multiple sources using di erent schemas, and
iii) identify, for a given query, the highest quality data.
      </p>
      <p>
        To address the aforementioned challenges, we introduce the notion of an
active Compendium for LS data { a representation of entities and the links
connecting these. Our methodology consisted of two steps: i) catalogue
development, in which metadata is collected and analyzed, and ii) links creation, which
ensures that concepts and properties are properly mapped to a set of Query
Elements (Qe) [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. For evaluation purposes, Qe are de ned in the context of Drug
Discovery and can be replaced by other Qe(s). We assume that the proposed
Compendium holds the potential to be used for a number of practical
applications including assembling federated queries in a particular context. We already
proposed the Link Creation mechanism, approaches and the linking statistics
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] as well as the cataloguing mechanism [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and in this article we brie y report
the methodology, initial results for Compendium development (cataloguing and
linking), and an architecture for implementing Domain Speci c Query Engine
(under progress) as one of the practical applications that federates SPARQL
query based on the set of mapping rules de ned in the Compendium.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2 State of the Art</title>
      <p>
        One approach to facilitate the \A posteriori integration" is through the use
of available schema: semantic information systems have used ontologies to
represent domain-speci c knowledge and enable users to select ontology terms in
query assembly [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. BLOOMS, for example, nds schema-level links between
LOD datasets using ontology alignment [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], but it relies mainly on Wikipedia.
Ontology alignment typically relies on starting with a single ontology, which is
not available for most SPARQL endpoints in the LOD cloud, hence could not
be applied in our case. Furthermore, ontology alignment does not make use of
domain rules (e.g. for two same sequences, quali es for same gene) nor the use
of URI pattern matching for alignment [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Approaches such as the VoID [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and
the SILK Framework [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] enable the identi cation of rules for link creation, but
require extensive knowledge of the data prior to links creation. Query federation
approaches have developed some techniques to meet the requirements of e cient
query computation in the distributed environment. FedX [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], a project which
extends the Sesame Framework [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] with a federation layer, enables e cient query
processing on distributed LOD sources by relying on the assembly of a catalogue
of SPARQL endpoints but does not use domain rules for links creation. Our
approach for link creation towards Compendium development is a combination of
the several linking approaches as already explained by Hasnain et. al [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]: i)
similarly to ontology alignment, we make use of label matching to discover concepts
in LOD that should be mapped to a set of Qe, ii) we create \bags of words" for
discovery of schema-level links similar to the approach taken by BLOOMS, and
iii) as in SILK, we create domain rules that enable the discovery of links.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Proposed Approach/ Methodology</title>
      <p>We proposed a Compendium for navigating the LSLOD cloud. Our methodology
consists of two stages namely catalogue generation and link generation. Data
was retrieved from 137 public SPARQL endpoints4 and organized in an RDF
document - the LSLOD Catalogue. The list of SPARQL endpoints was captured
from publicly available Bio2RDF datasets and Datahub5.
3.1</p>
      <sec id="sec-3-1">
        <title>Methodology for Catalogue Development</title>
        <p>For cataloguing, a preliminary analysis of multiple public SPARQL Endpoints
was undertaken and a semi-automated method was devised to retrieve all classes
(concepts) and associated properties (attributes) available by probing data
instances. The work ow de nition is as follows:
RDFS, Dublin Core6 and VoID7 vocabularies were used for representing the
data in the LSLOD catalogue, a slice of the catalogue for Pubchem SPARQL
endpoint is presented ( g.1).</p>
        <sec id="sec-3-1-1">
          <title>4 http://goo.gl/ZLbLzq 5 http://datahub.io/ (l.a.: 2015-05-05) 6 http://dublincore.org/documents/dcmi-terms/ (l.a.: 2014-07-12) 7 http://vocab.deri.ie/void (l.a.: 2014-07-12)</title>
          <p>1. For every SPARQL endpoint Si, nd the distinct Classes C(Si) :
C(Si) = Distinct (P roject (?class (toList (BGP (triple [ ] a ?class )))))
(1)
2. Collect the Instances for each Class Cj (Si) :</p>
          <p>Ii : Cj(Si) = Slice (P roject (?I (toList (BGP (triple ?a a &lt; Cj(Si) &gt; )))); rand())
(2)
3. Retrieve the Predicate/Objects pairs for each Ii : Cj (Si):</p>
          <p>Ii(P; O) = Distinct (P roject (?p; ?o (toList (BGP (triple &lt; Ii : Cj(Si) &gt; ?p ?o ))))
(3)
4. Assign Class Cj (Si) as domain of the Property Pk :</p>
          <p>
            Domain(Pk) = Cj(Si)
5. Retrieve Object type (OT ) and assign as a range of the Property Pk :
8 rdf : Literal
&gt;
&gt;&gt;&gt; dc : Image
Range(Pk) = OT ; OT = &lt; dc : InteractiveResource
&gt;&gt; P roject (?R (toList (BGP
&gt;
&gt;: (triple &lt; Ok &gt; rdf : type ?R))) if (Ok is IRI)
if (Ok is String)
if (Ok is Image)
if (Ok is U RL)
(4)
(5)
During this phase subClassOf and subPropertyOf links were created amongst
di erent concepts and properties to facilitate \a posteriori integration". The
creation of links between identi ed entities (both chemical and biological) is not
only useful for entity identi cation, but also for discovery of new associations
such as protein/drug, drug/drug or protein/protein interactions that may not be
obvious by analyzing datasets individually. Figure. 1 shows the subClassOf and
subPropertyOf links with de ned Qes. Links were created (discussed previously
in [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ]) using several approaches: i) Nave Matching/ Syntactic Matching/ Label
Matching, ii) Named Entity Matching, iii) Domain dependent/ unique identi er
Matching, and iv) Regex Matching.
4
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Applications/ Current Implementation</title>
      <p>As of 31st May 2015, the Compendium consists of 280064 triples representing
1861 distinct classes and 3299 distinct properties catalogued from 137 endpoints.</p>
      <sec id="sec-4-1">
        <title>4.1 DSQE: Domain-speci c Query Engine</title>
        <p>The general architecture of the DSQE (Fig 2) shows that given a SPARQL
query, the rst step is to parse the query and get individual triple patterns.
Then Compendium is used for the triple pattern wise source selection (TPWSS)
to identify relevant sources against individual triple patterns of the query. The
Compendium enumerates the known endpoints relates each endpoint with one
or more graphs and maps the local vocabulary to the vocabulary of the graph.
The resulting query is executed on top of Apache Jena query engine.</p>
        <p>
          An instance8 of the DSQE is deployed in the context of drug discovery
[
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Using `Standard' query builder, the user can select a topic of interest (e.g.
        </p>
        <sec id="sec-4-1-1">
          <title>8 http://srvgal86.deri.ie:8000/graph/Granatum</title>
          <p>Molecule) along with the list of associated properties. We plot the catalogued
subclasses of few Qe and the total number of distinct instances retrieved per Qe
while querying using DSQE (Fig 3).</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5 Evaluation</title>
      <p>
        So far we evaluated the performance of our catalogue generation methodology
and recorded the times taken to probe instances through endpoint analysis of 12
endpoints whose underlying data sources were considered relevant for drug
discovery - Medicare, Dailymed, Diseasome, DrugBank, LinkedCT, Sider, National
Drug Code Directory (NDC), SABIO-RK, Saccharomyces Genome Database
(SGD), KEGG, ChEBI and A ymetrix probesets. The cataloguing experiments
were carried out on a standard machine with 1.60Ghz processor, 8GB RAM
using a 10Mbps internet connection. We recorded the total available concepts
and properties at each SPARQL endpoint as well as those actually catalogued in
our Compendium [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Total number of triples exposed at each of these SPARQL
endpoints and the time taken for cataloguing was also recorded. We selected
those SPARQL endpoints which have a better latency for this evaluation, as the
availability and the uptime of the SPARQL endpoint is an important factor for
cataloguing. Best t regression models were then calculated. As shown in Fig.
4, our methodology took less than 1000000 milliseconds (&lt;16 minutes) to
catalogue seven of the SPARQL endpoints, and a gradual rise with the increase in the
number of available concepts and properties. We obtained two power regression
models (T = 29206 Cn1:113 and T = 7930 Pn1:027) to help extrapolate time taken
to catalogue any SPARQL endpoint with a xed set of available concepts (Cn)
and properties (Pn), with R2 values of 0.641 and 0.547 respectively. Using these
models and knowing the total number of available concepts/properties, a
developer could determine the approximate time (ms) as a vector combination. KEGG
and SGD endpoints took an abnormally large amount of time for cataloguing
than the trendline. The reason for this may include endpoint timeouts or network
      </p>
      <p>
        Fig. 4: Time taken to catalogue 12 SPARQL endpoints
delays. We also evaluated the performance of our Link Generation methodology
by comparing it against the popular linking approaches. Using WordNet
thesauri we attempted to automate the creation of bags of related words using 6
algorithms [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]: Jing Conrath, Lin, Path, Resnik, Vector and WuPalmer with
unsatisfactory results (Figure 5(c)). Our linking approaches resulted in better
linking rate as shown in Figure 5(a,b)
      </p>
    </sec>
    <sec id="sec-6">
      <title>6 Discussion</title>
      <p>
        There is great potential in using semantic web and LD technologies for accessing
and querying Life sciences data for nding Meaningful Biological Correlations.
However, in most cases, it is not possible to predict a priori where the
relevant data is available and its representation. Our current research provides the
concept and methodology for devising an active Linked Life Sciences Data
Compendium that relies on systematically issuing queries on various life sciences
SPARQL endpoints and collecting its results in an approach that would
otherwise have to be encoded manually by domain experts. Current experiments and
evaluation uses a set of Qe, which were de ned in a context of drug discovery. The
number of classes per endpoint varied from a single class to a few thousands. Our
initial exploration of the LSLOD revealed that only 15% of classes are reused.
However, this was not the case for properties, of which 48.5% are reused. Most
of the properties found were domain independent (e.g. type, seeAlso); however,
these are not relevant for the Compendium as they cannot increase the richness
of information content. Although a very low percentage of linking becomes
possible through nave matching or manual/domain matching, the quality of links
created are highly trusted [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. It is also worth noticing that 23% of identi ed
classes, and 56.2% of the properties remained unlinked, either because they are
out of scope or cannot match any Qe. This means that the quality as well as the
quantity of links created is highly dependent on the set of Qe used.
7
      </p>
    </sec>
    <sec id="sec-7">
      <title>Open Issues and Future Directions</title>
      <p>Multiple challenges faced which can hinder the applicability of our approach:
{ Some endpoints return timeout errors when a simple query (SELECT DISTINCT
?Concept WHERE {[ ] a ?Concept}) is issued.
{ Some endpoints have high downtime and cannot be generally relied.
{ Many endpoints provide non-deferenceable URI and some derefenceable URI
do not provide a \type" for the instance.</p>
      <p>
        In future an extension under consideration to available Compendium is to
enrich it with statistical and provenance information with appropriate changes to
DSQE and evaluate the overall performance. This includes information
including void:triples, void:entities, void:classes, void:properties, void:distinctSubjects
and void:distinctObjects in case of statistical cataloguing where as dcterms:title,
dcterms:description, dcterms:date, dcterms:publisher, dcterms:contributer,
dcterms:source, dcterms:creator,
dcterms:created, dcterms:issued and dcterms:modi ed in case of provenance.
Currently we are extending DSQE to convert any SPARQL 1.0 query into
corresponding SPARQL 1.1 query by using TPWSS information and the SPARQL
"SERVICE" clause. Implementing so DSQE will be able to answer any
federated SPARQL Query considering the desired endpoint being catalogued in
Compendium. The performance of this extended DSQE is aimed to compare
with state of the art query Engine FedX [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] using extensive evaluation criteria
including source selection in terms of number of ASK, total triple pattern-wise
sources selected, source selection time and total number of results retrieved per
query. For this evaluation we aim to select some queries from available query
federation benchmark e.g FedBench [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and also plan to de ne some complex
biological queries applicable on 10 real time publicly available datasets. Issues
related to Identity Resolution are also considered as future work.
      </p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgement</title>
      <p>This research has been supported in part by Science Foundation Ireland under
Grant Number SFI/12/RC/2289 and SFI/08/CE/I1380 (Lion 2). The author
would also like to acknowledge Stefan Decker being PhD supervisor.</p>
    </sec>
    <sec id="sec-9">
      <title>References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alexander</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hausenblas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Describing linked datasets-on the design and usage of void, the'vocabulary of interlinked datasets</article-title>
          .
          <source>In: In Linked Data on the Web Workshop (LDOW 09)</source>
          ,
          <article-title>in conjunction with WWW09</article-title>
          .
          <source>Citeseer</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bechhofer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buchan</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Roure</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Missier</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , et al.:
          <article-title>Why linked data is not enough for scientists</article-title>
          .
          <source>Future Generation Computer Systems</source>
          <volume>29</volume>
          (
          <issue>2</issue>
          ),
          <volume>599</volume>
          {
          <fpage>611</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Broekstra</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kampman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Harmelen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Sesame: A generic architecture for storing and querying RDF and RDF schema</article-title>
          .
          <source>In: The Semantic Web|ISWC</source>
          <year>2002</year>
          , pp.
          <volume>54</volume>
          {
          <fpage>68</fpage>
          . Springer (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Cheung</surname>
            ,
            <given-names>K.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frost</surname>
            ,
            <given-names>H.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marshall</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          , et al.:
          <article-title>A journey to semantic web query federation in the life sciences</article-title>
          .
          <source>BMC bioinformatics 10(Suppl</source>
          <volume>10</volume>
          ),
          <source>S10</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Deus</surname>
            ,
            <given-names>H.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prud</surname>
          </string-name>
          'hommeaux, E.,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malone</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adamusiak</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , et al.:
          <article-title>Translating standards into practice{one semantic web API for gene expression</article-title>
          .
          <source>Journal of biomedical informatics 45(4)</source>
          ,
          <volume>782</volume>
          {
          <fpage>794</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Goble</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevens</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hull</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , et al.:
          <article-title>Data curation+ process curation= data integration+ science</article-title>
          .
          <source>Brie ngs in bioinformatics 9(6)</source>
          ,
          <volume>506</volume>
          {
          <fpage>517</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hasnain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fox</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Decker</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deus</surname>
            ,
            <given-names>H.F.</given-names>
          </string-name>
          :
          <article-title>Cataloguing and linking life sciences LOD Cloud</article-title>
          .
          <source>In: 1st International Workshop on Ontology Engineering in a Datadriven World collocated with EKAW12</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hasnain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kamdar</surname>
            ,
            <given-names>M.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hasapis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zeginis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Warren</given-names>
            <surname>Jr</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.N.</surname>
          </string-name>
          , et al.:
          <article-title>Linked Biomedical Dataspace: Lessons Learned integrating Data for Drug Discovery</article-title>
          . In: International Semantic Web Conference (In-Use Track),
          <year>October 2014</year>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hasnain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , e Zainab,
          <string-name>
            <given-names>S.S.</given-names>
            ,
            <surname>Kamdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.R.</given-names>
            ,
            <surname>Mehmood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            ,
            <surname>Warren Jr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.N.</given-names>
            ,
            <surname>Fatimah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.A.</given-names>
            ,
            <surname>Deus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.F.</given-names>
            ,
            <surname>Mehdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Decker</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.:</surname>
          </string-name>
          <article-title>A roadmap for navigating the life sciences linked open data cloud</article-title>
          .
          <source>In: Semantic Technology</source>
          , pp.
          <volume>97</volume>
          {
          <fpage>112</fpage>
          . Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Jain</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verma</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yeh</surname>
            ,
            <given-names>P.Z.</given-names>
          </string-name>
          :
          <article-title>Ontology alignment for linked open data</article-title>
          .
          <source>In: The Semantic Web{ISWC</source>
          <year>2010</year>
          , pp.
          <volume>402</volume>
          {
          <fpage>417</fpage>
          . Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Petrovic</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burcea</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jacobsen</surname>
            ,
            <given-names>H.A.:</given-names>
          </string-name>
          <article-title>S-ToPSS: semantic toronto publish/subscribe system</article-title>
          .
          <source>In: Proceedings of the 29th international conference on Very large data bases-Volume</source>
          <volume>29</volume>
          . pp.
          <volume>1101</volume>
          {
          <fpage>1104</fpage>
          .
          <string-name>
            <given-names>VLDB</given-names>
            <surname>Endowment</surname>
          </string-name>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Schmidt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , Gorlitz,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Haase</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Ladwig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Schwarte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>Fedbench: A benchmark suite for federated semantic data query processing</article-title>
          .
          <source>In: The Semantic Web{ISWC</source>
          <year>2011</year>
          , pp.
          <volume>585</volume>
          {
          <fpage>600</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Schwarte</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haase</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hose</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schenkel</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Fedx: a federation layer for distributed query processing on linked open data</article-title>
          .
          <source>In: The Semanic Web: Research and Applications</source>
          , pp.
          <volume>481</volume>
          {
          <fpage>486</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Stein</surname>
          </string-name>
          , L.D.:
          <article-title>Integrating biological databases</article-title>
          .
          <source>Nature Reviews Genetics</source>
          <volume>4</volume>
          (
          <issue>5</issue>
          ),
          <volume>337</volume>
          {
          <fpage>345</fpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Studer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grimm</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abecker</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Semantic web services: concepts, technologies, and applications</article-title>
          . Springer (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Volz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaedke</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kobilarov</surname>
          </string-name>
          , G.:
          <article-title>Discovering and maintaining links on the web of data</article-title>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Zeginis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , et al.:
          <article-title>A collaborative methodology for developing a semantic model for interlinking Cancer Chemoprevention linked-data sources</article-title>
          .
          <source>Semantic Web</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>