<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A - Posteriori Integration for Life Sciences Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ali Hasnain</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Insight Center for Data Analytics, National University of Ireland</institution>
          ,
          <addr-line>Galway</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Multiple datasets that add high value to biomedical research have been exposed on the web as part of the Life Sciences Linked Open Data (LS-LOD) Cloud. The ability to easily navigate through these datasets is crucial in order to draw meaningful biological co relations. However, navigating these multiple datasets is not trivial as most of these are only available as isolated SPARQL endpoints with very little vocabulary reuse. We propose an approach for Autonomous Resource Discovery and Indexing (ARDI), a set of con gurable rules which can be used to discover links between biological entities in the LS-LOD cloud. We have catalogued and linked concepts and properties from 137 public SPARQL endpoints. The ARDI is used to dynamically assemble queries retrieving data from multiple SPARQL endpoints simultaneously.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        that vocabularies and ontologies are reused [21]. This can be achieved either
by ensuring that the multiple datasets make use of the same vocabularies and
ontologies known as \a priori integration" [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] or, using \a posteriori integration",
which makes use of mapping rules that change the topology of graphs such that
integrated queries become possible. A "posteriori" solutions are favoured by
Semantic Web technologies as these include mechanisms to describe two classes,
for example describing experiments and said to be \the same" [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Our work
focuses on a methodology to facilitate \a posteriori integration".
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Problem Statement</title>
      <p>
        In the Life Sciences domain, Linked Data is extremely heterogeneous and
dynamic [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. This includes both syntactic as well as semantic heterogeneity. Also
there is a recurrent need for ad hoc integration of novel experimental datasets
due to the speed at which technologies for data capturing in this domain are
evolving. As such, integrative solutions increasingly rely on federation of queries
[
        <xref ref-type="bibr" rid="ref1 ref10">24,10,1</xref>
        ]. Standardisation of SPARQL 1.1, made now possible to assemble
federated queries using the \SERVICE" keyword. To assemble queries encompassing
multiple graphs distributed over di erent places, it is necessary that all datasets
should be query-able using the same global schema [
        <xref ref-type="bibr" rid="ref11 ref13">11,13</xref>
        ]. This can be achieved
either by ensuring that the multiple datasets make use of the same vocabularies
and ontologies, an approach known as \a priori integration" or, using \a
posteriori integration", which makes use of mapping rules that change the topology of
remote graphs to match the global schema [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and the methodology to facilitate
the latter approach is the focus of our research.
3
4
      </p>
    </sec>
    <sec id="sec-3">
      <title>Relevancy</title>
    </sec>
    <sec id="sec-4">
      <title>Research question(s)</title>
      <p>This problem seems important for researchers using Linked Open Data in general
and Biomedical/ Bioinformatics researchers in speci c.</p>
      <p>For LD to become a core technology in the LS domain, three issues need to be
addressed, also provides baseline for research questions for our work: i) how to
dynamically discover datasets containing data on biological entities (e.g.
Proteins, Genes), ii) how to retrieve information about the same entities from
multiple sources using di erent schemas, and iii) to identify, for a given query, the
data with highest quality.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Hypothesis</title>
      <p>Our hypothesis can be summarised as follows:
"Given heterogeneous data from a publicly available Life Sciences Linked Open
Data corpus over distributed infrastructure, can we demonstrate improvements
to SPARQL Query Federation for Knowledge Discovery by the generation of
ARDI, an approach for indexing concepts and properties from distinct endpoints
(partially) achieving a posteriori integration of data".</p>
    </sec>
    <sec id="sec-6">
      <title>Approach</title>
      <p>To address the aforementioned research questions, we introduce the notion of
Autonomous Resource Discovery and Indexing (ARDI) { a representation of
concepts and the links connecting these concepts. ARDI would not only help
understand which data exists in each LS SPARQL endpoint, but more
importantly enable assembly of multiple source-speci c federated SPARQL queries.
Since our work is based on data exposed as public SPARQL endpoints, it is
important to analyze the content of each endpoint before creating ARDI. Hence
our overall approach comprises of four distinct steps/ stages (Figure 1).
The public SPARQL endpoints are planned to be analysed with two
considerations i. the content of a public SPARQL endpoint? and ii. how self descriptive
these endpoints are?. Analysing the content e.g. in terms of a) number classes,
b) number of properties, c) list of classes, etc are necessary to investigate the
size as well as nding similar data available at multiple datasources. Finding
how much self descriptive any endpoint is important to know the structure of
data stored at any endpoint in terms of class partitions, property partitions and
well as nested partitions. With self descriptive, we mean the potential of any
endpoint in order to express itself based on the data stored. In other words user
can nd the information regarding the endpoint and the data stored by
simply querying the data itself. This includes the type of data (e.g. list of classes
and properties), amount of data (e.g statistical snapshot regarding the entities,
triples, classes and properties), structure of data (class partitions, property
partitions and nested class/property partitions) and further classi cation of data
(e.g. literals, blank nodes and IRIs). This analysis is presented by Hasnain et al
[14], Such analysis provides a base line information regarding public SPARQL
endpoint as we catalogue and link the content (ARDI) of these endpoints to
support "a posteriori" integration.</p>
      <sec id="sec-6-1">
        <title>Autonomous Resource Discovery and Indexing (ARDI)</title>
        <p>
          The ARDI comprises a catalogue of LS-LOD and a set of functions to perform
standard queries against it. The methodology for developing the ARDI consists
of two stages namely catalogue generation [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] and link generation [15]. The
methodology for catalogue generation relies on retrieving all \types" (distinct
concepts) from each SPARQL endpoint and all associated properties with
corresponding instances. Data was retrieved from more than 130 public SPARQL
endpoints4, where the list was captured from publicly available Bio2RDF data
sets and by searching for data sets in Datahub5 tagged \life science" or
\healthcare". Hasnain et al, presented the methodology for catalogue generation [15] and
link generation [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] using nave, named entity and domain matching approaches
for weaving the \types" together for set of query elements (Qe).
        </p>
      </sec>
      <sec id="sec-6-2">
        <title>Query Engine</title>
        <p>As the practical application of ARDI, a Domain Speci c Query Engine is in a
design phase that would o ers a single-point-of-access for distributed life
science data from reliable sources without extensive expertise in SPARQL query
formulation. The ARDI identi es relevant triple patterns and matches types
according to their labels as a basic semantic normalisation approach. New public
endpoints are added through a cataloguing mechanism de ned by ARDI. Query
Engine would also provide provenance information covers the sources queried,
the number of triples returned and the retrieval time.</p>
      </sec>
      <sec id="sec-6-3">
        <title>Linked Biomedical Dataspace (LBDS)</title>
        <p>
          The combination of di erent components and technologies ARDI (Cataloguing
and Linking), Query Federation and Visual Query Explorer/ Aggregator)
constitute a dataspace - we call it Linked Biomedical Data Space [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. The Linked
Biomedical Dataspace (LBDS) enables the semantically-enriched representation,
exposure, interconnection, querying and browsing of biomedical data and
knowledge in a standardised and homogenised way.
7
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Related Work</title>
      <p>Relevant areas for related Work are: i) Linked Data access methods, ii)
Discovering SPARQL endpoints, iii) Cataloguing and Indexing, iv) Query Federation.</p>
      <sec id="sec-7-1">
        <title>Linked Data access methods</title>
        <p>
          There have been three methods provided to access content from knowledge bases
published as Linked Data: dereferencing, where IRIs of interest are looked
up via HTTP; dumps, where the entire content of a dataset is made available for
download; and SPARQL endpoints, where a query interface is provided over
the local content. A more recent proposal { Linked Data Fragments [26] { has
recently begun to gain attention. SPARQL endpoints push the burden from data
consumers to producers: hosting such a public query service is expensive and as a
4 http://goo.gl/ZLbLzq
5 https://datahub.io/ (l.a.: 2016-05-05)
result, endpoints may not be able to answer all queries for all consumer agents [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
As an alternative to SPARQL endpoints, Verborgh et al. [26] propose methods for
providing and organising multiple access methods to a Linked Dataset, including
a lightweight \triple pattern fragment", which allows clients to request all triples
matching a single pattern.
        </p>
      </sec>
      <sec id="sec-7-2">
        <title>Discovering SPARQL endpoints</title>
        <p>
          There are two high-level options for discovering SPARQL endpoints with relevant
data: (1) ood the endpoints with queries, or (2) build a central search index.
For example, federated SPARQL engines employ one or both of these
strategies [
          <xref ref-type="bibr" rid="ref1 ref2 ref4">22,24,2,1,4</xref>
          ]. Paulheim et al. [20] looked at how to nd a SPARQL endpoint
containing content about a given Linked Data URI: using VoID descriptions and
the DataHub catalogue. Buil-Aranda et al. [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] propose SPARQLES as a
catalogue of SPARQL endpoints, but focus on performance and stability metrics
rather than cataloguing content. Likewise, the analysis by Lorey [19] of public
endpoints focused on characterising the performance o ered by these services
rather than on the problem of discovery.
        </p>
      </sec>
      <sec id="sec-7-3">
        <title>Cataloguing and Linking</title>
        <p>
          Ontology alignment approaches can not be used for cataloguing as these do not
make use of domain rules (e.g. for two same sequences, quali es for same gene)
nor the use of URI pattern matching for alignment [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Approaches such as the
VoID [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and the SILK Framework [27] enable the identi cation of rules for link
creation, but require extensive knowledge of the data prior to links creation. Our
approach for link creation is a combination of the several linking approaches as
already explained by Hasnain et. al [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]: i) similarly to ontology alignment, we
make use of label matching to discover concepts in LOD that should be mapped
to a set of Qe, ii) we create \bags of words" for discovery of schema-level links
similar to the approach taken by BLOOMS, and iii) as in SILK, we create
domain rules that enable the discovery of links.
        </p>
      </sec>
      <sec id="sec-7-4">
        <title>SPARQL Federation Systems</title>
        <p>
          Advances in federated query processing methods over the Web of Data have
enabled the development of federated query engines (QE). Each of these QE
have slightly di erent goals and thus make di erent compromises between speed,
completeness, and exibility. Quilitz et al. [23] proposed DARQ. It makes use
of service descriptions for relevant data source selection. Langegger et al. [18]
propose a solution using a mediator approach, which continuously monitors the
SPARQL endpoints for any dataset changes for automatic updates. Schwarte
et al. [24] propose FedX, an index-free query federation for the Web of Data.
SPLENDID [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] makes use of Vocabulary of Interlinked Datasets (VoID)
descriptions along with SPARQL ASK queries to select the list of relevant sources
for each triple pattern. Kaoudi et al. [17] propose a federated query technique on
top of distributed hash tables (DHT) to minimise the query execution time and
the bandwidth consumption. Acosta et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] present ANAPSID, an adaptive
query engine that adapts query execution schedulers to endpoints data
availability and run-time conditions. Avalanche [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] gathers endpoint datasets statistics
and bandwidth availability on-the- y before the query federation.
8
        </p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Preliminary Results</title>
      <p>
        Results for our ARDI approach has been published [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], [15]. We evaluated the
performance of our catalogue generation methodology and recorded the times
taken to probe instances through endpoint analysis of 12 endpoints whose
underlying data sources were considered relevant for drug discovery. The cataloguing
experiments were carried out on a standard machine with 1.60Ghz processor,
8GB RAM using a 10Mbps internet connection. Best t regression models were
then calculated (Fig. 2). It took less than 1000000 milliseconds (&lt;16 minutes) to
catalogue seven of the SPARQL endpoints, and a gradual rise with the increase in
the number of available concepts and properties. We obtained two power
regression models (T = 29206 Cn1:113 and T = 7930 Pn1:027) to help extrapolate time
taken to catalogue any SPARQL endpoint with a xed set of available concepts
(Cn) and properties (Pn), with R2 values of 0.641 and 0.547 respectively. Using
these models and knowing the total number of available concepts/properties,
a developer could determine the approximate time (ms) as a vector
combination. KEGG and SGD endpoints took an abnormally large amount of time for
cataloguing than the trendline. We also evaluated the performance of our Link
Generation methodology by comparing it against the popular linking approaches.
Using WordNet thesauri we attempted to automate the creation of bags of
related words using 6 algorithms [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]: Jing &amp; Conrath, Lin, Path, Resnik, Vector
and WuPalmer with unsatisfactory results (Figure 3(c)). Our linking approaches
resulted in better linking rate as shown in Figure 3(a,b)
Our Evaluation plan will span over evaluating our approach in terms of i)
SPARQL endpoint analysis, ii) ARDI (cataloguing and linking) and iii) Query
Federation System. All these stages have di erent evaluation criteria.
SPARQL Endpoint Analysis:Criteria is twofold: (i) using VoID as a bar, to
empirically investigate the extent to which public endpoints can describe their
own content, and (ii) to build and analyse the capabilities of a best-e ort online
catalogue of current endpoints based on the (partial) results collected.
ARDI (cataloguing and linking): Cataloguing and Linking results along with
the evaluation in terms of i) time taken, ii) number of concepts and properties
catalogued, and iii) correct vs incorrect links has been published[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], [15].
Query Federation System: For evaluation the query federation system, we
dene source selection e ciency in terms of (a) total number of triple-wise sources
selected (#TP), (b) SPARQL ASK requests used (#AR; to obtain (a)), and (c)
the source selection time (SST). Based on this criteria we aim to evaluate our
system with FedX a state of the art query engine using a test bed of ten real
time datasets with 20 real time queries (a publication under review).
10
      </p>
    </sec>
    <sec id="sec-9">
      <title>Re ections</title>
      <p>Focusing on the problem of nding relevant SPARQL endpoints and analysing,
we may miss relevant Linked Datasets that do not o er a SPARQL endpoint.
According to statistics by Jentzsch et al. [16], only 68% of the Linked Datasets
surveyed provided a SPARQL endpoint. However, our focus is speci cally on the
problem of relevant SPARQL endpoints, which we argue is a su ciently
noteworthy problem in and of itself. Current experiments and evaluation uses a set of
Qe, which were de ned in a context of drug discovery. The number of classes per
endpoint varied from a single class to a few thousands. Our initial exploration
of the LSLOD revealed that only 15% of classes are reused. However, this was
not the case for properties, of which 48.5% are reused. Multiple challenges faced
which can hinder the applicability of our approach:
{ Some endpoints return timeout errors when a simple query (SELECT DISTINCT
?Concept WHERE {[ ] a ?Concept}) is issued.
{ Some endpoints have high downtime and cannot be generally relied.
{ Many endpoints provide non-deferenceable URI and some derefenceable URI
do not provide a \type" for the instance.</p>
    </sec>
    <sec id="sec-10">
      <title>Acknowledgement</title>
      <p>This research has been supported in part by Science Foundation Ireland under
Grant Number SFI/12/RC/2289. The author would also like to acknowledge
Dietrich Rebholz-Schuhmann being PhD supervisors.
14. Hasnain, A., Mehmood, Q., e Zainab, S.S., Hogan, A.: Sportal: Pro ling the content
of public sparql endpoints. International Journal on Semantic Web and Information
Systems (IJSWIS) 12(3), 134{163 (2016), http://www.igi-global.com/article/
sportal/160175
15. Hasnain, A., e Zainab, S.S., Kamdar, M.R., Mehmood, Q., Warren Jr, C.N.,
Fatimah, Q.A., Deus, H.F., Mehdi, M., Decker, S.: A roadmap for navigating the
life sciences linked open data cloud. In: Semantic Technology, pp. 97{112. Springer
(2014)
16. Jentzsch, A., Cyganiak, R., Bizer, C.: State of the lod cloud. Online Report
(September 2011), http://lod-cloud.net/state/
17. Kaoudi, Z., Kyzirakos, K., Koubarakis, M.: Sparql query optimization on top of
dhts. In: Proceedings of the 9th international semantic web conference on The
semantic web - Volume Part I. pp. 418{435. ISWC'10 (2010)
18. Langegger, A., Wo , W., Blochl, M.: A semantic web middleware for virtual data
integration on the web. In: Proceedings of the 5th European semantic web
conference on The semantic web: research and applications. pp. 493{507. ESWC'08
(2008)
19. Lorey, J.: Identifying and determining SPARQL endpoint characteristics. IJWIS
10(3), 226{244 (2014), http://dx.doi.org/10.1108/IJWIS-03-2014-0007
20. Paulheim, H., Hertling, S.: Discoverability of SPARQL Endpoints in Linked Open
Data. In: International Semantic Web Conference (ISWC) Posters &amp; Demos. pp.
245{248. Springer (2013)
21. Polleres, A.: Semantic web technologies: From theory to standards. In: 21st
National Conference on Arti cial Intelligence and Cognitive Science, NUI Galway
(2010)
22. Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In:</p>
      <p>European Semantic Web Conference (ESWC). pp. 524{538. Springer (2008)
23. Quilitz, B., Leser, U.: Querying distributed rdf data sources with sparql. In:
Proceedings of the 5th European semantic web conference on The semantic web:
research and applications. pp. 524{538. ESWC'08 (2008)
24. Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: A federation
layer for distributed query processing on Linked Open Data. In: Extended Semantic
Web Conference (ESWC). pp. 481{486. Springer (2011)
25. Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg,
L.J., Eilbeck, K., Ireland, A., Mungall, C.J., et al.: The obo foundry: coordinated
evolution of ontologies to support biomedical data integration. Nature
biotechnology 25(11), 1251{1255 (2007)
26. Verborgh, R., Hartig, O., Meester, B.D., Haesendonck, G., Vocht, L.D., Sande,
M.V., Cyganiak, R., Colpaert, P., Mannens, E., de Walle, R.V.: Querying
datasets on the Web with high availability. In: International Semantic Web
Conference (ISWC). pp. 180{196. Springer (2014), http://dx.doi.org/10.1007/
978-3-319-11964-9_12
27. Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Discovering and maintaining links
on the web of data. Springer (2009)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Acosta</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vidal</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lampo</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castillo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruckhaus</surname>
          </string-name>
          , E.:
          <article-title>ANAPSID: an adaptive query processing engine for SPARQL endpoints</article-title>
          .
          <source>In: International Semantic Web Conference (ISWC)</source>
          . pp.
          <volume>18</volume>
          {
          <fpage>34</fpage>
          . Springer (
          <year>2011</year>
          ), http://dx.doi.org/10. 1007/978-3-
          <fpage>642</fpage>
          -25073-
          <issue>6</issue>
          _
          <fpage>2</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Akar</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Halac</surname>
            ,
            <given-names>T.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ekinci</surname>
            ,
            <given-names>E.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dikenelli</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Querying the Web of Interlinked Datasets using VOID Descriptions</article-title>
          . In:
          <article-title>Linked Data On the Web (LDOW)</article-title>
          .
          <source>CEUR</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Alexander</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hausenblas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Describing linked datasets-on the design and usage of void, the'vocabulary of interlinked datasets</article-title>
          .
          <source>In: In Linked Data on the Web Workshop (LDOW 09)</source>
          ,
          <article-title>in conjunction with WWW09</article-title>
          .
          <source>Citeseer</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Basca</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bernstein</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Querying a messy web of data with Avalanche</article-title>
          .
          <source>J. Web Sem</source>
          .
          <volume>26</volume>
          ,
          <issue>1</issue>
          {
          <fpage>28</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bechhofer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buchan</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Roure</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Missier</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , et al.:
          <article-title>Why linked data is not enough for scientists</article-title>
          .
          <source>Future Generation Computer Systems</source>
          <volume>29</volume>
          (
          <issue>2</issue>
          ),
          <volume>599</volume>
          {
          <fpage>611</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fischetti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Foreword</given-names>
            <surname>By-Dertouzos</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.L.</surname>
          </string-name>
          :
          <article-title>Weaving the Web: The original design and ultimate destiny of the World Wide Web by its inventor</article-title>
          .
          <source>HarperInformation</source>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Buil-Aranda</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Umbrich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vandenbussche</surname>
          </string-name>
          , P.Y.:
          <article-title>Sparql webquerying infrastructure: Ready for action?</article-title>
          <source>In: The Semantic Web{ISWC</source>
          <year>2013</year>
          , pp.
          <volume>277</volume>
          {
          <fpage>293</fpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Deus</surname>
            ,
            <given-names>H.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prud</surname>
          </string-name>
          'hommeaux, E.,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malone</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adamusiak</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , et al.:
          <article-title>Translating standards into practice{one semantic web API for gene expression</article-title>
          .
          <source>Journal of biomedical informatics 45(4)</source>
          ,
          <volume>782</volume>
          {
          <fpage>794</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Goble</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevens</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hull</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , et al.:
          <article-title>Data curation+ process curation= data integration+ science</article-title>
          .
          <source>Brie ngs in bioinformatics 9(6)</source>
          ,
          <volume>506</volume>
          {
          <fpage>517</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. Gorlitz,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Staab</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          : Splendid:
          <article-title>Sparql endpoint federation exploiting void descriptions</article-title>
          .
          <source>In: Proceedings of the 2nd International Workshop on Consuming Linked Data</source>
          , Bonn, Germany (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Hasnain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fox</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Decker</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deus</surname>
            ,
            <given-names>H.F.</given-names>
          </string-name>
          :
          <article-title>Cataloguing and linking life sciences LOD Cloud</article-title>
          .
          <source>In: 1st International Workshop on Ontology Engineering in a Datadriven World collocated with EKAW12</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Hasnain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kamdar</surname>
            ,
            <given-names>M.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hasapis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zeginis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Warren</given-names>
            <surname>Jr</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.N.</surname>
          </string-name>
          , et al.:
          <article-title>Linked Biomedical Dataspace: Lessons Learned integrating Data for Drug Discovery</article-title>
          . In: International Semantic Web Conference (In-Use Track),
          <year>October 2014</year>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Hasnain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehmood</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          , e Zainab,
          <string-name>
            <given-names>S.S.</given-names>
            ,
            <surname>Decker</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.:</surname>
          </string-name>
          <article-title>A provenance assisted roadmap for life sciences linked open data cloud</article-title>
          .
          <source>In: Knowledge Engineering and Semantic Web</source>
          , pp.
          <volume>72</volume>
          {
          <fpage>86</fpage>
          . Springer (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>