<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SCRY: extending SPARQL with custom data processing methods for the life sciences</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bas Stringer</string-name>
          <email>b.stringer@vu.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Albert Meroño-Peñuela</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sanne Abeln</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Frank van Harmelen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jaap Heringa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre for Integrative Bioinformatics (IBIVU), Vrije Universiteit Amsterdam</institution>
          ,
          <addr-line>NL</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Knowledge Representation and Reasoning Group, Vrije Universiteit Amsterdam</institution>
          ,
          <addr-line>NL</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>An ever-growing amount of life science databases are (partially) exposed as RDF graphs (e.g. UniProt, TCGA, DisGeNET, Human Protein Atlas), complementing traditional methods to disseminate biodata. The SPARQL query language provides a powerful tool to rapidly retrieve and integrate this data. However, the inability to incorporate custom data processing methods in SPARQL queries inhibits its application in many life science use cases. It should take far less effort to integrate data processing methods, such as BLAST, with SPARQL. We propose an effective framework for extending SPARQL with custom methods should fulfill four key requirements: generality, reusability, interoperability and scalability. We present SCRY, the SPARQL compatible service layer, which provides custom data processing within SPARQL queries. SCRY is a lightweight SPARQL endpoint that interprets parts of basic graph patterns as input for user defined procedures, generating an RDF graph against which the query is resolved on-demand. SCRY's federationoriented design allows for easy integration with existing endpoints, extending SPARQL's functionality to include custom data processing methods in a decoupled, standards-compliant, tool independent manner. We demonstrate the power of this approach by performing statistical analysis of a benchmark, and by incorporating BLAST in a query which simultaneously finds the tissues expressing Hemoglobin and its homologs.</p>
      </abstract>
      <kwd-group>
        <kwd>SPARQL</kwd>
        <kwd>customization</kwd>
        <kwd>extension</kwd>
        <kwd>RDF generation</kwd>
        <kwd>data processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        information graphs in the RDF data based on the templates and filters (constraints
on nodes and edges) expressed in the query. Such patterns match relations that are
explicitly present in the RDF graph, or that are derivable from the graph under a
given entailment regime such as RDFS or OWL-DL [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. However, there are many
cases where information needs to be retrieved from an RDF graph where it is (a)
impractical to store all relevant relation-instances explicitly in the graph, and (b) the
relevant relation-instances can not be derived from the explicit graph under any of the
supported entailment regimes.
      </p>
      <p>A straightforward example of such a problem, is selecting outliers from a set of query
results. Which results are outliers depends on a scoring function and the rest of the set.
It is infeasible to precompute which results are outliers for every putative result set and
scoring function, and the math for typical scoring functions is not natively supported
by SPARQL or RDF/OWL entailment.</p>
      <p>Another example, common in life science, involves finding proteins which are
evolutionarily related to a given protein of interest. There is a multitude of methods and
parameters through which this can be derived, but which of these should be used
strongly depend on the exact context of the question. Precomputing every possible
relations between all proteins is not only impractical, but combinatorially impossible.
Again, RDF/OWL entailment can not derive such complex relations ad hoc.</p>
      <p>These examples illustrate that, even though RDF has shown itself to be a very
versatile representation language, there are important relations between entities which can
in principle be expressed in RDF, but that are impractical or impossible to materialize
persistently in a triplestore. Hence, they are not accessible through SPARQL queries
over such a triplestore. At the same time, these relations are critical to resolve queries
common to the life sciences or other domains.</p>
      <p>We present the SPARQL compatible service layer (SCRY) as a solution to this
problem. SCRY is a lightweight SPARQL endpoint that allows users to create their
own procedures, assign them to a URI, and embed them in standard SPARQL queries.
Through these procedures, analytic methods can be executed at query time, effectively
extending SPARQL with API-like functionality. SCRY provides a framework for custom
data processing within the context of a Semantic Web query, facilitating on-demand
computation of domain-specific relations in a generalized, reusable, interoperable and
scalable manner.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Requirements</title>
      <p>
        As introduced above, RDF/OWL entailment is not sufficiently expressive to
incorporate complex domain-specific data processing methods in SPARQL queries. We observe
there is an unmet need for a platform which enables computational biologists to
efficiently implement and integrate such methods. A platform enabling this would ideally
be 1) easy to deploy, 2) easy to extend with new methods, and 3) easy to write queries
for. Concretely, we argue this can be achieved by meeting the following requirements:
Generality A generic platform, which enables the implementation of any method
from any domain, avoids Semantic Web services devolving to a multitude of
mutually incompatible "SPARQL-like" extensions (e.g. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]). Consequently, the interface
through which implemented methods are exposed should support arbitrarily
complex algorithms, maintaining SPARQL’s expressivity without restricting functional
extensibility.
      </p>
      <p>Reusability Implemented methods should be standardized to a degree that allows
them to be reused between queries. Done well, this facilitates sharing services and
reusing the work of others, effectively extending the shareable nature of Semantic
Web data and schemata to apply to methods as well.</p>
      <p>Interoperability Queries incorporating methods which extend SPARQL’s
functionality should maintain full compatibility with the SPARQL standard, such that any
SPARQL-aware server, service or tool can parse and resolve them. Towards this
end, methods must be implemented in an endpoint-independent manner.
Scalability Many analytic methods are computationally expensive, making them
inhibitively slow or outright impossible to run through remotely hosted services.
Simpler methods incur bandwidth restrictions if they are given high numbers of
inputs, or produce large volume outputs. Such methods benefit from using local
resources for computation, which requires users to control both the software and
the hardware resolving their queries.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>
        The simplest and most prevalent way of processing RDF data accessed via SPARQL
endpoints is post hoc scripting. This typically involves SPARQL-compatible libraries
in some general purpose programming language [
        <xref ref-type="bibr" rid="ref6 ref9">6,9</xref>
        ]. Despite its generality, this comes
with two pitfalls. Firstly, such libraries are highly coupled with the standards of SPARQL
and the programming languages they are implemented in, making their maintenance
costly. Users have to implement purpose-specific data processing pipelines, which is
time consuming and typically results in code that is not reusable [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In fact, post hoc
scripting does not functionally extend SPARQL as a query language, and thus lacks
interoperability and reusability.
      </p>
      <p>
        The SPARQL 1.1 specification [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] defines Extensible Value Testing (EVT) as a
mechanism to extend SPARQL’s standard functionality. However, custom EVT
procedures are restricted to appear in limited query environments like BIND() and FILTER()),
thus obstructing the customization of aggregates. Moreover, queries “using extension
functions are likely to have limited interoperability” [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], making them triplestore
dependent and forcing redundant implementations across products. Another take on
extending SPARQL is OGC GeoSPARQL, which “defines a vocabulary for representing
geospatial data in RDF, and defines an extension to the SPARQL query language for
processing geospatial data” [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. However, its reliance on endpoint customization impairs
interoperability.
      </p>
      <p>
        Web Services are a common way of exposing data processing functionality online.
Significant efforts have been made to describe Web Services semantically to enhance
their interoperability [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. For example, in [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] authors propose to expose REST APIs
as Linked Data. In the same line of thought, the Semantic Automated Discovery and
Integration (SADI) registry [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] uses an elegant OWL description of the inputs and
outputs of semantic services, which enables non-expert users to automatically
incorporate relevant methods within their queries. However, SADI’s dependence on compatible
Post hoc scripting
SPARQL extension standards
(EVT, GeoSPARQL)
Semantic Web Services (SADI)
Linked Data APIs (OpenPHACTS)
      </p>
      <p>X</p>
      <p>X
X
X</p>
      <p>X
Generality Reusability Interoperability Scalability</p>
      <p>
        X X
query engines (e.g. SHARE [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]) limits interoperability, and its scalability suffers from
registered services commonly being exposed exclusively through online,
single-pointof-failure endpoints.
      </p>
      <p>
        Linked Data APIs are very similar to Web Services in terms of their goals and
challenges [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. They expose Linked Data in Web standard formats without
requiring extensive knowledge of RDF or SPARQL [
        <xref ref-type="bibr" rid="ref14 ref5">5,14</xref>
        ]. Projects like Open PHACTS [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]
maintain runtime performance by restricting query expressivity, trading in generality
for scalability. Given sufficient documentation, Linked Data APIs can be very reusable,
but since they abstract SPARQL away from end users, they are not directly
interoperable.
      </p>
      <p>Table 1 summarizes which of our requirements from the previous section are fulfilled
by the listed approaches.
4</p>
    </sec>
    <sec id="sec-4">
      <title>SCRY</title>
      <p>Infrastructure: Our SPARQL compatible service layer (SCRY) acts as a lightweight
SPARQL endpoint, granting users access to a personalized set of services during query
execution. Through these services, SCRY allows users to incorporate algorithms of
arbitrary complexity within standards-compliant SPARQL queries, and to use the
generated outputs directly within those same queries. Unlike traditional SPARQL endpoints,
the RDF graph against which SCRY resolves its queries is generated at query time, by
executing services encoded in the query’s graph patterns.</p>
      <p>Every service exposed by SCRY comprises one or more procedures. Registered
procedures are associated with a URI, which SCRY will recognize when it receives a query.
These procedure-associated URIs, as well as the desired inputs and outputs, can be
embedded within the graph patterns of standards-compliant, syntactically pure SPARQL
queries (see figure 2). Prior to resolving the SPARQL query, SCRY executes embedded
procedures and adds their output to the queried RDF graph. Procedures can comprise
simple instructions, like rounding off a decimal number. They can also be arbitrarily
complex, for example running an external program, parsing its output, performing some
form of analysis and exposing specifically requested results as RDF. The output of one
procedure can be used as input for the next, allowing simple (non-circular) workflows
to be encoded.</p>
      <p>Figure 1 illustrates the two typical dataflows of SPARQL queries resolved through
SCRY, accessing it through (A) a federated query from a primary endpoint or (B) a
direct query to a live SCRY instance.</p>
      <p>Engineering effort: To ensure SCRY can be easily utilized by life scientists, we
aimed to minimize the effort needed to 1) deploy SCRY, 2) extend it with novel data
processing methods and 3) write queries invoking them.</p>
      <p>SCRY can be deployed on any system running Python 2.7. The framework itself,
as well as the services required for the use cases we describe below, can be easily
installed using the package manager pip.4 It is equally straightforward to extend SCRY
with services of your own. Python scripts added to SCRY’s working directory are
automatically checked for defined procedures, which can take as little as three lines
to define.5 SCRY can be further configured through a commandline interface, which
includes options to load services from elsewhere.</p>
      <p>Defining a procedure requires 1) a URI and 2) a Python function to execute when
SCRY finds this URI in a graph pattern. Note these functions can rely on local or remote
resources, including other Python functions, shell commands (e.g. through os.system()),
and web requests.</p>
      <p>Figure 2 shows how easily procedure calls can be embedded in SPARQL queries
using SCRY. We explain the query itself, and how SCRY interprets the embedded
procedure calls, in the Statistics use case of the Evaluation section.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Evaluation</title>
      <p>We evaluate whether or not SCRY meets the requirements described in section 2 based
on two use cases, from the domains of statistics and bioinformatics respectively.</p>
      <sec id="sec-5-1">
        <title>4See http://tinyurl.com/scry4ls#file-deployment 5See http://tinyurl.com/scry4ls#file-hello-world</title>
        <p>
          Use case 1: Statistics. We use the the Berlin SPARQL Benchmark (BSBM) [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] to
evaluate our extension of SPARQL with generic statistical functions. This synthetic
data set covers a set of products offered by different vendors, each given a rating by
different reviewers. We want to know if any vendor’s products are consistently rated
higher than others, which we determine by calculating the mean, median and standard
deviation of every product’s rating, across all products sold by a specific vendor.
        </p>
        <p>Note that calculating standard deviations requires computing square roots, which
SPARQL does not natively support. Using SCRY, however, the query shown in figure
2 yields the answers by invoking two procedures defined in the scry-math package.</p>
        <p>The graph pattern {GRAPH ?g1 {math:median in: ?ratings; out: ?median.}} instructs SCRY
to 1) execute the procedure associated with the URI math:median, using whatever values
are bound to ?ratings as input; and 2) create an RDF (sub)graph matching the entire
pattern, with the procedure’s output inserted in the position of ?median.
Use case 2: Bioinformatics. In this use case, we evaluate a very typical query from
the life sciences. We use a protein sequence similarity search, to identify evolutionarily
related (i.e. homologous) proteins. The derived relations are then used to determine
which of a query protein’s homologs are coexpressed in the same tissues.</p>
        <p>
          Our scry-blast package defines two procedures: one which runs the most commonly
used sequence similarity search method (BLAST [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]), given a protein sequence as input;
and another which downloads such sequences from UniProt, given an identifier. Using
hemoglobin (P68871) as our protein of interest, we combine our procedures with
tissue-specific protein expression data from the Human Protein Atlas (HPA) [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. As
shown in figure 3, we can now write a query6 which downloads hemoglobin ’s sequence
from UniProt, runs BLAST with that sequence to identify homologous proteins, and
checks which tissues coexpress our protein of interest and its homologs.
        </p>
        <p>Of course, SCRY can be extended with similar procedures to expose different
similarity search methods, and queries can combine their results with other types of data.
Generality: SCRY enables users to integrate data processing procedures of arbitrary
complexity in SPARQL queries, by embedding procedure calls in a query’s graph
patterns. This design is not restricted to procedures from or applications within any specific
domain. Our use cases involved procedures which 1) integrate a service exposing
simple mathematics and statistics functions to analyse BSBM data as shown in figure 2;
and which 2) integrate a popular bioinformatics tool (BLAST) to derive evolutionary
relationships between proteins at query time, as shown in figure 3.</p>
        <p>These examples illustrate the simplicity and flexibility of using custom data
processing methods within a SPARQL query, and that SCRY can extend queries with
custom data processing methods of arbitrary complexity through intuitive,
standardscompliant SPARQL syntax.</p>
        <p>Reusability: The modular nature of services exposed through SCRY makes them easy
to share, modify and reuse. For example, the standard deviation procedure used in the
statistics use case can just as easily be applied to the e-values or molecular masses of
proteins homologous to hemoglobin identified in the bioinformatics use case. Once
implemented, a SPARQL extension exposed through SCRY can be used in any query.</p>
      </sec>
      <sec id="sec-5-2">
        <title>6See http://tinyurl.com/scry4ls#file-coexpression</title>
        <p>
          Interoperability: The manner in which SCRY embeds procedure calls in query graph
patterns is completely compatible with the SPARQL standard. Consequently, any
standards-compliant SPARQL engine will accept queries that federate parts of the
query to SCRY, e.g. with the SERVICE statement. To demonstrate, we loaded the
benchmark data from the statistics use case into four popular triplestores: Apache Jena [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ],
OpenLink Virtuoso [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], OpenRDF Sesame [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and Stardog [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. We then succesfully
resolved the query shown in figure 2 against each of these endpoints. SCRY is also
compatible with query editors and similar tools, such as YASGUI [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
        </p>
        <p>Being able to resolve a single query against different endpoints, calling the same
(reusable) procedure, sharply contrasts with the alternative: extending every triplestore
individually, or using their built-in SPARQL extensions, and writing triplestore-specific
queries for each. Making use of triplestore-specific built-in functions requires manual
search through heterogeneous and cryptic documentation, if the desired built-in exists
at all. Extending triplestores with custom functionality is similarly difficult: extending
Sesame7 with a simple function to calculate square roots requires 55 lines of Java8 and
deploying a properly documented JAR in a specific directory.</p>
        <p>Note that in many cases, SPARQL queries can combine simple functions to express
a more complex one. A standard deviation, for example, can be calculated by taking
the square root of an array’s variance, divided by its length. As complexity grows,
though, such queries quickly become difficult to write and harder still to read9. In this
context, SCRY provides an essential layer of abstraction, without which incorporating
programs like BLAST would be difficult, tedious or outright impossible.
Scalability: As a SPARQL endpoint, SCRY has the same issues as every other; the
time it takes to execute a query scales rather poorly with its complexity. Where service
execution is concerned, however, SCRY offers two unique advantages over typical
(Semantic) Web service platforms. First and foremost, SCRY is designed to be deployed
with a personalized set of services and procedures, granting users full control over both
the software and the hardware resolving their services. Secondly, it decouples the
execution of those services from resolution of the query. This inherently makes services
exposed by SCRY more scalable than equivalent web service implementations.</p>
        <p>Running local implementations instead of addressing an equivalent web service can
significantly improve the execution time of computationally expensive procedures. For
example, running BLAST locally instead of through the NCBI’s web service is several
orders of magnitude faster (data not shown), in addition to giving users more
control over critical parameters. Computationally expensive data processing methods are
prominent in the life sciences, and using SCRY instead of online services to include
their results in your queries can significantly speed up runtimes.</p>
        <p>Note that resolving queries entirely locally has another significant advantage:
privacy. Neither your query nor its (partial) results have to be exposed to external
endpoints or (web) services to resolve, which is especially valuable when working with
7See http://tinyurl.com/sesame-cookbook
8See http://tinyurl.com/sqrt-extensions#file-sesame-sqrt-java
9See http://tinyurl.com/sqrt-extensions for triplestore-specific queries equivalent to that in
figure 2
sensitive (e.g. clinical) data, from behind a firewall, or in an otherwise restricted
network.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>What does SCRY cost?</title>
      <p>SCRY leverages SPARQL’s query federation mechanism to make services general,
reusable, interoperable and scalable, which comes at a cost. The SPARQL protocol
requires communication between endpoints to occur through HTTP, which requires
inputs and outputs for procedures exposed by SCRY to be serialized and deserialized
to flat text. Additionally, SCRY needs time to parse embedded procedure calls,
execute them, populate an RDF graph, and resolve a SPARQL query against it. For our
statistics use case, which deals with a large number of inputs and outputs, these steps
take up approximatetely 10% of the total query time.</p>
      <p>In terms of reusability and standardization, SCRY provides optional description of
the inputs and outputs of a procedure, but does not enforce their use. Consequently, it is
not compatible with automated discovery of relevant services, unlike SADI for example,
which registers services using formal semantics. This is a natural consequence of SCRY’s
focus on local computation, to enhance scalability, and our aim to be interoperable with
extant triplestores and SPARQL tools.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>We argue SCRY meets each of our requirements: generality, reusability,
interoperability and scalability. Thus, computational biologists familiar with just the basics
of SPARQL and Python can easily deploy SCRY, extend it with procedures of
arbitrary complexity, and incorporate them in their SPARQL queries. SCRY overcomes
issues common to the application of semantic web technology in the life sciences, such
as the inability to incorporate computationally expensive data processing methods, or
working with privacy sensitive data. It provides an essential layer of abstraction,
without which incorporating programs like BLAST in SPARQL queries would be difficult,
tedious or outright impossible.</p>
      <p>Utilizing SCRY incurs some computational overhead, and the services it exposes
are not standardized with sufficient rigidity to automate their discovery. Even so, we
argue this is a price worth paying for a framework that offers such significant extension
to SPARQL’s functionality.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Altschul</surname>
            ,
            <given-names>S.F.</given-names>
          </string-name>
          , et al.:
          <article-title>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</article-title>
          .
          <source>Nucleic Acids Research</source>
          (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Battle</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kolas</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>GeoSPARQL: Enabling a Geospatial Semantic Web</article-title>
          . Semantic Web - Interoperability, Usability, Applicability (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schultz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The berlin sparql benchmark</article-title>
          .
          <source>International Journal on Semantic Web and Information Systems</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Fielding</surname>
          </string-name>
          , R.T.:
          <article-title>Architectural styles and the design of network-based software architectures (</article-title>
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Groth</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , et al.:
          <article-title>API-centric Linked Data integration: The Open PHACTS Discovery Platform case study</article-title>
          .
          <source>Journal of Web Semantics</source>
          <volume>29</volume>
          (
          <issue>0</issue>
          ),
          <fpage>12</fpage>
          -
          <lpage>18</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. van Hage,
          <string-name>
            <surname>W.R.</surname>
          </string-name>
          ,
          <source>with contributions from: Tomi Kauppinen</source>
          , Graeler,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Hoeksema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Ruttenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Bahls</surname>
          </string-name>
          .,
          <string-name>
            <surname>D.</surname>
          </string-name>
          : SPARQL: SPARQL client (
          <year>2013</year>
          ), http://CRAN. R-project.
          <source>org/package=SPARQL, R package version 1</source>
          .
          <fpage>15</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hoekstra</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , et al.:
          <article-title>An Ecosystem for Linked Humanities Data</article-title>
          .
          <source>In: Proceedings of the Workshop on Humanities in the Semantic Web (WHiSe</source>
          <year>2016</year>
          ),
          <string-name>
            <surname>ESWC</surname>
          </string-name>
          <year>2016</year>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. Inc.,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>Stardog 3</article-title>
          .
          <source>Tech. rep., Complexible Inc</source>
          . (
          <year>2015</year>
          ), http://docs.stardog.com/
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Krech</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , et al.:
          <article-title>RDFLib Python Library</article-title>
          .
          <source>Tech. rep., RDFLib Team</source>
          (
          <year>2002</year>
          ), https:// github.com/RDFLib/rdflib
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. OpenRDF:
          <article-title>Openrdf sesame</article-title>
          . http://rdf4j.org/
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Pedrinaci</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Domingue</surname>
          </string-name>
          , J.:
          <article-title>Toward the next wave of services: Linked Services for the Web of data</article-title>
          .
          <source>Journ. of Universal Computer Science</source>
          <volume>16</volume>
          (
          <issue>13</issue>
          ),
          <fpage>1694</fpage>
          --
          <lpage>1719</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Queralt-Rosinach</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Furlong</surname>
          </string-name>
          , L.:
          <article-title>DisGeNET RDF: a gene-disease association Linked Open Data resource</article-title>
          .
          <source>SWAT4LS</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Redaschi1</surname>
          </string-name>
          , N.,
          <article-title>the UniProt Consortium: UniProt in RDF: Tackling Data Integration and Distributed Annotation with the Semantic Web</article-title>
          . Nature
          <string-name>
            <surname>Precedings</surname>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Reynolds</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Linked Data API</article-title>
          .
          <source>Tech. rep., UK Government Linked Data</source>
          (
          <year>2009</year>
          ), https: //github.com/UKGovLD/linked-data-api
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Rietveld</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoekstra</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <source>The YASGUI Family of SPARQL Clients. Semantic Web - Interoperability</source>
          , Usability, Applicability (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Saleem</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.:
          <article-title>Linked Cancer Genome Atlas Database</article-title>
          .
          <source>Proceedings of the 9th International Conference on Semantic Systems</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Schmachtenberg</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.:
          <source>The Semantic Web - ISWC</source>
          <year>2014</year>
          ,
          <article-title>chap</article-title>
          .
          <source>Adoption of the Linked Data Best Practices in Different Topical Domains</source>
          , pp.
          <fpage>245</fpage>
          -
          <lpage>260</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Software</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Openlink virtuoso</article-title>
          . http://virtuoso.openlinksw.com/
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Speiser</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Integrating linked data and services with linked data services</article-title>
          .
          <source>In: The Semantic Web: Research and Applications</source>
          . pp.
          <fpage>170</fpage>
          --
          <lpage>184</lpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <article-title>The Apache Software Foundation: Apache Jena</article-title>
          . https://jena.apache.org/
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <article-title>The World Wide Web Consortium (W3C): Extending SPARQL Basic Graph Matching</article-title>
          . https://www.w3.org/TR/rdf-sparql-query/#sparqlBGPExtend
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <article-title>The World Wide Web Consortium (W3C): SPARQL Query Language for RDF</article-title>
          . http: //www.w3.org/TR/rdf-sparql-query/
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Uhlé</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.:
          <article-title>Tissue-based map of the human proteome</article-title>
          .
          <source>Science</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Vandervalk</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCarthy</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wilkinson</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>SHARE &amp; the Semantic Web - This Time it's</article-title>
          <source>Personal! Lecture Notes in Computer Science proceedings of the ASWC</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <article-title>W3C: SPARQL 1.1 Overview</article-title>
          . https://www.w3.org/TR/sparql11-overview/
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Wilkinson</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.D.</surname>
          </string-name>
          , et al.:
          <article-title>The FAIR Guiding Principles for scientific data management and stewardship</article-title>
          .
          <source>Sci Data</source>
          <volume>3</volume>
          ,
          <issue>160018</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Wilkinson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.:
          <article-title>The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation</article-title>
          .
          <source>Journal of Biomedical Semantics</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.:
          <article-title>Open PHACTS: semantic interoperability for drug discovery</article-title>
          .
          <source>Drug Discovery Today</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>