<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>R2RML-based access and querying to relational clinical data with morph-RDB</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Freddy Priyatna</string-name>
          <email>fpriyatna@fi.upm.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Raul Alonso Calvo</string-name>
          <email>ralonso@infomed.dia.fi.upm.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergio Paraiso-Medina</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gueton Padron-Sanchez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oscar Corcho</string-name>
          <email>ocorcho@fi.upm.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Biomedical Informatics Group, Universidad Politecnica de Madrid</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Ontology Engineering Group, Universidad Politecnica de Madrid</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Semantic interoperability is essential when carrying out postgenomic clinical trials where several institutions collaborate, since researchers and developers need to have an integrated view and access to heterogeneous data sources. In this paper we present a solution that uses an ontology based on the HL7 v3 Reference Information Model and a set of R2RML mappings that relate this ontology to an underlying relational database implementation, and where morph-RDB is used to expose a virtual SPARQL endpoint over the data. In previous e orts with other existing RDB2RDF systems we had not been able to work with live databases. Now we can issue SPARQL queries to the underlying relational data with acceptable performance, in general.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        In the last years, clinical trials have started introducing genomic variables [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
what requires performing patient strati cation when selecting the patient
population to apply the clinical trials to. This involves the use of biomarkers to
create subsets within a patient population that provide more detailed
information about how the patient will respond to a given drug. Several datasets,
commonly produced by di erent institutions and hence rather heterogeneous in
general, need to be used for patient strati cation [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Interoperability among
those datasets is made easier by the usage of biomedical standards and
vocabularies [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] . However, achieving such interoperability poses relevant technological
challenges. This is the basis of the work presented in this paper, which aims
at being applied in several healthcare institutions, such as the Institut Jules
Bordet3, the MAASTRO Clinic4, and the German Breast Group5.
      </p>
      <p>
        In previous works [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] we have already presented a relational database
implementation that is based on the HL7 version 3 Reference Information Model
(RIM) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This database aims to facilitate the interconnection with other data
      </p>
      <sec id="sec-1-1">
        <title>3 http://www.bordet.be/ 4 http://www.maastro.nl/ 5 http://www.germanbreastgroup.de/</title>
        <p>
          sources where medical ontologies are also being used, and has already been used
for providing some form of interoperability among real data sources [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] from
the aforementioned institutions. Currently we are working in providing
ontologybased support to data access, so as to facilitate such integration and allow
incorporating other datasets more easily. This is the reason why we are looking
into using a Relational Database to RDF (RDB2RDF) solution. We also
provide a SPARQL endpoint so that users are relieved from knowing the underlying
schema of the implemented database.
        </p>
        <p>RDB2RDF mappings are used to expose data from relational databases as
RDF datasets. Two major types of data access mechanisms are normally
provided by RDB2RDF tools: i) data translation (a speci c case of ETL - Extract,
Transform, Load -), where data are materialized into RDF datasets and stored in
a triple store (e.g., Virtuoso), which provides a SPARQL endpoint; and ii) query
translation, where SPARQL queries are directly translated into SQL
according to the speci ed RDB2RDF mappings, and evaluated against the relational
database, and where results are translated back using the mappings to conform
with the SPARQL query. In our case, we are interested in using RDB2RDF
mappings to make the data stored in our SQL implementation available according to
an ontology that re ects the HL7 version 3 RIM. Furthermore, we have a strong
requirement to use a query translation approach, given the importance of having
fresh results, what cannot be always ensured in the data translation approach.</p>
        <p>
          Our rst attempt [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] at applying RDB2RDF-based query translation was
with D2R server and mappings [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. This approach was not applicable since the
evaluation of the SQL queries resulting from query translation was not e cient
enough. Moreover, in some cases, queries could not be executed by the database
management system (e.g., their length was excessive). This has been already
hinted in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] that describes the experience of using RDB2RDF tools in the
domain of astronomy. The conclusion there was that RDB2RDF tools were not
feasible to be used in such a context, and this conclusion was consistent with
our rst attempt.
        </p>
        <p>
          Later, we started using morph-RDB [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] with R2RML mappings [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] for this
purpose. We have obtained better results that make this approach applicable in
our context. In this paper we describe our experience, which shows that it is
possible to use e cient RDB2RDF tools in the medical domain.
        </p>
        <p>This paper is organized as follows. Section 2 discusses the necessary
background such as our current model for storing medical data, the HL7 RIM
ontology, the R2RML mapping language, and our query translation engine
morphRDB. We discuss some optimization techniques in Section 3. In Section 4 we
discuss the set of queries that we use for the evaluation. Finally in Section 5
we provide some conclusions and describe some of our future work in this area,
including our deployment plans in aforementioned healthcare institutions.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Background: HL7, R2RML, and morph-RDB</title>
      <p>In this section we will review the main foundations of the work that we present
in the paper, namely HL7 and the HL7 RIM, the R2RML language, and
morphRDB.
2.1</p>
      <p>
        HL7 RIM
Recent years have witnessed a huge increase of biomedical databases [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This
larger availability opens up new opportunities, while setting some new important
challenges, especially in what respects to their integration, which is crucial to
obtain a proportional increment of knowledge on the biomedical area.
      </p>
      <p>
        Among the many Detailed Clinical Models that have been reviewed for the
integration of biomedical datasets [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], the HL7 v3 is one of the most relevant. The
HL7 v3 standard de nes the RIM at its core, this de nition consists of a UML
class diagram (it does not de ne a data structure or a database model). Besides,
issues such as the management of data types are not trivially translatable into
a database model. As a consequence, we have de ned in the past a relational
model for it, as described in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>The HL7 RIM backbone contains three main classes (Act, Role and Entity),
which are linked together using three association classes (Act-Relationship,
Participation and RoleLink). The core of the HL7 RIM is the Act class.
An Act is de ned as \a record of an event that has happened or may happen".
Any healthcare situation and all information concerning it should be describable
using the RIM by including the type of act (what happens), the actor who
performs the deed and the objects or subjects Entity that the act a ects to
Role. Some additional information may be provided to indicate location (where),
time (when), manner (how), together with reasons (why) or motives (what for).
Act and Entity classes have some specializations that add some attributes, such
as Observation (a subclass of Act), or Person (a subclass of Entity).</p>
      <p>This standard is able to represent all kinds of healthcare situations and any
kind of information associated with it. Based on this idea, we have de ned a
subset of the HL7 RIM schema, where we implement the classes and attributes
that are necessary to represent the scenario for sharing clinical data of breast
cancer clinical trials:
{ Act, with the subclasses Observation, Procedure, SubstanceAdministration,
and Exposure.
{ Role.
{ Entity, with the sub-classes LivingSubject, Person, and Device.
{ The classes; i)ActProcedureApproachSiteCode, ii) ActMethodCode,
iii) ActTargetSiteCode, iv) ActObservationInterpretationCode, and v)</p>
      <sec id="sec-2-1">
        <title>ActObservationValues related to Act.</title>
        <p>Besides, attribute data types are rather complex on the RIM, so they are
changed according to the mentioned scenario, following HL7 recommendations.
Therefore some attributes were simpli ed in the relational model compared to
those de ned by HL7 v3 standard. To improve a better performance and
knowledge on the HL7 RIM schema, it is de ned a set of views. These views cover the
access retrieval requirements for the clinical scenario. Therefore it is implemented
a di erent view for obtaining patient data due to its provenance (Observation,</p>
      </sec>
      <sec id="sec-2-2">
        <title>Procedure, SubstanceAdministration, and Exposure).</title>
        <p>Therefore, the de ned HL7 RIM based CDM above ful lls the requirements
needed by breast cancer in clinical trials scenario. Furthermore, we have created
an ontology that re ects the HL7RIM model6, available for others to reuse.
2.2</p>
        <p>
          R2RML
R2RML [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] is a W3C recommendation for the de nition of a mapping language
from relational databases to RDF. An R2RML mapping document consists of a
set of Triples Maps rr:TriplesMap, used to specify the rules to generate RDF
triples from database rows/values. A TriplesMap consists of:
{ A logical table rr:LogicalTable that is either a base table or SQL view,
used to provide the rows to be mapped as RDF triples.
{ A subject map rr:SubjectMap that is used to specify the rules to generate
the subject component of RDF triples.
{ A set of predicate object maps rr:PredicateObjectMap that is composed by
a set of predicate maps rr:PredicateMap and object maps rr:ObjectMap
(to generate the predicate and object components of RDF triples,
respectively). If a join with another triples map is needed, a reference object map
rr:RefObjectMap can be used. The other triples map to be joined is speci ed
in rr:parentTriplesMap and the join condition is speci ed via rr:Join
Subject maps, predicate maps, and object maps are term maps, which are
used to specify rules to generate the corresponding RDF triples element, and
those rules can be speci ed as a constant rr:constant, a database column
rr:column, or a template rr:template.
2.3
        </p>
        <p>
          morph-RDB
morph-RDB, which belongs to the morph suite7, i.e., receives as an input the
connection details to a relational database, an R2RML mapping document and
a SPARQL query and translates the query into SQL according to the R2RML
mapping, evaluates it into the underlying relational database and translates
back those results into the format required as a result of the SPARQL query
evaluation. The query translator component in morph-RDB implements the
algorithm described in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], which extends previous work in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] that de nes a set
of mappings and functions in order to translate SPARQL queries posed into
RDB-backed triples stores into SQL queries, and prove that the correctness of
        </p>
        <sec id="sec-2-2-1">
          <title>6 http://www.gib.fi.upm.es/hl7rim-common-data-model/ 7 http://www.oeg-upm.net/index.php/en/technologies/334-morph</title>
          <p>the query translation using the notion semantic-preserving, in other words, the
results of the SPARQL and SQL return the same answers. We extend their work
by relating those mappings and functions with the R2RML mapping elements.
Example 1. Consider the following table v person(patientId, patientName,
gender, actId) that stores the information about patients. This table is mapped
to class Patient with the attribute patientId as the identi er (together with
base URI for class Patient) of the instances. Attributes patientId and patientName
are mapped to ontology properties hasID and hasName, respectively. Now let's
add another table v observation(actId, title, code) that describes
observations. This table is mapped to class Observation with actId as the
identier of the instances, and the attribute title is mapped to property hasTitle.
patientId and actId are primary keys of the tables v person and v observation,
respectively. Furthermore, the actId of table v person is a foreign key that
refers to the column actId of table v observation, and this relation is mapped
to property hasObservation.</p>
          <p>The instances of the tables can be seen in Figure 1.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>SELECT T1.patientId, T2.patientName</title>
        <p>FROM T1 INNER JOIN T2 ON T1.patientId = T2.patientId;
where T1 = (SELECT patientId FROM v person WHERE patientId IS NOT NULL)
and T2 = (SELECT patientId, patientName FROM v person WHERE patientId</p>
      </sec>
      <sec id="sec-2-4">
        <title>IS NOT NULL AND patientName IS NOT NULL) are the results of translating</title>
        <p>the rst and second triple patterns, respectively.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Query Optimizations.</title>
      <p>The query translation technique presented above do not necessarily generate
optimal SQL queries. Based on the set of SPARQL queries we have evaluated,
we have observed that several patterns occur frequently. Hence we describe
optimization techniques that can be applied in order to generate more e cient
queries.
{ Self-join elimination. A set of triple patterns connected by the AND
operator and sharing the same subject occur frequently. We call this pattern
Subject Triple Group (STG). Using a nave query translation, each triple
pattern corresponds to an INNER join in the generated SQL query. We have
extended our query translation technique so that in addition to handling
triple patterns, each of the mappings/functions is also able to handle STG.
Example 2. Consider again the SPARQL query in Example 1. With self-join
elimination, now the result of translating that query is:</p>
      <sec id="sec-3-1">
        <title>SELECT T1.patientId, T1.patientName FROM v_person</title>
      </sec>
      <sec id="sec-3-2">
        <title>WHERE patientId IS NOT NULL AND patientName IS NOT NULL;</title>
        <p>{ Left-outer join to inner join. Another pattern is a Subject Triple Group
with Optional (OSTG), that is an OPTIONAL pattern that consists only one
triple pattern, preceded by an STG pattern or a triple pattern. Because the
OPTIONAL keyword corresponds to a left outer join, naively translating this
pattern produces one left-outer join for each OPTIONAL pattern. We extend
our query translation technique, so that the optimized query translation
generates an inner join, cheaper to evaluate than left-outer join, by
removing the conditional expression IS NOT NULL corresponding to the function
genCondSQL of the triple pattern in the OPTIONAL pattern.</p>
        <p>Example 3. Consider the following pattern:
{ ?p :hasID :pid . OPTIONAL { ?p :hasName ?pname . } }
Without left-outer join elimination, the result of translating this query is:</p>
      </sec>
      <sec id="sec-3-3">
        <title>SELECT T1.patientId, T2.patientName FROM</title>
        <p>(SELECT patientId FROM v_person</p>
      </sec>
      <sec id="sec-3-4">
        <title>WHERE patientId IS NOT NULL) T1 LEFT OUTER JOIN</title>
        <p>(SELECT patientId, patientName FROM v_person</p>
      </sec>
      <sec id="sec-3-5">
        <title>WHERE patientId IS NOT NULL AND patientName IS NOT NULL) T2 ON T1.patientId = T2.patientId;</title>
        <p>By changing the type of join from left-outer to inner, and removing the
conditional expression name IS NOT NULL, and applying the self-join elimination
(O1), the optimized query generated becomes:</p>
      </sec>
      <sec id="sec-3-6">
        <title>SELECT T2.patientId, T2.patientName FROM v_person T2;</title>
        <p>{ Phantom triple pattern introduction.</p>
        <p>Example 4. Consider the following pattern, which is neither an STG pattern
nor an OSTG, thus, none of the aforementioned optimizations can be applied.
{ ?p :hasObservation ?o .OPTIONAL { ?s :hasTitle ?t . } }
In order to exploit those optimisations we have presented so far, this query
has to be transformed into another query whose resulting query translation
can be optimised. To do that, we use the fact that for every IRI x, the fact
(x a rdf:Resource) holds, so that we can safely add this triple pattern to the
query without changing its semantics. We call such triple pattern a phantom
triple pattern. The result of adding the phantom triple pattern is:
Now with the new pattern that emerged, the optimization for OSTG pattern
can be applied.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Evaluation</title>
      <p>We have collected a total of 45 SPARQL queries that are used in the patient
recruitment and cohort selection scenario for breast cancer clinical trials. The
complete list of queries and their natural language descriptions are available at
http://bit.ly/1LecA7L. From this query list, we asked our domain experts to
group the queries into a set of ve groups that are representative of the whole
set, and they are shown in Table 1.</p>
      <p>Representative Similar queries
query
Q01
Q10
Q14
Q34
Q45
{ Observation query (Q45). This query retrieves the information of patients
who have been detected a category T2 breast tumor. It consists of 14 triple
patterns and 5 unique subjects. There are 4 OPTIONAL patterns, one of
them nested, and 2 FILTER patterns.</p>
      <p>The machine used in our evaluation has the following speci cations: CPU
Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz, 8 GB of RAM, 750 GB HDD with
Ubuntu Server 12.04 and MySQL Server 5.5. The dataset contains information
of 3 months of historical clinical data, with 4674 patients and 65056 acts, among
many other tables. The total size of the database is 105 MB. This database will
be growing in the future, as more data is added as a result of the data integration
processes carried out in the context of the projects where the database is being
generated.</p>
      <p>We were interested in comparing morph-RDB with another well-established
RDB2RDF engine, such as D2R, considering the total time required for the
execution of the SPARQL queries. That is, we include in our calculation the
time required to initialize the engine, the time needed for SPARQL-to-SQL query
translation, the time needed to evaluate the SQL queries, and the time needed
to translate back the result from the database using the mappings into the result
expected by the SPARQL queries. Figure 2 provides details for the ve selected
queries, which are also similar to the results obtained for the other queries in
our query set. We can easily see that in most cases our total execution time is
much lower than the one required for D2R Server. In some cases (queries Q14
and Q45) D2R Server was not able to produce results in less than ve minutes.
The results for the rest of the queries are available at http://bit.ly/1LecA7L.</p>
      <p>We were also interested in how the SQL queries that result from the query
rewriting approach perform in comparison to the SQL queries that would have
been natively created by a SQL expert. For this reason, we asked a domain
expert with good knowledge of the HL7 RIM relational database to construct
SQL queries that were semantically equivalent to the corresponding SPARQL
queries. In other words, without taking into account the mapping elements, such
as template or URI generation, the SPARQL and SQL queries should return the
same answer.</p>
      <p>We evaluated each query 5 times in cold and warm modes. In the cold mode,
we restart the server and empty the cache before we evaluate the next query. In
the warm mode, we skip these steps and execute the queries directly one after
the another. We measure the averages of query execution time and normalize the
query evaluation time to the native query evaluation time. As an additional note,
we can only do this type of evaluation using morph-RDB and native queries, as
D2R Server produces multiple SQL queries in many cases and performs joins in
memory, what makes it not comparable with the native or morph-RDB queries.</p>
      <p>The results from both evaluation modes show a similar trend. Furthermore,
we observed that in the warm mode, the database server doesn't lose its
capability of reusing previous results of the query cache. This is re ected by the fact that
only the rst run of the query takes more time to complete, while subsequent
queries can be evaluated with only a fraction of that time. Some of those queries
produced by morph-RDB can be evaluated in a reasonable time. For example,
the resulting query translation of query Q01 can be evaluated in a similar time
as the native query Q01. Furthermore, the resulting query translation Q34 can
be evaluated in less time than its corresponding native queries, which can be an
indicator that there might be still room for improving the corresponding native
query. Some other queries, such as Q10 and Q45, need more time to be
evaluated, being in the range of 20-35x slower than the corresponding native queries,
in which we still consider them as acceptable.The query Q14, however, needs
more investigation, as it takes a lot of time to be evaluated, 380-500x slower in
terms of normalized time to native queries. We suspect this is caused by the
arithmetic operation that is performed over the resulting translation queries.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper we have shown that SPARQL queries can be used as a means
to query relational clinical data that is integrated into an HL7 version 3 RIM
database implementation. We collected a set of 45 real SPARQL queries required
by our application domain and that will be deployed in a set of medical institutes,
chose ve of them as the most representatives ones, and evaluated those queries
using D2R Server and morph-RDB as our RDB2RDF tools. We have shown
that, in general, we got a better result with morph-RDB than D2R Server, what
allows now using this approach for accessing relational data using SPARQL.</p>
      <p>However, there are still some important remaining challenges to be
considered. We still have queries that require too much time to be evaluated, (e.g. query
Q14), because of the arithmetic operation that is included in the SPARQL query
and in its resulting translation into SQL. Investigating and designing
optimizations for dealing with this type of query will be part of our future work.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work has been funded by Ministerio de Econom a y Competitividad (Spain)
under the project "4V: Volumen, Velocidad, Variedad y Validez en la Gestion
Innovadora de Datos" (TIN2013-46238-C4-2-R), by the European Commission
through the EURECA (FP7-ICT-2011-7-288048) project and also by the
Ministry of Health of the Spanish Government under Grant PI13/02020.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>G.</given-names>
            <surname>Beeler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Case</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Curry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hueber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Mckenzie</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Schadow, and</article-title>
          <string-name>
            <surname>A. Shakir.</surname>
          </string-name>
          <article-title>HL7 reference information model</article-title>
          .
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          .
          <article-title>D2R server-publishing relational databases on the semantic web</article-title>
          .
          <source>In Poster at the 5th International Semantic Web Conference</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>A.</given-names>
            <surname>Chebotko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Fotouhi</surname>
          </string-name>
          .
          <article-title>Semantics preserving SPARQL-to-SQL translation</article-title>
          .
          <source>Data &amp; Knowledge Engineering</source>
          ,
          <volume>68</volume>
          (
          <issue>10</issue>
          ):
          <volume>973</volume>
          {
          <fpage>1000</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>J. F.</given-names>
            <surname>Coyle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Mori</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Hu</surname>
          </string-name>
          .
          <article-title>Standards for detailed clinical models as the basis for medical data exchange and decision support</article-title>
          .
          <source>International journal of medical informatics</source>
          ,
          <volume>69</volume>
          (
          <issue>2</issue>
          ):
          <volume>157</volume>
          {
          <fpage>174</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sundara</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          .
          <article-title>R2RML: RDB to RDF mapping language</article-title>
          .
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Galperin</surname>
          </string-name>
          and
          <string-name>
            <given-names>X. M.</given-names>
            <surname>Fernandez-Suarez</surname>
          </string-name>
          .
          <article-title>The 2012 nucleic acids research database issue and the online molecular biology database collection</article-title>
          .
          <source>Nucleic acids research</source>
          ,
          <volume>40</volume>
          (
          <issue>D1</issue>
          ):
          <source>D1{D8</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Ounis. Can</surname>
          </string-name>
          <article-title>RDB2RDF tools feasibily expose large science archives for data integration</article-title>
          ?
          <source>In Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications</source>
          , pages
          <volume>491</volume>
          {
          <fpage>505</fpage>
          . Springer-Verlag,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>W. R.</given-names>
            <surname>Hersh</surname>
          </string-name>
          .
          <article-title>Adding value to the electronic health record through secondary use of data for quality assurance, research, and surveillance</article-title>
          .
          <source>Clin Pharmacol Ther</source>
          ,
          <volume>81</volume>
          :
          <fpage>126</fpage>
          {
          <fpage>128</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. J. Kaufman.
          <article-title>Healthcare and life sciences standards overview-technology for life: NC symposium on biotechnology and bioinformatics</article-title>
          .
          <source>In Biotechnology and Bioinformatics</source>
          ,
          <source>2004. Proceedings. Technology for Life: North Carolina Symposium on</source>
          , pages
          <volume>31</volume>
          {
          <fpage>41</fpage>
          . IEEE,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>L. M. McShane</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Cavenagh</surname>
            ,
            <given-names>T. G.</given-names>
          </string-name>
          <string-name>
            <surname>Lively</surname>
            ,
            <given-names>D. A.</given-names>
          </string-name>
          <string-name>
            <surname>Eberhard</surname>
            ,
            <given-names>W. L.</given-names>
          </string-name>
          <string-name>
            <surname>Bigbee</surname>
            ,
            <given-names>P. M.</given-names>
          </string-name>
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          <string-name>
            <surname>Mesirov</surname>
            , M.-
            <given-names>Y. C.</given-names>
          </string-name>
          <string-name>
            <surname>Polley</surname>
            ,
            <given-names>K. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>J. V.</given-names>
          </string-name>
          <string-name>
            <surname>Tricoli</surname>
          </string-name>
          , et al.
          <article-title>Criteria for the use of omics-based predictors in clinical trials</article-title>
          .
          <source>Nature</source>
          ,
          <volume>502</volume>
          (
          <issue>7471</issue>
          ):
          <volume>317</volume>
          {
          <fpage>320</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>J. M. Moratilla</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Alonso-Calvo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Molina-Vaquero</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Paraiso-Medina</surname>
            , D. PerezRey, and
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Maojo</surname>
          </string-name>
          .
          <article-title>A data model based on semantically enhanced HL7 RIM for sharing patient data of breast cancer clinical trials</article-title>
          .
          <source>Studies in health technology and informatics</source>
          ,
          <volume>192</volume>
          :
          <fpage>971</fpage>
          {
          <fpage>971</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>S.</given-names>
            <surname>Paraiso-Medina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Perez-Rey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bucur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Claerhout</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Alonso-Calvo</surname>
          </string-name>
          .
          <article-title>Semantic normalization and query abstraction based on SNOMED-CT and HL7: Supporting multi-centric clinical trials</article-title>
          .
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>F.</given-names>
            <surname>Priyatna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Corcho</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. F.</given-names>
            <surname>Sequeda</surname>
          </string-name>
          .
          <article-title>Formalisation and experiences of R2RML-based SPARQL to SQL query translation using morph</article-title>
          .
          <source>In Proceedings of the 23rd International World Wide Web Conference</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>