<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Medical and Transmission Vector Vocabulary Alignment with Schema.org</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>William Smith</string-name>
          <email>william.smith@pnnl.gov</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alan Chappell</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>St. Louis Encephalitis Virus West Nile Virus</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>La Crosse Encephalitis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Highlands J Virus</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <abstract>
        <p>Available biomedical ontologies and knowledge bases currently lack formal and standards-based interconnections between disease, disease vector, and drug treatment vocabularies. The PNNL Medical Linked Dataset (PNNL-MLD) addresses this gap. This paper describes the PNNLMLD, which provides a unified vocabulary and dataset of drug, disease, side effect, and vector transmission background information. Currently, the PNNL-MLD combines and curates data from the following research projects: DrugBank, DailyMed, Diseasome, DisGeNet, Wikipedia Infobox, Sider, and PharmGKB. The main outcomes of this effort are a dataset aligned to Schema.org, including a parsing framework, and extensible hooks ready for integration with selected medical ontologies. The PNNLMLD enables researchers more quickly and easily to query distinct datasets. Future extensions to the PNNL-MLD may include Traditional Chinese Medicine, broader interlinks across genetic structures, a larger thesaurus of synonyms and hypernyms, explicit coding of diseases and drugs across research systems, and incorporating vector-borne transmission vocabularies.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Medical vocabularies and ontologies have been developed
over the last two decades and represent a large cross-section
of Linked Open Datasets. Several research initiatives are
now de facto authoritative data stores used by thousands of
medical researchers daily including: DrugBank
        <xref ref-type="bibr" rid="ref10">(Law, et al.
2014)</xref>
        , PharmGKB
        <xref ref-type="bibr" rid="ref12">(Stanford University 2014)</xref>
        , Vectorbase
        <xref ref-type="bibr" rid="ref1 ref11 ref13">(National Institute of Allergy and Infectious Diseases;
National Institutes of Health; Department of Health and
Human Services 2014)</xref>
        , Uniprot
        <xref ref-type="bibr" rid="ref3">(Consortium 2014)</xref>
        , Allen
Institute for Brain Science (AIBS) Brain Map
        <xref ref-type="bibr" rid="ref1 ref11">(Allen
Institute for Brain Science 2014)</xref>
        , and Kyoto Encyclopedia
of Genes and Genomes (KEGG)
        <xref ref-type="bibr" rid="ref10 ref8">(Kanehisa, et al. 2014)</xref>
        .
However, with the collection of these advanced medical
vocabularies and descriptive logic rules, a data classification
divergence occurred.
      </p>
      <p>Medical research groups rarely attempted to standardize
vocabularies and ontologies with other research teams. This
created data resources that are not natively interconnected
with knowledge bases outside of a specific research
objective. Furthermore, specific medical coding may exist on an
entity level (OMIM, MeSH, eMedicine, etc), but there is no
inherent guarantee across data sources that these codes are
available or properly represented in a standard format.
Entity matching between datasets is complicated by the fact
most medical classes operate on a complex set of synonyms,
hypernyms or taxonomical naming schemas that typically
are not standardized across research projects and
communities.</p>
      <p>This effort addresses the tracking of a disease and
treatment regimen across vector-borne transmission variables,
including geography and species. The variety of issues
described renders any available single source of research data
unusable to address realistic research questions across the
breadth of this domain space. Table 1 represents common
diseases and transmission vectors for tracking vector-borne
infections that were used as the starting point.</p>
      <sec id="sec-1-1">
        <title>Disease</title>
        <sec id="sec-1-1-1">
          <title>Eastern Equine Encephalitis</title>
        </sec>
        <sec id="sec-1-1-2">
          <title>Virus</title>
        </sec>
        <sec id="sec-1-1-3">
          <title>Western Equine Encephalitis</title>
        </sec>
        <sec id="sec-1-1-4">
          <title>Virus Highlands J Virus St. Louis Encephalitis Virus West Nile Virus</title>
        </sec>
        <sec id="sec-1-1-5">
          <title>Chikungunya</title>
        </sec>
        <sec id="sec-1-1-6">
          <title>Dengue Fever</title>
        </sec>
      </sec>
      <sec id="sec-1-2">
        <title>Transmission</title>
        <p>Vector
Culiseta
melanura / Cs.
morsitans
Culex /
Culiseta
Culiseta
melanura
Culex
Many
Ochlerotatus
triseriatus
synonym
Aedes triseriatus
A. albopictus
and A. aegypti
Genus Aedes,
principally A.
aegypti</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>INITIAL VOCABULARIES AND ONTOLOGIES</title>
      <p>One way of making use of the extensive previous work in
disease descriptions by different research efforts and
enabling associations across these vocabularies is assembling a
knowledge base targeting the research area of interest. The
more overlapping sets of information present in the resulting
knowledge base the better chance a system has of making
associations across vocabularies simply because of the
availability of information on which to make the
associations. For tracking vector-borne infections, disease datasets</p>
      <p>Dataset
are a primary focus. Therefore, the team initially
collected authoritative resources with a large amount of disease
entities and extensive properties attached to each entity.</p>
      <p>
        The chosen datasets and entity count estimates include:
Diseasome
        <xref ref-type="bibr" rid="ref5">(Goh, et al. 2007)</xref>
        , PharmGKB, DisGeNet
        <xref ref-type="bibr" rid="ref4">(DisGeNet 2014)</xref>
        , and Wikipedia Infobox
        <xref ref-type="bibr" rid="ref17">(Wikimedia
Foundation 2014)</xref>
        . Table 2 depicts the data sets
incorporated and the scale of the associated relevant vocabularies.
      </p>
      <p>These datasets provided different levels of expression
across diseases, an example being PharmGKB having a
small number of diseases with many properties versus
DisGeNet having several times more entities expressed
with a single name property and medical code.</p>
      <p>
        Drug datasets, while initially not appearing to be part
of the use case of tracking vector-borne infections, are
useful as a direct path for aligning diseases across naming
conventions. The selected drug datasets and estimated
entity counts include: DailyMed
        <xref ref-type="bibr" rid="ref11 ref13">(United States National
Library of Medicine 2014)</xref>
        and DrugBank. In practice,
drug datasets contain an extensive listing of medical
codes, collected from prior research, across databases
often missing from disease datasets. While these codes
can be imprecise, they provide a starting point for entity
interlinks and additional data enrichment through NLP
and Linked Data techniques. When we focus on the
disease medical codes affected by a specific treatment, the
medical codes in the drug datasets enable us to
programmatically create owl:sameAs relations across diseases in
the disease data sets that are missing explicit matching
medical codes or proper names. As a result, when drugs
listing extensive medical codes are used as a reference
point, diseases often can be more fully described, as
missing medical codes are combined across datasets for more
complete Linked Open Data.
      </p>
      <p>
        Side effects were also included in the initial
PNNLMLD. This additional information enables detecting
symptoms and matching the symptom to a disease or drug
combination. The Sider
        <xref ref-type="bibr" rid="ref9">(Kuhn, et al. 2010)</xref>
        dataset was
selected as the lone source due to limited availability, but
Sider contained dozens of different connections per entity
across drugs further helping to align the combined
dataset.
3
      </p>
      <p>TARGET VOCABULARY: SCHEMA.ORG
In order to facilitate easier query description through a
consistent vocabulary, the project chose one primary
vocabulary to encompass the collected data. Selection of this
vocabulary is driven by two primary considerations: 1)
adequate expressiveness for the queries, and 2) not overly
prescriptive such that it creates conflicts with the
individual dataset semantics. The selection of this primary
vocabulary is important, as it is an opportunity to promote
wider use of the assembled dataset through adoption of an
impactful or widely used vocabulary.</p>
      <p>
        Schema.org
        <xref ref-type="bibr" rid="ref6">(Google Inc; Microsoft Inc; Yahoo Inc
2014)</xref>
        was released in June 2011, and has become the
search industry preferred standard for publishing search
engine readable data. After the release of schema.org a
RDFS
        <xref ref-type="bibr" rid="ref15">(W3C RDF Working Group 2004)</xref>
        mapping was
created and hosted on http://schema.rdfs.org, and this
mapping is now a standard for Linked Data research
utilizing Schema.org. Finally, at the end of June 2011,
Schema.org released an official OWL
        <xref ref-type="bibr" rid="ref14">(W3C OWL
Working Group 2012)</xref>
        version of the Schema.org
ontology bridging the gap between vocabulary and description
logic.
      </p>
      <p>Schema.org provides a base ontology class for medical
entities available as a subclass of Thing entitled
MedicalEntity. The subclasses of the MedicalEntity class were
selected to represent the disease, drug, and side effect
entities available within the PNNL-MLD. Table 3 lists
the selected sub-classes.</p>
      <p>Schema.org Class Entity
MedicalCondition Disease
MedicalCause Disease Cause
MedicalSignOrSymptom Disease Symptom
MedicalTherapy, Drug Drug
MedicalCode Entity Code
MedicalEntity Side Effect
Table 3. Schema.org classes selected to represent use case
entities.</p>
      <p>Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes2
4</p>
    </sec>
    <sec id="sec-3">
      <title>VOCABULARY ALIGNMENT</title>
      <p>Simply adding a primary vocabulary to the datasets is not
adequate to simplify querying. The source datasets must
be aligned with the primary vocabulary so that queries
will return results that span and integrate all the available
information. The central goal in this alignment is to
provide a mapping of the source vocabularies to the new
primary vocabulary that preserves the semantics of the
source but bridges the divergence between the different
knowledge representations.
4.1</p>
      <p>
        Base dataset alignment
The project selected the URI:
http://beowulf.pnnl.gov/2014/ to serve as the RDF
        <xref ref-type="bibr" rid="ref15">(W3C
RDF Working Group 2004)</xref>
        prefix base for all aligned
data. We used this new base URI to simplify software
development later in the alignment process. Furthermore,
all properties were immediately aligned by import dataset,
prefix associations demonstrated by the following:
beo-&lt;dataset-name&gt;:propertyName.
      </p>
      <p>By first associating property and class values with an
original prefix denoting dataset we could now track
properties that were not explicitly aligned to Schema.org. The
rdfs:label and owl:sameAs properties were left
unmodified throughout the entire process, and
schema:alternateName is used to track synonyms of
rdfs:label.
4.2</p>
      <p>Disease dataset alignment
Four large datasets of varying entity counts and properties
were the first targets after the base import of the
PNNLMLD. The first substitution took place by converting all
unique entity IRIs to a common format:</p>
      <p>beo-disease:&lt;disease-id&gt;
We then added the Schema.org declaration of class:
a schema:MedicalCondition
Primary preventions were added to diseases as drug IRIs
were detected:
schema:primaryPrevention
beo-drug:&lt;original-drugid&gt;, …
Finally, we use Table 4 to ensure we can match back to
online medical resources and unify datasets:</p>
      <p>Schema.org Class Entity
MedicalCode IRI
MedicalPage URI
code Unknown Code Type
Table 4. Alignment of Schema.org classes to medical resources.
4.3</p>
      <p>Drug dataset alignment
Two datasets comprised drug metadata and provided
interlinks to side effect metadata. These entities were
referenced in both disease and side effect datasets as potential
treatments (disease) and causes of (side effect). The first
substitution took place by converting all unique entity
IRIs to a common format:</p>
      <p>beo-drug:&lt;disease-id&gt;
We then added the Schema.org declaration of class:
a schema:Drug
Drug, a subclass of MedicalTherapy, was selected due to
the semantics of the original data. Drugs have the same
medical coding standards as Table 4, but the attributes
linking the drugs are more abstract including two
descriptions of the drug:
schema:potentialAction</p>
      <p>schma:description
To link the drug entity to a disease we replace:</p>
      <p>beo-drugbank:possibleDiseaseTarget
with:</p>
      <p>schema:possibleTreatment
Finally, drugs can interact with each other creating
adverse reactions. The DrugBank dataset provides the
interconnections for this possibility. We aligned these
reactions by creating the entity type:</p>
      <p>a beo-drugbank:drug_interactions
And ensuring the new entity has at least two of the
following relations:</p>
      <p>schema:interactingDrug beo-drug:id
4.4</p>
      <p>Side Effect dataset alignment
The single Sider dataset provides the final links to drug
entities with each side effect’s unique IRI converted to:
beo-interaction:&lt;effect-id&gt;
Then adding the Schema.org declaration of class:
a schema:medicalEntity
Completing the ontology requires one last step linking
drugs to side effects with the drug entity property:
schema:seriousAdverseOutcome beo-interaction:&lt;id&gt;
5</p>
    </sec>
    <sec id="sec-4">
      <title>QUERY A DISEASE</title>
      <p>
        Using Dengue Fever as an example disease we can now
use schema:MedicalCondition to query across all of the
disease datasets. The SPARQL
        <xref ref-type="bibr" rid="ref16">(W3C SPARQL Working
Group 2013)</xref>
        query below locates the available
information in the combined dataset about any medical
condition with “dengue” in its name and collects the comments
that describe the source of that information.
      </p>
      <p>SELECT ?label ?comment
WHERE {
?item a schema:MedicalCondition .
?item rdfs:label ?label .</p>
      <p>FILTER (regex(?label, 'dengue', 'i')) .</p>
      <p>OPTIONAL { ?item rdfs:comment ?comment }}
Running this query on the PNNL-MLD returns Table 5.
The results in Table 5 expose a current limitation of the
system due to regex matching of the label property.
Because the query can now reach across several different
datasets with conflicting naming schemes an additional
normalization process is needed during the data import to
normalize labels for all of the entities linked with
owl:sameAs.</p>
      <p>The results in Table 5 show that one simple query now
identifies data from three different sources. This begins to
show the value of the combined dataset. However, to
explore the full impact of the alignment a more complex
query is needed that requires the integration of
information from multiple sources. Expanding on our previous
query we can search across all originally returned
“dengue” conditions and append the drug links and treatments
added with Schema.org .</p>
      <p>Another limitation of the current PNNL-MLD is exposed
reviewing the results of Table 6. When creating interlinks
across diseases, only the Diseasome entities were
referenced in the corresponding drug datasets as possible
targets for treatment. To correct this oversight we also need
to include owl:sameAs associations within our queries, or
select a logical reasoner capable of associating and
returning all related entities upon a single link between a
disease and drug.</p>
      <p>Most importantly, Table 6 depicts the value of the
combined and aligned PNNL-MLD dataset. Queries like
the one given here that require information linking
diseases to treatments or symptoms or side effects are now
greatly simplified and can focus on a single vocabulary.
Schema.org provided classes and properties appropriate
for drafting queries that can provide views of the data not
visible using only a single source of data.</p>
      <p>No technical limitation exists that would restrict a user
from loading all of the datasets into separate graphs of an
available triplestore and querying the different
vocabularies across graphs. However, when we align these datasets
into the PNNL-MLD we achieve four major benefits:
1.
2.
3.</p>
      <p>Queries are now simplified. Early drafts for
querying across all of the graphs required queries that
were dozens of lines in length, and portions of the
queries varied drastically in format and language.
A standardized vocabulary, that is industry
recognized, is now in place for application development.
All of the graphs, when aligned into the
PNNLMLD, are now equally extensible. Adding new
vocabularies and ontologies to the original data
would require special updates to each dataset, and
require updates to each specific portion of a query
using that dataset.</p>
      <p>As shown in Table 2, when a dataset is converted
using RDF, and not generated from a different file
type (unaligned entities = 0), the Schema.org
entities now have a much higher ratio of Schema.org
predicates to object triple mappings. By flattening
the ontology a simplified query now has access to a
much greater range of values and entities.
Additionally, because all modification and additions made
while aligning are programmatically defined rather than
human expert mediated, new version of the PNNL-MLD
can be easily created as source datasets produce new
versions.
6</p>
    </sec>
    <sec id="sec-5">
      <title>CURRENT LIMITATIONS</title>
      <p>The complete PNNL-MLD is now capable of being
queried through SPARQL using only Schema.org
associations. However, there are still shortcomings in searching
for drugs and diseases by name, including the
corresponding regex filters. To resolve this conflict a primary label
for a group of entities related by owl:sameAs should be
selected upon entity interlinking with the previous labels
turned into schema:alternateName properties. Queries
should then be composed to either search for a primary
name and/or alternate synonym. To remove duplicates
imported from different datasets a reasoner capable of
merging owl:sameAs relations should be used when
querying the complete PNNL-MLD.</p>
      <p>
        Medical coding was not at first considered a feature of
the application and early versions of the PNNL-MLD did
not prioritize accurately creating the properties in Table 3.
As it became more apparent diseases and drugs were not
consistently labeled across datasets, and outside database
entities generally were consistent across datasets, more
focus was added to ensure medical codes were applied to
drug and disease entities. However, this process was
never finalized through Linked Data authentication to ensure
the medical codes supplied were accurate for the attached
entity.
6.1 Future work
To address current limitations we need to focus on best
practices utilizing linked data
        <xref ref-type="bibr" rid="ref7">(Heath and Bizer 2011)</xref>
        ,
and expanding vector transmission geo-properties.
1.
2.
3.
      </p>
      <p>Authenticate medical coding. Confirm the entity is
correctly aligned to outside sources.</p>
      <p>Add Gazetteer to provide formal geographic
naming entities while also mapping a list of local
colloquialisms for geographic regions.</p>
      <p>Add Vectorbase. (National Institute of Allergy and
Infectious Diseases; National Institutes of Health;
Department of Health and Human Services 2014)</p>
    </sec>
    <sec id="sec-6">
      <title>7 CONCLUSIONS</title>
      <p>The broader implications of aligning datasets under a
common vocabulary, and making them available using
Linked Open Data best practices, is to standardize and
expand the original research objectives. When we
augment the unique vocabulary and ontology mappings of
individual research programs with the broader
Schema.org vocabulary, we create data interlinks that enable
conceptualization of new questions that bridge the earlier
work without requiring replicating research with a
broader focus. This combination of separate datasets with
common data points aligned to nonexclusive properties
and ontology rules simplifies queries, and creates a new
superset built for application development and public
discovery.</p>
    </sec>
    <sec id="sec-7">
      <title>ACKNOWLEDGEMENTS</title>
      <p>This work was funded by a contract with the Defense
Threat Reduction Agency (DTRA), Joint Science and
Technology Office for Chemical and Biological Defense
under project number CB10082. Pacific Northwest
National Laboratory is operated for the U.S. Department of
Energy by Battelle under Contract DE-AC05-76RL01830.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>Allen Institute for Brain Science</article-title>
          .
          <source>Allen Human Brain Atlas</source>
          .
          <year>2014</year>
          . http://human.brain-map.org/ (accessed
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Ashburner</surname>
          </string-name>
          , Michael.
          <source>BioPortal</source>
          .
          <year>2014</year>
          . http://bioportal.bioontology.org/ontologies/GAZ.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Consortium</surname>
          </string-name>
          , The UniProt.
          <article-title>"UniProt: a hub for protein information</article-title>
          .
          <source>" Oxford Journals</source>
          <volume>43</volume>
          , no.
          <source>D1</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>DisGeNet. 10</surname>
          </string-name>
          <year>2014</year>
          . http://www.disgenet.org/web/DisGeNET/v2.1/dbinfo.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Goh</surname>
            , Kwang-Il,
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Cusick</surname>
            , David Valle,
            <given-names>Barton</given-names>
          </string-name>
          <string-name>
            <surname>Childs</surname>
            ,
            <given-names>Marc</given-names>
          </string-name>
          <string-name>
            <surname>Vidal</surname>
            ,
            <given-names>and Albert-László</given-names>
          </string-name>
          <string-name>
            <surname>Barabási</surname>
          </string-name>
          .
          <article-title>"The Human Disease Network."</article-title>
          <source>Proc Natl Acad Sci USA</source>
          ,
          <volume>4</volume>
          <fpage>2007</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Google</given-names>
            <surname>Inc</surname>
          </string-name>
          ; Microsoft Inc; Yahoo Inc.
          <year>2014</year>
          . http://schema.org/.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Heath</surname>
            , Tom, and
            <given-names>Christian</given-names>
          </string-name>
          <string-name>
            <surname>Bizer</surname>
          </string-name>
          .
          <article-title>Linked Data: Evolving the Web into a Global Data Space. 1</article-title>
          . Berlin: Morgan &amp; Claypool,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Kanehisa</surname>
            , M, S Goto,
            <given-names>Y</given-names>
          </string-name>
          <string-name>
            <surname>Sato</surname>
            ,
            <given-names>M</given-names>
          </string-name>
          <string-name>
            <surname>Kawashima</surname>
            ,
            <given-names>M</given-names>
          </string-name>
          <string-name>
            <surname>Furumichi</surname>
            , and
            <given-names>M</given-names>
          </string-name>
          <string-name>
            <surname>Tanabe</surname>
          </string-name>
          .
          <article-title>"Data, information, knowledge and principle: back to metabolism in KEGG."</article-title>
          <source>Nucleic Acids Res</source>
          ,
          <year>Jan 2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Kuhn</surname>
            ,
            <given-names>M,</given-names>
          </string-name>
          <article-title>M Campillos, I Letunic, LJ Jensen</article-title>
          , and
          <string-name>
            <given-names>P</given-names>
            <surname>Bork</surname>
          </string-name>
          .
          <article-title>"A side effect resource to capture phenotypic effects of drugs."</article-title>
          <source>Epub (NCBI)</source>
          ,
          <year>1 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Law</surname>
            ,
            <given-names>V</given-names>
          </string-name>
          , et al.
          <article-title>"DrugBank 4.0: Shedding new light on drug metabolism." PubMed, no</article-title>
          .
          <volume>24203711</volume>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>National Institute of Allergy and Infectious Diseases; National Institutes of Health; Department of Health and Human Services</source>
          .
          <year>2014</year>
          . https://www.vectorbase.org.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          Stanford University.
          <year>2014</year>
          . https://www.pharmgkb.org/.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>United States National Library of Medicine. 10 1</source>
          ,
          <year>2014</year>
          . http://dailymed.nlm.nih.gov/.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>W3C OWL Working Group</surname>
          </string-name>
          .
          <year>2012</year>
          . http://www.w3.org/TR/2012/RECowl2-overview-20121211/.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>W3C RDF Working Group</surname>
          </string-name>
          .
          <year>2004</year>
          . http://www.w3.org/TR/2004/RECrdf-mt-
          <volume>20040210</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>W3C SPARQL Working Group. "SPARQL 1.1 Query Language." W3C Recommender. March</source>
          <year>2013</year>
          . http://www.w3.org/TR/sparql11-query/ (accessed
          <year>October 2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Wikimedia</given-names>
            <surname>Foundation</surname>
          </string-name>
          .
          <year>2014</year>
          . http://www.wikidata.org
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>