<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>April</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Enabling Tailored Therapeutics with Linked Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anja Jentzsch</string-name>
          <email>mail@anjajentzsch.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bo Andersson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>AstraZeneca R&amp;D Lund 221 87 Lund</institution>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Eli Lilly and Company Lilly Corporate Center Indianapolis</institution>
          ,
          <addr-line>Indiana 46285</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Freie Universität Berlin Web-based Systems Group Garystr.</institution>
          <addr-line>21 14195 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Freie Universität Berlin Web-based Systems Group Garystr.</institution>
          <addr-line>21 14195 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Toronto Database Group 10 King's College Rd</institution>
          ,
          <addr-line>Toronto</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2009</year>
      </pub-date>
      <volume>20</volume>
      <issue>2009</issue>
      <abstract>
        <p>Advances in the biological sciences are allowing pharmaceutical companies to meet the health care crisis with drugs that are more suitable for preventive and tailored treatment, thereby holding the promise of enabling more cost effective care with greater efficacy and reduced side effects. However, this shift in business model increases the need for companies to integrate data across drug discovery, drug development, and clinical practice. This is a fundamental shift from the approach of limiting integration activities to functional areas. The Linked Data approach holds much potential for enabling such connectivity between data silos, thereby enabling pharmaceutical companies to meet the urgent needs in society for more tailored health care. This paper examines the applicability and potential benefits of using Linked Data to connect drug and clinical trials related data sources and gives an overview of ongoing work within the W3C's Semantic Web for Health Care and Life Sciences Interest Group on publishing drug related data sets on the Web and interlinking them with existing Linked Data sources. A use case is provided that demonstrates the immediate benefit of this work in enabling data to be browsed from disease, to clinical trials, drugs, targets and companies.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Linked Data</kwd>
        <kwd>Semantic Web</kwd>
        <kwd>Tailored Therapeutics</kwd>
        <kwd>Drugs</kwd>
        <kwd>Clinical Trials</kwd>
        <kwd>Competitive Intelligence</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The Linking Open Drug Data (LODD) task within the W3C's
Semantic Web for Health Care and Life Sciences Interest Group1
gathered a list of data sets that include information about drugs,
and then determined how the publicly available data sets could be
linked together. The review showed that this domain is promising
for Linked Data as there are many publicly available data sets,
and they frequently share identifiers for key entities. The
complete evaluation results are posted on the W3C ESW Wiki2.
Participants of the LODD task have undertaken to demonstrate
the value of Linked Data to the health care and life sciences</p>
    </sec>
    <sec id="sec-2">
      <title>1 http://esw.w3.org/topic/HCLSIG/LODD</title>
      <p>2 http://esw.w3.org/topic/HCLSIG/LODD/Data/DataSetEvaluation
domain. This has been achieved by publishing and linking several
drug related data sets on the Web, and investigating use cases that
demonstrate how researchers in life science, as well as physicians
and patients can take advantage of the connected data sets.
This paper is structured as follows: Section 2 describes the
published data sets, their linkage with other published data
sources, and the methods that were used to create the links.
Section 3 exemplifies how navigating linked data can be utilized
within a competitive intelligence use case. While Section 4
summarizes our findings and experiences from publishing and
navigating the data sets.</p>
      <sec id="sec-2-1">
        <title>2. LINKED DATA SETS</title>
        <p>
          In this project, data about pharmaceutical companies, drugs in
clinical trials, mechanisms of action of drugs, safety information,
and data about disease gene correlations were added to the Linked
Data cloud. This selection of data sets enabled strong connections
to existing Linked Data resources, while providing novel data of
interest to the pharmaceutical industry. The existing Linked Data
of primary interest to this work includes the many bioinformatics
and cheminformatics data sources published by Bio2RDF [6], and
the information on diseases and marketed drugs in DBpedia [
          <xref ref-type="bibr" rid="ref2">7</xref>
          ].
The linkage of the newly published data sets to each other and
relevant existing Linked Data is shown in Figure 1.
        </p>
        <p>The Linked Clinical Trials (LinkedCT) data source 3 is derived
from a service provided by U.S. National Institutes of Health,
ClinicalTrials.gov, a registry of more than 60,000 clinical trials
conducted in 158 countries. Each trial is associated with a brief
description, related disorders 4 and interventions, eligibility
criteria, sponsors, locations (investigators), and several other
pieces of information. The data on LinkedCT is obtained by first
transforming the XML data provided by ClinicalTrials.gov to
relational data using the capabilities of a hybrid relational-XML
Relational Database Management System such as IBM DB2. This
transformation requires identification of the entities and facts in
the XML data and storing them in reasonably normalized
relational tables that are appropriate for transformation into RDF.
The RDF data is then published using D2R server [8]. The RDF
version of the dataset contains 7,011,000 triples and 290,000
links.</p>
        <p>
          DrugBank [
          <xref ref-type="bibr" rid="ref3">9</xref>
          ] is a large repository of almost 5000 FDA-approved
small molecule and biotech drugs. It contains detailed information
about drugs including chemical, pharmacological and
pharmaceutical data; along with comprehensive drug target data
such as sequence, structure, and pathway information. The data
was originally published as DrugBank DrugCards5 and was
republished as Linked Data using D2R server. The Linked Data
version of DrugBank contains 1,153,000 triples and 60,300 links6.
Diseasome [
          <xref ref-type="bibr" rid="ref4">10</xref>
          ] contains information about 4,300 disorders and
disease genes linked by known disorder–gene associations for
exploring known phenotype and disease gene associations and
indicating the common genetic origin of many diseases. The list
3 http://linkedct.org
4 disorder is used as a synonym for disease and indication,
http://en.wikipedia.org/wiki/Disease#Disorder
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5 http://www.drugbank.ca/fields</title>
    </sec>
    <sec id="sec-4">
      <title>6 http://www4.wiwiss.fu-berlin.de/drugbank/</title>
      <p>of disorders, disease genes, and associations between them was
obtained from the Online Mendelian Inheritance in Man
(OMIM)7, a compilation of human disease genes and phenotypes.
The data set is published by Diseasome in a flat file
representation. The flat files were read into a relational database
and made accessible as Linked Data using D2R server. The
Linked Data version of Diseasome contains 88,000 triples and
23,000 links8.</p>
      <p>DailyMed9 is published by the National Library of Medicine, and
provides high quality information about marketed drugs.
DailyMed provides much information including general
background on the chemical structure of the compound and its
mechanism of action, details on the clinical pharmacology of the
compound, indication (disorder) and usage, contraindications,
warnings, precautions, adverse reactions, overdosage, and patient
counseling. The data was originally published in Structured
Product Labeling 10 , a XML-based standard for exchanging
medication information that has been recently introduced by the
Food and Drug Administration in the United States. It was
published using the D2R server. The Linked Data version of
DailyMed contains 124,000 triples and 29,600 links11.
There are many commonly used identifiers in the life sciences
that can be utilized for making links between data sets explicit.
Links that were generated based on shared identifiers include the
connections from LinkedCT to Bio2RDF's PubMed, and from
DrugBank to DBpedia. The connections between bioinformatics
and cheminformatics data sources are already provided by
Bio2RDF allowing us to interlink our drug-related data sets to
their work. In cases where no shared identifiers exist, string and
semantic matching techniques were applied for link discovery</p>
    </sec>
    <sec id="sec-5">
      <title>7 www.ncbi.nlm.nih.gov/omim</title>
    </sec>
    <sec id="sec-6">
      <title>8 http://www4.wiwiss.fu-berlin.de/diseasome/</title>
    </sec>
    <sec id="sec-7">
      <title>9 http://dailymed.nlm.nih.gov/</title>
      <p>
        10 http://www.fda.gov/oc/datacouncil/SPL.html
11 http://www4.wiwiss.fu-berlin.de/dailymed/
[
        <xref ref-type="bibr" rid="ref5">11</xref>
        ]. Approximate string matching was employed to interlink
LinkedCT and Diseasome, where for instance "Alzheimer's
disease" in LinkedCT was matched with "Alzheimer_disease" in
Diseasome. Semantic matching is especially useful in matching
clinical terms as many drugs and diseases have multiple names.
Drugs tend to have generic names and brand names, for example,
"Varenicline" has the synonym "Varenicline Tartrate" and the
brand names "Champix" and "Chantix".
      </p>
      <sec id="sec-7-1">
        <title>3. COMPETITIVE INTELLIGENCE CASE</title>
      </sec>
      <sec id="sec-7-2">
        <title>STUDY</title>
        <p>A use case has been developed that demonstrates the value of
Linked Data about drugs to the pharmaceutical industry.
Departments within pharmaceutical companies have typically
decided independently which data sets need to be brought into
their organization for integration and interrogation. Access to the
data is provided to employees based upon their roles. The use
case describes the value that can be gained by allowing
employees to gain access to a more diverse and linked body of
data. This approach enables new and novel questions to be
explored. The following use case describes a scenario in
competitive intelligence.</p>
        <p>A neuroscience focused business manager is interested in seeing
an update on new clinical trials that competitors are starting in
Alzheimer’s Disease (AD). These updates influence future sales
forecasts across geographies, and impact portfolio decisions as
new drugs needs to demonstrate improved safety and efficacy
compared to the existing pharmacopeia.</p>
        <p>Using a Semantic Web browser of choice – for instance
Tabulator12 or the Marbles data browser13, the manager is able to
see all drugs in trials for AD in LinkedCT, including a new phase
III trial planned by Pfizer for a drug called Varenicline. The
business manager can see that more information is available about
the drug, which is unusual because not much data is typically
available for drugs that are under investigation. Following the
data link the manager sees data from DailyMed that shows that
the drug is already on the market for nicotine addiction.
As side effects are better understood for drugs that are already on
the market, they tend to be more successful in trials. Out of
curiosity, the manager scrolls down the page to see that side
effects are listed as constipation, sleeping problems, vomiting,
nausea, and gas; and that the typical dose is 1mg twice daily. The
dose stated on LinkedCT for the trial was no higher than that, so it
is unlikely that this drug will have new safety problems.
12 http://www.w3.org/2005/ajar/tab
13 http://beckr.org/marbles
Link Type
owl:sameAs
owl:sameAs
rdfs:seeAlso
owl:sameAs
owl:sameAs
owl:sameAs</p>
        <p>foaf:page
drugbank:possible</p>
        <p>DiseaseTarget
drugbank:branded</p>
        <p>Drug
owl:sameAs
drugbank: pfam
DomainFunction
drugbank:enzyme</p>
        <p>SwissprotId
drugbank:iupacId
drugbank:pdbId</p>
        <p>
          Count
27,685
12,127
8,848
444
301
42,219
61,920
8,201
1,593
1,522
19,028
4,660
4,592
Given the promising safety profile, the manager is curious to
discover why a nicotine addiction drug might work for AD.
Linking to DrugBank highlights to the manager that Varenicline
is an alpha-4 beta-2 neuronal nicotinic acetylcholine receptor
agonist. However, Diseasome indicates that the corresponding
genes are only important in nicotine addiction, rather than AD.
This suggests that there is a more complex relationship between
the diseases, than just sharing a drug target. Extending the
browsing to the SWAN Knowledgebase14 [
          <xref ref-type="bibr" rid="ref6">12</xref>
          ] shows that there
are hypotheses relating AD to nicotinic receptors through amyloid
beta [
          <xref ref-type="bibr" rid="ref7">13</xref>
          ].
        </p>
        <p>Using the Linked Data approach a business manager was able to
browse data relating to companies, clinical trials, drugs, diseases
and genetic variation. More specifically, the manager was able to
determine when extra data was available, gain access to data
without needing to map different identifiers and synonyms, and
gain additional insights as to interesting questions to ask.</p>
      </sec>
      <sec id="sec-7-3">
        <title>4. OUTLOOK</title>
        <p>This paper describes the mapping of four drug related data
sources into the Linked Data cloud, and the ensuing insights that
can be gained in the area of competitive intelligence. However,
this is just the beginning, because more interesting and novel
questions will be able to be addressed as additional data sets are
added. As a next step, it would be interesting to incorporate data
relating to epidemiology, as that could provide information
relating to geographical areas in which diseases are prevalent, and
where there is a strong need for the development of a drug that
meets the needs of a specific population. It would also be valuable
to create links to the AD hypotheses data that is in RDF within the
SWAN Knowledgebase.</p>
        <p>Pharmaceutical companies need to make decisions based upon
both internal and external data, it is therefore important that
companies begin to make internal data available in a linked
representation, both to break down the internal silos and to easily
connect with external data. Such an approach would require
organizations to understand where the linkage points occur across
internal data sets, but this is ongoing work as it is a critical
prerequisite for all data integration efforts relating to the effective
tailoring of drugs.</p>
        <p>
          Currently, when pharmaceutical companies bring copies of data
within their organizations for integration, they each need to have
experts who understand the connectivity across data sets.
However, with the Linked Data approach, this responsibility is
shifted to the data providers. This is a much more efficient
approach, as the data providers are the individuals who
understand the data best. It also means that the integration only
has to happen one time. In addition, it becomes possible for data
providers to incrementally add links to new data sets as they
become aware of their existence, rather than needing to design a
model to do everything in one go. As stated in [
          <xref ref-type="bibr" rid="ref8">14</xref>
          ], reasoning and
querying limitations can often be compensated for by integrating
additional data resources.
        </p>
        <p>As the Linked Data cloud grows, focus in pharmaceutical
companies will be moved to approaches for interpretation. One
project with potential to utilize the value from Linked Data is the
Large Knowledge Collider (LarKC), a platform for massive
distributed incomplete reasoning that aims at removing the
scalability barriers of currently existing reasoning systems for the
Semantic Web15.</p>
        <p>The Linked Data approach is very promising for the
pharmaceutical industry, and its value will increase as more data
sources become available. However, our technical work as well as
use case experiments revealed various challenges that need to be
mitigated to make this approach robust enough to be deployed
within an enterprise environment:
1.</p>
        <p>
          Progress needs to be made in finding links between data
items across data sets where no commonly used identifiers
exist. Discovering such links requires using specific record
linkage [
          <xref ref-type="bibr" rid="ref9">15</xref>
          ] and duplicate detection [
          <xref ref-type="bibr" rid="ref10">16</xref>
          ] techniques
developed within the database community as well as
ontology matching [
          <xref ref-type="bibr" rid="ref11">17</xref>
          ] methods from the knowledge
representation literature. Recent work has proposed
frameworks for simplifying this task for RDF data sets [
          <xref ref-type="bibr" rid="ref12">18</xref>
          ]
and relational data [
          <xref ref-type="bibr" rid="ref5">11</xref>
          ]. In order to benefit from these
14 http://hypothesis.alzforum.org/swan/
15 http://www.larkc.eu/
        </p>
        <p>frameworks for setting links within the LODD data sets,
domain experts need to identify linkage points and specific
rules required for finding the links.</p>
        <p>
          Work needs to be undertaken to make data browsers more
robust and performant. In addition, the user interface of data
browsers needs to be improved. Life Sciences data
frequently consists of long lists of entities (e.g. genes, trials,
diseases, patients) that need to be browsed, filtered, and
queried. Benefits would be gained if hybrid interfaces that
combine querying and browsing would be available and able
to process the large amounts of data that are typically
relevant within this domain. For such interfaces, it could be
promising to combine live data retrieval with local caching
and in-advance crawling of relevant data sets, as it is
currently done by Semantic Web Search engines such as
Sindice [
          <xref ref-type="bibr" rid="ref13">19</xref>
          ] and Falcons [
          <xref ref-type="bibr" rid="ref14">20</xref>
          ].
        </p>
        <p>A significant challenge within the life sciences and health
care is the strong prevalence of terminology conflicts,
synonyms, and homonyms. These problems are not
addressed by simply making data sets available on the Web
using RDF as common syntax but require deeper semantic
integration. For applications that focus on discovery and data
navigation, having explicit links between data sources is
often already a huge benefit even without semantic
integration. For other applications that rely on expressive
querying or automated reasoning deeper integration is
essential. In order to also provide for such applications and
lay the foundation for fusing data from several Linked Data
sources, it would be beneficial if more community practices
on publishing term and schema mappings would be
established.</p>
      </sec>
      <sec id="sec-7-4">
        <title>5. ACKNOWLEDGEMENTS</title>
        <p>This work was undertaken within the LODD task of the W3C's
Semantic Web for Health Care and Life Sciences Interest Group.
Significant contributions to the LODD task have also been made
by Kei Cheung, Don Doherty, Matthias Samwald, and Jun Zhao.
Anja Jentzsch and Chris Bizer received funding for this work
from Eli Lilly.</p>
      </sec>
      <sec id="sec-7-5">
        <title>6. REFERENCES</title>
        <p>[1] Healthcare 2015: Win-win or lose-lose?</p>
        <p>www.Ibm.com/healthcare/hc2015.
[2] Gerhardsson de Verdier, M.: The Big Three Concept - A
Way to Tackle the Health Care Crisis? Proc. Am. Thorac.</p>
        <p>Soc. 5: 800–805, 2008.
[3] Andersson B., Momtchev V.: D7a.1.1 LarKC Requirements
summary and data repository,
http://wiki.larkc.eu/LarkcProject/WP7a.
[4] Sharp, M., Bodenreider, O., and Wacholder, N.: A
framework for characterizing drug information sources.
AMIA Annu. Symp. Proc. 2008 Nov 6:662-666.</p>
        <p>http://www.ncbi.nlm.nih.gov/pubmed/18999182.
[5] Goble, C., Stevens, R.: State of the Nation in Data
Integration for Bioinformatics. J. Biomed. Infor. 41:
687693, 2008.
[6] Belleau F., Nolin., M.-A., Tourigny N., Rigault, P., and
Morissette, J. Bio2RDF: Towards a mashup to build</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>bioinformatics knowledge systems</article-title>
          .
          <source>J. Biomed. Infor</source>
          .
          <volume>41</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kobilarov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ives</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <article-title>DBpedia: A Nucleus for a Web of Open Data</article-title>
          .
          <source>In proceedings of the 6th International Semantic Web Conference. Lecture Notes in Computer Science 4825 Springer, ISBN 978-3-540-76297-3</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Wishart</surname>
            <given-names>D.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knox</surname>
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            <given-names>A.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shrivastava</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hassanali</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stothard</surname>
            <given-names>P.</given-names>
          </string-name>
          , Chang
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>Woolsey</surname>
          </string-name>
          <string-name>
            <surname>J.:</surname>
          </string-name>
          <article-title>DrugBank: a comprehensive resource for in silico drug discovery and exploration</article-title>
          .
          <source>Nuc. Acids Res</source>
          .
          <volume>1</volume>
          (
          <issue>34</issue>
          ):
          <fpage>D668</fpage>
          -
          <lpage>72</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Goh</surname>
            <given-names>K.-I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cusick</surname>
            <given-names>M.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valle</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Childs</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vidal</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barabási</surname>
            <given-names>A.L.</given-names>
          </string-name>
          :
          <article-title>The human disease network</article-title>
          .
          <source>Proc. Natl. Acad. Sci. USA</source>
          <volume>104</volume>
          :
          <fpage>8685</fpage>
          -
          <lpage>8690</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Hassanzadeh</surname>
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lim</surname>
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kementsietsidis</surname>
            <given-names>A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Wang</surname>
            <given-names>M.:</given-names>
          </string-name>
          <article-title>A Declarative Framework for Semantic Link Discovery over Relational Data</article-title>
          .
          <source>Poster at the 18th World Wide Web Conference</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Gao</surname>
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kinoshita</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seaborne</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cayzer</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>SWAN: A Distributed Knowledge Infrastructure for Alzheimer Disease Research</article-title>
          .
          <source>J. Web Sem</source>
          .
          <volume>4</volume>
          (
          <issue>3</issue>
          ):
          <fpage>222</fpage>
          -
          <lpage>228</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Dineley</surname>
            ,
            <given-names>K.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Westerman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bui</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bell</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ashe</surname>
            <given-names>K.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sweatt</surname>
            ,
            <given-names>J.D.:</given-names>
          </string-name>
          <article-title>b-Amyloid Activates the Mitogen-Activated Protein Kinase Cascade via Hippocampal a7 Nicotinic Acetylcholine Receptors: In Vivo Mechanisms Related to Alzheimer's Disease</article-title>
          .
          <source>J. Neurosci</source>
          .
          <volume>21</volume>
          (
          <issue>12</issue>
          ):
          <fpage>4125</fpage>
          -
          <lpage>4133</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Sahoo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bodenreider</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rutter</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skinner</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>An ontology-driven semantic mashup of gene and biological pathway information: Application to the domain of nicotine dependence</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          <volume>41</volume>
          :
          <fpage>752</fpage>
          -
          <lpage>765</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Elmagarmid</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ipeirotis</surname>
            ,
            <given-names>P.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verykios</surname>
            ,
            <given-names>V.S.</given-names>
          </string-name>
          <article-title>Duplicate record detection: A survey</article-title>
          .
          <source>IEEE Trans. Knowledge and Data Engineering</source>
          ,
          <volume>19</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Winkler</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Overview of Record Linkage and Current Research Directions. Bureau of the Census</article-title>
          ,
          <source>Technical Report</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Euzenat</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shvaiko</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Ontology Matching. Springer, Heidelberg,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Volz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaedke</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Kobilarov</surname>
          </string-name>
          , G.:
          <article-title>Silk - A Link Discovery Framework for the Web of Data</article-title>
          .
          <source>In: Linked Data on the Web workshop at WWW2009</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Tummarello</surname>
            <given-names>G.</given-names>
          </string-name>
          et al.
          <article-title>Sindice.com: Weaving the Open Linked Data</article-title>
          .
          <source>In: 6th International Semantic Web Conference</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [20] Gong Cheng, H. W.,
          <string-name>
            <surname>Weiyi</surname>
            <given-names>Ge</given-names>
          </string-name>
          , Qu Y.:
          <article-title>Searching Semantic Web Objects Based on Class Hierarchies</article-title>
          .
          <source>In: Linked Data on the Web workshop at WWW2008</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>