<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Med2RDF: Semantic Biomedical Knowledge-base and APIs for the Clinical Genome Medicine</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mayumi Kamada</string-name>
          <email>mkamada@kuhp.kyoto-u.ac.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Toshiaki Katayama</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shuichi Kawashima</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ryosuke Kojima</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Masahiko Nakatsui</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yasushi Okuno</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Database Center for Life Science</institution>
          ,
          <addr-line>178-4-4 Wakashiba, Kashiwa-shi, 277-0871 Chiba</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Kyoto University</institution>
          ,
          <addr-line>54 Shogoin, Sakyo-ku, 606-8397, Kyoto</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>For clinical interpretation of genomic variants, it is necessary to aggregate knowledge from public databases and literatures. To construct an integrated knowledge-base for interpretation, we have developed RDF versions of major biomedical databases in the Med2RDF project. This resource uses the originally developed med2rdf ontology covering core concepts ranging from genomes, genes, transcripts, variations, diseases, to evidence, common to the supported databases. We currently provide converters for 19 public databases that are required to interpret disease relevance. We stored most of the resulting RDF data in our SPARQL endpoint and are currently developing APIs to utilize the RDF data for accelerating application development for genomic medicine.</p>
      </abstract>
      <kwd-group>
        <kwd>Med2RDF</kwd>
        <kwd>Database integration</kwd>
        <kwd>APIs</kwd>
        <kwd>clinical genome medicine</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Genomic medicine aims to provide an appropriate medical treatment policy based
on individual genetic background. However, many of genomic variants identified by
genome sequence analysis are unclear in relation to mechanism of disease and often do
not lead to clinical determination. These variants are called variants of uncertain
significance (VUS) and the interpretation of these variants is a bottleneck of genomic
medicine. To clarify the disease relevance of VUS, in addition to specialized knowledge in
each disease domain, comprehensive interpretation of enormous amounts of
information in the literature and public databases is needed. Thus, in Med2RDF project, we
have tackled to integrate knowledge required to the clinical interpretation utilizing
Resource Description Framework (RDF).</p>
      <p>To date, major life science databases have been developed and provided as RDF data
thanks to the community efforts [1]. Our Med2RDF is an addition of biomedical
databases to this collaboration. We provide converters for MedGen, HGNC, ClinVar,
dbSNP, dbVar ExAC, gnomAD, dbNSFP, dbscSNV, HiNT, INstruct, ICGC, TCGA,
Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CIViC, COSMIC, CCLE, GDSC, OpenTG-Gates and DGIdb at the Med2RDF GitHub
repository1 and have stored the resulting RDF datasets at our SPARQL endpoint2.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Med2RDF ontology and API development</title>
      <p>Along with the development of RDF data, we have developed the med2rdf ontology
covering core concepts ranging from genomes, genes, transcripts, variations, diseases,
to evidence, common to the supported databases (Fig 1).</p>
      <p>Fig 1. A schematic representation of the Med2RDF ontology commonly used in
the Med2RDF datasets to improve interoperability.</p>
      <p>This ontology enables us to integrate heterogeneous datasets by improving the
interoperability, and one can utilize any combination of data in a standardized manner.
Moreover, we are currently developing APIs that encapsulate the SPARQL query with
the help of the SPARQList3 with which users can develop applications for the clinical
genome medicine with ease. This will also help researchers to apply machine learning
methods to Med2RDF data for the clinical interpretation of VUS obtained from clinical
sequencing.</p>
      <p>Acknowledgements. This research is supported by the Program for an Integrated
Database of Clinical and Genomic Information from Japan Agency for Medical Research
and development, AMED.</p>
      <p>Reference. 1. Katayama T, Kawashima S, Micklem G et al.: BioHackathon series in
2013 and 2014, F1000Research 2019, 8:16774</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>