<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An update on Genomic CDS, a complex ontology for pharmacogenomics and clinical decision support</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>José Antonio Minarro-Giménez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matthias Samwald</string-name>
          <email>matthias.samwald@meduniwien.ac.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Section for Medical Expert and Knowledge-Based Systems; Center for Medical Statistics</institution>
          ,
          <addr-line>Informatics, and Intelligent Systems;</addr-line>
          <institution>Medical University of Vienna;</institution>
          <addr-line>Vienna</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Genetic data can be used to optimize drug treatment based on the genetic profiles of individual patients, thereby reducing adverse drug events and improving the efficacy of pharmacotherapy. The Genomic Clinical Decision Support (Genomic CDS) ontology utilizes Web Ontology Language 2 (OWL 2) reasoning for this task. The ontology serves a clear-cut medical use case that requires challenging OWL 2 DL reasoning. We present an update of the Genomic CDS ontology which covers a significantly larger number of clinical decision support rules and where inconsistencies presented in previous versions of the ontology have been removed.</p>
      </abstract>
      <kwd-group>
        <kwd>OWL</kwd>
        <kwd>pharmacogenomics</kwd>
        <kwd>clinical decision support</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Different patients can react drastically different to the same type of medication. The
goal of personalized medicine and pharmacogenomics is to predict an individual
patient’s response by analyzing genetic markers that influence how medications are
metabolized or able to bind to their targets.</p>
      <p>To produce clinically valid and trustworthy predictions, no errors or ambiguities
should arise in the process of inferring a patient’s likely response from raw genetic
data. Current formalisms, data infrastructures and software applications leave many
opportunities for introducing such errors and ambiguities. Ontologies formalized with
the Web Ontology Language 2 (OWL 2) could be an excellent choice for tackling this
problem, but the complexity and potentially large scale of ontologies in this domain
also pose formidable challenges to currently available OWL 2 reasoners.</p>
    </sec>
    <sec id="sec-2">
      <title>The Genomic CDS ontology</title>
      <p>
        The Genomic Clinical Decision Support (Genomic CDS) ontology is an OWL 2
ontology aimed at representing pharmacogenomic knowledge and providing clinical
decision support based on genetic patient data. The Genomic CDS ontology has been
integrated into the Medicine Safety Code (MSC) system [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] in order to provide
pharmacogenomic decision support at the point-of-care. The different versions of the
Genomic CDS ontologies can be downloaded from
http://www.genomiccds.org/ont/snapshot-april-2014
The goals of developing the ontology are:
 Providing a simple and concise formalism for representing pharmacogenomic
knowledge
 Finding errors and lacking definitions in pharmacogenomic knowledge bases
 Automatically assigning alleles and phenotypes to patients
 Matching patients to clinically appropriate pharmacogenomic guidelines and
clinical decision support messages
 Being able to detect inconsistencies between pharmacogenomics treatment
guidelines from different sources.
      </p>
      <p>In the most common scenario, genetic patient data in OWL format are combined with
the axioms of the Genomic CDS ontology, and an OWL reasoner is used to infer
matching pharmacogenomic treatment recommendations. Several inference steps are
needed to derive matching treatment recommendations from raw data about genetic
markers (Fig. 1. ). The raw data consist of small variants in the genetic code, which in
most cases are single nucleotide polymorphisms (SNPs), such as an ‘A’ instead of a
‘G’ or a deletion/insertion of a nucleotide. Alleles are variants of a gene that are
defined by containing sets of such small variants. Phenotypes are referring to the
specific effects that certain small variants and alleles can have on the organism, e.g., how
quickly a patient metabolizes a specific drug. Clinical guidelines can use small
variants, alleles and/or phenotypes to match patients to treatment recommendations. The
Genomic CDS ontology classifies patients according to four types of inference rules
that are represented as subclasses of the human class. The class
human_with_genotype_marker represents the first inference step which gathers the raw
genetic data and recognizes particular SNP variants. The class
human_with_genetic_polimorphism is related to the second inference step where the
obtained SNP variants are matched to obtain the related alleles of the patient. The
third inference step is associated to the class
human_triggering_phenotype_inference_rule which represents patient’s phenotype rules
based on the SNP variants and alleles obtained in the previous steps. Finally, the class
human_triggering_CDS_rule represents the rules that conceptualize the clinical
guidelines based on the combination of the inferred SNP variants, alleles and
phenotypes.</p>
      <p>The human genome usually contains two copies of each gene (one from the father,
one from the mother), with each copy potentially bearing multiple genetic variants.
Because of this, the ontologies rely heavily on qualified cardinality restrictions with
cardinalities of two, which seems to cause performance issues with most current
OWL reasoners.</p>
      <p>There are two version of the ontology and the corresponding ‘demo’ versions that
include an example of the genetic data of a patient. The full versions of the ontology
(genomic-cds_rules_full.owl and genomic-cds_rules_full_demo.owl) contain the
axioms that can be used to link SNPs variants such as “rs267607275(G;G)”, and
alleles variants such as “TPMT *1/*2”, to a patient, whereas the light version of the
ontology (genomic-cds_rules.owl and genomic-cds_rules_demo.owl) only provide
axioms related to allele variants. The light version of the ontology can reduce the
complexity of the model which is useful when running reasoners with limited
computing resources without losing expressiveness. Both versions of the ontology have
ALCQ expressivity. They are characterized by extensive use of qualified cardinality
restrictions.</p>
      <p>
        Compared to the 2013 version of the ontology [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the decision support rules
encoded in the ontology were increased from 49 drug dosing recommendation rules to
298 drug dosing recommendation rules and 18 phenotype inference rules. This
increase in the number of rules demands more computational resources to obtain the
inferred model. In order to reduce the complexity of the ontology and facilitate the
reasoning process, we optimized the ontology by removing the axioms from the
previous version of the ontology which had no actual effects to the decision support the
rules. Therefore, the number of genes and SNP variants that we currently cover is
lower than in the 2013 version of the ontology. We also removed classes of genes,
such as “ABCG2” or “NAT1”, that were not reflected in any rule and, consequently,
the total number of genes represented in the ontology decreased from 72 to 38.
Besides, the number of SNPs variants decreased from 822 to 674. Despite these
removals, the number of defined alleles and drugs included in the 2014 version of the
Genomic CDS ontology is larger. The statistics about the ontology are summarized in
Table 1.
A simplified example of a rule for inferring an allele (CYP2C9 *3) based on its single
nucleotide polymorphisms, which also include a SNP insertion
(rs72558188_AGAAATGGAA), looks like this in Manchester syntax:
An example of an axiom for inferring an adequate clinical decision support message
for the anticoagulant drug warfarin (based on a combination of alleles and SNPs
according to an official recommendation in the drug label):
Class: 'human triggering CDS rule 7'
      </p>
      <p>EquivalentTo:
(has some 'CYP2C9*1') and (has some 'CYP2C9*3')</p>
      <p>and (has exactly 2 rs9923231_C)
Annotations:
label "human triggering CDS rule 7",
CDS_message "3-4 mg warfarin per day should
be considered as a starting dose range for
a patient with this genotype according to
the Warfarin drug label (Bristol-Myers</p>
      <p>
        Squibb)."
From the previous version of the ontology [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] we found that TrOWL1 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is
significantly more performant than the HermiT2 reasoner and other OWL 2 DL reasoners
when classifying and realizing the Genomic CDS ontology. Consequently, in this
paper we evaluated versions 1.3 and 1.4 of TrOWL reasoners for classifying the full
demo (MSC_classes_demo.owl) and the light demo (genomic-cds_demo.owl) of the
Genomic-CDS ontology. We compared the performance of the two version of the
reasoner on a 64-bit Windows 7 machine with 4GB of memory and an Intel i5-2430 at
2.4GHz. The reasoner use version 3.0 of OWLAPI and JRE 6 update 29 to run each
demo files. The results of this evaluation are shown in Table 2. As expected, the
reasoners take more time to classify the full version of our demo ontology than the
simplified one. Surprisingly, the latest version of TrOWL (1.4) takes slightly longer
than the previous version (1.3) to classify the demo ontologies. The 1.4 version of
TrOWL seems to be a minor revision of the 1.3 version due to the fact that the
developers only highlight some fixed bugs on the ontological patterns. Our hypothesis is
that such changes in the updated reasoner have increase the complexity of the
inference process and, consequently, its performance is lower.
1 http://trowl.eu
2 http://www.hermit-reasoner.com/
MSC_classes_demo.owl (full ontology)
      </p>
    </sec>
    <sec id="sec-3">
      <title>Conclusions and outlook</title>
      <p>The Genomic CDS ontology is an example of an OWL 2 ontology for clinical
genetics and decision support. The updated version of this ontology has covered an
increased number of drug treatment recommendation rules and has improved some
pharmacogenetic markers. As a consequence, the total number of axioms has
increased, further increasing the demand for OWL reasoners that could deal with this
type of ontologies in a reasonable time and with limited resources use.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>The research leading to these results has received funding from the Austrian Science
Fund (FWF): [PP 25608-N15].</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Miñarro-Gimenez</surname>
            <given-names>JA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blagec</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyce</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adlassnig</surname>
            <given-names>K-P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Samwald</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>An OntologyBased, Mobile-Optimized System for Pharmacogenomic Decision Support at the Point-ofCare</article-title>
          .
          <source>Plos One</source>
          .
          <volume>9</volume>
          (
          <issue>5</issue>
          ):
          <fpage>e93769</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Samwald</surname>
            <given-names>M. Genomic</given-names>
          </string-name>
          <article-title>CDS: an Example of a Complex Ontology for Pharmacogenetics and Clinical Decision Support</article-title>
          . Ulm, Germany: CEUR Workshop Proceedings;
          <year>2014</year>
          . p.
          <fpage>128</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Thomas</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            <given-names>JZ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ren</surname>
            <given-names>Y.</given-names>
          </string-name>
          <article-title>TrOWL: Tractable OWL 2 Reasoning Infrastructure</article-title>
          .
          <source>The Semantic Web: Research and Applications. 7th Extended Semantic Web Conference, ESWC</source>
          <year>2010</year>
          , Heraklion, Crete, Greece, May 30 - June 3,
          <year>2010</year>
          , Proceedings, Part II.
          <year>2010</year>
          . p.
          <fpage>431</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>