<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Cell, chemical and anatomical views of the Gene Ontology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>David Osumi-Sutherland</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Enrico Ponta</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Melanie Courtot</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Helen Parkinson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Badi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus</institution>
          ,
          <addr-line>Hinxton, Cambridge CB10 1SD</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Ho mann-La Roche Ltd</institution>
          ,
          <addr-line>Grenzacherstrasse 124, 4070 Basel</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The Gene Ontology (GO) consists of around 40,000 terms refering to classes of biological process, cell component and gene product activity. It has been used to annotate the functions and locations of several million gene products. Much pharmacological research focuses on understanding how disease conditions di er from physiological conditions in molecular terms with the aim of nding new drug targets for therapy. Gene set enrichment analysis using the GO and its annotations provides a powerful way to assess those di erences. Roche has developed a bespoke controlled vocabulary (RCV) to support enrichment analysis. Each term is manually mapped to a list of Gene Ontology (GO) terms. The groupings are tailored to the research aims of Roche and as a result, many groupings are out-of-scope for GO classes. For example, many RCV terms group process and cell parts according to the cell type they occur in. The manual mapping strategy is labour intensive and hard to sustain as the GO evolves. We have automated mappings between RCV and the GO via OWL-EL queries. This is made possible by extensive axiomatisation linking the GO to ontologies of cells, anatomical entites and chemicals. We can fully automate mapping for approximately one third of the terms in the RCV, with another 40% having 10 or fewer GO terms requiring manual mapping. Automated mapping uncovers many missing mappings. GSEA using the resulting, semi-automated mapping of RCV to GO detects enrichment to gene sets missed with the manual-only mapping. The OWL query approach we describe can be used as the basis of new ways to query the GO, group annotations and carry out GSEA. Importantly, it allows the classi cations used in enrichment analysis to be much more closely tailored to the needs of researchers and industry than was previously possible.</p>
      </abstract>
      <kwd-group>
        <kwd>OWL-EL</kwd>
        <kwd>gene set enrichment analysis</kwd>
        <kwd>gene expression</kwd>
        <kwd>gene ontology</kwd>
        <kwd>GO</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The Gene Ontology (GO) consists of almost 40,000 terms and has been used
to annotate millions of gene products to record their subcellular location (e.g.,
lysosome), their molecular function (e.g., kinase activity) and their wider role in
cellular, developmental and physiological processes (e.g., signal transduction) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
The classi cation and part hierarchies in the GO are used to group genes
annotated with related terms in user-facing tools such as QuickGO [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and AmiGO [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ],
and to generate gene sets for gene set enrichment analysis (GSEA) [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>Much pharmacological research focuses on understanding the molecular
differences between disease conditions and physiological conditions, with the aim
of nding new drug targets for therapy. Di erential expression experiments
analysing disease models or pathological tissue samples are an important source
of data contributing to our understanding of this. GSEA using GO derived gene
sets is an e cient way to nd functionally coherent gene sets that are statistically
over or under-represented in gene lists derived from these experiments. GSEA
results using the full GO and large numbers of genes can be di cult and slow to
interpret due to high levels of overlap between gene sets. There are a number of
sources of overlap: Grouping via class and part heirarchies means that gene sets
derived from annotation to a class subsumes the gene sets of its subclasses and
subparts; one GO class can sit in multiple branches of the heirarchy; a single
gene product may be annotated to terms in multiple branches.</p>
      <p>One way to reduce overlap is to use a at list of high or intermediate level
GO terms, commonly referred to as a slim. But for this to provide useful results,
the terms in the slim need to be su ciently descriptive to t the experimental
use cases. Rather than use a slim of GO terms, F. Ho mann-La Roche Ltd.
(\Roche"), maintains an internal controlled vocabulary (referred to hereafter as
RCV) for use in GSEA. The RCV consists of around 360 terms, each of which
is mapped to a set of terms from GO, just as a term in a GO slim maps to a set
of subclasses and subparts. It is tailored to the research interests of Roche, and
its terms were chosen with the aim of achieving gene set composition descriptive
and broad enough to allow robust and statistically signi cant results, though
not so broad and redundant in composition that it prevents easy interpretation
of results. Detecting enrichment to gene products involved in anatomy, organ or
cell-speci c processes or components can be critical for pharmacological research,
especially when working with complex tissues where there is a need to tease apart
events occurring in speci c tissue compartments or cell types. To support this,
many RCV terms group GO terms in ways that are out of scope for classes in
the GO, such as grouping processes solely by where they occur.</p>
      <p>
        To date, Roche has manually maintained mappings between RCV and GO.
Keeping this mapping up-to-date and complete has become impractical given
the evolution of the GO. Recent developments in the GO make it possible to
automate mappings between the RCV and the GO. The GO has switched its
underlying formalization to Web Ontology Language (OWL2) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and has
dramatically increased the number of logical axioms [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. The chemical participants
in over 12,000 processes or functions are speci ed in GO via axioms referencing
chemical entities de ned by Chemical Entities of Biological Interest (ChEBI) [
        <xref ref-type="bibr" rid="ref8 ref9">9,
8</xref>
        ]. Over 8000 GO classes have some direct or indirect logical link to a term from
the Cell Ontology (CL) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] or the Uber anatomy ontology (Uberon) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. These
record, for example, the location of cellular components (e.g., the acrosome and
its parts are present only in sperm), cell types that are the sole location of some
process (`natural killer cell degranulation' only occurs in natural killer cells),
and the products of developmental processes (bone is a product of `bone
morphogenesis'). There are also over 2500 logical axioms recording the functions of
cellular components via links to molecular function and biological process terms.
This axiomatisation makes it possible to construct bespoke classi cations of GO
classes that would be out-of-scope for named GO classes. For example, we can
use OWL queries to group processes occurring in T-cells or in the pancreas, or
processes involving nitric oxide or collagen bers. Here we describe the
development and testing of an automated mapping between GO and RCV, making use
of OWL reasoning.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Methods</title>
      <p>
        As the RCV is a at list and includes classi cations that are orthogonal to the
classi cation schemes used by the GO, it is not amenable to mapping via ontology
alignment techniques that use ontology structure [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Given the small size of
RCV, it is viable to manually map each RCV term to an OWL class expression,
which can then be used in conjunction with an OWL reasoner to generate lists
of GO terms. The RCV does not include textual de nitions de nitions to clarify
meaning, so for each RCV term we attempted to nd a class expression (a
mapping query) that re ected the intended meaning of the RCV term, as judged
by the RCV term name, manual mappings and discussion with RCV developers.
2.1
      </p>
      <sec id="sec-2-1">
        <title>Query strategy</title>
        <p>
          To ensure speed and scalability, we chose to restrict mapping queries to the
EL pro le of OWL2, allowing us to use ELK, a fast, scaleable EL reasoner,
to run queries [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. In order to keep the mapping process simple, only a single
mapping class was speci ed for each mapping. To compensate partially for the
lack of disjunction (OR) in OWL-EL, we developed a hierarchy of high level
object properties for use in queries. For example, we de ne occurs in OR
has participant as a grouping relation allowing queries for processes that occur
in a speci ed cell, or have that cell as a participant. Many RCV terms group
i) processes in which a speci ed chemical or cell participates with ii) processes
regulating those in which it participates (see table 1 for example). To support
such groupings, we used an OWL property chain axiom [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] to de ne a relation,
regulates o has participant, which can be used to query for processes that
regulate a process in which some speci ed entity is a participant. We then de ned
a super-property, participant OR reg participant, for this new relation and
has participant:
        </p>
        <p>These new, high-level object properties are di cult to name in a way that
communicates the meanings of mapping queries clearly. In order to compensate
for this, we used scripting to generate human readable descriptions for each
mapping query. Compare, for example, the mapping query for the RCV term
cannabinoid with its description:</p>
      </sec>
      <sec id="sec-2-2">
        <title>Mapping query : participant OR reg participant some cannabi</title>
        <p>
          noid
Description : \A process in which a cannabinoid participates, or that
regulates a process in which a cannabinoid participates."
Mapping queries were run using the ELK OWL reasoner [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] via calls to the
OWL-API [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. The query and results processing pipeline was written in Jython [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
All code, mapping tables and results were maintained in a GitHub repository [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
The mapping was speci ed using a single tab separated values (TSV) le in which
each line maps an RCV term to an OWL-EL mapping query that includes a term
from GO, ChEBI, CL, Uberon or NCBI taxonomy [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. Query results were used
to generate a TSV le, allowing direct comparison of manual and automated
mappings (see table 1 for an example). We used the GitHub API to generate
tickets for each mapping, linked to the relevant TSV results le, which GitHub
renders as a table. This allowed easy manual review and editing by RCV
curators at Roche who used the linked tickets to discuss mapping issues and record
the approval status of all mappings.
        </p>
        <p>Mapping queries were selected, tested and the results reviewed against
manual mappings to decide which patterns were most appropriate. Once a mapping
query was chosen, corrections and/or additions to the GO were made where
results were wrong or incomplete. At this point, any clear errors in the
manual mapping were blacklisted. Review of automated mappings was then passed
to Roche who approved or blacklisted individual classes (see table 1 for an
example). When satis ed with the results, the corresponding GitHub ticket was
closed, thereby indicating the mapping as approved. Results approved by Roche
were combined to produce a new RCV mapping table1.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Gene set enrichment analyses</title>
        <p>
          GSEA was performed using an open dataset comparing gene expression in adult
liver and embryonic cells of mice [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. Genes were ranked according to how
much more highly they were expressed in liver vs embryonic cells and vice versa.
GSEA enrichment scores were computed using GSEA software from the Broad
Institute [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] with an up-to-date set of GO annotations to mouse genes 2. The
results were analysed using the Enrichment Map plugin for Cytoscape [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], which
provides a graphical representation of enrichment results.
3
3.1
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <sec id="sec-3-1">
        <title>Mapping results</title>
        <p>We developed mapping queries for 308/364 RCV terms. Over a third (104) of
the mapping queries were su cient - meaning that no manual maintenance is
required. A further 40% of the mappings (148) had 10 or fewer additional manual
mappings ( gure 1A) and most of these (114) had fewer than 5.</p>
        <p>Mapping queries identi ed many GO terms that were not in the manual
mapping ( gure 1B). In some cases (e.g., leukocyte activation), over 1000 new
mappings were found. Only 8 automated mappings had blacklisted terms,
reecting minor di erences between the meaning of the mapping query and the
intended meaning of the RCV term. 56 terms were not mapped. Some were were
judged to be semantically equivalent to other RCV terms. The rest were rejected
as currently not mappable due to the lack of suitable terms or axiomatisation
within the GO. For example, RCV has terms for aerobic and anaerobic metabolic
1 Available from: https://github.com/GO-ROCHE-COLLAB/Roche_CV_mapping/blob/
master/mapping_tables/results/combined_results.tsv
2 Available from http://geneontology.org/page/download-annotations
processes, but GO currently has no formal way to group these and no
sustainable mechanisms for grouping them manually. Further formalisation of the GO
is likely to improve the number of RCV terms that can be mapped.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Testing the implication of automated mapping for gene set enrichment analyses</title>
        <p>We tested the the revised, semi-automated RCV mapping by performing GSEA
comparing the transcriptome of adult mouse liver against embryonic mouse cells
using a standard GO slim (Figure 2A), the original, manually mapped RCV
(RCV man; Figure 2B) and the new partially automated RCV GO mapping
(RCV auto; Figure 2C). Pro les of gene enrichment in liver compared to
undi erentiated cells are potentially useful benchmarks for the development and
testing of stem cell derived liver in vitro systems increasingly employed in
toxicology testing. Results were loosely grouped by experts at Roche to provide a
preliminary, biologically plausible interpretation. All approaches detected
enrichment to gene sets involved in cell division and gene expression in the embryonic
sample, but there were dramatic di erences in detection of enrichment in the
liver sample.</p>
        <p>GSEA with the standard GO slim detects enrichment in the liver to a few sets
of genes involved in metabolic processes that are known to be up-regulated in the
liver. GSEA with RCV man also detects enrichment to many more metabolic
processes that are speci c to or upregulated in the liver, and to a much ner
level of detail. It also detects enrichment of genes involved in immune cell related
process, consistent with detection of the resident immune system in the liver.
The results of enrichment with RCV auto are similar, but provide much more
detail. For example, GSEA with RCV auto detects enrichment to gene sets
involved in detoxi cation (important for toxicology use cases) and a wider range
of immune cell processes. There is also increased overlap between enriched gene
sets compared to RCV man, but at a level that is potentially informative (see
edges between nodes in 2C). For example, there is overlap between sets of genes
involved in both chemotaxis and processes involving types of immune cell that
are known to be capable of chemotaxis.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Improvements to the GO</title>
        <p>While GO has extensive axiomatisation linking processes to cells, anatomical
structures and chemicals, this is not always complete. In mapping from the
RCV to GO we found and corrected over 200 omissions in the axiomatisation.
This included missing links from processes to participant cell types, anatomical
structures, chemicals, cell components and transcript types. We also found and
corrected a number of errors, including errors in axiomatisation of developmental
processes that led to incorrect inferences for RCV anatomy terms.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion and future directions</title>
      <p>
        This work demonstrates how the logical structure of the GO can be used to
achieve biologically meaningful mappings between GO terms and terms from
external controlled vocabularies de ned with reference to ryps of cells,
chemicals or anatomical structures. Mappings are straightforward to specify and the
reasoning system used is fast and scalable [
        <xref ref-type="bibr" rid="ref13 ref17">13, 17</xref>
        ]. All mappings that are fully
automated can be automatically updated as the GO changes, simply by running
the mapping pipeline. Errors found during manual review were su ciently rare
that this step will not be used in future updates.
4.1
      </p>
      <sec id="sec-4-1">
        <title>Improving the RCV mapping to GO</title>
        <p>48% of mapped RCV terms have 10 or fewer manual mappings. We are
reviewing all of these cases to decide whether to drop manual mappings or whether</p>
        <p>Fig. 2. GSEA comparing expression between embryo and adult liver using gene sets
derived from the generic GO slim (Panel A), manually mapped RCV (panel B), and
semi-automated RCV (panel C). Red nodes indicate gene sets enriched in liver
compared to embryos. Blue nodes gene sets enriched in embryos compared to liver. The
size of the node is proportional to the size of the gene set. Connecting edge thickness
is a measure of the number of enriched genes in common between two gene sets.
complete automation might be achieved by a di erent query strategy. In some
cases, a more complete mapping could be achieved by combining the results of
multiple mapping queries. For example, RCV terms for chemical metabolism are
all manually mapped to GO terms for both metabolism and transport. A more
complete mapping could be achieved by combining the results of separate OWL
mapping queries for GO transport and GO metabolic process 3.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Alternative views of the GO and its annotations</title>
        <p>The OWL axioms used to automate RCV mapping to GO can also be used
to provide alternative views of the GO and its annotations. This is already
re ected in some of the newer functionalities of the GO browsing tool AMIGO,
which now displays inferred annotations to cell-types based on axioms in GO
recording where processes occur4.
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Improving mechanisms for extending RCV</title>
        <p>The system described here was designed to be lightweight and exible, allowing
maximum interaction between the designers of RCV at Roche and GO editors
with minimal development overhead. Where new terms following mapping query
patterns already used, they can be added via the same mechanism.</p>
        <p>
          The system described bears some relationship to TermGenie [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] which is
already used to generate 80% of new GO terms. One possible approach to ful lling
the needs of external groups for types of classi cation not included in the GO
would be to o er a TermGenie-like system to create bespoke terms.
Funding This work was supported by direct funding from F. Ho mann-La
Roche Ltd and by European Molecular Biology Laboratory (EMBL) core
funding. The Gene Ontology Consortium is supported by a P41 grant from the
National Human Genome Research Institute (NHGRI) [grant 5U41HG002273-14].
3 This is not formally equivalent to running OWL queries with disjunction (OR), but
we expect few, if any, di erences in results given the current GO axiomatisation.
4 http://amigo.geneontology.org/amigo/term/CL:0000084
        </p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Siham</given-names>
            <surname>Amrouch</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sihem</given-names>
            <surname>Mostefai</surname>
          </string-name>
          .
          <article-title>Survey on the literature of ontology mapping, alignment and merging</article-title>
          .
          <source>In Information Technology and e-Services (ICITeS)</source>
          ,
          <source>2012 International Conference on, pages 1{5</source>
          ,
          <string-name>
            <surname>March</surname>
          </string-name>
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>David</given-names>
            <surname>Binns</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Emily</given-names>
            <surname>Dimmer</surname>
          </string-name>
          , Rachael Huntley, Daniel Barrell,
          <string-name>
            <surname>Claire O'Donovan</surname>
            ,
            <given-names>and Rolf</given-names>
          </string-name>
          <string-name>
            <surname>Apweiler</surname>
          </string-name>
          .
          <article-title>Quickgo: a web-based tool for gene ontology searching</article-title>
          .
          <source>Bioinformatics</source>
          ,
          <volume>25</volume>
          (
          <issue>22</issue>
          ):
          <volume>3045</volume>
          {
          <fpage>3046</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Seth</given-names>
            <surname>Carbon</surname>
          </string-name>
          , Amelia Ireland,
          <string-name>
            <surname>Christopher J</surname>
          </string-name>
          . Mungall, ShengQiang Shu, Brad Marshall,
          <article-title>Suzanna Lewis, the AmiGO Hub,</article-title>
          and the Web Presence Working Group.
          <article-title>Amigo: online access to ontology and annotation data</article-title>
          .
          <source>Bioinformatics</source>
          ,
          <volume>25</volume>
          (
          <issue>2</issue>
          ):
          <volume>288</volume>
          {
          <fpage>289</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>The</given-names>
            <surname>Gene Ontology Consortium. Gene Ontology</surname>
          </string-name>
          <article-title>Consortium: going forward</article-title>
          .
          <source>Nucleic Acids Res</source>
          .,
          <volume>43</volume>
          (Database issue):
          <source>D1049{1056</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>David</given-names>
            <surname>Osumi-Sutherland. The</surname>
          </string-name>
          GO-Roche project - available at https://github. com/GO-ROCHE-COLLAB/
          <article-title>Roche_CV_mapping</article-title>
          .,
          <year>September 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>H.</given-names>
            <surname>Dietze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. Z.</given-names>
            <surname>Berardini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Foulger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lomax</surname>
          </string-name>
          , D. OsumiSutherland, P. Roncaglia, and
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Mungall. TermGenie -</surname>
          </string-name>
          <article-title>a web-application for pattern-based ontology class generation</article-title>
          .
          <source>J Biomed Semantics</source>
          ,
          <volume>5</volume>
          :
          <fpage>48</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Haendel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Balho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. B.</given-names>
            <surname>Bastian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Blackburn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Blake</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bradford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Comte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. M.</given-names>
            <surname>Dahdul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Dececchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Druzinsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. F.</given-names>
            <surname>Hayamizu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ibrahim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Mabee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Niknejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Robinson-Rechavi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. C.</given-names>
            <surname>Sereno</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Mungall</surname>
          </string-name>
          .
          <article-title>Uni cation of multi-species vertebrate anatomy ontologies for comparative biology in Uberon</article-title>
          .
          <source>J Biomed Semantics</source>
          ,
          <volume>5</volume>
          :
          <fpage>21</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>J.</given-names>
            <surname>Hastings</surname>
          </string-name>
          , P. de Matos,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dekker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ennis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Harsha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Muthukrishnan</surname>
          </string-name>
          , G. Owen,
          <string-name>
            <given-names>S.</given-names>
            <surname>Turner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Williams</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Steinbeck</surname>
          </string-name>
          .
          <article-title>The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013</article-title>
          .
          <source>Nucleic Acids Res</source>
          .,
          <volume>41</volume>
          (Database issue):
          <source>D456{463</source>
          ,
          <string-name>
            <surname>Jan</surname>
          </string-name>
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. David P Hill, Nico Adams, Mike Bada, Colin Batchelor, Tanya Z Berardini, Heiko Dietze, Harold J Drabkin, Marcus Ennis, Rebecca E Foulger, Midori A Harris, Janna Hastings, Namrata S Kale, Paula de Matos, Christopher Mungall, Gareth Owen, Paola Roncaglia, Christoph Steinbeck, Steve Turner, and
          <string-name>
            <given-names>Jane</given-names>
            <surname>Lomax</surname>
          </string-name>
          .
          <article-title>Dovetailing biology and chemistry: integrating the Gene Ontology with the ChEBI chemical ontology</article-title>
          .
          <source>BMC genomics</source>
          ,
          <volume>14</volume>
          (
          <issue>1</issue>
          ):
          <fpage>513</fpage>
          ,
          <year>January 2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Pascal</surname>
            <given-names>Hitzler</given-names>
          </string-name>
          , Markus Krotzsch, Bijan Parsia,
          <string-name>
            <surname>Peter F. Patel-Schneider</surname>
          </string-name>
          , and Sebastian Rudolph, editors.
          <source>OWL 2 Web Ontology Language: Primer. W3C Recommendation</source>
          , 27
          <year>October 2009</year>
          . Available at http://www.w3.org/TR/owl2-primer/.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Matthew</given-names>
            <surname>Horridge</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sean</given-names>
            <surname>Bechhofer</surname>
          </string-name>
          .
          <article-title>The owl api: A java api for owl ontologies</article-title>
          .
          <source>Semant. web</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):
          <volume>11</volume>
          {
          <fpage>21</fpage>
          ,
          <year>January 2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <article-title>Jython maintainers</article-title>
          . The Jython project - available at http://www.jython.org.,
          <year>September 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Yevgeny</surname>
            <given-names>Kazakov</given-names>
          </string-name>
          ,
          <article-title>Markus Krotzsch, and Frantisek Simanc k. Elk reasoner: Architecture and evaluation</article-title>
          .
          <source>CEUR Workshop Proceedings</source>
          ,
          <volume>858</volume>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>R.</given-names>
            <surname>Lowe</surname>
          </string-name>
          .
          <article-title>Sexually dimorphic gene expression emerges with embryonic genome activation and is dynamic throughout development (rna-seq)</article-title>
          . - available at http: //www.ncbi.nlm.nih.gov/geo/query/acc.cgi?
          <source>acc=GSE58733</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>T. F. Meehan</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          <string-name>
            <surname>Masci</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Abdulla</surname>
            ,
            <given-names>L. G.</given-names>
          </string-name>
          <string-name>
            <surname>Cowell</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          <string-name>
            <surname>Blake</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          <string-name>
            <surname>Mungall</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Diehl</surname>
          </string-name>
          .
          <article-title>Logical development of the cell ontology</article-title>
          .
          <source>BMC Bioinformatics</source>
          ,
          <volume>12</volume>
          :
          <fpage>6</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>D.</given-names>
            <surname>Merico</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Isserlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Stueker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Emili</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G. D.</given-names>
            <surname>Bader</surname>
          </string-name>
          .
          <article-title>Enrichment map: a network-based method for gene-set enrichment visualization and interpretation</article-title>
          .
          <source>PLoS ONE</source>
          ,
          <volume>5</volume>
          (
          <issue>11</issue>
          ):
          <fpage>e13984</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>C. Mungall</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Deitze</surname>
            , and
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Osumi-Sutherland</surname>
          </string-name>
          .
          <article-title>Use of OWL within the Gene Ontology</article-title>
          . In C. Maria Keet and V. Tamma, editors,
          <source>Proceedings of the 11th International Workshop on OWL: Experiences and Directions (OWLED</source>
          <year>2014</year>
          ), volume
          <volume>1265</volume>
          <source>of CEUR workshop proceedings</source>
          , pages
          <volume>25</volume>
          {
          <fpage>36</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <article-title>National Institutes of Health National Center for Biotechnology Information (NCBI), National Library of Medicine. The ncbi entrez taxonomy homepage</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Aravind</surname>
            <given-names>Subramanian</given-names>
          </string-name>
          , Pablo Tamayo,
          <string-name>
            <surname>Vamsi K. Mootha</surname>
          </string-name>
          , Sayan Mukherjee, Benjamin L.
          <string-name>
            <surname>Ebert</surname>
            ,
            <given-names>Michael A.</given-names>
          </string-name>
          <string-name>
            <surname>Gillette</surname>
            ,
            <given-names>Amanda</given-names>
          </string-name>
          <string-name>
            <surname>Paulovich</surname>
          </string-name>
          , Scott L. Pomeroy,
          <string-name>
            <surname>Todd R. Golub</surname>
            , Eric S. Lander, and
            <given-names>Jill P.</given-names>
          </string-name>
          <string-name>
            <surname>Mesirov</surname>
          </string-name>
          .
          <article-title>Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression pro les</article-title>
          .
          <source>Proceedings of the National Academy of Sciences</source>
          ,
          <volume>102</volume>
          (
          <issue>43</issue>
          ):
          <volume>15545</volume>
          {
          <fpage>15550</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>