<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Literature Based Approach to Define the Scope of Biomedical Ontologies: A Case Study on a Rehabilitation Therapy Ontology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mohammad K. Halawani</string-name>
          <email>M.K.H.Halawani2@newcastle.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rob Forsyth</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Phillip Lord</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Information Systems Umm Al-Qura University</institution>
          ,
          <country country="SA">Saudi Arabia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Neuroscience Newcastle University</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of Computing Science Newcastle University</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this article, we investigate our early attempts at building an ontology describing rehabilitation therapies following brain injury. These therapies are wide-ranging, involving interventions of many different kinds. As a result, these therapies are hard to describe. As well as restricting actual practice, this is also a major impediment to evidence-based medicine as it is hard to meaningfully compare two treatment plans. Ontology development requires significant effort from both ontologists and domain experts. Knowledge elicited from domain experts forms the scope of the ontology. The process of knowledge elicitation is expensive, consumes experts' time and might have biases depending on the selection of the experts. Various methodologies and techniques exist for enabling this knowledge elicitation, including community groups and open development practices. A related problem is that of defining scope. By defining the scope, we can decide whether a concept (i.e. term) should be represented in the ontology. This is the opposite of knowledge elicitation, in the sense that it defines what should not be in the ontology. This can be addressed by pre-defining a set of competency questions. These approaches are, however, expensive and time-consuming. Here, we describe our work toward an alternative approach, bootstrapping the ontology from an initially small corpus of literature that will define the scope of the ontology, expanding this to a set covering the domain, then using information extraction to define an initial terminology to provide the basis and the competencies for the ontology. Here, we discuss four approaches to building a suitable corpus that is both sufficiently covering and precise.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 INTRODUCTION</title>
      <p>
        Rehabilitation therapies, unlike pharmacologic therapies, are
difficult to define precisely both qualitatively and quantitatively
        <xref ref-type="bibr" rid="ref9">(van
Heugten et al., 2012)</xref>
        and many approaches have been taken to
trying to parse them. It is recognised that traditional approaches
to designation (e.g. “dressing practice”) are flawed as two
professionals’ rehabilitation sessions both targeting difficulties in dressing
could differ in pertinent active ingredients (e.g. actions, chemicals,
devices, or forms of energy) as experienced by the patient.
Assumptions that rehabilitation content can be inferred from the targeted
should
addressed:
impairment (e.g. “balance training”) as flawed: no one would
consider it appropriate to consider bariatric surgery, calorie-restricted
diets and exercise programmes together as equivalent forms of
“obesity therapy”
        <xref ref-type="bibr" rid="ref13">(Whyte et al., 2014)</xref>
        . This lack of a shared terminology
makes it difficult to describe, measure and meaningfully compare
rehabilitation therapies and treatments.
      </p>
      <p>
        Building a taxonomy for rehabilitation treatments could lead to a
better shared understanding of rehabilitation interventions
        <xref ref-type="bibr" rid="ref13 ref5">(Dijkers,
2014)</xref>
        . Hence, a rehabilitation treatment ontology (RTO) of
rehabilitation terms, as the terms represent the concepts and knowledge
of the domain
        <xref ref-type="bibr" rid="ref7">(Sowa, 2000)</xref>
        , should ease the dissemination of
treatments to communicate about them clearly and effectively, through
a shared understanding.
      </p>
      <p>
        To enable building the RTO, we need to define both the terms
that we wish to be in the ontology and those that should not.
Some ontologies have extremely well-defined scopes, such as the
Karyotype ontology
        <xref ref-type="bibr" rid="ref11">(Warrender and Lord, 2013)</xref>
        , which is an
ontological representation of a previously defined informal specification.
Others, such as the mitochondrial disease ontology
        <xref ref-type="bibr" rid="ref10 ref12">(Warrender,
2015)</xref>
        relate to a specific area of knowledge, or like the Gene
Ontology(GO)
        <xref ref-type="bibr" rid="ref2">(Ashburner et al., 2000)</xref>
        to a broad area, but at a specific
granularity. For the RTO, unfortunately, the breadth of the area
means that we lack this clear statement of scope.
      </p>
      <p>
        Of course, there has been significant research on ontology
learning, enabling either automation or semi-automation of the ontology
construction process
        <xref ref-type="bibr" rid="ref3">(Buitelaar et al., 2005)</xref>
        . For the RTO, we aim
to use a semi-automated approach, combined with a highly
programmatic, pattern-driven ontology construction methodology that
we have pioneered previously with the mitochondrial disease
ontology
        <xref ref-type="bibr" rid="ref10 ref12">(Warrender and Lord, 2015)</xref>
        : this separates terms out into a
scaffold generated automatically, often from a pre-existing
structured source such as a database. This is followed by manual refinement
using the vocabulary provided this scaffold.
      </p>
      <p>
        With the RTO, we plan to extend this ontology construction
methodology: first, we will build a corpus of appropriate literature that
will define the scope of the ontology; then we can use this to extract
a set of representative terms and phrases; finally, we will use these
terms and phrases as the basis for our ontological scaffold
        <xref ref-type="bibr" rid="ref10 ref12">(Warrender and Lord, 2015)</xref>
        . This should provide both coverage and scope
for our ontology, which we can then refine and build further either
manually or through the addition of further scaffolded terms,
identified during the first phase of development. We have previously used
a similar methodology to ensure good coverage and define the scope
of MITAP, a minimum information model
        <xref ref-type="bibr" rid="ref6">(Lord et al., 2016)</xref>
        .
      </p>
      <p>This leaves us with the problem of defining an appropriate
corpus of literature for the RTO. This corpus needs to cover the domain
adequately; at the same time, we would like this corpus to be
reflective of opinions of a wider community than the experts involved it
its construction. This is a common problem with ontology
development: if the scope is too narrow, the ontology will fulfil the needs
of only a few; if it is too broad, the ontology will either get large or
only have general terms.</p>
      <p>The aim of this article is to investigate different semi-automated
methods and search strategies to retrieve a corpus with a high level
of accuracy and coverage with respect to the communities needs for
the RTO. The accuracy and coverage of a corpus are its precision
and recall, respectively, in relation to the scope of rehabilitation. We
describe four different techniques that we have used all based around
use of PubMed, and describe their advantages and disadvantages.
2</p>
    </sec>
    <sec id="sec-2">
      <title>METHODS</title>
      <p>For this work, we have used PubMed exclusively to define our
corpus. As a corpus, PubMed is far from ideal. While it contains many
papers about rehabilitation, they are mostly written from an
academic perspective and may make a different use of vocabulary from
the clinicians. A significant percentage of the papers in PubMed
have only abstracts accessible (although, under UK law, we may
be able to access full text by other means (gre, 2014)).
However, it has other significant advantages: it is freely available; there
are no patient confidentiality restrictions as there would be with
medical records; finally, it has a good API and is easy to access
computationally.</p>
      <p>We use two additional features of PubMed in this paper. First,
papers are annotated with Medical Subject Headings (MeSH).
MeSH is a thesaurus organised into a hierarchy; searches with a
single term, also search the transitive closure of that term. Curators
can also define a MeSH annotation as the “major term” or MAJR.
Secondly, PubMed provides a similar articles functionality (PMSA),
based on text similarity ( U.S. National Library of Medicine (NLM),
2017). Currently, this functionality only allows retrieving
MEDLINE records (i.e. PubMed citation) similar to a single user-selected
record. We discuss this limitation later.</p>
      <p>
        Additional search functionality described in this paper was
implemented using Python, exploiting the Entrez module of
BioPython
        <xref ref-type="bibr" rid="ref4">(Cock et al., 2009)</xref>
        .
2.1
      </p>
      <p>Forming a Corpus
The simplest approach to generating a suitable corpus is a keyword
search. We tried this for RTO, searching with the term
“rehabilitation”. This naive approach does not work well, as it misses
many papers which contain the same stem but with a different
ending (such as “rehabilitate” or “rehabilitator”). Moreover, it
retrieves many less relevant results (for example, those relating to drug
rehabilitation).</p>
      <p>Our next approach is to use MeSH or MAJR terms. PubMed’s
search engine automatically searches the transitive closure of any
MeSH term given, therefore searches with “Rehabilitation” will
also search “Physical Therapy Modalities”, as can be seen in
figure 1.</p>
      <p>Clearly searching for “Rehabilitation” as the MAJR term will
produce a result which is an exact subset of searching for the
equivalent MeSH term. In fact, the simple search approach automatically
incorporates MeSH search, as PubMed’s search engine translates
search terms to its equivalent MeSH term if it exists. For
example, the term “Physiotherapy” is translated to the “Physical Therapy
Modalities” MeSH term.</p>
      <p>MeSH search approach also runs the risk of missing papers
which have not been annotated at all, or have been annotated with
alternative terms from MeSH.</p>
      <p>To address this latter problem, we have tried query expansion.
Here, we expand the transitive closure of the MeSH term, then add
alternative endings manually. Sub-terms, more specifically narrower
terms, of “rehabilitation” were extracted using “MeSH SPARQL”
tool 1. The following SPARQL query was used:
PREFIX r d f s : &lt;h t t p : / /www. w3 . org / 2 0 0 0 / 0 1 / rdf schema#&gt;
PREFIX meshv : &lt;h t t p : / / id . nlm . nih . gov / mesh / vocab #&gt;
PREFIX mesh : &lt;h t t p : / / id . nlm . nih . gov / mesh/&gt;
SELECT ? l a b e l
FROM &lt;h t t p : / / id . nlm . nih . gov / mesh&gt;
WHERE f
? term meshv : b r o a d e r D e s c r i p t o r + mesh : D012046 .</p>
      <p>? term r d f s : l a b e l ? l a b e l .
g</p>
      <p>We collected and filtered general terms used in other domains
such as “Yoga” and “Massage” by inspection. These are mostly
the ones without medical words such as rehabilitation or therapy.
Synonyms of the term “rehabilitation” were defined by consultation
with a domain expert: specifically, “restoration” and “recovery”.
Multiple variations of these words were determined manually using
a dictionary. Variations of the words “therapy”, “rehabilitation”
include “therapies”, “therapist” and “rehabilitant”. rehabilitated,
and were injected in the query. Finally, the collected general terms
were combined into a MeSH approach query, the rest of the terms
were combined into a query that is disjunctive between noun
phrases and their variants. For example, the term “physical therapy” was
converted to:</p>
      <p>P h y s i c a l therapy OR P h y s i c a l AND
( therapy OR t h e r a p i e s OR t h e r a p i s t OR t h e r a p i s t s OR t h e r a p e u t i c OR . . .
OR r e h a b i l i t a t i o n OR r e h a b i l i t a t e OR r e h a b i l i t a t o r OR . . .</p>
      <p>OR r e s t o r a t i o n OR r e s t o r e OR . . .</p>
      <p>OR recovery OR . . . )</p>
      <p>The two queries were combined to form the expanded query.The
result of this approach subsumes the results of the two previous
approaches. Thus, this approach provides the most coverage. In
fact, we retrieved around 2.9 million MEDLINE records using the
query expansion approach. Table 1 shows the search terms for each
approach along with the number of retrieved records.</p>
      <p>The query expansion search approach provides a significant
increase in the number of records. We tested the accuracy of the approach
1 MeSH SPARQL is available at https://id.nlm.nih.gov/mesh/query
Search Strategy</p>
      <p>Query Search Term(s)
by random selection of papers, followed by expert analysis to
determine whether the papers were in scope. Unfortunately, the accuracy
of this approach appears fairly low, with around 5% of the papers
considered in scope.</p>
      <p>Finally, we have pioneered a relative similarity measure. This
builds on PubMed’s existing article similarity score, and allows us
to define similarity to a set of articles. Retrieved records are ranked
with a relatively score which is calculated as follows:
relativity score(a) =
#similar articles(a) that are in s</p>
      <p>max(#s; #similar articles(a))
where s : seed set; a : article (i:e: M EDLIN E record)
From this equation, for a record to have a relativity score of 1:0,
all of its similar records need to cover all of the records in the seed
set. In other words, a record can only have a relativity score of 1:0
if its set of similar records is equivalent to the seed set. If it has a
similar record that is not in the seed set or if there is a record in the
seed set that is not similar to it, the relativity score will be less than
1:0. Thus, for higher scores, a record not only must be similar to
more records in the seed set, but also needs to have fewer similar
records out of the seed set.</p>
      <p>Figure 2 shows an example of this approach. There are 3 seed
MEDLINE records (i.e. records). The relativity score for the node
D is 1:0 , as all of its similar records are in the seed set. Below are
some of the other records scores:
relativity score(E) =
relativity score(G) =
relativity score(K) =
1
3
2
3
3
8</p>
      <p>Although K, like D, is similar to all the records in the seed set,
unlike D, its score is lower than that of G as it has more
similarity with other records out of the seed set. Records with higher
scores can be considered as more relatively similar to the seed set.
A significant advantage of this approach is that the result is
continuous and can be thresholded according to contain more or less papers
as required.</p>
      <p>Using Literature to Define Biomedical Ontologies’ Scope</p>
      <p>After achieving a maximal set of citations covering the topic, a
minimal accurate set was provided by a domain expert. The expert
set of articles was provided as an EndNote library file. We converted
the articles in the library file to PMIDs. We can test the coverage of
the maximal set by checking whether it subsumes the minimal set.
In fact, all of the articles provided by the expert were included in the
maximal set.</p>
      <p>Now, we can use this approach to retrieve relatively similar
articles from the experts seed set, i.e. the minimal set. The retrieved
articles that are not included in the maximal set are filtered to restrict
similar articles that are out of the maximal set’s scope. The expert,
then, can set a threshold score to select the most related articles.
The articles above the threshold, or ones chosen by the expert, can
then be added to the seed set to perform the process again. This
process can be repeated iteratively with the help of the expert until
the results are satisfying or until they converge. The choice of the
threshold might partly depend on the required number of retrieved
articles, especially in the final stages. This process is depicted in
Figure 3.
In this article, we described four complementary search strategies to
retrieve an accurate and covering corpus of PubMed records for the
topic of rehabilitation. We use this approach to ensure that we have
a covering and unbiased corpus. Of the approaches tried, the simple
search and MeSH based strategies were too restrictive, the
expanded query too broad. To address these issues, we have developed a
new measure for paper similarity which enables us to select papers
similar to a group of papers. This approach enables us to threshold
arbitrarily and define for ourselves the “Goldilocks” zone.</p>
      <p>The key advantage of this technique is that it requires relatively
little from the domain expert, beyond a set of references to
appropriate papers, something that most researchers will have through
their normal bibliography management facilities. Operationally, this
technique is also straight-forward as it works on PubMed similarity
(although it generalises to any similarity measure), and can
operate directly over PubMed’s normal search facilities. This avoids the
necessity of performing bespoke analysis over the whole of PubMed
locally.</p>
      <p>
        A significant advantage of this technique is that it works on
PubMed similarity (although it could work on any pair-wise
similarity metric), which makes it easy to perform. We can envisage
perhaps richer techniques that generalize the current over PubMed’s
similar articles approach. However, until and unless these are
directly supported by PubMed, they would require warehousing
PubMed locally. For the next step, we plan to use this corpus
to define a covering set of terms for the Rehabilitation Therapy
Ontology, using inverse document frequency statitics that we have
previously used to define the scope of a minimum information
model
        <xref ref-type="bibr" rid="ref6">(Lord et al., 2016)</xref>
        .
      </p>
      <p>We note that this approach is largely independent of domain.
We do not require a suitable MeSH term, or a pre-existing set of
keywords that can be used for querying. It raises the possibility of
moving the initial knowledge capture stage of ontology development
away from expert user groups and competency questions, toward an
approach which is more data-driven, embedding ontology
development in the explosion of interest in big data analytics that have
characterised the last few years.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          (
          <year>2014</year>
          ).
          <article-title>Exceptions to copyright - gov</article-title>
          .uk.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ball</surname>
            ,
            <given-names>C. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blake</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Botstein</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butler</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cherry</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>A. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dolinski</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dwight</surname>
            ,
            <given-names>S. S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eppig</surname>
            ,
            <given-names>J. T.</given-names>
          </string-name>
          , et al. (
          <year>2000</year>
          ).
          <article-title>Gene ontology: tool for the unification of biology</article-title>
          .
          <source>Nature genetics</source>
          ,
          <volume>25</volume>
          (
          <issue>1</issue>
          ),
          <fpage>25</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Magnini</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Ontology learning from text: methods, evaluation and applications</article-title>
          , volume
          <volume>123</volume>
          . IOS press.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Cock</surname>
            ,
            <given-names>P. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antao</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>J. T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chapman</surname>
            ,
            <given-names>B. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cox</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dalke</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedberg</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamelryck</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kauff</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wilczynski</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , et al. (
          <year>2009</year>
          ).
          <article-title>Biopython: freely available python tools for computational molecular biology and bioinformatics</article-title>
          . Bioinformatics,
          <volume>25</volume>
          (
          <issue>11</issue>
          ),
          <fpage>1422</fpage>
          -
          <lpage>1423</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Dijkers</surname>
            ,
            <given-names>M. P.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Rehabilitation treatment taxonomy: establishing common ground</article-title>
          .
          <source>Archives of physical medicine and rehabilitation</source>
          ,
          <volume>95</volume>
          (
          <issue>1</issue>
          ),
          <fpage>S1</fpage>
          -
          <lpage>S5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Lord</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spiering</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aguillon</surname>
            ,
            <given-names>J. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>A. E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Appel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benitez-Ribas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , ten
          <string-name>
            <surname>Brinke</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Broere</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cools</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cuturi</surname>
            ,
            <given-names>M. C.</given-names>
          </string-name>
          , et al. (
          <year>2016</year>
          ).
          <article-title>Minimum information about tolerogenic antigen-presenting cells (mitap): a first step towards reproducibility and standardisation of cellular therapies</article-title>
          .
          <source>PeerJ</source>
          ,
          <volume>4</volume>
          ,
          <fpage>e2300</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Sowa</surname>
            ,
            <given-names>J. F.</given-names>
          </string-name>
          (
          <year>2000</year>
          ).
          <article-title>Ontology, metadata, and semiotics</article-title>
          .
          <source>In International Conference on Conceptual Structures</source>
          , pages
          <fpage>55</fpage>
          -
          <lpage>81</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          U.S.
          <source>National Library of Medicine (NLM)</source>
          (
          <year>2017</year>
          ).
          <article-title>Pubmed tutorial - similar articles</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>van Heugten</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , Wolters Grego´rio, G., and
          <string-name>
            <surname>Wade</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Evidence-based cognitive rehabilitation after acquired brain injury: a systematic review of content of treatment</article-title>
          .
          <source>Neuropsychological rehabilitation</source>
          ,
          <volume>22</volume>
          (
          <issue>5</issue>
          ),
          <fpage>653</fpage>
          -
          <lpage>673</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Warrender</surname>
            ,
            <given-names>J. D.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>The consistent representation of scientific knowledge: investigations into the ontology of karyotypes and mitochondria</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Warrender</surname>
            ,
            <given-names>J. D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lord</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>The karyotype ontology: a computational representation for human cytogenetic patterns</article-title>
          .
          <source>arXiv preprint arXiv:1305</source>
          .
          <fpage>3758</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Warrender</surname>
            ,
            <given-names>J. D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lord</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Scaffolding the mitochondrial disease ontology from extant knowledge sources</article-title>
          .
          <source>arXiv preprint arXiv:1505</source>
          .
          <fpage>04114</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Whyte</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dijkers</surname>
            ,
            <given-names>M. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hart</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zanca</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Packel</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferraro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Tsaousides</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Development of a theory-driven rehabilitation treatment taxonomy: conceptual issues</article-title>
          .
          <source>Archives of physical medicine and rehabilitation</source>
          ,
          <volume>95</volume>
          (
          <issue>1</issue>
          ),
          <fpage>S24</fpage>
          -
          <lpage>S32</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>