<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the statistical sensitivity of semantic similarity metrics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Prashanti Manda</string-name>
          <email>manda@uncg.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Todd J. Vision</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Biology, University of North Carolina at Chapel Hill</institution>
          ,
          <addr-line>NC</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, University of North Carolina at Greensboro</institution>
          ,
          <addr-line>NC</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>7</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>Measuring the semantic similarity between objects that have been annotated with ontological terms is fundamental to an increasing number of biomedical applications, and several different ontologicallyaware semantic similarity metrics are in common use. In some of these applications, only weak semantic similarity is expected for biologically meaningful matches. In such cases, it is important to understand the limits of sensitivity for these metrics, beyond which biologically meaningful matches cannot be reliably distinguished from noise. Here, we present a statistical sensitivity comparison of five common semantic similarity metrics (Jaccard, Resnik, Lin, Jiang&amp; Conrath, and Hybrid Relative Specificity Similarity) representing three different kinds of metrics (Edge based, Node based, and Hybrid) and four different methods of aggregating individual annotation similarities to estimate similarity between two biological objects - All Pairs, Best Pairs, Best Pairs Symmetric, and Groupwise. We explore key parameter choices that can impact sensitivity. To evaluate sensitivity in a controlled fashion, we explore two different models for simulating data with varying levels of similarity and compare to the noise distribution using resampling. Source data are derived from the Phenoscape Knowledgebase of evolutionary phenotypes. Our results indicate that the choice of similarity metric, along with different parameter choices, can substantially affect sensitivity. Among the five metrics evaluated, we find that Resnik similarity shows the greatest sensitivity to weak semantic similarity. Among the ways to combine pairwise statistics, the Groupwise approach provides the greatest discrimination among values above the sensitivity threshold, while the Best Pairs statistic can be parametrically tuned to provide the highest sensitivity. Our findings serve as a guideline for an appropriate choice and parameterization of semantic similarity metrics, and point to the need for improved reporting of the statistical significance of semantic similarity matches in cases where weak similarity is of interest.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 INTRODUCTION</title>
      <p>
        Semantic similarity metrics quantify the similarity between objects
based on the similarity of their ontology annotations
        <xref ref-type="bibr" rid="ref14">(Pesquita
et al., 2009)</xref>
        . In biology, semantic similarity metrics are used
to compare proteins, phenotypes, diseases, and other biological
objects
        <xref ref-type="bibr" rid="ref10 ref15 ref18">(Robinson et al., 2014; Washington et al., 2009; Manda
et al., 2015)</xref>
        . For example, semantic similarity is the cornerstone
for hypothesizing candidate genes for evolutionary transitions in
the Phenoscape project
        <xref ref-type="bibr" rid="ref10">(Manda et al., 2015)</xref>
        and for connecting
model organism phenotypes to human disease phenotypes for
rare disease diagnosis within the Monarch Initiative
        <xref ref-type="bibr" rid="ref13">(Mungall
et al., 2016)</xref>
        . In some applications, such as in Phenoscape, the
biological objects being compared are expected to have only weak
similarity, and so require semantic similarity metrics that are
effective at distinguishing weak biologically meaningful matches
from noise.
        <xref ref-type="bibr" rid="ref10">(Manda et al., 2015)</xref>
        .
      </p>
      <p>Here, we conduct a statistical sensitivity analysis of commonly
used semantic similarity metrics combined with different choices
of parameters and aggregation strategies to assess their robustness
at identifying weak similarities. The results serve as a guide for
choosing and implementing a semantic similarity metric appropriate
for when weak matches are of biological interest.</p>
      <p>
        First, we introduce an application within the Phenoscape project
that requires the comparison of distant biological objects (
        <xref ref-type="bibr" rid="ref10 ref11">(Manda
et al., 2015, 2016)</xref>
        ) and use it for evaluating the sensitivity of
semantic similarity metrics and associated parameters. Phenotypic
diversity among species is typically described in phylogenetic
studies using characters consisting of two or more states that vary
in some attribute of that character, such as size or shape
        <xref ref-type="bibr" rid="ref3">(Dahdul
et al., 2010)</xref>
        . These evolutionary character states are associated with
taxa at different levels in a phylogenetic tree. Species closely related
to each other will generally have more character states in common
        <xref ref-type="bibr" rid="ref3">(Dahdul et al., 2010)</xref>
        . Model organism communities also study
phenotypes, but ones that result from perturbations to specific genes
        <xref ref-type="bibr" rid="ref16 ref17 ref8">(Sprague et al., 2007; Smith et al., 2005; Karpinka et al., 2015)</xref>
        .
Connecting evolutionary phenotypes to model organism phenotypes
can expand our understanding of both evolutionary variation and
developmental genetics
        <xref ref-type="bibr" rid="ref10 ref3 ref4">(Manda et al., 2015; Dahdul et al., 2010;
Edmunds et al., 2015)</xref>
        .
      </p>
      <p>
        The Phenoscape Knowledgebase (KB), which was built to
support the comparison of evolutionary and model organism
phenotypes, currently contains 415,819 phenotype annotations
that correspond to 21,570 evolutionary character states in 5,211
vertebrate taxa, 4,161 of which are terminal. These are integrated
with phenotypes for 3,526 human, 7,758 mouse, 5,883 zebrafish,
and 12 Xenopus genes from the Human Phenotype Ontology
        <xref ref-type="bibr" rid="ref9">(Ko¨hler et al., 2013)</xref>
        , Mouse Genome Informatics,
        <xref ref-type="bibr" rid="ref5">(Eppig et al.,
2015)</xref>
        , Zebrafish Information Network
        <xref ref-type="bibr" rid="ref1">(Bradford et al., 2011)</xref>
        ,
and Xenbase
        <xref ref-type="bibr" rid="ref8">(Karpinka et al., 2015)</xref>
        , respectively, and aims to
support semantic similarity matching among phenotypes within and
between these different sources
        <xref ref-type="bibr" rid="ref10">(Manda et al., 2015)</xref>
        .
      </p>
      <p>
        There are several challenges in comparing evolutionary phenotypes
to gene phenotypes using semantic similarity. First, evolutionary
and model organism gene phenotypes are studied by different
research communities who describe them using different ontologies
and different annotation formats. For example, evolutionary
phenotypes are often annotated in the Entity Quality (EQ)
annotation format
        <xref ref-type="bibr" rid="ref12">(Mungall et al., 2010)</xref>
        . The Entity (e.g. ‘anal
fin’) is drawn from one ontology, such as the animal anatomy
ontology UBERON
        <xref ref-type="bibr" rid="ref7">(Haendel et al., 2014)</xref>
        , and the Quality (e.g.
‘elongated’), that describes how the Entity is affected, is drawn
from a trait ontology, such as PATO
        <xref ref-type="bibr" rid="ref12">(Mungall et al., 2010)</xref>
        . In
contrast, model organism gene phenotypes such as ones from mouse
are described using the Mammalian Phenotype ontology
        <xref ref-type="bibr" rid="ref16">(Smith
et al., 2005)</xref>
        . Second, the species and their anatomical structures
being described in evolutionary and model organism phenotypes
can be vastly different. Even when changes in the same genetic
pathways affect the same anatomical structures, the phenotypes that
have changed over evolution will generally be different from those
induced in the laboratory. Given these various considerations, exact
matches between phenotypes from these different data sources will
be vanishingly rare. It is essential that semantic similarity measures
have the ability to detect very weak matches, and that one can
recognize when the best match available is sufficiently weak that
it cannot be distinguished from noise.
      </p>
      <p>
        A variety of semantic similarity metrics have been developed
and applied to compare biological entities
        <xref ref-type="bibr" rid="ref14">(Pesquita et al., 2009)</xref>
        .
These metrics can be broadly classified into edge-based,
nodebased, and hybrid measures. Edge-based metrics primarily use
distance between terms in the ontology as a measure of similarity.
Node-based measures use Information Content of the terms being
compared and/or their least common subsumer (LCS). Hybrid
measures incorporate both edge distance and Information Content
to estimate similarity between ontology terms. Jaccard (edge-based)
and Resnik (node-based) similarity are two of the most widely used
similarity metrics for biological applications
        <xref ref-type="bibr" rid="ref18">(Washington et al.,
2009)</xref>
        . We selected Jaccard from the edge based category and
Resnik, Lin, Jiang, and Conrath from the node based category.
From the set of hybrid measures, Hybrid Relative Specificity
Similarity (HRSS) was selected because this metric was shown
to outperform other metrics in tests such as distinguishing true
protein-protein interactions from randomized ones and obtaining
the highest functional similarity among orthologous gene pairs
        <xref ref-type="bibr" rid="ref19">(Wu
et al., 2013)</xref>
        .
      </p>
      <p>
        The set of ontology annotations used to describe an object is said
to be the object’s profile. There are several approaches to aggregate
the pairwise similarities between two profiles. The All Pairs uses
the distribution of all pairwise similarities between annotations in
the two profiles of objects being compared
        <xref ref-type="bibr" rid="ref14">(Pesquita et al., 2009)</xref>
        ,
while Best Pairs considers only the distribution of best match for
each annotation between the two sets
        <xref ref-type="bibr" rid="ref14">(Pesquita et al., 2009)</xref>
        . In
both approaches, different representations of the above similarity
distribution such as the mean, median, or a different quantile can
be used to quantify similarity between the objects being compared.
In contrast to All Pairs and Best Pairs, Groupwise aggregation
determines similarity between two objects by computing set-based
operations over the two profiles
        <xref ref-type="bibr" rid="ref14">(Pesquita et al., 2009)</xref>
        .
      </p>
      <p>
        A number of authors have reviewed the features and performance
of various semantic similarity metrics
        <xref ref-type="bibr" rid="ref14 ref6">(Pesquita et al., 2009;
Guzzi et al., 2012)</xref>
        . Notably, Clark and Radivojac
        <xref ref-type="bibr" rid="ref2">(Clark and
Radivojac, 2013)</xref>
        presented an information-theoretic framework for
comparing the performance of tools for automatic prediction of
ontology annotations and develop analogs to precision and recall by
quantifying the uncertainty and misinformation between a predicted
and true annotation.
      </p>
      <p>In this work, we focus on the ability of these metrics to distinguish
weak semantic similarities from noise. We introduce two models
for increasing dissimilarity between initially identical phenotypes,
and simulate annotations under these models using source data
from the Phenoscape KB. We assess the ability to distinguish
semantic similarity under progressive similarity decay from a noise
distribution obtained by resampling source annotations.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>METHODS</title>
      <sec id="sec-2-1">
        <title>Simulating decay of semantic similarity</title>
        <p>A database of 659 simulated evolutionary profiles with the same
size distribution as the 659 true profiles in the Phenoscape KB
was created by selecting annotations with uniform probability and
without replacement from the pool of evolutionary annotations. Five
query profiles each of sizes 10, 20, and 40 were randomly selected
from the simulated database. Three different profile sizes were
examined.</p>
        <p>Next, “decayed” profiles were created for each simulated profile
using one of the two models described below, in order to compare
the query profile to progressively dissimilar profiles. Initially,
the query profile is a perfect matches to itself, but similarity
eventually decays until it is no longer distinguishable from noise. To
characterize the noise distributions, we also generated 5,000 profiles
of the same size as the query by drawing annotations randomly from
among the 57,051 available. These profiles would not be expected to
have more than nominal similarity with any of the simulated subject
profiles.</p>
        <p>These two decay models reflect two different ways in which
we might simulate semantic similarity progressively decreasing
between two profiles.
2.1.1 Decay by Random Replacement In the Decay by Random
Replacement (RR) approach (Figure 1), annotations in the query
profile are replaced, one per iteration, by an annotation selected
randomly, with replacement, from the pool of 57,051 annotations.
The process terminates when all annotations in the profile have
been replaced. Thus, there is a 1-step decayed profile in which one
original annotation has been replaced, a 2-step decayed profile in
which two have been replaced, and so on.</p>
        <sec id="sec-2-1-1">
          <title>Query Profile</title>
          <p>Decayed Profiles</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>1 random</title>
          <p>annotation</p>
        </sec>
        <sec id="sec-2-1-3">
          <title>2 random annotations</title>
        </sec>
        <sec id="sec-2-1-4">
          <title>Sampling without replacement</title>
        </sec>
        <sec id="sec-2-1-5">
          <title>Annotation</title>
        </sec>
        <sec id="sec-2-1-6">
          <title>Pool</title>
          <p>2.1.2 Decay by Ancestral Replacement In the Decay by
Ancestral Replacement (RA) approach (Figure 2), annotations in
the query profile are replaced, one per iteration, by progressively
more semantically distant sibling, cousin, or parent terms. If an
annotation has no siblings, it is replaced by an immediate parent.
If an annotation has multiple immediate parents, a parent is selected
randomly from the set of parents for replacement. Unlike the RR
approach which only goes through N replacements for a profile
of size N , the RA approach continues the decay process after
each annotation in the profile has been replaced once. Subsequent
iterations further decay the modified profile from the previous
iteration by replacing each annotation with a more distantly related
term. The process of decaying the query profile can be terminated
when all annotations in the profile converge at the ontology root or
when there is no further decay in similarity.</p>
          <p>Query
Profile</p>
          <p>Decayed Profiles
Degree
1</p>
          <p>Degree
2</p>
          <p>Degree
3</p>
          <p>Degree</p>
          <p>4
Replace with
related classes</p>
          <p>Fig. 2. Profile decay via ancestral replacement. Profiles are decayed
iteratively with one annotation per iteration being replaced with a related
ontology class (sibling, cousin, parent). The decay process ends the process
converges at the root or when there is no further decay in similarity.
2.2</p>
          <p>
            Semantic similarity metrics
2.2.1 Jaccard similarity The Jaccard similarity (sJ ) of two
classes (A, B) in an ontology is defined as the ratio of the number
of classes in the intersection of their subsumers over the number of
classes in their union of their subsumers
            <xref ref-type="bibr" rid="ref14">(Pesquita et al., 2009)</xref>
            .
sJ (A; B) = jS(A) \ S(B)j
          </p>
          <p>
            jS(A) [ S(B)j
where S(A) is the set of classes that subsume A.
2.2.2 Resnik similarity The Information Content of ontology
class A, denoted I(A), is defined as the negative logarithm of the
proportion of profiles annotated to that class (f (A)) out of T profiles
2.2.3 Jiang and Conrath Jiang and Conrath similarity (sC ) takes
into account the IC of two ontology classes as well as the IC of their
Least Common Subsumer (LCS)
            <xref ref-type="bibr" rid="ref14">(Pesquita et al., 2009)</xref>
            . The metric
is defined as
sC (A; B) = In(A) + In(B)
2 In(LCS(A; B))
2.2.4 Lin Lin similarity (sL) also takes into account the IC of
the two ontology classes and the IC of their least common subsumer
(LCS), but in a different way
            <xref ref-type="bibr" rid="ref14">(Pesquita et al., 2009)</xref>
            sL(A; B) =
2 In(LCS(A; B))
          </p>
          <p>
            In(A) + In(B)
2.2.5 Hybrid Relative Specificity Similarity Hybrid Relative
Specificity Similarity (HRSS, sH (A; B)) combines edge based and
IC based measures. HRSS takes into account the specificity of the
classes being compared along with their generality by using the LCS
and the Most Informative Leaves (MIL) of the classes
            <xref ref-type="bibr" rid="ref19">(Wu et al.,
2013)</xref>
            .
          </p>
          <p>sH (A; B) =</p>
          <p>1
1 +
+
where,
and,
and,
and,
= D(LCS(A; B); A) + D(LCS(A; B); B)</p>
          <p>= In(LCS(A; B))
=</p>
          <p>D(A; M ILA) + D(B; M ILB )</p>
          <p>2
D(u; v) = In(v)</p>
          <p>In(u)
where M ILi is the MIL of class i, u and v are ontology terms, and
u is an ancestor of v.
2.3</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>Profile similarity</title>
        <p>An annotation profile may consist of several annotations to a
single object, such as a taxon or gene. In order to provide a
single measure of the similarity of two objects when there are
multiple pairwise similarity measures available between individual
annotations, several methods are commonly used.
On the statistical sensitivity of semantic similarity metrics
2.3.1 All pairs The All Pairs score (az) between two entities
X and Y (az(X; Y )) is calculated by computing all pairwise
annotation similarities between the annotation sets of X and Y .
These jXj*jY j pairwise annotation similarities can be summarized
by taking the median or another quantile (or summary measure).
az(X; Y ) =</p>
        <p>median
i2f1:::jXjg;j2fj=1:::jY jg
n
sz(Xi; Yj )
o
sz is the semantic similarity score where the index z can be used to
specify the semantic similarity metric used in the computation.
2.3.2 Best Pairs To compute the Best Pairs score (bz) between
X and Y , for each annotation in X, the best scoring match in Y
is determined, and the median of the jXj resultant values is taken
(bz(X; Y )).</p>
        <p>bz(X; Y ) =</p>
        <p>median
i2f1:::jXjg;j= argmax sz(Xi;Yj )
j=1:::jY j
n
sz(Xi; Yj )
o</p>
        <p>Unlike All Pairs, Best Pairs is not a commutative measure. To
address this, a symmetric version (pz(X; Y )) of Best Pairs can be
used.</p>
        <p>pz(X; Y ) = (1=2)[bz(X; Y ) + bz(Y; X)]
2.3.3 Groupwise Groupwise approaches (gz) compare profiles
directly based on set operations or graph overlap. The Groupwise
Jaccard similarity of profiles X and Y , gJ (X; Y ), is defined as the
ratio of the number of classes in the intersection to the number of
classes in the union of the two profiles
gJ (X; Y ) = jT (X) \ T (Y )j</p>
        <p>jT (X) [ T (Y )j
where T (X) is the set of classes in profile X plus all their
subsumers.</p>
        <p>Similarly, the Groupwise Resnik similarity of profiles X and Y ,
gR(X; Y ), is defined as the ratio of the normalized Information
Content summed over all nodes in the intersection of X, Y to the
Information Content summed over all nodes in the union.
gR(X; Y ) = P</p>
        <p>P
t2fT (X)\T (Y )g In(t)
t2fT (X)[T (Y )g In(t)
where T (X) is defined as above.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>RESULTS</title>
      <p>
        The Phenoscape Knowledgebase contains a dataset of 661 taxa with
57,051 evolutionary phenotypes, which are phenotypes that have
been inferred to vary among the taxon’s immediate descendents
        <xref ref-type="bibr" rid="ref10">(Manda et al., 2015)</xref>
        . A simulated dataset of subject profiles having
the same size distribution of annotations per taxon was created by
permutation of annotations.
3.1
      </p>
      <p>Effect of query profile size on similarity decay
First, we wanted to determine how the pattern of similarity decline
varies between different query profile sizes. We randomly selected
five simulated query profiles each of size 10, 20, and 40 and created
decayed profiles using the Decay by Random Replacement (RR)
approach. The query profiles along with their decayed profiles were
compared to the simulated database and the similarity score of
the best match was plotted. Similarity was computed using five
semantic similarity metrics (Jaccard, Resnik, Lin, Jiang, and HRSS)
along four profile similarity settings (All Pairs, Best Pairs, Best Pairs
Symmetric, and Groupwise (Jaccard and Resnik only)).</p>
      <p>The results (Figure 3) indicate that the pattern of similarity decay
is very similar across the three query profile sizes. This trend was
observed consistently across different similarity metrics and profile
similarity choices. For the Best Pairs methods, a sharp decline in
similarity is observed at the 50% decay mark for all profile sizes.
Groupwise metrics show a pattern of gradual decline across the three
profile sizes. Given that pattern of decay was very similar across
query profile sizes for all metrics, we only used query profiles of
size 10 for the rest of the experiments in this study.
3.2</p>
      <p>Effect of profile similarity method on sensitivity of
similarity metrics
Next, we conducted two comparisons - the four profile similarity
approaches against each other, and the five semantic similarity
metrics against each other. For ease of interpretation, we take
the upper 99:9% of the similarity distribution for random profile
matches (noise) as an arbitrary threshold for comparing the
sensitivity of the different series. We use discrimination of similarity
from the noise threshold as an indicator of the sensitivity of
metrics. We observed discrimination from noise on two factors - the
magnitude of discrimination (the higher, the better), and the point in
the decay process at which similarity was no longer distinguishable
from noise (the higher, the better).</p>
      <p>The All Pairs profile similarity method (Figure 4, Col 1) fails to
distinguish match similarity from noise across all five metrics. Both
the Best Pairs variants (Figure 4, Cols 2, 3) demonstrate an initial
discrimination from the noise threshold followed by a sharp decline
around the 50% decay mark after which similarity falls below the
noise threshold. The Symmetric measure (Figure 4, Col 3) shows
substantially greater discrimination from noise as compared to the
Asymmetric measure. Thus, both Best Pairs variants show greater
sensitivity as compared to All Pairs, the Best Pairs Symmetric
performing the best among the two Best Pairs methods.</p>
      <p>Groupwise measures (for Jaccard and Resnik) show a gradual
decline in similarity unlike the sharp decline exhibited by Best Pairs
methods. The magnitude of discrimination from noise is greater
than for Best Pairs, and discrimination is still possible beyond 50%
decay.</p>
      <p>These results from the comparison of the four profile similarity
approaches show that Best Pairs Symmetric (among pairwise
statistics) and Groupwise result in the greatest sensitivity across the
tested similarity metrics. Accordingly, these two profile similarity
methods were selected for subsequent experiments.</p>
      <p>Since Groupwise statistics are available only for two of the
five similarity metrics, we focus only on Best Pairs Symmetric to
compare the five semantic similarity metrics (Figure 4, Rows
15, Col 3). We observe that among the IC based measures (Resnik,
Lin, and Jiang), Lin shows the greatest discrimination from noise.
All three metrics decline into noise similarity at the 50% decay
mark. Lin and Jiang show a flat performance until 50% decay before
suddenly dipping below the noise threshold. This indicates that
these metrics fail to distinguish between perfect similarity between
identical profiles (no decay) from imperfect similarity between
decayed profiles before 50% decay. On the contrary, Resnik displays
a gradual decline in similarity indicating greater discrimination
Jaccard
Resnik</p>
      <p>Lin
Jiang
HRSS
1.0
0
1
between true matches of varying quality. Comparing the distance
based metric (Jaccard) to the hybrid metric (HRSS), HRSS declines
below the noise threshold at the 30% decay mark unlike Jaccard
which discriminates from noise until 50% decay.</p>
      <p>Based on these results, we conclude that Resnik (among the
ICbased metrics), and Jaccard (between distance-based and hybrid
metrics) demonstrate the greatest sensitivity. These two metrics in
conjunction with the two selected profile similarity methods were
used for all further experiments.
3.3</p>
      <p>
        Improving the sensitivity of Best Pairs metrics
The sharp decline in similarity under the Best Pairs statistics
at approximately 50% decay can be understood as a result of
summarizing pairwise similarity scores with the median of the
distribution
        <xref ref-type="bibr" rid="ref11">(Manda et al., 2016)</xref>
        . To understand if the sensitivity
of pairwise statistics such as Best Pairs can be tuned using a
different percentile of the pairwise score distribution, we compared
the results using the 80th percentile rather than the median.
      </p>
      <p>
        Best Pairs Jaccard and Resnik distinguish similarity from noise
for greater levels of decay when the 80th percentile is used (Figure
5) than for the median. This illustrates that sensitivity for pairwise
metrics can be tuned by how the pairwise similarities are aggregated.
Jaccard and Resnik perform similarly with respect to how long
similarity can be distinguished from noisewith Groupwise showing
marginally less sensitivity. We again see that Groupwise has a
more gradual decline in similarity. The implication of this is that
Groupwise statistics will provide more fine discrimination among
matches of varying quality and thus be better for rank ordering the
strength of matches
        <xref ref-type="bibr" rid="ref11">(Manda et al., 2016)</xref>
        , while sensitivity may be
slightly greater for Best Pairs when using a high percentile.
3.4
      </p>
      <p>Similarity Decay under the Ancestral Replacement
Model
Next, we explored if the metrics exhibit the same relative
performance when using the Ancestral Replacement decay model
rather than Random Replacement (Section 2.1.2).</p>
      <p>The results for Ancestral Replacement are in general agreement
with those reported above for the Random Replacement. Changing
the percentile at which pairwise scores are aggregated again shows
the percentage decay at which similarity for the Best Pairs statistics
On the statistical sensitivity of semantic similarity metrics
Jaccard
Resnik</p>
      <p>Lin
Jiang
HRSS
1.0
0.6
0.2
1.0
0.6
0.2
99.9
percentile
noise
0
5</p>
      <p>5
can no longer be discriminated from noise to be at the percentile
used (either 50% or 80%). Groupwise metrics again, show a more
gradual decline and fail to discriminate signal from noise at less
extreme level of decay than for the Best Pairs statistic using the 80th
percentile.
4</p>
    </sec>
    <sec id="sec-4">
      <title>DISCUSSION</title>
      <p>Our findings reveal that sensitivity can vary dramatically among
semantic similarity metrics and among different parameter choices.
The majority of studies that use semantic similarity employ the
Best Pairs or All Pairs approaches to aggregate similarity between
two profiles, employing a variety of semantic similarity metrics.
Here we see pronounced performance differences among these
metrics. The way in which pairwise statistics are aggregated has
easily predictable effects upon sensitivity. In some case, Groupwise
approaches may be more sensitive, and generally show greater
discrimination among levels of similarity above the sensitivity
threshold. These results suggest specific ways to improve the
sensitivity and interpretability of semantic similarity applications,
particularly for profile comparisons.</p>
      <p>We compared two models for decay of similarity between
profiles, and found similar results for both. We also saw no
substantive effect of profile size on the results. This increases our
confidence in the generality of the results, although our evaluation
is limited to one context, comparison among profiles sampled from
among the Entity-Quality phenotype annotations in the Phenoscape
KB.</p>
      <p>Our results also illustrate how difficult it can be to statistically
discriminate weakly matching profiles from noise, something which
has received relatively little consideration in many applications of
semantic similarity search to date. This suggests a need for more
statistically informed reporting of results from semantic similarity
matches, so that results which may be statistically meaningless are
not interpreted as having biological significance.</p>
    </sec>
    <sec id="sec-5">
      <title>ACKNOWLEDGEMENTS</title>
      <p>We thank W. Dahdul, T.A. Dececchi, N. Ibrahim and L. Jackson for
curation of the original dataset, the larger community of ontology
contributors and data providers, and useful feedback from members
of the Phenoscape team. This work was funded by the NSF
(DBI1062542) and start-up funding to PM from UNC Greensboro.
Sensitivity of semantic similarity
99.9 percentile
noise
On the statistical sensitivity of semantic similarity metrics
Resnik
1.0
0.8
0.6
0.4
0.6
0.4
0.2
0.0
99.9 percentile
noise
0 4 8 2 6 0 4 8
1 1 2 2 2
0 4 8 2 6 0 4 8
1 1 2 2 2</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Bradford</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Conlin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dunn</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fashena</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frazer</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Howe</surname>
            ,
            <given-names>D. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knight</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mani</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moxon</surname>
            ,
            <given-names>S. A. T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paddock</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pich</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramachandran</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruef</surname>
            ,
            <given-names>B. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruzicka</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schaper</surname>
            ,
            <given-names>H. B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schaper</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singer</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sprague</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sprunger</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Slyke</surname>
            ,
            <given-names>C. V.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Westerfield</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Zfin: enhancements and updates to the zebrafish model organism database</article-title>
          .
          <source>Nucleic Acids Res</source>
          ,
          <volume>39</volume>
          (
          <issue>suppl 1</issue>
          ),
          <fpage>D822</fpage>
          -
          <lpage>D829</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Clark</surname>
          </string-name>
          , W. T. and
          <string-name>
            <surname>Radivojac</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Information-theoretic evaluation of predicted ontological annotations</article-title>
          .
          <source>Bioinformatics</source>
          ,
          <volume>29</volume>
          (
          <issue>13</issue>
          ),
          <fpage>i53</fpage>
          -
          <lpage>i61</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Dahdul</surname>
            ,
            <given-names>W. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balhoff</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Engeman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grande</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hilton</surname>
            ,
            <given-names>E. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kothari</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lapp</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lundberg</surname>
            ,
            <given-names>J. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Midford</surname>
            ,
            <given-names>P. E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vision</surname>
          </string-name>
          , T. J., et al. (
          <year>2010</year>
          ).
          <article-title>Evolutionary characters, phenotypes and ontologies: curating data from the systematic biology literature</article-title>
          .
          <source>PLoS One</source>
          ,
          <volume>5</volume>
          (
          <issue>5</issue>
          ),
          <year>e10708</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Edmunds</surname>
            ,
            <given-names>R. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balhoff</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eames</surname>
            ,
            <given-names>B. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dahdul</surname>
            ,
            <given-names>W. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lapp</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lundberg</surname>
            ,
            <given-names>J. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vision</surname>
          </string-name>
          , T. J.,
          <string-name>
            <surname>Dunham</surname>
            ,
            <given-names>R. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mabee</surname>
            ,
            <given-names>P. M.</given-names>
          </string-name>
          , et al. (
          <year>2015</year>
          ).
          <article-title>Phenoscape: identifying candidate genes for evolutionary phenotypes</article-title>
          .
          <source>Molecular Biology and Evolution</source>
          ,
          <volume>33</volume>
          (
          <issue>1</issue>
          ),
          <fpage>13</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Eppig</surname>
            ,
            <given-names>J. T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blake</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bult</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kadin</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Richardson</surname>
            ,
            <given-names>J. E.</given-names>
          </string-name>
          , and Group,
          <string-name>
            <surname>M. G. D.</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>The mouse genome database (mgd): facilitating mouse as a model for human biology and disease</article-title>
          .
          <source>Nucleic Acids Research</source>
          ,
          <volume>43</volume>
          (
          <issue>D1</issue>
          ),
          <fpage>D726</fpage>
          -
          <lpage>D736</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Guzzi</surname>
            ,
            <given-names>P. H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mina</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guerra</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Cannataro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Semantic similarity analysis of protein data: assessment with biological features and issues</article-title>
          . Briefings in Bioinformatics,
          <volume>13</volume>
          (
          <issue>5</issue>
          ),
          <fpage>569</fpage>
          -
          <lpage>585</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Haendel</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balhoff</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bastian</surname>
            ,
            <given-names>F. B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blackburn</surname>
            ,
            <given-names>D. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blake</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bradford</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Comte</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dahdul</surname>
            ,
            <given-names>W. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dececchi</surname>
            ,
            <given-names>T. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Druzinsky</surname>
            ,
            <given-names>R. E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hayamizu</surname>
            ,
            <given-names>T. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ibrahim</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>S. E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mabee</surname>
            ,
            <given-names>P. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Niknejad</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robinson-Rechavi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sereno</surname>
            ,
            <given-names>P. C.</given-names>
          </string-name>
          , , and Mungall,
          <string-name>
            <surname>C. J.</surname>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Unification of multi-species vertebrate anatomy ontologies for comparative biology in uberon</article-title>
          .
          <source>Journal of Biomedical Semantics</source>
          ,
          <volume>5</volume>
          (
          <issue>1</issue>
          ),
          <fpage>21</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Karpinka</surname>
            ,
            <given-names>J. B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fortriede</surname>
            ,
            <given-names>J. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burns</surname>
            ,
            <given-names>K. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>James-Zorn</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ponferrada</surname>
            ,
            <given-names>V. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karimi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zorn</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Vize</surname>
            ,
            <given-names>P. D.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Xenbase, the xenopus model organism database; new virtualized system, data types and genomes</article-title>
          .
          <source>Nucleic Acids Research</source>
          ,
          <volume>43</volume>
          (
          <issue>D1</issue>
          ),
          <fpage>D756</fpage>
          -
          <lpage>D763</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>Ko¨hler, S.,</article-title>
          <string-name>
            <surname>Doelken</surname>
            ,
            <given-names>S. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bauer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Firth</surname>
            ,
            <given-names>H. V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bailleul-Forestier</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Black</surname>
            ,
            <given-names>G. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brown</surname>
            ,
            <given-names>D. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brudno</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Campbell</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , et al. (
          <year>2013</year>
          ).
          <article-title>The human phenotype ontology project: linking molecular biology and disease through phenotype data</article-title>
          .
          <source>Nucleic Acids Research</source>
          ,
          <volume>42</volume>
          (
          <issue>D1</issue>
          ),
          <fpage>D966</fpage>
          -
          <lpage>D974</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Manda</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balhoff</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lapp</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mabee</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Vision</surname>
          </string-name>
          , T. J. (
          <year>2015</year>
          ).
          <article-title>Using the phenoscape knowledgebase to relate genetic perturbations to phenotypic evolution</article-title>
          .
          <source>genesis</source>
          ,
          <volume>53</volume>
          (
          <issue>8</issue>
          ),
          <fpage>561</fpage>
          -
          <lpage>571</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Manda</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balhoff</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Vision</surname>
          </string-name>
          , T. J. (
          <year>2016</year>
          ).
          <article-title>Measuring the importance of annotation granularity to the detection of semantic similarity between phenotype profiles</article-title>
          .
          <source>In Proceedings of the Joint International Conference on Biological Ontology and BioCreative</source>
          , Aachen. CEUR Workshop Proceedings.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gkoutos</surname>
            ,
            <given-names>G. V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>C. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haendel</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>S. E.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Integrating phenotype ontologies across multiple species</article-title>
          .
          <source>Genome Biology</source>
          ,
          <volume>11</volume>
          (
          <issue>1</issue>
          ),
          <fpage>R2</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McMurry</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          , Ko¨hler,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Balhoff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            ,
            <surname>Borromeo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Brush</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Carbon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Conlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Dunn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Engelstad</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          , et al. (
          <year>2016</year>
          ).
          <article-title>The monarch initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species</article-title>
          .
          <source>Nucleic Acids Research</source>
          ,
          <volume>45</volume>
          (
          <issue>D1</issue>
          ),
          <fpage>D712</fpage>
          -
          <lpage>D722</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Pesquita</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faria</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Falcao</surname>
            ,
            <given-names>A. O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lord</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Couto</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Semantic similarity in biomedical ontologies</article-title>
          .
          <source>PLoS Computational Biology</source>
          ,
          <volume>5</volume>
          (
          <issue>7</issue>
          ),
          <year>e1000443</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>P. N.</given-names>
          </string-name>
          , Ko¨hler,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Oellrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Mungall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            ,
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. E.</given-names>
            , Washington, N.,
            <surname>Bauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Seelow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Krawitz</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          , et al. (
          <year>2014</year>
          ).
          <article-title>Improved exome prioritization of disease genes through cross-species phenotype comparison</article-title>
          .
          <source>Genome Research</source>
          ,
          <volume>24</volume>
          (
          <issue>2</issue>
          ),
          <fpage>340</fpage>
          -
          <lpage>348</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>C. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goldsmith</surname>
            ,
            <given-names>C.-A. W.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Eppig</surname>
            ,
            <given-names>J. T.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>The mammalian phenotype ontology as a tool for annotating, analyzing and comparing phenotypic information</article-title>
          .
          <source>Genome Biology</source>
          ,
          <volume>6</volume>
          (
          <issue>1</issue>
          ),
          <fpage>R7</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Sprague</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bayraktaroglu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bradford</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Conlin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dunn</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fashena</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frazer</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haendel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Howe</surname>
            ,
            <given-names>D. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knight</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , et al. (
          <year>2007</year>
          ).
          <article-title>The zebrafish information network: the zebrafish model organism database provides expanded support for genotypes and phenotypes</article-title>
          .
          <source>Nucleic Acids Research</source>
          ,
          <volume>36</volume>
          (
          <issue>suppl 1</issue>
          ),
          <fpage>D768</fpage>
          -
          <lpage>D772</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Washington</surname>
            ,
            <given-names>N. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haendel</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Westerfield</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>S. E.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Linking human diseases to animal models using ontology-based phenotype annotation</article-title>
          .
          <source>PLoS Biology</source>
          ,
          <volume>7</volume>
          (
          <issue>11</issue>
          ),
          <year>e1000247</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Pei</surname>
            ,
            <given-names>Z.-M.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Improving the measurement of semantic similarity between gene ontology terms and gene products: insights from an edge-and ic-based hybrid method</article-title>
          .
          <source>PloS One</source>
          ,
          <volume>8</volume>
          (
          <issue>5</issue>
          ),
          <year>e66745</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>