<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Comparative Study of Similarity Measures for Content-Based Medical Image Retrieval</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>John Collins</string-name>
          <email>johncoll@mail.sfsu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kazunori Okada</string-name>
          <email>kazokada@sfsu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>San Francisco State University</institution>
          ,
          <addr-line>1600 Holloway Avenue, San Francisco, CA 94132</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This note summarizes methodologies employed in our submissions for the medical retrieval subtask of 2012 ImageCLEF competition. Our work aims to provide a systematic comparison of various similarity measures in the Medical CBIR application context. Our system consists of the standard bag-of-words features with SIFT. Computed features are then compared by using various plug-in similarity measures, including di usion distance and information-theoretic metric learning. This note provides the results of our experimental validation using the 2011 ImageCLEF dataset.</p>
      </abstract>
      <kwd-group>
        <kwd>ImageCLEF</kwd>
        <kwd>CBIR</kwd>
        <kwd>M-CBIR</kwd>
        <kwd>Content-Based</kwd>
        <kwd>Image Retrieval</kwd>
        <kwd>Medical</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        ImageCLEF[1{3] is a public standardized competition which focuses attention
on, among other things, Medical CBIR (hereafter M-CBIR): CBIR[4{9] in which
all images are taken from gures in medical publications. This note focuses
on a subtask of M-CBIR 2012, the medical image retrieval task with image
data alone without other text-based data. Previous work on M-CBIR has led
to the development of an array of speci c/general and local/global features. For
examples, see SIFT [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ], SURF [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ] and Gabor Wavelets [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Despite the
relative maturity of feature design studies, similarity measures in CBIR have not
been investigated thoroughly. Previous studies in this regard [15{17] are still few
and the lack is especially evident in the M-CBIR sub eld.
      </p>
      <p>
        Addressing this shortcoming, this paper presents a comparative study of
MCBIR with a comprehensive list of similarity measures of many types. Our study
shows that well known measures tend to outperform more complex measures
with the notable exception of the Di usion Distance [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Further, we show
that learning a metric from a set of training data is worthwhile, our best result
coming from a combination of a metric learning transformation combined with
the Di usion Distance.
This paper is organized as follows. Sections 2 and 3 will outline, respectively, our
methods of feature extraction and representation, and our comparative study
of similarity measures. Sections 4 and 5 will summarize our results and their
interpretation.
2
      </p>
      <p>Feature Extraction and Image Representation
In this section we describe the process and the individual steps involved in
transforming an image to a feature vector, which consists of the following three steps.
First, we identify and extract SIFT features from all of the dataset images.
Second, we create a codebook of K representative features using K-means
clustering. Third, we generate a single vector per image as a normalized histogram
of such representative features. Beyond this basic three-step procedure we
experiment with a number of standard transformations on the feature codebooks
for better retrieval performance.
2.1</p>
      <sec id="sec-1-1">
        <title>Image Representation</title>
        <p>From each image, we extract a variable number of features which we classify
into K types using the codebook resulting from the bag-of-words model
described below. An image is then represented by the frequency distribution of
feature types in the image and is, by construction, a vector of length K.
Before calculating similarities, each vector is normalized so that it is a probability
distribution.
2.2</p>
      </sec>
      <sec id="sec-1-2">
        <title>SIFT: Scale Invariant Feature Transform</title>
        <p>
          SIFT [
          <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
          ] is a proprietary algorithm that describes regions of interest within
an image as a feature which is both scale and rotation invariant. The positions of
these features, called keypoints, are determined by nding extrema of di
erenceof-Gaussian images which are robust across multiple scales. Such regions are then
turned into 128-element SIFT feature vectors using local directional gradients
around the keypoint. We include the 4 extra parameters consisting of the 2
spatial coordinates of the keypoint's position within the image, the scale parameter
and the dominant-orientation parameter for a total of 132 dimensions.
2.3
        </p>
      </sec>
      <sec id="sec-1-3">
        <title>Bag-of-Words</title>
        <p>
          In order to generate an xed-length vector for each image, we cluster all features
together in space using K-means clustering with a prede ned vector-length K.
Before clustering, each SIFT feature-vector is centered and scaled using Z-Score
normalization. In our case we chose K to be 1000 where this number was taken
from an earlier report in the same competition [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. Each SIFT feature can
then be matched with one of the 1000 labels, 0-999, corresponding to the cluster
centers. We refer to this set of centers and the corresponding labels as a codebook.
This bag-of-words method yields the frequency distribution of these labels, 0-999,
which describes an image. The notion of a bag-of-words comes from textual data
mining and was originally proposed as a way of representing a text document
by it's word frequency distribution, ignoring order. In the analogy here, SIFT
vectors are word instances and the K centers returned from K-means clustering
are the true words. Instead of instances being exact copies of that word as in the
text mining case, in the image context a word instance is ascribed to represent
the center to which it is closest in distance.
2.4
        </p>
      </sec>
      <sec id="sec-1-4">
        <title>Data Transformations</title>
        <p>The following standard transformations were examined with the goal of improved
performance.</p>
        <p>
          PCA: Principal Components Analysis PCA[
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] is a technique used mainly
for dimension reduction. For a space X, It seeks to nd the linear
combination Y = Pin=1 ix(i) for column vectors x(i) of X such that the dimensions
of Y are not correlated (linearly independent). Moreover, dimensions in Y
are ordered from most to least important, where importance is de ned in
terms of variance. In practice, the transformed data in Y is often used for
dimension reduction since one gets a variance-maximal m-dimensional
representation of X by taking the rst m dimensions of Y . How small to make
m is data dependent and is typically chosen to cover at least 95% or 99% of
the data's variance.
        </p>
        <p>We experimented by varying the number of dimensions in PCA with both
2011 and 2012 ImageCLEF competition datasets and the results are shown
in Figure. 1. We found the variance spread of these two datasets to be quite
large. Overall, using our image representation, the 2012 codebook captured
more variance in fewer components than did the 2011 codebook. However,
in both cases we found that it took most of the components to cover an
adequate amount of variance.</p>
      </sec>
      <sec id="sec-1-5">
        <title>Tf-Idf: Term Frequency - Inverse Document Frequency This idea, like</title>
        <p>bag-of-words, comes from textual data mining. The goal is to penalize a
vector for words (features) whenever they are common across the entire
data set. Term Frequency (Tf) for an observation x is just the value at
term i's position, i.e. xi. Inverse Document Frequency (Idf) is calculated
by Idft = log jfd2DjD:tj2dgj where D is the dataset of observations and fd 2
D : t 2 dg is the number of observations which are non-zero in the i-th
position. For Tf-Idf, we transform d 2 D by d Idf . In our case, we do not
explicitly measure the presence or non-presence of a feature but rather the
100
d
re 75
u
t
p
a
C
e
c
n
ira 50
a
V
e
g
a
t
n
e
rc 25
e
P
0
0
200
2011 principal compoenents
2012 principal compoenents
400 600 800
Number Of Principal Components
1000
count of each feature. Thus, Tf-Idf provides for us a weighting of our images
which penalizes features if they are very common in the data set and awards
features otherwise.</p>
        <p>In the course of our study we experimented not just with PCA and Tf-Idf, but
also with nestings of these operations. In short, for our dataset, X, we compute
the following data transformations.
1. PCA(X)
2. Tf-Idf(X)
3. PCA(Tf-Idf(X))
4. Tf-Idf(PCA(X))
3</p>
        <p>Database Ranking by Similarity Comparison
Given a query image, the goal here is to calculate the similarities or distances
between it and each of the images in the database. Then the rst image returned
will be the most similar, the second return will be the second most similar, and
so on. In some cases a query may consist of multiple images. In this case, we
calculate the average similarity of the query parts to each database image as
the representative score. The subjectivity inherent to the idea of similarity is
re ected in the varying types of similarity measures which can be de ned. In
some cases below, e.g. cosine similarity, a measure has its natural expression
as a similarity rather than a dissimilarity measure. However, in most cases the
natural de nition is as a dissimilarity measure. We shall use d when referring to
a dissimilarity measure and s when referring to a similarity measure. The idea
of calculating similarity as an additive inverse of distance comes from the idea of
a metric. A metric on a set X is a mapping d : X X ! R such that 8x; y 2 X,
the following conditions all hold: d(x; y) 0, d(x; y) = 0 if and only if x = y,
d(x; y) = d(y; x), and d(x; z) d(x; y) + d(y; z).</p>
        <p>We use the broader term measure because in some cases what we use will fail
in one or more of the conditions above. For example, the Kullback-Liebler
Divergence is not symmetric since, in general, d(x; y) 6= d(y; x). Finally, when a
dissimilarity measure is being considered, it should be understood that we are
using 1 d(x; y) to calculate the similarity where x and y are appropriately
scaled so that d(x; y) 2 [0; 1].
3.1</p>
      </sec>
      <sec id="sec-1-6">
        <title>Various Similarity Measures</title>
        <p>The following lists similarity or dissimilarity measures we considered in our study.
Let x denote the vector (x1; x2; :::; xn) representing the query image and y the
vector (y1; y2; :::; yn) representing another image. Further, let x represents the
mean of the values in the x vector and y the mean of y. Further, let X and
Y represent, respectively, the cumulative distributions of x and y when they
are considered as probability distributions (Pn i=1 yi = 1). That is
i=1 xi = Pn
X = (X1; X2; ; Xn) where Xj = Pij=1 xi and similarly for Y and y. Finally
= ( 1; ::; n) is the mean vector such that = x +2y .
{ Minkowski and Standard Measures</p>
        <p>Euclidean Distance (L2) d(x; y) = pPn
i=1 (xi
yi)2
Cityblock Distance (L1) d(x; y) = Pn</p>
        <p>i=1 jxi
In nity Distance (L1) d(x; y) = maxin=1jxi
yij
yij
Cosine Similarity (CO) s(x; y) =</p>
        <p>
          x y
kxkkyk
{ Statistical Measures
{ Divergence Measures
Pin=1 (xi x)(yi y)
Pearson Correlation Coe cient (CC) d(x; y) = pPin=1 (xi x)2(yi y)2
Chi-Square Dissimilarity (CS) d(x; y) = Pn
i=1
(xi i)2 [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]
i
Kullback-Liebler Divergence (KL) d(x; y) = Pn
i=1 xi log xyii [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]
Je rey Divergence (JF) d(x; y) = Pn
i=1 xi log xi + yi log yi [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]
i i
Kolmogorov-Smirnov Divergence (KS) d(x; y) = maxin=1jXi
Cramer-von Mises Divergence (CvM) d(x; y) = Pn
i=1 (Xi
        </p>
        <p>
          Yij [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]
Yi)2 [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]
{ Other Measures
Earth Mover's Distance (EMD-L1) d(x; y) = Pn
i=1 jXi
        </p>
        <p>
          Yij [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]1
Di usion Distance (DD) d(x; y) = Plio=g12 n Pjn==21j z(ij) where z = (z1; z2; ; zn)
and z(l) is the l-times iteratively Gaussian-smoothened, then 2-downsampled
vector representation of jX Yj [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
3.2
        </p>
      </sec>
      <sec id="sec-1-7">
        <title>Metric Learning</title>
        <p>
          Metric Learning [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] is the process of using information about the similarity
and/or dissimilarity of some dataset X, to learn a mapping to a new space
Y = A1=2X, in which similar data will be closer together and dissimilar data will
be farther apart. Let denote an n-dimensional vector in which i determines
the weight given to the i-th variable. With such a we can de ne a weighted L2
metric on X such that for each x and y in X we capture the distance between
them by d (x; y) = qPiN=1 i(xi yi)2. The idea of metric learning is to learn
the appropriate weights from a training dataset. A less strict formulation of
metric learning allows the weights to be described by a non-diagonal symmetric
positive semi-de nite matrix A such that = diag(A), leading to a more general
Mahalanobis-type metric formulation:
dA(x; y) = jjx
yjjA =
q
(x
y)T A(x
y)
(1)
Many algorithms [26{28] have been used to learn such a metric with Yang [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]
giving a nice summary. We employ an algorithm called Information-Theoretic
Metric Learning (hereafter ITML) which is widely used. ITML uses an
informationtheoretic cost model which iteratively enforces similarity/dissimilarity constraints
with the input being a list of such pairwise constraints and the output being a
learned matrix A. An equivalent and more computationally e cient formulation
to the one above is to use the L2 metric on the data after applying the data
transformation X 7! A1=2X. In this study, we employ the diagonal form of A
for simplicity and information about similarity/dissimilarity attained from the
2011 ImageCLEF dataset as our training data.
3.3
        </p>
      </sec>
      <sec id="sec-1-8">
        <title>Query Filtering</title>
        <p>
          We used the Modality Classi cation results made available by ImageCLEF to
lter out certain image types which are likely to be irrelevant to all queries.
Table 1 indicates the ltering performed. In short, we included all and only
diagnostic images.
1 EMD for 1D features is equivalent to the Mallows Distance [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]
Using the relevance judgments from 2011 ImageCLEF, we validate our proposed
system. Table 2 shows the Mean Average Precision (hereafter MAP) scores for
various permutation of our system components computed using the relevance
judgment le from the 2011 results.
        </p>
        <p>We used this table to select our best potential measure/transformation
combinations for 2012 ImageCLEF competition. In the end we submitted the following
seven runs to the 2012 ImageCLEF medical retrieval competition.
1. L1 on the untransformed data (reg cityblock)
2. DD on the untransformed data (reg di usion)
3. L2 on the Tf-Idf(PCA) transformed data (t df of pca euclidean)
4. CO on the Tf-Idf(PCA) transformed data (t df of pca cosine)
5. P C on the Tf-Idf(PCA) transformed data (t df of pca correlation)</p>
        <sec id="sec-1-8-1">
          <title>6. L1 on the ITML data (itml cityblock)</title>
        </sec>
        <sec id="sec-1-8-2">
          <title>7. DD on the ITML data (itml di usion) These selected runs are identi ed in Table 2 as highlighted items. Submissions to ImageCLEF medical retrieval[30, 31] are text les containing a ranked list of at most 1000 images for each of the competition queries, along with information</title>
          <p>
            such as the rank, query number and score. These submission les are constructed
in the TREC-style submission format [
            <xref ref-type="bibr" rid="ref32">32</xref>
            ].
5
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Discussion</title>
      <p>We have presented a systematic comparison of various plug-in (dis-)similarity
measures for M-CBIR with a standard bag-of-words feature method. Our
validation results with the last year 2011 dataset indicates both ITML and di usion
distance to be promising choices for the ad-hoc image-based retrieval task for
medical images. Based on this result, we have entered seven runs (combinations
of three top performing measures with di erent feature transformations). The
results were disappointing. All the runs were placed at the last of this
category with very low MAP scores for this year competition. The reasons for this
performance may include a potentially suboptimal choice of our feature
extraction/representation and query ltering employed. Investigation of this and a
rerun of our study with a better base-CBIR system is our important future work.
Among our 2012 results, we observe the consistent trend of the di usion and
cityblock distances to perform best among other submitted runs. This indicates
the virtue of distance measures based on L1 metric. The run with metric
learning (ITML) was placed the last in our list. This may indicate signi cant change
of data characteristics between the 2011 and 2012 data, which would naturally
cause this reduced performance. Investigating the true advantage of the metric
learning approach in M-CBIR remains another future work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Muller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Clough</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Deselaeres</surname>
          </string-name>
          , and B. Caputo, eds.,
          <source>ImageCLEF: Experimental Evaluation in Visual Information Retrieval (The Information Retrieval Series)</source>
          . Springer, 1st edition. ed.,
          <source>Aug</source>
          .
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Clough</surname>
          </string-name>
          , H. Muller, and M. Sanderson, \
          <article-title>Seven Years of Image Retrieval Evaluation," in ImageCLEF (H</article-title>
          . Muller, P. Clough,
          <string-name>
            <given-names>T.</given-names>
            <surname>Deselaers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Caputo</surname>
          </string-name>
          , and W. B. Croft, eds.), vol.
          <volume>32</volume>
          of The Information Retrieval Series, Springer Berlin Heidelberg,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Mu</surname>
          </string-name>
          <article-title>ller and</article-title>
          J. Kalpathy{Cramer, \
          <article-title>The Medical Image Retrieval Task," in ImageCLEF (H</article-title>
          . Muller, P. Clough,
          <string-name>
            <given-names>T.</given-names>
            <surname>Deselaers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Caputo</surname>
          </string-name>
          , and W. B. Croft, eds.), vol.
          <volume>32</volume>
          of The Information Retrieval Series, Springer Berlin Heidelberg,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. W. M.</given-names>
            <surname>Smeulders</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Worring</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Santini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Jain</surname>
          </string-name>
          , \
          <article-title>Content-based image retrieval at the end of the early years,"</article-title>
          <source>IEEE Trans. Pattern Anal. and Machine Intell</source>
          ., vol.
          <volume>22</volume>
          , no.
          <issue>12</issue>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Mu</surname>
          </string-name>
          ller,
          <string-name>
            <given-names>N.</given-names>
            <surname>Michoux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bandon</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Geissbuhler</surname>
          </string-name>
          , \
          <article-title>A review of content-based image retrieval systems in medical applications|clinical bene ts and future directions,"</article-title>
          <source>Intl. J. Medical Informatics</source>
          , vol.
          <volume>73</volume>
          , no.
          <issue>1</issue>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Desai</surname>
          </string-name>
          , \
          <article-title>Medical image retrieval and registration: towards computer assisted diagnostic approach,"</article-title>
          <source>in Proc. IDEAS Workshop on Medical Information Systems: The Digital Hospital</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Deserno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Antani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Long</surname>
          </string-name>
          , \
          <article-title>Ontology of Gaps in Content-Based Image Retrieval,"</article-title>
          <source>Journal of Digital Imaging</source>
          , vol.
          <volume>22</volume>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. B.</given-names>
            <surname>Wein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dahmen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bredno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Vogelsang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Kohnen</surname>
          </string-name>
          , \
          <article-title>Content-based image retrieval in medical applications: a novel multistep approach," in</article-title>
          <string-name>
            <surname>SPIE (M. M. Yeung</surname>
            ,
            <given-names>B.-L.</given-names>
          </string-name>
          <string-name>
            <surname>Yeo</surname>
          </string-name>
          , and
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>A</article-title>
          . Bouman, eds.), vol.
          <volume>3972</volume>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Marchiori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Brodley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pavlopoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Broderick</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Aisen</surname>
          </string-name>
          , \
          <article-title>CBIR for medical images - an evaluation trial,"</article-title>
          <source>in IEEE Workshop on Content-based Access of Image and Video Libraries</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D. G.</given-names>
            <surname>Lowe</surname>
          </string-name>
          , \
          <article-title>Object recognition from local scale-invariant features,"</article-title>
          <source>in Proc. IEEE Int. Conf. Computer Vision</source>
          , vol.
          <volume>2</volume>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D. G.</given-names>
            <surname>Lowe</surname>
          </string-name>
          , \
          <article-title>Distinctive Image Features from Scale-invariant Keypoints,"</article-title>
          <source>Int. J. Computer Vision</source>
          , vol.
          <volume>60</volume>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>H.</given-names>
            <surname>Bay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Tuytelaars</surname>
          </string-name>
          , and
          <string-name>
            <surname>L. Van Gool</surname>
          </string-name>
          , \SURF:
          <article-title>Speeded Up Robust Features,"</article-title>
          <source>in Proc. European Conf</source>
          . Computer
          <string-name>
            <surname>Vision</surname>
          </string-name>
          (A.
          <string-name>
            <surname>Leonardis</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Bischof</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Pinz, eds.), vol.
          <volume>3951</volume>
          of Lecture Notes in Computer Science, Springer Berlin / Heidelberg,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>H.</given-names>
            <surname>Bay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ess</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Tuytelaars</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Gool</surname>
          </string-name>
          , \
          <article-title>Speeded-up robust features (SURF)," Computer Vision and Image Understanding</article-title>
          , vol.
          <volume>110</volume>
          , no.
          <issue>3</issue>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T. S.</given-names>
            <surname>Lee</surname>
          </string-name>
          , \
          <article-title>Image representation using 2D gabor wavelets,"</article-title>
          <source>IEEE Trans. Pattern Anal. and Machine Intell</source>
          ., vol.
          <volume>18</volume>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>O.</given-names>
            <surname>Pele</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Werman</surname>
          </string-name>
          , \
          <article-title>The Quadratic-Chi Histogram Distance Family,"</article-title>
          <source>in Proc. European Conf</source>
          . Computer
          <string-name>
            <surname>Vision (K. Daniilidis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Maragos</surname>
          </string-name>
          , and N. Paragios, eds.), vol.
          <volume>6312</volume>
          of Lecture Notes in Computer Science, Springer Berlin / Heidelberg,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Rubner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tomasi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Guibas</surname>
          </string-name>
          , \
          <article-title>A metric for distributions with applications to image databases,"</article-title>
          <source>in Proc. IEEE Int. Conf. Computer Vision</source>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Puzicha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Buhmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Rubner</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Tomasi</surname>
          </string-name>
          , \
          <article-title>Empirical evaluation of dissimilarity measures for color and texture,"</article-title>
          <source>in Proc. IEEE Int. Conf. Computer Vision</source>
          , vol.
          <volume>2</volume>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ling</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Okada</surname>
          </string-name>
          , \
          <article-title>Di usion Distance for Histogram Comparison,"</article-title>
          <source>in Proc. IEEE Conf. Computer Vision and Pattern Recognition</source>
          , vol.
          <volume>1</volume>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>U.</given-names>
            <surname>Avni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Goldberger</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Greenspan</surname>
          </string-name>
          , \
          <article-title>Medical image classi cation at Tel Aviv and Bar Ilan Universities," in ImageCLEF (H</article-title>
          . Muller, P. Clough,
          <string-name>
            <given-names>T.</given-names>
            <surname>Deselaers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Caputo</surname>
          </string-name>
          , and W. B. Croft, eds.), vol.
          <volume>32</volume>
          of The Information Retrieval Series, Springer Berlin Heidelberg,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>R. O.</given-names>
            <surname>Duda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. G.</given-names>
            <surname>Stork</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. E.</given-names>
            <surname>Hart</surname>
          </string-name>
          ,
          <article-title>Pattern classi cation</article-title>
          . Wiley, 2 ed.,
          <source>Nov</source>
          .
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Puzicha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hofmann</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Buhmann</surname>
          </string-name>
          , \
          <article-title>Non-parametric similarity measures for unsupervised texture segmentation and image retrieval,"</article-title>
          <source>in Proc. IEEE Conf. Computer Vision and Pattern Recognition</source>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ojala</surname>
          </string-name>
          , M. Pietikainen, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Harwood</surname>
          </string-name>
          , \
          <article-title>A comparative study of texture measures with classi cation based on featured distributions,"</article-title>
          <source>Pattern Recognition</source>
          , vol.
          <volume>29</volume>
          , no.
          <issue>1</issue>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>D.</given-names>
            <surname>Geman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Geman</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Gra gne, and</article-title>
          <string-name>
            <given-names>P.</given-names>
            <surname>Dong</surname>
          </string-name>
          , \
          <article-title>Boundary detection by constrained optimization,"</article-title>
          <source>IEEE Trans. Pattern Anal. and Machine Intell</source>
          ., vol.
          <volume>12</volume>
          , no.
          <issue>7</issue>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ling</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Okada</surname>
          </string-name>
          , \
          <article-title>An E cient Earth Mover's Distance Algorithm for Robust Histogram Comparison,"</article-title>
          <source>IEEE Trans. Pattern Anal. and Machine Intell</source>
          ., vol.
          <volume>29</volume>
          , no.
          <issue>5</issue>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>E.</given-names>
            <surname>Levina</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Bickel</surname>
          </string-name>
          , \
          <article-title>The Earth Mover's distance is the Mallows distance: some insights from statistics,"</article-title>
          <source>in Proc. IEEE Int. Conf. Computer Vision</source>
          , vol.
          <volume>2</volume>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>E. P.</given-names>
            <surname>Xing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. I.</given-names>
            <surname>Jordan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Russell</surname>
          </string-name>
          , \
          <article-title>Distance metric learning with application to clustering with side-information,"</article-title>
          <source>Learning</source>
          , vol.
          <volume>15</volume>
          , no.
          <issue>15</issue>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kulis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and I. S.</given-names>
            <surname>Dhillon</surname>
          </string-name>
          , \
          <article-title>Informationtheoretic metric learning,"</article-title>
          <source>in Proc. Intl. Conf. Machine learning</source>
          , (New York, NY, USA), ACM,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>B.</given-names>
            <surname>McFee</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Lanckriet</surname>
          </string-name>
          , \
          <article-title>Metric Learning to Rank,"</article-title>
          <source>in Proc. Intl. Conf. Machine learning</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>L.</given-names>
            <surname>Yang</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Jin</surname>
          </string-name>
          , \
          <article-title>Distance Metric Learning: A Comprehensive Survey," tech. rep</article-title>
          ., Department of Computer Science and Engineering, Michigan State University,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>H.</given-names>
            <surname>Mu</surname>
          </string-name>
          <article-title>ller,</article-title>
          <string-name>
            <surname>A. G. S. de Herrera</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Kalpathy-Cramer</surname>
            ,
            <given-names>D. D.</given-names>
          </string-name>
          <string-name>
            <surname>Fushman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Antani</surname>
            ,
            <given-names>and I. Eggel</given-names>
          </string-name>
          , \
          <article-title>Overview of the ImageCLEF 2012 medical image retrieval and classi cation tasks," CLEF 2012 working notes</article-title>
          , Sept.
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kalpathy</surname>
          </string-name>
          {Cramer,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bedrick</surname>
          </string-name>
          , and W. Hersh, \
          <article-title>Relevance Judgments for Image Retrieval Evaluation," in ImageCLEF (H</article-title>
          . Muller, P. Clough,
          <string-name>
            <given-names>T.</given-names>
            <surname>Deselaers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Caputo</surname>
          </string-name>
          , and W. B. Croft, eds.), vol.
          <volume>32</volume>
          of The Information Retrieval Series, Springer Berlin Heidelberg,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>N.</given-names>
            <surname>Stokes</surname>
          </string-name>
          , \TREC:
          <article-title>Experiment and Evaluation in Information Retrieval,"</article-title>
          <source>Computational Linguistics</source>
          , vol.
          <volume>32</volume>
          , pp.
          <volume>563</volume>
          {
          <issue>567</issue>
          ,
          <string-name>
            <surname>Nov</surname>
          </string-name>
          .
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>