<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>FCSE at Medical Tasks of ImageCLEF 2013</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ivan Kitanovski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ivica Dimitrovski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Suzana Loskovska</string-name>
          <email>suzana.loshkovskag@finki.ukim.mk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Computer Science and Engineering, University of Ss Cyril and Methodius Rugjer Boshkovikj 16</institution>
          ,
          <addr-line>1000 Skopje</addr-line>
          ,
          <country country="MK">Macedonia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents the details of the participation of FCSE (Faculty of Computer Science and Engineering) research team in ImageCLEF 2013 medical tasks (modality classi cation, ad-hoc image retrieval and case-based retrieval). For the modality classi cation task we used SIFT descriptors and tf idf weights of the surrounding text (image caption and paper title) as features. SVMs with 2 kernel and one-vsall strategy were used as classi ers. For the ad-hoc image retrieval task and case-based retrieval we adopted a strategy which uses a combination of word-space and concept-space approaches. The word-space approach uses the Terrier IR search engine to index and retrieve the text associated with the images/cases. The concept-space approach uses Metamap to map the text data into a set of UMLS (Uni ed Medical Language System) concepts, which are later indexed and retrieved by the Terrier IR search engine. The results from the word-space and concept-space retrieval are fused using linear combination. For the compound gure separation task, we used unsupervised algorithm based on breadth- rst search strategy using only visual information from the medical images. The selected algorithms were tuned and tested on the data from ImageCLEF 2012 medical task and based on the selected parameters we submitted the new experiments for ImageCLEF 2013 medical task. We achieved very good overall performance: the best run for the modality classi cation ranked 2nd in the overall score, the best run for the ad-hoc image retrieval ranked 3rd.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In this paper we present the experiments performed by the Faculty of Computer
Science and Engineering (FSCE) team for the medical tasks at ImageCLEF
2013. Our group participated in all medical subtasks. To acquire the optimal
parameters we evaluated our approaches on the ImageCLEF 2012 dataset and
then based on those parameters we submitted the runs for ImageCLEF 2013.</p>
      <p>The paper is organized as follows: Section 2 describes our approach for the
modality classi cation task, section 3 shows the algorithm for the compound
separation task, section 4 presents the ad-hoc image retrieval task, section 5
contains the details for the case-based retrieval task.</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <sec id="sec-2-1">
        <title>Modality classi cation task</title>
        <p>
          Imaging modality is an important information on the image for medical retrieval.
In user studies, clinicians have indicated that modality is one of the most
important llters that they would like to be able to limit their search by. Using the
modality information, the retrieval results can often be improved signi cantly.
The ImageCLEF 2013 medical modality classi cation task is a standardized
benchmark for systems to automatically classify medical image modality from
PubMed journal articles [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. The 2013 dataset has 31 calsses (the same number
of classes and the same classi cation hierarchy as in 2012) but larger number
of compound gures are present making the task signi cantly harder but
corresponding much more to the reality of biomedical journals [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>Our approach uses visual features with combination of textual features
extracted from the surrounding text content of the images. SVMs with 2 kernel
were used as a classi ers. The algorithms are explained in details in the
remainder of this section.
2.2</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Visual features</title>
      <p>
        Collections of medical images can contain various images obtained using
different imaging techniques. Di erent feature extraction techniques are able to
capture di erent aspects of an image (e.g., texture, shapes, color distribution...)
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Texture is especially important, because it is di cult to classify medical
images using shape or gray level information. E ective representation of texture
is needed to distinguish between images with equal modality and layout.
Local image characteristics are fundamental for image interpretation: while global
features retain information on the whole image, the local features capture the
details. They are thus more discriminative concerning the problem of inter and
intra-class variability [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        The bag-of-visual-words approach is commonly used in many state of the
art algorithms for image classi cation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The basic idea of this approach is to
sample a set of local image patches using some method (densely, randomly or
using a key-point detector) and calculate a visual descriptor on each patch (SIFT
descriptor, normalized pixel values). The resulting distribution of descriptors
is then quanti ed against a pre-speci ed visual codebook which converts it to
a histogram. The main issues that need to be considered when applying this
approach are: sampling of the patches, selection of the visual patch descriptor
and building the visual codebook.
      </p>
      <p>
        We use dense sampling of the patches, which samples an image grid in a
uniform fashion using a xed pixel interval between patches. We use an interval
distance of 6 pixels and sample at multiple scales ( = 1:2 and = 2:0). Due
to the low contrast of some of the medical images (for example, radiographs),
it would be di cult to use any detector for points of interest. Also, it has been
pointed by Zhang et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], that a dense sampling is always superior to any
strategy based on detectors for points of interest. We calculate a opponentSIFT
descriptor for each image patch [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. OpponentSIFT describes all the channels
in the opponent color space using SIFT descriptors. The information in the O3
channel is equal to the intensity information, while the other channels describe
the color information in the image. These other channels do contain some
intensity information, but due to the normalization of the SIFT descriptor they are
invariant to changes in light intensity [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        The crucial aspect of the bag-of-visual-words approach is the codebook
construction. An extensive comparison of codebook construction variables is given
by van Gemert et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. We employ k-means clustering on 250K randomly
chosen descriptors from the set of images available for training. k-means partitions
the visual feature space by minimizing the variance between a prede ned
number of k clusters. Here, we set k to 500 and thus de ne a codebook with 500
codewords [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Dense sampling gives an equal weight to all key-points, irrespective of their
spatial location in the image. To overcome this limitation, we follow the spatial
pyramid approach [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. We used a spatial pyramid of 1x1, 2x2, and 1x3 regions.
Since every region is an image in itself, the spatial pyramid can easily be used
in combination with dense sampling. The resulting vector with 4000 bins ((1x1
+ 2x2 + 1x3)x500) was obtained by concatenation of the eight histograms (each
histogram is L1 normalized). Fig. 1 shows an example of the histograms extarcted
from an image for the spatial pyramids of 1x1, 2x2 and 3x1.
2.3
      </p>
    </sec>
    <sec id="sec-4">
      <title>Textual features</title>
      <p>
        Images in the collection belong to a medical article, so they can be indexed
using the surrounding text content. The text representation adopted in this
work included information from the title of the paper and the image caption,
which can be found in the XML le corresponding to each image in the data
set. With that, a text corpus for the image collection was built, and standard
text processing operations were applied, including tokenization, stemming, and
stop-word removal using Terrier IR [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. We calculate the weight for each term in
each document using T F IDF weighting model. The calculated weights were
adopted as textual features.
2.4
      </p>
    </sec>
    <sec id="sec-5">
      <title>Feature fusion schemes</title>
      <p>
        Di erent features (in our case visual and textual) bringing di erent information
about the content of the images clearly outperform single feature approaches
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Following these ndings, we combine the two di erent features described
above using high level feature fusion scheme. The fusion schemes is depicted in
Fig. 2.
s
e
s
s
a
l
c
      </p>
      <p>The high level fusion scheme averages the predictions from the individual
classi ers trained on the separate descriptors.
2.5</p>
    </sec>
    <sec id="sec-6">
      <title>Classi er setup</title>
      <p>
        We used the libSVM implementation of SVMs (Support Vector Machines) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
with probabilistic output [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] as classi ers. To solve the multi-class classi cation
problems, we employ the one-vs-all approach. Each of the SVMs was trained
with a 2 kernel. Namely, we build a binary classi er for each modality/class:
the examples associated with that class are labeled positive and the remaining
examples are labeled negative. This results in an imbalanced ratio of positive
versus negative training examples. We resolve this issue by adjusting the weights
of the positive and negative class [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In particular, we set the weight of the
positive class to #pos+#neg and the weight of the negative class to #pos+#neg ,
#pos #neg
with #pos the number of positive instances in the train set and #neg the number
of negative instances. We also optimize the cost parameter C of the SVMs using
an automated parameter search procedure [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. For the parameter optimization,
we used the dataset from 2012. After nding the optimal C value, the SVM is
trained on the 2013 set of training images.
      </p>
    </sec>
    <sec id="sec-7">
      <title>Results and discussion</title>
      <p>In this section, we present and discuss the results obtained from the
experimental evaluation of the proposed method. First, we compare and evaluate the
performance of the proposed method for the ImageCLEF 2012 dataset. Next,
we present the results obtained for this year, ImageCLEF 2013 dataset.</p>
      <p>
        The rst three rows in Table 1 show the results of our method applied on the
ImageCLEF 2012 dataset. These results include visual, textual and mixed runs.
From the presented results, we can note that the better predictive performance of
the visual run compared to the textual run. The high level feature fusion scheme
helps in increasing the predictive performance. Furthermore, from the presented
results, we can also note that our method has a very high accuracy/performance.
Compared with the results from the groups that participate in the ImageCLEF
2012 medical task [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] our visual run is second best, the textual and mixed
runs are ranked rst. The mixed run with accuracy of 77:0 will be ranked rst
in the overall ranking if we have submitted this run in the last years modality
classi cation task.
      </p>
      <p>The second three rows in Table 1 shows the results of our method applied on
this year modality classi cation task. These results include also visual, textual
and mixed runs. The accuracy of 78:04 obtained with the mixed run is second
best in the overall ranking. The high level feature fusion scheme increases the
predicitve performance for this year dataset also.
3</p>
      <sec id="sec-7-1">
        <title>Compound gure separation</title>
        <p>Compound gures contain gures of several types, they cannot be classi ed into
unique classes and need to be separated before a detailed classi cation into
the gure types can be performed. In this work, a unsupervised technique of
compound gure separation is proposed and implemented based on
breadthrst search strategy using only visual information from the medical gures. All
pixel values in the gure are examined/traversed searching for enclosed region
separated with white border/pixels. The sensitivity of the border is controlled by
threshold parameter. The regions smaller than prede ned value are discarded.</p>
        <p>In some of the gures the separating borders between the contained sub gures
are in black color, therefore before applying our algorithm we invert the output
gure. For the given test dataset our algorithm correctly classi ed 68:59% of the
gures.
4</p>
      </sec>
      <sec id="sec-7-2">
        <title>Ad-hoc image retrieval</title>
        <p>In this section, we give an overview of the application of our methods to
adhoc medical image retrieval and present the results of our submitted runs. We
participated only in the textual retrieval.
4.1</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Proposed approach</title>
      <p>The approach uses the image caption and the title of medical article in which it
is referenced i.e. surrounding text. The approach seeks to combine word-space
and concept-space approaches with the goal to achieve better overall retrieval
performance.</p>
      <p>The word-space component indexes and retrieves the surrounding text of the
medical images in a traditional way. The surrounding text of the medical images
is rst preprocessed performing stop words removal and stemming, and
creating a standard inverted index. In the retrieval phase, the system pre-processes
the query and applies stop words removal and stemming to the query as well.
Weighting models are applied to calculate the score for the relevancy of every
medical article in respect to the given query. Once the score is calculated the
documents are sorted and returned.</p>
      <p>
        The concept-space component works by analyzing the text by the presented
medical concepts. The rst step is to map the surrounding text of the medical
images to medical concepts. The mapping can be done using a variety of toolkits,
services or libraries such as [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], Meshup [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] etc. The problem in this approach
arises in the way documents will be indexed and then evaluated in the retrieval
phase with respect to queries. Classical information retrieval models, directly or
indirectly, depend on the number of terms which the document and query share
to compute the relevance score [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. But, the number of terms which a query and
document share in the word-space could be very di erent in the concept-space.
For example, if a query and the document share one term "x-ray" in word-space,
they can share up to six terms in concept-space [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. On the other hand, if they
share a phrase of two terms "lung x-ray" in word-space, then they will share
only one term in concept-space.
      </p>
      <p>
        The results from both components are then normalized and passed to a fusion
component (the diagram is depicted on Figure 3). It can use any of the known
strategies for late fusion [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. In this study, we used a simple linear combination
of the normalized results.
      </p>
      <sec id="sec-8-1">
        <title>Imageqcaptionq/qMedicalqarticles</title>
        <p>Text
query</p>
      </sec>
      <sec id="sec-8-2">
        <title>Preprocessng</title>
        <p>(stemming,qstop
wordsqremoval)
Text
data</p>
      </sec>
      <sec id="sec-8-3">
        <title>Indexingqand</title>
      </sec>
      <sec id="sec-8-4">
        <title>Retrieval</title>
        <p>Concept-space
results</p>
      </sec>
      <sec id="sec-8-5">
        <title>Normalization</title>
        <p>Normalized
results</p>
      </sec>
      <sec id="sec-8-6">
        <title>Mappingqtext toqconcepts</title>
        <p>Concept Query
data data</p>
      </sec>
      <sec id="sec-8-7">
        <title>Indexingqand</title>
      </sec>
      <sec id="sec-8-8">
        <title>Retrieval</title>
        <p>Word-space
results</p>
      </sec>
      <sec id="sec-8-9">
        <title>Normalization</title>
        <p>Normalized
results
Text
query</p>
      </sec>
      <sec id="sec-8-10">
        <title>Fusion</title>
        <p>
          Finalq results
For the word-space approach Terrier IR [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] is used as a search engine. For the
preprocessing stage, Porter stemmer [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] and stop words are applied. In the
retrieval phase, several weighting models were evaluated: PL2 [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], BM25 [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ],
BB2 [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], DFR-BM25 [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], TF-IDF [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], DirichletLM [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. Additional experiment
was performed with query expansion on the best performing model to test its
maximum output.
        </p>
        <p>
          The concept-space approach requires a mapping mechanism to match the
text data to medical concepts. In this approach, Metamap is used as mapping
tool and the extracted medical concepts are UMLS [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] concepts. The mapping
is performed only on the surrounding text of the medical images. After the
concepts are extracted, new arti cial text is generated containing only the UMLS
concepts. The same process is repeated for the queries. Once the arti cial text is
constructed it is passed to the search engine for indexing. Terrier IR indexes the
arti cial text, with no additional preprocessing (no stemming and stop words
removal). The retrieval is performed by passing the arti cial queries to the search
engine. In this phase, the same weighting models are applied as in the word-space
approach. Basically, the concept-space approach can be viewed as a word-space
approach with more complex preprocessing.
        </p>
        <p>
          Before the fusion phase, the results from the word-space and concept-space
are normalized using min-max normalization [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. The normalized results are
then passed to the fusion component which applies linear combination. This
kind fusion provides modularity and control over the extent in which components
in uence the nal result.
4.3
        </p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Evaluating on ImageCLEF 2012</title>
      <p>The proposed framework was rst evaluated on the ImageCLEF 2012 dataset.
This phase is used to nd the optimal weighting models and appropriate
parameters. The results of the word-space assessment are depicted on Table 2. The
results show that the BM25 model provides the best performance for the
wordspace retrieval. An additional experiment was performed with the best model by
assigning weights to key words in the queries using Terri query language (For
example. words such as "MRI", "CT" etc. are given 1.5 weight). The results for the
experiment with the word weights (BM25-ww) show an increase in performance.</p>
      <p>The results of the concept-space assessment are depicted on Table 3. In this
case the best results are provided with the DirichletLM model.</p>
      <p>The results of the mixed assessment are depicted on Table 4. The mixed
assessment is consisted of two experiments. The rst one is by combining the
best word-space and concept-space approaches. The second experiment is done
by combining the word-space with word weights and concept-space approaches.
Based on the results obtained from the experiments over the ImageCLEF 2012
dataset, the runs for the ImageCLEF 2013 ad-hoc retrieval task was
submitted. Another experiment was made, only now using ImageCLEF 2013 data and
submitted the results only from the best performing techniques. For word-space
text-based retrieval we submitted the run using BM25 weighting model word
weights and for the concept-space text-based retrieval we submitted the run
using DirchletLM weighting model. Finally, for the mixed retrieval we submitted
the linear combination of the two previous spaces. The results from our runs on
ImgeCLEF 2013 are presented on Table 5.
In this section, we give an overview of the application of our methods to
casebased retrieval and present the results of our submitted runs. We participated
only in the textual retrieval of the cases.
The proposed approach for this task is similar to the ad-hoc retrieval task, with
the di erence that in this case the retrieval unit is a medical article, not an image.</p>
      <p>Two approach combines the word-space and concept-space, just as with the
adhoc retrieval. For the word-space component, we index the entire text of the
medical articles, which includes the title, abstract, article text and captions of the
images in the article (we refer to this as "fulltext"). The indexing and retrieval
is done using Terrier IR and several weighing models are applied to analyze their
performance for this type of task. For the concept-space component, only the title
and abstract of the medical article are used for extraction of medical concepts.
The tool for medical concept extraction is Metamap and the extracted results
are UMLS concepts. The rest of the process for the concept-space approach is
identical to the concept-space ad-hoc retrieval. The nal result is provided with
the late fusion of both components using linear combination.
5.2</p>
    </sec>
    <sec id="sec-10">
      <title>Evaluating on ImageCLEF 2012</title>
      <p>The proposed framework was again evaluated on the ImageCLEF 2012 dataset.
The results of the word-space assessment are depicted on Table 6. The results
show that the BM25 model provides the best performance for the word-space
case-based retrieval. An additional experiment was performed with the best
model by adding query expansion. The results for the experiment with the query
expansion (BM25-qe) show that the query expansion increase retrieval
performance by roughly 4%.</p>
      <p>The results of the concept-space assessment are depicted on Table 7. In this
case the best results are provided with the DirichletLM model. An additional
experiment was performed using query expansion on the best performing model,
which provides an improvement of roughly 2%.</p>
      <p>The results of the mixed assessment are depicted on Table 8. The mixed
assessment is consisted of two experiments. The rst one is by combining the
best word-space and concept-space approaches. The second experiment is done
by combining the word-space and concept-space approaches, both with added
query expansion.
Using the models and optimal parameters learned with the experiments over the
ImageCLEF 2012 dataset, the experiments over the ImageCLEF 2013 dataset
were performed. The best results were provided with in the case of the mixed
experiment using query expansion.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>de Herrera</surname>
            ,
            <given-names>A.G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalpathy-Cramer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fushman</surname>
            ,
            <given-names>D.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Overview of the imageclef 2013 medical tasks</article-title>
          .
          <source>In: Working notes of CLEF</source>
          <year>2013</year>
          .
          <article-title>(</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Dimitrovski</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Loskovska</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Content-based retrieval system for X-ray images</article-title>
          .
          <source>In: International Congress on Image and Signal Processing</source>
          . (
          <year>2009</year>
          )
          <volume>2236</volume>
          {
          <fpage>2240</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Tommasi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orabona</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caputo</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Discriminative cue integration for medical image annotation</article-title>
          .
          <source>Pattern Recognition Letters</source>
          <volume>29</volume>
          (
          <issue>15</issue>
          ) (
          <year>2008</year>
          )
          <year>1996</year>
          {
          <fpage>2002</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Marszalek</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lazebnik</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Local features and kernels for classi cation of texture and object categories: A comprehensive study</article-title>
          .
          <source>International Journal of Computer Vision</source>
          <volume>73</volume>
          (
          <issue>2</issue>
          ) (
          <year>2007</year>
          )
          <volume>213</volume>
          {
          <fpage>238</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Lowe</surname>
            ,
            <given-names>D.G.</given-names>
          </string-name>
          :
          <article-title>Distinctive image features from scale-invariant keypoints</article-title>
          .
          <source>International Journal of Computer Vision</source>
          <volume>60</volume>
          (
          <issue>2</issue>
          ) (
          <year>2004</year>
          )
          <volume>91</volume>
          {
          <fpage>110</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. van de Sande,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Gevers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Snoek</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>Evaluating color fescriptors for object and scene recognition</article-title>
          .
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          <volume>32</volume>
          (
          <issue>9</issue>
          ) (
          <year>2010</year>
          )
          <volume>1582</volume>
          {
          <fpage>1596</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. van Gemert,
          <string-name>
            <given-names>J.C.</given-names>
            ,
            <surname>Veenman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.J.</given-names>
            ,
            <surname>Smeulders</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.W.M.</given-names>
            ,
            <surname>Geusebroek</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.M.</surname>
          </string-name>
          :
          <article-title>Visual word ambiguity</article-title>
          .
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          <volume>99</volume>
          (
          <issue>1</issue>
          ) (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lazebnik</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmid</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ponce</surname>
          </string-name>
          , J.:
          <article-title>Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories</article-title>
          .
          <source>In: IEEE conference on Computer Vision and Pattern Recognition</source>
          . (
          <year>2006</year>
          )
          <volume>2169</volume>
          {
          <fpage>2178</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Ounis</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amati</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plachouras</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnson</surname>
          </string-name>
          , D.:
          <article-title>Terrier information retrieval platform</article-title>
          .
          <source>In: Advances in Information Retrieval</source>
          , Springer (
          <year>2005</year>
          )
          <volume>517</volume>
          {
          <fpage>519</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Tommasi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caputo</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Welter</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Guld,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Deserno</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>Overview of the clef 2009 medical image annotation track</article-title>
          .
          <source>In: Multilingual Information Access Evaluation II. Multimedia Experiments { LNCS 6242</source>
          , Springer Berlin/Heidelberg (
          <year>2010</year>
          )
          <volume>85</volume>
          {
          <fpage>93</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.J.:</given-names>
          </string-name>
          <article-title>LIBSVM: a library for support vector machines</article-title>
          . (
          <year>2001</year>
          ) Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>H.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weng</surname>
          </string-name>
          , R.C.
          <article-title>: A note on Platt's probabilistic outputs for support vector machines</article-title>
          .
          <source>Machine Learning</source>
          <volume>68</volume>
          (
          <year>2007</year>
          )
          <volume>267</volume>
          {
          <fpage>276</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. Muller, H.,
          <string-name>
            <surname>de Herrera</surname>
            ,
            <given-names>A.G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalpathy-Cramer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Demner-Fushman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eggel</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Overview of the imageclef 2012 medical image retrieval and classi cation tasks</article-title>
          . In: CLEF (Online Working Notes/Labs/Workshop). (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Aronson</surname>
            ,
            <given-names>A.R.:</given-names>
          </string-name>
          <article-title>E ective mapping of biomedical text to the umls metathesaurus: the metamap program</article-title>
          .
          <source>In: Proceedings of the AMIA Symposium</source>
          , American Medical Informatics Association (
          <year>2001</year>
          )
          <fpage>17</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Trieschnigg</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pezik</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , De Jong, F.,
          <string-name>
            <surname>Kraaij</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rebholz-Schuhmann</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Mesh up: e ective mesh text classi cation for improved document retrieval</article-title>
          .
          <source>Bioinformatics</source>
          <volume>25</volume>
          (
          <issue>11</issue>
          ) (
          <year>2009</year>
          )
          <volume>1412</volume>
          {
          <fpage>1418</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Abdulahhad</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chevallet</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berrut</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , et al.:
          <article-title>Mrim at imageclef2012. from words to concepts: A new counting approach</article-title>
          .
          <source>In: Notebook Papers of Labs and Workshop</source>
          (CLEF).
          <article-title>(</article-title>
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17. Muller, H.,
          <string-name>
            <surname>de Herrera</surname>
            ,
            <given-names>A.G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalpathy-Cramer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fushman</surname>
            ,
            <given-names>D.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eggel</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Overview of the imageclef 2012 medical image retrieval and classi cation tasks</article-title>
          .
          <source>Working Notes of CLEF</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plachouras</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lioma</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ounis</surname>
          </string-name>
          , I.: University of glasgow at webclef 2005:
          <article-title>Experiments in per- eld normalisation and language speci c stemming</article-title>
          .
          <source>In: Accessing Multilingual Information Repositories</source>
          . Springer (
          <year>2006</year>
          )
          <volume>898</volume>
          {
          <fpage>907</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Amati</surname>
          </string-name>
          , G.,
          <string-name>
            <surname>Van Rijsbergen</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          :
          <article-title>Probabilistic models of information retrieval based on measuring the divergence from randomness</article-title>
          .
          <source>ACM Transactions on Information Systems (TOIS) 20(4)</source>
          (
          <year>2002</year>
          )
          <volume>357</volume>
          {
          <fpage>389</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Hiemstra</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>A probabilistic justi cation for using tf idf term weighting in information retrieval</article-title>
          .
          <source>International Journal on Digital Libraries</source>
          <volume>3</volume>
          (
          <issue>2</issue>
          ) (
          <year>2000</year>
          )
          <volume>131</volume>
          {
          <fpage>139</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Zhai</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>La</surname>
            <given-names>erty</given-names>
          </string-name>
          , J.:
          <article-title>A study of smoothing methods for language models applied to information retrieval</article-title>
          .
          <source>ACM Transactions on Information Systems (TOIS) 22(2)</source>
          (
          <year>2004</year>
          )
          <volume>179</volume>
          {
          <fpage>214</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Jain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nandakumar</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ross</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Score normalization in multimodal biometric systems</article-title>
          .
          <source>Pattern recognition</source>
          <volume>38</volume>
          (
          <issue>12</issue>
          ) (
          <year>2005</year>
          )
          <volume>2270</volume>
          {
          <fpage>2285</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>