<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Fast filtering techniques in medical image classification and retrieval</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xin Zhou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Miaofei Han</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yanli Song</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Qiang Li</string-name>
          <email>liqiang@sari.ac.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Medical Image Information Laboratory, Advanced Medical Equipment Research Center, Shanghai Advanced Research Institute, Chinese Academy of Sciences</institution>
          ,
          <addr-line>99 Haike Road, Building No. 3, Pudong, Shanghai 201210</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This article presents the participation of the MIILab (Medical Image Information Laboratory) group in ImageCLEFmed2013. There are three types of tasks for ImageCLEFmed2013: modality classification, image retrieval and compound-image separation. Image modality classification and medical image retrieval are targeted according to MIILab's research interest. The main goal is to perform a feasibility test on applying existing techniques on new applications, such as applying image denoising techniques on image retrieval and classification. Both global features and local features were employed. Fast filtering techniques were used to obtain global features on color, shape and texture. These global features serves to perform a pre-classification on images. Both low-level and high-level local features were extracted. Bags of features model was used to build final feature vector. Both kNN and SVM classifiers were tried out in modality classification task. Reciprocal kNN was used to perform result fusion in image retrieval task. The modality classification task was decomposed into a compound image classification and a non-compound image classification. Our approaches achieved 89.9% classification accuracy on training data and 85.1% on testing data for compound image classification. For non compound image classification, accuracy is around 68.3% on training data and 67.7% on testing data. The overall classification accuracy is around 65% on 31 classes. False alarms are mainly from large classes such as compound images (312), X-ray images (101) and organ photos (63), but accuracy per class shows that performance bottleneck also comes from small/medium classes with large content diversity such as statistic figures, chemical structure and 3D images. Best result was around 80% (IBM research lab). For the image retrieval task, one baseline using SURFContext+BoF was submitted and the corresponding MAP (mean average precision) is 0.0086. Even best visual retrieval run obtained only a MAP of 0.018, which is still not comparable with textual approaches (average score 0.2).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The Cross Language Evaluation Forum (CLEF) organizes contests for the
evaluation of information retrieval systems each year. The image retrieval track of
CLEF is called ImageCLEF1. ImageCLEFmed has been part of ImageCLEF
focusing on medical images [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] since 2004. More about the ImageCLEF tasks and
results in 2013 can be found in [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ].
      </p>
      <p>MIILab (Medical Image Information Laboratory) is a medical imaging group
with research focus in medical imaging technology, particularly those tightly
related to advanced medical imaging equipments. This is the first year that MIILab
participates in the medical imaging tasks of ImageCLEF (ImageCLEFmed). The
main objective is to perform a feasibility test on applying existing techniques on
new applications, such as applying image denoising techniques on image retrieval
and classification.</p>
      <p>
        Two tasks of ImageCLEFmed2013 fit MIILab’s research interests, i.e.: image
modality classification and medical image retrieval. The fundamental difficulty
for ImageCLEFmed tasks is the diversity of image content. It is due to the
fact that the collection contains not only biomedical images, but also figures,
mathematic formulas, tables, non–clinical photos, etc. Such a diversity of content
generate numerous types of similarity which are difficult to cover in feature
space. In this paper, we propose to use fast filtering techniques (monochrome
filter, Laplace filter, and line/dot filter [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]) and BoBB(bags of bounding box) to
address this problem.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Methods</title>
      <p>This section describes the basic techniques and collections used by MIILab in
ImageCLEFmed2013. All runs submitted to ImageCLEFmed2013 are purely
visual–based. Techniques used can be divided into: 1) pre–processing, 2)
features, 3) classifiers and 4) fusion strategies.
2.1</p>
      <sec id="sec-2-1">
        <title>Techniques used</title>
        <p>Pre–processing Images in ImageCLEFmed2013 collections are from journal
papers, which were mainly converted into JPEG (Joint Photographic Experts
Group) format. Certain monochrome images became color images after the
conversion, others changed their dynamic range. Hence, images are no longer
comparable between them, which introduces additional complication to the task.
Normalization on images is required before any other operations take place, and
all images are converted into a unified range.</p>
        <p>
          Another issue is the color space. By default JPEG use the RGB (red, green,
blue) color space. However, the RGB color spaces is well known for not being
tightly corresponded to human color perception and thus less used in image
retrieval and classification [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. On the other hand, it has also been proved that
the HSV (Hue, Saturation, Value) color space can give promising results in image
retrieval [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ].
1 http://www.imageclef.org/
        </p>
        <p>Therefore, the pre–processing step consists of 2 parts. The HSV (Hue,
Saturation, Value) color space decomposition and a 0–255 normalization on each
channel.</p>
        <p>
          Features In this paper, both global features and local features were employed.
Fast filtering techniques, such as monochrome filter, line/dot filter [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], were
applied to obtain global features on color, shape and texture. First to third
statistical moments were also extracted in certain subregions as low–level local features.
SURFContext [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] (an early fusion version of SURF [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] and Shape Context [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ])
were implemented to obtain high–level local features. The number of visual
keywords is set to be 3000. The classical BoF (bags of features) approach was used
to form the final feature vector.
        </p>
        <p>Classifiers Classifiers were only used in modality classification task. Both kNN
(k–Nearest Neighbors) and SVM (Support Vector Machine) classifiers were
employed, but the latter significantly outperformed the former. Therefore, only
SVM classifier–based runs were submitted for evaluation.</p>
        <p>
          Two–level classification is performed: compound image classification and
modality classification. The former uses only global features and low–level
local features (first to third statistical moments, surface, ratio) of all detectable
bounding box. The bounding box detection is based on open source toolbox and
the final feature space is built with BoBB. The latter is based on high–level
local features with classical BoF approach. Open source machine learning libraries
(weka, libsvm) were used with a grid search for parameter selection.
Fusion Techniques Fusion techniques were used in image retrieval task to
combine result lists of different techniques. Classical fusion strategies, such as
combSUM, combMNZ, combSUM(2)MAX were used with a score normalization
based on rank–number based logarithmic weighting function [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Recently, a
reciprocal kNN based fusion technique was reported to obtain promising result
in nature image classification and retrieval [12]. One of the advantage of
using this technique is that no score normalization is required. This technique is
implemented and labeled as combRKNN in this paper.
2.2
        </p>
        <p>
          Image Collections
306’538 medical images were available for ImageCLEFmed 2013. Among them,
2’905 images were labeled with modality information and were provided as
training data, and 2’582 images were provided without modality information as test
data for modality classification task. Details about the setup and collections of
the ImageCLEFmed tasks can be found in an overview paper [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>This section describes our results for the two medical tasks.
3.1</p>
      <sec id="sec-3-1">
        <title>Modality Classification</title>
        <p>For modality classification task, evaluation is based on the percentage of
correctly classified images. For runs of various technique approaches (textual,
visual, mixed), the best accuracy and average accuracy together with our results
are shown in Table 1.</p>
        <p>The average accuracy per class is shown in Table 2. This year we separate
compound image classification as an independent task. Once an image was not
classified as compound image, it was passed to a non–compound classification
for the rest 30 classes.</p>
        <p>The overall classification accuracy is around 65% on 31 classes. False alarms
are mainly from large classes, such as compound images (COMP), X–ray images
(DRXR), etc. Actually, the accuracy in % of COMP and DRXR are around 68%,
which limited the overall performance. Certain other classes obtained scores
below overall accuracy and became also performance bottleneck. These classes
usually contains image of large diversity, such as organ photos (DVOR), non–
clinical photos (GNCP), statistic figures (GFIG), tables and forms (GTAB), and
3D images (D3DR). Previous experience shows that compound image
classification is of key importance for modality classification. Two specificities exist
for compound image classification: 1) compound image share the same image
content with 30 other image classes; 2) about 40% of the whole collection are
compound images. Hence, a big percentage of incorrectly classified images come
from the class of compound images.</p>
        <p>Through experiments, it is found that high–level semantic features can hardly
provide useful information for compound image classification. However, visual
perception can easily make a distinction, as a compound image is consisted of
several unconnected figures, invisible bounding boxes are produced in human
brain for each figure. Based on connected region detection toolbox in Matlab,
the 10 biggest bounding boxes without overlapping were extracted from the
value channel of image, and 5 biggest bounding boxes without overlapping were
extracted from hue and saturation channels of image. The whole image was
always considered as a bounding box. For each bounding box, height, width,
ratio, surface, first to third statistical moments (mean, variance, skewness) were
used to build a descriptor. Hence, 7 features per bounding box were extracted
from in total 21 bounding boxes, the final dimension of feature space is 147.
We call this approach BoBB–based approach. As compound image classification
is a one–to–one classification problem, only the SVM classifier was applied on
BoBB feature space. A grid searching on cost and value was used for parameter
selection. Figure 3.1 shows the classification performance obtained by various
BoBB strategies. Even color information was discarded, applying BoBB gray–
level image (the value channel of image) can already obtain around 85% accuracy.
Adding color features(approach 2), removing background (approach 3),
enhancing connectivity (approach 4) can all improve the performance. Adding color or
tensor features makes classifier more robust. Removing background eliminated
false alarms, and archived the best improvement. Combining all these features,
the best BoBB–based approach achieved 89.0% classification accuracy on
training data and 85.1% on testing data using SVM classifier.</p>
        <p>In non–compound classification, SURFContext + BoF approach was used.
Both SVM classifiers and classical kNN classifiers were tested on BoF feature
space. However, experiments on training data has shown that SVM classifiers
significantly outperformed kNN classifiers (69.1% vs 48.1%). Therefore, only two
runs using SVM classifiers were submitted to the modality classification task.
Accuracy by class on training data shows most diagnosis image classes obtained
over 80% accuracy, but on test data, the scores are around 70%, which shows the
performance of this approach is not stable. Classes with large content diversity
constantly obtained poor performance, Furthermore improvement is required on
these classes.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2 Image–based Retrieval</title>
        <p>For the image–based retrieval task, Mean average precision (MAP), binary
preference (Bpref), and early precision (P10, P30) are shown as measures. Only one
baseline (sari SURFContext HI baseline) using was submitted and the
corresponding MAP is 0.0086. Further tests on fusion strategies failed to improve the
performance, because the runs to be fused themselves contain too few relevant
results. The textual runs still largely outperforms the visual runs for both best
score and average score. Results of our run and best runs of various nature are
shown in Table 3.</p>
        <p>Apply fast filtering techniques to reduce the dimension and diversity of data
was tried out. As most query images are diagnosis images, the goal is to reject
those images which are completely not related to the query image in processing.
Online query requires a quick indexing and search. Line/dot filter, Laplace filter,
monochrome filter can process one image within 0.1 second. We used them to
reject unwanted images from the huge database by the following order shown in
Table 4. In our experiments, we define the region where standard derivation is
smaller than a threshold t. When all images were normalized to 0–255, t was set
to be 5. Using line filter, the area of line–like structure can be obtained. If the
area of whole image is equal to the sum of the area of line–like structure and the
area of background, it is labeled as ”all line”. All line rejection aims to reject
images containing only curves and lines. The risk of this filter is that it rejects
also diagnostic printed signals, such as electrocardiography, etc. However, this is
in general a safe solution, as query images are mainly diagnosis image.</p>
        <p>Figure rejection aims to reject the man–made images, such as flowchart,
system overviews, screenshots, etc. The Laplace filter was applied and where
laplace equals 0 were located as ”even” regions. If the area of whole image is
equal to the sum of the area of line–like structure, the area of even regions and
the area of background, it is labeled as ”figures”.</p>
        <p>The third round, images containing only text, such as program listing, DNA
sequences, tables and formulas, are rejected. This step is called ”all text
rejection”. A orientation histogram was calculated on the line–like structures. Those
images containing high repeatability in orientation histogram were rejected.</p>
        <p>Finally, only 92’785 images are left for retrieval, 66% of data are quickly
rejected. A comparison is made between the 92’785 images and the qrel file.
There are 18’961 non identical image ids in qrel file, 17’415 of 18’961 (about
92%) are found in 92’785 images. All image labeled as ”relevant” are not lost.
In other words, fast filtering techniques reduced largely the diversity of content,
but has little negative impact on image retrieval performance. This approach
can be helpful to reduce the quantity of data for all the groups.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Conclusions</title>
        <p>This paper describes the techniques used and results obtained by the MIILab
group in ImageCLEFmed 2013. Two tasks are targeted: modality classification
and medical image retrieval. The techniques used by MIILab are purely visual–
based. Classical BoF approach based on SURFContext was used as baselines,
and several variants were tried to improve the performance. Bags of bounding
boxes–based approach was proved to be useful for compound image classification.
Fast filtering techniques were applied to reduce the scale of data, and was proved
to be a safe strategy. For both modality classification and image retrieval tasks,
MIILab was ranked in the middle among all the groups. This is the first year
for MIILab to participate ImageCLEFmed, there is still room for performance
improvement.
12. Zhang, S., Yang, M., Cour, T., Yu, K., Metaxas, D.: Query specific fusion for image
retrieval. In Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C., eds.:
European Conference on Computer Vision. Lecture Notes in Computer Science.
Springer Berlin Heidelberg (2012) 660–673</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Clough</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Mu¨ller, H.,
          <string-name>
            <surname>Sanderson</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The CLEF cross-language image retrieval track (ImageCLEF) 2004</article-title>
          . In Peters,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Clough</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.J.F.</given-names>
            ,
            <surname>Kluck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Magnini</surname>
          </string-name>
          , B., eds.:
          <article-title>Multilingual Information Access for Text, Speech and Images: Result of the fifth CLEF evaluation campaign</article-title>
          . Volume
          <volume>3491</volume>
          of Lecture Notes in Computer Science (LNCS).,
          <string-name>
            <surname>Bath</surname>
          </string-name>
          , UK, Springer (
          <year>2005</year>
          )
          <fpage>597</fpage>
          -
          <lpage>613</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Garcia Seco de Herrera</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalpathy-Cramer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Demner-Fushman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Mu¨ller, H.:
          <article-title>Overview of the imageclef 2013 medical tasks</article-title>
          .
          <source>In: Working notes of CLEF</source>
          <year>2013</year>
          . (
          <year>September 2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Caputo</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , Mu¨ller, H.,
          <string-name>
            <surname>Thomee</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Villegas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paredes</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zellhofer</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goeau</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Martinez</surname>
          </string-name>
          <string-name>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Garcia</surname>
          </string-name>
          <string-name>
            <surname>Varea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Cazorla</surname>
          </string-name>
          , M., eds.:
          <source>ImageCLEF</source>
          <year>2013</year>
          :
          <article-title>the vision, the data and the open challenges</article-title>
          . In Caputo,
          <string-name>
            <given-names>B.</given-names>
            , Mu¨ller, H.,
            <surname>Thomee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Villegas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Paredes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Zellhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Goeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Joly</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Martinez</surname>
          </string-name>
          <string-name>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Garcia</surname>
          </string-name>
          <string-name>
            <surname>Varea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Cazorla</surname>
          </string-name>
          , M., eds.
          <source>: Proceedings of CLEF 2013. Lecture Notes in Computer Science (LNCS)</source>
          , Velencia, Spain, Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sone</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Doi</surname>
          </string-name>
          , K.:
          <article-title>Selective enhancement filters for nodules, vessels, and airway walls in two- and three-dimensional ct scans</article-title>
          .
          <source>Medical Physics</source>
          <volume>30</volume>
          (
          <issue>8</issue>
          ) (
          <year>2003</year>
          )
          <fpage>2040</fpage>
          -
          <lpage>2051</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Mu¨ller, H.,
          <string-name>
            <surname>Michoux</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bandon</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Geissbuhler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A review of content-based image retrieval systems in medicine-clinical benefits and future directions</article-title>
          .
          <source>International Journal of Medical Informatics</source>
          <volume>73</volume>
          (
          <issue>1</issue>
          ) (
          <year>2004</year>
          )
          <fpage>1</fpage>
          -
          <lpage>23</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>S.F.</given-names>
          </string-name>
          :
          <article-title>VisualSEEk: a fully automated content-based image query system</article-title>
          .
          <source>In: The Fourth ACM International Multimedia Conference and Exhibition</source>
          , Boston, MA, USA (November
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Squire</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          , Mu¨ller, W., Mu¨ller, H.,
          <string-name>
            <surname>Pun</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Content-based query of image databases: inspirations from text retrieval</article-title>
          .
          <source>Pattern Recognition Letters (Selected Papers from The 11th Scandinavian Conference on Image Analysis SCIA '99)</source>
          <volume>21</volume>
          (
          <fpage>13</fpage>
          -
          <lpage>14</lpage>
          ) (
          <year>2000</year>
          )
          <fpage>1193</fpage>
          -1198
          <string-name>
            <given-names>B.K.</given-names>
            <surname>Ersboll</surname>
          </string-name>
          , P. Johansen, Eds.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Grid-based Medical Image Retrieval Using Local Features</article-title>
          .
          <source>PhD thesis</source>
          , University of Geneva (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Bay</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ess</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tuytelaars</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gool</surname>
            ,
            <given-names>L.V.</given-names>
          </string-name>
          :
          <article-title>Speeded-up robust features (surf)</article-title>
          .
          <source>Computer Vision and Image Understanding</source>
          <volume>110</volume>
          (
          <issue>3</issue>
          ) (
          <year>2008</year>
          )
          <fpage>346</fpage>
          -
          <lpage>359</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Belongie</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greenspan</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malik</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Puzicha</surname>
          </string-name>
          , J.:
          <article-title>Shape matching and object recognition using shape contexts</article-title>
          .
          <source>IEEE transactions on pattern analysis and machine intelligence</source>
          <volume>24</volume>
          (
          <issue>4</issue>
          ) (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Depeursinge</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Mu¨ller, H.:
          <article-title>Information fusion for combining visual and textual image retrieval</article-title>
          .
          <source>In: 20th IEEE International Conference on Pattern Recognition (ICPR)</source>
          .
          <source>(August</source>
          <year>2010</year>
          )
          <fpage>1590</fpage>
          -
          <lpage>1593</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>