<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>NovaSearch on medical ImageCLEF 2013</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andre Mour~ao</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Flavio Martins</string-name>
          <email>flaviomartins@acm.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jo~ao Magalh~aes</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universidade Nova de Lisboa</institution>
          ,
          <addr-line>Faculdade de Ci</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>encias e Tecnologia</institution>
          ,
          <addr-line>Caparica</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This article presents the participation of the Center of Informatics and Information Technology group CITI in medical ImageCLEF 2013. This is our rst participation and we submitted runs on the modality classi cation task, the ad-hoc image retrieval task and case retrieval task. We are developing a system to integrate textual and visual retrieval into a framework for multimodal retrieval. Our approach for multimodal case retrieval achieved best global performance using Segmented (6 6 grid) Local Binary Pattern (LBP) histograms and Segmented HSV histograms for images and Lucene with query expansion (using the rst top 3 results). In modality classi cation we achieved one of the largest MAP gains in the multimodal classi cation task, resulting in the third best team result.</p>
      </abstract>
      <kwd-group>
        <kwd>medical retrieval</kwd>
        <kwd>case-based retrieval</kwd>
        <kwd>multimodal fusion</kwd>
        <kwd>medical modality classi cation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>ImageCLEF 2013 AMIA: Medical task is the CLEF benchmark focused on the
retrieval and classi cation of medical images and articles from PubMed. This is
our rst participation on ImageCLEF and the Medical track in particular, so
we tried to design a system to participate on every task (excluding Compound
gure separation). However we delved more in the case-based retrieval task and
results lists fusion.</p>
      <p>
        Our system is divided in image retrieval, textual retrieval and results
fusion. For image retrieval, we focused on top performing features from previous
editions [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] (CEDD [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], FCTH [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], Local Binary Pattern histograms [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], color
histograms).
      </p>
      <p>
        For text retrieval, we used Lucene1 from the Apache project paired with our
implementation of the BM25L [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] ranking function for Lucene. For modality
classi cation, we combined text and image descriptors (early fusion) and
performed classi cation using Vowpal Wabbit2. Our training dataset included only
the images provided on the 2013 edition [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. We did not perform any type of
1 http://lucene.apache.org/core/
2 https://github.com/JohnLangford/vowpal_wabbit/wiki
dataset augmentation. On the case-based retrieval task, we also tested a variety
of late fusion approaches, and came up with a variant of Reciprocal Rank (RR)
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] we named Inverse Square Rank (ISR). Our fusion method provided the best
performance for our case-based image and textual runs.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Techniques</title>
      <sec id="sec-2-1">
        <title>Text retrieval</title>
        <p>
          In text retrieval, article text is indexed using Lucene and retrieved using the
BM25L [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] retrieval function. We expanded the initial query with preferred and
alternative terms sourced from a SKOS formatted version of MeSH using
LuceneSKOS. This improves precision metrics at the top ranks, so we then exploit these
top documents to perform pseudo-relevance feedback.
        </p>
        <p>The indexed elds depend on the task: For image retrieval, we achieved good
results indexing and searching only on the image captions, title and abstract. For
case retrieval, we indexed and searched on the full text (all chapters including
image captions), abstract and title. We ran pseudo-relevance feedback using the
top 3 results retrieved using the initial query. We added a maximum of 25 new
query terms to the initial query. We purged any candidate words that did not
appear in a minimum of 5 documents.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Visual retrieval</title>
        <p>For visual retrieval, we extracted a set of descriptors e ective in medical images
retrieval (CEDD, FCTH, Local Binary Pattern (LBP) histograms and color
histograms). We used this image features in pairs. For LPB and color histograms,
we extracted both descriptors for 6 6 image grid and concatenated the results.
For CEDD and FCTH, we extracted and concatenated both feature descriptors.</p>
        <p>
          CEDD (Color and Edge Directivity Descriptor) expresses color and edge
directivity information from the image into a compact descriptor (54 bytes per
image). It combines "real color" histograms from the HSV color space (e.g. bins
Light Magenta, Dark Green) with MPEG-7 Edge Histogram Descriptor. FCTH
(Fuzzy Color and Texture Histogram) combines the same color histograms as
CEDD with texture information extracted with multiple energy wavelets using
fuzzy logic techniques. Color histograms are histograms of the individual
components of the HSV color space. Local Binary Patters are texture descriptors
based on thresholding a pixel with its neighbors to detect texture variations.
The thresholded values are concatenated into an histogram to represent image
texture. The features of all images in the corpus are stored in a FLANN [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] L2
index.
        </p>
        <p>The image retrieval results are sorted by their similarity, with the score
being the normalized inverse of the L2 distances between the query image and
the indexed result images. For case-base retrieval, an additional step must be
performed: the image id (IRI), must be converted into a document id (DOI)
(Figure 1 (a)) and the duplicate results must be merged to have an unique document
list (Figure 1 (b)). More details are present in section 2.3.
Result fusion aims at combining ranked lists from multiple sources into a single
combined ranked list. Consider these two use cases: combine the results from
queries with multiple images and combine the results from text and images
queries.</p>
        <p>
          There are two main approaches for late fusion: score based and rank based.
Score based approaches (CombSUM, CombMAX and CombMNZ) combine the
normalized scores given by the individual searches (e.g. visual and textual) as a
basis to the create the new ranked list. The studied variant that achieves the best
performance [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] is CombMNZ, but ranked based fusion is gaining momentum,
and can outperform score based fusion under most conditions [
          <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
          ]. For each
document i, the score after fusion can be computed as:
combSUM(i) =
        </p>
        <p>N(i)
X Sk(i);
k=1
combMAX(i) = max(S); 8S</p>
        <p>Di;
combMNZ(i) = N (i)
combSUM(i);
(1)
(2)
(3)
(4)
where Sk(i) is the score of the i document on the k result list.</p>
        <p>N(i) refers to the number of times a document appears on a results list. A
result list k does not contain all documents. Documents with a zero score or a
very high rank can be safely ignored. Thus, N(i) varies between 0 (the document
i does not appear on any list) and the total number of results list (the document
i appears on all lists). For example, in our experiments, there are two results
lists: one for visual search and other for textual search, limited to 1000 results
each.</p>
        <p>Rank based fusion methods consider the inverse of the rank of each document
in each one of the individual lists as the score. Reciprocal Rank and Reciprocal
Rank Fusion are the two methods we evaluated:</p>
        <p>N(i)
RR(i) = X
k=1 Rk(i)
1
;
where Rk(i) is the rank of document i on the k rank.</p>
        <p>After analyzing both score and rank based approaches, we combined elements
from both to improve precision. Inverted Squared Rank (ISR) combines the
inverse rank approaches of RR and RRF (using the squared rank to improve
precision at top results) with the frequency component of combMNZ (results
that appear on multiple lists are boosted):
(5)
(6)
ISR(i) = N (i)</p>
        <p>N(i)
X
k=1</p>
        <p>1
Rk(i)2
:</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <sec id="sec-3-1">
        <title>Modality classi cation</title>
        <p>In the modality classi cation task, we used the CEDD and FCTH descriptors
and (stemmed) text from the corresponding caption and article title. In the
image and textual runs, we used the corresponding descriptors from all images
in the provided training dataset to create a model based on stochastic gradient
descent with Vowpal Wabbit. These runs correspond to the CEDD FCTH for
images and words from text in Table 1.</p>
        <p>The multimodal run concatenates the descriptors described above and
performs the stochastic gradient descent with Vowpal Wabbit with the combined
descriptors (CEDD, FCTH, caption and title words). This run correspond to the
All in Table 1.</p>
        <p>The dataset provided as training data for modality classi cation was
unbalanced in the quantity of images per class. For example, the "Compound or
multipane images" (COMP) category contains over 1,105 training images while
the "PET" (DRPE) category contains 16 training images . Also, images in the
COMP may not be visually distinct; they consist on the combination of images
from other categories. Thus, we have decided to send runs where we ignored the
COMP in training and classi cation. These run correspond to the runs ended
by noComb in Table 1.</p>
        <p>Regarding the performance of our algorithm, we were able to be the 3rd
best team overall with our multimodal All run, classifying 72.92% of the images
correctly. This is only 8.76% bellow the best run. Our visual run classi ed 57.62%
of the images correctly (23.17% behind the best run) and our best textual run
classi ed 62.35% of the images correctly (1.82% behind the best run). The most
interesting fact is the big improvement of the results from single modality runs
in the multimodal run. We improved our best single modality run by over 10%,
while the best team only improved their best run by 0.89% in the multimodal
approach. Our noComb approaches did not perform well, as the testing dataset
contained a lot of COMP images.
For case retrieval, we submitted one visual run (with segmented LPB Histograms
and HSV Color Histograms) and two textual runs (one with MeSH expansion
and one without expansion). The run with MeSH expansion (with MSH on the
run name) outperformed the run without expansion in all metrics, with a special
emphasis on MAP (12% increase) and P@10 (11% increase).</p>
        <p>We achieved our best result in the expanded textual run, with a MAP 22%,
GM-MAP 11.8% and P@10 26%, very close to the best textual run (MAP: 24%,
GM-MAP: 11.6% and P@10: 27%). The visual run achieved the best result in
class, outperforming the second best result by a factor of 10. Our run achieved a
MAP of 3% and P@10 of 4%. Our best multimodal run (using ISR) also achieved
the best result in class with a MAP of 16% and P@10 of 18%. The combMNZ
based run achieved much worse results (MAP: 8% and P@10: 14%), following
our idea that rank-based fusion is better that score-based fusion in multimodal
retrieval. Although these results are worse that the textual only results, our
rank-based fusion algorithms improved existing algorithm by a small margin.
Fusion In addition to the submitted runs, we compared the performance of the
fusion algorithms using the best textual and visual runs for case-based retrieval.
Performance was evaluated using trec eval and the provided relevance judgments
(Table 3).</p>
        <p>With our data, rank-based approaches outperformed score based approaches
by a factor of 2. One of the reasons is the scoring di erencs between text and
images. Even though both visual and text scores have the same normalization
(the [0...1] interval), the distribution of the results in the score space is di erent.
Rank based approaches can handle multi-modality better, because the scores are
ignored.</p>
        <p>Regarding the di erences between RR, RRF and ISR: ISR performed as well
or better in our experiments (Table 3 and 4) in most measures, with a signi cant
performance boost on P@10 for case-based retrieval. The polynomial component
promotes top ranking results to the top of the list, o ering a better precision at
top results (e.g. P@10).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.3 Image retrieval</title>
        <p>In ad-hoc image retrieval, we submitted one visual run (with segmented LPB
Histograms and HSV Color Histograms) and two textual runs (one with MeSH
expansion and one without expansion). As with case retrieval, the run with MeSH
expansion (FCT SOLR BM25L MSH) outperformed the run without expansion
(FCT SOLR BM25L) in all metrics, although to a lesser degree: MAP increased
5% and P@10 increased 10%.</p>
        <p>We achieved our best result in the expanded textual run, with a MAP of
23% and P@10 of 30%, well behind the best textual (and best overall) run with
a MAP of 32% and P@10 of 39%. As expected, our visual runs performed worst
with a MAP of 0.7% and P@10 of 3%, about half of the performance of the best
visual run (MAP: 1.9% and P@10: 6%).</p>
        <p>After the competition, we tested the same fusion algorithms as with
casebased retrieval for multimodal retrieval. Our conclusion are similar to the ones
discussed in the case-based retrieval section: rank based approaches are much
better for our data, with ISR and RR being the best for fusion.
We would like to emphasize our results in the visual and multimodal case
retrieval tracks, where we achieved the best MAP and bpref. Our fusion algorithm
(ISR) achieved slightly better performance than existing fusion algorithms and
helped us achieving the best result on the multimodal case retrieval track.</p>
        <p>Our results in the modality classi cation are also noteworthy. In the
multimodal run, we increased the best single modality result by 17%, ending with the
third best team result.</p>
        <p>We were not able to submit all the desired combination of features and
fusion algorithms due to limited time. We hope that next year we can submit all
the desired runs. We will also focus on improving multimodal fusion using other
fusion approaches (e.g early fusion in case and image based retrieval) and
algorithms. Other technique we will study is the integration of modality as a feature
in the retrieval tasks.</p>
        <p>Overall, we are satis ed with our rst participation in ImageCLEF and hope
that we improve our performance in the following editions.</p>
        <p>VIII</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>T.</given-names>
            <surname>Ahonen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hadid</surname>
          </string-name>
          , and
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Pietikainen. Face description with local binary patterns: application to face recognition</article-title>
          .
          <source>IEEE transactions on pattern analysis and machine intelligence</source>
          ,
          <volume>28</volume>
          (
          <issue>12</issue>
          ):
          <year>2037</year>
          {
          <fpage>41</fpage>
          ,
          <string-name>
            <surname>Dec</surname>
          </string-name>
          .
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Belkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kantor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Fox</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Shaw.</surname>
          </string-name>
          <article-title>Combining the evidence of multiple query representations for information retrieval</article-title>
          .
          <source>Inf</source>
          . Process. Manage.,
          <volume>31</volume>
          (
          <issue>3</issue>
          ):
          <volume>431</volume>
          {
          <fpage>448</fpage>
          , May
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Chatzichristo</surname>
          </string-name>
          s and
          <string-name>
            <given-names>Y. S.</given-names>
            <surname>Boutalis</surname>
          </string-name>
          .
          <article-title>Cedd: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval</article-title>
          .
          <source>In Proceedings of the 6th international conference on Computer vision systems</source>
          ,
          <source>ICVS'08</source>
          , pages
          <fpage>312</fpage>
          {
          <fpage>322</fpage>
          , Berlin, Heidelberg,
          <year>2008</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Chatzichristo s</surname>
          </string-name>
          , K. Zagoris,
          <string-name>
            <given-names>Y. S.</given-names>
            <surname>Boutalis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Papamarkos</surname>
          </string-name>
          .
          <article-title>Accurate Image Retrieval Based on Compact Composite Descriptors and Relevance Feedback Information</article-title>
          .
          <source>International Journal of Pattern Recognition and Arti cial Intelligence (IJPRAI)</source>
          ,
          <volume>24</volume>
          (
          <issue>2</issue>
          ):
          <volume>207</volume>
          {
          <fpage>244</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>G. V.</given-names>
            <surname>Cormack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L. A.</given-names>
            <surname>Clarke</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Buettcher</surname>
          </string-name>
          .
          <article-title>Reciprocal rank fusion outperforms condorcet and individual rank learning methods</article-title>
          .
          <source>In SIGIR '09</source>
          , pages
          <fpage>758</fpage>
          {
          <fpage>759</fpage>
          , New York, NY, USA,
          <year>2009</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>D. Frank</given-names>
            <surname>Hsu</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Taksa.</surname>
          </string-name>
          <article-title>Comparing rank and score combination methods for data fusion in information retrieval</article-title>
          .
          <source>Inf. Retr.</source>
          ,
          <volume>8</volume>
          (
          <issue>3</issue>
          ):
          <volume>449</volume>
          {
          <fpage>480</fpage>
          , May
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A.</given-names>
            <surname>Garcia Seco de Herrera</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Kalpathy-Cramer</surname>
            ,
            <given-names>D. Demner</given-names>
          </string-name>
          <string-name>
            <surname>Fushman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Antani</surname>
            , and
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Mu</surname>
          </string-name>
          <article-title>ller. Overview of the imageclef 2013 medical tasks</article-title>
          .
          <source>In Working notes of CLEF</source>
          <year>2013</year>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lv</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhai</surname>
          </string-name>
          .
          <article-title>When documents are very long, bm25 fails!</article-title>
          <source>In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval</source>
          , pages
          <volume>1103</volume>
          {
          <fpage>1104</fpage>
          . ACM,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>M.</given-names>
            <surname>Muja</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Lowe</surname>
          </string-name>
          .
          <article-title>Fast approximate nearest neighbors with automatic algorithm con guration</article-title>
          .
          <source>In International Conference on Computer Vision Theory and Application VISSAPP'09)</source>
          , volume
          <volume>340</volume>
          , pages
          <fpage>331</fpage>
          {
          <fpage>340</fpage>
          . INSTICC Press,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. H. Muller, A. Garc a Seco de Herrera, J.
          <string-name>
            <surname>Kalpathy-Cramer</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Demner-Fushman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Antani</surname>
            ,
            <given-names>and I. Eggel.</given-names>
          </string-name>
          <article-title>Overview of the imageclef 2012 medical image retrieval and classi cation tasks</article-title>
          .
          <source>In CLEF 2012 working notes</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <article-title>Expansionbased technologies in nding relevant and new information: Thu trec2002 novelty track experiments</article-title>
          .
          <source>In the Proceedings of the Eleventh Text Retrieval Conference (TREC</source>
          , pages
          <volume>586</volume>
          {
          <fpage>590</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>