<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Recod @ MediaEval 2015: Diverse Social Images Retrieval</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rodrigo T. Calumby</string-name>
          <email>rtcalumby@ecomp.uefs.br</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Iago B. A. do C. Araujo</string-name>
          <email>ibacaraujo@ecomp.uefs.br</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vinícius P. Santana</string-name>
          <email>vpsantana@ecomp.uefs.br</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Javier A. V. Munoz</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Otávio A. B. Penatti</string-name>
          <email>o.penatti@samsung.com</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lin T. Li</string-name>
          <email>lintzyli@ic.unicamp.br</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jurandy Almeida</string-name>
          <email>jurandy.almeida@unifesp.br</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovani Chiachia</string-name>
          <email>chiachia@ic.unicamp.br</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcos A. Gonçalves</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ricardo da S. Torres</string-name>
          <email>rtorres@ic.unicamp.br</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Federal University of Minas Gerais</institution>
          ,
          <addr-line>Belo Horizonte, MG</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>GIBIS Lab, Federal University of S~ao Paulo</institution>
          ,
          <addr-line>S~ao Jose dos Campos, SP</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>RECOD Lab, University of Campinas</institution>
          ,
          <addr-line>Campinas, SP</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>SAMSUNG Research Institute Brazil</institution>
          ,
          <addr-line>Campinas, SP</addr-line>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Feira de Santana</institution>
          ,
          <addr-line>Feira de Santana, BA</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <fpage>14</fpage>
      <lpage>15</lpage>
      <abstract>
        <p>This paper presents the RECOD team experience in the Retrieving Diverse Social Images Task at MediaEval 2015. The teams were required to develop a diversi cation approach for social photo retrieval. Our proposal is based on irrelevant image ltering, reranking, rank aggregation, and diversity promotion. We proposed a multimodal approach and exploited image metadata and user credibility information.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        The relevance-diversity trade-o is an important problem
associated with several search scenarios. Promoting
diversity in retrieval results has been shown to positively impact
the user search experience specially for ambiguous,
underspeci ed, and visual summarization queries [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The
Retrieving Diverse Social Images Task [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] integrates it into
a tourism related representative image retrieval challenge.
This paper describes the RECOD group contributions via
diversity promotion boosted by multimodal rank fusion.
      </p>
    </sec>
    <sec id="sec-2">
      <title>PROPOSED APPROACH</title>
      <p>
        As a relevance enhancement step our approach includes
irrelevant image ltering, multimodal image reranking, and
rank aggregation. Image ltering was conducted according
to face detection data and geographic location of the
images. We also evaluated several di erent visual features and
text similarity measures. For reranking the original retrieval
list, we exploited textual, visual, geographic, and
credibility information. As an additional step, the reranked lists
were also aggregated with a Genetic Programming (GP) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
approach. Our proposal follows the general work- ow
presented in Figure 1. Each of these steps is described next.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Filtering</title>
      <p>
        For irrelevant images removal we used three ltering
strategies: geographic-based, face-based, and blur-based. The
GeoFilter eliminated all images located farther than a given
range in relation to the reference lat/long. Following the
result on the development set, the ltering radius was set to
10 km. In turn, the face-based procedure was introduced to
lter out images containing people as the main subject. We
used a face detection module of Face++1 and all images with
more than one face detected were removed. Finally,
out-offocus images were eliminated using the method from [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
with blur threshold set to 0.8.
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Visual Features and Text Similarity</title>
      <p>
        For visual similarity, besides the provided features, we
also extracted: (i) two general purpose global descriptors
(BIC [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and GIST [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]); (ii) a bag of visual words (BoVW)
descriptor, based on sparse (Harris-Laplace detector) SIFT,
with 512 visual words (randomly selected), soft assignment
( = 150), and max pooling or using Word Spatial
Arrangement (WSA) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] for encoding the spatial arrangement of
visual words; and (iii) fteen features available in the Lire
package [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].2
      </p>
      <p>For text-only and multimodal runs, we used the Cosine,
BM25, Dice, Jaccard, and TF-IDF measures which were
computed using the provided TF, DF, and TF-IDF vectors.
2.3</p>
    </sec>
    <sec id="sec-5">
      <title>Reranking and Aggregation</title>
      <p>For improving the original list ranking, we explored visual,
textual, credibility, and geographic ranking. For text-only
and multimodal reranking, the text-based scores were
computed as the similarity between the text vectors associated
1http://www.faceplusplus.com (as of June/2015).
2CEDD, FCTH, OpponentHistogram, JointHistogram,
AutoColorCorrelogram, ColorLayout, EdgeHistogram,
Gabor, JCD, JpegCoe cientHistogram, ScalableColor,
SimpleColorHistogram, Tamura, LuminanceLayout, and PHOG.
Available at: http://www.lire-project.net/
with the images and the localities' text vectors. The visual
method reranked the original list according to the similarity
in relation to the location's representative Wikipedia images.
The visual distance from each image to the representative
set was computed as the minimum distance to each
representative image. All credibility scores were individually used
for reranking. Additionally, lat/long data were used to rank
images according to the Haversine distance to the reference
point.</p>
      <p>
        For feature fusion, the reranked lists were combined using
the GP approach from [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] which uses several rank
aggregation methods. This method was trained using the
development data and combined order-based (MRA [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], RRF [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ],
and BordaCount [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]) and score-based (CombMIN,
CombMAX, CombSUM, ComMED, CombANZ [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], and
RLSim [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]) rank fusion methods.
      </p>
      <p>As a relevance-based ltering, from the nal aggregated
list, up to 150-top ranked images were selected as the input
list for the summarization procedure.
2.4</p>
    </sec>
    <sec id="sec-6">
      <title>Diversification Method</title>
      <p>
        After ltering, reranking, and aggregation steps, the
improved relevance-based lists were submitted to explicit
diversi cation. We evaluated four methods: clustering-based
(kMedoids and agglomerative) and reranking-based (MMR [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
MSD [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]). In all cases the k-Medoids method achieved
signi cantly superior results on the development set and was
used in the submitted runs.
      </p>
      <p>In the clustering step, the initial medoids were selected in
an o set fashion for equally sampling from the top to the
bottom of the ranked list. The medoids updating procedure
just selected the best connected image of the cluster, using
the average distance to all other images in the cluster. The
process iterates until no intercluster image transition occurs
or up to 50 iterations.</p>
      <p>
        For runs 1, 2, 3, and 5, the clusters were ranked according
to their sizes in descending order and intra-cluster sorting
was applied using average connectivity. For the
credibilitybased submission (run 4), the images were clustered
according to their owner (user) and the clusters were ranked
according to the users' credibility computed as a linear
combination of tagSpeci city, uploadFrequency, meanTagRank,
faceProportion, meanImageTagClarity, photoCount,
visualScore, and locationSimilarity [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>The representative images were selected in a round robin
fashion from the nal clusters. The number of clusters was
de ned as 30 for runs 1 and 3, and 40 for runs 2 and 5.</p>
    </sec>
    <sec id="sec-7">
      <title>RUNS SETUP</title>
      <p>We submitted ve runs (Table 1). The features used for
diversi cation were selected according to the best results on
the development set.</p>
      <p>In all runs, the geographic ltering and reranking were
only applied for one-topic queries since multi-topic queries
do not have reference geo-location. Additionally, in run 5,
the face-based lters were also only applied to one-topic
queries since multi-topic queries have a di erent relevance
constraint in relation to people in the foreground. Finally,
in runs 1, 3, and 5 no visual reranking was applied for
multitopic queries.
4.</p>
    </sec>
    <sec id="sec-8">
      <title>RESULTS AND DISCUSSION</title>
      <p>Table 2 presents the e ectiveness results for the ve runs
for the development and test set. The best results (F1@20)
on the development set were achieved on run 5, followed
by runs 2 and 3, in which textual information was used.
However, these were the runs with the greatest e ectiveness
di erence when comparing development and test queries,
specially considering the multi-topic queries.</p>
      <p>Table 3 presents the e ectiveness results for one-topic and
multi-topic test queries. As we can observe, even with no
visual reranking, the visual-only run allowed slightly
superior results for multi-topic queries considering all the run
types and also comparing to one-topic queries. All other run
types achieved superior e ectiveness on one-topic queries,
specially when the credibility information was used (runs 4
and 5). Our results suggests that visual features are
important when considering multi-topic queries while the textual
information seems more suitable for one-topic queries.
5.</p>
    </sec>
    <sec id="sec-9">
      <title>CONCLUSIONS</title>
      <p>For relevance and diversity maximization we proposed
ltering strategies and the combination of multiples features
with a rank fusion method. These improved ranked lists
were used as input for a clustering-based summarization
method. Our experiments suggest that di erent
summarization alternatives may result in di erent e ectiveness for
one-topic and multi-topic queries.</p>
    </sec>
    <sec id="sec-10">
      <title>ACKNOWLEDGMENTS</title>
      <p>We thank the support of UEFS/PROBIC and FAPESP.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R. T.</given-names>
            <surname>Calumby</surname>
          </string-name>
          , R. da
          <string-name>
            <given-names>S.</given-names>
            <surname>Torres</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Goncalves</surname>
          </string-name>
          .
          <article-title>Diversity-driven learning for multimodal image retrieval with relevance feedback</article-title>
          .
          <source>In Proceedings of the 21st IEEE International Conference on Image Processing</source>
          , pages
          <volume>2197</volume>
          {
          <fpage>2201</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Carbonell</surname>
          </string-name>
          and J.
          <string-name>
            <surname>Goldstein</surname>
          </string-name>
          .
          <article-title>The use of mmr, diversity-based reranking for reordering documents and producing summaries</article-title>
          .
          <source>In Proceedings of the 21st Anual International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , pages
          <volume>335</volume>
          {
          <fpage>336</fpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G. V.</given-names>
            <surname>Cormack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L. A.</given-names>
            <surname>Clarke</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Buettcher</surname>
          </string-name>
          .
          <article-title>Reciprocal rank fusion outperforms condorcet and individual rank learning methods</article-title>
          .
          <source>In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , pages
          <volume>758</volume>
          {
          <fpage>759</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Sivakumar</surname>
          </string-name>
          .
          <article-title>E cient similarity search and classi cation via rank aggregation</article-title>
          .
          <source>In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data</source>
          , pages
          <volume>301</volume>
          {
          <fpage>312</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Gollapudi</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Sharma</surname>
          </string-name>
          .
          <article-title>An axiomatic approach for result diversi cation</article-title>
          .
          <source>In Proceedings of the 18th International Conference on World Wide Web</source>
          , pages
          <volume>381</volume>
          {
          <fpage>390</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ionescu</surname>
          </string-name>
          , A. G^nsca,
          <string-name>
            <given-names>B.</given-names>
            <surname>Boteanu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Popescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lupu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Mu</surname>
          </string-name>
          <article-title>ller. Retrieving diverse social images at mediaeval 2015: Challenge, dataset and evaluation</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval 2015 Workshop</source>
          , Wurzen, Setember
          <volume>14</volume>
          -15
          <year>2015</year>
          .
          <article-title>CEUR-WS.org</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Koza</surname>
          </string-name>
          .
          <article-title>Genetic Programming: On the Programming of Computers by Means of Natural Selection</article-title>
          . MIT Press, Cambridge, MA, USA,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lux</surname>
          </string-name>
          and
          <string-name>
            <surname>S. A.</surname>
          </string-name>
          <article-title>Chatzichristo s. LIRE: lucene image retrieval: an extensible java CBIR library</article-title>
          .
          <source>In Proceedings of the 16th ACM International Conference on Multimedia</source>
          , pages
          <volume>1085</volume>
          {
          <fpage>1088</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Oliva</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Torralba</surname>
          </string-name>
          .
          <article-title>Modeling the shape of the scene: A holistic representation of the spatial envelope</article-title>
          .
          <volume>42</volume>
          (
          <issue>3</issue>
          ):
          <volume>145</volume>
          {
          <fpage>175</fpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D. C. G.</given-names>
            <surname>Pedronette</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. da S.</given-names>
            <surname>Torres</surname>
          </string-name>
          .
          <article-title>Image re-ranking and rank aggregation based on similarity of ranked lists</article-title>
          .
          <source>In Proceedings of the 14th International Conference on Computer Analysis of Images and Patterns - Volume Part I</source>
          , pages
          <volume>369</volume>
          {
          <fpage>376</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>O. A. B.</given-names>
            <surname>Penatti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. B.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Valle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Gouet-Brunet</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>R. da S.</given-names>
            <surname>Torres</surname>
          </string-name>
          .
          <article-title>Visual word spatial arrangement for image retrieval and classi cation</article-title>
          .
          <source>Pattern Recognition</source>
          ,
          <volume>47</volume>
          (
          <issue>2</issue>
          ):
          <volume>705</volume>
          {
          <fpage>720</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Shaw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Fox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Shaw</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Fox</surname>
          </string-name>
          .
          <article-title>Combination of multiple searches</article-title>
          .
          <source>In The Second Text REtrieval Conference (TREC-2)</source>
          , pages
          <fpage>243</fpage>
          {
          <fpage>252</fpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Stehling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nascimento</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Falca</surname>
          </string-name>
          <article-title>~o. A compact and e cient image retrieval approach based on border/interior pixel classi cation</article-title>
          .
          <source>In Proceedings of the 11th International Conference on Information and Knowledge Management</source>
          , pages
          <volume>102</volume>
          {
          <fpage>109</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>H.</given-names>
            <surname>Tong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and
          <string-name>
            <surname>C. Zhang.</surname>
          </string-name>
          <article-title>Blur detection for digital images using wavelet transform</article-title>
          .
          <source>In Proceedings of the IEEE International Conference on Multimedia and Expo</source>
          , pages
          <fpage>17</fpage>
          <lpage>{</lpage>
          20 Vol.
          <volume>1</volume>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Vargas</surname>
          </string-name>
          , R. da
          <string-name>
            <given-names>S.</given-names>
            <surname>Torres</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Goncalves</surname>
          </string-name>
          .
          <article-title>A soft computing approach for learning to aggregate rankings</article-title>
          .
          <source>In Proceedings of the 24th ACM International Conference on Conference on Information and Knowledge Management</source>
          ,
          <year>2015</year>
          . (in press).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H. P.</given-names>
            <surname>Young</surname>
          </string-name>
          .
          <article-title>An axiomatization of borda's rule</article-title>
          .
          <source>Journal of Economic Theory</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ):
          <volume>43</volume>
          {
          <fpage>52</fpage>
          ,
          <year>1974</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>