<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>UNED-UV @ Retrieving Diverse Social Images Task</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>A.Castellanos</string-name>
          <email>acastellanos@lsi.uned.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>X. Benavent</string-name>
          <email>xaro.benavent@uv.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A. García-Serrano</string-name>
          <email>agarcia@lsi.uned.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>E. de Ves</string-name>
          <email>esther.deves@uv.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>J.Cigarrán</string-name>
          <email>juanci@lsi.uned.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Informatics, University of Valencia</institution>
          ,
          <addr-line>Valencia</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>NLP &amp; IR Group, UNED, C/ Juan del Rosal</institution>
          ,
          <addr-line>16, Madrid</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <fpage>14</fpage>
      <lpage>15</lpage>
      <abstract>
        <p>This paper details the participation of the UNED-UV group at the 2015 Retrieving Diverse Social Images Task. This year, our proposal is based on a multi-modal approach that rstly applies a textual algorithm based on Formal Concept Analysis (FCA) and Hierarchical Agglomerative Clustering (HAC) to detect the latent topics addressed by the images to diversify them according to these topics. Secondly, a Local Logistic Regression model, which uses the low level features and some relevant and non-relevant samples, is adjusted and estimates the relevance probability for all the images in the database.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Information retrieval systems have been commonly based
on maximizing the relevance of the result list (i.e., in terms
of accuracy-based metrics). However, retrieval systems, and
specially those focused on image diversi cation, should be
able to o er relevant but also diverse results. Users are not
only interested in accurate results but also in results covering
di erent topics or situations [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        To address this task we propose an image representation
using the concept/s covered by the textual information
related to the images. This conceptual representation is
tackled by means of the use of Formal Concept Analysis, a data
organization technique. In our participation in the 2014
edition of this task we proved that this approach was able to
identify the di erent topics addressed in the images,
allowing the diversi cation of the result list according to them
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        This year we intend to go a step further by presenting
a multimedia approach so that the aspects related to the
visual information of the images were missed in our previous
approach. As it has extensivily been proven that the visual
information has a great impact in the information retrieval
systems [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The visual approach presented uses a relevance
feedback algorithm developped by the UV group used in
previous works [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This method estimates the similarity
probability of all the images of the database using the visual
low-level features by means of a Local Logistic Regression
model [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>SYSTEM DESCRIPTION</title>
      <p>Our multi-media system has two sub-systems: a
TextualBased Information Retrieval system that works with
textual information and generates clusters for diversity, and a
Content-Based Information Retrieval sub-system that
estimates the relevance of each of the images of the generated
clusters.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Textual-Based Information Retrieval</title>
      <p>
        Our proposal is based on the discovering of the latent
topics addressed by the images by applying Formal Concept
Analysis. A Hierarchical Agglomerative Clustering (HAC)
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is then applied to group together similar images according
to the detected formal concepts (those belonging to the same
topic). Each HAC-based cluster may be considered as an
image set covering a similar topic. Then, for each cluster,
the visual features related to the images are applied to rank
the cluster images according to their visual diversity.
2.1.1
      </p>
      <sec id="sec-3-1">
        <title>FCA-based Modelling</title>
        <p>
          Formal Concept Analysis (FCA) is a theory of concept
formation [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] to organize formal contexts. A formal context
is a structure K := (G; M; I), where G is a set of objects,
M a set of attributes related to these objects and I a binary
relationship between G and M , denoted by gIm: the object
g has the attribute m. From the formal context, a set of
formal concepts can be inferred i.e., a formal concept is a
pair (A; B) of images A and the features shared by those
images B) and organized in a lattice from the most generic
to the most speci c one.
        </p>
        <p>
          By applying FCA the images in the test set are modelling
terms of formal concepts, which group together the images
sharing a same set of features. In order to select only those
most-representative features, we applied Kullback-Leibler
Divergence (KLD) [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] on the textual contents related to the
images. This KLD-based selection represents each image by
the textual contents that better di erentiates a image from
the other ones.
2.1.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>HAC-based grouping</title>
        <p>
          From the FCA formal concepts, a set of diverse image
groups is created by applying a HAC algorithm [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
Specifically, we propose a Single Linking hierarchical clustering
that groups together similar formal concepts and the
ZeroInduces index to set the cluster similarity [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
2.2
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Content-Based Information Retrieval</title>
      <p>
        This sub-system is concerned with Content-Based Image
retrieval which models the user preferences by using a
relevance feedback algorithm. The general methodology
involves ve steps:
1. Reduction of the data dimensionality: The provided
low-level visual features [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] are used to generate a
feature vector associated to each image that will be
generically denoted as x in a dimensional space N = 945.
These features are reduced using a Principal
Component Analysis (PCA). We retain only the rst
components that account for 80% of the data variability. We
have used this idea to reduce the original dimension of
our characteristic space in a new characteristic vector
of dimension M &lt; N . One of the advantages of this
reduction is that the new transformed components are in
decreasing order with respect to the variance explained
by the corresponding principal component.
2. Selecting the relevant and non-relevant sets: The user
looks a few screens, each showing some images, and
marks some of them as being relevant and non-relevant
(run5 ). For the automatic runs (run1 and run3 ), the
relevant and non-relevant images are automatically
selected. For the one-topic subset, the relevant images
are the images given by Wikipedia for the certain topic,
and for the multi-topic subset, we have generated the
relevant images by selecting the rst ve results. A set
of non-relevant images has been manually generated
taking into account the non-relevant guidelines given
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] (photos with regular people as main subject,
photos with riots and protests). At each query, the
nonrelevant images required are randomly selected from
the generated non-relevant set.
3. Parameter estimation of the Local Logistic Regression
Models [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]: The reduced feature vectors (PCA) and
the relevant and non-relevant sets are the inputs of
several Local Logistic Regression models whose
outputs are the probabilities for user assessment, i.e. the
probabilities he/she would assign to the fact that the
image belongs to the relevant set. The feature vector
is splitted dynamically in m groups of non- xed size.
Each group is used for adjusting the model of higher
order, given the inputs sets and PCA components.
4. Ranking of the database: Models are evaluated on all
the images of the database and return the
probabilities of being relevant for each estimated model; as
results, we have a probability vector (p) of dimension
m for each individual image. We combine these
probabilities in just one by using a weighted average. The
weights (w) for a given probability are obtained by the
amount of variance accounted for the group of
components used to adjust the model. Finally, this
procedure gives us a score/probability for each image in the
database.
5. Ranking of the database: the nal diversity similarity
rank is generated by selecting the highest probability
image at each of the clusters generated by the TBIR
sub-system (run3 ). If there are less than 50 clusters,
a second highest probability image selection is done.
For the automatic run using only visual information
(run1 ), the clusters are made by a k-means (k = 50)
procedure over the PCA components of the visual
feature vector.
3.
      </p>
    </sec>
    <sec id="sec-5">
      <title>RESULTS</title>
      <p>We submitted four runs computed as following: run1
automated using visual information only (uses step 2
presented in Section 2.2), run2 - automated using text
information only (uses step 1 presented in Section 2.1), run3
automated multimedia (uses steps 1-2 presented in Sections
2.1 and 2.2) and run5 - everything allowed: Textual clusters
witch FCA and manual relevance feedback algorithm using
visual features (uses steps 1-2 presented in Sections 2.1 and
2.2). Results are presented in Table 1.</p>
      <p>It is interesting to observe that our best results for both
precision and diversi cation are obtained with the
multimedia human-based approach, run5, F @20 = 0:5380, for
both subsets: one-topic, F @20 = 0:5240, and multi-topic,
F @20 = 0:5519. For the automatic runs, the best result is
achieved by run2 at the one-topic subsest, F @20 = 0:5068;
whereas for the multi-topic subset, run1 gets the highest
precision P @20 = 0:7300 and best performance F 1@20 =
0:5130, but run2 gets better diversi cation, CR@20 = 0:4407.
4.</p>
    </sec>
    <sec id="sec-6">
      <title>CONCLUSIONS</title>
      <p>We presented a multimodal approach for image
diversication applying a conceptual-based modelling (based on
FCA and HAC) to cluster the images according to the
latent topics addressed by their textual content, joined with
a relevance feedback algorithm using the visual features for
determining the similarity. Results show that the manual
version of the multimedia approach works better than the
automatic one. This is due to the way the relevant and
nonrelevant images are chosen to estimate the model. A human
knows better the meaning of the topic; therefore, he/she
selects the most signi cant images for the model. Our
challenge is to make the automatic approach to be able to select
the relevant and non-relevant images as a human being.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Agrawal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gollapudi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Halverson</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Ieong</surname>
          </string-name>
          .
          <article-title>Diversifying search results</article-title>
          .
          <source>In Proceedings of the Second ACM International Conference on Web Search and Data Mining</source>
          , pages
          <volume>5</volume>
          {
          <fpage>14</fpage>
          . ACM,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Alqadah</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Bhatnagar</surname>
          </string-name>
          .
          <article-title>Similarity measures in formal concept analysis</article-title>
          .
          <source>Annals of Mathematics and Arti cial Intelligence</source>
          ,
          <volume>61</volume>
          (
          <issue>3</issue>
          ):
          <volume>245</volume>
          {
          <fpage>256</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>X.</given-names>
            <surname>Benavent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Garcia-Serrano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Granados</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Benavent</surname>
          </string-name>
          , and E. de Ves.
          <article-title>Multimedia information retrieval based on late semantic fusion approaches: Experiments on a wikipedia image collection</article-title>
          .
          <source>Multimedia</source>
          , IEEE Transactions on,
          <source>PP(99):1{1</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Castellanos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cigarran</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Garc</surname>
          </string-name>
          a-Serrano.
          <article-title>Uned @ retrieving diverse social images task</article-title>
          .
          <source>In MediaEval Multimedia Benchmark Workshop, CEUR-WS.org, 1263, ISSN 1613-0073</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] E. de Ves, G. Ayala,
          <string-name>
            <given-names>X.</given-names>
            <surname>Benavent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Domingo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Dura</surname>
          </string-name>
          .
          <article-title>Modeling user preferences in content-based image retrieval: a novel attempt to bridge the semantic gap</article-title>
          .
          <source>Neurocomputing</source>
          , (
          <volume>0</volume>
          ):{,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ionescu</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. L. G</surname>
          </string-name>
          ^nsca,
          <string-name>
            <given-names>B.</given-names>
            <surname>Boteanu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Popescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lupu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Mu</surname>
          </string-name>
          <article-title>ller. Retrieving diverse social images at mediaeval 2015: Challenge, dataset and evaluation</article-title>
          .
          <source>In Retrieving Diverse Social Images at MediaEval</source>
          <year>2015</year>
          :
          <article-title>Challenge, Dataset and Evaluation</article-title>
          .
          <source>Working Notes Proceedings of the MediaEval 2015 Workshop</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kullback</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Leibler</surname>
          </string-name>
          .
          <article-title>On information and su ciency</article-title>
          .
          <source>The Annals of Mathematical Statistics</source>
          ,
          <volume>22</volume>
          (
          <issue>1</issue>
          ):
          <volume>79</volume>
          {
          <fpage>86</fpage>
          ,
          <year>1951</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Loader</surname>
          </string-name>
          .
          <article-title>Local regression and likelihood</article-title>
          . New York: Springer-Verlag,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Raghavan</surname>
          </string-name>
          , and H. Schutze.
          <source>Hierarchical clustering. pages</source>
          <volume>377</volume>
          {
          <fpage>403</fpage>
          .
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Rudinac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanjalic</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Larson</surname>
          </string-name>
          .
          <article-title>Generating visual summaries of geographic areas using community-contributed images</article-title>
          .
          <source>IEEE Transactions on Multimedia</source>
          ,
          <volume>15</volume>
          (
          <issue>4</issue>
          ):
          <volume>921</volume>
          {
          <fpage>932</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.</given-names>
            <surname>Wille</surname>
          </string-name>
          .
          <article-title>Concept lattices and conceptual knowledge systems</article-title>
          .
          <source>Computers &amp; mathematics with applications</source>
          ,
          <volume>23</volume>
          (
          <issue>6</issue>
          ):
          <volume>493</volume>
          {
          <fpage>515</fpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>