=Paper= {{Paper |id=Vol-1178/CLEF2012wn-ImageCLEF-MantziouEt2012 |storemode=property |title=CERTH's Participation at the Photo Annotation Task of ImageCLEF 2012 |pdfUrl=https://ceur-ws.org/Vol-1178/CLEF2012wn-ImageCLEF-MantziouEt2012.pdf |volume=Vol-1178 |dblpUrl=https://dblp.org/rec/conf/clef/MantziouPPSK12 }} ==CERTH's Participation at the Photo Annotation Task of ImageCLEF 2012 == https://ceur-ws.org/Vol-1178/CLEF2012wn-ImageCLEF-MantziouEt2012.pdf

CERTH’s participation at the photo annotation
         task of ImageCLEF 2012

    Eleni Mantziou, Georgios Petkos, Symeon Papadopoulos, Christos Sagonas,
                           and Yiannis Kompatsiaris

                          Information Technologies Institute
           Centre for Research and Technology Hellas, Thessaloniki, Greece
                {lmantziou, gpetkos, papadop, sagonas, ikom}@iti.gr


        Abstract. This paper describes the approaches and experimental set-
        tings of the five runs submitted by CERTH at the photo annotation task
        of ImageCLEF 2012. Two different approaches were used, the first using
        the Laplacian Eigenmaps of an image similarity graph for learning, and
        the second using a “same class” learning model. Four runs were submit-
        ted using the first, and one using the second approach. A multitude of
        textual and visual features were employed, making use of different aggre-
        gation (BoW, VLAD) and post-processing schemes (WordNet, pLSA).
        The best performance scores in the test set was achieved by Run 3 (first
        approach using all features), which amounted to 0.321 in terms of MiAP
        and 0.2547 in terms of GMiAP (7th out of 18 compteting teams), and
        Run 5 which led to an F-ex score of 0.495 (6th out of 18 teams).


1     Introduction
This document describes the participation of CERTH at the photo annotation
task of the 2012 ImageCLEF competition [1]. CERTH submitted five runs using
two different approaches. The first approach, to be described in subsection 2.1,
computes the similarity between test images and train images, constructs an im-
age similarity graph, and trains concept detectors by using the graph Laplacian
Eigenmaps (LE) [7] as features. This is done for each modality and the final
result is obtained by performing late fusion using a linear classifier. The second
approach, to be detailed in subsection 2.2, utilizes the concept of a “same class”
model that takes as input the set of distances (as many as the number of used
features) between the image to be annotated and a reference item that represents
a target concept, and predicts whether the image belongs to the target concept.
Section 3 outlines each of the submitted runs and presents the obtained test
results. Section 4 presents some general remarks and conclusions.

2     Overview of methods
2.1    Concept detection using image similarity graphs
The first approach used by CERTH is based on the construction of a similarity
graph between the images. This graph is used to obtain a low-dimensional feature
representation: we use the first eigenvectors of the graph Laplacian as features.
These features correspond well to semantically coherent groups of images, and
are thus used to train concept classifiers.
    The idea of utilizing the implicit relational structure that can be derived by
computing similarities between the images of a collection has been proposed be-
fore. In [8], an extended similarity measure is proposed that takes into account
the local neighbourhood structure of images, i.e, the content and label infor-
mation (if available) of images that are similar to the input image. The afore-
mentioned measure is used in combination with two well-known semi-supervised
learning methods [16] and is shown to improve their performance both in syn-
thetic experiments and in benchmark video annotation task. Our work is mostly
related to [6] that introduces the concept of ”social dimensions”, i.e. the top-k
eigenvectors of a graph Laplacian, as an alternative to tackling the relational
classification problem, [10], i.e. the classification of a graph node by taking into
account information from neighbouring nodes. Here, we adopt a similar repre-
sentation for graph structure features.
    Method overview: Given a set of K target concepts Y = {Y1 ....YK } and an
annotated set L = {(xi , yi )}li=1 of training samples, where xi ∈