Experiments for the ImageCLEF 2007 Photographic Retrieval Task Thomas Wilhelm, Jens Kürsten, Maximilian Eibl Chemnitz, University of Technology Faculty of Computer Science, Chair Media Informatics Straße der Nationen 62 09111 Chemnitz, Germany [ thomas.wilhelm | jens.kuersten | eibl ] at informatik.tu-chemnitz.de Abstract This article describes the configuration of the experiments that we submitted for the ImageCLEF Photographic Retrieval Task. We used a redesigned version of our last years retrieval system prototype (see [1] for details). The translation of the topics for our cross-lingual experiments was realized with a plug-in to access the Google Translate [2] service. We used thesauri from OpenOffice [3] to expand the queries for better retrieval performance. This year, we submitted 11 runs, whereof only one was completely automatic. In all our experiments mixed modality was applied, i.e. we used text retrieval and content-based image retrieval for re-ranking. The evaluation results show that most of our experiments achieved very strong retrieval performance. Categories and Subject Descriptors H.3 [Information Storage and Retrieval]: H.3.1 Content Analysis and Indexing; H.3.3 Information Search and Retrieval Keywords Evaluation, Cross-Language Information Retrieval, Content-based Image Retrieval, Query Expansion, Ex- perimentation 1 Introduction and outline This year, we used a redesigned version of our retrieval prototype from 2006 to participate in the ImageCLEF Photographic Retrieval Task. The general description of the task is given in [4]. To overcome the challenging task thesauri were used for query expansion. We hoped to balance the reduced amount of textual annotations with this approach. Our experiments were based on text retrieval and were optimized with content-based image retrieval. The outline of the paper is as follows. Section 2 describes the general setup of our system. The individual configurations of our submitted experiments are shown in section 3. In sections 4 and 5 we summarize the results and sum up our observations. 2 Experimental setup The approach we used for the ImageCLEF Photographic Retrieval Task is as follows. We decided to use an automatic query expansion approach to balance the reduced amount of textual annotations in the data collection. We used thesauri from OpenOffice [3] by applying a threshold technique to obtain a number of terms for each query. The baseline of all our experiments was a classic text retrieval run. In a second step the results of the text retrieval were re-ranked based on image content descriptors. We applied the MPEG-7 descriptors EdgeHistogram and ScalableColor from the Caliph and Emir project [5] that were calculated from the example image of each topic. Finally, we used a manual feedback strategy to enhance retrieval performance in all our setups except the baseline run. The feedback strategy was to assess a certain number of the top documents and to apply a feedback algorithm that uses the annotations from the relevant documents. 3 Configuration of submitted runs The detailed setup of our experiments are presented in the following subsections. 3.1 Monolingual We submitted 5 monolingual experiments in total, whereof one was the completely automatic baseline run (first row in table 1). Table 1: Configuration of monolingual experiments identifier language # images for FB cut-EN2EN EN 0 cut-EN2EN-F20 EN 20 cut-EN2EN-F50 EN 50 cut-ES2ES ES 20 cut-DE2DE DE 20 3.2 Cross-lingual We also submitted cross-language experiments for all target collections. The translation was realized with a plug-in that is capable to access the Google Translate [2] service. We also used the thesauri based query expansion approach that was mentioned before. Table 2 shows the setup of the individual cross-language runs. Table 2: Configuration of cross-lingual experiments identifier query language target language # images for FB cut-EN2ES-F20 English Spanish 20 cut-ZHS2EN-F20 Chinese, simplified English 20 cut-DE2EN-F20 German English 20 cut-IT2EN-F20 Italian English 20 cut-FR2EN-F20 French English 20 cut-FR2DE-F20 French German 20 4 Results The results of our submitted runs are summarized in table 3. It can be seen that our monolingual english experiment performed best. Furthermore, one can observe that monolingual retrieval performance for english and spanish annotations is very good, while monolingual retrieval on german annotations is quite bad in comparison. Another interesting observation is the result for the cross-lingual experiment with english topics on spanish annotations, which performs better than all cross-lingual runs on the english annotations. Table 3: Results for submitted experiments identifier MAP P20 Rank cut-EN2EN-F50 0.3175 0.4592 1 cut-EN2EN-F20 0.2846 0.4025 5 cut-ES2ES 0.2772 0.3708 12 cut-EN2ES-F20 0.2770 0.3767 13 cut-ZHS2EN-F20 0.2690 0.4042 19 cut-DE2EN-F20 0.2565 0.3650 22 cut-IT2EN-F20 0.2495 0.3633 28 cut-FR2EN-F20 0.2432 0.3583 31 cut-DE2DE 0.1991 0.2992 40 cut-FR2DE-F20 0.1640 0.2367 100 cut-EN2EN 0.1515 0.2383 142 5 Conclusion Our experiments showed that the manual feedback strategy is a promising approach for this year’s ImageCLEF Photographic Retrieval Task. But also the combination of text retrieval and well-known content-based image descriptors as well as the application of thesauri based query expansion in this domain - with a small amount of textual metadata - was important for good retrieval performance. References [1] Wilhelm, T. & Eibl, M. (2006). ImageCLEF 2006 Experiments at the Chemnitz Technical University. In Working Notes for the CLEF 2006 Workshop, 20-22 September, Alicante, Spain. Retrieved August 17, 2007, from CLEF Web site: http://www.clef-campaign.org/2006/working_notes/workingnotes2006/wilhelmCLEF2006l.pdf [2] Google (2007). Google Translate BETA. Retrieved August 17, 2007, from Google Web site: http://www.google.com/translate_t [3] OpenOffice (2007). OpenOffice. Retrieved August 17, 2007, from OpenOffice Web site: http://www.openoffice.org/ [4] Grubinger, M. & Cloug, P. & Hanburry, A. & Müller, H. (2007). Overview of the ImageCLEFphoto 2007 Photographic Retrieval Task. In Working Notes for the CLEF 2007 Workshop, 19-21 September, Budapest, Hungary. To appear. [5] Lux, M. (2004-2007). Calpih & Emir. Retrieved August 17, 2007, from SemanticMetadata Web site: http://www.semanticmetadata.net/