=Paper=
{{Paper
|id=None
|storemode=property
|title=TALP at MediaEval 2011 Placing Task: Georeferencing Flickr videos with geographical knowledge and information retrieval
|pdfUrl=https://ceur-ws.org/Vol-807/Ferres_UPC_Placing_me11wn.pdf
|volume=Vol-807
|dblpUrl=https://dblp.org/rec/conf/mediaeval/FerresR11
}}
==TALP at MediaEval 2011 Placing Task: Georeferencing Flickr videos with geographical knowledge and information retrieval==
TALP at MediaEval 2011 Placing Task: Georeferencing Flickr Videos with Geographical Knowledge and Information Retrieval Daniel Ferrés Horacio Rodríguez TALP Research Center TALP Research Center Software Department Software Department Universitat Politècnica de Catalunya Universitat Politècnica de Catalunya C. Jordi Girona Salgado, 1-3 C. Jordi Girona Salgado, 1-3 08034 Barcelona, Spain 08034 Barcelona, Spain dferres@lsi.upc.edu horacio@lsi.upc.edu ABSTRACT 1) Geographical Knowledge (GeoKB). This approach This paper describes our Georeferencing approaches, experi- was used for MediaEval 2010 Placing Task [3] and then was ments, and results at the MediaEval 2011 Placing Task eval- improved (see [4]). The GeoKB approach uses the Geon- uation. The task consists of predicting the most probable ames1 Gazetteer for detecting the place names, stopwords geographical coordinates of Flickr videos. Our approaches lists, and an English Dictionary. The system uses the fol- lowing rules from Toponym Disambiguation techniques [4] used only Flickr users textual annotations and tagsets to pre- dict. We used three approaches for this task: 1) a Geograph-to get the geographical focus of the video: 1) select the most ical Knowledge approach, 2) an Information Retrieval based populated place that is not a state, country or continent and approach with Re-Ranking, and 3) a combination of both has its state apearing in the text, 2) otherwise select the (GeoFusion). The GeoFusion approach achieved the best most populated place that is not a state, country or conti- results within the margin of errors from 10km to 10000km. nent and has its country apearing in the text, 3) otherwise select the most populated state that has its country apearing in the text 4) otherwise apply population heuristics. Categories and Subject Descriptors 2) Information Retrieval with Re-Ranking. This H.3 [Information Search and Retrieval] approach is similar to the one presented by [6]. It uses the Terrier2 IR software (version 3.0) with the Hiemstra Lan- guage Modelling (HLM) weighting model [5]. The HLM General Terms default lambda (λ) parameter value in Terrier (0.15) was Design, Performance, Experimentation, Measurement used. See in equation 1 the Terrier implementation of the HLM Weighting model (version 1 [5]) score of a term t in Keywords document d ; where tft,d is the term frequency in P the doc- ument, cft is the collection frequency of the term, i cfi is Georeferencing, Toponym Disambiguation, Geographical Knowl- the number of tokens in the collection, and P tfi,d is the i edge Bases, Information Retrieval document length. P λ ∗ tft,d ∗ i cfi 1. INTRODUCTION Score(t, d) = log (1 + P ) (eq.1 ) (1 − λ) ∗ cft ∗ i tfi,d The MediaEval 2011 Placing task requires that partici- The indexing of the metadata subsets were done with pants automatically assign geographical coordinates (lati- the coordinates as a document number and their associated tude and longitude) to Flickr videos using one or more of: tagsets as the document text. We indexed with filtering us- Flickr metadata, visual content, audio content, and social ing the multilingual stopwords list and without stemming. information (see [1] for more details about this evaluation). The following metadata fields (lowercased) from the videos Evaluation of results is done by calculating the distance were used for the query: Keywords (tags), Title and De- from the actual point (assigned by a Flickr user) to the pre- scription. A Re-Ranking (RR) process is applied after the dicted point (assigned by a participant). Runs are evaluated IR process. For each topic their first 1000 retrieved coordi- finding how many videos were placed at least within some nates pairs from the IR software are used. From them we threshold distances. selected the subset of coordinates pairs with a weight equal or greater than the two-thirds (66.66%) of the weight of 2. SYSTEM DESCRIPTION the coordinates pair ranked in first position. Then for each geographical coordinates pair of the subset we sum its asso- We used three approaches for the MediaEval 2011 Placing ciated weight (provided by the IR software) and the weight Task (see more details about this approaches in [4]): of their neighbours at a threshold distance (e.g. 100km). Then we select the one with the maximum weighted sum. 1 Copyright is held by the author/owner(s). Geonames. http://www.geonames.org 2 MediaEval 2011 Workshop, September 1-2, 2011, Pisa, Italy Terrier. http://terrier.org 3) Combination of GeoKB and Information Re- trieval with Re-Ranking (GeoFusion). The GeoFusion Figure 1: Accuracy against margin of error in kms approach is applied by combining the results of the GeoKB 100 % experiments approach and the IR approach with Re-Ranking. From the 90 % TALP1 TALP2 TALP3 GeoKB system are selected the predicted coordinates that TALP5 come only from the Geographical Knowledge heuristics 1, 2 80 % and 3 (avoiding predictions from the population heuristics 70 % rules). When the GeoKB rules (applied in priority order: 1, 2, and 3) do not match then the predictions are selected 60 % Accuracy from the IR approach with Re-Ranking. 50 % We used two corpora for training the IR system for Medi- 40 % aEval 2011: 1) the MediaEval 2011 Flickr corpus (3,185,258 photos) and 2) the union of the MediaEval corpus with 30 % the CoPhIR3 image collection [2] (106 million processed im- 20 % ages). From the MediaEval corpus we filtered and extracted 1,026,993 coordinates (accuracies between 6 and 16 zoom 10 % levels) with their associated tagsets. From CoPhIR we se- 0% lected the photos with geographical referencing with accura- 1 10 20 50 100 200 500 1000 2000 5000 10000 Kms cies between 6 and 16 zoom levels (8,428,065 photos). Then we filtered repeated content and null content (7,601,117 pho- approach achieves the best results because combines high tos). The union of the extracted data from CoPhIR and precision rules based on Toponym Disambiguation heuris- MediEval gives a total of 2,488,965 unique coordinates with tics and predictions that come from a data driven IR Re- associated tagsets. Ranking approach. The GeoKB rules used in the GeoFu- sion approach achieved 80.18% of accuracy (1789 of 2231 3. EXPERIMENTS AND RESULTS videos) predicting up to 100km. As a further work we plan We designed a set of four experiments (see Table 1) for to improve the accuracy of the GeoKB rules. the MediaEval 2011 Placing Task test set of 5347 Flickr videos. The experiment TALP1 used the IR approach with Acknowledgments Re-Ranking up to 100 km and the MediaEval 2011 photos This work has been supported by the Spanish Research corpu as a training data. The experiment TALP2 used the Dept. (KNOW 2, TIN2009-14715-C04-04). Daniel Ferrés is GeoKB approach. The experiment TALP3 used the Geo- supported by the EBW II Project, which is financed by the Fusion approach with the MediaEval training corpora. The European Commission within the framework of the Erasmus experiment TALP5 used the GeoFusion approach with the Mundus Programme. TALP Research Center is recognized MediaEval and the CoPhIR corpora of photos for training. as a Quality Research Group (2001 SGR 00254) by DURSI, The results are shown in Figure 1 and Table 2. the Research Department of the Catalan Government. Table 1: MediaEval 2011 Placing task Experiments. 5. REFERENCES run Approach Training Corpus [1] Adam Rae and Vannesa Murdock and Pavel Serdyukov TALP1 IR Re-Rank (100km) MediaEval and Pascal Kelm. Working Notes for the Placing Task TALP2 GeoKB - TALP3 GeoKB + IR Re-Rank (100km) MediaEval at MediaEval 2011. In Working Notes of the MediaEval TALP5 GeoKB + IR Re-Rank (100km) MediaEval+ CoPhIR 2011 Workshop, Pisa, Italy, September 2011. [2] Paolo Bolettieri, Andrea Esuli, Fabrizio Falchi, Claudio Lucchese, Raffaele Perego, Tommaso Piccioli, and Table 2: Results at the Placing Task (5347 videos) Fausto Rabitti. CoPhIR: a Test Collection for Content-Based Image Retrieval. CoRR, Margin TALP1 TALP2 TALP3 TALP5 abs/0905.4627v2, 2009. 1km 916 611 781 890 10km 1834 2306 2281 2403 [3] Daniel Ferrés and Horacio Rodrı́guez. TALP at 20km 2070 2549 2553 2690 MediaEval 2010 Placing Task: Geographical Focus 50km 2415 2723 2840 2971 Detection of Flickr Textual Annotations. In Working 100km 2670 2823 3029 3171 200km 2821 2995 3253 3382 Notes of the MediaEval 2010 Workshop, Pisa, Italy, 500km 3022 3119 3450 3587 October 2010. 1000km 3278 3247 3670 3799 [4] Daniel Ferrés and Horacio Rodrı́guez. Georeferencing 2000km 3594 3374 3906 4017 5000km 4119 3706 4301 4465 Textual Annotations and Tagsets with Geographical 10000km 4975 4688 5076 5151 Knowledge and Language Models. In Actas de la SEPLN 2011, Huelva, Spain, September 2011. [5] Djoerd Hiemstra. Using Language Models for Information Retrieval. PhD thesis, Enschede, 4. CONCLUSIONS Netherlands, January 2001. We used three approaches at MediaEval 2011 Placing Task. [6] Pavel Serdyukov, Vanessa Murdock, and Roelof van The GeoFusion approach achieved the best results in the ex- Zwol. Placing flickr photos on a map. In James Allan, periments clearly outperforming the other approaches. This Javed A. Aslam, Mark Sanderson, ChengXiang Zhai, 3 and Justin Zobel, editors, SIGIR, pages 484–491, 2009. CoPhIR. http://cophir.isti.cnr.it