=Paper=
{{Paper
|id=Vol-1263/paper77
|storemode=property
|title=TALP-UPC at MediaEval 2014 Placing Task: Combining Geographical Knowledge Bases and Language Models for Large-Scale Textual Georeferencing
|pdfUrl=https://ceur-ws.org/Vol-1263/mediaeval2014_submission_77.pdf
|volume=Vol-1263
|dblpUrl=https://dblp.org/rec/conf/mediaeval/FerresR14
}}
==TALP-UPC at MediaEval 2014 Placing Task: Combining Geographical Knowledge Bases and Language Models for Large-Scale Textual Georeferencing==
TALP-UPC at MediaEval 2014 Placing Task: Combining Geographical Knowledge Bases and Language Models for Large-Scale Textual Georeferencing Daniel Ferrés, Horacio Rodríguez TALP Research Center Computer Science Department Universitat Politècnica de Catalunya {dferres,horacio}@cs.upc.edu ABSTRACT place names, stopwords lists, and an English Dictionary. This paper describes our Georeferencing approaches, exper- The system uses the following rules from Toponym Disam- iments, and results at the MediaEval 2014 Placing Task biguation techniques [3] to get the geographical focus of the evaluation. The task consists of predicting the most prob- photo/video: 1) select the most populated place that is not able geographical coordinates of Flickr images and videos a state, country or continent and has its state apearing in using its visual, audio and metadata associated features. the text, 2) otherwise select the most populated place that is Our approaches used only Flickr users textual metadata an- not a state, country or continent and has its country apear- notations and tagsets. We used four approaches for this ing in the text, 3) otherwise select the most populated state task: 1) an approach based on Geographical Knowledge that has its country apearing in the text 4) otherwise apply Bases (GeoKB), 2) the Hiemstra Language Model (HLM) population heuristics. approach with Re-Ranking, 3) a combination of the GeoKB 2) Hiemstra Language Model (HLM) with Re-Ranking. and the HLM (GeoFusion). 4) a combination of the GeoFu- This approach uses the Terrier2 Information Retrieval (IR) sion with a HLM model derived from the English Wikipedia software (version 3.0) with the HLM weighting model [5]. georeferenced pages. The HLM approach with Re-Ranking The HLM default lambda (λ) parameter value in Terrier showed the best performance within 10m to 1km distances. (0.15) was used. See in [3] more details about the Terrier The GeoFusion approaches achieved the best results within implementation of the HLM weighting model (version 1 [5]). the margin of errors from 10km to 5000km. The indexing of the metadata subsets were done with the co- ordinates as a document number and some metadata fields (Title, Description and User Tags) as the document text. 1. INTRODUCTION For each unique coordinate a document was created with all The MediaEval 2014 Placing task requires that partic- the textual metadata fields content of all the photos/videos ipants use systems that automatically assign geographical that pertain to this coordinate. coordinates (latitude and longitude) to Flickr photos and The indexing process uses a multilingual stopwords list to videos using one or more of the following data: Flickr meta- filter the tokens that are indexed. The following metadata data, visual content, audio content, and social information fields (lowercased) from the photos/videos were used for the (see [1] for more details about this evaluation). The Placing query: User tags, Title and Description. A Re-Ranking pro- Task training data consists of 5,000,000 geotagged photos cess is applied after the IR process. For each topic their first and 25,000 geotagged videos, and the test data consists of 1000 retrieved coordinates pairs from the IR software are 500,000 photos and 10,000 videos. Evaluation of results is used. From them we selected the subset of coordinates pairs done by calculating the distance from the actual point (as- with a score equal or greater than the two-thirds (66.66%) signed by a Flickr user) to the predicted point (assigned by of the score of the coordinates pair ranked in first position. a participant). Runs are evaluated finding how many videos Then for each geographical coordinates pair of the subset were placed at least within some threshold distances. we sum its associated score (provided by the IR software) and the score of their neighbours at a threshold distance 2. SYSTEM DESCRIPTION (e.g. 100km). Then we select the one with the maximum weighted sum. We used four approaches for the MediaEval 2014 Placing 3) GeoFusion: Hiemstra Language Model with Re- Task (see more details about the approaches in [3]): Ranking and GeoKB. This approach is applied by com- 1) Geographical Knowledge Bases (GeoKB). We bining the results of the GeoKB approach and the IR ap- used this approach in MediaEval 2010 and 2011 Placing proach with Re-Ranking. From the GeoKB system are se- Tasks [2] [4] after some improvements (see [3]).The GeoKB lected the predicted coordinates that come only from the approach uses the Geonames1 Gazetteer for detecting the heuristics 1, 2 and 3 (avoiding predictions from the popu- 1 lation heuristics rules). When the GeoKB rules (applied in Geonames (downloaded in 2011). http://www.geonames.org priority order: 1, 2, and 3) do not match then the predictions are selected from the HLM approach with Re-Ranking Copyright is held by the author/owner(s). 2 MediaEval 2014 Workshop, October 16-17, 2014, Barcelona, Spain Terrier. http://terrier.org Figure 1: Accuracy against margin of error in kms 4) GeoFusion+GeoWiki: GeoFusion combined with 100 % a HLM model of Georeferenced Wikipedia pages. experiments run1 run3 This is the only improvement with respect to the system run4 90 % run5 used at MediaEval 2011. This approach uses a set of 857,574 80 % Wikipedia georeferenced pages3 that were indexed with Ter- 70 % rier. The coordinates of the top ranked georeferenced Wikipedia 60 % page are used as a prediction. The predictions from the geo- Accuracy referenced Wikipedia based HLM model are used only in 50 % case that the HLM model with Re-Ranking based on the 40 % Training data gives an score lower than 7.0. This thresh- 30 % old was found empirically training with the MediaEval 2011 20 % test set. The system uses the coordinates of one of the most photographied places in the world as a prediction when the 10 % approaches cannot give a prediction. 0% 0.01 0.1 1 10 100 1000 5000 Kms 3. EXPERIMENTS AND RESULTS information and tags. In this evaluation we tried an ap- We designed a set of four experiments for the MediaEval proach that uses sometimes the English Wikipedia Geo- 2014 Placing Task (Main Task) test set of 510,000 Flickr refenced pages to handle these cases. The GeoFusion+GeoWiki photos and videos (see results in Figure 1 and Table 1): approach (that uses an HLM model of English Wikipedia georeferenced pages) does not generally offers better perfor- 1. The experiment run1 used the HLM approach with mance than the original GeoFusion approach. This approach Re-Ranking up to 100 km and the MediaEval 2014 only improved very slightly the results for estimations at training set metadata as a training data. From a set 10km. The HLM approach with Re-Ranking obtained the of 5,050,000 photos and videos of the MediaEval 2014 best results in the 10m to 1km range because the model training set, a set of 3,057,718 coordinates pairs with takes some benefits of relating non-geographical descriptive related metadata info were created as textual docu- keywords and place names appearing in the geographical co- ments and then indexed with Terrier. ordinates’ associated metadata. 2. The experiment run3 used the GeoKB approach. Acknowledgments 3. The experiment run4 used the GeoFusion approach with the MediaEval training corpora. This work has been supported by the Spanish Research De- partment (SKATER Project: TIN2012-38584-C06-01). TALP 4. The experiment run5 used the GeoFusion approach Research Center is recognized as a Quality Research Group with the MediaEval training corpora in combination (2014 SGR 1338) by AGAUR, the Research Department of with the English Wikipedia georeferenced pages HLM the Catalan Government. model. Table 1: Percentage of correctly georeferenced photos/videos 5. REFERENCES within certain amount of kilometers and median error for [1] J. Choi, B. Thomee, G. Friedland, L. Cao, K. Ni, each run. D. Borth, B. Elizalde, L. Gottlieb, C. Carrano, Margin run1 run3 run4 run5 R. Pearce, and D. Poland. The Placing Task: A 10m 0.29 0.08 0.23 0.23 Large-Scale Geo-Estimation Challenge for Social-Media 100m 4.12 0.80 3.00 3.00 Videos and Images. In Proceedings of the 3rd ACM 1km 16.54 10.71 15.90 15.90 10km 34.34 33.89 38.52 38.53 International Workshop on Geotagging and Its 100km 51.06 42.35 52.47 52.47 Applications in Multimedia, 2014. 1000km 64.67 52.54 65.87 65.86 [2] Daniel Ferrés and Horacio Rodrı́guez. TALP at 5000km 78.63 69.84 79.29 79.28 Median Error (kms) 83.98 602.21 64.36 64.41 MediaEval 2010 Placing Task: Geographical Focus Detection of Flickr Textual Annotations. In Working Notes of the MediaEval 2010 Workshop, Pisa, Italy, October 2010. 4. CONCLUSIONS [3] Daniel Ferrés and Horacio Rodrı́guez. Georeferencing We used four approaches at MediaEval 2014 Placing Task. Textual Annotations and Tagsets with Geographical The GeoFusion approaches achieved the best results in the Knowledge and Language Models. In Actas de la experiments clearly outperforming the other approaches. These SEPLN 2011, Huelva, Spain, September 2011. approaches achieve the best results because combine high [4] D. Ferrés and H. Rodrı́guez. TALP at MediaEval 2011 precision rules based on Toponym Disambiguation heuris- Placing Task: Georeferencing Flickr Videos with tics and predictions that come from an HLM models. The Geographical Knowledge and Information Retrieval. In GeoKB rules used in the GeoFusion approach achieved 81.17% Working Notes Proceedings of the MediaEval 2011 of accuracy (131,207 of 161,628 photos/videos) predicting Workshop, Santa Croce in Fossabanda, Pisa, Italy, up to 100km. The most difficult cases for prediction with September 1-2, 2011, 2011. our textual based approach are the ones with few textual [5] D. Hiemstra. Using Language Models for Information 3 Retrieval. PhD thesis, Enschede, Netherlands, January http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt\_Georeferenzierung/Hauptseite/ Wikipedia- World/en 2001.