<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">USEMP at MediaEval Placing Task 2014</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Adrian</forename><surname>Popescu</surname></persName>
							<email>adrian.popescu@cea.fr</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">CEA</orgName>
								<orgName type="institution" key="instit2">LIST</orgName>
								<address>
									<postCode>91190</postCode>
									<settlement>Gif-sur-Yvette</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Symeon</forename><surname>Papadopoulos</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">CERTH-ITI</orgName>
								<address>
									<settlement>Thermi-Thessaloniki</settlement>
									<country key="GR">Greece</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ioannis</forename><surname>Kompatsiaris</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">CERTH-ITI</orgName>
								<address>
									<settlement>Thermi-Thessaloniki</settlement>
									<country key="GR">Greece</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">USEMP at MediaEval Placing Task 2014</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">77DA0908D6E8402203FE5AFCD3F89677</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T16:11+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We describe the participation of the USEMP team in the Placing Task at MediaEval 2014. We submitted four textual runs which are inspired by CEA LIST's 2013 participation. Our entries are based on probabilistic place modeling but also exploit machine tag and/or user modeling. The best results were obtained when all these types of information are combined. The accuracy of automatic at 1km reaches 0.235 when using only training data provided by organizers and 0.441 with the use of external data.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>The goal of the task is to produce location estimates for a test set of 500,000 images and videos using a set of approximately five million geotagged images and videos and their metadata for training. A full description of the challenge and of the associated dataset is provided in <ref type="bibr" target="#b0">[1]</ref>. Our runs were implemented using, for a large part, methods described in CEA LIST's participation at Placing Task 2013 <ref type="bibr" target="#b1">[2]</ref>. For this reason, after a short presentation of the methods, runs and obtained results, we focus on failure analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">METHOD DESCRIPTION</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Probabilistic location models</head><p>Language models are successfully introduced in <ref type="bibr" target="#b2">[3]</ref> as an alternative to gazetteer-based geolocation and were progressively improved in following years. Test photos can be placed anywhere in the physical world and the training data provided by the organizers is insufficient in order to build robust probabilistic models. To verify the assumption that better results are obtained with the use of more data, we exploited:</p><p>(1) all geotagged metadata from the YFCC dataset<ref type="foot" target="#foot_0">1</ref> , after removing all test items and (2) an additional set of ∼90 million geotagged metadata from Flickr.</p><p>Similar to last year <ref type="bibr" target="#b1">[2]</ref>, the surface of the earth was split in (nearly) rectangular cells of size 0.01 of latitude and longitude degree (approximately 1km 2 size). User counts were used instead of tag counts in order to mitigate the influence of bulk tagging. Both titles and tags were taken into account and are referred to as tags hereafter. Put simply, we</p><p>Copyright is held by the author/owner(s).</p><p>MediaEval 2014 Workshop, October 16-17, 2014, Barcelona, Spain computed the probability of a tag in a cell by dividing its user count in that cell by its total user counts in all cells. Given a test item, we simply summed-up contributions of individual tags to find the most probable cell for that item. Finally, the photo was placed at the center of the cell.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Machine tag modeling</head><p>The authors of <ref type="bibr" target="#b3">[4]</ref> show that machine tags can improve automatic geotagging quality. In <ref type="bibr" target="#b1">[2]</ref> we propose a machine tag processing method which models only machine tags which are strongly associated to locations. (i.e. Foursquare, Lastfm and Upcoming entries) and we exploited it this year.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">User modeling</head><p>If images do not have associated tags or if these tags are not geographically discriminant, placing photos with probabilistic models is likely to fail. To overcome this problem, we exploited a simple user modeling technique <ref type="bibr" target="#b1">[2]</ref>, which computes the most probable cell of a user. Only photos which are at least 24 hours away from any of the user's test set images were exploited to reduce the risk of learning from test data. We downloaded up to 500 geotagged images per user in order to determine her most probable cell.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Fusion</head><p>We propose a late fusion scheme which is empirically derived from tests with a validation dataset. Since they are associated to precise locations or geolocated events, processed machine tags are very reliable and were used in priority. If there were no machine tags, location models were exploited to predict the most probable location of a set of tags. Finally, if there were no tags available or if the prediction score was below a threshold, the photo was placed in the most probable cell of the user who uploaded it. The threshold for replacing location models with user models was empirically determined on the validation dataset. We exploited user models for the 30% of test images which had the lowest placing scores.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">RUNS</head><p>We submitted the following runs: RU N1 -exploited location models and machine tags from training data provided by the organizers; RU N3 -combined location models and machine tags from the entire geotagged YFCC dataset, after excluding test items; RU N4 exploited tags and user models; RU N5 -exploited YFCC location models, machine tags and user models. We present the performance of the submitted runs in Table <ref type="table" target="#tab_0">1</ref>. The best results were obtained when com-P@X km Run 0.01 0.  bining all types of available information. As expected, the largest contribution was due to location models. The large gap between RU N1 and the others confirms that the use of supplementary training data is very beneficial. The difference of precision at close range (P@0.1) between RU N3 and RU N4 confirms that machine tags are very useful for precise geolocation. Inversely, if larger errors are admitted, user models become more useful than machine tags. The combination of these types of cues in RU N5 gives the best performance for all precision ranges. The results obtained this year are in the same range as those we reported in 2013 <ref type="bibr" target="#b1">[2]</ref>, confirming thus that our geolocation pipeline has consistent behavior over different datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">FAILURE ANALYSIS</head><p>In addition to the submitted runs, we tested other configurations which gave lower results and briefly describe them here. We notably tried a combination of location models and gazetteer information in order to give a privileged role to toponyms such as administrative division names (i.e. countries, regions, cities). The addition of the gazetteer gave lower results compared to the sole use of location models. This negative result could be explained by the strong ambiguity which characterizes the geographic domain. As we mentioned, we also tried to add a dataset of ∼90 million geotagged metadata to the YFCC full training data. Contrarily to existing literature <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b1">2]</ref>, the use of this supplementary dataset actually degraded the overall quality of results. This negative result might indicate that probabilistic models reach saturation when too much metadata are available.</p><p>In Figures <ref type="figure" target="#fig_1">1 and 2</ref>, we present a visualization of geotagging performance for RU N1 and RU N5 and the performance difference between the two runs is clearly reflected . Geotagging is precise in most European regions and worse for the other regions. Low performances can be easily explained by sparse data for Africa, Asia or South America. However, the imprecision is also high for the United States, the region of the world which concentrates the largest number of geotagged images. In this case, poor geotagging could be due to a very high ambiguity of place names. For instance, there are dozens of places called London or Paris in the US. If there is not enough disambiguation information associated to them in annotations, photos tagged with these toponyms will be placed in Europe.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">FUTURE WORK</head><p>Due to lack of time, we did not submit a visual run this year. While visual geotagging still lags well behind textual geotagging, it would be interesting to explore if it is possible to predict coordinates accurately at least for visually distinctive objects such as Points of Interest. Regarding text models, we would like to investigate in more depth why adding more data from outside YFCC degrades performance. Equally interesting, it would be interesting to investigate ways to select reliable annotations before computing location models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">ACKNOWLEDGMENT</head></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Average error plot for RU N1. Red/blue dots correspond to large errors/precise geotagging respectively.</figDesc><graphic coords="2,59.69,151.27,229.42,177.79" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Average error plot for RU N5. Red/blue dots correspond to large errors/precise geotagging respectively.</figDesc><graphic coords="2,322.70,53.80,229.42,177.79" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>P@X precision at X km.</figDesc><table><row><cell></cell><cell></cell><cell>1</cell><cell>1</cell><cell>10</cell><cell>100</cell><cell>1000</cell></row><row><cell>#1</cell><cell cols="5">0.007 0.016 0.235 0.408 0.481 0.618</cell></row><row><cell>#3</cell><cell cols="5">0.026 0.043 0.428 0.582 0.644 0.753</cell></row><row><cell>#4</cell><cell>0</cell><cell cols="4">0.012 0.418 0.597 0.679 0.779</cell></row><row><cell>#5</cell><cell cols="5">0.026 0.043 0.441 0.613 0.691 0.787</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://webscope.sandbox.yahoo.com/ catalog.php?datatype=i&amp;did=67</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1">This work is supported by the USEMP FP7 project, partly funded by the EC under contract number 611596.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">The placing task: A large-scale geo-estimation challenge for social-media videos and images</title>
		<author>
			<persName><forename type="first">J</forename><surname>Choi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of GeoMM&apos;14</title>
				<meeting>of GeoMM&apos;14</meeting>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Cea list&apos;s participation at mediaeval 2013 placing task</title>
		<author>
			<persName><forename type="first">A</forename><surname>Popescu</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013">2013</date>
			<publisher>MediaEval</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Placing flickr photos on a map</title>
		<author>
			<persName><forename type="first">P</forename><surname>Serdyukov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of SIGIR</title>
				<meeting>of SIGIR</meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Retrieving geo-location of videos with a divide &amp; conquer hierarchical multimodal approach</title>
		<author>
			<persName><forename type="first">M</forename><surname>Trevisiol</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICMR</title>
		<imprint>
			<biblScope unit="page" from="1" to="8" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
