<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Do geographic features impact pictures location shared on the Web? Modeling photographic suitability in the Swiss Alps.</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Produit Timothée</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tuia Devis</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>de Morsier Frank</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Golay François</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>LaSIG laboratory</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Signal Processing Laboratory (LTS</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>) Ecole Polytechnique Fédérale de Lausanne</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lausanne Switzerland</string-name>
        </contrib>
      </contrib-group>
      <fpage>22</fpage>
      <lpage>29</lpage>
      <abstract>
        <p>Nowadays, millions of landscape images are uploaded on photosharing platforms such as Flickr or Panoramio. More and more of these images are also accurately geotagged via GPS devices mounted on personal cameras. Each image results from a twofold spatial choice: the choice of the location and the choice of the picture subject. In this study, our focus is on landscape images in large touristic areas. Firstly, our goal is to learn which geographic features play a role in the choice of the location of shared images. For our analysis, we extract a series of geographic features from a Digital Elevation Model (DEM) and a Topographic Landscape Model (TLM) and model the photographic suitability as a density estimation problem in the space of the geographic features. Secondly, each combination of geographic features of a region is associated with a probability to be a location suitable for a photography. The resulting map is useful to promote tourism, to evaluate the landscape attractiveness or with a more technical objective as a prior in close-range photogrammetry. This study shows that databases of geotagged pictures can be used to understand tourists behaviour also in rural areas, even if most of current researches are adressed to cities. The application to a touristic region in the Swiss Alps shows that the proposed method suits well this Geographic OneClass Data problem and is more accurate than both standard KPCA and One-Class SVM to model the suitability for touristic photography at locations unseen during training. As expected, picture locations are mostly correlated with geographic features extracted from the digital elevation model, as well as with those related to accessibility (distance to roads, paths). However, the force of this study is the combination of the geographic features via a kernel method to model more accurately suitable picture locations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Categories and Subject Descriptors</title>
      <p>H.2.8 [Database management]: Database Applications
Spatial databases and GIS
Geographic One-Class data (GOCD), image geotags,
landscape images, density mapping
1.</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>With the advent of "Web 2.0", the number of
collaborative databases has dramatically increased. Very often, the
databases objects are geotagged, meaning that an attribute
related to its geographic location is available.</p>
      <p>
        In this paper, we focus on collective pictures databases.
Pictures uploaded on the Web via a photosharing platform
(Flickr, Panoramio, Instagram...) have a time stamp, some
textual tags describing its content and sometimes a world
coordinate representing the geographic position. We are
specially interested to the geotag which can be attributed
in several ways. Most of the images are located with a click
on a map. The accuracy of the localization is related to the
zoom level used and to the ability to recognize an area in
an aerial view. On the contrary, more and more cameras
have a GPS device able to track very accurately the latitude
and longitude coordinates. As stated in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the location
of a picture is the result of a spatial choices: the choice of
the location (the place where the photograph stands) and
the choice of the subject (the object at which the camera
points). This study focuses on the ¯rst choice: it bene¯ts
from the GPS accuracy to learn which geographic features
describe the locations chosen.
      </p>
      <p>
        Such a study aims at providing a map of the suitability to
be picture location. This map can be useful in several ways.
First, it could be used to promote tourism in areas sharing
similar geographic feature with famous regions. Second, it
can be used to assess landscape attractiveness, a measure
needed in environmental planning [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Finally, shared
photographies in landscape areas could be a valuable source to
extract information about natural phenomena (displacement
of glaciers [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], landscape change [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], local meteorology...):
to meet this goal they require to be located and oriented
accurately. Recent research shows that landscape images pose
estimation (computation of the location and orientation of
a camera in the computer vision vocabulary) requires priors
which can be provided by GIS data: horizon and 3D models
[
        <xref ref-type="bibr" rid="ref1 ref2 ref4">1, 2, 4</xref>
        ], aerial views [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], sun position [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The proposed
map can thus be used to extract the most probable picture
locations in a neighborhood.
      </p>
      <p>
        However, the collaborative database are more widely used
to analyse people behaviour and general trends in the tourists
movements in urban areas [
        <xref ref-type="bibr" rid="ref16 ref23 ref7">23, 16, 7</xref>
        ]. Picture locations
drawn on a map are di±cult to read and require a more
appropriate geovisualization. In [
        <xref ref-type="bibr" rid="ref16 ref7">16, 7</xref>
        ], spatial density maps
(also called heat maps) are used to extract easily the most
attractive areas (see for instance the site sightsmap.com).
The textual geotags associated to a world coordinate also
give many opportunities. For instance, authors in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] use
the tags to draw the geographic boundaries of fuzzy regions.
They compare how Kernel Density Estimation (KDE) and
Support Vector Machines (SVM) are accurate in the
extraction of areas such as the Alps. Heat and density maps are
well suited for large scale mapping. However, once zooming
in, the contours of the high density regions become
inaccurate, mainly because of the inaccuracy of the clicked geotags
and the use of a smoothing radius. For instance, locations
such cli®s can be considered highly attractive for
photographers just because they are within the in°uence radius of
popular places, while, in reality, they are inaccessible.
      </p>
      <p>In order to interpolate probability values in each location
of the map and not only in areas where picture are found,
we propose to compute density in a space generated from
geographic features rather than the space of latitude and
longitude. To this goal, we require precise image locations
which are provided by the GPS embedded in recent cameras
and appropriate geographic features. Such geographic
features are extracted with a GI Software either from a Digital
Elevation Model (DEM) or from a Topographic Landscape
Model (TLM: roads, lakes, forests).</p>
      <p>
        By modeling the density of pictures in the space of
geographic features, we estimate the probability of being a
picture location over all the territory considered. This
setting is equivalent to a One-Class problem (OC) where there
is a set of positive data but no negative data: the locations
of landscape photographies compose the positive set, the set
of \attractive places". However, for the map locations with
no pictures, we don't know if the absence of photographies
is due to the inappropriateness of the location (a \bad" or
negative place) or simply to the lack of pictures in an
\attractive" location. This type of problems is also common in
geographic classi¯cation problems, such as change detection
from satellite images [
        <xref ref-type="bibr" rid="ref14 ref5">14, 5</xref>
        ]. Since in our case we consider a
OC classi¯cation problem applied to geographical
information sciences, Guo [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] called these kinds of data Geographic
One-Class Data (GOCD).
      </p>
      <p>
        During the last years, Kernel methods have been widely used
for OC problems. In this study, we consider the following
OC models: the One-class SVM (OCSVM) [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], the Support
Vector Data Description (SVDD) [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] and the Kernel PCA
(KPCA) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. All of them use the kernel trick to project
the original data on a hypersphere in Rd. In the high
dimensional space, the data are more likely separated by a
linear model [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Guo [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] compares OCSVM, Maximum
Entropy (MAXTENT) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and Positive and Unlabelled
Learning (PUL) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for GOCD problems. The two last methods
provided the most accurate results. However, in our case
study, the hypothesis which states that \the probability of
a negative data being labelled is null " is not valid, thus
excluding the use of PUL models. Indeed, our set of image
locations contains some outliers, for instance pictures taken
from a cable car or pictures not related to landscapes. Since
such outliers are present, some of the labelled data belong
to the negative class.
      </p>
      <p>
        We will therefore focus on non-parametric methods for
distribution support estimation. Such methods do not
require knowledge about the distribution of the data and ¯t
well with our problem. Besides support vector methods,
another straightforward method is the Kernel Density
Estimation (KDE, also known as Parzen window) [
        <xref ref-type="bibr" rid="ref15 ref19">15, 19</xref>
        ]: KDE
estimates the density of data by applying a local smoothing
¯lter [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ].
      </p>
      <p>In this paper, we propose a KDE-based strategy to
estimate the probability distribution function (PDF) of the
data. We estimate the support of the data both in the
original space of geographic features and in the feature space
spanned by KPCA. The PDFs of the labeled vs the
unlabeled data are used to de¯ne a Bayesian criterion, which
measures the probability for a map location to be an image
location (given its geographic features). To ¯t the free
parameters of KDE and KPCA we de¯ne a performance
measure to ensure that most of the locations with photographies
are classi¯ed as positives and most of the unlabeled
locations are classi¯ed as negative. We compare the proposed
approaches with OCSVM and KPCA on a real dataset of
touristic pictures taken in the Swiss Alps.</p>
    </sec>
    <sec id="sec-3">
      <title>METHODOLOGY 2. 2.1</title>
    </sec>
    <sec id="sec-4">
      <title>Geographic features extraction</title>
      <p>The geographic features are extracted using a GIS
software for an ensemble of N cells on a grid with 100m
resolution. The considered features are summarized in table 1
and are computed for each cell zj of the grid forming the</p>
      <p>N
unlabeled set zu = fzj gi=1 and for the n image locations
zi, forming the labeled set zl = fzigin=1. Each zi and zj are
thus represented by a d-dimensional vector formed by the d
geographic features.</p>
      <p>The ¯rst set of features is extracted from the DEM
(Altitude, slope, curvature and visible sky). Then, for the TLM
based features, the distances from the cell to the nearest
forest, lake, road and cable car are computed. All the
geographic features are mean centered and scaled unit variance.
2.2</p>
    </sec>
    <sec id="sec-5">
      <title>GOCD with Kernel density Estimation</title>
      <p>The KDE function in equation (1) is used to evaluate the
density of a data set from the observations of the positive
class zl. Once the density has been estimated, we can
evaluate the density for an unknown location zj.</p>
      <p>f^(zj; zl) =
1
nh
i=1
n
X K ³ zj ¡ zi ´
h
where K(x) is a local smoothing operator, or kernel
function. Among the di®erent kernel functions, we used the
Gaussian function:
1 2</p>
      <p>
        K(x) = (2¼)¡d=2exp(¡ 2 x )
where h is the bandwidth of the kernel function. Scott's
rule is used to compute the bandwidth [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]:
h = n d¡+14 ¾zl
(1)
(2)
(3)
where ¾zl is the covariance of the positive dataset. This
rule uses the dimension d of the dataset and the number
of positive data n to estimate a reasonable h. The choice
of the appropriate kernel function has less in°uence on the
results than the choice of the proper bandwidth. Indeed, if
the bandwidth is too small, the density is over-¯tted to the
positive data set and its generalization power is weak. On
the contrary, if the bandwidth is too large, the density will
be oversmoothed and its small peaks will disappear.
      </p>
      <p>We consider the following scheme to describe our GOCD
problem. Let Y 2 0; 1 be the event (or class) \is a picture
location": Y = 1 for a cell being on a picture location and
Y = 0 otherwise. The probability of a cell being a \picture
location", given its geographic features zj , is</p>
      <p>P (Y = 1jzj) =
p(zjjY = 1)P (Y = 1)
p(zj)
P (Y = 1jzj) is the value we want to compute for each cell of
the map. The data density p(zj) is estimated with a KDE
on a random set of unlabeled cells: f^(zj; zu), the conditional
probability p(zjjY = 1) is estimated from a KDE on the
labeled data only: f^(zj; zl) and P (Y = 1) is a unknown
constant c. We are observing:</p>
      <p>P (Y = 1jzj)
c
=
p(zjjY = 1)
p(zj)</p>
      <p>A threshold can be set on the cell probability from
equation 5 under which it is assumed to be drawn from the
generic distribution p(zj), while above it is assumed drawn
from the distribution of the labeled data p(zj; Y = 1). In
practice, we set the threshold to one.</p>
      <p>
        To chose the best set of parameters for the method, a
performance measure need to be de¯ned. To this end, our
labelled set is divided in three subsets, the ¯rst one to train
the KDE and the second to select the parameters. We want
the estimated densities of the training and testing sets to be
(6)
(7)
similar, nevertheless both should di®er from the random set
density. Indeed, if the density of the random set and the one
issued from the labeled set are similar, the ratio for equation
5 will tend to one. This means that the geographic features
where badly chosen and are not able to distinguish properly
the random data from the labelled ones. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] the positive
and unlabeled score presented in equation 6 is maximized:
Fpu =
r2
rpos
      </p>
      <p>Where r is the recall (the proportion of correctly predicted
data in the testing set) and rpos is the ratio of positive
predicted locations in the random set.</p>
      <p>In this study, we propose another criterion, more adapted
for the density estimation process: we want to ensure that
the bandwidth h ¯ts well for both the training and testing
sets and thereby their density should be similar. We want
to maximize C in equation 7:</p>
      <p>C =
2
sR</p>
      <p>2 =
s2T ¡ st
2
sR
(sT ¡ st)(sT + st)</p>
      <p>Where sR is 1 ¡ rpos; sT = (1 ¡ rT ) with rT being the
recall for the training set and st = (1 ¡ rt), with rt being the
recall for the testing set. Thus, C is very similar to Fpu but
the component (sT ¡ st) ensures that the PDF estimation
of the training and testing set are similar.
2.3</p>
    </sec>
    <sec id="sec-6">
      <title>KDE and KDE(KPCA)</title>
      <p>
        In order to take into account the possible correlation
between the geographic features and the non-linearity in the
data distribution, we propose to estimate the density of the
data either in the original space or in a feature space spanned
by the mapping Á(:) induced by the KPCA. Ho®mann [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
states that the density function is proportional to the
spherical potential in the feature space. The spherical potential
measures the distance between Á(z): the projection of the
point z in the feature space, and the center of the data Á0(z).
However, the spherical potential can't be used to estimate
the density of the unlabeled data, because if the kernel
projection works well to separate the labeled set zl from the
unlabeled set zu, their gravity centers Á0(zl) and Á0(zu) do
not correspond. Consequently, we run the KDE on the
nonlinear features extracted from KPCA.
2.4
      </p>
    </sec>
    <sec id="sec-7">
      <title>Comparing approaches: One-Class SVM and KPCA</title>
      <p>
        The kernel functions project the data on a hypersphere
which has the same dimension number than the number of
training data. OCSVM searches for a hyperplan with two
constraints. First, the intersection between the plan and the
sphere enclose a ratio of the points equal to 1 ¡ ¹, where ¹
is the ratio of the outliers. Second, the plan has to be as
far as possible from the origin. SVDD [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] searches for the
minimal sphere which encloses the points. For \RBF"
functions, OCSVM and SVDD can produce similar results [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
KPCA applies a PCA on the projection of the data on the
sphere. The reconstruction error in the feature space is used
to separate positive and negative data. It appears that this
boundary encloses more tightly the data, resulting in best
results than OCSVM and SVDD.
      </p>
      <p>
        As in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], we use the positive and unlabeled F -score
presented in equation 6 to evaluate the performance and select
the best set of parameters for these two methods. Indeed,
for One-Class problems, the commonly used performance
measures, based on the ratio of false positives cannot be
applied.
      </p>
    </sec>
    <sec id="sec-8">
      <title>RESULTS</title>
      <p>First, we will present the data and geographic features
considered and how the image locations di®er from the
distribution of the random locations. Then, we conducted
some experiment using di®erent combination and di®erent
amount of geographic features. Finally, we will present the
resulting probability maps.
3.1</p>
    </sec>
    <sec id="sec-9">
      <title>Data</title>
      <p>The experiments consider one of the political regions of
Switzerland located in the Swiss Alps: \Valais - Wallis". The
area is bounded by the Geneva Lake to the West and by the
highest summits of Switzerland. It encloses some of the most
touristic spots in Switzerland: Zermatt, the Matterhorn and
the Aletsch glacier. The altitude gradient is important
between the lowest area on the Geneva lake shore (450m) and
the highest summit the \Dufourspitze" (4634m). The area
is a valley, whose bottom hosts small to mid-sized villages.
Climbing the °anks, the low altitudes are generally covered
with vineyards (500-700m); then forest, mountain villages
and resorts are found in the range from 700 to 1400m; above
1400m pastures and slopes dedicated to ski give access to the
highest peaks, playground for the alpinists.</p>
      <p>
        The Swiss Topographic Agency (Swisstopo) provides a
DEM; among the available resolutions of 25m and 200m,
we retained the ¯rst. Swisstopo also provides a TLM
containing vector layers of several territorial objects. For this
study, we selected the roads, other transportation facilities,
forest and lake layers. The image locations are extracted
from the Flickr database and were provided by [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. Those
images are ¯ltered to keep only those located outside built
areas and geotagged with a GPS device. As stated above,
we want to learn which are the good places in term of
landscape features: images taken in built areas tend to capture
the presence of a village or a touristic attraction rather than
a natural landscape. The set of image locations retained
contains 2683 points, which are then separated in the three
subsets represented in ¯gure 1. The ¯rst set, in blue, is used
to train the methods; the second one, in green, is used to ¯t
the free parameters of the methods; the third one, in yellow,
is used for the validation to compute independent statistics
of the results. To generate those three sets, a grid with a
5km side is generated over the area. Then, each grid cell
is randomly attributed to one of the subsets, in order to
obtain spatially uncorrelated set of approximatively equal
size. Finally, we also select 10 sets of random locations with
the same number of locations as in the training set. These
random sets will be used in the next sections to select
signi¯cant geographic features and to compare the distribution
of the labeled and unlabeled data.
3.2
      </p>
    </sec>
    <sec id="sec-10">
      <title>Geographic features selection</title>
      <p>In order to measure the signi¯cance of these features, the
unlabeled data and the picture location distributions are
compared. For each of the geographic feature chosen, their
distribution diverge (tested by a Kolmogorov-Smirnov test
with ® = 1%). Despite their statistical di®erences, some
features are more discriminative than others. The following
list presents the geographic features selected and explains
how their value are di®erent at true image locations.
² Altitude (Z): Since we are focusing on landscape
photography, few images are taken between 450 and
1500m. In contrary, the range between 1800 to 2200m,
where the ski slopes are found is very attractive. There
is a small mode above 4000m representing pictures
taken by alpinists at high altitude, ¯gure 2 (a).
² Curvature (Curv): Positive curvatures (convex area)
are more represented; indeed the ridges are more
attractive than the valley.
² Slope: The °atter areas from 5% to 30%) are
preferred.
² Visible Sky (Sky): This feature con¯rms the result
observed with the curvature: cells with a high ratio of
visible sky (&gt;90%) are more often chosen.
² Distance to nearest lake (DLake): People tend
to take more pictures in a radius of 200m around the
lakes.
² Distance to nearest road (DRoad) and presence
of roads (BRoad): The cells close to roads and paths
are more active. Approximately 70% of the pictures
are taken in a cell containing a road or a path, ¯gure
2 (b).
² Distance to the nearest forest (DFor): The
random and image distributions are very similar.
² Distance to the nearest cable car (DLift): Half
of the pictures are taken within a range of 1500m
surrounding a lift.
!
(a)
(b)</p>
      <p>In table 2, we report 10 experiments, obtained by
considering di®erent combination of geographic feature for GOCD.
In this table only the best experiments are presented (C
&gt;= 20). Thus, the less-signi¯cant geographic features are
less represented (DLift, DFor). The best combination of
geographic features is altitude, slope, visible sky, curvature,
distance to the nearest road and the binary roads (Exp. 9
in Table 2). The superiority of this experiment is observed
for the two methods proposed but the KDE(KPCA) has the
best result on the independent validation set, corresponding
to a recall of 0.85. The KDE in the original space obtains
slightly lower performance with a recall at 0.78. Thereby,
it appears that the KDE and KDE(KPCA) have a similar
behavior for the combination of few signi¯cant geographic
feature (Exp. 1-4, 6, 7). However, if more features (Exp. 8
10) or binary features (Exp. 5) are added, the KDE(KPCA)
is more suitable to describe the data relations between
geographic features, resulting to better results. Both KDE and
KDE(KPCA) are more suited than KPCA and OCSVM for
this problem. By inserting the unlabeled data in the
classi¯cation process, we ensure that less unlabeled data are
classi¯ed positive and thus less data in the independent
validation set are misclassi¯ed.
3.4</p>
    </sec>
    <sec id="sec-11">
      <title>Evaluation on one of the most attractive area in the Alps: Zermatt.</title>
      <p>On the map in ¯gure 3 (a), probabilities for the Zermatt
area are presented. The ¯lled dots are the training
locations, the triangles are the testing locations while the empty
circles are the validation locations. Image locations are
superposed to the results of KDE(KPCA) and correspond to
the estimate of p(Y=1jz) at each location. A
misclassi¯cation corresponds to a circle that would be located on an area
of low probability (blue). The validation set is very speci¯c
in this area: the pictures on the south are above 4000m
and accessible via a cable car (arrow A on the map in
¯gure 3 (a)). This area is also a skiing region during winter
and summer. First, some of the misclassi¯ed data are found
along the cable car and are easily explicable. Indeed, these
pictures aren't taken from the ground (but from the lift) and
thus these locations aren't related to the geographic features.
This also explains why the \DLift" geographic feature is not
in the set of the best geographic features. Another set of
locations, on the south of the \Breithorn" are badly classi¯ed
(arrow B on the map in ¯gure 3 (a)). They are shot on the
way to this peak which is one of the most easily accessible
4000 summit in the Alps. However, the path to the summit
is not in the roads / paths database. From this map, we
can understand the geographic features related to the image
locations. First, the mountain paths are easily recognizable
in the ¯gure. Indeed, they are always more attractive than
the other locations. However, the paths on steep slopes are
less probable than the other ones. Then, the ridges
(Gornergrat), passes and summits (Matterhorn) are more attractive
than other areas.</p>
      <p>At the scale of the whole area, as seen in ¯gure 4, it is
interesting to note that the method is able to extract di®erent
behaviours. For instance, the paths are expected to have a
large probability. By combining the geographic features, it
appears that it is true at medium altitude. However, in the
valley, where the roads have more tra±c, locations between
the roads are preferred. Moreover, at higher altitude, where
ski slopes are found, people are more disposed to move away
from the path. It is intuitive that the mountain peaks are
good places to shot pictures. The strength of our method
is to rank the peaks according to their altitude and shape.
Finally, the less attractive locations are the steep slopes,
bottom of deep valleys and °at areas at high altitude (glacier).
Indeed, these regions are hardly accessible.</p>
    </sec>
    <sec id="sec-12">
      <title>CONCLUSION</title>
      <p>The GPS measured positions of shared images are more
accurate than locations provided by the user. In this paper,
we propose to use the GPS coordinates of a set of
landscape pictures to train a classi¯er of likely and unelikely
image locations. Every map location is described with
geographic features extracted from a DEM and a TLM. The
method proposed projects the geographic features in a space
1
2
3
4
5
6
7
8
9
10</p>
      <sec id="sec-12-1">
        <title>Geographic</title>
        <p>considered</p>
      </sec>
      <sec id="sec-12-2">
        <title>Z, slope, Sky, DRoad</title>
      </sec>
      <sec id="sec-12-3">
        <title>Z, slope, Sky, Broad</title>
      </sec>
      <sec id="sec-12-4">
        <title>Z, slope, Sky,</title>
        <p>DRoad, Curv</p>
      </sec>
      <sec id="sec-12-5">
        <title>Z, slope, Sky,</title>
        <p>DRoad, DLake</p>
      </sec>
      <sec id="sec-12-6">
        <title>Z, slope, Sky,</title>
        <p>DRoad, BRoad</p>
      </sec>
      <sec id="sec-12-7">
        <title>Z, slope, Sky,</title>
        <p>DRoad, DFor</p>
      </sec>
      <sec id="sec-12-8">
        <title>Z, slope, Sky,</title>
        <p>DRoad, DLift</p>
      </sec>
      <sec id="sec-12-9">
        <title>Z, slope, Sky, DRoad, Curv, DLake</title>
      </sec>
      <sec id="sec-12-10">
        <title>Z, slope, Sky, DRoad, Curv, BRoad</title>
      </sec>
      <sec id="sec-12-11">
        <title>Z, slope, Sky, DRoad, Curv, Broad, DLake</title>
        <p>C
33.3
19.2
20.43
27.23
29.6
18.8
20.7
21.85
R(V)
0.83
0.73
0.83
0.78
0.78
0.69
0.75
0.79</p>
        <sec id="sec-12-11-1">
          <title>Proposed</title>
          <p>Npc
4
4
7
5
7
4
5
7
7
8
C
34.5
24.86
35.6
26.95
46.27
25.7
23.54
34.44
R(V)
0.83
0.72
0.86
0.78
0.84
0.55
0.7
0.86
Npc
3
5
6
6
4
4
2
6
5
2
Fpu
1.94
2.43
1.59
2.10
3.12
1.52
2.11
1.52</p>
        </sec>
        <sec id="sec-12-11-2">
          <title>Competing</title>
        </sec>
      </sec>
      <sec id="sec-12-12">
        <title>OCSVM R(V) 0.45 0.4</title>
        <p>0.57
0.01
0.45
0.55
0.42
0.53</p>
        <p>Fpu
1.66
1.19
1.23
1.3
1.89
1.18
1.21
1.16
º
0.4
0.45
0.35
0.35
0.45
0.25
0.2
0.25</p>
        <p>R(V)
0.41
0.48
0.52
0.42
0.37
0.64
0.59
0.63
of higher dimension using a KPCA. Then, the spatial
probability density function of the picture locations and random
locations are estimated with a density function (KDE). The
relation between them is used to compute the probability of
each map cell to be a likely location given the geographic
features. The recall on the independent validation set
surpasses the KPCA and OCVM classi¯cation in their regular
implementation.</p>
        <p>This preliminary study could be re¯ned in several ways.
First, the geographic features computed were selected from a
priori expected correlations. However, other geographic
features could also be correlated to the image locations
(orientation, DEM based features at ¯ner or coarser scales, rivers,
cable car departure and arrival stations only etc.). Second,
our studies focus on a small area in the Alps, a similar
approach could be applied to the entire Alps. Indeed, a bigger
amount of picture locations would re¯ne our results and
increase their robustness and generalization power at a larger
scale. Third, in this work we avoided the KDE bandwidth
setting using the Scott's rule. A more appropriate
bandwidth could improve the results. Finally, the improvement
between the proposed method (KDE and KDE(KPCA)) and
the standard ones (OCSVM, KPCA) is mainly due to the
insertion of the unlabeled data in the classi¯cation process.
Our method is easy to implement and shows good results
without ¯ne tuning.</p>
        <p>We proved that the locations of a map are not
equiprobable relative to landscape image locations. By describing
each map location with geographic features, one can extract
the most probable regions. The generated map di®ers from
the density maps based on the Northing and Easting of the
picture locations in two ways: ¯rstly, using the geographic
features, probabilities are also computed for region
without image locations. Second, by taking into account the
locations accessibility, the computed probabilities are more
realistic. The force of our method is to learn attractive
combinations of geographic features with the density estimation.
Indeed, the relation between a geographic feature and an the
picture location are intuitive (eg. paths are more probable,
convex area are preferred...), but their combinations is more
powerful to classify correctly picture locations.</p>
        <p>Currently, there is a huge interest in computer vision for
the pose estimation of shared images at a local or worldwide
scale. Our results show that it is possible to use geographic
features as a prior knowledge to either discredit some
unlikely poses or to promote the more probable ones.</p>
      </sec>
    </sec>
    <sec id="sec-13">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work has been partly supported by the Swiss
National Science Foundation (grant PZ00P2-136827).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Baatz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Saurer</surname>
          </string-name>
          , K. KoÄser, and
          <string-name>
            <given-names>M.</given-names>
            <surname>Pollefeys</surname>
          </string-name>
          .
          <article-title>Large scale visual geo-localization of images in mountainous terrain</article-title>
          .
          <source>In ECCV</source>
          , pages
          <volume>517</volume>
          {
          <fpage>530</fpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Baboud</surname>
          </string-name>
          , M. Cad¶³k, E. Eisemann, and
          <string-name>
            <given-names>H.-P.</given-names>
            <surname>Seidel</surname>
          </string-name>
          .
          <article-title>Automatic photo-to-terrain alignment for the annotation of mountain pictures</article-title>
          .
          <source>In IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          , pages
          <fpage>41</fpage>
          {
          <fpage>48</fpage>
          . IEEE,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bozzini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Conedera</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Krebs</surname>
          </string-name>
          .
          <article-title>A new tool for obtaining cartographic georeferenced data from single oblique photos</article-title>
          .
          <source>Proceedings of the 23rd International CIPA Symposium</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Chippendale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zanin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Andreatta</surname>
          </string-name>
          .
          <article-title>Spatial and temporal attractiveness analysis through geo-referenced photo alignment</article-title>
          .
          <source>In International Geoscience and Remote Sensing Symposium (IGARSS)</source>
          , volume
          <volume>2</volume>
          , pages
          <fpage>1116</fpage>
          {
          <fpage>1119</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>de Morsier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tuia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Borgeaud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Gass</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Thiran</surname>
          </string-name>
          .
          <article-title>Semi-supervised novelty detection using SVM entire regularization path</article-title>
          .
          <source>IEEE Trans. Geosci</source>
          . Remote Sens.,
          <volume>51</volume>
          (
          <issue>4</issue>
          ):
          <year>1939</year>
          {
          <year>1950</year>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Elkan</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Noto</surname>
          </string-name>
          .
          <article-title>Learning classi¯ers from only positive and unlabeled data</article-title>
          .
          <source>In International Conference on Knowledge Discovery and Data Mining</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Girardin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Calabrese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fiore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ratti</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Blat</surname>
          </string-name>
          .
          <article-title>Digital footprinting: Uncovering tourists with user-generated content</article-title>
          .
          <source>Pervasive Computing</source>
          , IEEE,
          <volume>7</volume>
          (
          <issue>4</issue>
          ):
          <volume>36</volume>
          {
          <fpage>43</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Grothe</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Schaab</surname>
          </string-name>
          .
          <article-title>Automated footprint generation from geotags with kernel density estimation and support vector machines</article-title>
          .
          <source>Spatial Cognition &amp; Computation</source>
          ,
          <volume>9</volume>
          (
          <issue>3</issue>
          ):
          <volume>195</volume>
          {
          <fpage>211</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Tong</surname>
          </string-name>
          .
          <article-title>Predicting potential distributions of geographic events using one-class data: concepts and methods</article-title>
          .
          <source>International Journal of Geographical Information Science</source>
          ,
          <volume>25</volume>
          (
          <issue>10</issue>
          ):
          <volume>1697</volume>
          {
          <fpage>1715</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ho</surname>
          </string-name>
          <article-title>®mann. Kernel PCA for novelty detection</article-title>
          .
          <source>Pattern Recognition</source>
          ,
          <volume>40</volume>
          (
          <issue>3</issue>
          ):
          <volume>863</volume>
          {
          <fpage>874</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>N.</given-names>
            <surname>Jacobs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Roman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Speyer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Pless</surname>
          </string-name>
          .
          <article-title>Geolocating static cameras</article-title>
          .
          <source>In International Conference on Computer Vision</source>
          , pages
          <fpage>1</fpage>
          <article-title>{6</article-title>
          . IEEE,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E. T.</given-names>
            <surname>Jaynes</surname>
          </string-name>
          .
          <article-title>Information theory and statistical mechanics</article-title>
          .
          <source>Physical review</source>
          ,
          <volume>106</volume>
          (
          <issue>4</issue>
          ):
          <fpage>620</fpage>
          ,
          <year>1957</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Kane</surname>
          </string-name>
          .
          <article-title>Assessing landscape attractiveness: a comparative test of two new methods</article-title>
          .
          <source>Applied Geography</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          ):
          <volume>77</volume>
          {
          <fpage>96</fpage>
          ,
          <year>1981</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mun~</surname>
          </string-name>
          oz-Mar¶³,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bovolo</surname>
          </string-name>
          , L. G¶
          <string-name>
            <surname>omez-Chova</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Bruzzone</surname>
            , and
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Camps-Valls</surname>
          </string-name>
          .
          <article-title>Semisupervised one-class support vector machines for classi¯cation of remote sensing data</article-title>
          .
          <source>IEEE Trans. Geosci</source>
          . Remote Sens.,
          <volume>48</volume>
          (
          <issue>8</issue>
          ):
          <volume>3188</volume>
          {
          <fpage>3197</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>E.</given-names>
            <surname>Parzen</surname>
          </string-name>
          .
          <article-title>On estimation of a probability density function and mode</article-title>
          .
          <source>The Annals of Mathematical Statistics</source>
          ,
          <volume>33</volume>
          (
          <issue>3</issue>
          ):
          <volume>1065</volume>
          {
          <fpage>1076</fpage>
          , 09
          <year>1962</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Popescu</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Grefenstette</surname>
          </string-name>
          .
          <article-title>Deducing trip related information from °ickr</article-title>
          .
          <source>In International Conference on World Wide Web, WWW '09</source>
          , pages
          <fpage>1183</fpage>
          {
          <fpage>1184</fpage>
          , New York, NY, USA,
          <year>2009</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Produit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tuia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Golay</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Strecha</surname>
          </string-name>
          .
          <article-title>Pose estimation of landscape images using DEM and orthophotos</article-title>
          . In International Conference on Computer Vision in Remote Sensing, pages
          <volume>209</volume>
          {
          <fpage>214</fpage>
          . IEEE,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>T.</given-names>
            <surname>Produit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tuia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Strecha</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Golay</surname>
          </string-name>
          .
          <article-title>An open tool to register landscape oblique images and generate their synthetic model</article-title>
          .
          <source>In Open Source Geospatial Research and Education Symposium</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rosenblatt</surname>
          </string-name>
          .
          <article-title>Remarks on some nonparametric estimates of a density function</article-title>
          .
          <source>The Annals of Mathematical Statistics</source>
          ,
          <volume>27</volume>
          (
          <issue>3</issue>
          ):
          <volume>832</volume>
          {
          <fpage>837</fpage>
          , 09
          <year>1956</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>B.</given-names>
            <surname>SchoÄlkopf</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Smola</surname>
          </string-name>
          .
          <article-title>Learning with Kernels</article-title>
          . MIT press, Cambridge (MA),
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>B. SchoÄlkopf</given-names>
            , R. C.
            <surname>Williamson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Smola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shawe-Taylor</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Platt</surname>
          </string-name>
          .
          <article-title>Support vector method for novelty detection</article-title>
          .
          <source>In NIPS</source>
          , volume
          <volume>12</volume>
          , pages
          <fpage>582</fpage>
          {
          <fpage>588</fpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>D. W. Scott</given-names>
            <surname>Satk</surname>
          </string-name>
          .
          <article-title>On optimal and data-based histograms</article-title>
          .
          <source>Biometrika</source>
          ,
          <volume>66</volume>
          (
          <issue>3</issue>
          ):
          <volume>605</volume>
          {
          <fpage>610</fpage>
          ,
          <year>1979</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bakillah</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Zipf</surname>
          </string-name>
          .
          <article-title>Road-based travel recommendation using geo-tagged images</article-title>
          .
          <source>Computers, Environment and Urban Systems, (0)</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Tax</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. P.</given-names>
            <surname>Duin</surname>
          </string-name>
          .
          <article-title>Support vector data description</article-title>
          .
          <source>Machine learning</source>
          ,
          <volume>54</volume>
          (
          <issue>1</issue>
          ):
          <volume>45</volume>
          {
          <fpage>66</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Palacio</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Purves</surname>
          </string-name>
          .
          <article-title>Georeferencing images using tags: application with °ickr</article-title>
          .
          <source>In AGILE International Conference on Geographic Information Science</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>D.-Y.</given-names>
            <surname>Yeung</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Chow</surname>
          </string-name>
          .
          <article-title>Parzen-window network intrusion detectors</article-title>
          .
          <source>In International Conference on Pattern Recognition</source>
          , volume
          <volume>4</volume>
          , pages
          <fpage>385</fpage>
          {
          <fpage>388</fpage>
          . IEEE,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>