=Paper= {{Paper |id=Vol-1436/Paper69 |storemode=property |title=JKU-Satellite at MediaEval 2015: An Intuitive Approach to Locate Single Pictures Within a Session |pdfUrl=https://ceur-ws.org/Vol-1436/Paper69.pdf |volume=Vol-1436 |dblpUrl=https://dblp.org/rec/conf/mediaeval/MayrHWSKS15 }} ==JKU-Satellite at MediaEval 2015: An Intuitive Approach to Locate Single Pictures Within a Session== https://ceur-ws.org/Vol-1436/Paper69.pdf
     JKU-Satellite at MediaEval 2015: An intuitive approach to
             locate single pictures within a Session

                   Michael Mayr        Thomas Hintersteiner                            Michael Weingartner
                 Johannes Kepler         Johannes Kepler                                Johannes Kepler
                  Universität Linz        Universität Linz                               Universität Linz
                  Florian Schröckeneder         Peter Knees                               Markus Schedl
                     Johannes Kepler          Johannes Kepler                            Johannes Kepler
                      Universität Linz         Universität Linz                           Universität Linzl

ABSTRACT                                                             In this sessions we look for pictures with missing coordi-
In this paper we describe our solution to the placing task of        nates. The normal case is a picture with missing coordinates
the MediaEval 2015 workshop. In particular we provide a              between two pictures where the coordinates are known. In
solution for Sub-Task 1.2, which is the Mobility-based plac-         this case a vector between the two known coordinates is com-
ing task [1]. We offer a intuitive solution for run 1, a different   puted. With the timestamps of the pictures we can estimate
approach for run 2 and combine these two solutions in run 3.         how far from the last known location to the next known loca-
                                                                     tion the picture with the missing coordinates has been taken.
Run 1 just uses known coordinates and timestamps to pre-
dict a missing location. The second run uses only given              In other words we just compute the direction and the time
image features. In a third run we combine these two ap-              between two known locations. We then just place the miss-
proaches.                                                            ing location on this direction, according to the time it took
                                                                     to get there. For example, picture A and picture C are 100
                                                                     meters apart and picture C was taken 10 minutes after A.
1.    INTRODUCTION                                                   Between A and C there is another picture B. B has no co-
  The goal of the Mobility-based placing Task is to predict          ordinates, but it has a time stamp. B was taken 3 minutes
coordinates of pictures. What we have is a series of pic-            after A. With a simple division and a multiplication we es-
tures, taken by one photographer in one city. Some of these          timate the location of picture B to be 30 meters from A in
pictures lack geographical information. We then have to es-          direction B.
timate these missing coordinates by using the locations of
the other pictures and their image features.[1]                      This approach also works if more than one picture is missing
                                                                     a location. In that case the direction between the last two
                                                                     known locations is computed.
2.    OUR APPROACH
  The following part describes our solution in detail. For           If the location of the first picture is missing, the first known
the first run only coordinates and timestamps are used, for          point is set to the next picture with coordinates and if the
the second run only given image features are used. For the           last picture is missing we set the last known point to the
third run both resources are used.                                   last picture with coordinates. This produces the best solu-
                                                                     tion for this 2 cases, because there is no direction, which can
In order to get an idea whether our predictions are any-             be read out of the data.
where near the real coordinates, we deleted randomly 20%
of the coordinates from the training set and checked the pre-
dictions against the actual coordinates.                             2.2    Run 2
                                                                       For this run, only the provided image features are used.
                                                                     After experimenting with single features and combining them
                                                                     we decided to use the CEDD feature, because it provided the
                                                                     best results on the training set.

2.1    Run 1                                                         The experimentations where done as described at the be-
  The most intuitive solution we could come up with was              ginning of the chapter. As stated we deleted 20% of the
implemented in run 1. For this approach the dataset is split         coordinates,this 20% remained the same during the differ-
up in single sessions of pictures. A session is a set of pictures    ent runs of the experiments in order to see which feature
as described in the Introduction.                                    works best. We do not know why the chosen feature worked
                                                                     best and if there exists a better combination of features. In
                                                                     the following few lines we describe the algorithm used for
                                                                     the experiments. For run 2 we take this algorithm and ap-
Copyright is held by the author/owner(s).                            ply it to the test data with the chosen feature.
MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany
The algorithm itself is relatively simple. First the feature
vector of a picture with missing coordinates is read. This
vector is compared to every other feature vector of the whole
data set. This comparison is done by computing the Eu-
clidean distance between these vectors. The picture with
the shortest distance, the most similar is then taken and its
coordinates are predicted.

We know that this approach by itself does not lead to really
useful predictions. However it should improve the predic-
tions which are already near the actual coordinates. This
approach is designed to just further improve the results of
run 1.

2.3    Run 3
   In our final run, we combine run 1 and run 2, with a few
alterations. We start by splitting the pictures into single
sessions, as we do in run 1. The next step is to get the first
picture without a location. Now we get all pictures that be-
long to the same node. A node in this case means a area as
described in the data set. After that, we keep the 25 pic-
tures with the shortest Euclidean distances and rank them
from shortest to longest distance. After we retrieved these
similar pictures, we look at all pictures of the current session                 Figure 1: Process of Run 3
and compute an average speed with the timestamps. Now
we take the next picture of the session that has a location.
This can be a picture before the one with the missing loca-
tion or afterwards.                                                Another problem is that people who go from some point
                                                                   to somewhere else do not move at a constant speed, they
What we have now is a picture with a location, an aver-            might stop, slowdown or speed up along the way. Never-
age speed and 25 pictures with similar features to that one        theless we can get a rough idea of the location, where the
with the missing coordinates. Now we can compute a ra-             picture might have been taken.
dius from the last known location and the average speed
and get a rough idea where the missing location should be.         Not surprisingly, the predicted coordinates of run 2 where
The next step is simply to see if one of the 25 similar pic-       way off the actual coordinates. Because the coordinates of
tures was taken is inside this radius. If this is the case, the    the other pictures of a session where not taken into account,
coordinates of this picture become the predicted ones. We          only the image features where considered, under 1% of the
take the first (most similar) picture that was taken inside        locations where within 10 km from the actual location. This
the radius.                                                        approach is just intended to improve the results of run 1 and
                                                                   of course on its own leads to a low success rate.
The idea behind this is that many people take pictures of
the same things, so if we already narrow the location down         Run 3 had similar results than run 1, 23.06% of all locations
to a certain radius chances are that another person took a         where within 0.1 km, 76.28 % where within a kilometer and
picture at the same place that looks similar.                      96.82% where in a range of 10km. The predictions where
                                                                   a bit further away from the actual coordinates than run 1.
If we do not find a similar picture, that was taken inside         This comes as bit of a surprise, because in our experiments
the reachable radius, we use the unaltered approach from           with the development set, we got a slight improvement of
run 1 to predict the coordinates of that picture.                  results compared to run one. We guess it just depends on
                                                                   how many pictures in the whole data set are taken at the
                                                                   same area as the pictures with the missing coordinates.
3.    RESULTS
  Our best run was run 1. With this approach our esti-             Due to limited resources and time we could not experiment
mations of about a quarter (23.18%) of all locations where         with combinations of all available features. One way to fur-
within 0.1 km of the actual location. And 76.94% where             ther improve our results from run 3 would certainly be more
within a kilometre. In a range of 10 km away from the ac-          experimentation in that direction.
tual coordinates we got 96.92%.

With this result we can say that our method for run 1 works
                                                                   4.   REFERENCES
quite well for placing the pictures in a certain range, but        [1] J. Choi, C. Hauff, O. V. Laere, and B. Thomee. The
is in many cases not enough to estimate exact coordinates.             Placing Task at MediaEval 2015. In Working Notes
One of the weaknesses is a location missing at the beginning           Proceedings of the MediaEval 2015 Workshop, Wurzen,
or the end of a session. But due to the fact that the pictures         Germany, September 2015.
are taken within 10 minutes, the error distance is within the
other estimation errors.