|title=JKU-Satellite at MediaEval 2015: An Intuitive Approach to Locate Single Pictures Within a Session
==JKU-Satellite at MediaEval 2015: An Intuitive Approach to Locate Single Pictures Within a Session==
JKU-Satellite at MediaEval 2015: An intuitive approach to locate single pictures within a Session Michael Mayr Thomas Hintersteiner Michael Weingartner Johannes Kepler Johannes Kepler Johannes Kepler Universität Linz Universität Linz Universität Linz Florian Schröckeneder Peter Knees Markus Schedl Johannes Kepler Johannes Kepler Johannes Kepler Universität Linz Universität Linz Universität Linzl ABSTRACT In this sessions we look for pictures with missing coordi- In this paper we describe our solution to the placing task of nates. The normal case is a picture with missing coordinates the MediaEval 2015 workshop. In particular we provide a between two pictures where the coordinates are known. In solution for Sub-Task 1.2, which is the Mobility-based plac- this case a vector between the two known coordinates is com- ing task [1]. We offer a intuitive solution for run 1, a different puted. With the timestamps of the pictures we can estimate approach for run 2 and combine these two solutions in run 3. how far from the last known location to the next known loca- tion the picture with the missing coordinates has been taken. Run 1 just uses known coordinates and timestamps to pre- dict a missing location. The second run uses only given In other words we just compute the direction and the time image features. In a third run we combine these two ap- between two known locations. We then just place the miss- proaches. ing location on this direction, according to the time it took to get there. For example, picture A and picture C are 100 meters apart and picture C was taken 10 minutes after A. 1. INTRODUCTION Between A and C there is another picture B. B has no co- The goal of the Mobility-based placing Task is to predict ordinates, but it has a time stamp. B was taken 3 minutes coordinates of pictures. What we have is a series of pic- after A. With a simple division and a multiplication we es- tures, taken by one photographer in one city. Some of these timate the location of picture B to be 30 meters from A in pictures lack geographical information. We then have to es- direction B. timate these missing coordinates by using the locations of the other pictures and their image features.[1] This approach also works if more than one picture is missing a location. In that case the direction between the last two known locations is computed. 2. OUR APPROACH The following part describes our solution in detail. For If the location of the first picture is missing, the first known the first run only coordinates and timestamps are used, for point is set to the next picture with coordinates and if the the second run only given image features are used. For the last picture is missing we set the last known point to the third run both resources are used. last picture with coordinates. This produces the best solu- tion for this 2 cases, because there is no direction, which can In order to get an idea whether our predictions are any- be read out of the data. where near the real coordinates, we deleted randomly 20% of the coordinates from the training set and checked the pre- dictions against the actual coordinates. 2.2 Run 2 For this run, only the provided image features are used. After experimenting with single features and combining them we decided to use the CEDD feature, because it provided the best results on the training set. 2.1 Run 1 The experimentations where done as described at the be- The most intuitive solution we could come up with was ginning of the chapter. As stated we deleted 20% of the implemented in run 1. For this approach the dataset is split coordinates,this 20% remained the same during the differ- up in single sessions of pictures. A session is a set of pictures ent runs of the experiments in order to see which feature as described in the Introduction. works best. We do not know why the chosen feature worked best and if there exists a better combination of features. In the following few lines we describe the algorithm used for the experiments. For run 2 we take this algorithm and ap- Copyright is held by the author/owner(s). ply it to the test data with the chosen feature. MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany The algorithm itself is relatively simple. First the feature vector of a picture with missing coordinates is read. This vector is compared to every other feature vector of the whole data set. This comparison is done by computing the Eu- clidean distance between these vectors. The picture with the shortest distance, the most similar is then taken and its coordinates are predicted. We know that this approach by itself does not lead to really useful predictions. However it should improve the predic- tions which are already near the actual coordinates. This approach is designed to just further improve the results of run 1. 2.3 Run 3 In our final run, we combine run 1 and run 2, with a few alterations. We start by splitting the pictures into single sessions, as we do in run 1. The next step is to get the first picture without a location. Now we get all pictures that be- long to the same node. A node in this case means a area as described in the data set. After that, we keep the 25 pic- tures with the shortest Euclidean distances and rank them from shortest to longest distance. After we retrieved these similar pictures, we look at all pictures of the current session Figure 1: Process of Run 3 and compute an average speed with the timestamps. Now we take the next picture of the session that has a location. This can be a picture before the one with the missing loca- tion or afterwards. Another problem is that people who go from some point to somewhere else do not move at a constant speed, they What we have now is a picture with a location, an aver- might stop, slowdown or speed up along the way. Never- age speed and 25 pictures with similar features to that one theless we can get a rough idea of the location, where the with the missing coordinates. Now we can compute a ra- picture might have been taken. dius from the last known location and the average speed and get a rough idea where the missing location should be. Not surprisingly, the predicted coordinates of run 2 where The next step is simply to see if one of the 25 similar pic- way off the actual coordinates. Because the coordinates of tures was taken is inside this radius. If this is the case, the the other pictures of a session where not taken into account, coordinates of this picture become the predicted ones. We only the image features where considered, under 1% of the take the first (most similar) picture that was taken inside locations where within 10 km from the actual location. This the radius. approach is just intended to improve the results of run 1 and of course on its own leads to a low success rate. The idea behind this is that many people take pictures of the same things, so if we already narrow the location down Run 3 had similar results than run 1, 23.06% of all locations to a certain radius chances are that another person took a where within 0.1 km, 76.28 % where within a kilometer and picture at the same place that looks similar. 96.82% where in a range of 10km. The predictions where a bit further away from the actual coordinates than run 1. If we do not find a similar picture, that was taken inside This comes as bit of a surprise, because in our experiments the reachable radius, we use the unaltered approach from with the development set, we got a slight improvement of run 1 to predict the coordinates of that picture. results compared to run one. We guess it just depends on how many pictures in the whole data set are taken at the same area as the pictures with the missing coordinates. 3. RESULTS Our best run was run 1. With this approach our esti- Due to limited resources and time we could not experiment mations of about a quarter (23.18%) of all locations where with combinations of all available features. One way to fur- within 0.1 km of the actual location. And 76.94% where ther improve our results from run 3 would certainly be more within a kilometre. In a range of 10 km away from the ac- experimentation in that direction. tual coordinates we got 96.92%. With this result we can say that our method for run 1 works 4. REFERENCES quite well for placing the pictures in a certain range, but [1] J. Choi, C. Hauff, O. V. Laere, and B. Thomee. The is in many cases not enough to estimate exact coordinates. Placing Task at MediaEval 2015. In Working Notes One of the weaknesses is a location missing at the beginning Proceedings of the MediaEval 2015 Workshop, Wurzen, or the end of a session. But due to the fact that the pictures Germany, September 2015. are taken within 10 minutes, the error distance is within the other estimation errors.