Finding Similar Movements in Positional Data Streams Jens Haase† and Ulf Brefeld‡ Knowledge Mining & Assessment Group Technische Universität Darmstadt, Germany † je.haase@gmail.com ‡ brefeld@kma.informatik.tu-darmstadt.de Abstract. In this paper, we study the problem of efficiently finding similar move- ments in positional data streams, given a query trajectory. Our approach is based on a translation-, rotation-, and scale-invariant representation of movements. Near- neighbours given a query trajectory are then efficiently computed using dynamic time warping and locality sensitive hashing. Empirically, we show the efficiency and accuracy of our approach on positional data streams recorded from a real soccer game. 1 Introduction Team sports has become a major business in many parts of the world. Clubs spend a great deal of money on players, training facilities and other ways to fur- ther improve their play. For instance, the German Bundesliga records all games with special cameras, capturing bird’s eye views of the pitch, to better analyse player movements and tactics. The recordings capture positions of the players and the ball for every fraction of a second. While simple analyses, such as the overall distance a player covered, heat maps of player positions, etc., can be computed (semi-)automatically, more complex analyses involving tactics and counter-strategies rely on human experts. However, the sheer existence of such data paves the way for automatic analyses using intelligent mining techniques. In this paper, we study efficient techniques for detecting similar movements in positional data streams to provide a basis for the analyses of frequent move- ments and tactical patterns. For a soccer game taking about 90 minutes, the recorded data translate into a positional data stream. Standard recording rates of 25 frames per second lead to a representation by about 135,000 snapshots. Every snapshot consists of the 23 positions of the players of the two teams and the ball. In sum, the game is described by more than three million coordinates. As player movements are sequences of such coordinates, it is clear that there are a great deal of com- parisons necessary to account for different lengths of such sequences across players. Thus, there is a real need for efficient techniques to further process and analyse the data. In this paper, we study the problem of finding similar movements of play- ers in such positional data streams. The problem is challenging for two rea- sons. Firstly, it is not clear how to define an appropriate similarity measure on player trajectories and secondly, the sheer number of coordinates render exact approaches infeasible as we will show in the experiments. We first propose a translation-, rotation-, and scale-invariant representation of movements using Angle/Arc-Lengths [11]. Second, we investigate efficient near-neighbour rou- tines based on dynamic time warping [9] and locality sensitive hashing [2]. Our empirical results on positional data recorded from a real soccer game show the efficiency and accuracy of our approach compared to exact baseline methods. The remainder is structured as follows. Section 2 briefly reviews related work. Our main contribution is presented in Section 3 and we report on empiri- cal results in Section 4. Section 5 concludes. 2 Related Work Prior work on mining positional data streams mostly focuses on the performance of individual players. Kang et al. [4] present an approach that uses positional data to assess player positions in particular areas of the pitch, such as catchable, safe or competing zones. Grunz et al. [3] analyse groups of players and their behaviour using self organising maps on positional data. Every neuron of the network represents a certain area of the pitch. Thus, whenever a player moves into such an area, the respective neuron is activated. Perše at al. [8] use posi- tional data of basketball games to compare movements of players with respect to tactical patterns, e.g., a player blocks space for his teammate. The presented approach however does not detect novel movements that deviate from the al- ready known patterns. By contrast, we study a purely data-driven approach to find similar movements in positional data for a given query trajectory without making any assumptions on zones, tasks, or movements. 3 Contribution 3.1 Preliminaries For each player, we are given a positional data stream P = hx1 , x2 , . . .i where xt = (x1 , x2 )> denotes the coordinates of the players position on the pitch at time t. A trajectory or movement of the player is a subset p ⊂ P of the stream, e.g., p = hxt , xt+1 , . . . , xt+m i, where m is the length of the trajectory. A game D is thus given by the union of all trajectories of length m of the two teams. For simplicity, we omit the time index t and simply index elements of a trajectory by their offset 1, . . . , m in the remainder. The goal of this paper is to accurately and efficiently compute similarities between trajectories in D. That is, given a query trajectory q, we aim at finding the N most similar trajectories in D. 3.2 Representation We aim to exploiting the symmetries of the pitch and use Angle/Arc-Length (AAL) [11] transformations to guarantee translation, rotation, and scale invari- ant representations of trajectories. The main idea of AAL is to represent a move- ment p = hx1 , . . . , xm i in terms of distances and angles p 7→ p̄ = h(α1 , kv 1 k), . . . , (αm , kv m k)i, (1) where v i = xi − xi−1 . The difference v i is called the movement vector at time i and the corresponding angle with respect to a reference vector v ref = (1, 0)> is defined as v>    −1 i v ref αi = sign(v i , v ref ) cos , kv i k kv ref k where the sign function computes the direction (clockwise ore counterclock- wise) of the movement with respect to the reference. In the remainder, we dis- card the norms in Equation (1) and represent trajectories by their sequences of angles, p 7→ p̃ = hα1 , . . . , αm i. 3.3 Dynamic Time Warping In this section, we propose a distance measure for trajectories. The proposed representation of the previous section fulfils the required invariance in terms of translation, rotation and scaling [11]. However, some movements may start slow and end fast, while others start fast and then slow down at the end. Thus, we additionally need to compensate for phase shifts of trajectories. A remedy comes from the area of speech recognition and is called dynamic time warping (DTW) [9]. Given two sequences s = hs1 , . . . , sm i and q = hq1 , . . . , qm i and an element-wise distance function dist : R × R → R (e.g., Euclidean distance), we define the DTW function g recursively as follows g(∅, ∅) = 0 g(s, ∅) = dist(∅, q) = ∞    g(s, hq2 , . . . , qm i)  g(s, q) = dist(s1 , q1 ) + min g(hs2 , . . . , sm i, q) g(hs2 , . . . , sm i, hq2 , . . . , qm i)   The time complexity of DTW is O(|s||q|) which is clearly intractable for com- puting similarities of thousands of trajectories. However, recall that we aim at finding the N best matches for a given query. This allows for pruning some DTW computations using lower bounds f , i.e., f (s, q) ≤ g(s, q), with an ap- propriate function f that can be more efficiently computed than g [10]. We use two different lower bound functions, fkim [6] and fkeogh [5], that are defined as follows: fkim focuses on the first, last, greatest, and smallest values of two sequences [6] fkim (s, q) = max {|s1 − q1 |, |sm − qm |, | max(s) − max(q)|, | min(s) − min(q)|} and can be computed in O(m). However, the greatest (or smallest) entry in the transformed paths is always close or identical to π (or −π) and can thus be ignored. Consequentially, the time complexity reduces to O(1) [10]. The second lower bound fkeogh [5] uses minimum `i and an maximum values ui for subsequences of the query q given by `i = min(qi−r , . . . , qi+r ) and ui = max(qi−r , qi+r ), where r is a user defined threshold. Trivially, ui ≥ qi ≥ `i holds for all i and the lower bound fkeogh is given by (si − ui )2 : if si > ui v  um uX fkeogh (q, s) = t ci with ci = (si − `i )2 : if si < `i 0 : otherwise  i=1 which can also be computed in O(m) (see [7] for details). Algorithm 1 extends [10] to compute the N most similar trajectories for a given query q. Lines 2–9 compute the DTW distances of the first N entries in the database and store the entry with the highest distance to q. Lines 10–21 loop over all subsequent trajectories in D by first applying the lower bound functions fkim and fkeogh to efficiently filter irrelevant movements before using the exact DTW distance for the remaining candidates. Every trajectory, realising a smaller DTW distance than the current maximum, replaces its peer and the variables maxdist and maxind are updated accordingly. Note that the complexity of Algorithm 1 is linear in the number of trajectories in D. In the worst case, the sequences are sorted in descending order by the DTW distance, which requires to compute all DTW distances. In practice we however observe much lower run-times. An important factor is the tightness of the lower bound functions. The better the approximation of the DTW the better the pruning. The parameter N plays also a crucial part in the effectiveness of the algorithm. If we set N = 1 the Algorithm 1 TOP N(N,q,D) Input: number of near-neighbour movements N , query trajectory q, game D Output: The N most similar trajectories to q in D 1: output = ∅ ∧ maxdist = 0 ∧ maxind = −1 2: for i = 1, . . . , N do 3: dist = g(q, D[i]) 4: output[i] = D[i] 5: if dist > maxdist then 6: maxdist = dist 7: maxind = i 8: end if 9: end for 10: for i = N + 1, . . . , |D| do 11: if fkim (q, D[i]) < maxdist then 12: if fkeogh (q, D[i]) < maxdist then 13: dist = g(q, D[i]) 14: if dist < maxdist then 15: output[maxind] = D[i] 16: maxdist = max{g(q, output[j]) : 1 ≤ j ≤ N } 17: maxind = arg max g(q, output[j]) 1≤j≤N 18: end if 19: end if 20: end if 21: end for maximum value will drop faster towards the lowest value in the dataset. By contrast, setting N = |D| requires to compute the DTW distances for all entries in the database. Hence, in most cases, N  |D| is an appropriate choice to reduce the overall computation time. 3.4 Locality Sensitive Hashing To further improve the efficiency of our algorithm, we will use locality sensi- tive hashing (LSH) [2] to remove a great deal of trajectories before process- ing them with Algorithm 1. The idea of LSH is to hash similar objects to the same bucket, so that all objects of a bucket are considered candidates for being near-neighbours. An interesting equivalence class of LSH functions are distance based hashes (DBH) [1] that can be applied together with arbitrary (e.g., non- metric) distance measures. To define a hash family for our purposes, we first need to define a function h : D → R that maps a trajectory s ∈ D to the set of real numbers. Choosing two randomly drawn members s1 , s2 ∈ D we define the function h as follows: dist(s, s1 )2 + dist(s1 , s2 )2 − dist(s, s2 )2 hs1 ,s2 (s) = . 2 dist(s1 , s2 ) In the remainder, we will use the identity dist(s1 , s2 ) = fkim (s1 , s2 ) for sim- plicity. To compute a discrete hash value for s we verify whether h(s) lies in a certain interval [t1 , t2 ],  [t1 ,t2 ] 1 : hs1 ,s2 (s) ∈ [t1 , t2 ] hs1 ,s2 (s) = 0: otherwise Optimally, the interval boundaries t1 and t2 are chosen so that the probability that a randomly drawn s ∈ X lies with 50% chance within and with 50% chance outside of the interval. The set T defines the set of admissible intervals, n o [t ,t ] [t ,t ] T (s1 , s2 ) = [t1 , t2 ] : P rD (hs11,s22 (s)) = 0) = P rD (hs11,s22 (s)) = 1) . Given h and T we can now define the DBH hash family that can be directly integrated in standard LSH algorithms: n o [t ,t ] HDBH = hs11,s22 : s1 , s2 ∈ R ∧ [t1 , t2 ] ∈ T (s1 , s2 ) Using random draws from HDBH , we construct several hash functions by AND- and OR-concatenation [2]. Given a query trajectory q ∈ D, the retrieval process first identities candidate objects that are hashed to the same bucket for at least one of the hash functions and computes the exact distances of the remaining candidates using Algorithm 1. 4 Evaluation In this section, we evaluate our approach in terms of run-time, pruning effi- ciency, and precision. For our experiments, we use positional data published by the DEBS Grand Challenge in 20131 . There are eight players in each team, where every player is equipped with two sensors, one for each shoe. We aver- age these two values to obtain only a single measurement for every player at a time. Discarding additional data that is not useful in our context leaves us with a stream of (sensor/player id, timestamp, player coordinates) 1 http://www.orgs.ttu.edu/debs2013/index.php?goto=cfchallengedetails 3000 Top 1000 with DBH + Top 1000 with DBH (4 CPU) × 2500 Top 1000 ∗ ∗ Top 1000 (4 CPU) 2 ∗ + Baseline  + ∗ + 2000  ∗ + ∗ Time + (sec) 1500 ∗  ∗ + 2 + 2 × × ∗ 2 × 1000  ∗ + 2 × + 2 × ∗ + 2  2 × × ∗ + 2 × 500 ∗ 2 ×  + 2 × ∗ + 2 × ∗  + 2 × 2 × 0 ∗ 2 + ×  2 × 0 2000 4000 6000 8000 10000 12000 14000 16000 Sliding Windows |w| Fig. 1. Run-time. triplets. Additionally, we discard all data points that have been recorded before or after the game as well as data points that occurred outside of the pitch. We also remove the effects of changing sides after half time by appropriate trans- formations. Finally, we average the positional information of each player over 100ms to reduce the overall amount of data and use trajectories of size 10. In our first experiment, we focus on 15,000 consecutive positions of one player, so that we are still able to compare performances to the exact baseline using the DTW distance from Section 3.3. We compute the N -most similar trajectories using Algorithm 1, where N = 1, 000 and study run-times of the different approaches. Figure 1 (left) shows the results. The computation time of the baseline grows exponentially in the size of the data D. Algorithm 1 performs slightly super-linear and clearly outperforms the baseline. Pre-filtering trajecto- ries using DBH results in only a small speed-up. Adding more CPUs further significantly improves the run-time of the algorithms and indicates that paral- lelisation of the approach allows for computing near-neighbours for large data sets in only a couple of minutes. The observed improvements in run-time are the result of a highly efficient pruning strategy. Table 4 shows the amount of trajectories that are pruned for dif- ferent amounts of data. Note that the DBH pruning depends on the data and not N on the ratio |D| . The effectiveness of pruning using fkim and fkeogh increases with increasing amounts of data for constant N . We now investigate the accuracy of the proposed approach. We compute the 1000 most similar trajectories for all 35,248 player movements and measure the effect of DBH in terms of the precision@N. For every query q we computed the performance for N ∈ {100, 200, . . . 1000} and averaged the results that are shown in Figure 2. For completeness we also included the worst cases. The Table 1. Pruning efficiency trajectories fkim fkeogh DBH Total 1000 0% 0% 11.42% 11.42% 5000 0.28% 34.00% 16.33% 50.61% 10000 9.79% 41.51% 17.80% 60.10% 15000 17.50% 46.25% 11.82% 75.57% 1 + + 0.9 + + + + + + + + 0.8 0.7 0.6 avg + P@n 0.5 worst × 0.4 0.3 0.2 0.1 × × × × × × × × × 0× 100 200 300 400 500 600 700 800 900 1000 n Fig. 2. Accuracy of DBH. quality of the candidates covers a broad range and the worst cases are clearly inappropriate for accurate computations of similarity. Nevertheless, on average DBH performs well and only slightly decreases in the size of N . Figure 3 shows an exemplary query trajectory (top, left) as well as five similar trajectories found by DBH, where the axes denote the coordinates on the pitch of the respective movement. The retrieved near-duplicates are very close to the query and well suited for further processing. 5 Conclusion In this paper, we presented an approach to efficiently compute similar move- ments in positional data streams. Our solution is based on dynamic time warping and distance based hashing. Empirically, we showed the efficiency and accuracy of our approaches. Future work will deal with detecting frequent movements across players. References 1. V. Athitsos, M. Potamias, P. Papapetrou, and G. Kollios. Nearest neighbor retrieval using distance-based hashing. In Proceedings of the 2008 IEEE 24th International Conference on -2050 3060 + 480 + + 3050 -2100 460 3040 + + + 440 3030 -2150 + + 3020 420 + -2200 3010 400 3000 -2250 + 2990 + 380 + 2980 -2300 + 360 2970 + -2350 2960 + 340 21800 22000 22200 22400 22600 22800 23000 23200 23400 23600 18100 18200 18300 18400 18500 18600 18700 18800 18900 27900 28000 28100 28200 28300 28400 28500 28600 28700 -1720 -11350 -10800 + + -10850 -1730 + -11400 -10900 + -1740 + -11450 + -10950 -1750 -11000 + + -11500 -1760 -11050 + -11550 -11100 -1770 + -11150 + -11600 + -1780 + + -11200 + -1790 -11650 -11250 38200 38250 38300 38350 38400 38450 38500 38550 38600 38650 4000 4200 4400 4600 4800 5000 5200 5400 5600 7000 7500 8000 8500 9000 9500 Fig. 3. Exemplary results: The query trajectory is shown in the top-left figure. The other figures show five similar trajectories found by our approach. Data Engineering, pages 327–336, 2008. 2. A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In Proceedings of the 25th International Conference on Very Large Data Bases, pages 518–529, 1999. 3. Andreas Grunz, Daniel Memmert, and Jrgen Perl. Tactical pattern recognition in soccer games by means of special self-organizing maps. Human Movement Science, 31(2):334 – 343, 2012. Special issue on Network approaches in complex environments. 4. C.-H. Kang, J.-R. Hwang, and K.-J. Li. Trajectory analysis for soccer players. In Pro- ceedings of the Sixth IEEE International Conference on Data Mining - Workshops, pages 377–381, 2006. 5. Eamonn Keogh. Exact indexing of dynamic time warping. In Proceedings of the 28th international conference on Very Large Data Bases, pages 406–417, 2002. 6. S.-W. Kim, S. Park, and W. W. Chu. An index-based approach for similarity search sup- porting time warping in large sequence databases. In Proceedings of the 17th International Conference on Data Engineering, pages 607–614, 2001. 7. D. Lemire. Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern Recogn., 42(9):2169–2180, September 2009. 8. M. Perše, M. Kristan, S. Kovačič, G. Vučkovič, and J. Perš. A trajectory-based analysis of coordinated team activity in a basketball game. Computer Vision and Image Understanding, 113(5):612 – 621, 2009. 9. Lawrence Rabiner and Biing-Hwang Juang. Fundamentals of speech recognition. Prentice- Hall, Inc., Upper Saddle River, NJ, USA, 1993. 10. T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover, Q. Zhu, J. Zakaria, and E. Keogh. Searching and mining trillions of time series subsequences under dynamic time warping. In Proceedings of the International Conference on Knowledge Discovery and Data Mining, pages 262–270, 2012. 11. Michail Vlachos, D. Gunopulos, and Gautam Das. Rotation invariant distance measures for trajectories. In Proceedings of tInternational Conference on Knowledge Discovery and Data Mining, KDD ’04, pages 707–712, 2004.