Trajectory Prediction from External Sensor Data using Recurrent Neural Networks Lı́via Almada Cruz Karine Zeitouni José A. F. de Macedo Campus Quixadá DAVID Laboratory - CNRS Department of Computing Science Federal University of Ceará University of Versailles St Quentin Federal University of Ceará Quixadá, Brazil Versailles, France Fortaleza, Brazil livia.almada@ufc.br karine.zeitouni@uvsq.fr jose.macedo@dc.ufc.br Abstract—This paper presents a study of location prediction o = (si , m, t). A trajectory traj =< o1 , o2 , . . . , ol > of a applied to trajectories obtained from sensors placed on road- moving object m is the sequence of observations associated to networks. We applied a variation of Recurrent Neural Networks m, where ti ≤ ti+1 . The trajectories are defined over a period using different combination of features, to measure the impact of each feature on the learning task. of the day. The problem tackled here is to predict the next Index Terms—mobility analysis, location prediction, recurrent sensor given the last k observations of a moving object (also neural networks, sensor trajectories called recent trajectory) and the set of historical trajectories. Different from classification problems with many classes, I. I NTRODUCTION we have to consider complex transitions patterns and time- The high availability of tracking data brought opportunities dependence. As the sensors have spatial relationships among to provide new methods to analyze and understand mobility them, the proximity of the predicted location to the actual patterns. Among the analysis, location prediction has gained value is also important. Furthermore, the trajectories obtained some attention given its applicability. The mobility prediction from sensors have a set of particularities that give us new (or location prediction) problem focuses on inferring the opportunities and challenges. next relevant location of a moving object based on historical 1) Huge data: Sensors continuously capture huge number trajectories and its most recent tracking. Many applications, of observations per day. like smart transportation, traffic control, urban planning and 2) Exhaustive types of trajectories: Moving objects are recommendation systems, can benefit from location prediction. not restricted to a specific fleet of vehicles. Commuters, Three levels of mobility prediction were considered in [1]: fleet of taxi or buses, deliveries, etc. can all of them be 1) object position; 2) path prediction; and 3) next place tracked by the sensors. Thus, trajectories can have very prediction. In levels 1 and 2, usually the models are learned different patterns. from raw trajectories obtained by Global Positioning System 3) Sparsity: The sensors are located on fixed positions, (GPS) devices and their predictions consider the movement usually only on the main roads of the city. The entire of the object. Level 3 predicts stops rather than movements, tracking of moving objects is not available and the usually by learning from sequences of points of interest trajectories are very sparse in space and time. or events. Differently of previous works, in this work, we 4) Incompleteness and uncertainty: Sensors may fail to analyze movement of objects tracked by sensors placed on the capture the passage a vehicle, producing incomplete roads sides. Each sensor captures and registers the passage of trajectories. It is not obvious when one observation is moving objects. Assuming that each register contains enough not in the data set because the sensor failed or if it is information to uniquely identify the associated moving object, because the object did not passe by the sensor. it is possible to derive from them the trajectories of the moving objects observed by the sensors. Our work is in between of Recently, applying Recurrent Neural Networks (RNN) to level 1 (object position) and level 2 (movement and path). location prediction has demonstrated the potential of these approaches to capture the complexity of mobility data. How- II. P ROBLEM S TATEMENT ever, at the best of our knowledge, none of those works 1) Sensor Trajectory Prediction: Given a set of sensors S = have studied location prediction for trajectories based on {s1 , s2 , . . . , sn }, when a sensor si captures the passage of a external sensor data. We call this problem ”Sensor trajectory moving object m at timestamp t, it registers an observation Prediction”. In this work, we evaluate RNN models based in different set of features in order to understand the limitations This study was financed in part by the Coordenação de Aperfeiçoamento of the predictability of such trajectories. We compare these de Pessoal de Nvel Superior - Brasil (CAPES) - Finance Code 001 and is part approaches with the ones based on sequence patterns and of the MASTER project that has received funding from the European Unions Horizon 2020 research and innovation programme under the Marie-Slodowska Markov models. Finally, we discuss the results and future Curie grant agreement N. 777695. directions. 18 TABLE I: Accuracy of models. Markov1 Markov5 TDAG SM STM SUM STUM 41.71 38.84 38.95 0 44.38 46.68 47.68 TABLE II: Quality in terms of closeness. Model [50, 60, 70, 80, 90]-percentiles Mean SM [8541.59, 10182.11, 11625.92,11844.64, 12724.72] 8342.45 STM [757.85, 1503.31, 2616.99,4249.94, 7211.40] 2273.89 SUM [559.12, 1553.77, 2651.52,4397.27, 7194.90] 2279.79 STUM [460.96, 1469.62, 2640.11,4349.11, 7233.33] 2262.16 Fig. 1: General architecture of the RNN for location prediction. observations. We obtained 1,025,040 trajectories from 266,522 III. R ELATED W ORKS distinct vehicles. TPRED [2] predicts the next stop from GPS trajectories. The 2) Results: Models were trained and tested 5 times using work in [3] predicts the next stop based on groups of users holdout 80-20. We also evaluate the accuracy of RNN models, who share the same profile (e.g. gender and age). GMove [4] Markov Models of First and 5th orders and TDAG [11] uses spatial-temporal information and geo-tagged text from approaches (Table I). The STUM model reached the best check-ins to predict the next stop. MyWay [5] predicts the accuracy, which confirms the users tend to have similar and position of moving objects from their GPS trajectories based time-dependent patterns. Even with only 4.5 trajectories per on the clustering and spatial match. In [6],a Dynamic Bayesian user (in average), including the user id improves the accuracy. Network predict next location of sparse trajectories from call SM suffers from overfitting in all executions, which means details records. In [7], a Spatial-Temporal RNN predicts the that RNN needs complementary information to learn mobility next location using spatial and temporal continuous values. patterns from the trajectory paths only. We measure the error SERM [8] is a spatial-temporal RNN to predict the next stop. of RNN models according to the closeness given by the road TA-TEM [9] is a recommendation system based on RNN distance from the actual location to the predicted one (Table which predicts the next stop. Both works [8] and [9] learned II). The quality of models in terms of distance to ground truth from check-ins. DeepMove [10] uses attention RNN to predict is also improved by adding additional knowledge. Precisely, the stop on the next time window. with this metric, the RNN models using time and user features showed a better performance. IV. RNN FOR S ENSOR T RAJECTORY P REDICTION The general architecture is a simplification of the model VI. D ISCUSSION AND F UTURE W ORKS proposed by [8]. The model (Figure 1) is composed by: i. In this paper we have shown preliminary results of the An embedding layer, responsible of reducing the dimensions application of RNN to sensor trajectory prediction. This type of input vectors; ii. A layer to concatenate the output of of trajectories may capture very different mobility patterns, embedding layers in order to get a unique input feature vector; since it is not restricted to a fleet or a community of users. iii. A recurrent layer to learn the complex patterns from They are also sparse, incomplete and uncertain. We have also sequences; iv. A fully connected layer with softmax function highlighted the use of the underline road-network to estimate a as activation, which converts the result of the recurrent layer finer granularity trajectory definition and obtain better models into the set of probabilities to be assigned to each class label. in terms of accuracy and error of distance. As ongoing work, Our experiments were based on different features: the we are studying how to deal with missing values by means spatial feature corresponds to the sensor label; the temporal imputation approaches for the sensor trajectories while we take features is the time slot in a day which fits the timestamp of an into account the uncertainty. As future works, we want to use observation; and user identification captures user preferences. road network restrictions to discard undesirable predictions We consider models with different combinations of these and enrich the models. Finally, we want to study how to features: spatial model (SM), spatial-temporal model (STM), improve the accuracy by means others machine learning spatial-user model (SUM) and spatial-temporal-user model techniques, like unsupervised learning. (STUM). We use the one-hot representation to transform each feature in a vector. A window of the k last observations is used R EFERENCES to learn the next position. [1] D. Bucher, “Vision paper: Using volunteered geographic information V. E XPERIMENTAL E VALUATION to improve mobility prediction,” in Proc. of the 1st ACM SIGSPATIAL Workshop on Prediction of Human Mobility, 2017, p. 2. 1) Dataset: Trajectories were collected from 01/09/2017 to [2] C. L. Rocha et al., “Tpred: a spatio-temporal location predictor frame- 30/09/2017, from 272 sensors in the city of Fortaleza, Brazil. work,” in Proc. of the 20th IDEAS, 2016, pp. 34–42. [3] E. Naserian, X. Wang, K. Dahal, Z. Wang, and Z. Wang, “Personalized Originally, we received a total of 22,338,916 observations. We location prediction for group travellers from spatialtemporal trajecto- filtered the trajectories and keep those with a minimum of 6 ries,” Future Generation Computer Systems, pp. 278–292, 2018. 19 [4] C. Zhang et al., “Gmove: group-level mobility modeling using geo- tagged social media,” in Proc. of the 22nd ACM SIGKDD, 2016, pp. 1305–1314. [5] R. Trasarti, R. Guidotti, A. Monreale, and F. Giannotti, “Myway: Location prediction via mobility profiling,” Information Systems, 2017. [6] F. Alhasoun, M. Alhazzani, F. Aleissa, R. Alnasser, and M. González, “City scale next place prediction from sparse data through similar strangers,” in Proc. of ACM KDD Workshop, 2017. [7] Q. Liu, S. Wu, L. Wang, and T. Tan, “Predicting the next location: a recurrent model with spatial and temporal contexts,” AAAI, pp. 194–200, 2016. [8] D. Yao, C. Zhang, J. Huang, and J. Bi, “Serm: A recurrent model for next location prediction in semantic trajectories,” 2017. [9] W. X. Zhao et al., “A time-aware trajectory embedding model for next- location recommendation,” Knowledge and Information Systems, 2017. [10] J. Feng, Y. Li, C. Zhang, F. Sun, F. Meng, A. Guo, and D. Jin, “Deep- move: predicting human mobility with attentional recurrent networks,” in Proc. of the 2018 WWW Conference, 2018, pp. 1459–1468. [11] P. Laird and R. Saul, “Discrete sequence prediction and its applications,” Machine learning, vol. 15, no. 1, pp. 43–68, 1994. 20