A public transport departure time prediction algorithm based on operation strategies and real-time monitoring data A A Agafonov1 1 Samara National Research University, Moskovskoye shosse 34, Samara, Russia, 443086 Abstract. In this paper, we consider a public transport departure time prediction problem. The problem is considered in two notations: estimation of the mean expected departure time and estimation of the time interval required to ensure a predefined probability to depart on-time. We propose departure time estimation algorithms based on a real-time monitoring data, schedule and operation strategies for the public transport management. We consider a schedule-based holding strategy and a bus bunching prevention strategy. We also propose an algorithm for departure time interval estimation. Numerical tests based on real bus routes in Samara, Russia are used for a comparative study of different departure time prediction algorithms. 1. Introduction It is a challenge for public transportation agencies to provide reliable service because public transport often operates under conditions of high uncertainty. Variation in travel time caused by different factors of uncertainty, such as real-time traffic situation and traffic congestion, weather conditions, change of passenger flow, driver behavior [1]. The unreliability of the transport schedule reduces the efficiency of the transport infrastructure because road users have to take into account risks of arriving late while planning the route. This is especially important for the passenger transport, where the discrepancy with the schedule can accumulate and lead to the bus bunching problem. In this paper we consider a passenger transport departure time prediction problem. We compare algorithms based on the schedule, real-traffic data and operation strategies of the public transport management. There are many papers focused on the similar arrival time prediction problem. Proposed methodologies include • Regression models [2] that constructed as a regression function from the set of independent variables. Non-parametric regression (NPR) models is a relatively simple method for prediction without the need to estimate parameters. K nearest neighbor (k-NN) methods are one of the most popular NPR methods. Bus travel time prediction models using k-NN was developed in [3, 4]. • Kalman filtering models [5, 6] are an efficient recursive procedure that estimates the future states of dependent variables. In [7] authors developed a path-based model and a link-based model using Kalman filter to predict bus travel times. IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) Image Processing and Earth Remote Sensing A A Agafonov • Machine learning models, including artificial neural networks (ANN) and support vector machine (SVM) models. ANN has been reported to be especially useful for finding solutions for complex non-linear problems. In [8] authors proposed two ANN-based models to predict bus arrival time: the link-based ANN and the stop-based ANN. The paper [6] proposed a dynamic algorithm that integrated the ANN model and a Kalman filter-based algorithm. In [9] authors used Bayesian inference theory to combine neural networks. Their results showed that the ANN model outperformed the historical data based model and the regression model in terms of prediction accuracy. SVM is a very specific type of learning algorithm characterized by the capacity control of the decision function, the use of the kernel functions, and the sparse solution. The SVM-based models to predict bus arrival time was developed in [10–12]. • Hybrid methods [13, 14] that combine two or more models to predict arrival time precisely. However, the described models use the real-time and historical travel time information as the basis to predict arrival time of the public transport and cannot be used directly to predict departure time of the vehicle from the starting station. Another problem that needs to be considered in the context of this paper is the schedule design and operational management of the public transport [15]. The use of operational control strategies makes it possible to reduce the discrepancy between the scheduled time and the real time of arrival at the control points and prevent the bus bunching on the route. One of the most popular control strategies is the schedule-based holding control strategy: if a bus arrives at a stop earlier than the scheduled time, it is held until the scheduled departure time is reached [16]. Although this strategy can improve schedule adherence, it delays the operation of early buses and causes impatient among passengers on board. Another strategy is a driver schedule recovery strategy that proposes to adjust speed over segments between two consecutive control points to arrive on-time [17, 18]. However, this strategy requires that drivers highly flexible response to the traffic situation and can be limited by traffic conditions. In this paper, we consider the problem of the public transport departure time prediction from the starting station. The problem is considered in two formulations: estimation of the mean expected departure time and minimizing the travel time interval required to ensure a predefined probability of departure on-time. The structure of this paper is organized as follows: the second section introduces the basic notations and the problem formulation, describes the departure time prediction algorithms based on real-time traffic data and schedule. In this section, we propose an algorithm based on the operation control strategies. The third section presents the problem formulation of the departure time interval estimation and describes a base algorithm for solving this problem. In the fourth section, we present an experimental study of the algorithms. Lastly, we present our conclusions and recommendations for future work. 2. Mean expected departure time estimation 2.1. Basic notation Introduce the following notation. Let V be the set of public transport vehicles; R be the set of routes; Ir be the bus run numbers on the route r ∈ R; DTi be the departure time from the starting station of the bus with the run number i ∈ Ir on the route r ∈ R; DTisch be the scheduled departure time from the starting station of the bus i ∈ Ir ; ATi be the arrival time at the ending terminal of the bus i ∈ Ir ; ATisch be the scheduled arrival time at the ending terminal of the bus i ∈ Ir . IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 76 Image Processing and Earth Remote Sensing A A Agafonov Let ∆min be the minimum recovery time at the ending terminal (time between two consecutive runs). Denote the deviation of the real observed arrival time of the bus with the run number i ∈ Ir from the scheduled arrival time as ∆arr i : ∆arr i = ATi − ATisch . The waiting time between runs denote as ∆wait i : ∆wait i = DTjsch − ATisch , where (i, j), i ∈ Ir , j ∈ Ir is the numbers of consecutive runs of the bus v ∈ V on the route r ∈ R. The main problem is to predict the departure time from the starting station DTi , i.e., the following assessment construction: ˆ i , ∀i ∈ Ir , ∀r ∈ R. DT (1) ˆ i based In the next subsections, we describe algorithms for estimation the departure time DT on the real-time monitoring information, schedule, and operation control strategies: a holding control strategy and a bus bunching prevention strategy. 2.2. Monitoring data based algorithm The monitoring data based departure time prediction algorithm uses the real-time monitoring data taking into account the planned schedule. The main assumption of the algorithm is that the deviation of the arrival time at the ending terminal from the scheduled arrival time will be maintained when the vehicle departs from the starting station. The departure time estimation based on the monitoring data can be described as follows DTˆjM = DTjsch + ∆arr i . (2) Obviously, this estimation is quite simple and does not consider any control strategies to decrease the deviation from the scheduled departure time. In the next subsections, we present departure time prediction algorithms based on schedule and operation strategies. 2.3. Schedule based algorithm The schedule-based holding control strategy is one of the most popular strategies in the bus route schedule design problem. In this strategy, if a bus arrives early at the timing point, it will be held until the scheduled departure time is reached. If the bus is already late, it will depart from the timing point immediately after passenger pickup. We use the similar approach to adapt schedule-based holding strategy to estimate the departure time from the starting terminal. The schedule-based algorithm with the holding strategy considers the deviation of the arrival time ∆arr i and the waiting (recovery) time ∆wait i to control the departure time in the following way:  ATi + ∆min ,  ∆wait < 0, ˆ S DTj = DTjsch + α∆arr , ∆ arr < 0, (3) i i sch arr wait  DTj + max (0, β∆i − γ∆i ), otherwise,  where α ∈ [0, 1], β ∈ [0, 1], γ ∈ [0, 1] are the control strategy coefficients. The proposed control strategy based algorithm consists of the following steps: IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 77 Image Processing and Earth Remote Sensing A A Agafonov • if the vehicle arrived at the terminal stop later than the scheduled departure time of the next run, then shorten the waiting time to a minimum; • if the vehicle arrived earlier than the scheduled arrival time, then reduce the deviation from the scheduled departure time by the recovery time increasing; • otherwise, reduce the deviation from the scheduled departure time by reducing the recovery time. 2.4. Operation strategy based algorithm The operation strategy based algorithm considers a minimum delay between departure times of the different buses on the same route to prevent bus bunching. We assume that the minimum delay time between two consecutive runs can be different for different routes and is determined based on the route schedule. Denote the minimum delay between the departure times as ∆route r , r ∈ R. Then the departure time estimation for the run number j ∈ R of the vehicle v ∈ V on the route r ∈ R can be expressed as follows: DTˆjOS = max(DT ˆ S , DT + ∆route ), j k r (4) where k is the last run number, DT ˆ S is the departure time estimation obtained using the j schedule-based algorithm. Experimental study of the described departure time estimation algorithms is presented in section 4. 3. Departure time interval estimation In the previous section, we describe the mean expected departure time estimation problem. Sometimes this formulation cannot be suitable for passengers. Often it is desirable to know not a time instant of the departure, but a departure time interval required to ensure a predefined probability of departing on-time. Let τ be the required departure time instant, p be the required probability to depart on-time, r be the selected route. The departure time interval estimation problem can be expressed in the following form: P (τ − ∆dep < DTi < τ ) > p, (5) where DTi is the departure time, ∆dep is the estimated departure time interval. The base departure time ∆dep estimation algorithm uses traffic statistics. Let Πdr = {DTi , i ∈ Ir } be the dataset with the departure time of the vehicles v ∈ V on the routes r ∈ R in the selected day d, Πr = {Πdr }N d=1 be the statistics for N days. We introduce an indicator variable of the following form: ( 1, ∃DTi ∈ Πdr : τ − ∆ < DTi < τ ; p(d, τ, ∆) = (6) 0, othewise, Then the departure time interval estimation algorithm can be expressed in the following form: IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 78 Image Processing and Earth Remote Sensing A A Agafonov Algorithm 1: Departure time interval estimation p̂ = 0; ∆ = 0; while p̂ < p do ∆ = ∆ + step; // step - increment step p̂ = N P d=1 p(d, τ, ∆)/N ; end ∆dep = ∆; 4. Simulation setup and results An experimental study of the algorithms was carried out on the public transport data in Samara. To estimate the departure time, we chose the 1070 runs of different bus routes, for which the departure / arrival times to the control points according to the schedule are known. Firstly, we estimate the deviation of the real observed arrival time to the terminal stop from the scheduled arrival time. Figure 1 shows the histogram of the arrival time deviation. The positive deviation indicates the delay in the arrival, and the negative deviation means that the vehicle arrives ahead of schedule. As can be seen from the histogram, vehicles often arrive at the ending terminal later than the scheduled time. 350 300 Vehicles number 250 200 150 100 50 0 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 Devia!on, min Figure 1. Arrival time deviation Next, we compare the proposed algorithms based on monitoring data, schedule and operation strategies. Table 1 provides the mean absolute error and standard deviation of the proposed algorithms. Table 1. Algorithms comparison Mean absolute error Standard deviation Monitoring data based algorithm 601.0 1147.35 Schedule based algorithm 249.95 385.22 Operation strategies based algorithm 260.85 377.78 IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 79 Image Processing and Earth Remote Sensing A A Agafonov The schedule-based algorithm with the holding strategy provides the best results by the selected criteria. Using bus bunching prevention control strategy do not improve the departure time estimation precision. At the final stage of the experimental study, we estimate the deviation of the real observed departure time from the estimated departure time. Figure 2 depicts the histogram of the departure time deviation. A positive deviation means that the vehicle departed after the estimated departure time. 300 250 Vehicles number 200 150 100 50 0 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 Devia!on, min Figure 2. Estimated departure time deviation As can be seen from the histogram, the schedule-based algorithm estimates the departure time without deviation from the observed time. 5. Conclusion In this paper, we considered the problem of public transport departure time estimation. We compared the algorithm based on the real-time monitoring, the schedule-based algorithm with the holding operation strategy, and the operation strategy based algorithm with bus bunching prevention strategy. The schedule-based algorithm with the holding control strategy showed the best results by mean absolute error criteria. Using bus bunching prevention strategy did not decrease the error of the estimation. The main disadvantage of the proposed algorithm is the fact that the traffic statistics are not used to estimate the departure time of the public transport. In this paper, we considered a public transport departure time prediction problem. The problem is considered in two notations: estimation of the mean expected departure time and estimation of the time interval required to ensure a predefined probability to depart on-time. We compare algorithms based on a real-time traffic data and public transport schedule. The paper proposes an original prediction algorithm based on the operation strategies of the public transport management. Numerical tests based on real bus routes in Samara, Russia are used for a comparative study of different departure time prediction algorithms. In addition, we consider the problem of the time interval estimation required to ensure a predefined probability to depart on-time. The base algorithm for solving this problem is proposed. The development of more complex algorithms will be conducted as our future research. IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 80 Image Processing and Earth Remote Sensing A A Agafonov 6. References [1] Chen X, Yu L, Zhang Y and Guo J 2009 Analyzing urban bus service reliability at the stop, route, and network levels Transportation Research Part A: Policy and Practice 43 722-734 [2] Agafonov A, Sergeev A and Chernov A 2012 Forecasting of the motion parameters of city transport by satellite monitoring data Computer Optics 36 453-458 [3] Chang H, Park D, Lee S, Lee H and Baek S 2010 Dynamic multi-interval bus travel time prediction using bus transit data Transportmetrica 6 19-38 [4] Smith B, Williams B and Keith Oswald R 2002 Comparison of parametric and nonparametric models for traffic flow forecasting Transportation Research Part C: Emerging Technologies 10 303-321 [5] Vanajakshi L, Subramanian S and Sivanandan R 2009 Travel time prediction under heterogeneous traffic conditions using global positioning system data from buses IET Intelligent Transport Systems 3 1-9 [6] Chen M, Liu X, Xia J and Chien S 2004 A dynamic bus-arrival time prediction model based on apc data Computer-Aided Civil and Infrastructure Engineering 19 364-376 [7] Chien S J and Kuchipudi C 2003 Dynamic travel time prediction with real-time and historic data Journal of Transportation Engineering 129 608-616 [8] Chien S J, Ding Y and Wei C 2002 Dynamic bus arrival time prediction with artificial neural networks Journal of Transportation Engineering 128 429-438 [9] van Hinsbergen C, van Lint J and van Zuylen H 2009 Bayesian committee of neural networks to predict travel times with confidence intervals Transportation Research Part C: Emerging Technologies 17 498-509 [10] Yu B, Lam W and Tam M 2011 Bus arrival time prediction at bus stop with multiple routes Transportation Research Part C: Emerging Technologies 19 1157-1170 [11] Bin Y, Zhongzhen Y and Baozhen Y 2006 Bus arrival time prediction using support vector machines Journal of Intelligent Transportation Systems: Technology, Planning, and Operations 10 151-158 [12] Yu B, Yang Z and Wang J 2010 Bus travel-time prediction based on bus speed Proceedings of the Institution of Civil Engineers: Transport 163 3-7 [13] Park T and Lee S 2004 A bayesian approach for estimating link travel time on urban arterial road network Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3043 1017-1025 [14] Agafonov A and Myasnikov V 2014 An algorithm for city transport arrival time estimation using adaptive elementary predictions composition Computer Optics 38(2) 356-368 [15] Wu Y, Tang J and Luo X 2015 Comparative analysis of operation strategies in schedule design for a fixed bus route International Transactions in Operational Research 22 545-562 [16] Cats O, Larijani A, Ólafsdóttir A, Burghout W, Andréasson I and Koutsopoulos H 2012 Bus-holding control strategies Transportation Research Record 100-108 [17] Kalaputapu R and Demetsky M J 1995 Modeling schedule deviations of buses using automatic vehicle-location data and artificial neural networks Transportation Research Record 44-52 [18] Yan Y, Meng Q, Wang S and Guo X 2012 Robust optimization model of schedule design for a fixed bus route Transportation Research Part C: Emerging Technologies 25 113-121 Acknowledgments This work was supported by the Russian Foundation for Basic Research (RFBR) grant 18-07-00605 A, grant 18-29-03135. IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 81