On Predicting Traveling Times in Scheduled Transportation (Abstract)


Avigdor Gal                                                                                       AVIGAL @ IE . TECHNION . AC . IL
Avishai Mandelbaum                                                                                  AVIM @ IE . TECHNION . AC . IL
Francois Schnitzler                                                                           FRANCOIS @ EE . TECHNION . AC . IL
Arik Senderovich                                                                                SARIKS @ TX . TECHNION . AC . IL
Technion - Israel Institute of Technology, Haifa, Israel
Matthias Weidlich                                                                   WEIDLIMA @ INFORMATIK . HU - BERLIN . DE
Humboldt-Universität zu Berlin, Berlin, Germany


Traveling time prediction                                            queues. We propose a model that uses natural segmentation
                                                                     of the data according to bus stops and a set of predictors,
Urban mobility impacts urban life to a great extent. People,         some use learning while others are learning-free, to estimate
living in cities, plan their daily schedule around anticipated       traveling time.
traffic patterns. Some wake-up early to “beat” rush hour.
Others stay at home and work during days when a conven-              The model of journey segments. As the foundation of our
tion comes to town.                                                  approach, we propose to model each bus trip by using a
                                                                     segmentation model as follows. A trip between two stops
To enhance urban mobility, much research was invested
                                                                     consists of segments, with each segment being represented
in traveling time prediction, see (Wu et al., 2004). That
                                                                     by a ‘start’ stop and an ‘end’ stop, see Figure 1. Given the
is, given an origin and destination, provide a passenger
                                                                     first stop of a trip ω1 and the last stop of a trip ωn , the inter-
with an accurate estimation of how long a journey lasts. In
                                                                     mediate stops are known in advance since each bus follows
particular, the ability to predict traveling time in scheduled
                                                                     a predefined journey pattern. Therefore, a trip can be de-
transportation, e.g., buses, was shown to be feasible (Chien
                                                                     scribed by segments that are characterized by a pair of stops
et al., 2002).
                                                                     of the form hωi , ωi+1 i (Figure 1). This segmented model,
In this work, we address the problem of online travel time           in turn, allows for fine-granular grounding of the prediction
prediction in the context of a bus journey. That is, a journey       of traveling time T (hω1 , . . . , ωn i, tω1 ) for a sequence of
may be ongoing in the sense that journey events already              stops hω1 , . . . , ωn i when departing at time tω1 : instead of
indicated the progress of the bus on its route. For such an          considering only journeys that follow the same sequence of
ongoing journey, we are interested in the current prediction         stops hω1 , . . . , ωn i, all journeys that share some segments
of the traveling time from the current bus stop to some des-         can be used for prediction.
tination via a particular sequence of stops, which is defined
by the respective journey pattern.
Prediction Approach                                                          ʘͺϭ                   ʘͺŝ         ʘͺ΂ŝнϭ΃   ʘͺŶ


To address the problem of online travel time prediction, we
                                                                             Figure 1. A segmented model of traveling times
investigate a novel use of methods from Queueing Theory
and Machine Learning in the prediction process. We pro-              Using information on bus stops, the prediction of the journey
pose a prediction engine that, given a scheduled bus journey         traveling time T (hω1 , . . . , ωn i, tω1 ) is traced back to the
(route) and a ‘source/destination’ pair, provides an estimate        sum of traveling times per segment. The traveling time per
for the traveling time, while considering both historical data       segment is assumed to be independent of a specific journey
and real-time streams of information that are transmitted by         pattern and, thus, also independent of a specific journey:
buses. To do so, we model buses as clients that go through                                                   X
                                                                                                             n−1
a journey of segments that are interpreted as a network of                    T (hω1 , . . . , ωn i, tω1 ) =     T (hωi , ωi+1 i, tωi ),
                                                                                                          i=1
                                                                     where tωn−1 = tω1 + T (hω1 , ωn−1 i, tω1 ).
Proceedings of the 2 nd International Workshop on Mining Urban
Data, Lille, France, 2015. Copyright c 2015 for this paper by its    Prediction based on the snapshot principle. A first set
authors. Copying permitted for private and academic purposes.
                                                                     of predictors is grounded in heavy-traffic approximations


                                                                    33
                                  On Predicting Traveling Times in Scheduled Transportation

in Queueing Theory. It is non-learning, in the sense that it        accuracy of the presented methods. The main results of our
does not generalize prediction from historical events, but          experiments have been:
rather uses recent events to predict future traveling times.
                                                                        • Prediction methods that combine the snapshot princi-
Applied to our context, the main idea is that a bus that passes
                                                                          ple and Machine Learning techniques are superior in
through a segment, will experience the same traveling time
                                                                          quality of prediction to both snapshot predictors and
as another bus that has just passed through that segment
                                                                          Machine Learning methods (that do not include the
(not necessarily of the same type, line, etc). Following this
                                                                          snapshot predictor).
line, we define a single-segment snapshot predictor, called
                                                                        • The prediction error increases with the number of bus
Last-Bus-to-Travel-Segment (LBTS).
                                                                          stops per journey. However, when considering the
To use this predictor to address the online travel time predic-           relative error, it is stable for all trip lengths, i.e. the pre-
tion problem, it needs to be lifted to a network setting. To              dictors do not deteriorate proportionally to the length
this end, we exploit the fact that the snapshot principle holds           of the journey (in stops).
for networks of queues, when the routing through this net-              • Surprisingly, the snapshot predictor does not deteri-
work is known in advance (Reiman & Simon, 1990). Clearly,                 orate for longer trips, therefore contradicting the hy-
in scheduled transportation, this is the case, so that we de-             pothesis that the snapshot predictor would be more
fine a multi-segment (network) snapshot predictor, called                 precise for journeys with higher temporal proximity to
Last-Bus-to-Travel-Network. It is derived by summing up                   the current journey.
the LBTS predictions for the segments of the respective
journey pattern of the bus for which the prediction is made.
                                                                    Conclusion
                                                                    In this work, we presented a novel approach towards predict-
Prediction using Machine Learning methods. A second
                                                                    ing travel time in urban public transportation. It is grounded
set of predictors comes from Machine Learning and is based
                                                                    in a partitioning of the travel time into stop-based segments,
on regression trees. They exploit past journey logs to learn
                                                                    and combines the use of Machine Learning and Queueing
a prediction model, and then use this model to make a
                                                                    Theory predictors to model traveling time in each segment.
prediction on new instances of the problem, in our case,
                                                                    Our empirical evaluations confirmed that the combination
traveling times as part of current journeys.
                                                                    of methods indeed improves performance. Moreover, we
As a first step, we formalize the traveling times prediction        observed that the snapshot predictor is, counter-intuitively,
problem as a regression problem. Features considered for            unaffected by the length of a journey. This leads to positive
the regression include the travel time of the last bus that         evidence in favor of applying mixed Queue and Machine
used that segment (LBTS, as introduced above); the interval         Learning predictors in similar settings.
between the time the last bus left the segment and the esti-
                                                                    This work was supported by the EU INSIGHT project (FP7-
mated time to enter the segment; the day of the week; and
                                                                    ICT 318225).
the time of the day. For the resulting regression model, vari-
ous generic algorithms are applied to derive an ensemble of
regression trees that is then used to solve the prediction prob-    References
lem. Specifically, random forests, extremely randomized             Chien, Steven I-Jy, Ding, Yuqing, and Wei, Chienhung.
forests, AdaBoost, and gradient tree boosting are leveraged.          Dynamic bus arrival time prediction with artificial neural
In a final step, the above methods originating from Queueing          networks. Journal of Transportation Engineering, 128
Theory and Machine Learning are combined. That is, we                 (5):429–438, 2002.
rely on the boosting algorithms and modify them such that
                                                                    Reiman, Martin I and Simon, Burton. A network of priority
the first model considered in the boosting is the snapshot
                                                                      queues in heavy traffic: One bottleneck station. Queueing
predictor model.
                                                                      Systems, 6(1):33–57, 1990.
Evaluation
                                                                    Wu, Chun-Hsin, Ho, Jan-Ming, and Lee, Der-Tsai. Travel-
To demonstrate the value of our approach, we tested the              time prediction with support vector regression. Intelligent
proposed predictors using bus data that comes from the bus           Transportation Systems, IEEE Transactions on, 5(4):276–
network in the city of Dublin.1 The data includes location           281, 2004.
of buses that is sampled in intervals of 5 to 300 seconds,
depending on the current location of the bus.
Using this data, we empirically evaluated the prediction
  1
    See also http://www.dublinked.ie/ and http://
www.insight-ict.eu/


                                                                   3N