V. Kůrková et al. (Eds.): ITAT 2014 with selected papers from Znalosti 2014, CEUR Workshop Proceedings Vol. 1214, pp. 54–60
http://ceur-ws.org/Vol-1214, Series ISSN 1613-0073, c 2014 Z. Jiráček, V. Martínek, M. Čermák


                    Deviations Prediction in Timetables Based on AVL Data

                                    Zbyněk Jiráček, Vladislav Martínek, and Miroslav Čermák

                                                     Dept. of Software Engineering,
                                                      Charles University in Prague,
                                                        Prague, Czech Republic
                                                           JBI@seznam.cz,
                                                     martinek@ksi.mff.cuni.cz,
                                                      cermak@ksi.mff.cuni.cz

Abstract: Relevant path planning using public transporta-               possess a smart mobile device is increasing every year, this
tion is limited by reliability of the transportation network.           also means an increase of potential users of this kind of
In some cases it turns out that we can plan paths with re-              application. Additionally, transit operators can equip stops
spect to expected delays and hereby improve reliability of              by information systems presenting this information to pas-
the resulting path. In our work we focus on prediction of               sengers.
the delays in public transportation systems. For this pur-                 We compare three methods: statistical, regression and
pose we use data from vehicle tracking systems used by                  neural networks. The statistical methods are expected to
transit operators - known as the AVL data.                              be computationally the least complex while the neural net-
   We compare statistic methods to methods of artificial                works should provide the best results.
intelligence using data from Prague trams tracking sys-                    We also define and compare static and dynamic predic-
tem. We discovered that in some cases the neural networks               tion. By static prediction we mean models that use only
show better results than the statistic methods. In contrast,            past data and do not have access to the information in real-
sometimes even simple statistical methods give as good                  time. Therefore, they can predict only expectable devia-
results as those provided by the neural networks.                       tions that happen repeatedly.
                                                                           Unlike static prediction, dynamic prediction does not
                                                                        use only past data, but takes real-time information into
1    Introduction                                                       account as well. This allows recognition of unexpected
Many people need to travel almost every day. When they                  problems in the transit network. On the other hand, real-
do so, they usually have two options - to use individual                time data have only short-term validity and passengers
transportation means (e.g. a car) or to use public trans-               can use information from dynamic prediction system only
portation. When making this decision, many aspects are                  when they are about to travel somewhere or while they are
taken into account. One of the important aspects besides                already travelling.
price and duration of the journey is reliability.
   Especially in larger cities, lot of money is often spent to             The structure of the article is as follows: The first sec-
support public transportation systems, including reliability            tion introduces the problem and the goal. The second sec-
improvements - mostly by building underground lines and                 tion shows previous and related work. In the third section
by segregation of trams and buses from individual traffic.              we analyse the problem in detail and in the fourth section
   An alternative way of improving reliability is by pro-               we evaluate the selected methods. In the fifth section we
viding useful information. Given that we will never be                  compare and discuss the results. The sixth section offers
able to ensure that the system is 100% reliable, we can                 some ideas about practical usage of the results. And the
soften the consequences of traffic irregularities by warn-              seventh section concludes the article and offers future fo-
ing the passengers. In more advanced case we can improve                cus.
path-planning systems in such way that they prefer more
reliable paths. The main advantage of this approach to                  2     Related Work
dealing with irregularities is that it is relatively simple and
cheap. Transit operators usually already collect tracking               There is a lot of work related to reliability of public trans-
data from vehicles. These data are known as Automatic                   port. Common public transport unreliability issues are
vehicle location (AVL) data. AVL data give us informa-                  discussed by Rietveld et al. in [1]. A static prediction
tion about positions of vehicles, typically in real-time.               algorithm was already presented by Martínek and Žem-
   In this article we find and compare approaches how to                lička in [2]. The algorithm corrects the timetable given by
interpret the information from AVL data and how to pre-                 the transit operator and the result is then presented to pas-
dict future development of delays of the public transporta-             sengers. Similar approach is mentioned by Dessouky and
tion vehicles in real-time. We assume that the resulting                Randolph [3]. They treat travel time as a log-normal dis-
information can be transmitted to passengers for instance               tributed random variable and calculate the expected travel
via a mobile application. Since the number of people who                time as the timetable time plus mean delay.
Deviations Prediction in Timetables Based on AVL Data                                                                         55


   An example of a dynamic approach is described by Tien         timetable. Furthermore, the next vehicle, if it is on time,
et al. in [4]. The system, though, does not predict situa-       will have less passengers to serve and therefore will stay
tion, it only checks if the user’s progress corresponds to       on time more easily.
the schedule and, if not, it computes an updated schedule           We were able to locate the effect described above in the
or a new route.                                                  data from Prague trams tracking system we have available.
   A dynamic prediction using statistical methods was            Let a be a stop somewhere near the middle of a specific
used by Wall and Dailey in [5]. Jeong and Rillet com-            line and let z be the final stop of the same line. We would
pared regression methods and neural networks on a bus            like to express the relationship between D(a) being the de-
line in Houston, Texas [6]. According to their measure-          lay at the stop a and D+ (a, z) = D(z) − D(a) which is the
ments, neural networks seem to be more accurate and more         additional delay at the stop z. We divided the past ob-
promising. More general comparison of neural networks            servations into clusters by D(a) value; each cluster con-
and statistical methods in transportation is provided by         tains connections with delay from k to k + 1 minutes at the
Karlaftis and Vlahogianni [7]. They point out that neural        stop a. Then we expressed the average additional delay
networks are better at recognition of more complex non-          for each of these clusters. If the delays on the way were
linear relations. But a significant drawback of neural net-      independent, these values should be similar. But figure 1
works is that they are much less transparent than statistical    shows there most probably is a relationship. While on the
methods.                                                         line 9b the drivers are usually capable of reducing the de-
   Bohmova and Mihalak [8] suggest that in larger net-           lay, the line 22a shows that the more delayed a vehicle is,
works where each line is served with high frequency we           the even more delayed it usually is on the rest of its way to
can guide passengers only by a list of stops and lines. It       the final stop.
means that instead of information “take line X departing
at HH:MM” we tell the user to take first vehicle of line X
or Y in a specific direction.


3    Our Focus

We focus mostly on transportation networks in larger
cities. These networks usually have more complex struc-
ture without clear hierarchy and there are several options       Figure 1: The average additional delay on arrival to the
of getting from one location to another.                         final stop for lines 9b (left), and 22a (right).
   According to Rietveld at al. [1] there are two major
causes of unreliability in public transport - recurrent and
non-recurrent congestion. While recurrent congestion oc-
curs every weekday at particular times and places, non-          4    Used Methods
recurrent congestion is caused by unpredictable incidents.
Non-recurrent congestion is relevant especially when talk-       We have data from Prague trams tracking system from
ing about trams or trains. A smaller incident can affect         March and April 2008. When a tram serves a stop, it also
more passengers since rail vehicles are typically not able       sends a message to the tracking system where this infor-
to bypass the critical spot. Note that non-recurrent con-        mation is stored. Therefore we don’t know the exact posi-
gestion cannot be predicted statically. However, dynamic         tion of the tram in every moment; we have only informa-
prediction mechanisms can identify the problem since they        tion about the last stop the tram stopped at together with
have access to information about current situation.              the associated time.
   Rietveld at al. also mention an obvious trade-off. Faster        For objective evaluation of the methods we divided the
transport or shorter halting times will improve the sched-       data into learning set and test set. The learning set is
uled travel times, but will have an adverse effect on the        about two times larger and is used as input when creating
reliability of the service [1]. This motivates us to study the   a model. The test set is then used to evaluate the model.
impact of the current delay on the additional future delay       The learning set contains data from March and the begin-
of the same vehicle. Sometimes when a vehicle is far be-         ning of April, while the test set contains the rest of the data
hind its schedule, it creates a longer interval between this     until the end of April. This way we simulate the real sit-
vehicle and the previous one on the same line. If the line       uation - that we create the model on the past data and use
is served with high frequency, passengers usually do not         the model for predictions.
consult schedules and arrive randomly at their stops [3].           For evaluation of the methods we calculate the follow-
Larger interval in this situation means that the delayed ve-     ing metrics. In these metrics, an error is the difference
hicle must transport more passengers. More passengers            between the predicted and real arrival.
cause longer dwell times spent at stops, which can lead
to further delays if there is not enough spare time in the           • The average absolute error
56                                                                                              Z. Jiráček, V. Martínek, M. Čermák


     • Median absolute error                                        Now in the present situation we have a vehicle v cur-
                                                                  rently located at the stop a, Therefore we know D(v, a).
     • Mean absolute percentage error (the absolute error di-     We want to predict D(v, b). In the following formula, let
       vided by the actual travel time)                           e b) be the prediction of D(v, b).
                                                                  D(v,
     • 95% confidence interval of the absolute error                              e b) := D(v, a) + D+ (a, b)
                                                                                  D(v,                                         (2)
     • Percent of connections with absolute error under
       60 seconds                                                    Since we use the current delay to predict the future de-
                                                                  lay, this is a dynamic prediction algorithm. We can com-
   For a particular tram currently located at a certain stop      pare it with a static version, which corresponds to the ap-
that we call the initial stop, our task is to predict the times   proach provided by Martínek and Žemlička in [2]. In the
of arrival to the following stops on its path. For simplifica-    static version we calculate the expected delay without us-
tion we have chosen three lines (parts of the lines, respec-      age of the value D(v, a), which we don’t know in the mo-
tively) with different characteristics. We also chose for         ment of the calculation:
each line one initial stop a and one target stop b instead of
                                                                                             1
predicting arrivals to all remaining stops on the line. The                         e
                                                                                    D(b) :=     ∑ D(c, b)
                                                                                            |L| c∈L
                                                                                                                               (3)
line parts we have chosen as the test subjects are described
in the following paragraphs.
                                                                     In the formula above we use average delay at the tar-
   Line 9b in the selected part is completely segregated
                                                                  get stop instead of the average additional delay. Note that
from the individual transport and there are no traffic lights
                                                                  since this is a static calculation, it does not depend on the
on its track. Therefore it is rarely delayed and if so, the
                                                                  concrete vehicle v. Finally, we compare the static and dy-
drivers are usually capable of decreasing the delay on the
                                                                  namic calculations using our dataset. The table 1 shows
way (see figure 1).
                                                                  the average absolute errors for both static and dynamic
   Line 22a in the selected part has some intersections
                                                                  prediction algorithms.
equipped with traffic lights on its way. This makes the
vehicle movement more unpredictable, but does not cause
major delays.                                                     Clustering Similarly as in [2] we can divide the data to
   Line 22b in the selected part is not segregated from in-       workdays and weekends and cluster the average values by
dividual transport and sometimes it suffers from recurrent        hours to get finer resolution, since the delays in morning
and non-recurrent congestions much more than the previ-           hours may differ from those at evenings. This means that
ous two lines.                                                    instead of one D+ (v, b) value we have 2×24 values for
   For all the three lines, the travel time from the stop a to    each hour for workdays and weekdays separately. In the
the stop b is approx. 12–15 minutes.                              equation 2 we use one of the 48 values based on the current
                                                                  time and the day of week. The table 1 shows the improve-
   We used three methods of prediction on these lines in          ment in average prediction error.
Matlab. Simple statistical method, neural networks and
regression.                                                                                       Average absolute error
                                                                        Method
   In the following sections we use this notation:                                                9b      22a       22b
L: The learning set of past observations on a particular                Static non-clustered     64.8 s 90.8 s 170.5 s
line. An observation is a set of times and delays for each              Static clustered         62.6 s 90.5 s 150.1 s
stop on the line. An observation corresponds to a single                Dynamic non-clustered    37.1 s 51.2 s 113.7 s
connection performed by a tram vehicle in the past.                     Dynamic clustered        32.5 s 47.0 s 97.2 s
a: The initial stop.
b: The target stop.                                               Table 1: Average absolute error for static/dynamic
D(c, s): The delay of a specific tram connection c at the         clustered/non-clustered statistic prediction
stop s.
D+ (c, a, b): The additional delay of a specific tram con-
nection c between stops a and b. This value is equal to           4.2     Neural Networks
D(c, b) − D(c, a).
                                                                  Neural networks are commonly used in transportation re-
                                                                  search (see [6] and [7]). Their main advantage is that
4.1     Statistical Methods                                       they can handle multi-dimensional data and are capable
The most straightforward solution is to calculate the aver-       of recognition of non-linear relationships. The main dis-
age additional delay D+ (a, b) between stops a and b in the       advantage lies in the lack of transparency. It is usually
following way:                                                    very hard to explain the results calculated by the neural
                                                                  networks.
                                1                                    We learned the neural networks on the past data using
                D+ (a, b) :=       ∑ D+ (c, a, b)
                               |L| c∈L
                                                           (1)
                                                                  the Levenberg-Marquardt method with 10 neurons in one
Deviations Prediction in Timetables Based on AVL Data                                                                                  57


hidden layer, which showed the best performance and ac-                      vector. The neural network treats the input vectors inde-
curacy in our tests. More information about structure and                    pendently and therefore when it is asked for an output, it
learning process of the neural networks can be found in                      can use only the data specified in the input vector. That
literature [9]. We used neural networks toolbox in Mat-                      means the network did not use any other information, for
lab. Some of its advantages are a built-in protection against                instance about status of the previous vehicles on the same
overfitting and automatic normalization of the input.                        line.
   For each past connection observation from the learning                       However, we would like the network to use more in-
set c ∈ L we created an input vector. The input vectors                      formation about current situation. In order to do that we
consisted of the time, day of week and delays at each stop                   need to extend the input vector and encode the information
from the starting point of the line to the initial stop a. The               into it.
network had only one output value - the prediction of the                       We have decided that we try to improve the results by
delay at the stop b1 . The first results are shown in the                    adding information about a few previous vehicles on the
table 2.                                                                     same line. Question is, how to express this information
                                                                             in a form of a vector that a neural network would be able
  Statistics on lines                9b           22a         22b            to understand. The fact that we do not know the exact
  Average absolute error           29.8 s       45.6 s       96.5 s          positions of the trams, but only the last served stop, also
  Median absolute error            22.2 s       36.0 s       59.2 s          needs to be taken into account.
  Mean percentage error            5.36 %       8.81 %      14.51 %             Given the limitations above we extended the input vec-
                                                                             tor by two values: number of trams of the given line cur-
    Table 2: Prediction precision using neural networks
                                                                             rently located between stops a and b, and travel time from
                                                                             a to b of the last tram on the given line that has reached
  As the results are very similar to those provided by                       the stop b. This improved the results for line 22b by ap-
the simple statistical methods (see table 1), we decided to                  prox. 25 %, but did not bring any significant changes of
make some improvements to the neural network.                                the results for lines 9b and 22a. Table 3 shows how the
                                                                             absolute prediction error has changed.
Input When changing the structure of the input vectors,
we found that the network does not use delays from the                        Statistics on lines          9b        22a       22b
previous stations before the station a. Additionally, the                     Average absolute error     32.7 s    45.8 s     72.6 s
weekday information could be simplified to a boolean                          Median absolute error      26.1 s    37.0 s     49.1 s
value “is-workday”.                                                           Mean percentage error      5.91 %    8.76 %    11.41 %
   As a result, only three-element vectors were used as the
                                                                             Table 3: Prediction precision using neural networks with
input: The time, workday boolean, and the delay at the
                                                                             more inputs
stop a, while the results did not change.
                                                                                We explain these results by the differences between the
Topology We have tested many different topologies of the                     lines. As the lines 9b and 22a are not directly influenced
network. It turns out that a neural network with approx.                     by other types of transport, their delays are more random.
10 neurons in one hidden layer is sufficient. In rare cases                  On the contrary, the line 22b is highly influenced by cur-
the neural network failed to learn, which can be improved                    rent traffic situation in the area, which usually does not
by adding one more layer. Adding more neurons and lay-                       dramatically change within just a few minutes. Therefore
ers only slowed down the learning process, but did not im-                   if a tram on the line 22b is delayed, it is probably caused
prove the results. Changing the learning method did not                      by the traffic congestions and it is also probable that the
bring any improvements as well.                                              next tram on this line will be delayed too.


Clustering Similarly as in statistical methods we tried to                   Further improvements The input vectors now contain
divide the data, at first only into two groups - workday and                 more information about current situation, yet the data pre-
weekends. Then we learned two neural network models                          sented are still very limited. The network uses the infor-
and for each input we used the appropriate model. We                         mation about the last tram that has passed the stop b. This
found that this approach only worsens the previous results.                  might still not be optimal.
Later, we found that Jeong and Rillet in [6] have observed                     If the distance between stops a and b is for example
the same effect.                                                             15 minutes, we use information about a tram which is
                                                                             15 minutes ahead. And this 15 minutes is a time long
                                                                             enough for the situation to change and therefore the pre-
Current situation Until now we used only data about                          diction may be based on obsolete information.
the particular connection when constructing a single input                     Moreover, as the trams send their location to the sys-
     1 Actually, the output could be easily widened to produce one predic-   tem only at stops, this can cause problems in case of an
tion value for each stop; we use only one-dimensional output for clarity.    accident. If a tram is extremely delayed or stopped on its
58                                                                                             Z. Jiráček, V. Martínek, M. Čermák


way, we are not informed about that. The only way how           4.4   Improvements
to assume this is by the fact that the tram has not arrived
to the stop b for a long time. This, again, slows the reac-     Similarly as we did with the previous methods we tried to
tion of the network, since it takes some time before a tram     apply some improvements. First, adding higher degrees of
becomes late enough to be suspicious.                           the time and delay input variables did not change results
   What could improve the results is the knowledge of the       significantly. Neither did clustering of the results. It may
exact position of the tram in real-time (or in reasonable in-   be possible that there is some combination of the input
tervals). But given the nature of the system used in Prague     variables that could lead to better results, but we believe it
this is rather unrealistic.                                     is unlikely.
   The only option left is a better usage of the data pre-
sented. For instance the input vector could contain infor-
mation about delays of trams at stops between a and b,          5     Comparison
which it does not now. Or it could contain information
about trams from other lines that share a part of their         In this section we would like to compare the results from
path with the current line. Nevertheless, it is necessary       the previous sections.
to present the values in such way that the neural network
will be able to interpret the data. We believe there may be
                                                                5.1   Used Methods
a chance of further improvements regarding to this mat-
ter, though we were not able to devise an input form that       First we compare the used methods. The table 5 shows the
would prove that.                                               final results.

4.3 Regression                                                                                 Average absolute error
                                                                      Method
                                                                                                9b      22a      22b
The similarities between the results of statistical process-          Statistical processing   32.5 s 47.0 s 97.2 s
ing and neural networks encouraged us to try one more                 Neural networks          29.8 s 45.6 s 72.6 s
method – regression. Regression should provide better re-             Regression               32.0 s 48.0 s 73.9 s
sults than simple statistics, which in some cases gave as
good results as the neural networks.                            Table 5: Best average absolute error for particular predic-
  Inspired by the input of the neural networks, we used         tion methods.
the following linear equation for the regression:
                                                                   The results indicate that on lines 9b and 22a all the
        D(v, b) = k1 T (v) + k2W (v) + k3 D(v, a) + k4 NP
                                                                methods present similar predictions. We think that this
               + k5 D(w, b) + k6                                is caused by relatively good punctuality rate of these two
                                                                lines. Average delay of the line 9b at the target stop is
   where:
                                                                38 seconds, for the line 22a it’s 94 seconds. The aver-
D(v, b) is the delay of the vehicle in the target stop,
                                                                age delay for the line 22a is higher, but the delays on this
T (v) is the time of departure of the vehicle,
                                                                line are probably caused mostly by the three intersections
W (v) is 1 for workdays, or 0 for weekends,
                                                                equipped with traffic lights, which generate unpredictable
D(v, a) is the delay of the vehicle in the initial stop,
                                                                deviations. Together they can hold a tram for approx.
NP is the number of vehicles on the same line currently
                                                                180 seconds in the worst case.
located between stops a and b,
                                                                   The results on lines 9b and 22a suggest that in the traffic
D(w, b) is the delay of the last vehicle on the same line
                                                                network where there are only small delays, or the delays
that has passed the stop b,
                                                                are caused mostly by unpredictable factors, simple statis-
ki are coefficients we want to solve by the regression.
                                                                tical methods are most suitable. Implementation of linear
                                                                regression or even neural networks is far more complex
  Note that the inputs to this equation are the same we
                                                                and most probably does not bring any improvements in
used for the neural networks in section 4.2.
                                                                these situations.
  The results are summarized in the following table:
                                                                   Regarding the line 23b, neural networks, together with
     Statistics on lines         9b        22a        22b       linear regression outperformed the statistical prediction.
     Average absolute error    32.0 s    48.0 s      73.9 s     This is mostly given by the fact that the statistical meth-
     Median absolute error     25.6 s    38.2 s      52.1 s     ods we used were not able to process many-dimensional
     Mean percentage error     5.76 %    9.11 %     11.81 %     input. The neural networks and regression have become
                                                                more precise by adding information about previous vehi-
     Table 4: Prediction precision using linear regression.     cles to the input, which we cannot as simply add to statisti-
                                                                cal methods too. Before we added this data to the input of
  The comparison to the other methods is offered in the         the neural networks and regression, the results were simi-
section 5.1.                                                    lar for the line 23b too.
Deviations Prediction in Timetables Based on AVL Data                                                                       59


   The results show no significant difference between the           We decided to compare those values. In the table 7,
precision of neural networks and linear regression. This is      “Inherent prediction” is a simple algorithm that always
a little surprising as we expected the neural networks to be     predicts the future delay as the same value, as the cur-
capable of discovering more complex non-linear relation-                                 e b) := D(v, a). In fact, this is
                                                                 rent delay - formally D(v,
ships between the input and output data.                         the simplest possible dynamic prediction algorithm. For
                                                                 reference we also added “No prediction” algorithm which
                                                                 simply uses the value from timetable and assumes zero de-
5.2 Static vs. Dynamic Prediction
                                                                 lay. This represents the simplest possible static prediction
In this article we also wanted to compare the static and         algorithm.
dynamic methods. It is clear that the dynamic methods
should provide more accurate predictions; the purpose of                                        Average absolute error
                                                                       Method
this comparison is more to express the improvement that                                          9b    22a      22b
the dynamic prediction methods can offer.                              No prediction            64 s 114 s     258 s
   To simulate the static environment we used the same                 Inherent prediction      46 s 58 s      127 s
methods: statistics, neural networks, and regression. The              Statistical processing   33 s 47 s       97 s
only difference is that static methods do not know the ac-             Neural networks          30 s 46 s       73 s
tual timetable deviations and therefore do not have the
                                                                 Table 7: Comparison of simple delay estimates and pre-
D(v, a) value on the input. The result is that the static
                                                                 diction algorithms.
methods must predict the delay D(v, b) using only the time
and the day of week (based on the past observations).
                                                                    The numbers show that using the prediction algorithms,
   First we compared the static versions of the used meth-
                                                                 we can reduce the departure prediction errors. With usage
ods to each other. The result is that in the static environ-
                                                                 of more advanced methods like neural networks or regres-
ment all the three methods give almost the same results.
                                                                 sion, the improvement can be even greater.
   Then we compared the static and dynamic methods.
The table 6 compares the best static method with the best
dynamic method results.                                          6.2   Navigation
                                                                 The prediction data can be also used in public transport
  Statistics for lines              9b        22a         22b
                                                                 connection search engines. These applications typically
                         Stat      63 s       90 s       150 s   search only in timetables and do not reflect current situa-
  Mean abs. err.
                         Dyn       30 s       46 s        73 s   tion. If the systems used predicted departures and arrivals,
                         Stat      47 s       67 s       104 s   they could possibly be able to find faster and more reliable
  Median abs. err.
                         Dyn       22 s       36 s        49 s   connections. Especially when the user is searching for the
                         Stat    11.3 %     17.2 %      24.2 %   fastest connection “right now”, we could use the benefits
  Mean perc. err.
                         Dyn      5.4 %      8.8 %      11.4 %   of the dynamic predictions.
                                                                    The most complex systems are public transport naviga-
Table 6: Comparison of static (Stat) and dynamic (Dyn)
                                                                 tion systems. These applications are often capable of deal-
prediction methods.
                                                                 ing with delays at least in a simple way. Such a system was
                                                                 already implemented in Boston [4]. Adding a prediction
                                                                 unit to such systems might improve their reliability.
6 Possible Usage
                                                                 Example Situation Mike is currently at a stop a and
6.1 Information Systems                                          needs to get to stop b to catch a train. There are two lines,
The results that the algorithms present could be used in         18 and 22, that connect the stops a and b, each of them
public transport information systems. The simplest usage         goes a different way. Mike knows that the line 22 has
of the data is a direct presentation of the results to passen-   higher probability to be delayed between the stops a and b.
gers via mobile phones or information systems at stops.          Both lines can also be delayed on their way to the stop a.
These systems already exist in many cities, they normally        A tram 22 is approaching the stop a, while the tram 18 is
list departures from a particular stop, ordered by the de-       scheduled a minute later. What should Mike do? Should
parture time. These systems typically show only static           he board the approaching tram and risk the possible delay
timetable data, sometimes together with current delays.          on the way? Or would it be better to wait for the tram 18
   But the passengers are not actually interested in the cur-    and risk that it will arrive late?
rent delay; the information is provided to them so that they
can infer the real departure time, which is inherently pre-      Solution This situation can be solved by the prediction
dicted as the timetable departure + the delay. However,          algorithms. If Mike had access to information from such
we could provide the users with the predicted departure,         system, he would know that because of a bad traffic situ-
which should be more accurate.                                   ation the trams on line 22 are predicted to be delayed by
60                                                                                               Z. Jiráček, V. Martínek, M. Čermák


5–10 minutes along the route to the stop b. He would also       References
know that the tram 18 scheduled a minute later is on time.
This would help him to decide not to board the tram 22          [1] Rietveld, P., Bruinsma, F. R., van Vuuren, D. J.: Coping with
and wait for the tram 18, which would most probably help            unrealiability in public transport chains: A case study for
him to get to the stop b on time.                                   Netherlands, Transport Research Part A: Policy and Practice
                                                                    (2001) 539–559
                                                                [2] Martínek, V., Žemlička, M.: Passenger Path Plan Reliability
7 Conclusion                                                        Improvement Proposal, The Fourth International Conference
                                                                    on Information, Intelligence, Systems and Applications Uni-
                                                                    versity of Piraeus, Piraeus, Greece (2013) 242–247
We have shown that by using even simple prediction al-
                                                                [3] Dessouky, M., Hall, R., Zhang, L., Singh, A.: Real-time
gorithms, it is possible to predict movement of the pub-            Control of Buses for Schedule Coordination at a Terminal,
lic transportation vehicles much more precisely than just           Transportation Research Part A: Policy and Practice (2003)
by using the timetables given by the transit operator. It           145–164
also turned out that for lines with only small or unpre-        [4] Tien, D. N., MacDonald, T., Xu, Z.: TDplanner: Pub-
dictable delays, more complex methods like regression or            lic transport planning system with real-time route updates
neural networks are not more accurate than the basic sta-           based on service delays and location tracking, IEEE Vehicu-
tistical methods. Therefore, usage of regression or neural          lar Technology Conference (2011).
networks is reasonable only in environments with signif-        [5] Wall, Z., Dailey, D. J.: An Algorithm for Predicting the Ar-
icant delays. As the neural networks are more complex,              rival Time of Mass Transit Vehicles Using Automatic Ve-
and most probably harder to implement, linear regression            hicle Location Data, Transportation Research Board, 78th
seems to be a good solution.                                        Annual Meeting, Washington, D. C., USA (1999)
   We also compared static and dynamic algorithms. The          [6] Jeong, R., Rillet, L. R.: Bus Arrival Time Prediction Us-
results indicate that when we have information about cur-           ing Artificial Neural Network Model, IEEE Inteligent Trans-
rent situations, the predictions are up to twice as much ac-        portation Systems Conference, Washington, D. C., USA
curate. Of course the results of the dynamic algorithms             (2004) 988–993
are valid only for a short period of time, as the situation     [7] Karlaftis, M. G., Vlahodianni, E. I.: Statistical methods ver-
changes.                                                            sus neural networks in transportation research: Differences,
                                                                    similarities and some insights, Transportation Research Part
                                                                    C 19 (2011), 387–399
7.1 Future Work                                                 [8] Bohmova, K., Mihalak, M., Proger, T., Sramek, R.,
                                                                    Widmayer, P.: Robust Routing in Urban Public Trans-
From the data we have available it turned out that trams in         portation: How to find reliable journeys based on
Prague are quite precise, with only a few exceptions. We            past observations, 13th Workshop on Algorithmic Ap-
                                                                    proaches for Transportation Modelling, Optimization,
believe this is the major cause of why the more complex
                                                                    and Systems, ser. OpenAccess Series in Informatics
prediction methods did not outperform the simple ones               (OASIcs), D. Frigioni and S. Stiller, Eds., vol. 33.
greatly. We think it would be interesting to test the algo-         Dagstuhl, Germany: Schloss agstuhl-Leibniz-Zentrum
rithms on a network with more significant deviations too,           fuer Informatik, (2013) 27–41. [Online]. Available:
for example on the Prague bus operation data, as the buses          http://drops.dagstuhl.de/opus/volltexte/2013/4242
tend to be less precise because of lower level of segrega-      [9] Beale, R.,Jackson, T.: Neural Computing: An Introduction,
tion from individual transport. However, we do not have             IOP Publishing, Bristol and Philadelphia (1990)
access to this data, so we could not test it.
   We would also like to focus on further improvements in
accuracy. We believe that the neural networks and maybe
the regression too, have potential to give better results if
they had more information on the input. The problem is
how to encode all the information about the current situa-
tion into a vector of real values of a reasonable length.
   In the future work we would also like to focus on the
usage of the data from the prediction algorithms. We be-
lieve that presentation of this data to passengers in a user-
friendly form is a relatively simple yet modern way how
to make public transportation more attractive.


Acknowledgment

This work was supported by project GAUK 472313.