<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Structural Optimization of the Travel Time Prediction Model Based on Hierarchical Regression</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anton Agafonov</string-name>
          <email>ant.agafonov@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vladislav Myasnikov</string-name>
          <email>vmyas@geosamara.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Image Processing Systems Institute of the Russian Academy of Sciences (IPSI RAS)</institution>
          ,
          <addr-line>Samara</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Samara State Aerospace University (SSAU)</institution>
          ,
          <addr-line>Samara</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we consider a problem of public transport arrival time prediction for a large city in real time. We investigate the algorithm based on a model of an adaptive combination of elementary prediction algorithms. Adaptability means that parameters of the constructed combination depend on a number of control parameters of the model. We compare our model with the nonlinear arti cial neural network model. Comparison is performed on data of the public transport network in Samara, Russia.</p>
      </abstract>
      <kwd-group>
        <kwd>travel time prediction</kwd>
        <kwd>combination of algorithms</kwd>
        <kwd>neural network</kwd>
        <kwd>hierarchical regression</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Geoinformation systems ans intelligent transportation systems (ITS) based on
automated vehicle location data are widely used by public transportation
companies around the world. ITS can be used for the improvement of public
transport reliability using di erent operational planning and control strategies. Such
strategies can be used to improve the public transport network design, evaluate
the schedule reliability and improve schedule planning [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>Travel time prediction problem is one of the most common transportation
problems. Travel time prediction can be used for solving many other problems
in di erent contexts, including management, navigation, planning, monitoring
and control problem, mentioned early.</p>
      <p>
        There are much research focused on the travel time prediction problem. All
approaches can be divided into the four following categories [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
1. Historical data based models. These approaches predict travel time in a
certain period by the averaged over historical values during the same period
in the previous days. The simplicity is the main drawback of these models,
because models ignore both current road situation (such as congestions or
accidents) and complex relationship between travel times and other variables
in urban public transport networks.
2. Tra c theory based models [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These models perform travel time prediction
based on tra c ow / tra c density estimations. The requirement of an
actual tra c ow model is the main drawback of these models.
3. Time-series models [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. These approaches assume that the future travel time
depends only on the most recent data samples. Accuracy of models decreases
for a longer forecasting horizon.
4. Machine learning and regression methods. These methods construct a
regression function depending on a set of independent variables, which may
include real-time and historical data of the road links travel times, road
situation, passenger, weather situation, delays at stops, etc [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Over the last
two decades, regression models have been the state of the art for this kind of
applications. Arti cial neural networks (ANN) are widely used for the travel
time prediction [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ], can represent complex nonlinear relationships between
the target travel time value and the set of independent variables. Support
vector regression (SVR) is used for travel time prediction in [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. Nonlinear
regressions methods (such as ANN or SVR) has several drawbacks compared
with other regression methods, including high training computational
complexity and possibility for over tting.
      </p>
      <p>
        In this paper we investigate the adaptive model proposed in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. This model
is based on adaptive combination of the elementary prediction algorithms. The
model uses hierarchical regression for representing complex nonlinear
relationships between the target travel time value and the set of independent variables.
Main purposes of this paper are the structural optimisation of the
hierarchical regression based model and the comparison of the proposed model with an
arti cial neural network model.
      </p>
      <p>The structure of this paper is organized as follows: the second section
contains a problem statement. The third section contains a description of the base
model. The next section describes modi ed structure of the base model, used
elementary predictors and aggregated function. In the fth section we present
an experimental study of the algorithm. Lastly we present our conclusions and
recommendations for further work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Problem statement</title>
      <p>We use the following notations:
{ w is the road link in a transport network;
{ j is the index of the speci ed vehicle;
{ s is the transport type (bus, tram, trolley), S is the set of transport types;
{ j 2 [0; 1] is the relative position of the vehicle j on the road link;
{ tc is the current time instant (at which predictions are calculated).</p>
      <p>The position of any vehicle in the transport network we de ne by the pair
(w; ).</p>
      <p>Let the vehicle j at the time tc be in the position (w0; 0). We should estimate
the arrival time, after which the vehicle be at the stop with position (wK ; K ).</p>
      <p>Let the route between two positions (w0; 0) and (wK ; K ) includes the
following edges (w0; 0) ! w1 ! w2 ! ! wK 1 ! (wK ; K ). Let Tjw (tc; t)
be the travel time along the edge w starting at the time t, assuming that the
prediction is calculated at the time tc.</p>
      <p>Assuming, that the vehicle j moves monotonically on each road link, the
predictive arrival time to the stop (wK ; K ) , calculated at the time tc, can be
written as follows:</p>
      <p>T (w0; 0)!(wK; K)(tc) = (1
j</p>
      <p>0)Tjw0 (tc; tc)
k=1</p>
      <p>K 1
+ X Tjwk tc; tc + Tj(w0; 0)!(wk;0)(tc)
+</p>
      <p>K TjwK tc; tc + T (w0; 0)!(wK;0)(tc) :
j
(1)
(2)
As it can be seen from the above expression, its main components are terms
Tjwk ( ) that characterize the travel time of the road links.</p>
      <p>For simpli cation, further we consider the travel time prediction problem at
the speci ed edge wk and the speci ed time t:</p>
      <p>T w (tc; t) :</p>
      <p>
        In next section we brie y describe our prediction model proposed in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>The base model</title>
      <p>Calculation of prediction (2) can be performed using di erent information
usually established in urban public transport networks. Firstly we describe the
different travel time estimations and control parameters that a ect the predictive
travel time value (2):
9.
10.
1. nT w(s)(t)o is the averaged historical travel time of the edge w by the
vehis
cles of the transport type s;
2. Tlwast(tc) is the most recent travel time of the link w;
3. T0w(t) is the normative travel time by the planning schedule;
4. Tvw(j)(tc) is the estimated travel time by the mean speed of the speci ed
vehicle;
5. f(t tj ; Tj (tc))gjw is the set of precedents containing time instants and travel
times of the edge w before the time tc;
6. w is the time elapsed since the last vehicle has passed the link w;
7. t tc is the required prediction horizon;
8. (t) 2 [0; 1] is the parameter de ning the driving di culty at the time t. This
parameter takes into account the light and the weather conditions integrally;
w(t) 2 [0; 1] is the parameter de ning the tra c density;
w(t) is the parameter de ning dynamic of the tra c ow on the edge w.</p>
      <p>These values have di erent meaningful information that a ects the way it is
used. Parameters 1-5 present individual or aggregated statistical "precedents"
of the travel times. These values can be used (together or separately, directly
or indirectly) in the construction of the required estimate T w (tc; t) as some
realizations of this assessment. Parameters 6-10 also directly a ect this estimate.
However, these parameters cannot be considered as the precedents of the travel
times. We use these parameters as the control parameters of the model (further
we use notation pi for them).</p>
      <p>
        To predict travel time T w (tc; t) we proposed to use a model based on the
combination of the elementary prediction algorithms [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Fig. 1 provides a scheme
of the proposed model.
      </p>
      <p>Tlwast(tc); T0w(t); T w(s)(t); Tvw(j)(t)
s = 0
f(t tj; Tj(tc))gjw(s)</p>
      <p>s = jSwj 1
f(t tj; Tj(tc))gjw(s)</p>
      <sec id="sec-3-1">
        <title>Algorithm 0</title>
      </sec>
      <sec id="sec-3-2">
        <title>Algorithm K 1</title>
      </sec>
      <sec id="sec-3-3">
        <title>Algorithm 0</title>
      </sec>
      <sec id="sec-3-4">
        <title>Algorithm K 1</title>
        <p>T 0w;1
tw;1
0
T 0w;K
tw;K
0
T jwS;w1j 1
w;1
tjSwj 1
T jwS;wKj 1
w;K
tjSwj 1</p>
      </sec>
      <sec id="sec-3-5">
        <title>Aggregation function</title>
        <p>Tbsw(tc; t)</p>
        <p>We used four kernels: rectangular kernel (k = 0), triangular kernel (k = 1),
exponential kernel (k = 2) and rational kernel (k = 3).
As the aggregation function we used the following function:</p>
        <p>K 1
Tbsw(tc; t) = X</p>
        <p>X arksT rw;k
k=0 r2S
(tc; t)+
(4)
+ a1Tlwast(tc) + a2T0w(t) + a3T w(s)(t) + a4Tvw(j)(tc) :</p>
        <p>Since our algorithm should be adaptive to the control parameters, we
proposed to use hierarchical regression similar to regression tree. Parameters of the
aggregation function (4) are de ned for each subdomain of de nition of the
control parameters. Subdomains are calculated automatically using the hierarchical
regression with the following approach:
1) domain of de nition for each control parameter pi is determined: [pimin; pimax];
2) interval [pimin; pimax] is splitted into two intervals by an average value.
To increase accuracy and decrease computation complexity of the proposed
model, we modify used elementary prediction algorithms and the aggregation
function. Experiments also show that the information, a ected the travel time
and described in the previous section, has redundancy. We decide to use only
signi cant parameters in our modi ed model.</p>
        <p>First of all, we added a new transport type car to an input data of the
modi ed model. We decided to use the following most signi cant information
that a ect the predictive travel time value (2):
o is the averaged historical travel time of the edge w by the
vehis
cles of the transport type s;
2. f(t tj ; Tj (tc))gjw is the set of precedents containing time instants and travel
times of the edge w before the time tc;
3. t tc is the required prediction horizon;
4. (t) 2 [0; 1] is the parameter de nes the driving di culty at the time t. This
parameter takes into account the light and the weather conditions integrally;
5. w(t) is the parameter de ning dynamics of the tra c ow on the edge w;
6. w(t) 2 [0; 1] is the parameter de ning the tra c density .</p>
        <p>The elementary prediction algorithms (3) based on kernel functions have the
similar accuracy and, moreover, have high correlation. To avoid these
disadvantages we use the best one algorithm with the exponential kernel.</p>
        <p>As an aggregation function we use ridge regression (regression with L2
regularization) of the elementary prediction algorithms. Ridge regression penalizes
the euclidian norm of regression coe cients.</p>
        <p>We also modify the method for de ning the intervals [pimin; pimax] of the
control parameters for the each node in the hierarchical regression:
1) for each interval [pimin; pimax] we calculate the median value pimedian using
histogram of the control parameter pi distribution;
2) divide each interval [pimin; pimax] into two using the median value:
[pimin; pimax] ! [pimin; pimedian]; [pimedian; pimax].
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Test results and analysis</title>
      <p>
        Main purpose of experiments is to compare proposed hierarchical regression
based model with other complex nonlinear methods, since in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] we compare our
model only with elementary prediction algorithms. As such nonlinear method
we choose an arti cial neural network, because this method is widely used for
the travel time prediction and require not so much time for training procedure
as support vector regression. Input dataset of the ANN contains both travel
time estimations and control parameter values. As the ANN we use feed forward
network with one hidden layer.
      </p>
      <p>Models has been tested with the data of all the bus routes in Samara, Russia.
The road network consists of 3387 edges. The number of the vehicles equipped
with GPS sensors is more than 1500; coordinates of the vehicles positions are
obtained every 30 seconds.</p>
      <p>Tests were performed on data of all the public vehicles positions in Samara
for one week. A validation set size is about 140 million records; a control set size
is about 20 million records.</p>
      <p>We compare the modi ed adaptive model with an elementary prediction
algorithm based on the exponential kernel, the averaged historical travel time
(statistics on the gures) and the neural network model. Number of hidden
units were selected by the criterion of the mean absolute error minimization of
10 20
Prediction horizon, m
30
0</p>
      <p>10 20
Prediction horizon, m
30</p>
      <sec id="sec-4-1">
        <title>Exponential kernel</title>
      </sec>
      <sec id="sec-4-2">
        <title>Statistics Neural network Adaptive model</title>
        <p>the ANN on the training set. We used ANN with 3 hidden units in a hidden
layer. The exponential kernel width was also selected experimentally.</p>
        <p>The prediction accuracy is evaluated by computing the mean absolute error
(MAE) and the mean absolute percentage error (MAPE) depending on the
prediction horizon. Fig. 3 shows the experimental results on the training set, g. 4
shows the experimental results on the control set.
0.25
0.20
E0.15
P
A
M0.10
0.05</p>
        <p>In terms of considered accuracy measure the proposed adaptive model based
on the hierarchical regression and the neural network model show similar results.</p>
        <p>Since the prediction model should be run on-line, the computation time of
the model is a critical issue. Next experiment shows the computation time of the
adaptive and the neural network models for a di erent simultaneously processing
number of the vehicles, using laptop computer (Intel Core i5-3740 3.20 GHz, 8
GB RAM) ( g. 5).</p>
        <p>150
s
m
,e100
m
it
n
o
itt
a
pu50
m
o
C
0
0
20</p>
        <p>40
Number of vehicles</p>
        <p>60
Neural network</p>
        <p>The processing time for one vehicle takes about 7 ms on average, mainly
depends on the required number of travel time predictions and is determined
by the route and the position of the vehicle. Computation time of the neural
network model is a little bit bigger than computation time of the adaptive model.
Nonetheless both models can be used to predict the arrival time of all public
vehicles in a large city in real time.
6</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and future work</title>
      <p>In this paper we modify travel time prediction model based on adaptive
combination of the elementary prediction algorithms. Our modi cations concern the set
of elementary prediction algorithms and the aggregation function. The modi ed
model has the same advantages as the base model:
{ allows to include prediction algorithms of any type in the adaptive
combination;
{ has adaptability to the tra c situation changes;
{ has adaptability to the factors that has direct or indirect impact on the
transport movement and / or prediction result.</p>
      <p>In experiments performed on data of all bus routes in Samara (Russia) the
modi ed adaptive model provides almost the same results as the neural network.
The proposed model doesn't require so much time and memory resources for the
training as the neural network. This is the main advantage of our model.</p>
      <p>Comparison between the models of di erent classes, including time-series
models and nonlinear regression models, will be conducted as our future work.
Acknowledgments. The proposed base model and the modi ed model based
on hierarchical regression (sections 3 and 4) were developed with support from
the Russian Foundation for Basic Research grant 13-07-12103-o -m, grant
1507-01164-a, grant 16-37-00055-mol-a. The experimental results (section 5) were
obtained with support from the Russian Science Foundation grant 14-31-00014
Establishment of a Laboratory of Advanced Technology for Earth Remote
Sensing.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Moreira-Matias</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendes-Moreira</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Sousa</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gama</surname>
          </string-name>
          , J.:
          <article-title>Improving Mass Transit Operations by Using AVL-Based Systems: A Survey</article-title>
          .
          <source>In: IEEE Transactions on Intelligent Transportation Systems</source>
          , vol.
          <volume>16</volume>
          (
          <issue>4</issue>
          ), pp.
          <fpage>1636</fpage>
          -
          <lpage>1653</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Hoogendoorn</surname>
            ,
            <given-names>S.P.</given-names>
          </string-name>
          :
          <article-title>State-of-the-art of vehicular tra c ow modeling</article-title>
          .
          <source>In: Proceedings of the Institution of Mechanical Engineers. Part I: Journal of Systems and Control Engineering</source>
          , vol.
          <volume>215</volume>
          (
          <issue>4</issue>
          ), pp.
          <fpage>283</fpage>
          -
          <lpage>303</lpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Cryer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Time Series Analysis With Applications in</article-title>
          R. Springer-Verlag, New York (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>H.X.</given-names>
            ,
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.R.</given-names>
            ,
            <surname>Ran</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>Use of Local Linear Regression Model for Short-term Tra c Forecasting</article-title>
          .
          <source>In: Transportation Research Record</source>
          , vol.
          <year>1836</year>
          , pp.
          <volume>143150</volume>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xia</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chien</surname>
            ,
            <given-names>S.I.:</given-names>
          </string-name>
          <article-title>A dynamic bus-arrival time prediction model based on APC data</article-title>
          .
          <source>In: Computer-Aided Civil and Infrastructure Engineering</source>
          , vol.
          <volume>19</volume>
          (
          <issue>5</issue>
          ), pp.
          <fpage>364</fpage>
          -
          <lpage>376</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Jeong</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rilett</surname>
            ,
            <given-names>L.R.</given-names>
          </string-name>
          :
          <article-title>Bus arrival time prediction using arti cial neural network model</article-title>
          .
          <source>In: IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC</source>
          , pp.
          <fpage>988</fpage>
          -
          <lpage>983</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Bin</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhongzhen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baozhen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Bus arrival time prediction using support vector machines</article-title>
          .
          <source>In: Journal of Intelligent Transportation Systems: Technology, Planning, and Operations</source>
          , vol.
          <volume>10</volume>
          (
          <issue>4</issue>
          ), pp.
          <fpage>151</fpage>
          -
          <lpage>158</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Wu</surname>
          </string-name>
          , C.-H.,
          <string-name>
            <surname>Ho</surname>
            ,
            <given-names>J.-M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>D.T.</given-names>
          </string-name>
          :
          <article-title>Travel-time prediction with support vector regression</article-title>
          .
          <source>In: IEEE Transactions on Intelligent Transportation Systems</source>
          , vol.
          <volume>5</volume>
          (
          <issue>4</issue>
          ), pp.
          <fpage>276</fpage>
          -
          <lpage>281</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Agafonov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Myasnikov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>An Adaptive Algorithm for Public Transport Arrival Time Prediction Based on Hierarhical Regression</article-title>
          .
          <source>In: IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC</source>
          , pp.
          <fpage>2776</fpage>
          -
          <lpage>2781</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>