<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Short-Term Trafic Forecasting: A Dynamic ST-KNN Model Considering Spatial Heterogeneity and Temporal Non-Stationarity</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Shifen Cheng, Feng Lu State Key Lab of Resources and Environment Information System Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences 11A</institution>
          ,
          <addr-line>Datun Road, Chaoyang District, Beijing 100101</addr-line>
          ,
          <country country="CN">P. R. China</country>
        </aff>
      </contrib-group>
      <fpage>133</fpage>
      <lpage>140</lpage>
      <abstract>
        <p>Accurate and robust short-term trafic forecasting is a critical issue in intelligent transportation systems and real-time trafic related applications. Existing short-term trafic forecasting approaches are used to adopt global and static model structures and assume the trafic correlations between adjacent road segments within assigned time periods. Due to the inherent characteristics of spatial heterogeneity and temporal non-stationarity of city trafic, it is rather dificult for these approaches to obtain stable and satisfying results. To overcome the problems of static model structures and quantitatively unclear spatiotemporal dependency relationships, this paper proposes a dynamic spatiotemporal knearest neighbor model, named D-ST-KNN, for short-term trafic forecasting. It comprehensively considers the spatial heterogeneity and temporal non-stationarity of city trafic with dynamic spatial neighbors, time windows, spatiotemporal weights and other parameters. First, the sizes of spatial neighbors and the lengths of time windows for trafic influence are automatically determined by cross-correlation and autocorrelation functions, respectively. Second, dynamic spatiotemporal weights are introduced into the distance functions to optimize the search mechanism. Then, dynamic spatiotemporal parameters are established to adapt the continuous change in trafic conditions, including the dynamic number of candidate neighbors and dynamic weight allocation parameters. Finally, the D-ST-KNN model is evaluated using two vehicular speed datasets collected on expressways in California, U.S. and city roads in Beijing, China. Four traditional prediction models are compared with the D-ST-KNN model in terms of the forecasting accuracy and the generalization ability. The results demonstrate that the D-ST-KNN model outperforms existing models in all time periods, especially in the morning period and evening peak period. In addition, the generalization ability of the D-ST-KNN model is also proved.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Short-term trafic forecasting, which has an important role in
intelligent transportation systems, enables trafic managers to
formulate reasonable and eficient strategies for alleviating trafic
congestion and optimizing trafic assignments. Short-term trafic
forecasting also enables the public to achieve accurate vehicular
path planning [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ][
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        In the past few decades, researchers have proposed several
short-term trafic forecasting models that can be divided into two
categories: parametric models and nonparametric models. A
parametric model uses an explicit parametric function to quantify the
relationship between historical trafic data and predicted trafic
data. Considering the stochastic and nonlinear characteristics of
trafic, constructing a mathematical model with high accuracy
for characterizing trafic characteristics in practice is dificult
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Nonparametric models, such as data-driven methods, do not
require a priori knowledge and explicit expression of mechanism;
thus, they are more suitable for short-term trafic forecasting
problems [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ].
      </p>
      <p>
        As a typical nonparametric method, the k-nearest neighbors
(KNN) model has received considerable attention. Many scholars
have successfully applied the traditional KNN model to
shortterm trafic prediction [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ][
        <xref ref-type="bibr" rid="ref17">17</xref>
        ][
        <xref ref-type="bibr" rid="ref19">19</xref>
        ][
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref16">16</xref>
        ][
        <xref ref-type="bibr" rid="ref30">30</xref>
        ][
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. Considering
that the evolution of trafic is a spatiotemporal interaction
process, trafic conditions of road segments are spatially and
mutually afected [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Therefore, spatiotemporal relationships between
multiple road segments in road networks are considered to
improve trafic prediction [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ][
        <xref ref-type="bibr" rid="ref21">21</xref>
        ][
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Based on the traditional
KNN model, [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] realized an enhanced model with the support
of spatiotemporal information and argued that it achieves better
performance than the model that employs only temporal
information. [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] considered upstream and downstream trafic
information and proposed a distributed architecture of a
spatiotemporalweighted KNN model for short-term trafic prediction. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
employed a spatiotemporal state matrix instead of the traditional
time series to describe the trafic state while using a Gaussian
weight distance to select the nearest neighbor to improve the
KNN model. However, the disadvantages of these ST-KNNs are
that the spatiotemporal relation cannot be accurately quantified,
which is primarily reflected in the modeling process, the size of
the spatial dimension m and the length of time window n of the
state space cannot be automatically determined, and some values
are artificially set. For example, for m=3, three adjacent road
segments are selected; for n=2, the historical data of the first two
time steps of the current time step are used to construct samples.
When the time series problem is transformed into a supervised
machine learning problem, the values of m and n determine the
number of selected features. Therefore, manually engineered
features can easily cause dimensional disaster prevent the guarantee
of the prediction accuracy of the model [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>
        The prediction model is usually static, thus, it cannot describe
the characteristics of the dynamic change in trafic, which are
primarily reflected in the following three aspects: 1) existing
studies usually assume that the spatial neighbors and time windows
are globally fixed, which indicates that once the number of road
segments m associated with the predicted road segment and the
length of the time window n are determined, they do not change
in the spatiotemporal range. Considering the dynamic
characteristics of an urban road network, trafic flow in the road network
is not a static point but is a moving process from one location
to another location. The spatial neighbors of the road segment
primarily rely on the current trafic conditions. The number of
spatial neighbors is very small if trafic congestion exists but is
large during flat peak periods [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. From the perspective of
urban road network heterogeneity, the number of relevant road
segments for diferent road segments also difers; thus, sharing
parameter m is dificult in the entire spatial range [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. The
selection of a time window based on a time series is used to determine
the length of the historical trafic data to match similar trafic
patterns. The trafic data in the historical time step and the
current time step must be relevant in the selection process [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Due
to the dynamic and heterogeneous nature of the road network,
even the same road segment, a significant diference is observed
in the time series of trafic data in diferent time periods (such as
morning and evening peak periods). That causes the selection of
the time window to be dynamic [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Thus, the spatial neighbors
and time windows that dynamically change over time and space
are not easily described with globally fixed spatiotemporal state
matrices; thus, there is a need for a dynamic spatiotemporal KNN
model to adapt to the characteristics of trafic changes. 2) Existing
research considers that diferent historical data for diferent time
periods have diferent contributions to the prediction of future
trafic conditions. When calculating the distance between two
state spaces, the weight distance criterion is usually adopted to
assign diferent weights to each component in the state space.
The closer the time window is to the predicted time, the larger the
allocated weight; the closer the spatial distance is to the predicted
road segment, the greater the assigned weight [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. However,
dynamic changes in the spatial neighbor and the time window not
only afect the dimension of the space-time matrix but also cause
the intensity of the correlation among diferent positions to
dynamically change over time. Therefore, the influence of diferent
components of trafic data is dificult to characterize with global
ifxed spatiotemporal weight matrix. 3) To determine the value of
the number of similar state spaces K, researchers usually employ
a cross-validation method to select a suitable value, then share in
the entire range of space and time[
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. Due to the diference
in trafic patterns in the diferent time periods and space
locations, the global fixed value of K cannot adapt to the dynamic
and heterogeneous nature of a road network.
      </p>
      <p>The key to short-term trafic forecasting models is the efective
use of the potential spatiotemporal dependencies in the trafic
data. The existing KNN models usually assume that the trafic
change is a static point process and often disregard its important
dynamics and heterogeneous characteristics. As a result, the
structure of the prediction model is usually globally fixed in time
and space, including the globally fixed spatial neighbor, time
window, spatiotemporal weights, and spatiotemporal parameters,
such as the traditional KNN model and the spatiotemporal KNN
model.</p>
      <p>In this paper, we propose a dynamic spatiotemporal KNN
model (D-ST-KNN) for short-term trafic prediction considering
spatial heterogeneity and temporal non-stationarity of city
trafifc. First, we investigated the autocorrelation of road trafic to
determine the time window required for the trafic data. Second,
we used the cross-correlation among diferent road segments to
analyze the spatiotemporal dependencies of trafic and build a
dynamic spatial neighbor for each road segment. The dynamic
spatiotemporal state matrix is obtained by the dynamic spatial
neighbor and the dynamic time window instead of the traditional
time series or the static spatiotemporal matrix to characterize
the state space. Finally, we introduced the dynamic
spatiotemporal weight, dynamic spatiotemporal parameters, and Gaussian
weight function to improve the KNN model to adapt to the
dynamic and heterogeneous characteristics of the trafic.</p>
      <p>The remainder of this paper is organized as follows: Section 2
proposes a D-ST-KNN model that considers the spatial
heterogeneity and temporal non-stationarity of city road trafic. The
construction of the dynamic spatiotemporal state matrix, weights,
and other parameters are also introduced in this section. In
Section 3, the dynamic characteristics, prediction performance, and
computational eficiency of the presented model are
comprehensively validated. The experimental results are also discussed.
Section 4 concludes the paper and provides an outlook of future
work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>METHODOLOGY</title>
      <p>In this section, we propose a D-ST-KNN model. Our method is
divided into five phases: the data bucket partition, state space
definition, distance function definition, optimal neighbor
selection, and prediction function definition, which corresponds to
Sections 2.1-2.5. First, considering the dynamic nature of trafic,
the original spatiotemporal data sets are partitioned according
to diferent time periods to form diferent data buckets. Second,
considering the spatial heterogeneity, each segment of a data
bucket is separately processed, and the appropriate spatial
neighbors and time windows are selected. The spatiotemporal state
matrix is constructed to describe the trafic conditions. Then, we
introduce the spatiotemporal weight matrix to define the
distance function and measure the distance between the current
spatiotemporal state matrix and the historical spatiotemporal
state matrix to select the K nearest neighbors. Finally, we
integrate these neighbors to obtain the predicted value of the target
road segment.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Data bucket</title>
      <p>Considering the non-stationarity and periodicity of trafic data,
there are significant diferences in the trafic characteristics among
diferent time periods, such as the morning peak period,
interpeak period, and evening peak period. In the same period, the
trafic data of same road segment has statistical homogeneity
and the trafic pattern tends to be stable with periodic changes,
such as diferent days for the morning peak period, which results
in the spatial neighbor, the time window, and spatiotemporal
parameters that can be shared. Therefore, we divide the
original trafic data {voltLj , j ∈ [1, N ], t ∈ [t0, tc ]} into diferent time
periods to describe the homogeneity in same time period and
dynamics in diferent time periods, where t0 and tc represent the
start time step and the current time step of the time series, and
Lj denotes the jth road segment.</p>
      <p>
        In the study of urban trafic modeling and prediction, to
distinguish the diference among the trafic characteristics in diferent
time periods, [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] divided a day into six time periods (period 1:
midnight-6:30 am; period 2: 6:30-10:00; period 3: 10:00-13:30;
period 4:13:30-17:00; period 5:17:00-20:30; period 6:20:30-midnight).
The test reveals that the partition is statistically acceptable. Based
on this analysis and according to the same strategy, the original
trafic data are divided into M diferent time periods (M= 6)
according to the time dimension, which corresponds to diferent
data buckets. Assuming that the entire trafic data set is BK, the
data bucket division must be satisfied:


 bki ∩ bko = ϕ
 BK = bk1 ∪ bk2 ∪ ... ∪ bkM
 bki = {voltLj |1 ≤ j ≤ N , ∀t ∈ [tabki , tbki
b )}
(1)
bucket 1), and voltLj is the trafic data of road segment
where i ∈ [1, M], o ∈ [1, M], i , o, bki is the ith bucket (i.e.,
Lj at
time step t . t ∈ [tabki , tbki ) indicates that time step t is within
b
the corresponding time period of the ith bucket (i.e.,[0:00-6:30),
[6:30-10:00)). Lj denotes the jth road segment (i.e., Link 1), and
N is the total number of road segments. Note that dividing the
original trafic data into diferent buckets at the pre-processing
stage does not have any impact on the analyses and conclusions
in this study because the same partitioning strategy were used
for all the algorithms that are evaluated.
2.2
      </p>
      <p>
        Dynamic spatiotemporal state matrix
2.2.1 Dynamic spatial neighborhoods. The dynamic spatial
neighborhood is used to determine how the trafic conditions of the
predicted road segment are afected by the surrounding road
segments in diferent buckets to determine the correlation among
road segments. The traditional method usually calculates the
correlation coeficients between the time series of the predicted road
segments and the time series of other road segments and sets the
threshold to select the relevant road segments [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Considering
that a road network has multiple internal and external factors,
such as the influence of trafic lights, the impact of
surrounding road segments on predicted road segments has a certain
degree of lag. Therefore, the delayed spatiotemporal
relationships cannot be exactly expressed by correlation coeficients. The
cross-correlation function is a delayed version of the correlation
coeficient function, which measures the correlation coeficients
of two time series at a specific lag [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]; therefore, it is more
suitable for describing the spatiotemporal dependence of trafic.
      </p>
      <p>Assume that bki is the bucket of the predicted road segment
Lj at time step t, and t ∈ [tabki , tbki ). Given the surrounding
b
road segments Lv , the time series of the trafic data for two road
segments can be expressed as U = {voltLj |∀t ∈ [tabki , tbki )},
b
Z = {voltLv
|∀t ∈ [tabki , tbki )}, j ∈ [1, N ], v ∈ [1, N ], and their
b
cross-correlation at lag φ is defined as follows:













 cc fub,kzi (φ) = γub,kzi (φ) , φ = 0, ±1, ±2, · · · ,</p>
      <p>αu σz
γub,kzi (φ) = E (ut − µ u ) zt +φ − uz
αu =
σz =</p>
      <p>q
q
Í (ut − µ u )</p>
      <p>2
Í zt +φ − uz
2
where γub,kzi (φ) is the correlation coeficient between time series
U and time series Z at lag φ in bucket bki , µ u and uz are the mean
values of U and Z, respectively, and σu and σz are the standard
deviations of U and Z, respectively.</p>
      <p>
        In this definition, the cross-correlation function can be
regarded as a function of lag, and the lag value that makes the
cross-correlation function obtain the maximum value is the
average delay time of the surrounding segments to the predicted
road segment [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. The formal definition is expressed as
φ
ψbLkvi = arдmax cc fub,kzi (φ) , v ∈ [1, N ]
(2)
(3)
where ψbLkvi is the lag value that maximizes cross-correlation of
the surrounding road segment Lv to the predicted road segment
in bki , and ψbLkvi describes the maximum impact time range of
the surrounding segments in diferent buckets on the predicted
road segment, which can be employed for eficient selection of
spatial neighbors. Consider the predicted road segment Lj in bki
and its predicted time interval ∆ t . When the surrounding road
segments deliver the trafic flow to the predicted road segments
within a given time interval, they influence the predicted road
segments, and the road segments beyond this time interval are
excluded. Its formal definition is expressed as
      </p>
      <p>RbLkji ←
n
Lv |∀0 ≤ ψbLkvi ≤ ∆ t , v ∈ [1, N ]
o
(4)
bki
in the ith bucket.
where RLj is the set of spatial neighbors of the jth road segment
as follows:</p>
      <p>2.2.2 Dynamic time windows. Considering that the selection
of the time window is based on the time series of the predicted
road segment, we can select n historical trafic data that have a
correlation with the predicted road segment. The autocorrelation
function is usually employed to measure the correlation between
the time series and its delayed version; thus, it can be used for the
selection of the time window, i.e., the lag in which the prediction
error is minimized can be set as the window size. Note that the
lag in the autocorrelation function describes the delay efect of
the time series, and the lag described in Section 2.2.1 is used to
characterize the delay efect between diferent time series. Given
the time series of the jth road segment Lj in bki , U = {voltLj |∀t ∈
spatial distances. The construction method is described as follows:
assume that the predicted road segment Lj at the current time
step tc is in data bucket bki and the dimension of the
spatiotemporal state matrix is mLj</p>
      <p>bki ×nbLkji , which is determined by the method
provided in Section 2.2. Then, the spatiotemporal state matrix of
the current time step can be expressed as χtLcj mbLkji , nbLkji .The
spatiotemporal matrix of the historical time step hi can be
deifned as χ Lj mbLkji , nbLkji , where mLj is the spatial dimension of
hi bki
the spatiotemporal state matrix of the jth predicted road segment
in the ith bucket, which is related to the number of elements
in the set of spatial neighbors RbLkji . Moreover, nbLkji is the
temporal dimension of the spatiotemporal state matrix of the jth
predicted road segment in the ith bucket, which is the size of
the time window. The time-weighted matrix is defined as W bki ,
t
and the space-weighted matrix is defined as Wsbki . The
corresponding elements are wtbki (ti, t j), ti ∈ [1, nbLkji ], t j ∈ [1, nbLkji ]
and wsbki (si, sj), si ∈ [1, mbLkji ], sj ∈ [1, mbLkji ], which represent
the time weight value and space weight value, respectively,
assigned to the jth predicted road segment in the ith bucket. The
weight distribution is as follows:</p>
      <p>,
wbki (ti, t j) =  ti ÍntibL=kj1i ti, ti = t j</p>
      <p>
t</p>
      <p>
wsbki (si, sj) =  cc fLsvi , Lj



</p>
      <p>0, ti , t j
,</p>
      <p>Ímsib=Lkj1i cc fLsvi , Lj , si = sj
 0, si , sj
</p>
      <p>In this definition, the temporal and spatial weights are linearly
distributed according to the proximity of the current time step
and the predicted road segments. cc fLsvi , Lj is the cross-correlation
between the time series of the si spatial neighbor (whose road
segment is Lv ) and the predicted road segment Lj . The closer
the value is to the predicted time, the greater the weight of
the allocation; the greater the relation to the space of the
predicted road segment, the greater the weight. By introducing
spatiotemporal weights into the original spatiotemporal matrix, the
spatiotemporal-weighted state matrices of the current time step
ΓLj and the spatiotemporal-weighted state matrices of the
histc
torical time step ΓLj are denoted by the following:</p>
      <p>hi
ΓtLcj = Wsbki × χtLcj mbLkji , nbLkji × Wtbki
ΓhLij = Wsbki × χhLij mbLkji , nbLkji × Wtbki</p>
      <p>By calculating the distance dbki (ΓtLcj , ΓhLij ) between the
historical spatiotemporal state matrix and the current spatiotemporal
state matrix, candidate neighbors can be selected. The formula is
expressed as
dbki ΓtLcj , ΓhLij =
r
trac
ΓLj
tc − ΓhLij</p>
      <p>tc − ΓhLij ′
× ΓLj
(10)
where trac represents the trace of the matrix.
2.4</p>
    </sec>
    <sec id="sec-4">
      <title>Dynamic spatiotemporal parameters</title>
      <p>
        In the KNN model, the spatiotemporal parameters include the
K values and the parameters introduced during the method
construction (such as the prediction generation functions). The
reasonableness of the parameters has substantial influence on the
(6)
(7)
(8)
(9)
prediction accuracy of the model. The K value is primarily
employed to determine the number of candidate neighbors. If the K
value is too small, the model becomes more complex and
overfitting is possible. If the K value is too large, the model is simpler
and under-fitting is possible. Considering that the selection of the
K value is significantly influenced by the finite sample nature of
the problem, the assignment of its values is usually performed by
cross-validation to select the K value that minimizes the model
error [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
      </p>
      <p>The existing methods usually assume that the K value is
globally fixed. When the K value is determined, it is shared
throughout the entire space and time. In contrast to the existing method,
the selection of the K value in the D-ST-KNN model considers the
characteristics of dynamic changes of trafic. Instead of setting
a global fixed K value, we can select the optimal K value for
diferent buckets, i.e., Kbki , bki ∈ BK, i] ∈ [1, M].</p>
      <p>
        To verify these assumptions, we use cross-validation to set the
range of K to [
        <xref ref-type="bibr" rid="ref1">1, 40</xref>
        ] and test the efect of diferent K values on
MAPE of the model in diferent buckets, as shown in Fig. 1.
12.5
12.0
)11.5
(%11.0
E
PA10.5
M10.0
9.5
9.0 0 10 2K0 30 40
      </p>
      <p>Bucket4</p>
      <p>
        As the K value increases, the prediction error is gradually
reduced. When the K value attains a certain value, the error
of the model begins to stabilize. Thus, the optimal K value for
each bucket can be determined (i.e., Kbk1 = 27, Kbk2 = 23).
Compared with diferent buckets, the K values dynamically vary
with diferent time periods. The global fixed K value has dificulty
describing the dynamic change in trafic. Therefore, the dynamic
K value proposed in this paper is reasonable. The parameters
of the D-ST-KNN model also contain the parameters introduced
by the predicted generation function (refer to Section 2.5). The
calibration method of the parameter is shown in Section 3.2.
Due to the spatiotemporal state space, the spatiotemporal weight,
and the spatiotemporal parameters dynamically change with
different buckets; to adapt to this change, the predictive generation
function should also dynamically change. This paper transforms
the four types of traditional weight distribution methods to
enable them to adapt to the dynamics of trafic, including the inverse
distance weight [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], rank-based weight[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ][
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], and Gaussian
weight [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Selecting the best prediction function by comparing
the performance of diferent predictive functions (refer to Section
3.2). Note that the weight referred to in this section is expressed
as the weight assigned by the candidate neighbor, whereas the
weight in Section 2.3 represents the weight matrix of the weights
assigned to each element in the spatiotemporal state matrix.
      </p>
      <p>Assuming that dbkki is the distance between the kth
candidate neighbor and the predicted road segment in the ith bucket
obtained by formula (10), volLj</p>
      <p>tc +1 the predicted value of the
predicted road segment Lj at time step tc + 1 is defined as
voltLcj+1 =
ÍkK=bk1i vol Lj</p>
      <p>hi +1 (k) × φbLkji (k)
ÍKbki φbLkji (k)
k=1
(11)
where tc ∈ [tabki , tbki ] is used to map the current time step into
b
the corresponding bucket, is used to determine the number of
candidate neighbors for the corresponding bucket, vol Lj
hi +1 (k)
represents the trafic data of the kth candidate neighbor, and
hi ∈ [tabki , tbki ]; and φbLkji (k) and represent the weight of the kth
b
neighbor of the jth predicted road segment in the ith bucket. The
form is defined as follows:
1
Kbki</p>
      <p>1
dbkki
(Kbki − rq + 1)2








φbLkji (k) =  (12)</p>
      <p> 4π a1bki exp(− 4dabkbkkii 22 )</p>
      <p>Formula (12) corresponds to equal weights, inverse distance
weights, the rank-based weight and the Gaussian weight, where
rq represents the order of the qth candidate neighbors, and abki
is the spatiotemporal parameter whose value is similar to the
value of the previously discussed spatiotemporal parameter K,
which dynamically values with diferent time periods. The
corresponding parameter calibration is shown in Section 3.2.</p>
    </sec>
    <sec id="sec-5">
      <title>2.6 Accuracy metrics</title>
      <p>
        Three criteria are selected to verify the prediction accuracy of
the D-ST-KNN model, namely, mean absolute error (MAE), mean
absolute percentage error (MAPE) and root-mean-square error
(RMSE). These indicators depict the essential characteristics of
errors from diferent perspectives. The RMSE indicates a
fluctuation in the error of the prediction model, and the MAPE indicates
the diference between the predicted and the actual trafic data. In
contrast, the MAE and RMSE provide a measure of the similarity
between the predicted and the actual trafic data [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The MAE,
MAPE, and RMSE are defined as follows:
      </p>
      <p>M AE =
v
u
u
u
u
u
u
u
M AP E = t
RM S E =
v
u
t
1
1
1</p>
      <p>M N S
Õ Õ Õ
M × N × S i=1 j=1 s=1
voltLcj+1 (s ) − voltc+1 (s )</p>
      <p>Lj
M × N × S i=1 j=1 s=1
M × N × S i=1 j=1 s=1</p>
      <p>M N S
Õ Õ Õ
M N S
Õ Õ Õ
voltLcj+1 (s ) − voltc+1 (s )</p>
      <p>Lj</p>
      <p>Lj
voltc+1 (s )
voltLcj+1 (s ) − voltc+1 (s )</p>
      <p>Lj
2
(13)
(14)
(15)
where M is the number of buckets M = 6, N is the number
of predicted road segments, S is the number of test samples,
voltLcj+1 (s) and voltLcj+1 (s) indicate the actual trafic data and the
predicted trafic data at the next time step of the jth predicted
road segment at the current time step, and s indicates the sth test
sample in the ith bucket.</p>
    </sec>
    <sec id="sec-6">
      <title>3 EXPERIMENTS</title>
    </sec>
    <sec id="sec-7">
      <title>3.1 Data preparation</title>
      <p>In this study, two diferent data sets are used to evaluate the
performance of the prediction model. The first data set is PeMS,
which is a high-quality data set with open access. PeMS is
extensively applied in the field of trafic prediction. The trafic speed
data from 59 consecutive locations on the US 101 freeway from
PeMS were downloaded for a total of 60 days; the time period
is August 15, 2016, to October 14, 2016 and time interval is 5
min (as shown in Table 1). Each detector represents a position;
the positional distribution is shown in Fig. 2. The second data
set is the floating car trajectory data obtained from the Beijing
road network, which is generated from more than 50,000 vehicles
equipped with GPS. The frequency of data acquisition is 5 min,
and the time period is March 1, 2012, to April 30, 2012 (as shown
in Table 1). In this study, a representative region that contains
30 road segments is used for the experiment with the position
distribution shown in Fig. 2. In the two data sets, the last ten
days are used as the test data to evaluate the accuracy of the
model. The remaining days of data are employed as training data
to construct the historical database of the predicting model.</p>
      <p>In addition, we normalize the original trafic data and use the
ratio of the average trafic speed to the maximum speed limit of
each road segment to express the trafic conditions of the road
segment. The formal expression is as follows:
vdi,t =
vi,t , i ∈ [1, N ], t ∈ [t0, tc ]
fi,max
(16)
where vdl,t is the normalized speed of the ith road segment at
time step t, vi,t is the real average speed data of the road segment,
and fi,max is the speed limit for the ith road segment.
yields the lowest performance. The distance function constructed
by the Gaussian function assigns weights in the time dimension
and space dimension; thus, the performance of the prediction
model is significantly improved. However, this method requires
additional introduction of the time-weighted parameter α1 and
the space-weighted parameter α2 in the construction process,
which makes calibration of its parameters and the global optimal
combination of parameters dificult. We adopt a similar strategy
that uses the linear time distribution weight in the time
dimension and the spatial correlation between the surrounding road
segments and the target road segment to assign weights in spatial
dimensions. Then, a dynamic spatiotemporal weight assignment
method is constructed that does not require any additional
parameters. The dynamic weight distribution has the lowest MAPE,
RMSE and MAE, which reflects the high eficiency of the method
compared to that of the other two weight distribution methods.
3.2.2 Determining the optimal predictive function. Based on the
discussion in the previous sections, we transform four types of
weight distribution methods, including equal weight, inverse
distance weight, rank-based weight, and Gaussian weight, which
are used to integrate the candidate neighbors to obtain the final
predicted value. In the process of cross-validation, we fix the other
parameters of the model, such as Kbki and abki , and calculate
the influence of diferent weight distribution methods on the
prediction accuracy of the D-ST-KNN model to obtain the average
error of the entire test data set for diferent weight distribution
methods. The results are shown in Fig. 4. The MAPE, RMSE and
MAE of the Gaussian weight method are lower than the MAPE,
RMSE and MAE of the other three weight distribution methods.
In the D-ST-KNN model, we employ the Gaussian function as
the weight distribution method for candidate neighbors.</p>
      <p>
        3.2.3 Calibrating hyper-parameters In the D-ST-KNN model,
the hyper-parameters primarily include the number of candidate
neighbors Kbki and the Gaussian weight parameter abki . In the
parameter calibration process, to find the best combination of
Kbki and abki that enables the prediction model to obtain the
minimum MAPE, we set the range of Kbki to [
        <xref ref-type="bibr" rid="ref1">1, 40</xref>
        ] and the range
of abki to [0.001, 0.04]. We apply the cross-validation method to
obtain the optimal combination of the parameters for each bucket.
The efect of parameter variation on the prediction accuracy of
the D-ST-KNN model can be tested by fixing other parameters of
the model. For example, we can fix the values of abki and test the
performance of the prediction model changes with Kbki (refer
to Section 2.4). Because the impact of parameter Kbki on the
prediction performance was discussed in Section 2.4, this section
focuses on the calibration of parameter abki .
      </p>
      <p>
        Fig. 5 shows the impact of changes in abki on the performance
of the D-ST-KNN model in diferent buckets. The trend in Fig. 5
reveals that the value of abki has a significant influence on the
prediction performance. For the minimum abki , the prediction
error of the model attains the maximum abki . As abki increases,
the prediction error gradually decreases and begins to stabilize.
We compare the variation of the parameters among the diferent
buckets. For example, in bucket 1, the optimal value of abk1 is
0.017, whereas the optimal value of abk2 in bucket 2 is 0.015. The
value of abki also changes dynamically over time. Considering
that Kbki also changes dynamically with time, the parameters of
the D-ST-KNN model change with time. The calibration results of
the entire model are listed in Table 2, and the values of Kbki are
shown in Fig. 5. In this analysis, setting the global fixed
parameters is unreasonable when constructing the prediction model.
We propose the concept of the data bucket, and the prediction
model is constructed in diferent time periods, which causes the
model parameters to change with the time period to adapt to the
dynamic nature of trafic.
3.3.1 Overall results. Based on the variable estimation, we compare
our model with several existing trafic prediction models,
including the historical average model (HA), Elman neural network
(Elman-NN) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], traditional KNN model (Original-KNN), and
spatiotemporal KNN model (ST-KNN). Fig. 6 shows the prediction
performance of diferent models. The HA model, the Elman-NN
model, and the Original-KNN model regard the problem of the
trafic prediction as a simple time series problem and disregard
the influence of the spatial factors on the predicted road segment.
Therefore, their prediction performance is lower than the
prediction performance of the ST-KNN model and the D-ST-KNN model
proposed in this paper by comparing the values of MAPE. The
STKNN model introduces the spatiotemporal state matrix, which
improves the prediction performance of the model. However,
this matrix ignores the spatial heterogeneity and the temporal
non-stationarity of the road network and cannot describe the
essential characteristics of the trafic dynamics using a static
ST- KNN model (including global fixed spatiotemporal matrix
and global fixed parameters). The D-ST-KNN model constructs
models for diferent time periods by introducing the concept
of data buckets. Simultaneously, the dynamic space neighbor,
dynamic time window, dynamic spatiotemporal weight, and
dynamic spatiotemporal parameters are introduced to construct the
D-ST-KNN model, which can adequately adapt to the dynamic
changes of trafic conditions. The experimental results indicate
that the D-ST-KNN model proposed in this paper is superior to
other models.
      </p>
      <p>HA
Elman-NN
Original-KNN
ST-KNN
D-ST-KNN
21
20
19
)18
%
(
E17
PA
M16
15
14
13</p>
      <p>3.3.2 Local results. To further evaluate the performance of the
D-ST-KNN model, we compare the MAPEs of diferent models in
diferent data buckets by averaging the prediction performance
of diferent road segments in a single bucket. The
experimental results are displayed in Fig. 7. In terms of overall trends, the
performance of diferent models corresponds to the degree of
congestion of the trafic conditions. For example, in bucket 1, bucket
3, and bucket 6, all models have a lower MAPE than other
buckets because the time periods that correspond to the three data
buckets are midnight-6:30 am, 10:00-13:30, and 20:30-midnight.
The trafic in Beijing during these three time periods belongs
to the flat peak period, and road trafic has low congestion and
exhibits regular changes. In buckets 2, 4, and 5, all models achieve
a relatively poor performance. Bucket 2 corresponds to the time
period of 6:30-10:00, bucket 4 corresponds to the time period
of 13:30-17:00 and bucket 5 corresponds to the time period of
17:00-20:30. These time periods correspond to the peak period in
Beijing. The changes of trafic conditions during these time
periods are more complicated than the trafic conditions of the other
buckets. In addition, in terms of the performance of diferent
models in a single data bucket, the prediction trend of diferent
models was similar to those of the overall results. For example,
in bucket 1, the ST-KNN and D-ST-KNN models perform better
than the HA, Elman-NN, and Original-KNN models, which is due
to the benefits of the introduction of spatial factors. However,
the D-ST-KNN model considers the spatial heterogeneity and
temporal non-stationarity of road networks to adapt to the
dynamic characteristics of trafic, making the model performance
better than other models in all time periods, especially in the peak
period. This also explains why the D-ST-KNN model is superior
to the other models in the overall result.</p>
      <p>Bucket1 Bucket2 Bucket3 Bucket4 Bucket5 Bucket6
HA</p>
      <sec id="sec-7-1">
        <title>Elman-NN</title>
      </sec>
      <sec id="sec-7-2">
        <title>Original-KNN ST-KNN D-ST-KNN</title>
        <p>To evaluate the generalization ability of the D-ST-KNN model,
we fix all parameters of the model and compare the performance
of the diferent methods with the test data set from PeMS; the
experimental results are shown in Fig. 8. The results indicate
that the prediction accuracy of the D-ST-KNN model on the
PeMS data set is significantly improved compared with that of
the Beijing floating car data set. The data quality of the PeMS
data set is relatively complete, and the data collection area is
the expressway. Compared with the trafic conditions of the
urban road network, the trafic mode is relatively simple with
minimal changes, which enables the prediction model to easily
represent the regular trafic pattern characteristics. However, the
D-ST-KNN model maintains the same prediction trend; in all
predicted models, its MAPE, RMSE, and MAE are lower than the
other models, which exhibit excellent predictive performance
and generalization ability.</p>
        <p>8
7
6
)(5
%
EP
AM4
3
2
1</p>
        <p>HA
Elman-NN
Original-KNN
ST-KNN
D-ST-KNN
0.08
0.07
0.06
E0.05
S
RM0.04
0.03
0.02
0.01</p>
        <p>HA
Elman-NN
Original-KNN
ST-KNN
D-ST-KNN
0.070
0.060
0.050
EA0.040
M
0.030
0.020
0.010</p>
        <p>HA
Elman-NN
Original-KNN
ST-KNN
D-ST-KNN</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>SUMMARY AND FUTURE WORK</title>
      <p>In this paper, we propose a D-ST-KNN model for short-term trafic
prediction. The proposed model considers the spatial
heterogeneity and temporal non-stationarity of road networks to adapt to
the dynamic characteristics of trafic, including dynamic spatial
neighbors, time windows, spatiotemporal weights, and
spatiotemporal parameters. With cross-correlation and autocorrelation
function computation, the automatic selections of spatial
neighbors and the time window are realized, which eficiently solve
the dimensionality disaster problem encountered in the existing
KNN models. The spatiotemporal weights are integrated into
a distance function to help identify candidate neighbors. Time
variable parameters are also introduced, including the dynamic
number of candidate neighbors and dynamic weight allocation
parameters, to further adapt to the dynamic and heterogeneous
nature of road networks.</p>
      <p>Using real trafic data collected from city roads and inter-city
expressways, we calculate the number of spatial neighbors and
the time window size of each road segments, which reflects the
distinct heterogeneity and non-stationarity of urban road trafic.
Then, we validate the performance of the proposed D-ST-KNN
model with comparisons to HA, Elman-NN, traditional KNN and
spatiotemporal KNN models. The experimental results indicate
that the D-ST-KNN model has a higher accuracy on short-term
trafic prediction than the existing models. In addition, we
explore the local performance of diferent models in diferent data
buckets and find that all models correspond to the degree of
trafifc congestion, and the D-ST-KNN model performs better than
other models in all time periods, especially in the morning
period and evening peak period. To summarize, compared with the
existing models, the proposed D-ST-KNN model significantly
improves the accuracy of short-term trafic prediction. Furthermore,
we compare the performance of diferent models using the
actual trafic data collected from PeMS. The D-ST-KNN model also
achieves the best performance, which verifies the generalization
ability of the proposed model.</p>
      <p>In the follow-up study, the following problems need to be
investigated to further improve the D-ST-KNN model. The
DST-KNN model behaves slightly diferently in peak and of-peak
time periods. Further improvement of the model performance
during peak hours will be a constant challenge. Moreover, a
multithreaded approach could be used to improve the eficiency of
D-ST-KNN. A parallel P-D-ST-KNN model on an existing parallel
computing framework is expected to alleviate the pressure of
real-time computation.
5</p>
    </sec>
    <sec id="sec-9">
      <title>ACKNOWLEDGMENTS</title>
      <p>This research is supported by the Key Research Program of the
Chinese Academy of Sciences (Grant No. ZDRW-ZS-2016-6-3)
and the State Key Research Development Program of China
(Grant No. 2016YFB0502104). Their supports are gratefully
acknowledged. And we also thank the anonymous referees for their
helpful comments and suggestions.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J Scott</given-names>
            <surname>Armstrong</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Findings from evidence-based forecasting: Methods for reducing forecast error</article-title>
          .
          <source>International Journal of Forecasting 22</source>
          ,
          <issue>3</issue>
          (
          <year>2006</year>
          ),
          <fpage>583</fpage>
          -
          <lpage>598</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Brenda</surname>
            <given-names>I. Bustillos</given-names>
          </string-name>
          <source>and Yi Chang Chiu</source>
          .
          <year>2011</year>
          .
          <article-title>Real-Time Freeway-Experienced Travel Time Prediction Using N -Curve and k Nearest Neighbor Methods</article-title>
          .
          <source>Transportation Research Record Journal of the Transportation Research Board</source>
          <volume>2243</volume>
          , -
          <fpage>1</fpage>
          (
          <year>2011</year>
          ),
          <fpage>127</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Pinlong</given-names>
            <surname>Cai</surname>
          </string-name>
          , Yunpeng Wang, Guangquan Lu, Peng Chen, Chuan Ding, and
          <string-name>
            <given-names>Jianping</given-names>
            <surname>Sun</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A spatiotemporal correlative k -nearest neighbor model for short-term trafic multistep forecasting</article-title>
          .
          <source>Transportation Research Part C Emerging Technologies</source>
          <volume>62</volume>
          (
          <year>2016</year>
          ),
          <fpage>21</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yoon</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Baek</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Dynamic near-term trafic flow prediction: systemoriented approach based on past experiences</article-title>
          .
          <source>Iet Intelligent Transport Systems</source>
          <volume>6</volume>
          ,
          <issue>3</issue>
          (
          <year>2012</year>
          ),
          <fpage>292</fpage>
          -
          <lpage>305</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Tao</surname>
            <given-names>Cheng</given-names>
          </string-name>
          , James Haworth, and
          <string-name>
            <given-names>Jiaqiu</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Spatio-temporal autocorrelation of road network data</article-title>
          .
          <source>Journal of Geographical Systems</source>
          <volume>14</volume>
          ,
          <issue>4</issue>
          (
          <year>2012</year>
          ),
          <fpage>389</fpage>
          -
          <lpage>413</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Tao</surname>
            <given-names>Cheng</given-names>
          </string-name>
          , Jiaqiu Wang, James Haworth, Benjamin Heydecker, and
          <string-name>
            <given-names>Andy</given-names>
            <surname>Chow</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>A Dynamic Spatial Weight Matrix and Localized SpaceĺCTime Autoregressive Integrated Moving Average for Network Modeling</article-title>
          .
          <source>Geographical Analysis</source>
          <volume>46</volume>
          ,
          <issue>1</issue>
          (
          <year>2014</year>
          ),
          <year>75ĺC97</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Stephen</given-names>
            <surname>Clark</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Trafic Prediction Using Multivariate Nonparametric Regression</article-title>
          .
          <source>Journal of Transportation Engineering</source>
          <volume>129</volume>
          ,
          <issue>2</issue>
          (
          <year>2003</year>
          ),
          <fpage>161</fpage>
          -
          <lpage>168</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Peibo</given-names>
            <surname>Duan</surname>
          </string-name>
          , Guoqiang Mao, Shangbo Wang, Changsheng Zhang, and Bin Zhang.
          <year>2016</year>
          .
          <article-title>STARIMA-based Trafic Prediction with Time-varying Lags</article-title>
          .
          <article-title>(</article-title>
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Jefrey</surname>
            <given-names>L</given-names>
          </string-name>
          <string-name>
            <surname>Elman</surname>
          </string-name>
          .
          <year>1990</year>
          .
          <article-title>Finding structure in time</article-title>
          .
          <source>Cognitive Science 14</source>
          ,
          <issue>2</issue>
          (
          <year>1990</year>
          ),
          <fpage>179</fpage>
          -
          <lpage>211</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Gaetano</surname>
            <given-names>Fusco</given-names>
          </string-name>
          , Chiara Colombaroni, and
          <string-name>
            <given-names>Natalia</given-names>
            <surname>Isaenko</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Short-term speed predictions exploiting big data on large urban road networks</article-title>
          .
          <source>Transportation Research Part C: Emerging Technologies</source>
          <volume>73</volume>
          (
          <year>2016</year>
          ),
          <fpage>183</fpage>
          -
          <lpage>201</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Kamran</given-names>
            <surname>Ghavamifar</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>A decision support system for project delivery method selection in the transit industry</article-title>
          . (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Filmon</surname>
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Habtemichael</surname>
            and
            <given-names>Mecit</given-names>
          </string-name>
          <string-name>
            <surname>Cetin</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Short-term trafic flow rate forecasting based on identifying similar trafic patterns</article-title>
          .
          <source>Transportation Research Part C</source>
          <volume>66</volume>
          (
          <year>2016</year>
          ),
          <fpage>61</fpage>
          -
          <lpage>78</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Filmon</surname>
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Habtemichael</surname>
          </string-name>
          , Mecit Cetin, and
          <string-name>
            <surname>Khairul</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Anuar</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>METHODOLOGY FOR QUANTIFYING INCIDENT-INDUCED DELAYS ON FREEWAYS BY GROUPING SIMILAR TRAFFIC PATTERNS</article-title>
          .
          <source>Transportation Research Record Journal of the Transportation Research Board</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>James</surname>
            <given-names>Douglas</given-names>
          </string-name>
          <string-name>
            <surname>Hamilton</surname>
          </string-name>
          .
          <year>1994</year>
          .
          <article-title>Time series analysis</article-title>
          .
          <volume>401</volume>
          -409 pages.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Haikun</surname>
            <given-names>Hong</given-names>
          </string-name>
          , Wenhao Huang,
          <string-name>
            <surname>Xiabing Zhou</surname>
            , Sizhen Du, Kaigui Bian, and
            <given-names>Kunqing</given-names>
          </string-name>
          <string-name>
            <surname>Xie</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Short-term trafic flow forecasting: Multi-metric KNN with related station discovery</article-title>
          .
          <source>In International Conference on Fuzzy Systems and Knowledge Discovery</source>
          .
          <fpage>1670</fpage>
          -
          <lpage>1675</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Xiaoyu</surname>
            <given-names>Hou</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Yisheng</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Siyu</given-names>
            <surname>Hu</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Short-term Trafic Flow Forecasting based on Two-tier K-nearest Neighbor Algorithm ąî</article-title>
          .
          <source>Procedia - Social and Behavioral Sciences</source>
          <volume>96</volume>
          (
          <year>2013</year>
          ),
          <fpage>2529</fpage>
          -
          <lpage>2536</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Myung</surname>
            <given-names>Jiwon</given-names>
          </string-name>
          , Dong Kyu Kim,
          <source>Seung Young Kho, and Chang Ho Park</source>
          .
          <year>2011</year>
          .
          <article-title>Travel Time Prediction Using k Nearest Neighbor Method with Combined Data from Vehicle Detector System and Automatic Toll Collection System</article-title>
          .
          <source>Transportation Research Record Journal of the Transportation Research Board</source>
          <volume>20</volume>
          ,
          <issue>2256</issue>
          (
          <year>2011</year>
          ),
          <fpage>51</fpage>
          -
          <lpage>59</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Arief</surname>
            <given-names>Koesdwiady</given-names>
          </string-name>
          , Ridha Soua, and
          <string-name>
            <given-names>Fakhreddine</given-names>
            <surname>Karray</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Improving Trafic Flow Prediction With Weather Information in Connected Cars: A Deep Learning Approach</article-title>
          .
          <source>IEEE Transactions on Vehicular Technology</source>
          <volume>65</volume>
          ,
          <issue>12</issue>
          (
          <year>2016</year>
          ),
          <fpage>9508</fpage>
          -
          <lpage>9517</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Shuangshuang</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Zhen</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Gang</given-names>
            <surname>Xiong</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>A k-nearest neighbor locally weighted regression method for short-term trafic flow forecasting</article-title>
          .
          <source>In International IEEE Conference on Intelligent Transportation Systems</source>
          .
          <volume>1596</volume>
          -
          <fpage>1601</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>He</surname>
          </string-name>
          , J. Ma,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Learning Trafic as Images: A Deep Convolutional Neural Network for Large-Scale Transportation Network Speed Prediction</article-title>
          .
          <source>Sensors</source>
          <volume>17</volume>
          ,
          <issue>4</issue>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Wanli</given-names>
            <surname>Min</surname>
          </string-name>
          and
          <string-name>
            <given-names>Laura</given-names>
            <surname>Wynter</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Real-time road trafic prediction with spatio-temporal correlations</article-title>
          .
          <source>Transportation Research Part C: Emerging Technologies</source>
          <volume>19</volume>
          ,
          <issue>4</issue>
          (
          <year>2011</year>
          ),
          <fpage>606</fpage>
          -
          <lpage>616</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Brian</surname>
            <given-names>Lee</given-names>
          </string-name>
          <string-name>
            <surname>Smith</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>Forecasting freeway trafic flow for intelligent transportation systems application</article-title>
          .
          <source>Transportation Research Part A 1</source>
          ,
          <issue>31</issue>
          (
          <year>1997</year>
          ),
          <fpage>61</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Brian L Smith</surname>
          </string-name>
          ,
          <string-name>
            <surname>Billy M Williams</surname>
            , and
            <given-names>R Keith</given-names>
          </string-name>
          <string-name>
            <surname>Oswald</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Comparison of parametric and nonparametric models for trafic flow forecasting</article-title>
          .
          <source>Transportation Research Part C: Emerging Technologies</source>
          <volume>10</volume>
          ,
          <issue>4</issue>
          (
          <year>2002</year>
          ),
          <fpage>303</fpage>
          -
          <lpage>321</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Anthony</given-names>
            <surname>Stathopoulos</surname>
          </string-name>
          and
          <string-name>
            <given-names>Matthew G.</given-names>
            <surname>Karlaftis</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>A multivariate state space approach for urban trafic flow modeling and prediction</article-title>
          .
          <source>Transportation Research Part C Emerging Technologies</source>
          <volume>11</volume>
          ,
          <issue>2</issue>
          (
          <year>2003</year>
          ),
          <fpage>121</fpage>
          -
          <lpage>135</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Eleni</surname>
            <given-names>I. Vlahogianni</given-names>
          </string-name>
          , Matthew G. Karlaftis,
          <string-name>
            <given-names>and John C.</given-names>
            <surname>Golias</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>SpatioTemporal Short-Term Urban Trafic Volume Forecasting Using Genetically Optimized Modular Networks</article-title>
          .
          <source>Computer-Aided Civil and Infrastructure Engineering</source>
          <volume>22</volume>
          ,
          <issue>5</issue>
          (
          <year>2007</year>
          ),
          <year>317ĺC325</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Shanhua</surname>
            <given-names>Wu</given-names>
          </string-name>
          , Zhongzhen Yang,
          <string-name>
            <given-names>Xiaocong</given-names>
            <surname>Zhu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Bin</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Improved knn for Short-Term Trafic Forecasting Using Temporal and Spatial Information</article-title>
          .
          <source>Journal of Transportation Engineering</source>
          <volume>140</volume>
          ,
          <issue>7</issue>
          (
          <year>2014</year>
          ),
          <fpage>04014026</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Dawen</surname>
            <given-names>Xia</given-names>
          </string-name>
          , Binfeng Wang,
          <string-name>
            <given-names>Huaqing</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Yantao</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Zili</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A distributed spatialĺCtemporal weighted model on MapReduce for short-term trafic flow forecasting</article-title>
          . Neurocomputing 179,
          <string-name>
            <surname>C</surname>
          </string-name>
          (
          <year>2016</year>
          ),
          <fpage>246</fpage>
          -
          <lpage>263</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Bin</surname>
            <given-names>Yu</given-names>
          </string-name>
          , Xiaolin Song, Feng Guan,
          <string-name>
            <given-names>Zhiming</given-names>
            <surname>Yang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Baozhen</given-names>
            <surname>Yao</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>kNearest neighbor model for multiple-time-step prediction of short-term trafic condition</article-title>
          .
          <source>Journal of Transportation Engineering</source>
          <volume>142</volume>
          ,
          <issue>6</issue>
          (
          <year>2016</year>
          ),
          <fpage>04016018</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Yang</given-names>
            <surname>Yue</surname>
          </string-name>
          and
          <string-name>
            <given-names>Anthony</given-names>
            <surname>Gar-On Yeh</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Spatiotemporal trafic-flow dependency and short-term trafic forecasting</article-title>
          .
          <source>Environment and Planning B: Planning and Design</source>
          <volume>35</volume>
          ,
          <issue>5</issue>
          (
          <year>2008</year>
          ),
          <fpage>762</fpage>
          -
          <lpage>771</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Lun</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Qian Rao,
          <string-name>
            <given-names>Wenchen</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Meng</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>An Improved k-NN Nonparametric Regression-Based Short-Term Trafic Flow Forecasting Model for Urban Expressways</article-title>
          . In International Conference on Transportation Engineering.
          <fpage>1214</fpage>
          -
          <lpage>1223</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Zuduo</given-names>
            <surname>Zheng</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dongcai</given-names>
            <surname>Su</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Short-term trafic volume forecasting: A k -nearest neighbor approach enhanced by constrained linearly sewing principle component algorithm</article-title>
          .
          <source>Transportation Research Part C Emerging Technologies</source>
          <volume>43</volume>
          (
          <year>2014</year>
          ),
          <fpage>143</fpage>
          -
          <lpage>157</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>