<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Short-term tra c flow forecasting using a distributed spatial- temporal model</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>A A Agafonov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A S Yumaganov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Samara National Research University</institution>
          ,
          <addr-line>Moskovskoye shosse 34, Samara, Russia, 443086</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>402</fpage>
      <lpage>409</lpage>
      <abstract>
        <p>In this paper, we consider the problem of short-term trac flow prediction. We propose a distributed model for short-term trca flow forecasting based otnhe k nearest neighbors method, that takes into accountspatial and temporal tra c flow distribution. To consider spatial-temporalcorrelations, we partition a transportationgraph in clusters by an area and describe tra c flow by a feature vector defined for eachcluster. The proposed model is implemented as a MapReduce based algorithm on an Apache Spark framework. The proposed tra c flow predictionmodel is tested usingthe actual average tra c speed data over aroad network in Samara, Russia.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Issues related to the tra c ow management are common in every major city around the world.
Tra c congestion leads to economic, environmental and social problems, which emphasizes the
importance of the transport planning and logistics. To solve these problems, it is important to
obtain accurate and timely tra c ow information. Due to this fact, road tra c forecasting has
been a subject of active research for more than 40 years.</p>
      <p>E orts devoted to mitigate the tra c congestion problem are usually classi ed in three
directions: modi cation of the transport infrastructure, improving the operational quality of
the public transport and managing tra c ows. The rst and the second directions are often
limited by economic or social factors, while the tra c ow management has been continuously
improving due to the development of tra c data collecting and processing technologies.</p>
      <p>Recently, much attention has been paid to the data-driven programming paradigm. This
interest is explained by the development of new technologies, methods and techniques for
massive data processing within the Big Data concept, the availability of multiple data sources
for predicting tra c ows, and the "open data" idea, that some data should be freely available
to everyone to use, without restrictions from copyright, patents or other mechanisms of control.</p>
      <p>
        Short-term tra c ow forecasting considers the tra c prediction problem on the basis
of current and archived information about the tra c ows state. A review of the latest
achievements in the road tra c forecasting eld, as well as the main unresolved technical
challenges, can be found in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Most research on this topic has focused on developing methods
for modeling the characteristics of tra c ows (for example, density or speed). An overview
of the methods of short-term tra c forecasting presented in the [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These methods can be
classi ed into three categories:
      </p>
      <p>
        1) Parametric methods [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ], including time series models [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], state space models, etc.
2) Non-parametric methods [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], including models of arti cial neural networks [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], k-nearest
neighbor (kNN) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], support vector regression (SVR) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
3) Hybrid methods that combine parametric and non-parametric methods [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ].
      </p>
      <p>However, these methods have both advantages and disadvantages when working under
di erent conditions using di erent datasets, so it is hard to conclude that one method
signi cantly superior others.</p>
      <p>
        In this paper, we propose an approach based on the k nearest neighbor algorithm - one of
the main non-parametric techniques for short-term tra c ow prediction. Results presented
in [
        <xref ref-type="bibr" rid="ref11 ref12 ref5">5, 11, 12</xref>
        ] showed that kNN outperformed other modern comparable models, including ANN,
SARIMA, random forest, and Nave Bayes. However, if the sample data size is too large, kNN
may not be suitable for real-time prediction due to the computational costs. Despite this issue,
a relatively small number of works are devoted to the short-term tra c ow forecasting with a
focus on processing big tra c data using the distributed computations, in particular, using the
MapReduce framework [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ].
      </p>
      <p>In this article, we consider a problem of short-term tra c ow forecasting for 10 minutes
ahead. We focus on developing a distributed forecasting model based on the weighted kNN
algorithm, taking into account the spatial and temporal characteristics of the transport ows
in the spatially compact area of a transport network. For distributed data processing, we use
MapReduce processing model implemented in the open source cluster-computing framework
Apache Spark. Experimental analysis on real-world tra c data sets allows us to conclude that
the proposed model has a high prediction accuracy and reasonable execution time, su cient for
real-time prediction.</p>
      <p>The paper is organized as follows. Section 2 contains the formulation of the problem. The
proposed model and its distributed implementation are described in detail, respectively, in
Sections 3 and 4. In section 5, we provide experimental results of the proposed model and
verify the accuracy of the proposed approach. Finally, we conclude the paper, and then present
possible directions for further research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem formulation</title>
      <p>A road network is considered as a directed graph G = (V; E)), with nodes V; NV = jV j
representing the road intersections and edges E; NE = jEj denoting road segments.</p>
      <p>Let Vtj denotes an observed tra c ow characteristic on an edge j 2 E at time interval t. As
a tra c ow characteristic can be used travel time, average speed, density or ow.</p>
      <p>In this work as a predicted tra c ow characteristic for the experimental study, we use the
average tra c speed.</p>
      <p>The short-term tra c ow forecasting problem can be formulated as follows: given a graph
G(V; E) and sequence fVtj g; j 2 E; t = 1; 2; : : : T of observed tra c ow data, predict the tra c
ow characteristic V^tj+ ; j 2 E at time interval (t + ) for a prede ned prediction horizon .</p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed model</title>
      <p>In this paper, we propose a short-term tra c ow forecasting model based on non-parametric
regression k-nearest neighbors algorithm. To apply the kNN method to the tra c ow prediction
problem, it is necessary to solve the following tasks:
1. De ne a feature vector to describe tra c ow.
2. De ne a suitable distance metric to determine the proximity between a feature vector
describing current tra c ow characteristics and feature vectors describing historical tra c
ow observations.
where
m0;m1</p>
      <p>0
xmin +</p>
      <p>1
xmin +
m0
M0
m0
M0</p>
      <p>0
xmax</p>
      <p>1
xmax
xmin ; x0min +</p>
      <p>0
xmin ; x1min +
1
m0 + 1
m0 + 1</p>
      <p>M0
M0</p>
      <p>0
xmax</p>
      <p>1
xmax</p>
      <p>0
xmin</p>
      <p>1
xmin
;
3. De ne a prediction function to forecast a tra c ow characteristic by selected nearest
neighbors.</p>
      <p>These challenges are described in the next subsections.</p>
      <sec id="sec-3-1">
        <title>3.1. Feature vector</title>
        <p>The choice of a feature vector in the kNN method depends on the particular application of the
method in practice. To solve the tra c ow prediction problem, it is reasonable to use a feature
vector that takes into account spatial and temporal correlations of the tra c ow characteristics.</p>
        <p>
          In the paper [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] as a feature vector authors used tra c ow of targeted road segment j,
downstream road segment j 1 and upstream road segment j + 1 for T time intervals:
(Vtj T ; : : : ; Vtj 1; Vtj ; Vtj T1; : : : ; Vtj 11; Vtj 1; Vtj+T1; : : : ; Vtj+11; Vtj+1)
        </p>
        <p>However, such feature vector does not consider tra c ow on adjacent segments. In addition,
in some cases, the upstream / downstream road segment cannot be uniquely determined.
Therefore, to describe tra c ow, it is proposed to use a feature vector that taking into account
the tra c ow characteristics in the spatially-compact cluster of the transport network graph.</p>
        <p>In this paper, we de ne the feature vector as follows:
1. The transportation network graph is partitioned into several spatially compact clusters
fGig. In each cluster i the feature vector is de ned as follows:
fVtj gi; j 2 Gi; t = tcur</p>
        <p>T; : : : ; tcur
2. For the de ned feature vector fV gi in the cluster i dimension reduction is performed using
principal component analysis procedure. Result of this procedure is a new feature vector
fXngi; n = 1; : : : ; N .
3. Proposed feature vector for each road segment j 2 E is de ned from the initial feature
vector of the targeted road segment j and the feature vector of the cluster i such that
j 2 Gi:</p>
        <p>Sj = (fVtj g; fXngi); j 2 Gi; t = tcur</p>
        <p>T; : : : ; tcur;
n = 1; : : : ; N:
(3)</p>
        <p>Graph partitioning algorithm is described in the next subsection.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Graph partitioning</title>
        <p>Let each edge i 2 E corresponding to the road segment ei with two terminal points xistart =
xs0tart; xs1tart i and xiend = xe0nd; xe1nd i.</p>
        <p>Then graph partitioning by an area can be described as follows:
1. Choose the numbers of clusters M0; M1.
2. The cluster Gm with index m = m0M1 + m1; m0 = 0; M0 1; m1 = 0; M1 1 contains
the edges i 2 E, for which coordinates of at least one of the corresponding terminal points
are inside the corresponding rectangular area m0;m1 :</p>
        <p>Gm0M1+m1
i 2 E : xistart 2</p>
        <p>i
m0;m1 _ xend 2
m0;m1 ;
(1)
(2)
(4)</p>
        <p>s
xmin =</p>
        <p>min
v=fstart;endg
i2E
xsv;i;</p>
        <p>s
xmax =</p>
        <p>max
v=fstart;endg
i2E
xsv;i;
s = 0; 1:</p>
        <p>The number of clusters along the vertical and horizontal axis M0; M1 is chosen empirically.
We assume, that each edge of the graph can get into only one cluster.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Proximity measure</title>
        <p>To de ne the proximity between the feature vectors, it is necessary to determine a suitable
distance metric. Di erent distance functions between feature vectors are available in the
literature, including Euclidean, Mahalanobis, Hamming distance.</p>
        <p>In this paper, we use a weighted Euclidean distance, modi ed to use the feature vector
describing transportation network clusters. The distance is considered separately for parts of
the feature vector describing tra c ows on the current segment fV g and in the corresponding
cluster fXg.</p>
        <p>vu T
d(S; Si) = tuX
t=1</p>
        <p>T t+1 Vt</p>
        <p>Vti 2 +</p>
        <p>Xn</p>
        <p>Xni 2:
vu N
uX
t
n=1
where 0 &lt; 1,
T denotes the total number of time intervals in the feature vector,
N denotes the total number of elements in the feature vector describing the graph cluster,
S is the feature vector describing current tra c ow,
Si is the feature vector describing ith historical tra c ow,
Vt; Vti are the feature vectors values representing respectively current and historical tra c ows
on the selected road segment at time interval t,
Xn; Xni are the nth feature vectors values representing respectively current and historical tra c
ows in the graph cluster.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Prediction function</title>
        <p>
          The traditional approach for estimating the value in k-NN regression is to choose the average
or the weighted average of the values of its k nearest neighbors [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>A prediction function by the average has the following form:</p>
        <p>K
X^T +1 = k1 X XTk+1</p>
        <p>k=1</p>
        <p>K
X^T +1 = X
k=1
dk 1
k=1 dk 1 XTk+1
PK
(5)
(6)
(7)
where X^T +1 is the predicted tra c ow value at the next time interval T + 1, XTk+1 is the tra c
ow value of the kth nearest neighbor at the time interval T + 1, K is the total number of the
neighbors.</p>
        <p>A prediction function by the weighted average has the following form:
where dk denotes the distance between the feature vector describing the current tra c data and
the kth nearest neighbors.</p>
        <p>We use the prediction function by the weighted average.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. MapReduce implementation</title>
      <p>
        The proposed model of tra c ow prediction uses a large amount of current and historical tra c
ow data. To improve the e ciency of the proposed model, we implement it on the basis of
MapReduce model [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] for distributed computing using Apache Spark engine [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>MapReduce provides parallel processing of big amount of data in computing clusters.
MapReduce model usually consists of three main steps: Map, Shu e and Reduce. Figure
1 illustrates a computation owchart of the proposed model based on MapReduce engine.
Input data</p>
      <p>Preprocessing phase</p>
      <p>Map phase</p>
      <p>Shuffle phase Reduce phase</p>
      <p>Output data
Training data</p>
      <p>Split data
Testing data</p>
      <p>Split data
train_split_0
train_split_1</p>
      <p>...
train_split_n
test_split_0
test_split_1</p>
      <p>...
test_split_k</p>
      <p>Cartesian join
ttreasitn__sspplilti_t_00 proMcaepdure Sort
ttreasitn__sspplilti_t_01 proMcaepdure Sort</p>
      <p>...
ttreasitn__sspplilti_t_kn-1-1 proMcaepdure Sort
ttreasitn__sspplilti_t_kn proMcaepdure Sort
key1:local_top_list1
key1:local_top_list2
...</p>
      <p>Reduce
procedure</p>
      <p>As illustrated in Figure 1, the rst step is a preparation of input data for Map phase. At rst,
the historical and test data are divided into partitions. The optimal number of such partitions
depends on the amount of processed data and the number of computing nodes. As mentioned in
o cial Apache Spark documentation, the recommended value of partitions is 2-3 partition per
CPU core in the cluster. Then, ordered pairs of historical and test data partitions are formed
using the Cartesian product. Next, in the Map phase, a map function is applied to each pair
of partitions. This function returns an intermediate set of key / value pairs - the test element
/ local list of k nearest neighbors. At the Shu e phase, the key-value pairs are grouped and
transferred to the reduce functions. At the nal Reduce step for each test data element, the set
of local k nearest neighbors lists is converted to the resulting (global) list of k nearest neighbors.
The resulting lists of k nearest neighbors are subsequently used to nd the predicted value tra c
ow.</p>
      <p>The results of evaluating the e ciency of the proposed model based on the MapReduce
concept are presented in Section 5.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Experiments</title>
      <p>
        In this work, in the experimental study, we predict average tra c speed in the city of Samara,
Russia for short-term prediction horizon 10 minutes. The dataset contains records for 34 days.
We compare the proposed model with the model described in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. This model uses feature
vector in form (1), where the feature vector considers time domain and upstream / downstream
road segments (denoted below as "TDUD"). Our model we denote as "Clusters" because the
feature vector considers spatial-temporal correlations in graph clusters.
      </p>
      <p>During testing, these models are performed on each day (test set) and the remaining days
considered as a historical dataset (training set). Then the average performance across the full
data set is calculated.</p>
      <p>We conduct the experiments on an Apache Spark cluster. The tra c ow was predicted for
a small area contained 698 road segments (Figure 2). Each road segment is considered as two
edges with di erent directions. The total size of the dataset was 3.5 GB.</p>
      <p>To compare the performance of the proposed model, we use two standard metrics: mean
absolute error (MAE) and mean absolute percentage error (MAPE) that can be formulated as:
n
MAPE = 1 X
n
t=1</p>
      <p>jVt
n
MAE = 1 X jVt
n
t=1</p>
      <p>Vt
^
Vtj
^
Vtj
(8)
(9)
where Vt is the actual value of tra c ow at time interval t, V^t is the predicted value for the
same time interval t, n is the total number of tra c ow observations.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>The paper presents the distributed spatial-temporal model of short-term traffic forecasting based on the
method of non-parametric regression k nearest neighbors. In the model, spatial and temporal
characteristics of the transport flow in a compact cluster of the transport network are taken into account
for the feature space description.</p>
      <p>For distributed Big Data processing, we use MapReduce processing model implemented in the open
source cluster-computing framework Apache Spark. Experimental analysis on real-world traffic data sets
allows us to conclude that the proposed model has a high prediction accuracy and reasonable execution
time, sufficient for real-time prediction.</p>
      <p>Day
Day</p>
      <p>Clusters
TDUD
Clusters
TDUD
3.6
3.4
3.2
EA3.0
M
2.8
2.6</p>
      <p>The possible direction of further research including dataset ltering for weekday / weekends
tra c data and development of graph partitioning algorithms based on the tra c ow
characteristics during a speci c time period.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work was supported by the Russian Foundation for Basic Research (RFBR) grant
18-0700605, grant 18-29-03135.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Lana</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Del Ser</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Velez</surname>
            <given-names>M</given-names>
          </string-name>
          and
          <string-name>
            <surname>Vlahogianni</surname>
            <given-names>E</given-names>
          </string-name>
          2018 Road traffic forecasting:
          <source>Recent advances and new challenges IEEE Intelligent Transportation Systems Magazine</source>
          <volume>10</volume>
          <fpage>93</fpage>
          -
          <lpage>109</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Vlahogianni</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Golias</surname>
            <given-names>J</given-names>
          </string-name>
          and
          <string-name>
            <surname>Karlaftis</surname>
            <given-names>M 2004</given-names>
          </string-name>
          <article-title>Short-term tra c forecasting: Overview of objectives andmethods</article-title>
          <source>Transport Reviews</source>
          <volume>24</volume>
          <fpage>533</fpage>
          -
          <lpage>557</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Karlaftis</surname>
            <given-names>M</given-names>
          </string-name>
          and
          <string-name>
            <surname>Vlahogianni E 2011</surname>
          </string-name>
          <article-title>Statistical methods versus neural networks in transportation research: Differences, similarities</article-title>
          and some insights Transportation Research Part C: Emerging Technologies 19
          <fpage>387</fpage>
          -
          <lpage>399</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Shekhar</surname>
            <given-names>S</given-names>
          </string-name>
          and
          <string-name>
            <surname>Williams</surname>
            <given-names>B 2007</given-names>
          </string-name>
          <article-title>Adaptive seasonal time series models for forecasting short-term traffic flow</article-title>
          <source>Transportation Research Record 116-125</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Smith</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            <given-names>B</given-names>
          </string-name>
          and
          <string-name>
            <surname>Keith Oswald R 2002</surname>
          </string-name>
          <article-title>Comparison of parametric and nonparametric models for traffic flow</article-title>
          forecasting Transportation Research Part C: Emerging Technologies 10
          <fpage>303</fpage>
          -
          <lpage>321</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Yin</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            <given-names>J</given-names>
          </string-name>
          and
          <string-name>
            <surname>Wong C 2002</surname>
          </string-name>
          <article-title>Urban traffic flow prediction using a fuzzy-neural approach</article-title>
          <source>Transportation Research Part C: Emerging Technologies</source>
          <volume>10</volume>
          <fpage>85</fpage>
          -
          <lpage>98</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Zheng</surname>
            <given-names>Z</given-names>
          </string-name>
          and
          <string-name>
            <surname>Su</surname>
            <given-names>D 2014</given-names>
          </string-name>
          <article-title>Short-term tra c volume forecasting: A k-nearest neighbor approach enhanced by constrained linearly sewing principle component</article-title>
          algorithm Transportation Research Part C: Emerging Technologies 43
          <fpage>143</fpage>
          -
          <lpage>157</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Wu</surname>
            <given-names>C H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ho J M and Lee D 2004</surname>
          </string-name>
          Travel
          <article-title>-time prediction with support vector regression</article-title>
          <source>IEEE Transactions on Intelligent Transportation Systems</source>
          <volume>5</volume>
          <fpage>276</fpage>
          -
          <lpage>281</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Sun</surname>
            <given-names>S</given-names>
          </string-name>
          and
          <string-name>
            <surname>Zhang C 2007</surname>
          </string-name>
          <article-title>The selective random subspace predictor for tra c ow forecasting</article-title>
          <source>IEEETransactions on Intelligent Transportation Systems</source>
          <volume>8</volume>
          <fpage>367</fpage>
          -
          <lpage>373</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Agafonov</surname>
            <given-names>A</given-names>
          </string-name>
          and
          <string-name>
            <surname>Myasnikov</surname>
            <given-names>V 2015</given-names>
          </string-name>
          <article-title>Tra c ow forecasting algorithm based on combination of adaptive elementary predictors</article-title>
          <source>Communications in Computer and Information Science</source>
          <volume>542</volume>
          <fpage>163</fpage>
          -
          <lpage>174</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Smith</surname>
            <given-names>B</given-names>
          </string-name>
          and
          <string-name>
            <surname>Demetsky M 1997 Tra</surname>
          </string-name>
          <article-title>c ow forecasting: Comparison of modeling approaches</article-title>
          <source>Journal ofTransportation Engineering</source>
          <volume>123</volume>
          <fpage>261</fpage>
          -
          <lpage>266</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Xia</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>Y</given-names>
          </string-name>
          and
          <string-name>
            <surname>Zhang Z 2016</surname>
          </string-name>
          <article-title>A distributed spatial-temporal weighted model on mapreduce for short-term tra c ow forecasting</article-title>
          <source>Neurocomputing</source>
          <volume>179</volume>
          <fpage>246</fpage>
          -
          <lpage>261</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Lv</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duan</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kang</surname>
            <given-names>W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>Z</given-names>
          </string-name>
          and
          <string-name>
            <surname>Wang F Y 2015</surname>
          </string-name>
          <article-title>Tra c ow prediction with big data: A deep learningapproach</article-title>
          <source>IEEE Transactions on Intelligent Transportation Systems</source>
          <volume>16</volume>
          <fpage>865</fpage>
          -
          <lpage>873</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Dean</surname>
            <given-names>J</given-names>
          </string-name>
          <source>and Ghemawat S 2008 Mapreduce: Simpli ed data processing on large clusters Communications of the ACM 51 107{113</source>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <article-title>ApacheSpark 2018 (Access mode: https://spark</article-title>
          .apache.org/)
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>