Study of a Spatial Structure of Urban Traffic Flows Using a Regime-Switching Vector Autoregressive Model? Dmitry Pavlyuk[0000−0003−3710−9678] Transport and Telecommunication Institute, Riga, Latvia Dmitry.Pavlyuk@tsi.lv Abstract. This paper is devoted to multivariate regime-switching vec- tor autoregressive models application taking into account a spatial struc- ture in urban traffic flows. The spatial structure of traffic flows is esti- mated as statistical relationships between traffic characteristics (includ- ing lagged) at different road segments. A broad literature review is pro- vided and complex spatial dependencies at road segments are illustrated. We execute empirical analyses of out-of-sample forecasting accuracy of several widely used traffic flow models: autoregressive integrated moving average model, vector autoregressive model, and Markov-switching vec- tor autoregressive model. We provided empirical evidence for improving forecasting accuracy by including regime-dependent spatial dependen- cies of traffic flows at road segments into vector autoregressive models. Special attention is paid to selection of time resolution and its effects on the forecasting results. Keywords: spatiotemporal modelling, traffic flow, vector autoregressive model, regime-switching model, forecasting 1 Introduction During the last decades, models of urban traffic flows have become an integral part of intellectual transportation systems. A wide range of stakeholders use results of traffic forecasting on an everyday basis: many drivers use navigation software for routing and efficient determining departure time; public transport operators schedule bus routes on the base of expected traffic jams; emergency services use current and expected traffic data for solving vehicle routing prob- lems; road network planning and maintenance operators try to predict effects of their road activities. Nowadays an intensive usage of urban traffic models is typical for large developed cities, but the rapid growth of modern traffic control technologies and inevitable spreading of autonomous vehicles will lead to the growing demand for new flexible approaches to traffic flow modelling. ? This work was financially supported by the post-doctoral research aid programme of the Republic of Latvia (project no. 1.1.1.2/VIAA/1/16/112, “Spatiotemporal urban traffic modelling using big data”), funded by the European Regional Development Fund. At this moment the most popular approach to traffic flow forecasting is based on univariate models like the autoregressive integrated moving average model (ARIMA). Many scientific articles and practical applications include modelling of traffic flows at a selected (usually the most critical) road segment without any links to other parts of the road network. This non-spatial approach has several advantages, such as satisfactory forecasting accuracy, simplicity of models, lim- ited data requirements, but ignore a fact of spatial dependencies between space points (crossroads, road segments, etc.) of traffic flows. As a result, the non- spatial approach becomes inconsistent in case of real-time changes of the road network like traffic congestion, accidents, and others. Simultaneous modelling of traffic flows at multiple space points and incorporation of spatial dependencies between space points should lead to higher forecasting accuracy. This fact, sup- ported by high availability of real-time traffic data from multiple data sources (sensor loops, video recorders, probe cars’ routes), raises attention to develop- ment of spatiotemporal models and their application to traffic flows. This paper is our small contribution to the methodology of spatiotemporal traffic modelling. We applied a regime-switching vector autoregressive model where spatial dependencies between traffic flows at space points are directly included into the specification and may vary over the time. The suggested model is validated for real world traffic data. Our main contributions are the presented evidences of advantages of spatiotemporal regime-switching vector autoregressive models over widely used univariate time series models, and also several practical recommendations for such models applications. 2 Spatial dependencies in traffic flow modelling Although the majority of previous studies utilized non-spatial forecasting mod- els, recently several researchers extended the models to spatial settings. Space- time ARIMA (STARIMA) models seems the most widely used model with an explicitly included road network spatial structure. STARIMA represents a vec- tor autoregressive model where dependencies between space points are provided in a form of a spatial contiguity matrix. The spatial contiguity matrix can be exogenously defined on the base of road network characteristics or identified us- ing correlations. Kamarianakis and Prastacos [6] applied STARIMA with a fixed spatial contiguity matrix, based on inverted distances between space points, for traffic flow forecasting and demonstrated its advantages over non-spatial models; Min et al. [12] suggested to use dynamic spatial contiguity matrixes to reflect changes in road networks; Salamanis et al. [14] introduced a graph-theory-based technique for managing spatial dependencies in STARIMA models. However, an assumption about a fixed spatial contiguity matrix is rarely satisfied in practice, so several recent researches applied correlation analysis to dynamic identifica- tion of spatial relationships. Cheng et al. [3] executed exploratory space-time autocorrelation analysis for a road network and discovered the structure to be dynamic and heterogeneous in both space and time. Min and Wynter [11] utilised separate correlation-based contiguity matrixes for different time lags. Recently Schimbinschi [15] suggested a topology-regularized vector autoregressive model, where potential relationships between space points are limited by distance, but remain flexible within this distance limit. A popular least absolute shrinkage and selection operator (LASSO) and im- proved graphical LASSO (GLASSO) are frequently used for regularization and dependency selection to limit model complexity and prevent overfitting. Kamar- ianakis et al. [7] successfully applied the LASSO technique for real-time road traffic forecasting and later Haworth and Cheng [5] applied graphical LASSO for local spatiotemporal neighborhood selection. Li et al. [9] used Granger causality tests and further LASSO regression for similar purposes. Classical autoregressive models reconstruct a time pattern of the traffic flow and use this pattern for forecasting. Although accuracy of such models is sat- isfactory for many practical purposes, it potentially could be improved by the utilizing the fact of different patterns of the traffic flow, which are switched over the time. For example, the traffic flow pattern could be different for free, stabilized or congested road conditions. Thus regime-switching models which al- low the existence of multiple forms of spatial dependency, became a promising technique for traffic flow analysis (models are referred to as threshold models if time points of regime switching are predefined and as Markov-switching models if regime switching is a Markov process). Recently advanced regime-switching models have been utilized by several researchers. Yu and Zhang [17] proved that the Markov-switching ARIMA outperforms regular ARIMA specifications in terms of in-sample prediction accuracy. Cetin and Comert [1] also found sig- nificant advantages of threshold models in case of presence time intervals with different traffic regimes. Kamarianakis et al. [7] found the threshold ARIMA model useful for discovering spatial dependencies between values of the traffic flow in different locations. The latter paper, to the best of our knowledge, is the only research where spatial dependencies and the threshold regime-switching technique were successfully combined for traffic flow forecasting. 3 Research Methodology Let define space-time point traffic information at a particular space point i = 1, . . . , N and time point t = 1, . . . , T as a vector X of K traffic flow character- istics: Xi,t = (X1,i,t , X2,i,t , . . . , XK,i,t ) (1) Typically, a set of traffic flow characteristics includes traffic volume, average flow speed, flow density, and road occupancy. A one-step-ahead forecasting model for the traffic flow can be considered as a relationship between a space-time point traffic information and observed historical information: Xi,t+1 = f (Xi,1 , . . . , Xi,t ) + εi,t+1 (2) where f is an unknown function, representing relationships between space-time points, and εi,t+1 is a random component. A popular vector autoregressive model (VAR) model specification could be presented as [4]: Xp Xi,t+1 = Ah Xi,t−h + εi,t+1 (3) h=0 where p is an order of autoregressive component, and Ah are matrices of unknown coefficients. The model could be reduced to the set of independent autoregressive models putting Ah to diagonal matrices: Ah = αh IN . Moving average(MA) components, included into classical ARIMA models, are omitted in the presented VAR model specification. VAR models can be found as a good approximation for VARMA for sufficiency large lags p [10] (similarly to Wold’s theorem), thus in this research we omitted MA components. Thanks to the flexibility and ability to fit data, VAR models became an im- portant forecasting tool in many applied areas. But this flexibility comes from a large number of model parameters and brings a risk of overfitting data and poor out-of-sample forecasting accuracy. Bayesian approach to estimation of model parameters partly helps to overcome the problem of overfitting [8]. In this re- search we use this approach to estimation and corresponding models will be referred to as BVAR (Bayesian VAR). There are several widely used metrics for estimation of model’s forecasting accuracy, and we use two of them – root-mean-square error (RMSE) and mean absolute error (MAE): v u n u1 X 2 RMSE = t (xi − x̂i ) (4) n i=1 n 1X MAE = |xi − x̂i | (5) n i=1 See Chai and Draxler [2] for an extended discussion about advantages of RMSE and MAE as forecasting accuracy metrics. To prevent effects of in-sample overfitting of VAR models, we use the out- of-sample values of RMSE and MAE for model comparison. In particular, we implemented a one-step-ahead model validation procedure, where parameters are estimated for every observation in the testing subsample and the resulting model is used to forecast one following observation. One of the main assumptions of this research is presence of dynamical changes in spatial relationships between space points. An underlying mechanism of this dynamics is based on drivers’ awareness of road conditions and selection of the optimal route. Nowadays many drivers use navigation software with real-time information about current traffic conditions and make their routing decision (or just follow navigation software instructions, which also based on real-time traffic information). As a result, traffic flows at two crossroads may be independent in case of free movement, but become dependent in case of congestion on one or both of them. We applied a regime-switching version of BVAR to reveal presence of such regimes in data. We assume the Markovian nature of these regimes, so transitions between regimes (states) don’t depend on process history. We consider Markovian hidden states s that represent actual traffic flow regimes: P (st |s0 , s1 , . . . , st−1 , Xi,1 , . . . , Xi,t ) = P (st |st−1 ) (6) Each regime has its own estimates of the VAR model parameters, so the Markov- switching VAR model is formulated as MS-VAR(p): p X Xi,t+1|st = Ah (st−h )Xi,t−h + εsi,t+1 (7) h=0 Given the current regime, the model is reduced to the regular BVAR model. A list of additional techniques, utilized in this research, includes: – Simple historical averages for separation a long-term pattern and distur- bances in the traffic flow; – Augmented Dickey-Fuller test for stationarity; – Granger causality tests for identification of lagged relationships; – Naı̈ve forecasts as a base point for models’ forecasting accuracy. 4 Experimental results 4.1 Data description Similarly to Pavlyuk [13], we utilize data of traffic flows publicly available from the Minnesota Department of Transportation. Data is collected by two sensors located before bridges in Minneapolis city center and serving the same direction (Fig. 1). Fig. 1. Location of sensor loops Traffic information includes average traffic flow density values, aggregated by three time resolutions: 10-minutes, 15-minutes, and 30-minutes time frames. The research period is from the 1st of March till the 31st of May 2017 (92 days); the data sets are complete. Typical daily patterns of analyzed traffic flow densities are presented on Fig. 2 and fairly usual for urban roads – low intensities during night hours, higher intensities during day hours and several congestion “spikes” in morning/evening rush hours. Fig. 2. Typical daily patterns of analyzed traffic flow densities Traffic flow densities have clear historical profiles (for each time-of-day and day-of-the-week) which could be predicted by historical average values or sea- sonal exponential smoothing (with weekly “seasons”). The research period is rel- atively short and uniform, so the exponential smoothing does not significantly outperform simple historical averages, thus we continue with the latter option: Densityt = Densityavg,t + DensityStatt (8) where Densityavg,t are historical average values for day-of-the-week time-of-day, and DensityStatt are deviations from the historical averages and the main point of interest. We analyzed DensityStatt using augmented Dickey–Fuller tests and found them stationary for both sensors. 4.2 Model specification and estimation Density levels at both sensors become stationary after subtracting historical averages, so a direct application of VAR model is allowed. We apply Granger causality tests for initial identification of spatial relationships between traffic flow densities in two space points. The tests were implemented for density levels and for residuals of univariate ARIMA models (Table 1). Granger tests support our hypothesis about presence of spatial relationships, but also give us information about potential asymmetry of this relationship: discovered effects of density at Sensor 2 point on density at Sensor 1 point have higher significance than in the opposite direction. Note that this conclusion has a preliminary nature, and VAR models will provide more consistent estimates of these spatial dependencies. A stack of considered models includes: Table 1. Results of Granger’s causality tests for independent ARIMA model residuals Cause Effect Time resolution F-statistic p-value Sensor 1 → Sensor 2 10-mins 3.871 0.021 15-mins 10.096 0.000 30-mins 2.417 0.089 Sensor 2 → Sensor 1 10-mins 5.973 0.003 15-mins 16.752 0.000 30-mins 38.246 0.000 – Naı̈ve forecasts as a base point for models’ accuracy comparison. – ARIMA(p, 0, q) model, where optimal orders of autoregressive lags p and moving average component q, is selected for every new model on the base of the Akaike Information Criterion (AIC). – Spatial BVAR model with 1 and 2 autoregressive lags (models with higher order of lags also were estimated, but didn’t outperform the presented models and excluded from this paper for brevity reasons) – Markov-switching spatial BVAR model (MS-BVAR) with 1 and 2 autore- gressive lags. The main goal of this research is testing of forecasting accuracy of differ- ent models, so we exclude models’ estimation results from this paper and con- centrate on their out-of-sample forecasts. To estimate out-of-sample forecasting accuracy we split the data set into training (91 days) and testing (1 day) sub- samples (in the background we also executed similar procedures for a testing subsample of 7 days and obtained similar results). For every time point in the testing subsample (48, 96, and 144 points for 30-mins, 15-mins, and 10-mins time frames respectively) we estimated all models for previous observations and made one-step-ahead forecasts. These forecasts were used to estimate models’ out-of-sample forecasting accuracy using RMSE and MAE metrics. Estimation results are summarized in Table 2. The obtained results are fairly consistent for RMSE and MAE metrics, so we arbitrary use MAE for further discussion. 5 Results and discussion Firstly, we note significant difference in forecasting accuracy of Sensor 1 and Sensor 2 points (MAE for Sensor 1 forecasts are 2–3 times higher than for Sensor 2 forecasts). This conclusion corresponds to daily density patterns presented on Fig. 1: Sensor 1 point is more congested in rush hours and has obvious spikes, which significantly affect forecasting accuracy. VAR models significantly outperform ARIMA and naı̈ve forecasts for both space points; Markov-switching BVAR models are preferred for Sensor 1, and regular BVAR models are preferred for Sensor 2. This observation is consistent with the studies presented in Section 2: embedding of spatial relationships into Table 2. Out-of-sample accuracy of models Time MS- MS- BVAR BVAR resolu- Naı̈ve ARIMA BVAR BVAR (p = 1) (p = 2) tion (p = 1) (p = 2) 10-mins 6.969 6.722 6.592 6.655 6.370 6.201 15-mins 7.176 6.628 6.663 6.599 6.550 6.231 RMSE 30-mins 9.306 8.492 8.058 8.141 7.935 7.892 10-mins 2.907 2.778 2.691 2.776 2.573 2.447 15-mins 2.970 2.805 2.828 2.737 2.799 2.651 Sensor 1 MAE 30-mins 4.187 3.909 3.620 3.700 3.587 3.589 10-mins 1.699 1.515 1.556 1.513 1.757 1.805 15-mins 1.220 1.165 1.137 1.121 1.311 1.524 RMSE 30-mins 1.374 1.301 1.297 1.307 1.293 1.302 10-mins 1.263 1.136 1.167 1.118 1.247 1.318 15-mins 0.946 0.912 0.846 0.828 0.987 1.059 Sensor 2 MAE 30-mins 1.005 0.936 0.893 0.903 0.910 0.889 traffic flow forecasting models leads to increased forecasting accuracy. An im- portant observation is related to the preference of the regime-switching model for Sensor 1 and the regular (one regime) model for Sensor 2. Fig. 3 provides the clues for this fact. Fig. 3. Estimated regime probabilities in the MS-BVAR(1) model Note that the MS-BVAR model identified two regimes: rush hours (Regime 1) and normal traffic flow (Regime 2). On Fig. 1 we observed that Sensor 1 point is more affected by congested traffic during rush hours, so the regime-switching model provides more accurate forecasts. Congestion is less typical for Sensor 2 point, so one-regime BVAR model provides better out-of-sample forecasts, while MS-BVAR model suffers from over-parametrization and in-sample overfitting. Thus we can make a conclusion that even within a small road segment the best overall accuracy of forecasts is archived not from one general model, but from a correct combination of models. The problem of forecasts combining is a hot topic in research publications on traffic forecasting; see Vlahogianni et al. [16] for an extended description of scientific challenges in this area. Another important point of this research lays in effects of time resolution on models’ forecasting accuracy. Note that forecasting results for different time frames are not directly comparable due to different forecasting horizons (one- step-ahead forecasts correspond to 10, 15, and 30 minutes respectively). Nev- ertheless, we can note that smaller time resolution requires a higher number of autoregressive lags as it was expected: for 30-mins time frames VAR models with one lags are preferred, for 15- and 10-mins time frames – VAR models with two lags (further increasing of the number of lags don’t have significant effects on the models’ forecasting accuracy for analyzed time resolutions). There is no solid approach to selection of the optimal time resolution for traffic flow modelling, and so the preferred time frames are a trade-off between detailed but noisy high resolution data, and low resolution data where some dependencies are smoothed by aggregation. An optimal resolution for discovering spatial relationships should correspond to the time of spatial effects (i.e. what time is required for drivers and their navigation software to identify a congestion and reroute). Impulse re- sponse functions, usually estimated with VAR models, could provide evidences for this spatial time lag, so are suggested as a tool for model identification. 6 Conclusions This paper contributes to multivariate statistical modelling of traffic flows, where spatial dependencies and potential regimes are explicitly included. We applied modern Markov-switching vector autoregressive models for traffic flow forecast- ing and compared out-of-sample forecasting accuracy for several popular models. Also, we briefly discussed problems of optimal model and time resolution selec- tion. Our main conclusions are: – Vector autoregressive models with spatial dependencies between space points outperform classical univariate model like ARIMA. – Spatial relationships between space points should be considered as asym- metric, i.e. effect of changes in traffic flow in one space point on another is not invertible. – Multiparametric models (i.e. Markov-switching vector autoregressive mod- els) suffer from sample data overfitting, which leads to poor out-of-sample forecasting accuracy. Selection of an appropriate model is space point-specific, so a complex forecast for road network with a large number of space points should be constructed from a set of models and combined forecasts. – Selection of time resolution (an aggregation level) is an important step of modelling. Time resolution is a trade-off between detailed but noisy high res- olution data, and low resolution data, where some dependencies are smoothed by aggregation. As lags, related to spatial dependencies, depend on road net- work and drivers’ behavior, the optimal time resolution should be estimated for every specific data set. References 1. Cetin, M., Comert, G.: Short-term traffic flow prediction with regime switching models. Transportation research record. Journal of the Transportation Research Board 1965, 23–31 (2006) 2. Chai, T., Draxler, R.R.: Root mean square error (RMSE) or mean absolute error (MAE)? Arguments against avoiding RMSE in the literature. Geoscientific Model Development 7(3), 1247–1250 (2014) 3. Cheng, T., Haworth, J., Wang, J.: Spatio-temporal autocorrelation of road network data. Journal of Geographical Systems 14, 389–413 (2012) 4. Hamilton, J.: Time series analysis. Princeton University Press, Princeton, N.J (1994) 5. Haworth, J., Cheng, T.: Graphical LASSO for local spatio-temporal neighbour- hood selec-tion. In: Proceedings the GIS Research UK 22nd Annual Conference. Presented at the GISRUK. pp. 425–433 (2014) 6. Kamarianakis, Y., Prastacos, P.: Space-time modeling of traffic flow. Computers & Geosciences 31(2), 119–133 (2005) 7. Kamarianakis, Y., Shen, W., Wynter, L.: Real-time road traffic forecasting us- ing regime-switching space-time models and adaptive LASSO. Applied Stochastic Models in Business and Industry 28, 297–315 (2012) 8. Karlsson, S.: Forecasting with bayesian vector autoregression. In: Handbook of Economic Forecasting, vol. 2, chap. Chapter 15, pp. 791–897. Elsevier (2013) 9. Li, L., Su, X., Wang, Y., Lin, Y., Li, Z., Li, Y.: Robust causal dependence mining in big data network and its application to traffic flow predictions. Transportation Research Part C: Emerging Technologies 58(Part B), 292–307 (2015) 10. Lütkepohl, H.: Forecasting Cointegrated VARMA Processes, pp. 179–205. Black- well Publishing Ltd (2007) 11. Min, W., Wynter, L.: Real-time road traffic prediction with spatio-temporal cor- relations. Transportation Research Part C: Emerging Technologies 19, 606–616 (2011) 12. Min, X., Hu, J., Chen, Q., Zhang, T., Zhang, Y.: Short-term traffic flow forecasting of urban network based on dynamic STARIMA model. In: Proc.12th Int. Conf. Intelligent Transportation Systems (ITSC). pp. 1–6 (01 2009) 13. Pavlyuk, D.: On application of regime-switching models for short-term traffic flow forecasting. In: Zamojski, W., Mazurkiewicz, J., Sugier, J., Walkowiak, T., Kacprzyk, J. (eds.) Advances in Dependability Engineering of Complex Systems: Proceedings of the Twelfth International Conference on Dependability and Com- plex Systems DepCoS-RELCOMEX, July 2–6, 2017, Brunów, Poland. pp. 340–349. Springer International Publishing, Cham (2018) 14. Salamanis, A., Kehagias, D., Filelis-Papadopoulos, C., Tzovaras, D., Gravvanis, G.A.: Managing spatial graph dependencies in large volumes of traffic data for travel-time prediction. IEEE Transactions on Intelligent Transportation Systems 17, 1678–1687 (2016) 15. Schimbinschi, F., Moreira-Matias, L., Nguyen, V., Bailey, J.: Topology-regularized uni-versal vector autoregression for traffic forecasting in large urban areas. Expert Systems with Applications 82, 301–316 (2017) 16. Vlahogianni, E., Karlaftis, M., Golias, J.: Short-term traffic forecasting: Where we are and where we’re going. Transportation Research Part C: Emerging Technologies 43, 3–19 (2014) 17. Yu, G., Zhang, C.: Switching ARIMA model based forecasting for traffic flow. In: Acoustics, Speech, and Signal Processing, 2004. Proceedings. pp. ii–429 (2004)