=Paper=
{{Paper
|id=None
|storemode=property
|title=From Trajectories of Moving Objects to
Route-based Traffic Prediction and Management
|pdfUrl=https://ceur-ws.org/Vol-652/MPA10-09.pdf
|volume=Vol-652
}}
==From Trajectories of Moving Objects to
Route-based Traffic Prediction and Management==
MPA'10 in Zurich 132 September 14th, 2010
From Trajectories of Moving Objects to
Route-based Traffic Prediction and Management
Gyızı Gidófalvi, Ehsan Saqib
The Royal Institute of Technology (KTH), Geoinformatics, Drottning Kristinas väg 30, 100 44 Stockholm, Sweden
Email: gyozo.gidofalvi@abe.kth.se, esaqib@kth.se
1. Introduction
The rapid growth of demand for transportation, high levels of car dependency caused
by the urban sprawl have exceeded the slow increments in transportation infrastructure
supply in many areas causing severe traffic congestion. Some well-known negative
effects of traffic congestion include: the fuel wasted in idling vehicles leading to
increasing air pollution and carbon dioxide emissions; the time and associated cost lost
by motorists sitting in traffic jams; the wear-and-tear on vehicles and infrastructure as
a result of stop-and-go traffic; and, last but not least, the inflicted stress and fatigue on
motorist causing unnecessary accidents.
In dense urban areas, adding capacity through construction of new facilities is
difficult due to lack of space and prohibitive costs. A more viable approach to cope
with the congestion problem is to monitor traffic congestion, understand the causes of
its formation and development, and use the aforementioned knowledge in traffic
management systems and transportation planning to mitigate traffic congestion.
Earlier efforts to derive information about traffic congestion have used fixed
location sensors. Portals have been built in highways for collecting information such as
flow and punctual speed at certain locations. Such data collection methods require
huge building costs and provide data that allows analysis methods to gain limited
knowledge about the causes of traffic congestion formation and development.
Currently, traffic monitoring centers consider deriving information about traffic
congestion, in particular traffic parameters such as flow and travel time, using both
fixed and mobile data collection methods. Studies using empirical data [2] have
analyzed data provided by two methods for estimating travel-time: Automatic Travel
Time System – ATTS (that identifies number plates at intersections and later matches
them) and, Floating Car Surveys – FCS (where specialized vehicles cover a route back
and forth during the survey period). It was observed that ATTS provide in large part
inaccurate or unfeasible data and FCS have a low sample rate and high cost (Morán C
and Bang KL, 2010).
More recently, as proposed by Gidófalvi and Morán (2010) and many others, the
increasing availability and accuracy of positioning technologies (primarily GPS)
embedded in on-board navigation systems of both private and commercial vehicles and
mobile devices that are carried by the drivers enable the low-cost FCS-like data
collection from large numbers of private vehicles. However, such data collection can
poses serious privacy threats as the trajectories collected refer to the highly sensitive
movements of private individuals. To this extent, this short paper outlines a privacy-
preserving approach that using trajectories of moving objects, in real-time, monitors
traffic congestion, extracts knowledge about the causes of congestion formation and
development, and uses the extracted knowledge in a traffic prediction and management
framework to mitigate congestion.
MPA'10 in Zurich 133 September 14th, 2010
2. Movement Pattern Based Traffic Prediction and Management
The proposed framework has a client-server architecture in which location-aware
mobile clients map match (Jensen and Tradisauskas 2009) their locations to road
segments, perform road-network based location anonymization, and report their
trajectories to the server in form of a continuous stream of timestamped traversed road
segments. The server stores the evolving route trajectories of clients in a compressed
format using route-tree, which due to length limitations is not described here.
Simultaneously, using a sliding window model in an incremental and continuous
fashion (Mozafari et al. 2008, Jiang and Gruenwald 2006) the server extracts and
stores movement and traffic patterns in form of frequent (sub-)routes and congested
road segments (Gidófalvi and Pedersen 2009). The server uses the extracted
knowledge for traffic and congestion prediction at a given time point by combining the
information about the partial-routes in the route-tree with relevant historical frequent
routes to estimate the expected number and speed of clients on each road segment in
the near future. Finally, based on the estimates the server performs traffic management
by providing various traffic advisory services to the clients, e.g., variable speed
advisory, alternative routes, etc. The following subsections further elaborate on
important aspects of the proposed framework.
2.1 Location Privacy and Anonymization
Trajectories of individuals contain highly sensitive personal information; therefore, to
protect the privacy of individuals, they need to be adequately anonymized. A major
privacy threat is the identification of individuals by self-correlating trajectories to
identify frequently visited private locations, e.g., home, work and subsequently cross-
referencing a subset of these private locations to publically available external data
sources, e.g., Yellow pages (Gidófalvi et al. 2010).
A number of privacy protection frameworks have been proposed to protect against
the above described threat. A common approach is to generalize the exact locations of
individuals to cloaking region. Following the traditional notion of k-anonymity, most
privacy protection frameworks construct cloaking regions such that at the time of the
location report there are at least k objects in the given region. As identified in
(Gidófalvi et al. 2010), there are a number of problems with this method. First,
determining the k-anonymity based cloaking region requires trusted components, e.g., a
server, often termed as the anonymizer, that is aware of the exact positions of the
objects. Second, as the cloaking regions depend on the positions of nearby objects, for
a given location the cloaking region varies over time depending on the density of the
objects, allowing an attacker infer the locations of the private location to be within the
intersection of the reported cloaking regions for the location. Finally, k-anonymity
based cloaking regions are likely to be over-protective in low density rural areas, and
under-protective in high density, sensitive hot spots, e.g., red light district.
To overcome the above shortcomings, based on prior work by Gidófalvi et al. 2010,
the proposed framework adopts an privacy protection framework in which clients
specify their requirements of location privacy, based on the notions of anonymization
road segment sets and location probabilities, intuitively saying how precisely they
want to be located in given areas. Such a privacy protection framework serves the
transportation application domain well for two reasons. First, it allows clients to set
their privacy requirements to arbitrarily low values in areas where the threat of being
identified is very low, which arguably constitutes most parts of the routes – only
excluding the parts very near the origin and destination of the route. Second, road-
network based generalization of noisy location readings allow aggregation based data
MPA'10 in Zurich 134 September 14th, 2010
mining methods the extraction of accurate and relevant movement patterns in the
transportation application domain (Gidófalvi and Pedersen 2009).
2.2 Movement and Traffic Patterns
In recent past, a large number of methods have been proposed to extract movement and
traffic patterns. Dodge et al. (2008) provide a good review and taxonomy of these
methods. For the target application at hand the most promising patterns include
trajectory clusters (Rinzivillo et al. 2008) and frequent spatio-temporal sequences, i.e.,
routes (Gidófalvi and Pedersen 2009). The proposed framework adopts the latter for
two reasons. First, the trajectory representation through road-network based
generalization allows well-researched data mining methods, such as (maximal/closed)
frequent itemset mining methods to efficiently extract frequent routes. Second, the so
extracted frequent routes can be efficiently stored- and their relationships to each other
can be easily and efficiently queried in a DBMS.
To preserver clarity, Figure 1 shows the two dimensional projection of such
frequent routes. Figure 1 clearly shows that each segment of every pattern has two
attributes: a vehicle count and a speed. What Figure 1 fails to illustrate is that such
patterns also have a direction and a spatial-temporal relationship between each other.
All four of these pattern attributes are vital for accurate traffic prediction and provide
insight into the creation and development of congestions.
² ² ²
Number of Vehicles Speed (km/h) Number of Vehicles Speed (km/h) Number of Vehicles Speed Deviation
1 - 139 No Data 1-5 No Data 1-5 No Data
139 - 412 1 - 30 5 - 13 1 - 30 5 - 13 <-1
412 - 819 30 - 45 13 - 24 30 - 45 13 - 24 -1 to 0
819 - 1971 24 - 46 24 - 46
45 - 65 45 - 65 0 to 1
65 - 150 65 - 150 >1
0 0.25 0.5 1 Kilometers 0 0.25 0.5 1 Kilometers 0 0.25 0.5 1 Kilometers
Figure 1: Frequent routes with speed profiles. Daily frequent routes and speed profiles
(left). 8am frequent routes and speed profiles (center). 8am speed deviations (right).
3. Empirical Study Based on Real World Trajectories
The performance and scalability of the proposed framework will be tested on the
following real world trajectory data set (stream) provided by Trafik Stockholm. The
trajectory data set contains the GPS readings of 1500 taxis and 400 trucks travelling on
the streets of Stockholm. Each taxi produces a reading once every 60 seconds
approximately. This reading includes only taxi identification and location information.
Taxis produce readings less frequently when they are not carrying any passengers.
Trucks use more recent and more accurate GPS devices that produce readings once
every 30 seconds and include identification, location, speed and heading information.
The peak data rate for the whole city is over 1000 readings per minute, and there are
approximately 170 million readings during the course of a year.
MPA'10 in Zurich 135 September 14th, 2010
The study will assess the performance of the system in terms of prediction accuracy
and throughput and will evaluate demonstrate the scalability of the approach by
replaying the data at several times the actual data rate. The system implementation of
the framework will utilize an IBM InfoSphere Streams parallel and distributed DSMS
running on a cluster of commodity hardware.
References
Dodge S, Weibel R, and Lautenschütz A-K, 2008, Towards a Taxonomy of Movement Patterns.
Information Visualization, 7(3):240–252
Jensen CS and Tradisauskas N, 2009, Map Matching. Encyclopedia of Database Systems 2009:1692–
1696
Jiang N and Gruenwald L, 2006, CFI-Stream: Mining Closed Frequent Itemsets in Data Streams. In:
Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, Philadelphia, USA, 592–597.
Gidófalvi G and Pedersen TB, 2009, Mining Long, Sharable Patterns in Trajectories of Moving Objects.
Geoinformatica, 13(1):27–55.
Gidófalvi G, Huang X, and Pedersen TB, 2010. Probabilistic Grid-Based Approaches for Privacy
Preserving Data Mining on Moving Object Trajectories. In: Bonchi F, Ferrari E (eds), Privacy-
Aware Knowledge Discovery: Novel Applications and New Techniques. CRC PRESS.
Gidófalvi G and Morán C, 2010, Estimating Traffic Performance in Road Networks from Anonymized
GPS Vehicle Probes. Workshop on Movement Research: Are you in the flow? at the 13th AGILE
International Conference on Geographic Information Science.
Morán C and Bang KL, 2010, Reliability of Congestion Performance Measures. In: Proceedings of the
Institution of Civil Engineers - Transport.
Mozafari B, Thakkar H, and Zaniolo C, 2008, Verifying and Mining Frequent Patterns from Large
Windows over Data Streams. In: Proceedings of the 2008 IEEE 24th International Conference
on Data Engineering, Cancun, Mexico, 179–188.
Rinzivillo S, Pedreschi D, Nanni M, Giannotti F, Andrienko N and Andrienko G, 2008, Visually Driven
Analysis of Movement Data by Progressive Clustering. Information Visualization, 7(3):225–239.