=Paper=
{{Paper
|id=Vol-2227/KDD-UMCit2018-Paper4
|storemode=property
|title=Potholes vs. Speed Bumps: A Multivariate Time Series Classification Approach
|pdfUrl=https://ceur-ws.org/Vol-2227/KDD-UMCit2018-Paper4.pdf
|volume=Vol-2227
|authors=Ariel Monteserin
|dblpUrl=https://dblp.org/rec/conf/kdd/Monteserin18
}}
==Potholes vs. Speed Bumps: A Multivariate Time Series Classification Approach==
<pdf width="1500px">https://ceur-ws.org/Vol-2227/KDD-UMCit2018-Paper4.pdf</pdf>
<pre>
Peer-reviewed Papers

Potholes vs. speed bumps: a multivariate time series classification
                           approach

                                               Ariel Monteserin

                   ISISTAN, CONICET-UNICEN, Campus Universitario, Tandil, Argentina.
                              Email: ariel.monteserin@isistan.unicen.edu.ar


      Abstract. In this work, we present a preliminary approach to distinguish potholes from speed
      bumps by analyzing the acceleration values sensed by a mobile device. The information of the
      accelerometers is gathered by an experimental mobile application developed to automatically detect
      potholes. A driver, who has previously installed the application in her smartphone, places the device
      in a fixed position inside the vehicle. Thus, this application records the accelerometer oscillations
      and the place where the vehicle transits. Then, if the road is damaged, the vibrations produced in the
      vehicle can be captured by the accelerometers indicating the pothole. However, in a road there are
      other structures that can produce similar effects: speed bumps. In both potholes and street bumps,
      the accelerometers of the mobile device produce a sequence of oscillations in the three axis (X, Y
      and Z). We model these sequence as multivariate time series and then we classify these by using a
      temporal classification approach. The preliminary results were carried out with real-world data and
      showed a promising accuracy.
      Pothole detection, multivariate time series classification.


1   Introduction

Maintaining the road infrastructure free of potholes is a difficult task. Several works have been proposed
to solve the problem of detecting potholes by using mobile devices [2,5,1]. All these works share the idea
that the potholes can be detected analyzing the acceleration values sensed by a mobile device. However,
not all vibrations in a vehicle are produced by potholes. The road infrastructure usually has speed bumps
and other structures that produce vibrations in a vehicle, but they cannot be considered road defects.
This fact produces a large number of false positives, which degrades the accuracy of the pothole detection.
    In this work, we propose a supervised learning approach based on temporal classification to distinguish
potholes from speed bumps. An experimental mobile application records the accelerometer oscillation and
GPS information. With this information, we can identify if an vibration event represents a pothole or a
speed bump. Each event is composed of a sequence of oscillation produced by the accelerometer of the
mobile device. These sequences include the oscillations since the event occurs until the vehicle stabilizes
(i.e. the oscillations disappear). An entry of a sequence is composed of a three pair of values (the raw
value of the accelerometer and its difference with the previous value) for axis X, Y and Z. We model these
sequences as a multivariate time series.
    A multivariate time series is a sequence of numerical vectors [8]. In this article, we use a supervised
learner for multivariate time series to classify the events recorded by the mobile device. This supervised
learner is based on a generic constructive induction technique to allow for domains where instances exhibit
recurring substructures [3]. These substructures are extracted, and a clustering algorithm is used to
construct synthetic attributes based on the presence or absence of certain substructures. Then, standard
learners can be applied. The experiments were carried out using real-world data and showed an accuracy
of 63.64% in the differentiation of potholes and two types of speed bumps (speed humps and street gutter).
    The rest of the article is organized as follows. Section 2 briefly describes the approach to detect potholes
and some related concepts. In Section 3, we present the preliminary results. Finally, Section 4 informs the
conclusions and future work.

KDD 2018 Workshop on Knowledge Discovery and User Modelling for Smart Cities                        Page 36 of 40
August 20, 2018 - London, United Kingdom

Copyright c 2018 for the individual papers by the papers’ authors. Copying permitted for private and
academic purposes. This volume is published and copyrighted by its editors.
Peer-reviewed Papers


                       Fig. 1: Example of a speed hump included in the experiments.


2    Approach description

When a vehicle transits in a road, potholes and others structures affect its stability. These variations in
the vehicle stability can be detected by the accelerometers of a mobile device (i.e. an smartphone). An
accelerometer provides data on the acceleration of the three coordinate axes with (almost) continuous
updates. This allows us to detect the slightest movements. Thus, placing the device in a fixed position
inside the vehicle, the mobile movements can reflect the movements of the vehicle. Then, we can detect
the presence of potholes or street bumps in the road, and its severity in terms of destabilization of
the vehicle, by analyzing the accelerometers information. Moreover, if we combine this information with
geolocation information, we can determine where these events occur. We name these events as stability
events. Particularly, we consider two types of speed bumps: speed humps and street gutter (a depression
running parallel to a street designed to collect rainwater, but that usually crosses perpendicular streets).
Figures 1 and 2 show two examples of speed humps and street gutter, respectively.
    The accelerometer information gathered by the application are tuples acc={rawX, diffX, rawY, diffY,
rawZ, diffZ}. The variables rawX, rawY and rawZ correspond to the acceleration values sensed by the
sensors in the axis X, Y and Z, respectively. Moreover, the variables dif f X, dif f Y and dif f Z represent
the differences between the actual raw values and the previous one. After losing stability, the vehicle takes
several seconds to stabilize again. For this reason, each stability event is composed of several tuples acc.
Then, we define a stability event as a sequence se = {(t1 , acc1 ), (t2 , acc2 ), ...(tn , accn )}, where acci is the
accelerometer information in time i within the stability event se.
    In this context, a sequence se represents a multivariate time series. A multivariate time series is a
sequence of numerical vectors [8]. Several approaches have been proposed to classify multivariate time


KDD 2018 Workshop on Knowledge Discovery and User Modelling for Smart Cities                         Page 37 of 40
August 20, 2018 - London, United Kingdom

Copyright c 2018 for the individual papers by the papers’ authors. Copying permitted for private and
academic purposes. This volume is published and copyrighted by its editors.
Peer-reviewed Papers


                     Fig. 2: Example of a street gutter included in the experiments.


series when a class can be associated with it [4,7,6,8]. Particularly, Kadous and Sammut [3] propose a
feature construction technique that parameterizes sub-events of the training set and clusters them to
construct features. Once obtained the features, a standard classifier is built to classify new instances.
Some of the components that can be applied to construct features are global extractors (duration, mean,
minimum and maximum and mode of a variable of the sequence), and the following metafeatures:

 – Increasing: it detects when a sequence is increasing.
 – Decreasing: it detects when a sequence is decreasing.
 – Plateau: it detects when a sequence is not changing.
 – LocalMax and LocalMin: it detect when a sequence has a local maximum or minimum, respectively.
 – RLE: Run-Length Encoding is a process where a single value repeated several times is encoded as
   that value, its starting point and its duration.

During the experimental results, we use the Kadous and Sammut approach. However, it is worth noticing
that other multivariate time series approaches can be applied.


3   Preliminary results

The experiments were carried out with real-world information extracted from Tandil, Buenos Aires, Ar-
gentina. In total, 48 journeys were processed. Moreover, we manually identified 24 potholes, 54 speed


KDD 2018 Workshop on Knowledge Discovery and User Modelling for Smart Cities               Page 38 of 40
August 20, 2018 - London, United Kingdom

Copyright c 2018 for the individual papers by the papers’ authors. Copying permitted for private and
academic purposes. This volume is published and copyrighted by its editors.
Peer-reviewed Papers

                                                              Actual Class
                                                    Pothole Speed hump Street gutter Precision


                           Predict.
                                         Pothole      42        13           7        0.677
                                      Speed hump       6        20           2        0.714
                                      Street gutter    7         5           8          0.4
                                         Recall      0.764     0.526       0.471
                                            Table 1: Confusion matrix and metrics.


humps and 33 street gutters in the streets through which the vehicle transited. Then, taking into account
the potholes and the speed bumps identified, we extracted 371 stability events, particularly: 184 events
associated to potholes, 128 events associated to speed humps, and 59 events associated to street gutters.
In average, each event was compound of 19.04 tuples acc.
    To run the experiment we used TClass 1 . TClass is the implementation of the approach to classify
multivariate time series proposed by Kadous and Sammut in [3]. Since TClass allow us to define different
feature extractors, we test different configurations to find the best one considering the accuracy of the
classification.
    The best results were obtained by using the global extractors duration, mean, min and max over the
6 attribute of the sequences: rawX, diffX, rawY, diffY, rawZ and diffZ by using a J48 classification tree.
The accuracy of the approach was 63.64%. Table 1 shows the confusion matrix and precision and recall
metrics for each class. The best precision and recall were obtained predicting speed humps and potholes,
respectively. In contrast, the worst individual metrics were obtained by predicting street gutter. We think
that this is because of the low number of stability events produced by the street gutter in the dataset.
    Moreover, we grouped the speed hump and street gutter in a common class, in order to differentiate
potholes from speed bumps. Considering this grouping, the accuracy increase to 70%.


4     Conclusions and future work

In this work, we propose a preliminary approach that allows us to distinguish between potholes and speed
bumps. This approach allows us to reduce the number of false positives produced during the pothole
detection process. Moreover, this approach is key if we want to make available the application in multiple
cities with the least effort. The preliminary results obtained from real-world data were promising.
    Future work will focus on a more extensive experimentation. Moreover, we will analyze the use of other
multivariate time series classification approaches.


References
1. Yu chin Tai, Cheng wei Chan, and Jane Yung jen Hsu. Automatic road anomaly detection using smart mobile
   device. 2010.
2. A. Fox, B. V. K. V. Kumar, J. Chen, and F. Bai. Multi-lane pothole detection from crowdsourced undersampled
   vehicle sensor data. IEEE Transactions on Mobile Computing, 16(12):3417–3430, Dec 2017.
3. Mohammed Waleed Kadous and Claude Sammut. Classification of multivariate time series and structured data
   using constructive induction. Machine Learning, 58(2):179–216, Feb 2005.
4. Chuanjun Li, Latifur Khan, and Balakrishnan Prabhakaran. Real-time classification of variable length multi-
   attribute motions. Knowl. Inf. Syst., 10(2):163–183, August 2006.
5. H.M. Ngwangwa, P.S. Heyns, H.G.A. Breytenbach, and P.S. Els. Reconstruction of road defects and road
   roughness classification using artificial neural networks simulation and vehicle dynamic responses: Application
   to experimental data. Journal of Terramechanics, 53:1 – 18, 2014.
6. Patrick Schäfer and Ulf Leser. Multivariate time series classification with WEASEL+MUSE. CoRR,
   abs/1711.11343, 2017.
1
    https://sites.google.com/site/waleedkadous/software/tclass


KDD 2018 Workshop on Knowledge Discovery and User Modelling for Smart Cities                       Page 39 of 40
August 20, 2018 - London, United Kingdom

Copyright c 2018 for the individual papers by the papers’ authors. Copying permitted for private and
academic purposes. This volume is published and copyrighted by its editors.
Peer-reviewed Papers

7. Lin Wang, Zhigang Wang, and Shan Liu. An effective multivariate time series classification approach using
   echo state network and adaptive differential evolution algorithm. Expert Systems with Applications, 43:237 –
   249, 2016.
8. Zhengzheng Xing, Jian Pei, and Eamonn Keogh. A brief survey on sequence classification. SIGKDD Explor.
   Newsl., 12(1):40–48, November 2010.


KDD 2018 Workshop on Knowledge Discovery and User Modelling for Smart Cities                    Page 40 of 40
August 20, 2018 - London, United Kingdom

Copyright c 2018 for the individual papers by the papers’ authors. Copying permitted for private and
academic purposes. This volume is published and copyrighted by its editors.

</pre>