<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>In-Vehicle Big Data Exploration for Road Maintenance</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>(Discussion Paper)</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Devis Bianchini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valeria De Antonellis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Massimiliano Garda</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Brescia, Dept. of Information Engineering Via Branze 38</institution>
          ,
          <addr-line>25123 - Brescia</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Big Data Exploration techniques may benefit from the availability of huge amount of data (e.g., collected from IoT infrastructures) for improving resilience of monitored systems. In this paper, we discuss the application of such techniques in a research project to pursue mobility resilience in Smart Cities applications. Among the aspects to be considered for enabling resilience in mobility, we specifically focus on road maintenance, gathering data streams from vehicles equipped with sensors and designing proper exploration scenarios. Scenarios rely on three precise components as main pillars of the proposed approach: (i) a multi-dimensional model apt to represent the road network and to enable data exploration; (ii) data summarisation techniques, in order to simplify exploration of high data volumes; (iii) a measure of relevance, aimed at attracting the attention of the road maintainers on relevant data only.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Multi-dimensional model</kwd>
        <kwd>data summarisation</kwd>
        <kwd>big data exploration</kwd>
        <kwd>smart and resilient mobility</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In the latest years, the increasing availability of big data has become a key factor in shifting
towards a data-centric vision of modern Smart Cities [1]. In particular, the concept of smart
mobility, and its impact on the transportation of goods and people, is experiencing radical
changes, capitalising on big data generated from sensor networks and IoT devices [2]. Indeed,
through such data, issues that can arise may be promptly noticed and tackled, increasing the
eficiency of delivered services [ 3]. For instance, sensor data in vehicles may provide in near
real-time valuable information about the quality of the area-wide road surface and may be
used by road maintainers to focus monitoring and maintenance activities on urban and public
infrastructure, for enhancing mobility resilience. In this landscape, road maintainers should
be equipped with valuable tools to gain insights from the data and ensure a safer and more
eficient infrastructure. Nevertheless, the variety and volume of collected data call for models,
tools and methods for data representation and exploration [4]. To support road maintainers
in analysing and assessing surface conditions of roads, in this paper we propose an approach,
based on big data exploration techniques consisting in the following three components: (i) a
multi-dimensional model, apt to represent portions of the road network (based on distinguishing
features such as type of road, area/district, mileage extent) and to enable their data-driven</p>
      <p>Trip-related entity</p>
      <p>Legend</p>
      <p>Anomaly-related
entity</p>
      <p>ID Recording date Lat/Long</p>
      <p>
        (
        <xref ref-type="bibr" rid="ref1 ref1">1,1</xref>
        )
      </p>
      <p>GPS Trail
Engine
Status</p>
      <p>Heading</p>
      <p>Speed</p>
      <p>associated with</p>
      <p>
        Lat/Long Date
Velocity
(
        <xref ref-type="bibr" rid="ref3 ref3">3,3</xref>
        )
      </p>
      <p>
        (
        <xref ref-type="bibr" rid="ref1 ref1">1,1</xref>
        )
Anomaly
(
        <xref ref-type="bibr" rid="ref1 ref1">1,1</xref>
        )
      </p>
      <p>ID Nr. Samples</p>
      <p>Strain (z)
Impact (g)
ID Lat/Long Heading Speed</p>
      <p>IDSample
Type(AccX/
AccY/ AccZ)</p>
      <p>
        Accelerometric (
        <xref ref-type="bibr" rid="ref1 ref1">1,1</xref>
        ) associated with
      </p>
      <p>
        Trail
Position type (
        <xref ref-type="bibr" rid="ref1 ref1">1,1</xref>
        )
(start, end,
intermediate
sample)
      </p>
      <p>Trip Details
associated with</p>
      <p>Address</p>
      <p>Drive style
(fair, brake, acceleration,
hazardous driving)</p>
      <p>ID</p>
      <p>
        Lat/Long Start-End
(
        <xref ref-type="bibr" rid="ref1 ref1">1,1</xref>
        )
      </p>
      <p>
        Trip
(
        <xref ref-type="bibr" rid="ref1 ref1">1,1</xref>
        )
      </p>
      <p>Timestamp Start-End
Distance covered Place Start-End</p>
      <p>Type (hole, bump, depression, rough
ground, unclassifiable, accident)
registration</p>
      <p>(1,*)
Overall distance (odometer)
detection</p>
      <p>(0, *)
Vehicle</p>
      <p>Type</p>
      <p>ID
exploration; (ii) data summarisation techniques, in order to simplify exploration of high data
volumes by extracting snapshots of measures gathered by vehicles, which evolve over time;
(iii) a measure of relevance, aimed at attracting the road maintainers’ attention on portions of
the road network in the multi-dimensional model that present substantial changes over time
(e.g., to plan corrective actions in the case of road conditions decay). This paper illustrates the
application of the proposed approach in the scope of the MoSoRe project (Italian acronym for
“Mobilità Sostenibile e Resiliente”), whose aim is to investigate the resilience of mobility systems
and infrastructure in the city of Brescia (Italy). Specifically, three exploration scenarios have
been devised to assist road maintainers when inspecting road surface conditions. A preliminary
presentation of the approach has been provided in [5].</p>
      <p>The paper is organised as follows: in Section 2 a conceptual model for data collected within
the MoSoRe project is described; Section 3 presents the ingredients of the big data exploration
approach; exploration scenarios for smart mobility are illustrated in Section 4, whereas Section 5
describes implementation and experimental evaluation; related work are discussed in Section 6;
ifnally, Section 7 closes the paper, sketching future research directions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Conceptual data model</title>
      <p>In the MoSoRe project, a fleet of commercial vehicles has been equipped with black boxes,
gathering data when vehicles transit on specific road portions (i.e., delimited sections of roads
covered during daily trips). Collected data regards both contextual information (e.g., details of
the journey) and measures from gravity acceleration sensors, which can be used to infer road
surface conditions. Collected data is conceptually represented through the Entity-Relationship
(E-R) diagram in Figure 1 as follows.</p>
      <p>Trip-related data. This kind of data regards the journeys accomplished by vehicles during the
monitored period (at the time of writing, data is transferred from vehicles on a daily basis).
The black box collects the position (in terms of GPS coordinates) of the hosting vehicle at the
Data stream
from vehicles</p>
      <p>Multi-dimensional and incremental</p>
      <p>clustering (Sect. 3)
Generation/
update of
snapshots</p>
      <p>Multidimensional
organisation
of snapshots</p>
      <p>Relevance-based data exploration (Sect. 4)
Exploration
scenario
selection</p>
      <p>Identification
of relevant
data</p>
      <p>Exploration
scenariobased
exploration</p>
      <p>Manual
task
Automatic
task</p>
      <p>Semiautomatic
task
begin/end of the trip, twice per kilometre and when some driving events occur (e.g., burst of
speed, hard braking). For privacy preservation issues, only few characteristics of the vehicle are
recorded (e.g., the type, to denote either a private or a commercial vehicle), but they cannot be
used to infer any information regarding the owner.</p>
      <p>Anomaly-related data. This kind of data concerns anomalous events recorded by the black
box of a vehicle, which can be: (i) induced by bad road surface conditions or (ii) caused by
car accidents. For both the types of anomalous events, the black box collects data from the
accelerometers on X-Y-Z axes, with a rate which varies from 200 up to 800 Hz for each trace,
depending on the model of the black box. Hence, depending on the frequency, accelerometric
traces from diferent vehicles may have a diferent number of samples. In the scope of the
research project, the big data exploration approach presented in this paper (whose overview
is reported in Figure 2) is rooted on the analysis of anomalous events related to road surface
conditions; indeed, each black box assigns a probabilistic estimate to the cause that induced the
event (either hole, bump, depression, rough ground or undetermined – not assignable to any
specific category). Noteworthy, upon anomalous event occurrence, a GPS trail is recorded and
the position of the vehicle is sampled.</p>
    </sec>
    <sec id="sec-3">
      <title>3. The ingredients of the Big Data Exploration approach</title>
      <p>In this section, we briefly report the three building blocks of the big data exploration approach
presented in [6], which is fostered in the MoSoRe project to cope with variety, velocity and
volume of anomaly-related data, in order to let road maintainers monitor the status of the road
surfaces and plan corrective actions.</p>
      <p>Multi-Dimensional Model. The proposed Multi-Dimensional Model (MDM) is grounded on
the following two main pillars: (i) dimensions and (ii) exploration facets. A dimension  is an
entity representing a single aspect of a road portion (e.g., belonging city area, if it is part of either
an urban or suburban road) defined on domain (). A combination of diferent dimension
instances is apt to identify a specific road portion and constitutes a facet  = { , . . . ,  }
where  ∈ (). We denote with Φ the set of all facets, representing road portions
with homogeneous characteristics. An example of facet 1 ∈ Φ may contain instances of the
following four dimensions: {RoadType, SpeedLimit, District, MileageExtension}. For 1, RoadType
= Urban and District = District1 are sample dimension instances. The MDM is leveraged to
organise the measures associated with physical quantities recorded by black boxes on vehicles,
the latter referred to as features. Measures are conceived as a stream or a time series and are
associated with specific road portions.</p>
      <p>Clustering-based data summarisation. Once focusing the attention on a specific facet, the
evolution over time of the stream of records from a single black box can be used to ascertain
whether road surface conditions are diverging from reference values. To obtain an efective
representation of the temporal evolution of a road surface condition, data summarisation based
on an incremental clustering algorithm is applied. In particular, the clustering algorithm, which
is based on the CluStream [7] algorithm, takes as input the stream of measures related to the
observed features. At a given time , the algorithm produces as output a set of syntheses (),
where each synthesis corresponds to a cluster of records, starting from records collected from
timestamp  − Δ to timestamp  and built on top of the previous set of syntheses  ( − Δ),
for a given road portion. Roughly speaking, syntheses conceptually represent a specific status
of road surface conditions. A set of syntheses at a given timestamp  corresponds to a snapshot
(), a data structure defined as the following tuple ⟨(),  , ,   ,   ⟩, where:
(i) () is a set of syntheses generated at time , (ii)  is the set of the monitored features;
(iii)  is the facet that identifies the road portion; (iv)   is the type of anomalous event the
snapshot refers to (i.e., either hole, bump, rough ground, depression, undetermined event); (v)
  is the identifier of the anomalous event as assigned by the black box of a vehicle.
Identification of relevant data. Given two snapshots (1) and  (2) (with  ̸=  and
2 &gt; 1), changes between syntheses in the two snapshots are apt to identify relevant data, which
can be proposed to road maintainers to start the exploration from. In particular, the measure of
relevance is based on the notion of distance between the syntheses sets (1) ∈ (1) and
 (2) ∈  (2), obtained by combining diferent factors to detect movements of syntheses,
expansion/contraction and changes in density (i.e., the diference in the number of measures
aggregated by the syntheses with respect to their hyper-volume). Snapshots for which the
distance value from a reference snapshot  (0) falls within an interval [, ] are
highlighted as relevant. At the moment, the  and  are predefined thresholds set by
road maintainers. For instance, diferent domain experts may set diferent threshold intervals
to highlight relevant snapshots complying with their diferent expertise and goals (e.g., since
road surface repair interventions imply a huge economic expense, a road maintainer may be
forced to limit the intervention on a subset of anomalous events, selecting them according to
the relevance value). Focusing on the set of syntheses of a relevant snapshot, it is possible to
check what are the syntheses that changed over time (namely, appeared, merged or removed),
which contributed to make that snapshot relevant.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Exploration scenarios for Smart Mobility</title>
      <p>
        The big data exploration techniques illustrated in the previous section are being applied, in the
MoSoRe project, to implement three exploration scenarios, targeted to assist road maintainers
when inspecting snapshots of measures related to road surface conditions.
1) Analysis of the evolution over time of an anomalous event. In this exploration scenario,
the road maintainer analyses a sequence of snapshots related to a single anomalous event, for the
monitored road section ̃︀. For the same road section, a reference snapshot, taken under normal
conditions (i.e., not referred to an anomalous event) is considered. The goal of this scenario is to
compare the evolution of syntheses in the sequence of snapshots against the reference snapshot.
As a result, it is possible to understand the triggering causes of the anomalous event, focusing
only on relevant snapshots and, for such snapshots, inspecting the evolution of syntheses that
changed over time. Starting from the dimensions in ̃︀, and considering the Multi-Dimensional
Model, a road maintainer may apply the renowned OLAP operators (e.g., roll-up, drill-down) to
discover the Area of the city with the highest percentage of relevant snapshots. Once found,
he/she can then narrow down the exploration with a drill-down operator, enabling the inspection
of anomalous events at District level.
2) Comparison of anomalous events of the same type. In this exploration scenario, the
snapshots considered for exploration are the ones belonging to the monitored road section
̃︀, having a type ̃︀ (e.g., hole), but related to diferent events. Considering a single reference
snapshot corresponding to a critical event (e.g., occurred in the past on the monitored road
section), the goal of this scenario is to determine which snapshot reflects the highest critical
situation. To this aim, the distance values between analysed snapshots and the reference
critical one are exploited to establish an order from the most to the least relevant. Hence, road
maintainers may exploit the organisation of road portions from the MDM, to identify where
the most severe anomalous events of a certain type occurred and then, taking one of such
anomalous events, they can resort to the scenario (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) to analyse the temporal evolution of
snapshots associated with the event.
3) Classification of an undetermined event. The goal of this scenario is to determine the
similarity of an undetermined event with respect to the known typologies (in the MoSoRe project:
hole, bump, depression and rough ground), thus performing a classification task. In this respect,
four reference snapshots are considered, one for each of the aforementioned types. Snapshots
to be considered for analysis are the ones of an undetermined event for a monitored road
section ̃︀. Classification is accomplished by calculating the distances of snapshots considered
for analysis from each of the four reference snapshots, focusing only on the relevant ones. The
lowest distance corresponds to the highest similarity, which is used to properly classify the
undetermined event. Similarly to the former scenarios, road maintainers may focus on road
portions with a considerable rate of undetermined events occurrence to start the exploration
from. Through this scenario, they can firstly ascertain the classification of such events, and
then they can assess their severity, resorting to the scenario (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ).
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Implementation and preliminary validation</title>
      <p>Architecture. In Figure 3, the architecture implementing the approach is reported.
Clusteringbased data summarisation modules have been implemented in R relying on the streamMOA
library. Mobility data is made available by the vehicles black boxes provider on an FTP repository</p>
      <p>Data enriched
2 with MDM
dimensions</p>
      <p>Collected</p>
      <p>Data
ConFnTePctor enrMicDhmMent</p>
      <p>
        Back-end
(in the form of CSV files). The Back-end modules: (i) listen for new files containing measures; (ii)
store the measures, automatically enriched with the dimensions of the MDM into a MongoDB
database (Collected Data) as JSON documents organised into collections (a collection contains
data related to a day). Summarised Data, obtained through the incremental clustering algorithm,
relies on MongoDB technology as well. The numbers on the arrows in Figure 3 denote the
interaction flow between modules. Once CSV files containing anomalous events, vehicles trips
and accelerometric trails are available on the FTP repository, they are retrieved by the
Backend (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), where measures are associated with the dimensions of the MDM, before being stored
within MongoDB (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ). Then, the Incremental Clustering module is notified about the presence
of available data to process from the Collected Data store (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ). The output of the Incremental
Clustering module is stored within the Summarised Data store and then sent to the Relevance
Data Identification module, which is in charge of: (i) identifying relevant snapshots according
to the exploration scenario selected; (ii) sending relevant snapshots to be displayed on a GUI.
Preliminary validation. Experimental evaluation aims at: (i) testing the processing time,
to prove that data summarisation and the measure of relevance can be eficiently computed,
thus facing high acquisition rates; (ii) assessing the quality of relevance evaluation, as the
correlation between high relevance values in correspondence to time instants where variations
of the collected data occurred. Currently, the average number of daily anomalous events
is 400, producing ≈ 3 · 106 records of accelerometric measures. For the experiments, we
considered a stream of measures composed of 14160 samples. Regarding processing time, we
performed tests varying the width of the time window Δ, retaining the latest collected data to
be processed by the clustering algorithm. Figure 4(a) shows the average time required by the
incremental clustering algorithm and relevance evaluation to process a single record of measures
for diferent Δ values. Lower Δ values demand more time to process data. Indeed, every time
data summarisation and relevance evaluation are performed, some initialisation operations have
to be executed (e.g., access to the set of syntheses previously computed). Therefore, to ensure
lower processing time, the frequency of clustering execution and relevance evaluation must be
reduced, that is, Δ value must be increased. Instead, higher Δ values indicate that clustering
execution and relevance evaluation could be performed far from time instants where important
variations occurred, thus reducing the quality of data relevance evaluation. This is evident in
Figure 4(b), where two diferent Δ values have been used for demonstration. The rationale is
to adaptively increase/decrease Δ value according to the distance of relevant syntheses from
warning and error thresholds for the observed features, depending on the road portion, since
they correspond to potentially critical situations that must be monitored at finer granularity.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Related Work</title>
      <p>In the literature, several research eforts proposed the adoption of comprehensive solutions for
big data exploration, to improve the resilience of Smart City mobility. Authors in [8] propose
a framework for analysing road accident data; therein, after data preprocessing, a clustering
algorithm is applied and association rules are mined to find possible underlying patterns in the
data set. With similar intents, the work in [9] combines IoT and big data to devise the Pavement
Managements System (PMS), a road maintenance management structure composed of pavement
detection and 3D modelling, data analysis and decision support. It also illustrates use cases
for two main actors, the road maintenance company and a technical firm that ofers smart
solutions for road maintenance. In [10], a city trafic state assessment system is implemented
using a big data cloud infrastructure, assuring high scalability, to host clustering methods and
ifnd areas of jam. Leveraging the recent advances in the field of computer vision and big data
computing, authors in [11] developed a scalable framework for image-based monitoring of urban
infrastructure, using both Web images and Google Street View imagery to train a CNN model.
Pursuing the goal of analysing road trafic and pollution data for the city of Aaruhs (Denmark),
in [12] big data technologies ease the calculation and visualisation of the least polluted route.
Despite multi-dimensional data organisation is not envisaged in any of the former work, we
share with [8] the introduction of metrics to identify relevant data. Furthermore, only [8, 10]
foster summarisation techniques, both of them relying on clustering. Nevertheless, in [8] several
clustering algorithms from the literature are cited, but none of them is conceived to be applied
incrementally on a stream of data, whilst in [10] details on how the algorithm is applied are
not provided. Regarding the formulation of exploration scenarios to support data exploration,
only [12] sketches scenarios targeted to Smart Mobility, but details are coarsely given.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Concluding remarks</title>
      <p>In this paper, we described our contribution in the scope of the MoSoRe research project,
presenting an approach based on big data exploration techniques to support road maintainers in
analysing and assessing road surface conditions. The approach includes three precise
components: (i) a multi-dimensional model apt to represent the road network and to enable data
exploration; (ii) data summarisation techniques, in order to simplify exploration of massive data
streams collected by vehicles; (iii) a measure of relevance aimed at attracting the attention of
the road maintainers on relevant data only. Moreover, the paper illustrates the application of
the approach in three exploration scenarios. Future research eforts regard the formalisation of
an exploration methodology rooted on exploration scenarios, taking into account also
personalisation aspects for road maintainers (e.g., in order to explore relevant snapshots more related to
their analysis interests) and the setup of an extensive campaign of usability experiments, to be
performed in the last phases of the MoSoRe project on the prototype GUI used for exploration
purposes. Additionally, a generalisation of the proposed approach will be also investigated for
other domains permeated by big data (e.g., healthcare, robotics).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.-J. Kim</surname>
            ,
            <given-names>P. P.</given-names>
          </string-name>
          <string-name>
            <surname>Maglio</surname>
          </string-name>
          ,
          <article-title>Smart cities with big data: Reference models, challenges, and considerations</article-title>
          ,
          <source>Cities</source>
          <volume>82</volume>
          (
          <year>2018</year>
          )
          <fpage>86</fpage>
          -
          <lpage>99</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Paiva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Ahad</surname>
          </string-name>
          , G. Tripathi,
          <string-name>
            <given-names>N.</given-names>
            <surname>Feroz</surname>
          </string-name>
          , G. Casalino,
          <article-title>Enabling technologies for urban smart mobility: Recent trends, opportunities and challenges</article-title>
          ,
          <source>Sensors</source>
          <volume>21</volume>
          (
          <year>2021</year>
          )
          <fpage>2143</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>S. E. Bibri,</surname>
          </string-name>
          <article-title>The anatomy of the data-driven smart sustainable city: instrumentation, datafication, computerization and related applications</article-title>
          ,
          <source>Journal of Big Data</source>
          <volume>6</volume>
          (
          <year>2019</year>
          )
          <fpage>59</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Campos-Cordobés</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Del</given-names>
            <surname>Ser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Laña</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. I.</given-names>
            <surname>Olabarrieta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sánchez-Cubillo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>SánchezMedina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. I.</given-names>
            <surname>Torre-Bastida</surname>
          </string-name>
          ,
          <article-title>Big data in road transport and mobility research</article-title>
          , in: Intelligent Vehicles,
          <year>2018</year>
          , pp.
          <fpage>175</fpage>
          -
          <lpage>205</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bianchini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Garda</surname>
          </string-name>
          ,
          <article-title>Big data exploration techniques for road surface conditions assessment</article-title>
          ,
          <source>in: 7th Italian Conf. on ICT for Smart Cities</source>
          and
          <string-name>
            <surname>Communities (I-CiTies</surname>
            <given-names>)</given-names>
          </string-name>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bagozi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bianchini</surname>
          </string-name>
          , V. De Antonellis,
          <string-name>
            <given-names>M.</given-names>
            <surname>Garda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Marini</surname>
          </string-name>
          ,
          <article-title>A relevance-based approach for big data exploration</article-title>
          ,
          <source>Future Generation Computer Systems</source>
          <volume>101</volume>
          (
          <year>2019</year>
          )
          <fpage>51</fpage>
          -
          <lpage>69</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          , J. Han,
          <string-name>
            <given-names>J</given-names>
            .
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <article-title>A framework for clustering evolving data streams</article-title>
          ,
          <source>in: Proc. of 29th Int. Conf. on Very Large Data Bases (VLDB)</source>
          ,
          <year>2003</year>
          , pp.
          <fpage>81</fpage>
          -
          <lpage>92</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Toshniwal</surname>
          </string-name>
          ,
          <article-title>A data mining framework to analyze road accident data</article-title>
          ,
          <source>Journal of Big Data</source>
          <volume>2</volume>
          (
          <year>2015</year>
          )
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ti</surname>
          </string-name>
          ,
          <article-title>A framework of pavement management system based on iot and big data</article-title>
          ,
          <source>Advanced Engineering Informatics</source>
          <volume>47</volume>
          (
          <year>2021</year>
          )
          <fpage>101226</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>C.-T. Yang</surname>
            , S.-T. Chen, Y.-
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Yan</surname>
          </string-name>
          ,
          <article-title>The implementation of a cloud city trafic state assessment system using a novel big data architecture</article-title>
          ,
          <source>Cluster Computing</source>
          <volume>20</volume>
          (
          <year>2017</year>
          )
          <fpage>1101</fpage>
          -
          <lpage>1121</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Alipour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Harris</surname>
          </string-name>
          ,
          <article-title>A big data analytics strategy for scalable urban infrastructure condition assessment using semi-supervised multi-transform self-training</article-title>
          ,
          <source>Journal of Civil Structural Health Monitoring</source>
          <volume>10</volume>
          (
          <year>2020</year>
          )
          <fpage>313</fpage>
          -
          <lpage>332</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zenkert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dornhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Weber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ngoukam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fathi</surname>
          </string-name>
          ,
          <article-title>Big data analytics in smart mobility: Modeling and analysis of the aarhus smart city dataset</article-title>
          ,
          <source>in: 2018 IEEE Industrial Cyber-Physical Systems (ICPS)</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>363</fpage>
          -
          <lpage>368</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>