<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Preliminary Assessment of the Tra c Measures in Madrid City</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Instituto de Estudios Fiscales</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Avda.Cardenal Herrera Oria</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Madrid</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Spain mpilar.rey@ief.hacienda.gob.es</string-name>
        </contrib>
      </contrib-group>
      <fpage>52</fpage>
      <lpage>64</lpage>
      <abstract>
        <p>A potential source for producing reliable statistical information is the huge amount of data les created by the activity of electronic sensing devices. In particular, datasets collecting data on tra c sensors can be downloaded from the open data portal o ered by the local government of Madrid City. The tra c sensors are a rich source of information, providing data not only on the vehicle count but also on, e.g., its speed. However, processing the data at the granularity level required involves complex workloads that exceed the capabilities of traditional data analytical processing technologies and require big data speci c tools. The rst part of the paper is devoted to the steps in producing short-term indicators of the evolution of the tra c ow variable in Madrid using the Spark big data platform. Taking advantage of the information on the sensors' geographical location, the indicators are then analyzed to assess the impact of some recent local government measures addressed to reduce pollution and tra c congestion.</p>
      </abstract>
      <kwd-group>
        <kwd>Big Data</kwd>
        <kwd>Short-term Indicators</kwd>
        <kwd>Spark Platform</kwd>
        <kwd>Tra c measures</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The local government of Madrid City o ers an open data portal designed for
the users to explore and download their publicly accessible data. The datasets
available include data from tra c sensors located at strategic points in the roads
and streets of Madrid City. These tra c sensors are a rich source of information,
providing data not only on the vehicle count, but also, e.g., on its speed and
geographical location. There have been a number of studies on tra c sensors[
        <xref ref-type="bibr" rid="ref5 ref6">6,5</xref>
        ]
reporting that they provide, in general, accurate tra c measures.
      </p>
      <p>
        The volume of the downloaded information cannot be processed using
conventional statistical software and requires procedures speci cally developed for
this purpose. Apache Spark [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], an open source analytics engine for Big Data
processing has been used on a single node for the rst steps of collecting and
pre-processing data. The volume of the downloaded information cannot be
processed using conventional statistical software and requires procedures speci cally
developed for this purpose. Apache Spark [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], an open source analytics engine
for Big Data processing has been used on a single node for the rst steps of
collecting and pre-processing data. The rst aim of the paper is to study the tra c
in the city from 2016, constructing daily indicators of its evolution. Monitoring
the real evolution is a task more di cult than it appears at rst glance. In order
to obtain good enough indicators and before the nal calculations to compute
the indexes, it requires various steps to detect and correct logical inconsistencies
in the data, impute missing information, and summarize at di erent granularity
levels.
      </p>
      <p>Once the indicators are available, the tra c evolution can be analyzed to
learn signi cant patterns of behavior. The information on the sensors
geographical location may help at this stage to discover similarities and di erences between
zones in Madrid City. On the other hand, combining all these data will allow to
evaluate the results of the recent tra c measures taken by the local government
addressed to improve the levels of air pollution within the city and surrounding
areas.</p>
      <p>The remainder of this paper is organized as follows: the next section presents
a summary of the steps taken to construct the indicators; section 3 analyses the
high-frequency series obtained; section 4 performs the assessment of the tra c
measures; and, nally, a number of remarks and conclusions are shown in section
5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Construction of the daily indicators</title>
      <p>
        The raw data to be used as source for computing the time series consist on the
datasets made available in the portal after the end of each month, including the
gures of the previous month, for each one of the more than 4000 sensors, of a
number of variables measured in 15-minutes intervals. This makes around 150
million of data points for each year and each variable. Besides the previously
cited Apache Spark, the Python software [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]has been used for all calculations
and analysis once the indicators have been obtained.
      </p>
      <p>Although the datasets provide information on more variables, this paper only
studies a single variable, the intensity measured by the number of vehicles by time
unit, as an example of the analysis that could be performed. A daily intensity
indicator will be computed for the whole city, and also split into the urban area
and the M30 ring road. For this purpose, the calculations are performed in some
stages. Given that the names and/or categories of the intensity and sensor type
variables have changed in the datasets through the times, the rst step of
preprocessing is done, treating the data to make them homogeneous. After this, as
the daily level of time granularity has been chosen, the total number of vehicles
per sensor and day is calculated.</p>
      <p>As next stage, data editing must be performed to ensure completeness and
validity because the transmission of information from some sensor nodes may
sometimes fail. To detect these failures, data with more than a certain proportion
of missing information in the readings are not validated. These data together
with missing data are imputed by a procedure described later.</p>
      <p>Pk xkt
It = It 1: P
k xkt 1</p>
      <p>Pk xkt
xbit = xit 1: P
k xkt 1</p>
      <p>Since the intensity of the tra c in a road is de ned by the number of vehicles
passing the road in a period of time, the natural way to measure the intensity in
an area would be by the average number of vehicles in all the roads and streets
located in the area. As there are not sensors in all the roads and streets, it could
be approximated by the average number of vehicles in all the sensors located in
the area during the period. But the transmission of information from some nodes
may sometimes fail due to environmental interference, physical damage or lack
of power. Therefore, changes in the averages could be motivated by changes in
the sensors location and/or activity and not necessarily by changes in the tra c
intensity in the area.</p>
      <p>
        Being ow data, a simple aggregative index [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] could be used to compute the
evolution of the intensity. Instead, to solve the previous problem in measuring
the evolution, the indicators are computed as change estimators or chain-linked
index
(1)
(2)
where the sum is extended to the k sensors having data validated for both periods
t and t 1. The indexes I0 for the rst period, the rst of January 2016, are
calculated as the average by sensors in the area of the total number of vehicles
this day. Once the indexes of a day are computed and before calculating the
indexes of the following day, the sensors having missing data on this day are
imputed as
where the sum is also extended to all k sensors having data validated for both
periods t and t 1: Then the imputed values are validated and the indexes are
re-calculated, obtaining the same previous values. In this way, the imputed data
are available for the calculation of the following day indexes. It can be shown
that using this simple method of imputation, the indexes are always computed
using all the information available, and they are not deteriorated by a repeated
lack of information on some sensors.
      </p>
      <p>
        After the imputations are computed in this way, there is a remaining problem:
there are days for which there are no data for any sensor and indexes cannot be
calculated. The daily changes series are then considered to complete the missing
days using time series predictions. The rst attempt for forecasting was made
using LSTM (Long Short-Term Memory) Deep Neural Networks [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], a class
of arti cial neural networks that allows exhibiting temporal dynamic behavior.
These networks have proven to be able to outperform state-of-the-art univariate
time series forecasting methods. However, in our case, having less than 4 years of
data, forecasts from ARIMA models, following the Box-Jenkins methodology [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
have obtained better results in terms of minimum mean square error of forecast.
      </p>
      <p>As a nal stage, once microdata have been imputed and missing daily changes
have been predicted, intensity indicators are computed for the whole city, the
M30 ring road and the urban area.
3</p>
    </sec>
    <sec id="sec-3">
      <title>High-frequency series analysis</title>
      <p>Even though the temporal granularity chosen is of 1-day intervals, another
aspect to consider is the distribution of the vehicles ows within the day. The
tra c intensity for the combination day-of-the-week and hour may show
interesting patterns. For this purpose, it has been calculated for each sensor the
average of the tra c intensity per day of the week and hour, and later these
averages have been divided by the maximum found tra c intensity at this sensor
in an hour. The method provides an approximate idea of the average level of
occupancy during the week of the road or street on which the sensor is located.</p>
      <p>
        Fig. 4 shows an example of the pro le for a particular sensor (tick marks
indicate noon for each day) where it can be seen the decay on weekend and a
peak around 9 a. m. each weekday. These pro les form 168-dimensional points.
Clusters of these points using the K -means algorithm and the Euclidean distance
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] have been built to explore and summarize the results. Fig. 5 shows the centers
of the clusters for k = 10 clusters.
      </p>
      <p>
        Although the elbow method [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] to determine the optimal number of clusters
is not totally conclusive, this number has not a big impact on the results: similar
graphs and conclusions could be obtained with another number of clusters. As
general patterns for all roads or streets, besides a decay on weekends, it is found
that the tra c intensity decreases during night hours (from 1 to 5 a. m.),
especially on weekdays, and that there are generally decaying around noon and 3 p.
m. Besides these general features, there are big di erences between the levels of
occupancy, extending from light in clusters 2 and 5 to heavy in clusters 4 and
6. It can also be seen that sensors in clusters 3 and 10 have maximum tra c
on weekdays at morning commuting hours, while sensors in 4, 8 and 9 have the
top at afternoon hours. Therefore, there are two aspects that may characterize
the sensors weekly behavior and may be of interest to explore and describe: the
global level of occupancy, and the time of the day at which the intensity on
weekdays is the highest.
      </p>
      <p>Instead of visually studying the graphs to assign a level of occupancy for
each sensor, they are automatically classi ed into three levels, depending on the
computed area under the normalized by the maximum weekly pro le curve. Fig.
6 shows the average level of occupancy obtained from the sensors in Madrid City
boroughs. It can be seen that most of the areas with Light tra c intensity are
outside the central part of the city.</p>
      <p>Similarly, the sensors can be automatically classi ed into three groups
depending on the time of the day at which the intensity on weekdays is the highest
(a sensor belongs to Morning commuting/Afternoon group when its average for
weekdays exceed by more than 20% the Afternoon/Morning commuting average,
respectively, being Morning commuting between 7 and 9 a.m. and Afternoon
between 2 and 9 p.m.; otherwise belongs to All day group). Fig. 7 provides an idea
of the typical weekday pro le of usage of the roads and streets.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Assessment of the impact of the tra c measures</title>
      <p>
        The local government of Madrid City has taken in the last years, some measures
addressed to reduce pollution. Although the current understanding of the air
pollution impacts from tra c congestion on roads is limited [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] , it seems that
vehicle emissions and tra c-related pollution are typically one of the largest
contributors to air pollution in cities. This paper studies just one variable, the
tra c intensity, and, consequently, the evaluation refers exclusively to the e ects
on tra c reduction, and not directly to the e ects on air pollution. The most
important tra c measures taken may be summarized in Table 1.
      </p>
      <p>As the measures have been gradually taken, a rst assessment of the impact
on the whole city can be done from the annual average rates in Table 2. The
global indicator re ects the behavior of the whole Madrid City area and the
other indicators (M30 and Urban) extend also over all area. For this reason, it
is not likely to nd any e ect of the tra c measures because they refer to only
some zones and there may also exist opposed e ects in other parts.</p>
      <p>
        To check the hypothesis of a possible e ect on any of the indicators, ARIMA
models with intervention analysis [
        <xref ref-type="bibr" rid="ref2 ref3">2,3</xref>
        ] have been used. Thus, a basic
multiplicative ARIMA model with weekly seasonality has been tted to each series using
the Scikit-learn software library [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] . There have also been included as regressors
some additive outliers and a speci c variable to measure the e ect of Easter,
a relevant moving holiday for daily data. Then, di erent intervention variables,
trying to gather the e ects of the tra c measures (with di erent structures and
di erent dates) have been tested. But the value of the corresponding parameter
estimates has never been signi cantly di erent from zero.
      </p>
      <p>In any case, the assessment must be better referred to zones that can be
a ected by the measures. The information about the geographical location of
the sensors, provided also in the open data portal of Madrid City, can be used.
Two zones probably a ected have been considered: Madrid Central, the area with
borders de ned by the local government and which some of the tra c measures
refer to, and another area de ned as a crown of 300 meters surrounding Madrid
Central, which will be named Crown. The delimitation of the zones appears in
Fig. 8.</p>
      <p>What can be done now is to compute new indicators, following the rules in
section 2, for the two zones, including in each one the data of the sensors within
the corresponding area. Thus, intensity indicators for Madrid Central and Crown
zones appear in Fig. 9 and Fig. 10.</p>
      <p>For a rst assessment, Table 3 shows the annual average rates where now
possible e ects appear. There is a gradual reduction in Madrid Central, probably
re ecting the cumulative e ect of the di erent measures. The Crown area, on its
side, shows a clear increase in 2018, result of a plausible substitution or border
e ect. Nevertheless, this may revert as a result of the last tra c measures in
2019.</p>
      <p>Fig. 11 and Fig. 12 present the corresponding monthly average and monthly
average annual rates, respectively, of the tra c intensity at Madrid Central and
Crown zones.</p>
      <p>
        With the aim to provide more detailed explanations, both series have been
treated in a similar way to the previous for nding possible e ects of the tra c
measures. That is, basic multiplicative ARIMA models [
        <xref ref-type="bibr" rid="ref2 ref3">2,3</xref>
        ] with weekly
seasonality, Easter variable, and additive outliers have been tted, and later di erent
intervention variables have been tested using the Scikit-learn [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] software library.
Although at rst glance, from Fig. 12, one of the most important measures, the
starting of Madrid Central in March 2019, seems to be having some e ects (both
indicators show annual decreases from April 2019), no signi cant e ects have
been found. Nor have any other signi cant interventions related to the tra c
measures been found, probably because of their gradual implementation that
may be described by the ARIMA model.
      </p>
      <p>Another interesting analysis to perform is to see whether there has been any
e ect on the weekly patterns of behavior for the roads and streets located in
both areas. To simplify the study, the period since the complete implementation
of all measures, (starting in March 16, 2019) is compared to an equivalent period
in 2017 (March 16, 2017, to August 30, 2017), when hardly any tra c measure
had begun to work.</p>
      <p>As a summary result, Fig. 13 classi es the sensors on whether they have
experienced an improvement or a worsening on the level of occupancy, computed
as described in section 3, in these 2-years.</p>
      <p>In general terms, after March 15, 2019 the level of occupancy has improved
in the area of Madrid Central, with some exceptions. The border e ect is
concentrated in speci c zones of the Crown area, while there are also in this area
other parts that have experienced improvements in the level of tra c intensity.</p>
      <p>Finally, in Fig. 14 are shown exclusively the sensors changing their pro le
of usage, calculated and de ned as in section 3, between the same periods in
2017 and 2019. It must be noted that the sensors within Madrid Central have
not changed to \All day" pro le of usage, supporting that now the zone is not
occupied through all hours. On the contrary, some sensors in the Crown area
have worsened its level of occupancy and, at the same time, have now an \All
day" pro le of usage.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Final remarks</title>
      <p>This paper uses data about tra c sensors from the Madrid City open data portal
to evaluate the impact of the tra c measures taken in the last years in Madrid.
Being the rst aim to study the behavior of the tra c intensity over time, it must
be stressed the di culties and complexities in measuring its evolution, requiring
speci c procedures.</p>
      <p>The results obtained are very preliminary, rst because only one of the
variables available has been considered, and second because more periods would be
needed to accurately measure the possible impacts.</p>
      <p>Although the main objective of the tra c measures taken is to reduce air
pollution, what has been assessed here is the impact on the tra c volume, because
it is considered one of the largest contributors to air pollution in cities. What
has been found is that the actions implemented from 2017 seem to have reduced
tra c congestion in Madrid Central and other areas especially from 2019. At the
same time, in 2018 a rst collateral border e ect of increasing tra c intensity
in the surrounding zones may exist, although this e ect may revert in the next
months as a consequence of the last actions undertaken.</p>
      <p>Taking advantage of the spatial aspects of the information available, the
methods proposed can be used to assess the e ects of other tra c actions at
the same or at more detailed geographical level, when data from more periods
are available. The scope of the analysis can be widened when data from more
periods are available and also by extending the procedures to other variables
existing at the open data portal.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Box</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jenkins</surname>
            ,
            <given-names>G.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reinsel</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Time series analysis: forecasting and control holden-day san francisco</article-title>
          .
          <source>BoxTime Series Analysis: Forecasting and Control Holden Day1970</source>
          (
          <year>1970</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Box</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tiao</surname>
            ,
            <given-names>G.C.</given-names>
          </string-name>
          :
          <article-title>Intervention analysis with applications to economic and environmental problems</article-title>
          .
          <source>Journal of the American Statistical association</source>
          <volume>70</volume>
          (
          <issue>349</issue>
          ),
          <volume>70</volume>
          {
          <fpage>79</fpage>
          (
          <year>1975</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>L.M.:</given-names>
          </string-name>
          <article-title>Joint estimation of model parameters and outlier e ects in time series</article-title>
          .
          <source>Journal of the American Statistical Association</source>
          <volume>88</volume>
          (
          <issue>421</issue>
          ),
          <volume>284</volume>
          {
          <fpage>297</fpage>
          (
          <year>1993</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>MacQueen</surname>
          </string-name>
          , J., et al.:
          <article-title>Some methods for classi cation and analysis of multivariate observations</article-title>
          .
          <source>In: Proceedings of the fth Berkeley symposium on mathematical statistics and probability</source>
          . vol.
          <volume>1</volume>
          , pp.
          <volume>281</volume>
          {
          <fpage>297</fpage>
          . Oakland, CA, USA (
          <year>1967</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Medina</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benekohal</surname>
            ,
            <given-names>R.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramezani</surname>
          </string-name>
          , H.:
          <article-title>Field evaluation of smart sensor vehicle detectors at intersections{volume 1: Normal weather conditions</article-title>
          .
          <source>Tech. rep. (</source>
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Mimbela</surname>
            ,
            <given-names>L.E.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>L.A.</given-names>
          </string-name>
          :
          <article-title>Summary of vehicle detection and surveillance technologies used in intelligent transportation systems (</article-title>
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Pedregosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varoquaux</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gramfort</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirion</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grisel</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blondel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prettenhofer</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weiss</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dubourg</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , et al.:
          <article-title>Scikit-learn: Machine learning in python</article-title>
          .
          <source>Journal of machine learning research 12(Oct)</source>
          ,
          <volume>2825</volume>
          {
          <fpage>2830</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Schmidhuber</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hochreiter</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Long short-term memory</article-title>
          .
          <source>Neural Comput</source>
          <volume>9</volume>
          (
          <issue>8</issue>
          ),
          <volume>1735</volume>
          {
          <fpage>1780</fpage>
          (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Stone</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prais</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Systems of aggregative index numbers and their compatibility</article-title>
          .
          <source>The Economic Journal</source>
          <volume>62</volume>
          (
          <issue>247</issue>
          ),
          <volume>565</volume>
          {
          <fpage>583</fpage>
          (
          <year>1952</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Team</surname>
            ,
            <given-names>P.C.</given-names>
          </string-name>
          :
          <article-title>Python: A dynamic, open source programming language, python software foundation, 2017</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Thorndike</surname>
            ,
            <given-names>R.L.</given-names>
          </string-name>
          :
          <article-title>Who belongs in the family?</article-title>
          <source>Psychometrika</source>
          <volume>18</volume>
          (
          <issue>4</issue>
          ),
          <volume>267</volume>
          {
          <fpage>276</fpage>
          (
          <year>1953</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Welch</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>The use of fast fourier transform for the estimation of power spectra: a method based on time averaging over short, modi ed periodograms</article-title>
          .
          <source>IEEE Transactions on audio and electroacoustics 15(2)</source>
          ,
          <volume>70</volume>
          {
          <fpage>73</fpage>
          (
          <year>1967</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Zaharia</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xin</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wendell</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Armbrust</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dave</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meng</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Venkataraman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Franklin</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          , et al.:
          <article-title>Apache spark: a uni ed engine for big data processing</article-title>
          .
          <source>Communications of the ACM</source>
          <volume>59</volume>
          (
          <issue>11</issue>
          ),
          <volume>56</volume>
          {
          <fpage>65</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Batterman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Air pollution and health risks due to vehicle tra c</article-title>
          .
          <source>Science of the total Environment</source>
          <volume>450</volume>
          ,
          <issue>307</issue>
          {
          <fpage>316</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>