Smart Emission Building a Spatial Data Infrastructure for an Environmental Citizen Sensor Network Michel Grothe Linda Carton Geonovum Radboud University, Amersfoort, The Netherlands Institute for Management Research m.grothe@ geonovum.nl Nijmegen, The Netherlands l.carton@fm.ru.nl Just van den Broecke Just Objects Hester Volten Amstelveen, The Netherlands National Institute for Public Health and the just@justobjects.nl Environment (RIVM) Bilthoven, The Netherlands hester.volten@rivm.nl Robert Kieboom CityGIS Den Haag, The Netherlands Robert@citygis.nl Abstract—Smart Emission is a citizen sensor network using networks (sometimes self-organized, sometimes participating low-cost sensors that enables citizens to gather data about in government-initiated participatory projects) with the aim to environmental quality, like air quality, noise load, vibrations, monitor and/or improve environmental qualities in their daily light intensities and heat stress. This paper introduces the design living environment [2]. This is strongly supported by and development of the data infrastructure for the Smart increasingly available ICT technologies and low-cost sensing Emission initiative and discusses challenges for the future. The equipment [3], [4]. Technology is an enabling factor in this Spatial Data Infrastructure (SDI) is open and accessible on the development. Moreover, the social trend of self-organization Internet using open geospatial standards and (Web-) client by citizens, taking responsibility of their own neighbourhood applications. Smart Emission as a citizen sensor network offers and region, is a driving force behind this emerging trend. In several possibilities for heterogonous applications, from health [5], the user perspective and the societal and policy determination to spatial planning purposes, environmental monitoring for sustainable traffic management, climate dimensions of this emerging concept of citizen-sensor- adaptation in cities and city planning. networks has been analysed in more depth. The Smart Emission initiative aims to establish an Keywords—Smart Emission, Citizens, Low-cost sensors, Spatial innovative citizen sensor network in a real-life ‘urban lab’ Data Infrastructure; Sensor Data; Geospatial Standards setting [2]. The initiative builds on the knowledge of technical and social innovation in implementing such a new citizen- I. INTRODUCTION sensor-network. It also reflects on the feedback provided by the information, and it’s potential consequence for citizens and Today’s technical advancements enable innovations of government to explore new venues, options and (low-cost) citizen sensor networks in cities, because more and more low- strategies to further improve local environmental quality in cost sensors are being invented, wireless communication dedicated places. infrastructures provide the means for information loops to be established over longer distances against relatively low costs, This paper is about the Smart Emission data infrastructure and big data tools are becoming available that make the offering an open SDI, including the use of international, open handling of massive amounts of data flows affordable and standards to achieve interoperability and provide open access doable [1]. to the data on the Internet. The paper is structured in the following order. In the first section, the Smart Emission At the basis of citizen sensor networks lies the observation initiative is described in summary. Next, the data collection that in multiple places, citizens organize themselves in and data distribution infrastructures are outlined. Then, the cloud deployment architecture is introduced. In the next visualizing this data for involving citizens and connecting to section, a short outline of the adopted open access principles broader city purposes. and available user applications follows. The paper ends with a At this moment, the Smart Emission initiative has started final section sketching challenges and outlook. in the city of Nijmegen. In this midsize city in the Netherlands with approx. 170.000 inhabitants, about 35 Jose environmental II. SMART EMISSION CITIZEN SENSOR NETWORK sensor units are located at citizens’ homes. Negotiations to expand the Smart Emission concept to other cities in the The Smart Emission initiative applies an innovative citizen Netherlands and outside the Netherlands (Belgium, Germany) sensor network, testing a network of low-cost sensors that is have started. being put in place in the city together with experts and participating citizens, as a proof of concept. III. DATA COLLECTION The initiative aims to monitor the environmental qualities in a low-cost and efficient manner while simultaneously The Smart Emission data collection starts with the acquiring a detailed image in space-time, as the sensors are collection of data from environmental sensors. The Jose sensor spread over many locations in the city. Ultimately, the sensing unit (see figure 1) developed by the Dutch sensor company initiative serves to further improve environmental qualities in Intemo [8] collects different types of environmental indicators: the built-up environment by monitoring and developing air quality, noise load, vibration, light intensity and several measures for environmental improvement in a bottom-up meteo indicators. planning style, based on a collaborative and communicative planning philosophy [6], [7], [8]. To this end, it involves citizens living and working in the city who have been be invited to place a sensor on their house, garden, window or building property. As volunteers, citizens are involved in receiving the data and discussing the ‘bigger picture’ of analyzed data and visualizations in sense-making sessions for citizen feedback and participatory evaluation (building on existing knowledge from fields like participatory planning and citizen science). Citizens participate in this low- cost sensing initiative. Their use cases are the starting point for this environmental citizen-sensing SDI. The initiative organizes citizens sessions in which citizens perform collaborative sense-making: in dialogue with other citizens, the city government and scientists/researchers, the Smart Emission data is analyzed, visualized and interpreted in a collaborative way using associated tools, like Web apps and a Maptable. The research conducted is shaped by the ideas of action research. This entails constructing a pilot version of a citizen sensor network in practice in a city, with the aim to become and remain operational during a certain time period. To this end, a low-cost sensor unit called “Jose” was developed [9] and implemented to measure the spatial pattern and spread of environmental information such as air quality, noise load, vibration, light coloring and intensities and meteo like rainfall, temperature, air pressure and humidity in a fine- grained network constellation. As such, other citizen sensing initiatives in the environmental domain exist. However, the initiatives often have one aspect of environmental monitoring in scope, like air quality [10, 11, 12], noise load [13,14,15], meteo [16] or vibrations [17,18]. The Smart Emission initiative has a broad(er) environmental perspective. The Fig. 1. Smart Emission Jose sensor (top) and Jose sensor installation at site heterogeneous sensor unit used can be applied for multiple of the national air quality network (bottom) applications. Smart Emission is also about exploring how low-cost This layered and extendable sensor unit offers the sensors can add value to high end sensing methods by following environmental indicators: 1. Light intensity, 2. Light collecting fine-grained urban measurement data, and which reflection, 3. Light (air) colour, 4. Earth vibration, 5. Carbon methods and scenarios can be used for processing and monoxide, 6. Nitric oxide, 7. Ozone, 8. Hydrogen, 9. Carbon dioxide, 10. Pressure, 11. Temperature (unit and environment), 12. Humidity, 13. Noise load. The Jose sensor In order to store the Smart Emission data in the unit is connected to a power supply by a USB phone adapter distribution database, harvesting and pre-processing of the raw and to the Internet. Internet connection is made via WIFI or sensor data (from the CityGIS production platform) is telecommunication network (using a GSM chip). Furthermore, performed. First, every minute a harvesting mechanism Jose collects time and date and location by GPS (latitude, collects data from the production platform using the raw longitude). Jose has memory and a multi-colour display ring. sensor API. The data encoded in JSON format is then The data are encrypted as data streams and sent every 15 processed by a multi-step ETL-based pre-processing seconds from each individual Jose unit to the data production mechanism. In several steps, the data streams are transformed platform hosted by CityGIS (see figure 2). to the Postgres DB. The encrypted data is decrypted by a dedicated ‘Jose Input Pre-processing is done specifically for the raw data of the Service’ that also inserts the data streams into a MongoDB air quality sensors. Based on a calibration activity, the raw (www.mongodb.com) database using JavaScript Object data from the air quality sensors is transformed to ‘better Notation (JSON). This MongoDB database is the source interpretable’ values. For some of the environmental production database, in which all raw sensor data streams of indicators, calibration procedures have been started. the Jose sensor units are permanently stored. Especially the four air quality indicators (carbon monoxide, nitric oxide, ozone, carbon dioxide) need further calibration. Smart Emission air quality sensor data are compared to the IV. DATA DISTRIBUTION measurements of two high-cost air quality sensor installations A dedicated Application Programming Interface (API), the belonging to the national air quality network and operated by ‘Raw Sensor API’, is developed for further distribution of the the National Institute for Public Health and the Environment Smart Emission data to other platforms, like the Smart (RIVM) and located in the city of Nijmegen (see figure 1). Emission open data distribution platform hosted at the For calibration of the air quality indicators, the approach FIWARE Lab NL 1. Other applications of FIWARE in the according to [20] was adopted and implemented. environmental domain can be found in [20]. Post-processing is the activity to transform the pre- The data distribution infrastructure at the FIWARE LAB processed values into new types of data using statistics NL consists of several components (see figure 2): (aggregations), spatial interpolations, etc.. 1. Pre-processsing and post-processing algorithms based on The data distribution architecture of Smart Emission is Extract-Transform-Load (ETL) principles; further expanded below. Figure 3 sketches the architecture 2. Data storage in Postgres/PostGIS database (DB); with an emphasis on the flow of data. This architecture 3. Several Open Geospatial Consortium (OGC) based APIs; sketches a multistep ETL approach which also is used within the ‘INSPIRE SOS Pilot’ approach for the implementation of 4. Several apps / web viewers, like the “SmartApp” and European air quality e-reporting (see [21], [22] and “Heron”. sensors.geonovum.nl). The multistep-ETL approach consists of three steps: harvesting, refinement (pre-processing and post-processing) and publishing (see figure 3). Fig. 2. Overall data architecture Smart Emission 1 “The FIWARE platform provides a rather simple yet powerful set of APIs that ease the development of smart applications in multiple vertical sectors. The FIWARE Lab is a non-commercial sandbox environment where innovation and experimentation based on FIWARE technologies take place” (www.fiware.org). • The three ETL steps run continuously via Linux cronjobs; • Each ETL process applies ‘progress tracking’ by SOS / SOS REST WMS / WFS SensorThings API FIWARE APIs Sensor Web Map & SensorThings Other Observation service Web Feature services service services maintaining persistent checkpoint data. Consequently, a process always knows where to resume, even after its STA (cron)job has been stopped or cancelled. All processes can SOS data data even be replayed from ‘time zero’. SOS-T SLD STA REST • Refined O&M data can be directly used for WMS and WFS services via GeoServer using SLDs and the PostGIS ETL ETL for ETL for datastore with selection VIEWs, e.g. last values of STEP 3 - publication for SOS STA ... components by WMS time dimension (WMS-T); • The SOS ETL process transforms refined data to SOS Observations and publishes these via the SOS-T STEP 2 Smart Emission InsertObservation operation. Stations are published once - validation Refined O&M ETL via the InsertSensor operation of the 52°North SOS server - calibration for - aggregation Smart Emission refinement (www.52north.org); - metadata Raw O&M • Publication to the SensorThings Server will go via a REST service. The SensorThings Server used is offered by SensorUp (www.sensorup.com); STEP 1 Harvester - harvesting The Smart Emission data infrastructure can be considered a spatial data infrastructure approach using geospatial Raw observation data API standards to expose spatial sensor data to the Internet. Smart Emission Smart emission Although the search for sensors and sensor data through Raw Sensor API raw data files sensor network metadata is not yet addressed, all data is exposed to the Internet for re-use by international OGC-based standards, like WMS, WFS, SOS and the latest SensorThings 2 Fig. 3. Overall Architecture with ETL Steps API (www.ogc.org). Other non-geospatial standards are considered as well. A multi-API approach offers a rich The ETL design comprises these main processing steps possibility of re-use of the environmental data to be explored (see figure 4): in different environmental application fields and re-used by different developers communities, like GIS, Internet of Things • Step 1 – Harvesting: fetch raw observation data via the and Web developers. Raw Sensor API; • Step 2 – ETL for refinement: for data validation, calibration transformations and aggregations of the raw V. CLOUD DEPLOYMENT ARCHITECTURE observation data; rendering ‘refined’ data with metadata; The cloud deployment architecture described above will be • Step 3 – Publication: ETL for publishing to various deployed on the FIWARE Platform provided by the FIWARE services, some having internal data stores: Lab NL (fiware-lab.nl). The FIWARE Lab NL offers a PAAS-  SOS ETL: transform and publish to the SOS DB via based computing and storage cloud where instances for SOS-Transactional (SOS-T); common images like Ubuntu can be created, provisioned (e.g.  SensorThings ETL: transform and publish to the storage, networking, central processing unit), and deployed. SensorThings API (STA) DB (via REST); Components from the Smart Emission data infrastructure as  Direct publication in WMS (with SLDs) and WFS; described in the architecture above will be deployed on the  other ETL for custom services or FIWARE APIs (to FIWARE cloud using Docker (www.docker.com). Docker is a be implemented). common computing container technology also used extensively within FIWARE. By using Docker, we can create Some additional notes for the data flows above and reusable high-level components, ‘containers’, which can be software used: built and run within multiple contexts. Figure 4 sketches the Docker deployment strategy. • The central datastore DB is PostgreSQL (www.postgresql.org) with PostGIS enabled; • All ETL transformations are executed with Stetl, streaming ETL, a lightweight ETL framework for geospatial data conversion (github.com/geopython/stetl); 2 The arrows represent the flow of data; circles depict harvesting/ETL processes; server instances are in rectangles and data stores are represented by the DB icons. WEB Platform as Generic Enablers (GEs) and included within the FIWARE Catalogue as components for FIWARE blueprints. VI. OPEN ACCESS AND APPLICATIONS The data infrastructure of Smart Emission is based on an 52North SOS Geoserver SensorUP SensorThings Other services open approach in many aspects, especially regarding open access to Smart Emission data for re-use and access offered through open, standardized APIs and some out-of-the-box client applications. The data infrastructure is as open as possible, as far as the privacy of the Smart Emission citizens is not violated. Also, the software infrastructure is mostly open StetL POSTGIS source software, as well as documentation 4. ETL Open access means that all environmental data is open and accessible through web APIs for developers and some client applications for end-users, like students, professional Local: Local: sync External: researchers, but also citizens. In order to make access to the Data and Logs Config and code GitHub sensor data as easy as possible, several client applications are available for data re-use through these adopted APIs. Client 3 applications that provide further processing and exploration Fig. 4. Docker deployment strategy are GIS applications (like ArcGIS and QGIS), statistical packages (like R) and out-of-the-box web applications (like In a first instance, Docker containers will be created for: the 52°North JavaScript SOS Client Helgoland). One specific 1. Web: front-end web serving (viewers/apps) and proxy to client has already been developed during the Smart Emission backend web APIs; project, the “SmartApp”. SmartApp is a simple end-user Web 2. 5°2North SOS: container with Tomcat running 52°North application that uses the 52°North Sensor Web REST API to SOS; explore the last values of measurement in a simple, intuitive map application; all Jose sensor locations are shown on the 3. GeoServer: container with Tomcat running GeoServer for map and the user can select a location and the last values of air WMS en WFS; quality and noise load are shown in a popup window (see 4. SensorUp SensorThings: container running SensorUp figure 5). SensorThings API server; 5. Stetl ETL: container for the Python-based ETL framework Many choices are made and small improvements and used; innovations are developed while the system as a whole is 6. PostGIS DB: container running PostgreSQL with PostGIS being built and implemented. On the basis of requests from the extension. citizens, a website is being set-up dedicated to the users. This portal serves as visible ‘front end’ to users and incorporates The networking and linking capabilities of Docker will be the data viewers, a forum, and documents exchanged at the applied to link Docker containers, for example to link citizen meetings: www.smartemission.ruhosting.nl. GeoServer and the other application servers to PostGIS. Each month, the Smart Emission consortium holds Docker networking may even be applied, independent of meetings to figure out various aspects of the system’s (VM) location. When required, containers may be distributed architecture along the way, and to learn from each other’s over VM instances. Another aspect in our Docker-approach is progressing insights and learned lessons over the various parts that all data, logging, configuration and custom of the project. On a lower frequency, meetings with citizens code/(web)content is maintained ‘locally’, i.e. outside Docker are organized. From the feedback by users that is generated in containers. This will make the Docker containers more these interactions, the technical architecture and data reusable and will provide better control, backup, and processing mechanisms are further optimized in order to get a monitoring facilities. An administrative Docker component is best achievable SDI under the constraints of the current also planned. Code, content and configuration is maintained initiative. and synced in and with GitHub. Custom Docker containers will be published to the Docker hub to facilitate immediate reuse. As a result, FIWARE Lab NL will be used as a cloud- 4 All content authored within the project like (ETL) code, based computing platform. Standard FIWARE components for viewers, apps, Docker definitions, configurations are Internet of Things like Orion may be integrated at a later phase maintained in a dedicated project at GitHub: in the project. Also, several Smart Emission Docker containers github.com/Geonovum/smartemission. Documentation is also will be generalized for potential addition to the FIWARE maintained in this GH repo and published automatically on GH commits to smartplatform.readthedocs.io. 3 The entities denote Docker containers, the arrows linking. network and their environmental situation, the Smart Emission data infrastructure also has the ambition to explore the role and potential of low-cost sensing to heterogeneous application fields. Several potential application areas exist, especially regarding investigating relationships between environmental factors and health problems, like air pollution mapping, noise mapping and heat stress mapping. In addition to health related applications, the Smart Emission low-cost sensing for spatial planning and climate adaptation purposes is also worth considering in more detail. ACKNOWLEDGMENT Fig. 5. SmartApp We would like to thank all members of the Smart Emission consortium: Municipality of Nijmegen, Radboud University, VII. CHALLENGES AND OUTLOOK Geonovum, Intemo, CityGIS, National Institute for Public Health and the Environment. We would also like to thank As an innovation initiative, Smart Emission explores SensorUp, in particular Steve Liang for providing the several research questions regarding environmental citizens- SensorThings Software. sensor-networks. However, several questions still remain to be answered: REFERENCES 1. Do low-cost sensors add to the fine-grained picture of air quality indicators? Can we trace an ‘air pollution cloud’ [1] M. Swan, “Sensor mania! the internet of things, wearable computing, accumulating in certain places in the built environment? objective metrics, and the quantified self 2.0,” Journal of Sensor and 2. Which methods (relatively simple spatial interpolation, Actuator Networks, vol. 1, Issue 3, pp. 217-253, 2012. spatial regression and visualization techniques) and [2] L.J.Carton, P. Ache and consortium partners, 2015, “Filling the scenarios can be used for processing and visualizing this feedback gap of place-related ‘externalities’ in smart cities: Empowering citizen-sensor-networks for participatory monitoring and planning for a data for involving citizens and connecting to broader responsible distribution of urban air quality,” Paper presented at AESOP municipal purposes? Can we combine these measurements 2015, Association of European Schools of Planning Annual Congress, with other (modelling) information for informed citizen Czech Republic, Prague, 13 - 16 July 2015. and government? [3] C. Gouveia, A. Fonseca, A. Cȃmara and F. Ferreira, “Promoting the use 3. What about data ownership in citizen-sensor-networks and of environmental data collected by concerned citizens through information and communication technologies,” Journal of privacy related issues of using low-cost environmental Environmental Management, 2004, vol. 71, pp. 135-154. sensors in cities? [4] M. Hacklay,. “Neogeography and the delusion of democratization,” 4. Does sense-making with citizens work? What is the citizen Environment and Planning A, vol. 45, 2013, pp. 55-69. science contribution? [5] L.J Carton and P.M. Ache, “Citizen-sensor-networks serving as 5. If the concept works, does this open up opportunities for countervailing power through bottom-up planning: An analysis of how bottom-up spatial/traffic/urban planning to further improve two grassroots alliances creatively use Geographic Information in a networked manner for monitoring environmental externalities,” 2016, quality of living and health? unpublished. 6. Reflective: (How) do roles of government and citizen [6] P. Healey, “Collaborative Planning: Shaping Places in Fragmented change? Central elements in the research questions are the [7] J.E. Innes, “Information in Communicative Planning”, Journal of the notion of ‘fine-grained’ constellation, tracing unevenly American Planning Association, vol. 64:1, pp. 52-63, 1998. spread accumulations or ‘pollution clouds’, and ‘collective [8] P.M. Ache, and L.J. Carton, “Smart citizens 4 smart ruimte - het sense-making’. verkennen van vergezichten voor co-creatie van de stad van de toekomst,” in Toevoegen van ruimtelijke kwaliteit. Ruimtelijke kennis These questions need further exploration and attention. voor het Jaar van de Ruimte, W.Salet, R.Vermeulen and R. van der The Smart Emission initiative is still in its infancy, working on Wouden, R. (ed.), 2015, pp. 124-135. the citizens network, the collection (and calibration) of sensor [9] JOSENE - Joined Sensor Networks. www.intemo.nl (accessed on 1 July 2016). data, the distribution of data by APIs and the search for smart [10] K. Austen, “Pollution Patrol,” Nature, vol. 517, , 2015, pp. 136-138. applications. [11] A. Bröring, Remke, A., and Lasnia, D., “SenseBox – A Generic Sensor Smart Emission aims to connect the retrieved data flow to Platform for the Web of Things,” In: Mobile and Ubiquitous Systems: other data sources and embed this information in the dynamic Computing, Networking, and Services pp. 186-196, 2011, Springer Berlin Heidelberg. process of city governance. The foremost important challenge in the next period is to stimulate the application of the Smart [12] Q. Jiang, F. Kresin, A.K. Bregt, L. Kooistra, E.Pareschi, E. van Putten, Hester Volten, and J. Wesseling, “Citizen Sensing for Improved Emission data infrastructure. Gaining better insight in sensing, Urban Environmental Monitoring,” Journal of Sensors, vol. 2016, 2016, individual environmental issues, and in general citizen 9 pages. involvement, commitment and corporation is valuable in itself. [13] M.C. Bell and F. Galatioto, “Novel wireless pervasive sensor network to Besides of course the experiences gained from the citizen’s improve the understanding of noise in street canyons,” Journal of Applied Acoustics, vol. 74, Issue 1, pp. 169-180, 2013. [14] Geluidsnet/Sensornet. http://www.sensornet.nl/english/ (accessed on 1 July 2016). [15] N. Maisonneuve, Stevens, M. and Ochab, B., “Participatory noise pollution monitoring using mobile phones,” Information Polity, vol. 15, pp. 51–71, 2010. [16] S. Bell, D. Cornford, and L. Bastin, “The state of automated amateur weather observations,” Weather, vol. 68, no. 2, pp. 36–41, 2013. [17] Open Seismic Sensor Grid Groningen (OSSG). http://www.ossg.nl/ (accessed on 1 July 2016). [18] E.S. Cochran, Lawrence, J.F., Christensen, C and Jakka, R.S., “The quake-catcher network: Citizen science expanding seismic horizons.” Seismological Research Letters, vol. 80, 2009, pp. 26–30. [19] T. Usländer, Berre, A. J., Granell, C., Havlik, D., Lorenzo, J., Sabeur, Z. and Modafferi, S., “The future internet enablement of the environment information space,” In: Environmental Software Systems. Fostering Information Sharing, pp. 109-120, 2013. Springer Berlin Heidelberg. [20] L. Spinelle, M. Gerboles, M. G. Villani, M. Aleixandre and F. Bonavitacola, “Field calibration of a cluster of low-cost available sensors for air quality monitoring. Part A: Ozone and nitrogen dioxide,” Sensors and Actuators B: Chemical, vol. 215, August 2015, pp. 249– 257. [21] A. Kotsev, O. Peeters, P. Smits and M. Grothe, “Building bridges: experiences and lessons learned from the implementation of INSPIRE and e-reporting of air quality data in Europe,” Earth Science Informatics, vol. 8, Issue 2, June 2015, pp. 353–365. [22] [Schleidt, K., J, Hřebíček, G. Schimak, M. Kubásek, A.E. Rizzoli, “INSPIREd Air Quality Reporting,” In: Proceedings of the Environmental Software Systems. Fostering Information Sharing: 10th IFIP WG 5.11 International Symposium, ISESS 2013, pp 439- 450, 2013, Springer Berlin Heidelberg.