=Paper=
{{Paper
|id=Vol-2841/BigVis_3
|storemode=property
|title= SenseBoard: Sensor Monitoring for Air Quality Experts
|pdfUrl=https://ceur-ws.org/Vol-2841/BigVis_3.pdf
|volume=Vol-2841
|authors=Federica Rollo,Laura Po
|dblpUrl=https://dblp.org/rec/conf/edbt/RolloP21
}}
== SenseBoard: Sensor Monitoring for Air Quality Experts==
SenseBoard: Sensor Monitoring for Air Quality Experts Federica Rollo Laura Po ‘Enzo Ferrari’ Engineering Department ‘Enzo Ferrari’ Engineering Department Modena, Italy Modena, Italy federica.rollo@unimore.it laura.po@unimore.it ABSTRACT of effective visualizations to process and interpret collected data Air quality monitoring is crucial within cities since air pollution is essential. is one of the main causes of premature death in Europe. However, In this paper, we present SenseBoard, an interactive tool ad- performing trustworthy monitoring of urban air quality is not a dressed to environmental experts that brings together heteroge- simple process. Especially, if you want to try to create extensive neous and dynamic data for real time analysis and management and timely monitoring of the entire urban area using low-cost of air quality network. This tool has been conceived within the sensors. TRAFAIR project1 that allowed the creation of an urban network In order to collect reliable measurements from low-cost sen- of low cost air quality sensors. The low-cost sensors employed sors, a lot of work is required from environmental experts who are cheaper and less reliable than the Air Quality Monitoring deploy and maintain the air quality network, and daily calibrate, (AQM) legal stations managed by the Environmental Agencies. control, and clean up the data generated by these sensors. In this It is possible to improve the reliability of the measurements of paper, we describe SenseBoard, an interactive dashboard created these devices if they are previously calibrated by placing the to support environmental experts in the sensor network control, device near air quality stations for some weeks. Low-cost sensors management of sensor data calibration, and anomaly detection. provide "raw" measures, i.e. a datum in millivolts; to convert this datum into a reliable concentration of pollutant it is necessary to carry out a calibration period during which some Machine 1 INTRODUCTION Learning algorithms are trained in order to generate, from the Air pollution is a global threat leading to large impacts on human raw measurements, pollutant concentrations in line with those health and ecosystems, particularly in urban areas. In Europe, air estimated by the AQM stations. SenseBoard is devoted to sup- quality remains poor in many cities that experience exceedances port environmental experts in the monitoring and control of the of the regulated limits for air pollutants [1]. The urgency of limit air quality sensor network, in the supervision of the calibration air pollution is also stated by the sustainable development goals process and in the detection of anomalous values. SenseBoard (SDGs) defined in the 2030 Agenda for Sustainable Development acts as an enabling tool to detect anomalies, update sensor sta- [2]. tus, monitor the proper functioning of the sensors, manage the Effective action to reduce air pollution and its impact on the change of location of the devices and, above all, to provide feed- quality of life requires good understanding and extensive moni- back to perform the calibration process. The calibration results toring of urban air quality. In recent years, the development of obtained using the Machine Learning algorithm are shown and Internet of Things technologies has increased and cities around compared to the raw data, and the data of the AQM station, thus, the world have exploited this enabling technology to be able to it is possible to understand if the algorithm works appropriately control multiple aspects of citizens’ lives. IoT allows monitoring or if it is necessary to extend the co-location period of the device. traffic congestion [3, 10], detecting and classifying road accidents SenseBoard is a general and flexible dashboard that can be [7], managing car parking [9], supporting decision in agriculture adapted for the monitoring of any air quality sensor network. [4], evaluating energy consumption [12], and, also, monitoring The scalability of the dashboard allows replicability in cities of air quality [13]. Data generated by IoT are used to improve city different size with a variable number of sensors. The dashboard services and the living experience of citizens. is not affected by the type of employed sensors and it can be In this context, data coming from a group of low-cost sen- easily modified to visualize other parameters measured by the sors spread around a city might generate widespread hyperlocal sensors. In this paper, we take advantage of the use case in the insights into air pollution. However, a network of low-cost air city of Modena. quality sensors is not enough to monitor urban air quality. Since The rest of the paper is organised as follows. Section 2 is de- those sensors are complex and sensitive, they require specific voted to the presentation of the background. Then, Section 3 environmental skills. Data generated by the air quality sensors introduces the dashboard and describes some views (data visu- need to be converted into relevant and crucial insights to allow alization) in the city of Modena. In the end, Section 5 provides the monitoring of air quality by politicians and to enable the conclusions. achievement of the sustainability goals. In this context, environ- mental experts hunger for a control platform to perform sensor 2 BACKGROUND data calibration and anomaly detection. TRAFAIR ("Understanding Traffic Flows to Improve Air Quality") The maintenance and control of a urban air quality network [11] is a project co-financed by European Commission that brings is relevant, and crucial to provide good information that enables together 10 partners from two European countries (Italy and the extensive monitoring of air quality. Moreover, the availability Spain) to develop innovative and sustainable services combining air quality, weather conditions, and traffic flows data. The scope © 2021 Copyright for this paper by its author(s). Published in the Workshop Proceed- is to increase the awareness on urban air quality for the benefit ings of the EDBT/ICDT 2021 Joint Conference (March 23–26, 2021, Nicosia, Cyprus) of citizens and government decision-makers. The project aims on CEUR-WS.org. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) 1 https://trafair.eu Figure 1: An air quality device (on the left) and its con- tent inside (on the right): 4 cells/sensors for measuring the level of 4 air pollutants (𝑁𝑂, 𝑁𝑂 2 , 𝐶𝑂, and 𝑂 3 in this case). Figure 2: Raw observations of one air quality device made to supervise the level of pollution on urban scale in 6 European in two different locations. cities (Modena, Florence, Pisa and Livorno in Italy, and Santiago de Compostela and Zaragoza in Spain) by producing real-time estimates of air pollution through a network of air quality devices The values drop off sharply due to the warm-up. Consequently, and by developing a service for forecasting urban air quality these values have to be discarded from the reliable observations based on weather forecasts and traffic flows [5]. and flagged by the environmental experts as "not reliable". This The sensors employed are low-cost, cheaper and less reliable operation can be made through SenseBoard (this will be described than the AQM stations managed by the Environmental Agencies. in Section 3). Accordingly, the environmental experts need to However, these devices can provide reliable measurements if they constantly visualize the data produced by the device and monitor are co-located for a certain period of time close to the stations the behavior of each device. where they "learn" how to measure air quality. The need for a monitoring tool comes from the environmental A device is a box where different sensors are placed. Each experts. For this reason, they have been directly involved in sensor, also called cell, is devoted to the measurement of a specific the definition of the requirements to be satisfied through the pollutant. Figure 1 shows an exemplar device. dashboard. After some discussions, 6 requirements have been The approach used in TRAFAIR is to install the devices for outlined: a certain period close to the AQM stations. During this period R1 providing an overview of the current position and status (calibration period) a Machine Learning algorithm is trained on of each sensor, the measurments provided in millivolts by our devices (raw ob- R2 recording the location or status change for a certain sensor servations) and the AQM station measurements. Then, when without hand-writing the SQL query to store the modifi- environmental experts evaluate the device is "ready" (usually cation on the database, after 3 weeks of co-location), it can be moved to different loca- R3 visualizing sensor observations without data aggregation tion and start providing air quality measurements. When the to detect anomalous values and compare them each other, device is "ready", thanks to the Machine Learning algorithm pre- and with data aggregation to better understand the trend, viously trained, calibrated data (concentrations) are generated R4 comparing observations of co-located sensors in the same from the raw observations. Usually, periodically every 6 months, place, the devices are once again co-located near the AQM stations for R5 showing the concentrations produced by the calibration re-calibrating, thus maintaining a good quality of measurements. algorithm and comparing them with the certified values In this scenario, it is easy to understand the importance of of the AQM legal stations, having a tool for managing the change of location or status of the R6 displaying the anomalies identified by the anomaly detec- devices. Besides, it is important to compare anytime the calibrated tion algorithm to control the efficiency of the automated data with the measurements of the AQM stations (no matter in algorithm. which location the sensor is), to determine when the device needs to be re-calibrated (usually when the calibrated data and AQM After the development of SenseBoard, during the usage, addi- measurements differ a lot). tional feedback from environmental experts has been continu- Since devices are constantly moved, it is possible that, when ously collected to improve the functionalities of the dashboard. they are switched off and then switched on in a new location, they experienced a "warm-up" for some minutes or even hours. 2.1 Air quality sensor network The warm-up is a specific period when the device tries to achieve In Modena, an Italian city of 186,000 inhabitants and 183 𝐾𝑚 2 , a thermo-mechanical balance in the measuring system as well as 13 air quality devices (52 low-cost sensors) have been installed an optimal operating temperature of the electronic components. in different locations. There have been identified 2 locations for Since the warm-up period is of variable duration, a time line calibration (the red dots in Figure 3), close to the AQM stations, evolution of the raw observations enables the user to detect and 10 locations of interest (the blue dots in Figure 3), that are when the warm-up period is over. placed in areas of different kinds, such as residential, industrial, Figure 2 shows the raw measurements collected from one or green areas. device. As it can be seen, at 9 a.m. approximately the device has Two types of low-cost devices have been exploited: 12 De- been switched off. One hour later, the device has been switched centlab Air Cubes and 1 Libelium Smart Environment PRO. All on in a new location. The zoomed area of the graph shows the the devices are equipped with 4 cells (sensors), one for each gas first two measurements for each gas made in the new location. (𝑁𝑂, 𝑁𝑂 2 , 𝐶𝑂 and 𝑂 𝑥 ). Each cell measures the gas concentration Figure 5: SenseBoard architecture. Figure 4 shows the data acquisition process. The devices en- capsulate the acquired measurements into LoRa packets and send them to the LoRaWAN server through the gateways. Every LoRa packet contains 19 (or 21) measurements (both mVs and basic concentrations for each gas and channel, including air tempera- ture, humidity and battery voltage) and in one day each device produces 13,680 (19 * 720) measurements. 4 gateways have been Figure 3: Points of interest for air quality monitoring (blue installed in Modena, mainly on the roofs of the highest buildings, dots) and positions of the AQM stations (red dots). Map to cover the whole urban area of the city and ensure the coverage data: Google, 2020. of the LoRaWAN network in our points of interest. Moreover, the gateways have been registered on the LoRa server. When the gateways receive a message, they send it to the LoRa server where the MQTT (Message Queue Telemetry Transport) Broker Mosquitto [8] is running. This is a publish/subscribe messag- ing transport protocol. Data are published as an MQTT topic. We have used the paho-mqtt Python library3 to implement the open source message broker in a Python script. The script is always running and exploits the client class to enable the connec- Figure 4: Sensor data acquisition from the low-cost air tion to the MQTT broker, publish messages, subscribe to topics quality sensor network. and receive messages. Then, messages are decoded and measure- ments stored into a PostgreSQL database, the TRAFAIR database (in the following sub-section, more details are provided). Also, through 2 channels (the auxiliary and the working channels). In through this script an anomaly detection algorithm is applied addition, the Libelium device measures the level of 𝑃𝑀2.5 and to the time series of the air quality measurements to detect if 𝑃𝑀10 . For each channel, the raw observations are provided in each measurement is anomalous or not. This algorithm employs 𝑚𝑉 , moreover, a basic concentration based on the original factory a majority voting system of three different Machine Learning calibration2 is provided in 𝜇𝑔/𝑚 3 . Besides, these devices are able algorithms. The anomalous data are flagged into the TRAFAIR to measure the air temperature and humidity, and provide the database. When a device is moved from one location to another, battery voltage. Therefore, the total number of measurements it automatically connects to the nearest gateway and restart send- provided by one sensor is 19 for Decentlab cubes and 21 for ing messages. Since the messages received in the LoRa Server are Libellium sensor. described by the identifier of the device, the change of gateway to The sensor data acquisition is managed by the Long Range which the device connects is completely transparent. The LoRa Wide Area Network (LoRaWAN) implemented in the city of Mod- server keeps storing the measurements of each device no matter ena. LoRaWAN [6] is a media access control (MAC) protocol how they are moving in the urban context. widely used in smart city applications thanks to its easy instal- lation and cost-effectiveness. It employs some gateways i.e. an- tennas that receive broadcast messages from the enabled devices 2.2 Data platform (the air quality devices, in our use case) and forward them to the Data from the air quality low-cost sensors are stored, in real server. The message from one device can be received by more time, into the TRAFAIR database. This database exploits the Post- gateways at the same time, the server will deal with duplicates. GIS extension to handle with geospatial data and the Timescale The LoRaWAN network exploits low radio frequencies and pro- extension to make SQL scalable for time-series data. vides for long-range communications (up to five kilometers in The database contains more than 60 tables and 190 GB of data urban areas, and up to 15 kilometers or more in rural areas). The collected from the beginning of the TRAFAIR project (November network coverage depends a lot on the geographic landscape. 2018) till now (February 2021). Air quality measurements and Our air quality devices have been registered to the LoRaWAN device-related information are store in 11 tables and take 3 GB. network of Modena through their identifier (DevEUI) and fol- These tables stores the technical characteristics of each device, lowing the Over-the-Air Activation (OTAA) process. The data its position, its status (running, calibration, offline, broken, warm- rate has been set up to 125 kHz, and the spreading factor to 7, to up), the raw observations, the concentrations obtained by both allow devices transmitting data every 2 minutes. the original factory calibration and our calibration algorithm, and the anomalies identified by some anomaly detection algorithms 2 This is obtained by applying to the raw observations a formula provided by the applied to both raw and calibrated observations. manufacturing company with the calibration parameters that are different for each device. 3 https://pypi.org/project/paho-mqtt/ In each moment, every device is described by a status and is Some examples of visualizations (views) are described in Sec- located in a point of interest (see Figure 3). Its measurements are tion 3.2. The views are static to allow users to navigate and stored continuously, as soon as they are parsed by the LoRa server. explore all the plots in the view without any interference. How- Each raw measurement can be calibrated by multiple calibration ever, the user can click on the “update” button to see the updated algorithms. Thus, calibrated data are identified, not only by the views. date of the measurement and the sensor that has provided it, but also by the algorithm that was used. In the end, several anomaly 3.1 Users and scope detection algorithms are applied to both raw and calibrated data. The scope of SenseBoard is the monitoring and control of the air The results are stored in the TRAFAIR database in appropriate quality sensor network and the supervise of the calibration and tables using boolean values to indicate if they are anomalous or anomaly detection processes. not. Regarding the monitoring of the network, SenseBoard allows Only considering the measurements coming from our devices, to identify and update the status of the sensors, change their from the installation, we have collected 3.3 million records of mea- location when they are moved in different position and perform surements (1.8 GB). Each record includes 19/21 measurements: air any maintenance, if necessary. temperature, humidity, battery voltage, 8 raw measurements (2 Considering the supervise of the calibration process, Sense- channels per 4 gases), 8/10 concentrations of the original factory Board lets to compare raw measurements of co-located sensors, calibration (2 channels per 4 gases and one measure for 𝑃𝑀2.5 raw and calibrated measurements of the sensors, and, in particu- and 𝑃𝑀10 ). lar, the calibrated observations generated during the calibration period with the legal observations from the AQM stations. This 3 SENSEBOARD last operation is the crucial one in the calibration process because SenseBoard4 is a Python web application which exploits Tor- it allows experts to understand if the training period of the Ma- nado5 as web framework. It runs on a Debian 9 machine with chine Learning algorithm is sufficient, i.e. if the concentrations, 32 Intel(R) Xeon(R) Silver 4108 CPU at 1.80GHz processors and elaborated by the Machine Learning algorithm, are in line with 256 GB RAM. Figure 5 shows the architecture of the dashboard. those of the AQM station. Firstly, users need to login to access the dashboard. The authen- Other tasks are the detection of issues in the network commu- tication phase is performed through the Lightweight Directory nication, the discovery of disruptions or failures in the sensor’s Access Protocol (LDAP). The list of people allowed to access behaviour, the identification of anomalous gas concentrations, is currently limited to the environmental experts working in the comparison of co-located sensors measurements, the correla- TRAFAIR. tion study of the pollution level in the area of sensor installation. After the authentication, the user is able to visualize the cur- The primary users of our visual analytic dashboard are the rent status of each device and send other requests through the environmental experts in charge of installation, maintenance and navigation bar at the top: he/she can ask for observations (raw calibration of air quality sensors. measurements), anomalies, calibration (calibrated measurements), and AQM station (measurements from the AQM stations). For 3.2 Views each request, the dashboard queries the TRAFAIR database to ob- In SenseBoard, we have developed 6 views to allow environmen- tain the appropriate data and creates plots of the time series data tal experts to have complete control of the air quality sensor by using the matplotlib Python library6 . More complex plots are network status and the operations that are performed on the periodically generated by ad-hoc Python scripts7 which query sensor data. Each view is described in detail in the following the database and save plots in the file system as html files through sub-sections. the save_html function of the mpld3 library8 . This library is also 3.2.1 Sensor status and position exploited for the InteractiveLegendPlugin9 , which allows to con- nect the plot to an interactive legend. This legend is very useful in The first view, i.e. the homepage of the dashboard after the our plots since it allows customizing the visualization by adding login, aims at satisfying requirements R1 and R2. Here, users are or removing some lines in the plot. The user can click on the able to visualize a table with a summary of the main information rectangle generated in the legend near the labels. If the rectangle related to the air quality devices. For each device, in the table, is colored, the corresponding data is shown on the plot; if the there are listed its identifier, the name of the location where the rectangle is white, these data are removed from the plot. The device is currently installed, the timestamp of the installation, html files are, then, included in the html page of the correspond- the name of the person in charge of the installation, the sensor ing request. What we mean with “more complex plots” are the status and any possible notes. ones which require an elaboration of the data stored in the data- Besides, as shown in Figure 6, for each device, two buttons are base and manage a big amount of data (i.e. the raw observations available: the “edit” button allows to update the location and/or of each sensor related to one month). This choice was made to the status of the corresponding device. After clicking on the save time in the visualization of the plots. Indeed, this solution button, the user has to specify the timestamp representing the decreases the server response time of 35 seconds for the most instant of the update (of the location or status), the location (one time-consuming request. of the points of interest in Figure 3), the status, and, optionally, its name and notes. The “save” button stores the information in 4 https://trafair-srv.ing.unimo.it/aqsensors the TRAFAIR database. The status update is exploited in different 5 https://www.tornadoweb.org/ 6 https://matplotlib.org/ situations. For example, if the device is moved from a point of 7 The scripts run every 2 minutes and generate the plots in 4-27 seconds. interest to the AQM station, its status changes from “running” to 8 https://mpld3.github.io/ “calibration”. In addition, if the environmental experts notices an 9 https://mpld3.github.io/examples/interactive_legend.html abnormal behavior of the device, he/she can modify the status in Figure 6: “Sensor status and position” view. (4) the raw observations of the 4 gases for auxiliary and work- ing channels in mV, (5) the observations of the 4 gases for auxiliary and working channels calibrated through the original factory calibra- tion in 𝜇𝑔/𝑚 3 , (6) the observations calibrated through the TRAFAIR calibra- tion algorithms in 𝜇𝑔/𝑚 3 . Only for the Libelium device another plot is provided, which shows the level of 𝑃𝑀2.5 and 𝑃𝑀10 . Each plot can visualize data for different time interval (last 24 Figure 7: Position of the devices on January 4𝑡ℎ , 2021. hours, week, or month) and data aggregation (2 minutes - which means no aggregation, 5 minutes, and 15 minutes), generating 9 different combinations for each plot. The visualization changes “broken” indicating as timestamp the date of the first abnormal according to the option selected by the user. measurement. Then, he/she needs to add the “running” status There are altogether 711 plots (13 devices * 6 plots * 3 time from the first regular measurement. interval * 3 data aggregation + 1 PM plot * 3 time interval * 3 In addition to the “edit” button, the “check data” button con- data aggregation). Since the creation of a plot took on average nects to the “sensor observations” view. 12 seconds, we decided to generate these plots asynchronously Besides, in this view, the user can interact with a map (Figure through one Python scripts. This means that the plots are gener- 7), where the current position of each device is visualized with ated independently by the user choice, and when the user selects an icon of different colors according to the status of the device. If an option (for time interval and data aggregation), he/she en- more devices are in the same location, a bigger icon is displayed ables the visualization of a ready-made plot. This time-saving on the map with the number of devices in that position. By design choice is also motivated by the user behavior. After three clicking on this icon, an icon for each device is visualized. If you months from the first release of SenseBoard, we noticed that it click on the icon of a device, you can see its name, its status, the was very likely that the user is interested in visualizing several name of the location, and the link to the “sensor observations” plots, exploring different gases with different aggregations or for view of that specific sensor. Folium10 is the Python library used a different time interval. If the plots are created synchronously to create the map. with the user’s choice, jumping from one plot to another requires waiting for the generation of the relative plot each time. In agree- 3.2.2 Sensor observations ment with environmental experts, we have therefore decided to The “sensor observations” view satisfies requirement R3 and switch to an asynchronous generation of the plots that reload includes 6 plots with the observations of one device. At the top the 711 plots every 2 minutes. of the page, the name of the device, its status and location, and Figures 8 and 9 are two examples of visualization available in the timestamp of the last observation with the level of battery the “sensor observations” view. In Figure 8 the measurements voltage are reported. This allows the managers of the sensor of the 4 gases for the auxiliary and working channels related network to check immediately if the sensor is not sending data to one device are plotted in a lines chart. An anomalous behav- or if the batteries need to be changed. ior of the device has been highlighted in red: the values of the The 6 plots show the measurements of: measurements in that time interval are very different from the (1) the relative humidity in percentage (%), previous ones. SenseBoard allows the detection of the wrong (2) the temperature in Celsius degree, data. After the maintenance work by the environmental experts, (3) the battery voltage in Volt (V), the device reaches the stability and the measurements proceed with the expected values. Through the “edit” button of the “sen- 10 https://pypi.org/project/folium/0.1.5/ sor status and position” view, the time period related to the red Figure 10: A visualization of the “gas observations” view Figure 8: An anomalous behavior of a device detected on which shows the measurements of NO channels. January 13𝑡ℎ , 2021. in Figure 10) to facilitate the comparison of these measurements and detect the correlation between the two channels. Thanks to this view, the behavior of the cells can be regularly checked and the maintenance planned. 3.2.4 Sensor anomalies The accuracy of the raw measurements can be influenced by multiple factors, i.e. the low level of battery voltage, the weather conditions, the air humidity. Distinguishing not correct data allows for providing more reliable data and could improve the results of the calibration task. We have implemented a majority voting system which com- bines 3 classifiers: (1) the Sliding Window anomaly detection which considers the consecutive measurements and the IQR to find anomalies far from the normal behavior of the system, (2) the FFIDCAD (Forgetting Factor Iterative Data Capture Anomaly Detection) which is an iterative algorithm, and (3) an algorithm based on the correlation between the values of each gas (NO, 𝑁𝑂 2 , CO and O3) and the measurements of air temperature and humidity. Every time a new measurement is done by a sensor, just after storing the measurement into the TRAFAIR database, the three classifiers are applied to the measurements. The research for anomalous data is performed on both chan- nels of each pollutant and device independently, since each device is individual and performs differently from the other devices even if they are in the same location. Figure 9: An anomalous behavior of a device detected on The “sensor anomalies” view consists of one plot for each January 15𝑡ℎ , 2021, due to a drastic reduction in the battery sensor with the raw observations and the anomalies identified by level. the majority voting system (requirement R6). Also in this case, the user can choose for the observations of the last 24 hours, area is flagged with the “broken” status. Figure 9 highlights an week, or month. abnormal behavior of another device. In this case, the anomalous Figure 11 is an example of anomalies visualization for sensor measurements are due to a drastic reduction in the battery level. 4006. Anomalies are identified by a point. As can be seen in the At 2 a.m., approximately, the battery died and the device stopped figure, in most cases anomalies are detected in the upper peaks sending data. After changing the battery, at 10 a.m., the device of the time series. restarts providing reliable measurements. 3.2.5 Calibrated observations 3.2.3 Gas observations The results of the calibration process consists of the concentra- In the “gas observations” view, a plot for each gas and channel tions of the 4 measured gases. Starting from 2 values for each is generated, as shown in Figure 10 for NO. This view meets gas (one value for each of the two channels) in millivolts, the requirements R3 and R4. The user can choose to visualize the calibration provides one value in 𝜇𝑔/𝑚 3 . Currently, we are using data of the last 24 hours, week or month. The visualization could Random Forest to calibrate our data. However, this algorithm seem confused, however the user is able to hide one or more can be improved over the time since more and more data are lines in the plot thanks to the interactive legend, and zoom in a collected and they are used to re-train the calibration algorithm. specific area of the plot. In the web page, the plots related to the The “calibrated observation” view shows the result of the two channels of the same gas are placed next to each other (as last calibration algorithm, that is the most recent and accurate Figure 12: Calibrated 𝑁𝑂 observations by 5 devices located in the same place (“Parco Ferrari”) visualized on January 4𝑡ℎ , 2021 at 4 p.m.. Figure 11: Anomalies of sensor 4006 for the last 24 hours (A), the last week (B), and the last month (C) available on In the plots of the “calibrated observations” view, a line in cor- January 4𝑡ℎ , 2021 at 11 a.m.. respondence of the threshold value is plotted only if at least one measurement exceeds the threshold. The plots in Figure 12 show the measurements of 𝑁𝑂 made by 5 different devices installed in algorithm available for the visualized data. This view meets re- the same location named “Parco Ferrari” (this is also the location quirement R5. The calibrated observations are organized in 4 of an AQM station). We have selected only the devices in the plots, one for each gas, and the user can distinguish the measure- same location through the interactive legend. The concentrations ments of each device through the integration of the interactive measured by the devices are very similar, as we expected. In the legend. “last month” plot the blue line indicates the above mentioned The calibrated values can be directly compared with the mea- local warning threshold and only one value is higher than this surements of the AQM stations since they are in the same unit of threshold. measure. To validate our calibrated data we have defined one lo- 3.2.6 Certified AQM station measurements cal warning threshold for each gas based on the measurements of The sixth view of SenseBoard is devoted to the visualization of the AQM stations. Each threshold has been calculated as 1.25 ∗ 𝑀, AQM station observations. They are hourly certified data related where 𝑀 is the maximum value measured by the AQM stations to the concentrations of 𝑁𝑂, 𝑁𝑂 2 , 𝑁𝑂 𝑥 , and 𝑂 3 measured by for the specific gas in the year preceding the date of the obser- the two AQM stations installed in Modena (red points in Figure vation to be compared. If the concentration of the gas is higher 3). than the corresponding threshold, it is automatically flagged as “anomalous” in the TRAFAIR database by a Python process run- 4 EXPERT EVALUATION ning in real time. The warning threshold is valid only in the area of Modena since it is provided by the certified values of the AQM SenseBoard has been regularly used by 4 environmental experts stations of Modena and it changes every year. This threshold from January 2020 till now and it is still active. It has allowed: allows to exclude very high values that are most likely due to (1) the recording of 250 location/status updates, malfunction of the sensor. It is not to be confused with the alert (2) the identification of network malfunctions in real time thresholds of the European Commission11 or the reference lev- (which occurred twice in the last year and caused the loss els of the European Environment Agency12 , which defines the of 1-2 days of data), values to assess the level of pollution in the area. (3) the detection of sensor faults in semi-real time and anoma- lous cell behaviour (which occurred 4 times and brought 11 https://ec.europa.eu/environment/legal/law/5/e_learning/module_2_18.htm to the cell replacement), 12 https://www.eea.europa.eu/themes/air/air-quality/resources/ (4) the identification of low battery level which caused anoma- air-quality-map-thresholds lous observations (33 times in around 14 months), (5) the daily comparison of concentrations from low-cost sen- allow the creation of custom plots, starting from the selection of sors and certified measurements from AQM stations to one or more sensors, pollutants, and AQM stations, and the time evaluate the calibration algorithm, interval. This will allow for further data comparison. (6) the detection of strange behaviour in the anomaly detec- tion process which allowed to retrain the algorithm and ACKNOWLEDGMENTS restart it. Research reported in this paper was partially supported by the The effectiveness of SenseBoard was widely appreciated by TRAFAIR project 2017-EU-IA-0167, co-financed by the Connect- environmental engineers who would not have had the opportu- ing Europe Facility of the European Union. The views and con- nity to compare sensor measurements and calibrations and to clusions contained in this document are those of the authors and carry out such sudden checks and maintenance. should not be interpreted as representing the official policies, ei- ther expressed or implied, of EU Commission. The authors would like to thank the City of Modena that contributes to the deploy- 5 CONCLUSION ment of the LoRa network, and the LARMA research group for SenseBoard is a data visualization and management platform providing requirements and feedback on SenseBoard. Moreover, for air quality sensors. It is a flexible tool that can be integrated a special thank goes to ARPAE that shared real-time air quality into specific IoT environments. In this paper, architecture, users, observations used for the calibration of the low-cost air quality scope, and exemplar views have been presented. Moreover, details sensors. on the sensor data acquisition and storage processes have been given. REFERENCES SenseBoard is a multi-purpose tool: to manage and maintain [1] European Environment Agency. 2020. Air quality in Europe — 2020 the air quality sensor network control and to supervise the calibra- report. Issue 9. https://doi.org/10.2800/786656 Available at https://www.eea.europa.eu//publications/air-quality-in-europe-2020-report. tion process and the identification of anomalies. The management [2] United Nations General Assembly. 2015. Transforming our world: The 2030 of the network requires the deploy and frequent re-allocation of Agenda for Sustainable Development. Available at http://www.un.org/ga/ devices close to the AQM stations or in specific points of interests. search/view_doc.asp?symbol=A/RES/70/1&Lang=E. [3] Chiara Bachechi, Federica Rollo, Federico Desimoni, and Laura Po. 2020. Us- Data coming in real-time from the sensors need to be constantly ing Real Sensors Data to Calibrate a Traffic Model for the City of Modena. monitored by experts in order to control the normal functioning In Intelligent Human Systems Integration 2020, Tareq Ahram, Waldemar Kar- of sensors. wowski, Alberto Vergnano, Francesco Leali, and Redha Taiar (Eds.). Springer International Publishing, Cham, 468–473. The dashboard integrates a big amount of heterogeneous data, [4] Titus Balan, Catalin Dumitru, Gabriela Dudnik, Enrico Alessi, Suzanne Lesecq, both geo-spatial and time series data. The position of each sensor Marc Correvon, Fabio Passaniti, and Antonella Licciardello. 2020. Smart Multi- Sensor Platform for Analytics and Social Decision Support in Agriculture. is visualized in an interactive map. The measurements of the Sensors 20, 15 (2020), 4127. https://doi.org/10.3390/s20154127 sensors have been plotted in different line charts with mainly [5] A. Bigi, G. Veratti, S. Fabbi, L. Po, and G. Ghermandi. 2019. Forecast two types of visualization: the same air pollutant measured by all of the impact by local emissions at an urban micro scale by the com- bination of Lagrangian modelling and low cost sensing technology: The the sensors in the same plot, and all the air pollutants measured TRAFAIR project. 19th International Conference on Harmonisation within by the same sensor in the same plot. Besides, anomalous data are Atmospheric Dispersion Modelling for Regulatory Purposes, Harmo 2019 highlighted in other plots. The visualization of such an amount (2019). https://www.scopus.com/inward/record.uri?eid=2-s2.0-85084160462& partnerID=40&md5=3b37303a7af769206777d87ab30a2541 cited By 2. of plots is speed up by the use of Python scripts which generate [6] Mehmet Ali Ertürk, Muhammed Ali Aydin, M. Talha Buyukakkaslar, and the plots asynchronously and independently by SenseBoard. Hayrettin Evirgen. 2019. A Survey on LoRaWAN Architecture, Protocol and Technologies. Future Internet 11, 10 (2019), 216. https://doi.org/10.3390/ The dashboard is accessible anywhere and anytime to allow a fi11100216 constant monitoring of the network. Besides, it can be generalized [7] N. Kumar, D. Acharya, and D. Lohani. 2021. An IoT-Based Vehicle Accident to visualize other kinds of geo-spatial and time series data. Indeed, Detection and Classification System Using Sensor Fusion. IEEE Internet of Things Journal 8, 2 (2021), 869–880. https://doi.org/10.1109/JIOT.2020.3008896 the dashboard is not affected by the type of sensors employed in [8] Xiangtao Liu, Tianle Zhang, Ning Hu, Peng Zhang, and Yu Zhang. 2020. The the network (also in our case we integrate two different types of method of Internet of Things access and network communication based on sensors) and can be easily adapted to monitor other pollutants MQTT. Computer Communications 153 (2020), 169 – 176. https://doi.org/10. 1016/j.comcom.2020.01.044 beyond the ones described in our use case. The flexibility and [9] Luis F. Luque-Vega, David A. Michel-Torres, Emmanuel López-Neri, Miriam A. scalability of SenseBoard allow to monitor networks of a variable Carlos-Mancilla, and Luis Enrique González Jiménez. 2020. IoT Smart Parking System Based on the Visual-Aided Smart Vehicle Presence Sensor: SPIN-V. number of sensors in cities of different sizes. In addition, in our Sensors 20, 5 (2020), 1476. https://doi.org/10.3390/s20051476 use case we manage a dynamic sensor network since the sensors [10] L. Po, F. Rollo, C. Bachechi, and A. Corni. 2019. From Sensors Data to Urban are moved frequently. However, this is an additional issue, and the Traffic Flow Analysis. In 2019 IEEE International Smart Cities Conference (ISC2). IEEE, Casablanca, Morocco, 478–485. https://doi.org/10.1109/ISC246665.2019. dashboard works also with static sensor networks. SenseBoard 9071639 can be adapted to query a different data platform which can be a [11] L. Po, F. Rollo, J. R. R. Viqueira, R. T. Lado, A. Bigi, J. C. López, M. Paolucci, and PostgreSQL database or a data model of different type. Queries P. Nesi. 2019. TRAFAIR: Understanding Traffic Flow to Improve Air Quality. In 2019 IEEE International Smart Cities Conference (ISC2). IEEE, Casablanca, and plots can be easily modified to visualize data in another way Morocco, 36–43. https://doi.org/10.1109/ISC246665.2019.9071661 or to show additional data that are not included in our use case. [12] N. Shivaraman, S. Saki, Z. Liu, S. Ramanathan, A. Easwaran, and S. Steinhorst. 2020. Real-Time Energy Monitoring in IoT-enabled Mobile Devices. In 2020 De- SenseBoard has been developed according to the technical sign, Automation Test in Europe Conference Exhibition (DATE). IEEE, Grenoble, requirements provided by the environmental experts. Thus, it is France, 991–994. https://doi.org/10.23919/DATE48585.2020.9116577 not comparable with the dashboards developed for citizens and [13] D. Zhang and S. S. Woo. 2020. Real Time Localized Air Quality Monitoring and Prediction Through Mobile and Fixed IoT Sensing Network. IEEE Access public administrations. Indeed, the scope of these dashboards 8 (2020), 89584–89594. https://doi.org/10.1109/ACCESS.2020.2993547 is not the monitoring of the sensor network, but the provision of pollution levels to raise awareness among people about the situation in their city. As future work, we will compare Sense- Board with the technical tools provided by the air quality sensor suppliers. In addition, we will integrate an additional view to