OGC SWE-based Data Acquisition System Development for EGIM on EMSODEV EU Project Daniel M. Toma, Joaquin del Rio, Javier Cadena, Óscar Garcia, Juanjo Dañobeitia, Jordi Sorribas, Ikram Bghiel, Enoc Martínez, Marc Nogueras Raquel Casas Universitat Politecnica de Barcelona Unidad de Tecnología Marina -CSIC UPC- SARTI Barcelona, Spain Barcelona, Spain P. Marítimo de la Barceloneta 37-49, 08003 Rambla Exposicion 24, 08800 Jaume Piera, Rafael Bartolome, Raúl Bardaji Instituto de Ciencias del Mar-CSIC Barcelona, Spain P. Marítimo de la Barceloneta 37-49, 08003 Abstract—The EMSODEV[1] (European Multidisciplinary ecosystems, and geo-hazard early warning research. As Seafloor and water column Observatory DEVelopment) is an EU illustrated in figure 1, the EGIM will utilize a comprehensive project whose general objective is to set up the full set of sensors and devices that meet particular technology implementation and operation of the EMSO distributed Research readiness thresholds to collect observations including Infrastructure (RI), through the development, testing and temperature, pressure, salinity, dissolved oxygen, turbidity, deployment of an EMSO Generic Instrument Module (EGIM). chlorophyll fluorescence, currents, and passive acoustics. This research infrastructure will provide accurate records on marine environmental changes from distributed local nodes around Europe. These observations are critical to respond accurately to the social and scientific challenges such as climate change, changes in marine ecosystems, and marine hazards. In this paper we present the design and development of the EGIM data acquisition system. EGIM is able to operate on any EMSO node, mooring line, sea bed station, cabled or non-cabled and surface buoy. In fact a central function of EGIM within the EMSO infrastructure is to have a number of ocean locations where the same set of core variables are measured homogeneously: using the same hardware, same sensor references, same qualification methods, same calibration methods, same data format and access, and same maintenance procedures. Keywords— EMSO; data acquisition; EMSODE; EGIM; OGC; SOS; SE; SWE; Sensor; Zabbix. I. INTRODUCTION The general objective of the EMSODEV project is to implement a Generic Sensor Module (EGIM) within the EMSO (European Multidisciplinary Seafloor and water column Observatory). EMSO is a distributed infrastructure of Fig. 1. EGIM prototype components strategically placed, deep sea and water column observatory nodes with the essential scientific objective of real-time, long- term monitoring of environmental processes related to the Relatively novel sensors will also be considered including interaction between the geosphere, biosphere, and those for pH, pCO2, and nutrients. Overall, this system will hydrosphere. The scientific drivers for developing and address the fullest possible set of Essential Climate Variables deploying the EGIM across a set of observatories in European (e.g. from the WMO’s GCOS-Global Climate Observing Seas are manifold, spanning requirements to collect System program; www.wmo.int) at EMSO nodes. observations for understanding climate change, marine Table 1. Core variables captured by the EGIM - EMSO Generic Instrument Web Enablement has two main functionalities. The first is to Module and their cross-disciplinary application guarantee that the data is recorded properly from the EGIM Variable Geosciences Physical Biogeo- Marine hardware. The second is to register and insert the recorded Oceano- chemistry Ecology graphy data into a standardized Open Geospatial Consortium (OGC) Temperature X X X X SWE SOS [2] that works as a gateway for the EMSO data Conductivity X X X X management system. Pressure X X X X Dissolved O2 X X X X Turbidity X X X X II. SYSTEM OVERVIEW Ocean X X X X The hardware required by the components of the EGIM currents acquisition system - the SOS server and the laboratory monitor Passive X X system - has been implemented by virtualizing the hardware of acoustics the whole system, e.g. generating three virtual machines (‘Mussel’,‘SeaShell’ and ‘Donax’) for each separate roll. Each EMSODEV, by means of EGIM, will provide virtual machine has been configured with the necessary unprecedented support for full standardization across EMSO. resources (database container, web server, VPN client ...) This is key to understanding regional scale phenomena. Data providing the necessary interfaces to communicate with the will be made coherent and attractive for the modeling other hosts as shown in figure 3. community and for other potential stakeholders as shown in table 1. An open data policy has already been adopted in compliance with the recommendations being developed within the GEOSS initiative (The Global Earth Observation System of Systems). This allows the shared use of the data infrastructure and the free exchange of scientific information and knowledge. Our contribution to the implementation of the EGIM data acquisition system module (WP4 of the EMSODEV project) focuses on the development of a generic software for sensor web enablement. Through this generic software, the EGIM status data is directly inserted into a centralized SOS (Sensor Observation Service) server [2] and into a laboratory monitor system (LabMonitor) for recording events and alarms. Moreover, the software will be able to detect, register and start the data acquisition from any new sensors connected to EGIM. Based on this development, the project will set up a data management system enabling sensor management and data analysis compliant with the Fig. 3. General Server and Services Layout requirements of EU and international initiatives (e.g. EMODNET, GEOSS), and a state-of-art ICT (Information and Communications Technology) user environment. III. SOS GATEWAY INTERFACE We analyzed the actual state-of-the-art SOS implementations, decided to use the 52°North SOS2.0 implementation to accomplish the SOS Gateway requirement. The 52°North SOS2.0 has capabilities to aggregate readings from live, in-situ and remote sensors. The service provides an interface to make sensors and sensor data archives both accessible through an interoperable web-based interface, using SensorML and Observation and Measurements (O&M).The main SOS 2.0 operations offered with this implementation are: A. Core Extension • GetCapabilities, for requesting a self-description of the service • GetObservation, for requesting the pure sensor data Fig. 2. EGIM diagram showing SOS Gateway relations encoded in Observations & Measurements 2.0 (O&M) • DescribeSensor for requesting information about a As shown in figure 2, the generic software for Sensor Web certain sensor, encoded in a Sensor Model Language Enablement with the SOS server is located between the data 1.0.1 (SensorML) instance document. source (EGIM) and the data management system in the EMSO Cyberinfrastructure (CI). The generic software for Sensor B. Enhanced Extension • GetFeatureOfInterest, for requesting the GML 3.2.1 encoded representation of the feature that is the target of the observation. • GetObservaitonById, for requesting the pure sensor data for a specific observation identifier C. Transactional Extension • InsertSensor, for publishing new sensors • UpdateSensorDescription, for updating the description of a sensor • DeleteSensor, for deleting a sensor • InsertObservation, for publishing observations for registered sensors D. Result Handling Extension Fig. 4. Visualization of EGIM status data on 52°North Helgoland SOS Client • InsertResultTemplate, for inserting a result template into a SOS server that describes the structure of the values of an InsertResult of GetResult request. IV. ACQUISITION SERVICES • InsertResult, for uploading raw values accordingly to the structure and encoding defined in the The acquisition environment has been deployed on the InsertResultTemplate request 'Seashell' virtual machine. On this server all the elements are configured to accomplish the acquisition requirements. These • GetResultTemplate, for getting the result structure and requirements include the processes to acquire the data from encoding for specific parameter constellations the sensors that are connected to EGIM - the so-called • GetResult, for getting the raw data for specific acquisition agent - and to send the observations to the SOS parameter constellations. Gateway and the Lab Monitor systems. For this deployment we use the 'Mussel' virtual machine. We deployed the 52°North Web applications over a Tomcat Web server container version 7, configured with a Postgres database container. Some other small configurations for conditioning the application in our domain have been implemented. In order to attend to client requests, we installed 52°North’s Helgoland Web client application[3] [4] to visualize the real time and historical data using the SOS Gateway as illustrated in figure 4. This web application has been opened for access from outside the local network Fig. 5. EGIM functional description To acquire data from the EGIM system, we need to distinguish between two kinds of reading procedures. First, there is a reading procedure for the external sensors connected to the EGIM system. The EGIM has the capability to host up to 12 sensors. Seven of these sensors are generic, as shown in figure 1. The five additional sensors provide additional ‘essential ocean variables‘, including chl-a, pCO2, pH, and photographic/video images. As illustrated in figure 5, these reading procedures are done based on TCP Socket connections. Second, there is a reading procedure of the EGIM internal sensors, which is done based on readings of UDP packets to a specific port. Once the agent reads these two kinds of data, we use a 'proxy SOS' tool to automatically executes all the data insert operations between the acquisition agent and the SOS server. Hence, this tool registers any new sensors connected to EGIM and sends the InsertResult queries for each new data acquired from EGIM. Moreover, the V. MONITOR LAB acquisition agent generates JSON requests to the Zabbix server [5], in order to add these values to the Lab Monitor’s The benchmark test as well as the production processes database. require the visualization of some real data and historical trend data from sensors, with the objective to control some critical information by arranging triggers. In the same manner, we need to monitor the correct behavior of the whole system (EGIM Status data). To accomplish this requirement, we built a LabMonitor system, using the Zabbix4 open source application. Zabbix is designed for monitoring availability and performance of IT infrastructure components. It works like a centralized monitoring system using active or passive agents for requesting or receiving from hosts. The system can use many protocols. In our scenario we use Zabbix agents to retrieve information about each virtual machine and observations from the EGIM system retrieved by the acquisition agents. Fig. 6. Acquisition Components Overview For this purpose, we have created the ‘Donax’ virtual machine. It uses a MySQL database as a data container and an We have identified several categories of data shared Apache Web container for attending to Web client requests. In between EGIM and CI. The following defines each one: each server, a binary Zabbix agent that reports all information about the host to the Zabbix server has been installed. This • Component descriptive data – Description of the functionality has also been added inside the acquisition platform/instrument configuration including instrument software to send all the data acquired from the EGIM system. types, serial numbers, position of the deployment, The acquisition agent gets the data received from EGIM calibration parameters. sensors and sends it to the Zabbix server using a formatted • Command data – Commands and associated attributes such JSON request. Then, the server informs the client if the data as when a command is scheduled to be executed. has been created successfully or it sends a report if any • Instrument data – Data produced by the platform problem arises. Once the data has been received on instruments, associated time tags, and attributes identifying LabMonitor, the data is written to the database, and can be the specific source instrument. visualized on the Web application. If a trigger has been • Engineering data – Data describing the operational status configured for this data, the system will check the rule of the system components. configuration and inform of any status change. • Metadata – Data describing the data. Metadata are data describing a resource, like an instrument or an information We can also check the state of the EGIM equipment by resource. monitoring all the status values coming from EGIM status pack. We have configured alarms for critical data as the In order to provide the description of all these categories of internal temperature, internal humidity, power consumption data, we use the SensorML 2.0 standard. SensorML supports and water leak. In case any reporting alarm occurs, we set up the ability to describe the components and encoding of real- an email account for receiving a message every time the time data streams, and to provide a link to the data stream trigger in the system switches on or off, informing about the itself [6]. This allows one to connect to a real-time data stream sensor data involved and some more detailed information. directly from a SensorML description and to use a generic data reader to parse the data stream. The act of describing a Finally, we set up a public access for remote monitoring data stream into or out of a process (or sensor/actuator) is purposes, which only require a web client for real time system accomplished by having the input or output be of type data access. Moreover, it is also feasible to request historical DataInterface. The DataInterface element allows one to data, which could be really useful for analyzing the events describe the DataStream, as well as provides an optional processes and crossing data. interface description. The acquisition agent (the SWE agent) reads and decodes VI. CONCLUSIONS this file, which is encoded in an EXI format. It uses the decoded information to autoconfigure itself, which opens a At the time of writing this document, we are in a communication port via an Ethernet connection with the development phase and compiling all the necessary instrument deployed by EGIM. This communication port has components for the final production environment (estimated the capability to use both TCP and UDP protocols. The SWE for October, 2016). As a result, the data may not have much agent starts getting information from the instrument in a push meaning or may eventually contain some gaps on historical or pull mode. The data retrieved from the instrument is stored trends. New development in the following months may in XML files, following the insertResult format. This format is introduce changes, such as IP's, ports, etc.. compliant with the Observation & Measurement Standard 2.0 and can be directly injected in the SOS database. We are trying to improve the system to complement the way that Data Management Platform (DMP) should receive the data. Initially, this configuration requires a connection ACKNOWLEDGMENT polling to request data from the DMP to the SOS Server. This implies two operations for each request data. We have This study benefited from the H2020 INFRADEV‐3‐2015 installed the Sensor Event Service in the SOS server. The EMSODEV Project n°676555. objective is that users with a publish/subscribe-based interface could access to sensor data and measurements located at the SOS server. The SES basically produces notifications and REFERENCES provides methods to subscribe for notifications and retrieve the latest notification. Meanwhile, users can also register new [1] EMSO project site http://www.emso-eu.org/site/projects.html sensors dynamically and send notifications to the service. [2] Arne Bröring, Christoph Stasch, Johannes Echterhoff “OGC® Sensor Observation Service Interface Standard”, http://www.opengis.net/doc/IS/SOS/2.0 , 2012-04-20 [3] 52 North SOS 2.0 implementation, http:// 52north.org/communities/sensorweb/sos/ [4] https://github.com/52North/helgoland [5] Zabbix Monitoring System. http://www.zabbix.com/product.php [6] del Río, J.; Mihai Toma, D.; O'Reilly, T.C.; Bröring, A; Dana, D.R.; Bache, F.; Headley, K.L.; Manuel-Lazaro, A; Edgington, D.R., "Standards-Based Plug & Work for Instruments in Ocean Observing Systems," Oceanic Engineering, IEEE Journal of , vol.39, no.3, pp.430,443, July 2014 doi: 10.1109/JOE.2013.2273277 Fig. 7. Zabbix Screen with Graphs and events