Cross-CPP – An Ecosystem for Provisioning, Consolidating, and Analysing Big Data from Cyber-Physical Products Ana Correia Elisa A. Herrmann Miriam Kachelmann Christian Wolff Victor Corral Meteologix AG Institute for Applied Systems ARI Sattel (SZ), Switzerland Technology Bremen GmbH Atos IT Solutions and Services miriam@kachelmann.com Bremen, Germany Iberia [correia, wolff]@atb-bremen.de Madrid, Spain Rance DeLong [elisa.herrmann, The Open Group Massimiliano Zanin victor.corral]@atos.net Reading, Berkshire, UK Ernestina Menasalvas r.delong@opengroup.org Centro de Tecnología Biomédica and Escuela Técnica Superior de Pavel Smrz Ingenieros Informáticos Brno University of Technology Universidad Politécnica de Brno, Czech Republic Madrid Madrid Spain smrz@fit.vutbr.cz [massimiliano.zanin, ernestina.menasalvas]@upm.es ABSTRACT KEYWORDS It is expected that with the increasing number of connected sensors Cyber-Physical Products, Cross-sectorial services, Big Data and actuators within mass products, the large spectrum of sensor Marketplace data coming from high volume products in various industrial sectors (vehicles, smart home devices, etc.) will rise in short-term. 1 Introduction This enormous amount of data continuously generated by CPPs In the present world where mass products have an increasing will represent (1) a new information resource to create new value, number of connected sensors and actuators, it is expected that a allowing the improvement of existing services or the establishment large spectrum of sensor data coming from high volume products of diverse new cross-sectorial services, by combining data streams in various industrial sectors (vehicles, smart home devices, etc.) from various sources, and (2) a major big data-driven business potential, not only for the manufacturers of Cyber Physical will rise in short-term. This CPP enormous amount of data has in Products (CPP), but in particular also for cross-sectorial industries today’s landscape only sporadic proprietary CPP ecosystems, as well as various organisations with interdisciplinary applications. which are restricted to manufacturer-specific services and not open for third parties interested in the CPP data. In spite of major advances in the field, several challenges still The Cross-CPP project will tackle these issues by focusing on hinder the use of these data, like the lack of, or only few, CPP what CPP and their sensor data can bring to the outside world. ecosystems that are in the best-case manufacturer specific and not Therefore, as key challenges, Cross-CPP aims to overcome open for external companies interested in using such data. several obstacles by establishing a CPP Big Data Ecosystem, with the following main characteristics: We present here a solution that envisions to establish a CPP Big Data Ecosystem to bring to the outside world CPP data • Brand independent concept, open for integration of diverse from various industrial sectors, brand independent, allowing for CPP data providers coming from different industrial areas, external service providers that use CPP data from this unique also providing a standardized cross industrial CPP data model CPP data access point (as well as from other sources) to which needs to be flexible enough to incorporate data coming develop cross-sectorial services. from various industrial sectors. 1st Workshop on Cyber-Physical Social Systems (CPSS2019), October 22, 2019, Bilbao, Spain. Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 1 CPSS2019, October 22, 2019, Bilbao, Spain A. Correia et al. • A CPP Big Data marketplace providing a single CPP data 2 State of the Art access point with just one interface (one-stop-shop) to service providers, as well as support functionalities for easy data 2.1 Data Model mining/analytics. By these means, data customers (Service The proliferation of intelligent devices together with modern Providers) just need to set-up and maintain one interface to computing paradigms such as cloud, fog service-oriented gather diverse CPP data from different CPP providers. computing is exponentially growing the amount of data recorded • Controlled access to diverse CPP data streams and and stored [1]. The massive amount of data available within a optimal management of data ownership and data rights, applicable company represents the key to competitiveness. However, data and to various cross CPP data streams. more data are useless without methods, methodologies and tools to In general, as seen in Figure 1, the ecosystem can be separated into manage them. From a technical point of view, the progression in three pillars: the amount of data is handled by a new breed of technologies and techniques such as NoSQL databases, MapReduce computation 1. Left pillar: Data Providers (CPP Producers / Owners) -> framework, machine learning algorithms etc. Nonetheless, the Comprising data harvesting and making CPP data from usage of these technologies is impractical if data are not structured various industrial sectors available, transfer brand specific i.e. data are not framed into a predictable and regularly occurring data streams into the common CPP data model. data format in order to be managed by computational 2. Middle pillar: Cross-CPP Cloud Storage & Big Data components/modules for data analytics tasks as confirmed in [2]. Marketplace (MP) -> Comprising a cloud based concept for CPP Cloud Storage. Enabling controlled access to CPP data Therefore, it is necessary to standardize and homogenize the way from different sources, offering support to Service Providers data are represented and structured (agreed data model) to cope in the form of an easy access and detection of needed data, as with the problem of integrating data from multiple vendor-based well as of flexible cross data stream analysis tools. systems for the sake of knowledge generation and information 3. Right pillar: Data Customer (Service Provider) -> Cross- distribution to upper managerial decision-making tools. sectorial industries or manufacturers of CPP using CPP data In order to allow for integrated data access the project has started from various products to create new value out of that data the development of the Agreed Data Model from the so-called (“CPP-data” has no value in itself), by improving services or CVIM1 developed in the scope of H2020 AutoMat project (GA no. the establishment of diverse new cross-sectorial services. 644657) for vehicles representing an agreed format for storing data in the cloud [3], and extend it to other CPP. From application point of view a combination of cooperative storage clouds and traditional storage clouds were addressed. This data cloud also represents the regulating interface for the data exchange between the CPP and the various service providers. This approach provides a breakthrough regarding an open data exchange that overcomes the drawbacks of current restricted products ICT services concepts. Currently there exists no standard information model for keeping, maintaining and aggregating data from and for CPPs. The development of agreed data models is not just a technical issue. The key is to not only define a data model, but to reach consensus among industrial players and make sure the models are used and shared i.e. to become “de-facto standards”. 2.2 Data Marketplace As it is well known, there exist a wide scope of marketplaces in the internet handling B2B, B2C and C2C relations. However, the Figure 1: Cross-CPP Ecosystem marketplace for trading data streams are rare, especially in industrial domains. For example, in smart manufacturing domain In Section 2 we describe the state of the art, continue with the [4] shared, secure, open-access infrastructures rich in functionality Cross-CPP Ecosystem concept, and use the Cross-CPP architecture for easier system integration and composability and a marketplace to explain the different modules of the Ecosystem as well as their that can drive technological capability beyond just products by purpose (section 3). We subsequently explain how, in the scope of integrating services on standards, uncertainty quantification, the project, it is being applied in industry (Section 4). Section 5 benchmarking, performance-use metrics, systems modelling, etc. concludes with the main learnt lessons and steps ahead. are still missing, but many initiatives are currently active (e.g. in US Leadership Coalition [5]). In AutoMat this concept was for the 1 The Common Vehicle Information Model (CVIM), represents a brand-independent, number of signals to be recorded as well as the type of measurement channels can be open and transparent data model for vehicle data. The CVIM is representing a living modified or extended. data structure, where in reference to the needs of the service provider community the 2 Cross-CPP – An Ecosystem for Provisioning, Consolidating, and Analysing Big CPSS2019, October 22, 2019, Bilbao, Spain Data from Cyber-Physical Products first time applied for the exchange of customer owned vehicle data data processing (including analytics) for a parallel (e.g. cloud) and service providers [6]. Therefore, the AutoMat Vehicle Big Data platform, i.e. to process the data as fast as possible using all of the Marketplace being the comprehensive platform to manage the sales available processors. It is therefore an ideal basis for data analytics and provisioning of all type of vehicle related data from all OEMs on parallel (i.e. cloud platforms), including pipelined and data- (the project brought VW, Renault and Fiat together) and the various parallel approaches. service providers [7], will serve as a basis for Cross-CPP Marketplace by extending it to cover data streams from various 2.4 Context sensitivity CPP. This large amount of continuously gathered CPP The acceptance and usability of complex cross-sectorial heterogenous data represents major economic big data business services can be considerably improved by making them context potentials, not only for specific industry verticals (as Automotive) sensitive. With the recent advance of context sensitivity, an but for cross-sectorial industries with interdisciplinary applications. increasing need arises for developing formal context Today’s proprietary approaches focus on bringing company modelling and reasoning techniques. services into Vehicles (e.g. [8]), Home Systems Entertainment The basis for context-aware applications is a well-designed without open-up to specialized cross-industries companies. Context Model (CM). As context integrates different data and Despite of that fact, it is still a major business potential locked knowledge sources and binds knowledge to the user to because the automotive and related industries were not yet able to guarantee that the understanding is consistent, context modelling establish an open service ecosystem equivalent to existing market is extensively investigated. A CM enables applications to applications such as e.g. Smartphone Apps. Such approaches fit understand the user’s activities in relation to situational quite well, and could easily be adapted to the requirements of the conditions. data handling in a cloud environment and the management of the Typical context modelling techniques include key-value information exchange between companies, vendors of CPP and models, object-oriented models, and ontological methods [13]. service providers. The problem to be solved is how to extract context from the The above approach is not cost-effectiveness for companies CPP use. Since it is planned to model context with ontology, because the associated costs and time-consuming process by each context extraction mainly is issue of context reasoning and sign-off agreement makes not feasible to scale up into production context provisioning: how to inference high level context environments the monetization of our CPP data lakes. Despite of information from low level raw context data [14] and [15]. this fact, it is how companies have been doing since the early stages Application of context awareness for cross sectorial services of data-driven services conceptualization. They have been trying to has not yet been sufficiently researched. In the case of such create ad-hoc applications and services for their customers which services the notion of context refers to process preferences must be sustained and supported by periodical subscription fees of of CPP and process skills of devices, physical capabilities connected services. Data Providers and Data Consumers charge all of the CPP and environment conditions. the cost of the product development lifecycle into the final cost of The modelling of context in this case presents an additional the services offered to end-users. Finally, they realized about the challenge, as the mentioned services are highly dynamic and fact that rather than working on isolated restricted (closed) reside in distributed environments. Up to now there were no interfaces with specific Services Providers, it doesn’t render into industrial driven attempts to provide harmonised modelling of economically feasible scenarios. Cross-CPP seeks for open-up and context under which CPP are used or under which data streams democratize the access to CPP data among cross-sectorial Service from CPP are generated. The key innovation issues to be solved Providers under a standardized CPP Data Model trying to are: how to allow building common, re-usable context models for maximize the monetization of their data users by cross-sectorial cross sectorial services; and how to provide a generic solution Data Consumers which are experts and clearly knows the needs of adaptable to different scenarios. our customers for accessing to digital services. 2.3 Big Data Analytics 3 Cross-CPP concept The challenges of the CPP data stream mining is to analyse how This section presents the results of the developed concept for the evolving data in the different sectors (e.g. home, automotive) general Cross-CPP Ecosystem Architecture. Figure 2 shows behave. In this context, the algorithms have to be designed and the overview of all modules that were planned to reuse adapted to deal with resource aware learning, change detection, and significantly enhance the results of the past projects w.r.t. novelty detection, multi-horizons analysis, and reasoning about the Cross-CPP project needs and objectives. This figure was derived learning process in the different domains [9] [10] [11].The CPP from the Cross-CPP Ecosystem (see Figure 1) and it summarises real-time and predictive analytics toolbox will be an extension of the Cross-CPP modules and its key software components and the software approach developed within the JUNIPER project [12]. how they correlate in the system. This approach was based on Java 8, which introduced Streams and Lambda expressions to support the efficient processing of in- memory stream sources. One of the primary aims was to accelerate 3 CPSS2019, October 22, 2019, Bilbao, Spain A. Correia et al. Company Backend and deploys the new configuration in case the data owner agrees. • Performs the basic authorization and authentication security services Both the Data Harvesting module as well as the Company Backend are conceptually generic for any CPP producer/Data Provider but have a company specific implementation due to the diversity of the companies and their internal systems. 3.2 Company Backend The Company Backend module holds the Cross-CPP company data processing chain, which receives data from the Data Harvesting Module and after processing, enrichment with company internal knowledge, formatting and transformation, stores the processed data into CIDM format in the CPP Cloud Storage (see section 3.3). The Company Backend module holds bidirectional connections to the CPP Cloud Storage as well as to the CPP Big data Marketplace for transferring the data and updates of the CIDM format. The main functionalities of the module are to: • interpret and transform proprietary CPP manufacturer specific CPP data into physical information in reference to agreed owner permissions, • validate the information and, if need be, mask it to enforce privacy. • convert the information into the required quasi standard Figure 2: Conceptual Cross-CPP system architecture [16] information representation, the Common Industrial Data Model (CIDM) format and publish it to the owner’s CPP 3.1 Data harvesting Cloud Storage, and The Data Harvesting module acts as intermediate layer between the • manage the configuration procedures for the data mining CPPs and the Company Backend module. The connection will be at CPP level, by providing CPP specific data logger realised using 3/4/5G mobile web technologies in the case of configurations (e.g. manages for instance the case that a vehicles and wired connection in the case of smart infrastructure certain CPP, even if from the same sector, often has other devices. models or configurations and therefore has some signals The main functionalities of the module are the: that other does not). • Set-up CPP system and data acquisition configuration 3.3 Common Industrial Data Model and CPP • Continuous data acquisition and transmission Cloud Storage Where the first will offer functions to allow for the signals coming from the CPPs to be configured in terms for instance of retrieval The Common Industrial Data Model (CIDM) is an open and highly and transmission rate, etc. and the second is the actual data scalable big data format, designed to harmonize IoT proprietary acquisition and transmission of said signal data from CPPs to the data into generic datasets. Company Backend (see section 3.2). The structure of the data model consists of three layers depicted in The main components in this module are: Figure 3. • The data logger in the CPP continuously measures data during CPP usage, according to the deployed measurement configuration. • The measured data are continuously aggregated and stored in CPP data packages. According to the deployed measurement configuration, the aggregation and storage of data is done for each defined measurement channel. • The stored CPP data packages are sent to the CPP Company Backend at the agreed frequency. Figure 3 CPP Data Model main structure • This component receives the request for management configuration initialisation / update from the CPP 4 Cross-CPP – An Ecosystem for Provisioning, Consolidating, and Analysing Big Data CPSS2019, October 22, 2019, Bilbao, Spain from Cyber-Physical Products • Starting from the bottom part, Signals describe the type of • Catalogue: This component manages the set of signals physical phenomena and chemical quantities of vehicles and and measurement channels available for the current buildings, including the name of the signal, the format and unit. version of the CIDM. • As measurements of the phenomena may far exceed the • Data Providers and Service Provider Manager: To available transmission bandwidth or the full resolution may manage data provider users, the services provider users not be required in most applications, data from the CPP need and to manage the sharing process between them. to be pre-processed and aggregated conforming a The module comprises a backend application with a RESTful API “measurement channel” that include the signals to aggregate to provide the functionality, an index storage, and a frontend (1 or more), the aggregation type (time series, histograms, application (web application) to provide the different actors the etc.) and the configuration of the aggregation. visualization of the data and the visual management of all the • Finally, at the highest level, data packages provide the actual resources. The module includes a Software Development Kit for data coming from the CPP, aggregated according to a retrieving datasets in an easy way. measurement channel selected. The data packages are stored An added value to the marketplace is the Data Analytics Toolbox in and retrieved from the Cloud Storage. In addition to the data, that extends the marketplace functionality to provide the Service data packages also contain metadata with support information Providers with Analytics Capabilities. It includes, besides the CPP like ownership and quality assessment. Data Analytics toolbox, the Software Development Kit. The former The CPP Cloud Storage is a cloud-based data storage infrastructure is composed of a set of modules facilitating the analysis of the that offers secure and private “data vaults” to data providers, to collected data. It is based on a modular structure, in which new store their devices data packages in the CIDM format. The storage analytics services can be added to fulfil new user requirements; and infrastructure provides an Application Programming Interface it is aimed at supporting both fast prototyping of new ideas, and (API) to enable data collection from the Company Backend as well efficient implementation of data synthesis and analysis techniques. as data access by the CPP Big Data Marketplace. The module Each module, devoted to a specific analysis, communicate with the includes a web application that allows users to have the control of Cross-CPP Marketplace to get the data and return the results. The their data by managing read or write access permissions granted to communication between each module and the central system can the Marketplace and the Company Backend. take two forms: • Pull mode: analyses are performed over the data provided 3.4 CPP Big Data Marketplace and Data by the system (or the final user), without taking into Analytics toolbox account the evolution of the system up to that point. The user/service provider requests for an analysis and once The CPP Big Data Marketplace connects Data Providers and Data the results are yielded, the whole computation is deleted. Consumers for selling and acquiring Connected Vehicle and Smart • Push, or stream, mode: internal models are updated in an Building data under the standardized data model (CIDM), assuring asynchronous way, using any new data available, such security and privacy of the data. The Marketplace main purpose is that the request for an analysis just implies retrieving the to allow Services Providers to create new B2B and B2C data-based result. In this case, the availability of new data triggers products and services. the update in the analysis and the user accesses results as The architecture of the Marketplace module consists of a set of if they were any other data stream in the system. components with different responsibilities: Within the plethora of data analytics techniques that have been • Indexing: This component indexes the metadata of the developed in the last decades, we have here selected some of them CPP data stored in the different Cloud Storages modules for being relevant in a large range of applications. In short, these to provide data discovery services and to locate and include: retrieve CPP data from Cloud Storages when needed. • Basic statistics: module that aims at providing with some • Discovery: This interface allows to check the types, very simple statistical functions, calculated over a subset amount and quality of the CPP data stored in the Cloud of the stored data, and with the objective of minimizing Storage spaces considering data consumer constrains and communication overheads. requirements. • Time series: module providing the service provider with • Cloud Storage Access: This component has two a set of tools for detecting when a time series, that responsibilities, to handler data change notifications from represents the evolution of a measurement, suffers from the Clouds Storages and to request CPP data packages to a sudden change. the proper Cloud Storage when a Service Provider • Trajectories: aims at making a set of basic tools available request data. to the service provider, in order to simplify the handling • Data Broker: This component provides an interface to and manipulation of trajectories. retrieve the CPP data in subscription (streaming) mode or • Machine learning: module supporting incremental in a pull mode (REST request). learning algorithms by means of existing libraries and 5 CPSS2019, October 22, 2019, Bilbao, Spain A. Correia et al. frameworks, that proved to be applicable in high velocity 3.6 Context Monitoring and Extraction settings. The monitoring and extraction module extracts context from the Both the raw data and the analytics results can be accessed in two CPP use to support security and improve services. ways. On one hand, the system provides an SDK for programmatic The context sensitivity of the Cross-CPP ecosystem can be access to the data. Alternatively, these can be explored through a supported by monitoring and extracting information about the use GUI, mainly used by Service Providers to select and configure the of CPP, and can support the adaptation of services (by allowing the access to the cross-sectorial CPP big data pool offered by the Cross- services to retrieve only the information that matches the context CPP data providers via the CPP Cloud Storage. extracted) as well of the Cross-CPP Security module (adapting the access to resources by individuals based on the CPP use). Although 3.5 Cross-CPP Security the implementation of the Context Monitoring & Extraction is The Cross-CPP security approach applies and extends an expected to be different depending on the source of the data, the implementation of the NGAC Standard [17] that provides fine- approach followed will be the same and is explained below. grained attribute-based access control for access to the CPP Cloud For each CPP, one has to define which concepts are relevant for the Storage. description of the situations (context), under which the CPP signals The distinguishing characteristic of the Cross-CPP implementation are generated and measured. Once of the concepts relevant to the derive from the objective to provide dynamically changing security description of context of CPP data streams generation are defined, policies that depend upon the context under which data streams are the next step is to define the concepts which are relevant for the used or generated, and to adapt services to the current needs of the cross-sectorial service where it will be used. As a first approach, user based on the current context. some general situations are considered (situations that could be To achieve this objective the implementation is extended with an interesting for a wide range of cross-sectorial services as well as for enhanced declarative policy language that enables changing policy the Cross-CPP context sensitive security enforcement). This will modes based on the current values of context variables, and a new translate in a context model that is in a first phase generic for all Event Processing Point that enables the currently active policy to CPPs and on a second step specialised for each CPP type. be dynamically changed on the basis of context change and the With the Context Model defined, the Context Monitoring and occurrence of events generated within the access control system or Extraction module will elsewhere in the Cross-CPP system. • monitor the data that are needed to extract the context The primary components of the access control implementation are • pre-process monitored raw data a Policy Server that provides interfaces for policy decisions and for • extract context by identifying the current context, based policy administration, an Event Processing Point that provides an on monitored raw data, the current context model and interface to the Context System and a mechanism to execute historic con-text information stored in a context changes to the access control policy as the result of specified repository events, and a Policy Tool to assist in the development and testing • Based on the identified context, situations can be of Declarative Policies and Event-Response packages. These compared to previous ones and stored/passed on to other artefacts are expressed in two distinct languages that are used to modules that may further consume this information (e.g. configure the behaviour of the Policy Server and the Event Cross-CPP Security) Processing Point respectively. The approach is best seen in Figure 5. Client applications are modified to operate on protected resources through a simple Policy Enforcement Point which consults the Policy Server for grant/deny decisions based on the current active policy. Figure 5: High level view of the functioning of the Context Monitoring and Extraction module 4 Industrial Application The described ecosystem is applied by two data providers (CPP producers): Figure 4: Cross-CPP Functional Architecture of NGAC • Vehicles extended for context sensitivity • Smart infrastructure 6 Cross-CPP – An Ecosystem for Provisioning, Consolidating, and Analysing Big CPSS2019, October 22, 2019, Bilbao, Spain Data from Cyber-Physical Products and three data consumers (service providers). Absolute Error (MAE) between the DMO (direct model output) and As described within the previous sections, the easy and secure the observation data. First tests have already revealed a significant access to diverse data streams via a platform like the CPP Big Data error reduction. Hence, access to smart building data again helps to Marketplace enables service providers to significantly enhance improve conventional forecasts. existing service solutions, and in some cases also create innovative Weather warnings new services, that have not been possible before. Within the Cross- CPP project, three main service areas are targeted to prototype this Another service enhancement made possible by a Big Data new approach: general weather forecasting services, weather-based Marketplace is a weather-based navigation and the provision of on- navigation/warnings, and an e-charging service. route live weather warnings. Within Cross-CPP, a prototype for a weather-based navigation service will be developed which uses Weather forecasting service vehicle data as well as data from a meteorological service provider Cross-sectorial data streams can considerably improve the to offer an enhanced navigation which takes current and future forecasting quality of weather forecasting models as they can weather conditions on the route of the vehicle driver into account. provide an unprecedented density of data points necessary for It will also provide a live weather warning mode as well as initial weather model initialization. Even in the case of only moderate re-routing to avoid bad weather on the trip. These navigational sensor quality compared to the common meteorological sensors, service enhancements are not only of use for the private consumer, new plausibility checks and analytics developed within Cross-CPP but especially for logistic industries and automated driving. can process these data streams appropriately so they can be E-mobility charging service successfully ingested into a weather model to help its initialization and thus improve its resulting output. The conditions and means of Main idea of the service is to exchange information among data successful data assimilation of a diverse range of sensors into an providers related to “E-Charging”, meaning vehicles will be existing service state a challenge, that is addressed intensively by providing information about their battery status and other service providers within Cross-CPP and assisted by the Analytics information relevant during the charging process, and buildings Toolbox functionality. One approach to facilitate data ingestion is about their e-chargers infrastructure – free charging locations and the development of a new high-resolution 100x100m weather constraints. model, which allows for a smoother incorporation of these new data The service is to send information about the presence of charging points. Routine operation of such models is not known, nor is the station inside of the building or located outside (public parking lot, inclusion of CPP data into such models. Despite that, a new airport, hospital) to the vehicle. The service is using real-time data Plausibility Check has been developed to test neighbouring data in the communication online with the car / building about points from one source against each other (homogeneous check) as occupancy of e-charges placed outside of the building or inside (in well as against data points from conventional sources the garages), as well as the vehicle’s own information about its (heterogeneous check). Despite the enhancement of weather capacity of the battery. This together with its current position and models, the easy and secure access to new and diverse sensor data speed could possibly calculate time of arrival and to reserve an e- like wind shield rain intensity sensors and wiper data will be used charger for this specific car. A future idea to extend this would be to develop a virtual rain radar, an innovative new prototype service, to use weather stations that may also possibly provide relevant data, that will mimic the main features of a radar map by using the live that could be used for expected electricity generation calculation as sensor data from vehicles. This service is especially useful in well. regions and countries that lack expensive radar stations (see e.g. European radar coverage in [18], and can also help fill gaps in times Service applications like these demonstrate the necessity and value of radar outages. of a secure and easy interface between data providers (CPP The access to live vehicle safety related sensors like slippery data producers) and service providers to foster Big data related growth (road slickness data), to enhance slippery road detection can also within the service sector and unlock currently merely touched be used to enhance weather-related warnings to vehicle owners business potentials. nearby, and also enables the cross-check of other available weather data as especially conditions like freezing rain are still weather 5 Conclusions events hard to predict. Furthermore, data from smart building weather stations are used In this contribution we have described the conceptual view of the within Cross-CPP to compute individually tailored weather work done in the Cross-CPP project. forecasts for them in order to enhance automatic building The developed CPP Ecosystem Architecture was outlined starting operations like e.g. facade systems, window blind or support energy from a first draft of the CPP Ecosystem Workflow, which defines efficiency. For that purpose, each building receives their own the information flow between key stakeholders and Cross-CPP weather model, that corrects the conventional forecast for usually system modules. The pictured architecture concept summarises the unknown on-site specifics, which the model learns through the Cross-CPP modules and its key software components and how they received CPP-signal. The improvement of such individual models correlate in Cross-CPP. Especially the defined Ecosystem compared to a forecast without this correction is measured by Mean Architecture, with its detailed representation of modules broken 7 CPSS2019, October 22, 2019, Bilbao, Spain A. Correia et al. down into a first overview of needed software components, [4] T. W. Jim Davis, „Towards Composable Manufacturing, Smart Apps and Services Marketplace,“ in NIST/OAGi Workshop on Smart Manufacturing presenting a blueprint for the system development within the (SM) and Cyber-Physical Production Systems (CPPS), Washington, 2016. project implementation phase. [5] „Smart Manufacturing Leadership Coalition,“ SMLC, 2017. [Online]. The project analyses the reuse of work from previous projects to Available: www.smartmanufacturingcoalition.org. [Zugriff am 20 04 analyses the hypothesis that data from diverse CPPs in different 2017]. sectors may be made available and reused by different Service [6] Automat Consortium Partners, „Deliverable 2.2: Overall Innovation & Technology Transfer Concept,“ 2015. Providers to produce cross-sectorial services. Cross-CPP is [7] ATOS, Automat consortium partners, „D4.3 Full Prototype of Vehicle Big Data overcoming the identified challenges by establishing a CPP Big Marketplace,“ 2017. Data Ecosystem, which will develop the following main [8] BMW, „BMW and MINI CarData,Tailored third-party services for bmw and characteristics: mini drivers.,“ [Online]. Available: https://www.bmwgroup.com/en/innovation/technologies-and- • Brand independent concept, open for integration of mobility/cardata.html. [Zugriff am 10 10 2019]. diverse CPP data providers coming from different [9] E. M. a. P. A. C. S. João Bártolo Gomes, „Learning recurring concepts from industrial areas, also providing a standardized cross data streams with a context-aware ensemble.,“ in Proceedings of the 2011 ACM Symposium on Applied Computing (SAC '11)., New York, NY, USA, industrial CPP data model, setting the basic structure of 2011. Cloud Storage(s) for the CPP data streams, which needs [10] R. S. a. P. P. R. João Gama, „Issues in evaluation of stream learning to be flexible enough to incorporate data coming from algorithms.,“ in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '09), New various industrial sectors. York, NY, USA, 2009. • CPP Big Data Marketplace providing to service [11] A. Z. a. S. K. Mohamed Medhat Gaber, „Mining data streams: a review,“ providers a single CPP data access point with just one SIGMOD Rec., Bd. 34, Nr. 2, pp. 18-26, June 2005. interface (one-stop-shop), as well as support [12] Juniper Consortium Partners, „Juniper Project website,“ 2019. [Online]. Available: http://www.juniper-project.org/. [Zugriff am 28 August 2019]. functionalities for easy data mining/analytics. By these [13] T. Strang und C. Linnhoff-Popien, „A Context Modeling Survey. in Workshop means, data customers (Service Providers) just need to on Advanced Context Modelling, Reasoning and Management as part of set-up and maintain one interface to gather diverse CPP the Conference on Ubiquitous Computing,“ in The Sixth International data from different CPP providers. The Marketplace Conference on Ubiquitous Computing, Nottingham, 2004. makes the Cloud Storage for CPP data streams seamless [14] D. Stokic, S. Scholze und O. Kotte, „Generic Self-Learning Context Sensitive Solution for Adaptive Manufacturing and Decision Making Systems,“ in to any data consumer taking security, data ownership and ICONS 2014, Nice, 2014. data rights into account. [15] S. Scholze, J. Barata und D. Stokic, „Holistic Context-Sensitivity for Run-Time • Controlled access to diverse CPP data streams and Optimization of Flexible Manufacturing Systems,“ Journal Sensors, Nr. 17, p. 455, 2017. optimal management of data ownership and data rights [16] Cross-CPP Consortium , „D1.3 Public Innovation Concept,“ 2019. (covering data flow from CPP owners up to Service [17] International Committee for Information Technology Standards, Cyber Providers), applicable to various cross CPP data streams. Security technical committee 1, „Information technology – Next • Win-Win value chain for all ecosystem partners, due to Generation Access Control – Functional Architecture,“ INCITS 499, 2013. the fact that the costs for the ecosystem in place can be [18] Huuskonen und e. al., „The Operational Weather Radar Network in Europe.,“ shared by a great many data customers, which will make Bull.Am.Met.Soc., 2014. a single service much more economical. ACKNOWLEDGMENTS This paper presents work developed in the scope of the project Cross-CPP. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 780167. The content of this paper does not reflect the official opinion of the European Union. Responsibility for the information and views expressed in this paper lies entirely with the authors. REFERENCES [1] B. Schmarzo, Big Data: Understanding How Data Powers Big Business, John Wiley & Sons, 2013. [2] W. H. Inmon und D. Linstedt, Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and Data Vault, Morgan Kaufmann, 2014. [3] J. Pillmann, C. Wietfeld, A. Zarcula, T. Raugust und D. C. Alonso, „Novel Common Vehicle Information Model (CVIM) for Future Automotive Vehicle Big Data Marketplaces,“ in IEEE Intelligent Vehicles Symposium (IV), Redondo Beach, CA, USA, 2017. 8