Cross-CPP – An Ecosystem for Provisioning, Consolidating,
          and Analysing Big Data from Cyber-Physical Products

            Ana Correia                                                   Elisa A. Herrmann                           Miriam Kachelmann
           Christian Wolff                                                   Victor Corral                                Meteologix AG
   Institute for Applied Systems                                                    ARI                               Sattel (SZ), Switzerland
    Technology Bremen GmbH                                            Atos IT Solutions and Services                 miriam@kachelmann.com
         Bremen, Germany                                                           Iberia
  [correia, wolff]@atb-bremen.de                                              Madrid, Spain                               Rance DeLong
                                                                             [elisa.herrmann,                             The Open Group
       Massimiliano Zanin                                                victor.corral]@atos.net                       Reading, Berkshire, UK
      Ernestina Menasalvas                                                                                            r.delong@opengroup.org
 Centro de Tecnología Biomédica
 and Escuela Técnica Superior de                                                                                           Pavel Smrz
     Ingenieros Informáticos                                                                                     Brno University of Technology
    Universidad Politécnica de                                                                                       Brno, Czech Republic
      Madrid Madrid Spain                                                                                               smrz@fit.vutbr.cz
       [massimiliano.zanin,
  ernestina.menasalvas]@upm.es


ABSTRACT                                                                              KEYWORDS
It is expected that with the increasing number of connected sensors                   Cyber-Physical Products, Cross-sectorial services, Big Data
and actuators within mass products, the large spectrum of sensor                      Marketplace
data coming from high volume products in various industrial
sectors (vehicles, smart home devices, etc.) will rise in short-term.                 1   Introduction
This enormous amount of data continuously generated by CPPs
                                                                                      In the present world where mass products have an increasing
will represent (1) a new information resource to create new value,
                                                                                      number of connected sensors and actuators, it is expected that a
allowing the improvement of existing services or the establishment
                                                                                      large spectrum of sensor data coming from high volume products
of diverse new cross-sectorial services, by combining data streams
                                                                                      in various industrial sectors (vehicles, smart home devices, etc.)
from various sources, and (2) a major big data-driven business
potential, not only for the manufacturers of Cyber Physical                           will rise in short-term. This CPP enormous amount of data has in
Products (CPP), but in particular also for cross-sectorial industries                 today’s landscape only sporadic proprietary CPP ecosystems,
as well as various organisations with interdisciplinary applications.                 which are restricted to manufacturer-specific services and not
                                                                                      open for third parties interested in the CPP data.
In spite of major advances in the field, several challenges still                     The Cross-CPP project will tackle these issues by focusing on
hinder the use of these data, like the lack of, or only few, CPP                      what CPP and their sensor data can bring to the outside world.
ecosystems that are in the best-case manufacturer specific and not                    Therefore, as key challenges, Cross-CPP aims to overcome
open for external companies interested in using such data.                            several obstacles by establishing a CPP Big Data Ecosystem,
                                                                                      with the following main characteristics:
We present here a solution that envisions to establish a CPP
Big Data Ecosystem to bring to the outside world CPP data                             •   Brand independent concept, open for integration of diverse
from various industrial sectors, brand independent, allowing for                          CPP data providers coming from different industrial areas,
external service providers that use CPP data from this unique                             also providing a standardized cross industrial CPP data model
CPP data access point (as well as from other sources) to                                  which needs to be flexible enough to incorporate data coming
develop cross-sectorial services.                                                         from various industrial sectors.


1st Workshop on Cyber-Physical Social Systems (CPSS2019),
October 22, 2019, Bilbao, Spain.

Copyright © 2019 for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).


                                                                                  1
CPSS2019, October 22, 2019, Bilbao, Spain                                                                                                                       A. Correia et al.


•    A CPP Big Data marketplace providing a single CPP data                                 2    State of the Art
     access point with just one interface (one-stop-shop) to service
     providers, as well as support functionalities for easy data                            2.1 Data Model
     mining/analytics. By these means, data customers (Service                              The proliferation of intelligent devices together with modern
     Providers) just need to set-up and maintain one interface to                           computing paradigms such as cloud, fog service-oriented
     gather diverse CPP data from different CPP providers.                                  computing is exponentially growing the amount of data recorded
•          Controlled access to diverse CPP data streams and                                and stored [1]. The massive amount of data available within a
optimal management of data ownership and data rights, applicable                            company represents the key to competitiveness. However, data and
to various cross CPP data streams.                                                          more data are useless without methods, methodologies and tools to
In general, as seen in Figure 1, the ecosystem can be separated into                        manage them. From a technical point of view, the progression in
three pillars:                                                                              the amount of data is handled by a new breed of technologies and
                                                                                            techniques such as NoSQL databases, MapReduce computation
1.    Left pillar: Data Providers (CPP Producers / Owners) ->                               framework, machine learning algorithms etc. Nonetheless, the
      Comprising data harvesting and making CPP data from                                   usage of these technologies is impractical if data are not structured
      various industrial sectors available, transfer brand specific                         i.e. data are not framed into a predictable and regularly occurring
      data streams into the common CPP data model.
                                                                                            data format in order to be managed by computational
2.    Middle pillar: Cross-CPP Cloud Storage & Big Data
                                                                                            components/modules for data analytics tasks as confirmed in [2].
      Marketplace (MP) -> Comprising a cloud based concept for
      CPP Cloud Storage. Enabling controlled access to CPP data                             Therefore, it is necessary to standardize and homogenize the way
      from different sources, offering support to Service Providers                         data are represented and structured (agreed data model) to cope
      in the form of an easy access and detection of needed data, as                        with the problem of integrating data from multiple vendor-based
      well as of flexible cross data stream analysis tools.                                 systems for the sake of knowledge generation and information
3.    Right pillar: Data Customer (Service Provider) -> Cross-                              distribution to upper managerial decision-making tools.
      sectorial industries or manufacturers of CPP using CPP data                           In order to allow for integrated data access the project has started
      from various products to create new value out of that data                            the development of the Agreed Data Model from the so-called
      (“CPP-data” has no value in itself), by improving services or                         CVIM1 developed in the scope of H2020 AutoMat project (GA no.
      the establishment of diverse new cross-sectorial services.
                                                                                            644657) for vehicles representing an agreed format for storing data
                                                                                            in the cloud [3], and extend it to other CPP. From application point
                                                                                            of view a combination of cooperative storage clouds and traditional
                                                                                            storage clouds were addressed. This data cloud also represents the
                                                                                            regulating interface for the data exchange between the CPP and the
                                                                                            various service providers. This approach provides a breakthrough
                                                                                            regarding an open data exchange that overcomes the drawbacks of
                                                                                            current restricted products ICT services concepts. Currently there
                                                                                            exists no standard information model for keeping, maintaining and
                                                                                            aggregating data from and for CPPs. The development of agreed
                                                                                            data models is not just a technical issue. The key is to not only
                                                                                            define a data model, but to reach consensus among industrial
                                                                                            players and make sure the models are used and shared i.e. to
                                                                                            become “de-facto standards”.

                                                                                            2.2 Data Marketplace
                                                                                            As it is well known, there exist a wide scope of marketplaces in the
                                                                                            internet handling B2B, B2C and C2C relations. However, the
Figure 1: Cross-CPP Ecosystem
                                                                                            marketplace for trading data streams are rare, especially in
                                                                                            industrial domains. For example, in smart manufacturing domain
In Section 2 we describe the state of the art, continue with the
                                                                                            [4] shared, secure, open-access infrastructures rich in functionality
Cross-CPP Ecosystem concept, and use the Cross-CPP architecture
                                                                                            for easier system integration and composability and a marketplace
to explain the different modules of the Ecosystem as well as their
                                                                                            that can drive technological capability beyond just products by
purpose (section 3). We subsequently explain how, in the scope of
                                                                                            integrating services on standards, uncertainty quantification,
the project, it is being applied in industry (Section 4). Section 5
                                                                                            benchmarking, performance-use metrics, systems modelling, etc.
concludes with the main learnt lessons and steps ahead.
                                                                                            are still missing, but many initiatives are currently active (e.g. in
                                                                                            US Leadership Coalition [5]). In AutoMat this concept was for the

1
 The Common Vehicle Information Model (CVIM), represents a brand-independent,               number of signals to be recorded as well as the type of measurement channels can be
open and transparent data model for vehicle data. The CVIM is representing a living         modified or extended.
data structure, where in reference to the needs of the service provider community the


                                                                                        2
Cross-CPP – An Ecosystem for Provisioning, Consolidating, and Analysing Big                                         CPSS2019, October 22, 2019, Bilbao, Spain
Data from Cyber-Physical Products


first time applied for the exchange of customer owned vehicle data                data processing (including analytics) for a parallel (e.g. cloud)
and service providers [6]. Therefore, the AutoMat Vehicle Big Data                platform, i.e. to process the data as fast as possible using all of the
Marketplace being the comprehensive platform to manage the sales                  available processors. It is therefore an ideal basis for data analytics
and provisioning of all type of vehicle related data from all OEMs                on parallel (i.e. cloud platforms), including pipelined and data-
(the project brought VW, Renault and Fiat together) and the various               parallel approaches.
service providers [7], will serve as a basis for Cross-CPP
Marketplace by extending it to cover data streams from various                    2.4 Context sensitivity
CPP. This large amount of continuously gathered CPP                               The acceptance and usability of complex cross-sectorial
heterogenous data represents major economic big data business                     services can be considerably improved by making them context
potentials, not only for specific industry verticals (as Automotive)              sensitive. With the recent advance of context sensitivity, an
but for cross-sectorial industries with interdisciplinary applications.           increasing need arises for developing formal context
Today’s proprietary approaches focus on bringing company                          modelling and reasoning techniques.
services into Vehicles (e.g. [8]), Home Systems Entertainment                     The basis for context-aware applications is a well-designed
without open-up to specialized cross-industries companies.                        Context Model (CM). As context integrates different data and
Despite of that fact, it is still a major business potential locked               knowledge sources and binds knowledge to the user to
because the automotive and related industries were not yet able to                guarantee that the understanding is consistent, context modelling
establish an open service ecosystem equivalent to existing market                 is extensively investigated. A CM enables applications to
applications such as e.g. Smartphone Apps. Such approaches fit                    understand the user’s activities in relation to situational
quite well, and could easily be adapted to the requirements of the                conditions.
data handling in a cloud environment and the management of the                    Typical context modelling techniques include key-value
information exchange between companies, vendors of CPP and                        models, object-oriented models, and ontological methods [13].
service providers.                                                                The problem to be solved is how to extract context from the
The above approach is not cost-effectiveness for companies                        CPP use. Since it is planned to model context with ontology,
because the associated costs and time-consuming process by each                   context extraction mainly is issue of context reasoning and
sign-off agreement makes not feasible to scale up into production                 context provisioning: how to inference high level context
environments the monetization of our CPP data lakes. Despite of                   information from low level raw context data [14] and [15].
this fact, it is how companies have been doing since the early stages             Application of context awareness for cross sectorial services
of data-driven services conceptualization. They have been trying to               has not yet been sufficiently researched. In the case of such
create ad-hoc applications and services for their customers which                 services the notion of context refers to process preferences
must be sustained and supported by periodical subscription fees of                of CPP and process skills of devices, physical capabilities
connected services. Data Providers and Data Consumers charge all                  of the CPP and environment conditions.
the cost of the product development lifecycle into the final cost of              The modelling of context in this case presents an additional
the services offered to end-users. Finally, they realized about the               challenge, as the mentioned services are highly dynamic and
fact that rather than working on isolated restricted (closed)                     reside in distributed environments. Up to now there were no
interfaces with specific Services Providers, it doesn’t render into               industrial driven attempts to provide harmonised modelling of
economically feasible scenarios. Cross-CPP seeks for open-up and                  context under which CPP are used or under which data streams
democratize the access to CPP data among cross-sectorial Service                  from CPP are generated. The key innovation issues to be solved
Providers under a standardized CPP Data Model trying to                           are: how to allow building common, re-usable context models for
maximize the monetization of their data users by cross-sectorial                  cross sectorial services; and how to provide a generic solution
Data Consumers which are experts and clearly knows the needs of                   adaptable to different scenarios.
our customers for accessing to digital services.

2.3 Big Data Analytics                                                            3   Cross-CPP concept
The challenges of the CPP data stream mining is to analyse how                    This section presents the results of the developed concept for the
evolving data in the different sectors (e.g. home, automotive)                    general Cross-CPP Ecosystem Architecture. Figure 2 shows
behave. In this context, the algorithms have to be designed and                   the overview of all modules that were planned to reuse
adapted to deal with resource aware learning, change detection,                   and significantly enhance the results of the past projects w.r.t.
novelty detection, multi-horizons analysis, and reasoning about the               Cross-CPP project needs and objectives. This figure was derived
learning process in the different domains [9] [10] [11].The CPP                   from the Cross-CPP Ecosystem (see Figure 1) and it summarises
real-time and predictive analytics toolbox will be an extension of                the Cross-CPP modules and its key software components and
the software approach developed within the JUNIPER project [12].                  how they correlate in the system.
This approach was based on Java 8, which introduced Streams and
Lambda expressions to support the efficient processing of in-
memory stream sources. One of the primary aims was to accelerate


                                                                              3
CPSS2019, October 22, 2019, Bilbao, Spain                                                                                         A. Correia et al.


                                                                                     Company Backend and deploys the new configuration in
                                                                                     case the data owner agrees.
                                                                                •    Performs the basic authorization and authentication
                                                                                     security services
                                                                           Both the Data Harvesting module as well as the Company Backend
                                                                           are conceptually generic for any CPP producer/Data Provider but
                                                                           have a company specific implementation due to the diversity of the
                                                                           companies and their internal systems.

                                                                           3.2    Company Backend
                                                                           The Company Backend module holds the Cross-CPP company data
                                                                           processing chain, which receives data from the Data Harvesting
                                                                           Module and after processing, enrichment with company internal
                                                                           knowledge, formatting and transformation, stores the processed
                                                                           data into CIDM format in the CPP Cloud Storage (see section 3.3).
                                                                           The Company Backend module holds bidirectional connections to
                                                                           the CPP Cloud Storage as well as to the CPP Big data Marketplace
                                                                           for transferring the data and updates of the CIDM format.
                                                                           The main functionalities of the module are to:
                                                                                 •    interpret and transform proprietary CPP manufacturer
                                                                                      specific CPP data into physical information in reference
                                                                                      to agreed owner permissions,
                                                                                 •    validate the information and, if need be, mask it to
                                                                                      enforce privacy.
                                                                                 •    convert the information into the required quasi standard
Figure 2: Conceptual Cross-CPP system architecture [16]
                                                                                      information representation, the Common Industrial Data
                                                                                      Model (CIDM) format and publish it to the owner’s CPP
3.1    Data harvesting                                                                Cloud Storage, and
The Data Harvesting module acts as intermediate layer between the                •    manage the configuration procedures for the data mining
CPPs and the Company Backend module. The connection will be                           at CPP level, by providing CPP specific data logger
realised using 3/4/5G mobile web technologies in the case of                          configurations (e.g. manages for instance the case that a
vehicles and wired connection in the case of smart infrastructure                     certain CPP, even if from the same sector, often has other
devices.                                                                              models or configurations and therefore has some signals
The main functionalities of the module are the:                                       that other does not).

    •     Set-up CPP system and data acquisition configuration             3.3    Common Industrial Data Model and CPP
    •     Continuous data acquisition and transmission
                                                                                 Cloud Storage
Where the first will offer functions to allow for the signals coming
from the CPPs to be configured in terms for instance of retrieval          The Common Industrial Data Model (CIDM) is an open and highly
and transmission rate, etc. and the second is the actual data              scalable big data format, designed to harmonize IoT proprietary
acquisition and transmission of said signal data from CPPs to the          data into generic datasets.
Company Backend (see section 3.2).                                         The structure of the data model consists of three layers depicted in
The main components in this module are:                                    Figure 3.
    •     The data logger in the CPP continuously measures data
          during CPP usage, according to the deployed
          measurement configuration.
    •     The measured data are continuously aggregated and
          stored in CPP data packages. According to the deployed
          measurement configuration, the aggregation and storage
          of data is done for each defined measurement channel.
    •     The stored CPP data packages are sent to the CPP
          Company Backend at the agreed frequency.                         Figure 3 CPP Data Model main structure
    •     This component receives the request for management
          configuration initialisation / update from the CPP


                                                                       4
Cross-CPP – An Ecosystem for Provisioning, Consolidating, and Analysing Big Data                                         CPSS2019, October 22, 2019, Bilbao, Spain
from Cyber-Physical Products


•     Starting from the bottom part, Signals describe the type of                           •     Catalogue: This component manages the set of signals
      physical phenomena and chemical quantities of vehicles and                                  and measurement channels available for the current
      buildings, including the name of the signal, the format and unit.                           version of the CIDM.
•     As measurements of the phenomena may far exceed the                                    •    Data Providers and Service Provider Manager: To
      available transmission bandwidth or the full resolution may                                 manage data provider users, the services provider users
      not be required in most applications, data from the CPP need                                and to manage the sharing process between them.
      to be pre-processed and aggregated conforming a                                  The module comprises a backend application with a RESTful API
      “measurement channel” that include the signals to aggregate                      to provide the functionality, an index storage, and a frontend
      (1 or more), the aggregation type (time series, histograms,                      application (web application) to provide the different actors the
      etc.) and the configuration of the aggregation.                                  visualization of the data and the visual management of all the
•     Finally, at the highest level, data packages provide the actual                  resources. The module includes a Software Development Kit for
      data coming from the CPP, aggregated according to a                              retrieving datasets in an easy way.
      measurement channel selected. The data packages are stored                       An added value to the marketplace is the Data Analytics Toolbox
      in and retrieved from the Cloud Storage. In addition to the data,                that extends the marketplace functionality to provide the Service
      data packages also contain metadata with support information                     Providers with Analytics Capabilities. It includes, besides the CPP
      like ownership and quality assessment.                                           Data Analytics toolbox, the Software Development Kit. The former
The CPP Cloud Storage is a cloud-based data storage infrastructure                     is composed of a set of modules facilitating the analysis of the
that offers secure and private “data vaults” to data providers, to                     collected data. It is based on a modular structure, in which new
store their devices data packages in the CIDM format. The storage                      analytics services can be added to fulfil new user requirements; and
infrastructure provides an Application Programming Interface                           it is aimed at supporting both fast prototyping of new ideas, and
(API) to enable data collection from the Company Backend as well                       efficient implementation of data synthesis and analysis techniques.
as data access by the CPP Big Data Marketplace. The module                             Each module, devoted to a specific analysis, communicate with the
includes a web application that allows users to have the control of                    Cross-CPP Marketplace to get the data and return the results. The
their data by managing read or write access permissions granted to                     communication between each module and the central system can
the Marketplace and the Company Backend.                                               take two forms:
                                                                                             •    Pull mode: analyses are performed over the data provided
3.4    CPP Big Data Marketplace and Data                                                          by the system (or the final user), without taking into
      Analytics toolbox                                                                           account the evolution of the system up to that point. The
                                                                                                  user/service provider requests for an analysis and once
The CPP Big Data Marketplace connects Data Providers and Data
                                                                                                  the results are yielded, the whole computation is deleted.
Consumers for selling and acquiring Connected Vehicle and Smart
                                                                                             •    Push, or stream, mode: internal models are updated in an
Building data under the standardized data model (CIDM), assuring
                                                                                                  asynchronous way, using any new data available, such
security and privacy of the data. The Marketplace main purpose is
                                                                                                  that the request for an analysis just implies retrieving the
to allow Services Providers to create new B2B and B2C data-based
                                                                                                  result. In this case, the availability of new data triggers
products and services.
                                                                                                  the update in the analysis and the user accesses results as
The architecture of the Marketplace module consists of a set of
                                                                                                  if they were any other data stream in the system.
components with different responsibilities:
                                                                                       Within the plethora of data analytics techniques that have been
      •   Indexing: This component indexes the metadata of the
                                                                                       developed in the last decades, we have here selected some of them
          CPP data stored in the different Cloud Storages modules
                                                                                       for being relevant in a large range of applications. In short, these
          to provide data discovery services and to locate and
                                                                                       include:
          retrieve CPP data from Cloud Storages when needed.
                                                                                             •    Basic statistics: module that aims at providing with some
      •   Discovery: This interface allows to check the types,
                                                                                                  very simple statistical functions, calculated over a subset
          amount and quality of the CPP data stored in the Cloud
                                                                                                  of the stored data, and with the objective of minimizing
          Storage spaces considering data consumer constrains and
                                                                                                  communication overheads.
          requirements.
                                                                                             •    Time series: module providing the service provider with
      •   Cloud Storage Access: This component has two
                                                                                                  a set of tools for detecting when a time series, that
          responsibilities, to handler data change notifications from
                                                                                                  represents the evolution of a measurement, suffers from
          the Clouds Storages and to request CPP data packages to
                                                                                                  a sudden change.
          the proper Cloud Storage when a Service Provider
                                                                                             •    Trajectories: aims at making a set of basic tools available
          request data.
                                                                                                  to the service provider, in order to simplify the handling
      •   Data Broker: This component provides an interface to
                                                                                                  and manipulation of trajectories.
          retrieve the CPP data in subscription (streaming) mode or
                                                                                             •    Machine learning: module supporting incremental
          in a pull mode (REST request).
                                                                                                  learning algorithms by means of existing libraries and


                                                                                   5
CPSS2019, October 22, 2019, Bilbao, Spain                                                                                            A. Correia et al.


           frameworks, that proved to be applicable in high velocity        3.6     Context Monitoring and Extraction
           settings.
                                                                            The monitoring and extraction module extracts context from the
Both the raw data and the analytics results can be accessed in two
                                                                            CPP use to support security and improve services.
ways. On one hand, the system provides an SDK for programmatic
                                                                            The context sensitivity of the Cross-CPP ecosystem can be
access to the data. Alternatively, these can be explored through a
                                                                            supported by monitoring and extracting information about the use
GUI, mainly used by Service Providers to select and configure the
                                                                            of CPP, and can support the adaptation of services (by allowing the
access to the cross-sectorial CPP big data pool offered by the Cross-
                                                                            services to retrieve only the information that matches the context
CPP data providers via the CPP Cloud Storage.
                                                                            extracted) as well of the Cross-CPP Security module (adapting the
                                                                            access to resources by individuals based on the CPP use). Although
3.5     Cross-CPP Security
                                                                            the implementation of the Context Monitoring & Extraction is
The Cross-CPP security approach applies and extends an                      expected to be different depending on the source of the data, the
implementation of the NGAC Standard [17] that provides fine-                approach followed will be the same and is explained below.
grained attribute-based access control for access to the CPP Cloud          For each CPP, one has to define which concepts are relevant for the
Storage.                                                                    description of the situations (context), under which the CPP signals
The distinguishing characteristic of the Cross-CPP implementation           are generated and measured. Once of the concepts relevant to the
derive from the objective to provide dynamically changing security          description of context of CPP data streams generation are defined,
policies that depend upon the context under which data streams are          the next step is to define the concepts which are relevant for the
used or generated, and to adapt services to the current needs of the        cross-sectorial service where it will be used. As a first approach,
user based on the current context.                                          some general situations are considered (situations that could be
To achieve this objective the implementation is extended with an            interesting for a wide range of cross-sectorial services as well as for
enhanced declarative policy language that enables changing policy           the Cross-CPP context sensitive security enforcement). This will
modes based on the current values of context variables, and a new           translate in a context model that is in a first phase generic for all
Event Processing Point that enables the currently active policy to          CPPs and on a second step specialised for each CPP type.
be dynamically changed on the basis of context change and the               With the Context Model defined, the Context Monitoring and
occurrence of events generated within the access control system or          Extraction module will
elsewhere in the Cross-CPP system.                                               •     monitor the data that are needed to extract the context
The primary components of the access control implementation are                  •     pre-process monitored raw data
a Policy Server that provides interfaces for policy decisions and for            •     extract context by identifying the current context, based
policy administration, an Event Processing Point that provides an                      on monitored raw data, the current context model and
interface to the Context System and a mechanism to execute                             historic con-text information stored in a context
changes to the access control policy as the result of specified                        repository
events, and a Policy Tool to assist in the development and testing
                                                                                 •     Based on the identified context, situations can be
of Declarative Policies and Event-Response packages. These
                                                                                       compared to previous ones and stored/passed on to other
artefacts are expressed in two distinct languages that are used to
                                                                                       modules that may further consume this information (e.g.
configure the behaviour of the Policy Server and the Event
                                                                                       Cross-CPP Security)
Processing Point respectively.
                                                                            The approach is best seen in Figure 5.
Client applications are modified to operate on protected resources
through a simple Policy Enforcement Point which consults the
Policy Server for grant/deny decisions based on the current active
policy.


                                                                                Figure 5: High level view of the functioning of the Context
                                                                                           Monitoring and Extraction module


                                                                            4     Industrial Application
                                                                            The described ecosystem is applied by two data providers (CPP
                                                                            producers):
    Figure 4: Cross-CPP Functional Architecture of NGAC
                                                                            •   Vehicles
                extended for context sensitivity
                                                                            •   Smart infrastructure


                                                                        6
Cross-CPP – An Ecosystem for Provisioning, Consolidating, and Analysing Big                                        CPSS2019, October 22, 2019, Bilbao, Spain
Data from Cyber-Physical Products


and three data consumers (service providers).                                     Absolute Error (MAE) between the DMO (direct model output) and
As described within the previous sections, the easy and secure                    the observation data. First tests have already revealed a significant
access to diverse data streams via a platform like the CPP Big Data               error reduction. Hence, access to smart building data again helps to
Marketplace enables service providers to significantly enhance                    improve conventional forecasts.
existing service solutions, and in some cases also create innovative
                                                                                  Weather warnings
new services, that have not been possible before. Within the Cross-
CPP project, three main service areas are targeted to prototype this              Another service enhancement made possible by a Big Data
new approach: general weather forecasting services, weather-based                 Marketplace is a weather-based navigation and the provision of on-
navigation/warnings, and an e-charging service.                                   route live weather warnings. Within Cross-CPP, a prototype for a
                                                                                  weather-based navigation service will be developed which uses
Weather forecasting service                                                       vehicle data as well as data from a meteorological service provider
Cross-sectorial data streams can considerably improve the                         to offer an enhanced navigation which takes current and future
forecasting quality of weather forecasting models as they can                     weather conditions on the route of the vehicle driver into account.
provide an unprecedented density of data points necessary for                     It will also provide a live weather warning mode as well as initial
weather model initialization. Even in the case of only moderate                   re-routing to avoid bad weather on the trip. These navigational
sensor quality compared to the common meteorological sensors,                     service enhancements are not only of use for the private consumer,
new plausibility checks and analytics developed within Cross-CPP                  but especially for logistic industries and automated driving.
can process these data streams appropriately so they can be
                                                                                  E-mobility charging service
successfully ingested into a weather model to help its initialization
and thus improve its resulting output. The conditions and means of                Main idea of the service is to exchange information among data
successful data assimilation of a diverse range of sensors into an                providers related to “E-Charging”, meaning vehicles will be
existing service state a challenge, that is addressed intensively by              providing information about their battery status and other
service providers within Cross-CPP and assisted by the Analytics                  information relevant during the charging process, and buildings
Toolbox functionality. One approach to facilitate data ingestion is               about their e-chargers infrastructure – free charging locations and
the development of a new high-resolution 100x100m weather                         constraints.
model, which allows for a smoother incorporation of these new data                The service is to send information about the presence of charging
points. Routine operation of such models is not known, nor is the                 station inside of the building or located outside (public parking lot,
inclusion of CPP data into such models. Despite that, a new                       airport, hospital) to the vehicle. The service is using real-time data
Plausibility Check has been developed to test neighbouring data                   in the communication online with the car / building about
points from one source against each other (homogeneous check) as                  occupancy of e-charges placed outside of the building or inside (in
well as against data points from conventional sources                             the garages), as well as the vehicle’s own information about its
(heterogeneous check). Despite the enhancement of weather                         capacity of the battery. This together with its current position and
models, the easy and secure access to new and diverse sensor data                 speed could possibly calculate time of arrival and to reserve an e-
like wind shield rain intensity sensors and wiper data will be used               charger for this specific car. A future idea to extend this would be
to develop a virtual rain radar, an innovative new prototype service,             to use weather stations that may also possibly provide relevant data,
that will mimic the main features of a radar map by using the live                that could be used for expected electricity generation calculation as
sensor data from vehicles. This service is especially useful in                   well.
regions and countries that lack expensive radar stations (see e.g.
European radar coverage in [18], and can also help fill gaps in times             Service applications like these demonstrate the necessity and value
of radar outages.                                                                 of a secure and easy interface between data providers (CPP
The access to live vehicle safety related sensors like slippery data              producers) and service providers to foster Big data related growth
(road slickness data), to enhance slippery road detection can also                within the service sector and unlock currently merely touched
be used to enhance weather-related warnings to vehicle owners                     business potentials.
nearby, and also enables the cross-check of other available weather
data as especially conditions like freezing rain are still weather
                                                                                  5   Conclusions
events hard to predict.
Furthermore, data from smart building weather stations are used                   In this contribution we have described the conceptual view of the
within Cross-CPP to compute individually tailored weather                         work done in the Cross-CPP project.
forecasts for them in order to enhance automatic building                         The developed CPP Ecosystem Architecture was outlined starting
operations like e.g. facade systems, window blind or support energy               from a first draft of the CPP Ecosystem Workflow, which defines
efficiency. For that purpose, each building receives their own                    the information flow between key stakeholders and Cross-CPP
weather model, that corrects the conventional forecast for usually                system modules. The pictured architecture concept summarises the
unknown on-site specifics, which the model learns through the                     Cross-CPP modules and its key software components and how they
received CPP-signal. The improvement of such individual models                    correlate in Cross-CPP. Especially the defined Ecosystem
compared to a forecast without this correction is measured by Mean                Architecture, with its detailed representation of modules broken


                                                                              7
CPSS2019, October 22, 2019, Bilbao, Spain                                                                                                                 A. Correia et al.


down into a first overview of needed software components,                              [4] T. W. Jim Davis, „Towards Composable Manufacturing, Smart Apps and
                                                                                               Services Marketplace,“ in NIST/OAGi Workshop on Smart Manufacturing
presenting a blueprint for the system development within the                                   (SM) and Cyber-Physical Production Systems (CPPS), Washington, 2016.
project implementation phase.                                                          [5] „Smart Manufacturing Leadership Coalition,“ SMLC, 2017. [Online].
The project analyses the reuse of work from previous projects to                               Available: www.smartmanufacturingcoalition.org. [Zugriff am 20 04
analyses the hypothesis that data from diverse CPPs in different                               2017].

sectors may be made available and reused by different Service                          [6] Automat Consortium Partners, „Deliverable 2.2: Overall Innovation &
                                                                                               Technology Transfer Concept,“ 2015.
Providers to produce cross-sectorial services. Cross-CPP is
                                                                                       [7] ATOS, Automat consortium partners, „D4.3 Full Prototype of Vehicle Big Data
overcoming the identified challenges by establishing a CPP Big                                Marketplace,“ 2017.
Data Ecosystem, which will develop the following main                                  [8] BMW, „BMW and MINI CarData,Tailored third-party services for bmw and
characteristics:                                                                              mini              drivers.,“            [Online].        Available:
                                                                                              https://www.bmwgroup.com/en/innovation/technologies-and-
     •    Brand independent concept, open for integration of                                  mobility/cardata.html. [Zugriff am 10 10 2019].
          diverse CPP data providers coming from different                             [9] E. M. a. P. A. C. S. João Bártolo Gomes, „Learning recurring concepts from
          industrial areas, also providing a standardized cross                                data streams with a context-aware ensemble.,“ in Proceedings of the 2011
                                                                                               ACM Symposium on Applied Computing (SAC '11)., New York, NY, USA,
          industrial CPP data model, setting the basic structure of                            2011.
          Cloud Storage(s) for the CPP data streams, which needs                       [10] R. S. a. P. P. R. João Gama, „Issues in evaluation of stream learning
          to be flexible enough to incorporate data coming from                                 algorithms.,“ in Proceedings of the 15th ACM SIGKDD international
                                                                                                conference on Knowledge discovery and data mining (KDD '09), New
          various industrial sectors.                                                           York, NY, USA, 2009.
     •    CPP Big Data Marketplace providing to service                                [11] A. Z. a. S. K. Mohamed Medhat Gaber, „Mining data streams: a review,“
          providers a single CPP data access point with just one                                SIGMOD Rec., Bd. 34, Nr. 2, pp. 18-26, June 2005.
          interface (one-stop-shop), as well as support                                [12] Juniper Consortium Partners, „Juniper Project website,“ 2019. [Online].
                                                                                                 Available: http://www.juniper-project.org/. [Zugriff am 28 August 2019].
          functionalities for easy data mining/analytics. By these
                                                                                       [13] T. Strang und C. Linnhoff-Popien, „A Context Modeling Survey. in Workshop
          means, data customers (Service Providers) just need to                                  on Advanced Context Modelling, Reasoning and Management as part of
          set-up and maintain one interface to gather diverse CPP                                 the Conference on Ubiquitous Computing,“ in The Sixth International
          data from different CPP providers. The Marketplace                                      Conference on Ubiquitous Computing, Nottingham, 2004.

          makes the Cloud Storage for CPP data streams seamless                        [14] D. Stokic, S. Scholze und O. Kotte, „Generic Self-Learning Context Sensitive
                                                                                                 Solution for Adaptive Manufacturing and Decision Making Systems,“ in
          to any data consumer taking security, data ownership and                               ICONS 2014, Nice, 2014.
          data rights into account.                                                    [15] S. Scholze, J. Barata und D. Stokic, „Holistic Context-Sensitivity for Run-Time
     •    Controlled access to diverse CPP data streams and                                      Optimization of Flexible Manufacturing Systems,“ Journal Sensors, Nr.
                                                                                                 17, p. 455, 2017.
          optimal management of data ownership and data rights
                                                                                       [16] Cross-CPP Consortium , „D1.3 Public Innovation Concept,“ 2019.
          (covering data flow from CPP owners up to Service                            [17] International Committee for Information Technology Standards, Cyber
          Providers), applicable to various cross CPP data streams.                               Security technical committee 1, „Information technology – Next
     •    Win-Win value chain for all ecosystem partners, due to                                  Generation Access Control – Functional Architecture,“ INCITS 499,
                                                                                                  2013.
          the fact that the costs for the ecosystem in place can be
                                                                                       [18] Huuskonen und e. al., „The Operational Weather Radar Network in Europe.,“
          shared by a great many data customers, which will make                                Bull.Am.Met.Soc., 2014.
          a single service much more economical.

ACKNOWLEDGMENTS
This paper presents work developed in the scope of the project
Cross-CPP. This project has received funding from the European
Union’s Horizon 2020 research and innovation programme under
grant agreement no. 780167. The content of this paper does not
reflect the official opinion of the European Union. Responsibility
for the information and views expressed in this paper lies entirely
with the authors.

REFERENCES

[1] B. Schmarzo, Big Data: Understanding How Data Powers Big Business, John
         Wiley & Sons, 2013.
[2] W. H. Inmon und D. Linstedt, Data Architecture: A Primer for the Data
        Scientist: Big Data, Data Warehouse and Data Vault, Morgan Kaufmann,
        2014.
[3] J. Pillmann, C. Wietfeld, A. Zarcula, T. Raugust und D. C. Alonso, „Novel
          Common Vehicle Information Model (CVIM) for Future Automotive
          Vehicle Big Data Marketplaces,“ in IEEE Intelligent Vehicles Symposium
          (IV), Redondo Beach, CA, USA, 2017.


                                                                                   8