A cloud-to-edge architecture for predictive analytics David Bowden∗ , Angelo Marguglio† , Lucrezia Morabito‡ , Chiara Napione‡ , Simone Panicucci‡ , Nikolaos Nikolakis§ , Sotiris Makris§ , Guido Coppo∗∗ , Salvatore Andolina∗∗ , Alberto Macii†† , Enrico Macii‡‡ , Niamh O’Mahony∗ , Paul Becker§§ , Sven Jung§§ ∗ DELL EMC, Cork, IRELAND, name.surname@dell.com † Engineering Ingegneria Informatica S.p.A., Palermo, ITALY, name.surname@eng.it ‡ COMAU S.p.A., Turin, ITALY, name.surname@comau.com § Laboratory for Manufacturing Systems & Automation, Department of Mechanical Engineering and Aeronautics, University of Patras, Patras, GREECE, name.surname@lms.mech.upatras.gr ∗∗ SynArea Consultants S.r.l., Turin, ITALY, name.surname@synarea.com †† Department of Control and Computer engineering, Politecnico di Torino, Turin, ITALY, alberto.macii@polito.it ‡‡ Interuniversity Department of Regional and Urban Studies and Planning, Politecnico di Torino, Turin, ITALY, enrico.macii@polito.it §§ Fraunhofer Gesellschaft zur Förderung der angewandten Forschung, Aachen, Germany, name.surname@ipt.fraunhofer.de ABSTRACT those data and creates insight, aiming to enable predictive an- Data management and processing to enable predictive analytics alytics at the edge. With the goal of anticipating failures and in cyber physical systems, holds the promise of creating insight estimating the remaining useful life (RUL) of physical equipment, into the underlying processes, discovering criticalities and pre- a two-tier data analytics architecture has been developed. This dicting imminent problems. Hence, proactive strategies can be architecture, the "SERENA" system, can identify the symptoms adopted, with respect to predictive analytics. This paper discusses of imminent machine failure, through the characterization of the the design and prototype implementation of a plug-n-play end- current dynamics of the process/machine (at any given time) us- to-end cloud architecture, enabling predictive maintenance of ing on-line data collected in the factory. A scalable and modular industrial equipment. This is enabled by integrating edge gate- approach has been taken in the design of the architecture, de- ways, data stores at both the edge and the cloud, and various coupling the overall design from any specific set of technologies. applications, such as predictive analytics, visualization and sched- For testing and validating the proposed approach, a prototype uling, integrated as services in the cloud system. The proposed has been implemented, integrating services such as visualization, approach has been implemented into a prototype and tested in scheduling and predictive analytics. This prototype was validated an industrial use case related to the maintenance of a robotic in a real-world scenario involving anomaly detection on a robotic arm. axis and concerning the maintenance requirements caused by the backlash effect. The visualization service enables a real-time data stream and machine visualization, while the predictive analytics 1 INTRODUCTION services generate the estimated RUL value, which is consumed by the scheduling service to proactively schedule the maintenance The advent of Industry 4.0 trend in automation and data exchange, activities. leads to a constant evolution towards smart environments, in- This paper is organized as follows. Section 2 discusses the cluding an intensive utilization of Cyber-Physical System (CPS). state-of-the-art analytics approaches targeting to Industry 4.0. This promotes a full integration of manufacturing IT and control Section 3 describes the proposed cloud-to-edge architecture for systems with physical objects embedded with electronics, soft- predictive analytics, while Section 4 introduces the industrial ware and sensors. This new industrial model leads to a pervasive use case. Section 5 presents the preliminary results achieved by integration of information and communication technology into exploiting the proposed architecture in a real use-case. Finally, productive components, generating massive amounts of data. Section 6 draws conclusions and discusses future developments of Powerful and reliable cyber-physical architectures are becoming the proposed cloud-to-edge architecture for predictive analytics prominent to effectively analyze such large amounts of data, cre- in Industry 4.0. ating insight into the production process, and, thus, enabling its improvement, as well as competitive business advantages. This paper presents a cloud architecture designed for the In- 2 RELATED WORK dustry 4.0 vision, bridging the gap between the physical world, A large variety of studies have been carried out to develop effi- which provides raw data, and the cyber space, which processes cient data management systems, data analytics engines, business processes and risk assessment in Industry 4.0. The authors in © 2019 Copyright held by the author(s). Published in the Workshop Proceedings [15] presented a case study exploiting Big Data analytics to im- of the EDBT/ICDT 2019 Joint Conference (March 26, 2019, Lisbon, Portugal) on prove production processes. It exploited a methodology, called CEUR-WS.org. Cross-Industry Standard Process for Data Mining, to present and organize results for better understanding businesses. The work in [7] presented an integrated self-tuning engine for pre- dictive maintenance in Industry 4.0. Specifically, a distributed architecture, based on Apache Kafka, Spark Streaming, MLlib, and Cassandra was proposed and discussed. The proposed ap- proach integrated the monitoring and prediction tasks, along with a self-tuning approach for the dynamic selection of the best predictive algorithm, and specific attention to providing inter- pretable knowledge to end users. Manufacturing computerization is another crucial issue to be addressed in the Industry 4.0 ecosys- tem. The study presented in [11] proposed a semantic reduction of heterogeneous sources, based on Semantic Web approaches, to foster better analytics implementations. Another interesting issue to address is the increasing amount of data to be managed by machine learning techniques. In this context, an interesting comparison between multi-class classifiers and deep learning techniques is discussed in [12]. Furthermore, a comparative experimental analysis of exploratory techniques for Figure 1: SERENA Architecture Big Data is provided in [3]. The authors in [2] present a Big-Data scalable predictive approach in the energy domain Industry. The 3 THE SERENA APPROACH study presented in [9] proposed a framework for on-demand remote sensing data analysis to speed up the execution of models The SERENA system comprises a number of services, which by reducing data transfers through the network. This allows collectively provide predictive analytics functionality, enabling for classical remote data service systems to evolve into remote predictive maintenance policies to be applied. It is implemented sensing data processing infrastructures. using a light-weight micro-services architecture, which utilizes Advanced Internet of Things (IoT) and ICT technologies al- Docker containers to wrap the services into deployable units. low linking physical manufacturing facilities and machines in The services can then be distributed across the SERENA hybrid integrated applications. The authors in [5] provided a review of cloud, extending their functionality out to edge gateways on the virtualized and cloud-based services in the context of manufac- factory floor. The distribution of services and dynamic commu- turing systems. A predictive maintenance approach involving nications channels is implemented using a Docker orchestration cyber-physical systems with wide IoT capabilities along with manager. Wrapping services in containers abstracts them from complex event processing features was discussed. the underlying host infrastructure. As Docker is a commonly Among the most widespread maintenance approaches, con- supported open source technology, the SERENA system can be dition based maintenance (CBM) is usually considered the most deployed on a wide variety of infrastructures, from hardware effective. Efficiently determining the health status of a monitored servers and gateways, through virtual machines, to hosted envi- device, in such context, is of major importance. Prognostics and ronments on public and hybrid clouds. Using the same Docker diagnostics applied to raw data collected from sensors aim to de- solution across the SERENA hybrid cloud and gateways, gives termine the health of the monitored system or equipment. To this the system a unified architecture, which can be operated and end, detecting and analyzing underlying data trends allow anom- managed as a single unit. The services represent logical elements alies to be discovered. An overview of data analytics techniques that provide defined functionality in the SERENA system. Whilst for anomaly detection is provided in [13]. The authors exploited the SERENA reference implementation uses specific technology artificial neural networks in large systems to effectively predict to realize each service, the common interface allows technology their health. Prognostics or predictive analytics are usually as- to be swapped, depending on the specific implementation require- sociated with the computation of a key performance indicator, ments. This technology transparency is an important concept in such as the RUL. The authors in [17] presented a deep-belief net- SERENA’s plug-n-play architecture. Figure 1 illustrates the main work ensemble method with multiple objectives to estimate RUL. components of the SERENA system and their interactions, which Similarly, the authors in [4] exploited a neural-network prognos- are further described in the following subsections. tics model to support industrial maintenance scheduling. The failure probabilities were computed from real-world equipment 3.1 Services measurements through a logistic regression approach. Such mea- The SERENA system is designed to integrate external applications surements were then routed to a prognostics model to forecast using a service-oriented approach. In this context, the following failure conditions and, finally, to estimate RUL. In this scenario, services have been designed, implemented and integrated in the predictive analytics are affected by the quality of data used for above mentioned system: prediction. The authors in [6] proposed a method for improving A predictive analytics service, based on machine learning data quality in diagnosing the health of devices and production techniques, to forecast future failures of machinery/equipment. equipment. First, a visualization-based grouping, based on the The aim of this service, whose functional building blocks are dissimilarity spectrum, was performed on critical measurements, shown in Figure 2, is two-fold: (i) Building a prediction model, which were then clustered and evaluated, in terms of their fit- based on historical data, by means of machine learning algo- ness and separation with each other. An outlier-detection visual rithms; and (ii) applying such a model in real time to new in- assessment was also presented to identify outliers in the data. coming data streams, to identify possible failures. A two-tier architecture exploiting both edge and cloud computing has been proposed to address phase (i) in the cloud, exploiting (theoreti- cally) unlimited resources, while phase (ii), which requires less operations to the local technicians or remote experts in an effec- tive and intuitive way. 3.2 Edge gateway As illustrated on the left of Figure 1 the edge gateways are lo- cated on the factory floor and collect sensor data from industrial equipment and channel it, through the data flow engine, to the communications broker running in the SERENA hybrid cloud. The gateways also host analytics models, which are used to pro- Figure 2: The predictive analytics service: main building cess data at the edge, converting raw data into smart data. Both blocks the data flow engine and the analytics model are deployed to the gateway as Docker containers, under the control of the Docker orchestration manager. The gateway can host multiple types of computation resources, runs at the edge due to the limited re- data flow engines and analytics models, depending on the types sources and to increase responsiveness. of equipment that are being monitored . The majority of the raw The smart data block derives relevant static features from the sensor data is transformed into smart data, by the analytics model raw data (in many cases raw data are time series), supporting running on the gateway, but when specific criteria are met, a the predictive maintenance goal. Smart data represents the key sample of the raw sensor data is sent to the SERENA cloud for characteristics of the raw data, as well as context information more in-depth analysis and to train the analytical models. Typi- about how the data was collected and the operating conditions cally, gateways have enough computing power to run analytical of the equipment it was collected from. In the current implemen- models, but not to train them. The training is handled by the tation, the block computes a large variety of statistical indices, predictive analytics service running in the SERENA cloud. The including maximum, minimum, mean, peak to peak distance, service uses the raw data to train the prediction algorithms and variance, inter-quartile ranges, standard deviation, root mean package the resultant models up in a Docker container, which square, kurtosis and skewness. are deployed to the edge gateways. The model building block is executed on a batch schedule on historical data. These data include the smart data computed 3.3 Broker over the original time series and their corresponding class labels As shown in the middle of Figure 1, the communications broker (e.g., failure presence or absence, category of failures). All data acts as the central communication hub between the Data Stores, are related to an industrial device/robot/piece of equipment of SERENA services and the edge gateways. The broker primarily interest that can fail and for which a predictive maintenance handles HTTPS traffic by exposing a number of REST endpoints. strategy should be addressed. The broker also supports a number of other protocols, such as Many classifiers do not manage time series data by design but, MQTT and Web Services, for real-time data transfer. In addition since the original time-series of measurements are not considered to receiving sensor data directly from the factory floor, the broker for training the model, a wide range of classifiers could be used. acts as the access point for external facilities, such as enterpise In the current implementation to train a predictive model, the resourse planning systems (ERP). Security is a critical part of proposed methodology exploits one of the following machine the SERENA system, and the broker, as the communications learning algorithms: Neural networks (NN) [16], Random Forest hub, provides secure channels to and from the gateways and (RF) [10], Logistic Regression (LR) [10], Support Vector Machines other SERENA services. It will also validate the authenticity of (SVM) [10], and Gradient-Boosted Tree (GBT) [8]. incoming messages, and whether the requester is authorized to In the validation block, performance of the prediction block, is use the requested service. evaluated by exploiting either a k-fold stratified cross-validation or a hold-out strategy based on the cardinality of the training 3.4 Cloud data storage set. In addition, the training dataset defaults to the available historical data, even if shorter and more specific periods can be SERENA supports a number of different data stores, depending selected to address ad-hoc predictive maintenance issues. The on the type of data and its function within the system, including prediction performance is evaluated through quality indices, such raw sensor data, smart data, metadata, equipment manuals, 3D as accuracy, to evaluate the overall efficacy of the classifier, whilst objects for virtual reality applications, etc. The data stores and f-measure, precision, and recall offer important insights on the data repositories are also implemented as containerized SERENA performance of the classifier with respect to a given class. services, which gives them the same flexibility as other services A forecasted failure time horizon is generated as the output of on the system. the predictive analytics service and consumed by a scheduling service [14]. The aim of the service is to prevent the predicted 3.5 Docker orchestration failure, by assigning the required maintenance activities to oper- The orchestration manager is responsible for deploying the ser- ators within the given timeframe. This service can be extended vice containers to the host infrastructure and managing their life- to consider the current production plan, hence fitting the mainte- cycle. It also defines and manages the communications channels nance activities within a given time slot to optimise production between services. The core SERENA cloud services are deployed outputs. as resilient clusters of Docker containers. If a container fails, the The visulization block provides a 3D view of the relevant orchestrator automatically starts a new container to replace it. machinery/equipment, using the data collected in the field, along Additionally, the orchestrator can be used to increase or decrease with the results of the predictive maintenance algorithms. This the number of containers in a service, thus scaling the operation service allows for the presentation of the data and maintenance of the service. As the SERENA cloud servers and gateways are registered within the same Docker domain, the orchestrator can manage the deployment of new services (e.g. data flow engines and analytics models) from the SERENA cloud, all the way to the edge gateways. Docker uses labels to specify which containers are deployed to which hosts. If a new data flow engine is required to support a piece of equipment, the appropriate Docker label is defined on the host gateway; the Docker orchestrator will then ensure that the appropriate data flow container is automatically deployed to the gateway. In this way, thousands of containers can be deployed to hundreds of gateways, simply by defining the appropriate Docker labels. 3.6 Docker registry The SERENA system also implements its own local Docker image registry. Docker containers are deployed from images in the local Figure 3: Experimental setting registry, rather than using a public registry, which ensures that the required images are always available locally, and from a trusted source. level is different; which slightly complicates the machine learning approach but allowed us to collect more data about the levels 4 USE CASE AND EXPERIMENTAL SETTING most complicate to analyze. Each datapoint consists of all the COMAU (https://www.comau.com/) deploys industrial robots information provided from the RobotBox controller and from the around the world and it has an increasing requirement to collect user setting connected to the choice of the belt tensioning degree: data that monitors the health status of all its machines, in order • header information: machine id, program number, cycle to avoid sudden failure. To cope with this complexity, further start time, cycle time; studies on predictive maintenance approaches are needed. For • time series data: position and current data, collected with this reason a test-bed has been built, which is called RobotBox a sampling time of 2 milliseconds for a duration of 24 and consists of a motor from a Comau medium size robot, with seconds; its associated controller. Then it is constituted by an adaptor, a • label: level of belt tensioning. belt and a 5 kilos weight, which simulates an end effector. The Smart data. From the current raw data, 12 statistical features choice of using a single axis rather than an entire robot is due to have been calculated and used to classify each cycle indepen- the fact that manipulating a robot is very expensive. In addition, dently. Smart data include: maximum, minimum, mean, peak to in a complete robot there are many factors having impact on peak distance, variance, standard deviation, root mean square the physical conditions of the robot behaviour (e.g. frictions, (rms) of raw data, kurtosis, skewness and rms of three types of temperature, vibrations, humidity, etc.). It is difficult to isolate filter on current data (low pass, band-pass and high pass filter). single effects and decouple environment phenomena, especially In previous internal studies these features have proved to be because the only two monitored signals are the axis position and effective to model a current cycle. the current required from the motor to perform the expected Experimental setting. In order to implement the first pro- cycle. So even noise signals impact on these two time series. totype of the proposed architecture, position and current data Nevertheless, it is possible to extend the knowledge acquired were acquired by the RobotBox controller and transmitted to the from the single axis to robots with more axes, in order to derive Gateway in a JSON format (the .log file), as presented in Figure 3. a comprehensive knowledge of the asset health status. The Gateway then calculates some statistical features (i.e., smart This initial experiment only takes into account position and data) of the current time series. Then, it communicates with two current, but in the future more parameters will be collected and other services, listed below, to obtain classification information analyzed. about the RobotBox cycle: In a predictive maintenance perspective two possible motor failures have been defined, namely backlash and incorrect belt • a neural network classifier able to recognize the belt ten- tensioning. In this study we focus on the belt tensioning issue. sioning level; Real data. COMAU collects real data by monitoring a motor • a classifier capable of giving a qualitative backlash status from a Comau medium size robot, with its associated controller. and an estimation of the remaining useful life expressed The collection phase started on September 2018 up to December in days. 2018, during which a cycle has been collected every 120 seconds. At the end, all the raw data, the current features and the classifiers’ The sampling rate to collect raw data is 2 milliseconds, i.e. the outputs are sent to the broker ingestion service running on the sampling frequency is 500 Hz. Therefore, the total number of cloud. monitored cycles is 87,840. A cycle is the sequence of moves The use of Node-RED (https://nodered.org/) makes it possible which the motor has been coded to perform in loop; in this case to program each block which implements the required function- the cycle lasts for 24 seconds. alities remotely, since the operator has only to connect or launch In order to study the belt tensioning phenomenon with a the flow. The Node-RED service in the RobotBox controller, the machine learning approach, six levels of tensioning have been two classifiers and the Node-Red flow in the Gateway are all defined with the domain expert’s help. The dataset collected is not located in Docker containers and the relative images have been balanced, which means the number of samples for each tensiosing created and added to the SERENA Docker Registry. Big data framework. As a first attempt, the proposed architec- Predicted ture exploits a NoSQL database as a cloud storage layer. How- 0 1 2 3 4 5 ever, the current solution is planned to be replaced by Big data 0 1373 0 0 0 0 0 framework, exploiting the Cloudera stack. A MongoDB solution 1 0 2145 0 5 0 0 Actual has been adopted in the experiment, in order to ease the data 2 0 0 6673 97 67 32 management services delivered upon flexible message formats, 3 0 0 36 3718 40 219 guaranteeing fast performance on both write/read directions. 4 0 0 51 230 3302 293 MongoDB collections and contents have been exposed through 5 0 0 0 119 30 3530 HTTP REST endpoints to the SERENA Ingestion Service running on the cloud implemented using Apache NiFi. Table 1: Confusion matrix HTTP and real time feed (MQTT). A service in a Docker con- tainer for almost real time data streaming has been deployed: this is situated in the RobotBox controller and it publishes data to a MQTT broker in the cloud, so as to make data available to the or removing a washer; which is why our model has difficulty visualization application which subscribes to the same topic. The in identifying the correct class. Future work might consider the data stream position was used to update a virtual representation temperature as another feature to be considered by the classifier. of the RobotBox and a sample period of 50 milliseconds was cho- sen as a good trade off between the visualization quality and the bandwidth requested. 5.2 Visualization application In order to setup the first experiment in the Comau test-bed, 5 PRELIMINARY RESULTS SynArea (http://www.synarea.com/) has developed an HTML5 In this section, some preliminary results obtained through the ex- Unity 3D interactive prototype application, to show in a web ploitation of the proposed architecture and its provided services browser the 3D model of the RobotBox. Furthermore, some inter- are discussed. face methods have been implemented to manage the information coming from the SERENA platform. In particular: 5.1 The predictive analytics service • display, in near real-time warnings, errors and RUL with The predictive analytics has been tailored to forecast the belt different colors highlighting the involved part, to imme- tensioning level. To this end, a machine learning algorithm has diately capture the operator’s attention, and provide an been applied to smart data, in order to recognize a tensioning intuitive indication of the main information to check; level, given a new cycle of data. In the current implementation, • preventive and predictive textual information to be dis- we exploited the TensorFlow library [1] to implement a Neural played by selecting the involved part of the RobotBox; Network algorithm. After an in-depth sensitivity analysis, the • 3D virtual procedure to guide the operator while perform- specific algorithm parameters were set to the following: ing the replacement of the involved part (i.e. an example of operator support); • two hidden layers with, respectively, 50 and 25 neurons; • subscribing to the defined topic of the MQTT broker in • Adam optimiser with default values; the cloud, used for the data stream, visualize the real-time • cross-entropy loss; position on the RobotBox 3D model, to enable a remote • 100 thousand of epochs. monitoring of the physical behaviour observed. Given the amount of data available, an hold-out approach was used to divide the dataset into train and test sets, with 75% of the Figure 4 shows a screenshot of the HTML5 Unity 3D inter- data used for training and the remainder used for test. In both active application showing the Comau RobotBox without the datasets, shuffling has been performed and a batch of size 300 associated controller. The central (yellow) element is a 5 kilos samples for the training set and 100 for the test set have been weight simulating an end effector, and the highlighted element chosen. is a medium size motor of a Comau robot, connected with an Since the belt tensioning is changed by moving the motor adaptor and a belt. with respect to the adaptor and in order to make experiments The application is connected to the SERENA cloud platform reproducible, six washers have been used to discretize the six in order to provide intuitive and real-time information to the levels of belt tensionsing taken into account (each washer is 0.2 maintenance operator, as a result of the analytics and predictive mm thick). The lower the number of washers, the higher both algorithms, and to enable remote monitoring using the position. the belt tensioning and the current cunsumption. The highlighted color on the motor shows its failure status The accuracy of the final model was found to be approximately (green = correct; yellow = warning; red = failure) and, by clicking 94%. Table 1 shows the confusion matrix; both 0 and 1 washers on it, an information box (on the left side) is displayed with some are almost perfectly recognized by the model and it is due to important prognostic or predictive values, such as the label (level the fact that those classes are easily divisible since the belt is of belt tensioning) and RUL (Remaining Useful Life). extremely tense and thus the current consumption is different By clicking on the ”Maintenance Procedure” button, a virtual from the other classes. Regarding the other classes, even though procedure of the belt replacement and tensioning is also displayed. the model has good performance, there are more incorrect classi- fications due to an environment factor: the temperature. In fact we have noticed that the higher the environment temperature, 5.3 Scheduling application the lower the current consumption of the motor due to lower The scheduling service has been implemented in Java, follow- friction between the motor components. The shift introduced by ing a client-server architecture. The service inputs include the this phenomenon is quite similar to the one caused by adding monitored equipment, RUL value, maintenance tasks, including applications can be enabled with various applications, consider- ing underlying CPS features and under the vision of Industry 4.0 and connected factories. In this regard, the proposed architecture has been designed with the goal of addressing some common needs of industrial enterprises such as: • compatibility with both the on-premise and the in-the- cloud environments; • exploitation of reliable and largely supported Big Data platforms; • virtually unlimited horizontal scalability; • easy deployment through containerized software modules. Figure 4: 3D visualization of the RobotBox To test the proposed approach, a prototype has been created and Resource Task Duration Cost validated in an industrial use case on the predictive maintenance Name (minutes) (Euros/minute) of a robotic manipulator, in particular the RobotBox device. To enable the evaluation on the basis of predictive analytics, visual- Newcomer, machine ization and consequent maintenance planning, three applications Middle inspection 20 0.25 have been integrated as services. As a result of the validation of Newcomer, machine the early prototype, the integrated solution achieved to bridge Expert inspection 120 0.25 the gap between machine data acquisition and generation of pre- machine dictive maintenance policies based on the analysis of the acquired Expert inspection 15 0.4 data. Additionally, dynamic allocation of docker containers at the Newcomer, replacement of edge was achieved, enabling a dynamic way of allocating func- Middle the gearbox 100 0.4 tionalities to shop floor equipment, as long as they are connected Newcomer, replacement of to the cloud platform and properly labelled. Existing frameworks, Expert the gearbox 10 0.5 such as Arrowhead (http://www.arrowhead.eu/), provide a high- replacement of level representation of the underlying architecture without any Expert the gearbox 80 0.5 specification on an end-to-end implementation with a certain set of components addressing some application. This paper pro- Table 2: Information used by the scheduling service for vides a reference implementation for a predictive maintenance the experiment system using a certain set of components and a specific interac- tion mechanism. Moreover, the presented implementation is not coupled to any specific technology or technique, thus making it suitable for overlaying other reference architectures, such as precedence relations and default duration per operation expe- Fiware (https://www.fiware.org/). The architecture presented in rience, and a number of potential operators with their charac- this work has been focused on flexibility and ease of implemen- teristics, such as experience level. The server side includes a tation, extension and deployment. A set of technologies have multi-criterion decision making framework, evaluating the alter- been used without restricting any user to adopt the same set of native scheduling configurations, ranking them and selecting the technologies, for example, for data storage, or final user services, highest ranked one. The client side communicates with the server such as scheduling. As a result, they can be easily substituted, side via restful APIs, supporting the following functionalities: following the proposed integration approach. This will allow the proposed architecture to fit a variety of applications and domains. • editing of tasks, resources, equipment; Hence, the main contribution of this work is providing a set of • time series visualisation; required end-to-end functionalities for creating a cloud platform • process plan Gantt visualisation. for Industry 4.0, not limited to the maintenance domain. The process time required to create a new schedule depends Future activities will focus on integrating additional function- on the complexity of the schedule, referring to the number of alities to the overall architecture, such as data security features, tasks, resources, and their dependencies, along with the evaluated increasing the robustness of the integrated solution, and eval- criteria. In the current experiment, the schedule was generated in uating it in versatile use cases with the aim of improving its approximately 11 msec, and included the execution of two tasks; efficiency and user-friendliness. Moreover, with regards to the machine inspection and replacement of the gearbox, along with data analytics, further investigation and research is required to three potential resources; (1) a team of one newcomer and one identify the most appropriate algorithms for enabling data driven of middle experience, (2) one newcomer and one expert and (3) predictive analytics and validating their outcome. a team of one expert. The difference in task completion time as well as cost is presented in the Table 2, per task. ACKNOWLEDGMENT 6 CONCLUSIONS AND FUTURE The research leading to these results has received funding from APPLICATIONS European Commission under the H2020-IND-CE-2016-17 pro- This work presents a flexible and scalable architecture merg- gram, FOF-09-2017, Grant agreement no. 767561 "SERENA" project, ing cloud based and edge deployed components. Through the VerSatilE plug-and-play platform enabling REmote predictive proposed unified integration and deployment concept, different mainteNAnce. REFERENCES [1] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). https://www.tensorflow.org/ Software available from tensorflow.org. [2] A. Acquaviva, D. Apiletti, A. Attanasio, E. Baralis, L. Bottaccioli, F. B. Castag- netti, T. Cerquitelli, S. Chiusano, E. Macii, D. Martellacci, and E. Patti. 2015. Energy Signature Analysis: Knowledge at Your Fingertips. In 2015 IEEE Interna- tional Congress on Big Data. 543–550. https://doi.org/10.1109/BigDataCongress. 2015.85 [3] Daniele Apiletti, Elena Baralis, Tania Cerquitelli, Paolo Garza, Fabio Pulvirenti, and Luca Venturini. 2017. Frequent itemsets mining for Big Data: a comparative analysis. Big Data Research 9 (2017), 67–83. [4] S. A. Asmai, A. S. H. Basari, A. S. Shibghatullah, N. K. Ibrahim, and B. Hussin. 2011. Neural network prognostics model for industrial equipment mainte- nance. In 2011 11th International Conference on Hybrid Intelligent Systems (HIS). 635–640. https://doi.org/10.1109/HIS.2011.6122180 [5] Radu F. Babiceanu and Remzi Seker. 2016. Big Data and virtualization for manufacturing cyber-physical systems: A survey of the current status and future outlook. Computers in Industry 81 (2016), 128 – 137. https://doi.org/ 10.1016/j.compind.2016.02.004 Emerging {ICT} concepts for smart, safe and sustainable industrial systems. [6] Yan Chen, Feibai Zhu, and Jay Lee. 2013. Data quality evaluation and im- provement for prognostic modeling using visual assessment based data par- titioning method. Computers in Industry 64, 3 (2013), 214 – 225. https: //doi.org/10.1016/j.compind.2012.10.005 [7] Tania Cerquitelli Alberto Macii Enrico Macii Massimo Poncino Daniele Apiletti, Claudia Barberis and Francesco Ventura. 2018. iS- TEP: an integrated Self-Tuning Engine for Predictive maintenance in Industry 4.0. In 16th IEEE International Symposium on Parallel and Distributed Processin with Applications, ISPA-18 Melbourne, Australia, December 11-13, 2018. 8. [8] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. [n. d.]. The elements of statistical learning: data mining, inference and prediction (2 ed.). Springer. [9] Z. Huang, A. Zhong, and G. Li. 2017. On-Demand Processing for Remote Sensing Big Data Analysis. In 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Con- ference on Ubiquitous Computing and Communications (ISPA/IUCC). 1241–1245. https://doi.org/10.1109/ISPA/IUCC.2017.00187 [10] Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An introduction to statistical learning. Vol. 112. Springer. [11] V. Jirkovsky, M. Obitko, and V. Marik. 2016. Understanding Data Heterogeneity in the Context of Cyber-Physical Systems Integration. IEEE Transactions on Industrial Informatics PP, 99 (2016), 1–1. https://doi.org/10.1109/TII.2016. 2596101 [12] M. Mis̆ kuf and I. Zolotová. 2016. Comparison between multi-class classifiers and deep learning with focus on industry 4.0. In 2016 Cybernetics Informatics (K I). 1–5. https://doi.org/10.1109/CYBERI.2016.7438633 [13] J. Murphree. 2016. Machine learning anomaly detection in large systems. In 2016 IEEE AUTOTESTCON. 1–9. https://doi.org/10.1109/AUTEST.2016.7589589 [14] Nikolaos Nikolakis, Apostolos Papavasileiou, Konstantinos Dimoulas, Kiriakos Bourmpouchakis, and Sotirios Makris. 2018. On a versatile scheduling concept of maintenance activities for increased availability of production resources. Procedia CIRP 78 (2018), 172 – 177. https://doi.org/10.1016/j.procir.2018.09.065 6th CIRP Global Web Conference âĂŞ Envisaging the future manufacturing, design, technologies and systems in innovation era (CIRPe 2018). [15] M. Niñ o, J. M. Blanco, and A. Illarramendi. 2015. Business understanding, challenges and issues of Big Data Analytics for the servitization of a capital equipment manufacturer. In 2015 IEEE International Conference on Big Data (Big Data). 1368–1377. https://doi.org/10.1109/BigData.2015.7363897 [16] Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural Networks 61 (2015), 85–117. [17] C. Zhang, P. Lim, A. K. Qin, and K. C. Tan. 2016. Multiobjective Deep Belief Networks Ensemble for Remaining Useful Life Estimation in Prognostics. IEEE Transactions on Neural Networks and Learning Systems PP, 99 (2016), 1–13. https://doi.org/10.1109/TNNLS.2016.2582798