Internet of Things, Networks and Security


      Fast Predictive Maintenance in Industrial Internet of
       Things (IIoT) with Deep Learning (DL): A Review

                Thomas Rieger1, Stefanie Regier2 , Ingo Stengel2, Nathan Clarke1
          1
              School of Computing and Mathematics, Plymouth University, United Kingdom
                         2
                           Karlsruhe University of Applied Sciences, Germany
                                thomas.rieger@plymouth.ac.uk


          Abstract: Applying Deep Learning in the field of Industrial Internet of Things
          is a very active research field. The prediction of failures of machines and
          equipment in industrial environments before their possible occurrence is also a
          very popular topic, significantly because of its cost saving potential. Predictive
          Maintenance (PdM) applications can benefit from DL, especially because of the
          fact that high complex, non-linear and unlabeled (or partially labeled) data is
          the normal case. Especially with PdM applications being used in connected
          smart factories, low latency predictions are essential. Because of this real-time
          processing becomes more important.
          The aim of this paper is to provide a narrative review of the most current re-
          search covering trends and projects regarding the application of DL methods in
          IoT environments. Especially papers discussing the area of predictions and real-
          time processing with DL models are selected because of their potential use for
          PdM applications. The reviewed papers were selected by the authors based on a
          qualitative rather than a quantitative level.

          Keywords: Predictive Maintenance, Industrial Internet of Things, IIoT, Deep
          Learning, Real-time, Data Streams


 1        Introduction

 This paper provides an analysis of selected literature applying DL techniques and
 Artificial Neural Networks (ANN) in the field of industrial IoT (IIoT) to produce fast
 predictions as required, among others, in maintenance applications. PdM attempts to
 predict failures before their possible occurrence to avoid unscheduled outages of ma-
 chines and plants. The aim is to avoid breakdowns by their timely prediction and
 maximizing the service life at the same time. The predictions are based on data com-
 prising accumulated knowledge and current conditions.


                                                   69
                                                           Internet of Things, Networks and Security
2


   IIoT environments produce massive amounts of data. The necessity to perform data
analytics on such massive data brings the characterizing features of Big Data into
play, like the "5V's" volume, variety, velocity, variability, and veracity [1]. The high
volume and the high complexity of data put massive demands on existing data pro-
cessing techniques. Additionally, evolving data streams and real-time demands inten-
sify the demands even more [2]. Sensors typically generate continuous streams of
data. The term of data streams refers to data continuously generated typically at a high
rate [3]. In fully automated industrial environments, obtaining information in real-
time and react immediately becomes indispensable. In IIoT environments Machine to
Machine (M2M) communication has high significance [4]. Intelligent sensors and
devices not only sending data but communicating with their environment, anticipate
immediate responses. In such IIoT Environments the characteristic of taking a snap-
shot of the entire data set and performing calculations with unpredictable response
time contrasts with the demand for real-time communication and the presence of con-
tinuous flowing data streams [5]. To cope with such demands self-adaptive algorithms
continuously learning and improving their models are essential. In addition, such
algorithms should provide high performance and real-time behaviour. This is not only
true when they are running on powerful cloud systems but also on fog and edge sys-
tems or IoT devices [6].

The methodological approach of this paper is a narrative review. The reviewed papers
were selected by the authors based on a qualitative rather than a quantitative level.
Papers covering the most current research for the topic fast predictions in IIoT with
DL were given priority. There are many papers covering the topic of DL in (I)IoT. To
the best of our knowledge, there is no paper in literature covering the specific topic of
PdM in connection with DL an (I)IoT.

This review provides a classification of different DL approaches mentioned for use in
industry und IoT. It also covers the topics of real-time processing and data streams in
regard to the mentioned DL approaches. Techniques intended to improve the real-
time and stream processing ability of different approaches mentioned in the reviewed
papers are evaluated and classified. Special focus is set on the ability of the mentioned
approaches to provide predictions. The paper concludes with a summary and outlook
on future developments.


2      Deep Learning Approaches in Industrial Internet of
       Things

This section starts with a short introduction into DL and ANN. A classification of
different DL methods mentioned for the use in industry und IoT will then be provid-
ed. The classification will be done by the theoretic approaches, application areas and
strength and weaknesses in regard to the demands of PdM in IIoT environments. The


                                            70
Internet of Things, Networks and Security
                                                                                          3


 reviewed papers covering the topics of DL methods in Cyber Physical Systems (CPS),
 IoT, Industry 4.0 (I4.0), as well as the topics of real-time and data stream processing.

    DL can be defined as a subcategory of Machine Learning (ML) whereas ML is a
 segment in the field of Artificial Intelligence (AI). DL itself is often defined as a class
 of optimized ANNs comprising numerous layers (hidden layers). The high number of
 layers and neurons allow the abstraction of more complex problems and support fur-
 ther characteristics like the ability to unsupervised learning or automatic feature ex-
 traction [7]. Examples are Deep Neural Networks (DNN), Deep Belief Networks
 (DBN) or Recurrent Neural Networks (RNN).

    The basic idea behind an ANN is to imitate the biological neural network in mam-
 malian brains. Components of an ANN are neurons (in ANNs often called nodes) and
 connections between those nodes. The nodes are organized in layers producing non-
 linear output data based on the input data. The connections between the nodes transfer
 the output of one node to the input of another node. Weights assigned to each connec-
 tion determine the relevance of the transferred signal. As in biological neural net-
 works the output signal of a neuron (node) is ruled by a threshold function. To set up
 an ANN all weights have to be set to an initial value (often just simple estimates). By
 training the network those weights are adjusted in a holistic way following a defined
 learning rate to achieve a valid and balanced network. This is also often referred to as
 “connections developing over time with training". ANNs are known for more than 50
 years and various ways have been developed since [21], [22], [23].

    In [6] the following DL models are listed for IoT application: Auto-encoder (AE),
 RNN, Restricted Boltzmann Machine (RBN), DBN, Long Short-term Memory
 (LSTM), Convolutional Neural Network (CNN), Variational Auto-encoder (VAE),
 Generative Adversarial Network (GAN) and Ladder Net. The DL models are catego-
 rized in [6] into the three main groups of generative approaches (AE, RBM, DBN,
 VAE), discriminative approaches (RNN, LSTM, CNN) and hybrid (GAN, Ladder
 Net) as a combination of the two approaches mentioned before. This categorisation
 mainly refer to the underlying learning method whereas generative approaches basi-
 cally follow the principle of unsupervised learning and discriminative approaches
 follow the principle supervised learning. Beside the definition of the required number
 of layers (complexity) the underlying learning method is a decisive factor for the se-
 lection of a DL approach. The categorization in generative and discriminative ap-
 proaches chosen by [6] can be fundamentally found in many other works. In [6] dif-
 ferent DL models are also categorized by their suitability in IoT applications. The
 relevant characteristics mentioned in [6] are the ability to work with (partially) unla-
 belled data (feature extraction, feature discovery), the magnitude of needed training
 dataset, dimensionality reduction abilities, the ability to deal with noisy data and time
 series data and their general performance classification. For the reduction of high
 dimensional data and to cope with unlabelled data [6] recommends the combination
 of RNN with DBN and AE. If the system is meant to make predictions like in PdM


                                              71
                                                           Internet of Things, Networks and Security
4


systems, DBN and AEs are often used as an upfront layer providing classified data to
a subsequent RNN [6].

   In case of spatial-temporal data like mobility data, RNNs are recommended be-
cause they show good results when data is developing in a sequential way. But if data
also comprises long term dependencies, RNNs are not a good choice because RNNs
does not memorize previous states and results [8]. An approach to handle sequential
data streams from human mobility and transportation transition models containing
long term dependencies (behaviours) is described in [8]. The described solution is a
combination of RNN with LSTM in the form of a specialized RNN architecture. Be-
sides the ability to handle long term dependencies the LMST also adds labelling and
predictive functionality to that combination. The combination of RNN with LSTM to
cope with data streams or time-series data comprising long-term dependencies (like
certain behaviours or wear and tear of machineries) can be found in many other works
[8], [9], [11], [18].

    The paper “IoT Data Analytics Using Deep Learning” [9] describes how to select
the right ANN to archive predictions from data streams and time-series data. To re-
trieve trends and predictions and also validate those trends and predictions in parallel
by anomaly detection, a combination of LSTM with Naive Bayes models is proposed.
The LSTM produces the predictions on data streams whereas the Naive Bayes model
is responsible for anomaly detection performed on the results of the LSTM.
    This paper also reflects on the fact that Simple Feedforward ANN (FNN) like Sin-
gle-layer Perceptron (SLP) and Multi-layer Perceptron (MLP) using standard back-
propagation (BP) for training are often not a good choice because they does not per-
form well in complex situations and on data streams with long-term dependencies.
This is especially true when data streams comprise time series data and the aim of the
model is to predict future events or trends. Data streams and time-series data usually
have dependencies over time. Such dependencies are typical for IoT data and provide
relevant insights. In simple ANNs data moves straight through the layers with the
assumption that input data is independent from output data. Because of this, there is
no way to remember previous input and output states (previous results). This is bad if
previous data is linked to current data. Using RNN instead can archive better results
in data streams and time-series data. Because the connections between nodes in a
RNN are in the form of sequences or loops, it is possible to remember previous states.
To avoid gradient explosions normally only a view states are remembered. Therefore
only short-term dependencies are recognized. Because of this [9] recommends the
application of LSTM in complex IoT environments to recognize long-term dependen-
cies in the data. LSTM are a variant of RNN introducing memory units. Those
memory units are able to remember important previous states and forget the unim-
portant ones [9].


                                            72
Internet of Things, Networks and Security
                                                                                      5


    To predict the behaviour of energy systems in the manner of smart grids [10] re-
 mark that more intelligent systems are necessary to produce accurate predictions on
 the future energy consumptions. In the paper “Deep learning for estimating building
 energy consumption” [10] it is stated that ANN-based prediction methods are a prom-
 ising approach because of their ability to handle massive and highly non-linear time
 series data coming from different heterogeneous data sources (e.g. SmartMeter) and
 containing a lot of uncertainty (unlabelled data). In the paper [10] they benchmarked
 two different approaches of the RBN, namely Conditional Restricted Boltzmann Ma-
 chine (CRBM) and Factored Conditional Restricted Boltzmann Machine (FCRBM),
 on a synthetic benchmark dataset. Based on this experiment the authors come to the
 conclusion that FCRBN outperforms in comparison to RNN, Support Vector Machine
 (SVM), as well as CBRM because of its added factored conditional history layer. A
 RBM is a stochastic ANN consisting of two layers, a visible layer and a hidden layer.
 In simple terms, the visible layer of a RBM contains a node for each possible value in
 the input data whereas the hidden layer defines categories of values. Because in a
 RBM each visible layer node is connected to any hidden layer node a RBN is good in
 feature classification, feature extraction and complexity reduction (by identifying the
 most important features). For DL RBMs can be stacked. In [10] RBM is extended by
 a conditional history layer (CRBM) enabling the RBN to detect long-term dependen-
 cies in time-series data. Additionally the output of one stacked CRMB layer is fac-
 tored (FCRBM) to reduce the number of possible compositions.

    Another paper in the field of energy management also emphasizes the very power-
 ful forecasting abilities of DL. In [11] the application of AE and LSTM is described
 for predicting the power generation of solar systems. The accuracy reached by a com-
 bination of AE and LMST (Auto-LSTM) is compared to other neural networks
 (namely MLP) as well as to a physical model. The benchmark data is taken from 21
 real solar power plants and the benchmark is taken from an experimental setup de-
 scribed in [11]. The following measurements are taken as benchmarks: average root-
 mean-square deviation (RMSD), average mean absolute error (MAE), average abso-
 lute deviation (Abs. Dev.), average BIAS and average correlation. The measured re-
 sults show that all ANN- and DL-based models show far better results than the physi-
 cal model. Among all ANN- and DL-based models Auto-LSTM is the best choice in
 this specific scenario and specific data set. The capability to extraction features in
 unlabelled data is mentioned as a decisive factor in making predictions.

    The paper “An enhancement deep feature fusion method for rotating machinery
 fault diagnosis “ [12] points out the strength of AEs in feature extraction and feature
 learning. The paper describes how to further improve the feature learning ability with
 reduced influence of background noises by stacking Deep AE (noise reduction) and
 contractive AE (enhanced feature recognition), called deep feature fusion method.


                                            73
                                                           Internet of Things, Networks and Security
6


3      Fast Predictions using DL

In many IoT applications real-time processing is essential. For example in a PdM
system high latency could lead to unintentional reactive maintenance because of in-
sufficient lead time to plan the maintenance tasks [5]. How fast real-time processing
needs to be, strongly depends on the application case. According [13] in micro manu-
facturing systems, where vast volumes of micro parts are manufactured with high
speed, the term real-time means microseconds. [13] shows that with systems for fault
detection and PdM the rejection rate of the manufactured micro parts decrease by
increasing processing speed [13]. In other scenarios, the term of real-time can mean
seconds, minutes or hours. For example in PdM Applications for offshore wind tur-
bines the frequency with which the data is available is mostly minutes and hours [14].

   The paper “Metro Density Prediction with Recurrent Neural Network on Streaming
CDR Data” [15] describes the implementation of a real-time public transportation
crowd prediction system using a weight-sharing recurrent neural network in combina-
tion with parallel streaming analytical programming. Fast response time to emergent
situations (e.g. entrance records in metro stations combined with telecommunication
data) demand real-time analysis. The use of a powerful neural network model with
strong learning capability offers a wide range of new insights but contrast with the
need for fast response time. The way to meet this goal is described in [15] with three
steps: a) adopting a RNN model to improve its ability to work on data streams, b)
implement strategies for parallelization of RNNs and c) the use of parallel streaming
analytical algorithms over a cloud-based stream processing platform. In the project
described in [15] each metro station is modelled by an independent RNN. Shared
layers are introduced to share weights from stations which are in similar “situations”
(e.g. a downtown station during rush hour) across several models dynamically.
Weight-sharing also enables co-training in parallel [15].

   The application of RNNs and their many variations for fast data analytics is also
recommended in [6]. Especially on typical sensor data like serial data, time-series
data and data streams, RNNs can provide better performance than other models. Such
sensor data is dominating in most PdM applications [1].

    In order to be able to develop and permanently adapt models on massive data com-
prising the behaviour of people and their spatial and temporal attributes together with
transportation capacities, real-time processing and real-time learning capabilities are
essential. The paper [8] describes a multi-task deep LSTM learning architecture. The
basic idea of this concept is not to use a joint feature vector but various LSTM tasks
separated by their domain (e.g. respectively a separate task for mobility and transpor-
tation mode prediction). This architecture performs parallel learning whereas the re-
sults are aggregated depending on the intended insights [8].

   Assistance systems in cars like traffic sign recognition must deliver accurate results
with low latency. The paper [16] describes how to apply DNN in this field. The model

                                            74
Internet of Things, Networks and Security
                                                                                       7


 of the system is continuously updated (online learning) and fed only with completely
 unlabelled data (raw images). A CNN with 9 layers is used for image recognition. To
 improve the performance of system max-pooling layers are combined with convolu-
 tional layers in an alternating way. The convolutional layers perform convolution on
 2D input pixel maps. The max-pooling layer works like a pre-processor between two
 convolutional layers transforming the output of a preceding convolutional layer to the
 input of a subsequent convolutional layer by eliminating overlapping regions in the
 pixel maps. This eliminates redundant processing in the complex and time consuming
 convolutional layers. The approach described in [16] is referred to as Multi-Column
 DNN (MCDNN).

    The paper [17] describes a real-time oriented solution for traffic sign detection and
 recognition. The primary focus is on the need for parallel processing because of the
 need to detect diverse traffic sign at the same time. In this approach also CNN is used
 for image processing in combination with AdaBoost to improve performance and
 parallel GPU processing.

    Because of its memory cells LSTM models are good if data comprises long-term
 dependencies. If the data structure allows the separation of single entities with their
 specific behaviour as well as the formation of groups of entities, it could be then pos-
 sible to process each entity and every group with its own neural network. This opens
 up parallel processing possibilities of the single neural networks. Normally each sin-
 gle and parallel processed neural network provides its result to an aggregation layer
 aggregating all outputs to an overall result. The paper “A Hierarchical Deep Temporal
 Model for Group Activity Recognition” [18] describes how to recognize situations in
 a volleyball match. One LSTM model per player predicts the behaviour of this player,
 remembering his previous behaviour in the match (long-term dependencies). Each
 single situation of the match is then modelled as a group of the players. The LSTMs
 are hierarchically ordered where the LSTM models of all involved players are subor-
 dinated to a scene. The scenes and the players behaviour is extracted based on images
 using CNN [18].

    The paper [7] mentions that because of the demands for real-time processing, the
 organization of layers and connections have changed. Fully connected networks
 where each node of a layer is connected to all nodes of the subsequent layer can han-
 dle complex problems but also demand a lot computing power. Dropout all connec-
 tions not really influencing the result is a strategy to reduce the complexity of a DL
 network, and therefore its computing demand, without affecting accuracy in a relevant
 manner. Besides dropout [7] also mention max pooling layers, batch normalization
 and transfer learning as additional strategies for performance optimization.

    Despite all the mentioned papers discussing performance enhancements and real-
 time abilities of DL models, [19] considers that highest accuracy still stands over all
 in mostly all current DL projects. The paper “An Analysis of Deep Neural Network
 Models for practical Applications” [19] argues that numerous DL approaches de-


                                             75
                                                                 Internet of Things, Networks and Security
8


scribed in literature are simply not suitable for practical use. This is for example be-
cause of their long processing time or excessive power consumption. In his paper he
demands to spend more attention to performance issues because they are key factors
in practical DL applications. The paper compares 14 different specific DL projects
like AlexNet or GoogLeNet by comparing their accuracy, memory footprint, parame-
ters, operations count, inference time, and power consumption. The paper shows that
a small increase in accuracy lead to an enormous increase in computational power and
computation time. It is recommended to define a maximum energy consumption for
each DL project and adjust the accuracy according to it [19].


4      Conclusions

In this paper we provided a narrative review of selected literature applying DL tech-
niques in the field of IIoT to produce fast predictions of maintenance issues. The pa-
pers have shown that the use of DL in IoT and PdM is a vital topic in industry. Many
different applications are in use in practice and are constantly being developed and
improved.

   Frequently reported are combinations of different DL models to combine different
advantages and strengths in one application. Also, the need for real-time processing of
complex data and data streams has been demonstrated in certain application scenarios.
This include in particular applications for predictions such as PdM. In order to in-
crease the real-time capability, concepts of parallel DL networks using a final aggre-
gation layer, or intermediate layers for the reduction of complexity are frequently
used. Although many activities can be observed in the area of real-time processing of
DL models, there are also critical voices criticizing the absolute focus on accuracy
and calling for a greater focus on performance and lighter applications suitable for
practical use. Almost all reports agree that a lot of research is still needed in this area.

            Table 1 Summary of reviews papers with the DL-Methods mentioned

Ref.           DL-Methods      Characteristics                                Typical applications
[6]            AE, CNN,        Feature extraction and dimensionality          Fault detection and
Mohammadi,     DBN, GAN,       reduction of IoT Data with AE, DBN             predictions IoT envi-
et al., 2018   LSTM, RBM,      CNN for image recognition but needs            ronments
               RNN, VAE,       large training set                             Real-time and stream
               Ladder Net      GAN, VAE and Ladder Net suitable for           processing with differ-
                               noisy data, used as classification layer for   ent kinds of RNNs
                               RNN to enable unsupervised learning
                               LSTM provide good performance for
                               data with long term dependencies
                               RBM for feature extraction, dimensional-
                               ity reduction and classification problems
                               RNN especial for time-series data


                                               76
Internet of Things, Networks and Security
                                                                                                    9

  [8] Song,         LSTM, RNN         LMST for data containing long term          IoT, Transport, Mobility
  et al., 2016                        dependencies; time-series and IoT data
                                      Streams; LSTM adds labelling and pre-
                                      dictive functionality in combination with
                                      RNN
                                      RNN good when sequential data and data
                                      streams
  [9] Xie,          LMST, RNN         LMST and RNN suitable for time-series      Predictions because of
  et al., 2017                        and IoT data Streams                       long-term dependencies
                                                                                 in data
                                                                                 RNN for short-term IoT
                                                                                 applications like condi-
                                                                                 tion monitoring
  [10]              RBM,              RBM for feature extraction, dimensional- Predictive IoT applica-
  Mocanu,           CRBM,             ity reduction, classification              tions e.g. for smart
  et al., 2016      FCRBM             CRBM extends RBM with long-term            cities or smart energy
                                      predictions by adding a conditional histo- grids
                                      ry layer
                                      FCRBM improves performance by reduc-
                                      ing the number of possible compositions
                                      of each output layer in a stacked (C)BRM

  [11] Gensler, DBN, Auto-            DBN perform good for predictions on         Predictive IoT applica-
  et al., 2016  LSTM                  time-series data                            tions like power genera-
                                      Auto-LSTM for predictions on time-          tion forecasts
                                      series data, combination of AE and
                                      LSTM
  [12] Shao,        AE                Good for feature extraction, unsupervised   IoT applications like
  et al., 2017                        learning, noise reduction and compres-      fault diagnosis
                                      sion (relevant feature detection), often
                                      used as pre-processing layer for complex-
                                      ity reduction, short-term dependencies
                                      only, not good for predictions
  [15] Liang,       RNN               Adopted RNNs used for data streams and      Applications running
  et al., 2016                        weight-sharing, as well as co-training in   parallel RNNs with
                                      parallel                                    shared layers
                                                                                  Cloud-based stream
                                                                                  processing
  [16] Ciresan, CNN                   Image recognition in real-time in combi-    Real-time and parallel
  et al., 2012                        nation with max-pooling layers, good for    processing IoT applica-
                                      short-term dependencies, not good for       tions like traffic sign
                                      predictions                                 recognition
  [17] Lim,         CNN               Image recognition in real-time in combi-    Real-time and parallel
  et al., 2017                        nation with max-pooling layers, good for    processing IoT applica-
                                      short-term dependencies, not good for       tions like traffic sign
                                      predictions                                 recognition
  [18] Ibrahim, CNN, LSTM             CNN for Image recognition                   Recognition of individ-
  et al., 2016                        LSTM for predictions considering long-      uals and groups e.g. to
                                      term dependencies; hierarchical LMST        determine current be-
                                      model for individuals and group behav-      haviour or dynamics
                                      iours


                                                     77
                                                               Internet of Things, Networks and Security
10


   Table 1 gives an overview of the reviewed papers with the DL-Methods men-
tioned. For each paper the characteristics (or strength and weaknesses) as well as the
recommended application areas (like predictions) of the DL-Methods mentioned in
the corresponding paper are summarized. Table 1 makes no statement regarding the
validity of results in a quantitative way. The categorisation of the different DL models
is only made in a qualitative way. This is because among all reviewed papers only in
[19] concrete measured values are defined. All other papers solely provide qualitative
statements. How to measure and evaluate the validity and quality of results of differ-
ent DL methods is an open question [20]. So far, few approaches for measuring, eval-
uating and benchmarking have been developed. Moreover, those approaches are usu-
ally not verifiable as generally valid. For instance, in the case of classifications the
use of accuracy estimation techniques, such as the "holdout method" or "n-fold cross-
validation", can be used to evaluate performance, predictive ability and model accura-
cy [20]. As such, mentioned techniques divide a training set via varying approaches
into data areas for learning and validation. For most models no measuring, evaluating
and benchmarking concept has yet been defined. In general, the evaluation is done
here by expert opinions [20]. The paper [20] points out that there is a demand for
improved measuring and benchmark methods. Proven measurement methods to gen-
erate representative benchmarks are needed in order to be able to assess DL models.

   The papers [1] to [5], [7], [13], [14] and [19] to [23] are not part of Table 1 because
they are used as reference regarding basic statements and explanations made in this
paper. These papers were not on the topic of DL methods and techniques.


References
 1. Pusala, Murali, et al. 2016. Massive Data Analysis: Tasks, Tools, Applications and Chal-
    lenges. Big Data Analytics. s.l. : Springer Verlag, 2016.
 2. Zhang, Liangwei, et al. 2017. Sliding Window-Based Fault Detection From High-
    Dimensional Data Streams. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND
    CYBERNETICS. 2017, Bd. 47, 2.
 3. Krawczyk, Bartosz und Wozniak, Michal. 2015. Data stream classification and big data
    analytics. Neurocomputing. 2015, 150.
 4. Ait-Alla, Abderrahim, et al. 2015. Real-time fault detection for advanced maintenance of
    sustainable technical systems. Procedia CIRP. 2015, 41.
 5. Bauer, Dennis, Stock, Daniel und Bauernhansl, Thomas. 2017. Movement towards ser-
    vice-orientation and app-orientation in manufacturing IT. 10th CIRP Conference on Intel-
    ligent Computation in Manufacturing Engineering - CIRP ICME '16. 2017.
 6. Mohammadi, Mehdi, et al. 2018. Deep Learning for IoT Big Data and Streaming Analyt-
    ics: A Survey, IEEE COMMUNICATIONS SURVEYS & TUTORIALS,
    arXiv:1712.04301v2
 7. Lee, L. N., et al. 2015. Risk Perceptions for Wearable Devices. Cornell University Library.
    [Online] 2015. http://arxiv.org/pdf/1504.05694.pdf.
 8. Song, Xuan, et al. 2016, DeepTransport: Prediction and Simulation of Human Mobility
    and Transportation Mode at a Citywide Level, Center for Spatial Information Science, The
    University of Tokyo, Japan


                                               78
Internet of Things, Networks and Security
                                                                                             11


  9. Xie, Xiaofeng, et al. 2017, IoT Data Analytics Using Deep Learning, Key Laboratory for
     Embedded and Networking Computing of Hunan Province, Hunan University.
 10. Mocanu, Elena, et al. 2016, Deep learning for estimating building energy consumption,
     Department of Electrical Engineering, Eindhoven University of Technology, The Nether-
     lands
 11. Gensler, André, et al. 2016, Deep Learning for Solar Power Forecasting - An Approach
     Using Autoencoder and LSTM Neural Networks, 2016 IEEE International Conference on
     Systems, Man, and Cybernetics • SMC 2016 | October 9-12, 2016 • Budapest, Hungary
 12. Shao, Haidong, et al. 2017, An enhancement deep feature fusion method for rotating ma-
     chinery fault diagnosis, School of Aeronautics, Northwestern Polytechnical University,
     710072 Xi’an, China
 13. Rippel, Daniel, Lütjen, Michael und Freitag, Michael. 2015. SIMULATION OF
     MAINTENANCE ACTIVIES FOR MICRO-MANUFACTURING SYSTEMS BY USE
     OF PREDICTIVE QUALITY CONTROL CHARTS. 2015.
 14. Freitag, Michael, et al. 2015. A Concept for the Dynamic Adjustment of Maintenance In-
     tervals by Analysing Hereogenoeous Data. Applied Mechanics and Materials. 794, 2015.
 15. Liang, Victor C., et al. 2016, Mercury: Metro Density Prediction with Recurrent Neural
     Network on Streaming CDR Data, ICDE 2016 Conference 978-1-5090-2020-1/16
 16. Ciresan, Dan, et al. 2012, Multi-Column Deep Neural Network for Traffic Sign Classifica-
     tion, IDSIA - USI - SUPSI | Galleria 2, Manno - Lugano 6928, Switzerland
 17. Lim, Kwangyong, et al. 2017, Real-time traffic sign recognition based on a general pur-
     pose GPU and deep-learning, Department of Computer Science, Yonsei University, 50
     Yonsei-ro Seodaemun-gu, Seoul, Republic of Korea
 18. Ibrahim, Mostafa S., et al. 2016, A Hierarchical Deep Temporal Model for Group Activity
     Recognition, School of Computing Science, Simon Fraser University, Burnaby, Canada
 19. Canziani, Alfredo, et al. 2016, AN ANALYSIS OF DEEP NEURAL NETWORK
     MODELS FOR PRACTICAL APPLICATIONS, Weldon School of Biomedical Engineer-
     ing Purdue University, Faculty of Mathematics, Informatics and Mechanics University of
     Warsaw, arXiv:1605.07678v4
 20. Krawczyk, Bartosz und Wozniak, Michal. 2015. Data stream classification and big data
     analytics. Neurocomputing. 2015, 150.
 21. Bhatia, Nidhi, et al. (2015), Deep Learning Techniques and its Various Algorithms and
     Techniques, International Journal of Engineering Innovation & Research, Volume 4, Issue
     5, ISSN: 2277 – 5668
 22. Chatfield, ken, et al. (2014), Return of the Devil in the Details: Delving Deep into Convo-
     lutional Nets, Visual Geometry Group, Department of Engineering Science, University of
     Oxford, arXiv:1405.3531v4
 23. DEVCOONS Website, http://www.devcoons.com/literature-review-of-deep-machine-
     learning-for-feature-extraction/
   This paper was submitted to the Collaborative European Research Conference
 (CERC 2019) https://www.cerc-conference.eu/


                                                79