=Paper= {{Paper |id=Vol-2277/paper39 |storemode=property |title= Sensor Data Preprocessing, Feature Engineering and Equipment Remaining Lifetime Forecasting for Predictive Maintenance |pdfUrl=https://ceur-ws.org/Vol-2277/paper39.pdf |volume=Vol-2277 |authors=Evgeniy Latyshev |dblpUrl=https://dblp.org/rec/conf/rcdl/Latyshev18 }} == Sensor Data Preprocessing, Feature Engineering and Equipment Remaining Lifetime Forecasting for Predictive Maintenance == https://ceur-ws.org/Vol-2277/paper39.pdf
    Sensor Data Preprocessing, Feature Engineering and
  Equipment Remaining Lifetime Forecasting for Predictive
                      Maintenance
                                           © Evgeniy Latyshev
                                    Lomonosov Moscow State University,
                                             Moscow, Russia
                                          e.latishev@gmail.com
          Abstract. Analytics based on sensor data is gradually becoming an industry standard in equipment
    maintenance. However, it involves several challenges, such as sensor data preprocessing, feature engineering
    and forecasting model development. Due to work in progress, this paper is mainly focused on sensor data
    preprocessing, which plays a crucial role in predictive maintenance due to the fact, that real-world sensing
    equipment usually provides data with missing values and a considerable amount of noise. Obviously, poor
    data quality can render practically useless all the following steps of data analysis. Thus, many missing data
    imputation, outlier filtering, and noise reduction algorithms were introduced in the literature. Streaming
    sensor data can be represented in a form of univariate time series. This paper provides an overview of common
    univariate time series preprocessing steps and the most appropriate methods, with consideration of the field
    of application. Sensor data from different sources comes in different scales and should be normalized. Thus,
    the comparison of univariate time series normalization techniques is given. Conventional algorithm quality
    metrics for each of the preprocessing steps are described. Basic sensor data quality assessment approach is
    suggested. Moreover, the architecture of a sensor data preprocessing module is proposed. The overview of
    time series-specific feature engineering techniques is given. The brief enumeration of considered forecasting
    approaches is provided.
          Keywords: predictive maintenance, preprocessing, univariate time series, data cleaning, missing data
    imputation, noise reduction, outlier filtering, data quality assessment, feature engineering, time series
    forecasting


1 Introduction                                                     unacceptable amount of missing values, outliers, sudden
                                                                   spikes etc. Simply ignoring these issues can be critical
Maintenance costs are a major part of the total operating          due to several reasons. For example, some analysis tools,
costs of any business involving complex equipment.                 including popular machine learning algorithms, can’t
Conducted surveys of maintenance management                        handle missing values. The absence of outlier filtration
effectiveness indicate that one-third of all maintenance           can dramatically skew the results. Measuring equipment
costs is wasted as the result of unnecessary or improperly         standard error can be mistaken for an actual pattern in
carried out maintenance [16]. With the spread of Internet          data. As a result, time series preprocessing involves
of Things concept, sensor data can be collected from a             several independent steps: missing data imputation,
huge amount of devices and equipment. This data can be             noise reduction, and data normalization. After these
used for real-time health monitoring and effective                 steps, data can be evaluated in quality and passed further
maintenance. However, this approach to maintenance,                for analysis. It is clear, that preprocessing should be done
also known as predictive maintenance, involves several             in near real-time to minimize the delay between data
challenges.                                                        measurement and decision making. Thus, there is a need
    First of all, often collected data is of poor quality,         for a fast and scalable independent module, that can
which can lead to unreliable analysis and ineffective              preprocess constantly incoming sensor data. This paper
maintenance. Consequently, data from sensing                       proposes the design of such a module, keeping in mind
equipment needs to be preprocessed before it can be used           the following integration with the existing architecture of
for any analysis. Poor data quality means non-                     a predictive maintenance system, introduced in [14].
compliance with requirements on at least one of the data               Secondly, it can be difficult to distinguish the patterns
quality assessment metrics. The root of problems can               and relationships in the initial data. The process of
vary: connection issues, sensor malfunction, transmitting          extracting and generating new characteristics and
hardware failure, data processing server downtime,                 features out of the available data, commonly referred to
software crash, measuring equipment inaccuracy and                 as feature engineering, has two main objectives. The first
many more. Common cases of poor data quality involve               one is to represent the data in such a form, that will make
Proceedings of the XX International Conference                     it easier to establish simple yet strong connections
“Data Analytics and Management in Data                             between the input and the output variables for the
Intensive Domains” (DAMDID/RCDL’2018),                             forecasting model, increasing the quality of the forecasts.
Moscow, Russia, October 9-12, 2018


                                                             226
The second objective is to pick the most useful features            below in figure 1. The whole module consists of 4
out of all the available ones, reducing the amount of               transformation steps and the data quality assessment
computations of the forecasting model.                              step.
    Finally, a proper forecasting model is to be chosen
and implemented. There are various approaches to time
series forecasting, from straightforward methods like
naive method to way more sophisticated ones like long-
short term recurrent neural network. The main
complication here is the trade-off between the forecast
quality and the ease of model implementation and
deployment.
    The remaining part of the paper is organized as
follows. The preprocessing module architecture is
described in Section 2. Section 3 reviews missing data
imputation methods. Section 4 is devoted to time series
noise reduction. Section 5 briefly overviews data
normalization techniques. In section 6, some thoughts on
data quality assessment are combined. Section 7 is
devoted to time series feature engineering. The brief
overview of time series forecasting approaches is given
in section 8. Finally, the future directions of presented
work are given in section 9.

2 Preprocessing Module Architecture
The preprocessing module is a part of the system for
predictive maintenance, deployed to a Hadoop [29]
cluster in a cloud manner. The module is wrapped in
Docker [17] container and runs on a standalone node of
the cluster. One of the key requirements for the module
is the seamless integration into the architecture. The data
is retrieved from Apache Kafka message queue [2],
transformed by the preprocessing module and passed in
parallel to OpenTSDB [25] and Apache Hive [11] for
storage. To satisfy the requirements onto speed and
scalability the transformations are conducted onto
Apache Spark Streaming engine [27].
There are many stream data processing frameworks
including but not limited to Apache Storm [5], Apache
Flink [1], Apache Samza [4] and Kafka Streams [3].
Although Spark Streaming has latency issues and sliding             Figure 1 Components and data flow within the
window processing may be tricky due to Spark inherent               preprocessing module
batch-based streaming model, is has several advantages
which make Spark Streaming a safer choice.                          3 Missing Data Imputation
First of all, Spark Streaming is a mature framework with
thorough documentation and huge community. As a                     Sometimes due to a sensor malfunction, unstable internet
result of long-term popularity, there are plenty of open-           connection or other technical difficulties the data for
source tools for Spark Streaming, including solutions for           some points in time is missing. Simply ignoring those
relatively painless integration with Kafka and mentioned            gaps may be not the best strategy, because it can lead to
earlier database management systems [28, 11, 21].                   a loss of efficiency and unreliable results of the analysis.
Another advantage is the existence of pySpark [23] – an             Another approach is to try to impute the missing values
API for Python, one of the biggest programming                      based on the available information.
languages at this moment. All other enumerated
frameworks require Scala, Clojure or Java knowledge,                3.1 Methods
which makes them less accessible.                                   The detailed overview of basic imputation methods and
One of the biggest downsides of Spark Streaming is                  their implementations can be found in imputeTS R
performance degradation on sudden bursts of input data.             package documentation [19].
However, in case of sensor data processing the input data               Some simple methods that are applicable not only to
flow intensity remains nearly the same at all time                  time series: median imputation, mode imputation, mean
intervals, which mitigates the downside.                            imputation, random imputation. These methods are fast
    The data flow and module components are introduced              and very straight-forward, but lack accuracy.




                                                              227
    Simple time series specific methods include LOCF                                 Frequency domain approaches are based on signal
(last observation carried forward), NOCB (next                                   decomposition into frequency components. The most
observation carried backward), interpolation (linear,                            common approaches involve discrete/fast/short-time
polynomial, Stineman) and moving average (simple,                                Fourier transform either wavelet transform.
weighted, exponential). All of them are rather fast and                              Most of the time domain approaches are based on
can work in specific cases, but fall off when there is                           smoothing the signal of each given data point based on
seasonality in the data or large missing sub-sequences.                          the values of its neighbors.
    More sophisticated approaches like Structural Model                              The comparison of the basic noise reduction methods
& Kalman Smoothing, ARIMA State Space                                            can be found in the work of Köhler et al. [15]. The
Representation & Kalman Smoothing [10] can be used                               conducted experiment involves the comparison of
for seasonal data with complex patterns.                                         moving average filter, exponential smoothing filter,
    However, sensor data has one unfortunate                                     linear Fourier smoothing, nonlinear wavelet shrinkage
characteristic – the gaps of missing data can be too long                        and simple nonlinear noise reduction in different
for conventional methods to work properly. In this case,                         conditions.
the method proposed in [22] can be the appropriate                                   The downside of the approaches listed above is that
choice. The idea of the Dynamic Time Warping Based                               they modify almost all the data values, most of which are
Imputation is to find the most similar sub-sequence to the                       initially correct. Song et al. [26] proposed the first
sub-sequence before the missing values, then complete                            constraint-based approach for cleaning stream data. The
the gap by the next sub-sequence of the most similar one.                        idea is to sanity check the changes of values in time based
The result is a very plausible gap imputation with a                             on subject area constraints. This method allows to detect
drawback of a huge computational cost.                                           and repair large spike errors in data. The biggest
                                                                                 advantage of this method is the support of online
3.2 Metrics                                                                      cleaning over streaming data.
Missing data imputation involves 2 types of quality                                  However, this method can be used only for large
metrics based on the pattern of imputation.                                      outlier detection. In some cases, even small errors can be
   For single value imputations, the metrics coincide                            important and repairing only spike errors is insufficient.
with the ones commonly used in time series forecasting                           Zhang et al. [30] proposed a novel statistical-based
– RMSE (Root Mean Square Error)and MAPE (Mean                                    cleaning by introducing the repairment likelihoods with
Absolute Percentage Error).                                                      respect to speed changes. Several effective and
                                                                                 computationally efficient heuristics are also introduced
                                              ∑ (𝑦𝑦^𝚤𝚤 −𝑦𝑦𝑖𝑖 )2
                             𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = � 𝑖𝑖                      ,              in this work.
                                                     𝑛𝑛
                                       100%               𝑦𝑦 −𝑦𝑦^𝚤𝚤
                          𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 =          × ∑𝑖𝑖 � 𝑖𝑖              �,         4.2 Metrics
                                        𝑛𝑛                   𝑦𝑦𝑖𝑖
    where yi is real value, ŷi is the forecasted value and n                     Most of the papers involve RMSE, defined earlier, as a
is the number of forecasts.                                                      denoising quality metric. However, there are several less
    However, different metrics are used for long gap                             popular ones, including the Symmetrical Visual Error
imputation. The most popular of them are similarity and                          Measure, proposed in [18].
Dynamic Time Warping distance.

                          1             1                                        5 Data Normalization
 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 =      × ��                              �
                          𝑛𝑛         |𝑦𝑦𝑖𝑖 − 𝑦𝑦^𝚤𝚤 |                                 Making sure that your data is of uniform scale is key
                      𝑖𝑖 �1 +                                 �
                              𝑚𝑚𝑚𝑚𝑚𝑚(𝑦𝑦^𝚤𝚤 ) − 𝑚𝑚𝑚𝑚𝑚𝑚(𝑦𝑦^𝚤𝚤 )                    for many methods, including k-NN, linear models,
   DTW calculation algorithm can be found at [19]. It is                         artificial neural networks and many more. Even
worth mentioning, that modern implementations often                              univariate time series data should be normalized because
have adjustments to speed up the calculations (for                               it might be further used in combination with data of
example, DDTW [13]).                                                             different scale from other sources.
                                                                                     The most well-known and widely used are min-max
4 Noise Reduction                                                                normalization and z-score normalization. Min-max
                                                                                 implies that you know the minimum and the maximum
    Similar to missing data points, sensor data is usually                       values in your dataset beforehand, which is often not the
contaminated with noise, which can be mistaken for                               case. Z-score is more robust but performs poorly om non-
actual data pattern, which yet again leads to a loss of                          stationary time series.
efficiency and unreliable results of the analysis. The task                                                        𝑦𝑦−𝑚𝑚𝑚𝑚𝑚𝑚(𝑌𝑌)
                                                                                                   ŷ𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 =               ,
of noise reduction is to subtract the maximum amount of                                                           𝑚𝑚𝑚𝑚𝑚𝑚(𝑌𝑌)−𝑚𝑚𝑚𝑚𝑚𝑚(𝑌𝑌)
                                                                                                                    𝑦𝑦−𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚(𝑌𝑌)
noise from the initial data, leaving the maximum amount                                             ŷ𝑧𝑧−𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 =                 ,
                                                                                                                        𝑠𝑠𝑠𝑠𝑠𝑠(𝑌𝑌)
of useful signal.
                                                                                     where ŷ is a value after normalization, y is a value
4.1 Methods                                                                      prior normalization and Y is the set of values being
                                                                                 normalized.
   According to Chen et al. [6] noise reduction methods                              Some less popular methods are decimal scaling
can be divided into 2 categories: frequency domain                               normalization, which holds all the drawbacks of min-
approaches and time domain approaches.




                                                                           228
max normalization, sigmoid normalization, which is                          It is worth mentioning, that there are automatic time
actively used in neural networks and tanh estimators,                   series feature engineering tools such as tsfresh [7], which
which roughly can be described as a hyperbolic tangent                  achieve decent results with almost no effort required.
of the z-score normalization.
                                         𝑦𝑦
                       ŷ𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 = 𝑑𝑑,                            7.1 Timestamp Features
                                        10
    where d is the order of values in the set.                          The idea of this approach is to extract the features from
                                         1                              the timestamp of each observation. The most commonly
                      ŷ𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 =     𝑦𝑦 ,
                                          1+𝑒𝑒
                                   0.01∗�𝑦𝑦−𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚(𝑌𝑌)�
                                                                        used features are:
     ŷ𝑡𝑡𝑡𝑡𝑡𝑡ℎ = 0.5 ∗ �𝑡𝑡𝑡𝑡𝑡𝑡ℎ �                        � + 1�.          • minutes elapsed for the day;
                                          𝑠𝑠𝑠𝑠𝑠𝑠(𝑌𝑌)
    According to the experiment, conducted in [20], there                • hour of the day;
is no optimal time series normalization method and one                   • day of the month;
should choose the appropriate method based on the data                   • weekend or not;
patterns. Regarding sensor data, the mean and standard                   • season of the year;
deviation remain approximately the same throughout                       • public holiday or not.
time, which makes z-score normalization a reasonable                        Talking about sensor data, some examples of useful
choice.                                                                 timestamp features are:
                                                                         • time since the last maintenance;
6 Data Quality Assessment                                                • age of equipment;
                                                                         • time since the last failure;
Data Quality Assessment (DQA) is the scientific and
                                                                         • operating time of equipment.
statistical evaluation of data to determine if data obtained
                                                                            Using just these features alone for predictions will
from environmental data operations are of the right type,
                                                                        likely result in a poor model. However, in combination
quality, and quantity to support their intended use [8].
                                                                        with other features, they can boost the quality of
    There is a comprehensive work on time series data
                                                                        forecasts.
quality assessment done in [9], which shows that there
are dozens of different metrics that can be used to                     7.2 Statistical Features
measure the quality of data. Obviously, using all of them
is excessive and computationally inefficient, so only a                 This approach involves sliding through a time series with
few are to be chosen. However, there is no common view                  the window of a given width and calculating statistics for
on which metrics are better. The simple yet effective                   each iteration. The most common statistical features are
strategy might be to look onto the most popular ones:                   the mean of the previous few values, the median, the
 • event data loss (gaps in the data);                                  mode, the minimal value, the maximum value, the
 • values out of range (values out of sane interval for                 standard deviation and many more. In addition to
     the domain);                                                       calculated statistics, we can also use the lagged values of
 • value spikes (improbable sudden changes);                            a time series as features.
 • wrong timestamps;                                                        The biggest challenge of this approach is that the
                                                                        window can be of any width and there is no general
 • rounded measurement value (not desirable level of
                                                                        algorithm to choose it. Usually, the researchers just try
     detail);
                                                                        out several values of the width and choose the one that
 • signal noise (slightly inaccurate measurements).
                                                                        performs best. However, if there is a seasonal pattern in
    The assessment is to be done for both data prior and
                                                                        data, it is worth making the width of the window not less
after preprocessing to acquire an evaluation of
                                                                        than the period of the seasons.
preprocessing module effectiveness. It is also worth
keeping in mind, that initially clean data is different to              7.3 Spectral Features
the data, that was made “clean” during preprocessing due
to approximations and inevitable errors of the methods                  Different variations of Fourier transform and wavelet
involved on each step.                                                  transform are used to extract spectral features from a
                                                                        nonstationary signal. The basic idea behind those
                                                                        methods is to decompose a given time series into a sum
7 Feature Engineering
                                                                        of several basic functions, providing a different
Feature engineering is, probably, the most peculiar step                representation of the initial signal. The biggest drawback
of data processing, as it depends on the initial data type,             of those methods is that they are relatively
its origin, quantity, quality, the desired output of the                computationally expensive.
forecasting model and even the nature of the model itself.
As it was already mentioned, sensor data can be                         7.4 Dimensionality Reduction
represented in a form of univariate time series. The                    Feature extraction provides many features, some of
conventional approaches to time series feature                          which can be useless or strongly correlated with each
engineering can be divided into 3 categories: timestamp                 other. Excessive features not only add unnecessary
features, statistical features, and spectral features. The              computations but also can decrease the quality of the
feature extraction step is usually followed by a                        model. Thus, several dimensionality reduction methods
dimensionality reduction step.                                          were introduced to minimize the number of features, at




                                                                  229
the same time keeping the maximum amount of                       References
information. The most common ones are principal
component analysis, independent component analysis                 [1] Apache Flink. https://flink.apache.org/
and partial least squares regression.                              [2] Apache Kafka. https://kafka.apache.org/
                                                                   [3] Apache       Kafka       Streams      Documentation.
8 Remaining Lifetime Forecasting                                       https://kafka.apache.org/documentation/streams/
                                                                   [4] Apache Samza. http://samza.apache.org/
There is a variety of methods, that can be used for time
series forecasting. Each of them has advantages and                [5] Apache Storm. https://storm.apache.org/
drawbacks and can be viable in certain circumstances.              [6] Chen, Mithal, Vangala, Brugere, Boriah, Kumar: A
    First of all, there are some basic methods such as                 study of time series noise reduction techniques in the
average method, naive method, seasonal naive method,                   context of land cover change detection. NASA
and drift method, which are very simple yet can be                     Conference on Intelligent Data Understanding
effective when the data pattern is easy.                               (2011)
    Secondly, linear regression models can be used for             [7] Christ M., Kempa-Liehrb A., Feindt M.: Distributed
forecasting. In the simplest case, the regression model                and Parallel Time Series Feature Extraction for
allows for a linear relationship between the forecast                  Industrial Big Data Applications. ACML Workshop
variable and some predictor variables. The biggest                     on Learning on Big Data (2016)
downside is inherent linearity, while real-world data is           [8] Gitzel R.: Data Quality in Time Series Data An
mostly non-linear.                                                     Experience Report. CBI Industrial Track (2016)
    The most common approach is to use stochastic                  [9] Guidance for Data Quality Assessment: Practical
models – ARMA, ARIMA, SARIMAX, etc. One of the                         Methods for Data Analysis. EPA (2000)
biggest drawbacks of those models is that they require
fine-tuning of several hyperparameters, which is                   [10]Harvey A.: Forecasting, structural time series
computationally expensive and not intuitive.                           models and the Kalman filter. Cambridge university
    One of the recently popular approaches is to use                   press (1990)
decision trees. Random forests and gradient boosting               [11]Hive        on       Spark:      Getting     Started.
methods, which are so widely used in machine learning                  https://cwiki.apache.org/confluence/display/Hive/H
competitions, can also be used for time series                         ive+on+Spark%3A+Getting+Started
forecasting.                                                       [12]Huai Y., Chauhan A., Gates A., Hagleitner G.,
    Artificial neural networks approach for time series                Hanson E.N., O’Malley O., Pandey J., Yuan Y., Lee
forecasting gained immense popularity in last few years.               R., Zhang X.: Major technical advancements in
Their modifications – recurrent neural networks (RNNs)                 Apache Hive. ACM SIGMOD international
and long short-term memory networks (LSTMs) are                        conference on management of data (2014)
especially effective for this task due to their “memory”           [13]Keogh, E., Pazzani, M.: Derivative Dynamic Time
component. Although neural networks tend to be the most                Warping. First SIAM International Conference on
accurate forecasting method when tuned properly and given              Data Mining (2001)
enough data, they might be computationally too costly for          [14]Kovalev D., Shanin I., Stupnikov S., Zakharov V.:
the considered conditions.                                             Data Mining Methods and Techniques for Fault
                                                                       Detection and Predictive Maintenance in Housing
9 Conclusion                                                           and Utility Infrastructure. Engineering Technologies
                                                                       and Computer Science (2018)
In this study, the overview of sensor data preprocessing
steps, methods, and common metrics is held. Some                  [15]Köhler, Torsten, Lorenz: A comparison of denoising
thoughts on sensor data quality assessment are given.                  methods for one dimensional time series. Zentrum
The architecture of a fast, scalable preprocessing module              für Technomathematik (2005)
is proposed. A brief overview of time series feature              [16]Marron J. S., Tsybakov A. B.: Visual error criteria
engineering techniques and forecasting methods is given.               for qualitative smoothing. Journal of the American
The future goals of the ongoing work are to implement                  Statistical Association (1995)
the designed preprocessing module on Spark Streaming              [17]Merkel D.: Docker: Lightweight Linux Containers
engine, integrate it into the existing predictive                      for Consistent Development and Deployment. Linux
maintenance pipeline, implement the feature engineering                J., vol. 2014 (2014)
step and to develop a remaining lifetime forecasting              [18]Mobley K.: An Introduction to Predictive
model.                                                                 Maintenance — 2nd edition (2002)
Acknowledgments. This work is supervised by Dmitriy               [19]Moritz S., Sardá A., Bartz-Beielstein T., Zaefferer
Kovalev, Institute of Informatics Problems, Federal                    M., Stork J: Comparison of different Methods for
Research Center “Computer Science and Control” of the                  Univariate Time Series Imputation in R. CoRR
Russian Academy of Sciences.                                           abs/1510.03924 (2015)
The research is financially supported by Ministry of
Education and Science of the Russian Federation                   [20]Nayak S., Misra B., Behera H.: Impact of Data
(project’s unique identifier RFMEFI60717X0176).                        Normalization on Stock Index Forecasting.
                                                                       International Journal of Computer Information




                                                            230
    Systems and Industrial Management Applications              [26]Song S., Zhang A., Wang J., Yu P.: SCREEN:
    (2014)                                                          Stream Data Cleaning under Speed Constraints.
[21]OpenTSDB 2.3 documentation | HTTP API.                          ACM SIGMOD international conference on
    http://opentsdb.net/docs/build/html/api_http/put.ht             management of data (2015)
    ml                                                          [27]Spark       Streaming      Programming        Guide.
[22]Phan T., Poisson Caillault E., Lefebvre A., Bigand              https://spark.apache.org/docs/latest/streaming-
    A.: Dynamic time warping-based imputation for                   programming-guide.html
    univariate time series data. Pattern Recognition            [28]Spark Streaming + Kafka Integration Guide.
    Letters (2017)                                                  https://spark.apache.org/docs/2.2.0/streaming-
[23]pySpark           Package           Documentation.              kafka-integration.html
    http://spark.apache.org/docs/2.1.0/api/python/pyspa         [29]White T.: Hadoop: The Definitive Guide. O'Reilly
    rk.html                                                         Media; Forth Edition (2012)
[24]Sakoe, Chiba: Dynamic Programming Algorithm                 [30]Zhang A., Song S., Wang J.: Sequential Data
    Optimization for Spoken Word Recognition. IEEE                  Cleaning: A Statistical Approach. ACM SIGMOD
    Transactions on Acoustics, Speech and Signal                    international conference on management of data
    Processing (1978)                                               (2016)
[25]Sigoure B.: OpenTSDB: The distributed, scalable
    time series database. OSCON, vol. 11 (2010)




                                                          231