<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Information Technology and Interactions, December</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Models for Anomaly Detection in an Industrial Transporting System</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kyrylo Kadomskyi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Taras Shevchenko National University of Kyiv</institution>
          ,
          <addr-line>64/13, Volodymyrska st., Kyiv, 01601</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>0</volume>
      <fpage>2</fpage>
      <lpage>03</lpage>
      <abstract>
        <p>Cyber-Physical Production Systems (CPPS) require robust techniques for detecting anomalies and root causes in the system. The model-based diagnosis is a commonly used approach in which a dynamic process model captures spatio-temporal features of the system's behavior. Because of the infeasibility of precise mathematical or expert modeling, algorithms have been developed for learning such models from system observations. These algorithms are characterized by high domain-specialization and yield relatively poor performance in other use cases. In this paper the CPPS data is used, on which existing models have proven ineffective. The perspective of applying deep learning approach to constructing a process model in such systems is investigated. The main idea is to go from models with fixed structure to more universal techniques for learning optimal structure from challenges of evaluating dynamic system models of this class are identified, and evaluation criteria are proposed for representative comparison and benchmarking of the models. It is shown that deep learning models provide increase in anomaly detection score but require additional verification of model robustness. industrial IoT Anomaly detection, autoencoder, model evaluation, cyber-physical production systems,</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Motivation</title>
      <p>Industrial AI is an emergent research field that is actively revolutionizing production plants.
Increasing product variety, product complexity and pressure for efficiency lead to systems that
contain a growing set of sensors to facilitate automation [1]. In this context diagnosis of complex
production processes has gained new attention due to research agendas such as Cyberphysical
Production Systems (CPPS) [2, 3]: the initiative of Industrial Internet of Things (IIoT) and Industrie
4.0. In these agendas the most important goals of self-diagnosis are identification of anomalous
system behavior, suboptimal energy consumption, or wear in CPPS [4, 5].</p>
      <p>The most accepted method is model based diagnosis [4] where the features of normal and
anomalous system’s behavior are captured by the process model. Modern CPPS are adaptable and
changeable, which makes both precise mathematical modelling and manual expert modelling costly
and ineffective [6]. Thus, to build the model the process features must be extracted from sensory
measurements. As the process often is highly dynamic and variable, the most informative features are
spacio-temporal and include sequential events, timing and duration of specific process stages, or the
boundaries on observed values specific to each given stage.</p>
      <p>To achieve this, novel dynamic modelling techniques are being developed [3, 4, 7, 8] and are
currently replacing traditional methods, such as Statistical Process Control (SPC) and Bayesian
inference with time dependency. While showing good results in certain applications, this models yield</p>
      <p>2020 Copyright for this paper by its authors.
relatively poor performance in other similar use cases [9, 7, 8]. The hypothesis is that this effect is due
to limited nature and fixed structure of spatio-temporal features learned by the model, which are
imposed by the structure of the model itself. Then the informativeness of learned features will vary in
different physical systems, which can explain the observed effect.</p>
      <p>In this study Deep Learning (DL) models, such as autoencoders [10], are applied to remove the
mentioned limitation by automatically selecting the most relevant features and structure to represent
the data. Evaluating these models on the dataset that has proven challenging for applying novel
dynamic models is conducted aiming for accurate benchmarking of the two approaches. This in turn
provides the possibility to assess the limits of model-based anomaly detection in given class if CPPS.</p>
      <p>As results of traditional evaluation techniques in CPPS applications may not be representative [9],
the challenges of evaluating dynamic system models in CPPS are identified by analyzing data
collected from DL models, and robustness criteria are proposed to increase evaluation
representativeness.</p>
    </sec>
    <sec id="sec-2">
      <title>2. The System and the data</title>
      <p>Currently several projects are aimed at utilizing new technical possibilities to meet the challenges
of Industrial IoT and Industrie 4.0. Under the European Union’s Horizon 2020 research project
IMPROVE [11] a number of experiments in industrial systems were made, and environments were
designed specifically to test novel methods for self-diagnosis (including monitoring, anomaly
detection) and self-optimization [12]. The High Rack Storage System or HRSS is a demonstrator
system built in SmartFactoryOWL in Lemgo, Germany. The system transports pallets between its
different shelves, as shown in Figure 1.</p>
      <p>Measurements of position, power and voltage are made at each of the system’s drives during full
transporting cycles. Anomalies in this system include shortening of cycles, pauses, abnormal timing,
duration, or sequence of different process stages, as well as increase or decrease in one or multiple
signals at certain stages. The task is to detect HRSS anomalies and to localize them with time-step
precision by constructing the model of normal system behavior in an unsupervised manner.</p>
      <p>A time series dataset [13] was collected in this system under IMPROVE project and is being
actively used to test novel approaches to anomaly detection [9, 14]. The data contains 18 real-valued
signals sampled 15–20 times per second. It includes time series of 106 normal cycles (25,907
observations) and 111 cycles containing labelled anomalies (23,645 observations). The dataset is
unbalanced with 76.0% of negative examples. Statistical distributions of the classes (i.e. normal and
anomalous measurements) are not distinguishable in feature space, which excludes direct applying of
traditional Machine Learning (ML) methods for anomaly detection (e.g. linear models, decision trees,
SVM, etc.). At the same time PCA analysis shows that 10 main principal components cover 98.1% of
data variation, so linear dimensionality reduction techniques can be useful. Data quality issues that
may affect model performance include high noisiness, strong outliers, and difference in feature ranges
by several orders of magnitude.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Background research</title>
      <p>As the statistical separation of classes is not possible in this task, constructing a model from
process measurements involves learning spatio-temporal patterns and events, which are typically
characterized by timing and duration of different process stages.</p>
      <p>To address this goal the use of dynamic process models such as Hybrid Timed Automata (HTA)
has been proposed [9]. To apply a discrete state HTA model to continuous process measurements the
unsupervised data preprocessing with self-organizing maps (SOM) and watershed transformations
were utilized. This method detects anomalies with timestep precision. Yet, having proven effective in
other CPPS applications [7, 8], it yields low performance on HRSS data with 30.76% F1 score and
26.7% recall (1516 true positives).</p>
      <p>In another study the Deep Learning architectures were applied to the same data [14]: Siameese
LSTM model was used for binary classification of full process cycles into ‘normal’ and ‘anomaous’
classes. Targeting minimal false-positive score this model yields 25.6% F1 measure, 88.2% precision,
and 15.0% recall, while being unable to localize anomalies within a cycle.</p>
      <p>In both studies anomaly detection rates are low, comparing to other CPPS applications, thus
learning a model from the process measurements in HRSS plant remains a challenging task. To
address this task, features of HRSS system must be identified that explain observed drop in efficiency.
As the results of the two studies are not directly comparable, the perspective of applying DL models
in this class of CPSS also remains an open question. Answering it requires strict evaluation of DL
models, as well as assessment of the effect of architectural variations. As the representativeness of
evaluation results remains unknown [9], additional measures must be developed to assess model
robustness.</p>
    </sec>
    <sec id="sec-4">
      <title>4. The method</title>
      <p>In this study a set of autoencoder architectures are applied to the task of anomaly detection [10] in
a setup shown in Figure 2. The DL model, i.e. autoencoder, is trained in unsupervised manner to
reconstruct normal time series targeting minimal reconstruction loss. Then the trained model is used
to reconstruct unseen time series with anomalies, where the reconstruction error is expected to peak at
anomalous intervals. To evaluate the model, the distributions of reconstruction error in normal and
anomalous intervals are analyzed for being statistically distinguishable. Finally, from the error
distributions a decision-rule classifier for anomaly detection is built in a supervised mode.</p>
      <sec id="sec-4-1">
        <title>Measurements, time series</title>
        <sec id="sec-4-1-1">
          <title>Preprocessing, feature engineering</title>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>Features</title>
        <sec id="sec-4-2-1">
          <title>Autoencoder</title>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>Distance measure</title>
        <sec id="sec-4-3-1">
          <title>Decision tree</title>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>Reconstructed time series</title>
      </sec>
      <sec id="sec-4-5">
        <title>Anomaly prediction</title>
        <p>This method detects anomalies with time step precision, and most of evaluated models can be
applied in real time.
modelled and evaluated in a setup allowing for direct benchmarking against background research. For
results to be representative, models’ robustness must be assessed. From the analysis of evaluation
results, two challenges were identified that must be met to achieve model robustness and the
representativeness of evaluation.</p>
        <p>1.</p>
        <p>One distinct feature of HRSS plant is low process variation in normal conditions with 12.6%
mean absolute deviation from the averaged process cycle. Under such conditions, an
autoencoder model can reach local minima of reconstruction error without reconstructing
individual features of distinct cycles (i.e. different process runs). In this case model’s output is
close to the average training cycle with reconstruction loss close to vnormal. Such model
performs well on HRSS data where process variation is low, but it will not be useful in most
CPPS applications where process variation is higher.
2. The presence of anomalies may affect model’s performance in reconstructing neighboring
normal intervals. This is expected behavior in models with internal time-dependency, which
are used in this study. In this case model’s robustness is limited by the type and the length of
anomalies, which typically are not known at training time.
4.1.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Robustness criteria</title>
      <p>To address the mentioned challenges two robustness criteria are proposed for representative model
RC1. Reconstructed variation rate is calculated in unsupervised mode using the training set of
normal process cycles, by comparing step-vise standard
deviation of reconstructed signal
to standard deviation of the model input</p>
      <p>:</p>
      <p>RC2. Reconstruction sensitivity to anomalies is assessed in supervised mode on the set of
anomalous cycles (i.e. evaluation set) as the correlation between error of reconstructing normal
intervals and the strength of anomalies in the same process cycle or time window:
2 = corr (M| 

−  
_
),
_</p>
      <p>is domain specific and includes type, time length and strength of the
evaluation.



where 
anomaly.</p>
      <p>

In HRSS plant two distinct types of anomalies are present.</p>
      <p>Type 1: amplitude deviations from normal signal</p>
      <p>Type 2: deviations in timing, duration, or sequence of process stages</p>
      <p>In practice anomalous cycle duration and long-term type-2 anomalies have noticeable effect on
RC2, as shown in Figure 3.
4.2.</p>
    </sec>
    <sec id="sec-6">
      <title>Evaluation techniques</title>
      <p>To evaluate the DL models, HRSS dataset is split into three parts.</p>
      <p>autoencoder.
overtraining.</p>
      <p>Training set contains randomly selected 2/3 of normal cycles and is used to train the
Test set contains remaining normal cycles and is used to validate autoencoder and test it for
Evaluation set contains all cycles with anomalies and is used to assess anomaly detection
performance and to justify the selection of decision threshold.</p>
    </sec>
    <sec id="sec-7">
      <title>4.2.1. Choice of performance measures</title>
      <p>The architecture consists of two parts: the autoencoder which is used to reconstruct input time
sequence, and the classifier used for anomaly detection. So, two performance indicators are required.</p>
      <p>The performance of signal reconstruction was measured with MAE loss function, which is more
outlier-resistant and more suitable for high-dimensional data comparing to MSE.</p>
      <p>Anomaly detection performance was measured with F1 score and confusion matrix. The F1 score
has the advantage of accounting for both false positives and false negatives. Comparing to accuracy
and correlation-based measures, which also account for true negatives, F1 score better suits an
unbalanced dataset. Also, F1 score with confusion matrix enable direct comparison with the
background research.</p>
    </sec>
    <sec id="sec-8">
      <title>4.2.2. Selecting decision threshold</title>
      <p>in two ways:</p>
      <p />
      <p>In anomaly detector the threshold must be set for the signal reconstruction error. Let   be the
distribution of signal reconstruction loss obtained on the training set (i.e. in normal cycles); let   and
be the distributions of loss obtained on validation data: in normal intervals and in anomalies
respectively. Then optimal value for the classification threshold can be assessed from   ,   and  
Unsupervised:  = E(  ) + 2 (  ).</p>
      <p>Supervised:
detection.</p>
      <p>= argmax  (  ,   ,  ), where  is a performance measure for anomaly</p>
      <p>Experiments on HRSS data show that the optimal threshold value for different architectural
modifications varies in a broad range. While the first assessment can be far from optimal, the second
assessment may not be possible in most applications where labelled anomalous data is not available.</p>
    </sec>
    <sec id="sec-9">
      <title>4.2.3. Evaluation steps</title>
      <p>Evaluation steps include:
1.</p>
      <p>calculating performance measures.
2. assessing the statistical separation between autoencoder response to normal and anomalous
signals (  ,   and   ).
3. assessing robustness criteria RC1 and RC2.
4. selecting the optimal model by maximal performance, among models that have passed
robustness tests.</p>
    </sec>
    <sec id="sec-10">
      <title>Models</title>
      <p>The DL models being tested are divided in two groups by DL architecture type: LSTM and
Convolutional. In each group the first model is a traditional architecture used for anomaly detection.
Other models are built to assess the effect of architectural modifications on model performance.</p>
      <p>The choice of the model’s hyper-parameters affects both experimental performance and
robustness. Hyper-parameters include the number, types and sizes of layers, compression rate of
autoencoder, the use of dropouts, as well as internal layer parameters (e.g. kernel size, activation
function). As no computationally effective techniques exist for finding the optimal architecture
construction through hyper-parameter choices, this task remains tedious and highly intuition driven
[15, 16]. In this study a grid search approach was applied for each model type, obtaining the models
shown in Table 1.</p>
    </sec>
    <sec id="sec-11">
      <title>6. Experimental setup</title>
      <p>The models were implemented using Keras with Tensorow backend. Training was performed
using ‘Adam’ optimizer and MAE loss function with learning rate of  = 0.005,  1 = 0.9,  2 =
0.999, and fuzzy factor  = 10−7 [19]. The time series of complete process cycles, padded to
constant length of 300 timesteps, were used as both input and target. Training was run with 130
epochs for LSTM models and 300 epochs for ConvNet models, in mini-batch mode with batch size
32. To rule out the effect of batch-averaging on robustness criteria RC1, training was repeated in
stochastic mode (batch size 1). In this setup the number of epochs was reduced by the factor of 5, as
epochs are more time-consuming in this mode, but epoch-to-epoch convergence is faster. As no
significant influence of the batch size on evaluation criteria was observed in experiments, only results
obtained in mini-batch mode are presented. As the reconstruction loss fluctuates between training
epochs, averaging across last 10 epochs was used for reliable performance estimate.</p>
      <p>Data pre-processing included the following steps:
 Introducing velocity features, calculated with second order accurate central differences.
 Dimensionality reduction from 24 to 12 components with PCA, which preserves 98.2% of data
variance.
 Normalization and scaling to the range (0,1), which unifies value ranges of features.
 Time smoothing with gaussian kernel of width 15 and standard deviation 3.</p>
      <p> Unifying time series length by padding.</p>
    </sec>
    <sec id="sec-12">
      <title>7. Results</title>
      <p>Reconstruction rates of all models fall into a narrow range, as shown in Table 2, with exception of
classic LSTM autoencoder (LSTM 1), which proved unable to accurately reconstruct the process.
Thus, reconstruction loss measure cannot be used to assess the efficiency of autoencoder model in
CPPS anomaly detection task. Instead, statistical analysis of the loss distributions must be applied.</p>
      <sec id="sec-12-1">
        <title>Model</title>
      </sec>
      <sec id="sec-12-2">
        <title>Target</title>
        <p>value</p>
      </sec>
      <sec id="sec-12-3">
        <title>LSTM 1</title>
      </sec>
      <sec id="sec-12-4">
        <title>LSTM 2</title>
      </sec>
      <sec id="sec-12-5">
        <title>LSTM 3</title>
      </sec>
      <sec id="sec-12-6">
        <title>LSTM 4</title>
      </sec>
      <sec id="sec-12-7">
        <title>LSTM 5</title>
      </sec>
      <sec id="sec-12-8">
        <title>LSTM 6</title>
      </sec>
      <sec id="sec-12-9">
        <title>ConvNet 1</title>
      </sec>
      <sec id="sec-12-10">
        <title>ConvNet 2</title>
        <p>ConvNet 3</p>
        <p>1
100
52.6
46.3
69.2
62.3
75.4
75.7</p>
        <p>Evaluation results indicate that increasing complexity of DL models (top down in Table 2) leads to
higher performance measure. However, this is not the case with robustness. Deep LSTM models with
heterogeneous layers (LSTM 5 and LSTM 6) tend to average out all variation in the signal (i.e., have
low RC1), while deeper convolutional networks lose ability to reconstruct normal signal in presence of
type 2 anomalies (i.e., have high RC2). It may be concluded that traditional performance metrics for
model evaluation are misleading in case of HRSS, favoring models with low robustness according to
criteria RC1 and RC2.</p>
        <p>Considering both performance measure and proposed robustness criteria, the LSTM 4 model is
selected as the best choice for HRSS data. Model’s architecture is demonstrated in Figure 6.</p>
        <p>Comparing to traditional LSTM autoencoder architectures [17, 18], this model introduces two
distinct architectural features. First, input time-series are not flattened into a vector, and thus the
model has lower compression rate. Experimental evidence (Table 2) suggests that preserving time
dimension in encoder generally leads to better performance in anomaly detection task. Second, an
additional convolution layer is added at model’s bottleneck to capture long-term features in input
time-series.</p>
        <p>The obtained LSTM 4 model provides 62.3±2.1% overall anomaly detection rate (F1 score) and
59.1% recall with 3350 true positives, as shown in Table 3. Comparing to the baseline efficiency [9],
an increase by 102% in anomaly detection score and an increase by 121% in recall are achieved.</p>
      </sec>
    </sec>
    <sec id="sec-13">
      <title>8. Conclusions</title>
      <p>The problem of the model-based anomaly detection in industrial CPPS was addressed in the Deep
Learning paradigm by applying autoencoder architectures. The specific case of HRSS plant was
studied, in which construction and evaluation of process models had proven to be a challenging task.
The major challenges of applying Deep Learning models were identified as low process variation in
the training set, and presence of two distinct types of anomalies, detecting which requires different
algorithms or settings.</p>
      <p>It was shown that increasing model complexity, both in LSTM and convolution-based models,
allow to increase anomaly detection performance but has strong robustness tradeoff. This indicates
that model evaluation in systems of this class cannot rely completely on performance metrics. For
evaluation results to be representative, detection rates of different anomaly types must be assessed
separately, and additional robustness criteria must be considered. Such criteria were proposed based
on statistical analysis of both the data and the model output in supervised training context.</p>
      <p>In the studied industrial transporting system (HRSS) applying deep learning models and
autoencoder techniques allowed for 102% performance gain, F1 score, while preserving model’s
robustness. Wider assessment of perspectives of CPPS applications requires further experimental
research in cases of higher variance in the normal process as well as different types of anomalies.</p>
    </sec>
    <sec id="sec-14">
      <title>9. Acknowledgements</title>
      <p>This research utilizes the data collected at SmartFactoryOWL Lemgo, Germany, under the
European Union’s Horizon 2020 research project IMPROVE [12]. The data was made publicly
available by inIT [13] under a Creative Commons License Attribution-ShareAlike 4.0 International
(CC BY-NC-SA 4.0).
10.References
[1] Factories of the future: multi-annual roadmap for the contractual PPP under HORIZON 2020,
Publications Office of the European Union, Luxembourg, 2013.
[2] E. A. Lee, Cyber physical systems: design challenges. In: Proceedings of the 11th IEEE
international symposium on Object Oriented Real-Time Distributed Computing (ISORC),
Orlando, FL, 2008, pp. 363–369. doi: 10.1109/ISORC.2008.25.
[3] O. Niggemann, C. Frey, Data-driven anomaly detection in cyber-physical production systems,</p>
      <p>AT – Automatisierungstechnik, 2015, vol. 63, issue 10. doi: 10.1515/auto-2015-0060.
[4] L. Christiansen, A. Fay, B. Opgenoorth, J. Neidig, Improved diagnosis by combining structural
and process knowledge, in: Proceedings of the 16th IEEE conference on Emerging Technologies
Factory Automation, ETFA, Toulouse, France, 2011, pp. 1–8. doi:
10.1109/ETFA.2011.6059056.
[5] S. Windman, S. Jiao, O. Niggemann, H. Borcherding, A stochastic method for the detection of
anomalous energy consumption in hybrid industrial systems, in: Proceedings of the 11th
international IEEE conference on Industrial Informatics, INDIN, Bochum, Germany, 2013. doi:
10.1109/INDIN.2013.6622881.
[6] B. Vogel-Heuser, C. Diedrich, A. Fay, S. Jeschke, M. Kowalewski, S. Wollschlaeger,
P. Goehner, Challenges for software engineering in automation, Journal of Software Engineering
and Applications 7 (2014) 440–451. doi: 10.4236/jsea.2014.75041.
[7] N. Hranisavljevic, O. Niggemann, A. Maier, A novel anomaly detection algorithm for hybrid
production systems based on deep learning and timed automata, in: Proceedings of the 27th
international workshop on Principles of Diagnosis, DX-2016, Denver, Colorado, 2016.
[8] A. von Birgelen, O. Niggemann, Enable learning of hybrid timed automata in absence of discrete
events through self-organizing maps, in: O. Niggemann, P. Schüller (eds.), IMPROVE –
Innovative modelling approaches for production systems to raise validatable efficiency.
Technologien für die intelligente automation (Technologies for intelligent automation), vol. 8,
Springer Vieweg, Berlin, Heidelberg, 2008. doi: 10.1007/978-3-662-57805-6_3.
[9] A. von Birgelen, O. Niggemann, Using self-organizing maps to learn hybrid timed automata in
absence of discrete events, in: Proceedings of the 22nd IEEE international conference on
Emerging Technologies and Factory Automation, ETFA, Limassol, Cyprus, 2017, pp. 1–8. doi:
10.1109/ETFA.2017.8247695.
[10] C. Zhou, R. C. Paffenroth, Anomaly detection with robust deep autoencoders, in: Proceedings of
the 23rd ACM SIGKDD international conference on Knowledge Discovery and Data Mining,
KDD '17, Halifax NS, Canada, 2017, pp. 665–674. doi: 10.1145/3097983.3098052.
[11] IMPROVE. Creating the factory of the future with 4.0 solutions, 2016. URL:
http://improvevfof.eu/.
[12] Physical factory / demonstrators IMPROVE, 2016. URL:
http://improvevfof.eu/background/physical-factory-demonstrators.
[13] inIT, High storage system data for energy optimization, 2018. URL:
https://www.kaggle.com/inIT-OWL/high-storage-system-data-for-energy-optimization.
[14] M. Cerliani. Predictive maintenance with LSTM siamese network, 2019. URL:
https://towardsdatascience.com/predictive-maintenance-with-lstm-siamese-network51ee7df29767.
[15] S. R. Young, D. C. Rose, T. P. Karnowski, S.-H. Lim, R. M. Patton, Optimizing deep learning
hyper-parameters through an evolutionary algorithm, in: Proceedings of the workshop on
Machine Learning in High-Performance Computing Environments, MLHPC '15, Austin, Texas,
2015, article no. 4. doi: 10.1145/2834892.2834896.
[16] J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization, The journal of machine
learning research, 13 (2012), pp. 281–305.
[17] A. Sagheer, M. Kotb. Unsupervised pre-training of a deep LSTM-based stacked autoencoder for
multivariate time series forecasting problems, Scientific Reports 9, 19038 (2019). doi:
10.1038/s41598-019-55320-6.
[18] A. H. Mirza, S. Cosan, Computer network intrusion detection using sequential LSTM neural
networks autoencoders, in: Proceedings of the 26th Signal Processing and Communications
Applications Conference, SIU, Izmir, Turkey, 2018, pp. 1–4. doi: 10.1109/SIU.2018.8404689.
[19] D. P. Kingma, J. Ba. Adam: a method for stochastic optimization, in: Proceedings of the 3rd
international conference for Learning Representations, CoRR, San Diego, CA, 2014,
abs/1412.6980.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>