<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Recurrent neural network model for time series analysis*</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>IVUS2024: Information Society and University Studies 2024</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Silesian University of Technology</institution>
          ,
          <addr-line>Akademicka 2A, 44-100, Gliwice</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of L'Aquila</institution>
          ,
          <addr-line>Via Vetoio, 40, 67100 Coppito AQ</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Time series analysis is a critical component in various fields such as finance, economics, climate science, and healthcare, where accurate forecasting and pattern recognition are paramount. This research explores the application of recurrent neural networks (RNNs), particularly Long Short-Term Memory (LSTM) networks, for time series prediction, using Google stock prices as a case study. The study begins with a comprehensive literature review, highlighting the evolution and advancements in RNN architectures, their theoretical foundations, and diverse applications in time series forecasting. Methodologically, this study outlines the data preprocessing techniques employed, including scaling and partitioning the dataset into training and testing sets. The RNN model architecture is meticulously designed, featuring multiple LSTM layers and dropout regularization to prevent overfittingand enhance model robustness. The model is trained and evaluated using different metrics (MAE, MSE, RMSE). Empirical results demonstrate the efficacy of the RNN model in capturing the temporal dependencies and producing accurate forecasts of stock prices.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Recurrent neural networks (RNNs)</kwd>
        <kwd>Time series analysis</kwd>
        <kwd>Long Short-Term Memory (LSTM)</kwd>
        <kwd>Deep learning</kwd>
        <kwd>Machine learning</kwd>
        <kwd>Artificialintelligence (AI)</kwd>
        <kwd>Financial forecasting</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In the data-driven decision-making landscape, time series analysis is essential for understanding
sequential data across diverse domains. Traditionally, statistical methods like autoregressive
models, moving averages, and exponential smoothing have been used to analyze time series
data. However, these methods often fall short due to assumptions of linearity and stationarity,
which are rarely met in real-world datasets characterized by nonlinearity and volatility.</p>
      <p>The advent of deep learning, especially recurrent neural networks (RNNs), has transformed
time series analysis. RNNs, with their recurrent connections and ability to process
variablelength sequences, excel at capturing temporal dependencies in data. This study explores the
application of RNNs for predicting Google’s stock prices. The goals of this study are:
1. To develop a bespoke recurrent neural network model tailored specifically for time series
analysis, with a focus on predicting Google stock prices; and</p>
      <p>2. To empirically evaluate the performance of the proposed model using real-world financial
data.</p>
      <p>The research addresses the demand for reliable predictive models in financial decision-making,
considering challenges such as nonlinearity and irregularities in financial data. The study also
examines methodological aspects like data preprocessing, model selection, hyperparameter
tuning, and evaluation metrics, aiming to elucidate RNNs’ strengths and limitations in financial
forecasting.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Recurrent Neural Networks</title>
      <p>
        Recurrent Neural Networks (RNNs) are a class of artificial neural networks designed for process- ing
sequences of data. Unlike traditional feedforward neural networks, RNNs have connections that
form directed cycles, allowing them to maintain a memory of previous inputs. This makes RNNs
particularly well-suited for tasks where the order of data points is important, such as time
series [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] forecasting, language modeling, and speech recognition.
      </p>
      <sec id="sec-2-1">
        <title>2.1. Architecture of RNNs</title>
        <p>The basic architecture of an RNN consists of a set of hidden states that are updated at each
time step based on the current input and the previous hidden state. This can be mathematically
described as follows:</p>
        <p>Let  be the input at time step  , ℎ be the hidden state and ℎ−1 be the previous hidden state at
time step , and  be the output at time step . The equations governing the RNN are:</p>
        <p>Here  ℎ and  are the weight matrices for the input and output respectively.  ℎ is the
weight matrix for the hidden state, ℎ is the bias vector for the hidden state,  is the bias vector for the
output,  is the activation function, often used for its nonlinear properties.</p>
        <p>The below diagram 2.1 illustrates the architecture of a Recurrent Neural Network (RNN). On the
left, the compact representation shows a single recurrent unit with input  , hidden state ℎ, and
output . The connections demonstrate how the hidden state ℎ is influenced by the current input  ,
previous hidden state (loop), and contributes to the output  .</p>
        <p>On the right, the unfolded representation depicts how the RNN processes a sequence of
inputs over time steps  − 1, , and  + 1. Each time step  has its own input , hidden state ℎ, and output
. The hidden state ℎ is updated by the previous hidden state ℎ−1 and the current input . The weight
matrices  ,  , and  are shared across all time steps, illustrating the RNN’s ability to handle
sequential data by maintaining and updating a memory of previous inputs.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Training Recurrent Neural Networks</title>
        <p>Training an RNN involves adjusting the weights and biases to minimize the error between
the predicted output and the actual target. This is typically done using the backpropagation
through time (BPTT) algorithm, which is an extension of the backpropagation algorithm used
in feedforward neural networks.</p>
        <p>Forward Pass: During the forward pass, the network processes the input sequence from
the first time step to the last, updating the hidden states and producing the outputs. The loss is
calculated by comparing the predicted outputs to the actual targets.</p>
        <p>Let ℒ be the loss function. The total loss over a sequence of length  is given by:
where ℓ is the loss function (i.e. Mean Squared Error) at each time step,  is the actual target, and ˆ 
is the predicted output. The loss function is given by:</p>
        <p>Backward Pass (Backpropagation Through Time): During the backward pass, the
gradients of the loss with respect to the weights are calculated by propagating the error backwards
through the network. The gradients are then used to update the weights.</p>
        <p>The gradients for the weights and biases can be computed as follows:</p>
        <sec id="sec-2-2-1">
          <title>For the hidden weights and biases, assuming  ′ is the derivative of  :</title>
          <p>Algorithm 1 Training RNN using BPTT
1: Initialize weights and biases:  ℎ,  ℎ,  ℎ, ,  2:
for each training sequence in dataset do
3: Initialize loss to 0
4: for  = 1 to  do
5: ℎ ←  ( ℎ ·  +  ℎ · ℎ−1 +  ℎ)
6:  ←  · ℎ + 
7:  ←  + _(, ^)
8: end for
9: gradients ← _()
10: update_weights_and_biases(gradients)
11: end for</p>
          <p>Training Recurrent Neural Networks (RNNs) using Backpropagation Through Time (BPTT)
involves several steps. Initially, the weights ℎ, ℎ, ℎ, , and  are randomly initialized. For each
training sequence, a forward pass computes hidden states ℎ using an activation function (e.g.,
tanh, ReLU) applied to the current inputs and previous hidden states, weighted by  ℎ and
ℎ and shifted by bias ℎ. Outputs  are derived from ℎ using  and . The loss is computed by
comparing predicted outputs with target values. A backward pass calculates the gradients of
the loss with respect to the weights and biases, propagating the error backward through
time. Weights and biases are updated using these gradients via an optimization algorithm like
gradient descent. This iterative process refines the network parameters, minimizing loss and
enhancing performance.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Long Short-Term Memory (LSTM)</title>
        <p>
          LSTMs are a type of RNN designed to overcome the vanishing gradient problem [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. They
include special units called memory cells to store information over long periods.
        </p>
        <p>An LSTM cell consists of three gates: input gate (), forget gate (), and output gate ().
The equations governing the LSTM cell are:</p>
        <p>where  is the input at time , ℎ−1 is the previous hidden state,  is the cell state,  is the sigmoid
function, and  and  are the weight matrices and bias vectors respectively.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Time Series Analysis</title>
      <sec id="sec-3-1">
        <title>3.1. Introduction to Time Series</title>
        <p>
          A time series is a sequence of data points typically measured at successive points in time, spaced at
uniform intervals. Time series analysis involves methods for analyzing time series data to
extract meaningful statistics and other characteristics. Time series forecasting is the use of a
model to predict future values based on previously observed values. Mathematically, a time
series can be represented as:
where  () is the observed value at time ,  () is the trend component, () is the seasonal component,  ( )
is the cyclic component, and  ( ) is the irregular component [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>Components of Time Series: Time series data can be decomposed into several components, each
representing an underlying pattern or structure in the data:
Trend (T): The long-term progression of the series. It represents the general direction in which the data
is moving over a long period.</p>
        <p>Seasonality (S): The repeating short-term cycle in the data. This is often observed in data with periodic
fluctuations, such as monthly sales data.</p>
        <sec id="sec-3-1-1">
          <title>Cyclic (C): Long-term oscillations in the data, typically spanning several years. Irregular (I): The random noise component which cannot be attributed to the other components.</title>
          <p>
            RNNs have found widespread applications in time series analysis, where the objective is to
analyze historical data and make predictions about future trends. Some common applications of
RNNs in time series analysis include: stock price prediction, weather forecasting, economic
modeling, signal processing [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ], health monitoring and diagnosis.
3.2. Time Series Analysis Methods
Several methods are employed to analyze time series data:
          </p>
          <p>Moving Average: The moving average method smooths the data to identify the trend
component by averaging adjacent data points. The moving average at time  for a window size
 is given by:</p>
          <p>
            Exponential Smoothing: Exponential smoothing assigns exponentially decreasing weights to
past observations. The simple exponential smoothing forecast for time  + 1 is:
where  is the smoothing parameter (0 &lt;  &lt; 1) [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ].
          </p>
          <p>
            Autoregressive Integrated Moving Average (ARIMA): The ARIMA model combines
autoregression (AR), differencing (I), and moving average (MA) to model time series data:
where  is the differenced series,  is a constant,  represents the autoregressive parameters, 
represents the moving average parameters, and  is the white noise error term [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ].
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Related Work</title>
      <p>
        The theoretical underpinnings of recurrent neural networks trace back to the foundational
concepts of artificial neural networks and computational neuroscience. Early research in neural
network theory laid the groundwork for understanding the principles of learning, representation, and
computation in interconnected networks of artificial neurons [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The introduction of
recurrent connections endowed neural networks with the ability to process sequential data and
capture temporal dependencies, paving the way for the development of RNNs [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        Over the years, several architectural variants of recurrent neural networks have been proposed to
address the challenges of training deep networks and mitigating the issues of vanishing and
exploding gradients. Among these variants, Long Short-Term Memory (LSTM) networks and
Gated Recurrent Unit (GRU) networks have emerged as prominent choices due to their
ability to capture long-range dependencies and facilitate more stable training dynamics [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The
architectural design principles and computational mechanisms underlying LSTM and GRU
networks have been extensively studied, highlighting their strengths and limitations in modeling
sequential data.
      </p>
      <p>
        Recurrent neural networks have found widespread applications in time series analysis,
spanning various domains such as finance, economics, climate science, healthcare, and engineering
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. In the context of financial time series analysis, RNNs have been extensively employed for
stock price prediction, market trend forecasting, risk assessment, and algorithmic trading [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
Similarly, in climate science, RNN-based models have been used for weather forecasting, climate
modeling, and environmental monitoring, leveraging the temporal dependencies inherent in
meteorological data [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Other applications include speech recognition, natural language
processing, physiological signal analysis, and anomaly detection [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], where RNNs excel in
capturing sequential patterns and extracting meaningful insights from temporal data.
      </p>
      <p>
        Despite their versatility and effectiveness, recurrent neural networks are not without
limitations. Challenges such as the vanishing gradient problem, the curse of dimensionality, overfitting, and
computational inefficiency pose significant hurdles in training deep RNN architectures on
large-scale datasets [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Moreover, the interpretability of RNN-based models remains a
concern, as the black-box nature of deep learning algorithms may hinder their adoption in
domains where transparency and explainability are paramount.
      </p>
      <p>
        Looking ahead, several avenues for future research and innovation in the field of RNNs for
time series analysis can be identified. These include the development of hybrid architectures
integrating RNNs with other deep learning techniques, advancements in optimization algorithms and
regularization techniques, and efforts to enhance the interpretability and transparency of
RNNbased models [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Additionally, exploring applications in emerging domains such as
healthcare, cybersecurity, and smart manufacturing holds promise for extending the scope and
impact of RNN-based models in real-world scenarios.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Introduction of the Dataset</title>
      <p>The dataset for this study comprises daily Google stock prices from March 21, 2019, to March 20, 2024,
sourced from finance.yahoo.com.It includes 1259 observations of five key attributes: Open, High, Low,
Close, and Volume, representing different aspects of stock prices and trading volumes. Significant
variability and large standard deviations indicate considerable daily fluctuations and market
volatility. Quartile values highlight the distribution of stock prices and volumes, showing
moderate values with some extreme fluctuations. This dataset is essential for time series
analysis and forecasting, aiding in understanding stock behavior and improving prediction accuracy
using RNN models.</p>
      <sec id="sec-5-1">
        <title>5.1. Correlation Matrix</title>
        <p>The correlation matrix provides a numerical summary of the linear relationships between pairs of
variables in the dataset: Open, High, Low, Close, and Volume. Each value in the matrix ranges
from -1 to 1, where 1 indicates a perfect positive correlation, -1 indicates a perfect negative
correlation, and 0 indicates no correlation.</p>
        <p>Interpretation: The prices show extremely high positive correlations, close to 1, indicating
they move together. Trading volume has a weak negative correlation with prices, suggesting
that higher volumes may slightly correspond to lower prices. This correlation matrix is crucial
for understanding relationships between stock prices and trading volume, informing predictive
modeling and analysis in financial studies, as it highlights typical price movements and potential
influences of large trades. [Figure 5.1]</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Histogram Plot</title>
        <p>The histogram displays the distribution of five variables: Open, High, Low, Close, and Volume.
Each histogram provides insight into the frequency and distribution of these values over the
observed period. The histogram for the ‘Open’ prices shows a distribution that ranges from
The histogram for ‘Open’ prices ranges from about 50 to 150, with distinct clusters peaking
around 60, 80, 100, and 140, indicating common opening price ranges. ‘High’ prices follow a
similar pattern, suggesting the highest prices during trading often align with opening prices.
The ‘Low’ prices histogram also shows clustering at similar levels, indicating that the lowest
prices frequently fell within these groups. ‘Close’ prices follow the same distribution pattern,
with peaks at 60, 80, 100, and 140, suggesting consistency and stability across opening, high,
low, and closing prices during trading periods.</p>
        <p>The ‘Volume’ histogram, however, shows a different pattern, ranging from 0 to approximately 1.2e8,
with a significant peak around 0.2e8, indicating many trading periods had this volume level.
There is a noticeable decrease in frequency as the volume increases, suggesting fewer periods
with extremely high trading volumes.[Figure 5.2]</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Methodology</title>
      <p>This chapter outlines the methodology employed to develop and evaluate a Recurrent Neural
Network (RNN) for time series analysis, specificallyfor predicting Google stock prices. The
approach involves several key steps: preprocessing, model architecture design, training, and
evaluation. Each step is described in detail to provide a comprehensive understanding of the
processes involved in this study.</p>
      <sec id="sec-6-1">
        <title>6.1. Data Preprocessing</title>
        <p>Data preprocessing is crucial for training the RNN model, involving several steps. The dataset is
first loaded into a pandas DataFrame for structured manipulation and analysis, followed by
inspecting its structure, checking for missing values, and reviewing basic statistics. The data is
then split into training (80%) and testing (20%) sets to evaluate the model’s performance on
unseen data. The ’Open’ price is chosen for predicting future stock prices, and Min-Max
normalization scales the values between 0 and 1, aiding the RNN model’s convergence. Training
sequences of 60 time steps are created, with each sequence comprising ’Open’ prices for 60
consecutive days and the target being the ’Open’ price of the next day, capturing temporal
dependencies. Finally, testing sequences are prepared from the combined training and testing
’Open’ prices, ensuring continuity and smooth transition in the time series.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Model Architecture and Process</title>
        <p>The RNN model is designed using a stacked Long Short-Term Memory (LSTM) network. LSTMs
are chosen for their ability to learn long-term dependencies, making them suitable for time
series prediction. The model consists of several LSTM layers, each followed by a Dropout layer to
prevent overfitting.The finallayer is a Dense layer that outputs the predicted stock price.</p>
        <p>LSTM Layers and Dropout: The Dropout layers, which randomly set a fraction of the input
units to 0 during training, prevent overfittingwhile each LSTM layer records the temporal
dependencies in the stock price data.</p>
        <p>Compiling the Model: The Adam optimizer is used to compile the model, and the MSE
(Mean Squared Error) is used as the loss function. The Adam optimizer is chosen for its adaptive
learning rate capabilities, which help in faster and more efficient convergence.</p>
        <p>Model Training: The model is trained with a batch size of 32 across 100 epochs. This choice
balances training time with the model’s ability to learn from the data effectively.</p>
        <p>Model Evaluation: The model’s performance is assessed using the testing set. Predicted
stock prices are compared with actual prices, and the Mean Absolute Error (MAE), Mean Squared
Error (MSE), and Root Mean Squared Error (RMSE) are used to evaluate the accuracy of the
model.</p>
        <p>Making Predictions: Utilizing the testing data, the trained model is employed to make
predictions. The predicted prices are then transformed back to their original scale using the
inverse of the Min-Max scaler.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Empirical Analysis</title>
      <p>In this study, the performance of recurrent neural networks (RNNs) was analyzed with varying
numbers of layers and activation functions to predict time series data for a specificnumber of
input values. The performance metrics used were Mean Absolute Error (MAE), Mean Squared
Error (MSE), and Root Mean Squared Error (RMSE). The activation functions evaluated included no
activation key, tanh, ReLU, sigmoid, and combinations thereof. This comprehensive analysis aims
to identify the optimal configuration for time series prediction and understand the impact of
different network architectures on model performance.
7.1. General Observations:
1. The best performing models balance simplicity and sufficient depth to capture temporal
patterns without overfitting.
2. Deep networks with complex activations tend to perform poorly, indicating the need for
careful tuning and possibly alternative architectures or regularization techniques to manage
deeper models. The issues like exploding gradients can affect deeper networks, particularly
those using ReLU and sigmoid activation, which can result in poor model performance.</p>
      <sec id="sec-7-1">
        <title>7.2. Impact of Network Depth</title>
        <p>Shallow Networks (4-6 layers): These generally performed well, especially with no activation
key or a combination of no activation key and tanh. This suggests that for this specifictime
series data, a less complex model is sufficient to capture the necessary patterns.</p>
        <p>Medium Networks (8-15 layers): Performance begins to degrade as the number of layers
increases. For instance, at 10 layers, the errors increase notably, particularly with the ReLU and
sigmoid activations.</p>
        <p>Deep Networks (20-100 layers): The performance significantly worsens with deeper
networks. Activation functions like ReLU and sigmoid show particularly high errors, likely due to
gradient issues and overfitting.
7.3. Impact of Activation Functions
No Activation Key: Consistently shows good performance across different network depths,
indicating stability and robustness.</p>
        <p>tanh: Performs well in shallow and medium networks but shows increased errors in very
deep networks.</p>
        <p>ReLU and sigmoid: These activation functions result in poor performance, particularly as the
network depth increases. This is likely due to their susceptibility to gradient issues.</p>
        <p>Combinations of Activation Functions: The combination of no activation key and tanh
shows promise in shallow networks but degrades in performance as the network depth increases.</p>
      </sec>
      <sec id="sec-7-2">
        <title>7.4. Model Performance Visualization</title>
        <p>The figure 7.1 shows the Root Mean Squared Error (RMSE) of a Recurrent Neural Network
(RNN) model with different numbers of layers and various activation functions. Each subplot
represents the RMSE for a specific number of layers, ranging from 4 to 100 layers, as indicated by
the titles of the subplots. The x-axis of each subplot lists different activation functions used in
the RNN layers, and the y-axis indicates the RMSE value corresponding to each activation
function.</p>
        <p>Similarly, we can see the activation key wise visualization of the results in figure7.2. The
figure shows the Root Mean Squared Error (RMSE) of a Recurrent Neural Network (RNN) model with
different number of layers and various activation functions. Each subplot represents the
RMSE for specific activation function keys, ranging from no activation key to ReLU and tanh
activation keys, as indicated by the titles of the subplots. The x-axis of each subplot lists different
layers used in the RNN model, and the y-axis indicates the RMSE value corresponding to each
activation function.</p>
        <p>Best Performance: The best performance in terms of the lowest MAE, MSE, and RMSE
was observed with a 4-layer RNN using a combination of no activation key and tanh activation
key. This configuration provides MAE: 2.532684078, MSE: 10.31147237, RMSE: 3.21114814. This
indicates that a relatively shallow network with mixed activation functions can effectively
capture the temporal dependencies in the data without overfitting. [Figure 7.3]</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>8. Conclusion</title>
      <p>This study explored the application of recurrent neural networks (RNNs), particularly Long
Short-Term Memory (LSTM) networks, for time series prediction using Google stock prices as a
case study. The study comprehensively evaluated different RNN configurations, varying the
number of network layers and activation functions, to determine the optimal setup for accurate
forecasting.</p>
      <p>The empirical analysis revealed that the best performance was achieved with a 4-layer RNN
using a combination of no activation key and tanh activation function. This configuration
produced the lowest Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean
Squared Error (RMSE), indicating its effectiveness in capturing temporal dependencies in the
data. Conversely, the worst performance was observed with a 100-layer RNN without using
any activation function, which highlighted issues such as gradient explosion and overfitting in
deeper networks.</p>
      <p>The findings underscore the importance of network depth and activation function selection in
RNN-based time series analysis. Shallow networks with appropriate activation functions can
effectively model temporal data, while deeper networks require careful tuning to avoid
performance degradation.</p>
    </sec>
    <sec id="sec-9">
      <title>9. Acknowledgments</title>
      <p>I would like to express my deepest gratitude to my supervisor, Professor Marcin Woźniak,
for his invaluable guidance, encouragement, and support throughout the course of this study.
His profound knowledge and expertise have been instrumental in shaping my research and
providing me with the direction needed to navigate through various challenges. Professor
Woźniak’s insightful feedback and constructive criticism have significantly enhanced the quality of
this work. His patience and willingness to share his time and knowledge have been greatly
appreciated.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Siłka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wieczorek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Woźniak</surname>
          </string-name>
          ,
          <article-title>Recurrent neural network model for high-speed train vibration prediction from time series</article-title>
          ,
          <source>Neural Computing and Applications</source>
          <volume>34</volume>
          (
          <year>2022</year>
          )
          <fpage>13305</fpage>
          -
          <lpage>13318</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>[2] A brief overview of recurrent neural networks (rnn</article-title>
          ),
          <year>2024</year>
          . URL: https://www. analyticsvidhya.com/blog/2022/03/a
          <article-title>-brief-overview-of-recurrent-neural-networks-rnn/</article-title>
          , accessed on June 14,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Woźniak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wieczorek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Siłka</surname>
          </string-name>
          ,
          <article-title>Bilstm deep neural network model for imbalanced medical data of iot systems</article-title>
          ,
          <source>Future Generation Computer Systems</source>
          <volume>141</volume>
          (
          <year>2023</year>
          )
          <fpage>489</fpage>
          -
          <lpage>499</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Montgomery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Jennings</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kulahci</surname>
          </string-name>
          , Introduction to Time
          <source>Series Analysis and Forecasting</source>
          , John Wiley &amp; Sons,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Woźniak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Siłka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wieczorek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Alrashoud</surname>
          </string-name>
          ,
          <article-title>Recurrent neural network model for iot and networking malware threat detection</article-title>
          ,
          <source>IEEE Transactions on Industrial Informatics</source>
          <volume>17</volume>
          (
          <year>2020</year>
          )
          <fpage>5583</fpage>
          -
          <lpage>5594</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>R. G</surname>
          </string-name>
          . Brown, Smoothing, Forecasting and Prediction of Discrete Time Series, Prentice-Hall,
          <year>1963</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Box</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Jenkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. C.</given-names>
            <surname>Reinsel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Ljung</surname>
          </string-name>
          ,
          <article-title>Time series analysis: forecasting and control</article-title>
          , John Wiley &amp; Sons,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Ranganathan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Reddy</surname>
          </string-name>
          ,
          <article-title>Time series forecasting using lstm neural networks</article-title>
          ,
          <source>in: Recent Trends in Image Processing and Pattern Recognition</source>
          , Springer, Singapore,
          <year>2021</year>
          , pp.
          <fpage>129</fpage>
          -
          <lpage>135</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Z. C.</given-names>
            <surname>Lipton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Berkowitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Elkan</surname>
          </string-name>
          ,
          <article-title>A critical review of recurrent neural networks for sequence learning</article-title>
          ,
          <source>arXiv preprint arXiv:1506.00019</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <article-title>Long short-term memory</article-title>
          ,
          <source>Neural computation 9</source>
          (
          <year>1997</year>
          )
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Woźniak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zheng</surname>
          </string-name>
          , Apgvae:
          <article-title>Adaptive disentangled representation learning with the graph-based structure information</article-title>
          ,
          <source>Information Sciences 657</source>
          (
          <year>2024</year>
          )
          <fpage>119903</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , X. Wu,
          <article-title>Time series prediction using a deep learning model based on lstm network</article-title>
          ,
          <source>in: Journal of Physics: Conference Series</source>
          , volume
          <volume>1529</volume>
          ,
          <year>2020</year>
          , p.
          <fpage>032035</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>N.</given-names>
            <surname>Boulanger-Lewandowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vincent</surname>
          </string-name>
          ,
          <article-title>Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription</article-title>
          ,
          <source>arXiv preprint arXiv:1206.6392</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Pascanu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <article-title>On the difficulty of training recurrent neural networks</article-title>
          ,
          <source>in: International Conference on Machine Learning</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>1310</fpage>
          -
          <lpage>1318</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>Recent advances in deep learning for time series forecasting</article-title>
          ,
          <source>IEEE Transactions on Neural Networks and Learning Systems</source>
          <volume>32</volume>
          (
          <year>2020</year>
          )
          <fpage>3730</fpage>
          -
          <lpage>3751</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>