<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Analysis of Machine Learning Methods for Predicting Stock Prices?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oluwadurotimi Onibonoje</string-name>
          <email>oluwadurotimi.onibonoje2@mail.dcu.ie</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kevin Djoussa</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mark Roantree</string-name>
          <email>mark.roantree@dcu.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Insight Centre for Data Analytics, Dublin City University</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computing, Dublin City University</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>VistaMilk SFI Research Centre, Dublin City University</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this research, we investigated the applicability of Long Short Term Memory (LSTM) and Convolutional Neural Networks (CNN) in forecasting the next day's closing price of four major stock indices and explored de-noising techniques to improve the performance of these models. Our experiments show the use of Kalman Filters with the LSTM model provide the best forecast accuracy, reducing forecast error by at least 30% in three of the four nancial time series used in this study.</p>
      </abstract>
      <kwd-group>
        <kwd>Time Series Analysis</kwd>
        <kwd>Neural Networks Signal Processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Predicting the future price of stock indices has been shown to be an extremely
challenging endeavor, largely due to the noisy and non-stationary characteristics
of their time series [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Several approaches have been investigated in forecasting
stock indices, eg. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Statistical approaches such as the linear
autoregressive Integrated Moving Average (ARIMA) were used in forecasting the monthly
stock price of the S&amp;P 500 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Deep learning architectures such as the LSTM
implemented in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and Convolutional Neural Networks (CNN) as used in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] are
some of the non-linear models that have shown promise in this domain. These
research initiatives provide evidence that using more sophisticated models can
deliver better results.
      </p>
      <p>
        Motivation. Stock market indices such as the Standard and Poor's 500
(S&amp;P500) have been shown, through the Granger-causality test, to have
predictive power as a leading indicator of the economy [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Therefore, accurate forecasts
of stock market indices will help economic policy makers reach more informed
conclusions as regards the right economic policies for desired economic outcomes.
Institutional investors will also bene t from accurate forecasts of a stock market
index, as these predictions will help inform the portfolio optimization process
across di erent nancial asset classes.
      </p>
      <p>Contribution. This project attempts to forecast the univariate time series
of four major stock indices, namely, the S&amp;P 500, Dow Jones Industrial
Average (DJIA), Euro Stoxx 50 (Stoxx50E) and the National Association of
Securities Dealers Automated Quotation (NASDAQ) exchange. We applied three deep
learning algorithms, namely, LSTM, CNN and CNN-LSTM on the daily closing
prices for each index. In addition to investigating and understanding the e cacy
of neural network architectures in time series forecasting, our research attempts
to examine the merits of using de-noising techniques such as wavelet transform
and Kalman lters on these nancial time series. Novel ensemble approaches
composed of these de-noising techniques and the neural network models were
introduce in this paper were developed in a bid to improve the accuracy of the
time series forecasts of our baseline model.</p>
      <p>Paper Structure. The remainder of this paper is structured as follows: in
section 2, we provide an overview of related research; in section 3, we examine the
theoretical concepts and models that underpin our implementation models; in
section 4, we present our methodology for detecting the best model con guration
for day ahead forecasts of stock Indices; In section 5, we present our evaluation
and discuss our ndings; and nally, in section 6, we conclude the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Research</title>
      <p>
        In [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the authors applied Deep Neural Networks to predict one-month ahead
stock returns of stocks in the MSCI Japan Index. Their approach used 25
fundamental analysis factors for each stock in the cross-section of the Japanese stock
market. The experimental results, which were evaluated using Rank Correlation,
Directional Accuracy and Mean Square Error (MSE), showed that Deep
Neural Networks outperformed other models including Support Vector Regression
(SVR), Random Forest(RF) and Shallow Neural Network models with a 30%
average uplift using the rank correlation metric and a 2.6% reduction in MSE.
      </p>
      <p>
        The e cacy of Deep Belief Networks (DBN) was investigated in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] with
Technical Analysis Indicators as features and a 2-Dimensional Principal
Component Analysis (PCA) model in predicting the S&amp;P 500. Three models were
formulated and evaluated using the RMSE metric in order to properly evaluate
the usefulness of the Technical Analysis Indicators. The rst model is composed
of a Back Propagation Neural Network (BPNN) and the basic features in the
raw dataset while the second model is composed of a DBN, basic features and
extracted Technical Indicator features. The nal model adds the complexity of
a 2-Dimensional PCA to the previous model. The experimental results indicate
that Technical Indicators coupled with PCA can help improve the predictive
power of Deep Learning Algorithms. The nal model had a 43.5% reduction in
RMSE in comparison to the rst model and a 16.91% reduction in RMSE when
compared to the second model.
      </p>
      <p>
        Wavelet Transforms have been studied in di erent applications with
nonstationary time series and signals [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. Wavelet transforms decompose a signal
into components of di erent time scales. Li and Tam used wavelets as a
realtime de-noising technique for nancial time series of East Asian Stock Indices as
seen in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Their approach applies di erent mother wavelets with grid-searched
hyper-parameters like the decomposition level and sliding window size on these
stock indices to obtain a smooth time series, which will be processed by an LSTM
Neural Network. The results of their research show that there is merit in using
wavelets as a de-noising technique for neural network models. De-noised data
improved directional accuracy of LSTM forecasts on original data inputs by an
average of 7.6%.
      </p>
      <p>
        Kalman Filter is a state space model [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], which uses an optimal recursive
algorithm typically found in signal processing research. In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], the authors
compared the use of wavelet transforms and Kalman lters in de-noising signals.
Their results show that the Coi et 2 wavelet transform outperforms Kalman
Filters in signal de-noising. Ma and Teng predicted chaotic time series using a
variation of the Kalman Filter known as Unscented Kalman Filter (UKF) [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
Lima and Neto used Kalman lters in conjunction with wavelets to pre-process
the time series of the Brazilian IBOVESPA index [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The pre-processed time
series is fed into a Recurrent Neural Network (RNN) to generate forecasts. Their
results indicate a Mean Absolute Percentage Error (MAPE) of 0.72%, beating
other models including ARIMA.
      </p>
      <p>
        Summary. From this review, we can interpret the most recent approaches
taken by researchers in seeking better stock price predictions. Some approaches
have included technical indicator features using Neural Networks [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], linear
statistical models [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and wavelet de-noising on the input time series as seen in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
While these approaches are richly varied in their methodologies, we have not
found a study that attempted to focus solely on the input nancial series and
compare the e ect of the de-noising techniques that we propose to improve the
forecasting ability of Neural Network models.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Background Models</title>
      <p>In this section, we provide a brief overview of the models used in our evaluation
and the 2 denoising techniques used in an attempt to improve model
performance.
3.1</p>
      <sec id="sec-3-1">
        <title>Long Short Term Memory (LSTM)</title>
        <p>
          Recurrent Neural Networks (RNN) maintain an internal loop that allows for
information persistence. The output of a RNN is used in conjunction with the
current element in the input tensor, to compute the next element in the output
sequence[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. In more simplistic RNN models, the memory unit or state of the
RNN is often equivalent to the previous output while other complex models have
di erent values for the state and the previous element in the output sequence.
        </p>
        <p>
          Equation 1 [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] illustrates the computational model output in a RNN for
a given time step t. In this equation, the activation function ' is a nonlinear
hyperbolic tangent function and Wx; Wy are matrices, containing connection
weights for the input of the current time step and the outputs for the previous
time step respectively. The state matrix from the previous time step is h(t 1)
while the bias vector b contains the bias term for each neuron.
        </p>
        <p>Yt = '(X(t)Wx + h(t 1)Wy + b)
(1)
(2)
(3)
(4)</p>
        <p>
          LSTMs [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] address the long term dependency problem of RNNs by
introducing three gate structures namely the forget gate F , input gate I, and output
gate O shown in Fig. 1. The forget gate function in equation 2 takes as input the
previous state h(t 1) and the current input vector X(t) and passes these inputs
into a sigmoid function which returns a value between 1 and 0 that represents
the amount of information to ow through the gate.
        </p>
        <p>Ft = (WF [h(t 1); xt] + bF )
The input gate I(t) function in equation 3, as with the forget state, takes as
input the previous state h(t 1) and the current input vector X(t) and passes
these inputs into a sigmoid function. The input gate helps to determine the
value to be updated.</p>
        <p>It = (WI [h(t 1); xt] + bI )</p>
        <p>The output gate function O(t) shown in equation 4 determines those parts
of the long-term state C(t) to be passed as output to H(t) for the current time
step.</p>
        <p>Ot = (WO[h(t 1); xt] + bO)</p>
        <p>The matrices WF ; WI ; WO contain the connection weights for the gate layers
while bF ; bI ; bO are the bias terms for these layers.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Convolutional Neural Network</title>
        <p>
          Convolutional Neural Networks (CNNs) are a class of deep learning algorithms
shown to be highly e ective in tasks relating to visual perception [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. The
architecture of a CNN incorporates the convolution layer and the pooling layer.
These layers explain why CNNs outperform and are more e cient than
traditional neural network architectures in computer vision tasks. In this research,
we applied a 1D CNN to our nancial time series.
        </p>
        <p>The convolutional layer allows the Neural Network to capture spatial and
temporal dependencies in the input feature map by applying a convolutional
kernel on this input tensor. The layer iteratively parses each element of the input
feature map by applying a sliding convolution kernel to produce a convolved
output feature map, which models the translation invariant nature of the input
feature map. The size of the output feature map is in uenced by the size of the
convolution kernel and the padding added to the input feature map to avoid
losing the edge elements of the feature space.</p>
        <p>The pooling layer down samples the convolved feature map by applying
tensor operations with a nxn sliding window. There are principally two types of
pooling operations namely: maximum pooling and average pooling. The
maximum pooling operation computes the highest value in the current nxn window
of the convolved feature map while the average pooling operation computes the
mean. The pooling layer allows CNNs to better model spatial hierarchies present
in the input feature map. The layer reduces the number of parameters of the
input feature map hence, reducing the risk of over tting and giving CNNs
generalization power. In
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Wavelet Transformation</title>
        <p>
          Wavelet Transformation is a signal processing technique that has been used
in many applications such as image compression, time-frequency analysis and
data de-noising. The Continuous Wavelet Transform (CWT) provides the
timefrequency representation and overcomes the resolution problems of the Short
Time Fourier Transform (STFT). The CWT di ers from the STFT by o ering a
variable size window function for the spectral components. The wavelet function
derived from the mother and father wavelet [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] can be expressed using equations
5 and 6, where a is the scale factor and b is the translation factor.
        </p>
        <p>1
a;b(t) = pa</p>
        <p>1
a;b(t) = pa
(t
(t
a
a
b)
b)</p>
        <p>
          The formula for the CWT and the Inverse CWT [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] of a function is shown
in equations 7 and 8.
        </p>
        <p>W f(a;b) = pa</p>
        <p>f (x) =</p>
        <p>
          Here, we applied the Haar wavelet to de-noise our input nancial time series.
The Haar wavelet is computationally more e cient than other mother wavelets
and has shown to be capable of improving results in this domain [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
Kalman Filters estimate the state of a system given measurements with expected
errors. Their e ciency in making time series forecasts make them widely applied
in time series analysis and real-time applications [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. A linear Gaussian model
for the state and observation of a measured process is shown in equations 12 and
13, where xt is the real value at a given time t for the measured system and ytis
the measured value at t.
(8)
(9)
(10)
(11)
(12)
(13)
        </p>
        <p>
          In order to determine the real state of the system at a given time t, there are
three functional components. The rst component, F xt 1, shows the functional
relationship between the value of the previous state xt 1 and the current state
xt. The second component, B ut, is an external force term [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. The third
component, wt is a stochastic term which captures dynamics not present in the
previous state. The measured value, yt, is determined by applying a function to
the real value of the current state, A xt, and adding a white Gaussian noise
vt. The Kalman lter forecasts the future value using equation 14, where Kt is
the Kalman gain.
        </p>
        <p>x^t = Kt
yt + (1</p>
        <p>Kt)</p>
        <p>
          The Kalman Filter recursively iterates between the prediction and ltering
phase [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] with the prediction phase described by equations 15, 16, and the
ltering phase by equations: 17, 19, where Pt the estimate of the state covariance;
R is the measurement error variance; and Q is a tunable hyper-parameter for
improving the performance of the model.
(15)
(16)
(17)
(18)
(19)
x^t = F
^
P t = F
x^t 1 + B
        </p>
        <p>ut
Pt 1</p>
        <p>F T + Q
x^t = x^t + Kt
(yt</p>
        <p>A</p>
        <p>x^t )
Pt = (I</p>
        <p>Kt</p>
        <p>A)</p>
        <p>
          Pt
The Kalman gain, Kt, attempts to determine the relative importance of the
measured error of the estimate when compared to the error of the real value.
The computation of the Kalman gain is described as follows [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] and shown in
equation 19.
        </p>
        <p>Kt = Pt</p>
        <p>At
(A</p>
        <p>Pt</p>
        <p>AT + R) 1
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Methodology</title>
      <p>
        In this research, with the exception of our baseline ARIMA model, we relax the
stationarity condition for the nancial time series in our neural network models
as in [
        <xref ref-type="bibr" rid="ref15 ref24">15,24</xref>
        ]. The implementation logic for our NN models was adapted from the
approach presented in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Data Preprocessing. The dataset used in this research was obtained from
Yahoo! Finance from 01-01-2004 to 31-12-2019 for the following stock indices:
NASDAQ 100 (NDX), S&amp;P 500, Euro Stoxx 50 and the Dow Jones Industrial
Average. The daily closing price series for each index was transformed into a
1D tensor composed of 100 successive daily values and mapping our independent
variable x^ = xt+1; xt+2: : : xn to the corresponding dependent variable y^ = xn+1.
The values of the independent variable were standardized using Min-Max
normalization to have values within the range of (0,1).
4.1</p>
      <sec id="sec-4-1">
        <title>Evaluation Models</title>
        <p>Baseline Model ARIMA was selected as a baseline model where the result is
a benchmark to measure the improvement in forecast accuracy from other more
sophisticated models. The steps are as follows:
{ Step 1: The dataset is split into a training set and a test set using a 70:30
ratio;
{ Step 2: The hyper-parameters, p,d,q are grid searched on the training set to
produce the optimal model with the lowest Mean Squared Error;
{ Step 3: predictions from the ARIMA model are compared with the values in
the test set and evaluated using the MSE.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Wavelet Transform - Convolutional Neural Network (WT-CNN).</title>
        <p>
          The speci cation of this sequential CNN architecture is as follows: Layer 1 is
a 1-D convolutional layer with 3 lters and 3 kernels with the Recti ed Linear
Unit (ReLU) as the activation function. Layer 2 is also a 1-D convolutional
layer that has identical hyper-parameters to the preceeding layer. Layer 3 is
a 1-D maximum pooling layer followed by a 4th layer which attens the two
dimensional tensor into a vector fed into a fully connected layer, not unlike the
data model approach taken in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. The loss function which the model optimizes
using the Adam Optimizer is the Mean Squared Error (MSE). The following
steps were followed for the implementation of the WT-CNN model:
{ Step 1: The time series is de-noised by Haar Wavelet with soft thresholding.
{ Step 2: The de-noised time series is scaled to the range f0,1g. This is to allow
the CNN to converge faster.
{ Step 3: The dataset is divided into a training and test set using a 70:30 split.
{ Step 4: Both the training and test sets are converted to a supervised learning
problem with 100 past sequences representing the independent variable used
to predict the next value in the sequence.
{ Step 5: The training and test input tensors are reshaped to have the following
dimensions: [samples; timesteps; f eatures].
{ Step 6: The input tensors are fed into the CNN and the network is trained
over 100 epochs.
{ Step 7: The predictions generated from the CNN are standardized to their
normal range and then compared to the test set to generate values for the
evaluation metrics.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>WT-CNN-Long Short Term Memory (WT-CNN-LSTM). The spec</title>
        <p>i cation of the sequential LSTM architecture is as follows: Layer 1 is a 1-D
convolutional layer with 64 lters and a kernel with the Recti ed Linear Unit as
the activation function. Layer 2 is a 1-D maximum pooling layer with a pooling
size of 2, followed by a layer which attens the two dimensional tensor into a
vector fed into a LSTM layer, with 50 neurons and a ReLu activation function. The
LSTM layer outputs a tensor to a fully connected layer. Once again, the MSE loss
function is optimized using the Adam Optimizer. The steps followed in the
implementation of the WT-CNN-LSTM model mirror those of the WT-CNN model
with the major di erences being the change in dimension of the input tensors
from [samples; timesteps] into [samples; subsequences; timesteps; f eatures].</p>
        <p>WT-LSTM. The speci cation of the sequential LSTM architecture is as
follows: the rst three layers are LSTM layers with 50 neurons and the nal
layer is a fully connected layer. The implementation logic of the WT-LSTM is
not too di erent from previous models, where the major di erence being the
dimension of the input tensors: [samples; timesteps].</p>
        <p>Kalman Filter-LSTM (KF-LSTM). In this model, the input time series
is rst passed into the Kalman lter as a 1-D tensor. The output of the de-noising
process returns a 2-D tensor that is reshaped into a 1-D shape before it is fed
into an LSTM network.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Evaluation</title>
      <p>In this section, we present the results of our model implementations, evaluated
using the RMSE and MAE metrics obtained for each stock index. These
evaluation metrics measure the distance between predictions from the algorithm and
the actual values in the test set.</p>
      <p>Our experiments show LSTMs outperform other Neural Network
architectures in forecasting the univariate time series of stock indices. With the exception
of the NDX stock index, LSTMs improve the forecast performance of the One
Dimensional CNN by over 50% for each stock index. When compared with the
CNN-LSTM architecture, LSTMs marginally outperform in forecasting the time
series of the S&amp;P 500 and the Euro Stoxx50E index. There is a more widened
gap between these two architectures, however, for the Dow Jones index and the
NDX index.</p>
      <p>From Table 2, we can see that the LSTM model con guration achieved
superior RMSE scores when compared to the CNN and CNN-LSTM for both the
S&amp;P500 and the STOXX50E stock index. The CNN network had the worst
performance in trying to predict all of the stock indices, with the exception of the
NDX. Interestingly, Only the LSTM model for the STOXX50E outperformed
the baseline model, ARIMA.</p>
      <p>Using Table 3, we nd similar patterns to those which were obtained in Table
2 with respect to LSTMs outperforming other models for stock indices such as
the S&amp;P 500 and DJIA. For the best forming neural network models on stock
indices such as the STOXX50E and S&amp;P 500, wavelet transform delivered similar
results with the ARIMA model. The worst model performance came from the
CNN-LSTM model with the exception of the STOXX50E dataset.
In many tasks involving time series analysis, CNNs generally outperform LSTMs
but this was not the case for the evaluated results of the time series forecasts
presented in Table 4. Our assumption is that this is due to the two-dimensional
architecture of the network that allows it to capture both spatial and temporal
information inherent in the time series.</p>
      <p>
        We found that using wavelets to remove noise in our univariate time series
to be inconclusive. On one hand, predictions using the Dow Jones index showed
a 20% average reduction of in forecast error when de-noised using wavelet
transforms. However, predictions using the NDX and Stoxx50E indices showed no
substantial reduction in forecast error. We interpret that the application of Haar
wavelet decomposition may not be suitable for all types of nancial time series as
suggested by [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Indeed, each nancial time series may require a unique mother
wavelet for its decomposition. With the exception of models that used Kalman
ltering, no model could consistently beat the (baseline) ARIMA model in
forecasting for each index. These results show that increasing model complexity
does not guarantee improved performance. Many studies show that statistical
forecasting techniques such as ARIMA can often outperform Neural Network
architectures [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Kalman lters reduced the forecast error of the LSTM on the
S&amp;P 500 and Stoxx50E by 68%. The forecast errors in the Dow Jones index
reduced by 53% while those in NDX reduced by 34%.
      </p>
      <p>In summary, we found de-noising performance of Kalman Filters to
outperform wavelets for the nancial time series in our evaluation. Our Neural Network
models used the previous N = 100 samples to predict the next price in the
sequence, with the choice of N , being arbitrary.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusions</title>
      <p>In this paper, we applied three deep learning algorithms, LSTM, CNN, and
CNNLSTM to forecast the univariate time series of four stock indices; S&amp;P 500, Dow
Jones Industrial Average, Euro Stoxx 50 and the Nasdaq Exchange. Initially,
we attempted to forecast the future prices of the stock indices using Neural
Networks; we then investigated the e cacy of Discrete Wavelet Transforms,
particularly the Haar Mother Wavelet to de-noise the input nancial time series;
and nally, we investigated the use of Kalman Filters and discovered better
performance when compared to the wavelet transform approach. Results were
evaluated using the RMSE and MAE metrics. While our evaluation does provide
support for ARIMA, for forecasting using time series, we believe that our results
using some de-noising techniques suggest that other approaches may outperform
ARIMA, given the appropriate experimental con gurations.</p>
      <p>Our current work is focused on the exploration of other input features to
enhance the performance of the neural network models: con guring the type of
mother wavelet applied, decomposition level and window size may well deliver
improved performance. We are also seeking to investigate the optimal con
gurations of DWT for more frequent observations of these time series.</p>
      <p>Onibonoje et al.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>M.</given-names>
            <surname>Abe</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Nakayama</surname>
          </string-name>
          ,
          <article-title>"Deep Learning for Forecasting Stock Returns in the Cross-Section"</article-title>
          ,
          <source>ArXiv</source>
          ,
          <year>2018</year>
          .
          <source>[Accessed 14 August</source>
          <year>2020</year>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Ariyo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Adewumi</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Ayo</surname>
          </string-name>
          ,
          <article-title>"Stock Price Prediction Using the ARIMA Model"</article-title>
          ,
          <source>2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation</source>
          ,
          <year>2014</year>
          . Available:
          <volume>10</volume>
          .1109/uksim.
          <year>2014</year>
          .
          <volume>67</volume>
          [Accessed 1
          <article-title>August 2020</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Ken</given-names>
            <surname>Bailey</surname>
          </string-name>
          , Mark Roantree,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Crane</surname>
          </string-name>
          , and
          <string-name>
            <surname>Andrew McCarren</surname>
          </string-name>
          .
          <article-title>Data Mining in Agri Warehouses Using MODWT Wavelet Analysis</article-title>
          .
          <source>23rd Intl. Conf. on Information and Software Technologies</source>
          , pp.
          <fpage>241</fpage>
          -
          <lpage>253</lpage>
          , Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>W.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yue</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Rao</surname>
          </string-name>
          ,
          <article-title>"A deep learning framework for nancial time series using stacked autoencoders and long-short term memory"</article-title>
          ,
          <source>PLOS ONE</source>
          , vol.
          <volume>12</volume>
          , no.
          <issue>7</issue>
          , p.
          <fpage>e0180944</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>J.</given-names>
            <surname>Brownlee</surname>
          </string-name>
          ,
          <source>Deep Learning for Time series Forecasting. 1st edn</source>
          .
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>B.</given-names>
            <surname>Comincioli</surname>
          </string-name>
          ,
          <article-title>"The Stock Market As A Leading Indicator: An Application Of Granger Causality"</article-title>
          ,
          <source>University Avenue Undergraduate Journal of Economics</source>
          , vol.
          <volume>1</volume>
          , no.
          <issue>1</issue>
          ,
          <year>1996</year>
          . Available: https://digitalcommons.iwu.edu/uauje/vol1/iss1/1.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A.</given-names>
            <surname>Dghais</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Ismail</surname>
          </string-name>
          ,
          <article-title>"A study of stationarity in time series by using wavelet transform"</article-title>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>K.</given-names>
            <surname>Erkan</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Bolat</surname>
          </string-name>
          ,
          <article-title>"Comparison of Kalman Filter and Wavelet Filter for Denoising"</article-title>
          ,
          <source>2005 International Conference on Neural Networks and Brain</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>T.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chai</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <article-title>"Deep learning with stock indicators and twodimensional principal component analysis for closing price prediction system"</article-title>
          ,
          <source>2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS)</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>A.</given-names>
            <surname>Geron</surname>
          </string-name>
          ,
          <article-title>Hands-On Machine Learning with Scikit-Learn, Keras Tensor ow. 2nd edn.</article-title>
          <string-name>
            <surname>O'Reilly Media</surname>
          </string-name>
          , Inc. Sebastopol(
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. H.
          <string-name>
            <surname>Gunduz</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Yaslan</surname>
            and
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Cataltepe</surname>
          </string-name>
          ,
          <article-title>"Intraday prediction of Borsa Istanbul using convolutional neural networks and feature correlations", Knowledge-Based Systems</article-title>
          , vol.
          <volume>137</volume>
          , pp.
          <fpage>138</fpage>
          -
          <lpage>148</lpage>
          ,
          <year>2017</year>
          . Available:
          <volume>10</volume>
          .1016/j.knosys.
          <year>2017</year>
          .
          <volume>09</volume>
          .023.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Piotr</surname>
            <given-names>Habela</given-names>
          </string-name>
          , Mark Roantree,
          <string-name>
            <given-names>Kazimierz</given-names>
            <surname>Subieta</surname>
          </string-name>
          .
          <article-title>Flattening the metamodel for object databases</article-title>
          .
          <source>Proceedings of ADBIS, Lecture Notes in Computer Science</source>
          vol.
          <volume>2435</volume>
          , pages
          <fpage>263</fpage>
          -
          <lpage>276</lpage>
          , Springer,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>B.</given-names>
            <surname>Henrique</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sobreiro</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Kimura</surname>
          </string-name>
          ,
          <article-title>"Literature review: Machine learning techniques applied to nancial market prediction"</article-title>
          ,
          <source>Expert Systems with Applications</source>
          , vol.
          <volume>124</volume>
          , pp.
          <fpage>226</fpage>
          -
          <lpage>251</lpage>
          ,
          <year>2019</year>
          . Available:
          <volume>10</volume>
          .1016/j.eswa.
          <year>2019</year>
          .
          <volume>01</volume>
          .012.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <article-title>"Long Short-Term Memory"</article-title>
          ,
          <source>Neural Computation</source>
          , vol.
          <volume>9</volume>
          , no.
          <issue>8</issue>
          , pp.
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          ,
          <year>1997</year>
          . Available:
          <volume>10</volume>
          .1162/neco.
          <year>1997</year>
          .
          <volume>9</volume>
          .8.
          <fpage>1735</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>R.</given-names>
            <surname>Kalman</surname>
          </string-name>
          ,
          <article-title>"A New Approach to Linear Filtering and Prediction Problems"</article-title>
          ,
          <source>Journal of Basic Engineering</source>
          , vol.
          <volume>82</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>35</fpage>
          -
          <lpage>45</lpage>
          ,
          <year>1960</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>A.</given-names>
            <surname>Krizhevsky</surname>
          </string-name>
          , I. Sutskever and
          <string-name>
            <given-names>G.</given-names>
            <surname>Hinton</surname>
          </string-name>
          ,
          <article-title>"ImageNet classi cation with deep convolutional neural networks"</article-title>
          ,
          <source>Communications of the ACM</source>
          , vol.
          <volume>60</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>84</fpage>
          -
          <lpage>90</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.</given-names>
            <surname>Tam</surname>
          </string-name>
          ,
          <article-title>"Combining the real-time wavelet denoising and long-shortterm-memory neural network for predicting stock indexes"</article-title>
          ,
          <source>2017 IEEE Symposium Series on Computational Intelligence (SSCI)</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>F.</given-names>
            <surname>Lima</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Neto</surname>
          </string-name>
          ,
          <article-title>"Combining Wavelet and Kalman Filters for Financial Time Series Forecasting"</article-title>
          ,
          <source>Asian Economic and Financial Review</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. J. Ma and
          <string-name>
            <given-names>J.</given-names>
            <surname>Teng</surname>
          </string-name>
          ,
          <article-title>"Predict chaotic time-series using unscented Kalman lter"</article-title>
          ,
          <source>Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826)</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <given-names>A.</given-names>
            <surname>Moghar</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Hamiche</surname>
          </string-name>
          ,
          <article-title>"Stock Market Prediction Using LSTM Recurrent Neural Network"</article-title>
          ,
          <source>Procedia Computer Science</source>
          , vol.
          <volume>170</volume>
          , pp.
          <fpage>1168</fpage>
          -
          <lpage>1173</lpage>
          ,
          <year>2020</year>
          . Available:
          <volume>10</volume>
          .1016/j.procs.
          <year>2020</year>
          .
          <volume>03</volume>
          .049.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <given-names>A.</given-names>
            <surname>Nielsen</surname>
          </string-name>
          ,
          <article-title>Practical time series analysis. 1st edn.</article-title>
          <string-name>
            <surname>O'Reilly Media</surname>
          </string-name>
          , Inc, Sebastopol(
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>R.</given-names>
            <surname>Polikar</surname>
          </string-name>
          , Wavelet Tutorial - Part 2, http://users.rowan.edu/ polikar/WTpart2.html.
          <source>Last accessed 01 Aug 2020</source>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>J. Rankin</surname>
          </string-name>
          ,
          <article-title>"Kalman ltering approach to market price forecasting."</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>M. Rhif</surname>
            ,
            <given-names>A. Ben</given-names>
          </string-name>
          <string-name>
            <surname>Abbes</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <string-name>
            <surname>Farah</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <article-title>Mart nez</article-title>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sang</surname>
          </string-name>
          ,
          <article-title>"Wavelet Transform Application for/in Non-Stationary Time-Series Analysis: A Review"</article-title>
          ,
          <source>Applied Sciences</source>
          , vol.
          <volume>9</volume>
          , no.
          <issue>7</issue>
          , p.
          <fpage>1345</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>