=Paper= {{Paper |id=Vol-2698/paper10 |storemode=property |title=Deep Learning Methods Application in Finance: A Review of State of Art |pdfUrl=https://ceur-ws.org/Vol-2698/p10.pdf |volume=Vol-2698 |authors=Dovil̇e Kuizinieṅe,Tomas Krilavičius |dblpUrl=https://dblp.org/rec/conf/ivus/KuizinieneK20 }} ==Deep Learning Methods Application in Finance: A Review of State of Art== https://ceur-ws.org/Vol-2698/p10.pdf
Deep Learning Methods Application in Finance: A
Review of State of Art
Dovil̇e Kuizinieṅea , Tomas Krilavičiusa
a Department of Applied Informatics Vytautas Magnus University Kaunas, Lithuania



                                          Abstract
                                          Artificial intelligence uses in financial markets or business units forms financial innovations. These innovations are the
                                          key indicator for economic grow and intelligent finance system formation. Recants years scientist and most innovation
                                          driving companies, such as Google, IBM, Microsoft and other, are focusing in deep learning methods. These methods have
                                          achieved significant performances in diverse areas: image recognition, natural language processing, speech recognition,
                                          video processing, etc. Therefore, it is necessary to understand the variety of deep learning methods and only then their
                                          applicability in the financial field. Accordingly, in this paper firstly is presented differences in science community already
                                          settled deep learning method’s architectures. Secondly, is shown a big picture of developing scientific articles of deep learning
                                          uses in finance field, where the most used deep learning methods were identified. Finally, the conclusion, limitations and
                                          future work have been shown.

                                          Keywords
                                          Artificial intelligence, Machine Learning, Deep Learning, Convolution Neural Network, Deep Belief Network,
                                          Deep Boltzmann Machine, Deep neural network, Deep Q-Learning, Deep reinforcement learning,
                                          The extreme learning machine, Generative adversarial network, Recurrent Neural Learning, Long short-term memory,
                                          Gated Recurrent Unit, Finance, Financial innovations


1. Introduction                                                                                                    ing up more complex concepts by merging the simpler
                                                                                                                   ones [5, 6, 7]. Companies such as Google, Facebook,
The global financial industry is quietly changing un-                                                              IBM, Microsoft and others use this algorithm for de-
der the catalysis of artificial intelligence (AI) [1]. AI                                                          veloping next-generation intelligent applications [8].
represents a clear opportunity to advance the transfor-                                                            In finances there are two major problems:1) to predict
mation of the finance industry by providing users with                                                             future returns (i.e., stock prices, currencies, indices,
greater value and increasing firms’ revenues [2]. The                                                              product demand); or 2) to make categorical classifi-
goal of AI is to invent a machine which can sense, re-                                                             cation (i.e. credit scoring (“good, “bad”), bankruptcy
member, learn, and recognize like a real human being                                                               (“True”, “False”)). While the issues in finances remain
[3]. The deep integration of AI technology and finance                                                             almost the same over the last several decades, novel
is the inevitable result of deepening development and                                                              methods, and growing amount of data are changing
Exploring Innovation in these fields [1]. These inno-                                                              the field, especially Machine Learning and Artificial
vations have the potential to directly influence both                                                              Intelligence techniques [9]. Furthermore, exploitation
the production and the characteristics of a wide range                                                             of additional data sources allows to achieve better re-
of products and services, with important implications                                                              sults, e.g. satellite images can be used for predicting
for productivity, employment, and competition [4]. AI                                                              economic activity, voice information provides infor-
also improve work efficiency at the business and cre-                                                              mation about emotions, textual information, extracted
ate a whole process of intelligent finance [1]. Appli-                                                             from news and comments gives sentiments of writers
cations of AI systems are generally viewed as positive                                                             and audience, etc [10]. However, extraction of useful
for economic growth and productivity [2]. Deep learn-                                                              knowledge out of such data heap is not trivial, it re-
ing is a recently-developed field belonging to Artifi-                                                             quires considerable effort [11, 12]. Portfolio manage-
cial Intelligence [3]. It attempts to learn hierarchical                                                           ment tasks have more challenges, because there are
representations from raw data and is capable of learn-                                                             two main issues with portfolio formation: (1) selec-
ing simple concepts first and then successfully build-                                                             tion of assets with highest revenue, and (2) determina-
                                                                                                                   tion other value composition of assets in the portfolio
IVUS 2020: Information Society and University Studies, 23 April 2020,                                              to achieve the goal of maximal potential returns with
KTU Santaka Valley, Kaunas, Lithuania
" dovile.kuiziniene@vdu.lt (D. Kuizinieṅe);
                                                                                                                   minimal risk [13]. Therefore, this paper is divided in
tomas.krilavicius@vdu.lt (T. Krilavičius)                                                                          two parts: 1) different deep learning architectures are
                                                                                                                  discussed; 2) application of the aforementioned meth-
                                    © 2020 Copyright for this paper by its authors. Use permitted under Creative
                                    Commons License Attribution 4.0 International (CC BY 4.0).                     ods in finances is discussed.
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
2. Literature review                                       unit (image, video, time series). This layer produces
                                                           huge amount of features that makes overfitting prob-
The term "artificial intelligence" is applied when a ma- lems and expensive computation [8]. Pooling leayers
chine mimics "cognitive" functions that humans asso- reducses this problem by aggregating multiple feature
ciate with other human minds, such as "learning" and values into a single value. Max-pooling is mostly used
"problem solving” [14]. In other words, it tries to mimic pooling operation, in Keras instead of this operation
the human brain, which is capable of processing the could be used Average-pooling, Global-max-pooling
complex input data, learning different knowledges in- or Global-average-pooling operations [20]. Rectified
tellectually and fast, and solving different kinds of com- linear unit (ReLU) is an activation function meant to
plicated tasks well [3].                                   zero out negative values, whereas a sigmoid “squashes”
   AI has been part of human thoughts and slowly evolv- arbitrary values into the interval [0, 1] producing some-
ing in academic research labs [14]. Machine learning thing that can be interpreted as a probability [19].
is the subset of AI. Machine learning is the study of         These three operations are repeated over tens or hun-
computer algorithms that can be improved automat- dreds of layers, with each layer learning to detect dif-
ically through experience [1]. Machine learning al- ferent features [16]. The classification phase consists
gorithms overcome following strictly static program from two layers dropout and fully connected. Dropout
instructions by making data-driven predictions or de- consists of randomly dropping out (setting to zero) a
cisions, through building a model from sample inputs number of output features of the layer during train-
[14]. In machine learning, artificial neural networks ing [19].
are a family of models that mimic the structural ele-         The fully connected layer that produces a vector of
gance of the neural system and learn patterns inherent K dimensions where K is the number of classes that
in observations [15], see Fig. 1. The term “deep” refers the network will be able to predict. This vector con-
to the number of layers in the network—the more lay- tains the probabilities for each class of any image being
ers, the deeper the network [16]. Traditional neural classified [16, 21]. The quality of model is evaluated
networks contain only 2 or 3 layers, while deep net- by the cost function in fully connected layer (sigmoid,
works can have hundreds [16]. Deep learning has been softmax or other).
explosively developed today. Compared with shallow
learning, deep learning reaches the state of arts in many
researches [17].
                                                           2.2. Deep Belief Network
   In contrast to the shallow architectures like kernel The power of Deep Belief Network (DBN) (Fig. 3 and
machines which only contain a fixed feature layer (or Fig. 4) lies in their ability to reconstruct both the input
base function) and a weight-combination layer (usu- vector and the learning feature vectors, which is im-
ally linear), deep architectures refers to the multi-layer plemented using a layer-by-layer learning strategy [22].
network where each two adjacent layers are connected Each layer of a DBN consists of a Restricted Boltz-
to each other in some way [3]. This introduces the un- mann Machines (RBM). RBMs follow the principle of
precedented flexibility to model even highly complex, the probability distribution to complete its learning cy-
non-linear relationships between predictor and out- cle [23]. Each RBM is concluded from a visible layer (v)
come variables, a quality that has allowed deep neural and a hidden layer (h). Number of neurons is set up
networks to outperform models from traditional ma- in each layer. The neurons between different layers
chine learning in a variety of tasks [18]. Deep learn- are fully connected, and the neurons in the same layer
ing methods have only now become so powerful, due are not connected [23]. When an RBM has learned, its
to technical reasons of: computational power (hard- feature activations are used as the “data” for training
ware), availability of large datasets and optimization the next RBM in the DBNs [24]. RBMs is as an unsu-
algorithms [18],[19].                                      pervised network which considers the visible layer to
                                                           the hidden layer as a subnetwork. Then, this hidden
2.1. Convolution Neural Network                            layer is considered as a visible layer to the next layer
                                                           and so on [24]. The higher-level features are learned
Convolution neural network (CNN) algorithm is sepa- from the previous layers and the higher-level features
rated into two main parts: feature detection and clas- are believed to be more complicated and better reflects
sification (see Fig. 2).                                   the information contained in the input data’s struc-
   Feature detection phase consist from convolution, tures [3]. DBN training is divided into two steps: for-
pooling and rectified linear unit (ReLU) layers. Convo- ward pre-training process and reverse fine-tuning pro-
lutional filters activates certain features from data set cess [25]. During the pre-training phase, the RBMs are



                                                       60
                                                                 2.5. Deep Q-Learning or Deep
                                                                      Reinforcement Learning
                                                                 Deep Q-Learning (DQL) or Deep reinforcement learn-
                                                                 ing (DRL) concept is replaceable in scientific literature
                                                                 (6). In DQL is always used reinforcement learning al-
                                                                 gorithm, or in DRL is often used Q-learning function,
                                                                 because of it is dealing with high-dimensional state
                                                                 space inputs [30], [31]. A reinforcement learning (RL)
                                                                 process involves an agent learning from interactions
Figure 1: The connection of AI, ML and DL                        with its environment in discrete time steps in order to
                                                                 update its mapping between the perceived state and a
                                                                 probability of selecting possible actions (policy) [32].
trained one-by-one until the hidden layer of the last            In other words, RL is commonly used to solve an se-
RBM. During this phase, the parameter of each RBM                quential decision making problem [30]. The RL prob-
can be obtained [23]. The back-propagation network               lem is normally formalized using the Markov decision
(BP) is set in the last hidden layer of the DBN [25]. BP         process (MDP) and includes a set of states S, set of ac-
is applied to fine tune the parameter using the output           tions A, transition function T as action distributions,
labels of the sample data [23].                                  reward function R and discount factor 𝛾 [33]. The so-
                                                                 lution to the MDP is a policy 𝜋 : S → A and the pol-
2.3. Deep Boltzmann Machine                                      icy should maximize the expected discounted cumu-
                                                                 lative reward [30]. Q-learning, as a typical reinforce-
Deep Boltzmann Machine (DBM) have only one undi-                 ment learning approach, mimics human behaviors to
rected network [24]. DBM as DBN is comprised of a                take actions to the environment, in order to obtain the
Restricted Boltzmann Machines (RBM). The main dif-               maximum long-term rewards [34]. The DQL process
ference is related to the interaction among layers of            can be viewed as iteratively optimizing network pa-
RBMs [25]. For the computation of the conditional                rameters process according to gradient direction of the
probability of the hidden units h1, both the lower visi-         loss function at each stage [35]. Therefore, the inex-
ble layer v and the upper hidden layer h2 are incorpo-           act approximate gradient estimation with a large vari-
rated, that makes DBM differentiated from DBN and                ance can largely deteriorate the representation perfor-
also more robust to noisy observation [15]. There are            mance of deep Q network by driving the network pa-
no direct connections between the units in the same              rameter deviated from the optimal setting, causing the
layers. DBM parameters of all layers can be optimized            large variability of DQL performance [35]. The advan-
jointly by following the approximate gradient of a vari-         tages of deep Q-learning is good results and ease of use
ational lower-bound on the likelihood objective [26].            (code can be modified easy for different physical prob-
   Different from the DBN, the DBM can incorporate               lems) [36].
top-down feedback, which can better propagate un-
certainty and hence deal with ambiguous input more
robustly [27].                                                   2.6. The Extreme Learning Machine
                                                                 The extreme learning machine (ELM) is a single-hidden
2.4. Deep Neural Network                                         layer feedforward network, proposed by Huang in 2012.
                                                                 In the traditional feed-forward ANN, the training of
Due to the novelty of the concept Deep neural network            the network is iterative, while the process is trans-
(DNN) (Fig. 5)in the scientific literature can be identi-        formed into an analytical equation in the ELM [37].
fied for all the algorithms analyzed in this paper. How-         In ELM the weights between input and hidden layer
ever, in recent years the concept of DNN has become              are assigned randomly following a normal distribution
known as Artificial Neural Network (ANN) with hid-               and the weights between hidden and output layers are
den layers [9] [28]. DNN typically is feedforward net-           learnt in single step by a pseudo-inverse technique.
work so it can be understood as the Multilayer Percep-           During the training, the hidden layer is not learned
tron (MLP or MP). MLP consists of an input layer, sev-           but the weight matrix of output layer is obtained by
eral hidden layers and one output layer ant it’s widely          solving the optimization problem formulated by some
used for pattern classification, recognition and predic-         learning criteria and regularizations [38], as showed
tion [29].                                                       in the theory the output weights solved from regular-



                                                            61
Figure 2: Convolution neural network architecture.




Figure 3: Deep Belief Network architecture.



ized least squares problem [39]. Therefore, ELM offers
benefits such as fast learning speed, ease of implemen-
tation, and less human intervention when compared to Figure 4: Deep Belief Network architecture
the standard neural networks [40]

2.7. Generative Adversarial Network                             same time, and finally the output is almost the same as
                                                                the real data [44].
The general idea of Generative adversarial network
(GAN) is that it aims to train a generator to recon-
                                                                2.8. Recurrent Neural Learning
struct high-resolution images for fooling a discrimi-
nator that is trained to distinguish generative images          Recurrent Neural Learning (RNN) (Fig. 9) is different
from real ones [41] (Fig. 8). This idea involves two            from the traditional feedforward neural networks, be-
competing neural network models: one of them takes              cause have feedback connections, which can be be-
noise as input and produces some samples (genera-               tween hidden units or from the output to the hidden
tor) and the other model (discriminator) accepts both           units [44, 45]. This connections address the temporal
the data outputted by the generator and the real data,          relationship of inputs by maintaining internal states
meanwhile, separates their sources [42]. The Discrim-           that have memory . An RNN is able to process the
inator trains itself to discriminate real data and gen-         sequential inputs by having a recurrent hidden state
erated data better while the Generator trains itself to         whose activation at each step depends on that of the
fit the real data distribution so as to fool Discrimina-        previous step [5, 46]. In other words, RNN not only
tor [43]. These two neural networks are trained at the          processes the current element in the sequence, but also



                                                           62
Figure 5: Deep neural network architecture




                                                          Figure 7: The extreme learning machine architecture




                                                          Figure 8: Generative adversarial network architecture



                                                             However, it has been observed that it is difficult to
                                                          train the RNNs to deal with long-term sequential data,
                                                          as the gradients tend to vanish [5].

                                                          2.9. Long short-term memory
Figure 6: Deep Q-Learning architecture                 Long short-term memory (LSTM) (Fig. 10) in literature
                                                       is called one of the classes [13], advanced or exten-
                                                       sion [48] of RNN. The main advantage of LSTM is ca-
draws upon the hidden layer of the previous element pability to learn longer dependencies in data [49] com-
in the sequence [18]. For example, the states produced pared with RNN. Information sequentially is processed
by an RNN at time t-1 will have some impacts on the in LSTM, but there is a memory cell, which remem-
states produced by the RNN at time t [17]. Hidden bers and forgets information [48]. In each memory cell
units can be regarded as the storage of the whole net- is three multiplication units: input gate, output gate
work, which remember the end-to-end information [47]. and forget gate, which controls the flow of informa-
                                                       tion [50]. The input gate determines how much cur-



                                                     63
Figure 9: Recurrent Neural Learning architecture




Figure 10: Long short-term memory architecture.




Figure 11: The extreme learning machine architecture




                                                       64
rent information should be treated as input in order to
generate the current state [51], whilst the forget gate
determine which information to be forgotten from the
memory state [52]. Finally, the output gate filters the
information that can be actually treated as significant
and produces the output [52]. The “gate” structure is
implemented using the sigmoid function, which de-
notes how much information can be allowed to pass.
For one hidden layer in LSTM, activation function is
used in forward propagation, and gradient is used in
backward propagation [38].

2.10. Gated recurrent unit
Gated Recurrent Unit (GRU) is aimed to solve the van-
                                                                 Figure 12: Categorical classification of analyzed articles
ishing gradient problem which comes with a standard
RNN [53]. GRU consists of two gates: update gate (zt)
and reset gate (rt). Update gate decides how much
the unit updates its activation, or content, and reset           problem with unbalanced data set, i.e. authors [55]
gate allows forget the previously computed state [54].           used data set, where credit worthy instances were 91,55
GRU is a less complex compared with LSTM, it does                proc. and CNN accuracy rate was 91,64 proc. In the
not possess any internal memory and output gate like             bankruptcy and investment market structure was used
LSTM [49].                                                       CNN network or in tax evasion DQL network. Arti-
                                                                 cles in financial field is interested to obtain knowledge
                                                                 from words and used it as some indicators. Therefore,
3. Application of Deep Learning                                  is seen a trend to use natural language processing tech-
                                                                 niques. The goal of natural language processing (NLP)
   Methods                                                       is to process text using the computational linguistics,
Articles were included from electronic libraries: Sci-           text analysis, machine learning, statistical and linguis-
ence Direct, IEEE, Scopus, ACM, Emerald, Springer-               tics knowledge in order to analyze and extract signif-
Link, JSTOR, EBSCO and others.                                   icant information [56]. Researches in financial field
   Analyzed period started from 2017 till 2020. The re-          are using sentiment analysis for better stock price pre-
view was conducted in January 2020. Keywords “Deep               diction or bankruptcy classification. Sentiment analy-
learning” and “Finance” were used for the article’s se-          sis is the essential task for NLP, which can be divided
lection. All methods presented in this review matches            into three categories: lexicon-based sentiment anal-
a term “Deep learning”, wherefore individually search            ysis, machine learning-based sentiment analysis and
by each method was not developed. The same with a                the hybrid approach [56].
term “Finance”, which includes accounting, financial                Lexicon-based sentiment analysis was used only in
markets, risks and etc. Therefore, this paper presents           one article [11], due to the need of opinion lexicon in
a big picture of developing scientific articles in Deep          this field. Machine learning-based sentiment analy-
learning in Finance category. 33 papers were selected            sis uses in bag-of-words method [48], [57] and word
and analyzed. The analyzed articles can be categorized           embeddings [48, 58, 57, 10] with CNN [58, 57], LSTM
by the problematic of given task: to predict future re-          [48, 10] methods.
turns or two make classification of results. Sometimes,             In [4] research was used bag of words and word
for better results are used natural language processing          embeddings methods for LSTM, results showed that
algorithms (Fig. 12 and Tab 1).                                  LSTM models can outperform all traditional machine
   The classification algorithms in finance most often           learning models based on the bag-of-words approach,
have been applied for credit scoring, which divides              especially when further pre-train word embeddings wi-
loans into “good” and “bad”. For this problem solv-              th transfer learning. The main financial article’s fo-
ing author’s used DBN [29], modified LSTM [52] and               cus is in future returns prediction, especially in stock
CNN [55] networks. The results cannot be compared                prices or stock indexes. The main reason is data source
due to different classifier evaluation methods used and          availability for scientific research. In this field very
data source differences. In credit scoring topic is a big        often, scientific researches combine different methods



                                                            65
Table 1
Detailed topics from Finance perspective




                                                                   4. Conclusions
                                                                   learning machine, generative adversarial network, re-
                                                                   current neural learning, long short-term memory, gated
                                                                   recurrent unit; and it’s applicability in finance field.
                                                                   This review reveals that financial article’s:
                                                                      1. mainly focus for the forecasting task than clas-
                                                                         sification;
                                                                      2. starts using natural language processing tech-
                                                                         niques, mostly sentiment analysis, for better re-
Figure 13: Use of deep learning methods in financial con-                sults prediction;
text.                                                                 3. uses not ‘basic’ the deep learning methods, i.e.
                                                                         they are often combined with several different
                                                                         models or merged to voting classifier.
together [49, 59, 60] or make some model modifica-
tion’s [13, 50, 61, 62] for better prediction results. Some           Furthermore, this analysis has shown the importance
authors [48, 63] analyze several different deep learning           of balanced data set and normalization of the data, which
models results for the deeper future model develop-                is submitted to deep learning networks.
ment, see Fig. 13.                                                    The main limitation of this work is representation
   Most popular methods are CNN and LSTM. How-                     only a big picture of developing scientific articles in
ever, DBM and GAN method’s was not found any ad-                   Deep learning in Finance category. Therefore, in fu-
justment in finance field.                                         ture research is needed to extend search keyword’s in
   In some papers data is not normalized, i.e. cryp-               electronic libraries, i.e. search by each method
tocurrency prices [51] or demand [18]. Therefore, pre-
dictive accuracy measurements, such as RMSE, MPE                   References
and others, can be comparable with different other au-
thors works or sometimes even in the same paper, i.e.               [1] C. Huang, X. Wang, Financial innovation based
RMSE for Bitcoin is 2.75×103 or for Ripple 0.0499 [51].                 on artificial intelligence technologies, in: Pro-
                                                                        ceedings of the 2019 International Conference
                                                                        on Artificial Intelligence and Computer Science,
                                                                        2019, pp. 750–754.



                                                              66
 [2] P. Yeoh, Artificial intelligence: accelerator or            [15] Suk, Heung-Il, An introduction to neural net-
     panacea for financial crime?, Journal of Finan-                  works and deep learning, in: Deep Learning for
     cial Crime (2019).                                               Medical Image Analysis, Elsevier, 2017, pp. 3–24.
 [3] D. Mo, A survey on deep learning: one small step            [16] MATHWORKS, Introducing Deep Learning with
     toward ai, Dept. Computer Science, Univ. of New                  MATLAB, MATHWORKS, 2020.
     Mexico, USA (2012).                                         [17] Zheng, Chao, Wang, Shaorong, Liu, Yilu, Liuand,
 [4] I. M. Cockburn, R. Henderson, S. Stern, The im-                  Chengxi, A novel rnn based load modelling
     pact of artificial intelligence on innovation, Tech-             method with measurement data in active distri-
     nical Report, National bureau of economic re-                    bution system, Electric Power Systems Research
     search, 2018.                                                    166 (2019) 112–124.
 [5] L. Mou, P. Ghamisi, X. X. Zhu, Deep recurrent               [18] Kraus, Mathias, Feuerriegel, Stefan, Oztekin,
     neural networks for hyperspectral image classi-                  Asil, Deep learning in business analytics and op-
     fication, IEEE Transactions on Geoscience and                    erations research: Models, applications and man-
     Remote Sensing 55 (2017) 3639–3655.                              agerial implications, European Journal of Oper-
 [6] F. Beritelli, G. Capizzi, G. Lo Sciuto, C. Napoli,               ational Research 281 (2020) 628–641.
     M. Woźniak, A novel training method to preserve             [19] Chollet, Francois, Deep Learning mit Python und
     generalization of rbpnn classifiers applied to ecg               Keras: Das Praxis-Handbuch vom Entwickler
     signals diagnosis, Neural Networks 108 (2018)                    der Keras-Bibliothek, MITP-Verlags GmbH & Co.
     331–338.                                                         KG, 2018.
 [7] F. Beritelli, G. Capizzi, G. Lo Sciuto, C. Napoli,          [20] keras, Guide to the Sequential model - Keras Doc-
     F. Scaglione, Rainfall estimation based on the in-               umentation, keras, 2020.
     tensity of the received signal in a lte/4g mobile           [21] G. Capizzi, G. L. Sciuto, P. Monforte, C. Napoli,
     terminal by using a probabilistic neural network,                Cascade feed forward neural network-based
     IEEE Access 6 (2018) 30865–30873.                                model for air pollutants evaluation of single mon-
 [8] J. Hearty, Advanced Machine Learning with                        itoring stations in urban areas, International
     Python, Packt Publishing Ltd, 2016.                              Journal of Electronics and Telecommunications
 [9] O. Lachiheb, M. S. Gouider, A hierarchical deep                  61 (2015) 327–332.
     neural network design for stock returns predic-             [22] D. Saif, S. El-Gokhy, E. Sallam, Deep belief
     tion, Procedia Computer Science 126 (2018) 264–                  networks-based framework for malware detec-
     272.                                                             tion in android systems, Alexandria engineering
[10] D. Katayama, Y. Kino, K. Tsuda, A method of sen-                 journal 57 (2018) 4049–4057.
     timent polarity identification in financial news            [23] Balakrishnan, Nagaraj, Rajendran, Arunkumar,
     using deep learning, Procedia Computer Science                   Pelusi, Danilo, Ponnusamy, Vijayakumar, Deep
     159 (2019) 1287–1294.                                            belief network enhanced intrusion detection sys-
[11] M.-Y. Day, C.-C. Lee, Deep learning for financial                tem to prevent security breach in the internet of
     sentiment analysis on finance news providers, in:                things, Internet of Things (2019) 100112.
     2016 IEEE/ACM International Conference on Ad-               [24] J. Karhunen, T. Raiko, K. Cho, Unsupervised deep
     vances in Social Networks Analysis and Mining                    learning: A short review, in: Advances in Inde-
     (ASONAM), IEEE, 2016, pp. 1127–1134.                             pendent Component Analysis and Learning Ma-
[12] G. Capizzi, G. Lo Sciuto, C. Napoli, D. Polap,                   chines, Elsevier, 2015, pp. 125–142.
     M. Wozniak, Small lung nodules detection based              [25] Fan, Chaodong, Ding, Changkun, Zheng, Jin-
     on fuzzy-logic and probabilistic neural network                  hua, Xiao, Leyi, Ai, Zhaoyang, Empirical mode
     with bioinspired reinforcement learning, IEEE                    decomposition based multi-objective deep belief
     Transactions on Fuzzy Systems 28 (2020).                         network for short-term power load forecasting,
[13] Wang, Wuyu, Li, Weizi, Zhang, Ning, Liu,                         Neurocomputing 388 (2020) 110–123.
     Kecheng, Portfolio formation with preselec-                 [26] N. Srivastava, R. R. Salakhutdinov, Multimodal
     tion using deep learning from long-term finan-                   learning with deep boltzmann machines, in: Ad-
     cial data, Expert Systems with Applications 143                  vances in neural information processing systems,
     (2020) 11–42.                                                    2012, pp. 2222–2230.
[14] Ongsulee, Pariwat, Artificial intelligence, ma-             [27] S. Wang, M. He, Z. Gao, S. He, Q. Ji, Emotion
     chine learning and deep learning, in: 2017 15th                  recognition from thermal infrared images using
     International Conference on ICT and Knowledge                    deep boltzmann machine, Frontiers of Computer
     Engineering (ICT&KE), IEEE, 2017, pp. 1–6.                       Science 8 (2014) 609–618.



                                                            67
[28] K. Akyol, Comparing of deep neural networks                     nary imbalanced learning, Neural Networks 119
     and extreme learning machines based on grow-                    (2019) 235–248.
     ing and pruning approach, Expert Systems with              [41] R. Jiang, X. Li, A. Gao, L. Li, H. Meng, S. Yue,
     Applications 140 (2020) 112875.                                 L. Zhang, Learning spectral and spatial features
[29] C. Luo, D. Wu, D. Wu, A deep learning approach                  based on generative adversarial network for hy-
     for credit scoring using credit default swaps, En-              perspectral image super-resolution, in: IGARSS
     gineering Applications of Artificial Intelligence               2019-2019 IEEE International Geoscience and Re-
     65 (2017) 465–470.                                              mote Sensing Symposium, IEEE, 2019, pp. 3161–
[30] W. Huang, Y. Wang, X. Yi, Deep q-learning to                    3164.
     preserve connectivity in multi-robot systems, in:          [42] Y. Cui, W. Wang, Colorless video rendering sys-
     Proceedings of the 9th International Conference                 tem via generative adversarial networks, in: 2019
     on Signal Processing Systems, 2017, pp. 45–50.                  IEEE International Conference on Artificial In-
[31] Matta, Cardarilli, D. Nunzio, Fazzolari, Giardino,              telligence and Computer Applications (ICAICA),
     Nannarelli, Re, Spano, A reinforcement learning-                IEEE, 2019, pp. 464–467.
     based qam/psk symbol synchronizer, Ieee Access             [43] Z. Zhai, J. Zhai, Identity-preserving conditional
     7 (2019) 124147–124157.                                         generative adversarial network, in: 2018 Inter-
[32] M. Ramicic, A. Bonarini, Attention-based expe-                  national Joint Conference on Neural Networks
     rience replay in deep q-learning, in: Proceedings               (IJCNN), IEEE, 2018, pp. 1–5.
     of the 9th International Conference on Machine             [44] I. Goodfellow, Y. Bengio, A. Courville, Y. Bengio,
     Learning and Computing, 2017, pp. 476–481.                      Deep learning, volume 1, MIT press Cambridge,
[33] H. Shen, H. Hashimoto, A. Matsuda, Y. Taniguchi,                2016.
     D. Terada, C. Guo, Automatic collision avoidance           [45] F. Bonanno, G. Capizzi, C. Napoli, Some remarks
     of multiple ships based on deep q-learning, Ap-                 on the application of rnn and prnn for the charge-
     plied Ocean Research 86 (2019) 268–288.                         discharge simulation of advanced lithium-ions
[34] C. Qiu, F. R. Yu, F. Xu, H. Yao, C. Zhao,                       battery energy storage, in: International Sym-
     Blockchain-based distributed software-defined                   posium on Power Electronics Power Electronics,
     vehicular networks via deep q-learning, in: Pro-                Electrical Drives, Automation and Motion, IEEE,
     ceedings of the 8th ACM Symposium on Design                     2012, pp. 941–945.
     and Analysis of Intelligent Vehicular Networks             [46] M. Miljanovic, Comparative analysis of recurrent
     and Applications, 2018, pp. 8–14.                               and finite impulse response neural networks in
[35] W.-Y. Zhao, X.-Y. Guan, Y. Liu, X. Zhao,                        time series prediction, Indian Journal of Com-
     J. Peng, Stochastic variance reduction for deep q-              puter Science and Engineering 3 (2012) 180–191.
     learning, arXiv preprint arXiv:1905.08152 (2019).          [47] C. Yin, Y. Zhu, J. Fei, X. He, A deep learning
[36] I. Sajedian, H. Lee, J. Rho, Design of high trans-              approach for intrusion detection using recurrent
     mission color filters for solar cells directed by               neural networks, Ieee Access 5 (2017) 21954–
     deep q-learning, Solar Energy 195 (2020) 670–                   21961.
     676.                                                       [48] M. Kraus, S. Feuerriegel, Decision support from
[37] B. Çil, H. Ayyıldız, T. Tuncer, Discrimina-                     financial disclosures with deep neural networks
     tion of 𝛽-thalassemia and iron deficiency ane-                  and transfer learning, Decision Support Systems
     mia through extreme learning machine and regu-                  104 (2017) 38–48.
     larized extreme learning machine based decision            [49] A. J. Balaji, D. H. Ram, B. B. Nair, Applicability
     support system, Medical Hypotheses 138 (2020)                   of deep learning models for stock price forecast-
     109611.                                                         ing an empirical study on bankex data, Procedia
[38] J. Chen, Y. Zeng, Y. Li, G.-B. Huang, Unsu-                     computer science 143 (2018) 947–953.
     pervised feature selection based extreme learn-            [50] Y. Chen, K. He, G. K. Tso, Forecasting crude oil
     ing machine for clustering, Neurocomputing 386                  prices: a deep learning based model, Procedia
     (2020) 198–207.                                                 computer science 122 (2017) 300–307.
[39] H. Li, X. Yang, Y. Li, L.-Y. Hao, T.-L. Zhang, Evo-        [51] S. Lahmiri, S. Bekiros, Cryptocurrency forecast-
     lutionary extreme learning machine with sparse                  ing with deep learning chaotic neural networks,
     cost matrix for imbalanced learning, ISA trans-                 Chaos, Solitons & Fractals 118 (2019) 35–40.
     actions 100 (2020) 198–209.                                [52] C. Wang, D. Han, Q. Liu, S. Luo, A deep learning
[40] S. Shukla, B. S. Raghuwanshi, Online sequential                 approach for credit scoring of peer-to-peer lend-
     class-specific extreme learning machine for bi-                 ing using attention mechanism lstm, IEEE Access



                                                           68
     7 (2018) 2161–2168.
[53] Y. Santur, Sentiment analysis based on gated
     recurrent unit, in: 2019 International Artifi-
     cial Intelligence and Data Processing Symposium
     (IDAP), IEEE, 2019, pp. 1–5.
[54] J. Kim, H. Kim, et al., Classification performance
     using gated recurrent unit recurrent neural net-
     work on energy disaggregation, in: 2016 in-
     ternational conference on machine learning and
     cybernetics (ICMLC), volume 1, IEEE, 2016, pp.
     105–110.
[55] B. Zhu, W. Yang, H. Wang, Y. Yuan, A hybrid deep
     learning model for consumer credit scoring, in:
     2018 International Conference on Artificial Intel-
     ligence and Big Data (ICAIBD), IEEE, 2018, pp.
     205–208.
[56] A. Abdi, S. M. Shamsuddin, S. Hasan, J. Piran,
     Deep learning-based sentiment classification of
     evaluative text based on multi-feature fusion, In-
     formation Processing & Management 56 (2019)
     1245–1259.
[57] H. Maqsood, I. Mehmood, M. Maqsood, M. Yasir,
     S. Afzal, F. Aadil, M. M. Selim, K. Muhammad, A
     local and global event sentiment based efficient
     stock exchange forecasting using deep learn-
     ing, International Journal of Information Man-
     agement 50 (2020) 432–451.
[58] F. Mai, S. Tian, C. Lee, L. Ma, Deep learning
     models for bankruptcy prediction using textual
     disclosures, European journal of operational re-
     search 274 (2019) 743–758.
[59] L. Ni, Y. Li, X. Wang, J. Zhang, J. Yu, C. Qi, Fore-
     casting of forex time series data based on deep
     learning, Procedia computer science 147 (2019)
     647–652.
[60] H. Yun, M. Lee, Y. S. Kang, J. Seok, Portfolio
     management via two-stage deep learning with a
     joint cost, Expert Systems with Applications 143
     (2020) 113041.
[61] S. Huang, D. Wang, X. Wu, A. Tang, Dsanet:
     Dual self-attention network for multivariate time
     series forecasting, in: Proceedings of the 28th
     ACM International Conference on Information
     and Knowledge Management, 2019, pp. 2129–
     2132.
[62] A. M. Aboussalah, C.-G. Lee, Continuous control
     with stacked deep dynamic recurrent reinforce-
     ment learning for portfolio optimization, Expert
     Systems with Applications 140 (2020) 112891.
[63] C. Chen, P. Zhang, Y. Liu, J. Liu, Financial quan-
     titative investment using convolutional neural
     network and deep learning technology, Neuro-
     computing 390 (2020) 384–390.



                                                            69