Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


          IDENTIFICATION OF NEWS TEXT CORPORA
        INFLUENCING THE VOLATILITY OF FINANCIAL
                      INSTRUMENTS
                                           A.S. Stankus a
Saint Petersburg State University, 7-9 Universitetskaya emb., Saint Petersburg, 199034, Russia

                                      E-mail: a alexey@stankus.ru


Using neural networks to predict changes in financial markets is a promising task. For more accurate
forecasting, it is necessary to determine the tone of the texts of the articles, whether the news carries
positive or negative information for the market. Standard approaches to using pretrained neural
networks aimed at analyzing user reviews are not successful due to the fact that professional reporters
try to present their articles in a neutral way, which leads to incorrect conclusions. In this article, we
will talk about the possibilities of training neural networks to analyze the sentiments of articles based
on volatility data in the volatility of financial markets.

Keywords: Neural networks, attention-based, transformer, BI-LSTM, sentiments of articles,
financial market


                                                                                           Alexey Stankus


                                                             Copyright © 2021 for this paper by its authors.
                    Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


                                                   397
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


1. Statement of the problem
        Let's take the news collected from Reuters for a certain period and the oil price for the same
period. Using one of the most advanced architectures of the BERT transformer [1-2], we get the
following definition of tonalities [tab. 1]:
                                                Table 1. The result of determining the sentiment of news
                       Data class         Number of articles         Percentages
                         Neutral                 97152                  61.55%
                        Negative                 60206                  38.14%
                         Positive                 496                   0.31%

        From the obtained distribution of results, it is obvious that the model is carries most articles to
the "neutral" class. This fact can be expected - news articles rarely contain a emotional component. In
most cases, the authors adhere to a strict style, which is designed to present the dry facts. Standard
models of news sentiment recognition are usually trained on short and emotional messages such as
social networks posts or customers reviews. Nevertheless, it is considered how the labels relate to the
price movement [tab.2]:
                                       Table 2. The relationship between sentiment and price movement
                       News label            Price increase         Price decrease
                        Negative                 30604                  29602
                         Positive                 272                     224

          As a result of calculating p-value = 0.07, we can conclude that there is no statistical
significance of the obtained news breakdown. Thus, the use of the transformed model is not justified -
it is trained to find the wrong relationships that are necessary to solve the problem posed within the
framework of this work. Overfitting under the given conditions may also not lead to a positive result
due to strong differences in both the training set and the predicted feature. The use of the above-
mentioned models requires a complete learning process from scratch, which can be realized only in the
presence of an extremely voluminous and correctly labeled training data array.
       From this, it is necessary to make conclusions about the creating your own data markup with
subsequent training of the neural network.


2. Selecting news and texts pre-processing
        By the reaction of the course, it is possible to determine the presence of relevant information
that carries a positive or negative value for asset owners. To do so we must proceed the following
steps:
    ●   Relying on volatility as an indicator of market expectations [3];
    ●   Select the news preceding the event;
    ●   Pre-processing texts;
    ●   Train the neural network;
    ●   Checking the result on predictions.


    The first thing to consider is the average volatility that is characteristic of the market and
associated with the opening and closing of exchanges around the world. [fig. 1].


                                                   398
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


                               Figure 1. Average volatility values by hour

        To identify the moments of the market reaction to the information received, the deviation of
the volatility in the time interval from the value obtained at the last step can be used. After obtaining
the values of the volatility deviations, you can build an approximation of the first and second
derivatives (v′, v′′). Further, as the moments of anomalous market reaction, sharp jumps in the V′′,
accompanied by long-term preservation of the positivity of the V′, are considered. [fig. 2].


                      Figure 2. Moving volatility, its second and first derivatives

        Next, we have to do text pre-processing. As part of the ongoing work, a large number of
regular expressions and NLP packages have been applied to improve the quality of the input data. In
particular, the following operations were performed on the texts of articles:
    ●   replacement of html-mnemonics and special characters;
    ●   censoring swear words;
    ●   correction of obvious typos;
    ●   removal of redundant punctuation marks;
    ●   defining the language of publication and separating different languages from each other;
    ●   addition of description texts with the text of the article, if necessary.


                                                   399
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


       To speed up and improve the efficiency of training, the corpus of texts is filtered by keywords
presumably relevant to the asset under study. We got 53 000 text news for each class.


3. Sentiment analysis model
        In order to prevent overfitting [4], dropout layers have been added to the used neural network
architecture, which randomly change the values of the previous layer (dropout) or disable some
variables of the embedding layer (spatial-dropout). Neural network has following structure [tab. 3]:
                                                                                      Table 3. Model layers

                         Layer                    Size                 Parameters
                       Embedding                  256                    2560000
                        Dropout                   256                          0
                        B-LSTM                    128                    164352
                        B-LSTM                    128                       98816
                        Dropout                   128                          0
                         Dense                    128                       16512
                        Dropout                   128                          0
                         Dense                     32                       4128
                         Dense                     1                           65

        After 45 epochs of GPU training on supercomputer “Govorun”, the result is [tab. 4]:
                                                                                    Table 4. Training results

                                    Sample              categorical accuracy
                                    Training                  0.8657
                                      Test                    0.5374

         Due to the use of a more complex three-class markup function, it should be noted that in this
case the problem was solved not of a binary, but of a multiclass classification. When setting the
problem of multiclass classification, the “basic” accuracy of the random number generator, which is
the boundary of the meaningfulness of the result, is 1/3, and not 1/2, as in the case of binary
classification, respectively.


4. Results
        After training, we get the following results for texts assessment:
    ●   «Russia committed holding round talks week the Belarussian capital Minsk ending violence
        eastern Ukraine, senior Kremlin aide said Monday»

                                      Class              Confidence level
                                     Neutral                   0.005
                                    Negative                   0.003
                                    Positive                   0.992


                                                   400
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


    ●   « Two people died at least 13 injured an explosion a factory belonging to Gulf Oil Corporation
        Ltd the southern Indian city Hyderabad, police said Monday »

                                      Class              Confidence level
                                     Neutral                  0.003
                                    Negative                  0.904
                                    Positive                  0.093

    ● «Former world number Maria Sharapova cruised past Kazakhstan’s Zarina Diyas into semi-
      finals the Shenzhen Open China Thursday»

                                      Class              Confidence level
                                     Neutral                  0.96
                                    Negative                  0.03
                                    Positive                  0.01

        In all these examples, the model classification results correspond to the real expected market
reaction - the increase in oil exports by Saudi Arabia is assessed as negative news for the oil price,
progress in resolving world tension in Ukraine is assessed as positive, and irrelevant news is assessed
neutrally. However, such a review of the results cannot serve as a basis for drawing conclusions about
the quality of the model. It is necessary to determine the effectiveness by additional verification.


References
[1] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Available
at: https://arxiv.org/abs/1810.04805
[2] A. Vaswani, N.Shazeer, Niki Parmar et al. Attention Is All You Need. 2017
[3] A. Atkins, M. Niranjan, E.Gerding. Financial news predicts stock market volatility better than
close price // The Journal of Finance and Data Science. 2018. Vol. 4, pp. 120-137.
[4] R. Pascanu, T. Mikolov, Y. Bengio. On the difficulty of training Recurrent Neural Networks.
2013.


                                                   401