Neural network approximation precision change analysis on cryptocurrency price prediction A Misnik1, S Krutalevich1, S Prakapenka1, P Borovykh2 and M Vasiliev2 1 State Institution of Higher Professional Education "Belarusian-Russian University", Mogilev, Belarus 2 RoninAI Lab, New York, USA Abstract. The neural networks are tool for approximation universal series of data, but their precision highly depends on a adequate set of inputs. Cryptocurrencies experience high levels of volatility due to absence of agreed upon pricing methodologies behind its valuation. In this article we analyze approaches of obtaining additional inputs for neural networks and explore their influence on its precision. 1. Introduction In this study, we will estimate the price of Bitcoin (BTC), using market data and try to analyze improvement of the neural network’s precision by adding subsequent factors such as social and time factor. Study presented in this paper will include two neural networks architectures: Multilayer perceptron (MLP) and Long short-term memory (LSTM) neural networks. MLP is the most common architecture, providing acceptable accuracy in a large number of tasks, and LSTM is considered to be more accurate in the tasks of time series prediction Results are tested and based on neural networks created by RoninAI Lab. RoninAI uses various neural networks for cryptocurrency rate prediction, lending hands-on data and analysis to this study. 2. Gathering market data A number of sources mention the possibility of predicting price of an asset using market data. To test prediction accuracy of the market data, we started out by collecting historical minute-based market data from one of the oldest and biggest cryptocurrency exchange – Kraken. Market data inputs are listed in Table 1. Table 1. Market data parameters. Input name Description DATE AND TIME Timestamp of given minute OPEN Rate at the 0 second of given minute CLOSE Rate at the 59 second of given minute HIGH Highest rate during given minute LOW Lowest rate during given minute VOLUME FROM Lowest volume of sales per second VOLUME TO Highest volume of sales per second 96 3. Training on market data After training MLP and LSTM neural networks using market data as inputs and next-minute price of BTC as output, models returned results presented on Figure 1. Figure 1. .Training results on market data. The precision of predictions by both neural networks came out sufficiently low. LSTM showed slightly better accuracy. 4. Extending inputs with time factor Expert time factor is a factor representing activity level of exchanges. Retraining neural networks with additional input of the time factor produced results shown on Figure 2. Figure 2. . Training results on market data and time factor. Despite sufficiently low level of precision, predictions produced by both MLP and LSTM turned out to be much better compared to predictions based on market data alone. 5. Extending inputs with social factor It is estimated that the majority of market players in the cryptocurrency space are retail investors making their investment and trading decisions based on the condition of visual charts. An example of such chart is shown on Figure 3. We wanted test the importance of visual charts on predictive power of neural networks. Let’s consider an algorithm that allows us to identify some complex indicators of a numerical series, based on which we can predict trader’s subjective assessment on appearance of chart. In general, any series of data can be considered as a sum of linear and harmonic components. The purpose of further research is to investigate the algorithm for isolating these components and their normalization. 97 Figure 3. . BTC price chart. The proposed algorithm assumes the following stages. 1. Definition of the minimal element Emin of the series E. 2. Carrying out the subtraction operation E'= Ei-Emin. 3. Approximation of the series E' by a polynomial E1 = a (0) + a (1) * n, (1) where a(0), a(1) are the approximation coefficients; n-discrete values of the time axis. 4. Calculation of the coefficient of relative change in the linear component over the period T by the formula: EL = a (1) * T * 100 / Emin, (2) The value of EL is relative and does not depend on the absolute value of the series. If the quantity EL > 0, the linear component increases. 5. To estimate the harmonic component of the series, we perform a Fourier transform (FT) for the series E'. 6. Determine the moduli of the oscillation amplitudes in the frequency domain A(w). Carry out filtering of frequencies according to the amplitude values. To analyze the efficiency of the proposed algorithm, we used MATLAB environment. Step 1. The linear component is constant. Harmonic component is absent. The results are shown on Figure 4. The value of the coefficient EL displayed on the second chart. In this case, EL = 0. Step 2. The linear component grows. Harmonic component is absent. The results are shown in Figure 5. The coefficient EL = 20%. The spectrum modules have values in the low-frequency range (1-2 Hz.). It should be remembered that the main frequency band for the module of the real sequence lies in the interval 0