Statistical Analysis of the Popularity of Programming Language
Libraries Based on StackOverflow Queries
Ihor Rishnyak1, Yurii Matseliukh1, Taras Batiuk1, Lyubomyr Chyrun2, Oleksandra
Strembitska1, Oksana Mlynko1, Viktoriia Liashenko1 and Andrii Lema1
1
    Lviv Polytechnic National University, S. Bandera Street, 12, Lviv, 79013, Ukraine
2
    Ivan Franko National University of Lviv, University Street, 1, Lviv, 79000, Ukraine


                 Abstract
                 This paper presents a statistical analysis of existing trends in the spread of programming
                 language libraries based on data set studies. The various problems that arise when using
                 specific libraries of different programming languages for certain periods, the most common is
                 the month, are studied and analyzed. The results of the study of existing trends in the spread of
                 programming language libraries, collected in the studied dataset, are presented graphically, set
                 key descriptive characteristics, taking into account the correlation of data. Trends in the
                 behavior of the studied indicators using the methods of smoothing time series are determined.
                 A cluster analysis of programming language libraries was performed, making it possible to
                 group data by clusters and form appropriate data groups for ranking programming language
                 libraries.

                 Keywords 1
                 Statistical analysis, information technologies, business analysis, programming language
                 libraries, StackOverflow queries, data processing

1. Introduction
    The rapid growth in popularity of programming language libraries based on Stack Overflow queries
has not yet solved the problem of solving complex technical questions that cannot be answered through
queries on the Internet. A typical problem is when developers, looking for answers by submitting
queries in search engines, get all kinds of results, often spam or incorrect, outdated, and sometimes off-
topic. You often have to look for a blog post and then sit with the source for a long time (more than ten
minutes) to identify a way to solve a technical problem in a particular post. Stack Overflow is a place
where developers ask and get a reliable answer. Stack Overflow allows developers to improve their
level as a programmer, using the experience of others. It increases the code experience even for those
already experienced, helping others who have not been able to figure it out themselves. Stack Overflow
is the formation of future technologies as the world's future. The above proves the relevance of the study
of the popularity of libraries of programming languages, where it is crucial to analyze the composition,
structure, and issues of various queries in specific libraries each month. This study is especially relevant
for beginners who are now trying to choose a language. The problem's urgency is no less for experienced
developers to expand their knowledge in studying each subsequent programming language.
    From the point of view of business analysts, this analysis can be considered the creation of a library
rating system, i.e., identifying the most significant number of queries and determining the most popular
languages. Based on the analyzed data, the business analyst will be able to assess the decline, growth,

COLINS-2022: 6th International Conference on Computational Linguistics and Intelligent Systems, May 12–13, 2022, Gliwice, Poland
EMAIL: ihor.v.rishnyak@lpnu.ua (I. Rishnyak); indeed.post@gmail.com (Y. Matseliukh); taras.batiuk.mnsa.2020@lpnu.ua (T. Batiuk);
Lyubomyr.Chyrun@lnu.edu.ua (L. Chyrun); oleksandra.strembitska.sa.2019@lpnu.ua (O. Strembitska); oxanamlunko@gmail.com (O.
Mlynko); viktoriia.liashenko.sa.2019@lpnu.ua (V. Liashenko); andrii.lema.sa.2019@lpnu.ua (A. Lema)
ORCID: 0000-0001-5727-3438 (I. Rishnyak); 0000-0002-1721-7703 (Y. Matseliukh); 0000-0001-5797-594X (T. Batiuk); 0000-0002-9448-
1751 (L. Chyrun); 0000-0003-2754-7076 (O. Strembitska); 0000-0001-9878-6846 (O. Mlynko);0000-0003-0966-7912 (V. Liashenko); 0000-
0001-6490-6221 (A. Lema)
              ©️ 2022 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)
and invariability of the popularity of languages in 2009-2019 and offer his vision of the possible
development of specific languages.
   The work aims to use the main methods of visualization, graphical display, and primary statistical
processing of numerical data presented by a sample or time series to identify trends in the studied
indicators of programming language libraries, present the nature of their trends, apply time series
smoothing methods and tabulation MS Excel. Determination by methods of correlation analysis of
experimental data presented by time sequences.

2. Related Works
    Research on the popularity of programming languages, according to scientists [1-3], is one of the
components of the problem of human capital development. Having hard skills employees and acquiring
soft skills is an important task, as it allows you to solve important social [4-5], economic [6-8] and
technical [9-23] issues. Since its inception, the system we consider in this paper has provided an
opportunity to ask questions about programming and get answers to them for 12 years [24-36].
    Confectioners discuss recipes in culinary forums; students discuss their questions in help groups in
telegrams; parents of these children have joint chats on Viber, where they solve problems. Older people
gather under the porches to discuss neighbors or world news, i.e., every branch of people or
professionals should have a place where he can ask his question, hear an expert's opinion, discuss a
topic or give advice. Therefore, the importance of using the StackOverflow system is beyond doubt.
    Its relevance has been described in many articles [24-36] and videos on YouTube and other social
networks. Also, if you practice programming, you are interested in specific questions and decided to
enter them in the Google search. One of the first results will be the site Stack Overflow. One example
of relevance is the Wikipedia site. He reports that a 2016 study by Android developers using Stack
Overflow generated ten times more functional code (but less secure, which is a disadvantage) than
developers using official documentation [30, 37].
    In researching the chosen topic, we considered the HABR website [31], which was created to publish
news and opinions related to IT and business. These libraries of programming languages will be our
attributes.
         • Month is here are the day-month data on the library in the StackOverflow program;
         • NLTK is the number of queries about the NLTK library (a set of libraries and programs for
             symbolic and statistical processing of natural languages for English, written in the Python
             programming language);
         • spaCy is the number of requests for the spacy library (open-source library for advanced
             natural language processing (NLP) in Python);
         • Stanford-NLP is number of queries about Stanford - NLP library;
         • Python is the number of queries about the Python library;
         • R is number of requests for r library;
         • NumPy is the number of queries about the NumPy library (python language extension);
         • SciPy - the number of queries about the spicy library;
         • MATLAB is the number of queries about the MATLAB library;
         • Machine-Learning - the number of requests for machine learning.
    In these works [31], the user described his story. For seven years, he used the system we are
discussing, and during this time, he "answered 3516 questions, asked 58, entered the hall of fame in
several languages, met many wise people, and actively used all the site's features.
    The issues of the most popular programming languages under discussion and their libraries are
already discussed on the website StackOverflow [30, 37].
    Is it possible to trust the answers to your questions on the site and actively use them? - After all,
users are interested in each question and will quickly correct it in case of error. The HABR website [31]
also explains that the average programmer cannot write code without a break of several hours.
Therefore, to avoid unnecessary distractions and overloads, you can have a great time with like-minded
people on StackOverflow [30, 37]. The user is set a rating by answering the question, and his
"reputation" can rise exponentially, depending on the site activity. After a reputation mark of 25,000,
the user gets access to all SO statistics and permission to store queries in the user database.
   Thus, the SO system is one of the most popular among professional software developers, system
administrators, and programmers. All questions are marked with a specific topic tag (or multiple tags,
depending on the topics involved) to which the question relates. By clicking on the label, you can view
their list to select the topic that interests you. In our case, these are the themes of libraries of different
programming languages.

3. Methods
    To solve the problems in this work, we will use standard methods [38-45].
    The correlation field is a graph that establishes a relationship between variables, where X of each
corresponds to the value of the factor feature (abscissa), and Y - the value of the resultant feature
(ordinate) of a particular unit of observation. The number of points on the Graph corresponds to the
number of observation units. The location of the points indicates the presence and direction of
communication [38-45].
    Building a correlation field is carried out mainly in the following steps: choose two variables that
change over time. Then measure the value of the dependent variable and enter the result in the table.
Then construct a coordinate plane on the X-axis to indicate the value of the independent variable and
on the Y-axis - the dependent. Then you need to mark the points of the correlation field on the Graph.
On the X-axis for the first value of the independent variable, mark a point on the Y-axis corresponding
to the value of the dependent variable. The resulting set of points is called the correlation field [38-45].
We analyze the received schedule and conclude the presence of communication or its absence.
    Correlation coefficient is an indicator used to measure the density of the relationship between traits
in the correlation-regression model of linear dependence [46-52]. The absolute value of the correlation
coefficient ranges from -1 to +1.
    The correlation ratio determines the correlation in any of its forms, namely in, straight or curved.
The correlation ratio can be determined to estimate the curvilinear relationship between the values of
X and Y. It always has a positive value and is in the range from 0 to + 1. The value of the zero ratios is
taken when the relationship between the features is absent [38-52].
    Autocorrelation is the correlation of a function with itself shifted by a certain amount of independent
variable. The autocorrelation function graph can be obtained by plotting the correlation coefficient of
two functions along the ordinate axis and the value along the abscissa axis [38-45]. The autocorrelation
function measures the linearity of the relationship between the elements of the time series spaced apart
at x points in time. The Graph of an autocorrelation function is called a correlogram.
    The correlation matrix is a table that represents the values of the correlation coefficients for different
variables. It shows the numerical value of the correlation coefficient for all combinations of variables.
It is generally used when we need to determine the relationship between more than two variables. It
consists of rows and columns that contain variables, and each cell contains coefficient values that inform
the degree of association and linear relationship between two variables [38-45]. In addition, it can be
used in specific statistical analyzes. Multiple linear regression, where we have several independent
variables and a correlation matrix, helps determine the degree of association.
    The multiple correlation coefficient describes the correlation's intensity, or the relationship's degree
of closeness, between a dependent variable and several independent variables [38-45]. Its value cannot
be less than the absolute value of any partial or straightforward correlation coefficient. The primary
indicator of the closeness of the connection in multiple correlations is the coefficient of multiple
correlations, which has a value from 0 to +1.

4. Experiments
  The structure of the dataset [24] is presented in the Table 1 and has ten fields, among which month,
NLTK (Natural Language Toolkit), spaCy, Stanford-NLP, Python, R, NumPy, SciPy, MATLAB,
Machine-Learning, and 132 rows for 12 years.
Table 1
The dataset structure of the programming language libraries based on StackOverflow queries
                              Stanford-                                               Machine-
  month      NLTK spaCy          NLP    Python      R     NumPy SciPy MATLAB           Learning
  09-Jan       0        0         0       631       8        6        2       19           8
  09-Feb       1        0         0       633       9        7        3       27           4
     …         …       …          …        …        …        …       …         …           …
  19-Nov      72       79        14     23602 4883         1297     199       479         918
  19-Dec      82       72        13     20058 4150         1118     159       349         983

    Charts are used to represent data on a sheet graphically. There are several standard chart types in
Excel. Charts can be ¬placed directly on the sheet next to the data used to build the chart. Such charts
are called embedded. In addition, the chart can occupy a separate sheet in the book, which is called a
chart sheet. No matter how the chart was created, it is always linked to the sheet data. If the data changes,
the chart will be updated automatically [33]. The graphical form of data representation is called a chart.
In the form of a chart, you can provide sets of numbers, sums of money, percentages, dates, and time
values. Chart is created using the Chart Wizard, launched by the Chart Wizard button on the Standard
toolbar (Fig. 1). Output Range is a range of spreadsheet cells ¬that contains data that will be displayed
graphically or in the form of textual explanatory elements. A graphic representation of a single value is
called a data element in the chart. A row of data is a sequence of data arranged in a single row or column
of a spreadsheet and displayed graphically on a chart (Fig. 2). Typically, the value shown in the diagram
depends on another value or set of text values. Such independent values and text values are called data
categories [33].

 25000


 20000


 15000


 10000


  5000


      0
          19-MAY
          09-MAY


          10-MAY


          11-MAY


          12-MAY


          13-MAY
           09-SEP


           10-SEP


           11-SEP


           12-SEP


           13-SEP

          14-MAY


          15-MAY


          16-MAY


          17-MAY


          18-MAY
           14-SEP


           15-SEP


           16-SEP


           17-SEP


           18-SEP


           19-SEP
           09-JAN


           10-JAN


           11-JAN


           12-JAN


           13-JAN


           14-JAN


           15-JAN


           16-JAN


           17-JAN


           18-JAN


           19-JAN


          nltk   spacy   stanford-nlp    python     r   numpy      scipy   matlab     machine-learning

Figure 1: Graphical data representation of queries by date in the Cartesian coordinate system
                                             09-Jan
                                     19-Nov
                                  19-Sep            09-Mar
                                                       09-May
                               19-Jul
                           19-May      25000              09-Jul
                                                             09-Sep
                       19-Mar                                    09-Nov                           nltk
                     19-Jan                                         10-Jan
                  18-Nov               20000                           10-Mar
               18-Sep                                                     10-May                  spacy
              18-Jul                                                        10-Jul
         18-May                        15000                                  10-Sep
       18-Mar                                                                   10-Nov            stanford-nlp
      18-Jan                                                                       11-Jan
                                       10000
    17-Nov                                                                          11-Mar        python
   17-Sep                                                                            11-May
                                         5000                                                     r
    17-Jul                                                                            11-Jul
  17-May                                                                              11-Sep
                                            0                                                     numpy
  17-Mar                                                                              11-Nov
   17-Jan                                                                             12-Jan
                                                                                                  scipy
   16-Nov                                                                            12-Mar
     16-Sep                                                                         12-May
                                                                                                  matlab
       16-Jul                                                                      12-Jul
       16-May                                                                   12-Sep
         16-Mar                                                               12-Nov              machine-
            16-Jan                                                          13-Jan                learning
               15-Nov                                                     13-Mar
                  15-Sep                                               13-May
                      15-Jul                                        13-Jul
                       15-May                                    13-Sep
                           15-Mar
                               15-Jan                        13-Nov
                                                          14-Jan
                                 14-Nov
                                      14-Sep           14-Mar
                                                    14-May
                                             14-Jul

Figure 2: Graphical data representation of queries by date in the polar coordinate system

    Descriptive statistics [25-29, 32-36] provide the basis for the formation of competencies for
choosing a measurement scale, automation of data processing using different formats at the stage of
their collection, presentation of results in various forms, graphical presentation of results, calculation
of statistical distribution parameters, and evaluation of general population parameters using information
technology. It selects quantitative information necessary (or interesting) for different people. Large data
sets must be generalized or collapsed before humans can study them. It is what descriptive statistics do,
which describes, summarizes, or reduces the properties of data sets to the desired type. Descriptive
statistics are used to analyze and interpret statistical data, construct statistical distributions and calculate
the relevant numerical parameters that characterize the study population. It is used to organize
information collection, check the quality of data and their interpretation, and the image of statistical
material [25-29, 32-37]. A result of descriptive statistics shows in the Table 2.
    The construction of histograms interprets the distribution data more apparent [32]. It involves
dividing the entire range of possible values of X into a finite number of intervals (in the
multidimensional case - rectangular) and counting the number of implementations that fall into each of
them (Fig. 3).
    Cumulate is the curve of the interval variation series's accumulated frequencies [34]. The Graph of
the integral distribution function F (x) is compared with the cumulative and is also considered in
probability theory [34]. The concepts of histograms and cumulates are associated with continuous data
and their interval variation series [34]. Their graphs are empirical estimates of the probability density
and distribution function (Fig. 3).
    The methods of smoothing time series are the method of moving average, exponential smoothing,
adaptive smoothing, and their modifications [25-29, 32-36]. They are used to reduce the influence of a
random component (random fluctuations) in time series. They make it possible to obtain more "pure"
values, which consist only of deterministic components. Some of the methods aim to highlight some
components, such as trends [25-29, 32-36]. Smoothing methods can be divided into two classes based
on analytical and algorithmic approaches.
Table 2
The dataset structure of the programming language libraries based on Stack Overflow queries
                                                                                                Machi
                              Stan-                                                              ne-
                              ford-                                                    MAT-     Learni
  Index    NLTK     spaCy      NLP      Python         R        NumPy       SciPy      LAB        ng
  Mean     42,70    11,85     25,54     9856,70     2411,86     514,20     112,45     651,68    264,40
Standar
 d Error   2,53      1,83      1,99     541,47      149,25      34,20       6,06      34,46     21,73
Median     44,50     0,00     17,50     9651,50     2613,50     486,00     130,50     581,00    154,50
  Mode     0,00      0,00      0,00        -        139,00       6,00       2,00      99,00      8,00
Standar
    d
Deviatio
    n      29,02    21,07     22,82     6221,07     1714,76    392,88      69,68      395,95    249,66
 Sample    842,4    443,8              38701728     294039     154357,     4855,4    156776,    62327,
Variance     2        1      520,80       ,16         9,25       03           1         11        85
Kurtosis   -1,23    2,15      -0,79      -1,15       -1,51      -1,32       -1,33      -1,04     -0,57
Skewnes
    s      0,05      1,80      0,66       0,17        -0,06      0,22       -0,28      0,13      0,80
           106,0
Range        0      79,00     79,00    22971,00     5136,00    1306,00     227,00    1516,00    981,00
Minimu
  m        0,00      0,00      0,00      631,00       2,00       4,00       2,00      19,00      2,00
Maximu     106,0
  m          0      79,00    79,00     23602,00     5138,00    1310,00     229,00    1535,00    983,00
           5637,    1564,    3371,0    1301085,     318365,    67875,0     14844,    86022,0    34901,
  Sum       00       00        0          00          00          0          00         0         00
           132,0    132,0
  Count      0        0      132,00      132,00      132,00     132,00     132,00     132,00    132,00
 Largest
   (2)     94,00    79,00     79,00    23414,00     5117,00    1297,00     223,00    1433,00    918,00
Smallest
   (2)      0,00     0,00      0,00      633,00       4,00       6,00       2,00      24,00      3,00
Confide
   nce
  Level
(95.0%)     5,00     3,63      3,93     1071,17      295,25      67,65      12,00     68,18      42,99

    The simplest way of forecasting is considered to be the approach that determines the forecast
estimate from the achieved level using the average level, average growth, and average growth rate—
extrapolation based on the average level of the series [25-29, 32-36]. When extrapolating socio-
economic processes based on the average level of the series, the predicted value is taken as the
arithmetic mean of the previous levels of the series. The reliability interval considers the uncertainty
hidden in the estimate of the mean. However, the projected indicator is assumed to be equal to the
average sample value. The approach doesn't consider that individual indicator values fluctuated around
the average in the past [25-29, 32-36]. It will also happen in the future.
    Methods of analytical smoothing include regression analysis and the method of least squares and its
modifications [25-29, 32-36]. To identify the primary trend by the analytical method means to give the
studied process the same development throughout the observation period. Therefore, for 4 of these
methods, choosing the optimal function of the deterministic trend (growth curve) is essential, which
smooths out several observations.
             250             Frequency       Cumulative %
                                                                                                       25,00%


             200                                                                                       20,00%


             150                                                                                       15,00%
 Frequency


             100                                                                                       10,00%


              50                                                                                       5,00%


               0                                                                                       0,00%
                   9,01 9,11 10,09 11,07 12,05 13,03 14,01 14,11 15,09 16,07 17,05 18,03 19,01 19,11    Bin
Figure 3: The diagrams of the distribution data of queries – frequency and cumulate

   Forecasting methods based on regression methods are used for short-term and medium-term
forecasting. They do not allow adaptation: the forecasting procedure must be repeated first with the
receipt of new data. The optimal length of the lead period is determined separately for each economic
process, taking into account its statistical instability.

5. Results
  The most commonly used method is smoothing time series using moving averages [25-29, 32-36].
The algorithm for calculating the moving average is as follows [25-29, 32-36].
                                                                                         (1)


       Algorithm for calculating the weighted average is as follows [25-29, 32-36].
                                                                                                         (2)


5.1.           Smoothing according to Kendel formulas - simple moving average
    Smooth the data using the dimensions of the smoothing interval w = 3, 5, 7, 9, 11, 13, 15 are
presented in Fig. 4-Fig. 6. The smoothed data for queries about MatLab are calculated using to Kendel
formulas for the smoothing interval w = 3 (Fig. 4, a), w = 5 (Fig. 4, b), w = 7 (Fig. 4, c), w = 9 (Fig. 5,
a), w = 11 (Fig. 5, b), w = 13 (Fig. 5, c), w = 15 (Fig. 6).
       1800
                                                                        matlab      w=3
       1600
       1400
       1200
       1000
        800
        600
        400
        200
          0
                16
                21
                 1
                 6
                11


                26
                31
                36
                41
                46
                51
                56
                61
                66
                71
                76
                81
                86
                91
                96
               101
               106
               111
               116
               121
               126
               131
                                                a
        1800                                                            matlab      w=5
        1600
        1400
        1200
        1000
         800
         600
         400
         200
           0
               101
                 1
                 6
                11
                16
                21
                26
                31
                36
                41
                46
                51
                56
                61
                66
                71
                76
                81
                86
                91
                96

               106
               111
               116
               121
               126
               131                              b
        1800
                                                                        matlab      w=7
        1600
        1400
        1200
        1000
         800
         600
         400
         200
           0
               101
                 1
                 6
                11
                16
                21
                26
                31
                36
                41
                46
                51
                56
                61
                66
                71
                76
                81
                86
                91
                96

               106
               111
               116
               121
               126
               131


                                               c
Figure 4: The smoothed data for queries about MatLab using the smoothing interval w = 3 (a), w = 5
(b), w = 7 (c)
   We smoothed the data using the smoothing interval w = 3, then we smoothed the obtained smoothed
data again, but use the size of the smoothing interval w = 5. We continued smoothing the obtained data
with a smoothing interval w = 7 and so on to w = 15.
        1800                                                                matlab       w=9
        1600
        1400
        1200
        1000
         800
         600
         400
         200
           0


               101
                 1
                 6
                11
                16
                21
                26
                31
                36
                41
                46
                51
                56
                61
                66
                71
                76
                81
                86
                91
                96

               106
               111
               116
               121
               126
               131
                                                  a
        2000
                                                                          matlab      w=11

        1500


        1000


         500


           0
               101
                 1
                 6
                11
                16
                21
                26
                31
                36
                41
                46
                51
                56
                61
                66
                71
                76
                81
                86
                91
                96

               106
               111
               116
               121
               126
               131
                                                  b
        1800
                                                                          matlab       w=13
        1600
        1400
        1200
        1000
         800
         600
         400
         200
           0
               101
                 1
                 6
                11
                16
                21
                26
                31
                36
                41
                46
                51
                56
                61
                66
                71
                76
                81
                86
                91
                96

               106
               111
               116
               121
               126
               131


                                               c
Figure 5: The smoothed data for queries about MatLab using the smoothing interval w = 9 (a), w = 11
(b), w = 13 (c)

    The smoothed data for queries about MatLab obtained by the smoothing data again. In Fig. 7 there
are presented the smoothed data for queries about MatLab using the smoothing interval w = 5 (w = 3)
(a), w = 7 (w = 5) (b). Fig. 8 shown the smoothed data for queries about MatLab for w = 9 (w = 7) (a),
w = 11 (w = 9) (b), w = 13 (w = 11) (c), w = 15 (w = 13) (d) according to Kendel formulas.
          1800
                                                                        matlab       w=15
          1600
          1400
          1200
          1000
          800
          600
          400
          200
            0
                  46


                 131
                   1
                   6
                  11
                  16
                  21
                  26
                  31
                  36
                  41

                  51
                  56
                  61
                  66
                  71
                  76
                  81
                  86
                  91
                  96
                 101
                 106
                 111
                 116
                 121
                 126
Figure 6: The smoothed data for queries about MatLab using the smoothing interval w = 15 according
to Kendel formulas

  1600
                                                                         matlab       w=5(w=3)
  1400

  1200

  1000

   800

   600

   400

   200

     0
           69


           89
            1
            5
            9
           13
           17
           21
           25
           29
           33
           37
           41
           45
           49
           53
           57
           61
           65

           73
           77
           81
           85

           93
           97
          101
          105
          109
          113
          117
          121
          125
          129
                                                   a
   1600
   1400
   1200
   1000
    800
    600
    400
    200
      0
          121
            1
            5
            9
           13
           17
           21
           25
           29
           33
           37
           41
           45
           49
           53
           57
           61
           65
           69
           73
           77
           81
           85
           89
           93
           97
          101
          105
          109
          113
          117

          125


                                          matlab       w=7(w=5)

                                               b
Figure 7: The smoothed data for queries about MatLab using the smoothing interval w = 5 (w = 3) (a),
w = 7 (w = 5) (b)
   In both cases, we find for each smoothing the number of turning points and correlation coefficients
between the original values and the smoothed ones.
  1500                                                   1500

  1000                                                   1000

   500                                                    500

       0    57                                              0


           105
             1
             9
            17
            25
            33
            41
            49

            65
            73
            81
            89
            97

           113
           121


                                                                 89
                                                                  1
                                                                  9
                                                                 17
                                                                 25
                                                                 33
                                                                 41
                                                                 49
                                                                 57
                                                                 65
                                                                 73
                                                                 81

                                                                 97
                                                                105
                                                                113
                                                                121
                  matlab        w=9(w=7)                                 matlab        w=11(w=9)

                           a                                                      b
  1500                                                   1500


  1000                                                   1000


   500                                                    500


     0                                                      0
                                                                  1
                                                                  9
                                                                 17
                                                                 25
                                                                 33
                                                                 41
                                                                 49
                                                                 57
                                                                 65
                                                                 73
                                                                 81
                                                                 89
                                                                 97
                                                                105
                                                                113
                                                                121
            65
             1
             9
            17
            25
            33
            41
            49
            57

            73
            81
            89
            97
           105
           113
           121


                 matlab        w=13(w=11)                                matlab       w=13(w=11)

                           c                                                       d
Figure 8: The smoothed data for queries about MatLab using the smoothing interval w = 5 (w = 3) (a),
w = 7 (w = 5) (a), w = 9 (w = 7) (a), w = 11 (w = 9) (a), w = 13 (w = 11) (a), w = 15 (w = 13) (a) according
to Kendel formulas

  The correlation coefficients between the original values and the smoothed ones are calculated in
Table 3.

 Table 3
 The correlation coefficients between the original values and the smoothed ones
Interval w 3         5        7     9     11      13 15 5 (3) 7 (5) 9 (7) 11 (9) 13 (11) 15 (13)
 Correlat.
  coeffic. 0,980 0,962 0,953 0,939 0,925 0,916 - 0,977 0,971 0,965 0,958 0,953 0,950
Number of
  correct
             36     30       24    23     16      14 14 20           8      4   4   2       2
  turning
   points


5.2.       Smoothing according to Pollard formulas
   John Pollard's algorithm, proposed by him in 1975, is used to factorize integers [28]. It is based on
Floyd's algorithm for finding the length of the cycle in the sequence and some consequences of the
paradox of birthdays. The algorithm most effectively factored composite numbers with relatively minor
factors in the decomposition. All of Pollard's ρ-methods construct a numerical sequence, the elements
of which form a loop, starting with some number n, which can be illustrated by the arrangement of
numbers in the Greek letter ρ. It was the name for a family of methods [28].
   We smooth the data for queries about R using the same dimensions of the smoothing interval (w =
3, 5, 7, 9, 11, 13, 15). It is presented in Fig. 9-Fig. 11. The smoothed data for queries about R are
calculated using Pollard formulas for the smoothing interval w = 3 (Fig. 9, a), w = 5 (Fig. 9, b), w = 7
(Fig. 9, c), w = 9 (Fig. 10, a), w = 11 (Fig. 10, b), w = 13 (Fig. 10, c), w = 15 (Fig. 11).

    6000

    5000

    4000

    3000

    2000

    1000

       0
            25


           129
             1
             5
             9
            13
            17
            21

            29
            33
            37
            41
            45
            49
            53
            57
            61
            65
            69
            73
            77
            81
            85
            89
            93
            97
           101
           105
           109
           113
           117
           121
           125
                                                 r       w=3

                                                     a
   6000
   5000
   4000
   3000
   2000
   1000
      0
             1
             5
             9
            13
            17
            21
            25
            29
            33
            37
            41
            45
            49
            53
            57
            61
            65
            69
            73
            77
            81
            85
            89
            93
            97
           101
           105
           109
           113
           117
           121
           125
           129
                                                 r       w=5

                                                     b
   6000

   5000

   4000

   3000

   2000

   1000

      0
            33


            53


            73
             1
             5
             9
            13
            17
            21
            25
            29

            37
            41
            45
            49

            57
            61
            65
            69

            77
            81
            85
            89
            93
            97
           101
           105
           109
           113
           117
           121
           125
           129


                                                 r       w=7

                                                 c
Figure 9: The smoothed data for queries about R using the smoothing interval w = 3 (a), w = 5 (b), w =
7 (c)
   6000

   5000

   4000

   3000

   2000

   1000

      0
            1
            5
            9


           21


           57


           93
           13
           17

           25
           29
           33
           37
           41
           45
           49
           53

           61
           65
           69
           73
           77
           81
           85
           89

           97
          101
          105
          109
          113
          117
          121
          125
          129
                                               r       w=9

                                                   a
   6000

   5000

   4000

   3000

   2000

   1000

      0
           29


          101
            1
            5
            9
           13
           17
           21
           25

           33
           37
           41
           45
           49
           53
           57
           61
           65
           69
           73
           77
           81
           85
           89
           93
           97

          105
          109
          113
          117
          121
          125
          129
                                              r        w=11

                                                   b
   6000

   5000

   4000

   3000

   2000

   1000

      0
           29


          101
            1
            5
            9
           13
           17
           21
           25

           33
           37
           41
           45
           49
           53
           57
           61
           65
           69
           73
           77
           81
           85
           89
           93
           97

          105
          109
          113
          117
          121
          125
          129


                                              r        w=13

                                                c
Figure 10: The smoothed data for queries about R using the smoothing interval w = 9 (a), w = 11 (b),
w = 13 (c) according to Pollard formulas
   6000

   5000

   4000

   3000

   2000

   1000

      0
           21


           57


           93
            1
            5
            9
           13
           17

           25
           29
           33
           37
           41
           45
           49
           53

           61
           65
           69
           73
           77
           81
           85
           89

           97
          101
          105
          109
          113
          117
          121
          125
          129
                                               r       w=15

Figure 11: The smoothed data for queries about R using the smoothing interval w = 15 according to
Pollard formulas

   We smooth the data using the size of the smoothing interval w = 3, then we smooth the obtained
smoothed data again, but we use the size of the smoothing interval w = 5.
   The smoothed data for queries about R obtained by the smoothing data again. In Fig. 12 there are
presented the smoothed data for queries about R using the smoothing interval w = 5 (w = 3) (a), w = 7
(w = 5) (b). Fig. 13 shown the smoothed data for queries about R for w = 9 (w = 7) (a), w = 11 (w = 9)
(b), w = 13 (w = 11) (c), w = 15 (w = 13) (d) according to Pollard formulas.

  6000                                               6000


  5000                                               5000


  4000                                               4000


  3000                                               3000


  2000                                               2000


  1000                                               1000


     0                                                  0
           28
            1
           10
           19

           37
           46
           55
           64
           73
           82
           91
          100
          109
          118
          127


                                                              1
                                                             10
                                                             19
                                                             28
                                                             37
                                                             46
                                                             55
                                                             64
                                                             73
                                                             82
                                                             91
                                                            100
                                                            109
                                                            118
                                                            127


                    w=3       w=5(w=3)                                 w=5       w=7(w=5)

                          a                                             b
Figure 12: The smoothed data for queries about R using the smoothing interval w = 5 (w = 3) (a), w =
7 (w = 5) (b) according to Pollard formulas
  6000                                                 6000

  5000                                                 5000

  4000                                                 4000

  3000                                                 3000

  2000                                                 2000

  1000                                                 1000

       0                                                  0
             1
            10
            19
            28
            37
            46
            55
            64
            73
            82
            91
           100
           109
           118
           127


                                                                1
                                                               10
                                                               19
                                                               28
                                                               37
                                                               46
                                                               55
                                                               64
                                                               73
                                                               82
                                                               91
                                                              100
                                                              109
                                                              118
                                                              127
                   w=7        w=9(w=7)                                  w=9        w=11(w=9)

                          a                                                    b
 6000                                                  6000

 5000                                                  5000

 4000                                                  4000

 3000                                                  3000

 2000                                                  2000

 1000                                                  1000

    0                                                     0
                                                                1
             1


            55


                                                               10
                                                               19
                                                               28
                                                               37
                                                               46
                                                               55
                                                               64
                                                               73
                                                               82
                                                               91
                                                              100
                                                              109
                                                              118
                                                              127
            10
            19
            28
            37
            46

            64
            73
            82
            91
           100
           109
           118
           127


                  w=11        w=13(w=11)                               w=13        w=15(w=13)

                          c                                                     d
Figure 13: The smoothed data for queries about R using the smoothing interval w = 9 (w = 7) (a), w =
11 (w = 9) (b), w = 13 (w = 11) (c), w = 15 (w = 13) (d) according to Pollard formulas

5.3.       Exponential smoothing
    We add all sample elements to construct exponential smoothing and multiply by a factor (1 - α). The
α takes values from zero to one, and the last element of the already created table of values for a certain
α is multiplied by α (the Sum of coefficients should be equal to 1). The following is a graph of
exponential smoothing for all required α.
    Exponential smoothing queries about Machine Learning for а=0.1 (a), а=0.15 (b), а=0.2 (c), а=0.25
(d), а=0.3 (e) are presented in the Fig. 14.
    We find the number of turning points and coefficients for each smoothing correlation between
original and smoothed values. The correlation coefficients between the original values and the
smoothed ones are calculated in the Table 4.
                                                                                                                                                                                                                                                                                                                                                         500
                                                                                                                                                                                                                                                                                                                                                               1000
                                                                                                                                                                                                                                                                                                                                                                      1500


                                                                                                                                                                                                                                                                                                                                                     0


                                                                                                                                                                                                                                           0
                                                                                                                                                                                                                                               500
                                                                                                                                                                                                                                                     1000
                                                                                                                                                                                                                                                            1500
                                                                                                                                                                                                                                                                                                0
                                                                                                                                                                                                                                                                                                    500
                                                                                                                                                                                                                                                                                                          1000
                                                                                                                                                                                                                                                                                                                 1500


                                                                                                                                 0
                                                                                                                                     500
                                                                                                                                           1000
                                                                                                                                                  1500
                                                                                                                                                                                      0
                                                                                                                                                                                          500
                                                                                                                                                                                                1000
                                                                                                                                                                                                       1500
                                                                                                                             1                                                    1                                                    1                                                    1                                                    1
                                                                                                                             5                                                    5                                                    5                                                    5                                                    5
                                                                                                                             9                                                    9                                                    9                                                    9                                                    9
                                                                                                                            13                                                   13                                                   13                                                   13                                                   13


а=0.25 (d), а=0.3 (e)
                                                                                                                            17                                                   17                                                   17                                                   17                                                   17
                                                                                                                            21                                                   21                                                   21                                                   21                                                   21
                                                                                                                            25                                                   25                                                   25                                                   25                                                   25
                                                                                                                            29                                                   29                                                   29                                                   29                                                   29
                                                                                                                            33                                                   33                                                   33                                                   33                                                   33
                                                                                                                            37                                                   37                                                   37                                                   37                                                   37
                                                                                                                            41                                                   41                                                   41                                                   41                                                   41
                                                                                                                            45                                                   45                                                   45                                                   45                                                   45
                                                                                                                            49                                                   49                                                   49                                                   49                                                   49
                                                                                                                            53                                                   53                                                   53                                                   53                                                   53
                                                                                                                            57                                                   57                                                   57                                                   57                                                   57
                                                                                                                            61                                                   61                                                   61                                                   61                                                   61


                                                                                                                                                                                                              c
                                                                                                                                                                                                                                                                                                                        a


                                                e
                                                                                                                                                         d
                                                                                                                                                                                                                                                                   b
                                                                                                                            65                                                   65                                                   65                               machine-learning    65                                                   65


                                                                                                                                                             machine-learning
                                                                                                                                                                                                                  machine-learning
                                                                                                                                                                                                                                                                                                                            machine-learning


                                                                                                        machine-learning
                                                                                                                            69                                                   69                                                   69                                                   69                                                   69
                                                                                                                            73                                                   73                                                   73                                                   73                                                   73
                                                                                                                            77                                                   77                                                   77                                                   77                                                   77
                                                                                                                            81                                                   81                                                   81                                                   81                                                   81


                                                                                                                                                                                                                  а=0.2
                                                                                                                                                                                                                                                                                                                            а=0.1


                                                                                                        а=0.3
                                                                                                                            85                                                   85                                                   85                                                   85                                                   85
                                                                                                                                                                                                                                                                       а=0.15


                                                                                                                                                             а=0.25
                                                                                                                            89                                                   89                                                   89                                                   89                                                   89
                                                                                                                            93                                                   93                                                   93                                                   93                                                   93
                                                                                                                            97                                                   97                                                   97                                                   97                                                   97
                                                                                                                           101                                                  101                                                  101                                                  101                                                  101
                                                                                                                           105                                                  105                                                  105                                                  105                                                  105
                                                                                                                           109                                                  109                                                  109                                                  109                                                  109
                                                                                                                           113                                                  113                                                  113                                                  113                                                  113
                                                                                                                           117                                                  117                                                  117                                                  117                                                  117
                                                                                                                           121                                                  121                                                  121                                                  121                                                  121
                                                                                                                           125                                                  125                                                  125                                                  125                                                  125
                                                                                                                           129                                                  129                                                  129                                                  129                                                  129


Figure 14: Exponential smoothing queries about Machine Learning for а=0.1 (a), а=0.15 (b), а=0.2 (c),
Table 4
The correlation coefficients between the original values and the smoothed ones
       Factor α               0.1            0.15             0.2         0.25         0.3
Correlation coefficient 0,958867          0,964152         0,96739      0,969568    0,971129
  Number of correct
                              26              32              38           38          42
    turning points


5.4.       Median smoothing
    Median smoothing queries about Python for w=3 (a), w=5 (b), w=7 (a), w=9 (a), w=11 (a), w=13
(a), w=15 (a) are presented in the Fig. 15-Fig. 17.

  25000


  20000


  15000


  10000


   5000


       0
            41


            77


           113
             1
             5
             9
            13
            17
            21
            25
            29
            33
            37

            45
            49
            53
            57
            61
            65
            69
            73

            81
            85
            89
            93
            97
           101
           105
           109

           117
           121
           125
           129
                                            python      w=3

                                                a
  25000


  20000


  15000


  10000


   5000


       0
             1
             5
             9
            13
            17
            21
            25
            29
            33
            37
            41
            45
            49
            53
            57
            61
            65
            69
            73
            77
            81
            85
            89
            93
            97
           101
           105
           109
           113
           117
           121
           125
           129


                                           python       w=5

                                               b
Figure 15: Median smoothing queries about Python for w=3 (a), w=5(b)
                                                                                                 0
                                                                                                     5000
                                                                                                            10000
                                                                                                                    15000
                                                                                                                            20000
                                                                                                                                    25000
                                                                                                                                                               0
                                                                                                                                                                   5000
                                                                                                                                                                          10000
                                                                                                                                                                                  15000
                                                                                                                                                                                          20000
                                                                                                                                                                                                  25000
                                                                                                                                                                                                                             0
                                                                                                                                                                                                                                 5000
                                                                                                                                                                                                                                        10000
                                                                                                                                                                                                                                                15000
                                                                                                                                                                                                                                                        20000
                                                                                                                                                                                                                                                                25000


                                                                                             1                                                             1                                                             1
                                                                                             5                                                             5                                                             5
                                                                                             9                                                             9                                                             9
                                                                                            13                                                            13                                                            13
                                                                                            17                                                            17                                                            17
                                                                                            21                                                            21                                                            21
                                                                                            25                                                            25                                                            25
                                                                                            29                                                            29                                                            29
                                                                                            33                                                            33                                                            33
                                                                                            37                                                            37                                                            37
                                                                                            41                                                            41                                                            41
                                                                                            45                                                            45                                                            45
                                                                                            49                                                            49                                                            49
                                                                                            53                                                            53                                                            53
                                                                                            57                                                            57                                                            57


                                                                                  python

                                               c
                                                                                                                                                                                                                        61
                                                                                                                                                                                                              python

                                                                                                                                                                                                          a


                                                                                                                                            b
                                                                                            61                                                            61


                                                                                                                                                python
                                                                                            65                                                            65                                                            65
                                                                                            69                                                            69                                                            69
                                                                                            73                                                            73                                                            73
                                                                                                                                                                                                              w=7


                                                                                                                                                w=9
                                                                                            77                                                            77                                                            77


                                                                                  w=11
                                                                                            81                                                            81                                                            81
                                                                                            85                                                            85                                                            85
                                                                                            89                                                            89                                                            89
                                                                                            93                                                            93                                                            93
                                                                                            97                                                            97                                                            97
                                                                                           101                                                           101                                                           101
                                                                                           105                                                           105                                                           105


Figure 16: Median smoothing queries about Python for w=7 (a), w=9 (b), w=11 (c)
                                                                                           109                                                           109                                                           109
                                                                                           113                                                           113                                                           113
                                                                                           117                                                           117                                                           117
                                                                                           121                                                           121                                                           121
                                                                                           125                                                           125                                                           125
                                                                                           129                                                           129                                                           129
   25000


   20000


   15000


   10000


    5000


       0
            21


            81


           113
             1
             5
             9
            13
            17

            25
            29
            33
            37
            41
            45
            49
            53
            57
            61
            65
            69
            73
            77

            85
            89
            93
            97
           101
           105
           109

           117
           121
           125
           129
                                             python       w=13

                                                   a
   25000


   20000


   15000


   10000


    5000


       0
           101
           105
           109
             1
             5
             9
            13
            17
            21
            25
            29
            33
            37
            41
            45
            49
            53
            57
            61
            65
            69
            73
            77
            81
            85
            89
            93
            97


           113
           117
           121
           125
           129
                                             python       w=15

                                               b
Figure 17: Median smoothing queries about Python for w=13 (a), w=15 (b)

6. Discussions
6.1. Data correlation
    Correlation analysis is a group of methods that can detect the presence and degree of relationship
between several parameters that change randomly [24]. Two samples (data sets) are studied in the
simplest case. Their multidimensional complexes (groups) are studied in the general case. The purpose
of correlation analysis is to determine whether one variable has a significant dependence on another
[25]. The main tasks of correlation analysis are the definition and expression of the form of analytical
dependence of the resultant trait y on the factor traits хі.
    There are the following stages of correlation analysis [24, 25].
    •    Identifying the relationship between the signs;
    •    Determining the form of communication;
    •    Determination of strength (tightness) and direction of communication.
    Advantages of correlation analysis are as following.
    •    Ability to create a new rule of the interaction of functions with each other;
    •    Estimation of the interaction of functions received strangely.
    Disadvantages are as following. The results obtained using the technique can be used only in the
field of this study or close to it.
    A correlation occurs when a series of values of a function (dependent variable) corresponds to the
same value of an argument (independent variable) [24].
    To construct a correlation field, we considered the definition of the concept of correlation field. The
correlation field (scatter plot) is a graphical representation of the relationship between the two studied
sequences [24, 25]. Thus, it is a set of points in a rectangular coordinate system, the abscissa of each of
which corresponds to the value of the factor feature (x), and the ordinate - the value of the resultant
feature (y) of a particular unit of observation. The number of points on the Graph corresponds to the
number of observation units. The location of points on the correlation field allows you to judge the
nature of the dependence, for example, linear, parabolic, hyperbolic, logistical, logarithmic,
exponential, exponential, or no dependence [24].
    Fig. 18 shows the behavior of the correlation field for queries about the python programming
language for only one month for each day. From Fig. 18 it is seen that the nature of the dependence is
linear. The dependence is described by an equation y = 1932x - 17317 with a high coefficient of
determination R² = 0,972.

       25000

                                                          y = 1932x - 17317
       20000                                                  R² = 0,972


       15000


       10000


        5000


           0
               0              5                10               15              20               25

                                               python         Linear

Figure 18: The correlation field for queries about Python to days during one month

    The correlation field is built from the input data (x and game) in the form of a scatter plot. Analyzing
the location of points on the correlation field, we can judge the nature of the dependence, namely that
it is linear. Request dates start from 2009 and are collected until 2019 inclusive, broken down by all
months. The lowest number of requests for Python was one month of 2009 and increased with each
passing month, indicating the language's growing popularity and increased number of users. Data from
2019 to 2021 are not collected in the network date. However, analyzing the statistics, we can predict
even more significant growth in the popularity of the programming language, as there are requests for
its library. That is, the data has a growing trend.
    We are determining the value of the correlation coefficient. A sample correlation coefficient is used
to quantify the closeness of the connection. The correlation coefficient characterizes the degree of
closeness of the linear dependence. In general, when some stochastic dependence relates the X and Y
values, the correlation coefficient may have a value in the range of -1 ≤ r ≤ +1 [24].
    The formula for calculating the correlation coefficient is as following.
                                                                                                       (3)
   The statistical scientific sources [24-29] are recommended to use the following expression to
calculate the correlation coefficient.
                                                                                           (4)


     The calculated correlation coefficient for queries about Python is equal to R2=0,98588536. It is a
good correlation coefficient, and it shows that there is a dependence; it is linear and quite close.
     The correlation ratio is used in cases where there are following case [24-29].
     •    Between a pair of studied features, there is a nonlinear relationship;
     •    The nature of the sample data (number, density of location on the correlation field) allows their
     grouping on the y-axis, and secondly, the ability to calculate "individual" mathematical expectations
     within each grouping interval.
     According to the preliminary construction of the correlation field, we see that the Graph is linear, so
it is impractical to calculate the correlation ratio.
     To divide one of the sequences into three equal parts we divide the sequence, corresponding to the
number of queries about the python in the programming language library (Table 5).

Table 5
The divided sequence of the queries about python in the programming language libraries into three
equal intervals
                                     1st part              2nd part             3rd part
            Interval                  (1; 45)              (45; 89)            [89; 132]
  Number of sample items                44                    44                   44

    As we can see the partition is performed so that the number of sample elements at each interval is
the same, and it is equal 44. In the case of many observations, when the correlation coefficients need to
be calculated sequentially for several samples, for convenience, the obtained coefficients are
summarized in tables, which are called correlation matrices.
    The correlation matrix is a square table where the correlation coefficient between the corresponding
parameters is located at the corresponding row and column intersection [24-29].
    Dividing the sample into three equal parts, we build a correlation matrix (Table 6).

Table 6
The correlation matrix of the queries about python
                                       1st part                  2nd part                 3rd part
              1st                         1
              2nd                    0,92230619                     1
              3rd                     0,8602376                0,86988678                    1

   The formula for calculating the autocorrelation coefficient is as following [24-29].
                                                                                                      (5)


   To calculate the autocorrelation coefficient according to the formula (5). We used the CORREL
function. The autocorrelation coefficient for queries about Python is presented in Table 7. The sequence
of autocorrelation coefficients of the levels of the first, second, third, etc. orders is called the
autocorrelation function. The Graph of the autocorrelation function is called the correlogram [25-29].
Table 7
Multiple correlation coefficients for the queries about python
                           Lag                                       Autocorrelation coefficients
                            1                                               0,98701089
                            2                                               0,98096642
                            3                                               0,98308662
                            4                                                0,9783225
                            5                                                0,9783225
                            6                                               0,97742187
                            7                                               0,97709153

   The correlogram for the queries about python is presented in Fig. 19. The pattern of the correlogram
shows that the studied series is not stationary because in the case of a stationary time series, the
correlogram must decline rapidly.

        0,988     0,987

        0,986

        0,984                             0,983

        0,982                 0,981

        0,980
                                                      0,978       0,978
        0,978                                                                 0,977       0,977

        0,976

        0,974

        0,972
                    1           2           3           4           5           6           7

Figure 19: The correlogram for queries about Python vs each lag

6.2.    The cluster data analysis
    To form an "object-property" table from our data, let's split the data so that the 2nd, 3rd, 4th, and 5th
columns can be considered objects. The first column will then be considered a property. To calculate
each of the properties, we use the standard formulas [53-62]. To calculate the properties in column
2016, we used only the data for queries collected from this 2016 (Table 8). The term "average" in the
Table 8 means the average number of queries in the NumPy library overall 12 months. Accordingly,
"minimum" shows the lowest number of requests during the year (for a month), and "maximum" - the
most. "Volume" - the number of lines for a given year. There are 12 of them every year, because of 12
months a year. "Fashion" is the value of a certain quantity, which occurs most often in all observations.
Since the statistics on queries changed every month and there was never one repeated for at least two
months, cannot talk about fashion, it is impossible to determine. "Median" is a number that divides the
list of attribute values into two equal parts so that there is the same number of units on both sides.
"Standard error" is the approximate standard deviation of the statistical sample. The more data points
involved in calculating the mean, the smaller the standard error [63-79]. "Standard deviation" is the
deviation of all characteristic values from their average value.
Table 8
The Normalized table "object-property"
          Index               2016                 2017                 2018                  2019
        Average            13259.25             16678.92             17191.67               19861.33
    Standard error         0.037773             0.026568             0.037599               0.027778
        Median               13108                16620               17334.5               20047.5
        Fashion              #N/A                 #N/A                 #N/A                  #N/A
  Standard deviation       580.6382             1304,687             894.8652               1939,741
   Sampling variance       367789.8             1856955              873582.2               4104650
        Kurtosis           -0.63691             -0.25216             -1.25289               0.25504
      Asymmetry            0.396782              0.09637             -0.39425                0.7177
        Interval           328,5209             738.1827             506.3083               1097,492
        Minimum              12424                14388                15537                  17167
       Maximum               14296                18935                18329                  23602
           Sum              159111               200147               206300                 238336
        Amount                 12                   12                   12                    12
 Reliability level (95%)   0.083137             0.058476             0.082755               0.061138

   It is one of the essential methods to help determine how much a particular value change [74-79].
The larger the standard deviation, the more comprehensive the range of changes in the values of this
value "Amount" - the total number of requests to the library for twelve months for each described year.
The "level of reliability" is the ability to reject the null hypothesis when it is correct. It is a good
possibility of error of the first kind for this task. "Sampling variance" - allows you to measure how far
random values are distributed from their average value. Larger variance values indicate more significant
deviations of the values of the random variable from the center of the distribution. "Excess" is a
numerical characteristic of the probability distribution of an objective random variable. The excess
coefficient characterizes the "steepness," i.e., the rate of increase of the distribution curve compared to
the standard curve. "Asymmetry" measures how asymmetric the distribution (skew) can be. If we talk
about the opposite concept of symmetry, the distribution relative to the center on the right and left is
ideal mirror images of each other. "Interval" - the interval between the extreme values of the feature in
the group of units. To construct a matrix of similarities (Table 9) we used formula (6) by analogy with
the previous Table 8 [53-73].
                                                                                                      (6)


Table 9
The proximity matrix for four clusters
        Cluster                  1                 2                    3                      4
           1                     0              1489747              508047.4               3737727
           2                 1489747               0                 983393.4               2248031
           3                508047.4            983393.4                0                   3231234
           4                 3737727            2248031              3231234                   0

   The resulting proximity matrix (Table 9) is a symmetric diagonal matrix that indicates the amount
of proximity between objects. Agglomerative hierarchical cluster analysis is performed based on such
a matrix. The choice of integration strategy is determined by the approach. We chose the strategy of the
nearest neighbor. In it, the distance between two groups is defined as the distance between the two
closest elements of these groups.
   After performing the cluster analysis procedure sequentially, we obtained proximity matrices for 3
(Table 10) and 2 clusters (Table 11).
Table 10
The proximity matrix for 3 clusters
          Cluster                     1.3                      2                          4
             1.3                       0                    983393,4                   3231234
              2                     983393,4                   0                       2248031
              4                     3231234                 2248031                       0

Table 11
The proximity matrix for 2 clusters
              Cluster                                 1.3.2                               4
               1.3.2                                    0                              2248031
                 4                                   2248031                              0

    The cluster analysis procedure starts with the proximity matrix. In it, we determine the smallest
number. It is 508047.4, located at the 1st and 3rd objects intersection. Therefore, we group the 1st and
3rd objects and create a new table. Now determine the minimum number again. This time it is at the
intersection of objects (1.3) and (2). We are grouping them again. We built a table "union-node-metric"
(Table 12).

Table 12
The union-node-metric table for programming language libraries
            Step                  Association           Node                            Metrics
              1                      1+3                  d5                           508047.4
              2                    1+3+2                  d6                           983393.4
              3                   1+3+2+4                 d7                           2248031

    Our union-node-metric table is formed in 3 steps. In the first, there is a union of objects 1 and 3. In
the second step of objects (1,3) and 2. In the third (1,3,2) and 4. According to the steps, nodes are
formed, named d 5, d 6, and d 7, because there are four objects, and the next numbering begins after
the 4th. And the representation of the metric is the minimum value at each stage of the construction of
the table.
    The constructed dendrogram for programming language libraries can help us to visualize the results
of cluster analysis in the Fig.20. We construct the dendrogram of clustering several objects manually in
the draft version and then implement it in a graphical environment. Indicators on the dendrogram on
the left represent the metric, the bottom objects, and the top point to each node separately. Drawing
horizontal lines in the plane of the dendrogram at a given height, in this case, allows you to select
individual clusters.
    When interpreting the results of cluster analysis, we observe 3 clusters at level 2248031, among
which cluster 1 includes objects 1.3, the second cluster - only one object 2, and the third - only object
4. At level 983393,4 we observe 2 clusters, among which the first cluster includes three objects 1,3,2,
and the second - only one object 4. At the level of 508047,4 we observe one cluster of all elements.

7. Conclusions
   In this work, we learned the basic visualization methods, graphical display, and primary statistical
processing of numerical data represented by a sample of time series.
   We got acquainted with the main methods of highlighting the trend of the behavior of the studied
indicator, which is represented by the nature of its trend, using methods of smoothing time series and
presenting the results using an MS Excel spreadsheet.
Figure 20: The constructed dendrogram of programming language libraries

   We also got acquainted with the methods of correlation analysis of experimental data presented by
time sequences. We learned to build a correlation field, determine the value of the correlation
coefficient, calculate the correlation ratio, plot autocorrelation functions, divide one of the sequences
into three equal parts, build a correlation matrix for them and find multiple correlation coefficients. We
also divided a given set of objects, each characterized by the same set of specific features, into separate
groups using hierarchical agglomerative cluster analysis.
   A library rating system has been created, i.e., the most significant number of queries has been
identified, and the most popular language has been identified. In ranking queries in language libraries,
where the first is Python, the least popular - is spacy. The tendency of the growing popularity of all
language libraries characterizes the active development of programming and, most importantly,
people's interest in the work. The obtained data will allow experts to assess the decline, growth, and
invariability of the popularity of languages in the recent period (2009-2019) and offer their vision of
the possible development of specific programming languages.

8. References
[1] O. Kuzmin, M. Bublyk, A. Shakhno, O. Korolenko, H. Lashkun, Innovative development of
    human capital in the conditions of globalization, E3S Web of Conferences 166 (2020) 13011.
[2] I. Bodnar, M. Bublyk, O. Veres, O. Lozynska, I. Karpov, Y. Burov, P. Kravets, I. Peleshchak, O.
    Vovk, O. Maslak, Forecasting the risk of cervical cancer in women in the human capital
    development context using machine learning, CEUR workshop proceedings Vol-2631 (2020) 491-
    501.
[3] M. Bublyk, V. Vysotska, Y. Matseliukh, V. Mayik, M. Nashkerska, Assessing losses of human
    capital due to man-made pollution caused by emergencies, CEUR Workshop Proceedings Vol-
    2805 (2020) 74-86.
[4] D. Koshtura, M. Bublyk, Y. Matseliukh, D. Dosyn, L. Chyrun, O. Lozynska, I. Karpov, I.
    Peleshchak, M. Maslak, O. Sachenko, Analysis of the demand for bicycle use in a smart city based
    on machine learning, CEUR workshop proceedings Vol-2631 (2020) 172-183.
[5] M. Bublyk, Y. Matseliukh, U. Motorniuk, M. Terebukh, Intelligent system of passenger
     transportation by autopiloted electric buses in Smart City, CEUR workshop proceedings Vol-2604
     (2020) 1280-1294.
[6] I. Rishnyak, O. Veres, V. Lytvyn, M. Bublyk, I. Karpov, V. Vysotska, V. Panasyuk,
     Implementation models application for IT project risk management, CEUR Workshop Proceedings
     Vol-2805 (2020) 102-117.
[7] V. Vysotska, A. Berko, M. Bublyk, L. Chyrun, A. Vysotsky, K. Doroshkevych, Methods and tools
     for web resources processing in e-commercial content systems, in: Proceedings of 15th
     International Scientific and Technical Conference on Computer Sciences and Information
     Technologies, CSIT, 1, 2020, pp. 114-118. doi: 10.1109/CSIT49958.2020.9321950.
[8] M. Bublyk, A. Kowalska-Styczen, V. Lytvyn, V. Vysotska, The Ukrainian Economy
     Transformation into the Circular Based on Fuzzy-Logic Cluster Analysis, Energies 2021 (14)
     5951. doi: 10.3390/en14185951.
[9] A. Berko, I. Pelekh, L. Chyrun, M. Bublyk, I. Bobyk, Y. Matseliukh, L. Chyrun, Application of
     ontologies and meta-models for dynamic integration of weakly structured data, in: Proceedings of
     the IEEE 3rd International Conference on Data Stream Mining and Processing, DSMP, 2020, pp.
     432-437. doi: 10.1109/DSMP47368.2020.9204321.
[10] V.-A. Oliinyk, V. Vysotska, Y. Burov, K. Mykich, V. Basto-Fernandes, Propaganda Detection in
     Text Data Based on NLP and Machine Learning, CEUR workshop proceedings Vol-2631 (2020)
     132-144.
[11] R. Lynnyk, V. Vysotska, Y. Matseliukh, Y. Burov, L. Demkiv, A. Zaverbnyj, A. Sachenko, I.
     Shylinska, I. Yevseyeva, O. Bihun, DDOS Attacks Analysis Based on Machine Learning in
     Challenges of Global Changes, CEUR workshop proceedings Vol-2631 (2020) 159-171.
[12] V. Vysotska, Linguistic Analysis of Textual Commercial Content for Information Resources
     Processing, in: Proceedings of the International Conference on Modern Problems of Radio
     Engineering, Telecommunications and Computer Science, TCSET, 2016, pp. 709-713. doi:
     10.1109/TCSET.2016.7452160.
[13] V. Lytvyn, V. Vysotska, A. Rzheuskyi, Technology for the Psychological Portraits Formation of
     Social Networks Users for the IT Specialists Recruitment Based on Big Five, NLP and Big Data
     Analysis, CEUR Workshop Proceedings Vol-2392 (2019) 147-171.
[14] Lytvyn Vasyl, Vysotska Victoria, Dosyn Dmytro, Holoschuk Roman, Rybchak Zoriana,
     Application of Sentence Parsing for Determining Keywords in Ukrainian Texts, in: Proceedings
     of the International Conference on Computer Sciences and Information Technologies, CSIT, 2017,
     pp. 326-331. doi: 10.1109/STC-CSIT.2017.8098797.
[15] Y. Burov, V. Vysotska, P. Kravets, Ontological approach to plot analysis and modeling, CEUR
     Workshop Proceedings Vol-2362 (2019) 22-31.
[16] V. Vysotska, O. Kanishcheva, Y. Hlavcheva, Authorship Identification of the Scientific Text in
     Ukrainian with Using the Lingvometry Methods, in: Proceedings of the International Conference
     on Computer Sciences and Information Technologies, CSIT, 2018, pp. 34-38. doi: 10.1109/STC-
     CSIT.2018.8526735.
[17] A. Gozhyj, I. Kalinina, V. Gozhyj, V. Vysotska, Web service interaction modeling with colored
     petri nets, in: Proceedings of the International Conference on Intelligent Data Acquisition and
     Advanced Computing Systems: Technology and Applications, IDAACS, 1, 2019, pp. 319-323.
     doi: 10.1109/IDAACS.2019.8924400.
[18] A. Gozhyj, I. Kalinina, V. Vysotska, S. Sachenko, R. Kovalchuk, Qualitative and Quantitative
     Characteristics Analysis for Information Security Risk Assessment in E-Commerce Systems,
     CEUR Workshop Proceedings Vol-2762 (2020) 177-190.
[19] L. Podlesna, M. Bublyk, I. Grybyk, Y. Matseliukh, Y. Burov, P. Kravets, O. Lozynska, I. Karpov,
     I. Peleshchak, R. Peleshchak, Optimization model of the buses number on the route based on
     queueing theory in a smart city, CEUR workshop proceedings Vol-2631 (2020) 502 - 515.
[20] O. Bisikalo, O. Kovtun, V. Kovtun, V. Vysotska, Research of Pareto-Optimal Schemes of Control
     of Availability of the Information System for Critical Use, CEUR Workshop Proceedings Vol-
     2623 (2020) 174-193.
[21] V. Vysotska, Ukrainian Participles Formation by the Generative Grammars Use, CEUR workshop
     proceedings Vol-2604 (2020) 407-427.
[22] V. Vysotska, S. Holoshchuk, R. Holoshchuk, A comparative analysis for English and Ukrainian
     texts processing based on semantics and syntax approach, CEUR Workshop Proceedings Vol-2870
     (2021) 311-356.
[23] K. Tymoshenko, V. Vysotska, O. Kovtun, R. Holoshchuk, S. Holoshchuk, Real-time Ukrainian
     text recognition and voicing, CEUR Workshop Proceedings Vol-2870 (2021) 357-387.
[24] Data Set, 2022. URL: https://www.kaggle.com/aishu200023/stackindex.
[25] M. Bublyk, Y. Matseliukh, Small-batteries utilization analysis based on mathematical statistics
     methods in challenges of circular economy, CEUR workshop proceedings Vol-2870 (2021) 1594-
     1603.
[26] Standard error, 2022. URL: https://ua.nesrakonk.ru/standard-error/.
[27] Standard deviation, 2022. URL: https://studopedia.su/10_11382_standartne-vidhilennya.html.
[28] Statistical models of marketing decisions taking into account the uncertainty factor, 2022. URL:
     https://excel2.ru/articles/uroven-znachimosti-i-uroven-nadezhnosti-v-ms-excel.
[29] Grouping of statistical data - BukLib.net Library, 2022. URL: https://buklib.net/books/35946/.
[30] Stack Overflow, 2022. URL: https://en.wikipedia.org/wiki/Stack_Overflow.
[31] StackOverflow is more than just a repository of answers to stupid questions, 2022. URL:
     https://habr.com/ru/post/482232/.
[32] TechTrend, 2022. URL: http://techtrend.com.ua/index.php?newsid=20844.
[33] Graphic presentation of information, 2022. URL: https://studopedia.com.ua/1_132145_grafichne-
     podannya-informatsii.html.
[34] Construction of an interval variable sequence of continuous quantitative data, 2022. URL:
     https://stud.com.ua/93314/statistika/pobudova_intervalnogo_variatsiynogo_ryadu_bezperernih_k
     ilkisnih_danih.
[35] Forecasting the trend of the time series by algorithmic methods, 2022. URL:
     http://ubooks.com.ua/books/000269/inx42.php.
[36] Wikideck, 2022. URL: https://wp-uk.wikideck.com/.
[37] StackOverflow, 2022. URL: https://ru.stackoverflow.com.
[38] P. Bidyuk, A. Gozhyj, I. Kalinina, V. Vysotska, Methods for Forecasting Nonlinear Non-
     Stationary Processes in Machine Learning, Communications in Computer and Information Science
     1158 (2020) 470-485. doi: 10.1007/978-3-030-61656-4_32.
[39] P. Bidyuk, A. Gozhyj, I. Kalinina, V. Vysotska, M. Vasilev, R. Malets, Forecasting Nonlinear
     Nonstationary Processes in Machine Learning Task, in: Proceedings of the IEEE 3rd International
     Conference on Data Stream Mining and Processing, DSMP, 2020, pp. 28-32. doi:
     10.1109/DSMP47368.2020.9204077.
[40] A. B. Lozynskyy, I. M. Romanyshyn, B. P. Rusyn, Intensity Estimation of Noise-Like Signal in
     Presence of Uncorrelated Pulse Interferences, Radioelectronics and Communications Systems
     62(5) (2019) 214-222. doi: 10.3103/S0735272719050030.
[41] N. Romanyshyn, Algorithm for Disclosing Artistic Concepts in the Correlation of Explicitness and
     Implicitness of Their Textual Manifestation, CEUR Workshop Proceedings Vol-2870 (2021) 719-
     730.
[42] O. Rudenko, O. Bezsonov, Robust Training of ADALINA Based on the Criterion of the Maximum
     Correntropy in the Presence of Outliers and Correlated Noise, CEUR Workshop Proceedings Vol-
     2870 (2021) 1694-1705.
[43] Y. Yusyn, T. Zabolotnia, Methods of Acceleration of Term Correlation Matrix Calculation in the
     Island Text Clustering Method, CEUR workshop proceedings Vol-2604 (2020) 140-150.
[44] B. Rusyn, V. Ostap, O. Ostap, A correlation method for fingerprint image recognition using
     spectral features, in: Proceedings of the International Conference on Modern Problems of Radio
     Engineering, Telecommunications and Computer Science, TCSET 2002, 2002, pp. 219–220. doi:
     10.1109/TCSET.2002.1015935.
[45] A. Lozynskyy, I. Romanyshyn, B. Rusyn, V. Minialo, Robust Approach to Estimation of the
     Intensity of Noisy Signal with Additive Uncorrelated Impulse Interference. In: Proceedings of the
     2018 IEEE 2nd International Conference on Data Stream Mining and Processing, DSMP 2018,
     2018, pp. 251–254. doi: 10.1109/DSMP.2018.8478625.
[46] N. Boyko, O. Moroz, Comparative Analysis of Regression Regularization Methods for Life
     Expectancy Prediction, CEUR Workshop Proceedings Vol-2917 (2021) 310-326.
[47] L. Mochurad, Optimization of Regression Analysis by Conducting Parallel Calculations, CEUR
     Workshop Proceedings Vol-2870 (2021) 982-996.
[48] R. Yurynets, Z. Yurynets, D. Dosyn, Y. Kis, Risk Assessment Technology of Crediting with the
     Use of Logistic Regression Model, CEUR Workshop Proceedings Vol-2362 (2019) 153-162.
[49] A. Kucher, O. Boyko, K. Ilkanych, A. Fechan, N. Shakhovska, Retrospective analysis by
     multifactor regression in the evaluation of the results of fine-needle aspiration biopsy of thyroid
     nodules, CEUR Workshop Proceedings Vol-2753 (2020) 443–447.
[50] O. Murzenko, S. Olszewski, O. Boskin, I. Lurie, N. Savina, M. Voronenko, V. Lytvynenko,
     Application of a combined approach for predicting a peptide-protein binding affinity using
     regulatory regression methods with advance reduction of features, in: Proceedings of the 10th IEEE
     International Conference on Intelligent Data Acquisition and Advanced Computing Systems:
     Technology and Applications, IDAACS , 2019, 1, pp. 431–435, 8924244. doi:
     10.1109/IDAACS.2019.8924244.
[51] B. van Stein, H. Wang, W. Kowalczyk, T. Bäck, M. Emmerich, Optimally weighted cluster kriging
     for big data regression, Lecture Notes in Computer Science (including subseries Lecture Notes in
     Artificial Intelligence and Lecture Notes in Bioinformatics) 9385 (2015) 310–321. doi:
     10.1007/978-3-319-24465-5_27.
[52] C. L. M. Belusso, S. Sawicki, V. Basto-Fernandes, R. Z. Frantz, F. Roos-Frantz, Price modeling
     of laaS providers using multiple regression [Modelagem de Preços de Provedores de IaaS
     Utilizando Regressão Múltipla], in: Iberian Conference on Information Systems and Technologies,
     CISTI, 2017. 10.23919/CISTI.2017.7975845.
[53] P. Kravets, Y. Burov, V. Lytvyn, V. Vysotska, Gaming method of ontology clusterization,
     Webology 16(1) (2019) 55-76.
[54] P. Kravets, Y. Burov, O. Oborska, V. Vysotska, L. Dzyubyk, V. Lytvyn, Stochastic Game Model
     of Data Clustering, CEUR Workshop Proceedings Vol-2853 (2021) 214-227.
[55] I. Lurie, V. Lytvynenko, S. Olszewski, M. Voronenko, A. Kornelyuk, U. Zhunissova, О. Boskin,
     The Use of Inductive Methods to Identify Subtypes of Glioblastomas in Gene Clustering, CEUR
     Workshop Proceedings Vol-2631 (2020) 406-418.
[56] Y. Bodyanskiy, A. Shafronenko, I. Klymova, Adaptive Recovery of Distorted Data Based on
     Credibilistic Fuzzy Clustering Approach, CEUR Workshop Proceedings Vol-2870 (2021) 6-15.
[57] Y. Meleshko, M. Yakymenko, S. Semenov, A Method of Detecting Bot Networks Based on Graph
     Clustering in the Recommendation System of Social Network, CEUR Workshop Proceedings Vol-
     2870 (2021) 1249-1261.
[58] N. Boyko, S. Hetman, I. Kots, Comparison of Clustering Algorithms for Revenue and Cost
     Analysis, CEUR Workshop Proceedings Vol-2870 (2021) 1866-1877.
[59] R. J. Kosarevych, B. P. Rusyn, V. V. Korniy, T. I. Kerod, Image Segmentation Based on the
     Evaluation of the Tendency of Image Elements to form Clusters with the Help of Point Field
     Characteristics, Cybernetics and Systems Analysis 51(5) (2015) 704-713. doi: 10.1007/s10559-
     015-9762-5.
[60] S. Babichev, B. Durnyak, I. Pikh, V. Senkivskyy, An Evaluation of the Objective Clustering
     Inductive Technology Effectiveness Implemented Using Density-Based and Agglomerative
     Hierarchical Clustering Algorithms, Advances in Intelligent Systems and Computing 1020 (2020)
     532-553. doi:10.1007/978-3-030-26474-1_37.
[61] S. Babichev, M. A. Taif, V. Lytvynenko, V. Osypenko, Criterial analysis of gene expression
     sequences to create the objective clustering inductive technology, in: Proceedings of the
     International Conference on Electronics and Nanotechnology, ELNANO, 2017, pp. 244–248. doi:
     10.1109/ELNANO.2017.7939756.
[62] S. Babichev, V. Lytvynenko, V. Osypenko, Implementation of the objective clustering inductive
     technology based on DBSCAN clustering algorithm, in: Proceedings of the 12th International
     Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT,
     2017, 1, pp. 479-484. doi: 10.1109/STC-CSIT.2017.8098832.
[63] S. A. Babichev, A. Gozhyj, A. I. Kornelyuk, V. I. Lytvynenko, Objective clustering inductive
     technology of gene expression profiles based on SOTA clustering algorithm, Biopolymers and Cell
     33(5) (2017) 379–392. doi: 10.7124/bc.000961.
[64] V. Lytvynenko, I. Lurie, J. Krejci, M. Voronenko, N. Savina, M. A. Taif., Two Step Density-Based
     Object-Inductive Clustering Algorithm, CEUR Workshop Proceedings Vol-2386 (2019) 117-135.
[65] S. Mashtalir, O. Mikhnova, M. Stolbovyi, Multidimensional Sequence Clustering with Adaptive
     Iterative Dynamic Time Warping, International Journal of Computing 18(1) (2019) 53-59.
[66] R. Melnyk, R. Tushnytskyy, 4-D pattern structure features by three stages clustering algorithm for
     image analysis and classification, Pattern Analysis and Applications 16(2) (2013) 201-211. doi:
     10.1007/s10044-013-0326-x.
[67] R. Melnyk, R. Tushnytskyy, Circuit board image analysis by clustering, in: Proceeding of the 4th
     International Conference of Young Scientists on Perspective Technologies and Methods in MEMS
     Design, MEMSTECH, 2008, pp. 44-45. doi: 10.1109/MEMSTECH.2008.4558732.
[68] N. Shakhovska, V. Yakovyna, N. Kryvinska, An improved software defect prediction algorithm
     using self-organizing maps combined with hierarchical clustering and data preprocessing, Lecture
     Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and
     Lecture Notes in Bioinformatics) 12391 (2020) 414–424. doi: 10.1007/978-3-030-59003-1_27.
[69] S. Babichev, V. Osypenko, V. Lytvynenko, M. Voronenko, M. Korobchynskyi, Comparison
     Analysis of Biclustering Algorithms with the use of Artificial Data and Gene Expression Profiles,
     in: Proceeding of the IEEE 38th International Conference on Electronics and Nanotechnology,
     ELNANO, 2018, pp. 298–304. doi: 10.1109/ELNANO.2018.8477439.
[70] S. Babichev, J. Krejci, J. Bicanek, V. Lytvynenko, Gene expression sequences clustering based on
     the internal and external clustering quality criteria, Proceedings of the 12th International Scientific
     and Technical Conference on Computer Sciences and Information Technologies, CSIT, 2017, 1,
     pp. 91–94. doi: 10.1109/STC-CSIT.2017.8098744.
[71] S. Babichev, V. Lytvynenko, J. Skvor, J. Fiser, Model of the objective clustering inductive
     technology of gene expression profiles based on SOTA and DBSCAN clustering algorithms,
     Advances in Intelligent Systems and Computing 689 (2018) 21–39. doi: 10.1007/978-3-319-
     70581-1_2.
[72] N. Shakhovska, V. Vysotska, L. Chyrun, Features of E-Learning Realization Using Virtual
     Research Laboratory, in: Proceedings of the International Conference on Computer Sciences and
     Information Technologies, CSIT, 2016, pp. 143–148. doi: 10.1109/STC-CSIT.2016.7589891.
[73] N. Shakhovska, V. Vysotska, L. Chyrun, Intelligent Systems Design of Distance Learning
     Realization for Modern Youth Promotion and Involvement in Independent Scientific Researches,
     Advances in Intelligent Systems and Computing 512 (2017) 175-198. doi: 10.1007/978-3-319-
     45991-2_12.
[74] M. Emmerich, V. Lytvyn, I. Yevseyeva, V. B. Fernandes, D. Dosyn, V. Vysotska, Preface: Modern
     Machine Learning Technologies and Data Science, CEUR Workshop Proceedings Vol-2386
     (2019).
[75] M. Emmerich, V. Lytvyn, V. Vysotska, V. Basto-Fernandes, V. Lytvynenko, Preface: Modern
     Machine Learning Technologies and Data Science, CEUR Workshop Proceedings Vol-2631
     (2020).
[76] M. Emmerich, V. Lytvyn, V. Vysotska, V. B. Fernandes, V. Lytvynenko, Preface: 3rd International
     Workshop on Modern Machine Learning Technologies and Data Science, CEUR Workshop
     Proceedings Vol-2917 (2021).
[77] P. S., Malachivskyy, Y. V. Pizyur, V. A. Andrunyk, Chebyshev Approximation by the Sum of the
     Polynomial and Logarithmic Expression with Hermite Interpolation, Cybernetics and Systems
     Analysis 54(5), (2018) 765-770. doi: 10.1007/s10559-018-0078-0.
[78] B. van Stein, H. Wang, W. Kowalczyk, M. Emmerich, T. Bäck, Cluster-based Kriging
     approximation algorithms for complexity reduction, Applied Intelligence 50(3) (2020) 778–791.
     doi: 10.1007/s10489-019-01549-7.
[79] H. Wang, M. Emmerich, B. Van Stein, T. Back, Time complexity reduction in efficient global
     optimization using cluster kriging, in: Proceedings of the 2017 Genetic and Evolutionary
     Computation Conference on GECCO, 2017, pp. 889–896. doi: 10.1145/3071178.3071321.