<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>COLINS-</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Analysis of Geo-Economic Distribution of Scientific Publications Citation and Self-Citation Standardized Indices Based on Machine Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michał Duda</string-name>
          <email>michal.duda@uwm.edu.pl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bohdan Korostynskyi</string-name>
          <email>bohdan.korostynskyi.sa.2019@lpnu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oleksandr Mediakov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Victoria Vysotska</string-name>
          <email>victoria.a.vysotska@lpnu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oksana Markiv</string-name>
          <email>oksana.o.markiv@lpnu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Scientific Publications Citation, Self-Citation Standardized Indices, Machine Learning,</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Lviv Polytechnic National University</institution>
          ,
          <addr-line>S. Bandera Street, 12, Lviv, 79013</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Osnabrück University</institution>
          ,
          <addr-line>Friedrich-Janssen-Str. 1, Osnabrück, 49076</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Warmia and Mazury in Olsztyn</institution>
          ,
          <addr-line>Michała Oczapowskiego Street, 2, Olsztyn, 10719</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>6</volume>
      <fpage>12</fpage>
      <lpage>13</lpage>
      <abstract>
        <p>The article dwells upon the logical order of processing, transformation and synthesis of data windows, their visualization and analysis for geo - economic distribution research of articles authorship numerical characteristics, their citation, estimation, lack of linear and nonlinear relationships between individual parameters of author and percentage of self-citation. The work demonstrates the possibility of using new and classical methods of data visualization to study patterns, relationships between numerical and nominal data as well as methods of using conventional multilayer perceptrons to search for nonlinear relationships between multiple parameters. Open source software designed to build the necessary representations of data and models is the important part of the investigation.</p>
      </abstract>
      <kwd-group>
        <kwd>Indices</kwd>
        <kwd>Keywords1</kwd>
        <kwd>dataset</kwd>
        <kwd>neural network</kwd>
        <kwd>correlation matrix</kwd>
        <kwd>correlation analysis</kwd>
        <kwd>regression analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>General development of mankind and globalization has caused the fact that most modern advances
in various fields of human activity are spreading with incredible speed. The authorship, sale or
distribution of articles, books and works are made through a large number of different platforms that
are available to anyone. Such systems are the basis of virtuous exchange of knowledge.</p>
      <p>Modern scientific works are always based on certain previous results and in accordance with the
generally accepted principles of the scientific world, the source of a certain result is always indicated.
Accordingly, each quality article has its own list of references, which from the point of view of the
author of these works, is his citation.</p>
      <p>It is logical that many works cite many other works, forming complex graph-like connections. Using
numerical indicators or certain indices, they form ratings of the most influential, popular or most
productive authors in the world.</p>
      <p>Since the position of the author can be increased by self-citation, there is a cluster of individuals
who abuses sit. That is why one of the tasks of this work is to check the relationship between different
author numerical and nominal characteristics with the level of his self-citation.</p>
      <p>However, the development of the scientific world has different bases, including financial, which can
be extended to the geographical and economic distribution of scientific capacity, in the perspective of
EMAIL:
(B.</p>
      <p>Korostynskyi);</p>
      <p>Oleksandr.Mediakov.sa.2019@lpnu.ua
(O.</p>
      <p>Mediakov);</p>
      <p>2022 Copyright for this paper by its authors.
writing articles, citations, indices of authors and more. Thus, another task set in the paper is to study
the geographical distribution, as well as economic, by using certain economic and social indicators.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>
        The first researches and works related to the evaluation of authors citation are related to the
derivation of correct index or rating formula. Data set used in the work contains such two indices:
hindex, hm-index. First of all, the quality of the author citation represetation using the h-index is
debatable, because many scholars and publicists argue the low efficiency and incompetence of this
approach [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1-4</xref>
        ].Since this work aims to study self-citation and by comparing the h-indices of the authors
with and without self-citation, the question of study correctness has been emerged. Many papers dwells
upon direct citation analysis limited to a specific cluster of authors tied to publishing platforms. In the
result of the analysis, data about distribution of citations by person partners as a physical being, as well
as by their intellectual characteristics or areas such as education have been obtained [
        <xref ref-type="bibr" rid="ref1 ref3">1,3</xref>
        ].
      </p>
      <p>
        The problem of self-citation research is quite popular [
        <xref ref-type="bibr" rid="ref5 ref6 ref7 ref8 ref9">5-9</xref>
        ]. It leads to the availability of many
sources of data related to it. The study of these data is limited to cluster, regression analysis. So this
paper describes experiments using neural networks to identify or refute the existence of nonlinear,
complex relationships between many possible parameters of the author and his level of self-citation, to
identify patterns or limited human resources in certain areas. However, when the issue of citation is
extended to authors nationalities, there is a gap in the availability of intelligence analysis of the number
of self-citations in parts of the world or countries, as most studies are unidirectional.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods and Materials</title>
      <p>
        The study of relationships between different parameters, proving the presence or absence of these
relationships, as well as establishing their analytical form requires certain mathematical and algorithmic
apparatus, such as classical methods, correlation analysis, regression analysis, least squares method for
approximation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], backpropagation algorithm as a method of learning neural networks (including
multilayer perceptrons) and Deep Learning method [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13">10-13</xref>
        ]. Some algorithms used in the work have a
high level of complexity and do not require additional changes or descriptions [
        <xref ref-type="bibr" rid="ref14 ref15 ref16">14-21</xref>
        ], because it does
not meet the objectives of this research. They are actuality for content analysis or web resources
monitoring based on machine learning [22-30], for example, citation and self-citation analysis in
scientific and technical articles for dataset formation [31-36]. Such algorithms include the Adam
method - a stochastic algorithm for finding the numerical value of the FBZ gradient. However, some
algorithms for generating the required type of data, their visualization or tokenization have been
supplemented or changed specifically for this research. Some of them are given as an example based
on algorithm supplemented algorithm for filling the correlation matrix and the corresponding heat map:
      </p>
      <p>Step 1. Initiate a single matrix of the attribute’s numbers order. The value that are not on the main
diagonal equals to the special NaN value.</p>
      <p>Step 2. Choose the type of triangular matrix.</p>
      <p>Step 3. Pass cyclic the cells of matrices with incremental parameters i, j</p>
      <p>Step 3.1. If i ≠ j belongs to the matrix</p>
      <p>
        Step 3.1.1. If the upper triangular matrix is chosen and i&gt;j (in cell i, j), then assign the value
of Pearson correlation coefficient [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] between the attributes of numbers i and j.
      </p>
      <p>Step 3.1.2. If the lower triangular matrix is chosen and i&lt;j (in cell), then assign the correlation
coefficient value between the attributes by numbers i and j (in the cell i and j).</p>
      <p>Step 3.2. If i = j, then skip step
Step 4. Choose two color nodes.</p>
      <p>
        Step 5. Interpolate the color nodes values on the interval [
        <xref ref-type="bibr" rid="ref1">-1, 1</xref>
        ], for NaN values it is necessary to
return transparent color.
      </p>
      <p>Step 6. Fill the matrix with the appropriate color values.</p>
      <p>Step 7. Display the color table according to the values of the matrix cells</p>
      <p>Another example is tokenization, which is used to convert some string fields of data windows into
corresponding numerical vectors. Fig. 1 shows the activity diagram describing this algorithm.</p>
      <p>Get vector strings
Cyclically form an array of
unique values</p>
      <p>Remember the number of</p>
      <p>unique values
Create an array of values</p>
      <p>from 0 to length - 1
Combine unique values
and an array of numbers</p>
      <p>into a hash table
Cycle through the initial array by
replacing the tape values with hash</p>
      <p>values</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experiment</title>
      <p>
        The study of selected parameters distribution is based on several datasets, including data about 160
thousand of unique authors [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], world GDP information in 2020, as well as dataset of the global
distribution of corruption indices - CPI, 2019.
      </p>
      <p>The first dataset about authors contains 46 fields and more than 161,441 lines. Each field has an
encrypted name, the description of which is given in Fig. 2. However, only some of them were actually
used in the study, namely: author country, number of author articles for the period 1960 - 2019, number
of author citations for 2019 (including and exclusively with self-citation - Fig. 3), author h-index for
2019 (including and exclusively with self-citation - Fig. 4), as well as the top category of the author
(from the categories of ScienceMetrix). For example, for general statistics within 12% of self-citation
- 160 thousand; with &lt;1% - about 9700 authors; from&gt; 12% - approximately 65400 people, and from&gt;
30% - approximately 7500 (Fig. 6-7).</p>
      <p>
        The next dataset is the data on GDP of countries according to the World Bank (Fig. 8) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. From this
dataset, two fields were used in the work - the ISO3 representation of the country name and the actual
value of GDP.
      </p>
      <p>
        The third dataset (Fig. 9) is one used in the work, the information in which is obtained from
Transparency International – the international organization against corruption [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This dataset is
suitable for separate study, but in this paper is used as a source of information to test the hypothesis.
      </p>
      <p>In order to use, convert and process data from datasets using R the best is to translate the obtained
data into csv format. Since one of the investigation tasks is to study the percentage of self-citation
depending on other parameters, and in the initial data they are given as a tape value, it is necessary to
develop a program for converting a column of data, for example:</p>
      <p>When working with data approaches and technologies, the tidy verse group has been used. Since the
important work is the visualization and representation on graphs of many parameters simultaneously,
the basic principles of developing such graphs using ggplot2, ggridges and patchwork have been
considered. The main type of graphs is lollipop. The developed template looks like this:</p>
      <p>Replacement of the variable F with any available in the data frame will allow to build different
graphs, for example from Fig. 10. Classic point graphs with a regression line also are used. There is an
example code to create:</p>
      <p>Another important type of displaying the relationship between data is the thermal map, the algorithm
of which was described in the previous section, in particular, the supplemented algorithm can be
performed as follows:</p>
      <p>Relatively new type of many samples’ properties representation, their distribution or relationship
are the so-called joyplot or ridgeline. In this paper it is performed using point gradient by the following
code:</p>
      <p>The results of this graph type are shown in Fig.13.</p>
      <p>After describing features of building data graphical representation, construction of some algorithms
for data processing has been demonstrated. First of all, the method of data conversion of the authors
dataset has been considered. It is necessary to combine all records by country and then convert the
grouped fields into new ones, for example, the number of published articles by authors of one country
should be summed, the level of self-citation should be turned into the average one, and the h-index of
country authors should be generalized.</p>
      <p>After creating the new data window, it is possible to merge it with two others that have a common
field - country names. For croquet combination without forming NA values, firstly, it is needed to find
the intersection of three sets of existing countries in each dataset and then select only those countries
that are part of the new set:</p>
      <p>In the part of the work responsible for the study of the considered properties geo-economic
distribution properties, the deviation to the exponential regression has been performed. If to use the
built-in function lm and compose the formula as y ~ exp (x), then there will be a problem that the best
parameters for the regression function f (x) = c + exp (x) will be selected.</p>
      <p>However, in the general case, the exponential function has the form of a * b ^ x, which is why the
formula ln (y) ~ x is chosen to perform the correct regression analysis, which is equivalent to the
function ln (f (x)) = c1 + c2x, which is equal to f (x) = exp (c1) exp (c2x), where a = exp (c1), and b =
exp (c2):</p>
      <p>The last important part of the experiments descriptions is the neural networks development. The
paper represents four different in architecture and set of activation functions, single and multilayer
perceptrons, i.e. fully connected networks (Fig. 11).</p>
      <p>TensorFlow was used as a library for python to build perceptrons and the model itself was built
using the Functional API:</p>
      <p>To build a dataset for learning and testing models from the initial worksheet about the authors
developed code:</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>The general relationship between the parameters is the first step in considering the data. It was
carried out with the help of class correlation analysis, by constructing thermal map of the correlation
matrix (Fig. 12).</p>
      <p>The first important aspect which should be highlighted is the second sub-diagonal presence with a
correlation coefficient is closed to 1. This linear data cloud is a correlation between citation parameters
(number, h-index, hm-index, etc.) with and without self-citation. The presence of such correlation
means that in the general case, self-citation does not have a strong impact on the performance of authors
and therefore is not used as self-plagiarism or as a source of increasing the author rating. The next
important aspect is the self-citation percentage correlation to other parameters. The last column of the
map from Fig. 7 is almost all painted according to zero correlation. This means that at least there is no
linear relationship between self-citation and parameters presented in the dataset. Proving of the
hypothesis that there is no connection between the parameters available in the data set and self-citation
will be done by building neural networks and analyzing of their results.</p>
      <p>The next stage of data review is the distribution of citation parameters and articles publication by
countries. At the beginning, it is necessary to create a data window according to the algorithm described
in the previous paragraph of this paper. Firstly, the statistical distributions of the fields selected from
the dataset are considered, in particular, they are built as joyplot. Because the ranges and data intervals
between these articles, h-index, self-citation percentage and citations are very different, all column data
values have been normalized. The diagram in Fig. 13 shows a certain similarity in the distribution of
these parameters, in particular, all have positive asymmetry and are higher than normal excess.</p>
      <p>It is important to notice that the correlation coefficient between the average percentage of
selfcitations from the dataset, as well as the calculated percentage grouped according to the data is 0.95
respectively and can be considered as a correct calculation of the average percentage. The next stage is
to consider individual distributions of parameters relative to the country. In Fig. 14. the top 60 (unsorted
values) and 23 countries are shown in terms of the total number of articles. Part of the chart with the
top 23 countries shows that the top is occupied by economically developed countries or large
economies, in particular, this observation suggests the existence of correlation between economic
development and the number of articles. GDP was chosen as an indicator of economic development. In
contrast to GDP, an indicator of corruption in the states was added to obtain information on the
dependence of the level of corruption, in particular in the educational and scientific spheres, on the
abuse of self-citation among authors. The next considered parameter is the total number of the country
authors citation. The graph in Fig. 15 shows the top 60, top 23 and the position of Ukraine. Each lollipop
has two labels, the larger one is responsible for quoting with self-citation and without the smaller one.
This graph already shows the presence of high correlation between these parameters.</p>
      <p>Ukraine ranks 49th in the number of articles, which is equal to 10029, such a small number is due
to the incompleteness of the dataset, because it includes only the most popular authors in the world.
Again, the first place is behind the United States, second - China, third - Great Britain and so on. Ukraine
ranks 65th along with Hungary and Peru. The distribution of the percentage of self-citations by countries
is given below, the graphical representation of the data is shown in Fig. 15. Unfortunately, according
to this parameter, Ukraine is in the top five, ranking fourth, with approximately equal self-citation of
34%. The smallest number of self-citations are in small countries in the Caribbean, such as Barbados
and Santa Lucia. In particular, the lack of visual link between self-citation and the economic
development of the state suggests that there is no correlation between GDP and the percentage of
selfcitation. The most interesting approach to considering of selected primers for countries is developed for
the h-index. Since the h-index data is on the controversial boundary between nominal and relative data
types, it is incorrect to estimate the country by the average value, so it was decided to calculate the
hindex of authors h-indices in a particular country. The algorithm for constructing such a parameter is
described in the previous section.</p>
      <p>The first graph from Fig. 16 is sorted by the number of citations for 2019, and it can be seen that for
the first 60 countries there is a correlation between these fields. Again, the United States ranks first,
with an h-index of 55. Ukraine ranks 68th, with an index of 16, along with a number of other countries.</p>
      <p>To display the global picture of the geographical distribution of these parameters, a graph is
constructed Fig. 17.Deciphering graph values is quite intuitive, countries are sorted by number of
articles, the number is indicated by the color of lolipop, the value of lolipop consists of two parts, most
of them are responsible for the h-index of the country on the h-index of authors including self-citation,
the size of the lolipop indicates the percentage of self-citation. Some interesting points can be seen from
the graph: Russia is in the top 23 countries, but the percentage of self-citation of this country clearly
stands out among others, and it falls out of the picture and the h-index, because it is quite low compared
to neighboring countries.</p>
      <p>After conducting graphical intelligence analysis, the values of GDP and CPI of countries were added
to the data. The first step is correlation analysis. Fig. 18 shows thermal map of the correlation matrix.
This map shows the presence of linear relationship between the h-indices of countries, which confirms
one of the previous assumptions. The correlation matrix also confirms a fairly obvious relationship
between the number of articles and citations (with and without self-citation). All correlation coefficients
with self-citation are negative. In particular, the largest modulus correlation is between self-citation and
the corruption rate, which allows a weak linear relationship between them, the higher the CPI (i.e. the
more transparent the various spheres of government are) the lower the level of self-citation abuse is,
but it is impossible to insist on the straight relationship.</p>
      <p>Another assumption concerns the country GDP and the number of articles. The map shows that there
is a linear relationship between GDP and the number of citations. However, a little deeper analysis was
done to confirm the connection. Thus, scatter plots, i.e. correlation fields, were constructed between the
value of GDP and the country authors citation numbers for different countries quantities, in particular
for 40, 90 and all available ones. Figure 19 shows that with a smaller sample, the correlation coefficient
decreases significantly and only with the inclusion of countries such as the United States, China and
the United Kingdom, the coefficient increases to the value obtained from the matrix. The fourth graph
from Fig. 18 deals with the study of self-citation relationships between different parameters, and is a
correlation field between the value of self-citation percentage and GDP. It shows the chaos and lack of
specific dependence. Additionally, a study of extraordinary results of the weak linear relationship
between the h-index and GDP and CPI. Comparison of correlation fields are shown in Fig. 20. First
three graphs in Fig. 19 show a regression line constructed by using the lm-function. Similar additions
to the graph are made for scatter charts from Fig. 20.</p>
      <p>The last mini-study based on the results of the correlation matrix is regression analysis of the
relationship between the number of citations and the country h-index. Fig. 21-22 presents a bar chart.
It shows that the approximation function must have the form of an exponent. The construction of this
model was described in the previous section.</p>
      <p>Before describing the results of neural networks the another observation about the distribution of
self-citation has been given but this time through the prism of the field to which the author belongs (top
sphere from the dataset).It is similar to the creation of data window with data about countries, performed
by grouping data records by category and country. After that, three-dimensional histogram was
constructed (Fig. 23), where the third dimension describes the percentage of self-citation and is
determined by color. Since the global view of the histogram is not the most comfortable for visual
analysis, a subset of countries for which a separate chart is built. Fig.24 shows that the highest level of
self-citation is in the field of mathematics and statistics in Mexico, however, if considering Fig. 23, it
is clear that the largest number of self-citations predominates in the field of physics and information
technology.</p>
      <p>The last point of the research is the construction of neural networks to study the existence of
nonlinear connections between those available in the first dataset about the world authors and their
selfcitation. As it was described in the previous section, four different simple single- and multilayer
perceptrons were created with different activation functions and randomization distribution of filling
the initial values of weights and displacements. After tokenization of the fields denoting the country,
institute and three different categories of the author, a new .csv file with a training and test sample has
been created (Fig. 25).The resulting file contains 37 columns, respectively 36 properties, and 161,441
records. When forming the training and test sample, the ratio 80/20 was chosen.</p>
      <p>Figure 23: Global histogram</p>
      <p>During training, the number of repetitions according to the data is equal to 3, and only accuracy is
selected as a metric. After starting the training the following results have been obtained:</p>
      <p>The following results were obtained when testing the networks:</p>
    </sec>
    <sec id="sec-6">
      <title>6. Discussions</title>
      <p>The comprehensive research, which was aimed to test a number of assumptions about the
relationship between the numerical characteristics of the world authors activity on the level of their
selfcitation has been made. The study did not reveal any patterns, relationships between the investigated
properties. Such results lead to certain assumptions about the individualization of characteristics,
motives, level of intellectual and moral development of authors, as well as general trends in science,
existing problems of society, etc. to the level of self-citation. Psychological analysis may show a link
between self-citation abuse and its use as it is a need.</p>
      <p>Further research requires to expand the subject area by finding other possible sources of data.</p>
      <p>On the other hand, the other part of the study that is the geo-economic distribution of certain
parameters has more positive results. The analysis has confirmed a number of hypotheses, including
the connection between the economic development of the country and its ability to create scientific
papers, articles with sufficient quality as there is a link between authors citation and the country GDP.</p>
      <p>The following analysis can be carried out with in-depth division of the components of GDP, possibly
with the expansion of geographical areas, which authors are grouped by.</p>
      <p>There may be a number of other indicators that have an impact on the characteristic of the state in
terms of scientific articles authors views. The paper considers the level of corruption in countries as an
example of possible indices / parameters. In particular, data on the level of education, research costs
and the availability of private research and art centers are suitable for analysis.</p>
      <p>So, the analysis of the geo - economic distribution of parameters in a specific sample of countries
Ukraine, Poland, Georgia, Slovakia and Romania has been made.</p>
      <p>The graph in Fig. 28 shows that the largest number of articles was published in Poland, with Ukraine
in third place. The following graphs from Fig. 29 show that Poland ranks first in all rankings, except
for self-citation, where the championship, unfortunately, belongs to Ukraine.</p>
      <p>From the last graph in Fig. 30 it can be concluded that the existing dataset has ten Ukrainian authors
with a minimum h-index equal to 10. The generalized picture is shown in Fig. 31. Additionally, graph
for the distribution of these countries self-citations percentage by categories of authors has been
created. The corresponding histogram is shown in Fig. 32.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusions</title>
      <p>In the result of the research, the study on the presence, analytical view or lack of links between the
geo-economic distribution of human resources that produce certain scientific papers and articles has
been conducted. The paper presents and uses possible approaches to estimate the country, for example,
with a double overlap of the h-index. In a comprehensive study of the parameters and generalized
characteristics of the world the authors give examples of using classical and modern mathematical
information apparatus to find and prove the presence of patterns in data including the use of graphical
data representation and small neural networks combination.
8. References
[17] S. Babichev, B. Durnyak, O. Sharko, A. Sharko, Technique of metals strength properties
diagnostics based on the complex use of fuzzy inference system and hybrid neural network,
Communications in Computer and Information Science 1158 (2020) 114–126.
[18] P. Mukalov, O. Zelinskyi, R. Levkovych, P. Tarnavskyi, A. Pylyp, N. Shakhovska, Development
of System for Auto-Tagging Articles, Based on Neural Network, CEUR Workshop Proceedings
Vol-2362 (2019) 106–115.
[19] S. Leoshchenko, A. Oliinyk, S. Skrupsky, S. Subbotin, T. Zaiko, Parallel Method of Neural
Network Synthesis Based on a Modified Genetic Algorithm Application, CEUR Workshop
Proceedings Vol-2386 (2019) 11–23.
[20] I. Tsmots, M. Medykovskyy, O. Skorokhoda, Synthesis of hardware components for vertical-group
parallel neural networks, in: Proceedings of the International Conference on Computer Sciences
and Information Technologies, CSIT, 2015, 1–4.
[21] M. O. Medykovskyi, I. G. Tsmots, O. V. Skorokhoda, Spectrum neural network filtration
technology for improving the forecast accuracy of dynamic processes in economics, Actual
Problems of Economics 162(12) (2014) 410–416.
[22] N. B. Shakhovska, R. Yu. Noha, Methods and tools for text analysis of publications to study the
functioning of scientific schools, J. of Automation and Information Sciences 47 (2015) 29–43.
[23] A. Berko, V. Andrunyk, L. Chyrun, M. Sorokovskyy, O. Oborska, O. Oryshchyn, M. Luchkevych,
O. Brodovska, The Content Analysis Method for the Information Resources Formation in
Electronic Content Commerce Systems, CEUR Workshop Proceedings 2870 (2021) 1632–1651.
[24] V. Kuchkovskiy, V. Andrunyk, M. Krylyshyn, L. Chyrun, A. Vysotskyi, S. Chyrun, N. Sokulska,
I. Brodovska, Application of Online Marketing Methods and SEO Technologies for Web
Resources Analysis within the Region, CEUR Workshop Proceedings 2870 (2021) 1652–1693.
[25] V. Lytvyn, V. Danylyk, M. Bublyk, L. Chyrun, V. Panasyuk, O. Korolenko, The lexical
innovations identification in English-languagee eurointegration discourse for the goods analysis
by comments in e-commerce resources, in: Proceedings of IEEE 16th International conference on
Computer science and information technologies, Lviv, Ukraine, 2021, 85–97.
[26] A. Gozhyj, L. Chyrun, A. Kowalska-Styczen, O. Lozynska, Uniform method of operative content
management in web systems, CEUR Workshop Proceedings 2136 (2018) 62–77.
[27] L. Chyrun, Y. Burov, B. Rusyn, L. Pohreliuk, O. Oleshek, A. Gozhyj, І. Bobyk, Web resource
changes monitoring system development, CEUR Workshop Proceedings 2386 (2019) 255–273.
[28] L. Chyrun, A. Kowalska-Styczen, Y. Burov, A. Berko, A. Vasevych, I. Pelekh, Y. Ryshkovets,
Heterogeneous data with agreed content aggregation system development, CEUR Workshop
Proceedings 2386 (2019) 35–54.
[29] B. Rusyn, L. Pohreliuk, A. Rzheuskyi, R. Kubik, Y. Ryshkovets, L. Chyrun, S. Chyrun, A.</p>
      <p>Vysotskyi, V.B. Fernandes, The mobile application development based on online music library for
socializing in the world of bard songs and scouts’ bonfires, Advances in Intelligent Systems and
Computing 1080 (2020) 734–756. doi: 10.1007/978-3-030-33695-0_49.
[30] N. Antonyuk, L. Chyrun, V. Andrunyk, A. Vasevych, S. Chyrun, A. Gozhyj, I. Kalinina, Y.</p>
      <p>Borzov, Medical news aggregation and ranking of taking into account the user needs, CEUR
Workshop Proceedings 2488 (2019) 369–382.
[31] T. Yu, G. Yu, M. Y. Wang, Classification method for detecting coercive self-citation in journals,</p>
      <p>Journal of Informetrics 8(1) (2014) 123–135.
[32] T. Yu, G. Yu, Y. Song, M. Y. Wang, Toward the more effective identification of journals with
anomalous self-citation, Malaysian Journal of Library &amp; Information Science 23(2) (2018) 25–46.
[33] M. Szomszor, D. A. Pendlebury, J. Adams, How much is too much? The difference between
research influence and self-citation excess, Scientometrics 123(2) (2020) 1119–1147.
[34] R. H. Gálvez, Assessing author self-citation as a mechanism of relevant knowledge diffusion,</p>
      <p>Scientometrics 111(3) (2017) 1801–1812.
[35] G. Abramo, C. A. D'Angelo, L. Grilli, The effects of citation-based research evaluation schemes
on self-citation behavior, Journal of Informetrics 15(4) (2021) 101204.
[36] Y. Liu, M. Chen, Applying text similarity algorithm to analyze the triangular citation behavior of
scientists, Applied Soft Computing 107 (2021) 107362.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kacem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Flatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mayr</surname>
          </string-name>
          ,
          <string-name>
            <surname>Tracking</surname>
          </string-name>
          self-citations in academic publishing,
          <source>Scientometrics</source>
          <volume>123</volume>
          (
          <year>2020</year>
          )
          <fpage>1157</fpage>
          -
          <lpage>1165</lpage>
          . doi:
          <volume>10</volume>
          .1007/s11192-020-03413-9.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Corruption</given-names>
            <surname>Perceptions Index</surname>
          </string-name>
          .
          <year>2019</year>
          . URL: https://www.transparency.org/en/cpi/2019/index/nzl
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D. W.</given-names>
            <surname>Aksnes</surname>
          </string-name>
          ,
          <article-title>A macro study of self-citation</article-title>
          ,
          <source>Scientometrics</source>
          <volume>56</volume>
          (
          <year>2003</year>
          )
          <fpage>235</fpage>
          -
          <lpage>246</lpage>
          . doi:
          <volume>10</volume>
          .1023/A:
          <fpage>1021919228368</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Kaptay</surname>
          </string-name>
          ,
          <article-title>The k-index is introduced to replace the h-index to evaluate better the scientific excellence of individuals</article-title>
          ,
          <source>Heliyon</source>
          <volume>6</volume>
          (
          <issue>7</issue>
          ) (
          <year>2020</year>
          )
          <article-title>e04415</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.heliyon.
          <year>2020</year>
          .e04415.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V. A.</given-names>
            <surname>Bloomfield</surname>
          </string-name>
          ,
          <article-title>Using R for Numerical Analysis in Science and Engineering</article-title>
          . Minneapolis, USA: University of Minnesota,
          <year>2014</year>
          . URL: http://hsrmmathematik.de/SS2020/semester4/
          <article-title>Datenanalyse-und-</article-title>
          <string-name>
            <surname>ScientificComputing-</surname>
          </string-name>
          mit-R/book.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>World</given-names>
            <surname>Bank GD Pranking</surname>
          </string-name>
          ,
          <year>2019</year>
          . URL: https://www.kaggle.com/theworldbank/world-bank-gdpranking.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Baas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Klavans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Boyack</surname>
          </string-name>
          ,
          <article-title>Supplementary data tables for "A standardized citation metrics author database annotated for scientific field" (PLoS Biology 2019</article-title>
          ),
          <year>2019</year>
          . doi:
          <volume>10</volume>
          .17632/btchxktzyw.1. URL: https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/1.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Baas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Boyack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P. A.</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          ,
          <article-title>Data for "Updated science-wide author databases of standardized citation indicators</article-title>
          ,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .17632/btchxktzyw.2. URL: https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/2.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Baas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Boyack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P. A.</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          ,
          <year>August 2021</year>
          data
          <article-title>-update for "Updated science-wide author databases of standardized citation indicators"</article-title>
          ,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .17632/btchxktzyw.3. URL: https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/3.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>B.</given-names>
            <surname>Rusyn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lutsyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kosarevych</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Obukh</surname>
          </string-name>
          ,
          <source>Application Peculiarities of Deep Learning Methods in the Problem of Big Datasets Classification, Lecture Notes in Electrical Engineering</source>
          <volume>831</volume>
          (
          <year>2022</year>
          )
          <fpage>493</fpage>
          -
          <lpage>506</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -92435-5_
          <fpage>28</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Emmerich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Lytvyn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vysotska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. B.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Lytvynenko</surname>
          </string-name>
          ,
          <source>Preface: 3rd International Workshop on Modern Machine Learning Technologies and Data Science (MoMLeT &amp;DS</source>
          <year>2021</year>
          ),
          <source>CEUR Workshop Proceedings</source>
          Vol-
          <volume>2917</volume>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B.</given-names>
            <surname>Polishchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Berko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chyrun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bublyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Schuchmann</surname>
          </string-name>
          ,
          <article-title>The rain prediction in Australia based Big Data analysis and machine learning technology</article-title>
          ,
          <source>in: Proceedings of IEEE 16th International conference on computer science and information technologies</source>
          <year>2021</year>
          ,
          <fpage>97</fpage>
          -
          <lpage>100</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Demchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Rusyn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pohreliuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gozhyj</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Kalinina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chyrun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Antonyuk</surname>
          </string-name>
          ,
          <article-title>Commercial content distribution system based on neural network and machine learning</article-title>
          ,
          <source>CEUR Workshop Proceedings</source>
          <volume>2516</volume>
          (
          <year>2019</year>
          )
          <fpage>40</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>V.</given-names>
            <surname>Lytvynenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wojcik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fefelov</surname>
          </string-name>
          , I. Lurie,
          <string-name>
            <given-names>N.</given-names>
            <surname>Savina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Voronenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Boskin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Smailova</surname>
          </string-name>
          ,
          <source>Hybrid Methods of GMDH-Neural Networks Synthesis and Training for Solving Problems of Time Series Forecasting, Lecture Notes in Computational Intelligence and Decision Making</source>
          <volume>1020</volume>
          (
          <year>2020</year>
          )
          <fpage>513</fpage>
          -
          <lpage>531</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Safonyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mishchanchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Lytvynenko</surname>
          </string-name>
          ,
          <article-title>Intelligent information system for the determination of iron in coagulants based on a neural network</article-title>
          ,
          <source>CEUR Workshop Proceedings</source>
          <volume>2853</volume>
          (
          <year>2021</year>
          )
          <fpage>142</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Ivanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Koretska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Lytvynenko</surname>
          </string-name>
          ,
          <article-title>Intelligent modeling of unified communications systems using artificial neural networks</article-title>
          ,
          <source>CEUR Workshop Proceedings</source>
          <volume>2623</volume>
          (
          <year>2020</year>
          )
          <fpage>77</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>