<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Intelligent Information System for Generating a Scientist's Scientometrics Using Content Analysis Methods</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mykola Dyvak</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andriy Yushko</string-name>
          <email>a.yushko@wunu.edu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andriy Melnyk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>West Ukrainian National University</institution>
          ,
          <addr-line>11 Lvivska Street, Ternopil, 46001</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper proposes methods and software tools for developing a scientometric profile of a researcher using content analysis techniques. A scientometric profile is a system of indicators that assesses a researcher's scientific productivity and influence. The growing volume of scientific information in various databases, such as Scopus and Web of Science, has made it challenging to manually track and analyze individual publishing activities. For scientific and higher education institutions, monitoring both the quantity and quality of publications is crucial. Additionally, understanding researchers' main areas of interest helps support their professional development and foster interdisciplinary collaboration. Existing tools for monitoring scientific metrics typically offer limited functionality, lack the ability to process large volumes of data efficiently, and struggle to filter irrelevant information automatically. This paper presents an approach to building a researcher's scientometric profile using content analysis, supported by large language models, specifically Ollama. A mathematical model was developed to filter out irrelevant publications based on the researcher's scientometric profile. The system for collecting and analyzing scientometric indicators was implemented, and experimental studies were conducted using profiles of researchers from West Ukrainian National University.</p>
      </abstract>
      <kwd-group>
        <kwd>intelligent information system</kwd>
        <kwd>scientometrics</kwd>
        <kwd>researcher</kwd>
        <kwd>content analysis methods</kwd>
        <kwd>large language model</kwd>
        <kwd>irrelevant publications</kwd>
        <kwd>*</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>only allows you to collect information from scientific databases about publications, projects, grants
and participation in scientific events, but also forms a profile of a scientist, determining his
scientific interests. Using this profile, the system is able to filter irrelevant publications,
automatically assessing their relevance to the scientist's interests. This decision contributes to
increasing the efficiency of scientific activity, allowing to focus attention on really important and
relevant scientific achievements.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Algorithms and approaches for selecting keywords and determining the researcher's scientific interests</title>
      <p>Modern research actively uses algorithms for automatic analysis of text data to select keywords
that reflect the main scientific interests of the researcher. The development of such approaches is
aimed at simplifying the process of collecting, analyzing and systematizing scientific materials,
which allows not only to identify the main areas of work, but also to identify interdisciplinary
connections.</p>
      <p>The main methods used to analyze texts for the purpose of extracting keywords can be divided
into several categories:</p>
    </sec>
    <sec id="sec-3">
      <title>3. Statistical methods</title>
      <p>
        One of the basic approaches is to calculate the frequency of use of terms in texts. The TF-IDF
metric (Term Frequency-Inverse Document Frequency) is the most popular among statistical
methods and allows taking into account both the frequency of a term in a document and its
significance in the context of the entire corpus of texts [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This increases the accuracy of
extracting significant terms, as frequent but insignificant words are given less weight.
      </p>
      <p>Figure 1 shows an example of the implementation of the TF-IDF metric in the Python
programming language using the scikit-learn library.</p>
      <p>As a result of executing the code, we will get a table with the top 10 keywords and their TF-IDF
values.</p>
      <p>The TF-IDF value of each keyword reflects its weight in the context of the article's annotation –
the higher the TF-IDF value, the more important the term is for this text</p>
    </sec>
    <sec id="sec-4">
      <title>4. Rule-based methods</title>
      <p>
        Rule-based approaches, such as Named Entity Recognition (NER), allow the extraction of certain
categories of words, such as names of organizations, names of people, geographic locations, and
other important entities [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In the Python programming language, you can use the transformers
library from Hugging Face, which allows you to load a pre-trained model for recognizing named
entities (Fig. 2).
      </p>
      <p>As you can see from the code above, we use the pipeline method with a pre-trained
dbmdz/bertlarge-cased-finetuned-conll03-english model that is specially tuned for Named Entity Recognition
(NER). The aggregation_strategy="simple" parameter allows you to aggregate the results for
greater convenience.</p>
      <p>The next step is to run NER on the text. This pulls up a list of colds with the specified types (eg
organizations, technology names, scientific concepts).</p>
      <p>After that, keyword filtering is performed by selecting entities that may be relevant. For
example ORG (organizations) and MISC (various terms such as technology or scientific concepts).</p>
      <p>After passing all the stages, we get a list of keywords selected from the text, in our case it is:
Google Cloud Reduce and MapReduce.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Natural language processing (NLP) models</title>
      <p>Thanks to the development of natural language processing methods and the emergence of deep
models such as BERT, GPT and others, it became possible to significantly improve the accuracy of
text analysis [8]. These models take into account the context of words, which allows not only to
highlight keywords, but also to understand their relationship and semantic meaning.</p>
      <p>Figure 3 shows a code fragment for selecting article categories by their annotations using the
ready-made facebook/bart-large-mnli model from the Transformers library.</p>
      <p>In Figure 4, the output shows which categories most closely match each text, as well as the
confidence level of the model corresponding to each category.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Text vectorization</title>
      <p>To vectorize the text and create its numerical representation, you can use the Word2Vec or
Doc2Vec methods from the gensim library in Python. Word2Vec creates vectors for individual
words, while Doc2Vec allows you to get a vector representation for an entire document[9] (Fig. 5).</p>
      <p>As a result of executing this code, we will get a vector representation for three annotation texts,
which is shown in Figure 6.</p>
      <p>In the future, the obtained vectors can be used to compare the similarities between documents
or to cluster documents based on topics. For example, we can calculate cosine similarity between
vectors to find out how similar documents are to each other (Fig. 7).</p>
      <p>The cosine similarity value of two documents of 0.98 indicates that these documents have a very
high level of similarity in terms of their vector representations. Cosine similarity measures the
angle between the vectors of two texts: a value close to 1 means that the vectors are nearly parallel,
indicating a high degree of similarity between the texts.</p>
      <p>These methods can be used both separately and in combination to achieve more accurate results
in determining the researcher's key scientific interests. The use of these approaches allows
automating the processes of analysis of scientific activity, which, in turn, contributes to the
formation of a comprehensive profile of a researcher capable of reflecting the dynamics of his
scientific work and interdisciplinary connections.</p>
      <p>Each of the described methods has its own unique application and can complement other
methods in complex tasks of text analysis. In the next section, we will look at how you can use the
Ollama model with its powerful language models to identify keywords in text. This approach will
make it possible to apply the latest capabilities of deep learning to improve the accuracy of
extracting relevant terms and analyzing complex textual data.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Methodology for creating a scientometric portrait of scientist using large Ollama language models</title>
      <p>A researcher's profile is a comprehensive description of the researcher's professional activities,
scientific achievements, and interests. It includes such key elements as name, surname, position,
academic title, scientific interests, number of published works, participation in scientific grants and
projects. The formation of a scientist's profile is an important task, since it can be used to solve
such tasks as automated filtering of publications that match the researcher's scientific interests, and
optimized selection of a scientific supervisor for young scientists or graduate students whose
scientific activity coincides with the topic.</p>
      <p>To form a profile of a scientist, first of all, it is necessary to collect basic metadata, which will
become the foundation for further processing. The web scraping method can help us in this, which
will allow us to collect basic information from the official website of the organization where the
scientist works. This method provides automated extraction of such data as name, surname,
position, academic title, circle of scientific interests, links to scientometric profiles of the author
(Scopus, Web of Science, ORCID, Google Scholar, DSpace).</p>
      <p>The use of web scraping at the initial stage provides automatic filling of the profile with
publicly available information, which significantly reduces the time spent on manual data
collection and creates an accurate starting point for further analysis.</p>
      <p>To implement the web scraping process, you can use specialized libraries that allow you to
automatically read and extract information from web pages [10,11]. For example, using the Cheerio
library in JavaScript, it is possible to retrieve and process the HTML content of a page, extracting
the required metadata such as name, title, academic interests, etc. The following example
demonstrates the basic code for obtaining information about a scientist from the official website of
the Western Ukrainian National University, focusing on the necessary profile elements.</p>
      <p>Figure 8 shows a fragment of the code for parsing the metadata of scientists from the official
website of the organization.</p>
      <p>Now that we have a basic set of web-scraping metadata, we can move on to the next step —
fleshing out a scientist's profile using Ollama's large language models.</p>
      <p>Large Language Models (LLM) are a powerful tool for analyzing and processing textual data.
Thanks to the ability to understand the context and extract meaningful units.</p>
      <p>The main advantage of Ollama is the ability to run and manage large language models (LLM)
locally on a computer, without the need for cloud services. This ensures increased confidentiality
of data, reduces costs and allows users to fully control information processing processes [10].</p>
      <p>The models presented in the Ollama platform are specialized in the processing of scientific texts
and have a wide range of applications, such as automatic text classification, extraction of keywords
and phrases, identification of scientific interests and creation of a generalized profile.</p>
      <p>Table 1 provides a comparative analysis of the major language models supported by the Ollama
platform [12,13].</p>
      <p>The above table shows the main features of the models, their advantages and disadvantages and
allows you to choose the appropriate model for solving this or that problem.</p>
      <p>Phi-3 1,4B ~2,8 rSepseecairaclihzetdasikns;schiiegnhtiafciccuarnadcy. tMasakys;bneeleesdss esfeftetcintigv.e in general</p>
      <p>Figure 9 demonstrates the process of forming a profile of a scientist, which includes the main
stages: data collection, analysis of text documents using Ollama models, parsing of scientific
interests, classification of information and final creation of a profile.</p>
    </sec>
    <sec id="sec-8">
      <title>8. A mathematical model of filtering irrelevant publications based on the profile of a scientist</title>
      <p>In scientometric databases, a problem often arises when, due to the coincidence of the author's last
name, first name, and patronymic, publications that do not belong to the scientist are added to the
scientist's profile. This distorts the indicators of scientific activity and complicates the objective
assessment of the researcher's contribution. The development of a mathematical model for filtering
irrelevant publications based on the profile of a scientist allows to effectively solve this problem.
To build mathematical models in conditions of limited data sampling, it is advisable to use methods
based on interval analysis [14-18]. Using detailed scholarly profile data, such as author research
interests, affiliations, and other unique characteristics, it is possible to accurately identify
publications that actually belong to a particular scholar. This ensures an increase in the accuracy of
scientometric indicators and will contribute to a more objective analysis of scientific activity.</p>
      <p>The process of building a mathematical model for filtering irrelevant publications based on the
profile of a scientist can be divided into several steps:</p>
      <p>Step 1. Formulation of the author's scientific interests.</p>
      <p>The author's scientific interests can be represented as a vector of keywords that provides an
opportunity to describe the main areas of research. Let I  k 1 , k  2 ,,k  n    , where k i is a keyword
or phrase describing the author's interests.</p>
      <p>Step 2. Vector representation of the publication.</p>
      <p>Each post can also be represented as a vector of keywords. Let Pj  p 1 , p 2 ,,pm  , where p i
is a keyword or phrase associated with post Pj .</p>
      <p>Step 3. Calculating relevance using cosine similarity.</p>
      <p>To measure the similarity between the scientific interests of the author I and the publication
vector Pj , you can use the cosine similarity:
elevance I ,P j  </p>
      <p>n
i 1k i  pi
n
i 1k i2 </p>
      <p>m
i 1pi2
 ,
(1)
where  is the scalar product operation. The value of relevance I ,P j  ranges from 0 to 1, where a
value close to 1 means high relevance.</p>
      <p>Step 4. Filtering of irrelevant publications.
irrelevant and is filtered out.</p>
      <p>Pj is relevant if relevance I ,P j  T
If the value relevance I ,P j  is less than some threshold T , then the publication is considered
So, as we can see, the model we received allows us to automatically filter out irrelevant
publications based on the scientific profile of the author.</p>
    </sec>
    <sec id="sec-9">
      <title>9. Software implementation of the system for collecting and analyzing scientific and scientific-pedagogical activities of the academic team</title>
      <p>In the modern conditions of the information society, it is important to have effective tools for
collecting and analyzing scientific activity [19-21]. The developed system is aimed at automating
the processes of data collection about the scientific and scientific and pedagogical achievements of
the academic staff, filtering this data based on relevance to their interests, and forming reports by
university, faculty, department, which allows to improve the quality of management and planning
of scientific work.</p>
      <p>Conventionally, our system can be divided into several interacting modules, namely:
1. The authorization and authentication module, which ensures secure user access to the
system, access management and protection of personal data;
2. Data collection module: responsible for obtaining information from scientometric databases
(for example, Scopus, Crossref, NRAT) and the profile of a scientist;
3. Data processing and analysis module: cleans, normalizes and pre-processes collected data
for preparation for further analysis;
4. Filtering module: implements a mathematical filtering model using machine learning
algorithms and criteria defined on the basis of the scientist's profile;
5. Reporting module: provides an opportunity to generate a general report on the scientific
activity of the university, faculty or department;
6. User interface: provides user interaction with the system, providing the ability to perform
CRUD operations with the main entities (for example, publications, dissertations, grants,
projects, scientific activities).</p>
      <p>The system architecture was implemented using advanced technologies that ensure reliability,
performance and flexibility. The core technology stack is based on JavaScript as both client-side
and server-side programming languages, which helps ensure codebase consistency and eases
application development. The server part was developed using Node.js, which allows you to create
high-performance and scalable server applications with high request processing capabilities even
in real time.</p>
      <p>To optimize the interaction between the client and the server, GraphQL is used, which gives the
client the opportunity to get only the data that is needed, which reduces the load on the network
and server resources. Which, in turn, will improve system performance from the point of view of
building complex queries.</p>
      <p>The MongoDB database acts as a storage, which provides speed and flexibility when working
with large volumes of unstructured data. It also provides efficient work with various data types
used in describing the structure for data from various scientometric information systems, and
provides easy scalability of the database in accordance with the load and needs.</p>
      <p>In addition, the Ollama platform is integrated into the system, which provides a mechanism for
working with various models of machine learning and artificial intelligence. Thanks to such
capabilities, the system can more accurately determine the relevance of publications based on the
profile of a scientist, calculating complex relationships between data.</p>
      <p>The system interface is developed based on the principles of building intuitiveness and ease of
use, which provides convenient user access to the main functionality without the need for
additional training of personnel.</p>
      <p>Figure 10 shows the initial screen of the page with the authorization and authentication forms.</p>
      <p>As you can see from the screenshot above, the authorization form is quite simple, as it requires
the user to enter only the e-mail address and the password that was created during registration in
the system (Fig. 11).</p>
      <p>The registration form requires the user to fill out a database of information about himself, such
as surname, first name, patronymic, position, faculty, department, etc. Also, when registering, the
employee must specify his identifiers in other scientometric databases to ensure the process of
automated information collection. Another of the main fields of this form is the scientist's last
name and first name in Latin, as these data will be needed to search for publications in the Crossref
database.</p>
      <p>After successful authorization in the system, the user gets to the "Overview" page (Fig. 12)
where he can see quantitative indicators of publication activity.</p>
      <p>As can be seen from the screenshot above, the user has the opportunity to filter all indicators by
faculty, department and publication period.</p>
      <p>There is also an opportunity to create a world based on a specific division by clicking the
"Download report" button. This opportunity is available only to employees with appropriate access
rights (for example, the head of the department, the dean of the faculty, the vice-rector for
scientific work).</p>
      <p>If the user has entered the system and there is no added data yet, he will see a welcome window
and a button that will allow synchronization of all publication activity from other scientometric
databases (Fig. 13).</p>
      <p>After pressing the "Synchronization" button, a window will open with a description of the
databases in which information will be searched (Fig. 14).</p>
      <p>By going to the "Publications" section, the user will be able to view all the publications that the
system managed to find (Fig. 15).</p>
      <p>If the system could not find any publication of the author, he can add it manually by pressing
the corresponding button. After that, the user will open a form where he will need to fill in all the
necessary fields (Fig. 16).</p>
      <p>The page for viewing dissertations protected by the user, where there is an addition form, has a
similar appearance (Fig. 17).</p>
      <p>The R&amp;D funding page displays a list of research and development (R&amp;D) funded projects,
including the name, manager, terms, amount of funding, type of funding, and category of each
project (Fig. 22).</p>
      <p>It is also possible to quickly search by faculty, department and deadline. As already mentioned
earlier, the system provides for the possibility of automatic creation of a scientist's profile, which
can be used in the future for the tasks of filtering publications.</p>
      <p>As can be seen from the figure above, the user has the opportunity not only to view his profile,
but also to edit the necessary information.
10. Conclusion
This work emphasizes the need for automation of collection, processing and analysis of publication
activity in the modern scientific environment. The increase in the volume of scientific information
complicates the manual control and analysis of data, especially in large academic groups. The
developed system described in this paper not only provides automated collection of information
from scientometric databases such as Scopus and Web of Science, but also forms a profile of a
scientist, which includes information about his scientific interests, publications, grants and
participation in scientific events. This allows you to optimize the processes of managing scientific
activities, making them more efficient and objective.</p>
      <p>An important part of the work is the use of modern algorithms for automatic text analysis, such
as TF-IDF, Named Entity Recognition (NER) and text vectorization, which contribute to the
selection of keywords and the identification of scientific interests of researchers. The application of
deep language models, such as BERT, GPT, as well as the capabilities of the Ollama platform for
localized processing of big data, allows you to achieve high accuracy in text analysis, taking into
account the semantic context and the relationship between terms.</p>
      <p>In addition, a mathematical model of filtering irrelevant publications is built in the work, which
is based on the profile of a scientist, which solves the problem of filtering the author's original
works, thereby significantly increasing the accuracy of scientometric indicators.</p>
      <p>Also, the use of vector representation of scientific interests and publications with the
calculation of cosine similarity is proposed for the first time. This approach contributes to the
objective assessment of scientific contributions, reducing the risk of inaccuracies due to random
coincidence of surnames or errors in databases.</p>
      <p>Another important component of this work is the integration of the Ollama platform into its
own system, which allows the use of language models for accurate identification of scientific
interests, as well as for automatic categorization and clustering of scientific materials. This greatly
facilitates the formation of reports for scientific institutions, which allows you to quickly obtain
generalized data on the activities of the university, faculties and departments.</p>
      <p>Declaration on Generative AI
During the preparation of this work, the authors used ChatGPT and Grammarly to check grammar
and spelling, paraphrase, and reword the text. These tools help identify and correct grammatical
errors, typos, and other writing mistakes, improving the clarity and professionalism of the text.
After using these tools, the authors reviewed and edited the content as needed and take full
responsibility for the publication’s content.
[8] L. M. Pham, H. C. The, LNLF-BERT: transformer for long document classification with
multiple attention levels, IEEE Access (2024) 1. https://doi.org/10.1109/access.2024.3492102.
[9] H. D. Abubakar, M. Umar, Sentiment classification: review of text vectorization methods: bag
of words, tf-idf, word2vec and doc2vec, SLU J. Sci. Technol. 4.1&amp;2 (2022) 27–33.
https://doi.org/10.56471/slujst.v4i.266.
[10] H.-S. Lee, H.-S. Shim, Implementation of generative AI using metaverse-based LLM, Korea Ind.</p>
      <p>Technol. Converg. Soc. 29.2 (2024) 123–132. https://doi.org/10.29279/jitr.2024.29.2.123.
[11] M. Brown, A. Gruen, G. Maldoff, S. Messing, Z. Sanderson, M. Zimmer, Web scraping for
research: legal, ethical, institutional, and scientific considerations, 2024. https://doi.org/
10.48550/arXiv.2410.23432.
[12] D. P. Pau, F. M. Aymone, Forward learning of large language models by consumer devices,</p>
      <p>Electronics 13.2 (2024) 402. https://doi.org/10.3390/electronics13020402.
[13] C.-N. Hang, P.-D. Yu, R. Morabito, C.-W. Tan, Large language models meet next-generation
networking technologies: A review, Future Internet 16.10 (2024) 365. https://doi.org/
10.3390/fi16100365.
[14] M. Dyvak, P. Stakhiv, A. Pukas, Algorithms of parallel calculations in task of tolerance
ellipsoidal estimation of interval model parameters, Bull. Pol. Acad. Sci. 60.1 (2012).
https://doi.org/10.2478/v10175-012-0022-9.
[15] M. Dyvak, I. Voytyuk, N. Porplytsya, A. Pukas, Modeling the process of air pollution by
harmful emissions from vehicles, in: 2018 14th international conference on advanced trends in
radioelecrtronics, telecommunications and computer engineering (TCSET), 2018, pp. 1272–
1276. https://doi.org/10.1109/TCSET.2018.8336426.
[16] N. Ocheretnyuk, I. Voytyuk, M. Dyvak, Y. Martsenyuk, Features of structure identification the
macromodels for nonstationary fields of air pollutions from vehicles, in: Proceedings of
international conference on modern problem of radio engineering, telecommunications and
computer science, 2012, pp. 444–444.
[17] M. Dyvak, Parameters identification method of interval discrete dynamic models of air
pollution based on artificial bee colony algorithm, in: 2020 10th international conference on
advanced computer information technologies (ACIT), 2020, pp. 130–135. https://doi.org/
10.1109/ACIT49673.2020.9208972.
[18] M. Dyvak, A. Pukas, I. Oliynyk, A. Melnyk, Selection the “saturated” block from interval
system of linear algebraic equations for recurrent laryngeal nerve identification, in: 2018 IEEE
second international conference on data stream mining &amp; processing (DSMP), 2018, pp. 444–
448. https://doi.org/10.1109/DSMP.2018.8478528.
[19] M. Pirnau, M. A. Botezatu, I. Priescu, A. Hosszu, A. Tabusca, C. Coculescu, I. Oncioiu, Content
analysis using specific natural language processing methods for big data, Electronics 13.3
(2024) 584. https://doi.org/10.3390/electronics13030584.
[20] M. Gkevrou, D. Stamovlasis, Illustration of a software-aided content analysis methodology
applied to educational research, Educ. Sci. 12.5 (2022) 328. https://doi.org/10.3390/
educsci12050328.
[21] N. Le, D. Tran, R. Sturgill, Content analysis of three-dimensional model technologies and
applications for construction: current trends and future directions, Sensors 24.12 (2024) 3838.
https://doi.org/10.3390/s24123838.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D</given-names>
            <surname>.-M. Petroșanu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pîrjan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tăbușcă</surname>
          </string-name>
          ,
          <article-title>Tracing the influence of large language models across the most impactful scientific works</article-title>
          ,
          <source>Electronics</source>
          <volume>12</volume>
          .24 (
          <year>2023</year>
          )
          <article-title>4957</article-title>
          . https://doi.org/ 10.3390/electronics12244957.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Lutsiv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Maksymyuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Beshley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lavriv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Andrushchak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sachenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Vokorokos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gazda</surname>
          </string-name>
          ,
          <article-title>Deep semisupervised learning-based network anomaly detection in heterogeneous information systems</article-title>
          , Comput.,
          <source>Mater. &amp; Contin. 70.1</source>
          (
          <year>2022</year>
          )
          <fpage>413</fpage>
          -
          <lpage>431</lpage>
          . https://doi.org/ 10.32604/cmc.
          <year>2022</year>
          .
          <volume>018773</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sachenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kochan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Turchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tymchyshyn</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Vasylkiv</surname>
          </string-name>
          ,
          <article-title>"Intelligent nodes for distributed sensor network,"</article-title>
          <source>IMTC/99. Proceedings of the 16th IEEE Instrumentation and Measurement Technology Conference (Cat. No.99CH36309)</source>
          , Venice, Italy,
          <year>1999</year>
          , pp.
          <fpage>1479</fpage>
          -
          <lpage>1484</lpage>
          vol.
          <volume>3</volume>
          . https://doi.org/ 10.1109/IMTC.
          <year>1999</year>
          .776072
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V.</given-names>
            <surname>Lytvyn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vysotska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pukach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Nytrebych</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Demkiv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Senyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Malanchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sachenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kovalchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Huzyk</surname>
          </string-name>
          ,
          <article-title>Analysis of the developed quantitative method for automatic attribution of scientific and technical text content written in Ukrainian</article-title>
          ,
          <source>EasternEuropean J. Enterp. Technol. 6</source>
          .
          <issue>2</issue>
          (
          <issue>96</issue>
          ) (
          <year>2018</year>
          )
          <fpage>19</fpage>
          -
          <lpage>31</lpage>
          . https://doi.org/10.15587/
          <fpage>1729</fpage>
          -
          <lpage>4061</lpage>
          .
          <year>2018</year>
          .
          <volume>149596</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Zaki Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Rodríguez</given-names>
            <surname>Díaz</surname>
          </string-name>
          ,
          <article-title>A methodology for machine-learning content analysis to define the key labels in the titles of online customer reviews with the rating evaluation</article-title>
          ,
          <source>Sustainability</source>
          <volume>14</volume>
          .15 (
          <year>2022</year>
          )
          <article-title>9183</article-title>
          . https://doi.org/10.3390/su14159183.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Research on the TF-IDF algorithm combined with semantics for automatic extraction of keywords from network news texts</article-title>
          ,
          <source>J. Intell. Syst. 33.1</source>
          (
          <year>2024</year>
          ). https://doi.org/ 10.1515/jisys-2023-0300.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Garg</surname>
          </string-name>
          ,
          <article-title>Named entity recognition (NER) and relation extraction in scientific publications</article-title>
          ,
          <source>Int. J. Recent Technol. Eng. (IJRTE) 12.2</source>
          (
          <year>2023</year>
          )
          <fpage>110</fpage>
          -
          <lpage>113</lpage>
          . https://doi.org/ 10.35940/ijrte.b7846.
          <fpage>0712223</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>