<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Mathematical Models of the Modern Educational Space: Virtual Communication, Processing of Natural Language Information, Normalization of Speech Signal⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ravshanbek Zulunov</string-name>
          <email>zulunovrm@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hryhorii Hnatiienko</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vladyslav Hnatiienko</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Larysa Myrutenko</string-name>
          <email>myrutenko.lara@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Tampere University of Applied Sciences</institution>
          ,
          <addr-line>4 Kalevantie, 33100 Tampere</addr-line>
          ,
          <country country="FI">Finland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Taras Shevchenko National University of Kyiv</institution>
          ,
          <addr-line>64/13 Volodymyrska str., 01601 Kyiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>323</fpage>
      <lpage>338</lpage>
      <abstract>
        <p>This paper reviews and improves the tools for researching the modern educational space. Particular attention is paid to the means of communication, which largely shape the educational space. In particular, the methodology of applying natural language information processing tools in the study of educational space is considered. Today, speech recognition is often used in video surveillance and access control systems, as well as in various mobile and cloud platforms. A speech recognition system is a technology that can convert human speech into text. It can work autonomously, or it can learn the pronunciation of a particular user. Voice recognition is a part of speech recognition technology. Voice identification is used in biometric verification to restrict access to personal files.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;virtual communication</kwd>
        <kwd>text data</kwd>
        <kwd>application methodology</kwd>
        <kwd>natural language information</kwd>
        <kwd>speech</kwd>
        <kwd>speech signal</kwd>
        <kwd>discrete signal</kwd>
        <kwd>signal normalization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Text data is an attribute of our civilization: we see it when we read books, newspapers, and other
printed materials, search for information on the Internet, use Facebook and Twitter, communicate
with each other on various forums, and so on. The amount of this data is growing exponentially.
About 80% of text data is unstructured text. These are Wikipedia articles, web pages, blogs, emails,
social media posts, e-books, etc. It is impossible to read and process all of this textual data, and to
extract the most useful information from it, it needs to be structured, organized, systematized, etc.
Thus, there is a need for tools that help people process unstructured texts more efficiently [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
Therefore, the involvement of computers in solving such tasks is quite natural [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        In addition, the article deals with speech, speech detection, and speech signal normalization.
Speech recognition is evolving nowadays. Today, speech recognition is often used in video
surveillance and access control systems, as well as in various mobile and cloud platforms. A speech
recognition system is a technology that can convert human speech into text [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. It can work
autonomously, or it can learn the pronunciation features of a particular user. Voice recognition is a
part of speech recognition technology [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]. Voice identification is used in biometric verification to
restrict access to personal files. The system memorizes a person’s voice and distinguishes it from
other voices.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. The current state of research on the problem</title>
      <p>
        Speech is a historically formed form of communication between people through language
structures created based on certain rules. If air is used as a conductive medium for the transmission
of information (communication), then speech is obtained—a sound vibration characterized by
frequency and amplitude. Speech is an information carrier signal used by a person to transmit
messages. By its physical nature, it is an acoustic signal that changes continuously over time. To
emphasize the essence of this signal and distinguish it from other types of signals, speech is called
a speech signal in the technical literature. In addition, the terms “speech”, “speech signal” and
“spoken speech” are used interchangeably, except when it is necessary to emphasize the meaning
of a separate term [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>Speech, as a mode of communication rooted in history, facilitates interaction among individuals
through language structures shaped by specific rules. When air serves as the conduit for
information transmission, speech emerges as a manifestation—a dynamic sound vibration defined
by its frequency and amplitude. This auditory phenomenon, inherently tied to the physical realm,
operates as an information carrier signal, enabling individuals to convey messages effectively.</p>
      <p>
        The very essence of speech lies in its acoustic nature– a continuous modulation of sound over
time. This continuous evolution of sound defines the dynamic quality inherent in speech, making it
a nuanced and adaptive means of expression. In the realm of technical literature, the term "speech
signal" is employed to accentuate the unique characteristics of this form of communication. This
categorization helps to distinguish speech signals from other types of signals that may exist in
various communication modalities [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Within the technical discourse, the interchangeable use of terms such as “speech,” “speech
signal,” and “spoken speech” prevails, demonstrating the fluidity in referencing this complex form
of communication. However, a nuanced approach is adopted when precision is paramount,
warranting the emphasis of a specific term to convey a distinct facet of the communication process.</p>
      <p>In delving into the intricacies of speech, it is crucial to recognize its dual identity as both a
historical construct and a contemporary tool for conveying information. The structured rules that
underpin language systems have evolved, shaping speech into a multifaceted vehicle for human
expression. By employing air as the medium for communication, speech transcends its historical
roots, embracing technological and scientific dimensions that characterize it as a signal—imbued
with meaning and purpose in the intricate tapestry of human interaction.</p>
      <p>Speech recognition technology, or Speech-to-Text (voice-to-text), appeared at the end of the last
century, but programs learned to efficiently convert human speech into text only in the 2000s, as IT
technologies and machine learning developed. Today, speech recognition systems are widely used
in everyday life and business, because it significantly saves resources.</p>
      <p>This is a complex multi-stage algorithm, so we will try to describe the general principle of
operation. If you tell voice search “Taras Shevchenko”, the phone will not hear the name of the
famous writer, but a sound signal without clear boundaries. Based on this continuous signal, the
system reconstructs the phrase reproduced by a person as follows:

</p>
      <p>First, the device records a voice request, and the neural network analyzes the speech
stream. A sound wave is divided into fragments—phonemes.</p>
      <p>The neural network then accesses its templates and matches the phonemes to a letter,
syllable, or word. Next, an order is formed from words known to the program, and it inserts
unknown words according to the context. The result of combining information from these
two stages is the translation of speech into text.</p>
      <p>At the dawn of development, the Speech-to-Text process consisted of an elementary acoustic
model—human speech was compared with patterns. However, the number of dictionaries in the
system was not enough for accurate recognition; the program often made mistakes.
Thanks to the learning ability of neural networks, the quality of speech recognition has increased
significantly. The algorithm knows the typical sequence of words in live speech and can perceive
the structure of the language—this is how the language model works. Each new processed voice
information affects the quality of processing of the next one, reducing the number of errors.</p>
      <p>Speech recognition technology allows us to search for the necessary information and create a
route using the navigator. Here are a few other areas where using Speech-to-Text has made life
easier:

</p>
      <p>Telephony. The technology saves not only the caller’s time but also the company’s
resources. Using voice dialing and a robot, customers can order goods, answer surveys, and
receive advice without the participation of managers.</p>
      <p>Household appliances and personal computers. Today you can control various devices with
your voice: switches, lighting systems, and gadgets. You can train your computer to
recognize your voice (with Windows and Mac systems).</p>
      <p>Speech recognition allows you to automate many business processes, from sales and customer
service control to protection from fraudsters.</p>
      <p>Using this technology, analytics of telephone conversations with customers has become easier
and cheaper: the system automatically records calls and collects data to increase conversion. For
example, the MANGO OFFICE speech analytics system helps you find out which competitors your
customers most often compare your product with. You create tags for competitor mentions,
analyze conversation reports, and understand how to improve your marketing strategy. You can
also analyze the work of employees—mark stop words, and monitor compliance with sales scripts.
If you need to transcribe speech from a video, you can download an audio file from it and upload it
to a speech analytics service. Speech on video must be clear, so use a microphone when speaking
on video.</p>
      <p>
        Another area where speech analytics helps business development is interactive voice systems
(IVR). It is an indispensable tool in call center management. Speech-to-Text recognizes the client’s
speech, and the voice robot automatically selects the necessary information to answer or transfers
the call to an operator. The technology reduces the number of abandoned calls, as many people are
late or unable to press buttons in the voice menu. Service control services do not need to conduct
additional surveys: this can be done automatically, and then analyze the reports. Bank security
teams use speech analytics to protect customers’ data [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Mathematical model of virtual communication and some aspects of leadership psychology in virtual space</title>
      <p>
        The famous philosopher Johan Huizinga [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ] convincingly proved that play, to a greater extent
than labor, was a formative element in human culture, that the most fundamental human activity
takes place in the field of play, and that all human culture was a form of play. The role of play has
become especially characteristic in the computer age, and for modern civilization, the Internet has
become a space that generates new forms of human interaction, new principles for designing their
interaction, problems of “virtual” freedom, and many other problems. Various forms of human
activity are carried out on the Internet: communicative, cognitive, commercial, and gaming. The
world in which we live is a fragment of the real world, which is perceived by humans through the
senses and can be described by the following model:
      </p>
      <p>F1= F1 ( f i1 , gi1 , i∈ I 1) ,
(1)
where F1 is the function of perception by the senses, f i1 are the human senses, gi1 are the thresholds
of sensory sensitivity that act as filters that allow us to perceive only information that is essential
to the situation, I 1 is a set of indices of the senses: vision, hearing, smell, taste, touch, balance, and
the sense of body position in space. In virtual reality, the sensory thresholds of the senses change
significantly: some senses become more acute and hypertrophied, and the thresholds of others
become zero, meaning that these senses are not used in virtuality.</p>
      <p>Virtual reality, created by the Internet environment, is a space for human interaction. Electronic
means of communication have become an extension of the human nervous system and contribute
to the globalization of society. Virtual reality is characterized by the following features: it imposes
a new rhythm of life, contributes to changing people’s perception of the world, shifts and even
destroys the points of beginning and end of events, violates the principle of irreversibility as a
fundamental property of our real space-time, gives the right to make mistakes in the artificial
world, promotes anonymity and pseudonymity of communications in the virtual world, allows
staging human personality, etc. Virtual communication has some features that distinguish it from
real communication. Communication is a prerequisite for managerial actions—managers spend
about 80% of their working time on communication. Virtual communication can be described in
terms of the following functional dependency:</p>
      <p>F2= F2 ( f i2 , i∈ I 2) ,
(2)
where f 12 is anonymity, which is an effective way of managing the impression of oneself,
contributes to psychological emancipation, non-normativity, and unrestrictedness by the norms of
social roles; f 22 is in conditions of limited sensory capabilities in virtual reality, the resonance of
communication becomes primarily the root cause of emotional intimacy; f 23 are the possibility of
expressing staged feelings to the interlocutor rather than real ones; f 24 are limited possibilities of
expressing feelings in the form of emoticons and textual interpretations; f 52 is the voluntary nature
of virtual contacts, the possibility of their interruption at any time; f 26 are the destruction of the
stable self-identification and individuality of the interlocutor.</p>
      <p>One of the features of the 21st century is the massive creation of virtual organizations. The
activity of a virtual organization can be represented as the following function</p>
      <p>F3= F3 ( f i3 , i∈ I 3) ,
(3)
where f 13 are the mission and vision of the organization; f 23 is organizational culture; f 33 is
organizational structure; f 34 is the management structure of the organization; f 53 are information
flows.</p>
      <p>
        The features of a virtual organization are: functioning in the information space, voluntary
participation, independence of the organization’s members, free configuration of relationships
between members, common values of members in the information environment, limited
interaction, territorial distribution of members, etc. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In his theory of organization, Weber
pointed out the need for a clear formal fixation of organizational rules and norms [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>A game is a type of activity characterized by the interaction of players whose actions are limited
by rules and aimed at achieving a goal. The game has a strictly regulated hierarchy of players; it is
not declarative and indicative, but rigidly implemented and realized. At the same time, the
hierarchy in the game does not correlate with the social hierarchy. The impression of a player is
formed under the influence of the following factors: the player’s self-identification, desired and
undesired image, role restrictions, and cultural and ethical values.</p>
      <p>
        To adequately describe the game, we should introduce the concept of a virtual charismatic
leader. Leadership in this case has two aspects: virtual power and virtual charisma. It is known that
power is one of the main aspects of an organization’s existence and provides the leader with 2/3 of
the influence necessary for leadership. Knowledge and experience in a real organization, according
to expert estimates, affect the result [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. The activities of a leader in a virtual organization can be
represented in the form of two blocks of key competencies—three managerial and five leadership
competencies:
where f 14 are planning (goals, objectives, actions, resources, etc.), f 24 are managing subordinates,
creating a control system, f 34 is exercising control (monitoring activities, identifying problems, and
eliminating them), f 44 are forming organizational strategy, goals, and organizational culture, f 54 are
forming communications, coordinating coalitions, managing relationships, f 64 are motivation and
encouragement, f 74 are forming and maintaining values, f 84 are training and development.
      </p>
      <p>The formalization of the leader’s virtual behavior in the game can be described by introducing a
formula:</p>
      <p>F 4= F 4 ( f i4 , i∈ I 4) ,
F5= F5 ( f i5 , i∈ I 5) ,
(4)
(5)
where f 51 are the natural properties of a person according to formula (1), f 52 are behavior as a
function of type (2) of the natural properties of a person as a result of socialization, f 53 are the
virtual environment according to formula (3) that has developed around the game, f 54 are the real
external environment around the virtually charismatic leader as an individual, f 55 are the material
and virtual resources available to players, f 56 are the leadership qualities described by formula (4),
f 57 are psychological characteristics, and f 58 are the rules of the game.</p>
      <p>
        Humans operate in natural, social, cultural, and other environments. With the development of
civilization, human presence in the information environment is increasing. Today, cyberspace has
become an integral part of the noosphere. The structure of cyberspace is determined by the
processes of creating, storing, transmitting, distributing, processing, consuming, and perceiving
information, procedures for interacting with social institutions, norms of cyberspace ethics, etc.
Mathematical modeling and decision theory methods can be successfully used to study virtual
space [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Psychological aspects of decision-making in virtual reality</title>
      <p>
        The transition from agrarian to industrial and then to post-industrial society was accompanied by
an increase in entropy, a loss of structure in society, and a tendency toward systematic fluidity and
disordered interrelationships. Entropy is a measure of unstructuredness, unpredictability, and
uncertainty of the values that describe certain objects. In terms of degrees of freedom entropy can
be defined as a measure of the connectedness of an object’s degrees of freedom [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The more
degrees of freedom, the more unpredictable an object is, and the higher its entropy. The concept of
entropy is associated with disorder, equilibrium, homogeneity, equality, freedom, stability, and
ignorance, while the concept of negentropy is associated with order, disequilibrium, heterogeneity,
inequality, constraint, instability, and knowledge. Increasing connectivity between individuals in a
social system leads to a decrease in total entropy, and the greater the connectivity, the lower the
total entropy—it decreases by the amount of predictability that comes from connectivity [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Thus,
the way to reduce the entropy of a system is to increase the interconnectedness of the elements
and the coherence of their actions. Entropy is minimal in a rigidly hierarchical, predictable,
welldefined system. When people have some kind of relationship, or ties—playful, business, family,
official, friendship, etc.—it becomes much easier to predict the behavior of such a social group.
Often, such ties lead to the coherence of actions, which translates into overall systemic behavior.
      </p>
      <p>
        Game structures in social systems are low-entropy, ordered social integrities. The game space
has its unconditional order. A positive feature of the game is that it creates order, and organizes
actions. In an imperfect world, in real life, it creates a temporary, limited perfection [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The game
gives rise to a special form of human connection with the world in general and the social world in
particular. The intensive use of games in various forms by modern people can be considered a
subconscious response to the increasing entropy of post-industrial society.
      </p>
      <p>
        In recent decades, computer games have become extremely popular. In this case, a game is a
type of activity characterized by the interaction of players whose actions are limited by rules and
aimed at achieving a goal. A computer game is a way of transforming a human personality,
selfidentification, and creative individuality. Players begin to subconsciously perceive the computer as
an extension of their personality in space. The main psychological features of a computer game can
be identified [
        <xref ref-type="bibr" rid="ref18">18, 20</xref>
        ], which influence decision-making in the game environment. The features of
this phenomenon and their detailed description are given in Table 1.
The game violates the basic property The source of the game’s appeal lies in the
of time—its irreversibility change in a person’s relationship with time. In
virtual reality, there is no such fundamental
property of our world as irreversibility. A person
can do any action and has the opportunity to go
back a few steps back in the game and prevent
their own mistake
Players can quickly create their new At the same time, the player gets used to a
virtual heroic image different, more dynamic pace of life
Virtual reality gives the right to
make mistakes in the computer
world
      </p>
      <sec id="sec-4-1">
        <title>Under the terms of the game, you can repeat the attempt to achieve the goal in a short time, while in real life you can wait for years for the opportunity to repeat the attempt</title>
        <p>The game has simple, tangible, The game is much simpler than life. The player
artificial, but understandable rules is attracted to strict regulation, while in society
there is always uncertainty. The friend-or-foe
distinction in a game is much more transparent,
clear, and less blurred than in the real world.</p>
        <p>The meaning of the game is always simpler than
the meaning of life
The game space has communication The game space is characterized by a certain
features linguistic design—slang, jargon, the use of
specific terms, a special style of text that
eliminates the uninformed
Anonymity of a computer game
participant</p>
      </sec>
      <sec id="sec-4-2">
        <title>This leads to the fact that all hidden psychological complexes of the player are revealed, which strengthens the power of the irrational over the psyche</title>
        <p>Anonymity of communication in the This enriches the possibilities of
selfgame presentation, allowing the player to create an
impression of himself in the virtual world of his
choice and be whoever he wants to be
All of the above factors affect decision-making in virtual reality: time reversal is broken, the pace
of events accelerates, the cost of a decision-making error is reduced, and all actions and reactions
to them are regulated. At the same time, the decision-maker is anonymous and heroized at will,
and the virtual scope of his or her activities is orders of magnitude larger than that available to him
or her in the real world.</p>
        <p>Computer games often serve as a psychological relief, a kind of psychological training. A person
is attracted to the game by the desire to relieve irritation, aggression, and the possibility of
transferring tension to a new object. In general, computer games are a way of social experience
that is important for personal development. Today, many users of modern computer technologies
believe that existence is possible only online. If a person is not known in cyberspace, it is as if he or
she does not exist in the real world at all.</p>
        <p>
          Play, to a greater extent than labor, was the basis for the formation of human culture [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], the
fundamental human activity takes place in the field of play. All human culture has been a form of
play. At the same time, in the virtual world, people are mostly interested in themselves: at all times,
the most important task is to find themselves and their like-minded people, as well as to study the
history of the world [20]. In earlier times, this information was obtained through fairy tales,
legends, and memories, later—through fiction, and in the modern world—through computer games.
Today, the game, like fiction before, in the period of human development was a guide—a means of
determining the direction, a special prism through which the world looks different, and a criterion
for the compliance of this world with ideals.
        </p>
        <p>
          The special normative order, internal ethics, and morality of a computer game are focused on
maintaining and strengthening the internal game order [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. This order is intended to set the
system of internal relations and regulate the organization of internal relations to at least prevent
the growth of entropy, not to allow entropy indicators to go beyond the permissible limits. The
latter means that the internal regulatory system should, on the one hand, provide a certain freedom
of adaptive action, and at the same time, limit freedom to ensure internal coherence, orderliness,
and integrity. At the same time, such low-entropy orderliness should be supported by the inner
world of a person, his or her feeling of being needed, expedient, and demanded by the structure,
and this feeling dominates the desire for freedom and equality.
        </p>
        <p>Thus, the total appeal of the modern person to a computer game is a response to the increase in
entropy characteristic of modern civilization. Human nature is based on the desire for certainty,
and people feel confident when they have some control over their environment. The game is a
selforganizing counterpoint to the complex world. A game is a way to obtain order, stability, and
certainty, at least in the virtual world, for a while.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Methodology for the application of natural language information processing in the study of educational space</title>
      <sec id="sec-5-1">
        <title>5.1. SLAM algorithms</title>
        <p>The ability of computers to perform useful tasks related to human language, to perform
highquality text or speech processing, to help in communication between people who speak different
languages, and, in general, the ability to communicate between people and machines—all these
problems are tried to be solved by Natural Language Processing (NLP) [21–26]. This is a general
area of computer science, artificial intelligence, and computational linguistics, the main problem
field of which is to ensure interaction between computers and human (natural) languages. That is,
it is the processing of language, words, and speech by a computer: how to program computers to
process and analyze large amounts of data in a natural language.</p>
        <p>A “natural language” refers to a language used for everyday communication between people:
English, Ukrainian, Italian, etc. Unlike artificial languages, such as programming languages and
mathematical expressions, natural languages live, change, develop, and are passed down from
generation to generation title.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Areas of NLP research</title>
        <p>Today, the main areas of application are as follows:







</p>
        <p>Information Retrieval
Information Extraction
Machine Translation
Question-Answering Systems
Dialogue systems
Speech Recognition
Natural Language Generation</p>
        <p>Sentiment Analysis.









</p>
        <p>There are low-level and high-level NLP subtasks. High-level subtasks are built based on
lowlevel ones.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Low-level NLP subtasks</title>
        <p>Among the low-level subtasks, the following main subtasks are usually distinguished:
Sentence boundary detection or Sentence boundary disambiguation, and abbreviations
complicate this task.</p>
        <p>Tokenization is the detection of individual tokens (words, punctuation marks) in a
sentence: Lexical analyzer, lexer, or Tokenizer—a program or part of a program that
performs tokenization.</p>
        <p>Part-of-speech tagging is the automatic assignment of parts of speech or other forms to
elements in a text.</p>
        <p>Lemmatization is a method of morphological analysis that reduces a word form to its
original dictionary form (lemma): as a result of lemmatization, inflectional endings are
dropped from the word form, and the basic or dictionary form of the word is returned.
Stemming is the process of reducing a word to its base by dropping auxiliary parts, such as
an ending or suffix.</p>
        <p>Shallow parsing or chunking—grouping a sequence of words into a phrase.</p>
      </sec>
      <sec id="sec-5-4">
        <title>5.4. High-level subtasks</title>
        <p>Today, scientists most often refer to the following areas of applied research as high-level tasks:</p>
        <sec id="sec-5-4-1">
          <title>Spelling/grammatical error identification.</title>
          <p>Named-entity recognition.</p>
          <p>Word sense disambiguation is an unsolved problem of natural language processing, which
consists of the task of choosing the meaning (or sense) of a multivalent word or phrase
depending on the context in which it is found.</p>
          <p>Relationship extraction between named objects. Given a piece of text, you need to
determine the relationships between named objects.</p>
        </sec>
      </sec>
      <sec id="sec-5-5">
        <title>5.5. Areas of research on educational space</title>
        <p>The main components of the methodology for applying natural language information processing in
educational space research can be developed and implemented in several ways. These components
of the methodology and their detailed characteristics are summarized in Table 2. The style of texts
within tables should be normal.
Component 4</p>
        <sec id="sec-5-5-1">
          <title>Component 5</title>
        </sec>
        <sec id="sec-5-5-2">
          <title>Component 7</title>
        </sec>
        <sec id="sec-5-5-3">
          <title>Component 8</title>
        </sec>
        <sec id="sec-5-5-4">
          <title>Component 9</title>
        </sec>
        <sec id="sec-5-5-5">
          <title>Component 10</title>
        </sec>
        <sec id="sec-5-5-6">
          <title>Component 11</title>
        </sec>
        <sec id="sec-5-5-7">
          <title>Component 12</title>
        </sec>
        <sec id="sec-5-5-8">
          <title>Component 13</title>
        </sec>
        <sec id="sec-5-5-9">
          <title>Component 14</title>
        </sec>
        <sec id="sec-5-5-10">
          <title>Detailed description of the methodology components</title>
          <p>Monitoring of existing measures of similarity between texts and
development of new measures of similarity if necessary
Generating text annotations using 7 approaches and determining the
similarity measures between the generated annotations to identify the
tools that best generate text annotations in the selected field of
knowledge</p>
        </sec>
        <sec id="sec-5-5-11">
          <title>Component 3</title>
        </sec>
        <sec id="sec-5-5-12">
          <title>Generation of abstracts of theses defended in Ukraine since 1991</title>
        </sec>
        <sec id="sec-5-5-13">
          <title>Component 6 Generating annotations of dissertations that are in the public domain Identification of research areas in the field of education based on the analysis of abstracts of dissertations</title>
          <p>Clustering of research areas in the educational space based on automatic
detection of these areas by abstracts of dissertations
Determination of the similarity measures of dissertation annotations
made by the dissertator and automatically generated by different
approaches
Construction of membership functions for annotations created by the
dissertation based on calculated similarity measures with automatically
generated annotations
Automation of research on the level of internationality of educational and
scientific events
Classification of publications submitted to the scientific event according
to the declared tracks (sections) of the educational and scientific event
Automated determination of the level of novelty of scientific results and
the quality of educational scientific activity of scientists by the formula
for determining the best teacher
Dynamic automatic supplementation of the scientific performance of
teachers of higher education institutions
Automatic determination of the number of self-citations of higher
education teachers in scientific texts that are in the public domain
Building graphs related to mutual citations of different authors in
scientific papers
Component 15</p>
        </sec>
        <sec id="sec-5-5-14">
          <title>Component 16</title>
          <p>Investigation of cycles in references graphs in cases of indirect (cyclic)
references
Study of the level of cooperation of teachers of higher education
institutions, i.e. the ratio of the total number of scientific papers written
in co-authorship to the total number of co-authors and the number of
published scientific papers</p>
        </sec>
      </sec>
      <sec id="sec-5-6">
        <title>5.6. Prospects for further research</title>
        <p>Modern research is increasingly focusing on:












</p>
        <p>Unsupervised and semi-supervised learning algorithms (unsupervised and partially
supervised learning algorithms). Such algorithms are capable of learning from data that has
not been manually annotated or using a combination of annotated and unannotated data.
Deep learning techniques are being developed that give good results in language modeling
and parsing.</p>
        <p>Creating an intelligent system for analyzing the tone or sentimental analysis of Ukrainian
texts.</p>
        <p>A spam filtering system for Ukrainian-language emails.</p>
        <p>A system of grammatical and morphological analysis of Ukrainian texts.</p>
        <p>A system of rhyming words to create poems or songs in Ukrainian.</p>
        <p>A system for determining verse size (rhythm) in Ukrainian poems using neural networks.
A system for automatically generating questions to Ukrainian-language texts.
An intelligent system for Ukrainian language stemming.</p>
        <p>Building a lemmatizer for the Ukrainian language.</p>
        <p>A tokenizer for the Ukrainian language, taking into account the ambiguity of punctuation.
Assessment of psychological qualities of a person based on texts based on the big five.</p>
        <p>Analysis of the tone of Ukrainian-language texts based on the big five.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Speech signal and its normalization</title>
      <sec id="sec-6-1">
        <title>6.1. Methods</title>
        <p>Most signals (including speech) are analog, so they are converted to discrete signals by
analog-todigital transformation (ADT) for processing in digital computers. Using this procedure, a set of []
samples obtained at ∆ instantaneous values of a continuous signal devoid of physical nature is
obtained, and their maximum and minimum values are determined by the ADT bit depth. For
example, if the ADT bit depth is 2 bytes, then it [ 216−1 , 216−1−1]corresponds to the range of all
values in the samples determines the storage. The sampling frequency is the inverse of the
sampling phase ∆. According to Kotelnikov’s theorem, only such an analog signal can be losslessly
recovered from a discrete signal, the high frequency of whose spectrum is equal to half of the
sampling rate [26]:</p>
        <p>f s&gt;2⋅ f n</p>
        <p>Digital signal processing tools are used to describe and transform discrete signals. The most
important CIA procedure is the Discrete Fourier Transform (DFT):</p>
        <p>N j2πnm
S [ m ]=∑ s [ n ] ∙ e N , m=1 , … , N</p>
        <p>i=1
where N is the number of N-DFT constructed samples; j is an imaginary unit.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Results and Discussions</title>
        <p>DFT allows go to the frequency, i.e. divide it into a set of harmonics and find the amplitude
(energy)) of a harmonic as a function of its frequency. Fig. 1 shows a portion of the speech signal
with the vowel “a” in the time domain. At the same time, to abstract from the ADT bit depth, the
samples of the digitized signal are usually described in relative values [27]: or in fractions of the
maximum value (this is for 2 bytes or in decibels). The first method of presentation is used in this
work. n=1024 reference fields were allocated to find the DFT; the result is shown in Fig. 2. In this
case, the frequency of the harmonic is horizontally, vertically—|[ ]|, which is the amplitude of the
harmonic.
Speech is a non-stationary signal, that is, its characteristics change over time. These changes can be
visualized by plotting DFT modules for successive parts (frames) of the speech signal. Fig. 3 shows
the waveform of the word “Forward”. The resulting image is called a spectrogram (Fig. 4). Figs. 2, 3,
and 4 show that frequencies up to 8 kHz consume the most energy [28]. Therefore, a typical choice
of sampling rate when digitizing a speech signal is 16 kHz.
Just as the words in written speech are formed from a limited set of characters—the alphabet of the
language, the spoken speech also includes a limited set of sound “letters” in all their variability. The
minimal semantically distinct unit of speech is the phoneme. The Ukrainian language has 38
phonemes, of which there are 6 vowels and 32 consonants. Unfortunately, there is no further
uniform classification of phonemes, so Fig. 4 shows one of the combined options, which includes
intersecting classes, for example, voiced (voiced) and fricative, deaf (voiceless) and explosive, etc.
[29].</p>
        <p>When recording a speech, some factors affect the amplitude of the audio signal: the speaker’s
voice pitch, his distance from the microphone, etc. These factors lead to a large variability in the
pitch of the speech signal. This phenomenon is especially noticeable when using heterogeneous
recording equipment. The amplitude normalization procedure is used to eliminate volume
dispersion. With this technique, the signal amplitude is within the limits [∆/2, −∆/2] (Figs. 5 and 6).
Sampling of the normalized signal [29] is carried out according to the following formula:
S [ n ]=</p>
        <p>∆
max ⁡|S [ m ]|</p>
        <p>×s [ n ] ,
this earth a ∆ is abscissa axis relatively symmetrical and has been normalization zone width (for
example, in Figs. 5 and 6: ∆=1).
Voice height variability evaluation for one or one how many the words pronunciation of doing 
examples seeing get out in length  of samples  is an example for average sound height value  
and  examples for average  the find:</p>
        <p>D ( q )=|1−</p>
        <p>M ( q )</p>
        <p>M q
|.</p>
        <p>Graphical illustrations for the initial values are shown in Figs. 7 and 9. The normalized signals
are shown in Figs. 8 and. 10.
Above from formulas apparently as a result of the sample absolute value also, in the example this
samples the number too effect does, therefore for collection of h size variability evaluation needed.
Length approx one different has been one different to the class about examples and in general of
the basis sound in height changes in the Figs. 7 and 8. For  =  () is suitable respectively from
normalization before and then say “three.” Note 100 made example shown. Voice height original
examples 28.5% for and normalized ones 14.3 % for organize did in Figs. 9–10.  = 2000 different
pronunciations of examples speech base for  () is displayed. Voice height original base 25.8% and
normalized for 23.11 % organize did [31, 32].</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Conclusions</title>
      <p>From research apparently, as from normalization use each always different words for sound in
height changes reduces. These are the results the only one is available in the equipment one
different in the circumstances collected speech base for taken, therefore for normalization, in
general, is the insignificant role played. However, with every difference in the circumstances
speech signal acceptance when done, is real of the system in operation normalization necessary.</p>
      <p>Voice-to-text technology simplifies everyday tasks and helps advance many professional fields.
In business, Speech-to-Text is used to effectively interact with clients and quickly process large
amounts of data. Analytics and voice robots reduce costs, increase the average bill, and study the
real needs of customers. Speech analytics automates call control and saves time. You increase sales
conversion, improve the quality of service, and receive feedback from the market in understandable
language.</p>
      <p>Many challenges still exist, but significant progress has been made in the field of natural
language processing in recent years. Today, the maturity of natural language processing is
encouraging more and more companies to use natural language processing in their products or
their internal organization.
Declaration on Generative AI
While preparing this work, the authors used the AI programs Grammarly Pro to correct text
grammar and Strike Plagiarism to search for possible plagiarism. After using this tool, the authors
reviewed and edited the content as needed and took full responsibility for the publication’s content.
[20] L. Bevzenko, Agents of social change in a crisis society: Options for problematization and
outlines of the conceptual framework of the study, Sociol. Theor. Meth. Mark. 4 (2020) 111–
132.
[21] O. Romanovskyi, et al., Accuracy improvement of spoken language identification system for
close-related languages, Advances in Computer Science for Engineering and Education VII,
vol. 242 (2025) 35–52. doi:10.1007/978-3-031-84228-3_4
[22] I. Iosifov, et al., Transferability Evaluation of speech emotion recognition between different
languages, Advances in Computer Science for Engineering and Education 134 (2022) 413–426.
doi:10.1007/978-3-031-04812-8_35
[23] I. Iosifov, O. Iosifova, V. Sokolov, Sentence segmentation from unformatted text using
language modeling and sequence labeling approaches, in: IEEE 7th International Scientific and
Practical Conference Problems of Infocommunications. Science and Technology (2020) 335–
337. doi:10.1109/PICST51311.2020.9468084
[24] O. Iosifova, et al., Analysis of automatic speech recognition methods, in: Cybersecurity</p>
      <p>Providing in Information and Telecommunication Systems, vol. 2923 (2021) 252–257.
[25] O. Romanovskyi, et al., Automated pipeline for training dataset creation from unlabeled audios
for automatic speech recognition, Advances in Computer Science for Engineering and
Education IV, vol. 83 (2021) 25–36. doi:10.1007/978-3-030-80472-5_3
[26] M. Gales, S. Young, The application of hidden Markov models in speech recognition,
foundations and trends in signal processing, 1(3) (2008) 195–304. doi:10.1561/2000000004
[27] A. F. Voloshin, G. N. Gnatienko, E. V. Drobot, A method of indirect determination of intervals
of weight coefficients of parameters for metricized relations between objects, J. Autom. Inf.</p>
      <p>Sci. 35(1–4) (2003). doi:10.1615/JAutomatInfScien.v35.i3.30
[28] H. Hnatiienko, et al., Method for determining the level of criticality elements when ensuring
the functional stability of the system based on role analysis of elements, in: Cybersecurity
Providing in Information and Telecommunication Systems, vol. 3654, 2024, 301–311.
[29] R. Zulunov, et al., Detecting mobile objects with AI using edge detection and background
subtraction techniques, in: E3S Web of Conferences, vol. 508, 2024, 03004.
[30] R. Zulunov, et al., Building and predicting a neural network in PYTHON, in: E3S Web of</p>
      <p>Conferences, vol. 508, 2024, 04005.
[31] V. V. Byts, R. M. Zulunov. Specification of matrix algebra problems by reduction, J. Math. Sci.</p>
      <p>71 (1994) 2719–2726.
[32] U. Akhundjanov, et al., Handwritten signature preprocessing for off-line recognition systems,
in: E3S Web of conferences, vol. 587, 2024, 03019.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>A. S.</surname>
          </string-name>
           Pillai,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tedesco</surname>
          </string-name>
          ,
          <article-title>Introduction to machine learning, deep learning, and natural language processing</article-title>
          ,
          <source>1st Edition</source>
          , CRC Press,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
             
            <surname>Tmienova</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
           
          <article-title>Sus, System of Intellectual Ukrainian language processing</article-title>
          ,
          <source>in: Selected Papers of the 18th International Scientific and Practical Conference “Information Technologies and Security” (ITS</source>
          <year>2019</year>
          ), vol.
          <volume>2577</volume>
          ,
          <year>2019</year>
          ,
          <fpage>199</fpage>
          -
          <lpage>209</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
             
            <surname>Bird</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
             
            <surname>Klein</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
           
          <article-title>Loper, Natural language processing with Python, Published by</article-title>
          <string-name>
            <surname>O'Reilly Media</surname>
          </string-name>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>O.</given-names>
             
            <surname>Ilarionov</surname>
          </string-name>
          , et al.,
          <article-title>Intelligent module for recognizing emotions by voice</article-title>
          ,
          <source>Adv. Inf. Technol</source>
          .
          <volume>1</volume>
          (
          <year>2021</year>
          )
          <fpage>46</fpage>
          -
          <lpage>52</lpage>
          . doi:
          <volume>10</volume>
          .17721/AIT.
          <year>2021</year>
          .
          <volume>1</volume>
          .
          <fpage>06</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>B. W.</surname>
          </string-name>
           Schuller,
          <article-title>Speech emotion recognition: Two decades in a nutshell, benchmarks, and ongoing trends</article-title>
          ,
          <source>Commun. ACM</source>
          <volume>61</volume>
          (
          <issue>5</issue>
          ) (
          <year>2018</year>
          )
          <fpage>90</fpage>
          -
          <lpage>99</lpage>
          . doi:
          <volume>10</volume>
          .1145/3129340
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>X.</given-names>
             
            <surname>Huahu</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
           Jue,
          <string-name>
            <surname>Y.</surname>
          </string-name>
           
          <article-title>Jian, Application of speech emotion recognition in intelligent household robot</article-title>
          ,
          <source>in: International Conference on Artificial Intelligence and Computational Intelligence</source>
          , vol.
          <volume>1</volume>
          ,
          <year>2010</year>
          ,
          <fpage>537</fpage>
          -
          <lpage>541</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>K.</given-names>
             
            <surname>Sailunaz</surname>
          </string-name>
          , et al.,
          <article-title>Emotion detection from text and speech: a survey</article-title>
          ,
          <source>Soc. Netw. Anal. Min</source>
          .
          <volume>8</volume>
          (
          <issue>1</issue>
          ) (
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>26</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H. A.</given-names>
             
            <surname>Bourlard</surname>
          </string-name>
          , N. Morgan,
          <article-title>Connectionist speech recognition: A hybrid approach</article-title>
          , Kluwer Academic Publishers, Norwell, MA, USA,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9] G. Cheng, et al.,
          <article-title>An exploration of dropout with Lstms</article-title>
          , in: Interspeech,
          <year>2017</year>
          ,
          <fpage>1586</fpage>
          -
          <lpage>1590</lpage>
          . doi:
          <volume>10</volume>
          .21437/Interspeech.2017-
          <fpage>129</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
             
            <surname>Babenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
             
            <surname>Hnatiienko</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
           
          <article-title>Vialkova, Modeling of the integrated quality assessment system of the information security management system</article-title>
          ,
          <source>in: 7th International Conference “Information Technology and Interactions”</source>
          ,
          <year>2020</year>
          ,
          <fpage>75</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>H.</given-names>
            <surname> Hnatiienko</surname>
          </string-name>
          , et al.,
          <article-title>Application of cluster analysis for condition assessment of Banks in Ukraine</article-title>
          , in: 8th
          <source>International Scientific Conference “Information Technology and Implementation”</source>
          , vol.
          <volume>3179</volume>
          ,
          <year>2022</year>
          ,
          <fpage>112</fpage>
          -
          <lpage>121</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>J. Huizinga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Homo</given-names>
            <surname>Ludens</surname>
          </string-name>
          .
          <article-title>Experience in defining the game element of culture</article-title>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>M. </surname>
          </string-name>
          <article-title>Petrushkevych, Carnival features of communication in new media: Challenges of mass culture, sociocultural challenges of modernity: The need for theoretical understanding</article-title>
          ,
          <source>Ostroh</source>
          ,
          <year>2022</year>
          ,
          <fpage>103</fpage>
          -
          <lpage>138</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>D.</given-names>
             
            <surname>Palko</surname>
          </string-name>
          , et al.,
          <article-title>Cyber security risk modeling in distributed information systems</article-title>
          .
          <source>Appl. Sci</source>
          .
          <volume>13</volume>
          (
          <year>2023</year>
          )
          <article-title>2393</article-title>
          . doi:
          <volume>10</volume>
          .3390/app13042393
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>S. P.</surname>
          </string-name>
           Robbins,
          <string-name>
            <surname>T.</surname>
          </string-name>
           A. Judge, Organizational behavior, 18th Edn. New York, NY: Pearson,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname> Hnatiienko</surname>
          </string-name>
          , et al.,
          <article-title>Application of expert decision-making technologies for fair evaluation in testing problems</article-title>
          , in: 20th
          <source>International Scientific and Practical Conference “Information Technologies and Security” (ITS</source>
          <year>2020</year>
          ), vol.
          <volume>2859</volume>
          ,
          <year>2021</year>
          ,
          <fpage>46</fpage>
          -
          <lpage>60</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>H.</given-names>
            <surname> Hnatiienko</surname>
          </string-name>
          , et al.,
          <article-title>Methods of identifying the correlation of Ukrainian scientific paradigms based on the study of defended dissertations</article-title>
          ,
          <source>in: 10th International Scientific Conference “Information Technology and Implementation” (IT&amp;I</source>
          <year>2023</year>
          ), vol.
          <volume>3646</volume>
          ,
          <year>2023</year>
          ,
          <fpage>64</fpage>
          -
          <lpage>75</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
             
            <surname>Kushnir</surname>
          </string-name>
          , Economy and society, Max Weber,
          <source>trans. from German</source>
          , Vsevit,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19] L. D. Bevzenko,
          <string-name>
            <surname>Social</surname>
          </string-name>
          self-organization.
          <source>Synergetic paradigm: possibilities of social interpretations</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>