<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Bibliometric analysis and experimental assessment of chatbot training approaches</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Roman O. Liashenko</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Serhiy O. Semerikov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Academy of Cognitive and Natural Sciences</institution>
          ,
          <addr-line>54 Universytetskyi Ave., Kryvyi Rih, 50086</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Digitalisation of Education of the NAES of Ukraine</institution>
          ,
          <addr-line>9 M. Berlynskoho Str., Kyiv, 04060</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Kryvyi Rih National University</institution>
          ,
          <addr-line>11 Vitalii Matusevych Str., Kryvyi Rih, 50027</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Kryvyi Rih State Pedagogical University</institution>
          ,
          <addr-line>54 Universytetskyi Ave., Kryvyi Rih, 50086</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Zhytomyr Polytechnic State University</institution>
          ,
          <addr-line>103 Chudnivsyka Str., Zhytomyr, 10005</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>199</fpage>
      <lpage>225</lpage>
      <abstract>
        <p>The rapid development of natural language processing and artificial intelligence has led to growing interest in creating chatbots capable of engaging in human-like dialogue. This study conducts a bibliometric analysis of research on chatbot training to identify key concepts, trends, and promising directions. The analysis of 549 publications from the Scopus database using VOSviewer reveals four main clusters of research: natural language processing techniques, application of NLP technologies in society, use of machine learning for NLP, and chatbots in education and service sectors. To experimentally evaluate chatbot training approaches, two datasets are created from scientific publications and used to fine-tune GPT-2 language models. The results demonstrate the feasibility and efectiveness of transfer learning for adapting pre-trained models to domain-specific data. This research provides insights into the state-of-the-art in chatbot development and highlights opportunities for future work on creating specialized conversational agents.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;chatbot</kwd>
        <kwd>natural language processing</kwd>
        <kwd>bibliometric analysis</kwd>
        <kwd>transfer learning</kwd>
        <kwd>GPT-2</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The rapid development of articfiial intelligence (AI) and natural language processing (NLP) technologies
has sparked a surge of interest in creating software agents capable of engaging in human-like dialogue,
commonly known as chatbots. Leading technology companies such as Google [1], Microsoft [2], Meta
(Facebook) [3], and OpenAI [4] are heavily investing in chatbot development. The success of projects
like OpenAI’s ChatGPT [5] highlights the immense potential for deploying such systems across various
domains of human activity. Recent research on chatbots predominantly focuses on two main areas:
(
        <xref ref-type="bibr" rid="ref1 ref44 ref48 ref52">1</xref>
        ) utilizing chatbots for student education and learning [6]; and (
        <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
        ) developing question-answering
systems that train their own models using user-provided data [7]. However, the efective training of
chatbots also raises critical scientific and practical challenges related to system reliability, safety, and
ethics.
      </p>
      <p>As AI systems capable of understanding and generating natural language, chatbots have the potential
to revolutionize various aspects of human activities [8]. They can automate routine tasks, provide
intelligent user support, enable personalized learning experiences, and much more. Breakthroughs in
chatbot development could fundamentally transform how humans interact with computer systems,
ultimately boosting productivity and enhancing learning outcomes. Nevertheless, addressing the ethical
and security implications of chatbot technologies is crucial to ensure their responsible deployment for
the benefit of society.</p>
      <p>This study aims to advance the field of chatbot development by conducting a comprehensive
bibliometric analysis of research on chatbot training and experimentally evaluating key training approaches.
The bibliometric analysis will identify trends, key concepts, and knowledge gaps in the existing
literature, providing valuable insights to guide future research eforts. The experimental assessment will
involve fine-tuning state-of-the-art language models on domain-specific datasets to create specialized
chatbots. The findings will demonstrate best practices and practical strategies for adapting chatbots to
target domains.</p>
      <p>The main objectives of this research are threefold:
1. Conduct a bibliometric analysis of publications on chatbot training to identify key research
concepts, trends, and promising directions for future work.
2. Compare the performance of popular chatbot training models and establish evaluation criteria
for assessing their efectiveness.
3. Develop two domain-specific datasets and experimentally evaluate the performance of fine-tuned
large language models on these datasets.</p>
      <p>To achieve these objectives, we employ a multi-method approach. First, we perform bibliometric
analysis on search results from the Scopus database to: (a) identify the chronological boundary marking
a steady increase in chatbot training publications; (b) construct a map of keyword co-occurrences;
(c) cluster keywords into thematic groups; and (d) determine the central research concepts (as decribed
in our previous works [9, 10]). Second, we use state-of-the-art language models – namely GPT-3.5,
GPT-4.0, Google Bard, and Claude 2 – to generate cluster names and descriptions based on the keyword
analysis. Finally, we apply software engineering methods, including system design, implementation,
testing, and experimental trials, to fine-tune the language models and evaluate their performance.</p>
      <p>The results of this study have both theoretical and practical implications. The bibliometric analysis
provides a systematic mapping of the chatbot training research landscape, identifying key themes,
trends, and research gaps. This can guide researchers in identifying promising avenues for future work
and inform funding agencies about areas needing further investigation. The experimental findings
ofer organizations actionable insights into efective methods for creating domain-specific chatbots
using transfer learning techniques. This can help businesses harness the power of conversational AI to
enhance their operations and customer experiences.</p>
      <p>The remainder of this paper is structured as follows. Section 2 presents the bibliometric analysis,
describing the methodology and discussing the results. Section 3 reviews the main approaches to chatbot
training, including supervised learning, reinforcement learning, and transfer learning, and outlines
evaluation metrics. Section 4 details the experimental procedure, dataset creation, model selection,
and fine-tuning process. Finally, section 5 concludes the paper by summarizing the key findings and
outlining directions for future research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Bibliometric analysis</title>
      <sec id="sec-2-1">
        <title>2.1. Rationale</title>
        <p>Both scientists and IT company developers are actively working in the field of creating and training
chatbots. Certain aspects of this issue are covered in a number of publications. In particular, the search
results in the DeepLearning.AI blog [11] for 2019-2023 provide an opportunity to highlight the following
practically solved tasks:
• since 2020, chatbots (Generative BST from Facebook and Google Meena) can be used for short
dialogues on general topics [12, 13];
• since 2023, Microsoft has provided the ability to use chatbots in Ofice 365 and Windows to boost
productivity [14];
• since 2020, chatbots are actively used in business for customer service, sales, etc. [15];
• since 2023, Google and Microsoft have been introducing chatbots into search engines [16].
DeepLearning.AI blog articles also point to knowledge gaps and open problems that exist in this field:
• medical chatbots can make false diagnoses [17];
• chatbots such as BlenderBot 3 and Galactica can exhibit bias, toxicity, and fact distortion [18];
• high cost of deploying large language models [19];
• lack of transparency in chatbot models can raise ethical questions [20].</p>
        <p>An analysis of recent research and publications reveals the need to consolidate eforts to determine
relevant areas and priorities for research in order to efectively train chatbots.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Results</title>
        <p>For bibliometric analysis, the Scopus bibliographic database published by Elsevier was chosen as one of
the largest abstract databases, indexing more than 42,000 periodicals and containing mostly high-quality
scientific sources in technical, natural, medical and social sciences, which makes it representative for
analysing publications on the problem of chatbot training at the intersection of several scientific fields.
The balanced coverage of various scientific disciplines in Scopus provides the necessary completeness
to identify key trends and research directions in the chosen field.</p>
        <p>To identify in which works the key research concepts of “training” and “chatbots” occur, a search
was performed on November 18, 2023 in the Scopus database [21]. Those documents were selected in
the titles, abstracts and keywords of which the words “chatbot” and “training” occurred simultaneously
(figure 1).</p>
        <p>The total number of documents – 549; the distribution of documents by year is shown in figure 2.</p>
        <p>A review of the keywords shows that the most frequently occurring ones are: Chatbot (244 times),
Chatbots (221), Artificial Intelligence (126), Natural Language Processing Systems (90), Natural Language
Processing (90). Sorting the keywords alphabetically provides an opportunity to determine which of
them difer only in number (singular and plural) and to convert all to singular:
• Chatbots is the plural of Chatbot;
• Conversational Agents is the plural of Conversational Agent;
• Convolutional Neural Networks is the plural of Convolutional Neural Network;
• Customer Services is the plural of Customer Service (Customer-service);</p>
        <p>• Dialogue Generations is the plural of Dialogue Generation;
• Humans is the plural of Human;
• Large Language Models is the plural of Large Language Model;
• LSTM is the abbreviation for Long Short-term Memory;
• Machine-learning is the synonym for Machine Learning;
• Mobile Applications is the plural of Mobile Application;
• NLP is the abbreviation for Natural Language Processing (NAtural Language Processing);
• NLU is the abbreviation for Natural Language Understanding;
• Reinforcement Learnings is the plural of Reinforcement Learning;
• Virtual Assistants is the plural of Virtual Assistant.</p>
        <p>The found documents were exported in CSV [22, p. 30] and BibTeX formats. In the CSV file,
replacement of plural keywords and abbreviations with singular keywords was performed.</p>
        <p>To perform bibliometric analysis, VOSviewer 1.6.20 [23] was used: Create... → Create a map based on
bibliographic data → Read data from bibliographic database file → Scopus (figure 3).</p>
        <p>The map is created with the following parameters: analysis type – by co-occurrence of the term; unit
of analysis – all keywords; counting method – full counting.</p>
        <p>The total number of keywords – 3705 – can be reduced to 58 by discarding those that occur less than
12 times. From the selected words, the word “article” was additionally excluded as not being specific to
the search query.</p>
        <p>The constructed map (table 1) is presented in figure 4.</p>
        <p>In table 1 the following notations are used [22, pp. 6, 38]:
• keyword – a term characteristic of a publication, defined by the author or indexing system;
• cluster – the number of the keyword group;
•  – the number of links of a keyword with other keywords;
•    ℎ – the total strength of the links of a keyword with other keywords (here the
number of publications in which two terms co-occur);
•  – the number of documents in which a keyword occurs;
• . .  – the average publication year of the documents in which a keyword occurs;
• . – the average number of citations received by the documents in which a keyword
occurs;
• . .  – the average normalized number of citations received by the documents in
which a keyword occurs;
• significance – calculated as the sum, where for each keyword, 2 is selected if it has the highest
value in the column, and 1 if it has the value preceding the highest.</p>
        <p>In appendix A, the results of applying generative chatbots ChatGPT (GPT 3.5), Bing (GPT 4.0, 3
modes), Claude 2 and Google Bard to determine the names of clusters are presented (table 2).</p>
        <p>The engaged large language models generated several variants of names and descriptions of clusters
based on the analysis of keywords that are part of their composition. For each cluster, all variants
proposed by the models are provided with notations A.1, A.2, and so on.</p>
        <p>The choice of the final names of clusters, presented in the “Chosen name” column of table 2, was
carried out through careful analysis and critical evaluation by the authors of all suggestions generated</p>
        <p>No. Suggestions from chatbots Chosen name
1 Advanced Language Processing and AI Applications (A.1); Language Pro-Natural Language Processing
cessing and Information Retrieval (A.2); Natural Language Processing (A.3)
(NLP) (A.3); Natural Language Understanding and Generation (A.4);
Natural Language Processing Applications (A.5); Technical NLP (A.6)
2 AI in Healthcare and Social Context (A.1); AI in Healthcare and Education Application of Natural Language
(A.2); Artificial Intelligence (A.3); Artificial Intelligence and Health Care Processing Technologies in
Soci(A.4); Conversational AI and Impacts (A.5); Applications of NLP (A.6) ety (A.6, A.5, A.1)
3 Core Technologies in Natural Language Processing and Machine Learning Application of Machine Learning
(A.1); Machine Learning and User Interaction (A.2); Machine Learning for Natural Language Processing
(A.3); Natural Language Processing and Machine Learning: (A.4); Core AI (A.6, A.4)
and ML Techniques (A.5); Machine Learning and Deep Learning for NLP
(A.6)</p>
        <sec id="sec-2-2-1">
          <title>4 Educational and Service-oriented AI Applications (A.1); AI in Education Chatbots in Education and Ser</title>
          <p>and Customer Service (A.2); Learning Systems (LS) (A.3); Chatbot and vice Sector (A.1, A.2, A.4, A.5)</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>Education (A.4); AI for Training and Customer Service (A.5); Chatbots and Learning Systems (A.6)</title>
          <p>by diferent models. The key selection criteria were meaningfulness, clarity, brevity of names with
simultaneous maximum coverage of key terms in the composition of clusters.</p>
          <p>The chosen cluster names not only generalize their content but also logically reflect the main research
directions in the field of chatbot training identified by the results of bibliometric analysis: 1) basic
methods of natural language processing; 2) application of relevant technologies in various spheres
of social activity; 3) the use of machine learning as a leading toolkit for developing natural language
processing systems; 4) the range of key applications of chatbots, in particular in the educational sector
and service sector.</p>
          <p>Thus, the final choice of cluster names was the result of combining the capabilities of large language
models with the expert opinion of the authors and is based on clear justified positions aimed at avoiding
ambiguities in the interpretation of the obtained results.</p>
          <p>Figure 5 shows a map of connections of keywords with significance not less than 1, i.e. the most
significant keywords belonging to the four identified clusters. This map allows analysing the relationships
between the central concepts of the study and visualizing the leading directions within each cluster.</p>
          <p>The construction of this focused map was carried out with the aim of concentrating attention on the
most important keywords, selected on the principle of maximum and pre-maximum values of a number
of indicators (number of links, total weight of links, number of mentions in publications, average values
of publication year, citation, etc.). The visualization of the most significant terms and their relationships
allows identifying priority areas in each of the studied fields to outline promising directions for future
scientific research.</p>
          <p>Such an approach allows detailing and deepening the conclusions made based on the general keyword
map, focusing directly on the central concepts of chatbot training as the leading artificial intelligence
agents.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Chatbot training models</title>
      <sec id="sec-3-1">
        <title>3.1. Supervised learning</title>
        <p>Supervised learning is one of the main approaches to machine learning, which is widely used to build
chatbots and other dialogue systems [24, p. 3940] (figure 6). This approach involves training the model
on labeled data, where each example contains input information (user query) and the corresponding
target response (chatbot reply).</p>
        <sec id="sec-3-1-1">
          <title>Chatbot components</title>
          <p>Patil et al. [25] distinguish the following chatbot components [25, p. 2]):
• text processing – word embeddings are vector representations of words within a specific vocabulary,
allowing for better implementation and utilization of statistical machine learning models;
• machine learning model – the concept of an artificial neural network, which is widely used for
input processing, classification and generation of the most appropriate response to the input
query;
• knowledge base – the dataset used for training the model, which can be open or closed in subject
area (domain): open domain chatbots are inferior in the relevance and accuracy of responses,
while closed domain chatbots perform well due to the limited but clearly defined scope of the
dataset;</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>User interface (UI)</title>
        </sec>
        <sec id="sec-3-1-3">
          <title>Knowledge base</title>
        </sec>
        <sec id="sec-3-1-4">
          <title>Text processing</title>
        </sec>
        <sec id="sec-3-1-5">
          <title>Machine learning model</title>
        </sec>
        <sec id="sec-3-1-6">
          <title>Response generation</title>
          <p>• response generation: the response returned to the input query is either selected from a base or
generated depending on the features of input vectors, dictionary and trained classifier.
The supervised learning process for chatbots includes the following steps:
1. Selection of the model architecture.
2. Collection and annotation of training data.
3. Data preprocessing.
4. Model training.
5. Evaluation of the quality of the trained model.</p>
          <p>Among the challenges of supervised learning, the need for a large volume of high-quality labeled
data can be highlighted, the collection and annotation of which can be a laborious and costly process
[24, p. 3940]. One way to address this is to use semi-supervised learning, which allows training the
model on both labeled and unlabeled data [24, p. 3943].
3.1.1. Seq2Seq models based on recurrent neural networks
Seq2Seq (Sequence-to-Sequence) models based on recurrent neural networks with the application of
LSTM (Long Short-Term Memory) modules are one of the approaches to building dialogue systems
using machine learning methods [26, p. 230].</p>
          <p>The Seq2Seq architecture consists of two main components: an encoder and a decoder. The encoder
processes the input sequence (user query) and generates its vector representation in the hidden space.
The decoder, in turn, uses this representation to generate the output sequence (chatbot response) [25,
p. 9].</p>
          <p>A key feature of Seq2Seq models is the use of recurrent layers, particularly LSTM, for processing
sequential data [25, p. 4]. LSTM modules allow efectively modeling long-term dependencies in sequences,
which is critically important for generating coherent and contextually relevant responses in dialogue
[25, p. 5].</p>
          <p>Seq2Seq models based on LSTM have certain limitations. In particular, they can sufer from the
problem of “vanishing gradients” when processing long sequences [25, p. 4].</p>
          <p>Nevertheless, LSTM-based Seq2Seq architectures still remain an important building block of modern
dialogue systems. They are often used as base models, which can then be improved and extended
through other methods such as attention modules or hierarchical architectures [24, p. 3941-3942].</p>
          <p>In the context of chatbot development, Seq2Seq models based on LSTM have proven to be an efective
tool for generating responses to user queries in various subject areas [27].
3.1.2. Transformer architectures
Transformer architectures, particularly models of the GPT (Generative Pre-trained Transformer)
family, represent a state-of-the-art approach to building language models and dialogue systems. These
architectures have gained extraordinary popularity due to the fact that they consistently outperform
other language models such as recurrent neural networks. These models are based on the self-attention
mechanism and are capable of eficiently processing and generating sequences of arbitrary length,
storing longer conversation histories, which leads to consistent, contextual, and improved dialogue [28,
p. 2].</p>
          <p>The transformer architecture consists of several encoder and decoder blocks, each containing
selfattention and feed-forward layers (figure 7). This allows transformers to process all elements of the
input sequence in parallel [29, p. 2], which significantly speeds up the learning process compared to
recurrent models [29, p. 10].</p>
          <p>Researchers have also demonstrated how the performance of large language models trained on a
large corpus of data can be improved by fine-tuning on specific tasks. This can be clearly observed
when we look at the GPT series from OpenAI (figure 8): GPT, GPT-2, GPT-3, and GPT-4, which are
capable of performing tasks such as question answering, reading comprehension, text summarization,
text generation, or conversation modeling [28, p. 2].</p>
          <p>Despite the impressive performance of GPT models, it is known that they exhibit a phenomenon
called hallucination, where they generate results that are contextually implausible or incompatible
with the real world [30, p. 1]. Despite this, transformer architectures, and particularly GPT models,
have become the de facto standard for building modern dialogue systems and chatbots. Their ability
to process context, generate human-like responses, and adapt to diferent subject areas makes them a
powerful tool for creating intelligent assistants and virtual conversationalists.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Reinforcement learning</title>
        <p>Reinforcement learning is one of the approaches to machine learning that is gaining popularity in the
ifeld of building chatbots and other dialogue systems. Unlike supervised learning, where the model is
trained on labeled pairs of “query-response”, reinforcement learning allows the model to learn through
interaction with the environment and receiving feedback in the form of rewards for its actions.</p>
        <p>For the case where the environment is the user, the reinforcement learning process for chatbots can
be represented as follows:
1. The agent (chatbot) interacts with the user at discrete points in time.</p>
        <p>2. At each step, the agent receives information about the current state of the dialogue and generates
a response.
3. The user provides feedback in the form of a reward, reflecting the quality of the generated
response in the context of the current dialogue. The reward function determines how the reward
for each action of the agent is calculated, and can take into account various aspects such as the
relevance of the response, its grammatical correctness, user satisfaction, etc.
4. The agent updates its strategy in such a way as to maximize the value function – the total expected
reward throughout the dialogue.</p>
        <p>The advantage of reinforcement learning-based chatbots is their ability to adapt to diferent user
interaction scenarios and optimize their behavior to achieve desired results. In addition, such agents
can learn based on implicit feedback from the user, which avoids the need for explicit data labeling.</p>
        <p>Reinforcement learning based on human feedback (RLHF (figure 9)) is usually performed iteratively
[31, p. 92]:
1. The reward model is initially trained on a dataset of human judgments.
2. The policy model is trained to maximize reward using the current reward model.
3. Humans then evaluate the outputs of the policy model to create a new dataset.
4. This new dataset is used to update the reward model, making it more accurate.
5. The policy model is again fine-tuned using the updated reward model.</p>
        <p>Generative Adversarial Networks (GANs) are one of the most innovative approaches to dialogue
management in chatbots. GANs consist of two neural networks – a generator and a discriminator,
which compete with each other in the learning process. The goal of the generator is to create responses
that are so realistic that the discriminator cannot distinguish them from real human responses. The
goal of the discriminator is to learn to distinguish between generated responses and real responses.</p>
        <p>Chou and Hsueh [33] proposed a method for creating a chatbot using a model that generates sequential
sentences based on a generative adversarial network. The model architecture contains a generator
predicted
reward
RL algorithm
reward predictor
observation
action</p>
        <p>human
feedback
environment
that generates diverse sentences and a discriminator that evaluates sentences. The generator combines
an attention model that responds to tracking conversation states with a Seq2Seq model using LSTM
to obtain sentence information. For the discriminator, two types of rewards are calculated to assign
low rewards for repetitive sentences and high rewards for diverse sentences. Under this approach, the
environment is the model itself.</p>
        <p>Q.-D. L. Tran et al.: Enhancing Conversational Model With Deep RL and Adversarial Learning</p>
        <p>The model proposed by Tran et al. [34] combines reinforcement learning and generative adversarial
networks to generate both accurate and human-like responses (figure 10).</p>
        <p>Using the chain rule and relying on the equation ∂θ f (θ )</p>
        <p>=
f (θ )∂θ log f (θ ) as in [45], we obtain the</p>
        <p>Algorithm 3 RLGAN-Chat
Require: Input sequence (X ), ground-truth output sequence</p>
        <p>(Y ) and history conversation (CL );
1: Initialize the Generator network G, and the Discriminator
at:t+k ∈A
Transfer leXarning is an approach to machine learning thatnaeltlwoowrks uDsing the knowledge gained by the
∂θ J (θ ) π θ (at:t+k )∂θ log π θ (at:t+k )R(at:t+k )</p>
        <p>
          =
model when solving one task to improve its eficiency whe2:nSseotluvpinthgeaRneostphoenrs,esiGmeinlearrattoarskG[w3 1it,hpp. r8e1tr]a.ined
BERTThe idea of transfer learning is to first train the model on ablaasregde amount of data to perform a general
[∂θ log π θ (at:t+k )R(at:t+k )] (
          <xref ref-type="bibr" rid="ref14">14</xref>
          )
= Eat:t+k ∼π θ {Generator pre-training with Reinforcement Learning}
task (for example, predicting the next word in a sentence), and then fine-tune it on a smaller amount of
3: Fine-tuning the Generator using reward in (
          <xref ref-type="bibr" rid="ref11">11</xref>
          ) on a batch
Thdeataalgtooripthemrfofromrtahespperocpificostaedskm(foodrelexisa msupmlem, agreizneedraitning chatbot responses in a certain subject area) [ 35].
from corpus
AlgorithAmda3p.tiWnge musoedethlse taodavesrpsaerciaificl sleuabrjneinctg amreeath(oddo mtoain) is an important area of transfer learning that
{Discriminator pre-training}
push the model to generate indistinguishable responses from
        </p>
        <p>allows improving the quality of chatbots in specific area4s: owfhailpepnloi ctactoinovne.rgIelidevdsoki et al. [36, p. 4116]
human-generated ones. Using RL as a backbone, we define</p>
        <p>distinguish between two cases when two domains overla5p: and(rewahl,efnakoen)e dopmosaiitniveisbaantchexftreonmsidoantaosfett,haend
negathe quality of the generated responses by their ability to ←</p>
        <p>other, and point out the advantages of using transfer learningtivfeorfrtormainGing specialized chatbots:
fool the discriminator into believing it is a human response.</p>
        <p>The output from the discriminator is also feedback to the
generator, pushing the model to generate responses to be more
like a human.</p>
        <p>IV. EVALUATION
In this section, we present experimental results and qualitative
positive batch from dataset, and
neg211
6:
9:
10:</p>
        <p>Update D using (real, fake)
7: end while</p>
        <p>{Adversarial Training}
8: for number of training iterations do
for k steps do
(real, fake) ←
ative from G
• training chatbots on a smaller amount of data: in data-constrained environments, models trained
using transfer learning achieve better training and testing results than models trained
independently;
• better chatbot performance: the use of transfer learning has a significant positive impact on
performance, even when all data from the target domain is available.</p>
        <p>Fine-tuning existing large language models instead of creating them from scratch is often a more
practical and efective approach for the following reasons [31, pp. 80-82]:
• resource eficiency – fine-tuning uses pre-trained models, allowing for high performance without
significant investments in computational resources;
• data eficiency – fine-tuning allows leveraging the large amount of data on which the models
were pre-trained, requiring only a smaller, task-specific dataset for adaptation;
• transfer learning – pre-trained models have a general understanding of language, context and
certain domain knowledge that is transferred to a specific task during fine-tuning;
• high performance – fine-tuning allows using state-of-the-art architectures that have been carefully
optimized and tested by industry experts;
• lower entry barrier – fine-tuning is more accessible to organizations and individuals who do not
have suficient infrastructure to train models from scratch;
• continuous learning – pre-trained models can be continuously updated and tuned for diferent
tasks, making them versatile and adaptable;
• wide applicability – one pre-trained model can be fine-tuned for many domains and tasks.
The fine-tuning process includes the following steps (figure 11) [31, pp. 82-84]:
1. Loading the parameters of the pre-trained model.
2. Preparing a task-specific dataset.
3. Extracting features using the pre-trained layers of the model.
4. Adjusting the model parameters through backpropagation and gradient descent with a lower
learning rate.
5. Updating gradients, applying regularization to prevent overfitting.
6. Choosing a fine-tuning strategy: full or partial tuning of the model.
7. Evaluating performance and optimizing model hyperparameters.</p>
        <p>During fine-tuning of a neural network, changes occur at the level of architecture and individual
neurons [31, pp. 85-88]:
• adjusting weights and changing activation function outputs;
• tuning the upper layers of the model and freezing the lower layers;
• adjusting the feature space to adapt to the new task;
• replacing the last layer to match the new task;
• updating batch normalization parameters.</p>
        <p>Fine-tuning allows eficiently adapting pre-trained models to specific tasks, preserving their “intuition”
and optimizing for the specifics of the new task.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.4. Evaluation of chatbot training efectiveness</title>
        <p>Taulli [37] considers a number of performance metrics for large language models (table 3).</p>
        <p>Human evaluation is a group of methods that involve direct assessment of the quality of chatbots’
work by real users or experts. These methods allow obtaining a more holistic and comprehensive
assessment of the model’s efectiveness, taking into account aspects such as relevance, coherence,
naturalness, and usefulness of the generated responses.</p>
        <p>For human evaluation of the quality of a fine-tuned model, the following steps should be performed:
Loading parameters</p>
        <p>Using pre-trained
layers
Pre-trained model</p>
        <p>Feature extraction</p>
        <p>Fine-tuned model
Data preparation</p>
        <p>Task-specific
dataset</p>
        <p>Fine-tuning
Adjusting
parameters, fine-tuning
strategies</p>
        <p>Evaluation and
optimization
Performance
evaluation,
hyperparameter optimization</p>
        <p>Metric Description</p>
        <sec id="sec-3-3-1">
          <title>BERTScore evaluates by comparing the generated text with the reference text using BERT embeddings</title>
        </sec>
        <sec id="sec-3-3-2">
          <title>Perplexity evaluates how well the probability distribution predicted by the model matches the actual data distribution</title>
        </sec>
        <sec id="sec-3-3-3">
          <title>BLEU computes n-gram precision scores for similarity between the generated and reference text ROUGE computes n-gram overlap between the generated and reference text</title>
          <p>1. Use the fine-tuned model to generate new texts.
2. Engage humans (experts in the relevant field or regular users) to evaluate the quality of the
generated texts.
3. Develop an evaluation system that takes into account aspects such as grammar, coherence,
relevance, style, and overall quality.
4. Collect feedback and ratings from multiple people and average the results to obtain an overall
assessment of the model’s performance.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <sec id="sec-4-1">
        <title>4.1. Creating datasets for training chatbots</title>
        <p>For conducting experiments on training chatbots, two datasets were created containing texts of scientific
publications in the field of information technology.</p>
        <p>CEUR Workshop Proceedings is a Diamond OA publication that publishes materials from scientific
conferences and seminars covering a wide range of research in computer science and engineering. The
following steps were performed to form the dataset from its publications:
• downloading the website https://ceur-ws.org/ using the wget utility;
• extracting texts from 68791 PDF files corresponding to volumes 1-3583 for 1995-2024;
• creating a single text file ceur-ws.txt with a size of 1917 MB (2009797694 bytes).</p>
        <p>The relative distribution of publications by languages of articles according to Scopus: English –
94.807%, Russian – 1.368%, German – 1.101%, Spanish – 0.772%, Portuguese – 0.691%, Turkish – 0.596%,
French – 0.340%, Ukrainian – 0.160%, Italian – 0.131%, Czech – 0.015%, other languages together –
0.019%.</p>
        <p>The next dataset was created from publications of the journal Information Technologies and Learning
Tools that publishes articles on theoretical and applied aspects of the use of information and
communication technologies in education. The procedure for creating this dataset included:
• downloading the journal’s website https://journal.iitta.gov.ua/index.php/itlt using the wget utility;
• extracting texts from 1732 PDF files corresponding to volumes 1-100 of the journal for 2006-2024;
• forming a text file journal.txt with a volume of 107 MB (112245281 bytes).</p>
        <p>The specificity of this dataset is the narrower subject specialization of texts, which focus on
issues of digital transformation of education, development and use of electronic educational resources,
technologies of distance and blended learning, etc.</p>
        <p>The approximate relative distribution of publications by languages of articles according to Web of
Science: Ukrainian – 52.54%, Russian – 26.73%, English – 20.73%.</p>
        <p>The formed datasets difer in volume, time coverage, thematic focus of texts and distribution of
languages of writing texts, which allows using them for comparative analysis of the efectiveness of
chatbot models trained on heterogeneous text corpora.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Selection of models for fine-tuning</title>
        <p>Considering the prevalence of models of the GPT (Generative Pre-trained Transformer) family, the
possibilities of access to both their modern versions (GPT 3.5, 4.0) and alternative models (Gemini 1.0,
Claude 3) were studied:
• OpenAI provides a programmatic and visual [38] interface for fine-tuning models
gpt-3.5-turbo0125, gpt-3.5-turbo-1106, gpt-3.5-turbo-0613, babbage-002, davinci-002, and gpt-4-0613. None of
these models are free for fine-tuning – as of the 1st quarter of 2024, the cheapest babbage-002
requires a payment of 0.40$ per 1 million tokens, while more advanced models are 20 or more
times more expensive. In addition to the payment for fine-tuning, there is a payment for usage
– for example, the cost of input and output messages for the davinci-002 model is 12.00$ per 1
million tokens.
• Google also provides a programmatic and visual [39] interface for fine-tuning models
gemini-1.0pro, gemini-1.5-pro, gemini-1.5-flash, gemini-1.0-pro, but the latter is not available in Ukraine
[40].
• Anthropic, as of the 1st quarter of 2024, does not provide the possibility of fine-tuning models of
the Claude family (except for selected groups of experts) [41].</p>
        <p>Considering the unpredictability of the budget for fine-tuning and using fine-tuned models, instead
of modern versions of GPT, the historical GPT-2 model [42] was chosen, trained on an internal dataset
of OpenAI – WebText (40 GB). The feasibility of using this model as a basis for a chatbot is shown in
the work [43].</p>
        <p>Hugging Face provides access to all basic versions of the GPT-2 model via the link https://huggingface.
co/openai-community: GPT2, GPT2-Medium, GPT2-Large, and GPT2-XL, trained on English-language
texts. Considering the comparability of the dataset from CEUR Workshop Proceedings publications (2
GB) and the WebText dataset (40 GB), as well as the predominant language (English), the choice of the
GPT2-XL model was reasonable for fine-tuning.</p>
        <p>Hugging Face contains a large (more than 11 thousand) number of GPT-2 models fine-tuned (https:
//huggingface.co/models?sort=trending&amp;search=gpt2), among which less than 10 were fine-tuned on
Ukrainian-language texts (table 4).</p>
        <p>Considering the small volume of the dataset from publications of the journal Information Technologies
and Learning Tools and the predominance of the Ukrainian language in it, it is reasonable to choose for
ifne-tuning a model that has already been fine-tuned on a Ukrainian-language set of texts. Among the
models presented in table 4, malteos/gpt2-uk was chosen for fine-tuning due to its small size (124M
parameters) and non-specificity of the OSCAR and Wikimedia datasets compared to the datasets used
for fine-tuning other models – fiction texts, news, essays for ZNO.
4.3. The process of fine-tuning the chatbot model based on GPT2-uk
For fine-tuning the chatbot model based on GPT2-uk on a Ukrainian-language text corpus, the following
Python code using the transformers library was implemented in the Google Colab environment:
from transformers import TextDataset
from transformers import DataCollatorForLanguageModeling
from transformers import GPT2Tokenizer, GPT2LMHeadModel
from transformers import Trainer, TrainingArguments</p>
        <p>First, the necessary classes from the transformers library are imported for working with text data,
tokenizer and GPT2 model, as well as for configuring the training process.</p>
        <p>Next, auxiliary functions were defined:
def load_dataset(file_path, tokenizer, block_size = 128):
dataset = TextDataset(
tokenizer = tokenizer,
file_path = file_path,
block_size = block_size,
)
return dataset
def load_data_collator(tokenizer, mlm = False):
data_collator = DataCollatorForLanguageModeling(
tokenizer=tokenizer,
mlm=mlm,
)
return data_collator</p>
        <p>The load_dataset function loads a text dataset, performing its tokenization with a given block size
block_size. The load_data_collator function creates an object that will prepare data batches
for training the model (with the possibility of masked language modeling).</p>
        <p>The main function train performs the direct fine-tuning of the model:
def train(train_file_path,model_name,
output_dir,
overwrite_output_dir,
per_device_train_batch_size,
num_train_epochs,
save_steps, resume_from_checkpoint):
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
train_dataset = load_dataset(train_file_path, tokenizer)
data_collator = load_data_collator(tokenizer)
model = GPT2LMHeadModel.from_pretrained(model_name)
training_args = TrainingArguments(
output_dir=output_dir,
overwrite_output_dir=overwrite_output_dir,
per_device_train_batch_size=per_device_train_batch_size,
num_train_epochs=num_train_epochs,
)
trainer = Trainer(
model=model,
args=training_args,
data_collator=data_collator,
train_dataset=train_dataset,
)
This function performs the following steps:
1. Loading and saving the tokenizer and model initialized from the specified model_name point.
2. Preparing the training dataset and data_collator object.
3. Defining training parameters training_args: directory for saving the model, batch size, number
of iterations.
4. Creating a trainer object with the model, training parameters, and training data.
5. Launching model training using the train() method with the ability to continue from a
checkpoint.
6. Saving the fine-tuned model.</p>
        <p>The fine-tuning parameters are set by the following variables:
#original model and checkpoints
train_directory =</p>
        <p>"/content/drive/MyDrive/SemerikovProject/q_and_a"
train_file_path =</p>
        <p>"/content/drive/MyDrive/SemerikovProject/journal.txt"
model_name = train_directory
#fine-tuned model and checkpoints
output_dir =</p>
        <p>"/content/drive/MyDrive/SemerikovProject/custom_full_text"
overwrite_output_dir = False
per_device_train_batch_size = 8
num_train_epochs = 50
save_steps = 50000</p>
        <p>The train_file_path variable specifies the path to the text file with the fine-tuning corpus,
model_name – the directory with the files of the initial model (or the checkpoint for continuing
ifne-tuning), output_dir – the output directory for saving the fine-tuned model files and checkpoints.</p>
        <p>The initial model malteos/gpt2-uk is loaded using the methods of the transformers library
from the Hugging Face hub:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("malteos/gpt2-uk")
model = AutoModelForCausalLM.from_pretrained("malteos/gpt2-uk")
tokenizer.save_pretrained(train_directory)
model.save_pretrained(train_directory)</p>
        <p>The loaded model files are saved to the specified train_directory.</p>
        <p>The fine-tuning process is launched by calling the train function with the specified parameters:
train(train_file_path=train_file_path,
model_name=model_name,
output_dir=output_dir,
overwrite_output_dir=overwrite_output_dir,
per_device_train_batch_size=per_device_train_batch_size,
num_train_epochs=num_train_epochs,
save_steps=save_steps,
# False for the first time,
# True - to continue after resume
resume_from_checkpoint=False)</p>
        <p>The training time can be reduced by using a GPU on a separate server or local computer. The latter
required installing the CUDA 12.5 driver set to support the GeForce RTX 3080 (10 Gb) video card and a
specialized version of PyTorch, as well as updating the accelerate and transformers libraries.</p>
        <p>For the final experiment, the total training time was 37 hours (the average duration of one training
iteration was 43 minutes).</p>
        <p>To test the model fine-tuned on the dataset, a user interface was created using the gradio library
(figure 12, https://huggingface.co/spaces/POMAHSLS/ITLT_Journal):
from transformers import pipeline
import gradio as gr
model = pipeline("text-generation",</p>
        <p>
          model="/content/drive/MyDrive/SemerikovProject/lastversion")
def predict(prompt):
completion = model(prompt, max_length=50)[0]["generated_text"]
return completion
gr.Interface(fn=predict, inputs="text", outputs="text").launch()
5. Conclusions
1. Bibliometric analysis of 549 sources from the Scopus database on the problem of chatbot training
provided an opportunity to:
a) identify the lower chronological boundary (2018), starting from which there is a steady increase
in the number of publications on chatbot training;
b) suggest that the reason for a significant increase in the number of works in 2023 (by 55
compared to 2022) is the public availability of ChatGPT and the associated surge of interest
from the scientific community in the topic of large language models and chatbots: to confirm
or refute this assumption, one can analyze the context of keywords in 2023 publications for
direct mentions of ChatGPT or similar systems, and also consider the dynamics of citations of
works dedicated to ChatGPT in other works during this period;
c) group into 4 clusters the author and index keywords of sources on chatbot training: 1) natural
language processing; 2) application of natural language processing technologies in society;
3) application of machine learning for natural language processing; 4) chatbots in education
and service sector.
2. To determine the most significant key concepts of the research in each cluster, the keywords that
have the largest and the next largest value were identified by the following indicators: number of
links, total link strength, number of documents with given keyword and averages: publication
year, number of citations, normalized number of citations. For the selected keywords, their
significance was calculated from 1 to 6:
a) in the first cluster (natural language processing), the most significant are 7 keywords (41%):
computational linguistics (
          <xref ref-type="bibr" rid="ref6">6</xref>
          ), natural language understanding (
          <xref ref-type="bibr" rid="ref3 ref46 ref50 ref54">3</xref>
          ), performance (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ), language
model (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ), speech processing (
          <xref ref-type="bibr" rid="ref1 ref44 ref48 ref52">1</xref>
          );
b) in the second cluster (application of natural language processing technologies in society), the
most significant are 10 keywords (63%): artificial intelligence (
          <xref ref-type="bibr" rid="ref6">6</xref>
          ), large language model (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ),
chatgpt (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ), human (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ), training (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ), review (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ), conversational agent (
          <xref ref-type="bibr" rid="ref1 ref44 ref48 ref52">1</xref>
          ), controlled study (
          <xref ref-type="bibr" rid="ref1 ref44 ref48 ref52">1</xref>
          ),
education (
          <xref ref-type="bibr" rid="ref1 ref44 ref48 ref52">1</xref>
          ), male (
          <xref ref-type="bibr" rid="ref1 ref44 ref48 ref52">1</xref>
          );
c) in the third cluster (application of machine learning for natural language processing), the most
significant are 6 keywords (50%): natural language processing (
          <xref ref-type="bibr" rid="ref6">6</xref>
          ), natural language processing
systems (
          <xref ref-type="bibr" rid="ref3 ref46 ref50 ref54">3</xref>
          ), virtual assistant (
          <xref ref-type="bibr" rid="ref3 ref46 ref50 ref54">3</xref>
          ), language processing (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ), user interfaces (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ), diagnosis (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          );
d) in the fourth cluster (chatbots in education and service sector), the most significant are 7
keywords (58%): chatbot (
          <xref ref-type="bibr" rid="ref6">6</xref>
          ), learning systems (
          <xref ref-type="bibr" rid="ref3 ref46 ref50 ref54">3</xref>
          ), curricula (
          <xref ref-type="bibr" rid="ref3 ref46 ref50 ref54">3</xref>
          ), learn+ (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ), knowledge based
systems (
          <xref ref-type="bibr" rid="ref2 ref45 ref49 ref53">2</xref>
          ), personnel training (
          <xref ref-type="bibr" rid="ref1 ref44 ref48 ref52">1</xref>
          ), customer service (
          <xref ref-type="bibr" rid="ref1 ref44 ref48 ref52">1</xref>
          ).
3. During the analysis of the keyword map, it was found that the third cluster (application of
machine learning for natural language processing) has certain intersections with the content of
the first (natural language processing) and second (application of natural language processing
technologies in society) clusters. This is explained by the fact that machine learning methods,
particularly deep learning, are fundamental tools for developing natural language processing
systems and their applications. Therefore, some key terms of the third cluster, such as natural
language processing, virtual assistants, neural networks, etc., are closely related to the concepts
of other clusters. This indicates the close integration of diferent research areas in the single
scientific field of creating intelligent conversational agents. Further analysis of the intersections
between clusters may become a promising direction for future scientific research to more clearly
delineate subject areas.
4. Analysis of the map of connections of the most significant keywords provided an opportunity to
identify the leading research directions:
a) in natural language processing using computational linguistics – natural language
understanding, construction of language models and speech recognition;
b) in the application of artificial intelligence technologies for natural language processing –
controlled use of large language models and chatbots (in particular, ChatGPT) in education;
c) in the application of machine learning for natural language processing – the use of virtual
assistants, natural language user interfaces and other natural language processing systems, in
particular, for diagnosis;
d) in the application of chatbots in education and service sector – the use of chatbots, learning
systems and knowledge management systems for enhanced and adaptive learning.
The identified directions can be useful for organizations in developing strategies for using artificial
intelligence and integrating conversational agents into production activities.
5. The results of the conducted bibliometric analysis can be applied:
• in education:
a) the identified priority areas of research in the field of chatbot training can be used
to form the topics of courses, training programs for specialists in natural language
processing and development of artificial intelligence systems;
b) the identified key concepts can serve as a basis for the development of educational
materials designed to highlight the most important concepts and technologies for
creating conversational agents;
c) an overview of the applications of chatbots in the educational process outlines promising
ways of their integration into adaptive and personalized learning systems;
• in scientific research:
a) the map of connections of key concepts can serve as a basis for forming scientific
hypotheses and constructing conceptual models during research design;
b) the results of the analysis open up prospects for further bibliometric studies of dynamics
and relationships in the subject field;
c) the identified priority areas determine the relevant vectors of future scientific research
in the field of artificial intelligence and its use in scientific research in the field of social
sciences.
6. Supervised learning is one of the main approaches that involves training the model on labeled
“query-response” pairs. For this, architectures based on recurrent neural networks (for example,
Seq2Seq with LSTM) and transformers (for example, GPT) are used. These models are capable of
generating contextually relevant responses, but require large volumes of high-quality labeled
data.
7. Reinforcement learning allows the model to learn through interaction with the environment (user)
and receive feedback in the form of rewards. This approach is implemented using generative
adversarial networks (GAN) and an iterative process of reinforcement learning based on human
feedback (RLHF).
8. Transfer learning consists of using the knowledge gained by the model when solving one task
to improve its eficiency when solving another, similar task. The most common approach is
ifne-tuning a pre-trained model on a specific dataset to adapt to a specific subject area.
9. To evaluate the efectiveness of chatbot training, both automatic metrics (BERTScore, perplexity,
BLEU, ROUGE) and human evaluation methods are used, which allow taking into account the
relevance, coherence and naturalness of the generated responses.
10. Two datasets for training chatbots were formed: a dataset from “CEUR Workshop Proceedings”
publications (a wide range of research in computer science in English) and a dataset from
publications of the journal “Information Technologies and Learning Tools” (a subject-oriented corpus
mainly in Ukrainian). The created datasets difer in volume, time coverage, thematic focus and
distribution of languages of writing texts, which provides conditions for a comparative analysis
of the efectiveness of chatbot models trained on heterogeneous text corpora.
11. The choice of models for fine-tuning was substantiated: the basic multilingual GPT2-XL model for
the first dataset and the gpt2-uk model previously fine-tuned on Ukrainian texts for the second.
The selection of models was based on comparing the characteristics of text datasets (volume,
language composition) and available models of the GPT family, taking into account their size and
previous training experience.
12. The process of fine-tuning the selected models on the formed text corpora using the capabilities
of the transformers library from Hugging Face was implemented. The developed program
code allows loading pre-trained models, fine-tuning them on text data provided by the user, and
saving fine-tuned models for further use.
13. The operation of the fine-tuned models for generating chatbot responses to thematically related
user queries was tested. For the convenience of user interaction with the chatbot, a graphical
interface was developed using the gradio library.
        </p>
        <p>The obtained results can be used for further research in the direction of creating efective and
specialized chatbots using modern approaches to training large language models.</p>
        <p>Acknowledgments: The authors would like to thank the developers of the Hugging Face transformers library and the
providers of the GPT-2 and gpt2-uk models for making their work publicly available. We also express our gratitude to the
publishers of the “CEUR Workshop Proceedings” and “Information Technologies and Learning Tools” for maintaining open
access to their valuable scientific content.</p>
        <p>Declaration on Generative AI: During the preparation of this work, the authors used Claude 3 Opus in order to: Text
Translation, Abstract drafting, Formatting assistance. After using this tool, the authors reviewed and edited the content as
needed and takes full responsibility for the publication’s content.
A. Results of applying generative chatbots to determine cluster names
The query was created in English on November 30, 2023. The query contained information about
the number of clusters and the keywords related to each cluster. The chatbot was expected to make
suggestions regarding the names and contents of the clusters. The query had the following form:
“There are 4 clusters of keywords related to chatbot training research:</p>
        <p>Cluster 1: classification (of information), training data, speech processing, performance, semantics,
dialogue systems, natural language understanding, computational linguistics, text processing, long
shortterm memory, state of the art, language model, question answering, information retrieval, embeddings,
reinforcement learning, response generation</p>
        <p>Cluster 2: artificial intelligence, conversational agent, human, adult, training, male, medical education,
female, controlled study, health care, mental health, chatgpt, education, covid-19, review, large language
model</p>
        <p>Cluster 3: natural language processing, natural language processing systems, machine learning, deep
learning, natural languages, language processing, learning algorithms, virtual assistant, user interfaces,
convolutional neural network, query processing, diagnosis</p>
        <p>Cluster 4: chatbot, learning systems, students, e-learning, learn+, sales, human computer interaction,
knowledge based systems, personnel training, curricula, customer service, engineering education</p>
        <p>Please suggest cluster names and brief descriptions for each of the 4 clusters based on the keywords.”
ChatGPT (GPT 3.5)
1. Advanced Language Processing and AI Applications: This cluster focuses on advanced language
processing techniques and various AI applications related to text and speech. It encompasses topics
like natural language understanding, computational linguistics, information retrieval, dialogue
systems, and the utilization of cutting-edge technologies such as long short-term memory (LSTM),
embeddings, and reinforcement learning in language models and question answering systems.
2. AI in Healthcare and Social Context: This cluster centers around the applications of artificial
intelligence in healthcare, mental health, and education with a focus on human interaction. It
explores the use of conversational agents, particularly in medical education, mental health care,
and addressing issues like COVID-19. Additionally, it touches on the gender aspect in AI studies
by mentioning controlled studies with adults and gender variations.
3. Core Technologies in Natural Language Processing and Machine Learning: This cluster revolves
around foundational technologies crucial in natural language processing and machine learning. It
includes topics such as deep learning, machine learning algorithms, language processing, neural
networks (like convolutional neural networks), query processing, and the development of virtual
assistants and user interfaces.
4. Educational and Service-oriented AI Applications: This cluster focuses on AI applications in
education, learning systems, and service-oriented domains. It highlights topics like e-learning,
knowledge-based systems, human-computer interaction, customer service, and sales. It also
touches upon the use of AI in student learning, engineering education, and personnel training,
emphasizing knowledge acquisition and application in various educational and service-based
contexts.</p>
        <p>Bing, precise mode (GPT 4.0)
1. Language Processing and Information Retrieval: This cluster focuses on the technical aspects of
natural language understanding and computational linguistics. It includes topics such as speech
processing, semantics, dialogue systems, text processing, language modeling, question answering,
and reinforcement learning for response generation.
2. AI in Healthcare and Education: This cluster revolves around the application of artificial
intelligence, particularly conversational agents like ChatGPT, in fields such as medical education,
healthcare, and mental health. It also touches on demographic factors (male, female, adult) and
current issues like COVID-19.
3. Machine Learning and User Interaction: This cluster is centered on machine learning, deep learning,
and natural language processing systems. It also includes user interfaces and virtual assistants,
indicating a focus on how users interact with these AI systems.
4. AI in Education and Customer Service: This cluster pertains to the use of AI systems like chatbots
in educational settings (e-learning, engineering education, curricula) and customer service. It
also includes topics related to human-computer interaction and knowledge-based systems.
Bing, balanced mode (GPT 4.0)
1. Natural Language Processing (NLP): This cluster includes topics related to the processing of human
language by computers. It covers a wide range of subfields such as computational linguistics, text
processing, information retrieval, and question answering. The cluster also includes techniques
such as embeddings, long short-term memory, and reinforcement learning.:
2. Artificial Intelligence (AI) : This cluster includes topics related to the development of intelligent
agents that can perform tasks that typically require human intelligence. It covers a wide range of
subfields such as conversational agents, large language models, and chatbots. The cluster also
includes topics related to health care, education, and COVID-19.:
3. Machine Learning (ML): This cluster includes topics related to the development of algorithms that
can learn from data and make predictions or decisions. It covers a wide range of subfields such as
deep learning, learning algorithms, and convolutional neural networks. The cluster also includes
topics related to virtual assistants, user interfaces, and diagnosis.:
4. Learning Systems (LS): This cluster includes topics related to the development of systems that can
learn from data and improve over time. It covers a wide range of subfields such as e-learning,
personnel training, and customer service. The cluster also includes topics related to sales,
humancomputer interaction, and engineering education.:
Bing, creative mode (GPT 4.0)</p>
        <p>Claude 2</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Franklin</surname>
          </string-name>
          ,
          <article-title>Google's new AI Chatbot produces fake health-related evidence-then self-corrects</article-title>
          ,
          <source>PLOS Digital Health</source>
          <volume>3</volume>
          (
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          . doi:
          <volume>10</volume>
          .1371/journal.pdig.
          <volume>0000619</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Usigan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. I.</given-names>
            <surname>Salomeo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. J. L. J.</given-names>
            <surname>Zafe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Centeno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A. R. C.</given-names>
            <surname>Sison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Bitancor</surname>
          </string-name>
          ,
          <article-title>Implementation of an Undergraduate Admission Chatbot Using Microsoft Azure's Question Answering and Bot Framework</article-title>
          ,
          <source>in: Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference, AICCC '22</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2023</year>
          , p.
          <fpage>240</fpage>
          -
          <lpage>245</lpage>
          . doi:
          <volume>10</volume>
          .1145/3582099.3582135.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Thamilselvan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Natesan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Manimaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Naveenkumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Shanthosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vigneshwaran</surname>
          </string-name>
          ,
          <article-title>Designing A Llama 2-Powered Chatbot for Enhanced College Website Support</article-title>
          ,
          <source>in: 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICCCNT61001.
          <year>2024</year>
          .
          <volume>10725472</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Workman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. K.</given-names>
            <surname>Rathi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Lerner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. N.</given-names>
            <surname>Palmer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. D.</given-names>
            <surname>Adappa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <article-title>Utility of a LangChain and OpenAI GPT-powered chatbot based on the international consensus statement on allergy and rhinology</article-title>
          : Rhinosinusitis,
          <source>International Forum of Allergy &amp; Rhinology</source>
          <volume>14</volume>
          (
          <year>2024</year>
          )
          <fpage>1101</fpage>
          -
          <lpage>1109</lpage>
          . doi:
          <volume>10</volume>
          .1002/alr.23310.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] OpenAI,
          <string-name>
            <surname>Introducing</surname>
            <given-names>ChatGPT</given-names>
          </string-name>
          ,
          <year>2022</year>
          . URL: https://openai.com/blog/chatgpt.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S. V.</given-names>
            <surname>Symonenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. V.</given-names>
            <surname>Zaitseva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. V.</given-names>
            <surname>Osadchyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. P.</given-names>
            <surname>Osadcha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. S.</given-names>
            <surname>Kruglyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. O.</given-names>
            <surname>Sysoieva</surname>
          </string-name>
          ,
          <article-title>Application of chatbots for enhancing communication skills of IT specialists</article-title>
          ,
          <source>Journal of Physics: Conference Series</source>
          <volume>2871</volume>
          (
          <year>2024</year>
          )
          <article-title>012026</article-title>
          . doi:
          <volume>10</volume>
          .1088/
          <fpage>1742</fpage>
          -6596/2871/1/012026.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Riabko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Vakaliuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. V.</given-names>
            <surname>Zaika</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. P.</given-names>
            <surname>Kukharchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. V.</given-names>
            <surname>Kontsedailo</surname>
          </string-name>
          ,
          <article-title>Chatbot algorithm for solving physics problems</article-title>
          ,
          <source>CEUR Workshop Proceedings</source>
          <volume>3553</volume>
          (
          <year>2023</year>
          )
          <fpage>75</fpage>
          -
          <lpage>92</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>I.</given-names>
            <surname>Mintii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Semerikov</surname>
          </string-name>
          ,
          <article-title>Optimizing Teacher Training and Retraining for the Age of AI-Powered Personalized Learning: A Bibliometric Analysis</article-title>
          , in: E. Faure,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tryus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Vartiainen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Danchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bondarenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bazilo</surname>
          </string-name>
          , G. Zaspa (Eds.),
          <source>Information Technology for Education, Science, and Technics</source>
          , volume
          <volume>222</volume>
          <source>of Lecture Notes on Data Engineering and Communications Technologies</source>
          , Springer Nature Switzerland, Cham,
          <year>2024</year>
          , pp.
          <fpage>339</fpage>
          -
          <lpage>357</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -71804-5_
          <fpage>23</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Liashenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Semerikov</surname>
          </string-name>
          ,
          <article-title>The Determination and Visualisation of Key Concepts Related to the Training of Chatbots</article-title>
          , in: E. Faure,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tryus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Vartiainen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Danchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bondarenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bazilo</surname>
          </string-name>
          , G. Zaspa (Eds.),
          <source>Information Technology for Education, Science, and Technics</source>
          , volume
          <volume>222</volume>
          <source>of Lecture Notes on Data Engineering and Communications Technologies</source>
          , Springer Nature Switzerland, Cham,
          <year>2024</year>
          , pp.
          <fpage>111</fpage>
          -
          <lpage>126</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -71804-
          <issue>5</issue>
          _
          <fpage>8</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Liashenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Semerikov</surname>
          </string-name>
          ,
          <article-title>Bibliometric analysis of chatbot training research: Key concepts and trends</article-title>
          ,
          <source>Information Technologies and Learning Tools</source>
          <volume>101</volume>
          (
          <year>2024</year>
          )
          <fpage>181</fpage>
          -
          <lpage>199</lpage>
          . doi:
          <volume>10</volume>
          .33407/itlt. v101i3.
          <fpage>5622</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>DeepLearning.AI</surname>
          </string-name>
          , Search | The Batch |
          <source>AI News &amp; Insights</source>
          ,
          <year>2023</year>
          . URL: https://www.deeplearning. ai/search/.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Big</given-names>
            <surname>Bot Makes Small Talk</surname>
          </string-name>
          :
          <article-title>A research summary of Facebook's Generative BST chatbot</article-title>
          ,
          <year>2020</year>
          . URL: https://www.deeplearning.
          <article-title>ai/the-batch/big-bot-makes-small-talk/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Bot</surname>
            <given-names>Comic</given-names>
          </string-name>
          :
          <article-title>How Google's Meena chatbot developed a sense of humor, 2020</article-title>
          . URL: https://www. deeplearning.ai/the-batch/bot-comic/.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <article-title>Chatbots for Productivity: Microsoft extends Copilot to 365</article-title>
          and
          <string-name>
            <surname>Windows</surname>
          </string-name>
          ,
          <year>2023</year>
          . URL: https: //www.deeplearning.
          <article-title>ai/the-batch/microsoft-extends-</article-title>
          <string-name>
            <surname>copilot-</surname>
          </string-name>
          365-windows/.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>China</given-names>
            <surname>Chases</surname>
          </string-name>
          <article-title>Chatbots: Chinese tech companies race to cash in on ChatGPT fever</article-title>
          ,
          <year>2023</year>
          . URL: https: //www.deeplearning.
          <article-title>ai/the-batch/chinese-tech-companies-race-to-cash-in-on-chatgpt-fever/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Search</surname>
            <given-names>War</given-names>
          </string-name>
          !
          <article-title>Google and Microsoft both announce AI-Powered search</article-title>
          ,
          <year>2023</year>
          . URL: https://www. deeplearning.
          <article-title>ai/the-batch/google-and-microsoft-both-announce-ai-powered-search/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Chatbots</given-names>
            <surname>Disagree</surname>
          </string-name>
          on Covid-
          <volume>19</volume>
          :
          <article-title>Medical chatbots ofered conflicting Covid advice</article-title>
          ,
          <year>2020</year>
          . URL: https://www.deeplearning.
          <article-title>ai/the-batch/chatbots-disagree-on-covid-</article-title>
          <volume>19</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Language</surname>
            <given-names>Models</given-names>
          </string-name>
          ,
          <article-title>Extended: Large language models grew more reliable and less</article-title>
          biased in
          <year>2022</year>
          ,
          <year>2022</year>
          . URL: https://www.deeplearning.
          <article-title>ai/the-batch/ language-models-grew-more-reliable-and-</article-title>
          <string-name>
            <surname>less-</surname>
          </string-name>
          biased-in-2022/.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <article-title>Cost Containment for Generative AI: Microsoft's quest to reduce the size and cost of language models</article-title>
          ,
          <year>2023</year>
          . URL: https://www.deeplearning.
          <article-title>ai/the-batch/ microsofts-quest-to-reduce-the-size-and-cost-of-language-models/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>What</given-names>
            <surname>We</surname>
          </string-name>
          Know - and
          <string-name>
            <surname>Don't Know - About Foundation Models</surname>
          </string-name>
          :
          <article-title>A new Stanford index to assess the transparency of leading AI models</article-title>
          ,
          <year>2023</year>
          . URL: https://www.deeplearning.
          <article-title>ai/the-batch/ a-new-stanford-index-to-assess-the-transparency-of-leading-ai-models/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Elsevier</surname>
            <given-names>B.V.</given-names>
          </string-name>
          , Scopus - Document search | Signed in,
          <year>2023</year>
          . URL: https://www.scopus.com/search/ form.uri?display=basic#basic.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>N. J. Van Eck</surname>
          </string-name>
          ,
          <string-name>
            <surname>L</surname>
          </string-name>
          . Waltman, VOSviewer Manual, Universiteit Leiden,
          <year>2023</year>
          . URL: https://www. vosviewer.com/documentation/Manual_VOSviewer_1.6.20.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <article-title>Centre for Science and Technology Studies</article-title>
          , Leiden University, The Netherlands, VOSviewer - Visualizing scientific landscapes,
          <year>2023</year>
          . URL: https://www.vosviewer.com/.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>S. P.</given-names>
            <surname>Uprety</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Jeong</surname>
          </string-name>
          ,
          <article-title>The Impact of Semi-Supervised Learning on the Performance of Intelligent Chatbot System</article-title>
          , Computers,
          <source>Materials &amp; Continua</source>
          <volume>71</volume>
          (
          <year>2022</year>
          )
          <fpage>3937</fpage>
          -
          <lpage>3952</lpage>
          . doi:
          <volume>10</volume>
          .32604/cmc.
          <year>2022</year>
          .
          <volume>023127</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>S.</given-names>
            <surname>Patil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Mudaliar</surname>
          </string-name>
          , P. Kamat,
          <article-title>LSTM based Ensemble Network to enhance the learning of Longterm Dependencies in Chatbot</article-title>
          ,
          <source>International Journal of Automation and Smart Technology</source>
          <volume>12</volume>
          (
          <year>2022</year>
          )
          <fpage>2286</fpage>
          -
          <lpage>2286</lpage>
          . doi:
          <volume>10</volume>
          .5875/ausmt.v12i1.
          <fpage>2286</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <article-title>A Multi-modal Seq2seq Chatbot Framework</article-title>
          , in: Z.
          <string-name>
            <surname>Qian</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Jabbar</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
          </string-name>
          (Eds.),
          <source>Proceeding of 2021 International Conference on Wireless Communications, Networking and Applications</source>
          , Springer Nature, Singapore,
          <year>2022</year>
          , pp.
          <fpage>225</fpage>
          -
          <lpage>233</lpage>
          . doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -981-19-2456-9_
          <fpage>24</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>P.</given-names>
            <surname>Anki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bustamam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. S.</given-names>
            <surname>Al-Ash</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Sarwinda</surname>
          </string-name>
          ,
          <article-title>Intelligent Chatbot Adapted from Question and Answer System Using RNN-LSTM Model</article-title>
          ,
          <source>Journal of Physics: Conference Series</source>
          <year>1844</year>
          (
          <year>2021</year>
          )
          <article-title>012001</article-title>
          . doi:
          <volume>10</volume>
          .1088/
          <fpage>1742</fpage>
          -
          <lpage>6596</lpage>
          /
          <year>1844</year>
          /1/012001.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dharrao</surname>
          </string-name>
          , S. Gite,
          <article-title>TherapyBot: a chatbot for mental well-being using transformers</article-title>
          ,
          <source>International Journal of Advances in Applied Sciences</source>
          <volume>13</volume>
          (
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          . doi:
          <volume>10</volume>
          .11591/ijaas.v13.
          <year>i1</year>
          .
          <fpage>pp1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          , Attention Is All You Need,
          <year>2023</year>
          . arXiv:
          <volume>1706</volume>
          .
          <fpage>03762</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <source>A Mathematical Investigation of Hallucination and Creativity in GPT Models, Mathematics</source>
          <volume>11</volume>
          (
          <year>2023</year>
          )
          <article-title>2320</article-title>
          . doi:
          <volume>10</volume>
          .3390/math11102320.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kansal</surname>
          </string-name>
          ,
          <article-title>Finetuning: The Theory</article-title>
          , in: Building Generative
          <string-name>
            <surname>AI-Powered</surname>
            <given-names>Apps</given-names>
          </string-name>
          :
          <article-title>A Hands-on Guide for Developers</article-title>
          , Apress, Berkeley, CA,
          <year>2024</year>
          , pp.
          <fpage>77</fpage>
          -
          <lpage>100</lpage>
          . doi:
          <volume>10</volume>
          .1007/979-8-
          <fpage>8688</fpage>
          -0205-
          <issue>8</issue>
          _
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>P.</given-names>
            <surname>Christiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leike</surname>
          </string-name>
          , T. B.
          <string-name>
            <surname>Brown</surname>
            , M. Martic,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Legg</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Amodei</surname>
          </string-name>
          ,
          <article-title>Deep reinforcement learning from human preferences</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>1706</volume>
          .
          <fpage>03741</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>T.-L. Chou</surname>
            ,
            <given-names>Y.-L.</given-names>
          </string-name>
          <string-name>
            <surname>Hsueh</surname>
          </string-name>
          ,
          <article-title>A Task-oriented Chatbot Based on LSTM and Reinforcement Learning</article-title>
          ,
          <source>in: Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval</source>
          , NLPIR '19,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2019</year>
          , p.
          <fpage>87</fpage>
          -
          <lpage>91</lpage>
          . doi:
          <volume>10</volume>
          .1145/3342827.3342844.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Q.-D. L. Tran</surname>
          </string-name>
          , A.
          <string-name>
            <surname>-C. Le</surname>
            ,
            <given-names>V.-N.</given-names>
          </string-name>
          <string-name>
            <surname>Huynh</surname>
          </string-name>
          ,
          <article-title>Enhancing Conversational Model With Deep Reinforcement Learning and Adversarial Learning</article-title>
          ,
          <source>IEEE Access 11</source>
          (
          <year>2023</year>
          )
          <fpage>75955</fpage>
          -
          <lpage>75970</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2023</year>
          .
          <volume>3297652</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>R.</given-names>
            <surname>Liashenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Semerikov</surname>
          </string-name>
          ,
          <article-title>Training Specialised Chatbots on Ukrainian Scientific Text Corpora Using Transfer Learning</article-title>
          ,
          <source>in: 2024 IEEE 18th International Conference on Computer Science and Information Technologies (CSIT)</source>
          , IEEE,
          <year>2025</year>
          (in press).
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>V.</given-names>
            <surname>Ilievski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Musat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hossman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Baeriswyl</surname>
          </string-name>
          ,
          <article-title>Goal-oriented chatbot dialog management bootstrapping with transfer learning</article-title>
          ,
          <source>in: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, International Joint Conferences on Artificial Intelligence Organization</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>4115</fpage>
          -
          <lpage>4121</lpage>
          . doi:
          <volume>10</volume>
          .24963/ijcai.
          <year>2018</year>
          /572.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>T.</given-names>
            <surname>Taulli</surname>
          </string-name>
          ,
          <string-name>
            <surname>AI-Assisted</surname>
            <given-names>Programming</given-names>
          </string-name>
          : Better Planning, Coding, Testing, and
          <string-name>
            <surname>Deployment</surname>
          </string-name>
          ,
          <string-name>
            <surname>O'Reilly Media</surname>
          </string-name>
          , Inc., Sebastopol, CA,
          <year>2024</year>
          . URL: https://www.oreilly.com/library/view/ ai-assisted-programming/9781098164553/.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <surname>OpenAI</surname>
          </string-name>
          ,
          <string-name>
            <surname>Fine-</surname>
          </string-name>
          tuning - openai api,
          <year>2024</year>
          . URL: https://platform.openai.com/finetune.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <surname>Google</surname>
          </string-name>
          , Google ai studio,
          <year>2024</year>
          . URL: https://aistudio.google.com/.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Google</surname>
          </string-name>
          ,
          <article-title>Available regions for Google AI Studio</article-title>
          and
          <string-name>
            <surname>Gemini</surname>
            <given-names>API</given-names>
          </string-name>
          ,
          <year>2024</year>
          . URL: https://ai.google.dev/ gemini-api/docs/available-regions.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>S.</given-names>
            <surname>Elaprolu</surname>
          </string-name>
          ,
          <article-title>Introducing the AWS Generative AI Innovation Center's Custom Model Program for Anthropic Claude</article-title>
          ,
          <year>2023</year>
          . URL: https://aws.amazon.com/blogs/machine
          <article-title>-learning/ introducing-the-aws-generative-ai-innovation-centers-custom-model-program-for-anthropic-claude/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Child</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Luan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Amodei</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sutskever</surname>
          </string-name>
          ,
          <source>Language Models are Unsupervised Multitask Learners</source>
          ,
          <year>2019</year>
          . URL: https://cdn.openai.
          <article-title>com/better-language-models/language_models_ are_unsupervised_multitask_learners</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>A. W.</given-names>
            <surname>Ohlheiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hao</surname>
          </string-name>
          ,
          <article-title>An AI is training counselors to deal with teens in crisis, 2021</article-title>
          . URL: https: //www.technologyreview.com/
          <year>2021</year>
          /02/26/1020010/trevor-project
          <article-title>-ai-suicide-hotline-training/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Natural</given-names>
            <surname>Language</surname>
          </string-name>
          <article-title>Understanding and Generation: This cluster contains keywords related to the tasks and methods of natural language understanding and generation, such as classification, dialogue systems, question answering, and response generation. It also includes keywords that indicate the performance and state of the art of these tasks, such as semantics, long short-term memory, embeddings, and reinforcement learning</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Artificial</given-names>
            <surname>Intelligence</surname>
          </string-name>
          and Health Care :
          <article-title>This cluster contains keywords related to the applications and implications of artificial intelligence in health care, such as conversational agent, medical education, mental health, and covid-19. It also includes keywords that indicate the characteristics and challenges of these applications, such as human, adult, male, female, controlled study, review, and large language model</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Natural</given-names>
            <surname>Language</surname>
          </string-name>
          <article-title>Processing and Machine Learning: This cluster contains keywords related to the general concepts and techniques of natural language processing and machine learning, such as natural language processing systems, machine learning, deep learning, learning algorithms, and convolutional neural network. It also includes keywords that indicate the interfaces and domains of these techniques, such as virtual assistant, user interfaces</article-title>
          ,
          <source>query processing, and diagnosis.</source>
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          <article-title>4. Chatbot and Education: This cluster contains keywords related to the design and evaluation of chatbot systems for education, such as chatbot, learning systems</article-title>
          , students, e
          <article-title>-learning, learn+, and curricula. It also includes keywords that indicate the contexts and outcomes of these systems, such as sales, human computer interaction, knowledge based systems, personnel training, customer service, and engineering education</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Natural</given-names>
            <surname>Language Processing Applications</surname>
          </string-name>
          :
          <article-title>This cluster focuses on various NLP applications like classification, speech processing, dialogue systems, question answering, etc. Central themes include semantics, computational linguistics, LSTMs, performance, and building practical natural language systems</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          2.
          <string-name>
            <surname>Conversational</surname>
            <given-names>AI</given-names>
          </string-name>
          and
          <article-title>Impacts: This cluster deals with conversational AI, its use in healthcare/education, and analyzing its impacts. Key themes look at chatbots/assistants, how they interact with and impact humans, with analysis around ethics, proper training, and responsible AI</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          3.
          <string-name>
            <surname>Core</surname>
            <given-names>AI</given-names>
          </string-name>
          and
          <string-name>
            <given-names>ML</given-names>
            <surname>Techniques</surname>
          </string-name>
          <article-title>: This cluster covers fundamental AI and ML techniques like natural language processing, machine learning, deep learning, algorithms, etc. It focuses on the key technical methods and capabilities that power many AI systems</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          4.
          <article-title>AI for Training and Customer Service: The fourth cluster deals with leveraging AI for use cases like education/training and customer service. Central ideas look at e-learning, knowledge systems, training curriculums, human-computer interaction, and providing automation for repetitive customer service queries</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          1.
          <string-name>
            <surname>Technical</surname>
            <given-names>NLP</given-names>
          </string-name>
          :
          <article-title>This cluster focuses on the technical aspects of natural language processing (NLP), including classification, training data, speech processing, semantics, dialogue systems, and language models</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          2. Applications of NLP:
          <article-title>This cluster focuses on the applications of NLP in various domains, including conversational agents, healthcare, education, and customer service</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref54">
        <mixed-citation>
          <article-title>3. Machine Learning and Deep Learning for NLP: This cluster focuses on the machine learning and deep learning techniques used in NLP.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref55">
        <mixed-citation>
          <article-title>4. Chatbots and Learning Systems: This cluster focuses on the development and evaluation of chatbots and other learning systems.</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>