=Paper= {{Paper |id=Vol-3762/571 |storemode=property |title=Advancements and Challenges in Generative AI: Architectures, Applications, and Ethical Implications |pdfUrl=https://ceur-ws.org/Vol-3762/571.pdf |volume=Vol-3762 |authors=Flora Amato,Egidia Cirillo,Mattia Fonisto,Alberto Moccardi,Vincenzo Moscato,Carlo Sansone,Stefano Marrone,Antonio Maria Rinaldi,Antonio Galli,Domenico Benfenati,Giovanni Maria De Filippis,Lidia Marassi,Narendra Patwardhan,Antonio Elia Pascarella,Cristiano Russo,Cristian Tommasino |dblpUrl=https://dblp.org/rec/conf/ital-ia/AmatoCFMMS0RGBF24 }} ==Advancements and Challenges in Generative AI: Architectures, Applications, and Ethical Implications== https://ceur-ws.org/Vol-3762/571.pdf
                                Advancements and Challenges in Generative AI:
                                Architectures, Applications, and Ethical Implications
                                Flora Amato1,* , Domenico Benfenati1 , Egidia Cirillo1 , Giovanni Maria De Filippis1 ,
                                Mattia Fonisto1 , Antonio Galli1 , Stefano Marrone1 , Lidia Marassi1 , Vincenzo Moscato1 ,
                                Narendra Patwardhan1 , Alberto Moccardi1 , Antonio Elia Pascarella1 , Antonio M. Rinaldi1 ,
                                Cristiano Russo1 , Carlo Sansone1 and Cristian Tommasino1,2
                                1
                                    Department of Electrical Engineering and Information Technology (DIETI), University of Naples Federico II, Via Claudio 21, 80125 Naples, Italy
                                2
                                    Interdepartmental Center for Research on Management and Innovation in Healthcare (CIRMIS), University of Naples Federico II, Naples, Italy


                                                   Abstract
                                                   Architecture, classification, and major applications of Generative AI interfaces, specifically chatbots, are presented in this
                                                   paper. Research paper details how the Generative AI interfaces work with various Generative AI approaches and show the
                                                   architecture and their working. On the other hand, the generative model is built using advanced machine learning techniques
                                                   to build dynamic, contextually relevant responses automatically. On the other hand, the retrieval-based model builds up with
                                                   dependency on a predefined response library. The paper also discusses the use of Generative AI to populate Multimedia
                                                   Knowledge Graphs (KGs), presenting technologies based on the semantic analysis of deep learning and NoSQL to more
                                                   effectively integrate and retrieve data. The social and ethical challenges that come with the deployment of generative models
                                                   are critically reviewed. These dialogues bring forward the balance that has to be maintained between progress and necessity
                                                   in technological advancements, for which the call for ethical responsibility in developing AI is made. The paper presents
                                                   a comprehensive review of state-of-the-art Generative AI with special focus on the promises and pitfalls in Generative AI
                                                   research related to both natural language processing and knowledge management.

                                                   Keywords
                                                   artificial intelligence, Generative AI



                                1. Introduction                                            The term "chatbot", short for "chatterbot", was originally
                                                                                           coined by Michael Mauldin in 1994 to describe these con-
                                A chatbot, also known as a conversational agent, is an versational programs in his attempt to develop a Turing
                                artificial intelligence (AI) software that can simulate a System [2].
                                conversation (or a chat) with a user through text or voice This work aims to explore various techniques, approaches
                                interfaces [1]. Chatbots can use natural language process- and technologies that have been utilized for developing
                                ing (NLP) and machine learning algorithms to understand chatbots since the late 1990s; furthermore, we will pro-
                                user inputs and generate appropriate responses, allowing vide insights into the most common applications and use
                                them to provide assistance, automate tasks, and perform cases.
                                other functions without the need for human intervention.
                                Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga-
                                nized by CINI, May 29-30, 2024, Naples, Italy
                                                                                                                                            2. Architecture and Classification
                                *
                                  Corresponding author.                                                                                        of Generative AI Interfaces
                                $ flora.amato@unina.it (F. Amato); egidia.cirillo@unina.it
                                (E. Cirillo); mattia.fonisto@unina.it (M. Fonisto);                                                         As a modern approach for architecture of Generative AI
                                antonio.galli@unina.it (A. Galli); stefano.marrone@unina.it
                                (S. Marrone); lidia.marassi@unina.it (L. Marassi);
                                                                                                                                            Interfaces, we will follow [3, 4, 5] and divide the intelli-
                                vincenzo.moscato@unina.it (V. Moscato);                                                                     gent interfaces structure proposed in the state of the art
                                narendra.patwardhan@unina.it (N. Patwardhan);                                                               in four parts: the interface, the multimedia processor, the
                                alberto.moccardi@unina.it (A. Moccardi);                                                                    multimodal input analysis, and the response generator.
                                antonioelia.pascarella@unina.it (A. E. Pascarella);                                                         In detail,
                                carlo.sansone@unina.it (C. Sansone)
                                 0000-0002-5128-5558 (F. Amato); 0009-0008-5825-8043                                                           1. The interface is responsible for managing the
                                (D. Benfenati); 0009-0002-8395-0724 (G. M. D. Filippis);
                                0000-0001-6852-0377 (S. Marrone); 0009-0006-8134-5466                                                              interaction between the chatbot and users, which
                                (L. Marassi); 0000-0002-4807-5664 (V. Moscato);                                                                    involves receiving inputs in various forms such
                                0000-0002-4807-5664 (N. Patwardhan); 0000-0002-1079-7741                                                           as text or audio and returning appropriate re-
                                (A. E. Pascarella); 0000-0001-7003-4781 (A. M. Rinaldi);                                                           sponses.
                                0000-0002-8732-1733 (C. Russo); 0000-0002-8176-6950 (C. Sansone);
                                0000-0001-9763-8745 (C. Tommasino)
                                                                                                                                                2. The multimedia processor (optional) may be
                                             Β© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License          required to preprocess voice or video signals and
                                             Attribution 4.0 International (CC BY 4.0).




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
       convert them into text or recognize the user’s         model leverages this information to link normalized user
       tone to facilitate response generation.                inputs with the most probable user intent [7].
    3. The multimodal input analysis unit handles
       classification and data pre-treatment, often us-       RL-based chatbots
       ing natural language understanding (NLU) tech-
       niques such as semantic parsing, slot filling, and     RL-based chatbots adopt reinforcement learning for
       intent identification.                                 response generation. Reinforcement learning itself is
                                                              mainly based on the Markov decision process, i.e. a 4-
    4. The response generator either associates a
                                                              tuple (𝑆, 𝐴, π‘ƒπ‘Ž , π‘…π‘Ž ) where:
       proper response for the given pre-processed input
       from a stored dataset or, using modern machine              β€’ 𝑆 = (𝑠1 , 𝑠2 , ..., 𝑠𝑛 ) is a set of states, called the
       learning techniques, maps the normalized input                state space;
       to the output using a pre-trained model.
                                                                   β€’ 𝐴 = (π‘Ž1 , π‘Ž2 , ..., π‘Žπ‘š ) is a set of actions, called the
  The response generator is the core component of a                  action space;
chatbot where the actual question-and-answer process               β€’ π‘ƒπ‘Ž (𝑠, 𝑠′ ) = Pr(𝑠𝑑+1 = 𝑠′ |𝑠𝑑 = 𝑠, π‘Žπ‘‘ = π‘Ž) is the
takes place, and it can be considered as the "brain" of the          probability that action π‘Ž, in the state 𝑠 at step 𝑑
system. Based on the architecture of the response gen-               will lead to state 𝑠′ at step 𝑑 + 1;
erator, chatbot systems can be classified into two main            β€’ π‘…π‘Ž (𝑠, 𝑠′ ) is the reward received after transition-
categories: retrieval-based chatbots, which select their             ing from state 𝑠 to state 𝑠′ when action π‘Ž is per-
responses form a pre-defined set of possible outcomes,               formed.
and generative-based chatbots, which use ML tech-
niques to dynamically generate answers [6].                     The goal of a Markov decision process it to find a
                                                              function πœ‹(𝑠) (generally called policy) that associate, for
                                                              every state 𝑠𝑖 , the action πœ‹(𝑠𝑑 ) = π‘Žπ‘– which maximizes
2.1. Retrieval-based chatbots                                 the overall reward, i.e. the following expectation value:
The goal of retrieval-based chatbots is to "understand"                            [οΈƒ ∞                        ]οΈƒ
the user input and choose the most suitable responses                                βˆ‘οΈ 𝑑
                                                                        π‘„πœ‹ = 𝐸           𝛾 π‘…πœ‹(𝑠𝑑 ) (𝑠𝑑 , 𝑠𝑑+1 )       (1)
from a knowledge dataset. There are four sub-categories
                                                                                     𝑑=0
of retrieval-based chatbots, which can be distinguished
based on the architecture of their knowledge dataset and         where 𝛾 is a coefficient (the discount factor) between 0
retrieval techniques. These categories are template-based,    and 1 [8]. In RL-based chatbots, each state 𝑠𝑖 corresponds
corpus-based, intent-based, and RL-based [5].                 to a specific turn in the conversation and is usually rep-
                                                              resented by an embedded vector. After the chatbot is
                                                              trained, it is able to select the most appropriate response
Template-based chatbots
                                                              (action) π‘Žπ‘– to ensure that the conversation remains rele-
Template-based chatbots select responses from a set of        vant and coherent [9].
possible candidates by comparing the user input to cer-
tain query patterns.                                    2.2. Generative-based chatbots
Corpus-based chatbots                                      Generative-based chatbots have the advantage of being
                                                           able to generate responses dynamically, which can lead to
Although template-based chatbots have shown effective- more natural and flexible conversations with users. Gen-
ness in certain cases, their fundamental architecture ne- erative chatbots can generate novel responses, which
cessitates scanning through all potential outputs for each means that they are not limited to pre-defined responses
input until the appropriate response is located. As a like retrieval-based chatbots. This flexibility allows them
result, this approach can be slow and unsuitable for ap- to provide more personalized and relevant responses.
plications with a large knowledge dataset.                 Depending on the machine learning architecture used,
                                                           we will discuss about RNN-based chatbots and
Intent-based chatbots                                      Transformer-based chatbots.
Intent-based chatbots utilize machine learning tech-
niques to establish a connection between user inputs          RNN-based chatbots
and pre-defined outputs. Typically, relevant data is col-     One commonly used method for developing generation-
lected and stored to establish associations between user      based chatbots involves the use of two interconnected
intents (i.e., the conceptual meaning behind a user’s re-     neural networks known as recursive neural networks
quest) and appropriate responses. Next, a pre-trained         (RNNs). The first network, called the encoder, is trained
to associate an input sentence with an intermediate vec-         (AI) to streamline and revolutionize complex decision-
tor called the context vector. The second network,               making processes, augmenting the power of cutting-
called the decoder, takes the context vector as input and        edge technologies, enhancing the classical Retrieval-
is trained to generate an output sentence, either by gen-        Augmented Generation (RAG) models. Through a meticu-
erating actual words or by using tokens. This approach           lous exploration of a multi-query & human centred RAG
is commonly referred to as "sequence-to-sequence" or             application design, the access and the understanding to
Seq2Seq [6, 10].                                                 sophisticated AI capabilities, bridging the gap between
As RNN-based chatbot responses are dynamically gen-              technical expertise and practical application, is guaran-
erated through machine learning models, they may be              teed. The culmination of this inquiry comes with a con-
less precise and more uncertain than retrieval-based chat-       cise and robust architectural flow proposal, laying the
bots. For this reason, RNN-based chatbots are less com-          groundwork for the seamless integration of multiquery-
monly used in task- or knowledge-oriented scenarios and          RAG solutions into decision-making processes and offer-
are instead more frequently used in entertainment and            ing further insights that extends beyond the confines of
mental-health-related activities [5].                            this study and pave the way for future advancements in
                                                                 the field.
Transformer-based chatbots
                                                                 Question Generation Chain The multiquery-RAG
A Transformer is a recent type of neural network archi-
                                                                 system distinguishes itself through its ability to generate
tecture used for NLU and chatbots. First introduced in
                                                                 multiple variations of the original user query, in a human
[11], is also used in other tasks such as language transla-
                                                                 like fashion, through a specialized question generation
tion and text summarization. Transformers are based on
                                                                 chain that produces a prefixed number of alternative
the self-attention mechanism, which allows the model
                                                                 queries capturing distinct viewpoints and nuances asso-
to learn which parts of the input sequence to attend to
                                                                 ciated with the original question. This diversification
at each step of processing, based on the relevance of the
                                                                 of the query set, if correctly fine-tuned, plays a pivotal
other parts of the sequence to the current position. This
                                                                 role in surmounting the limitations of distance-based
is done through a process called scaled dot-product atten-
                                                                 similarity searches in vector databases, ensuring a com-
tion, where the model learns a set of weights to compute
                                                                 prehensive and more efficient document retrieval process
a weighted sum of the input sequence representations.
                                                                 despite the classical retrieving process.
An important language model based on the Transformer
architecture is the Generative Pre-trained Trans-
former (GPT), which was developed by OpenAI in 2020              Answer Generation Chain Following the retrieval of
[12]. GPT serves as the underlying architecture for the          information (documents), the system proceeds to gener-
ChatGPT chatbot, which has gained widespread recog-              ate answers by synthesizing and formulating responses
nition for its ability to provide detailed and articulate        using the data extracted from the documents and leverag-
responses across a variety of domains [13].                      ing a wide LLMs systems. Contextualizing and elaborat-
                                                                 ing on those information it ensures that the responses are
                                                                 both accurate and easily understandable for non-experts
3. Multiquery Retrieval                                          facilitating broader accessibility and utilization of the
                                                                 information among a wider audience.
   Augmented Generation
In the actual forefront of Generative Artificial Intelli-        3.2. Evaluation Criteria
gence (Gen-AI) streamlining complex decision-making
                                                                 This section outlines the principal metrics [15] that are
processes by enabling accessible and comprehensible
                                                                 integral for evaluating a Retrieval-Augmented Genera-
tools to all users it is vitally important. The core of this
                                                                 tion (RAG) in measuring different aspects of the system’s
section is relative to propose an alternative to the classical
                                                                 performance as presented in figure [1].
RAG, introduced by Lewis et al. in 2021 [14], enhancing
its capabilities with a multiquery approach presenting
a concise and solid architectural flow along with main           Context Precision This metric evaluates the signal-to-
evaluation metrics.                                              noise ratio within the retrieved contexts measuring how
                                                                 many of the retrieved documents are actually relevant
                                                                 respect to the user’s query.
3.1. Methodology
This methodological section delves into the profound im- Context Recall This metric assesses whether all neces-
plications of leveraging Generative Artificial Intelligence sary information required to answer the query has been
                                                               Recent advancements, however, offer promising solu-
                                                               tions. [18] and [19] present novel frameworks integrating
                                                               semantic analysis, deep learning, and NoSQL technolo-
                                                               gies to extract entities from knowledge corpora, bridging
                                                               the gap between textual and multimedia sources. Their
                                                               approaches mark significant strides in enriching KGs
                                                               with diverse data types, fostering more comprehensive
                                                               knowledge representation and analysis.
                                                               Meanwhile, Chen et al. [20] propose a generative ap-
                                                               proach to the KG population, leveraging machine learn-
                                                               ing to establish relationships and reduce human inter-
                                                               vention in the curation process. Training models to learn
                                                               underlying data distributions and generate triplets re-
                                                               gardless of entity pair co-occurrence in textual corpora
                                                               pave the way for more efficient and scalable KG con-
                                                               struction. This innovative approach streamlines the pop-
                                                               ulation process and broadens the scope of knowledge
                                                               capture, enabling KGs to encapsulate a wider array of
                                                               interconnected concepts and relationships.
                                                               Manual curation, though traditional, is labor-intensive
                                                               and impractical in the face of expanding data landscapes
Figure 1: RAG Evaluation criterion                             [21]. To address this, a data-centric architecture harness-
                                                               ing generative deep-learning models emerges, automat-
                                                               ing KG creation, particularly for multimedia instances.
retrieved ensuring that the system’s knowledge base cov-       By synthesizing multimedia data, irrespective of absolute
ers all aspects needed to formulate a comprehensive and        data scarcity, a dynamic, infinitely expandable pool of
accurate response and relying on a comparison between          instances is ensured, underpinning model training and in-
the retrieved contexts and the ground truths.                  ference with a multimedia knowledge graph that evolves
                                                               alongside data trends.
Faithfulness This metric quantifies the factual accu-          Different knowledge graph population approaches with
racy of the answers generated by the RAG system. It in-        generative AI are based on standard steps. The first is
volves counting the number of correct factual statements       grabbing information from curated textual sources. It is
made in the generated answers based on the retrieved           possible to enrich it by using Linked Open Data (LOD)
contexts and comparing this count to the total number          and base the image’s generation using the enhanced tex-
of statements in the answers.                                  tual description to make the text as complete as possible.
                                                               The next step combines the previously obtained textual
Answer Relevancy This metric measures how well                 statement and produces a representative multimedia in-
the generated answers address the user’s queries. For ex-      stance of the input text via a generative text-image syn-
ample, if a query asks for multiple pieces of information,     thesis model. The last step consists of using a focused
the relevancy score reflects how completely the response       crawler, which allows a check on the quality of the gener-
addresses all elements of the query.                           ated image, exploiting different metrics useful to measure
                                                               the degree of similarity of the generated image concern-
                                                               ing its textual description and real images crawled from
4. Multimedia Knowledge Graph                                  the web. If the image from the previous step exhibits met-
                                                               ric values that surpass a threshold determined through
   population using Generative AI                              experimental evaluation, it can be stored in the node of
Knowledge Graphs (KGs) serve as potent repositories,           the multimedia knowledge base.
adeptly organizing, connecting, and extracting insights        In image generation for the knowledge graph population,
from many data sources, embodying contemporary                 text-image synthesis models are developed to bridge the
knowledge management principles in semantic web ap-            semantic gap between textual descriptions and corre-
plications [16]. Despite their invaluable utility, realizing   sponding visual representations. These models lever-
the full potential of KGs necessitates a systematic pop-       age cutting-edge generative strategies to produce high-
ulation with relevant information, a task fraught with         quality images aligned with the provided textual prompts.
challenges, mainly when data is scarce [17].                   The application of text-to-image models improved a lot in
                                                               recent years, migrating from Generate Adversarial Net-
work (GAN) to Latent Diffusion Models, such as Stable       on creating a concrete sustainable generative model, ad-
Diffusion [22]. A latent diffusion model refines a latent   dressing crucial issues related to data collection, key
representation by applying diffusion steps in the latent    model components, and essential additions. One of the
space, gradually reducing noise and revealing the desired   main goals of the project is to improve model efficiency
image. This iterative process involves adding noise and     without compromising performance, using techniques
updating the latent code. The model implements a de-        such as attention and linear layer optimization within the
coder network to reconstruct the image from the refined     Transformer architecture. Hominis also aims to ensure
latent code.                                                the sanitization of public data and develop data collection
The evaluation phase of the quality of multimedia in-       strategies to capture a wide range of multifaceted data.
stances for the KG node is important. The evaluation pro-   Additionally, the project involves developing tools for the
cess of text-to-image synthesis models involves assessing   community to analyze, curate, and critique datasets while
their accuracy in converting text inputs into synthetic     ensuring fairness, privacy, and legality. The proposed
images.                                                     methodologies, such as Universal Tokenization, Assisted
Some quantitative metrics are used to assess not only the   Generation by Recovery (RAG), the use of diffusion to
quality of the image about the text but also the degree     improve model controllability, and the use of muTransfer
of realism in a generated image by comparing it to real     technique to optimize hyperparameters and reduce car-
images, such as Cosine Similarity, which compares the       bon footprint associated with training, all aim to improve
feature vectors, calculating the cosine between them, FID   the efficiency, sustainability, and fairness of AI models. In
(Frechèt Inception Distance) [23], a numerical value that   particular, the approach of unifying data through Univer-
quantifies the similarity between the statistical distribu- sal Tokenization can help better manage data diversity,
tions of real and generated images computing the FrΓ©chet    while RAG can improve model relevance and accuracy,
distance between the two distributions, and CLIP score      ensuring greater fairness in outcomes. Furthermore, the
[24], a metric that understands the relationship between    use of diffusion to improve model controllability helps en-
images and text, used for evaluate the model’s ability to   sure that AI outputs are transparent and understandable.
rank images based on their relevance to a given textual     Today, attention to sustainable, adaptable, and responsi-
description and vice versa.                                 ble AI is crucial to ensure that the benefits of artificial
                                                            intelligence are evenly distributed and that negative im-
                                                            pacts, such as the carbon footprint associated with model
5. Ethical and social challenges                            training, are minimized. In an era where sustainable and
                                                            responsible AI is essential for our future, projects like
The recent advances in generative AI are revolutionizing
                                                            Hominis represent a step in the right direction, helping
many sectors thanks to the ability to create original con-
                                                            ensure that the benefits of AI are accessible to all while
tent based on patterns learned from training data. Models
                                                            minimizing negative impacts on the environment and
such as those based on transformer architectures, have
                                                            society.
already demonstrated significant success in various fields,
including natural language processing, computer vision,
and reinforcement learning. However, despite the advan- Acknowledgments
tages offered by generative models, their development
and deployment raise concerns regarding ethical and en- This work was partially supported by PNRR MUR Project
vironmental implications. Firstly, these models require PE0000013-FAIR.
massive computational resources and consume a large The FAIR project is committed to promoting an advanced
amount of energy during both training and execution vision of Artificial Intelligence, driving research and de-
processes. This raises concerns about the environmental velopment in this crucial field and constantly keeping
impact of AI, especially considering the urgent need to ethical, legal and sustainability considerations in mind
reduce carbon emissions to address climate change. Ad-
ditionally, there are ethical concerns regarding the use
and management of training data. Since these models References
can generate original content, there is a risk that they
                                                             [1] G. Caldarini, S. Jaf, K. McGarry, A literature survey
may perpetuate biases or discriminations present in the
                                                                   of recent advances in chatbots (2022). doi:https:
training data, raising questions about fairness, privacy,
                                                                  //doi.org/10.48550/arXiv.2201.06657.
and data security in the era of AI [25].
                                                             [2] M. Mauldin, Chatterbots, tinymuds, and the turing
   The Hominis project, conducted at the University of
                                                                   test: Entering the loebner prize competition, 1994.
Naples Federico II in collaboration with industrial part-
                                                             [3] S. A. Abdul-Kaer, J. Woods, Survey on chatbot de-
ners (DeepKapha), aims to advance toward sustainable
                                                                   sign techniques in speech conversation systems,
and programmable AI solutions [26]. The project focuses
     International Journal of Advanced Computer Sci-             V. Karpukhin, N. Goyal, H. KΓΌttler, M. Lewis, W. tau
     ence and Applications, Vol. 6, No. 7 (2015).                Yih, T. RocktΓ€schel, S. Riedel, D. Kiela, Retrieval-
 [4] P. Jonell, Using social and physiological signals           augmented generation for knowledge-intensive
     for user adaptation in conversational agents., Pro-         nlp tasks, 2021. arXiv:2005.11401.
     ceedings of the international joint conference on [15] S. Es, J. James, L. Espinosa-Anke, S. Schockaert, Ra-
     autonomous agents and multiagent systems, AA-               gas: Automated evaluation of retrieval augmented
     MAS (Vol. 4(c), pp. 2420-2422) (2019).                      generation, 2023. arXiv:2309.15217.
 [5] B. Luo, R. Y. K. Lau, C. Li, Y. Si, A critical re- [16] J. Zhang, M. Pourreza, R. Ramachandran, T. J. Lee,
     view of state-of-the-art chatbot designs and ap-            P. Gatlin, M. Maskey, A. M. Weigel, Facilitating data-
     plications, Wiley Interdisciplinary Reviews: Data           centric recommendation in knowledge graph, in:
     Mining and Knowledge Discovery Volume 12, Is-               2018 IEEE 4th International Conference on Collab-
     sue 1 (2022). doi:https://doi.org/10.48550/                 oration and Internet Computing (CIC), IEEE, 2018,
     arXiv.2201.06657.                                           pp. 207–216.
 [6] H. Chen, X. Liu, D. Yin, J. Tang, A survey [17] H. Li, G. Appleby, C. D. Brumar, R. Chang, A. Suh,
     on dialogue systems: Recent advances and new                Knowledge graphs in practice: Characterizing their
     frontiers, ACM SIGKDD Explorations Newslet-                 users, challenges, and visualization opportunities,
     ter (2018). doi:https://doi.org/10.48550/                   2023. arXiv:2304.01311.
     arXiv.1711.01731.                                      [18] M. Muscetti, A. M. Rinaldi, C. Russo, C. Tommasino,
 [7] M. Franco, B. Rodrigues, E. J. Scheid, A. Ja-               Multimedia ontology population through semantic
     cobs, C. Killer, Z. G. L., S. B.,         Secbot: a         analysis and hierarchical deep features extraction
     business-driven conversational agent for cyberse-           techniques, Knowledge and Information Systems
     curity planning and management, 16th Interna-               64 (2022) 1283–1303.
     tional Conference on Network and Service Manage- [19] A. M. Rinaldi, C. Russo, C. Tommasino, A novel
     ment (CNSM) (2020). doi:https://doi.org/10.                 approach to populate multimedia knowledge graph
     23919/CNSM50824.2020.9269037.                               via deep learning and semantic analysis, in: Pro-
 [8] H. CuayΓ‘huitl, D. Lee, S. Ryu, Y. Cho, S. Choi, S. In-      ceedings of the 14th International Conference on
     durthi, S. Yu, H. Choi, I. Hwang, J. Kim, Ensemble-         Management of Digital EcoSystems, 2022, pp. 40–
     based deep reinforcement learning for chatbots,             47.
     Neurocomputing (2019). doi:https://doi.org/ [20] H. Chen, C. Zhang, J. Li, P. S. Yu, N. Jing, Kggen:
     10.48550/arXiv.1908.10422.                                  A generative approach for incipient knowledge
 [9] I. V. Serban, C. Sankar, M. Germain, S. Zhang, Z. Lin,      graph population, IEEE Transactions on Knowl-
     S. Subramanian, T. Kim, M. Pieper, S. Chandar, N. R.        edge and Data Engineering 34 (2022) 2254–2267.
     Ke, S. Rajeshwar, A. de Brebisson, J. M. R. Sotelo,         doi:10.1109/TKDE.2020.3014166.
     D. Suhubdy, V. Michalski, A. Nguyen, J. Pineau, [21] S. Issa, O. Adekunle, F. Hamdi, S. S.-S. Cherfi, M. Du-
     Y. Bengio, A deep reinforcement learning chatbot,           montier, A. Zaveri, Knowledge graph completeness:
     2017. doi:https://doi.org/10.48550/arXiv.                   A systematic literature review, IEEE Access 9 (2021)
     1709.02349.                                                 31322–31339.
[10] K. ho, B. van Merrienboer, C. Gulcehre, [22] R. Rombach, A. Blattmann, D. Lorenz, P. Esser,
     D. Bahdanau, F. Bougares, H. Schwenk,                       B. Ommer, High-resolution image synthesis with la-
     Y. Bengio, Learning phrase representations                  tent diffusion models, 2022. arXiv:2112.10752.
     using rnn encoder-decoder for statisti- [23] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler,
     cal machine translation, 2014. doi:https:                   S. Hochreiter, Gans trained by a two time-scale
     //doi.org/10.48550/arXiv.1406.1078.                         update rule converge to a local nash equilibrium,
[11] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit,            Advances in neural information processing systems
     L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin, At-        30 (2017).
     tention is all you need, 2017. doi:https://doi. [24] J. Hessel, A. Holtzman, M. Forbes, R. L. Bras, Y. Choi,
     org/10.48550/arXiv.1706.03762.                              Clipscore: A reference-free evaluation metric for
[12] A. Radford, K. Narasimhan, I. Salimans, T.                  image captioning, arXiv preprint arXiv:2104.08718
     ad Sutskever, Improving language understanding              (2021).
     by generative pre-training, 2020.                      [25] G. Tamburrini, et al., Digital humanism and global
[13] S. Lock, What is ai chatbot phenomenon chatgpt              issues in artificial intelligence ethics., 2022.
     and could it replace humans?, 2022. URL: https: [26] N. Patwardhan, S. Shetye, L. Marassi, M. Zuccarini,
     //www.theguardian.com/technology/2022/dec/05/               T. Maiti, T. Singh, Designing human-centric foun-
     what-is-ai-chatbot-phenomenon-chatgpt-and-could-it-replace-humans.
                                                                 dation models, reconstruction 9 (2023) 10.
[14] P. Lewis, E. Perez, A. Piktus, F. Petroni,