=Paper=
{{Paper
|id=Vol-3762/514
|storemode=property
|title=Road map per la creazione di un agente conversazionale per la scoperta di servizi pubblici coerente con le direttive di Design System Italia (DSI
|pdfUrl=https://ceur-ws.org/Vol-3762/514.pdf
|volume=Vol-3762
|authors=Davide Bruno
|dblpUrl=https://dblp.org/rec/conf/ital-ia/Bruno24
}}
==Road map per la creazione di un agente conversazionale per la scoperta di servizi pubblici coerente con le direttive di Design System Italia (DSI==
Road map for the implementation of a conversational
agent chatbot consistent with the guidelines of the Design
System Italy (DSI)
Davide Bruno
Regione Toscana, via di Novoli,27, Firenze, 50127, Italy
Abstract
The project idea intends to set some cornerstones for the design and subsequent implementation
of a conversational agent based on intent and artificial intelligence generative with the aim of
exploiting public service semantics (CPSV-AP_IT Core Public Service Vocabulary) to improve the
interaction between citizens and the public sector (PS).
Keywords
generative artificial intelligence, public sector (PS), citizen , CPSV-AP_IT, CEUR-WS 1
exploit the ability of machines to understand
1. Introduction natural language to handle requests and
Artificial Intelligence Generative will interactions with citizens in a timely and
certainly have a significant impact in the contextual manner. These applications will
Public sector (PS). This technology offers make it possible to improve the relationship
opportunities to improve the efficiency, with citizens, providing personalised and
transparency and quality of public services. rapid responses, thus helping to optimise
By implementing AI Generative-based the delivery of public services.
systems, PS can automate repetitive These highly innovative solutions will have
processes, optimise data management and to take into account the regulatory
improve interaction with citizens. aims to framework of reference for the PA and, in
improve productivity, accessibility and any case, avoid excessive dependence on
efficiency in service delivery. It leverages suppliers that could quickly become
technologies such as machine learning, technological lock-in.
natural language processing (NLP) and Another risk that should not be
natural language processing. underestimated at this time of feverish
These tools, known as 'conversational excitement on the subject is 'overpromising'
applications' or more commonly 'chatbots', and the illusion of perfection that comes
1Ital-IA 2024: 4th National Conference on Artificial Intelligence, organized by CINI, May 29-30, 2024, Naples, Italy
davide.bruno@regione.toscana.it ;
© 2024 Copyright for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
from controlled demos where everything standardisation between different
seems perfect and everything is solved implementations.2
simply by installing Artificial Intelligence The CPSV-AP aims at structuring public
Generative, at the moment there are not service information, making it user-centred
many solutions that are field-tested in real and machine-readable, facilitating the
situations and are not known. creation of a public service catalogue that is
It is therefore crucial to choose the paradigm interoperable and efficient. This vocabulary
for the development of the target is used by several EU countries, including
architecture and to identify generative AI Italy, to describe public services and
models. associated life events in a standardised way.
It will be possible to exploit the great
potential of generative AI if we take a In 2017, the Agency for Digital Italy (AgID)
cautious and forward-looking responsible published, in the OntoPiA controlled set of
approach ontologies and vocabularies3 (Ontologies for
favouring open-source solutions or those Public Administration), a special vocabulary
based on shared open standards so as to for the definition of public services CPSV-
guarantee PA flexibility and increase the AP_IT Core Public Service Vocabulary.4
possibility of changing suppliers in the The CPSV-AP_IT is the Italian version of this
future. vocabulary, offering a framework to
PAs will have to develop in-house skills very describe public services in Italy, in line with
quickly by investing in staff training to make the European vocabulary of core public
them at least aware and capable of services.
expressing functional requirements, and not Designing and implementing a
in the acquisition phase, but as mentioned, conversational agent natively integrated
we must also be very vigilant in the with CPSV-AP_IT will help improve the
operation and pre-operation phase, accessibility and effectiveness of public
remembering that we are talking about new services, enabling users to interact in a more
technological solutions that will certainly intuitive and personalised way with the
need to mature. information and services offered5.
It will not be easy in this very dynamic and Moreover, the artificial intelligence of the
exponentially exploding field of LLM conversational agent could be enhanced by
solutions, techniques and models. the structure and semantics provided by the
However, lasting collaboration with other CPSV-AP_IT, facilitating a better
public bodies, including research bodies, understanding of user requests and offering
must be encouraged and institutionalised in more precise and contextualised answers
order to share experiences and best
practices. The realised artefact would be reusable at
different administrative levels of the central
2. Reference Models and Goals and local Public Administration thanks to
the work carried out over the years by AGID
and by virtue of the fact that the Core
The 'Core Public Service Vocabulary Vocabulary of Public Services-Italian Profile
Application Profile' (CPSV-AP) is a data (CPSV-AP_IT) has been defined. Moreover,
model designed to harmonise the the aspects of accessibility and usability
description of public services on would be guaranteed by including as a
eGovernment portals. It provides a common requirement by design the respect of the
vocabulary to describe public services, principles of accessibility and usability
ensuring interoperability and
2https://ec.europa.eu/isa2/solutions/core-public-service- 4Avaiable at link:
vocabolari-controllati e GitHub wiki: https://github.com/italia/daf- 5https://www.readygoone.it/approfondimenti/10-funzioni-
ontologie-vocabolari-controllati/wiki (ita) dellassistente-conversazionale/
already present in the Design System Italia6
(DSI).
Other crucial elements in the PA's selection
of the open source artificial intelligence
architecture are:
3. Main steps for the design and Agnostic with respect to language models
(it must therefore work with customised
implementation OpenAI, Cohere, HuggingFace models)
The project idea therefore intends to set The solution must possess long-term
some cornerstones for the design and memory, have the possibility of using
subsequent implementation of a external tools (API, other models), be able to
conversational agent based on intent and ingest documents of different formats (at
generative artificial intelligence with the aim least pdf, txt, json) and be developed with
of exploiting the semantics of public services technologies that natively implement the
to improve the interaction between citizens possibility of scaling horizontally and
and public administration. vertically.
The selection phase of a chatbot Non-functional requirements:
development paradigm/platform is crucial Accessibility by design: the chatbot should
as already mentioned from a technological be implemented according to the
point of view it will be a mix of intent and accessibility by design paradigm while also
generative AI. taking into account the needs of user groups
For the reasons already expressed for the with disabilities as suggested by
PA, open source solutions for generative AI accessibility best practices1, such as the
should be favoured, combining it with provision of text alternatives for any images
retrieval-augmented generation (RAG) and audio transcripts.
techniques7. Multilingualism: implementation of
Retrieval-augmented generation (RAG) has multilingual functionality.
emerged as a promising solution that
incorporates knowledge from external 3.1. Acquisition and preparation of data
databases.
1. Retrieval of CPSV-AP_IT data: Data in
For knowledge systems, RAG has several
CPSV-AP_IT format (e.g. eligibility
advantages over the use of LLM alone:
requirements, tariffs) and identification of
relations with other services.
Accuracy: RAG reduces and mitigates the
2. Pre-processing and organisation: In the
risk of 'hallucinations', where LLMs might
case of data not conforming to CPSV-AP_IT, a
provide plausible but incorrect information.
mapping phase is still necessary to clean and
It does this by 'rooting' LLM answers in
harmonise the data in a format suitable for
accurate data retrieved from your team's
the chosen development architecture. It may
data sources to generate reliable answers.
be necessary to convert them into a
Transparency: good RAG systems can
computer-readable format (e.g. JSON, CSV)
provide references that allow users to verify
and to structure and optimise them for
where information comes from, adding a
efficient queries.
level of trust and accountability to the
3. Identification of intentions and entities: in
answers provided by RAG models.
this phase, the potential questions and
Customisation: RAG systems can use data
intentions of the user (e.g. "how do I apply for
specific to your company or industry (e.g.
a passport renewal?") and the entities they
naming conventions), making them
might mention (e.g. "passport", "renewal")
adaptable and ensuring that answers are
should be defined. This could help the
relevant to your specific context.
chatbot understand the user's needs.
6https://designers.italia.it/design-system/ 7https://blogs.nvidia.com/blog/what-is-retrieval-augmented-
generation/
4. Conclusions and on-going
3.2. Chatbot technological development
activities
This presentation briefly describes a possible road
1. Design the flow of the conversation:
map for the implementation of a conversational agent
Create natural language dialogues that guide
based on intent and generative artificial intelligence
the user in the discovery of services.
(AIG) with the aim of exploiting public service
Consider common user questions and create
semantics (CPSV-AP_IT Core Public Service
branching paths based on their answers.
Vocabulary) to improve the interaction between
Prioritise clarity and conciseness and non-
citizens and the public administration (PA).
bureaucratic language.
Retrieval Augmented Generation (RAG) in general
2. Integration of data with the CPSV-AP_IT
offers several advantages, in particular to improve the
model: Linking the chatbot via ingestion
capabilities of artificial intelligence systems.
pipelines to the previously processed CPSV-
In short, it is an approach that combines large
AP_IT data.
language models (LLM) with information retrieval
3. Implementation Natural Language
(IR) to improve the accuracy and relevance of LLM-
Processing (NLP): Using NLP techniques
generated text.
(intent recognition, entity extraction) to
understand user queries and map them to
In a nutshell, the aims of this proposal:
relevant data points in the CPSV-AP_IT
knowledge base. In order to enable the
1. Improving the discovery of online and on-
chatbot to retrieve accurate information
site services of the public administration
about services.
2. Providing personalised and relevant
4. Error handling and fallback: Implement
answers to users
mechanisms to handle user input that does
3. Reduce first level help desk calls to various
not match defined intent or entity. Provide
services
helpful hints or propose to rephrase the
question. Consider offering a fallback option
such as connecting with a human agent for
An indicative road map of development:
complex questions by retrieving what was
typed.
• A prototype will be realised and validated by
2024.
3.3. Test and distribution • By first half of 2025 go live in production.
1. Extensive testing: Rigorous testing of the
chatbot's functionality with various Possible future developments also automate the
scenarios and user queries should be delivery of some simple services, integration at least
envisaged. Ensure that it accurately retrieves as UX in the Design System Italy.
and presents information on public services,
understands intent and provides clear We may conclude by saying that the PA should not
guidance. Usability tests should also be make the mistake of building an architecture that is
envisaged. bound to a single LLM model or specific solutions
2. Monitoring and improvement: Constant because depending on the specific use case, expected
monitoring of the chatbot's performance. performance and costs, the configuration of the
One should plan to collect feedback from generative AI solution will be different.
users especially transactions that did not go
well and use it to refine conversation flow, Just to give an example of the variety and speed with
NLP accuracy and the overall user which this sector is evolving, Anthropic alone released
experience. three LLM models between 2023 and 2024: #Claude1,
#Claude2 and #Claude3.
OpenAI appears to be close to launching new versions
of #GPT, which it claims will represent a further leap
forward. The galaxy of generative artificial
intelligence is still evolving strongly.
And it also has an impact on the open source world in
fact the difference between an open-source LLM and
close so binary is showing its limits or rather this new
pradigma1 is establishing itself with respect to LLM so
we can have the following types of models:
Openly Trained Models (OLMo, Pythia, etc.) - are
those models with training data, training code and
weights available without restrictions on use.
Permissible Usage Models (Llama**, Mistral,
Gemma, etc.) - are those models with base model
weights and inference code available for easy set-up
and distribution.
Closed LLMs - everything from GPT4 to a random
set of tuned weights without much information.
From the point of view of the Public Administration
this is desirable a cautious and far-sighted responsible
approach confirming the main requirement already
expressed the framework selected must be agnostic
with respect to the LLM model and in any case as PA
we should prefer Openly Trained Models (OLMo,
Pythia, etc.) or Permissible Usage Models (Llama**,
Mistral, Gemma, etc.).
References
[1] Retrieval-Augmented Generation for Natural
Language Understanding" di Patrick Lewis et al.
(2020): https://arxiv.org/abs/2005.11401
[2] RAG: A Simple but Effective Approach to Neural
Conversational Modeling" di Alexander Rush et
al. (2020): https://medium.com/dropout-
analytics/what-is-rag-in-generative-ai-
f5b8c13575f8
[3] "RAG-BERT: Retrieval-Augmented Generation
with BERT" di Honglei Zhuang et al. (2020):
https://www.analyticsvidhya.com/blog/2023/
10/rags-innovative-approach-to-unifying-
retrieval-and-generation-in-nlp/
[4] "Towards Controllable and Consistent
Generation with Retrieval-Augmented
Generation" di Yilun Wang et al. (2021):
https://aclanthology.org/2020.coling-main.207
[5]
https://www.marktechpost.com/2024/04/01/
evolution-of-rags-naive-rag-advanced-rag-and-
modular-rag-architectures/Wang