Road map for the implementation of a conversational
                                agent chatbot consistent with the guidelines of the Design
                                System Italy (DSI)


                                Davide Bruno

                                Regione Toscana, via di Novoli,27, Firenze, 50127, Italy


                                                     Abstract
                                                     The project idea intends to set some cornerstones for the design and subsequent implementation
                                                     of a conversational agent based on intent and artificial intelligence generative with the aim of
                                                     exploiting public service semantics (CPSV-AP_IT Core Public Service Vocabulary) to improve the
                                                     interaction between citizens and the public sector (PS).

                                                     Keywords
                                                     generative artificial intelligence, public sector (PS), citizen , CPSV-AP_IT, CEUR-WS 1


                                                                                                                          exploit the ability of machines to understand
                                1. Introduction                                                                           natural language to handle requests and
                                            Artificial Intelligence Generative will                                       interactions with citizens in a timely and
                                            certainly have a significant impact in the                                    contextual manner. These applications will
                                            Public sector (PS). This technology offers                                    make it possible to improve the relationship
                                            opportunities to improve the efficiency,                                      with citizens, providing personalised and
                                            transparency and quality of public services.                                  rapid responses, thus helping to optimise
                                            By implementing AI Generative-based                                           the     delivery      of   public    services.
                                            systems, PS can automate repetitive                                           These highly innovative solutions will have
                                            processes, optimise data management and                                       to take into account the regulatory
                                            improve interaction with citizens. aims to                                    framework of reference for the PA and, in
                                            improve productivity, accessibility and                                       any case, avoid excessive dependence on
                                            efficiency in service delivery. It leverages                                  suppliers that could quickly become
                                            technologies such as machine learning,                                        technological lock-in.
                                            natural language processing (NLP) and                                         Another risk that should not be
                                            natural         language         processing.                                  underestimated at this time of feverish
                                            These tools, known as 'conversational                                         excitement on the subject is 'overpromising'
                                            applications' or more commonly 'chatbots',                                    and the illusion of perfection that comes


                                1Ital-IA 2024: 4th National Conference on Artificial Intelligence, organized by CINI, May 29-30, 2024, Naples, Italy

                                  davide.bruno@regione.toscana.it ;

                                               © 2024 Copyright for this paper by its authors. Use permitted under
                                               Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
            from controlled demos where everything                                 standardisation       between        different
            seems perfect and everything is solved                                 implementations.2
            simply by installing Artificial Intelligence                           The CPSV-AP aims at structuring public
            Generative, at the moment there are not                                service information, making it user-centred
            many solutions that are field-tested in real                           and machine-readable, facilitating the
            situations and are not known.                                          creation of a public service catalogue that is
            It is therefore crucial to choose the paradigm                         interoperable and efficient. This vocabulary
            for the development of the target                                      is used by several EU countries, including
            architecture and to identify generative AI                             Italy, to describe public services and
            models.                                                                associated life events in a standardised way.
            It will be possible to exploit the great
            potential of generative AI if we take a                                In 2017, the Agency for Digital Italy (AgID)
            cautious and forward-looking responsible                               published, in the OntoPiA controlled set of
            approach                                                               ontologies and vocabularies3 (Ontologies for
            favouring open-source solutions or those                               Public Administration), a special vocabulary
            based on shared open standards so as to                                for the definition of public services CPSV-
            guarantee PA flexibility and increase the                              AP_IT Core Public Service Vocabulary.4
            possibility of changing suppliers in the                               The CPSV-AP_IT is the Italian version of this
            future.                                                                vocabulary, offering a framework to
            PAs will have to develop in-house skills very                          describe public services in Italy, in line with
            quickly by investing in staff training to make                         the European vocabulary of core public
            them at least aware and capable of                                     services.
            expressing functional requirements, and not                            Designing       and       implementing         a
            in the acquisition phase, but as mentioned,                            conversational agent natively integrated
            we must also be very vigilant in the                                   with CPSV-AP_IT will help improve the
            operation and pre-operation phase,                                     accessibility and effectiveness of public
            remembering that we are talking about new                              services, enabling users to interact in a more
            technological solutions that will certainly                            intuitive and personalised way with the
            need                  to                mature.                        information      and      services     offered5.
            It will not be easy in this very dynamic and                           Moreover, the artificial intelligence of the
            exponentially exploding field of LLM                                   conversational agent could be enhanced by
            solutions,     techniques      and     models.                         the structure and semantics provided by the
            However, lasting collaboration with other                              CPSV-AP_IT,       facilitating     a      better
            public bodies, including research bodies,                              understanding of user requests and offering
            must be encouraged and institutionalised in                            more precise and contextualised answers
            order to share experiences and best
            practices.                                                             The realised artefact would be reusable at
                                                                                   different administrative levels of the central
2. Reference Models and Goals                                                      and local Public Administration thanks to
                                                                                   the work carried out over the years by AGID
                                                                                   and by virtue of the fact that the Core
            The 'Core Public Service Vocabulary                                    Vocabulary of Public Services-Italian Profile
            Application Profile' (CPSV-AP) is a data                               (CPSV-AP_IT) has been defined. Moreover,
            model designed to harmonise the                                        the aspects of accessibility and usability
            description of public services on                                      would be guaranteed by including as a
            eGovernment portals. It provides a common                              requirement by design the respect of the
            vocabulary to describe public services,                                principles of accessibility and usability
            ensuring       interoperability       and


2https://ec.europa.eu/isa2/solutions/core-public-service-              4Avaiable at link:<https://github.com/italia/daf-ontologie-

vocabulary-application-profile-cpsv-ap_en/                             vocabolari-controllati/blob/master/Ontologie/CPSV/v1.1/CPSV-
3OntoPiA: GitHub: https://github.com/italia/daf-ontologie-             AP_IT.rdf.>
vocabolari-controllati e GitHub wiki: https://github.com/italia/daf-   5https://www.readygoone.it/approfondimenti/10-funzioni-

ontologie-vocabolari-controllati/wiki (ita)                            dellassistente-conversazionale/
            already present in the Design System Italia6
            (DSI).
                                                                            Other crucial elements in the PA's selection
                                                                            of the open source artificial intelligence
                                                                            architecture are:
3. Main steps for the design and                                             Agnostic with respect to language models
                                                                            (it must therefore work with customised
   implementation                                                           OpenAI, Cohere, HuggingFace models)
            The project idea therefore intends to set                       The solution must possess long-term
            some cornerstones for the design and                            memory, have the possibility of using
            subsequent       implementation        of     a                 external tools (API, other models), be able to
            conversational agent based on intent and                        ingest documents of different formats (at
            generative artificial intelligence with the aim                 least pdf, txt, json) and be developed with
            of exploiting the semantics of public services                  technologies that natively implement the
            to improve the interaction between citizens                     possibility of scaling horizontally and
            and           public            administration.                 vertically.

            The selection phase of a chatbot                                Non-functional requirements:
            development paradigm/platform is crucial                        Accessibility by design: the chatbot should
            as already mentioned from a technological                       be implemented according to the
            point of view it will be a mix of intent and                    accessibility by design paradigm while also
            generative                                AI.                   taking into account the needs of user groups
            For the reasons already expressed for the                       with    disabilities    as   suggested     by
            PA, open source solutions for generative AI                     accessibility best practices1, such as the
            should be favoured, combining it with                           provision of text alternatives for any images
            retrieval-augmented generation (RAG)                            and audio transcripts.
            techniques7.                                                    Multilingualism:       implementation      of
             Retrieval-augmented generation (RAG) has                       multilingual functionality.
            emerged as a promising solution that
            incorporates knowledge from external              3.1. Acquisition and preparation of data
            databases.
                                                                  1.    Retrieval of CPSV-AP_IT data: Data in
            For knowledge systems, RAG has several
                                                                        CPSV-AP_IT       format       (e.g.   eligibility
            advantages over the use of LLM alone:
                                                                        requirements, tariffs) and identification of
                                                                        relations with other services.
             Accuracy: RAG reduces and mitigates the
                                                                  2.    Pre-processing and organisation: In the
            risk of 'hallucinations', where LLMs might
                                                                        case of data not conforming to CPSV-AP_IT, a
            provide plausible but incorrect information.
                                                                        mapping phase is still necessary to clean and
            It does this by 'rooting' LLM answers in
                                                                        harmonise the data in a format suitable for
            accurate data retrieved from your team's
                                                                        the chosen development architecture. It may
            data sources to generate reliable answers.
                                                                        be necessary to convert them into a
             Transparency: good RAG systems can
                                                                        computer-readable format (e.g. JSON, CSV)
            provide references that allow users to verify
                                                                        and to structure and optimise them for
            where information comes from, adding a
                                                                        efficient queries.
            level of trust and accountability to the
                                                                  3.    Identification of intentions and entities: in
            answers provided by RAG models.
                                                                        this phase, the potential questions and
             Customisation: RAG systems can use data
                                                                        intentions of the user (e.g. "how do I apply for
            specific to your company or industry (e.g.
                                                                        a passport renewal?") and the entities they
            naming conventions), making them
                                                                        might mention (e.g. "passport", "renewal")
            adaptable and ensuring that answers are
                                                                        should be defined. This could help the
            relevant to your specific context.
                                                                        chatbot understand the user's needs.


6https://designers.italia.it/design-system/                   7https://blogs.nvidia.com/blog/what-is-retrieval-augmented-

                                                              generation/
                                                        4. Conclusions and on-going
3.2. Chatbot technological development
                                                           activities
                                                             This presentation briefly describes a possible road
  1.   Design the flow of the conversation:
                                                        map for the implementation of a conversational agent
       Create natural language dialogues that guide
                                                        based on intent and generative artificial intelligence
       the user in the discovery of services.
                                                        (AIG) with the aim of exploiting public service
       Consider common user questions and create
                                                        semantics (CPSV-AP_IT Core Public Service
       branching paths based on their answers.
                                                        Vocabulary) to improve the interaction between
       Prioritise clarity and conciseness and non-
                                                        citizens and the public administration (PA).
       bureaucratic language.
                                                        Retrieval Augmented Generation (RAG) in general
  2.   Integration of data with the CPSV-AP_IT
                                                        offers several advantages, in particular to improve the
       model: Linking the chatbot via ingestion
                                                        capabilities of artificial intelligence systems.
       pipelines to the previously processed CPSV-
                                                        In short, it is an approach that combines large
       AP_IT data.
                                                        language models (LLM) with information retrieval
  3.   Implementation        Natural      Language
                                                        (IR) to improve the accuracy and relevance of LLM-
       Processing (NLP): Using NLP techniques
                                                        generated text.
       (intent recognition, entity extraction) to
       understand user queries and map them to
                                                        In a nutshell, the aims of this proposal:
       relevant data points in the CPSV-AP_IT
       knowledge base. In order to enable the
                                                            1.   Improving the discovery of online and on-
       chatbot to retrieve accurate information
                                                                 site services of the public administration
       about services.
                                                            2.   Providing personalised and relevant
  4.   Error handling and fallback: Implement
                                                                 answers to users
       mechanisms to handle user input that does
                                                            3.   Reduce first level help desk calls to various
       not match defined intent or entity. Provide
                                                                 services
       helpful hints or propose to rephrase the
       question. Consider offering a fallback option
       such as connecting with a human agent for
                                                        An indicative road map of development:
       complex questions by retrieving what was
       typed.
                                                            •    A prototype will be realised and validated by
                                                                 2024.
3.3. Test and distribution                                  •    By first half of 2025 go live in production.
  1.   Extensive testing: Rigorous testing of the
       chatbot's   functionality     with    various    Possible future developments also automate the
       scenarios and user queries should be             delivery of some simple services, integration at least
       envisaged. Ensure that it accurately retrieves   as UX in the Design System Italy.
       and presents information on public services,
       understands intent and provides clear            We may conclude by saying that the PA should not
       guidance. Usability tests should also be         make the mistake of building an architecture that is
       envisaged.                                       bound to a single LLM model or specific solutions
  2.   Monitoring and improvement: Constant             because depending on the specific use case, expected
       monitoring of the chatbot's performance.         performance and costs, the configuration of the
       One should plan to collect feedback from         generative AI solution will be different.
       users especially transactions that did not go
       well and use it to refine conversation flow,     Just to give an example of the variety and speed with
       NLP accuracy and the overall user                which this sector is evolving, Anthropic alone released
       experience.                                      three LLM models between 2023 and 2024: #Claude1,
                                                        #Claude2 and #Claude3.

                                                         OpenAI appears to be close to launching new versions
                                                        of #GPT, which it claims will represent a further leap
                                                        forward.      The galaxy of generative artificial
                                                        intelligence is still evolving strongly.
And it also has an impact on the open source world in
fact the difference between an open-source LLM and
close so binary is showing its limits or rather this new
pradigma1 is establishing itself with respect to LLM so
we can have the following types of models:

  Openly Trained Models (OLMo, Pythia, etc.) - are
those models with training data, training code and
weights available without restrictions on use.
 Permissible Usage Models (Llama**, Mistral,
Gemma, etc.) - are those models with base model
weights and inference code available for easy set-up
and distribution.
  Closed LLMs - everything from GPT4 to a random
set of tuned weights without much information.

From the point of view of the Public Administration
this is desirable a cautious and far-sighted responsible
approach confirming the main requirement already
expressed the framework selected must be agnostic
with respect to the LLM model and in any case as PA
we should prefer Openly Trained Models (OLMo,
Pythia, etc.) or Permissible Usage Models (Llama**,
Mistral, Gemma, etc.).


References
[1]   Retrieval-Augmented Generation for Natural
      Language Understanding" di Patrick Lewis et al.
      (2020): https://arxiv.org/abs/2005.11401
[2]   RAG: A Simple but Effective Approach to Neural
      Conversational Modeling" di Alexander Rush et
      al. (2020): https://medium.com/dropout-
      analytics/what-is-rag-in-generative-ai-
      f5b8c13575f8
[3]   "RAG-BERT: Retrieval-Augmented Generation
      with BERT" di Honglei Zhuang et al. (2020):
      https://www.analyticsvidhya.com/blog/2023/
      10/rags-innovative-approach-to-unifying-
      retrieval-and-generation-in-nlp/
[4]   "Towards      Controllable    and     Consistent
      Generation      with       Retrieval-Augmented
      Generation" di Yilun Wang et al. (2021):
      https://aclanthology.org/2020.coling-main.207
[5]
      https://www.marktechpost.com/2024/04/01/
      evolution-of-rags-naive-rag-advanced-rag-and-
      modular-rag-architectures/Wang