SAVIA: Artificial Intelligence in support of the lawmaking
                                process
                                Michele Visciarelli1,*,† , Giovanni Guidi1,† , Laura Morselli1,† , Domitilla Brandoni1 ,
                                Giuseppe Fiameni2 , Luisa Monti3 , Stefano Bianchini3 and Cosimo Tommasi3
                                1
                                  CINECA, via Magnanelli 6/3, Casalecchio di Reno (B0), 40033, Italy
                                2
                                  NVIDIA AI Technology Center, Milan , Italy
                                3
                                  Assemblea Legislativa Emilia Romagna, viale Aldo Moro 50, Bologna, 41127, Italy


                                                Abstract
                                                We explore the use of open-source Large Language Models (LLMs) to support legal professionals, lawmakers, and citizens
                                                in accessing information on the current and past legislation of the Emilia-Romagna region. We develop a generative AI
                                                tool based on the Retrieval-Augmented Generation (RAG) technique to answer questions related to regional laws and their
                                                implementing acts, retrieving relevant information from the Emilia-Romagna law corpus. To adapt pre-trained LLMs to this
                                                downstream task, we follow a multi-step approach. First, we use the QLoRa technique to quantize and adapt the pre-trained
                                                LLMs to the regional legal text dataset. Next, we fine-tune the domain-adapted models using an "ad-hoc" instruction-based
                                                dataset. We then implement a module to retrieve relevant contextual information from the legal documents dataset. Finally,
                                                we align the models with domain-specific instructions using RAG-based prompting. We evaluate the performance of the
                                                domain-adapted models using the perplexity metric, and the results of the final fine-tuned models are assessed by domain
                                                experts, focusing on the quality of the generated text and the relevance of the answers. Our results show that domain
                                                adaptation on domain-specific text is a crucial step for enhancing the quality of the generated text in expert domains, such as
                                                legal texts, which contain a vast amount of specialized vocabulary and expressions. This approach leads to higher performance
                                                compared to models fine-tuned only on small Question-Answer datasets. Additionally, our findings highlight the importance
                                                of the retrieval module, which must be able to reliably find the most relevant documents to provide useful and up-to-date
                                                insights to lawmakers and citizens.

                                                Keywords
                                                Generative AI, LLM, Legal AI, NLP


                                1. Introduction                                                                                          ural Language Processing (NLP) [13], by improvements
                                                                                                                                         in hardware acceleration for linear algebra, expanding
                                In the last years the interest for Generative Artificial                                                 model’s size up to several billions of parameters, by the
                                Intelligence (Generative AI) applications has grown im-                                                  introduction of quantization techniques allowing train-
                                portance among the research and industry community,                                                      ing of large NNs even on consumer GPUs [14, 15], and by
                                thanks to the introduction of Foundation Models in dif-                                                  the release of large high-quality and open-source datasets
                                ferent AI domains, such as text generation (GPT-series                                                   [16].
                                [1, 2, 3], LLaMA-series [4, 5, 6], MEGATRON [7]), im-                                                       Large Language Models (LLMs) for text generation
                                age generation (DALL· E [8, 9, 10], Stable Diffusion [11]),                                              have achieved remarkable performance and great inter-
                                and video generation (Sora [12]). The progress in Deep                                                   est even outside the research and industry professional
                                Learning modelling has been fostered by important ad-                                                    community, in particular after the release of ChatGPT to
                                vancements in the neural network (NN) research, such as                                                  the public [17]. Despite their success, the use of LLMs on
                                the introduction of the Transformer architecture in Nat-                                                 domain-specific Question-Answer (QA) tasks still face
                                                                                                                                         several challenges, that hinder their spread beyond the
                                Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga-                                  research community, especially when applied to tasks
                                nized by CINI, May 29-30, 2024, Naples, Italy                                                            for which explainability and high quality responses are
                                *
                                  Corresponding author.
                                †                                                                                                        of paramount importance [18].
                                  These authors contributed equally.
                                $ m.visciarelli@cineca.it (M. Visciarelli); g.guidi@cineca.it
                                                                                                                                            Some of the challenges that LLMs are still facing are
                                (G. Guidi); l.morselli@cineca.it (L. Morselli); d.brandoni@cineca.it                                     the following:
                                (D. Brandoni); gfiameni@nvidia.com (G. Fiameni);
                                luisa.monti@regione.emilia-romagna.it (L. Monti);                                                             • difficulty of maintaining up-to-date knowledge
                                stefano.bianchini@regione.emilia-romagna.it (S. Bianchini);                                                   • costs of training and inference of large models,
                                cosimo.tommasi@regione.emilia-romagna.it (C. Tommasi)                                                           costs and difficulty to collect large amount of high-
                                 0000-0003-0753-2571 (M. Visciarelli); 0000-XXX (G. Guidi);
                                0000-0003-0753-2571 (L. Morselli); 0000-0002-8157-1459
                                                                                                                                                quality domain-specific data
                                (D. Brandoni); 0000-0001-8687-6609 (G. Fiameni)                                                               • hallucinated answers, i.e. answers that provided
                                          © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                          Attribution 4.0 International (CC BY 4.0).                                                            false information without warning


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
     • out-of-date or generic answers, even when the               • Mistral-7B-v0.1: a 7B model, that implements
       user expects a specific, current response                     grouped-query and sliding window attention, Ro-
                                                                     tary Position Embedding, that can handle context
   Retrieval-Augmented Generation or RAG has recently                of arbitrary size [23].
emerged as a paradigm to address such challenges [19].             • Mixtral-8x7B-Instruct-v0.1: a 46.7B mixture
In particular, RAG combines a language model with an                 of experts model, trained on instructions in En-
information retrieval system to dynamically fetch rel-               glish, French, Italian, German and Spanish, with
evant external information to enhance the model’s re-                maximum context length of 32k [24].
sponses, by encoding the user’s question into a dense
representation, and retrieving passages relevant to the         Domain experts qualitatively evaluated the perfor-
question from an indexed data source, adding this infor-      mances of the final models obtained from the different
mation to the LLM prompt. Different studies have shown        pre-trained LLMs.
that RAG enhances the quality of the generation process,
leading to higher accuracy, better robustness, reduced        2.1. Unsupervised Domain-Adaptation
hallucinations, higher interpretability, and even the pos-
sibility to perform open-domain QA just by updating           The first step in the procedure was the domain-adaptation
the knowledge-base [20, 21]. RAG also offers a balanced       of the model on legal text. We collected the PDFs of the
approach in terms of customization and resource require-      regional laws of Emilia-Romagna, as well as the relative
ments, being more flexible and cost-effective than full       implementing acts at the regional level (e.g. "atto del
fine-tuning, although still requiring labeled data and a      dirigente" and "atto di giunta") and the available reports
supervised training phase.                                    on the expected and measured impact of a given law (e.g.
   In this work we present SAVIA, a project developed by      "clausola valutativa", "ex-ante and "ex-post" reports). We
CINECA and Assemblea Legislativa of Emilia-Romagna.           split the legal documents in chunks, and we implemented
This project, started in Autumn 2023 and expected to end      a cleaning pipeline to remove typos, bad characters, and
in march 2025, has the goal of creating a model capable       irrelevant parts of the documents such as headers and
of answering questions on the Region’s law and their          footers. We also added mii-llm/gazzetta-ufficiale [25] in
respective implementing acts, as well as on related "ex-      the training dataset, given the affinity of this dataset to
ante" and "ex-post" reports of the laws’ impact. In Section   our application, both in language, semantics and type
2 we present the data used for this project and the work-     of documents. We did not perform domain-adaptive to-
flow that has been adopted. In Section 3, we describe the     kenization [26], using instead the pre-trained models
procedure and the details of the experiments and tests        native tokenizers to tokenize the legal corpus.
conducted, and in Section 4 we show the obtained results.        Not all the three model under investigation under-
Our conclusion are then presented in Section 5.               went domain adaptation. LLaMAntino-2-7b-hf-ITA
                                                              and Mistral-7B-v0.1 were adapted, while Mixtral-8x7B-
                                                              Instruct-v0.1, after tests regarding its native capabilities
2. Methodology                                                of producing adequate Italian legal text, has not been
                                                              domain adapted.
To obtain a model capable of understanding Italian lan-
guage in the law domain and responding to questions
related to laws enacted in the Emilia-Romagna region,
                                                              2.2. Model Alignment on
we followed a multi-step approach. We started from an              Instruction-Based dataset
open-source LLM and adapted it to the legal language          With the support of domain experts, we generated an
through unsupervised domain adaptation (Section 2.1).         Q&A dataset mimicking different levels of domain lan-
The resulting domain-adapted model was then fine-tuned        guage proficiency, ranging from questions that could by
for question-answering (Q&A) on an instruction-based          written by non-expert users, to the ones that may be
dataset prepared by domain experts for this purpose (Sec-     asked by experts in the legal domain. We developed a
tion 2.2). Finally, we implemented a domain-adapted           semi-automatic procedure to further enrich this Q&A
retrieval model (Section 2.3) to enrich the answers with      dataset, using legal documents metadata. The following
relevant information from the law corpus.                     is an example included in the instruction-based dataset:
   The full workflow was reproduced starting from dif-
ferent open-source LLMs:                                           • Q: "Da quando è stata istituita la regione, quali
                                                                     normative sono state adottate per incentivare la
     • LLaMAntino-2-7b-hf-ITA: a 7B model, based                     partecipazione?"
       on LLama-2, specifically fine-tuned for the Italian         • A: "La prima legge regionale riguardante la parte-
       language [22].                                                cipazione ad essere stata approvata è la legge nu-
                                                                     mero 3 del 2010. In seguito, la legge numero 3
       del 2010 è stata abolita e sostituita con la legge   Table 1
       regionale numero 15 del 2018.",                      Perplexity for base and domain-adapted models under study.
   To fine-tune the domain-adapted LLMs, we used the          Model                                      Perplexity
instruction-based dataset prepared by the domain experts.     LLaMAntino-2-7b-hf-ITA                     5.2447
For the loss function computation, we removed the por-        LLaMAntino-2-7b-hf-ITA domain-adapted      1.1789
                                                              Mistral-7B-v0.1                            5.2254
tion of the text containing the prompt, as in many cases
                                                              Mistral-7B-v0.1 domain-adapted             2.3618
the prompt added by the RAG module can account for
up to 50% of the total text length. This approach helped
to optimize the training process more effectively.
                                                            4. Results
2.3. Domain-Adapted Retrieval Model                         To evaluate the quality and proceed to select the can-
                                                            didate for instruction-based fine-tuning, all domain-
To enrich the user’s question with relevant information     adapted models were evaluated using the perplexity met-
from the legal documents database, we develop a retrieval   ric (PPL) on an held-out evaluation dataset based on laws.
module based on Semantic Similarity search technique.       The metric is reported in Table 1.
We used a Sentence-BERT model [27] to populate a vec-          Three different domain experts (lawmakers of Assem-
tor store with embedding generated from the legal docu-     blea Legislativa) were asked to evaluate the answers gen-
ments text chunks. The content similar to a user’s ques-    erated by the final instruction fine-tuned models to a set
tion is retrieved using the semantic search library FAISS   of 25 questions. The qualitative analysis of the experts
[28, 29].                                                   reported that, in general, the answers provided by the
                                                            LLaMAntino-based model were considered too short and
3. Experiment                                               dry. The answers provided by the Mixtral-based model
                                                            were considered the most complete, clear and satisfactory
The project has been carried out exploiting the compu-      in terms of quality of the used specific words. Below we
tational resources of the supercomputer LEONARDO,           report an example of answers provided by the different
hosted by CINECA. Each node in the booster partition is     final models on a given question. For context, we also
equipped with four NVidia A100 SXM6 64GB GPUs and           include the answer of chatGPT (3.5) to the same question.
a single 32-cores Intel Ice Lake CPU.
   For all models, only data parallelism has been em-            • Question: Sul tema della partecipazione, quali
ployed, given the fact that all these models could ade-            leggi sono state fatte in Emilia-Romagna?
quately fit in the VRAM of the GPUs at our disposal. For         • Answer of Mixtral-8x7B-Instruct-v0.1 fine-tuned:
the same reason, LLaMAntino-2-7b-hf-ITA and Mistral-               La prima legge regionale approvata in tema
7B-v0.1 have not been quantized during domain adap-                di partecipazione è la legge regionale 9 feb-
tation and instruction fine-tuning, opting to preserve             braio 2010, n. 3, intitolata "NORME PER LA
the weights’ precision. Mixtral-8x7B-Instruct-v0.1 un-             DEFINIZIONE, RIORDINO E PROMOZIONE
derwent 4-bit quantization instead [30], due to its size.          DELLE PROCEDURE DI CONSULTAZIONE E
For domain adaptation and instruction fine-tuning, we              PARTECIPAZIONE ALLA ELABORAZIONE
applied LoRA adapters on Q, K, V layers of the models              DELLE POLITICHE REGIONALI E LOCALI".
[15]. The training procedure for the models under study            Questa legge è stata successivamente abrogata
were the following:                                                e sostituita con la legge regionale 22 ottobre
                                                                   2018, n. 15, denominata "LEGGE SULLA PARTE-
     • pre-trained LLMs causal Language Modelling
                                                                   CIPAZIONE ALL’ELABORAZIONE DELLE
       on the legal text chunks. This has been per-
                                                                   POLITICHE PUBBLICHE. ABROGAZIONE
       formed on LLaMAntino-2-7b-hf-ITA and Mistral-
                                                                   DELLA LEGGE REGIONALE 9 FEBBRAIO 2010,
       7B-v0.1 and needed for each model, on average,
                                                                   N. 3". La legge regionale attualmente in vigore in
       400 GPU hours (approximately 4 days on a sin-
                                                                   materia di partecipazione è la legge regionale 22
       gle LEONARDO booster node) training for four
                                                                   ottobre 2018, n. 15.
       epochs;
     • model alignment on domain adapted                         • Answer of LLaMAntino-2-7b-hf-ITA domain-
       LLaMAntino-2-7b-hf-ITA and Mistral-7B-v0.1,                 adapted + instruction fine-tuned: - Legge Re-
       and on base pre-trained Mixtral-8x7B-Instruct-              gionale 9 febbraio 2010, n. 3 - Legge Regionale 22
       v0.1, on the QA dataset. This step required                 ottobre 2018, n. 15
       approximately 96 GPU hours, or 24 node hours              • Answer of Mistral-7B-v0.1 domain-adapted + in-
       (4 GPUs per node), to complete a 12 epochs                  struction fine-tuned: In relazione alla parteci-
       training run, on a single LEONARDO node.                    pazione, le leggi regionali sono state le seguenti:
       - Legge Regionale 3 del 2010 - Legge Regionale 15         sources.
        del 2018                                                    Given the scope and the length of the project, sev-
     • Answer of chatGPT-3.5: Non ho accesso a in-               eral improvements to the workflow are foreseen in the
        formazioni in tempo reale o a dati specifici             near future, as well as the possibility to test with more
        riguardanti leggi promulgate dopo gennaio 2022.          pre-trained open source models, for example new Italian-
       Tuttavia, posso fornirti un’informazione generale         native models that will be developed in the near future
        sul tema della partecipazione politica in Italia. In     and domain adaptation of Mixture-of-Experts models
        Italia, molte regioni hanno promulgato leggi e           (such as Mixtral-8x7B-v0.1). Our future work will also
        normative per favorire la partecipazione dei cit-        focus on further improving the retrieval module with bet-
        tadini alla vita politica e amministrativa. Queste       ter embedding models, and on applying more powerful
        leggi possono riguardare vari aspetti, come la con-      techniques to train the LLMs, such as Direct Preference
        sultazione pubblica su decisioni amministrative,         Optimization (DPO, [31]) and Reinforcement Learning
        la promozione della trasparenza nelle istituzioni,       from Human Feedback (RLHF, [32]).
        o l’istituzione di forme di democrazia partecipa-
        tiva a livello locale. Per quanto riguarda l’Emilia-
        Romagna, una delle regioni più attive nel promuo-        Acknowledgments
       vere la partecipazione cittadina, potrebbero essere       We are extremely grateful to the President of Assem-
        state adottate diverse leggi e iniziative in questo      blea Legislativa Emilia-Romagna, Emma Petitti, for the
        ambito. Alcune di queste potrebbero riguardare           long-eyed vision that created the conditions to launch
        la partecipazione ai processi decisionali locali, la     the project – and to the Director General of Assemblea
        promozione della trasparenza e dell’accesso alle         Legislativa Emilia-Romagna, Leonardo Draghetti, to set
        informazioni pubbliche, o la creazione di stru-          strategically the project and ensure the necessary human
        menti e piattaforme per coinvolgere attivamente          and material resources.
        i cittadini nelle decisioni che li riguardano. Per          Besides, this endeavour would not have been possible
        ottenere informazioni specifiche sulle leggi rela-       without the commitment of the President of CINECA,
        tive alla partecipazione in Emilia-Romagna dopo          Francesco Ubertini, and of the Director of the Su-
        il 2022, ti consiglio di consultare le fonti ufficiali   percomupting applications and innovation Director of
        della Regione Emilia-Romagna, come il sito web           CINECA, Sanzio Bassini.
        istituzionale o i comunicati stampa delle autorità          Special thanks goes to Giovanna Favero of Assemblea
        regionali. In alternativa, potresti contattare di-       Legislativa Emilia-Romagna for her efforts in making
        rettamente gli uffici regionali competenti per ot-       available laws, implementing acts, as well as related ”ex-
        tenere informazioni aggiornate sulle leggi e le          ante” and ”ex-post” reports.
        iniziative in materia di partecipazione politica e
        amministrativa.
                                                                 References
5. Conclusions                                                    [1] T. B. Brown, B. Mann, N. Ryder, M. Sub-
                                                                      biah, et al., Language Models are Few-
We explored different approaches to adapt open-source                 Shot     Learners,     arXiv  e-prints (2020)
LLMs for question-answering on the Emilia-Romagna                     arXiv:2005.14165arXiv:2005.14165,
law corpus. We adapted the different LLMs on a corpus                 doi:10.48550/arXiv.2005.14165.
composed of the Emilia-Romagna regional laws and the              [2] M. Chen, J. Tworek, H. Jun, Q. Yuan, et
relative implementing acts, and we further refined the                l.,   Evaluating Large Language Models
domain-adapted models on a custom QA dataset provided                 Trained on Code, arXiv e-prints (2021)
by domain experts. Finally, we exploited RAG to enrich                arXiv:2107.03374arXiv:2107.03374,
the user’s question with relevant contextual information              doi:10.48550/arXiv.2107.03374.
extracted from the law database.                                  [3] OpenAI, J. Achiam, S. Adler, S. Agarwal, et
   We experimented with different open-source LLMs,                   al., GPT-4 Technical Report, arXiv e-prints
such as Mistral-7B-v0.1, LLaMAntino-2-7b-hf-ITA,                      (2023) arXiv:2303.08774arXiv:2303.08774, doi:
Mixtral-8x7B-Instruct-v0.1. Our results show that                     10.48550/arXiv.2303.08774.
domain-adapted LLMs that are able to answer specific do-          [4] H. Touvron, T. Lavril, G. Izacard, X. Mar-
main questions can be a helpful tool to support decision-             tinet, et al., LLaMA: Open and Efficient
making in specialized fields such as the legal domain,                Foundation Language Models, arXiv e-prints
that often need to retrieve exact, concise and easy-to-               (2023)     arXiv:2302.13971arXiv:2302.13971,
understand information from large and unstructured data               doi:10.48550/arXiv.2302.13971.
 [5] H. Touvron, L. Martin, K. Stone, P. Al-              [17] OPENAI, Introducing chatgpt, Tech. rep., OpenAI
     bert, et al., Llama 2: Open Foundation and                (2022).
     Fine-Tuned Chat Models, arXiv e-prints               [18] L. Huang, W. Yu, W. Ma, W. Zhong, et
     (2023)      arXiv:2307.09288arXiv:2307.09288,             al., A Survey on Hallucination in Large Lan-
     doi:10.48550/arXiv.2307.09288.                            guage Models: Principles, Taxonomy, Chal-
 [6] B. Rozière, J. Gehring, F. Gloeckle, S. Sootla,           lenges, and Open Questions, arXiv e-prints
     et al., Code Llama:            Open Foundation            (2023) arXiv:2311.05232arXiv:2311.05232, doi:
     Models for Code, arXiv e-prints (2023)                    10.48550/arXiv.2311.05232.
     arXiv:2308.12950arXiv:2308.12950,                    [19] P. Lewis, E. Perez, A. Piktus, F. Petroni,
     doi:10.48550/arXiv.2308.12950.                            et al., Retrieval-Augmented Generation for
 [7] D. Narayanan, M. Shoeybi, J. Casper,                      Knowledge-Intensive NLP Tasks, arXiv e-prints
     P. LeGresley, et al., Efficient Large-Scale               (2020)      arXiv:2005.11401arXiv:2005.11401,
     Language Model Training on GPU Clus-                      doi:10.48550/arXiv.2005.11401.
     ters Using Megatron-LM, arXiv e-prints               [20] S. Siriwardhana, R. Weerasekera, E. Wen,
     (2021)      arXiv:2104.04473arXiv:2104.04473,             T. Kaluarachchi, et al., Improving the Do-
     doi:10.48550/arXiv.2104.04473.                            main Adaptation of Retrieval Augmented
 [8] A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss,           Generation (RAG) Models for Open Do-
     A. Radford, M. Chen, I. Sutskever, Zero-Shot              main Question Answering, arXiv e-prints
     Text-to-Image Generation,        arXiv e-prints           (2022)      arXiv:2210.02627arXiv:2210.02627,
     (2021)      arXiv:2102.12092arXiv:2102.12092,             doi:10.48550/arXiv.2210.02627.
     doi:10.48550/arXiv.2102.12092.                       [21] P. Zhao, H. Zhang, Q. Yu, Z. Wang, et al.,
 [9] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu,                Retrieval-Augmented Generation for AI-
     M. Chen, Hierarchical Text-Conditional Image              Generated Content: A Survey, arXiv e-prints
     Generation with CLIP Latents, arXiv e-prints              (2024)      arXiv:2402.19473arXiv:2402.19473,
     (2022) arXiv:2204.06125arXiv:2204.06125, doi:             doi:10.48550/arXiv.2402.19473.
     10.48550/arXiv.2204.06125.                           [22] P. Basile, E. Musacchio, M. Polignano, L. Sicil-
[10] Z. Shi, X. Zhou, X. Qiu, X. Zhu, Im-                      iani, G. Fiameni, G. Semeraro, LLaMAntino:
     proving Image Captioning with Better                      LLaMA 2 Models for Effective Text Gen-
     Use of Captions, arXiv e-prints (2020)                    eration in Italian Language, arXiv e-prints
     arXiv:2006.11807arXiv:2006.11807,                         (2023)      arXiv:2312.09993arXiv:2312.09993,
     doi:10.48550/arXiv.2006.11807.                            doi:10.48550/arXiv.2312.09993.
[11] R. Rombach, A. Blattmann, D. Lorenz, P. Esser,       [23] A. Q. Jiang, A. Sablayrolles, A. Mensch,
     B. Ommer, High-Resolution Image Synthesis                 C. Bamford, et al., Mistral 7B, arXiv e-prints
     with Latent Diffusion Models, arXiv e-prints              (2023)      arXiv:2310.06825arXiv:2310.06825,
     (2021) arXiv:2112.10752arXiv:2112.10752, doi:             doi:10.48550/arXiv.2310.06825.
     10.48550/arXiv.2112.10752.                           [24] M. A. team, Mixtral of experts, Tech. rep., Mistral
[12] OpenAI, Video generation models as world simula-          AI (2023).
     tors, Tech. rep., OpenAI (2024).                     [25] E. Federici, M. Ferraretto, N. Landro, Gazzetta
[13] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkor-             Ufficiale: A dataset of legislative texts, public and
     eit, L. Jones, A. N. Gomez, L. Kaiser, I. Polo-           private acts (2024).
     sukhin, Attention Is All You Need, arXiv e-prints         URL       https://huggingface.co/datasets/mii-llm/
     (2017) arXiv:1706.03762arXiv:1706.03762, doi:             gazzetta-ufficiale
     10.48550/arXiv.1706.03762.                           [26] M. Liu, T.-D. Ene, R. Kirby, C. Cheng, et
[14] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li,        al., ChipNeMo:            Domain-Adapted LLMs
     S. Wang, L. Wang, W. Chen, LoRA: Low-Rank Adap-           for Chip Design, arXiv e-prints (2023)
     tation of Large Language Models, arXiv e-prints           arXiv:2311.00176arXiv:2311.00176,
     (2021) arXiv:2106.09685arXiv:2106.09685, doi:             doi:10.48550/arXiv.2311.00176.
     10.48550/arXiv.2106.09685.                           [27] N. Reimers, I. Gurevych, Sentence-BERT:
[15] T. Dettmers, A. Pagnoni, A. Holtzman,                     Sentence       Embeddings         using      Siamese
     L. Zettlemoyer, QLoRA: Efficient Finetun-                 BERT-Networks,         arXiv      e-prints     (2019)
     ing of Quantized LLMs, arXiv e-prints                     arXiv:1908.10084arXiv:1908.10084,
     (2023)      arXiv:2305.14314arXiv:2305.14314,             doi:10.48550/arXiv.1908.10084.
     doi:10.48550/arXiv.2305.14314.                       [28] J. Johnson, M. Douze, H. Jégou, Billion-scale simi-
[16] H. Face, Hugging face datasets (2016).                    larity search with GPUs, IEEE Transactions on Big
     URL https://huggingface.co/datasets                       Data 7 (3) (2019) 535–547.
[29] M. Douze, A. Guzhva, C. Deng, J. Johnson, G. Szil-
     vasy, P.-E. Mazaré, M. Lomeli, L. Hosseini, H. Jégou,
     The faiss library (2024). arXiv:2401.08281.
[30] H. Face, bitsandbytes, Tech. rep., Hugging Face
     (2023).
[31] R. Rafailov, A. Sharma, E. Mitchell, S. Er-
     mon, C. D. Manning, C. Finn, Direct Pref-
     erence Optimization: Your Language Model
     is Secretly a Reward Model, arXiv e-prints
     (2023) arXiv:2305.18290arXiv:2305.18290, doi:
     10.48550/arXiv.2305.18290.
[32] L. Ouyang, J. Wu, X. Jiang, D. Almeida, et
     al., Training language models to follow in-
     structions with human feedback, arXiv e-prints
     (2022) arXiv:2203.02155arXiv:2203.02155, doi:
     10.48550/arXiv.2203.02155.