<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SAVIA: Artificial Intelligence in support of the lawmaking process</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michele Visciarelli</string-name>
          <email>m.visciarelli@cineca.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Guidi</string-name>
          <email>g.guidi@cineca.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Morselli</string-name>
          <email>l.morselli@cineca.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Domitilla Brandoni</string-name>
          <email>d.brandoni@cineca.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giuseppe Fiameni</string-name>
          <email>gfiameni@nvidia.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luisa Monti</string-name>
          <email>luisa.monti@regione.emilia-romagna.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefano Bianchini</string-name>
          <email>stefano.bianchini@regione.emilia-romagna.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cosimo Tommasi</string-name>
          <email>cosimo.tommasi@regione.emilia-romagna.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Assemblea Legislativa Emilia Romagna</institution>
          ,
          <addr-line>viale Aldo Moro 50, Bologna, 41127</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>CINECA</institution>
          ,
          <addr-line>via Magnanelli 6/3, Casalecchio di Reno (B0), 40033</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>NVIDIA AI Technology Center</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We explore the use of open-source Large Language Models (LLMs) to support legal professionals, lawmakers, and citizens in accessing information on the current and past legislation of the Emilia-Romagna region. We develop a generative AI tool based on the Retrieval-Augmented Generation (RAG) technique to answer questions related to regional laws and their implementing acts, retrieving relevant information from the Emilia-Romagna law corpus. To adapt pre-trained LLMs to this downstream task, we follow a multi-step approach. First, we use the QLoRa technique to quantize and adapt the pre-trained LLMs to the regional legal text dataset. Next, we fine-tune the domain-adapted models using an "ad-hoc" instruction-based dataset. We then implement a module to retrieve relevant contextual information from the legal documents dataset. Finally, we align the models with domain-specific instructions using RAG-based prompting. We evaluate the performance of the domain-adapted models using the perplexity metric, and the results of the final fine-tuned models are assessed by domain experts, focusing on the quality of the generated text and the relevance of the answers. Our results show that domain adaptation on domain-specific text is a crucial step for enhancing the quality of the generated text in expert domains, such as legal texts, which contain a vast amount of specialized vocabulary and expressions. This approach leads to higher performance compared to models fine-tuned only on small Question-Answer datasets. Additionally, our findings highlight the importance of the retrieval module, which must be able to reliably find the most relevant documents to provide useful and up-to-date insights to lawmakers and citizens.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Generative AI</kwd>
        <kwd>LLM</kwd>
        <kwd>Legal AI</kwd>
        <kwd>NLP</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>• dificulty of maintaining up-to-date knowledge
• costs of training and inference of large models,
costs and dificulty to collect large amount of
highquality domain-specific data
• hallucinated answers, i.e. answers that provided
false information without warning
• out-of-date or generic answers, even when the
user expects a specific, current response
• Mistral-7B-v0.1: a 7B model, that implements
grouped-query and sliding window attention,
Rotary Position Embedding, that can handle context
of arbitrary size [23].
• Mixtral-8x7B-Instruct-v0.1: a 46.7B mixture
of experts model, trained on instructions in
English, French, Italian, German and Spanish, with
maximum context length of 32k [24].</p>
      <sec id="sec-1-1">
        <title>Retrieval-Augmented Generation or RAG has recently</title>
        <p>emerged as a paradigm to address such challenges [19].</p>
        <p>In particular, RAG combines a language model with an
information retrieval system to dynamically fetch
relevant external information to enhance the model’s
responses, by encoding the user’s question into a dense
representation, and retrieving passages relevant to the Domain experts qualitatively evaluated the
perforquestion from an indexed data source, adding this infor- mances of the final models obtained from the diferent
mation to the LLM prompt. Diferent studies have shown pre-trained LLMs.
that RAG enhances the quality of the generation process,
leading to higher accuracy, better robustness, reduced 2.1. Unsupervised Domain-Adaptation
hallucinations, higher interpretability, and even the
possibility to perform open-domain QA just by updating The first step in the procedure was the domain-adaptation
the knowledge-base [20, 21]. RAG also ofers a balanced of the model on legal text. We collected the PDFs of the
approach in terms of customization and resource require- regional laws of Emilia-Romagna, as well as the relative
ments, being more flexible and cost-efective than full implementing acts at the regional level (e.g. "atto del
ifne-tuning, although still requiring labeled data and a dirigente" and "atto di giunta") and the available reports
supervised training phase. on the expected and measured impact of a given law (e.g.</p>
        <p>In this work we present SAVIA, a project developed by "clausola valutativa", "ex-ante and "ex-post" reports). We
CINECA and Assemblea Legislativa of Emilia-Romagna. split the legal documents in chunks, and we implemented
This project, started in Autumn 2023 and expected to end a cleaning pipeline to remove typos, bad characters, and
in march 2025, has the goal of creating a model capable irrelevant parts of the documents such as headers and
of answering questions on the Region’s law and their footers. We also added mii-llm/gazzetta-uficiale [ 25] in
respective implementing acts, as well as on related "ex- the training dataset, given the afinity of this dataset to
ante" and "ex-post" reports of the laws’ impact. In Section our application, both in language, semantics and type
2 we present the data used for this project and the work- of documents. We did not perform domain-adaptive
tolfow that has been adopted. In Section 3, we describe the kenization [26], using instead the pre-trained models
procedure and the details of the experiments and tests native tokenizers to tokenize the legal corpus.
conducted, and in Section 4 we show the obtained results. Not all the three model under investigation
underOur conclusion are then presented in Section 5. went domain adaptation. LLaMAntino-2-7b-hf-ITA
and Mistral-7B-v0.1 were adapted, while
Mixtral-8x7BInstruct-v0.1, after tests regarding its native capabilities
2. Methodology of producing adequate Italian legal text, has not been
domain adapted.</p>
        <sec id="sec-1-1-1">
          <title>2.2. Model Alignment on</title>
        </sec>
        <sec id="sec-1-1-2">
          <title>Instruction-Based dataset</title>
        </sec>
      </sec>
      <sec id="sec-1-2">
        <title>With the support of domain experts, we generated an</title>
        <p>Q&amp;A dataset mimicking diferent levels of domain
language proficiency, ranging from questions that could by
written by non-expert users, to the ones that may be
asked by experts in the legal domain. We developed a
semi-automatic procedure to further enrich this Q&amp;A
dataset, using legal documents metadata. The following
is an example included in the instruction-based dataset:
• Q: "Da quando è stata istituita la regione, quali
normative sono state adottate per incentivare la
partecipazione?"
• A: "La prima legge regionale riguardante la
partecipazione ad essere stata approvata è la legge
numero 3 del 2010. In seguito, la legge numero 3</p>
      </sec>
      <sec id="sec-1-3">
        <title>To obtain a model capable of understanding Italian lan</title>
        <p>guage in the law domain and responding to questions
related to laws enacted in the Emilia-Romagna region,
we followed a multi-step approach. We started from an
open-source LLM and adapted it to the legal language
through unsupervised domain adaptation (Section 2.1).
The resulting domain-adapted model was then fine-tuned
for question-answering (Q&amp;A) on an instruction-based
dataset prepared by domain experts for this purpose
(Section 2.2). Finally, we implemented a domain-adapted
retrieval model (Section 2.3) to enrich the answers with
relevant information from the law corpus.</p>
        <p>The full workflow was reproduced starting from
different open-source LLMs:
• LLaMAntino-2-7b-hf-ITA: a 7B model, based
on LLama-2, specifically fine-tuned for the Italian
language [22].
del 2010 è stata abolita e sostituita con la legge
regionale numero 15 del 2018.",</p>
      </sec>
      <sec id="sec-1-4">
        <title>To fine-tune the domain-adapted LLMs, we used the</title>
        <p>instruction-based dataset prepared by the domain experts.
For the loss function computation, we removed the
portion of the text containing the prompt, as in many cases
the prompt added by the RAG module can account for
up to 50% of the total text length. This approach helped
to optimize the training process more efectively.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Results</title>
      <p>2.3. Domain-Adapted Retrieval Model To evaluate the quality and proceed to select the
candidate for instruction-based fine-tuning, all
domainTo enrich the user’s question with relevant information adapted models were evaluated using the perplexity
metfrom the legal documents database, we develop a retrieval ric (PPL) on an held-out evaluation dataset based on laws.
module based on Semantic Similarity search technique. The metric is reported in Table 1.</p>
      <p>We used a Sentence-BERT model [27] to populate a vec- Three diferent domain experts (lawmakers of
Assemtor store with embedding generated from the legal docu- blea Legislativa) were asked to evaluate the answers
genments text chunks. The content similar to a user’s ques- erated by the final instruction fine-tuned models to a set
tion is retrieved using the semantic search library FAISS of 25 questions. The qualitative analysis of the experts
[28, 29]. reported that, in general, the answers provided by the
LLaMAntino-based model were considered too short and
3. Experiment dry. The answers provided by the Mixtral-based model
were considered the most complete, clear and satisfactory
The project has been carried out exploiting the compu- in terms of quality of the used specific words. Below we
tational resources of the supercomputer LEONARDO, report an example of answers provided by the diferent
hosted by CINECA. Each node in the booster partition is ifnal models on a given question. For context, we also
equipped with four NVidia A100 SXM6 64GB GPUs and include the answer of chatGPT (3.5) to the same question.
a single 32-cores Intel Ice Lake CPU.</p>
      <p>For all models, only data parallelism has been
employed, given the fact that all these models could
adequately fit in the VRAM of the GPUs at our disposal. For
the same reason, LLaMAntino-2-7b-hf-ITA and
Mistral7B-v0.1 have not been quantized during domain
adaptation and instruction fine-tuning, opting to preserve
the weights’ precision. Mixtral-8x7B-Instruct-v0.1
underwent 4-bit quantization instead [30], due to its size.</p>
      <p>For domain adaptation and instruction fine-tuning, we
applied LoRA adapters on Q, K, V layers of the models
[15]. The training procedure for the models under study
were the following:
• pre-trained LLMs causal Language Modelling
on the legal text chunks. This has been
performed on LLaMAntino-2-7b-hf-ITA and
Mistral7B-v0.1 and needed for each model, on average,
400 GPU hours (approximately 4 days on a
single LEONARDO booster node) training for four
epochs;
• model alignment on domain adapted</p>
      <p>LLaMAntino-2-7b-hf-ITA and Mistral-7B-v0.1,
and on base pre-trained
Mixtral-8x7B-Instructv0.1, on the QA dataset. This step required
approximately 96 GPU hours, or 24 node hours
(4 GPUs per node), to complete a 12 epochs
training run, on a single LEONARDO node.</p>
      <p>• Question: Sul tema della partecipazione, quali</p>
      <p>leggi sono state fatte in Emilia-Romagna?
• Answer of Mixtral-8x7B-Instruct-v0.1 fine-tuned :</p>
      <p>La prima legge regionale approvata in tema
di partecipazione è la legge regionale 9
febbraio 2010, n. 3, intitolata "NORME PER LA
DEFINIZIONE, RIORDINO E PROMOZIONE
DELLE PROCEDURE DI CONSULTAZIONE E
PARTECIPAZIONE ALLA ELABORAZIONE
DELLE POLITICHE REGIONALI E LOCALI".</p>
      <p>Questa legge è stata successivamente abrogata
e sostituita con la legge regionale 22 ottobre
2018, n. 15, denominata "LEGGE SULLA
PARTECIPAZIONE ALL’ELABORAZIONE DELLE
POLITICHE PUBBLICHE. ABROGAZIONE
DELLA LEGGE REGIONALE 9 FEBBRAIO 2010,
N. 3". La legge regionale attualmente in vigore in
materia di partecipazione è la legge regionale 22
ottobre 2018, n. 15.
• Answer of LLaMAntino-2-7b-hf-ITA
domainadapted + instruction fine-tuned : - Legge
Regionale 9 febbraio 2010, n. 3 - Legge Regionale 22
ottobre 2018, n. 15
• Answer of Mistral-7B-v0.1 domain-adapted +
instruction fine-tuned : In relazione alla
partecipazione, le leggi regionali sono state le seguenti:
- Legge Regionale 3 del 2010 - Legge Regionale 15 sources.</p>
      <p>del 2018 Given the scope and the length of the project,
sev• Answer of chatGPT-3.5: Non ho accesso a in- eral improvements to the workflow are foreseen in the
formazioni in tempo reale o a dati specifici near future, as well as the possibility to test with more
riguardanti leggi promulgate dopo gennaio 2022. pre-trained open source models, for example new
ItalianTuttavia, posso fornirti un’informazione generale native models that will be developed in the near future
sul tema della partecipazione politica in Italia. In and domain adaptation of Mixture-of-Experts models
Italia, molte regioni hanno promulgato leggi e (such as Mixtral-8x7B-v0.1). Our future work will also
normative per favorire la partecipazione dei cit- focus on further improving the retrieval module with
bettadini alla vita politica e amministrativa. Queste ter embedding models, and on applying more powerful
leggi possono riguardare vari aspetti, come la con- techniques to train the LLMs, such as Direct Preference
sultazione pubblica su decisioni amministrative, Optimization (DPO, [31]) and Reinforcement Learning
la promozione della trasparenza nelle istituzioni, from Human Feedback (RLHF, [32]).
o l’istituzione di forme di democrazia
partecipativa a livello locale. Per quanto riguarda l’Emilia- Acknowledgments
Romagna, una delle regioni più attive nel
promuovere la partecipazione cittadina, potrebbero essere We are extremely grateful to the President of
Assemstate adottate diverse leggi e iniziative in questo blea Legislativa Emilia-Romagna, Emma Petitti, for the
ambito. Alcune di queste potrebbero riguardare long-eyed vision that created the conditions to launch
la partecipazione ai processi decisionali locali, la the project – and to the Director General of Assemblea
promozione della trasparenza e dell’accesso alle Legislativa Emilia-Romagna, Leonardo Draghetti, to set
informazioni pubbliche, o la creazione di stru- strategically the project and ensure the necessary human
menti e piattaforme per coinvolgere attivamente and material resources.
i cittadini nelle decisioni che li riguardano. Per Besides, this endeavour would not have been possible
ottenere informazioni specifiche sulle leggi rela- without the commitment of the President of CINECA,
tive alla partecipazione in Emilia-Romagna dopo Francesco Ubertini, and of the Director of the
Suil 2022, ti consiglio di consultare le fonti uficiali percomupting applications and innovation Director of
della Regione Emilia-Romagna, come il sito web CINECA, Sanzio Bassini.
istituzionale o i comunicati stampa delle autorità Special thanks goes to Giovanna Favero of Assemblea
regionali. In alternativa, potresti contattare di- Legislativa Emilia-Romagna for her eforts in making
rettamente gli ufici regionali competenti per ot- available laws, implementing acts, as well as related
”extenere informazioni aggiornate sulle leggi e le ante” and ”ex-post” reports.
iniziative in materia di partecipazione politica e
amministrativa.</p>
    </sec>
    <sec id="sec-3">
      <title>5. Conclusions</title>
      <sec id="sec-3-1">
        <title>We explored diferent approaches to adapt open-source</title>
        <p>LLMs for question-answering on the Emilia-Romagna
law corpus. We adapted the diferent LLMs on a corpus
composed of the Emilia-Romagna regional laws and the
relative implementing acts, and we further refined the
domain-adapted models on a custom QA dataset provided
by domain experts. Finally, we exploited RAG to enrich
the user’s question with relevant contextual information
extracted from the law database.</p>
        <p>We experimented with diferent open-source LLMs,
such as Mistral-7B-v0.1, LLaMAntino-2-7b-hf-ITA,
Mixtral-8x7B-Instruct-v0.1. Our results show that
domain-adapted LLMs that are able to answer specific
domain questions can be a helpful tool to support
decisionmaking in specialized fields such as the legal domain,
that often need to retrieve exact, concise and
easy-tounderstand information from large and unstructured data
[5] H. Touvron, L. Martin, K. Stone, P. Al- [17] OPENAI, Introducing chatgpt, Tech. rep., OpenAI
bert, et al., Llama 2: Open Foundation and (2022).</p>
        <p>Fine-Tuned Chat Models, arXiv e-prints [18] L. Huang, W. Yu, W. Ma, W. Zhong, et
(2023) arXiv:2307.09288arXiv:2307.09288, al., A Survey on Hallucination in Large
Landoi:10.48550/arXiv.2307.09288. guage Models: Principles, Taxonomy,
Chal[6] B. Rozière, J. Gehring, F. Gloeckle, S. Sootla, lenges, and Open Questions, arXiv e-prints
et al., Code Llama: Open Foundation (2023) arXiv:2311.05232arXiv:2311.05232, doi:
Models for Code, arXiv e-prints (2023) 10.48550/arXiv.2311.05232.
arXiv:2308.12950arXiv:2308.12950, [19] P. Lewis, E. Perez, A. Piktus, F. Petroni,
doi:10.48550/arXiv.2308.12950. et al., Retrieval-Augmented Generation for
[7] D. Narayanan, M. Shoeybi, J. Casper, Knowledge-Intensive NLP Tasks, arXiv e-prints
P. LeGresley, et al., Eficient Large-Scale (2020) arXiv:2005.11401arXiv:2005.11401,
Language Model Training on GPU Clus- doi:10.48550/arXiv.2005.11401.
ters Using Megatron-LM, arXiv e-prints [20] S. Siriwardhana, R. Weerasekera, E. Wen,
(2021) arXiv:2104.04473arXiv:2104.04473, T. Kaluarachchi, et al., Improving the
Dodoi:10.48550/arXiv.2104.04473. main Adaptation of Retrieval Augmented
[8] A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, Generation (RAG) Models for Open
DoA. Radford, M. Chen, I. Sutskever, Zero-Shot main Question Answering, arXiv e-prints
Text-to-Image Generation, arXiv e-prints (2022) arXiv:2210.02627arXiv:2210.02627,
(2021) arXiv:2102.12092arXiv:2102.12092, doi:10.48550/arXiv.2210.02627.
doi:10.48550/arXiv.2102.12092. [21] P. Zhao, H. Zhang, Q. Yu, Z. Wang, et al.,
[9] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, Retrieval-Augmented Generation for
AIM. Chen, Hierarchical Text-Conditional Image Generated Content: A Survey, arXiv e-prints
Generation with CLIP Latents, arXiv e-prints (2024) arXiv:2402.19473arXiv:2402.19473,
(2022) arXiv:2204.06125arXiv:2204.06125, doi: doi:10.48550/arXiv.2402.19473.
10.48550/arXiv.2204.06125. [22] P. Basile, E. Musacchio, M. Polignano, L.
Sicil[10] Z. Shi, X. Zhou, X. Qiu, X. Zhu, Im- iani, G. Fiameni, G. Semeraro, LLaMAntino:
proving Image Captioning with Better LLaMA 2 Models for Efective Text
GenUse of Captions, arXiv e-prints (2020) eration in Italian Language, arXiv e-prints
arXiv:2006.11807arXiv:2006.11807, (2023) arXiv:2312.09993arXiv:2312.09993,
doi:10.48550/arXiv.2006.11807. doi:10.48550/arXiv.2312.09993.
[11] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, [23] A. Q. Jiang, A. Sablayrolles, A. Mensch,
B. Ommer, High-Resolution Image Synthesis C. Bamford, et al., Mistral 7B, arXiv e-prints
with Latent Difusion Models, arXiv e-prints (2023) arXiv:2310.06825arXiv:2310.06825,
(2021) arXiv:2112.10752arXiv:2112.10752, doi: doi:10.48550/arXiv.2310.06825.
10.48550/arXiv.2112.10752. [24] M. A. team, Mixtral of experts, Tech. rep., Mistral
[12] OpenAI, Video generation models as world simula- AI (2023).</p>
        <p>tors, Tech. rep., OpenAI (2024). [25] E. Federici, M. Ferraretto, N. Landro, Gazzetta
[13] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkor- Uficiale: A dataset of legislative texts, public and
eit, L. Jones, A. N. Gomez, L. Kaiser, I. Polo- private acts (2024).
sukhin, Attention Is All You Need, arXiv e-prints URL https://huggingface.co/datasets/mii-llm/
(2017) arXiv:1706.03762arXiv:1706.03762, doi: gazzetta-uficiale
10.48550/arXiv.1706.03762. [26] M. Liu, T.-D. Ene, R. Kirby, C. Cheng, et
[14] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, al., ChipNeMo: Domain-Adapted LLMs
S. Wang, L. Wang, W. Chen, LoRA: Low-Rank Adap- for Chip Design, arXiv e-prints (2023)
tation of Large Language Models, arXiv e-prints arXiv:2311.00176arXiv:2311.00176,
(2021) arXiv:2106.09685arXiv:2106.09685, doi: doi:10.48550/arXiv.2311.00176.
10.48550/arXiv.2106.09685. [27] N. Reimers, I. Gurevych, Sentence-BERT:
[15] T. Dettmers, A. Pagnoni, A. Holtzman, Sentence Embeddings using Siamese
L. Zettlemoyer, QLoRA: Eficient Finetun- BERT-Networks, arXiv e-prints (2019)
ing of Quantized LLMs, arXiv e-prints arXiv:1908.10084arXiv:1908.10084,
(2023) arXiv:2305.14314arXiv:2305.14314, doi:10.48550/arXiv.1908.10084.
doi:10.48550/arXiv.2305.14314. [28] J. Johnson, M. Douze, H. Jégou, Billion-scale
simi[16] H. Face, Hugging face datasets (2016). larity search with GPUs, IEEE Transactions on Big</p>
        <p>
          URL https://huggingface.co/datasets Data 7 (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) (2019) 535–547.
[29] M. Douze, A. Guzhva, C. Deng, J. Johnson, G.
Szilvasy, P.-E. Mazaré, M. Lomeli, L. Hosseini, H. Jégou,
        </p>
        <p>The faiss library (2024). arXiv:2401.08281.
[30] H. Face, bitsandbytes, Tech. rep., Hugging Face</p>
        <p>(2023).
[31] R. Rafailov, A. Sharma, E. Mitchell, S.
Ermon, C. D. Manning, C. Finn, Direct
Preference Optimization: Your Language Model
is Secretly a Reward Model, arXiv e-prints
(2023) arXiv:2305.18290arXiv:2305.18290, doi:
10.48550/arXiv.2305.18290.
[32] L. Ouyang, J. Wu, X. Jiang, D. Almeida, et
al., Training language models to follow
instructions with human feedback, arXiv e-prints
(2022) arXiv:2203.02155arXiv:2203.02155, doi:
10.48550/arXiv.2203.02155.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>T. B. Brown</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ryder</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Subbiah</surname>
          </string-name>
          , et al.,
          <source>Language Models are FewShot Learners</source>
          , arXiv e-prints (
          <year>2020</year>
          ) arXiv:
          <year>2005</year>
          .14165arXiv:
          <year>2005</year>
          .14165, doi:10.48550/arXiv.
          <year>2005</year>
          .
          <volume>14165</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tworek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yuan</surname>
          </string-name>
          , et l.,
          <source>Evaluating Large Language Models Trained on Code</source>
          , arXiv e-prints (
          <year>2021</year>
          ) arXiv:
          <fpage>2107</fpage>
          .03374arXiv:
          <fpage>2107</fpage>
          .03374, doi:10.48550/arXiv.2107.03374.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>OpenAI</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Achiam</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Adler</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Agarwal</surname>
          </string-name>
          , et al.,
          <source>GPT-4 Technical Report</source>
          , arXiv e-prints (
          <year>2023</year>
          ) arXiv:
          <fpage>2303</fpage>
          .08774arXiv:
          <fpage>2303</fpage>
          .08774, doi: 10.48550/arXiv.2303.08774.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Touvron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lavril</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Izacard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Martinet</surname>
          </string-name>
          , et al.,
          <source>LLaMA: Open and Eficient Foundation Language Models</source>
          , arXiv e-prints (
          <year>2023</year>
          ) arXiv:
          <fpage>2302</fpage>
          .13971arXiv:
          <fpage>2302</fpage>
          .13971, doi:10.48550/arXiv.2302.13971.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>