<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">InLegalLLaMA: Indian Legal Knowledge Enhanced Large Language Model</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Sudipto</forename><surname>Ghosh</surname></persName>
							<email>sudipto.ghosh@scai.iitd.ac.in</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">University of Delhi</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Devanshu</forename><surname>Verma</surname></persName>
							<email>dverma@cs.du.ac.in</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">University of Delhi</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Balaji</forename><surname>Ganesan</surname></persName>
							<email>bganesa1@in.ibm.com</email>
							<affiliation key="aff1">
								<orgName type="institution">IBM Research India</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Purnima</forename><surname>Bindal</surname></persName>
							<email>pbindal@cs.du.ac.in</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">University of Delhi</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Vikas</forename><surname>Kumar</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">University of Delhi</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Vasudha</forename><surname>Bhatnagar</surname></persName>
							<email>vbhatnagar@cs.du.ac.in</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">University of Delhi</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">InLegalLLaMA: Indian Legal Knowledge Enhanced Large Language Model</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">EE86C219EC88058C5622CD685645896A</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:28+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Large Language Models</term>
					<term>Knowledge Enhanced Models</term>
					<term>Legal Text Analytics</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Large Language Models (LLM) are being increasingly used in many domains including legal and justice. General purpose models trained on web data are not performant enough on legal text analytics (LTA) tasks while fine tuning task specific models is expensive because of the annotation and compute costs. Pre-training domain or application specific models is increasingly popular. However pre-training LLMs in small domain corpora like Indian legal documents and judgements is challenging. We introduce our InLegalLLaMA model, along with the related training corpus, adapted for the Indian legal domain, that shows promise of improved performance on LTA tasks. We also propose a RAG-based framework for petition drafting that benefits from the legal language generation and reasoning abilities of large language models.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Like many other fields, Large Language Models (LLMs) are increasingly being used in the domain of Law and Justice. There have been works like Chalkidis et al. <ref type="bibr" target="#b0">[1]</ref>, Paul et al. <ref type="bibr" target="#b1">[2]</ref> that have incorporated LLM embeddings for legal tasks. While general purpose LLMs are being tried in domain and country specific applications, such web scale models where the provenance of the training data cannot be easily established, may not be best suited for the purpose and face regulatory challenges. LLMs trained specifically for the domain corpus could be both compute efficient, trustworthy, and also perform well. Joshi et al. <ref type="bibr" target="#b2">[3]</ref> introduced a benchmark for several tasks in the Indian legal domain.</p><p>Our motivation for training a India specific legal LLM is that tasks like petition drafting, case similarity prediction, judgement summarisation require knowledge infusion at different stages. As shown in Figure <ref type="figure" target="#fig_0">1</ref>, these tasks are currently specialized tasks performed by legal professionals and researchers, and each of them includes several sub-tasks like legal NER, question answering, text-to-SQL. Infusing knowledge into LLMs has shown promise in general purpose tasks and could be useful for legal text analytics (LTA) as well. In particular, Agarwal et al. <ref type="bibr" target="#b3">[4]</ref>, Moiseev et al. <ref type="bibr" target="#b4">[5]</ref>, Agarwal et al. <ref type="bibr" target="#b5">[6]</ref> have shown the effectiveness of additionally pre-training LLMs with external knowledge sources like Knowledge Graphs.</p><p>In this paper, we introduce InLegalLLaMA, a large language model enhanced with knowledge of the Indian legal domain and specifically designed for Indian legal text analytics tasks. The construction and use of a legal knowledge graph is central to the effectiveness of legal LLMs. This knowledge graph should include entities and relationships extracted from legal documents like judgments and legislation. They can be represented in a structured format of triples (subject, object, predicate) or other appropriate formats. The integration of this structured knowledge into LLMs will enable them to perform more accurate and context-aware analysis of legal text.</p><p>For creating an Indian legal knowledge graph, we build upon the work of Dhani et al. <ref type="bibr" target="#b6">[7]</ref> who construct a knowledge graph on Indian court judgements and legislation. By integrating domain-specific knowledge comprising Indian law and case documents, we expect InLegalLLaMA to be able to address several challenges inherent in legal documents, such as complex language, lengthy texts, prevalance of non-English terms, and unstructured information.</p><p>In addition to supporting legal professionals, InLegalLLaMA can also benefit law students, researchers, legislators, and citizens. Students and researchers can use the model to understand legal terms and concepts better. Legislators can use them for their discussions and improving laws. Finally even citizens not familiar with legal processes can use applications built using this model, in tasks such as drafting petitions and understanding legal notices. InLegalLLaMA aims to bridge the gap between advanced NLP techniques and the practical needs of the legal domain. By leveraging a robust legal knowledge graph, the model enhances the efficiency and accessibility of legal text analytics, contributing to more effective and timely delivery of justice.</p><p>This paper is organized as follows. In Section 2 we discuss related work, in Section 3 we introduce the InLegalLLaMA model, and in Section 4 we propose a RAG-based framework for petition drafting using InLegalLLaMA.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>There has been significant amount of work done in different methods to train LLMs with domain specific data. There are also few works on models trained on legal data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Legal Knowledge Graph and datasets</head><p>Automatic Knowledge Graph Construction (AKBC) has been popular since the Knowledge Base Population (KBP) track Ji et al. <ref type="bibr" target="#b7">[8]</ref> organized by the Text Analytics Conference (TAC) in 2010. Domain specific knowledge graphs Abu-Salih <ref type="bibr" target="#b8">[9]</ref> still remain an ongoing research area. Dhani et al. <ref type="bibr" target="#b6">[7]</ref>, Jain et al. <ref type="bibr" target="#b9">[10]</ref> discuss creating legal knowledge graphs using judgements and related documents from Indian courts. The role of human annotations in knowledge graph construction is also a well researched area. Chiticariu et al. <ref type="bibr" target="#b10">[11]</ref> proposed a system to extract domain specific entities and relationships from documents. Vannur et al. <ref type="bibr" target="#b11">[12]</ref> discussed fairness in personal knowledge base construction. We can characterize such methods as rule-based or rule-assisted knowledge base construction.</p><p>Guha et al. <ref type="bibr" target="#b12">[13]</ref> introduced LegalBench, a benchmark for measuring legal reasoning in large language models. The Indian Legal Document Corpus published by Malik et al. <ref type="bibr" target="#b13">[14]</ref> contains 35,000 Indian court judgments and gold standard explanations for the Court Judgment Prediction and Explanation task. In this work, we introduce a new dataset comprising 10,000 Indian court judgments and legal statutes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Knowledge Infusion</head><p>Chalkidis et al. <ref type="bibr" target="#b0">[1]</ref> introduced the LegalBERT model that continues to be used for tasks on the legal data including our experiments in this work. Paul et al. <ref type="bibr" target="#b1">[2]</ref> introduced InLegalBERT which is trained on Indian legal documents. Infusing knowledge into large language models has been discussed in several works. Two survey papers Wei et al. <ref type="bibr" target="#b14">[15]</ref>, Yang et al. <ref type="bibr" target="#b15">[16]</ref> present different methods to infuse knowledge into large language models. Islam et al. <ref type="bibr" target="#b16">[17]</ref> consume a knowledge graph for the entity generation task.</p><p>Agarwal et al. <ref type="bibr" target="#b3">[4]</ref> created a method to translate knowledge graph triples into sentences for enhancing LLM pre-training. Moiseev et al. <ref type="bibr" target="#b4">[5]</ref> and Agarwal et al. <ref type="bibr" target="#b5">[6]</ref> then directly integrated these triples into T5 models, showing two effective paths for knowledge integration-via natural language or directly from triples. Vasisht et al. <ref type="bibr" target="#b17">[18]</ref> took a different approach by using contextual text for embedding knowledge into models. dos Santos et al. <ref type="bibr" target="#b18">[19]</ref> developed Knowledge Prompts for frequent Wikidata entities, refined to aid in triple prediction. Diao et al. <ref type="bibr" target="#b19">[20]</ref> use adapters for efficient knowledge infusion into LLMs.</p><p>LegalBERT Chalkidis et al. <ref type="bibr" target="#b0">[1]</ref>, CaseLawBERT Zheng et al. <ref type="bibr" target="#b20">[21]</ref> and JuriBERT Douka et al. <ref type="bibr" target="#b21">[22]</ref> show the sustained interest of researchers in using language models for downstream legal tasks. However, these models are typically trained on European legal documents and do not perform well in the Indian context directly where there is more variability in the document structures and multi-linguality in legal documents. Under these circumstances, existing models do not work well out of the box and need additional training on local corpora.</p><p>Gururangan et al. <ref type="bibr" target="#b22">[23]</ref> proposed extending the training phase of the language models with domainspecific datasets to realize domain adaption. Ibrahim et al. <ref type="bibr" target="#b23">[24]</ref> empirically observe that it is possible to match performance of language models trained from scratch on a mix of original training corpora and incoming domain corpora, with a combination of novel learning rate scheduling strategies and replay of some portion of original corpora along with the training text from the target domain. Using a similar strategy, Yang et al. <ref type="bibr" target="#b24">[25]</ref> continually pretrain a LLaMA-2 model with 10% replay of RedPajama, instruction tune it on a subset of LIMA as well as customized instructions and evaluate the model on plant science quizzes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Legal Large Language Models</head><p>Works such as InLegalBERT and InCaseLawBERT Paul et al. <ref type="bibr" target="#b25">[26]</ref> involve training the base models on Indian legal documents and achieve reasonable performance on certain tasks like legal statute identification, semantic segmentation and court judgment prediction. Much work still needs to be done to make LLMs useful in LTA tasks that need human expertise. We extend the pretraining phase of the LLaMA-2 foundation model Touvron et al. <ref type="bibr" target="#b26">[27]</ref> on small-scale Indian legal domain corpora and instruction-tune it for a selected set of tasks in the Indian legal domain. We aim to use parameter-efficient fine-tuning methods for pretraining and instruction fine-tuning, and compare its performance on multiple datasets in tandem with other state-of-the-art models, with and without fine-tuning on domain corpora.</p><p>Joshi et al. <ref type="bibr" target="#b2">[3]</ref> introduced the IL-TUR benchmark for Indian Legal Text Understanding and Reasoning. IL-TUR introduces monolingual (English, Hindi) and multilingual (9 Indian languages) legal domainspecific tasks from the point of view of understanding and reasoning over Indian legal documents. They present baseline models for each task and propose a community leader board. Joshi et al. <ref type="bibr" target="#b27">[28]</ref> introduce the Prior Case Retrieval task to cite prior cases and proposes a solution using events extraction. Bhattacharya et al. <ref type="bibr" target="#b28">[29]</ref>, Belfathi et al. <ref type="bibr" target="#b29">[30]</ref> explore Rhetorical Role Prediction in legal documents using transformer-based architectures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">InLegalLLaMA</head><p>In this section, we describe all the stages of training InLegalLLama model, and the experiments we conducted to evaluate its performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Knowledge Infusion</head><p>A legal knowledge graph can help students familiarize themselves with the legal terms and concepts. Such knowledge graphs can also be used to infuse knowledge into or fine tune large language models (LLMs) to fill gaps in such models where they may not have sufficient domain specific knowledge. We build on Dhani et al. <ref type="bibr" target="#b6">[7]</ref> and Jain et al. <ref type="bibr" target="#b9">[10]</ref> to create a legal knowledge graph by scraping the web for court cases, judgements, laws and other cases cited from the judgements etc. In particular, they use court repositories and other public sources in the Indian court system, who's provenance can be easily established. And any document that be removed from the training corpus upon request.</p><p>They retrieve legal documents from Indian court systems and use citations and similarity from Indi-anKanoon Sinha <ref type="bibr" target="#b30">[31]</ref> and Casemine Yadav <ref type="bibr" target="#b31">[32]</ref>. Next, they process the original documents using Stanza (Qi et al. <ref type="bibr" target="#b32">[33]</ref>), extract entities and relations using SystemT (Chiticariu et al. <ref type="bibr" target="#b10">[11]</ref>). They further annotate these documents using manually curated dictionaries as described in Vannur et al. <ref type="bibr" target="#b11">[12]</ref>.</p><p>Based on the above prior work, we too represent our Indian legal knowledge graph in triples format comprising of subject, object, and predicate. Representing knowledge graph triples and infusing them as a triple prediction task is quite well known. Agarwal et al. <ref type="bibr" target="#b3">[4]</ref> generated natural language sentences from triples, and additionally pre-trained large language models with the generated sentences. Moiseev et al. <ref type="bibr" target="#b4">[5]</ref> showed that we could directly infuse triples into large language models without having to generate sentences from the triples. Agarwal et al. <ref type="bibr" target="#b5">[6]</ref> tried infusing triples from domain specific knowledge graphs into flan-t5 models.</p><p>Vasisht et al. <ref type="bibr" target="#b17">[18]</ref> fine-tune LLMs on triple prediction and design prompts to probe the extent of knowledge infusion in LLMs. One of the limitations of knowledge infusion using triples, as described in Moiseev et al. <ref type="bibr" target="#b4">[5]</ref> is the inability of the models to capture graph structure. Vasisht et al. <ref type="bibr" target="#b17">[18]</ref> try to solve this by relying on the contextual information to help the model recollect other information associated with the triples. We mask one of the elements of the triples and pose cloze questions, while providing triples as context, to our model during the instruction-tuning phase to allow it to answer questions posed in a similar fashion.</p><p>Following the idea that domain-adapted models tend to perform better on domain tasks than general language models which have never seen domain corpora Gururangan et al. <ref type="bibr" target="#b22">[23]</ref>, Ibrahim et al. <ref type="bibr" target="#b23">[24]</ref>, we extend the training phase of the LLaMA-2 foundation model on Indian legal text and instruction tune it for a couple of domain tasks. We use the training scripts from the LLaMA-Factory project Zheng et al. <ref type="bibr" target="#b33">[34]</ref>. The base and the instruction-tuned versions of the model are publicly available on HuggingFace. 1 2   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Dataset</head><p>We make use of 10,000 legal documents from the Indian common law system, comprising of an equal number of (i) reportable court judgments (which are important and become binding on lower courts) and non-reportable court judgments (which are limited in their application to the specific case at hand) published by the Supreme Court of India, and (ii) legal statutes published in the Gazette of India by the Indian parliament and various state legislative institutions. We use this corpus to continue the pretraining phase of LLaMA-2. We preprocess the text using an in-house package to remove non-printable characters, stray sequences and most of the noise. For the instruction-tuning of the foundation model, we use the datasets from Vasisht et al. <ref type="bibr" target="#b17">[18]</ref>, Bhattacharya et al. <ref type="bibr" target="#b28">[29]</ref> and Zhou et al. <ref type="bibr" target="#b35">[36]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Continual Pretraining</head><p>The foundation model is adapted to the Indian legal domain in hopes of realising benefits that domain adaptation brings. However, naively continuing to train on the new dataset from that training environment can result in poor adaptation on the incoming data and catastrophic forgetting of the capabilities and knowledge that the model holds. Instead of randomly initializing weights and training from scratch, model training is continued on the new pretraining dataset using the learning rate re-warmup and re-decay approach as suggested in the same work. We continue training LLaMA-2 from the published model weights on the auto-regressive pretraining task with a new dataset of documents with 88,768,648 unique 𝑛-grams and 5% replay data from RedPajama TogetherAI <ref type="bibr" target="#b36">[37]</ref> to avoid catastrophic forgetting, for around 24n900 steps. In order to do this efficiently due to resource constraints, we set a chunk size of 2,048 tokens and use LoRA (𝑟 = 16, 𝛼 = 32) for weight updates instead of full pretraining. We also make use of a learning rate schedule to speed up training with an initial linear re-warmup of 2,000 steps up to an 𝜂 𝑝𝑒𝑎𝑘 of 3.0 × 10 −4 followed by a cosine re-decay phase as recommended by Hoffmann et al. <ref type="bibr" target="#b34">[35]</ref>. It took approximately 301 GPU hours for running the training process with 3 epochs on a single NVIDIA A6000. The training loss and learning rate over training steps are plotted in Figure <ref type="figure">3</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Instruction Tuning</head><p>In order to enable the model to respond to specific instructions and queries, we use examples for in-context masked triple prediction (Vasisht et al. <ref type="bibr" target="#b17">[18]</ref>) and legal sentence rhetorical role classification (Bhattacharya et al. <ref type="bibr" target="#b28">[29]</ref>) tasks posed as instructions to the model. We also use LIMA instructions Zhou et al. <ref type="bibr" target="#b35">[36]</ref> to prevent catastrophic forgetting. We perform supervised LoRA instruction-tuning with default LLaMA-2 prompt template, with 𝜂 𝑝𝑒𝑎𝑘 of 3.0 × 10 −5 and a 1000-step warmup. Instruction tuning took about six hours on the same node. The instruction tuning loss and LR schedule are depicted in Figure <ref type="figure">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Evaluation</head><p>We evaluate the LLaMA-2 foundation models and InLegalLLaMA on in-context masked triple prediction Vasisht et al. <ref type="bibr" target="#b17">[18]</ref> and legal sentence rhetorical role classification Bhattacharya et al. <ref type="bibr" target="#b28">[29]</ref> tasks. We take held-out sets of task instances and report task metrics in Tables <ref type="table" target="#tab_0">1 and 2</ref>. We set out to observe whether we are able to match the performance of the off-the-shelf LLaMA-2 model on these Indian legal domain tasks, and do not seek to establish state-of-the-art in these tasks. We note that InLegalLLaMA performs better than LLaMA-2 in the domain tasks we consider. The promising results show that InLegalLLaMA may also be suited to other tasks in the Indian legal context and perform at par with the baselines. Comparative performance of InLegalLLaMA for in-context triple prediction task in terms of Hits@1, BLEU, ROUGE-L metrics</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Model</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Model P R F1</head><p>Hier-BiLSTM-CRF 0.652 0.552 0.578 BERT-BiLSTM-CRF 0.688 0.615 0.635 LLaMA-2-7B 0.620 0.553 0.571 InLegalLLaMA 0.669 0.573 0.585</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 2</head><p>Comparative performance of InLegalLLaMA for legal sentence rhetorical role prediction task in terms of Precision (P), Recall (R) and F1-Score (F1) metrics</p><p>In Table <ref type="table" target="#tab_0">1</ref>, we can see the performance of our InLegalLLama model compares well against the baseline LLama-2-7B model as well as the performance of Flan-T5 model. However, the high performance of all the models indicates, the triple prediction task on this dataset may not be that hard. We plan to increase the size of the knowledge graph from which these triples are drawn.</p><p>On the other hand, rhetorical role prediction is a harder problem which has important applications in legal text analytics. Here our evaluation of the InLegalLLaMA model shows more promising results as shown in Table <ref type="table">2</ref>. Here our model is performing as well as supervised models for this task.</p><p>To extend this model to the more complex legal text analytics tasks described in the next section, we believe more extensive instruction tuning is required. We leave this as future work. Further, a code fine-tuned version of InLegalLLaMA will likely be needed for taskslike Text-to-SQL. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Petition Drafting</head><p>Court petitions are formal written requests submitted to a court, seeking a specific judicial action or ruling. Ghosh et al. <ref type="bibr" target="#b37">[38]</ref> includes samples of petitions filed in Indian courts. They serve as a primary method for individuals or entities to initiate various legal proceedings, seek specific court orders, or appeal against decisions of lower courts or authorities. While individuals can file petitions themselves, it is common to engage lawyers due to the complexity of legal procedures.</p><p>Petition drafting is a task, that is inherently human centered, especially in the context of Indian court system. Indian courts have the concept of Public Interest Litigations (PILs), using which, any citizen can approach a court of law to seek relief on issues concerning the people. There are, of course, much larger number of people approaching the courts seeking redressal of their grievances.</p><p>Enabling people or their lawyers to write well-written petitions can go a long way in getting them access to justice. Given the backlog and the volume of petitions disposed by courts in India and in many other countries, poorly written petitions can add significant cost to both individuals and the society as a whole. Among other things, poorly written petitions could be those that leave out important pieces of information, addressed to the wrong courts or authorities, risk being dismissed as frivolous when in fact they are not.</p><p>We propose using LLMs to identify missing information in a petition. This is a qualitatively much harder task than writing a document which focuses on the writing style and presentation. Our task involves making LLMs to identify missing information that should typically be present in the petition. This can be designed as a conversational question answering task. This is closely related to the factuality related work in LLMs, since we do not want the model to ask trivial questions. The model needs to be able to identify salient information in a petition, and prompt the user to furnish any missing information.</p><p>For example, in a petition about a missing person which is a very sensitive but important judicial function, the petition is expected to provide the time when the person was last seen by a member of the public or a CCTV camera. While we expect this to be a multi-turn conversation, we currently focus on creating a question answering dataset and evaluating our LLaMA-2 model on the question answering task.</p><p>A typical Indian court petition includes (i) petitioner's and respondent's names and addresses, (ii) a detailed statement of facts and events leading to the petition, (iii) legal grounds and relevant laws supporting the petition, and (iv) specific relief or action sought from the court.</p><p>We propose a RAG solution for the task in Figure <ref type="figure" target="#fig_2">4</ref> which relies on the inputs of a trained advocate for generating the draft and employs a human-in-the-loop approach. It consists of four key stages:</p><p>1. Template Selection: Based on the case details and the court to which the appeal is to be made, a set of candidate templates is retrieved from a template store. It is essential to file petitions in the appropriate court with jurisdiction over the matter. The advocate selects the most appropriate template from the recommendations. This template acts as a structured outline for the petition and is used in the next stages.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Content Generation:</head><p>The content generation phase is modelled as a parallel multi-hop question answering task Mavi et al. <ref type="bibr" target="#b38">[39]</ref> over case details. Each section of the petition has to be written from a certain perspective. An LLM agent can use RAG, external tools and human input to acquire the required context for each section and generate the content by following certain rules of thumb (RoTs). The role of these RoTs are to ensure that the model only uses the relevant details to generate the content, and to steer the tone and depth of detail of the generations.</p><p>3. Refinement and Integration: Each section is to be refined to enhance readability, eliminate redundancies, ensure coherence and proper narrative. A human expert intervenes by accepting or rejecting these refinements. These sections are then merged into a cohesive document that follows the outline.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Draft Evaluation:</head><p>The final phase involves a evaluation and iteration process where the petition draft is assessed through an elaborate evaluation strategy. If required, sections can be regenerated on demand. LLMs can be used to judge the quality in tandem with human expert(s) and the feedback can be used as align model generations using reinforcement learning Zheng et al. <ref type="bibr" target="#b39">[40]</ref>.</p><p>From template selection to an exhaustive evaluation, the framework deals with the intricacies of the petition drafting task. Human experts must monitor such a system to ensure that the generated petition draft is admissible in the court, complete, legally sound and does justice to the case at hand.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Conclusion</head><p>Our observations of the existing works and applications in Indian legal text analytics make it amply clear that we need to develop country and domain specific Large Language Models (LLMs) enhanced with knowledge graphs. In this work, we have introduced InLegalLLaMA, a set of LLMs enhanced with knowledge from an Indian legal knowledge graph. We show the performance of our model on tasks like tail prediction and rhetorical role prediction. We then discuss how our model is useful in more complex legal text analytics tasks like petition drafting, case similarity, judgement summarisation and legal question answering. We plan to work on code variants of InLegalLLaMA in the future that will help in Retrieval Augmented Generation (RAG) applications.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: End-users interact with the LLM agent directly and through specialized data collection and interaction screens to exploit legal language generation and reasoning capabilities.</figDesc><graphic coords="2,72.00,65.61,451.27,209.55" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :Figure 3 :</head><label>23</label><figDesc>Figure 2: Training loss and learning rate schedule during instruction tuning phase of InLegalLLaMA</figDesc><graphic coords="5,72.00,294.21,203.07,152.31" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Retrieval Augmented Generation-based Framework for Petition Drafting</figDesc><graphic coords="7,94.57,65.60,406.15,336.54" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc></figDesc><table><row><cell></cell><cell cols="3">Hits@1 BLEU ROUGE</cell></row><row><cell>Flan-T5</cell><cell>0.914</cell><cell>-</cell><cell>-</cell></row><row><cell>LLaMA-2-7B</cell><cell>0.925</cell><cell cols="2">94.951 94.927</cell></row><row><cell>InLegalLLaMA</cell><cell>0.984</cell><cell cols="2">98.224 99.191</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://huggingface.co/sudipto-ducs/InLegalLLaMA</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://huggingface.co/sudipto-ducs/InLegalLLaMA-Instruct</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">LEGAL-BERT: The muppets straight out of law school</title>
		<author>
			<persName><forename type="first">I</forename><surname>Chalkidis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fergadiotis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Malakasiotis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Aletras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Androutsopoulos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="2898" to="2904" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Paul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mandal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ghosh</surname></persName>
		</author>
		<idno type="DOI">10.48550/ARXIV.2209.06049</idno>
		<title level="m">Pre-training transformers on indian legal text</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Il-tur: Benchmark for indian legal text understanding and reasoning</title>
		<author>
			<persName><forename type="first">A</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Paul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Modi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Long Papers</title>
		<meeting>the 62nd Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="11460" to="11499" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training</title>
		<author>
			<persName><forename type="first">O</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shakeri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Al-Rfou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</title>
				<meeting>the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="3554" to="3565" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">SKILL: Structured knowledge infusion for large language models</title>
		<author>
			<persName><forename type="first">F</forename><surname>Moiseev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Alfonseca</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jaggi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</title>
				<meeting>the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1581" to="1588" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">There is no big brother or small brother:knowledge infusion in language models for link prediction and question answering</title>
		<author>
			<persName><forename type="first">A</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gawade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Channabasavarajendra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bhattacharya</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 19th International Conference on Natural Language Processing (ICON), Association for Computational Linguistics</title>
				<meeting>the 19th International Conference on Natural Language Processing (ICON), Association for Computational Linguistics<address><addrLine>New Delhi, India</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="204" to="211" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Similar cases recommendation using legal knowledge graphs</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">S</forename><surname>Dhani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Bhatt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ganesan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Sirohi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Bhatnagar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Symposium on Artificial Intelligence and Law (SAIL)</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Overview of the tac 2010 knowledge base population track</title>
		<author>
			<persName><forename type="first">H</forename><surname>Ji</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Grishman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">T</forename><surname>Dang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Griffitt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ellis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Third text analysis conference (TAC 2010)</title>
				<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="3" to="3" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Domain-specific knowledge graphs: A survey</title>
		<author>
			<persName><forename type="first">B</forename><surname>Abu-Salih</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Network and Computer Applications</title>
		<imprint>
			<biblScope unit="volume">185</biblScope>
			<biblScope unit="page">103076</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Constructing a knowledge graph from indian legal domain corpus</title>
		<author>
			<persName><forename type="first">S</forename><surname>Jain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Harde</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Mihindukulasooriya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bisht</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Dubey</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">TEXT2KG/MK@ ESWC</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="80" to="93" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">SystemT: An algebraic approach to declarative information extraction</title>
		<author>
			<persName><forename type="first">L</forename><surname>Chiticariu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Krishnamurthy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Raghavan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Reiss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Vaithyanathan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 48th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="128" to="137" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Data augmentation for fairness in personal knowledge base population</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">S</forename><surname>Vannur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ganesan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Nagalapatti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tippeswamy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Trends and Applications in Knowledge Discovery and Data Mining: PAKDD 2021 Workshops</title>
				<imprint>
			<date type="published" when="2021">2021. 2021</date>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="143" to="152" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models</title>
		<author>
			<persName><forename type="first">N</forename><surname>Guha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Nyarko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ré</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Chilton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Chohlas-Wood</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Peters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Waldon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Rockmore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zambrano</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in Neural Information Processing Systems</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Ildc for cjpe: Indian legal documents corpus for court judgment prediction and explanation</title>
		<author>
			<persName><forename type="first">V</forename><surname>Malik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sanjay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Nigam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Guha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bhattacharya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Modi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing</title>
		<title level="s">Long Papers</title>
		<meeting>the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4046" to="4062" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">X</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bhatia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Arnold</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2110.08455</idno>
		<title level="m">Knowledge enhanced pretrained language models: A compreshensive survey</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Peng</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2110.00269</idno>
		<title level="m">A survey of knowledge enhanced pre-trained models</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Fair data generation using language models with hard constraints</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Islam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nagpal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ganesan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">K</forename><surname>Lohia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CtrlGen Workshop</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Infusing knowledge into large language models with contextual prompts</title>
		<author>
			<persName><forename type="first">K</forename><surname>Vasisht</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ganesan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Bhatnagar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">20th International Conference on Natural Language Processing</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">N</forename><surname>Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Cer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Nham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shakeri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sung</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2210.04726</idno>
		<title level="m">Knowledge prompts: Injecting world knowledge into language models through soft prompts</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Diao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2306.05406</idno>
		<title level="m">Mixture-of-domain-adapters: Decoupling and injecting domain knowledge to pre-trained language models memories</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">When does pretraining help? assessing self-supervised learning for law and the casehold dataset of 53,000+ legal holdings</title>
		<author>
			<persName><forename type="first">L</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Guha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">R</forename><surname>Anderson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Henderson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">E</forename><surname>Ho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the eighteenth international conference on artificial intelligence and law</title>
				<meeting>the eighteenth international conference on artificial intelligence and law</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="159" to="168" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">JuriBERT: A masked-language model adaptation for French legal text</title>
		<author>
			<persName><forename type="first">S</forename><surname>Douka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Abdine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Vazirgiannis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">El</forename><surname>Hamdani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">Restrepo</forename><surname>Amariles</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Natural Legal Language Processing Workshop 2021</title>
				<meeting>the Natural Legal Language Processing Workshop 2021</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="95" to="101" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Don&apos;t stop pretraining: Adapt language models to domains and tasks</title>
		<author>
			<persName><forename type="first">S</forename><surname>Gururangan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Marasović</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Swayamdipta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Beltagy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Downey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">A</forename><surname>Smith</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 58th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="8342" to="8360" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Ibrahim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Thérien</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Richter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Anthony</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Lesort</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Belilovsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Rish</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2403.08763</idno>
		<title level="m">Simple and scalable strategies to continually pre-train large language models</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<author>
			<persName><forename type="first">X</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Xue</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Alexandersson</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2401.01600</idno>
		<title level="m">Pllama: An open-source large language model for plant science</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Pre-trained language models for the legal domain: A case study on indian law</title>
		<author>
			<persName><forename type="first">S</forename><surname>Paul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mandal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ghosh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law, ICAIL &apos;23</title>
				<meeting>the Nineteenth International Conference on Artificial Intelligence and Law, ICAIL &apos;23</meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="187" to="196" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Touvron</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Martin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">R</forename><surname>Stone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Albert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Almahairi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2307.09288</idno>
		<title level="m">Llama 2: Open Foundation and Fine-Tuned Chat Models</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">U-creat: Unsupervised case retrieval using events extraction</title>
		<author>
			<persName><forename type="first">A</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Tanikella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Modi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Long Papers</title>
		<meeting>the 61st Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="13899" to="13915" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">DeepRhole: Deep Learning for Rhetorical Role Labeling of Sentences in Legal Case Documents</title>
		<author>
			<persName><forename type="first">P</forename><surname>Bhattacharya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Paul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Wyner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence and Law</title>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Harnessing gpt-3.5-turbo for rhetorical role prediction in legal cases</title>
		<author>
			<persName><forename type="first">A</forename><surname>Belfathi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Hernandez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Monceaux</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Frontiers in Artificial Intelligence and Applications</title>
				<imprint>
			<publisher>IOS Press</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">379</biblScope>
			<biblScope unit="page" from="187" to="196" />
		</imprint>
	</monogr>
	<note>Legal Knowledge and Information Systems -JURIX 2023</note>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Sinha</surname></persName>
		</author>
		<ptr target="https://indiankanoon.org/" />
		<title level="m">IndianKanoon: Search Engine for Indian Law</title>
				<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Yadav</surname></persName>
		</author>
		<ptr target="https://www.casemine.com/" />
		<title level="m">Casemine: A granular mapping of indian case law</title>
				<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Stanza: A python natural language processing toolkit for many human languages</title>
		<author>
			<persName><forename type="first">P</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bolton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Association for Computational Linguistics (ACL) System Demonstrations</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ma</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2403.13372</idno>
		<title level="m">LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Training Compute-Optimal Large Language Models</title>
		<author>
			<persName><forename type="first">J</forename><surname>Hoffmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Borgeaud</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mensch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Buchatskaya</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="30016" to="30030" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Iyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ma</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2305.11206</idno>
		<title level="m">LIMA: Less Is More for Alignment</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b36">
	<monogr>
		<author>
			<persName><surname>Togetherai</surname></persName>
		</author>
		<title level="m">RedPajama: an Open Dataset for Training Large Language Models</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Verma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ganesan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bindal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Bhatnagar</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2403.10944</idno>
		<title level="m">Human centered ai for indian legal text analytics</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<monogr>
		<author>
			<persName><forename type="first">V</forename><surname>Mavi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jangra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jatowt</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2204.09140</idno>
		<title level="m">Multi-hop Question Answering</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b39">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W.-L</forename><surname>Chiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhuang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhuang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">P</forename><surname>Xing</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2306.05685</idno>
		<title level="m">Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
