<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>FOIS</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Using Parameter Eficient Fine-Tuning on Legal Artificial Intelligence</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kuo-Chun Chien</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chia-Hui Chang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ren-Der Sun</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Central University</institution>
          ,
          <addr-line>No. 300, Zhongda Rd., Zhongli District, Taoyuan City 320317, Taiwan</addr-line>
          ,
          <country country="CN">R.O.C.</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>19</volume>
      <fpage>19</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>Legal AI has a wide range of applications such as predicting whether a prosecution will be punished, or whether the punishment will be a prison sentence or a fine. However, current advances in natural language processing have resulted in an ever-increasing number of language models. The cost of finetuning the pre-trained language model and storing these fine-tuned language models becomes more and more expensive. To address this issue, we adopted the concept of Parameter Eficient Fine-Tuning (PEFT) and applied it to the field of Legal AI. By leveraging PEFT techniques, particularly through the implementation of the Low-Rank Adaptation (LoRA) architecture, we have achieved promising results in fine-tuning pre-trained language models. This approach enables us to achieve comparable, if not superior, performance while significantly reducing the time required for model adjustments. It demonstrates the potential of PEFT techniques in adapting language models to diferent legal frameworks, enhancing the accuracy and relevance of legal knowledge services, and making Legal AI more accessible to individuals without legal backgrounds.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Legal AI</kwd>
        <kwd>Legal Judgment Prediction</kwd>
        <kwd>Parameter-Eficient Fine-Tuning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Legal AI refers to the utilization of artificial intelligence (AI) technology in the legal sector. It
is an expanding field that harnesses sophisticated algorithms and machine learning techniques
to assist in the organization, analysis, and interpretation of extensive legal documentation.
Applications of legal AI encompass various areas, including case management [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], legal judgment
prediction (LJP) [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ], court views generation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], among others. From overseeing compliance
to managing legal risks, from streamlining contract management to conducting due diligence,
AI technology can automate and enhance the legal workflow, leading to improved eficiency,
accuracy, and convenience for legal professionals. Ultimately, the implementation of legal AI
has the potential to revolutionize the legal industry, making legal services more accessible and
cost-efective for individuals and businesses alike.
      </p>
      <p>
        Legal cases typically fall into two main categories: civil law and criminal law. Since
gathering facts and evidence for civil cases can be challenging [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], most research eforts in LJP have
primarily concentrated on criminal cases [
        <xref ref-type="bibr" rid="ref6">6, 7, 8</xref>
        ], utilizing verdicts as the primary dataset
for predicting potential legal articles, charges, and terms based on given factual information.
However, in the field of Legal Judgment Prediction (LJP) in criminal cases, there are not only
verdicts but also indictments and various prediction tasks. For example, prosecutors may want
to know whether the case ultimately went to trial according to the legal provision and charges
in the indictment; if the case went to trial, did its punishment result in jail time or a fine; and
if the case was dismissed, was it because of immunity or not guilty?
      </p>
      <p>
        In recent years, significant progress has also been made on many legal tasks based on
pretrained models, including accusation prediction [9], prison term classification [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], criminal
element extraction [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and court view generation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], etc. However, current advances in natural
language processing have resulted in an ever-increasing number of language models. The cost
of fine-tuning the pre-trained language model for diferent LJP tasks and storing these
finetuned language models becomes more and more expensive. If we were to train a separate large
language model for each sub-task, it would consume excessive time and resources. This
highlights the need for adaptive methods, such as Parameter-Eficient Fine-Tuning (PEFT), which
allows for selective updates or additions of parameters to train the model for new tasks.
      </p>
      <p>In this study, we propose the use of PEFT to fine-tune pre-trained language models.
Specifically, we adopt Low-Rank Adaptation of Large Language Models (LoRA)[10], as an
implementation of PEFT, which ofers advantages in reducing computational resources and fine-tuning
time while maintaining or surpassing model performance. This makes it particularly valuable
for refining large models with billions of parameters.</p>
      <p>The rest of the paper is organized as follows: Section 2 introduces related work on legal
AI and LJP and Parameter-Eficient Fine-Tuning (PEFT). The problem definition and dataset
construction is detailed in Section 3. Section 4 explains PEFT. We report the experimental
results in Section 5. Finally, Section 6 concludes the paper and suggests for future research
direction.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Legal AI</title>
        <p>Legal artificial intelligence (LegalAI) has drawn increasing attention from NLP researchers
because of the vast amount of legal documents. Zhong et al. [11] surveyed the researches on legal
artificial intelligence (LegalAI) and categorized its applications into three types: legal judgment
prediction (LJP), similar case matching, and legal question answering.</p>
        <p>
          Among them, legal judgment prediction has been widely studied for decades, and there are
also several related LJP datasets, such as CAIL [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], CAIL-Long[12], ECHR[13, 14], etc. CAIL
is the first Chinese Legal Judgment Prediction Dataset, which collects the criminal cases from
Supreme People’s Court of China. CAIL-Long further obtains more information form Supreme
People’s Court of China, including civil and criminal cases. ECHR [14] is an English Legal
Judgment Prediction Dataset collected from European Court of Human Rights, which contains
cases that a state has breached human rights provisions of the European Convention of Human
Rights.
        </p>
        <p>
          LegalAI’s research methods can be divided into symbol-based methods and embedding-based
methods [11]. In the past, researchers have used traditional machine learning methods for
feature extraction, attempting to extract or create specific features from the description of criminal
facts using additional labeling to help describe the crime. For example, Hu et al. [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] combined
ten discriminative legal features to help predict low-frequency charges. Shaikh et al. [15]
identified and extract 19 features of murder-related criminal cases to train a binary classifier to
judge if guilty or not. However, these features are dificult to apply to large-scale datasets [ 16]
because fact descriptions are expressed in diferent ways and some of these features require
additional labels.
        </p>
        <p>
          To address the above scaling issues, researchers have attempted to incorporate legal
knowledge into neural networks via automatic learning. For example, Luo et al. [17] adopted a
twostep approach to filter out irrelevant law articles with and retain the top k articles to scale up
to a large number of law articles. They built a binary classifier for each article focusing on
its relevance to the input case. The advantage of such an approach is that we can add new
articles with the existing classifiers untouched. Similarly, Bao et al. [ 9] proposed an attention
neural network, LegalAtt, which uses relevant articles to improve the performance and
interpretability of charge prediction task. Gan et al. [18] injected the legal knowledge in the form
as a set of first-order logic rules and integrate these rules into a co-attention network-based
model, which makes the prediction more interpretable for civil loan cases. Kang et al. [16]
constructed auxiliary fact representations from the definitions of behavioral reasons to enhance
fact descriptions. Lyu et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] introduced four types of criminal elements as bridges between
the fact description and article, and used the concept of reinforcement learning to jointly
identify similar articles and confusing fact descriptions in the legal judgment prediction task.
        </p>
        <p>Multi-task learning framework is a machine learning method that can train multiple related
tasks simultaneously, thus improving the performance of each task. It can use a shared layer
to extract common features for all tasks, and then use diferent specialized layers to handle
the details of each task, or use diferent layers to extract features for each task and then use
some methods to limit the diferences between the parameters of these layers. Zhong et al.
[7] proposed the TopJudge model, which uses a topological graph to enhance performance by
exploiting the relationships between legal judgments, predicting articles, charges, and terms.
Yang et al. [19] proposed a multi-layer forward prediction and backward validation framework
to efectively utilize the dependency relationships between multiple sub-tasks.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Parameter-Eficient Fine-Tuning</title>
        <p>It has been shown that it is feasible to update or add a very small number of parameters as
opposed to updating all of the parameters of the pre-trained model as is the case with ordinary
ifne-tuning. The addition of adapters, which are tiny trainable feed-forward networks inserted
between the layers of the fixed pre-trained model, was suggested Houlsby et al. [ 20] (See
Figure 1). Since then, a wide range of advanced PEFT techniques have been put forth, e.g. low rank
adaptation by Hu et al. [10], and prefix-tuning by [ 21]. In a way, Houlsby et al. (2019) places
two adapters sequentially within one layer of the transformer, that is to say typical adapters
are sequential computation. On the other hand, prefix-tuning and LoRA can be thought of as a
“parallel”computation to the PLM layer. An unified view toward Parameter-eficient transfer
learning was proposed by He et al. [22].
3. Problem Formulation and Dataset Construction
Four steps make up a criminal proceeding: investigation, prosecution, trial, and execution.
Among these steps, the public is most interested in the investigation and trial steps. The
investigation procedure refers to the process in which law enforcement agents look into potential
criminal events and gather evidence under the direction of the prosecutor. The prosecution
will file charges and begin the trial process if they feel that the defendant has a strong
suspicion of committing a crime. An impartial, unbiased judge oversees the trial process and
determines whether the defendant actually committed a crime based on the evidence given
by the prosecutor. Today, judgment documents are used as the data source in the majority
of publicly accessible datasets for LJP research. However the language employed in judgment
documents is frequently more eloquent, and the substance primarily concentrates on the facts
and procedures, leading to greater document lengths and more dificult comprehension for
legal specialists. On the other hand, prosecutors employ language that is shorter and more akin
to that of the general public when describing the portion of the criminal facts in the indictment
that are based on their involvement in the investigation. Hence, rather than using judgment
documents for the scope of the data collection, we employ indictments.</p>
      </sec>
      <sec id="sec-2-3">
        <title>3.1. Dataset Construction</title>
        <p>We collected indictments from the public document inquiry system of the Ministry of Justice of
Taiwan from June 15, 2018 to June 30, 2021. The defendant, charges, criminal facts, and legal
provisions were extracted from the indictments using regular expressions, and the material
was then organized into a JSON format. There were 533 articles under 41 laws and 183 charges
from 355,295 cases in the original dataset.</p>
        <p>How many articles and charges to include in the prediction model is a recurring issue while
creating the LJP dataset. We screened out instances where the number of charges or articles
was insuficient in order to make the experiment fair and prevent classification-related
insuficient training or testing data, which may have an impact on the experimental outcomes (e.g.,
less than 30 cases). Furthermore, the first 100 articles of Taiwan’s criminal code contain
definitions of terms like attempted ofenses and criminal responsibility, but we did not include these
articles in our dataset because they do not specify the real penalties. Excluding the above cases,
the total number of articles decreased significantly to 165, and the number of charges decreased
from 183 to 94. A total of 12,541 cases were removed, accounting for 3.5% of the total dataset.
It is worth noting that a case may violate more than one charge, but often only the primary
Facts
Laws
Articles
⋯Knowing that handing over account
passbooks, financial cards, and passwords to
oth⋯知悉將帳戶存簿、金融卡及密碼交 ers may serve as tools for criminals to
com付他人使用，恐為不法者充作詐騙被 mit fraud by transferring funds, and also not
害人匯入款項之犯罪工具，亦不違背 deviating from their intention of money
laun其本意之洗錢及幫助詐欺取財之犯意， dering and aiding in fraudulent schemes,
pro將其之存摺及提款卡等資料，並提供 viding the passbooks, withdrawal cards, and
提款卡密碼，以寄送包裹之方式，租借 supplying the PIN codes through parcel
de寄予詐欺集團成員，容任該人及其所 livery to members of a fraudulent group
en屬之詐騙集團持以犯罪使用。⋯ ables that person and their afiliated
fraudulent organization to utilize them for criminal
purposes. ⋯</p>
        <p>Money Laundering Control Act、Criminal
洗錢防制法、刑法 Code</p>
        <p>It was committed by the defendant, and is
guilty of the crime of money laundering
un⋯是核被告所為，係犯洗錢防制法第 2 der Article 2, paragraph 2, and Article 14,
para條第 2 款、第 14 條第 1 項之洗錢罪嫌 graph 1, and the crime of assisting in
fraud及刑法第 30 條第 1 項前段、第 339 條 ulent acquisition of money under Article 30,
第 1 項之幫助詐欺取財罪嫌⋯ paragraph 1, and Article 339, paragraph 1 of
the Criminal Code.</p>
        <p>詐欺 Fraud
charge are listed in the indictment. Thus, it is more dificult to estimate charge than articles
(even though the number of articles in our dataset is greater than the number of charges). The
distribution of instances in this dataset is unequal, as one might anticipate. The top 10 counts
make about 85% of all cases, according to the number of charges in the indictment. In contrast,
just 0.14% of the instances are covered by the lowest 10 charges. We divided the cases into
categories based on the charges in the indictment in order to fairly split the data, using 80%
of the instances in each category as training data, 10% as validation data, and the final 10% as
testing data. Lastly, we created a dataset called TWLJP1 (TaiWan Legal Judgment Prediction
Datasets) by combining the data from all categories to create training, validation, and testing
datasets.
blue and green, respectively. In this instance, a suspect gave the fraudsters access to his bank
account, and the group tricked the victim into wiring money to the account before withdrawing
it. The Anti-Money Laundering Act and the Criminal Law were both allegedly broken by the
defendant, however the indictment only listed fraud as a crime.</p>
        <p>Dataset
TWLJP
# cases
342,754
# laws # articles # charge
33
165
94
avg length</p>
      </sec>
      <sec id="sec-2-4">
        <title>3.2. Problem Formulation</title>
        <p>Let  = (
sequence of 
1,  2, ⋯ ,   ) denotes a dataset with  cases where each case   is described by a
words</p>
        <p>= ( 1,  2, ⋯ ,    ), and is associated with three labels   in   ,   in  
and   in   , where  and  denote the size of the one-hot vector of law and charge, while   is

of the three vectors and  
, 


,</p>
        <p>in {0, 1}.
a multi-hot vector of articles with dimension  .

Each case   is also associated with a vector of  laws,   = ( 1

,  2, ⋯ ,   ), a vector of  articles,
  = ( 1</p>
        <p>,  2, ⋯ ,   ), and a vector of  charges,   = ( 1</p>
        <p>,  2, ⋯ ,   ), where  ,  ,  represent the size</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Proposed Models</title>
      <p>Current models like Lawformer and TopJudge, as well as other state-of-the-art Legal Judgment
Prediction (LJP) models, showcase the potential of neural network models in terms of accuracy
and eficiency in predicting legal judgments. However, it is important to acknowledge that
these models have certain limitations when applied to legal systems of diferent countries.
For example, Lawformer is a pre-trained language model that utilizes legal documents from
Mainland China as training data. It has shown impressive performance on the CAIL dataset.
However, when applied to the TWLJP dataset, its performance is not as good as Chinese BERT.
The reason could be attributed to variations in legal terminology, penalties, and writing styles
of legal documents across diferent countries.</p>
      <p>As mentioned before, Parameter Eficient Fine-Tuning (PEFT) is an alternative approach that
allows a model to learn a new task with minimal updates. In PEFT, a pre-trained model is
finetuned by selectively updating or adding a small number of parameters. Recent advancements
in PEFT techniques have demonstrated the ability to achieve performance comparable to
finetuning the entire model while only modifying a fraction (e.g., 0.01%) of its parameters [23].</p>
      <p>In this paper, we adopt LoRA[10] to reduce the number of trainable parameters by learning
pairs of rank-decomposition matrices while keeping the original weights frozen. The idea
behind LoRA is that when adapting a pre-trained language model to a specific task or dataset,
only a few features need to be emphasized or re-learnt. This means that the update matrix (ΔW)
can be a low-rank matrix. As shown in Figure 2, The update of a pre-trained weight matrix
 ∈  × is constrained by using a low-rank decomposition  0 + ∆ =  0 +  , where
 ∈  × ,  ∈  × , and the rank r is a hyper parameter less than or equal to the minimum of
d and k. During training,  0 remains unchanged and does not receive any gradient updates,
while A and B contain trainable parameters.</p>
      <p>Since our LJP tasks include the prediction of legal law, article, and charges, we add three
fully connected layers to the CLS output as depicted in Figure 2. By adopting this approach
with PLM frozen, we can significantly minimize the computational resources and time required
for fine-tuning while ensuring the model’s performance is preserved.</p>
      <p>The key advantage of LoRA lies in its remarkable ability to substantially reduce the
computational resources and time necessary for fine-tuning, while maintaining the model’s
performance. This method proves particularly valuable when tackling extensive fine-tuning tasks,
such as the refinement of highly capable large models that consist of billions of parameters.</p>
    </sec>
    <sec id="sec-4">
      <title>5. Experiment</title>
      <p>In order to evaluate the performance of the TWLJP dataset that we have collected across
different pre-trained language models, we conducted training and evaluation using the following
settings:
Multi-task BERT As dipicted in Figure2, we use multi-task learning to model the prediction
of Law, Charge and Article by given criminal fact descriptions in the indictment as input.
We utilize the Huggingface[24] Chinese pre-training language model bert-base-chinese.
The optimizer we use for Multi-task BERT is BERT Adam with a learning rate of 1e-5,
maximum length of 512 and hidden size of 768 for the parameters of pre-trained language
model.</p>
      <p>Multi-task Lawformer Lawformer[12] is a pre-trained language model based on the
CAILlong dataset and capable of processing articles up to 4096 characters in length. However,
since Lawformer uses the CAIL-long dataset in simplified Chinese, and our data is in
traditional Chinese, we first used the OpenCC package to convert the crime facts to
simplified Chinese before training. The optimizer we use for Multi-task Lawformer is
AdamW with a learning rate of 1e-5, maximum length of 512 and hidden size of 768 for
the parameters of pre-trained language model.</p>
      <p>LoRA We utilized the LoRA implementation from Hugging Face’s PEFT package[25] and
bertbased-chinese model to generate embeddings. In the LoRA setting, the value of r is set
to 8. The optimizer we use for LoRA is AdamW with a learning rate of 3e-4, maximum
length of 512 and hidden size of 768 for the parameters of pre-trained language model.</p>
      <p>Evaluation Metric We adopt micro precision (MiP), recall (MiR), and F1 score (MiF), as well
as macro precision (MaP), recall (MaR), and F1 score (MaF), as the evaluation metrics.
Macroprecision/recall/F1 is computed by averaging each class, which is a commonly used metric in
multi-label classification tasks.</p>
      <sec id="sec-4-1">
        <title>5.1. Performance on TWLJP</title>
        <p>To evaluate the performance of the TWLJP dataset across diferent models, we conducted
experiments using the models introduced in the previous section. The performance of TWLJP on
each model is shown in Tables 3, 4, and 5. In each experiment, we selected the epoch with the
best performance on the validation dataset and tested on the testing dataset. The performance
shown in the tables is the average performance of the model over five experiments, with a
calculation of 2 times the standard deviation.</p>
        <p>We conducted the experiments using the GeForce RTX 4070 Ti graphics card, and the
training time for each model for one epoch, as well as the parameter information of the models, are
presented in Table 6.</p>
        <p>Based on the experimental results, it is evident that the performance of models implemented
using the Lawformer pre-trained language model did not meet our expectations. Upon
analysis, we determined that the reason behind this discrepancy lies in the fact that Lawformer
was trained on legal documents from mainland China. Despite our eforts to convert the input
criminal facts from Traditional Chinese to Simplified Chinese, there are significant diferences
between the legal systems and terminologies used in mainland China and Taiwan. This
mismatch in legal terminology and usage negatively impacted the performance of Lawformer on
the TWLJP dataset.</p>
        <p>Under the training architecture of LoRA, comparable performance to Multi-task BERT is
achieved in terms of case cause, legal provisions, and legal sources, and even superior
performance compared to Multi-task BERT. The training time for one epoch is 1 hour and 58 minutes,
which is approximately half the time required by Multi-task BERT, which is 3 hours and 49
minutes. Regarding the parameter count, Multi-task BERT has a total of 102,716,744 parameters,
all of which need adjustment. In the LoRA architecture, the total number of parameters is
103,011,656, but only 744,008 parameters need to be trained, which is approximately 0.72% of
the trainable parameters in Multi-task BERT.</p>
        <p>Sub-task</p>
        <p>Model/Metric</p>
        <p>Multi-task BERT
Multi-task Lawformer</p>
        <p>LoRA(r=8)</p>
        <p>MiP
84.1
79.3
84.6</p>
        <p>MiR
85.7
79.8
86.9</p>
        <p>Article
MiF MaP
84.9 79.0
79.6 70.9
85.8 79.9</p>
        <p>MaR
71.6
59.6
71.6</p>
        <p>MaF
73.4
62.7
73.9</p>
      </sec>
      <sec id="sec-4-2">
        <title>5.2. Performance on CAIL</title>
        <p>To ensure fairness in our experiments, we also utilized the publicly available CAIL dataset. We
conducted multi-task training on the dataset, focusing on the charges and articles. The
perfor</p>
        <p>Sub-task</p>
        <p>Model/Metric</p>
        <p>Multi-task BERT
Multi-task Lawformer</p>
        <p>LoRA(r=8)
mance of each model is shown in Tables 7 and 8. Since the main objective of our experiment
was to compare the performance, time, and parameters of large language models, we did not
compare them to other related models. For each experiment, we selected the epoch with the
best performance on the validation dataset and tested it on the test dataset. We conducted
the experiments using the GeForce RTX 4070 Ti graphics card, and the training time for each
model per epoch and the parameter information are provided in Table 9.</p>
        <p>From the experimental results, it can be observed that the performance of the Lawformer
pretrained language model did not meet expectations. Upon analyzing the reasons for this,
although Lawformer was trained on legal documents from mainland China, it is based on the
Longformer architecture, which allows for input lengths of up to 4096 tokens. However, we
used a maximum length of 512 tokens, and modifying this maximum length would lead to
insuficient memory on the graphics card. As a result, the weights of some models were not
updated, leading to poor training performance.</p>
        <p>On the other hand, under the training framework of LoRA, comparable performance to
Multitask BERT was achieved for charges and articles. The training time for one epoch was 1 hour
and 14 minutes, compared to 2 hours and 19 minutes for Multi-task BERT, requiring
approximately half the time. In terms of parameter quantity, LoRA only required approximately 0.86%
of the training parameters compared to Multi-task BERT.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. Conclusion</title>
      <p>Legal AI plays a crucial role in providing legal knowledge services to individuals with legal
backgrounds, as well as assisting non-legal professionals. However, due to the diverse range
of sub-tasks in Legal AI and the increasing size of pre-trained language models, training and
storing a separate language model for each sub-task can be costly and resource-intensive. To
address this challenge, we have embraced the concept of Parameter Eficient Fine-Tuning (PEFT)
and applied it to the field of Legal AI.</p>
      <p>By leveraging the PEFT approach, specifically through the implementation of the LoRA
architecture, we have observed promising results in fine-tuning pre-trained language models.
This approach allows us to achieve comparable, if not superior, performance while significantly
reducing the time required for model adjustments. In our experiments, we found that using the
LoRA framework required only about half the time compared to fine-tuning the entire model,
without sacrificing performance. This innovative methodology opens up new possibilities for
adapting language models to diferent legal contexts eficiently.</p>
      <p>The success of our approach highlights the potential of PEFT techniques in the Legal AI
domain. By eficiently adjusting and fine-tuning language models, we can tailor them to specific
legal frameworks, taking into account the variations in legal definitions, documents, and
terminologies across diferent countries. This advancement not only enhances the accuracy and
relevance of legal knowledge services but also extends the accessibility of Legal AI to
individuals without a legal background.
[7] H. Zhong, Z. Guo, C. Tu, C. Xiao, Z. Liu, M. Sun, Legal judgment prediction via
topological learning, in: Proceedings of the 2018 Conference on Empirical Methods in Natural
Language Processing, Association for Computational Linguistics, Brussels, Belgium, 2018,
pp. 3540–3549. URL: https://aclanthology.org/D18-1390. doi:10.18653/v1/D18- 1390.
[8] N. Xu, P. Wang, L. Chen, L. Pan, X. Wang, J. Zhao, Distinguish confusing law
articles for legal judgment prediction, in: Proceedings of the 58th Annual Meeting
of the Association for Computational Linguistics, Association for Computational
Linguistics, Online, 2020, pp. 3086–3095. URL: https://aclanthology.org/2020.acl-main.280.
doi:10.18653/v1/2020.acl- main.280.
[9] Q. Bao, H. Zan, P. Gong, J. Chen, Y. Xiao, Charge prediction with legal attention, in:
Natural Language Processing and Chinese Computing: 8th CCF International Conference,
NLPCC 2019, Dunhuang, China, October 9–14, 2019, Proceedings, Part I, Springer-Verlag,
Berlin, Heidelberg, 2019, p. 447–458. URL: https://doi.org/10.1007/978-3-030-32233-5_35.
doi:10.1007/978- 3- 030- 32233- 5_35.
[10] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, Lora: Low-rank
adaptation of large language models, arXiv preprint arXiv:2106.09685 (2021).
[11] H. Zhong, C. Xiao, C. Tu, T. Zhang, Z. Liu, M. Sun, How does NLP benefit legal
system: A summary of legal artificial intelligence, in: Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics, Association for Computational
Linguistics, Online, 2020, pp. 5218–5230. URL: https://aclanthology.org/2020.acl-main.466.
doi:10.18653/v1/2020.acl- main.466.
[12] C. Xiao, X. Hu, Z. Liu, C. Tu, M. Sun, Lawformer: A pre-trained language model for
chinese legal long documents, AI Open 2 (2021) 79–84. URL: https://www.sciencedirect.
com/science/article/pii/S2666651021000176. doi:10.1016/j.aiopen.2021.06.003.
[13] N. Aletras, D. Tsarapatsanis, D. Preoţiuc-Pietro, V. Lampos, Predicting judicial decisions
of the european court of human rights: A natural language processing perspective, PeerJ
Computer Science 2 (2016) e93.
[14] I. Chalkidis, I. Androutsopoulos, N. Aletras, Neural legal judgment prediction in English,
in: Proceedings of the 57th Annual Meeting of the Association for Computational
Linguistics, Association for Computational Linguistics, Florence, Italy, 2019, pp. 4317–4323.</p>
      <p>URL: https://aclanthology.org/P19-1424. doi:10.18653/v1/P19- 1424.
[15] R. A. Shaikh, T. P. Sahu, V. Anand, Predicting outcomes of legal cases based on legal
factors using classifiers, Procedia Computer Science 167 (2020) 2393–2402. URL: https://
www.sciencedirect.com/science/article/pii/S1877050920307584. doi:10.1016/j.procs.
2020.03.292, international Conference on Computational Intelligence and Data
Science.
[16] L. Kang, J. Liu, L. Liu, D. Ye, Label definitions augmented interaction model for
legal charge prediction, in: Advances in Information Retrieval: 43rd European
Conference on IR Research, ECIR 2021, Virtual Event, March 28 –April 1, 2021, Proceedings,
Part I, Springer-Verlag, Berlin, Heidelberg, 2021, p. 270–283. URL: https://doi.org/10.1007/
978-3-030-72113-8_18. doi:10.1007/978- 3- 030- 72113- 8_18.
[17] B. Luo, Y. Feng, J. Xu, X. Zhang, D. Zhao, Learning to predict charges for criminal cases
with legal basis, in: Proceedings of the 2017 Conference on Empirical Methods in
Natural Language Processing, Association for Computational Linguistics, Copenhagen,
Denmark, 2017, pp. 2727–2736. URL: https://aclanthology.org/D17-1289. doi:10.18653/v1/
D17- 1289.
[18] L. Gan, K. Kuang, Y. Yang, F. Wu, Judgment prediction via injecting legal knowledge into
neural networks, Proceedings of the AAAI Conference on Artificial Intelligence 35 (2021)
12866–12874. URL: https://ojs.aaai.org/index.php/AAAI/article/view/17522.
[19] W. Yang, W. Jia, X. Zhou, Y. Luo, Legal judgment prediction via multi-perspective
bifeedback network, in: Proceedings of the Twenty-Eighth International Joint Conference
on Artificial Intelligence, IJCAI-19, International Joint Conferences on Artificial
Intelligence Organization, 2019, pp. 4085–4091. URL: https://doi.org/10.24963/ijcai.2019/567.
doi:10.24963/ijcai.2019/567.
[20] N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M.
Attariyan, S. Gelly, Parameter-eficient transfer learning for nlp, in: International
Conference on Machine Learning, PMLR, 2019, pp. 2790–2799.
[21] X. L. Li, P. Liang, Prefix-tuning: Optimizing continuous prompts for generation, in:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and
the 11th International Joint Conference on Natural Language Processing (Volume 1: Long
Papers), Association for Computational Linguistics, Online, 2021, pp. 4582–4597. URL:
https://aclanthology.org/2021.acl-long.353. doi:10.18653/v1/2021.acl- long.353.
[22] J. He, C. Zhou, X. Ma, T. Berg-Kirkpatrick, G. Neubig, Towards a unified view of
parameter-eficient transfer learning, in: ICLR, 2022.
[23] H. Liu, D. Tam, M. Muqeeth, J. Mohta, T. Huang, M. Bansal, C. A. Rafel, Few-shot
parameter-eficient fine-tuning is better and cheaper than in-context learning, Advances
in Neural Information Processing Systems 35 (2022) 1950–1965.
[24] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf,
M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. L.
Scao, S. Gugger, M. Drame, Q. Lhoest, A. M. Rush, Transformers: State-of-the-art
natural language processing, in: Proceedings of the 2020 Conference on Empirical
Methods in Natural Language Processing: System Demonstrations, Association for
Computational Linguistics, Online, 2020, pp. 38–45. URL: https://www.aclweb.org/anthology/2020.
emnlp-demos.6.
[25] S. Mangrulkar, S. Gugger, L. Debut, Y. Belkada, S. Paul, Peft: State-of-the-art
parametereficient fine-tuning methods, https://github.com/huggingface/peft, 2022.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <article-title>A comparative study of automated legal text classification using random forests and deep learning</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>59</volume>
          (
          <year>2022</year>
          )
          <article-title>102798</article-title>
          . URL: https://www.sciencedirect.com/science/article/pii/ S0306457321002764. doi:
          <volume>10</volume>
          .1016/j.ipm.
          <year>2021</year>
          .
          <volume>102798</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Xu,</surname>
          </string-name>
          <article-title>Cail2018: A large-scale legal dataset for judgment prediction</article-title>
          ,
          <year>2018</year>
          . URL: https://arxiv.org/ abs/
          <year>1807</year>
          .02478. doi:
          <volume>10</volume>
          .48550/ARXIV.
          <year>1807</year>
          .
          <volume>02478</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lyu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <article-title>Improving legal judgment prediction through reinforced criminal element extraction</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>59</volume>
          (
          <year>2022</year>
          )
          <article-title>102780</article-title>
          . URL: https://www.sciencedirect.com/science/article/pii/ S0306457321002600. doi:
          <volume>10</volume>
          .1016/j.ipm.
          <year>2021</year>
          .
          <volume>102780</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Luo</surname>
          </string-name>
          , W. Chao,
          <article-title>Interpretable charge predictions for criminal cases: Learning to generate court views from fact descriptions</article-title>
          ,
          <source>in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , New Orleans, Louisiana,
          <year>2018</year>
          , pp.
          <fpage>1854</fpage>
          -
          <lpage>1864</lpage>
          . URL: https://aclanthology.org/ N18-1168. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N18</fpage>
          - 1168.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ma</surname>
          </string-name>
          , Y. Zhang,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Zhang,</surname>
          </string-name>
          <article-title>Legal judgment prediction with multi-stage case representation learning in the real court setting</article-title>
          ,
          <source>in: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , SIGIR '21,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2021</year>
          , p.
          <fpage>993</fpage>
          -
          <lpage>1002</lpage>
          . URL: https://doi.org/10.1145/3404835.3462945. doi:
          <volume>10</volume>
          .1145/ 3404835.3462945.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Few-shot charge prediction with discriminative legal attributes</article-title>
          ,
          <source>in: Proceedings of the 27th International Conference on Computational Linguistics</source>
          , Association for Computational Linguistics, Santa Fe, New Mexico, USA,
          <year>2018</year>
          , pp.
          <fpage>487</fpage>
          -
          <lpage>498</lpage>
          . URL: https://aclanthology.org/C18-1041.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>