<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>X i v .</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Management to Intelligence Engineering - A practical approach to building AI inside the law-firm using open-source Large Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Uwais Iqbal</string-name>
          <email>uwais.iqbal@simplexico.ai</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Large Language Models, Legal AI, Natural Language Processing, Generative AI, Foundational Language Model, Knowledge</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Simplexico Limited</institution>
          ,
          <addr-line>33 Carnarvon Road, London, E10 6DW</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>2</volume>
      <issue>2</issue>
      <fpage>360</fpage>
      <lpage>371</lpage>
      <abstract>
        <p>Open-source foundational language models unlock a new opportunity for building AI inside the law firm. In this paper, we explore the diferent options in the AI build vs buy equation facing law firms and outline four postures across the spectrum of building AI. We motivate a particular posture that leverages open-source foundational models in a way that both mitigates data privacy and security concerns, while enabling customisation of these models with internal data. We explore the diferent ways in which these models can be fine-tuned and present a novel addition of intelligence engineering to the traditional knowledge management process that involves instruction fine-tuning language models to infinitely scale access to explicit knowledge. We provide a practical demonstration of this technical approach with a proof of concept using an open-source foundational model based on the GPT-3 architecture and an open-source dataset of contracts. We also provide a qualitative analysis of results.</p>
      </abstract>
      <kwd-group>
        <kwd>Models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>The recent wave of hype surrounding Generative AI and</title>
        <p>
          Large Language Models (LLMs) has captured the
collective imagination of the legal industry. Following the
launch of ChatGPT [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], GPT-3, and GPT-4, [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] there
has been considerable interest in exploring how this new
wave of AI could bring transformation to the legal
industry. Reports and predictions estimate that up to 40% of
legal work could be displaced by AI systems [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>
          Despite the newfound enthusiasm of the legal sector
around AI, the practical constraints around actually
doing technology in the legal sector still remain as relevant
as ever. In a conservative and risk-averse industry like
legal, concerns around data privacy, security and
confidentiality dictate the pace of adoption, transformation
and innovation [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. There is a balance to be struck in
driving innovation with technology in the sector while
staying faithful to these justified concerns.
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>The scope of this paper is to explore whether Large</title>
      </sec>
      <sec id="sec-1-3">
        <title>Language Models could be used in law firms, and if so, how this may be realised technically. This paper attempts to provide a practical middle path to the future of AI</title>
        <p>ligence and Intelligent Assistance for Legal Professionals in the Digital
Workplace (LegalAIIA 2023), held in conjunction with ICAIL 2023,
0009-0008-3384-6383 (U. Iqbal)</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. AI Build vs Buy - The Options</title>
      <p>The build vs buy distinction is actually more of a
spectrum when it comes to AI. With traditional software
engineering there were two extremes - either build the
product yourself or go buy it from the market.
Developing software is a relatively simple process that doesn’t
involve lots of moving parts.</p>
      <p>AI is fundamentally diferent. With AI there are a
number of core components involved across three
different contexts. There are three main components: a)
the underlying code for the algorithm, b) the data and</p>
      <sec id="sec-2-1">
        <title>c) the compute resources where all of these elements</title>
        <p>together to give rise to the derived product; the model.</p>
        <p>© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License are combined to create the model. The three parts come
There are also three diferent stages to creating AI which
are relevant to the diferent build vs buy options:
1. Pre-training - In the pre-trTaHiniEng AstaIgBe, tUheIcLoDde
for the algorithm is used with data in a compute
resource to train a model. SPECTRUM
2. Fine-tuning - In the fine-tuning stage, the
pretrained model is further refined with additional
data using the code for the algorithm in a compute
resource to create a more refined model focused
for a particular domain or for a particular task.
3. Serving - In the serving stage, the model is served
in a compute environment and packaged in an
API so it can be called o nDIAGbRAyM KoEYtherOpseno-so uftwrce areVensdorer- Internal
vices and the AI capability can be consumed.</p>
        <sec id="sec-2-1-1">
          <title>2.1. Buying AI</title>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>Buying AI can be understood as purchasing a specific</title>
        <p>AI-enabled application from a vendor. Just as one might
purchase an electrical appliance like a toaster or a
kettle from a supermarket. These AI-enabled applications
perform particular tasks and fulfill a certain defined set
of needs. Examples of such AI-enabled applications in
the legal sector include oferings from vendors such as
Harvey, Spellbook, Kira etc.</p>
        <sec id="sec-2-2-1">
          <title>2.2. The AI Build Spectrum</title>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Building AI can be understood as going one level higher</title>
        <p>to interact with the underlying technology through APIs
or actual code. AI can be built through four diferent
postures across a spectrum that spans from one extreme
of a consumer posture to another extreme of a creator
posture. The four postures are as follows:
PRE-TRAINING FINE-TUNING
+ Data + BaseModel
+ Code + Code
+ Compute + Compute
= BaseModel + Data
= FMinoed-etluned</p>
        <p>SERVING
Model</p>
        <p>Compute
CONSUMER
POSTURE</p>
        <p>PRE-TRAINING FINE-TUNING
+ Data + BaseModel
+ Code + Code
+ Compute + Compute
= BaseModel + Data
= FMinoed-etluned
PRE-TRAINING FINE-TUNING
+ Data + BaseModel
+ Code + Code
+ Compute + PCoremmp/uCtloeu(Od)n
= BaseModel + Data
= FMinoed-etluned</p>
        <p>CREATOR
CUSTOMISER
POSTURE</p>
        <p>SERVING PRE-TRAINING FINE-TUNING
Model + Data + BaseModel
PCoremmp/uCtloeu(Od)n + Code + Code
+ PCroemmp/uCtloeu(Od)n + PCroemmp/uCtloeu(Od)n
= BaseModel + Data
= FMinoed-etluned</p>
        <p>SERVING
Model</p>
        <p>Compute
CONSUMER
CUSTOMISER
POSTURE</p>
        <p>SERVING
Model
Compute(On
Prem/Cloud)
CREATOR
POSTURE
the consumer posture would involve plugging into APIs
ofering AI services such as Azure Cognitive Services or
Open AI to use a model for a particular task.</p>
        <p>The Consumer Posture is easy to get started with and
no AI skills are needed. The services are all managed
by the vendor so there are no technical or infrastructure
concerns to worry about. However, under the Consumer
Posture there is no control over the AI process so there
is no efective way to mitigate bias and risk. The models
available through such services tend to be too generic and
general purpose to be efective and useful for specialised
tasks in the legal domain.
2.2.2. The Consumer Customiser Posture</p>
      </sec>
      <sec id="sec-2-4">
        <title>1. The Consumer Posture 2. The Consumer Customiser Posture 3. The Creator Customiser Posture 4. The Creator Posture</title>
        <p>The Consumer Customiser Posture is similar to the
Consumer Posture. The vendor provides fine-tuning as a
service ofering and allows an organisation to customise and
ifne-tune the underlying model with their own data. The
vendor still takes care of everything from pre-training,</p>
        <p>Figure 1 outlines a visual representation of the postures fine-tuning and serving. A typical use with the
Conacross the AI build spectrum and how the diferent com- sumer Customiser Posture would involve fine-tuning a
ponents of data, code, compute and model are distributed model with an organisation’s internal data through an
across open-source, vendor and internal management. API service like Open AI so the model is more familiar
with specific domain language.
2.2.1. The Consumer Posture Much like the Consumer Posture, the Consumer
CusIn the Consumer Posture, an organisation acts as a con- tomiser Posture is easy to get started with and is ofered
sumer of AI and the pre-training, fine-tuning and serv- as a managed service meaning all the technical
infrasing stages are all managed by a vendor. The Consumer tructure work is taken care of by the vendor. However,
Posture enables an organisation to just get on with con- this posture involves sharing internal data with a vendor
suming and integrating AI into their applications. The so raises risks and concerns around data privacy and
sevendor’s job is to worry about training and serving the curity. It also raises concerns and questions around the
AI system. The organisation just acts as a consumer and IP ownership of the fine-tuned model managed by the
can build applications around it. A typical use case with vendor.
building to explore general purpose AI applications as
well as building their own AI systems with internal data.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. The State of Afairs Open-source Large Language Models</title>
      <p>Over recent months, there has been something of a
revolutionary movement in the open-source community.
Since the release of ChatGPT, the open-source
community have been active in replicating the capabilities of
closed-source models. The data and hardware
requirements for creating foundational language models were
previously significant barriers to entry. Only
organisations in a privileged position could create such
foundational language models and then proceed onto the later
phase of development. The ability to create such models
were in the hands of the few.</p>
      <p>
        The recent wave of activity in the open-source
community has significantly changed this dynamic. With
the release of open-source foundational model suites like
LLaMa [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], pythia [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], Cerebras-GPT [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], StabilityLM [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ],
and MPT-7B [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] these base models are now publicly
accessible. The ability to customise models and continue
through phases of development are in the hands of
everyone. While some of the open-source base models have
been released under licenses only permitting academic
use, some models are available for commercial use.
Table 1 outlines the licenses for some of these open-source
foundational models and whether they are available for
commercial use.
2.2.3. The Creator Customiser Posture
In the Creator Customiser Posture, the posture changes
from one of a consumer to a creator. In this posture,
open-source pre-trained models are utilised for
commercial purposes. The open-source base models are brought
inside an organisation and further fine-tuned on a
combination of open-source and internal data. Under this
posture the pre-training is performed by another party
who then makes their pre-trained model publicly
available under an open-source license. The fine-tuning and
serving phases are managed internally so infrastructure
is needed to facilitate the compute resources in the form
of either on-premise or cloud resources.
      </p>
      <p>A typical use case for the Creator Customiser
Posture would involve bringing an open-source foundational
model in house and fine-tuning it on internal data and
serving it internally for internal applications.</p>
      <p>The Creator Customiser Posture is attractive in that
it doesn’t require the sharing of data with a third party.
There is more control over the AI creation process so it
becomes easier to mitigate for bias and risk as well as put
in place controls for safety and governance. The IP of
finetuned models is owned by the organisation. However,
this posture requires infrastructure to be managed and
maintained internally for the compute resources in the
ifne-tuning and serving phases. It also requires some
level of specialised skill set around AI.
2.2.4. The Creator Posture</p>
      <sec id="sec-3-1">
        <title>The Creator Posture goes a step further than the Creator</title>
        <p>Customiser Posture. Instead of relying on a third party to
perform the pre-training phase to create the pre-trained
model, in the Creator Posture the pre-training phase is Name Provider Pa- License
Combrought inside the organisation. There is no dependence ram-
meron a third party and all three phases of pre-training, fine- eters cial
tuning and serving are managed internally. Use</p>
        <p>The Creator Posture is the polar opposite of the Con- LLaMa bFaocoek- 65B dAecma-ic No
sumer Posture and sits on the other end of the AI build Research Use Only
spectrum. There is complete control over the process of pythia EleutherAI 70M- Apache Yes
creating AI so there the ability to have full provenance 12B 2.0
of data. There is a complete transparency and the digital Cerebras- Cerebras 111M- Apache Yes
footprint of the models can be traced back to their origins. GPT 13B 2.0
Mitigating for bias, risk and putting in place controls for Stabil- Stabil- 3B/7B CC Yes
safety and governance is made much more accessible. ityLM ityAI BY-SA</p>
        <p>Just like the Creator Customiser Posture, a specialised 4.0
skillset around AI is needed and the infrastructure for pre- MPT-7B Mo- 7B Apache Yes
training, fine-tuning and serving all have to be managed saicML 2.0
internally. Table 1</p>
        <p>These four postures present the diferent ways an or- Open-source foundational language models with their
paramganisation can go about building AI. The options of buy- eter sizes and licenses
ing AI and the four postures of building AI are not
mutually exclusive. An organisation can develop a
complimentary strategy across buying and the four postures of</p>
      </sec>
      <sec id="sec-3-2">
        <title>The allure of open-source models for law firms is that they can be used with the Creator Customiser Posture.</title>
      </sec>
      <sec id="sec-3-3">
        <title>These models can be brought inside the law firm and</title>
        <p>developed further on internal data mitigating the risks
around data privacy and security while still enabling
access to cutting-edge technology. These developments
around open-source foundational language models now
mean that AI can be brought inside the law firm to build
AI systems that power use cases inside the law firm.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Layers of Fine-tuning Customising Open-source Language Models</title>
      <sec id="sec-4-1">
        <title>The first two levels of fine-tuning create domain</title>
        <p>specific functionality while the last is more aesthetic.
These layers of fine-tuning can be performed with a
combination of internal and open-source datasets for
maximum learning. There are already a growing number
of such open-source datasets for instruction response
ifne-tuning and RHLF available for commercial use [ 14].
Open-source datasets can be combined with internal
datasets and stacked in a modular fashion to create the
desired intelligence capabilities within the language model.</p>
        <sec id="sec-4-1-1">
          <title>4.1. From Knowledge Management to Intelligence Engineering</title>
          <p>An open-source foundational language model can be
further fine-tuned to customise the model with
domainspecific and internal data. There are three distinct layers
of fine-tuning that can be performed with language
models.</p>
          <p>
            The Instruction Response fine-tuning approach is of
particular importance for law firms. Knowledge
management (KM) can be defined as the ”tools, techniques, and
strategies to retain, analyse, organise, improve, and share
business experience” [15]. Within the context of a law
1. Unsupervised Fine-Tuning - With a domain- firm, knowledge management involves ”a firm’s ability
specific corpus of raw unstructured text, the lan- to identify, capture, and leverage the internal knowledge
guage model can be fine-tuned to learn the partic- of individuals” to ”enhance the ability of all law firm staf
ular nuances and quirks of legal language. This to create and share knowledge across the firm and to
can be made even more specific by focusing on provide excellent client services and to compete in an
a particular practice area or a particular area of increasingly aggressive professional legal services
envilaw. The data requirements for unsupervised fine- ronment” [16].
tuning are not restrictive. All that is needed is a Knowledge management is based on three
fundamencorpus of raw text documents which should be tal concepts: a) data, b) information, and c) knowledge
relatively easy to find within a law firm. The data [17]. Data is understood as the raw resource without
does not even need to be structured as only the context. Information is understood as data with context
raw text is needed. that is able to provide value. Knowledge is understood as
2. Instruction Response Fine-tuning - Recasting information combined with understanding and capability.
structured data into tasks consisting of instruc- The distinction between knowledge and information can
tion response pairs results in language models be clarified as ”knowledge being a personal subjective
being able to generalise across unseen tasks re- process emerging from previous experiences, while
inally well [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ]. Table 4 in the appendix shows formation is objective data about the environment” [18].
some examples of instruction response pairs. The Knowledge lives in the minds of people and is
anthropobase language model can be fine-tuned on pairs of morphic while information is not.
instructions and responses to create an ability to Knowledge can further be broken down into two main
follow particular instructions and commands [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ]. types: a) Tacit Knowledge, and b) Explicit Knowledge
This creates an ability for the model to generalise [15]. Tacit knowledge refers to personal knowledge
emacross diferent tasks by learning how to com- bedded in individual experience while explicit knowledge
plete the instruction on unseen data in a zero-shot refers to tacit knowledge that has been documented. One
fashion. There are some requirements around the of the challenges in knowledge management is the
difidata for instruction response fine-tuning since culty in capturing tacit knowledge. One of the key
functhe data has to be structured from raw text into tions of a knowledge management strategy is to make
instruction response pairs. the tacit explicit so that it can be easily transferred and
3. Reinforcement Human Learning Feedback communicated from one individual to another.
(RHLF) - The third phase involves creating a We can introduce a fourth related concept of
intellimore human-like interface to the language model gence to the fundamental concepts of knowledge
manby using Reinforcement Learning to teach the agement. Intelligence can be defined as the ability to
model how to converse as a human [13]. This acquire knowledge and skills. As such, intelligence can
creates a conversational layer with the language be possessed by humans as natural intelligence and by
model that allows it to be interacted with as a machines in the form of artificial intelligence (AI). By
inchat bot. cluding intelligence within the fundamental concepts of
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Results and Findings</title>
      <p>knowledge management, AI can be adopted to achieving
and enhance the objectives of knowledge management.</p>
      <p>Knowledge management within law firms has
traditionally focused on capturing the tacit knowledge from 5.1. Experimental Setup
the minds of highly skilled and experienced individuals
into explicit knowledge in the form of written content. To demonstrate how an open-source foundational
lanThe explicit knowledge captured in content allows this guage model can be leveraged inside a law firm we
pracexpertise to be shared with and accessed by other individ- tically demonstrate how the Creator Customiser Posture
uals in the firm. However, one of the practical challenges can be used with layered unsupervised and instruction
with knowledge management is how to make this explicit response fine-tuning.
knowledge easily accessible, retrievable and consumable We first select a base foundational model. The
to other individuals within the organisation. Cerebras-GPT model suite contains 7 GPT-3 models
rang</p>
      <p>
        Traditionally knowledge management seeks to take ing from 111M up to 13B in parameter size [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The
tacit knowledge from the mind of a human and capture models were created by training on The Pile dataset [19].
it as explicit knowledge in the form of written content so These models are licensed under the Apache 2.0 license
that it can be consumed by another human. The process and are available for commercial use.
of knowledge transfer goes from human (tacit knowl- Taking a pragmatic approach we focus on the smaller
edge) to content (explicit knowledge) to human (tacit parameter size variants of the Cerebras-GPT model suite.
knowledge). With the advent of AI, large language mod- In particular we work with the 590M parameter variant.
els and instruction response fine-tuning, we propose in- Our goal is to provide a pragmatic demonstration of the
telligence engineering as an additional step in the process. approach rather than optimising for performance.
Rather than going from human to content to human, In order to proxy law firm internal data, we use two
we propose introducing a machine into the process; go- open-source datasets related to contracts. Both are in
ing from human (tacit knowledge) to content (explicit the public domain and are available under the CC-BY 4.0
knowledge) to machine (intelligence) to human (tacit license. These datasets are as follows:
knowledge).
      </p>
      <p>Practically, this would involve much of the same pro- 1. Contract Understanding Atticus Dataset
cesses around knowledge management as before but with (CUAD) [20] - The CUAD dataset consists of 510
a few additional steps. Tacit knowledge from the mind commercial legal contracts with over 13,000+
laof a human can be captured as explicit knowledge in the bels that have been manually labelled under the
form of an instruction response dataset. Existing explicit supervision of experienced lawyers. The
annoknowledge content can easily be restructured into an tations identify legal clauses that are considered
instruction response format. Such a dataset can be used important in contract review in connection with
to fine-tune a large language model with the Creator a corporate transaction, including mergers
acquiCustomiser posture to create protected and privileged sitions, etc.
intelligence within a law firm. Then, individuals in the 2. Merger Agreement Understanding Dataset
ifrm can interface with the large language model to re- (MAUD) [21] - The MAUD dataset consists of
trieve, access and query the intelligence captured. This 152 merger agreements with over 47,000+ labels
additional step of intelligence engineering removes exist- that have been manually labeled under the
suing bottlenecks around accessing and retrieving explicit pervision of experienced lawyers to identify 92
knowledge. In the context of knowledge management questions in each agreement used by the 2021
within a law firm, a large language model efectively cre- American Bar Association (ABA) Public Target
ates infinite scale in providing access to explicit knowl- Deal Points Study.
edge. While explicit knowledge has always been static We combine the raw text from the contracts and
agreein the form of content, large language models transform ments in the CUAD and MAUD datasets to create a
this content to intelligence that is dynamic, scalable and dataset for unsupervised fine-tuning. The resulting
comeasily accessible. bined dataset consists of 662 documents with 1.8M tokens.</p>
      <p>
        We explore and evaluate this approach by layering un- This dataset acts as a proxy for documents and
unstrucsupervised fine-tuning and instruction fine-tuning with tured text which may sit inside a document management
the Creator Customiser Posture using an open-source system at a law firm.
foundational language model and a publicly available Taking inspiration from FLAN [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], we mine the labels
dataset of contracts. in the CUAD dataset to produce a collection of 8,000+
instruction response pairs. The pairs span a number of
legal specific tasks including drafting, classification and
extraction. This dataset acts as a proxy for internal data
contained in a precedent bank or within explicit knowl- We take the fine-tuned model and use it to further
edge content and practice notes. With a little mining, any perform instruction fine-tuning using the 8,000+
instrucinternal structured textual data can be reconstructed in tion response pairs mined from the CUAD dataset. We
the form of instruction response pairs for use with large report the following results after 3 epochs of instruction
language models. Table 4 in the appendix shows exam- response fine-tuning:
ples of these instruction response pairs created from the
CUAD dataset. We perform the experiments on commod- 14.09
ity hardware in the form of a single Nvidia A100 GPU
with 80GB of RAM.
      </p>
      <sec id="sec-5-1">
        <title>5.2. Results</title>
        <p>Using the base model, we perform two stages of
finetuning:
1. Unsupervised Fine-Tuning - Using the dataset
of 1.8M tokens created by combining the raw text
from the CUAD and MAUD datasets, we fine-tune
the model in an unsupervised manner. Through
this stage the model learns the nuances of legal
language.
2. Instruction Fine-Tuning - Using the 8,000+
instruction response pairs created from mining the
CUAD data, we further fine-tune the model with
these pairs. The model learns how to perform
these legal specific tasks.</p>
        <sec id="sec-5-1-1">
          <title>For both datasets we use an 80:20 split to create the</title>
          <p>training and testing sets. Using the 590M parameter
variant from the Cerebras-GPT model suite as the base
model, we run the unsupervised fine-tuning with the
combined text from the CUAD and MAUD datasets.</p>
          <p>We compare the quality of the fine-tuned language
models by using perplexity as an intrinsic evaluation
metric. The perplexity score captures the average
number of words that can be encoded. In more concrete
terms, a perplexity score of 4 means that when trying to
guess the next word, the model is as confused as if it had
to pick between 4 diferent words. A lower perplexity
score means that the language model is more precise at
predicting words and is better.</p>
          <p>We report the following results after 3 epochs of
unsupervised fine-tuning:</p>
          <p>Perplexity on Test Set
(Before fine-tuning)
Train Loss
Test Loss
Perplexity on Test Set (After
fine-tuning)
Duration
Cost</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>5.3. Findings</title>
        <p>The significant diference in perplexity scores before
and after fine-tuning indicate that massive amounts of
domain-specific data isn’t needed to efectively fine-tune
open-source models. This also demonstrates that
finetuning can be performed with reasonable volumes of
data at a reasonable cost in a reasonable time frame. We
can estimate order of magnitude data requirements for
the diferent fine-tuning layers. For unsupervised
finetuning on the order of hundreds of documents are needed
while for instruction response fine-tuning on the order
of thousands of instruction response pairs are needed.</p>
        <p>We posit that since the language in the domain is
more specific and standardised, it is easier for the model
to learn the nuances of legal language. As opposed to
generic pre-training datasets like Pile [19], there is less
variability in the language so it is easier for the model
to learn. Further work is needed to evaluate and
quantify the efectiveness of such fine-tuned large language
models by way of extrinsic evaluation to quantify
performance on downstream tasks. Further investigation is
also needed to understand the efect of parameter size on
the performance of fine-tuning on legal domain specific
language and tasks.</p>
        <p>The results also indicate that fine-tuning open-source
foundational models is feasible and practical from a
variety of operational perspectives: data, cost and time.
The data requirements are not unreasonable - most law
ifrms have access to thousands of documents that can
be used for unsupervised fine-tuning. Following our
approach, internal knowledge content, structured databases
and practice notes can be mined to create instruction
response pairs for instruction response fine-tuning. The
cost of running such fine-tuning experiments is very
cheap which means that the operational expense of
experimentation is not a barrier to innovation. The time
requirements for performing fine-tuning are not prohibitive
to limit rapid prototyping and iterative development.</p>
        <p>The approach outlined by taking the Creator
Customiser Posture with open-source foundational models
and performing unsupervised and instruction-response
ifne-tuning is a methodology that can be adopted for
intelligence engineering to create an intelligence layer
to make explicit knowledge more accessible so it can be
readily be consumed by other humans. In efect, this
methodology can immediately unlock value by
advancing knowledge management objectives within law firms
through intelligence engineering.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>We have introduced the various options available when
it comes to the buy vs build question for AI. We have
outlined the four postures across the AI build spectrum and
demonstrated that the Creator Customiser posture is the
most appealing for law firms looking to leverage internal
data while mitigating risks around data privacy and
security. We have assessed the opportunities open-source
foundational language models present and outlined the
various ways in which these models can be further
reifned on internal data. We have also presented a practical
way and method in which large language models can be
introduced into knowledge management workflows in
law firms through intelligence engineering.</p>
      <p>We have carried out an early evaluation of this
approach using an open-source foundational language
model and an open-source contract dataset. While the
experimental analysis in this paper is limited to a
qualitative analysis of results, early findings indicate the value
and technical feasibility of this approach in unlocking
opportunities for building AI within the law firm. Further
experimentation is required to quantitatively evaluate
this approach on downstream tasks to provide an
objective assessment of performance. The impact of larger
parameter sizes on downstream performance is an area
for further investigation.
The following table provides some examples from the instruction response dataset that was created by mining the
CUAD dataset.</p>
      <p>Instruction
Draft a renewal term clause for a marketing afiliate agreement
between Birch First Global Investments Inc. (”Company”) and
Mount Knowledge Holdings Inc. (”Marketing Afiliate”, ”MA”)
with a renewal term of successive 1 year
Classify the following clause as either a renewal term clause or
a notice period to terminate renewal clause or a governing law
clause .. or a covenant not to sue clause or a third party
beneficiary clause: ”This agreement may be terminated by either
party at the expiration of its term or any renewal term upon
thirty (30) days written notice to the other party.”
Extract the jurisdiction of the governing law of the agreement
from the following text: ”This Agreement shall be governed
by, and any dispute arising hereunder shall be determined in
accordance with, the laws of State of New York”
Response
This agreement shall begin upon the date of its
execution by MA and acceptance in writing by
Company and shall remain in efect until the end of the
current calendar year and shall be automatically
renewed for successive one (1) year periods unless
otherwise terminated according to the cancellation
or termination provisions contained in paragraph
18 of this agreement.</p>
      <p>This is a notice period to terminate renewal clause
Classification
Extraction</p>
    </sec>
    <sec id="sec-7">
      <title>B. Example Outputs of Model</title>
      <p>The following table outlines examples of outputs from the models with the associated prompts:
Model Input
Extract the jurisdiction of the governing law of the agreement
from the following text: This Agreement shall be governed by,
and construed in accordance with, the Laws of the State of New
York, applicable to contracts executed in and to be performed
entirely within that state.</p>
      <p>Draft a notice period to terminate renewal clause for a
videoon-demand content license agreement between Rogers Cable
Communications Inc. (”rogers”) and Euromedia Holdings Corp.</p>
      <p>(”licensor”) with a notice period of 60 days
Subject to the terms and conditions of this
agreement, during the term hereof, and for a period
of one (1) year thereafter, Rogers shall have the
right to terminate this agreement at any time upon
thirty (30) days written notice to the other party.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          , T. Han,
          <string-name>
            <surname>S</surname>
          </string-name>
          . Ma, J. Zhang,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Qiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ge</surname>
          </string-name>
          ,
          <article-title>Summary of chatgpt/gpt-4 research and perspective towards the future of large language models</article-title>
          ,
          <year>2023</year>
          .
          <article-title>a r X i v : 2 3 0 4 . 0 1 8 5 2</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2] OpenAI, Gpt-4
          <source>technical report</source>
          ,
          <year>2023</year>
          .
          <article-title>a r X i v : 2 3 0 3 . 0 8 7 7 4</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>T. B. Brown</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ryder</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Subbiah</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Dhariwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Neelakantan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Shyam</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Sastry</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Askell</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Herbert-Voss</surname>
            , G. Krueger,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Henighan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Child</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Ramesh</surname>
            ,
            <given-names>D. M.</given-names>
          </string-name>
          <string-name>
            <surname>Ziegler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Winter</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Hesse</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            , E. Sigler,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Litwin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Chess</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Berner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>McCandlish</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Radford</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Amodei</surname>
          </string-name>
          ,
          <article-title>Language models are few-shot learners</article-title>
          ,
          <year>2020</year>
          .
          <article-title>a r X i v : 2 0 0 5 . 1 4 1 6 5</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hatzius</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Briggs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kodani</surname>
          </string-name>
          ,
          <source>The potentially large efects of artificial intelligence on economic growth</source>
          ,
          <year>2023</year>
          . URL: https://www.key4biz.it/wp-content/uploads/2023/ 03/
          <string-name>
            <surname>Global-Economics-Analyst</surname>
          </string-name>
          _
          <article-title>-ThePotentiallyLarge-Effects-of-Artificial-Intelligence-onEconomic-</article-title>
          <string-name>
            <surname>Growth-Briggs</surname>
          </string-name>
          _Kodnani.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W. D.</given-names>
            <surname>Henderson</surname>
          </string-name>
          ,
          <article-title>Innovation difusion in the legal industry</article-title>
          , Dickinson L.
          <year>Rev</year>
          .
          <volume>122</volume>
          (
          <year>2017</year>
          )
          <fpage>395</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.</given-names>
            <surname>Touvron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lavril</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Izacard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Martinet</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Lachaux</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lacroix</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Rozière</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Hambro</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Azhar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rodriguez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Joulin</surname>
          </string-name>
          , E. Grave, G. Lample,
          <article-title>Llama: Open and eficient foundation language models</article-title>
          ,
          <year>2023</year>
          .
          <article-title>a r X i v : 2 3 0 2 . 1 3 9 7 1</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Biderman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schoelkopf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Anthony</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bradley</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. O'Brien</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Hallahan</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Purohit</surname>
            ,
            <given-names>U. S.</given-names>
          </string-name>
          <string-name>
            <surname>Prashanth</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Raf</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Skowron</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Sutawika</surname>
            ,
            <given-names>O. van der Wal</given-names>
          </string-name>
          ,
          <article-title>Pythia: A suite for analyzing large language models across training</article-title>
          and scaling,
          <year>2023</year>
          .
          <article-title>a r X i v : 2 3 0 4 . 0 1 3 7 3</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>N.</given-names>
            <surname>Dey</surname>
          </string-name>
          , G. Gosal, Zhiming, Chen,
          <string-name>
            <given-names>H.</given-names>
            <surname>Khachane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Marshall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pathria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hestness</surname>
          </string-name>
          , Cerebras-GPT:
          <article-title>Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster</article-title>
          ,
          <year>2023</year>
          . URL: http://arxiv.org/abs/2304.03208.
          <source>doi:1 0 . 4 8 5 5 0 / a r X i v . 2 3</source>
          <volume>0 4 . 0 3 2 0 8</volume>
          , arXiv:
          <fpage>2304</fpage>
          .03208 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>[9] StabilityAI, Stability ai launches the first of its stablelm suite of language models</article-title>
          ,
          <year>2023</year>
          . URL: https://stability.ai/blog/stability-ai
          <article-title>-launches-thefirst-of-its-stablelm-suite-of-language-models</article-title>
          , accessed:
          <fpage>2023</fpage>
          -05-17.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M. N.</given-names>
            <surname>Team</surname>
          </string-name>
          ,
          <article-title>Introducing mpt-7b: A new standard for open-source, ly usable llms</article-title>
          ,
          <year>2023</year>
          . URL: www.mosaicml.com/blog/mpt-7b, accessed:
          <fpage>2023</fpage>
          - 05-17.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bosma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Guu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. W.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <surname>Finetuned Language Models Are Zero-Shot Learners</surname>
          </string-name>
          ,
          <year>2022</year>
          . URL: http://arxiv.org/abs/2109.01652.
          <source>doi:1 0 . 4 8 5 5 0 / a r X i v . 2 1</source>
          <volume>0 9 . 0 1 6 5 2</volume>
          , arXiv:
          <fpage>2109</fpage>
          .01652 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ouyang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Almeida</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Wainwright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mishkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Agarwal,
          <string-name>
            <given-names>K.</given-names>
            <surname>Slama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schulman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hilton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Kelton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Simens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Askell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Welinder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Christiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leike</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lowe</surname>
          </string-name>
          , Training language models to
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>