=Paper= {{Paper |id=Vol-3435/paper2 |storemode=property |title=The Potential for Jurisdictional Challenges to AI or LLM Training Datasets |pdfUrl=https://ceur-ws.org/Vol-3435/paper2.pdf |volume=Vol-3435 |authors=Chris Draper,Nicky Gillibrand |dblpUrl=https://dblp.org/rec/conf/icail/DraperG23 }} ==The Potential for Jurisdictional Challenges to AI or LLM Training Datasets== https://ceur-ws.org/Vol-3435/paper2.pdf
The Potential for Jurisdictional Challenges to AI or LLM Training
Datasets

Chris Draper 1 and Nicky Gillibrand 2
1
    Indiana University, 107 S Indiana Ave, Bloomington, IN 47405, USA
2
    University College Dublin, Belfield, Dublin 4, D04 V1W8, Ireland


                 Abstract
                 Large language model (LLM) tools used in AI powered access to justice (A2J) systems
                 experience systemic bias when their training datasets do not reflect their communities. Such
                 bias arguably indicates that the LLM should see the validity of its legal underpinnings
                 challenged on jurisdictional grounds. Since ChatGPT has the capacity to pass an American Bar
                 Exam, this provides hope that LLM tools can be trained to perform the work of a legal
                 professional at the direction of a lay person, to the perceived benefit of the underserved litigant.
                 However, significant challenges arise when reviewing the source of the datasets in terms of
                 adherence to legal sovereignty, rule of law and quality of outcome. While privacy and data
                 security will often focus data sovereignty on the geographic location where the data is held, the
                 A2J community should also be mindful of extra-jurisdictional contributions to LLM training
                 datasets that dispute the generally accepted norm of legal sovereignty, and as a result skew its
                 application of law to be outside the acceptable boundaries of the impacted community. To better
                 represent the challenges posed by LLM tools a novel quadripartite theory of informational
                 sovereignty is offered, encompassing concerns regarding population, territory, recognition and
                 regulation of borders.

                 This paper will therefore examine and call into question claims that LLM is a perceived enabler
                 of A2J. Discussion will involve how avoidance of jurisdictional challenges, such as traditional
                 legal sovereignty, through a myopic focus on data sovereignty circumvents the risks of training
                 data skewedness often displayed in bias, before considering how jurisdictionally defined
                 training data limitations could impact outcome quality and the reformulation of the traditional
                 role of the lawyer in the legal process. Finally, we will explore the dangers of failing to
                 sufficiently address these far-reaching challenges – impacting all levels from the community to
                 constitutional - in light of contemporary concerns and litigation.

                 Keywords 1
                 Validation of the Legal Underpinnings of Systems; LLM; Large Language Models;
                 Sovereignty; Rule of Law; Jurisdiction; Bias; AI Risk; Pragmatics of Adoption; Self-
                 Represented Litigants; Panel Discussions; Guided Discussions; Works in Progress


Workshop on Artificial Intelligence for Access to
Justice (AI4AJ 2023), June 19, 2023, Braga,
Portugal
chris.draper@meidh.com;
nicky.gillibrand@ucdconnect.ie
                    ©️ 2023 Copyright for this paper by its authors. Use permitted under
             Creative Commons License Attribution 4.0 International (CC BY 4.0).

                CEUR Workshop Proceedings
             (CEUR-WS.org)
1. Introduction                                        newsworthy, particularly in the wake of
                                                       ChatGPT’s rise to prominence and its related
                                                       controversies such as its ban in Italy,2 amongst
    Large language model (LLM) tools used in AI
                                                       other notable headlines such as its ability to pass
powered access to justice (A2J) systems
                                                       the Uniform Bar Examination in the US.3 Whilst
experience systemic bias when their training
                                                       much of the existing literature on the role of AI in
datasets do not reflect their communities. Such
                                                       the law to this point stems from a place of hope
bias arguably indicates that the LLM should see
                                                       that it may eventually have a positive impact on
the validity of its legal underpinnings challenged
                                                       A2J, enabling those who cannot afford a legal
on jurisdictional grounds. Since ChatGPT has the
                                                       professional to use accessible technology that can
capacity to pass an American Bar Exam, this
                                                       technically attain the level of a trained
provides hope that LLM tools can be trained to
                                                       professional,4 with some going as far as to state
perform the work of a legal professional at the
                                                       that AI is a prerequisite for social justice. 5 A
direction of a lay person, to the perceived benefit
                                                       significant volume of work also puts forward that
of the underserved litigant. However, significant
                                                       we should remain cautious of the sudden rise of
challenges arise when reviewing the source of the
                                                       AI usage, with it holding the potential to
datasets in terms of adherence to legal
                                                       exacerbate structural inequities inherent in
sovereignty, rule of law and quality of outcome.
                                                       society.6
While privacy and data security will often focus
                                                           Failure to regulate the use of AI in the legal
data sovereignty on the geographic location where
                                                       profession remains a significant problem, with
the data is held, the A2J community should also
                                                       jurisdictions focusing primarily on the regulation
be mindful of extra-jurisdictional contributions to
                                                       of AI in case of autonomous vehicles and for the
LLM training datasets that dispute the generally
                                                       use of national defence.7 The value of government
accepted norm of legal sovereignty, and as a result
                                                       regulation cannot be understated as the rolling out
skew its application of law to be outside the
                                                       of an AI tool as a means to facilitate A2J can
acceptable boundaries of the impacted
                                                       contribute to sociopolitical disparities where those
community. To better represent the challenges
                                                       who can only afford AI may be receiving low
posed by LLM tools a novel quadripartite theory
                                                       quality legal services compared to those who have
of informational sovereignty is offered,
                                                       the funds to engage legal professionals.
encompassing concerns regarding population,
                                                       Furthermore, AI broadly defined cannot
territory, recognition and regulation of borders.
                                                       constitute an appropriate answer to enhance A2J
                                                       as the newest LegalTech will remain cost
    This paper will therefore examine and call into
                                                       prohibitive to underserved members of the public,
question claims that LLM is a perceived enabler
                                                       whilst high street lawyers representing less
of A2J. Discussion will involve how avoidance of
                                                       wealthy members of society will also be squeezed
jurisdictional challenges, such as traditional legal
                                                       by LegalTech,8 therefore a significant gulf will
sovereignty, through a myopic focus on data
                                                       remain between profit and not-for-profit AI
sovereignty circumvents the risks of training data
                                                       systems.9
skewedness often displayed in bias, before
                                                           As AI datasets, if poorly constructed, are
considering how jurisdictionally defined training
                                                       capable of providing incorrect information and
data limitations could impact outcome quality and
                                                       being subject to considerable bias,10 infringing the
the reformulation of the traditional role of the
                                                       rights of individuals and groups with certain
lawyer in the legal process. Finally, we will
                                                       characteristics.11 If used in sentencing, such bias
explore the dangers of failing to sufficiently
                                                       can ultimately result in a deprivation of one’s
address these far-reaching challenges – impacting
                                                       liberty based on these characteristics. 12 As such,
all levels from the community to constitutional -
                                                       warnings have arisen that AI datasets must not
in light of contemporary concerns and litigation.
                                                       only be bigger, but also of better quality, which is
                                                       generally described as the dataset being unbiased
                                                       and less expensive whilst most importantly
2. The Current State of Legal AI                       remaining legally compliant,13 in turn assisting
                                                       the cultivation of more predictable outcomes.14
    Due in no small part to the rising accessibility   Therefore quality of datasets is paramount to AI
and the proliferation of use of AI, considerable       fulfilling any sort of function and cultivating
literature on the topic continues to emerge at a       public trust as an alternative to traditional
rapid pace. AI itself is becoming increasingly         services.15 Perhaps most prohibitively of all, those
who are unable to use computers or are without           the definition of access. The National Center for
the necessary technology cannot make use of AI           Access to Justice defines A2J as “when people
tools regardless, furthering social inequalities. 16     encounter life challenges they are able to
    Whilst the bulk of the literature focuses on         understand their rights under the law, protect
how a failure to properly regulate AI can impact         those rights, obtain a fair result, and enforce that
the public at an individual level, there is              result to fully realize its value.”21 This definition
considerably less on the wider impact to the             frames justice as accessible through sufficient
state’s jurisdiction and constitutional architecture.    understanding and fair application of the law, with
Of these, it is said to be pivotally important for the   organizations like the United States Department
societies to have control over the source code of        of Justice seeing its role as helping “the justice
the AI datasets before it is ceded to private tech       system efficiently deliver outcomes that are fair
corporations who may ultimately regulate AI and          and accessible to all, irrespective of wealth and
subsequently impact the rule of law.17 The rule of       status”22 or the American Bar Association seeing
law is said to be challenged in three ways by AI:        A2J as “access to pro bono and low-cost legal
the aforementioned blurring of the private-public        services for vulnerable persons.”23 These views of
regulatory sphere on fundamental rights; the             justice as being attainable through greater access
subsequent failure to demarcate legal certainty          to the legal system have resulted in many A2J
within this framework; the lack of transparency          efforts focusing on the following solutions, inter
and accountability of the mechanisms of decision-        alia:
making.18 By challenging the rule of law, one                 • Open data initiatives - Governments and
challenges potentially centuries of constitutional                legal organizations are increasingly
tradition that forms the basis of civilised society.              embracing open data initiatives, making
As such, the implications may be widespread,                      legal information more freely available to
with theorists stating that there requires a                      the public. By providing access to
substantive reconfiguration of the relationship
                                                                  legislation, case law, and other legal
between law, technology and legal culture in order
                                                                  resources, these initiatives enable
to incorporate algorithmic rationality.19 If,
therefore, LLMs gain a significant role in the legal              individuals to better understand their
profession and fail to be representative of legal                 legal rights and obligations.
culture, synonymous to some with the rule of                  • Legal Aid Apps, Chatbots, and Self-Help
law,20 this can result in declining public sentiment              Portals - Various mobile applications and
towards the legal system more generally which is                  chatbots have been developed to provide
insurmountably detrimental to the wider                           legal assistance and guidance to
functioning of the state.                                         individuals who cannot afford or access
    These discourses are also significantly related               traditional legal services. These tools
to our concerns regarding the impact of LLMs and                  offer information about legal rights,
their datasets on jurisdictional sovereignty which                procedures, and resources, helping people
remain largely unaddressed. It is, therefore, of
                                                                  navigate legal issues more effectively,
utmost importance to exercise caution when
                                                                  including interactive guides, video
considering the role of LLM tools in the law and
consider any substantive advancements for its                     tutorials, and legal document templates.
capacity through the lens of sovereignty                          These resources empower individuals to
discourses, both of the traditional and digital                   handle legal matters on their own,
variety, in order to fortify the probability of                   reducing the need for costly legal
representative outcomes for communities.                          representation.
    .                                                         • Non-lawyer representation - Some legal
                                                                  sandbox initiatives in the United States
3. Framing Access To Justice                                      are allowing non-lawyers to provide legal
                                                                  guidance on various topics.
   How to deliver access to justice (A2J) within              • Pro Bono Resource Matching -Online
society is broadly debated. Among laypersons this                 platforms have emerged that connect
debate typically revolves around philosophical                    individuals in need of legal assistance
definitions of justice. Yet among the legal                       with volunteer lawyers willing to provide
community the debate typically revolves around                    pro bono services. These platforms use
        technology to match individuals with           refers to the fairness and impartiality of the
        appropriate       legal       professionals,   processes and procedures used to resolve disputes,
        expanding access to free legal help.           allocate resources, or make decisions. It
    •   Remote Court Access -The adoption of           emphasizes the importance of ensuring that the
        remote court proceedings has accelerated       procedures used to make decisions are perceived
                                                       as fair and just by those affected by them,
        in recent years, especially during the
                                                       regardless of the outcome.
        COVID-19 pandemic. Virtual courtrooms
                                                           The concept of procedural justice is rooted in
        and video conferencing technologies            the belief that people have a fundamental need to
        have allowed individuals to participate in     be treated fairly and with respect, and that the
        legal proceedings without the need for         procedures used to make decisions can have a
        physical presence, saving time and             significant impact on how they perceive the
        reducing logistical barriers.                  fairness of those decisions. If disputes are
    •   Alternative     and     Online      Dispute    resolved through a process that the community
        Resolution - Face-to-face mediation and        agrees is “fair,” then the outcome of that process
        arbitration have long been viewed as           should be “just.”
        options for reducing court backlogs, with          The concept of what constitutes “fairness”
        online dispute resolution (ODR) rising to      with respect to the processes that make up the
                                                       justice system grew out of communities' norms
        prominence in the justice system
                                                       and values. Historically, communities established
        following its rapid growth as a solution
                                                       their own rules and systems for resolving disputes
        for resolving eCommerce disputes               and administering justice arising from their
        outside of the traditional justice system.     distinct legal culture. These systems were based
    Each of these solutions have the potential to      on the norms, values, and customs of the
expand access by making legal information and          community and were designed to reflect the
resources more accessible, walking laypeople           unique needs and characteristics of that
through the steps that must be taken, reducing the     community.
time it takes to find meaningful support or                For     example,     in    many     Indigenous
representation, or decreasing the time and cost for    communities, the concept of restorative justice
a case to be heard. In theory, the more these tools    was and still is an important part of their justice
can operate without the oversight or intervention      system.24 In this system, the focus is on healing
of human experts, the further barriers to access       relationships and restoring balance, rather than on
will drop.                                             punishment or retribution. This approach is
    This is where much promise is seen in AI. As       grounded in the values of community, respect, and
examples, open data initiatives mean AI datasets       harmony.
could become more complete. AI chatbots could              Similarly, in many small communities,
understand a layperson’s issues, select the most       disputes were often resolved through mediation or
appropriate process, any relevant forms needed,        negotiation rather than through formal legal
and even fill out or file those forms on their         proceedings. These informal methods of dispute
behalf. The productivity of non-lawyer and pro         resolution were based on a sense of community
bono experts could leverage AI-supported intake        and mutual respect, and often involved the
interviews, document drafting, or meeting              participation of respected community members or
scheduling. Remote hearings, ADR, or ODR               elders.25
could be facilitated by digital clerks or neutrals.        As communities grew and became more
Yet all this promise is contingent on the ability to   complex, the need for more formal systems of
appropriately understand and act upon often            governance and justice arose. However, the
murky human intention.                                 underlying values and principles of fairness and
                                                       equity remained an important part of these
                                                       systems. The legal system that evolved from these
                                                       community-based systems is built upon the
4. AI As A Tool for Procedural Justice                 principles of due process, impartiality, and the
                                                       rule of law, as circumscribed by jurisdictional
   AI systems powered by LLM tools are seen as         boundaries.
potentially transformative when framed through a
procedural view of justice. Procedural justice
5. Justice Through The Rule of Law                       6. The Role of Jurisdiction
   The role of the rule of law within legal systems          Jurisdictional boundaries are geographic or
cannot be understated. The rule of law cemented          legal limits that define the authority of courts and
its place as a foundational principle of                 other legal institutions to hear and decide cases.
constitutional law centuries prior, continuing to        They represent an important component of the
predominate until the present day. The rule of law       justice system, as they help to ensure fairness and
acts as a safeguard against arbitrary power and a        impartiality by preventing conflicts of interest and
maintainer of public order.26 Also within this, it       promoting consistency and predictability in legal
acts as a bedrock for the formation of laws as the       outcomes.
principal consideration on lawfulness on public              One way that jurisdictional boundaries support
legal action. In order to protect the rule of the law,   fairness in the justice system is by ensuring that
a practical restriction exists in terms of each state    cases are heard in a neutral and impartial venue.
having responsibility to maintain the quality of the     By establishing clear rules for which court or
rule of law. Responsibility for this substantially       jurisdiction has authority over a particular case,
befalls the legal system and to a degree, the            jurisdictional boundaries help to prevent conflicts
system of government. Both of these are impacted         of interest and ensure that cases are heard in a
by public values to some extent, the law must            forum that is independent and unbiased. Despite
adhere to the concerns of public policy and legal        this, jurisdictional contestation is commonplace
culture whilst the careers of many of those in the       within private international legal cases where
governmental sphere rests firmly upon public             foreign laws may contravene the public policy
opinion.                                                 interests of the lex fori thus transgressing the
   The rule of law is said to be challenged in three     interests of the community in question.28 The
ways by AI: the blurring of the private-public           additional layer of complication formed by AI that
regulatory sphere on fundamental rights; the             exists outside of jurisdictional boundaries can be
subsequent failure to demarcate legal certainty          reasonably expected to add further complexity to
within this framework; the lack of transparency          the legal system by blurring the jurisdictional
and accountability of the mechanisms of decision-        lines between legal precedents.
making.27 All of the above add a layer of
obfuscation to a system that is already subject to           Appropriate jurisdictional boundaries that
unintelligibility at the level of a layperson. The       protect fairness as interpreted by the communities
result of this would be a more significant gap           within those boundaries promote consistency and
between the public and those in the legal                predictability in legal outcomes. This is achieved
profession thus causing a disengagement and a            by establishing clear rules originating from
subsequent decline in legal culture.                     community norms for which jurisdiction or court
   Within the discussion of jurisdictions, a             has authority over a particular type of case, legal
heavier usage of AI LLMs in their current form           institutions can ensure that cases are decided in a
would result in an incremental decrease in               manner that is consistent with established legal
representative legal outcomes. The absence of            principles and precedents in line with the principle
clear direction would subsequently culminate in a        of parity.
decline in legal culture being the primary source            The nature of precedents themselves can create
of law as it has previously been in common law           significant challenges within a jurisdiction and for
systems. To uproot a primary source of law               AI machine learning, particularly when
particularly through the backdoor, perhaps the one       jurisdictional contestation is already a
source that the public are undeniably aware of, is       considerable problem. Whilst it is significant to
incredibly problematic from a democratic                 ensure that an AI only applies the dataset
perspective. The legal system does not exist in a        applicable to the community in question in the
vacuum thus it is incontrovertible that an attempt       application of law, it is often the case that a state
to remedy the A2J crisis should not contravene           may make reference to another jurisdictions legal
democracy and the foundations of a community.            precedent. For instance, the common law legal
                                                         system of Ireland often makes reference to the
                                                         precedents of other common law jurisdictions
                                                         such as the legal system of England and Wales to
                                                         assist in determining appropriate outcomes.
Rather than binding precedent, this is merely          for a more expansive or limited interpretation of a
persuasive precedent. As such, teaching LLMs to        statute or legal principle. For example, a lawyer
differentiate between the use of other                 may argue that the First Amendment's protection
jurisdiction’s law as persuasive precedent rather      of freedom of speech includes certain forms of
than the basis of another community’s law which        expression that the government is trying to
would largely be unrepresentative of that              restrict. Second, lawyers can argue that existing
community’s sentiment will pose a significant          legal principles or precedents should be changed
challenge to the effective use of AI in law,           or modified in light of changing societal values or
requiring considerably more nuance than LLM’s          as a matter of public policy. This argument is
provide in their current form.                         often based on a claim that a particular legal
    Yet these precedents, and even sometimes the       principle or precedent is outdated or does not
principles underpinning those precedents, are not      adequately address current issues. Both of these
permanent. These changes in precedent or               strategies are heavily dependent upon community
principles are driven by the fact that community       acceptance that the lawyer is correctly
input and court decisions are intertwined. As court    understanding both the law and the community it
decisions can be influenced by community input,        is serving.
most often provided by lawyers or other legal              To ensure lawyers are taking actions that have
practitioners, community input is also shaped by       the potential to change the law from a position of
court decisions. When a court makes a decision in      understanding regarding the current law,
a particular case, based on how the community it       jurisdictions typically have a set of rules and
serves argues the law before it, the decision sets a   regulations in place to ensure that lawyers
precedent for future cases that involve similar        representing clients in front of the court are
legal issues. Precedent is important because it        competent. These rules and regulations are
ensures that the law is applied consistently over      designed to ensure that lawyers have the
time, and it allows individuals and organizations      necessary education, training, and ethical
to rely on the law and predict legal outcomes.         standards to represent clients effectively.
    As society and values change over time, legal          For instance, nearly every jurisdiction requires
principles and precedents must also change. New        bar admission as the primary way of ensuring
societal norms must be reflected in new court          competency. Lawyers must meet certain
decisions that establish new legal interpretations     educational and character requirements to be
in order for the community to continue                 admitted to the bar and practice law in a particular
interpreting the justice system as just.               jurisdiction. For example, in the United States,
                                                       lawyers must graduate from an accredited law
                                                       school, pass a bar exam, and meet certain
7. Lawyers as the Voice of the                         character and fitness standards to be admitted to
                                                       practice law. Once admitted, most jurisdictions
   Community                                           require lawyers to engage in ongoing education
                                                       and training to maintain their competence.
    Lawyers play a critical role in shaping legal      Lawyers may be required to complete a certain
principles and interpretations through their           number of continuing legal education (CLE)
advocacy on behalf of clients. This role is so vital   credits each year to stay up-to-date on changes in
that nearly every jurisdiction enforces significant    the law and legal practice. In addition to education
penalties when individuals, or computers in some       and training requirements, jurisdictions may also
jurisdictions, are seen to be engaged in the           have rules and regulations in place to ensure
unauthorized practice of law. 29 In court cases,       ethical conduct and professional responsibility.
lawyers argue for a particular interpretation of the   For example, lawyers must adhere to rules of
law that they believe best serves their client's       professional conduct that govern their behavior
interests. This interpretation can influence the       and ensure that they act in the best interests of
court's decision and can also shape future legal       their clients. Failure to comply with these rules
precedent. It is these arguments made by lawyers       can result in disciplinary action, including
that, in aggregate, represent the norms of the         suspension or revocation of the lawyer's license to
community.30                                           practice law. 31
    These arguments have the potential to change           All of these rules are in place for protecting the
accepted legal principles or precedent through         authenticity with which the community, through
two primary strategies. First is through arguing       the voice of those lawyers who represent members
of the community and the judges who preside over         predictions or generate new text. These
court actions, is accurately represented through a       predictions and generated text represent the
continually modifying justice system.                    arguments and decisions that would be made or
                                                         arrived at by the community, so long as the dataset
8. The Role of Lawyers in AI Legal                       was generated by the community.
                                                             As with any other computer system, an LLM
   Systems                                               operates solely based on the data to which it has
                                                         been exposed. These datasets are used to "teach"
    The promise of access to justice tools that          the model how to recognize patterns and make
employ AI is rooted in the idea that such tools          predictions. But the very nature of modern AI/ML
could eliminate the need for lawyers. If                 systems means they typically reflect the average
appropriately implemented, advocates believe             of the dataset’s opinions expressed in their
general citizens could interact with an AI powered       training data and struggle to identify special
dispute resolution tool through the development          circumstances or edge cases. As such, it is of
of LLM-driven systems that direct participants           utmost importance that there is large datasets of
through the procedures of justice towards an             multiple cases in order to accurately automate
accepted resolution filed with the courts. 32 In this    legal predictions and have general applicability.37
system, it is not correct to think that lawyers          Yet even with large datasets this gravitation to the
would just disappear. Lawyers, in terms of all           norm is a feature of the neural networks these
parties with an influencing role in the outcome of       tools are built upon, making them incapable of
case, therefore, will be subject to a vastly different   accurately applying specific logical processes or
role in the legal system. This is despite their role     account for edge cases without them being
as trained professionals who have undertaken             directly coded into the system.38
many years of training to attain their level of              If the outputs of the LLM are to be appropriate
competence. Although not free from criticism, the        for a jurisdiction, they must be so on three
public are considerably more forgiving and               grounds. The LLM training data must reflect the
empathetic to human error rather than                    community bounded by that jurisdiction, meaning
computational error which is expected to be              the model inputs should only be generated by
faultless.33 While AI is technically able to attain      individuals who have met the standards required
the level of a legal professional given its proven       of representing the community within that
ability to pass the Uniform Bar Exam with a score        jurisdiction. Second, the datasets must be
within the 90th percentile,34 raw legal prowess is       substantial enough to result in generalisable and
an insufficient indicator of appropriate                 predictable outcomes based upon that
observance of legal norms. Where lawyers are             community’s law without reference to law from
subject to mechanisms of accountability which            other jurisdictions that would not ordinarily be
forms a core administrative legal principle, AI          cited in traditional legal precedents. And lastly,
systems are unable to bear significant                   operational logic reflecting procedure specific to
repercussions for their shortcomings and                 a jurisdiction must be directly encoded for
violations of ethics or proper legal procedure, but      instances when the law clearly requires a known
rather run the risk of being placed as a liability       cause to procures a specific effect.
shield.35 As such the retention of lawyers as a
human in the loop remains a necessity in order to
protect core legal principles at risk of AI
                                                         9. Reformulating Digital Sovereignty
overreach. 36 Therefore, lawyers would manifest
themselves in a different manner: through the
arguments they have made, the decisions they                Protecting communities from the potential
were party to, or the precedents they caused to be       harm of AI systems often takes the framing of an
set are contained in the AI training data. LLM-          outside force acting upon the affected population.
based access to justice tools will require training      In the legal technology vertical, this force can
on vast amounts of textual data representing             often be seen as anything from profit driven
community interests through the arguments made           corporations to malevolent State actors. 39 This
by the lawyers representing the community. These         focus on protection from outside forces drives
models use machine learning techniques to                protection efforts towards the concepts of digital
identify patterns in the data and develop a set of       sovereignty, at whose heart is the concept of data
rules or patterns that can be used to make               sovereignty. While reasonable, AI-driven justice
technologies tools push us to realize that these          digital sovereignty considers data itself to be the
strategies are fundamentally ineffectual.                 population that must be protected through
    Digital sovereignty refers to the idea that           rigorous control.42 When defining this data
nations and individuals should have control over          population, the concept of data sovereignty
their own digital technologies, data, and                 typically features two unique aspects whose
infrastructure. The concept of digital sovereignty        reasonableness AI-driven tools directly challenge:
is based on the idea that the digital world has                • Data protection laws. Many countries
become a vital part of modern life, and that control               have implemented data protection laws
over digital technologies and data is essential for                that regulate the collection, storage, and
maintaining       national    security,      economic              use of personal data. These laws give
competitiveness, and personal privacy. In                          individuals control over their personal
attempts to exert this control, the focus of digital               data and require organizations to obtain
sovereignty can be framed within the remit of                      consent before collecting and processing
traditional geopolitical sovereignty which has                     personal data, and
been subject to centuries of prior discourse.40                • Data localization. Data localization is the
Here, Krasner’s quadripartite conception of                        practice of requiring that data be stored in
sovereignty can be reworked as a basis to                          a specific geographic location. This
incorporate the challenges presented by an                         allows countries to maintain control over
increasing use of AI in the legal profession 41:                   their citizens' data and protect it from
     • Population is conceptualized as control                     foreign governments and companies.
         over data. Digital sovereignty emphasizes            The focus on these two aspects of data
         the importance of individual and national        sovereignty are typically implemented by
         control over personal data and                   governments through restricting what data
         information. This includes data privacy,         generated by one person’s existence can be
         data protection, and the ability to decide       copyrighted by another without the generator’s
         how and when data is collected, used, and        consent, and restricting the jurisdiction wherein
         shared.                                          the silicon upon which the generated dataset must
     • Territory is conceptualized as control             be physically located.
         over digital infrastructure. Digital                 AI tools challenge the reasonableness of
         sovereignty also involves control over the       modern data sovereignty constructs because,
         infrastructure and systems that support          although they must access the data contained on
         digital technologies. This includes control      the silicon that is intended to be protected by the
         over networks, servers, and other digital        concepts of digital and data sovereignty, the
         hardware and software.                           information perceived from an AI tool is a
     • Recognition is conceptualized as control           biproduct of the appropriate relationships
         over      digital   governance.        Digital   interpreted between the training data. For United
         sovereignty emphasizes the importance            States Citizens, this can be illustrated by the
         of national sovereignty in digital               difference between an integer 123456789, a
         governance and regulation. This includes         person defined by social security number 123-45-
         the ability of nations to set their own rules    6789, and a company defined by employer
         and regulations for digital technologies         identification number 12-3456789.
         and data, and the ability to enforce those           The data generated by an individual is an
         rules and regulations.                           artifact of their existence and cannot recreate a
     • Regulation of borders is conceptualized            projection of their existence without the context
         as protection against cyber threats.             of the individual. The information associated with
         Digital sovereignty also involves                this contextually derived assembly of the data is
         protecting against cyber threats such as         what makes any AI or LLM usable. This is why
         cyber-attacks, cyber espionage, and cyber        concepts of data sovereignty when considering
         terrorism. This includes developing              the regulation of AI for LegalTech uses require a
         robust cybersecurity measures and                reconfigured, more appropriate “information
         protocols, and collaborating with other          sovereignty” concept.
         nations to combat cyber threats.                     In the same way that the laws of a jurisdiction
    While traditional sovereignty concepts                are only accepted if they reflect the community
consider the population to be human individuals,          contained within the jurisdiction, and the laws of
                                                          a jurisdiction are made by the legal professionals
operating within that jurisdiction, an LLM is only      LLMs are not at the stage where they can
appropriate for use within a jurisdiction if the data   appropriately respond to concerns expressed by
is assembled in a manner that incorporates the          the legal community, sufficiently considering
context of the legal professionals from within that     these four tenets would go a significant way to
jurisdiction. The location of the silicon upon          addressing these concerns and fortifying trust in
which the data that assembles that data into            AI. Until this is the case, it would be improper to
information, or the location of the stochastic          consider LLMs as a sufficient device to contribute
datasets that dynamically deploy that data within       meaningfully towards access to justice on more
an AI tool, do pose a risk in the form of model         than just a superficial level. Those who cannot
access or reliability. But the appropriateness of an    afford traditional legal services still deserve
AI tool is based solely on its ability to represent     representative legal outcomes and rights to due
the information gathered through observation of         process. Where a case may hinge on a fine
the population it will serve. This requires that tool   technicality, AI is unlikely to yet have the
suitability is defined by the source of information     appropriate level of nuance to effectively respond.
that was observed through the training of the           Whilst this remains the case, this variety of
model.                                                  technology has not yet sufficiently evolved into a
    The fact that any LLM is little more than a         trusted legal tool.
technological mimic of the observations it is fed
has become more rapidly understood than
possibly any comparable revelation for any other        10. The Risks of Doing Nothing
transformative technology.43 This means that, in
the same way precedent in a jurisdiction would
not be accepted if it was attempted to be made by           Shifting industry focus from one of digital
                                                        sovereignty to information sovereignty will likely
a legal practitioner who is not authorized to
                                                        be a significant effort. In the meantime, the A2J
practice in that jurisdiction, an AI LLM that is
used by a jurisdiction must be restricted to            community will have to grapple with the risks
assemblies of data that are deemed appropriate          posed by current tools and weigh potential
                                                        impacts. Doing this requires examining some of
because they are trained upon observations of
                                                        the prevalent comforts, fears, or mitigating
practitioners from that jurisdiction. This
rethinking of how AI tools should be                    strategies when considering appropriate strategies
jurisdictionally restricted leads to a proposal of      with respect to AI integration without information
                                                        sovereignty protections into A2J systems. For
“information sovereignty” that could be
                                                        instance, consider the following scenarios:
represented as:
     • Population. Model training must be                    • “Drafting         demand       letter     or
         limited to observations or interactions                 communications can be done safely
         with individuals from that jurisdiction.                because it will always be reviewed before
     • Territory. The jurisdiction is not                        they go anywhere.” As the world recently
         geographically constrained but instead                  observed in Mata v. Avianca, Inc.,44 even
         inclusive of practitioners and systems                  lawyers who are paid their full rate may
         operating      within    its    represented             have a tendency to rely too heavily on a
         community.                                              technology that convincingly mimics
     • Recognition. System outputs must be                       intelligence. In Mata v Avianca, Inc., a
         sufficiently auditable to verify that it is             brief filed with the court contained
         consistently reflecting an appropriate                  multiple citations that were invented by
         representation of community accepted                    ChatGPT by combining fragments of real
         practitioners.                                          training data. The likelihood of AI
     • Regulation of borders. System outputs                     generated drafts being given a less than
         must be sufficiently immutable to prevent
                                                                 appropriate review significantly increases
         modification when transferred across
                                                                 when a case is being handled pro bono.
         systems.
    In following this structure, AI could be used in             When an AI system is so convincing and
such a way that it does not harm the democratic                  the outputs are not jurisdictionally
foundations of a community nor lead to                           constrained, A2J is depending on a pro
unfounded or unrepresentative outcomes. Since                    bono attorney becomes effectively an on-
    the-loop, active safeing system that must           idea that there are enough people in the
    perform the labor intensive job of                  system to catch any errors before they
    verifying facts in a document that appears          produce an impact. Yet in the same way it
    correct.                                            has been demonstrated that judges will
•   “Selecting      appropriate      forms     or       too often ignore analytics in favor of their
    appropriate citations can be done now.”             own biases a majority of the time when
    Correct, form selection or citation                 looking at pretrial diversion programs
    reference when using appropriate search             supposed       by      AI-enabled       risk
    criteria can be successfully completed              evaluations,47 a judge is not infallible
    today. In theory, AI should be able to              when spotting unsupported arguments
    speed up these processes by requiring               that could become precedent setting. In
    fewer less informed inputs from a user to           cases where invalid arguments are
    find the most correct result faster.                accepted within the system, the threat of
    However, unless the system is using                 the judicial system’s public acceptance
    details other than those communicated by            rapidly grows. However, from an A2J
    the user to the system through a language-          perspective, the court’s rejection of an
    based search, the model interpreting those          invalid argument developed by a
    search inputs must be built to accurately           layperson is likely more immediately
    reflect the context of that jurisdiction. For       damaging because their access to fairness
    example, damage value and circumstance              has been denied due to the AI system
    play a significant role in understanding            misdirecting them in the development of
    where a case can be filed, with that                the brief presented to the court. In both
    decision often varying by jurisdiction. If          scenarios, acceptance or rejection, public
    the AI model is not trained in a manner             confidence is eroded either slowly or
    that accounts for such nuance, the expert           rapidly.
    system finding the right form with the          •   "The model can just be finetuned to be
    wrong context could result in justice               safer.” ChatGPT has proven that any
    being denied.                                       system which is probabilistically
•   “Providing legal information through                assembling responses to prompts can
    tools like Chatbots is a straightforward            easily produce erroneous answers. While
    exercise that poses little risk.” Apart from        many of these answers may seem to
    the fears of bias and inaccuracy that have          provide information that goes beyond
    been well documented in legal chatbot               what is contained in the training data, this
    use cases,45 the experience of New Jersey           interpretation is the technological
    Courts in building its Judiciary                    equivalent of observing dinosaurs in the
    Information          Attendant          (JIA)       clouds. Since these erroneous answers are
    demonstrated         that      unanticipated        partly due to the inappropriateness of the
    questions could require up to 70% of                dataset, finetuning the dataset through
    inquiries be responded to by human                  weighting or censoring is not a sufficient
    attendants.46 Where the JIA design sent             solution. Controlling a probabilistic
    inquiries to a call center when answers             system by reducing a probability does not
    fell outside of rigid parameters, the               eliminate its potential to emerge, which is
    nimbleness of an AI-powered chatbot                 why tools like ChatGPT can still believe
    could allow the system to more often                the 2+3 could equal 87.48 AI tools for A2J
    believe it is fully understanding the               applications will not only need to have
    inquiry in a manner that leads to a false           clear acceptability boundaries more akin
    response.                                           to expert systems than ChatGPT-style AI,
•   "A poorly written brief poses little risk           these     protections      need    to     be
    and will not be precedent setting.” The             jurisdictionally bounded with logical
    risk posed by poorly cited or constructed           relationship appropriate to a jurisdiction
    arguments is often dismissed based on the           included in their evaluative structure in
        order to be sure that any result accurately      sovereignty to act as a bulwark for the protection
        reflects a valid outcome.                        of democracy and the individual. This is based
                                                         upon the importance of limiting the model’s
                                                         training to observing individuals from the
11. Concluding Remarks                                   population in question, including the practitioners
                                                         and systems operating with that territory,
    Whilst ostensibly the use of AI tools presents       providing accountability through the recognition
significant opportunities, at present it is plagued      of reflecting the outputs of practitioners within
with risks and inconsistencies that would further        that community whilst in doing so providing
jeopardize A2J in the long term if left                  sufficiently immutable outputs to prevent
unaddressed. By permitting an undeveloped                modification outside regulated borders.
system to act in lieu of the services of a legal             Although shifting the focus from digital to
professional, those who cannot afford a lawyer are       informational sovereignty will be subject to
directly disadvantaged with the less than                incremental change, this adapted criteria for the
normative creation of further barriers to A2J. As        training of LLM’s would be appropriate
such, the improper use of AI tools as a                  reassurance for communities to consider the use
replacement for conventional legal services has          of AI tools in such a manner that would accelerate
far-reaching     implications,    impacting      the     rather than inhibit A2J. In the meantime,
individual, their community and the traditional          mitigation of the risks is paramount given the
conception of the state. It is posited this will         invention of false evidence by LLM tools like
transpire primarily through jurisdictional               ChatGPT, the lack of predictability and accuracy
overreach of AI tools that pose the substantial risk     in outcomes and bias that threaten due process and
of blurring the delimitations of community law           A2J in legal systems.
through datasets that fail to differentiate along
jurisdictional boundaries.
    The proposed starting point for a solution is set
forth as a new conception of informational

                                                         services/service/ll/llglrd/2019668143/201966814
   2
       H. Ruschemeier, ‘Squaring the Circle’             3.pdf (last accessed 8th May 2023) p. 1-2
https://verfassungsblog.de/squaring-the-circle/              8
                                                                A. Telang, ‘The Promise and Peril of AI
(last accessed 8th May 2023)                             Legal       Services     to   Equalize     Justice’
    3
      ABA Journal – D. Cassens Weiss, ‘Latest            https://jolt.law.harvard.edu/digest/the-promise-
Version of ChatGPT Aces Bar Exam With Score              and-peril-of-ai-legal-services-to-equalize-justice
Nearing                 90th               Percentile’   (last accessed 8th May 2023)
https://www.abajournal.com/web/article/latest-               9
                                                               A. Reichman and G. Sartor, ‘Algorithms and
version-of-chatgpt-aces-the-bar-exam-with-               Regulation‘ within ‘Constitutional Challenges in
score-in-90th-percentile (last accessed 9th May          the Algorithmic Society’ eds H-W. Micklitz, O.
2023)                                                    Pollicino, A. Reichman, A. Simoncini, G. Sartor
    4
      J. Villasenor, ‘How AI Will Revolutionize          and G. De Gregorio (Cambridge University Press,
the            Practice             of           Law’    2022) p. 157
https://www.brookings.edu/blog/techtank/2023/0               10
                                                                C. Gans-Combe, ‘Automated Justice: Issues,
3/20/how-ai-will-revolutionize-the-practice-of-          Benefits and Risks in the Use of Artificial
law/ (last accessed 8th May 2023)                        Intelligence and Its Algorithms in Access to
    5
      A. Buccella, ‘’AI For All’ Is A Matter of          Justice and Law Enforcement’ within ‘Ethics,
Social Justice’ (2022) AI and Ethics                     Integrity and Policymaking: The Value of the
    6
      H. Kanu, ‘Artificial Intelligence Poised to        Case Study’ eds D. O’Mathuna & R. Iphofen
Hinder, Not Help Access to Justice’                      (Springer, 2022) p. 175
https://www.reuters.com/legal/transactional/artifi           11
                                                                R. Rodrigues, ‘Legal and Human Rights
cial-intelligence-poised-hinder-not-help-access-         Issues of AI: Gaps, Challenges and
justice-2023-04-25/ (last accessed 8th May 2023)         Vulnerabilities’ (2020) Journal of Responsible
    7
        Law Library: Library of Congress,                Technology 4 100005
‘Regulation of Artificial Intelligence in Selected           12
                                                                United Nations Office on Drugs and Crime,
Jurisdictions’           https://tile.loc.gov/storage-   ‘Artificial Intelligence: A New Trojan Horse for
                                                         Undue          Influence      on        Judiciaries’
                                                          26
                                                              J. Raz, ‘The Rule of Law and its Virtue’
https://www.unodc.org/dohadeclaration/en/news/         within ‘The Authority of Law: Essays on Law and
2019/06/artificial-intelligence_-a-new-trojan-         Morality’ (Oxford University Press, 1979) p. 210
                                                           27
horse-for-undue-influence-on-judiciaries.html                    O. Pollicino & G. De Gregorio,
(last accessed 9th May 2023)                           ‘Constitutional Law in the Algorithmic Society’
    13
       J. Soh Tsin Howe, ‘Building Legal Datasets’     within ‘Constitutional Challenges in the
https://datacentricai.org/neurips21/papers/74_Ca       Algorithmic Society’ eds H-W. Micklitz, O.
meraReady_building-legal-datasets-                     Pollicino, A. Reichman, A. Simoncini, G. Sartor
CamReady.pdf p. 1-2                                    and G. De Gregorio (Cambridge University Press,
    14
        S. Wolfram, ‘What Is ChatGPT Doing…            2022) p. 7
and Why Does it Work?’ (Wolfram Media, 2023)               28
                                                              F. Ghodoosi, ‘The Concept of Public Policy
    15
       M. Kusak, ‘Quality of Data Sets That Feed       in Law: Revisiting the Role of the Public Policy
AI and Big Data Applications Enforcement’              Doctrine in the Enforcement of Private Legal
(2022) ERA Forum 23 p. 209                             Arrangements’ (2016) Nebraska Law Review
    16
         Law Society Gazette, ‘Will LawTech            94(68) p. 690
Extend Justice or Deepen the Digital Divide?’              29
                                                                M. Rotenberg, "Stifled Justice: The
https://www.lawsociety.ie/gazette/top-                 Unauthorized Practice of Law and Internet Legal
stories2/will-lawtech-increase-access-to-justice-      Resources" (2012). Minnesota Law Review. 347
or-deepen-the-digital-divide (last accessed 8th        p. 731
May 2023)                                                  30
                                                               Judicature – D. F. Levi, D. Remus & A.
    17
       S. Rosengrun, ‘Why AI is a Threat to the        Frisch, ‘Reclaiming the Role of Lawyers as
Rule of Law’ (2022) Digital Society 1(10) p. 10        Community                              Connectors’
    18
          O. Pollicino & G. De Gregorio,               https://judicature.duke.edu/articles/reclaiming-
‘Constitutional Law in the Algorithmic Society’        the-role-of-lawyers-as-community-connectors/
within ‘Constitutional Challenges in the               (last accessed 15th May 2023)
Algorithmic Society’ eds H-W. Micklitz, O.                 31
                                                              American Bar Association, ‘Model Rules of
Pollicino, A. Reichman, A. Simoncini, G. Sartor        Professional Conduct – Table of Contents’
and G. De Gregorio (Cambridge University Press,        https://www.americanbar.org/groups/professiona
2022) p. 7                                             l_responsibility/publications/model_rules_of_pro
    19
        M. Catanzariti, ‘Algorithmic Law: Law          fessional_conduct/model_rules_of_professional_
Production by Data or Data Production by Law?’         conduct_table_of_contents/ (last accessed 15th
within ‘Constitutional Challenges in the               May 2023)
Algorithmic Society’ eds H-W. Micklitz, O.                 32
                                                              A. Buccella, ‘’AI For All’ Is A Matter of
Pollicino, A. Reichman, A. Simoncini, G. Sartor        Social Justice’ (2022) AI and Ethics
and G. De Gregorio (Cambridge University Press,            33
                                                              A. Reichman & G. Sartor, ‘Algorithms and
2022) p. 89                                            Regulations’ within ‘Constitutional Challenges in
    20
        R. Michaels, ‘Legal Culture’ available at:     the Algorithmic Society’ eds H-W. Micklitz, O.
https://scholarship.law.duke.edu/cgi/viewcontent.      Pollicino, A. Reichman, A. Simoncini, G. Sartor
cgi?article=3012&context=faculty_scholarship p.        and G. De Gregorio (Cambridge University Press,
1                                                      2022) p. 161
    21
       National Centre for Access to Justice, ‘What        34
                                                              ABA Journal – D. Cassens Weiss, ‘Latest
is Access to Justice?’ https://ncaj.org/what-          Version of ChatGPT Aces Bar Exam With Score
access-justice, (last accessed May 27, 2023)           Nearing                 90th             Percentile’
    22
       Office for Access to Justice, ‘About ATJ’       https://www.abajournal.com/web/article/latest-
https://www.justice.gov/atj/about-atj          (last   version-of-chatgpt-aces-the-bar-exam-with-
accessed 1st June 2023)                                score-in-90th-percentile (last accessed 9th May
    23
            ABA,       ‘Access      to      Justice’   2023)
                                                           35
www.americanbar.org/topics/access/             (last          J. J. Bryson, M. E. Diamantis & T. D. Grant,
accessed 1st June 2023)                                ‘Of, for, and by the people: the legal lacuna of
    24
       B. Jarrett & P. Hyslop, ‘Justice for All: An    synthetic persons’ (2017) Artificial Intelligence
Indigenous Community-Based Approach to                 Law 25 p. 287
Restorative Justice in Alaska’ (2014) Northern             36
                                                              A. Reichman & G. Sartor, ‘Algorithms and
Review 38 p. 239                                       Regulations’ within ‘Constitutional Challenges in
    25
       J. Folberg, ‘A Mediation Overview: History      the Algorithmic Society’ eds H-W. Micklitz, O.
and Dimension of Practice’ (1983) Mediation            Pollicino, A. Reichman, A. Simoncini, G. Sartor
Quarterly 1 p. 5
                                                          43
                                                              Boost.AI ‘What are Large Language Models
and G. De Gregorio (Cambridge University Press,        and        How         Do        They        Work?’
2022) p. 174                                           https://www.boost.ai/blog/llms-large-language-
    37
       A. Reichman & G. Sartor, ‘Algorithms and        models (last accessed 16th May 2023)
Regulations’ within ‘Constitutional Challenges in          44
                                                              E. Volokh, ‘A lawyer’s filing ‘is replete with
the Algorithmic Society’ eds H-W. Micklitz, O.         citations to non-existent cases’ Thanks, Chat
Pollicino, A. Reichman, A. Simoncini, G. Sartor        GPT?’ https://reason.com/volokh/2023/05/27/a-
and G. De Gregorio (Cambridge University Press,        lawyers-filing-is-replete-with-citations-to-non-
2022) p. 157                                           existent-cases-thanks-chatgpt/ (accessed 28th May
    38
       S. Wolfram, ‘What Is ChatGPT Doing…             2023)
and Why Does it Work?’ (Wolfram Media, 2023)               45
                                                               A. Asher-Schapiro & D. Sherfinski,
p. 99                                                  ‘Analysis: Chatbots in U.S. Justice System Raise
    39
       S. Rosengrun, ‘Why AI is a Threat to the        Bias,               Privacy                Concerns’
Rule of Law’ (2022) Digital Society 1(10) p. 9         https://www.reuters.com/legal/litigation/chatbots
    40
       T. Hobbes, ‘Leviathan’ (Harvard Classics,       -us-justice-system-raise-bias-privacy-concerns-
1651) Chapter 13 Para 10; W. A. Dunning, ‘Jean         2022-05-10/ (last accessed 28th May 2023)
Bodin on Sovereignty’ (1896) Political Science             46
                                                              Joint Technology Committee, ‘Introduction
Quarterly 11(1) p. 92                                  to             AI             for            Courts’
    41
        S. Krasner, ‘Sovereignty: Organised            https://www.ncsc.org/__data/assets/pdf_file/0013
Hypocrisy’ (Princeton University Press, 1999)          /20830/2020-04-02-intro-to-ai-for-
within this work Krasner sets out four variants of     courts_final.pdf (28th May 2023)
sovereignty: domestic (exercise of authority               47
                                                              American Constitution Society, ‘Roadblock
within a territory), interdependence (control over     to       Reform’       https://www.acslaw.org/wp-
cross-border        flow),   international     legal   content/uploads/2018/11/RoadblockToReformRe
(recognition of territory by other territories) and    port.pdf p. 3 (last accessed 28th May 2023)
Westphalian (non-intervention by others in the             48
                                                              S. Wolfram, ‘What Is ChatGPT Doing…
affairs of a territory)                                and         Why         Does        it       Work?’
    42
       L. Amoore, ‘Cloud geography: Computing,         https://writings.stephenwolfram.com/2023/02/wh
data, sovereignty’ (2018) Progress in Human            at-is-chatgpt-doing-and-why-does-it-work/ (last
Geography 42(1) p. 16                                  accessed 1st June 2023)