=Paper=
{{Paper
|id=Vol-3435/paper2
|storemode=property
|title=The Potential for Jurisdictional Challenges to AI or LLM Training
Datasets
|pdfUrl=https://ceur-ws.org/Vol-3435/paper2.pdf
|volume=Vol-3435
|authors=Chris Draper,Nicky Gillibrand
|dblpUrl=https://dblp.org/rec/conf/icail/DraperG23
}}
==The Potential for Jurisdictional Challenges to AI or LLM Training
Datasets==
The Potential for Jurisdictional Challenges to AI or LLM Training Datasets Chris Draper 1 and Nicky Gillibrand 2 1 Indiana University, 107 S Indiana Ave, Bloomington, IN 47405, USA 2 University College Dublin, Belfield, Dublin 4, D04 V1W8, Ireland Abstract Large language model (LLM) tools used in AI powered access to justice (A2J) systems experience systemic bias when their training datasets do not reflect their communities. Such bias arguably indicates that the LLM should see the validity of its legal underpinnings challenged on jurisdictional grounds. Since ChatGPT has the capacity to pass an American Bar Exam, this provides hope that LLM tools can be trained to perform the work of a legal professional at the direction of a lay person, to the perceived benefit of the underserved litigant. However, significant challenges arise when reviewing the source of the datasets in terms of adherence to legal sovereignty, rule of law and quality of outcome. While privacy and data security will often focus data sovereignty on the geographic location where the data is held, the A2J community should also be mindful of extra-jurisdictional contributions to LLM training datasets that dispute the generally accepted norm of legal sovereignty, and as a result skew its application of law to be outside the acceptable boundaries of the impacted community. To better represent the challenges posed by LLM tools a novel quadripartite theory of informational sovereignty is offered, encompassing concerns regarding population, territory, recognition and regulation of borders. This paper will therefore examine and call into question claims that LLM is a perceived enabler of A2J. Discussion will involve how avoidance of jurisdictional challenges, such as traditional legal sovereignty, through a myopic focus on data sovereignty circumvents the risks of training data skewedness often displayed in bias, before considering how jurisdictionally defined training data limitations could impact outcome quality and the reformulation of the traditional role of the lawyer in the legal process. Finally, we will explore the dangers of failing to sufficiently address these far-reaching challenges – impacting all levels from the community to constitutional - in light of contemporary concerns and litigation. Keywords 1 Validation of the Legal Underpinnings of Systems; LLM; Large Language Models; Sovereignty; Rule of Law; Jurisdiction; Bias; AI Risk; Pragmatics of Adoption; Self- Represented Litigants; Panel Discussions; Guided Discussions; Works in Progress Workshop on Artificial Intelligence for Access to Justice (AI4AJ 2023), June 19, 2023, Braga, Portugal chris.draper@meidh.com; nicky.gillibrand@ucdconnect.ie ©️ 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) 1. Introduction newsworthy, particularly in the wake of ChatGPT’s rise to prominence and its related controversies such as its ban in Italy,2 amongst Large language model (LLM) tools used in AI other notable headlines such as its ability to pass powered access to justice (A2J) systems the Uniform Bar Examination in the US.3 Whilst experience systemic bias when their training much of the existing literature on the role of AI in datasets do not reflect their communities. Such the law to this point stems from a place of hope bias arguably indicates that the LLM should see that it may eventually have a positive impact on the validity of its legal underpinnings challenged A2J, enabling those who cannot afford a legal on jurisdictional grounds. Since ChatGPT has the professional to use accessible technology that can capacity to pass an American Bar Exam, this technically attain the level of a trained provides hope that LLM tools can be trained to professional,4 with some going as far as to state perform the work of a legal professional at the that AI is a prerequisite for social justice. 5 A direction of a lay person, to the perceived benefit significant volume of work also puts forward that of the underserved litigant. However, significant we should remain cautious of the sudden rise of challenges arise when reviewing the source of the AI usage, with it holding the potential to datasets in terms of adherence to legal exacerbate structural inequities inherent in sovereignty, rule of law and quality of outcome. society.6 While privacy and data security will often focus Failure to regulate the use of AI in the legal data sovereignty on the geographic location where profession remains a significant problem, with the data is held, the A2J community should also jurisdictions focusing primarily on the regulation be mindful of extra-jurisdictional contributions to of AI in case of autonomous vehicles and for the LLM training datasets that dispute the generally use of national defence.7 The value of government accepted norm of legal sovereignty, and as a result regulation cannot be understated as the rolling out skew its application of law to be outside the of an AI tool as a means to facilitate A2J can acceptable boundaries of the impacted contribute to sociopolitical disparities where those community. To better represent the challenges who can only afford AI may be receiving low posed by LLM tools a novel quadripartite theory quality legal services compared to those who have of informational sovereignty is offered, the funds to engage legal professionals. encompassing concerns regarding population, Furthermore, AI broadly defined cannot territory, recognition and regulation of borders. constitute an appropriate answer to enhance A2J as the newest LegalTech will remain cost This paper will therefore examine and call into prohibitive to underserved members of the public, question claims that LLM is a perceived enabler whilst high street lawyers representing less of A2J. Discussion will involve how avoidance of wealthy members of society will also be squeezed jurisdictional challenges, such as traditional legal by LegalTech,8 therefore a significant gulf will sovereignty, through a myopic focus on data remain between profit and not-for-profit AI sovereignty circumvents the risks of training data systems.9 skewedness often displayed in bias, before As AI datasets, if poorly constructed, are considering how jurisdictionally defined training capable of providing incorrect information and data limitations could impact outcome quality and being subject to considerable bias,10 infringing the the reformulation of the traditional role of the rights of individuals and groups with certain lawyer in the legal process. Finally, we will characteristics.11 If used in sentencing, such bias explore the dangers of failing to sufficiently can ultimately result in a deprivation of one’s address these far-reaching challenges – impacting liberty based on these characteristics. 12 As such, all levels from the community to constitutional - warnings have arisen that AI datasets must not in light of contemporary concerns and litigation. only be bigger, but also of better quality, which is generally described as the dataset being unbiased and less expensive whilst most importantly 2. The Current State of Legal AI remaining legally compliant,13 in turn assisting the cultivation of more predictable outcomes.14 Due in no small part to the rising accessibility Therefore quality of datasets is paramount to AI and the proliferation of use of AI, considerable fulfilling any sort of function and cultivating literature on the topic continues to emerge at a public trust as an alternative to traditional rapid pace. AI itself is becoming increasingly services.15 Perhaps most prohibitively of all, those who are unable to use computers or are without the definition of access. The National Center for the necessary technology cannot make use of AI Access to Justice defines A2J as “when people tools regardless, furthering social inequalities. 16 encounter life challenges they are able to Whilst the bulk of the literature focuses on understand their rights under the law, protect how a failure to properly regulate AI can impact those rights, obtain a fair result, and enforce that the public at an individual level, there is result to fully realize its value.”21 This definition considerably less on the wider impact to the frames justice as accessible through sufficient state’s jurisdiction and constitutional architecture. understanding and fair application of the law, with Of these, it is said to be pivotally important for the organizations like the United States Department societies to have control over the source code of of Justice seeing its role as helping “the justice the AI datasets before it is ceded to private tech system efficiently deliver outcomes that are fair corporations who may ultimately regulate AI and and accessible to all, irrespective of wealth and subsequently impact the rule of law.17 The rule of status”22 or the American Bar Association seeing law is said to be challenged in three ways by AI: A2J as “access to pro bono and low-cost legal the aforementioned blurring of the private-public services for vulnerable persons.”23 These views of regulatory sphere on fundamental rights; the justice as being attainable through greater access subsequent failure to demarcate legal certainty to the legal system have resulted in many A2J within this framework; the lack of transparency efforts focusing on the following solutions, inter and accountability of the mechanisms of decision- alia: making.18 By challenging the rule of law, one • Open data initiatives - Governments and challenges potentially centuries of constitutional legal organizations are increasingly tradition that forms the basis of civilised society. embracing open data initiatives, making As such, the implications may be widespread, legal information more freely available to with theorists stating that there requires a the public. By providing access to substantive reconfiguration of the relationship legislation, case law, and other legal between law, technology and legal culture in order resources, these initiatives enable to incorporate algorithmic rationality.19 If, therefore, LLMs gain a significant role in the legal individuals to better understand their profession and fail to be representative of legal legal rights and obligations. culture, synonymous to some with the rule of • Legal Aid Apps, Chatbots, and Self-Help law,20 this can result in declining public sentiment Portals - Various mobile applications and towards the legal system more generally which is chatbots have been developed to provide insurmountably detrimental to the wider legal assistance and guidance to functioning of the state. individuals who cannot afford or access These discourses are also significantly related traditional legal services. These tools to our concerns regarding the impact of LLMs and offer information about legal rights, their datasets on jurisdictional sovereignty which procedures, and resources, helping people remain largely unaddressed. It is, therefore, of navigate legal issues more effectively, utmost importance to exercise caution when including interactive guides, video considering the role of LLM tools in the law and consider any substantive advancements for its tutorials, and legal document templates. capacity through the lens of sovereignty These resources empower individuals to discourses, both of the traditional and digital handle legal matters on their own, variety, in order to fortify the probability of reducing the need for costly legal representative outcomes for communities. representation. . • Non-lawyer representation - Some legal sandbox initiatives in the United States 3. Framing Access To Justice are allowing non-lawyers to provide legal guidance on various topics. How to deliver access to justice (A2J) within • Pro Bono Resource Matching -Online society is broadly debated. Among laypersons this platforms have emerged that connect debate typically revolves around philosophical individuals in need of legal assistance definitions of justice. Yet among the legal with volunteer lawyers willing to provide community the debate typically revolves around pro bono services. These platforms use technology to match individuals with refers to the fairness and impartiality of the appropriate legal professionals, processes and procedures used to resolve disputes, expanding access to free legal help. allocate resources, or make decisions. It • Remote Court Access -The adoption of emphasizes the importance of ensuring that the remote court proceedings has accelerated procedures used to make decisions are perceived as fair and just by those affected by them, in recent years, especially during the regardless of the outcome. COVID-19 pandemic. Virtual courtrooms The concept of procedural justice is rooted in and video conferencing technologies the belief that people have a fundamental need to have allowed individuals to participate in be treated fairly and with respect, and that the legal proceedings without the need for procedures used to make decisions can have a physical presence, saving time and significant impact on how they perceive the reducing logistical barriers. fairness of those decisions. If disputes are • Alternative and Online Dispute resolved through a process that the community Resolution - Face-to-face mediation and agrees is “fair,” then the outcome of that process arbitration have long been viewed as should be “just.” options for reducing court backlogs, with The concept of what constitutes “fairness” online dispute resolution (ODR) rising to with respect to the processes that make up the justice system grew out of communities' norms prominence in the justice system and values. Historically, communities established following its rapid growth as a solution their own rules and systems for resolving disputes for resolving eCommerce disputes and administering justice arising from their outside of the traditional justice system. distinct legal culture. These systems were based Each of these solutions have the potential to on the norms, values, and customs of the expand access by making legal information and community and were designed to reflect the resources more accessible, walking laypeople unique needs and characteristics of that through the steps that must be taken, reducing the community. time it takes to find meaningful support or For example, in many Indigenous representation, or decreasing the time and cost for communities, the concept of restorative justice a case to be heard. In theory, the more these tools was and still is an important part of their justice can operate without the oversight or intervention system.24 In this system, the focus is on healing of human experts, the further barriers to access relationships and restoring balance, rather than on will drop. punishment or retribution. This approach is This is where much promise is seen in AI. As grounded in the values of community, respect, and examples, open data initiatives mean AI datasets harmony. could become more complete. AI chatbots could Similarly, in many small communities, understand a layperson’s issues, select the most disputes were often resolved through mediation or appropriate process, any relevant forms needed, negotiation rather than through formal legal and even fill out or file those forms on their proceedings. These informal methods of dispute behalf. The productivity of non-lawyer and pro resolution were based on a sense of community bono experts could leverage AI-supported intake and mutual respect, and often involved the interviews, document drafting, or meeting participation of respected community members or scheduling. Remote hearings, ADR, or ODR elders.25 could be facilitated by digital clerks or neutrals. As communities grew and became more Yet all this promise is contingent on the ability to complex, the need for more formal systems of appropriately understand and act upon often governance and justice arose. However, the murky human intention. underlying values and principles of fairness and equity remained an important part of these systems. The legal system that evolved from these community-based systems is built upon the 4. AI As A Tool for Procedural Justice principles of due process, impartiality, and the rule of law, as circumscribed by jurisdictional AI systems powered by LLM tools are seen as boundaries. potentially transformative when framed through a procedural view of justice. Procedural justice 5. Justice Through The Rule of Law 6. The Role of Jurisdiction The role of the rule of law within legal systems Jurisdictional boundaries are geographic or cannot be understated. The rule of law cemented legal limits that define the authority of courts and its place as a foundational principle of other legal institutions to hear and decide cases. constitutional law centuries prior, continuing to They represent an important component of the predominate until the present day. The rule of law justice system, as they help to ensure fairness and acts as a safeguard against arbitrary power and a impartiality by preventing conflicts of interest and maintainer of public order.26 Also within this, it promoting consistency and predictability in legal acts as a bedrock for the formation of laws as the outcomes. principal consideration on lawfulness on public One way that jurisdictional boundaries support legal action. In order to protect the rule of the law, fairness in the justice system is by ensuring that a practical restriction exists in terms of each state cases are heard in a neutral and impartial venue. having responsibility to maintain the quality of the By establishing clear rules for which court or rule of law. Responsibility for this substantially jurisdiction has authority over a particular case, befalls the legal system and to a degree, the jurisdictional boundaries help to prevent conflicts system of government. Both of these are impacted of interest and ensure that cases are heard in a by public values to some extent, the law must forum that is independent and unbiased. Despite adhere to the concerns of public policy and legal this, jurisdictional contestation is commonplace culture whilst the careers of many of those in the within private international legal cases where governmental sphere rests firmly upon public foreign laws may contravene the public policy opinion. interests of the lex fori thus transgressing the The rule of law is said to be challenged in three interests of the community in question.28 The ways by AI: the blurring of the private-public additional layer of complication formed by AI that regulatory sphere on fundamental rights; the exists outside of jurisdictional boundaries can be subsequent failure to demarcate legal certainty reasonably expected to add further complexity to within this framework; the lack of transparency the legal system by blurring the jurisdictional and accountability of the mechanisms of decision- lines between legal precedents. making.27 All of the above add a layer of obfuscation to a system that is already subject to Appropriate jurisdictional boundaries that unintelligibility at the level of a layperson. The protect fairness as interpreted by the communities result of this would be a more significant gap within those boundaries promote consistency and between the public and those in the legal predictability in legal outcomes. This is achieved profession thus causing a disengagement and a by establishing clear rules originating from subsequent decline in legal culture. community norms for which jurisdiction or court Within the discussion of jurisdictions, a has authority over a particular type of case, legal heavier usage of AI LLMs in their current form institutions can ensure that cases are decided in a would result in an incremental decrease in manner that is consistent with established legal representative legal outcomes. The absence of principles and precedents in line with the principle clear direction would subsequently culminate in a of parity. decline in legal culture being the primary source The nature of precedents themselves can create of law as it has previously been in common law significant challenges within a jurisdiction and for systems. To uproot a primary source of law AI machine learning, particularly when particularly through the backdoor, perhaps the one jurisdictional contestation is already a source that the public are undeniably aware of, is considerable problem. Whilst it is significant to incredibly problematic from a democratic ensure that an AI only applies the dataset perspective. The legal system does not exist in a applicable to the community in question in the vacuum thus it is incontrovertible that an attempt application of law, it is often the case that a state to remedy the A2J crisis should not contravene may make reference to another jurisdictions legal democracy and the foundations of a community. precedent. For instance, the common law legal system of Ireland often makes reference to the precedents of other common law jurisdictions such as the legal system of England and Wales to assist in determining appropriate outcomes. Rather than binding precedent, this is merely for a more expansive or limited interpretation of a persuasive precedent. As such, teaching LLMs to statute or legal principle. For example, a lawyer differentiate between the use of other may argue that the First Amendment's protection jurisdiction’s law as persuasive precedent rather of freedom of speech includes certain forms of than the basis of another community’s law which expression that the government is trying to would largely be unrepresentative of that restrict. Second, lawyers can argue that existing community’s sentiment will pose a significant legal principles or precedents should be changed challenge to the effective use of AI in law, or modified in light of changing societal values or requiring considerably more nuance than LLM’s as a matter of public policy. This argument is provide in their current form. often based on a claim that a particular legal Yet these precedents, and even sometimes the principle or precedent is outdated or does not principles underpinning those precedents, are not adequately address current issues. Both of these permanent. These changes in precedent or strategies are heavily dependent upon community principles are driven by the fact that community acceptance that the lawyer is correctly input and court decisions are intertwined. As court understanding both the law and the community it decisions can be influenced by community input, is serving. most often provided by lawyers or other legal To ensure lawyers are taking actions that have practitioners, community input is also shaped by the potential to change the law from a position of court decisions. When a court makes a decision in understanding regarding the current law, a particular case, based on how the community it jurisdictions typically have a set of rules and serves argues the law before it, the decision sets a regulations in place to ensure that lawyers precedent for future cases that involve similar representing clients in front of the court are legal issues. Precedent is important because it competent. These rules and regulations are ensures that the law is applied consistently over designed to ensure that lawyers have the time, and it allows individuals and organizations necessary education, training, and ethical to rely on the law and predict legal outcomes. standards to represent clients effectively. As society and values change over time, legal For instance, nearly every jurisdiction requires principles and precedents must also change. New bar admission as the primary way of ensuring societal norms must be reflected in new court competency. Lawyers must meet certain decisions that establish new legal interpretations educational and character requirements to be in order for the community to continue admitted to the bar and practice law in a particular interpreting the justice system as just. jurisdiction. For example, in the United States, lawyers must graduate from an accredited law school, pass a bar exam, and meet certain 7. Lawyers as the Voice of the character and fitness standards to be admitted to practice law. Once admitted, most jurisdictions Community require lawyers to engage in ongoing education and training to maintain their competence. Lawyers play a critical role in shaping legal Lawyers may be required to complete a certain principles and interpretations through their number of continuing legal education (CLE) advocacy on behalf of clients. This role is so vital credits each year to stay up-to-date on changes in that nearly every jurisdiction enforces significant the law and legal practice. In addition to education penalties when individuals, or computers in some and training requirements, jurisdictions may also jurisdictions, are seen to be engaged in the have rules and regulations in place to ensure unauthorized practice of law. 29 In court cases, ethical conduct and professional responsibility. lawyers argue for a particular interpretation of the For example, lawyers must adhere to rules of law that they believe best serves their client's professional conduct that govern their behavior interests. This interpretation can influence the and ensure that they act in the best interests of court's decision and can also shape future legal their clients. Failure to comply with these rules precedent. It is these arguments made by lawyers can result in disciplinary action, including that, in aggregate, represent the norms of the suspension or revocation of the lawyer's license to community.30 practice law. 31 These arguments have the potential to change All of these rules are in place for protecting the accepted legal principles or precedent through authenticity with which the community, through two primary strategies. First is through arguing the voice of those lawyers who represent members of the community and the judges who preside over predictions or generate new text. These court actions, is accurately represented through a predictions and generated text represent the continually modifying justice system. arguments and decisions that would be made or arrived at by the community, so long as the dataset 8. The Role of Lawyers in AI Legal was generated by the community. As with any other computer system, an LLM Systems operates solely based on the data to which it has been exposed. These datasets are used to "teach" The promise of access to justice tools that the model how to recognize patterns and make employ AI is rooted in the idea that such tools predictions. But the very nature of modern AI/ML could eliminate the need for lawyers. If systems means they typically reflect the average appropriately implemented, advocates believe of the dataset’s opinions expressed in their general citizens could interact with an AI powered training data and struggle to identify special dispute resolution tool through the development circumstances or edge cases. As such, it is of of LLM-driven systems that direct participants utmost importance that there is large datasets of through the procedures of justice towards an multiple cases in order to accurately automate accepted resolution filed with the courts. 32 In this legal predictions and have general applicability.37 system, it is not correct to think that lawyers Yet even with large datasets this gravitation to the would just disappear. Lawyers, in terms of all norm is a feature of the neural networks these parties with an influencing role in the outcome of tools are built upon, making them incapable of case, therefore, will be subject to a vastly different accurately applying specific logical processes or role in the legal system. This is despite their role account for edge cases without them being as trained professionals who have undertaken directly coded into the system.38 many years of training to attain their level of If the outputs of the LLM are to be appropriate competence. Although not free from criticism, the for a jurisdiction, they must be so on three public are considerably more forgiving and grounds. The LLM training data must reflect the empathetic to human error rather than community bounded by that jurisdiction, meaning computational error which is expected to be the model inputs should only be generated by faultless.33 While AI is technically able to attain individuals who have met the standards required the level of a legal professional given its proven of representing the community within that ability to pass the Uniform Bar Exam with a score jurisdiction. Second, the datasets must be within the 90th percentile,34 raw legal prowess is substantial enough to result in generalisable and an insufficient indicator of appropriate predictable outcomes based upon that observance of legal norms. Where lawyers are community’s law without reference to law from subject to mechanisms of accountability which other jurisdictions that would not ordinarily be forms a core administrative legal principle, AI cited in traditional legal precedents. And lastly, systems are unable to bear significant operational logic reflecting procedure specific to repercussions for their shortcomings and a jurisdiction must be directly encoded for violations of ethics or proper legal procedure, but instances when the law clearly requires a known rather run the risk of being placed as a liability cause to procures a specific effect. shield.35 As such the retention of lawyers as a human in the loop remains a necessity in order to protect core legal principles at risk of AI 9. Reformulating Digital Sovereignty overreach. 36 Therefore, lawyers would manifest themselves in a different manner: through the arguments they have made, the decisions they Protecting communities from the potential were party to, or the precedents they caused to be harm of AI systems often takes the framing of an set are contained in the AI training data. LLM- outside force acting upon the affected population. based access to justice tools will require training In the legal technology vertical, this force can on vast amounts of textual data representing often be seen as anything from profit driven community interests through the arguments made corporations to malevolent State actors. 39 This by the lawyers representing the community. These focus on protection from outside forces drives models use machine learning techniques to protection efforts towards the concepts of digital identify patterns in the data and develop a set of sovereignty, at whose heart is the concept of data rules or patterns that can be used to make sovereignty. While reasonable, AI-driven justice technologies tools push us to realize that these digital sovereignty considers data itself to be the strategies are fundamentally ineffectual. population that must be protected through Digital sovereignty refers to the idea that rigorous control.42 When defining this data nations and individuals should have control over population, the concept of data sovereignty their own digital technologies, data, and typically features two unique aspects whose infrastructure. The concept of digital sovereignty reasonableness AI-driven tools directly challenge: is based on the idea that the digital world has • Data protection laws. Many countries become a vital part of modern life, and that control have implemented data protection laws over digital technologies and data is essential for that regulate the collection, storage, and maintaining national security, economic use of personal data. These laws give competitiveness, and personal privacy. In individuals control over their personal attempts to exert this control, the focus of digital data and require organizations to obtain sovereignty can be framed within the remit of consent before collecting and processing traditional geopolitical sovereignty which has personal data, and been subject to centuries of prior discourse.40 • Data localization. Data localization is the Here, Krasner’s quadripartite conception of practice of requiring that data be stored in sovereignty can be reworked as a basis to a specific geographic location. This incorporate the challenges presented by an allows countries to maintain control over increasing use of AI in the legal profession 41: their citizens' data and protect it from • Population is conceptualized as control foreign governments and companies. over data. Digital sovereignty emphasizes The focus on these two aspects of data the importance of individual and national sovereignty are typically implemented by control over personal data and governments through restricting what data information. This includes data privacy, generated by one person’s existence can be data protection, and the ability to decide copyrighted by another without the generator’s how and when data is collected, used, and consent, and restricting the jurisdiction wherein shared. the silicon upon which the generated dataset must • Territory is conceptualized as control be physically located. over digital infrastructure. Digital AI tools challenge the reasonableness of sovereignty also involves control over the modern data sovereignty constructs because, infrastructure and systems that support although they must access the data contained on digital technologies. This includes control the silicon that is intended to be protected by the over networks, servers, and other digital concepts of digital and data sovereignty, the hardware and software. information perceived from an AI tool is a • Recognition is conceptualized as control biproduct of the appropriate relationships over digital governance. Digital interpreted between the training data. For United sovereignty emphasizes the importance States Citizens, this can be illustrated by the of national sovereignty in digital difference between an integer 123456789, a governance and regulation. This includes person defined by social security number 123-45- the ability of nations to set their own rules 6789, and a company defined by employer and regulations for digital technologies identification number 12-3456789. and data, and the ability to enforce those The data generated by an individual is an rules and regulations. artifact of their existence and cannot recreate a • Regulation of borders is conceptualized projection of their existence without the context as protection against cyber threats. of the individual. The information associated with Digital sovereignty also involves this contextually derived assembly of the data is protecting against cyber threats such as what makes any AI or LLM usable. This is why cyber-attacks, cyber espionage, and cyber concepts of data sovereignty when considering terrorism. This includes developing the regulation of AI for LegalTech uses require a robust cybersecurity measures and reconfigured, more appropriate “information protocols, and collaborating with other sovereignty” concept. nations to combat cyber threats. In the same way that the laws of a jurisdiction While traditional sovereignty concepts are only accepted if they reflect the community consider the population to be human individuals, contained within the jurisdiction, and the laws of a jurisdiction are made by the legal professionals operating within that jurisdiction, an LLM is only LLMs are not at the stage where they can appropriate for use within a jurisdiction if the data appropriately respond to concerns expressed by is assembled in a manner that incorporates the the legal community, sufficiently considering context of the legal professionals from within that these four tenets would go a significant way to jurisdiction. The location of the silicon upon addressing these concerns and fortifying trust in which the data that assembles that data into AI. Until this is the case, it would be improper to information, or the location of the stochastic consider LLMs as a sufficient device to contribute datasets that dynamically deploy that data within meaningfully towards access to justice on more an AI tool, do pose a risk in the form of model than just a superficial level. Those who cannot access or reliability. But the appropriateness of an afford traditional legal services still deserve AI tool is based solely on its ability to represent representative legal outcomes and rights to due the information gathered through observation of process. Where a case may hinge on a fine the population it will serve. This requires that tool technicality, AI is unlikely to yet have the suitability is defined by the source of information appropriate level of nuance to effectively respond. that was observed through the training of the Whilst this remains the case, this variety of model. technology has not yet sufficiently evolved into a The fact that any LLM is little more than a trusted legal tool. technological mimic of the observations it is fed has become more rapidly understood than possibly any comparable revelation for any other 10. The Risks of Doing Nothing transformative technology.43 This means that, in the same way precedent in a jurisdiction would not be accepted if it was attempted to be made by Shifting industry focus from one of digital sovereignty to information sovereignty will likely a legal practitioner who is not authorized to be a significant effort. In the meantime, the A2J practice in that jurisdiction, an AI LLM that is used by a jurisdiction must be restricted to community will have to grapple with the risks assemblies of data that are deemed appropriate posed by current tools and weigh potential impacts. Doing this requires examining some of because they are trained upon observations of the prevalent comforts, fears, or mitigating practitioners from that jurisdiction. This rethinking of how AI tools should be strategies when considering appropriate strategies jurisdictionally restricted leads to a proposal of with respect to AI integration without information sovereignty protections into A2J systems. For “information sovereignty” that could be instance, consider the following scenarios: represented as: • Population. Model training must be • “Drafting demand letter or limited to observations or interactions communications can be done safely with individuals from that jurisdiction. because it will always be reviewed before • Territory. The jurisdiction is not they go anywhere.” As the world recently geographically constrained but instead observed in Mata v. Avianca, Inc.,44 even inclusive of practitioners and systems lawyers who are paid their full rate may operating within its represented have a tendency to rely too heavily on a community. technology that convincingly mimics • Recognition. System outputs must be intelligence. In Mata v Avianca, Inc., a sufficiently auditable to verify that it is brief filed with the court contained consistently reflecting an appropriate multiple citations that were invented by representation of community accepted ChatGPT by combining fragments of real practitioners. training data. The likelihood of AI • Regulation of borders. System outputs generated drafts being given a less than must be sufficiently immutable to prevent appropriate review significantly increases modification when transferred across when a case is being handled pro bono. systems. In following this structure, AI could be used in When an AI system is so convincing and such a way that it does not harm the democratic the outputs are not jurisdictionally foundations of a community nor lead to constrained, A2J is depending on a pro unfounded or unrepresentative outcomes. Since bono attorney becomes effectively an on- the-loop, active safeing system that must idea that there are enough people in the perform the labor intensive job of system to catch any errors before they verifying facts in a document that appears produce an impact. Yet in the same way it correct. has been demonstrated that judges will • “Selecting appropriate forms or too often ignore analytics in favor of their appropriate citations can be done now.” own biases a majority of the time when Correct, form selection or citation looking at pretrial diversion programs reference when using appropriate search supposed by AI-enabled risk criteria can be successfully completed evaluations,47 a judge is not infallible today. In theory, AI should be able to when spotting unsupported arguments speed up these processes by requiring that could become precedent setting. In fewer less informed inputs from a user to cases where invalid arguments are find the most correct result faster. accepted within the system, the threat of However, unless the system is using the judicial system’s public acceptance details other than those communicated by rapidly grows. However, from an A2J the user to the system through a language- perspective, the court’s rejection of an based search, the model interpreting those invalid argument developed by a search inputs must be built to accurately layperson is likely more immediately reflect the context of that jurisdiction. For damaging because their access to fairness example, damage value and circumstance has been denied due to the AI system play a significant role in understanding misdirecting them in the development of where a case can be filed, with that the brief presented to the court. In both decision often varying by jurisdiction. If scenarios, acceptance or rejection, public the AI model is not trained in a manner confidence is eroded either slowly or that accounts for such nuance, the expert rapidly. system finding the right form with the • "The model can just be finetuned to be wrong context could result in justice safer.” ChatGPT has proven that any being denied. system which is probabilistically • “Providing legal information through assembling responses to prompts can tools like Chatbots is a straightforward easily produce erroneous answers. While exercise that poses little risk.” Apart from many of these answers may seem to the fears of bias and inaccuracy that have provide information that goes beyond been well documented in legal chatbot what is contained in the training data, this use cases,45 the experience of New Jersey interpretation is the technological Courts in building its Judiciary equivalent of observing dinosaurs in the Information Attendant (JIA) clouds. Since these erroneous answers are demonstrated that unanticipated partly due to the inappropriateness of the questions could require up to 70% of dataset, finetuning the dataset through inquiries be responded to by human weighting or censoring is not a sufficient attendants.46 Where the JIA design sent solution. Controlling a probabilistic inquiries to a call center when answers system by reducing a probability does not fell outside of rigid parameters, the eliminate its potential to emerge, which is nimbleness of an AI-powered chatbot why tools like ChatGPT can still believe could allow the system to more often the 2+3 could equal 87.48 AI tools for A2J believe it is fully understanding the applications will not only need to have inquiry in a manner that leads to a false clear acceptability boundaries more akin response. to expert systems than ChatGPT-style AI, • "A poorly written brief poses little risk these protections need to be and will not be precedent setting.” The jurisdictionally bounded with logical risk posed by poorly cited or constructed relationship appropriate to a jurisdiction arguments is often dismissed based on the included in their evaluative structure in order to be sure that any result accurately sovereignty to act as a bulwark for the protection reflects a valid outcome. of democracy and the individual. This is based upon the importance of limiting the model’s training to observing individuals from the 11. Concluding Remarks population in question, including the practitioners and systems operating with that territory, Whilst ostensibly the use of AI tools presents providing accountability through the recognition significant opportunities, at present it is plagued of reflecting the outputs of practitioners within with risks and inconsistencies that would further that community whilst in doing so providing jeopardize A2J in the long term if left sufficiently immutable outputs to prevent unaddressed. By permitting an undeveloped modification outside regulated borders. system to act in lieu of the services of a legal Although shifting the focus from digital to professional, those who cannot afford a lawyer are informational sovereignty will be subject to directly disadvantaged with the less than incremental change, this adapted criteria for the normative creation of further barriers to A2J. As training of LLM’s would be appropriate such, the improper use of AI tools as a reassurance for communities to consider the use replacement for conventional legal services has of AI tools in such a manner that would accelerate far-reaching implications, impacting the rather than inhibit A2J. In the meantime, individual, their community and the traditional mitigation of the risks is paramount given the conception of the state. It is posited this will invention of false evidence by LLM tools like transpire primarily through jurisdictional ChatGPT, the lack of predictability and accuracy overreach of AI tools that pose the substantial risk in outcomes and bias that threaten due process and of blurring the delimitations of community law A2J in legal systems. through datasets that fail to differentiate along jurisdictional boundaries. The proposed starting point for a solution is set forth as a new conception of informational services/service/ll/llglrd/2019668143/201966814 2 H. Ruschemeier, ‘Squaring the Circle’ 3.pdf (last accessed 8th May 2023) p. 1-2 https://verfassungsblog.de/squaring-the-circle/ 8 A. Telang, ‘The Promise and Peril of AI (last accessed 8th May 2023) Legal Services to Equalize Justice’ 3 ABA Journal – D. Cassens Weiss, ‘Latest https://jolt.law.harvard.edu/digest/the-promise- Version of ChatGPT Aces Bar Exam With Score and-peril-of-ai-legal-services-to-equalize-justice Nearing 90th Percentile’ (last accessed 8th May 2023) https://www.abajournal.com/web/article/latest- 9 A. Reichman and G. Sartor, ‘Algorithms and version-of-chatgpt-aces-the-bar-exam-with- Regulation‘ within ‘Constitutional Challenges in score-in-90th-percentile (last accessed 9th May the Algorithmic Society’ eds H-W. Micklitz, O. 2023) Pollicino, A. Reichman, A. Simoncini, G. Sartor 4 J. Villasenor, ‘How AI Will Revolutionize and G. De Gregorio (Cambridge University Press, the Practice of Law’ 2022) p. 157 https://www.brookings.edu/blog/techtank/2023/0 10 C. Gans-Combe, ‘Automated Justice: Issues, 3/20/how-ai-will-revolutionize-the-practice-of- Benefits and Risks in the Use of Artificial law/ (last accessed 8th May 2023) Intelligence and Its Algorithms in Access to 5 A. Buccella, ‘’AI For All’ Is A Matter of Justice and Law Enforcement’ within ‘Ethics, Social Justice’ (2022) AI and Ethics Integrity and Policymaking: The Value of the 6 H. Kanu, ‘Artificial Intelligence Poised to Case Study’ eds D. O’Mathuna & R. Iphofen Hinder, Not Help Access to Justice’ (Springer, 2022) p. 175 https://www.reuters.com/legal/transactional/artifi 11 R. Rodrigues, ‘Legal and Human Rights cial-intelligence-poised-hinder-not-help-access- Issues of AI: Gaps, Challenges and justice-2023-04-25/ (last accessed 8th May 2023) Vulnerabilities’ (2020) Journal of Responsible 7 Law Library: Library of Congress, Technology 4 100005 ‘Regulation of Artificial Intelligence in Selected 12 United Nations Office on Drugs and Crime, Jurisdictions’ https://tile.loc.gov/storage- ‘Artificial Intelligence: A New Trojan Horse for Undue Influence on Judiciaries’ 26 J. Raz, ‘The Rule of Law and its Virtue’ https://www.unodc.org/dohadeclaration/en/news/ within ‘The Authority of Law: Essays on Law and 2019/06/artificial-intelligence_-a-new-trojan- Morality’ (Oxford University Press, 1979) p. 210 27 horse-for-undue-influence-on-judiciaries.html O. Pollicino & G. De Gregorio, (last accessed 9th May 2023) ‘Constitutional Law in the Algorithmic Society’ 13 J. Soh Tsin Howe, ‘Building Legal Datasets’ within ‘Constitutional Challenges in the https://datacentricai.org/neurips21/papers/74_Ca Algorithmic Society’ eds H-W. Micklitz, O. meraReady_building-legal-datasets- Pollicino, A. Reichman, A. Simoncini, G. Sartor CamReady.pdf p. 1-2 and G. De Gregorio (Cambridge University Press, 14 S. Wolfram, ‘What Is ChatGPT Doing… 2022) p. 7 and Why Does it Work?’ (Wolfram Media, 2023) 28 F. Ghodoosi, ‘The Concept of Public Policy 15 M. Kusak, ‘Quality of Data Sets That Feed in Law: Revisiting the Role of the Public Policy AI and Big Data Applications Enforcement’ Doctrine in the Enforcement of Private Legal (2022) ERA Forum 23 p. 209 Arrangements’ (2016) Nebraska Law Review 16 Law Society Gazette, ‘Will LawTech 94(68) p. 690 Extend Justice or Deepen the Digital Divide?’ 29 M. Rotenberg, "Stifled Justice: The https://www.lawsociety.ie/gazette/top- Unauthorized Practice of Law and Internet Legal stories2/will-lawtech-increase-access-to-justice- Resources" (2012). Minnesota Law Review. 347 or-deepen-the-digital-divide (last accessed 8th p. 731 May 2023) 30 Judicature – D. F. Levi, D. Remus & A. 17 S. Rosengrun, ‘Why AI is a Threat to the Frisch, ‘Reclaiming the Role of Lawyers as Rule of Law’ (2022) Digital Society 1(10) p. 10 Community Connectors’ 18 O. Pollicino & G. De Gregorio, https://judicature.duke.edu/articles/reclaiming- ‘Constitutional Law in the Algorithmic Society’ the-role-of-lawyers-as-community-connectors/ within ‘Constitutional Challenges in the (last accessed 15th May 2023) Algorithmic Society’ eds H-W. Micklitz, O. 31 American Bar Association, ‘Model Rules of Pollicino, A. Reichman, A. Simoncini, G. Sartor Professional Conduct – Table of Contents’ and G. De Gregorio (Cambridge University Press, https://www.americanbar.org/groups/professiona 2022) p. 7 l_responsibility/publications/model_rules_of_pro 19 M. Catanzariti, ‘Algorithmic Law: Law fessional_conduct/model_rules_of_professional_ Production by Data or Data Production by Law?’ conduct_table_of_contents/ (last accessed 15th within ‘Constitutional Challenges in the May 2023) Algorithmic Society’ eds H-W. Micklitz, O. 32 A. Buccella, ‘’AI For All’ Is A Matter of Pollicino, A. Reichman, A. Simoncini, G. Sartor Social Justice’ (2022) AI and Ethics and G. De Gregorio (Cambridge University Press, 33 A. Reichman & G. Sartor, ‘Algorithms and 2022) p. 89 Regulations’ within ‘Constitutional Challenges in 20 R. Michaels, ‘Legal Culture’ available at: the Algorithmic Society’ eds H-W. Micklitz, O. https://scholarship.law.duke.edu/cgi/viewcontent. Pollicino, A. Reichman, A. Simoncini, G. Sartor cgi?article=3012&context=faculty_scholarship p. and G. De Gregorio (Cambridge University Press, 1 2022) p. 161 21 National Centre for Access to Justice, ‘What 34 ABA Journal – D. Cassens Weiss, ‘Latest is Access to Justice?’ https://ncaj.org/what- Version of ChatGPT Aces Bar Exam With Score access-justice, (last accessed May 27, 2023) Nearing 90th Percentile’ 22 Office for Access to Justice, ‘About ATJ’ https://www.abajournal.com/web/article/latest- https://www.justice.gov/atj/about-atj (last version-of-chatgpt-aces-the-bar-exam-with- accessed 1st June 2023) score-in-90th-percentile (last accessed 9th May 23 ABA, ‘Access to Justice’ 2023) 35 www.americanbar.org/topics/access/ (last J. J. Bryson, M. E. Diamantis & T. D. Grant, accessed 1st June 2023) ‘Of, for, and by the people: the legal lacuna of 24 B. Jarrett & P. Hyslop, ‘Justice for All: An synthetic persons’ (2017) Artificial Intelligence Indigenous Community-Based Approach to Law 25 p. 287 Restorative Justice in Alaska’ (2014) Northern 36 A. Reichman & G. Sartor, ‘Algorithms and Review 38 p. 239 Regulations’ within ‘Constitutional Challenges in 25 J. Folberg, ‘A Mediation Overview: History the Algorithmic Society’ eds H-W. Micklitz, O. and Dimension of Practice’ (1983) Mediation Pollicino, A. Reichman, A. Simoncini, G. Sartor Quarterly 1 p. 5 43 Boost.AI ‘What are Large Language Models and G. De Gregorio (Cambridge University Press, and How Do They Work?’ 2022) p. 174 https://www.boost.ai/blog/llms-large-language- 37 A. Reichman & G. Sartor, ‘Algorithms and models (last accessed 16th May 2023) Regulations’ within ‘Constitutional Challenges in 44 E. Volokh, ‘A lawyer’s filing ‘is replete with the Algorithmic Society’ eds H-W. Micklitz, O. citations to non-existent cases’ Thanks, Chat Pollicino, A. Reichman, A. Simoncini, G. Sartor GPT?’ https://reason.com/volokh/2023/05/27/a- and G. De Gregorio (Cambridge University Press, lawyers-filing-is-replete-with-citations-to-non- 2022) p. 157 existent-cases-thanks-chatgpt/ (accessed 28th May 38 S. Wolfram, ‘What Is ChatGPT Doing… 2023) and Why Does it Work?’ (Wolfram Media, 2023) 45 A. Asher-Schapiro & D. Sherfinski, p. 99 ‘Analysis: Chatbots in U.S. Justice System Raise 39 S. Rosengrun, ‘Why AI is a Threat to the Bias, Privacy Concerns’ Rule of Law’ (2022) Digital Society 1(10) p. 9 https://www.reuters.com/legal/litigation/chatbots 40 T. Hobbes, ‘Leviathan’ (Harvard Classics, -us-justice-system-raise-bias-privacy-concerns- 1651) Chapter 13 Para 10; W. A. Dunning, ‘Jean 2022-05-10/ (last accessed 28th May 2023) Bodin on Sovereignty’ (1896) Political Science 46 Joint Technology Committee, ‘Introduction Quarterly 11(1) p. 92 to AI for Courts’ 41 S. Krasner, ‘Sovereignty: Organised https://www.ncsc.org/__data/assets/pdf_file/0013 Hypocrisy’ (Princeton University Press, 1999) /20830/2020-04-02-intro-to-ai-for- within this work Krasner sets out four variants of courts_final.pdf (28th May 2023) sovereignty: domestic (exercise of authority 47 American Constitution Society, ‘Roadblock within a territory), interdependence (control over to Reform’ https://www.acslaw.org/wp- cross-border flow), international legal content/uploads/2018/11/RoadblockToReformRe (recognition of territory by other territories) and port.pdf p. 3 (last accessed 28th May 2023) Westphalian (non-intervention by others in the 48 S. Wolfram, ‘What Is ChatGPT Doing… affairs of a territory) and Why Does it Work?’ 42 L. Amoore, ‘Cloud geography: Computing, https://writings.stephenwolfram.com/2023/02/wh data, sovereignty’ (2018) Progress in Human at-is-chatgpt-doing-and-why-does-it-work/ (last Geography 42(1) p. 16 accessed 1st June 2023)