=Paper= {{Paper |id=Vol-3435/short2 |storemode=property |title=ChatGPT as an Artificial Lawyer? |pdfUrl=https://ceur-ws.org/Vol-3435/short2.pdf |volume=Vol-3435 |authors=Jinzhe Tan,Hannes Westermann,Karim Benyekhlef |dblpUrl=https://dblp.org/rec/conf/icail/TanWB23 }} ==ChatGPT as an Artificial Lawyer?== https://ceur-ws.org/Vol-3435/short2.pdf
ChatGPT as an Artificial Lawyer?
Jinzhe Tan1,* , Hannes Westermann1 and Karim Benyekhlef1
1
    Cyberjustice Laboratory, Faculté de droit, Université de Montréal, Montréal, Québec, H3T 1J4, Canada


                                       Abstract
                                       Lawyers can analyze and understand specific situations of their clients to provide them with relevant legal information and
                                       advice. We qualitatively investigate to which extent ChatGPT (a large language model developed by OpenAI) may be able to
                                       carry out some of these tasks, to provide legal information to laypeople. This paper proposes a framework for evaluating
                                       the provision of legal information as a process, evaluating not only its accuracy in providing legal information, but also its
                                       ability to understand and reason about users’ needs. We perform an initial investigation of ChatGPT’s ability to provide legal
                                       information using several simulated cases. We also compare the performance to that of JusticeBot, a legal information tool
                                       based on expert systems. While ChatGPT does not always provide accurate and reliable information, it acts as a powerful
                                       and intuitive way to interact with laypeople. This research opens the door to combining the two approaches for flexible and
                                       accurate legal information tools.

                                       Keywords
                                       Artificial Intelligence & Law, Large Language Models, ChatGPT, Access to Justice, Legal Information



1. Introduction                                                                                         language interface to many types of knowledge. In the
                                                                                                        legal domain, an important role of lawyers is to provide
ChatGPT1 has garnered significant attention from the information and advice about legal problem to laypeople.
public, academia, industry, and media. It is able to per- However, for many people, the related expenses of hiring
form a huge variety of textual tasks after simply being a lawyer can pose a significant obstacle to safeguarding
asked through a free chat interface. The model can fur- and enforcing their rights. They find themselves living
ther be accessed using an API (Application Programming in a "legal advice desert"[2], unable to find the means
Interface)[1], which gives developers the ability to create to safeguard their rights. Despite the important work
products enhanced by ChatGPT across many different done by scholar and practitioners in increasing access to
areas. Every day, dozens of applications using this API justice[3], many individuals still face issues with it.[4]
are launched.2                                                                                             Artificial Intelligence (AI) has the potential to play a
    While ChatGPT’s performance in carrying out nat- significant role in promoting access to justice. AI could
ural language conversations is impressive, its potential serve many parties simultaneously and provide legal in-
applications are not limited to that. In addition to general- formation to parties in an efficient, accurate, and cost-
purpose applications such as search engines,3 it could effective manner. A number of products in this area have
also be used in domain-specific tasks, such as in the le- been developed and used with success.[5]
gal domain. ChatGPT can be seen to provide a natural                                                       In this research, we sought to explore the performance
                                                                                                        of ChatGPT in providing legal information, compared to
Workshop on Artificial Intelligence for Access to Justice (AI4AJ 2023),
                                                                                                        self-help tools that specialize in the legal domain. To this
June 19, 2023, Braga, Portugal.
*
   Corresponding author.                                                                                end, we conducted an experiment on ChatGPT, utilizing
$ jinzhe.tan@umontreal.ca (J. Tan);                                                                     simulated legal cases to qualitatively evaluate its perfor-
hannes.westermann@umontreal.ca (H. Westermann);                                                         mance. We compared the responses to the JusticeBot, a
karim.benyekhlef@umontreal.ca (K. Benyekhlef)                                                           legal decision support tool focused on landlord-tenant
 0000-0002-8259-2630 (J. Tan); 0000-0002-4527-7316
                                                                                                        disputes.
(H. Westermann); 0000-0001-9390-556X (K. Benyekhlef)
           © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License    Our investigation approached the provision of legal in-
           Attribution 4.0 International (CC BY 4.0).
    CEUR

           CEUR Workshop Proceedings (CEUR-WS.org)
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073                                                                        formation as a multifaceted process, comprising not only
1
  ChatGPT is a Large Language Model (LLM) application developed the acquisition of precise and reliable sources of informa-
  by OpenAI.                                                                                            tion but also the ability to elicit an accurate description
2
  For example, ChatPDF (https://chatpdf.com) allows you to request
  information in PDF documents, ChatExcel (https://chatexcel.com)
                                                                                                        of users’ legal predicaments and needs, the aptitude to
  allows you to use natural language to adjust Excel files, bilin- discern relevant and extraneous facts from the users’ in-
  gual book maker (https://github.com/yihong0618/bilingual_book_ put, and the skill to furnish pertinent legal information
  maker) allows you to translate an entire book in a short amount relevant to the users’ situation via legal reasoning. Our
  of time, Mem (https://mem.ai) allows you to quickly brief yourself research thus underscores the vital role played by each
  on relevant material, and AI Dungeon (https://play.aidungeon.io)
  allows you to play infinite episodes of games generated by AI.
                                                                                                        of these components in ensuring the effective delivery
3
  The new Bing (https://www.bing.com/) integrated with the Chat- of legal information.
  GPT model is now available.                                                                              In this paper, we will examine ChatGPT’s ability to
provide legal information, understand the strengths of      significantly improves ChatGPT’s performance.[21, 22]
ChatGPT, and look at opportunities for integration of       Chain-of-thought prompting[23] has been shown to be
ChatGPT and other legal tools in the future.                effective in improving performance in the legal field as
                                                            well.[24]
                                                               In addition to testing ChatGPT using exam sets, re-
2. Related Work                                             search has been conducted on comparisons between
                                                            ChatGPT and human experts, and corpus-based evalua-
Using AI and technology to increase access to justice by
                                                            tions of ChatGPT performance in the legal field have been
giving legal information has been an important field of
                                                            conducted.[25]The study found that ChatGPT-generated
investigation. For example, Branting et al built a tool
                                                            answers were generally evaluated as more helpful than
to inform individuals of their rights regarding protec-
                                                            humans’, but it also found that ChatGPT may fabricate
tion orders [6]. Zeleznikow’s GetAid system aims to help
                                                            facts, especially in the legal field, where ChatGPT may
lawyers determine whether an individual is eligible for le-
                                                            quote non-existent legal texts to answer questions.[25]
gal aid [7]. The Loge-expert system aimed to help people
                                                            The study compared the answers generated by ChatGPT
understand their housing law situation using an expert
                                                            to answers taken from Wikipedia and Baidu Baike4 Due
system [8]. Housing law questions were also investigated
                                                            to the specificity of the legal field, these answers may not
using machine learning in [9]. In [5], the authors suggest
                                                            always be accurate [26].
the "JusticeBot" approach to build legal decision support
                                                               In this work, we prepare simulated cases, and interact
systems for laypeople.
                                                            with ChatGPT as if we were layperson parties of these
   Here, we examine whether ChatGPT could be used
                                                            cases. We compare the answers given by ChatGPT to
to provide legal information to increase access to jus-
                                                            the answers given by JusticeBot, a legal information tool
tice. This would have the advantage of not requiring
                                                            developed at the Cyberjustice Laboratory, created by
the manual adaptation to new legal areas. We compare
                                                            legal experts.
the answers given by ChatGPT to answers given by a
JusticeBot tool focused on landlord-tenant disputes [5].
   Experiments on ChatGPT’s performance in the le- 3. Access to Justice
gal field have been conducted extensively, according
to studies testing ChatGPT’s performance on the U.S. Laypeople often have trouble resolving their everyday le-
bar Exam,[10] While ChatGPT placed in the bottom gal disputes. A majority of people will at some point have
10% of law students, the improvide GPT-4 model was to deal with a legal issue, such as neighbourhood, employ-
estimate to pass the bar and place in the top 10% of ment or debt problems. [27, 28] However, it can be very
students.[11, 12] Another study tested ChatGPT’s perfor- tricky to resolve such disputes. Studies conducted have
mance on law school exams, showing that it was able to shown that only around 20% of legal problems that arose
pass the exam, but could only be considered a "mediocre in the past three years had been resolved [29]. These
law student".[13] It was also found that ChatGPT has issues are likely to be especially pronounced for individ-
some ability to write legal documents.[14] Further, GPT- uals who do not have access to professional legal help,
4 was found to display strong performance when given which can be quite expensive. Such individuals may not
legal annotation tasks to carry out [15, 16] and when be aware of which rights apply to them, which forms
explaining legal concepts using case law [17]. In the they need to fill out [30], or what the relevant facts are
context of access to justice, the GPT-4 model further regarding their case [31].
showed promise in intervening in an online dispute reso-       Self-represented litigants, lacking precise legal infor-
lution context, by reformulating inflammatory messages mation guidance, may navigate through the labyrinth
or even autonomously suggesting interventions to medi- of litigation like a ship without a compass,[32] and may
ators [18].                                                 find themselves at a disadvantage due to missing critical
   Remarkably, ChatGPT exhibits a certain level of le- information,[33] hindering access to justice and under-
gal reasoning ability despite not having been specifi- mining public confidence in the judicial system. The im-
cally trained on legal data, owing to the emergent abil- pacts may be especially grave when laypeople encounter
ities derived from the scaling of the model.[19] This a party that is well financed and has previous experience
suggests that ChatGPT is a general-purpose model, it in going through litigation.[34] Such a party imbalance
contains laws about human common sense, a capa- may be present, for example, in housing disputes. [32]
bility that, when combined with specialized domain             A key aspect in addressing this situation is providing
knowledge,[20] has the potential to solve the challenges individuals with legal information [35]. As was discussed
encountered in legal AI training in the past. We can
enhance its performance in the legal domain through 4 Baidu Baike is a Chinese-language collaborative online encyclope-
prompting. Yu et.al showed that such legal prompting dia owned by the Chinese technology company Baidu. It’s consid-
                                                              ered to be the "Chinese version of Wikipedia".
in section 2, the internet and artificial intelligence have    range of lay people, and every minor information error
been important components of such self-help tools in the       may potentially lead to harmful decisions being made by
recent past. Such tools are able to take a more interactive,   laypeople.
and personalized approach than e.g. books providing
legal information.
   Such tools are generally focused on providing legal         4. Methodology
information. This refers to more general principles and
                                                               In order to qualitatively evaluate the capability of Chat-
rules. Legal advice, which focuses on the unique circum-
                                                               GPT to provide legal information, we first determine
stances of a particular [36], can usually exclusively be
                                                               criteria that we will use to evaluate the accuracy and
provided by members of the legal bar [37].
                                                               trustworthiness of information provided by ChatGPT,
   Some legal tasks, such as negotiating with the other
                                                               and how well it interacts with users (section 4.1. Then,
side and predicting the outcome, likely constitute the
                                                               we use ChatGPT to generate simulated cases (section 4.2).
practice of law [38]. For other tasks, this line is not as
                                                               Finally, we interact with ChatGPT as if we were parties
clear, which may cause difficulty for the development of
                                                               in the simulated cases without legal training, and ask
legal self-help tools.
                                                               ChatGPT to provide legal information (section 4.3).
   As the accuracy and relevance of legal information pro-
                                                                  We try the same cases in the JusticeBot tool,5 a legal
vided by legal self-help tools increases, users can access
                                                               decision support tool developed at the Cyberjustice Lab-
legal provisions, relevant cases, from these tools that are
                                                               oratory and focused on landlord-tenant disputes. The
close to the legal issues they face, and users can use the
                                                               JusticeBot was built in collaboration with the Tribunal
above information to choose the next step to take. The
                                                               Administratif du Logement du Québec, the housing tri-
line between the dichotomy of legal information and legal
                                                               bunal in Québec, and with financing from the “Ministère
advice is gradually blurring, and the content included un-
                                                               de l’Économie et de l’Innovation Quèbec”, and has been
der the concept of legal information is becoming broader.
                                                               accessed by over 20k users since being launched in the
The expansion of the scope of legal information can lead
                                                               summer of 2021.
to conflicts between these two concepts.
                                                                  JusticeBot is based on an expert system methodology,
   We need to be cautious when defining the scope of
                                                               where all content has been created by a legal expert,
legal advice to prevent excessively limiting the potential
                                                               and the system is fully deterministic to ensure the accu-
of helping people.[33] The objective of restricting entities
                                                               racy of the provided information [5]. ChatGPT, on the
that provide legal advice is to protect parties from being
                                                               other hand, is based on large language models, which are
disadvantaged by erroneous legal advice, rather than
                                                               trained on enormous corpora of texts to absorb patterns
turning legal advice into a privilege that obstructs parties
                                                               from it.
from comprehending the rules of the society in which
                                                                  ChatGPT, on the other hand, is based on a large lan-
they reside.
                                                               guage model, which is trained on a enormous corpora of
   Providing individuals with legal information can help
                                                               text data. By predicting the next possible sequence of text,
them better understand their legal situation. This is an
                                                               it absorbs patterns from the data. The scale of its training
important step in making them aware that their issue
                                                               makes it "emerge" remarkable abilities in many textual
has a legal solution, and how to enforce their rights with
                                                               tasks. Comparing these tools will allows us to better un-
regards to this situation. Of course, building tools that
                                                               derstand the trade-offs of these different approaches, and
can provide such information can take a lot of effort, in-
                                                               may even lead to ways of combining the advantages of
cluding finding, categorizing and logically structuring
                                                               both approaches.
legal information, and exposing it to the user of the sys-
tem in a way that is understandable to them. Here, we
perform some initial investigations of whether ChatGPT,        4.1. Evaluation criteria
a sophisticated general language model, can supply indi-       Developing evaluation criteria ensures that we assess
viduals with legal information, without specifically being     ChatGPT’s performance in an objective, accurate and
adapted to the legal domain.                                   systematic manner. We drew upon the "HHH" (help-
   At the same time, it is important to remain aware of        ful, honest, harmless) comparison criteria from previous
the challenges of interacting with laypeople using AI.         research[39]. Here, we believe that "helpfulness" not only
Judges and lawyers, who have specialized training in law       refers to ChatGPT’s ability to provide accurate and re-
school, are better equipped to spot errors and flaws in        liable information, but also encompasses its ability to
the AI generated legal contents. However, lay people           help users learn how to use the tool smoothly through a
may not be able to evaluate the accuracy of the provided       gentle learning curve. In addition, an important aspect
information. So when we talk about legal self-help tools,      of evaluating the helpfulness of ChatGPT is whether it
legal information accuracy above the "pass mark" is not
enough because legal self-help tools will serve a wide         5
                                                                   https://justicebot.ca
can recommend some aspects that users may not have            encounters. As described above, in these cases, the in-
considered based on their specific situation. Therefore,      ability to obtain accurate legal information and advice
we adapted the criteria to the objectives of this study. In   can be a significant impediment to accessing justice. To
the resulting evaluation comparison criteria, we aim to       evaluate how ChatGPT and JusticeBot can help in these
test two tools for:                                           situations, we generated cases that are representative of
                                                              everyday legal issues. Since the public version of Jus-
     • Language comprehension. Should understand ticeBot mainly covers landlord-tenant cases, in order to
        natural language and legal terms described by a ensure the feasibility of comparison, we limit the selected
        layperson, while effectively communicating legal cases mainly to this area.
        information to users in a clear and understandable       To avoid potential bias that could result from the cases
        way.                                                  we used being included in ChatGPT’s training dataset,
     • Accuracy. Should provide information and ad- we did not choose to select real cases, but instead used
        vice that is correct, reliable, and consistent with ChatGPT for case generation, and then manually selected
        legal sources such as statutes, regulations, and and adjusted cases to ensure that the selected cases cover
        case law. As the law is dynamic and subject to as many aspects of legal information provision ability as
        change, the information obtained by users from possible.
        the tool should also be timely and up-to-date.[2]        The prompt we use is: "As a law professor, you need
     • Completeness. Should provide users with the to develop simulated landlord-tenant cases for teaching
        necessary legal context and guidance on when purposes. Please provide three cases, detailing the parties
        and how to apply the provided legal information involved, what happened, and who needs to seek legal
        in different scenarios. Additionally, it should offer information, and note that the cases occurred within the
        concrete and actionable next steps for users to jurisdiction of Quebec."
        follow.                                                  Finally, the cases we use in this paper are:
     • Trustworthiness. Should not provide mislead-
        ing information, which includes wrong informa-              • Miller v. Johnson. In this case, Ms. Johnson
        tion and incomplete information. Even if the in-              rented a property to Mr. Miller. After Mr. Miller
        formation provided is accurate, it may cause users            moved in, he was late in paying rent from the
        to take wrong actions or lead to adverse conse-               second month onward. At one year after moving
        quences due to the lack of relevant context.                  in, Mr. Miller stopped paying rent altogether. Ms.
        Trustworthiness includes more than just accu-                 Johnson asked Miller to fix the problem, but he
        racy. A trustworthy tool not only provides accu-              refused. Ms. Johnson then sought legal advice
        rate information but also ensures that users can              and sued Mr. Miller for breach of contract. This
        confidently apply this information in the corre-              case took place in Quebec, Canada.
        sponding scenarios. Trust is the user’s percep-             • Jones v. Smith. In this case, Ms. Jones rented
        tion, representing a consistent experience of re-             a property from Mr. Smith. After moving in,
        ceiving trustworthy information throughout their              Ms. Jones discovered that the property had a
        repeated interactions with the tool.                          serious pest infestation that made it unlivable.
     • Harmless. Among others, should not generate                    She asked Mr. Smith to fix the problem, but he
        toxic or offensive statements and interact with               refused. Ms. Jones then withheld rent payments
        users in a positive and inspiring manner when-                until the problem was resolved. Mr. Smith sued
        ever possible. Should not favor specific individu-            Ms. Jones for non-payment of rent. This case
        als, organizations or interests and should remain             took place in Quebec, Canada.
        neutral. Should not encourage users to engage               • Johnson v. Smith. In this case, Mr. Johnson
        in illegal, dangerous, or potentially harmful ac-             rented an apartment from Mr. Smith. The lease
        tivities. Should respect the privacy of users and             agreement stated that pets were not allowed in
        protect their personal information.                           the apartment. However, after moving in, Mr.
     • User-friendly. Should be easy to use, with low                 Johnson purchased a small dog and kept it in
        requirements for users.                                       the apartment. Mr. Smith discovered the dog
                                                                      during a routine inspection and demanded that
                                                                      Mr. Johnson get rid of it. Mr. Johnson refused
4.2. Case Generation                                                  and argued that he had a legal right to keep the
Everyday legal disputes, which are typically low-                     dog as a companion animal for his mental health.
intensity but high-volume (compare [40]), are among                   Mr. Smith disagreed and threatened to evict Mr.
the most common types of cases that the general public                Johnson if he did not get rid of the dog. This case
                                                                      took place in Quebec, Canada.
   As an aside, in this use ChatGPT proved to be a power-      pathway, and users only need to make a few simple clicks
ful tool for the generation of synthetic data. This could be   to get the information they need, and the user experience
an interesting use case in research where example cases        is also very good.
need to be evaluated, or used for training a machine              Language comprehension. ChatGPT has a good un-
learning model (see e.g. [41]).                                derstanding of natural language. Even with some typos
                                                               and confusion in the description, it can still understand
4.3. Prompting process                                         what the user means. ChatGPT also has the ability to
                                                               understand different ways of describing the same or sim-
Users who utilize legal self-help tools may lack a legal       ilar situations, such as bed bugs and pest infestation. In
background and therefore have difficulty articulating          JusticeBot, however, the paths that users can choose are
their situation clearly or omitting key information. In        limited by the scope of the system development. The user
some cases, they may not even be sure what type of legal       is asked to determine for themselves whether their situ-
information they need. To test ChatGPT’s analytical abil-      ation may fulfill certain legal criteria, and are provided
ities in the face of specific cases, we use it to simulate a   with context in the form of case law and plain language
scenario where a litigant communicates with a lawyer. In       descriptions to help them with this.
the generated prompts, we include common errors and               This language understanding capability of ChatGPT
omitted key information to determine if ChatGPT can            has significant implications for legal self-help tools, mean-
recognize self-contradictory points in a user’s descrip-       ing that lay people can more easily articulate their needs
tion and complete missing information in subsequent            and get answers when using these tools.
conversations.                                                    The answers ChatGPT provides rarely contain complex
   The prompt we use is "Suppose you are [party], you          legal jargon, instead explaining the information users
have no background knowledge of the law, and you are a         need in plain English. However, It tends to generate ex-
party to the following case: [case fact]. Please simulated     cessively long responses. While in some cases detailed
what you would say as a lay person to your lawyer when         responses can provide users with more relevant informa-
you are confronted."                                           tion, in other cases ChatGPT generates responses that are
   After obtaining simulated descriptions of parties that      simply semantic repetitions of the same meaning, which
could be provided to lawyers, we tested ChatGPT using          may be confusing for the user.
dialogue and followed up with further questions based             Accuracy. The biggest shortcoming of using Chat-
on its responses to assess its ability to provide legal in-    GPT to directly provide legal information is the lack of
formation.                                                     accuracy of those answers. It frequently "hallucinates"
   To avoid the interference of ChatGPT’s ability to re-       answers to legal questions, generating false legal provi-
member consecutive conversations with the results, we          sions and false cases.
tested each case in a new session. Although JusticeBot            ChatGPT provided correct information in some of our
does not have the ability to remember the previous con-        experiments, such when we tested the case Miller v. John-
versation, each test was also conducted in JusticeBot’s        son (see 4.2), where ChatGPT correctly identified the
new session in order to ensure the comparability of the        relevant dispute resolution institution as the Tribunal ad-
experiment.                                                    ministratif du logement (formerly the Régie du logement).
                                                               This may mean that ChatGPT has a higher accuracy in
                                                               providing non-numbered and more general information.
5. Results                                                        In the Jones v. Smith Case, ChatGPT provided content
Overall, ChatGPT and JusticeBot have different advan-          that was quite close to the actual legal provisions, but the
tages in different aspects. ChatGPT has very good perfor-      information it provided deviated from the actual legal
mance in language comprehension. However, its perfor-          provisions in terms of key information. This could cause
mance in information accuracy, completeness and trust-         additional confusion to the user, as the provisions appear
worthiness is somewhat lacking, while JusticeBot per-          to be credible, relevant to the case, and the corresponding
forms well in these aspects. Both tools perform well           legal article numbers are quite close to the actual num-
in terms of harmlessness, with ChatGPT not generating          bers. In the context of users without legal training, there
harmful information during our experiments, and Justice-       is a high probability that users will trust the information
Bot avoiding it altogether due to its deterministic nature.    provided by ChatGPT.
In terms of user experience, ChatGPT requires little adap-        Surprisingly, in the Johnson v. Smith Case, ChatGPT
tation due to its natural language interface, and users can    provides legal content that is very similar to the original
continuously follow up on the answers they receive, like       Charter of human rights and freedoms. Although the
they may do when speaking to a lawyer. JusticeBot uses         content of the article is almost correct, it is not fully
a series of simple questions to help users find the correct    relevant to the case at hand.
                                                                  Completeness. JusticeBot does a better job at provid-
ing complete information, and ChatGPT has shown the                        ness of ChatGPT in providing legal information. It is
ability to proactively provide additional relevant infor-                  difficult for a layperson to point out errors in the in-
mation in some cases.                                                      formation provided, and ChatGPT does not reflect and
   In the case of Jones vs. Smith, ChatGPT addressed                       admit that its answers were faulty in follow-up questions,
not only the resolution of a housing infestation and the                   which can cause significant problems for the normal user.
landlord’s obligations, but also whether the tenant had                       Since the information provided by JusticeBot is vetted
the right to reduce or suspend rent payments.                              by experts, JusticeBot’s performance in terms of trust-
   JusticeBot exhibits a higher level of completeness in                   worthiness is more reassuring.
providing legal information. For instance, users may                          Harmless. In our experiment, both ChatGPT and Jus-
overlook important legal facts such as whether the other                   ticeBot used polite and professional language, without
party has been notified or when the other party is overdue                 generating any toxic or harmful information. Likewise,
on paying rent. It is often challenging for laypeople to                   we did not observe any noticeable bias favoring one side,
actively provide such information when selecting a legal                   demonstrating their satisfactory performance in this re-
pathway. JusticeBot overcomes this issue by providing                      gard. However, it should be noted that while OpenAI has
relevant information during the user’s pathway guidance,                   implemented filtering mechanisms to prevent ChatGPT
as well as offering relevant legal cases for user reference.               from generating harmful content, there is still a possibil-
Thus, users are likely to have a higher probability of                     ity for the system to produce such content under certain
getting comprehensive information during the use of                        coercive circumstances.
JusticeBot.                                                                   In terms of user privacy and protection of personal
   During our experiments, we observed that JusticeBot’s                   information, ChatGPT offers an option to not use the
ability to provide helpful information to users is limited                 conversation data for further model training. As for Jus-
when they encounter situations that are not covered by                     ticeBot, it does not collect any additional personal infor-
the system. This is a deliberate tradeoff to make sure that                mation from users apart from the choices made based
the information provided is relevant. However, it means                    on the questions presented in the legal pathway and
that the JusticeBot system requires regular maintenance                    anonymized statistics.
and expansion by experts to cover a broader range of legal                    User-friendly. Overall, both ChatGPT and JusticeBot
matters. On the other hand, ChatGPT is highly scalable                     offer excellent user interaction. ChatGPT enables direct
and can answer nearly any question posed by the user,                      communication with users in natural language, reduc-
based on the enormous amount of data it has been trained                   ing the learning curve and making it easier to use. In
on. Of course, this may be less of an advantage if the                     contrast, JusticeBot streamlines the process of obtaining
information it provides is often wrong.                                    legal information by allowing users to select the appropri-
   Trustworthiness. As mentioned above, ChatGPT                            ate options rather than having to describe their situation
generates false information, especially about legal pro-                   in detail.
visions and legal cases. ChatGPT also showed strong
confidence in its generated content and was less willing
to modify its answers when users repeatedly asked them                     6. Discussion and Conclusion
about it. For example, ChatGPT provided us with a case
                                                                           ChatGPT and JusticeBot demonstrate impressive capabil-
called Tan v. Bailey [2018 ONCA 260]. Despite our re-
                                                                           ities in different domains. ChatGPT offers an outstanding
peated confirmation, ChatGPT insisted that the case was
                                                                           interactive experience with minimal learning costs for
genuine.
                                                                           users, allowing them to describe their legal matters us-
   Prompt: are you sure it is a real case?
                                                                           ing fragmented language and subsequently correct or
   Answer: Yes, I can confirm that Tan v. Bailey [2018
                                                                           reinforce the facts during the conversation. However,
ONCA 260] is a real case that was heard by the Ontario
                                                                           ChatGPT occasionally generates "hallucinations" in the
Court of Appeal in 2018. The case can be found on the
                                                                           legal field, an issue which may be addressed to some
CanLII website, which is a free online legal database in
                                                                           extent in GPT-4 [42]. As mentioned above, since legal in-
Canada. If you are interested in learning more about the
                                                                           formation tools often target laypeople that are unable to
case or reading the court’s decision, I would recommend
                                                                           verify the information provided, it is very important that
searching for it on the CanLII website or another legal
                                                                           the information is accurate, up-to-date, and sourced legal
database.
                                                                           information. Given that ChatGPT is a language model
   However, through the search in CanLII, the real [2018
                                                                           rather than a knowledge database, it does not generate
ONCA 260] case should be R. v. Jennings6 .
                                                                           information with perfect accuracy. OpenAI seems to be
   This behavior substantially reduces the trustworthi-
                                                                           well aware of this limitation, and specifically prohibits
                                                                           the use of its language models to provide specific legal
6
    R. v. Jennings, 2018 ONCA 260 (CanLII), ,
    retrieved on 2023-03-15
advice.7                                                     [5] H. Westermann, K. Benyekhlef, Justicebot: A
    JusticeBot, on the other hand, shines in its ability to      methodology for building augmented intelligence
provide accurate and deterministic legal information, ver-       tools for laypeople to increase access to justice, in:
ified by legal experts. Of course, inputting this informa-       Proceedings of the Nineteenth International Con-
tion for new legal areas can take time, which can make           ference on Artificial Intelligence and Law, 2023.
it difficult to cover every legal domain a user may face.    [6] L. K. Branting, Advisory systems for pro se litigants,
    Given the different tradeoffs between the approaches         in: Proceedings of the 8th international conference
of JusticeBot and ChatGPT, an interesting approach could         on Artificial intelligence and law, 2001, pp. 139–146.
be combining the two. Tools such as the JusticeBot could     [7] J. Zeleznikow, Using web-based legal decision sup-
be used to inject verified and accurate knowledge to Chat-       port systems to improve access to justice, Informa-
GPT. For example, ChatGPT could be used as the com-              tion & Communications Technology Law 11 (2002)
munications layer (compare [8]) that communicates with           15–33.
the user and makes the information accessible to them        [8] L.-C. Paquin, F. Blanchard, C. Thomasset, Loge–
(compare [43]). ChatGPT could also guide the user to-            expert: from a legal expert system to an information
ward the correct legal pathway in the JusticeBot, thereby        system for non-lawyers, in: ICAIL 1991, 1991, pp.
helping them understand which pathway is relevant for            254–259.
their situation. We previously explored such an approach     [9] H. Westermann, V. R. Walker, K. D. Ashley,
in [41], using other language models. We will continue           K. Benyekhlef, Using factors to predict and an-
investigations in this direction in future work.                 alyze landlord-tenant decisions to increase access
    Overall, as we can see, ChatGPT is not yet accurate          to justice, in: Proceedings of the Seventeenth Inter-
enough to provide legal information directly to laypeople.       national Conference on Artificial Intelligence and
ChatGPT and JusticeBot have different strengths in pro-          Law, 2019, pp. 133–142.
viding legal information — a very promising avenue may [10] M. Bommarito II, D. M. Katz, Gpt takes the bar
be the combination of the two approaches, to create a            exam, arXiv preprint arXiv:2212.14402 (2022).
powerful tool that provides comprehensive legal support. [11] OpenAI, Gpt-4, 2023. URL: https://openai.com/
We are looking forward to exploring such approaches in           research/gpt-4.
future work.                                                [12] D. M. Katz, M. J. Bommarito, S. Gao, P. Arredondo,
                                                                 Gpt-4 passes the bar exam, Available at SSRN
                                                                 4389233 (2023).
Acknowledgments                                             [13] J. H. Choi, K. E. Hickman, A. Monahan, D. Schwarcz,
                                                                 Chatgpt goes to law school, Available at SSRN
Jinzhe Tan, Hannes Westermann, and Karim Benyek-
                                                                 (2023).
lef would like to thank the Cyberjustice Laboratory at
                                                            [14] A. M. Perlman, et al., The implications of openai’s
Université de Montréal, the LexUM Chair on Legal Infor-
                                                                 assistant for legal services and society, Available at
mation and the Autonomy through Cyberjustice Tech-
                                                                 SSRN (2022).
nologies (ACT) project for their support of this research.
                                                            [15] J. Savelka, K. Ashley, M. Gray, H. Westermann,
                                                                 H. Xu, Can gpt-4 support analysis of textual data
References                                                       in tasks requiring highly specialized domain exper-
                                                                 tise?, in: ASAIL’23: 6th Workshop on Automated
  [1] OpenAI, Introducing chatgpt and whisper                    Semantic Analysis of Information in Legal Text,
        apis,      2023. URL: https://openai.com/blog/           2023.
        introducing-chatgpt-and-whisper-apis/.              [16] J. Savelka, Unlocking practical applications in le-
  [2] M. Lauritsen, Q. Steenhuis, Substantive legal soft-        gal domain: Evaluation of gpt for zero-shot se-
        ware quality: A gathering storm?, in: Proceedings        mantic annotation of legal texts, arXiv preprint
        of the Seventeenth International Conference on Ar-       arXiv:2305.04417 (2023).
        tificial Intelligence and Law, 2019, pp. 52–62.     [17] J. Savelka, K. Ashley, M. Gray, H. Westermann,
  [3] M. Galanter, Access to justice in a world of expand-       H. Xu, Explaining legal concepts with augmented
        ing social capability, Fordham Urb. LJ 37 (2010)         large language models (gpt-4), in: AI4Legs 2023: AI
        115.                                                     for Legislation, 2023.
  [4] M. Cappelletti, B. Garth, Access to justice: the [18] H. Westermann, J. Savelka, K. Benyekhlef, Llmedi-
        newest wave in the worldwide movement to make            ator: Gpt-4 assisted online dispute resolution, in:
        rights effective, Buff. L. Rev. 27 (1977) 181.           Artificial Intelligence for Access to Justice (AI4AJ
                                                                 2023), 2023.
                                                            [19] J. Wei, Y. Tay, R. Bommasani, C. Raffel, B. Zoph,
7
  https://openai.com/policies/usage-policies                     S. Borgeaud, D. Yogatama, M. Bosma, D. Zhou,
     D. Metzler, et al., Emergent abilities of large lan-          nowhere to go’ self-help legal aid services in aus-
     guage models, arXiv preprint arXiv:2206.07682                 tralia, Alternative Law Journal 26 (2001) 184–190.
     (2022).                                                  [37] J. E. Cabral, A. Chavan, T. M. Clarke, J. Grea-
[20] C. Zhen, Y. Shang, X. Liu, Y. Li, Y. Chen, D. Zhang,          cen, Using technology to enhance access to justice,
     A survey on knowledge-enhanced pre-trained lan-               Harvard Journal of Law & Technology 26 (2012)
     guage models, arXiv preprint arXiv:2212.13428                 241–324.
     (2022).                                                  [38] J. Bennett, T. Miller, J. Webb, R. Bosua, A. Lodders,
[21] F. Yu, L. Quartey, F. Schilder, Legal prompting:              S. Chamberlain, Current state of automated legal
     Teaching a language model to think like a lawyer,             advice tools (2018).
     arXiv preprint arXiv:2212.01326 (2022).                  [39] A. Askell, Y. Bai, A. Chen, D. Drain, D. Gan-
[22] D. Trautmann, A. Petrova, F. Schilder, Legal prompt           guli, T. Henighan, A. Jones, N. Joseph, B. Mann,
     engineering for multilingual legal judgement pre-             N. DasSarma, et al., A general language assis-
     diction, arXiv preprint arXiv:2212.02199 (2022).              tant as a laboratory for alignment, arXiv preprint
[23] J. Wei, X. Wang, D. Schuurmans, M. Bosma, E. Chi,             arXiv:2112.00861 (2021).
     Q. Le, D. Zhou, Chain of thought prompting elicits       [40] K. Benyekhlef, J. Zhu, Intelligence artificielle et
     reasoning in large language models, arXiv preprint            justice: justice prédictive, conflits de basse intensité
     arXiv:2201.11903 (2022).                                      et données massives, Intelligence 30 (2018).
[24] N. Guha, D. E. Ho, J. Nyarko, C. Ré, Legalbench:         [41] H. Westermann, S. Meeùs, M. Godet, A. Troussel,
     Prototyping a collaborative benchmark for legal               J. Tan, J. Savelka, K. Benyekhlef, Bridging the gap:
     reasoning, arXiv preprint arXiv:2209.06120 (2022).            Mapping layperson narratives to legal issues with
[25] B. Guo, X. Zhang, Z. Wang, M. Jiang, J. Nie, Y. Ding,         language models, in: ASAIL’23: 6th Workshop on
     J. Yue, Y. Wu, How close is chatgpt to human ex-              Automated Semantic Analysis of Information in
     perts? comparison corpus, evaluation, and detec-              Legal Text, 2023.
     tion, arXiv preprint arXiv:2301.07597 (2023).            [42] OpenAI, Gpt-4 technical report,                    2023.
[26] B. S. Noveck, Wikipedia and the future of legal               arXiv:2303.08774.
     education, J. Legal Educ. 57 (2007) 3.                   [43] T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu,
[27] A. Currie, The legal problems of everyday life,               M. Lomeli, L. Zettlemoyer, N. Cancedda, T. Scialom,
     in: Access to justice, Emerald Group Publishing               Toolformer: Language models can teach themselves
     Limited, 2009.                                                to use tools, arXiv preprint arXiv:2302.04761 (2023).
[28] T. C. Farrow, A. Currie, N. Aylwin, L. Jacobs,
     D. Northrup, L. Moore, Everyday legal problems
     and the cost of justice in canada: Overview report,
     Osgoode Legal Studies Research Paper (2016).
                                                              A. Support materials
[29] L. Savage, S. McDonald, Experiences of serious           Support materials are stored at https://github.com/
     problems or disputes in the canadian provinces,          JinzheTan/ChatGPT-as-an-Artificial-Lawyer- The file
     2021, Juristat: Canadian Centre for Justice Statistics   “ChatGPT as an Artificial Lawyer appendix.docx” has
     (2022) 1–28.                                             the original transcript of the conversation with ChatGPT
[30] J. Macfarlane, The national self-represented liti-       and JusticeBot.
     gants project: Identifying and meeting the needs of
     self-represented litigants final report (2013).
[31] K. Branting, C. Balhana, C. Pfeifer, J. S. Aberdeen,
     B. Brown, Judges are from mars, pro se litigants
     are from venus: Predicting decisions from lay text.,
     in: JURIX, 2020, pp. 215–218.
[32] K. A. Sabbeth, Housing defense as the new gideon,
     Harv. Women’s LJ 41 (2018) 55.
[33] L. Sudeall, The overreach of limits on" legal advice",
     Yale LJF 131 (2021) 637.
[34] M. Galanter, Why the haves come out ahead: Spec-
     ulations on the limits of legal change, Law & Soc’y
     Rev. 9 (1974) 95.
[35] L. S. o. U. Canada, Access to Justice for a New Cen-
     tury: The Way Forward, Law Society of Upper
     Canada, 2005. Google-Books-ID: zxN9QgAACAAJ.
[36] J. Giddings, M. Robertson, ‘informed litigants with