<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Return of the AI: An Analysis of Legal Research on Artificial Intelligence Using Topic Modeling</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Constanta Rosca</string-name>
          <email>constanta.rosca@maastrichtuniversity.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bogdan Covrig</string-name>
          <email>b.covrig@maastrichtuniversity.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Catalina Goanta</string-name>
          <email>catalina.goanta@maastrichtuniversity.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gijs van Dijck</string-name>
          <email>gijs.vandijck@maastrichtuniversity.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gerasimos Spanakis</string-name>
          <email>jerry.spanakis@maastrichtuniversity.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Law &amp; Tech Lab, Maastricht University Maastricht</institution>
          ,
          <country country="NL">Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>AI research finds itself in the third boom of its history, and in recent years, AI-related themes have gained considerable popularity in new disciplines, such as law. This paper explores what legal research on AI constitutes of and how it has evolved, while addressing the issues of information retrieval and research duplication. Using Latent Dirichlet Allocation (LDA) topic modeling on a dataset of 3931 journal articles, we explore three questions: (a) Which topics within legal research on AI can be distinguished? (b) When were these topics addressed? and (c) Can similar papers be detected? The topic modeling results in a total of 32 meaningful topics. Additionally, it is found that legal research on AI drastically increased as of 2016, with topics becoming more granular and diverse over time. Finally, a comparison of the similarity assessments produced by the algorithm and a human expert suggest that the assessments often coincide. The results provide insights into how a legal research on AI has evolved over time, and support for the development of machine learning and information retrieval tools like LDA that assist in structuring large document collections and identifying relevant articles.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Computing methodologies → Topic modeling; • Applied
computing → Law.
information retrieval, topic modeling, legal research</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>Artificial intelligence (AI) research finds itself in the third boom of
its history, fuelled by increased funding, scientific breakthroughs
such as deep learning, and widespread public speculation relating
to the scope and impact of these breakthroughs. While decades
ago the interest in AI research was generally limited to specific
disciplines (e.g. computer science, philosophy), during the past
years AI-related themes have gained considerable popularity in
new disciplines, such as law.</p>
      <p>
        With over 2500 publications already by the year 2015 referring
to ‘artificial intelligence’ [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] (see also Figure 3 in the Appendix),
it may no longer be realistic to assume that researchers can keep
up with legal research on AI, or the number of publications in
general. Moreover, the recent spike suggests more and more authors
have started writing about AI in law topics, including authors who
have previously not published on such topics. This, in combination
with the inability to keep up with legal research on AI due to its
exponential growth, creates the risk that authors replicate previous
work without being aware of similar previous publications.
      </p>
      <p>This paper aims to explore what legal research on AI constitutes
of and how it has evolved while addressing the issue of
information retrieval and the risk of research duplication. We develop a
methodology that distinguishes topics in a collection of documents
(in this case journal publications), allows exploring the evolution
of the topics over time, and detects similarity between documents,
with the purpose of providing solutions for reading and
analyzing a number of publications in bulk in ways that humans cannot.
Consequently, we aim to answer the following research questions
(RQ):</p>
      <p>RQ1: Which topics within the field of legal research on AI can
be distinguished (‘What’)? A methodology would provide insight
in how legal research on AI is structured, and it would allow
classifying publications in sub-topics, which would enhance information
retrieval.</p>
      <p>RQ2: What (i.e. about which topics) has been written when
(‘What - When’)? This question contributes to the understanding of
which topics have emerged, remained, or lost the interest of legal
scholars. The analysis of the What - When question will provide
information about the evolution of legal research on AI.</p>
      <p>RQ3: Can similar papers be detected? Considering the sharp
increase of publications, it may be becoming increasingly dificult
to find publications on similar research questions, which may even
result in reproduction of scholarship because prior publications
are overlooked. This question explores a methodology that allows
detecting thematically similar documents in a given corpus.
2</p>
    </sec>
    <sec id="sec-3">
      <title>BACKGROUND</title>
      <p>
        One of the innovations in this paper is to use unsupervised
machine learning to categorize legal research on AI and to map how
legal scholarship on this topic has developed. The idea of smart
literature reviews has received prior attention, on the grounds of
how manual searches for existing literature - especially in matured
domains where scholarship is abundant - is not eficient and might
have a negative impact on the quality of new research [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Similar
approaches have been taken to map computer science literature
[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], as well as communications research [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
      </p>
      <p>
        To the best of our knowledge, legal scholarship as such has
not yet been analyzed using LDA. Within the narrow confines of
the field of research labeled as AI and Law, a lot of legal research
on information retrieval has been published in the past decades.
Notwithstanding that the volume of scholarship on information
retrieval pertaining to the discipline of computer science alone is
vast to say the least [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], particularly when applied to the legal
ifeld, such research focused on the availability of legal information
[
        <xref ref-type="bibr" rid="ref20 ref21 ref28 ref32 ref34 ref46 ref5 ref55">5, 20, 21, 28, 32, 34, 46, 55</xref>
        ], search systems and search strategies [
        <xref ref-type="bibr" rid="ref16 ref22 ref30 ref43 ref47 ref57">16,
22, 30, 43, 47, 57</xref>
        ], information processing [
        <xref ref-type="bibr" rid="ref17 ref18 ref25 ref33 ref36 ref37 ref50 ref6">6, 17, 18, 25, 33, 36, 37, 50</xref>
        ],
and the role of legal publishers [
        <xref ref-type="bibr" rid="ref1 ref26 ref3">1, 3, 26</xref>
        ]. These publications address
a variety of issues and questions, including the sustainability of
publicly available legal repositories (e.g. AustLII), the importance of
natural language processing when searching for legal information,
the performance of online searches compared to searching through
paper, how citation analysis may be used to improve search results,
the role of legal publishers in this and, more generally, the impact
of automation on how the law is analyzed and applied.
      </p>
      <p>
        Still, the question of how to capture and visualize the
development of an entire legal sub-field like legal research on AI has
barely been explored. One of the avenues of exploration has been
shaped by Bench-Capon et al., who have previously focused on
the proceedings of the International Conference on AI and Law,
ifrst held in 1987, to make a 25-year retrospective of the research
generated therein by describing the scholarship progress through
illustrative papers selected from various editions [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Other studies
focused on traditional (systematic) literature reviews on the impact
of AI on specific legal domains, such as administrative law [
        <xref ref-type="bibr" rid="ref40">40</xref>
        ] or
intellectual property [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ].
      </p>
      <p>However, given the limitations of legal databases or the way
they are used by researchers, making comprehensive overviews of
existing literature remains a considerable hurdle. This is all the more
so in the past four years, when the production of legal scholarship
on artificial intelligence seems to have grown considerably (see
Figure 3 in the Appendix). This paper aims to fill this research
gap by proposing and testing an unsupervised machine learning
approach to the clustering of literature in this field, namely topic
modeling.
3
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>METHODOLOGY</title>
    </sec>
    <sec id="sec-5">
      <title>Corpus</title>
      <p>The corpus includes a total of 3931 journal articles obtained from the
HeinOnline database. Absent a centralised, comprehensive, open
access repository for international legal scholarship, we focused
on one of the commercially available databases. HeinOnline is one
of the leading international databases on legal materials, which
contains over 170 million pages of literature and indexing over
2700 law journals. In the section ‘Law Journal Library’, section
type ‘Articles’, the corpus covers literature available in the database
between 1960 and 2018. Unlike arXiv, HeinOnline does not have
a section on ‘artificial intelligence’. The total number of retrieved
articles reflects the results of a boolean search using the keywords
‘artificial intelligence’, namely all articles which include both terms.</p>
      <p>Resulting articles therefore discuss a wide array of aspects
relating to artificial intelligence, and do not, as such, focus on specific
technical or legal issues. For the purpose of this study, we assume
that even one reference to the keywords is suficient to include an
article in the corpus.</p>
      <p>The articles in the corpus follow a power law distribution, where
a relatively large number of publications is written by a low number
of authors, and few publications by many authors (see Figure 4 in
the Appendix). The same distribution applies to the number of
publications per author(s) with roughly 28 authors having more
than 5 publications (see Figure 5 in the Appendix).
3.2</p>
      <p>
        Topic Modeling
3.2.1 Latent Dirichlet Allocation. Latent Dirichlet Allocation (LDA)
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] was used to identify topics in the corpus. LDA has many diferent
use cases so far. Examples concern organizing large document
collections in order to improve search and retrieval of information,
summarization of large textual data, and even image clustering. In
the legal domain, LDA has been used to study the agenda of the US
Supreme Court [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], the High Court of Australia [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and the Court
of Justice of the European Union (CJEU) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Winkels [
        <xref ref-type="bibr" rid="ref58">58</xref>
        ] used
LDA to build a recommender system for Dutch case law. Panagis
and Sadl [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ] combined network analysis and LDA to study the
case-law generative process of the CJEU.
      </p>
      <p>
        LDA is a generative, probabilistic model for a collection of
documents, which are represented as mixtures of latent topics, where
each topic is characterized by a distribution over all words of the
collection of documents. The basic representation unit of the
documents is the word, i.e. all distinct terms are extracted from the
document collection along with their frequencies (per document),
which is the so called ‘Bag-of-Words’ model [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ].
      </p>
      <p>
        On a conceptual level, the algorithm tries to discover topics that
can represent the collection of documents. Each document is
generated from a mixture of these topics and each topic is generated
from a probability vector (distribution) over all words. Assuming
such a generative model for any collection of documents, LDA’s
goal is to try and ‘backtrack’ this process, i.e. find a set of topics
that are likely to have generated the whole document collection.
3.2.2 LDA Variants. In this paper, we used the model of Blei et al.
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], however we did explore other variations as well. Stevens et al.
present a qualitative comparison of diferent topic model algorithms,
[
        <xref ref-type="bibr" rid="ref48">48</xref>
        ] also using diferent evaluation metrics [
        <xref ref-type="bibr" rid="ref31 ref35">31, 35</xref>
        ] and conclude
that, given the same data and the same number of topics, LDA is
able to learn more coherent topics than the competing approaches.
      </p>
      <p>
        Nevertheless, there are extensions of the LDA model towards
topic tracking over time [
        <xref ref-type="bibr" rid="ref53 ref56">53, 56</xref>
        ]. However, according to Wang et al.
[
        <xref ref-type="bibr" rid="ref52">52</xref>
        ], these methods deal with constant topics and the timestamps
are used for better discovery. Opposed to that, in a Blei et al. paper
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] a model for detection of evolving topics in a discrete time space
is presented (Dynamic Topic Modeling - DTM). Here, LDA is used
on topics aggregated in time epochs and a state space model handles
transitions of the topics from one epoch to another.
      </p>
      <p>
        Bruggermann et al. applied DTM on the RCV1 Reuters corpus
(810.000 documents) with weekly time epochs [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Results showed
that the variances within the topics among the time epochs are
marginal. DTM still treats the corpus as a whole and the number of
topics is fixed over all time epochs. Chaney et al. introduce another
extension to LDA that detects events in a large text collection [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
Their model adds separate probability distributions for defined
entities and time intervals to the generative process. The model
consists accordingly of general topics, entity-related topics and
topics specific to time intervals. Events are detected as anomalies,
which are identified as temporary deviations from usual behavior.
Usual behavior refers to the topics discussed by the entities. An
anomaly can be detected, whenever these entities change their
topics of discussion significantly and at the same time in a similar
way.
3.2.3 Output of LDA. LDA can be applied on a corpus and the
output model provides the following distributions:
• t = {  }: topic-word vector-distribution, where  denotes
the topic (in total there are  topics) and  denotes the word
(in total there are  words in the collection). The component
  shows the relative weight of word  in topic .
• d = { }: document-topic vector-distribution, where 
denotes the topic (out of the total  topics) and  denotes a
document. The component  shows the relative weight of
topic  in document .
4
4.1
      </p>
    </sec>
    <sec id="sec-6">
      <title>RESULTS AND ANALYSIS</title>
    </sec>
    <sec id="sec-7">
      <title>Topics in Legal Research on AI (What?)</title>
      <p>Pre-processing of the corpus involved filtering for articles written
only in the English language and afterwards removing common
English stop-words (the, of, etc.) as well as removing some very
common corpus-specific words (subject, supra, part, etc.).
Moreover, we considered the presence of either unigrams (one words)
or bigrams (two words) so as to be able to capture some concepts
like ‘dispute resolution’.</p>
      <p>
        Subsequently, LDA was used to identify topics in the corpus.
For identifying the number of topics ( ), we used perplexity [
        <xref ref-type="bibr" rid="ref51">51</xref>
        ]
and coherence [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] measures on a held-out set. We started from
 = 5 and increased it (step 5) till  = 100. A plateau in both
perplexity (178.4) and coherence (0.51) could be observed around
35 or 40 topics, which suggests a computationally optimal number
of topics. Based on this, topic models with 30, 35, 40, and 45 topics
were explored. Two legal researchers inspected the results of the
various topic models in order to determine which number of topics
were substantively the most meaningful. For this, the 20 terms with
the highest weights for each topic were provided to the researchers
(e.g. ‘weapon’, ‘system’, ‘military’, ‘international’, ‘war’ etc.). Based
on this evaluation, a topic model was selected that consisted of 35
topics.
      </p>
      <p>The topic validation consisted of two steps. First, three researchers
inspected the 20 terms for each topic in order to label the topic
(e.g. ‘military technology’). Second, paper titles were inspected to
determine whether the paper titles supported the assigned label
- if not, the label would, if possible, be adjusted. In this respect,
the researchers were presented with a list of paper titles for each
topic. The validation process indicated that the vast majority of the
LDA-produced topics were substantively meaningful.</p>
      <p>Table 1 reveals the topics distinguished by the LDA model. It
includes the topic IDs (not meaningful), the ten terms that have the
highest weights in relation to the topic, and the labels the human
coders assigned to the topics.</p>
      <p>Three miscellaneous topics were identified: id21, id25, and id33.
Both the inspection of the 20 words and the titles did not result in a
substantively meaningful label or description for these topics. It was
decided to not remove the words in these topics and to not re-run
the topic modeling algorithm, as it was expected that the removal
of words could introduce selection bias in the corpus. Moreover,
the identification of the three miscellaneous topics does not afect
the relevance or interpretation of the other topics.</p>
      <p>
        The results show a wide range of topics, varying from tax to
military technology to copyright. Within each topic, diversity could
still be observed. For example, for the topic of algorithmic
decisionmaking and quantitative methods, paper titles such as ‘An FDA for
Algorithms’ [
        <xref ref-type="bibr" rid="ref49">49</xref>
        ], ‘A Simple Guide to Machine Learning’ [
        <xref ref-type="bibr" rid="ref54">54</xref>
        ], and
‘Lawyer as a Soothsayer: Exploring the Important Role of Outcome
Prediction in the Practice of Law’ [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] can be observed.
      </p>
      <p>An important issue concerned the initial selection of journal
articles. The articles were selected based on the search string ‘artificial
intelligence’. The number of occurrences of this string presumably
is, however, not an entirely accurate proxy for measuring whether
the paper actually is about artificial intelligence. It might be that
papers on public policy or competition &amp; markets mention the term
‘artificial intelligence’, but do not primarily focus on artificial
intelligence or even on technology or digital matters. An additional
selection was therefore required. Consequently, three of the
researchers (authors) independently went through the topic list and
the related words as displayed in Table 1 in order to determine
whether the labels and words are likely to be related to artificial
intelligence, digital, and/or technology. A perfect agreement could
be observed for the vast majority of topics. In the few instances
of disagreement between the coders, the disagreements were
resolved through a brief discussion. Ultimately, a total number of 18
topics was selected (Table 1, in bold). The 18 topics were used in
subsequent analyses.
4.2</p>
    </sec>
    <sec id="sec-8">
      <title>How Did Legal Research on AI Evolve (What - When)?</title>
      <p>To answer this question we needed to determine which of the 18
topics that were selected in the previous section, have gained or lost
attention over time. The first step is to extract the dominant topics
of each paper by using the document-topic vectors  (as described
in section 3.2.3). More specifically, we denoted for each document 
the first three dominant topics 1−3, based on their relative weights
(contributions) in descending order in the document-topic vector d .
To assess the relevance of each document’s topics, two researchers
manually reviewed the first five topics – sorted by means of
contribution – from a sample of documents. They agreed that for most
documents, the relevance dropped significantly after the third topic,
since the latter topics provide no substantive contribution to the
paper. Based on the results of the inspection, the first three topics
were denoted as dominant. We define the frequency of a topic as
the number of times that a topic is dominant in all articles.</p>
      <p>Linking the topic frequencies to the year of publication, the count
of papers addressing each topic every year was computed. Figure 1
depicts how the 18 selected topics evolved over time (1960 - 2018).
id
0
1
2
3
4
5
6</p>
      <p>
        Figure 2 shows how the topics have evolved in relation to other
topics across six major periods in AI history (taken from [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]).
      </p>
      <p>As a measure of relative topic popularity during a certain period,
the ratio of papers concerned with each topic to the total number of
papers written in a certain period is computed. Caution needs to be
exercised when interpreting the relative topic popularity.
Considering the relatively low number of publications before approximately
1986 (see Figure 1), just before the second AI winter, not much
weight should be given to the topic popularity in those years, as
small changes in the number of publications can have substantial
impact on the popularity of one topic relative to other topics.</p>
      <p>For example, as visualized in Figure 2, scholarly output produced
during the first AI boom concerned, among others,
knowledgebased systems (13.16 %) and algorithmic decision-making and
quantitative methods (7.90 %), however, trends in legal technology and
information retrieval were prominent topics in this period,
forming the subject-matter of 45% and 21% of the papers, respectively.
As time advances, topics like financial regulation &amp; technology ,
autonomous vehicles, surveillance, software licensing and Internet
governance appear.</p>
      <p>Another illustration is reflected by the developments visible
around the deep learning era. The topics that gained popularity
(compared to the previous period, namely the third AI boom) with
rise of deep learning are regulation of innovation, privacy,
algorithmic decision-making and quantitative methods, financial regulation &amp;
technology, military technology and autonomous vehicles. The largest
decreases in popularity in this period are those of knowledge-based
systems and trends in legal technology. The popularity of the other
topics underwent popularity changes below the average for this
period. The interest in most topics grew during the deep learning
era.</p>
      <p>In addition, we observed several interesting trends regarding the
evolution of pairs of topics (e.g. algorithmic decision-making and
quantitative methods with knowledge-based systems and information
retrieval with trends in legal technology), as well as individual topics,
e.g. privacy, which emerged as a topic during the first AI winter ,
1The label ‘ADM and quantitative methods’ refers to the topic algorithmic
decisionmaking and quantitative methods (id14).
but interest in which saw several relatively insignificant changes
before the deep learning era, when its popularity underwent a
high increase.
4.3</p>
    </sec>
    <sec id="sec-9">
      <title>Document Similarity</title>
      <p>
        As mentioned in Section 3.2.3, each document  is represented by a
vector d , where each component of the vector shows the weight
of each topic in that specific document. For any pair of documents
we can use cosine similarity [
        <xref ref-type="bibr" rid="ref42">42</xref>
        ] as a measure of how close two
documents are, which in our case is translated to similarity in the
‘topic space’, i.e. two very similar documents (cosine similarity of
value 1) are expected to have similar topic distributions.
      </p>
      <p>The detection of publications that have similar topics enhances
information retrieval and reduces the risk of reproduction of
scholarship because prior publications are overlooked. Having computed
the similarity between all document pairs in our corpus, we
obtained some 7,436,296 similarity scores after adjusting the algorithm
to avoid computing the similarity between the same document pair
twice.</p>
      <p>To explore the substantive meaning of the ensuing similarity
scores, the results produced by the cosine similarity algorithm were
compared to substantive similarity as defined by a legal expert
one of the authors. The goal of the inspection was to explore (1) the
extent to which papers were similar and (2) whether diferences
in similarity scores produced by the cosine similarity measure are
substantively meaningful diferences. For this, five papers for
diferent topics were selected - the seed papers. Each of these five seed
papers were compared with five papers with diferent similarity
scores - the comparison papers: one similarity score in the .55-.65
range, one in the .65-.75 range, one in the .75-.85 range, one in the
.85-.95 range, and one in the .95-1.00 range. As a result, a total of
ifve seed papers and 25 comparison papers were inspected.</p>
      <p>Without knowledge as to which similarity range the comparison
paper would fall under, the legal expert ranked the similarity for
each pair of papers (seed paper - comparison paper) for each topic
separately. A Spearman’s rank correlation test showed a high
correlation (r = .62, p &lt; .001) between the ranks based on the
machinegenerated similarity scores and the ranks provided by the expert.
The comparison took place on four levels: the paper title (i.e. to
what extent do the paper titles suggest similarity?), the research
question level (i.e. are the research questions similar?), the focus
or sub-topic level (i.e. which sub-topics does the paper focus on,
which angle or perspective does it take?), and the citation level (i.e.
is there a reference from one paper to the other?).</p>
      <p>
        The inspection of the paper suggests that papers with similarity
scores of .85 or lower may share substantive similarity, but in a
limited way at best (e.g. pairs of publications where one article [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
focuses on whether (source) code is or should be copyright protected
and the other paper [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] on whether results produced by machines
are or should be subject to copyright protection). In contrast, articles
with high similarity scores, particularly those with a score of .90 or
higher, are also substantively similar, at least in case of the inspected
papers. For instance, two selected papers on humanitarian law with
the highest similarity score ([
        <xref ref-type="bibr" rid="ref44">44</xref>
        ] and [
        <xref ref-type="bibr" rid="ref45">45</xref>
        ]) discuss legal principles
such as proportionality and necessity, and they discuss the role of
subjectivity and the capability of autonomous weapons systems
(similarity in the &gt;.95 range)
      </p>
      <p>When inspecting citations between papers with very high
similarity scores, references from one paper to another were sometimes
found, and sometimes not. In some instances when no citation was
found this could be due to the fact that papers were published in
subsequent years and authors did not have the opportunity to cite
the published version of the similar paper. Nevertheless, there are
instances where a reference was expected in the papers, but not
found.
5</p>
    </sec>
    <sec id="sec-10">
      <title>DISCUSSION</title>
      <p>The landscape of legal research on AI has undergone considerable
changes with respect to the volume of scholarship tackling topics
dealing with AI. In this context, we were interested in what exact
topics authors focused on, and how these topics shifted over time.
For this, we applied LDA topic modeling to identify topics and
analyze how journal articles were distributed across topics as well
as across time (1960-2018).</p>
      <p>The main finding for the first research question, namely what
topics can be distinguished in the corpus of legal papers on AI,
is that 35 topics can be identified, 32 of them being meaningful
(and three miscellaneous topics). Overall, the model performed
considerably well in identifying latent topics in our corpus (see
Table 1 above).</p>
      <p>The second research question dealt with the evolution of topics
throughout the diferent periods which can be identified in the
history of AI. The topic river displayed in Figure 1 above shows
fundamental changes in legal research on AI from two perspectives:
on the one hand, the total number of papers referring to artificial
intelligence sees a sharp increase since 2016, and on the other hand,
the diversity of topics also increases throughout time. This can be
mostly contextualized by the occurrence of new technologies (e.g.
the Internet, leading to a new topic on Internet governance), but also
by the granular development of existing technologies.</p>
      <p>As for the third research question, namely how similar papers can
be detected, we further calculated similarity scores between pairs
of articles and compared the scores for a selection of pairs to the
scores produced by humans. Consequently, it was explored whether
the similarity scores produced by the machine coincide with the
expert assessment. A correlation test revealed a high correlation
between the orders produced by the machines and by the human.
The highest agreement levels were found for the papers with the
highest similarity scores. The similarity predicted by the machine
did, however, also sometimes deviate from the expert assessment,
although there will undoubtedly also be disagreement between
human experts when assessing the similarity of documents.</p>
      <p>The results do not only provide insight into how a legal research
within a broader theme has evolved over time, they also provide
support for the development of LDA tools that assist in structuring
large document collections and in finding relevant papers. To aid
researchers in either exploring or keeping up with such vast and
complex research themes explored in parts of legal scholarship
which do not necessarily overlap or interact, it is necessary to
consider how a method such as that presented in this paper (topic
modeling) can be used to further visualize this body of legal research.
To this extent, we are currently working on a dashboard which
we aim to make available to scholars (from any discipline) with
an interest in exploring the evolution of legal literature on AI, or
particular topics within it. Such a dashboard would make it easier,
on the one hand, for legal researchers to get the bigger picture of
all the fields of law / journals / scholars tackling topics of interest
relating to AI, and on the other hand for researchers from other
disciplines (e.g. computer science) to have a bird’s eye view on
the vast legal literature which might have a direct impact on their
research. Such publicly available resources could even be a new
way to stimulate more awareness of legal and ethical implications
of technology on society, and be of interest to civil society as well.</p>
      <p>Of course, this paper does not go without limitations. First, the
corpus is not necessarily representative of all legal articles or legal
publications in general. Although HeinOnline papers do
presumably constitute a significant part of journal articles in law, there
are other publishers and repositories that contain a number of
publications that may composed substantively diferent than the
publications in HeinOnline. Second, another limitation regarding
the corpus concerns the initial selection. We selected journal
articles that included the keywords ‘artificial intelligence’. Had we
selected diferent keywords, the corpus might have looked
diferently. Additionally, diferent topics can be expected when running
additional topic models for a subset of the corpus. Furthermore, the
analyses that explored the increase or decrease of publications over
time used the three most dominant topics to determine whether
topics have become less or more relevant. The results might be
diferent if also less relevant topics are taken into consideration,
although such analyses would be empirically dificult to conduct, as
it would require a measure to determine when certain topic weights
are deemed insuficiently relevant.</p>
      <p>We are currently exploring the possibility of applying and
comparing diferent topic modeling algorithms and incorporate the
latest NLP representation models (namely word embeddings
either using word2vec or BERT). Moreover, we want to apply our
methodology to a diferent legal collection and identify whether
our findings can be confirmed in a diferent corpus. Finally, after
fully validating our approach, we are planning to release it as an
open source website where people can explore and visualize our
ifndings in an intuitive way.</p>
      <p>A</p>
    </sec>
    <sec id="sec-11">
      <title>APPENDIX</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Olufunmilayo</given-names>
            <surname>Arewa</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Open Access in a Closed Universe: Lexis, Westlaw, Law Schools, and the Legal Information Market</article-title>
          . (03
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Claus</given-names>
            <surname>Boye</surname>
          </string-name>
          Asmussen and
          <string-name>
            <given-names>Charles</given-names>
            <surname>Møller</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Smart literature review: a practical topic modelling approach to exploratory literature review</article-title>
          .
          <source>Journal of Big Data</source>
          <volume>6</volume>
          ,
          <issue>1</issue>
          (
          <year>2019</year>
          ),
          <volume>93</volume>
          . https://doi.org/10.1186/s40537-019-0255-7
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Steven</given-names>
            <surname>Barkan</surname>
          </string-name>
          .
          <year>1992</year>
          .
          <source>Can Law Publishers Change the Law? Legal Reference Services Quarterly</source>
          <volume>11</volume>
          (03
          <year>1992</year>
          ),
          <fpage>29</fpage>
          -
          <lpage>35</lpage>
          . https://doi.org/10.1300/J113v11n03_
          <fpage>05</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Trevor</given-names>
            <surname>Bench-Capon</surname>
          </string-name>
          , Michal Araszkiewicz, Kevin Ashley, Katie Atkinson, Floris Bex, Filipe Borges, Daniele Bourcier, Paul Bourgine, Jack G. Conrad, Enrico Francesconi,
          <string-name>
            <given-names>Thomas F.</given-names>
            <surname>Gordon</surname>
          </string-name>
          , Guido Governatori, Jochen L.
          <string-name>
            <surname>Leidner</surname>
            ,
            <given-names>David D.</given-names>
          </string-name>
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>Ronald P.</given-names>
          </string-name>
          <string-name>
            <surname>Loui</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Thorne</surname>
            <given-names>McCarty</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Henry</given-names>
            <surname>Prakken</surname>
          </string-name>
          , Frank Schilder, Erich Schweighofer, Paul Thompson, Alex Tyrrell, Bart Verheij,
          <string-name>
            <given-names>Douglas N.</given-names>
            <surname>Walton</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Adam Z.</given-names>
            <surname>Wyner</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>A history of AI and Law in 50 papers: 25?years of the international conference on AI and Law</article-title>
          .
          <source>Artificial Intelligence and Law</source>
          <volume>20</volume>
          ,
          <issue>3</issue>
          (
          <year>2012</year>
          ),
          <fpage>215</fpage>
          -
          <lpage>319</lpage>
          . https://doi.org/10.1007/s10506-012-9131-x
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Robert</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Berring</surname>
          </string-name>
          .
          <year>1986</year>
          .
          <article-title>Full-Text Databases</article-title>
          and Legal Research:
          <article-title>Backing into the Future</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Jon</given-names>
            <surname>Bing</surname>
          </string-name>
          .
          <year>1986</year>
          .
          <article-title>The text retrieval system as a conversion partner</article-title>
          .
          <source>International Review of Law, Computers &amp; Technology</source>
          <volume>2</volume>
          ,
          <issue>1</issue>
          (
          <year>1986</year>
          ),
          <fpage>25</fpage>
          -
          <lpage>39</lpage>
          . https://doi.org/10. 1080/13600869.
          <year>1986</year>
          .
          <volume>9966228</volume>
          arXiv:https://doi.org/10.1080/13600869.
          <year>1986</year>
          .9966228
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>David</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Blei</surname>
            and
            <given-names>John D.</given-names>
          </string-name>
          <string-name>
            <surname>Laferty</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Dynamic Topic Models</article-title>
          .
          <source>In Proceedings of the 23rd International Conference on Machine Learning (Pittsburgh</source>
          , Pennsylvania, USA) (
          <article-title>ICML '06)</article-title>
          . ACM, New York, NY, USA,
          <fpage>113</fpage>
          -
          <lpage>120</lpage>
          . https://doi.org/10.1145/ 1143844.1143859
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>David</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Blei</surname>
            ,
            <given-names>Andrew Y.</given-names>
          </string-name>
          <string-name>
            <surname>Ng</surname>
            , and
            <given-names>Michael I.</given-names>
          </string-name>
          <string-name>
            <surname>Jordan</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Latent Dirichlet Allocation</article-title>
          .
          <source>J. Mach. Learn. Res. 3 (March</source>
          <year>2003</year>
          ),
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          . http://dl.acm.org/ citation.cfm?id=
          <volume>944919</volume>
          .
          <fpage>944937</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bridy</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Coding Creativity: Copyright and the Artificially Intelligent Author</article-title>
          .
          <source>Stanford Technology Law Review</source>
          <volume>5</volume>
          (
          <year>2012</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Brüggermann</surname>
          </string-name>
          , Yannik Hermey, Carsten Orth, Darius Schneider,
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Selzer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Gerasimos</given-names>
            <surname>Spanakis</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Storyline detection and tracking using Dynamic Latent Dirichlet Allocation</article-title>
          .
          <source>Computing News Storylines</source>
          (
          <year>2016</year>
          ),
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>David J Carter</surname>
            , James Brown, and
            <given-names>Adel</given-names>
          </string-name>
          <string-name>
            <surname>Rahmani</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Reading the high court at a distance: Topic modelling the legal subject matter and judicial activity of the high court of australia,</article-title>
          <year>1903</year>
          -
          <fpage>2015</fpage>
          . UNSWLJ 39 (
          <year>2016</year>
          ),
          <fpage>1300</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Allison</given-names>
            <surname>June-Barlow</surname>
          </string-name>
          <string-name>
            <surname>Chaney</surname>
          </string-name>
          , Hanna M Wallach,
          <string-name>
            <given-names>Matthew</given-names>
            <surname>Connelly</surname>
          </string-name>
          , and David M Blei.
          <year>2016</year>
          .
          <article-title>Detecting and Characterizing Events.</article-title>
          .
          <source>In EMNLP</source>
          .
          <volume>1142</volume>
          -
          <fpage>1152</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Laura</surname>
            <given-names>Dietz</given-names>
          </string-name>
          , Bhaskar Mitra, Jeremy Pickens, Hana Anber, Sandeep Avula,
          <string-name>
            <given-names>Asia J.</given-names>
            <surname>Biega</surname>
          </string-name>
          , Adrian Boteanu, Shubham Chatterjee, Jef Dalton,
          <string-name>
            <surname>Shiri</surname>
            <given-names>DoriHacohen</given-names>
          </string-name>
          , John Foley, Henry Feild, Ben Gamari, Rosie Jones, Pallika Kanani, Sumanta Kashyapi, Widad Machmouchi, Matthew Mitsui, Steve Nole, Alexandre Tachard Passos,
          <string-name>
            <surname>Jordan Ramsdell</surname>
            , Adam Roegiest,
            <given-names>David</given-names>
          </string-name>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>and Alessandro</given-names>
          </string-name>
          <string-name>
            <surname>Sordoni</surname>
          </string-name>
          .
          <source>2019. Report on the First HIPstIR Workshop on the Future of Information Retrieval. SIGIR Forum 53</source>
          ,
          <issue>2</issue>
          (
          <year>December 2019</year>
          ),
          <fpage>62</fpage>
          -
          <lpage>75</lpage>
          . https://www.microsoft.com/en-us/research/publication/report-on
          <article-title>-the-firsthipstir-workshop-on-the-future-of-information-retrieval/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Dixon</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Breaking into locked rooms to access computer source code: Does the dmca violate constitutional mandate when technological barriers of access are applied to software</article-title>
          .
          <source>Virginia Journal of Law &amp; Technology</source>
          <volume>8</volume>
          ,
          <issue>1</issue>
          (
          <year>2003</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>60</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Arthur</given-names>
            <surname>Dyevre</surname>
          </string-name>
          and
          <string-name>
            <given-names>Nicolas</given-names>
            <surname>Lampach</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Issue Attention on the European Court of Justice: A Text-Mining Approach</article-title>
          . SSRN (
          <year>2018</year>
          ). http://dx.doi.org/10. 2139/ssrn.3251186
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Ian</given-names>
            <surname>Edwards</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Search like a robot: Developing targeted search algorithms</article-title>
          .
          <source>Australian Law Librarian</source>
          <volume>26</volume>
          ,
          <issue>2</issue>
          (
          <year>2018</year>
          ),
          <fpage>104</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Daphne</given-names>
            <surname>Gelbart</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Smith</surname>
          </string-name>
          .
          <year>1993</year>
          .
          <article-title>Automating the Process of Abstracting Legal Cases</article-title>
          .
          <source>International Journal of Law and Information Technology</source>
          <volume>1</volume>
          (
          <issue>03</issue>
          <year>1993</year>
          ),
          <fpage>324</fpage>
          -
          <lpage>334</lpage>
          . https://doi.org/10.1093/ijlit/1.3.
          <fpage>324</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Daphne</given-names>
            <surname>Gelbart</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Smith</surname>
          </string-name>
          .
          <year>1994</year>
          .
          <article-title>The application of automated text processing techniques to legal text management</article-title>
          .
          <source>International Review of Law, Computers &amp; Technology</source>
          <volume>8</volume>
          ,
          <issue>1</issue>
          (
          <year>1994</year>
          ),
          <fpage>203</fpage>
          -
          <lpage>210</lpage>
          . https://doi.org/10.1080/13600869.
          <year>1994</year>
          .
          <volume>9966390</volume>
          arXiv:https://doi.org/10.1080/13600869.
          <year>1994</year>
          .9966390
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Catalina</surname>
            <given-names>Goanta</given-names>
          </string-name>
          , Gijs van Dijck,
          <string-name>
            <given-names>and Gerasimos</given-names>
            <surname>Spanakis</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Back to the Future: Waves of Legal Scholarship on Artificial Intelligence</article-title>
          . Forthcoming in Sofia Ranchordás and Yaniv Roznai, Time, Law and Change (Oxford, Hart Publishing,
          <year>2019</year>
          ) (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Graham</surname>
            <given-names>Greenleaf</given-names>
          </string-name>
          , Daniel Austin, Philip Chung, Andrew Mowbray, Madeleine Davis, and
          <string-name>
            <given-names>Jill</given-names>
            <surname>Matthews</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Solving the Problems of Finding Law on the Web: World Law and DIAL</article-title>
          .
          <source>Journal of Information, Law and Technology</source>
          <year>2000</year>
          (01
          <year>2000</year>
          ). https://doi.org/10.1017/S0731126500009483
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Graham</surname>
            <given-names>Greenleaf</given-names>
          </string-name>
          , Andrew Mowbray,
          <string-name>
            <given-names>Geofrey</given-names>
            <surname>King</surname>
          </string-name>
          , and Geofrey van Dijk.
          <year>1995</year>
          .
          <article-title>Public Access to Law via Internet: The Australian Legal Information Institute</article-title>
          .
          <source>Journal of Law and Information Science</source>
          ,
          <volume>6</volume>
          (
          <issue>1</issue>
          ),
          <volume>50</volume>
          (
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>F.</given-names>
            <surname>Hanson</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>From Key Numbers to Keywords: How Automation Has Transformed the Law</article-title>
          .
          <source>Law Library Journal</source>
          <volume>94</volume>
          (09
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Karen</given-names>
            <surname>Hao</surname>
          </string-name>
          .
          <year>2019</year>
          . We analyzed
          <volume>16</volume>
          ,
          <article-title>625 papers to figure out where AI is headed next</article-title>
          . https://www.technologyreview.com/s/612768/we-analyzed-16625
          <string-name>
            <surname>-</surname>
          </string-name>
          papersto
          <article-title>-figure-out-where-ai-is-headed-next/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Maria</surname>
            <given-names>Iglesias</given-names>
          </string-name>
          , Sharon Shamuilia, and
          <string-name>
            <given-names>Amanda</given-names>
            <surname>Anderberg</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Intellectual Property and Artificial Intelligence - A literature review</article-title>
          .
          <source>EU Science</source>
          Hub - European
          <string-name>
            <surname>Commission</surname>
          </string-name>
          (Dec.
          <year>2019</year>
          ). https://ec.europa.eu/jrc/en/publication/intellectualproperty-and
          <article-title>-artificial-intelligence-literature-review</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Aaron</given-names>
            <surname>Kirschenfeld</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Yellow Flag Fever: Describing Negative Legal Precedent in Citators</article-title>
          . https://doi.org/10.31228/osf.io/dfjah
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Melanie</given-names>
            <surname>Knapp</surname>
          </string-name>
          and
          <string-name>
            <given-names>Rob</given-names>
            <surname>Willey</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Comparison of Research Speed and Accuracy Using WestlawNext and Lexis Advance</article-title>
          .
          <source>Legal Reference Services Quarterly</source>
          <volume>35</volume>
          (05
          <year>2016</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          . https://doi.org/10.1080/0270319X.
          <year>2016</year>
          .1177428
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Michael</surname>
            <given-names>A Livermore</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Allen</given-names>
            <surname>Riddell</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Rockmore</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Agenda formation and the US supreme court: A topic model approach</article-title>
          .
          <source>Arizona Law Review</source>
          <volume>1</volume>
          ,
          <issue>2</issue>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Maggs</surname>
          </string-name>
          .
          <year>1994</year>
          .
          <article-title>Legal Data Banks in the United States and Their Use in Comparative Law</article-title>
          .
          <source>International Journal of Legal Information</source>
          <volume>22</volume>
          (01
          <year>1994</year>
          ),
          <fpage>214</fpage>
          -
          <lpage>227</lpage>
          . https://doi.org/10.1017/S0731126500024926
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Maier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Waldherr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Miltner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Wiedemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Niekler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Keinert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Pfetsch</surname>
          </string-name>
          , G. Heyer, U. Reber,
          <string-name>
            <given-names>T.</given-names>
            <surname>Häussler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schmid-Petri</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Adam</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Applying LDA Topic Modeling in Communication Research: Toward a Valid and Reliable Methodology</article-title>
          .
          <source>Communication Methods and Measures</source>
          <volume>12</volume>
          ,
          <fpage>2</fpage>
          -
          <lpage>3</lpage>
          (
          <year>2018</year>
          ),
          <fpage>93</fpage>
          -
          <lpage>118</lpage>
          . https://doi.org/10.1080/19312458.
          <year>2018</year>
          .
          <volume>1430754</volume>
          arXiv:https://doi.org/10.1080/19312458.
          <year>2018</year>
          .1430754
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Elizabeth</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>McKenzie</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Natural Language Searching</article-title>
          .
          <source>Legal Reference Services Quarterly</source>
          <volume>18</volume>
          ,
          <issue>4</issue>
          (
          <year>2001</year>
          ),
          <fpage>39</fpage>
          -
          <lpage>47</lpage>
          . https://doi.org/10.1300/J113v18n04_04 arXiv:https://doi.org/10.1300/J113v18n0404
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>David</given-names>
            <surname>Mimno</surname>
          </string-name>
          ,
          <string-name>
            <surname>Hanna M Wallach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Edmund</given-names>
            <surname>Talley</surname>
          </string-name>
          , Miriam Leenders, and
          <string-name>
            <surname>Andrew McCallum</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Optimizing semantic coherence in topic models</article-title>
          .
          <source>In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics</source>
          ,
          <fpage>262</fpage>
          -
          <lpage>272</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>A</given-names>
            <surname>Moens</surname>
          </string-name>
          .
          <year>2009</year>
          . Is AustLII Sustainable? Australian Law Librarian,
          <volume>17</volume>
          (
          <issue>3</issue>
          ),
          <fpage>154</fpage>
          -
          <lpage>157</lpage>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Marie-Francine</surname>
            <given-names>Moens</given-names>
          </string-name>
          , Maarten Logghe, and
          <string-name>
            <given-names>Jos</given-names>
            <surname>Dumortier</surname>
          </string-name>
          .
          <year>2002</year>
          . Legislative Databases: Current Problems and Possible Solutions.
          <source>International Journal of Law and Information Technology</source>
          <volume>10</volume>
          (03
          <year>2002</year>
          ). https://doi.org/10.1093/ijlit/10.1.
          <fpage>1</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>Elizabeth</given-names>
            <surname>Moll-Willard</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>The use and perceptions of open Access resources by legal academics at the University of Cape Town (UCT</article-title>
          ) in South Africa.
          <volume>6</volume>
          (
          <issue>09</issue>
          <year>2018</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>David</given-names>
            <surname>Newman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Youn</given-names>
            <surname>Noh</surname>
          </string-name>
          , Edmund Talley, Sarvnaz Karimi, and
          <string-name>
            <given-names>Timothy</given-names>
            <surname>Baldwin</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Evaluating topic models for digital libraries</article-title>
          .
          <source>In Proceedings of the 10th annual joint conference on Digital libraries. ACM</source>
          ,
          <volume>215</volume>
          -
          <fpage>224</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ogden</surname>
          </string-name>
          .
          <year>1993</year>
          .
          <article-title>"Mastering the lawless science of our law": A story of legal citation indexes</article-title>
          .
          <source>Law Library Journal</source>
          <volume>85</volume>
          (01
          <year>1993</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>Marc</given-names>
            <surname>Opijnen</surname>
          </string-name>
          , Hayo Schreijer, Ilja Andreas, and
          <string-name>
            <given-names>Maarten</given-names>
            <surname>Kroon</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Specialised Government Publishing: The Law Pocket and Linked Legal Data in the Netherlands</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <surname>Mark</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Osbeck</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Lawyer as Soothsayer: Exploring the Important Role of Outcome Prediction in the Practice of Law</article-title>
          .
          <source>Penn State Law Review</source>
          <volume>123</volume>
          ,
          <issue>1</issue>
          (
          <year>2018</year>
          ),
          <fpage>41</fpage>
          -
          <lpage>102</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>Yannis</given-names>
            <surname>Panagis</surname>
          </string-name>
          and
          <string-name>
            <given-names>Urska</given-names>
            <surname>Sadl</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>The Force of EU Case Law: A Multidimensional Study of Case Citations.</article-title>
          .
          <source>In JURIX</source>
          .
          <volume>71</volume>
          -
          <fpage>80</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>João</surname>
            <given-names>Reis</given-names>
          </string-name>
          , Paula Espírito Santo, and
          <string-name>
            <given-names>Nuno</given-names>
            <surname>Melão</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Impacts of Artificial Intelligence on Public Administration: A Systematic Literature Review</article-title>
          .
          <source>In 2019 14th Iberian Conference on Information Systems and Technologies (CISTI)</source>
          .
          <source>IEEE</source>
          , 1-
          <fpage>7</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>Gerard</given-names>
            <surname>Salton</surname>
          </string-name>
          and
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Buckley</surname>
          </string-name>
          .
          <year>1988</year>
          .
          <article-title>Term-weighting approaches in automatic text retrieval</article-title>
          .
          <source>Information processing &amp; management 24</source>
          ,
          <issue>5</issue>
          (
          <year>1988</year>
          ),
          <fpage>513</fpage>
          -
          <lpage>523</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>G.</given-names>
            <surname>Salton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wong</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <year>1975</year>
          .
          <article-title>A Vector Space Model for Automatic Indexing</article-title>
          .
          <source>Commun. ACM</source>
          <volume>18</volume>
          ,
          <issue>11</issue>
          (Nov.
          <year>1975</year>
          ),
          <fpage>613</fpage>
          -
          <lpage>620</lpage>
          . https://doi.org/10.1145/ 361219.361220
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>Pamela</given-names>
            <surname>Samuelson</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Google Book Search and the Future of Books in Cyberspace</article-title>
          .
          <source>Minnesota law review 94 (01</source>
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sassoli</surname>
          </string-name>
          .
          <year>2014</year>
          . Autonomous Weapons and International Humanitarian Law: Advantages, Open Technical Questions and Legal Issues to Be Clarified.
          <source>International Law Studies. US Naval War College</source>
          <volume>90</volume>
          (
          <year>2014</year>
          ),
          <fpage>308</fpage>
          -
          <lpage>340</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Schmitt</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jefrey</given-names>
            <surname>Thurnher</surname>
          </string-name>
          .
          <year>2013</year>
          . '
          <article-title>Out of the Loop': Autonomous Weapon Systems and the Law of Armed Conflict</article-title>
          .
          <source>Harvard National Security Journal</source>
          <volume>4</volume>
          ,
          <issue>02</issue>
          (
          <year>2013</year>
          ),
          <fpage>231</fpage>
          -
          <lpage>281</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <surname>Cecilia</surname>
            <given-names>Magnusson</given-names>
          </string-name>
          <string-name>
            <surname>Sjóberg</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>Corpus Legis: A Legal Document Management Project</article-title>
          .
          <source>International Journal of Law and Information Technology</source>
          <volume>5</volume>
          ,
          <issue>1</issue>
          (
          <issue>03</issue>
          <year>1997</year>
          ),
          <fpage>83</fpage>
          -
          <lpage>99</lpage>
          . https://doi.org/10.1093/ijlit/5.1.83 arXiv:https://academic.oup.com/ijlit/article-pdf/5/1/83/9820642/83.pdf
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <surname>James</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Sprowl</surname>
          </string-name>
          .
          <year>1976</year>
          .
          <article-title>Computer-Assisted Legal Research-An Analysis of Full-Text Document Retrieval Systems, Particularly the LEXIS System</article-title>
          .
          <source>Law &amp; Social Inquiry 1</source>
          ,
          <issue>1</issue>
          (
          <year>1976</year>
          ),
          <fpage>175</fpage>
          -
          <lpage>226</lpage>
          . https://doi.org/10.1111/j.1747-
          <fpage>4469</fpage>
          .
          <year>1976</year>
          .tb00955.x arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1747-
          <fpage>4469</fpage>
          .
          <year>1976</year>
          .tb00955.x
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <surname>Keith</surname>
            <given-names>Stevens</given-names>
          </string-name>
          , Philip Kegelmeyer, David Andrzejewski,
          <string-name>
            <given-names>and David</given-names>
            <surname>Buttler</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Exploring topic coherence over many models and many topics</article-title>
          .
          <source>In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics</source>
          ,
          <fpage>952</fpage>
          -
          <lpage>961</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Tutt</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>An FDA for Algorithms</article-title>
          .
          <source>Administrative Law Review</source>
          <volume>69</volume>
          ,
          <issue>1</issue>
          (
          <year>2017</year>
          ),
          <fpage>83</fpage>
          -
          <lpage>123</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <surname>Marc</surname>
            <given-names>van Opijnen. 2017. Gaining</given-names>
          </string-name>
          <string-name>
            <surname>Momentum</surname>
          </string-name>
          .
          <source>How ECLI Improves Access to Case Law in Europe.</source>
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [51]
          <string-name>
            <surname>Hanna</surname>
            <given-names>M Wallach</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Iain</given-names>
            <surname>Murray</surname>
          </string-name>
          , Ruslan Salakhutdinov, and
          <string-name>
            <given-names>David</given-names>
            <surname>Mimno</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Evaluation methods for topic models</article-title>
          .
          <source>In Proceedings of the 26th annual international conference on machine learning</source>
          .
          <volume>1105</volume>
          -
          <fpage>1112</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          [52]
          <string-name>
            <surname>Chong</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>David M.</given-names>
            <surname>Blei</surname>
          </string-name>
          , and
          <string-name>
            <given-names>David</given-names>
            <surname>Heckerman</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Continuous Time Dynamic Topic Models.</article-title>
          .
          <string-name>
            <surname>In</surname>
            <given-names>UAI</given-names>
          </string-name>
          ,
          <string-name>
            <surname>David</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>McAllester and Petri Myllymäki</surname>
          </string-name>
          (Eds.). AUAI Press,
          <fpage>579</fpage>
          -
          <lpage>586</lpage>
          . http://dblp.uni-trier.de/db/conf/uai/uai2008.html# WangBH08
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          [53]
          <string-name>
            <given-names>Xuerui</given-names>
            <surname>Wang and Andrew McCallum</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Topics over Time: A non-Markov Continuous-time Model of Topical Trends</article-title>
          .
          <source>In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          (Philadelphia, PA, USA) (
          <article-title>KDD '06)</article-title>
          . ACM, New York, NY, USA,
          <fpage>424</fpage>
          -
          <lpage>433</lpage>
          . https://doi.org/10. 1145/1150402.1150450
        </mixed-citation>
      </ref>
      <ref id="ref54">
        <mixed-citation>
          [54]
          <string-name>
            <given-names>E.</given-names>
            <surname>Warren</surname>
          </string-name>
          . 2017-
          <fpage>2018</fpage>
          .
          <article-title>A Simple Guide to Machine Learning</article-title>
          .
          <source>SciTech Lawyer</source>
          <volume>14</volume>
          ,
          <issue>1</issue>
          (
          <fpage>2017</fpage>
          -2018),
          <fpage>5</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref55">
        <mixed-citation>
          [55]
          <string-name>
            <given-names>Clemens</given-names>
            <surname>Wass</surname>
          </string-name>
          .
          <year>2017</year>
          . openlaws.eu - Building Your Personal Legal Network.
        </mixed-citation>
      </ref>
      <ref id="ref56">
        <mixed-citation>
          [56]
          <string-name>
            <surname>Xing</surname>
            <given-names>Wei</given-names>
          </string-name>
          , Jimeng Sun, and
          <string-name>
            <given-names>Xuerui</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Dynamic Mixture Models for Multiple Time-Series.</article-title>
          .
          <source>In IJCAI (2007-03-05)</source>
          , Manuela M.
          <string-name>
            <surname>Veloso</surname>
          </string-name>
          (Ed.).
          <fpage>2909</fpage>
          -
          <lpage>2914</lpage>
          . http://dblp.uni-trier.de/db/conf/ijcai/ijcai2007.html#WeiSW07
        </mixed-citation>
      </ref>
      <ref id="ref57">
        <mixed-citation>
          [57]
          <string-name>
            <given-names>Robin</given-names>
            <surname>Widdison</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>New Perspectives in Legal Information Retrieval</article-title>
          .
          <source>International Journal of Law and Information Technology</source>
          <volume>10</volume>
          ,
          <issue>1</issue>
          (
          <issue>01</issue>
          <year>2002</year>
          ),
          <fpage>41</fpage>
          -
          <lpage>70</lpage>
          . https://doi.org/10.1093/ijlit/10.1.41 arXiv:https://academic.oup.com/ijlit/articlepdf/10/1/41/2065390/100041.pdf
        </mixed-citation>
      </ref>
      <ref id="ref58">
        <mixed-citation>
          [58]
          <string-name>
            <given-names>R</given-names>
            <surname>Winkels</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Experiments in finding relevant case law</article-title>
          .
          <source>In NAIL 2015: 3rd International Workshop on Network Analysis in Law.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>