Return of the AI: An Analysis of Legal Research on Artificial Intelligence Using Topic Modeling Constanta Rosca, Bogdan Covrig, Catalina Goanta, Gijs van Dijck, Gerasimos Spanakis {constanta.rosca,b.covrig,catalina.goanta,gijs.vandijck,jerry.spanakis}@maastrichtuniversity.nl Law & Tech Lab, Maastricht University Maastricht, Netherlands ABSTRACT years AI-related themes have gained considerable popularity in AI research finds itself in the third boom of its history, and in re- new disciplines, such as law. cent years, AI-related themes have gained considerable popularity With over 2500 publications already by the year 2015 referring in new disciplines, such as law. This paper explores what legal re- to ‘artificial intelligence’ [19] (see also Figure 3 in the Appendix), search on AI constitutes of and how it has evolved, while addressing it may no longer be realistic to assume that researchers can keep the issues of information retrieval and research duplication. Using up with legal research on AI, or the number of publications in Latent Dirichlet Allocation (LDA) topic modeling on a dataset of general. Moreover, the recent spike suggests more and more authors 3931 journal articles, we explore three questions: (a) Which topics have started writing about AI in law topics, including authors who within legal research on AI can be distinguished? (b) When were have previously not published on such topics. This, in combination these topics addressed? and (c) Can similar papers be detected? The with the inability to keep up with legal research on AI due to its topic modeling results in a total of 32 meaningful topics. Addition- exponential growth, creates the risk that authors replicate previous ally, it is found that legal research on AI drastically increased as of work without being aware of similar previous publications. 2016, with topics becoming more granular and diverse over time. This paper aims to explore what legal research on AI constitutes Finally, a comparison of the similarity assessments produced by the of and how it has evolved while addressing the issue of informa- algorithm and a human expert suggest that the assessments often tion retrieval and the risk of research duplication. We develop a coincide. The results provide insights into how a legal research on methodology that distinguishes topics in a collection of documents AI has evolved over time, and support for the development of ma- (in this case journal publications), allows exploring the evolution chine learning and information retrieval tools like LDA that assist of the topics over time, and detects similarity between documents, in structuring large document collections and identifying relevant with the purpose of providing solutions for reading and analyz- articles. ing a number of publications in bulk in ways that humans cannot. Consequently, we aim to answer the following research questions CCS CONCEPTS (RQ): RQ1: Which topics within the field of legal research on AI can • Computing methodologies → Topic modeling; • Applied be distinguished (‘What’)? A methodology would provide insight computing → Law. in how legal research on AI is structured, and it would allow classi- fying publications in sub-topics, which would enhance information KEYWORDS retrieval. information retrieval, topic modeling, legal research RQ2: What (i.e. about which topics) has been written when ACM Reference Format: (‘What - When’)? This question contributes to the understanding of Constanta Rosca, Bogdan Covrig, Catalina Goanta, Gijs van Dijck, Gerasi- which topics have emerged, remained, or lost the interest of legal mos Spanakis. 2020. Return of the AI: An Analysis of Legal Research on scholars. The analysis of the What - When question will provide Artificial Intelligence Using Topic Modeling. In Proceedings of the 2020 Nat- information about the evolution of legal research on AI. ural Legal Language Processing (NLLP) Workshop, 24 August 2020, San Diego, RQ3: Can similar papers be detected? Considering the sharp US. ACM, New York, NY, USA, 8 pages. increase of publications, it may be becoming increasingly difficult to find publications on similar research questions, which may even 1 INTRODUCTION result in reproduction of scholarship because prior publications are overlooked. This question explores a methodology that allows Artificial intelligence (AI) research finds itself in the third boom of detecting thematically similar documents in a given corpus. its history, fuelled by increased funding, scientific breakthroughs such as deep learning, and widespread public speculation relating to the scope and impact of these breakthroughs. While decades 2 BACKGROUND ago the interest in AI research was generally limited to specific One of the innovations in this paper is to use unsupervised ma- disciplines (e.g. computer science, philosophy), during the past chine learning to categorize legal research on AI and to map how legal scholarship on this topic has developed. The idea of smart Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons literature reviews has received prior attention, on the grounds of License Attribution 4.0 International (CC BY 4.0). how manual searches for existing literature - especially in matured NLLP @ KDD 2020, August 24th, San Diego, US © 2020 Copyright held by the owner/author(s). domains where scholarship is abundant - is not efficient and might have a negative impact on the quality of new research [2]. Similar NLLP @ KDD 2020, August 24th, San Diego, US Rosca and Covrig, et al. approaches have been taken to map computer science literature Resulting articles therefore discuss a wide array of aspects relat- [23], as well as communications research [29]. ing to artificial intelligence, and do not, as such, focus on specific To the best of our knowledge, legal scholarship as such has technical or legal issues. For the purpose of this study, we assume not yet been analyzed using LDA. Within the narrow confines of that even one reference to the keywords is sufficient to include an the field of research labeled as AI and Law, a lot of legal research article in the corpus. on information retrieval has been published in the past decades. The articles in the corpus follow a power law distribution, where Notwithstanding that the volume of scholarship on information a relatively large number of publications is written by a low number retrieval pertaining to the discipline of computer science alone is of authors, and few publications by many authors (see Figure 4 in vast to say the least [13], particularly when applied to the legal the Appendix). The same distribution applies to the number of field, such research focused on the availability of legal information publications per author(s) with roughly 28 authors having more [5, 20, 21, 28, 32, 34, 46, 55], search systems and search strategies [16, than 5 publications (see Figure 5 in the Appendix). 22, 30, 43, 47, 57], information processing [6, 17, 18, 25, 33, 36, 37, 50], and the role of legal publishers [1, 3, 26]. These publications address 3.2 Topic Modeling a variety of issues and questions, including the sustainability of 3.2.1 Latent Dirichlet Allocation. Latent Dirichlet Allocation (LDA) publicly available legal repositories (e.g. AustLII), the importance of [8] was used to identify topics in the corpus. LDA has many different natural language processing when searching for legal information, use cases so far. Examples concern organizing large document the performance of online searches compared to searching through collections in order to improve search and retrieval of information, paper, how citation analysis may be used to improve search results, summarization of large textual data, and even image clustering. In the role of legal publishers in this and, more generally, the impact the legal domain, LDA has been used to study the agenda of the US of automation on how the law is analyzed and applied. Supreme Court [27], the High Court of Australia [11] and the Court Still, the question of how to capture and visualize the devel- of Justice of the European Union (CJEU) [15]. Winkels [58] used opment of an entire legal sub-field like legal research on AI has LDA to build a recommender system for Dutch case law. Panagis barely been explored. One of the avenues of exploration has been and Sadl [39] combined network analysis and LDA to study the shaped by Bench-Capon et al., who have previously focused on case-law generative process of the CJEU. the proceedings of the International Conference on AI and Law, LDA is a generative, probabilistic model for a collection of docu- first held in 1987, to make a 25-year retrospective of the research ments, which are represented as mixtures of latent topics, where generated therein by describing the scholarship progress through each topic is characterized by a distribution over all words of the illustrative papers selected from various editions [4]. Other studies collection of documents. The basic representation unit of the doc- focused on traditional (systematic) literature reviews on the impact uments is the word, i.e. all distinct terms are extracted from the of AI on specific legal domains, such as administrative law [40] or document collection along with their frequencies (per document), intellectual property [24]. which is the so called ‘Bag-of-Words’ model [41]. However, given the limitations of legal databases or the way On a conceptual level, the algorithm tries to discover topics that they are used by researchers, making comprehensive overviews of can represent the collection of documents. Each document is gen- existing literature remains a considerable hurdle. This is all the more erated from a mixture of these topics and each topic is generated so in the past four years, when the production of legal scholarship from a probability vector (distribution) over all words. Assuming on artificial intelligence seems to have grown considerably (see such a generative model for any collection of documents, LDA’s Figure 3 in the Appendix). This paper aims to fill this research goal is to try and ‘backtrack’ this process, i.e. find a set of topics gap by proposing and testing an unsupervised machine learning that are likely to have generated the whole document collection. approach to the clustering of literature in this field, namely topic modeling. 3.2.2 LDA Variants. In this paper, we used the model of Blei et al. [8], however we did explore other variations as well. Stevens et al. present a qualitative comparison of different topic model algorithms, 3 METHODOLOGY [48] also using different evaluation metrics [31, 35] and conclude that, given the same data and the same number of topics, LDA is 3.1 Corpus able to learn more coherent topics than the competing approaches. The corpus includes a total of 3931 journal articles obtained from the Nevertheless, there are extensions of the LDA model towards HeinOnline database. Absent a centralised, comprehensive, open topic tracking over time [53, 56]. However, according to Wang et al. access repository for international legal scholarship, we focused [52], these methods deal with constant topics and the timestamps on one of the commercially available databases. HeinOnline is one are used for better discovery. Opposed to that, in a Blei et al. paper of the leading international databases on legal materials, which [7] a model for detection of evolving topics in a discrete time space contains over 170 million pages of literature and indexing over is presented (Dynamic Topic Modeling - DTM). Here, LDA is used 2700 law journals. In the section ‘Law Journal Library’, section on topics aggregated in time epochs and a state space model handles type ‘Articles’, the corpus covers literature available in the database transitions of the topics from one epoch to another. between 1960 and 2018. Unlike arXiv, HeinOnline does not have Bruggermann et al. applied DTM on the RCV1 Reuters corpus a section on ‘artificial intelligence’. The total number of retrieved (810.000 documents) with weekly time epochs [10]. Results showed articles reflects the results of a boolean search using the keywords that the variances within the topics among the time epochs are ‘artificial intelligence’, namely all articles which include both terms. marginal. DTM still treats the corpus as a whole and the number of Return of the AI: An Analysis of Legal Research on Artificial Intelligence Using Topic Modeling NLLP @ KDD 2020, August 24th, San Diego, US topics is fixed over all time epochs. Chaney et al. introduce another highest weights in relation to the topic, and the labels the human extension to LDA that detects events in a large text collection [12]. coders assigned to the topics. Their model adds separate probability distributions for defined Three miscellaneous topics were identified: id21, id25, and id33. entities and time intervals to the generative process. The model Both the inspection of the 20 words and the titles did not result in a consists accordingly of general topics, entity-related topics and substantively meaningful label or description for these topics. It was topics specific to time intervals. Events are detected as anomalies, decided to not remove the words in these topics and to not re-run which are identified as temporary deviations from usual behavior. the topic modeling algorithm, as it was expected that the removal Usual behavior refers to the topics discussed by the entities. An of words could introduce selection bias in the corpus. Moreover, anomaly can be detected, whenever these entities change their the identification of the three miscellaneous topics does not affect topics of discussion significantly and at the same time in a similar the relevance or interpretation of the other topics. way. The results show a wide range of topics, varying from tax to military technology to copyright. Within each topic, diversity could 3.2.3 Output of LDA. LDA can be applied on a corpus and the still be observed. For example, for the topic of algorithmic decision- output model provides the following distributions: making and quantitative methods, paper titles such as ‘An FDA for • t𝑖 = {𝑡𝑖 𝑗 }: topic-word vector-distribution, where 𝑖 denotes Algorithms’ [49], ‘A Simple Guide to Machine Learning’ [54], and the topic (in total there are 𝑁 topics) and 𝑗 denotes the word ‘Lawyer as a Soothsayer: Exploring the Important Role of Outcome (in total there are 𝑃 words in the collection). The component Prediction in the Practice of Law’ [38] can be observed. 𝑡𝑖 𝑗 shows the relative weight of word 𝑗 in topic 𝑖. An important issue concerned the initial selection of journal arti- • d𝑘 = {𝑑𝑘𝑖 }: document-topic vector-distribution, where 𝑖 cles. The articles were selected based on the search string ‘artificial denotes the topic (out of the total 𝑁 topics) and 𝑘 denotes a intelligence’. The number of occurrences of this string presumably document. The component 𝑑𝑘𝑖 shows the relative weight of is, however, not an entirely accurate proxy for measuring whether topic 𝑖 in document 𝑘. the paper actually is about artificial intelligence. It might be that papers on public policy or competition & markets mention the term 4 RESULTS AND ANALYSIS ‘artificial intelligence’, but do not primarily focus on artificial in- telligence or even on technology or digital matters. An additional 4.1 Topics in Legal Research on AI (What?) selection was therefore required. Consequently, three of the re- Pre-processing of the corpus involved filtering for articles written searchers (authors) independently went through the topic list and only in the English language and afterwards removing common the related words as displayed in Table 1 in order to determine English stop-words (the, of, etc.) as well as removing some very whether the labels and words are likely to be related to artificial common corpus-specific words (subject, supra, part, etc.). More- intelligence, digital, and/or technology. A perfect agreement could over, we considered the presence of either unigrams (one words) be observed for the vast majority of topics. In the few instances or bigrams (two words) so as to be able to capture some concepts of disagreement between the coders, the disagreements were re- like ‘dispute resolution’. solved through a brief discussion. Ultimately, a total number of 18 Subsequently, LDA was used to identify topics in the corpus. topics was selected (Table 1, in bold). The 18 topics were used in For identifying the number of topics (𝑁 ), we used perplexity [51] subsequent analyses. and coherence [31] measures on a held-out set. We started from 𝑁 = 5 and increased it (step 5) till 𝑁 = 100. A plateau in both perplexity (178.4) and coherence (0.51) could be observed around 4.2 How Did Legal Research on AI Evolve 35 or 40 topics, which suggests a computationally optimal number (What - When)? of topics. Based on this, topic models with 30, 35, 40, and 45 topics To answer this question we needed to determine which of the 18 were explored. Two legal researchers inspected the results of the topics that were selected in the previous section, have gained or lost various topic models in order to determine which number of topics attention over time. The first step is to extract the dominant topics were substantively the most meaningful. For this, the 20 terms with of each paper by using the document-topic vectors 𝑑 (as described the highest weights for each topic were provided to the researchers in section 3.2.3). More specifically, we denoted for each document 𝑘 (e.g. ‘weapon’, ‘system’, ‘military’, ‘international’, ‘war’ etc.). Based the first three dominant topics 𝑖 1−3 , based on their relative weights on this evaluation, a topic model was selected that consisted of 35 (contributions) in descending order in the document-topic vector d𝑘 . topics. To assess the relevance of each document’s topics, two researchers The topic validation consisted of two steps. First, three researchers manually reviewed the first five topics – sorted by means of contri- inspected the 20 terms for each topic in order to label the topic bution – from a sample of documents. They agreed that for most (e.g. ‘military technology’). Second, paper titles were inspected to documents, the relevance dropped significantly after the third topic, determine whether the paper titles supported the assigned label since the latter topics provide no substantive contribution to the - if not, the label would, if possible, be adjusted. In this respect, paper. Based on the results of the inspection, the first three topics the researchers were presented with a list of paper titles for each were denoted as dominant. We define the frequency of a topic as topic. The validation process indicated that the vast majority of the the number of times that a topic is dominant in all articles. LDA-produced topics were substantively meaningful. Linking the topic frequencies to the year of publication, the count Table 1 reveals the topics distinguished by the LDA model. It of papers addressing each topic every year was computed. Figure 1 includes the topic IDs (not meaningful), the ten terms that have the depicts how the 18 selected topics evolved over time (1960 - 2018). NLLP @ KDD 2020, August 24th, San Diego, US Rosca and Covrig, et al. Table 1: All 35 topics identified by LDA id words labels 0 speech, public, amendment, court, medium, political, free, government, con- freedom of expression tent, freedom 1 internet, network, computer, communication, cyberspace, technol- Internet governance ogy, user, access, virtual, service 2 weapon, system, military, international, war, human, state, attack, tar- military technology get, autonomous_weapon 3 datum, information, privacy, personal, protection, data, individual, privacy consumer, big, user 4 work, company, employee, corporate, worker, labor, business, employer, cor- labour/corporate poration, economic 5 lawyer, legal, client, firm, service, practice, attorney, profession, professional, legal practice work 6 student, school, legal, education, learn, university, skill, practice, teach, re- legal education search 7 state, act, agency, public, government, federal, policy, congress, rule, national public policy 8 environmental, space, energy, nanotechnology, water, land, risk, plan, environment/space air, include 9 patent, claim, invention, method, application, process, art, inventor, patents court, technology 10 copyright, work, protection, author, program, copy, court, intellec- copyright tual_property, fair, computer 11 cognitive, behavior, process, theory, make, people, action, social, men- psychology & neuroscience tal, mind 12 human, robot, machine, technology, artificial_intelligence, robotic, from humans to machines agent, science, future, research 13 market, cost, consumer, service, economic, product, competition, price, trade- competition & markets mark, firm 14 datum, algorithm, model, decision, analysis, result, method, predic- algorithmic decision-making & quan- tion, study, risk titative methods 15 surveillance, privacy, government, search, police, enforcement, surveillance fourth_amendment, court, information, intelligence 16 contract, party, electronic, agreement, agent, online, term, dispute, contract & dispute resolution transaction, dispute_resolution 17 legal, theory, social, science, society, system, political, power, economic, form legal theory/philosophy 18 evidence, probability, argument, theory, inference, case, reason, fact, expert, evidence scientific 19 international, state, country, european, national, article, trade, member, china, international law & relations global 20 medical, health, patient, physician, care, medicine, health_care, hospital, fda, health device 21 https, http, technology, www, online, user, pdf, digital, last_visit, platform miscellaneous 22 person, human, child, life, moral, legal, animal, state, property, interest personhood 23 tax, income, trust, taxpayer, property, asset, return, business, pay, interest tax 24 legal, rule, case, system, reason, knowledge, base, model, argument, knowledge-based systems fact 25 time, world, people, game, make, year, life, work, american, story miscellaneous 26 financial, market, bank, security, investor, regulation, risk, transac- financial regulation & technology tion, investment, trading 27 criminal, crime, police, sentence, justice, offender, sentencing, victim, commit, crime drug 28 technology, system, process, change, development, public, informa- regulation of innovation tion, research, social, design 29 information, search, document, library, legal, research, database, case, information retrieval access, electronic 30 liability, vehicle, product, car, tort, autonomous, risk, safety, driver, autonomous vehicles manufacturer 31 court, case, judge, judicial, rule, justice, decision, opinion, trial, litigation courts 32 software, code, license, open_source, source, program, standard, free, software licensing developer, computer 33 make, rule, problem, case, fact, question, reason, give, decision, view miscellaneous 34 computer, system, program, information, software, user, technology, trends in legal technology datum, expert, process Return of the AI: An Analysis of Legal Research on Artificial Intelligence Using Topic Modeling NLLP @ KDD 2020, August 24th, San Diego, US Figure 2: Relative topic evolution across AI periods2 but interest in which saw several relatively insignificant changes Figure 1: Topic river1 before the deep learning era, when its popularity underwent a high increase. Figure 2 shows how the topics have evolved in relation to other 4.3 Document Similarity topics across six major periods in AI history (taken from [19]). As mentioned in Section 3.2.3, each document 𝑘 is represented by a As a measure of relative topic popularity during a certain period, vector d𝑘 , where each component of the vector shows the weight the ratio of papers concerned with each topic to the total number of of each topic in that specific document. For any pair of documents papers written in a certain period is computed. Caution needs to be we can use cosine similarity [42] as a measure of how close two exercised when interpreting the relative topic popularity. Consider- documents are, which in our case is translated to similarity in the ing the relatively low number of publications before approximately ‘topic space’, i.e. two very similar documents (cosine similarity of 1986 (see Figure 1), just before the second AI winter, not much value 1) are expected to have similar topic distributions. weight should be given to the topic popularity in those years, as The detection of publications that have similar topics enhances small changes in the number of publications can have substantial information retrieval and reduces the risk of reproduction of schol- impact on the popularity of one topic relative to other topics. arship because prior publications are overlooked. Having computed For example, as visualized in Figure 2, scholarly output produced the similarity between all document pairs in our corpus, we ob- during the first AI boom concerned, among others, knowledge- tained some 7,436,296 similarity scores after adjusting the algorithm based systems (13.16 %) and algorithmic decision-making and quan- to avoid computing the similarity between the same document pair titative methods (7.90 %), however, trends in legal technology and twice. information retrieval were prominent topics in this period, form- To explore the substantive meaning of the ensuing similarity ing the subject-matter of 45% and 21% of the papers, respectively. scores, the results produced by the cosine similarity algorithm were As time advances, topics like financial regulation & technology, compared to substantive similarity as defined by a legal expert - autonomous vehicles, surveillance, software licensing and Internet one of the authors. The goal of the inspection was to explore (1) the governance appear. extent to which papers were similar and (2) whether differences Another illustration is reflected by the developments visible in similarity scores produced by the cosine similarity measure are around the deep learning era. The topics that gained popularity substantively meaningful differences. For this, five papers for differ- (compared to the previous period, namely the third AI boom) with ent topics were selected - the seed papers. Each of these five seed rise of deep learning are regulation of innovation, privacy, algorith- papers were compared with five papers with different similarity mic decision-making and quantitative methods, financial regulation & scores - the comparison papers: one similarity score in the .55-.65 technology, military technology and autonomous vehicles. The largest range, one in the .65-.75 range, one in the .75-.85 range, one in the decreases in popularity in this period are those of knowledge-based .85-.95 range, and one in the .95-1.00 range. As a result, a total of systems and trends in legal technology. The popularity of the other five seed papers and 25 comparison papers were inspected. topics underwent popularity changes below the average for this Without knowledge as to which similarity range the comparison period. The interest in most topics grew during the deep learning paper would fall under, the legal expert ranked the similarity for era. each pair of papers (seed paper - comparison paper) for each topic In addition, we observed several interesting trends regarding the separately. A Spearman’s rank correlation test showed a high cor- evolution of pairs of topics (e.g. algorithmic decision-making and relation (r = .62, p < .001) between the ranks based on the machine- quantitative methods with knowledge-based systems and information generated similarity scores and the ranks provided by the expert. retrieval with trends in legal technology), as well as individual topics, The comparison took place on four levels: the paper title (i.e. to e.g. privacy, which emerged as a topic during the first AI winter, what extent do the paper titles suggest similarity?), the research 1 The label ‘ADM and quantitative methods’ refers to the topic algorithmic decision- question level (i.e. are the research questions similar?), the focus making and quantitative methods (id14). or sub-topic level (i.e. which sub-topics does the paper focus on, NLLP @ KDD 2020, August 24th, San Diego, US Rosca and Covrig, et al. which angle or perspective does it take?), and the citation level (i.e. highest similarity scores. The similarity predicted by the machine is there a reference from one paper to the other?). did, however, also sometimes deviate from the expert assessment, The inspection of the paper suggests that papers with similarity although there will undoubtedly also be disagreement between scores of .85 or lower may share substantive similarity, but in a human experts when assessing the similarity of documents. limited way at best (e.g. pairs of publications where one article [14] The results do not only provide insight into how a legal research focuses on whether (source) code is or should be copyright protected within a broader theme has evolved over time, they also provide and the other paper [9] on whether results produced by machines support for the development of LDA tools that assist in structuring are or should be subject to copyright protection). In contrast, articles large document collections and in finding relevant papers. To aid with high similarity scores, particularly those with a score of .90 or researchers in either exploring or keeping up with such vast and higher, are also substantively similar, at least in case of the inspected complex research themes explored in parts of legal scholarship papers. For instance, two selected papers on humanitarian law with which do not necessarily overlap or interact, it is necessary to the highest similarity score ([44] and [45]) discuss legal principles consider how a method such as that presented in this paper (topic such as proportionality and necessity, and they discuss the role of modeling) can be used to further visualize this body of legal research. subjectivity and the capability of autonomous weapons systems To this extent, we are currently working on a dashboard which (similarity in the >.95 range) we aim to make available to scholars (from any discipline) with When inspecting citations between papers with very high simi- an interest in exploring the evolution of legal literature on AI, or larity scores, references from one paper to another were sometimes particular topics within it. Such a dashboard would make it easier, found, and sometimes not. In some instances when no citation was on the one hand, for legal researchers to get the bigger picture of found this could be due to the fact that papers were published in all the fields of law / journals / scholars tackling topics of interest subsequent years and authors did not have the opportunity to cite relating to AI, and on the other hand for researchers from other the published version of the similar paper. Nevertheless, there are disciplines (e.g. computer science) to have a bird’s eye view on instances where a reference was expected in the papers, but not the vast legal literature which might have a direct impact on their found. research. Such publicly available resources could even be a new way to stimulate more awareness of legal and ethical implications of technology on society, and be of interest to civil society as well. 5 DISCUSSION Of course, this paper does not go without limitations. First, the The landscape of legal research on AI has undergone considerable corpus is not necessarily representative of all legal articles or legal changes with respect to the volume of scholarship tackling topics publications in general. Although HeinOnline papers do presum- dealing with AI. In this context, we were interested in what exact ably constitute a significant part of journal articles in law, there topics authors focused on, and how these topics shifted over time. are other publishers and repositories that contain a number of For this, we applied LDA topic modeling to identify topics and publications that may composed substantively different than the analyze how journal articles were distributed across topics as well publications in HeinOnline. Second, another limitation regarding as across time (1960-2018). the corpus concerns the initial selection. We selected journal ar- The main finding for the first research question, namely what ticles that included the keywords ‘artificial intelligence’. Had we topics can be distinguished in the corpus of legal papers on AI, selected different keywords, the corpus might have looked differ- is that 35 topics can be identified, 32 of them being meaningful ently. Additionally, different topics can be expected when running (and three miscellaneous topics). Overall, the model performed additional topic models for a subset of the corpus. Furthermore, the considerably well in identifying latent topics in our corpus (see analyses that explored the increase or decrease of publications over Table 1 above). time used the three most dominant topics to determine whether The second research question dealt with the evolution of topics topics have become less or more relevant. The results might be throughout the different periods which can be identified in the different if also less relevant topics are taken into consideration, history of AI. The topic river displayed in Figure 1 above shows although such analyses would be empirically difficult to conduct, as fundamental changes in legal research on AI from two perspectives: it would require a measure to determine when certain topic weights on the one hand, the total number of papers referring to artificial are deemed insufficiently relevant. intelligence sees a sharp increase since 2016, and on the other hand, We are currently exploring the possibility of applying and com- the diversity of topics also increases throughout time. This can be paring different topic modeling algorithms and incorporate the mostly contextualized by the occurrence of new technologies (e.g. latest NLP representation models (namely word embeddings ei- the Internet, leading to a new topic on Internet governance), but also ther using word2vec or BERT). Moreover, we want to apply our by the granular development of existing technologies. methodology to a different legal collection and identify whether As for the third research question, namely how similar papers can our findings can be confirmed in a different corpus. Finally, after be detected, we further calculated similarity scores between pairs fully validating our approach, we are planning to release it as an of articles and compared the scores for a selection of pairs to the open source website where people can explore and visualize our scores produced by humans. Consequently, it was explored whether findings in an intuitive way. the similarity scores produced by the machine coincide with the expert assessment. A correlation test revealed a high correlation REFERENCES between the orders produced by the machines and by the human. [1] Olufunmilayo Arewa. 2006. Open Access in a Closed Universe: Lexis, Westlaw, The highest agreement levels were found for the papers with the Law Schools, and the Legal Information Market. (03 2006). Return of the AI: An Analysis of Legal Research on Artificial Intelligence Using Topic Modeling NLLP @ KDD 2020, August 24th, San Diego, US [2] Claus Boye Asmussen and Charles Møller. 2019. Smart literature review: a Commission (Dec. 2019). https://ec.europa.eu/jrc/en/publication/intellectual- practical topic modelling approach to exploratory literature review. Journal of property-and-artificial-intelligence-literature-review Big Data 6, 1 (2019), 93. https://doi.org/10.1186/s40537-019-0255-7 [25] Aaron Kirschenfeld. 2017. Yellow Flag Fever: Describing Negative Legal Precedent [3] Steven Barkan. 1992. Can Law Publishers Change the Law? Legal Reference in Citators. https://doi.org/10.31228/osf.io/dfjah Services Quarterly 11 (03 1992), 29–35. https://doi.org/10.1300/J113v11n03_05 [26] Melanie Knapp and Rob Willey. 2016. Comparison of Research Speed and Accu- [4] Trevor Bench-Capon, Michal Araszkiewicz, Kevin Ashley, Katie Atkinson, Floris racy Using WestlawNext and Lexis Advance. Legal Reference Services Quarterly Bex, Filipe Borges, Daniele Bourcier, Paul Bourgine, Jack G. Conrad, Enrico 35 (05 2016), 1–11. https://doi.org/10.1080/0270319X.2016.1177428 Francesconi, Thomas F. Gordon, Guido Governatori, Jochen L. Leidner, David D. [27] Michael A Livermore, Allen Riddell, and Daniel Rockmore. 2016. Agenda forma- Lewis, Ronald P. Loui, L. Thorne McCarty, Henry Prakken, Frank Schilder, Erich tion and the US supreme court: A topic model approach. Arizona Law Review 1, Schweighofer, Paul Thompson, Alex Tyrrell, Bart Verheij, Douglas N. Walton, 2 (2016). and Adam Z. Wyner. 2012. A history of AI and Law in 50 papers: 25?years of [28] Peter Maggs. 1994. Legal Data Banks in the United States and Their Use in the international conference on AI and Law. Artificial Intelligence and Law 20, 3 Comparative Law. International Journal of Legal Information 22 (01 1994), 214– (2012), 215–319. https://doi.org/10.1007/s10506-012-9131-x 227. https://doi.org/10.1017/S0731126500024926 [5] Robert C. Berring. 1986. Full-Text Databases and Legal Research: Backing into [29] Daniel Maier, A. Waldherr, P. Miltner, G. Wiedemann, A. Niekler, A. Kein- the Future. ert, B. Pfetsch, G. Heyer, U. Reber, T. Häussler, H. Schmid-Petri, and S. [6] Jon Bing. 1986. The text retrieval system as a conversion partner. International Adam. 2018. Applying LDA Topic Modeling in Communication Research: To- Review of Law, Computers & Technology 2, 1 (1986), 25–39. https://doi.org/10. ward a Valid and Reliable Methodology. Communication Methods and Mea- 1080/13600869.1986.9966228 arXiv:https://doi.org/10.1080/13600869.1986.9966228 sures 12, 2-3 (2018), 93–118. https://doi.org/10.1080/19312458.2018.1430754 [7] David M. Blei and John D. Lafferty. 2006. Dynamic Topic Models. In Proceedings of arXiv:https://doi.org/10.1080/19312458.2018.1430754 the 23rd International Conference on Machine Learning (Pittsburgh, Pennsylvania, [30] Elizabeth M. McKenzie. 2001. Natural Language Searching. Legal Reference USA) (ICML ’06). ACM, New York, NY, USA, 113–120. https://doi.org/10.1145/ Services Quarterly 18, 4 (2001), 39–47. https://doi.org/10.1300/J113v18n04_04 1143844.1143859 arXiv:https://doi.org/10.1300/J113v18n040 4 [8] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet [31] David Mimno, Hanna M Wallach, Edmund Talley, Miriam Leenders, and Andrew Allocation. J. Mach. Learn. Res. 3 (March 2003), 993–1022. http://dl.acm.org/ McCallum. 2011. Optimizing semantic coherence in topic models. In Proceedings citation.cfm?id=944919.944937 of the conference on empirical methods in natural language processing. Association [9] A. Bridy. 2012. Coding Creativity: Copyright and the Artificially Intelligent for Computational Linguistics, 262–272. Author. Stanford Technology Law Review 5 (2012), 1–28. [32] A Moens. 2009. Is AustLII Sustainable? Australian Law Librarian,17(3), 154-157 [10] Daniel Brüggermann, Yannik Hermey, Carsten Orth, Darius Schneider, Stefan (2009). Selzer, and Gerasimos Spanakis. 2016. Storyline detection and tracking using [33] Marie-Francine Moens, Maarten Logghe, and Jos Dumortier. 2002. Legislative Dynamic Latent Dirichlet Allocation. Computing News Storylines (2016), 9. Databases: Current Problems and Possible Solutions. International Journal of [11] David J Carter, James Brown, and Adel Rahmani. 2016. Reading the high court at Law and Information Technology 10 (03 2002). https://doi.org/10.1093/ijlit/10.1.1 a distance: Topic modelling the legal subject matter and judicial activity of the [34] Elizabeth Moll-Willard. 2018. The use and perceptions of open Access resources high court of australia, 1903-2015. UNSWLJ 39 (2016), 1300. by legal academics at the University of Cape Town (UCT) in South Africa. 6 (09 [12] Allison June-Barlow Chaney, Hanna M Wallach, Matthew Connelly, and David M 2018), 1–13. Blei. 2016. Detecting and Characterizing Events.. In EMNLP. 1142–1152. [35] David Newman, Youn Noh, Edmund Talley, Sarvnaz Karimi, and Timothy Baldwin. [13] Laura Dietz, Bhaskar Mitra, Jeremy Pickens, Hana Anber, Sandeep Avula, 2010. Evaluating topic models for digital libraries. In Proceedings of the 10th annual Asia J. Biega, Adrian Boteanu, Shubham Chatterjee, Jeff Dalton, Shiri Dori- joint conference on Digital libraries. ACM, 215–224. Hacohen, John Foley, Henry Feild, Ben Gamari, Rosie Jones, Pallika Kanani, [36] P. Ogden. 1993. "Mastering the lawless science of our law": A story of legal Sumanta Kashyapi, Widad Machmouchi, Matthew Mitsui, Steve Nole, Alexan- citation indexes. Law Library Journal 85 (01 1993), 1–48. dre Tachard Passos, Jordan Ramsdell, Adam Roegiest, David Smith, and [37] Marc Opijnen, Hayo Schreijer, Ilja Andreas, and Maarten Kroon. 2015. Spe- Alessandro Sordoni. 2019. Report on the First HIPstIR Workshop on the cialised Government Publishing: The Law Pocket and Linked Legal Data in the Future of Information Retrieval. SIGIR Forum 53, 2 (December 2019), 62– Netherlands. 75. https://www.microsoft.com/en-us/research/publication/report-on-the-first- [38] Mark K. Osbeck. 2018. Lawyer as Soothsayer: Exploring the Important Role of hipstir-workshop-on-the-future-of-information-retrieval/ Outcome Prediction in the Practice of Law. Penn State Law Review 123, 1 (2018), [14] R. Dixon. 2003. Breaking into locked rooms to access computer source code: 41–102. Does the dmca violate constitutional mandate when technological barriers of [39] Yannis Panagis and Urska Sadl. 2015. The Force of EU Case Law: A Multi- access are applied to software. Virginia Journal of Law & Technology 8, 1 (2003), dimensional Study of Case Citations.. In JURIX. 71–80. 1–60. [40] João Reis, Paula Espírito Santo, and Nuno Melão. 2019. Impacts of Artificial [15] Arthur Dyevre and Nicolas Lampach. 2018. Issue Attention on the European Intelligence on Public Administration: A Systematic Literature Review. In 2019 Court of Justice: A Text-Mining Approach. SSRN (2018). http://dx.doi.org/10. 14th Iberian Conference on Information Systems and Technologies (CISTI). IEEE, 2139/ssrn.3251186 1–7. [16] Ian Edwards. 2018. Search like a robot: Developing targeted search algorithms. [41] Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in Australian Law Librarian 26, 2 (2018), 104. automatic text retrieval. Information processing & management 24, 5 (1988), [17] Daphne Gelbart and J. Smith. 1993. Automating the Process of Abstracting Legal 513–523. Cases. International Journal of Law and Information Technology 1 (03 1993), [42] G. Salton, A. Wong, and C. S. Yang. 1975. A Vector Space Model for Automatic 324–334. https://doi.org/10.1093/ijlit/1.3.324 Indexing. Commun. ACM 18, 11 (Nov. 1975), 613–620. https://doi.org/10.1145/ [18] Daphne Gelbart and J. C. Smith. 1994. The application of automated text process- 361219.361220 ing techniques to legal text management. International Review of Law, Computers [43] Pamela Samuelson. 2010. Google Book Search and the Future of Books in Cy- & Technology 8, 1 (1994), 203–210. https://doi.org/10.1080/13600869.1994.9966390 berspace. Minnesota law review 94 (01 2010). arXiv:https://doi.org/10.1080/13600869.1994.9966390 [44] M. Sassoli. 2014. Autonomous Weapons and International Humanitarian Law: Ad- [19] Catalina Goanta, Gijs van Dijck, and Gerasimos Spanakis. 2019. Back to the vantages, Open Technical Questions and Legal Issues to Be Clarified. International Future: Waves of Legal Scholarship on Artificial Intelligence. Forthcoming in Sofia Law Studies. US Naval War College 90 (2014), 308–340. Ranchordás and Yaniv Roznai, Time, Law and Change (Oxford, Hart Publishing, [45] Michael Schmitt and Jeffrey Thurnher. 2013. ’Out of the Loop’: Autonomous 2019) (2019). Weapon Systems and the Law of Armed Conflict. Harvard National Security [20] Graham Greenleaf, Daniel Austin, Philip Chung, Andrew Mowbray, Madeleine Journal 4, 02 (2013), 231–281. Davis, and Jill Matthews. 2000. Solving the Problems of Finding Law on the Web: [46] Cecilia Magnusson Sjóberg. 1997. Corpus Legis: A Legal Document World Law and DIAL. Journal of Information, Law and Technology 2000 (01 2000). Management Project. International Journal of Law and Information https://doi.org/10.1017/S0731126500009483 Technology 5, 1 (03 1997), 83–99. https://doi.org/10.1093/ijlit/5.1.83 [21] Graham Greenleaf, Andrew Mowbray, Geoffrey King, and Geoffrey van Dijk. 1995. arXiv:https://academic.oup.com/ijlit/article-pdf/5/1/83/9820642/83.pdf Public Access to Law via Internet: The Australian Legal Information Institute. [47] James A. Sprowl. 1976. Computer-Assisted Legal Research—An Analysis Journal of Law and Information Science, 6(1), 50 (1995). of Full-Text Document Retrieval Systems, Particularly the LEXIS System. [22] F. Hanson. 2002. From Key Numbers to Keywords: How Automation Has Trans- Law & Social Inquiry 1, 1 (1976), 175–226. https://doi.org/10.1111/j.1747- formed the Law. Law Library Journal 94 (09 2002). 4469.1976.tb00955.x arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1747- [23] Karen Hao. 2019. We analyzed 16,625 papers to figure out where AI is headed 4469.1976.tb00955.x next. https://www.technologyreview.com/s/612768/we-analyzed-16625-papers- [48] Keith Stevens, Philip Kegelmeyer, David Andrzejewski, and David Buttler. 2012. to-figure-out-where-ai-is-headed-next/ Exploring topic coherence over many models and many topics. In Proceedings of [24] Maria Iglesias, Sharon Shamuilia, and Amanda Anderberg. 2019. Intellectual Prop- the 2012 Joint Conference on Empirical Methods in Natural Language Processing erty and Artificial Intelligence - A literature review. EU Science Hub - European and Computational Natural Language Learning. Association for Computational NLLP @ KDD 2020, August 24th, San Diego, US Rosca and Covrig, et al. Linguistics, 952–961. [49] Andrew Tutt. 2017. An FDA for Algorithms. Administrative Law Review 69, 1 (2017), 83–123. [50] Marc van Opijnen. 2017. Gaining Momentum. How ECLI Improves Access to Case Law in Europe. [51] Hanna M Wallach, Iain Murray, Ruslan Salakhutdinov, and David Mimno. 2009. Evaluation methods for topic models. In Proceedings of the 26th annual interna- tional conference on machine learning. 1105–1112. [52] Chong Wang, David M. Blei, and David Heckerman. 2008. Continuous Time Dynamic Topic Models.. In UAI, David A. McAllester and Petri Myllymäki (Eds.). AUAI Press, 579–586. http://dblp.uni-trier.de/db/conf/uai/uai2008.html# WangBH08 [53] Xuerui Wang and Andrew McCallum. 2006. Topics over Time: A non-Markov Continuous-time Model of Topical Trends. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Philadelphia, PA, USA) (KDD ’06). ACM, New York, NY, USA, 424–433. https://doi.org/10. 1145/1150402.1150450 [54] E. Warren. 2017-2018. A Simple Guide to Machine Learning. SciTech Lawyer 14, 1 (2017-2018), 5–9. [55] Clemens Wass. 2017. openlaws.eu – Building Your Personal Legal Network. [56] Xing Wei, Jimeng Sun, and Xuerui Wang. 2007. Dynamic Mixture Models for Multiple Time-Series.. In IJCAI (2007-03-05), Manuela M. Veloso (Ed.). 2909–2914. Figure 4: Distribution of the number of authors over the http://dblp.uni-trier.de/db/conf/ijcai/ijcai2007.html#WeiSW07 [57] Robin Widdison. 2002. New Perspectives in Legal Information Retrieval. In- number of documents ternational Journal of Law and Information Technology 10, 1 (01 2002), 41–70. https://doi.org/10.1093/ijlit/10.1.41 arXiv:https://academic.oup.com/ijlit/article- pdf/10/1/41/2065390/100041.pdf [58] R Winkels. 2015. Experiments in finding relevant case law. In NAIL 2015: 3rd International Workshop on Network Analysis in Law. A APPENDIX Figure 3: Number of journal articles with keywords per year Figure 5: Number of publications per author for authors with at least five publications