Legal Information Retrieval and Rhetorical Role Labelling for Legal Judgements Nitin Nikamanth Appiah Balaji, B. Bharathi and J. Bhuvana Department of CSE, Sri Siva Subramaniya Nadar College of Engineering,Tamil Nadu, India Abstract Retrieving the most relevant information from a huge amount of documents is a tedious process. Nowa- days, artificial intelligence plays a major role in information retrieval tasks. Information retrieval tasks can be used in different applications like search engines, relevance feedback, summarization, and so on. In this proposed work explains the method used to solve the problem given in the shared task on Artificial Intelligence for Legal Assistance proposed by the Forum of Information Retrieval Evaluation in 2020(AILA@Fire2020). There are two tasks given in the challenge. The first task is, given the de- scription of a situation, identify relevant statutes, and prior-cases. The second task is, given a legal case document, classify each sentence of the document in one of the 7 semantic segments/rhetorical roles. We compare two systems for the first task - TFIDF features along with the cosine similarity metric and BM25 ranking algorithm. For the semantic labeling task, pre-trained FastText embedding with MLP and TFIDF with random forest classifiers are used. BM25 ranking algorithm shows significantly better results for the first task and the pre-trained FastText method is better than the vectorization method for classifying the rhetorical roles. Keywords Legal information retrieval, Rhetorical Role labelling, BM25, TFIDF, FastText 1. Introduction The countries like India, UK, Canada, Australia, and many others use the common law system. There are two important primary sources of law that exists in the system. The first one is called statutes which are the written laws. The precedents or judgments of prior cases delivered by a court, which involve similar legal facts and issues are the current case but are not directly indicated in the written law. When the legal counselor working on a new case often relies on these statutes and precedents to understand how the Court has discussed, argued, and behaved in similar scenarios. The first task is to retrieve the relevant statutes and prior-cases by giving the situation of the problem. Most of the legal case documents follow a common structure with different sections like "Details of the Case", “Issues being discussed”, “Arguments given by the parties”, etc. These sections are popularly termed as "rhetorical roles". An acquaintance of such semantic roles will not only improve the readability of the documents but also needed to compute document FIRE 2020: Forum for Information Retrieval Evaluation, December 16-20, 2020, Hyderabad, India " nitinnikamanth17099@cse.ssn.edu.in (N. N. A. Balaji); bharathib@ssn.edu.in (B. Bharathi); bhuvanaj@ssn.edu.in (J. Bhuvana)  0000-0002-6105-0998 (N. N. A. Balaji); 0000-0001-7279-5357 (B. Bharathi) © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) similarity, summarization, etc. However, this information is generally not specified explicitly in case documents, which are usually just free-flowing text. The second task is to semantically label the sentence into one among the seven roles. 2. Related work In AILA 2019, task of identifying most relevant prior cases for the given situation and retrieving most relevant statutes for the given situation was solved by different authors. In [1], BM25 ranking algorithm with Doc2vec unsupervised algorithm is used for the task. The named entity recognition preprocessing with TFIDF and BM25 was used in [2]. In [3], pre-trained word embeddings are used for the query and relevant documents. Cosine similarity was used for retrieving the most relevant prior cases or statutes. The authors of [4] used different vectorization methods with similarity metrics such as jaccord similarity and cosine similarity to retrieve the target documents. In [5], Language Model for Information Retrieval, Vector Space Model and BM25 were used for information retrieval. In [6], the important topic words were extracted from the given situation and the topic words were used as a query to identify the relevant prior cases. 3. Proposed system 3.1. Precedent & Statute Retrieval The overview of the task is described in [7]. The first task involves the identification and ranking of the most relevant statutes or prior-cases for a given description of a legal situation. This is done by checking the correlation between the requested query and the available prior documents. Based on the correlation based on the algorithm used the documents are ranked. For this task, BM25 ranking algorithm and TFIDF vectorization along with cosine similarity are considered. The performance of both the systems is compared with MAP, bpref, Recip_rank, P@10 scores. 3.1.1. TFIDF - Cosine Similarity TFIDF (Term Frequency Inverse Document Frequency) is used to convert the given queries and the documents into number vectors. TFIDF generates a vector removing the significance of stop words and words with less semantic meaning. Reducing this noise from the documents can help the cosine-similarity function to concentrate on the important words. The cosine-similarity produces the correlation between the query and the document by the equation 1. 𝑄·𝐷 cos 𝜃 = (1) |𝑄| · |𝐷| where 𝑄 is the TFIDF vector of the query and 𝐷 is the TFIDF vector of the document - statute or case. 3.1.2. BM25 Ranking BM25 (Best Matching) algorithm is a bag of words based algorithm which ranks based on the appearance of query key terms on each document. It does not consider the proximity of the query keys in the documents. The BM25 is between a query and a document is calculated using the equation 2. The Okapi BM25 algorithm is used for this task. 𝑛 ∑︁ 𝑓 (𝑞𝑖 , 𝐷)(𝑘1 + 1) 𝑠𝑐𝑜𝑟𝑒(𝑄, 𝐷) = 𝐼𝐷𝐹 (𝑞𝑖 ) |𝐷| (2) 𝑖=1 𝑓 (𝑞𝑖 , 𝐷) + 𝑘1 (1 − 𝑏 + 𝑏 𝑎𝑣𝑔𝑑𝑙 ) where 𝑞1 , ..., 𝑞𝑛 are the keywords in 𝑄 and 𝐷 is the Document. 𝑓 (𝑞𝑖 , 𝐷) is 𝑞𝑖 ’s term frequency. |𝐷| is the number of words in D. 𝑎𝑣𝑔𝑑𝑙 is the average length of documents. 𝐼𝐷𝐹 (𝑞𝑖 ) is the inverse document frequency. 𝑘1 and 𝑏 are free parameters. 3.2. Rhetorical Role Labeling for Legal Judgements The second task involves the classification of sentences from legal case documents into 7 semantic segments or rhetorical roles. For converting the sentences into a numerical feature matrix two different feature extraction techniques - FastText and TFIDF are implemented and compared. The features thus extracted are classified for the role labels by using Multilayer Perceptron (MLP) and Random Forest (RF) classifiers. The scikit-learn implementation of the machine learning models and the TFIDF text feature extractor is used. The accuracy, macro F1, precision, and recall values are used for evaluation. 3.2.1. FastText Embedding FastText pre-trained models are CBOW models trained on CommonCrawl and Wikipedia data. As the number of data samples is limited, the pre-trained embeddings could provide scope for improvement. The FastText model pre-trained on the English corpus is considered. The embedding of sentences is extracted for the sentences in the data-set. 300 dimensions fixed- length vector is generated by each sentence in the data-set. This is then fed to a Multilayer Perceptron with 512, 128 hidden layer size, and trained for 200 iterations. 3.2.2. TFIDF Vectorization TFIDF vectorization is based on the count-based vectorization technique. In TFIDF, the inverse document frequency term reduces the impact of the common words on the classification task. TFIDF vectorization with an n-gram range of 1-5 is taken and a sparse matrix is generated. This sparse matrix is then classified using a Random Forest classifier. 4. Results and Discussions On comparing the P@10 for the precedent records retrieval task, it is clear that the BM25 algorithm outperforms the TFIDF model. But the statute retrieval task, the TFIDF model Table 1 Results of test-set for Task1 - Precedent & Statute Retrieval. Task Model MAP BPREF Recip_Rank P@10 Precedent BM25 0.1264 0.0918 0.2043 0.08 TFIDF + cosine-similarity 0.0652 0.0406 0.1004 0.05 Statute BM25 0.1181 0.069 0.2739 0.07 TFIDF + cosine-similarity 0.3423 0.136 0.3423 0.07 Table 2 Results of test-set for Task2 - Rhetorical Role Labeling for Legal Judgements. Model Classifier Precision Recall Macro F1 score Accuracy FastText MLP 0.384 0.4 0.354 0.46 TFIDF n-gram=1-5 RF 0.473 0.354 0.333 0.467 performance slightly better. The P@10 score of both the models is equal for the statute subtask. So we can say that the BM25 model performed better than the TFIDF model. For task2, the FastText and the TFIDF models produce models with similar performance. The accuracy of both the models is 0.46%, but when comparing the macro F1 score, FastText shows slightly better performance, around 6% improvement over the TFIDF model. This is because of the advantage the FastText model gets from the pre-trained weights. The performance of the models on test-data are shown in Table 1 and 2 for task1 and task2 respectively. 5. Conclusion In this paper, we study the methods for legal document retrieval and rhetorical role labeling. We see that information retrieval techniques play a major role in these tasks. We propose BM25 and TFIDF - cosine similarity algorithms for the precedent and statute retrieval task. Our model achieved a P@10 score of 0.08 and 0.07 for precedent and statute retrieval sub-tasks. For the rhetorical role labeling task, we compare the FastText and TFIDF models. Our FastText model produces a macro F1 score of 0.354. References [1] B. Gain, D. Bandyopadhyay, A. De, T. Saikh, A. Ekbal, IITP at AILA 2019: System report for artificial intelligence for legal assistance shared task, in: P. Mehta, P. Rosso, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation, Kolkata, India, December 12-15, 2019, volume 2517 of CEUR Workshop Proceedings, CEUR- WS.org, 2019, pp. 19–24. URL: http://ceur-ws.org/Vol-2517/T1-3.pdf. [2] R. More, J. Patil, A. Palaskar, A. Pawde, Removing named entities to find precedent legal cases, in: P. Mehta, P. Rosso, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation, Kolkata, India, December 12-15, 2019, volume 2517 of CEUR Workshop Proceedings, CEUR-WS.org, 2019, pp. 13–18. URL: http://ceur-ws. org/Vol-2517/T1-2.pdf. [3] S. Mandal, S. D. Das, Unsupervised identification of relevant cases & statutes using word embeddings, in: P. Mehta, P. Rosso, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation, Kolkata, India, December 12-15, 2019, volume 2517 of CEUR Workshop Proceedings, CEUR-WS.org, 2019, pp. 31–35. URL: http://ceur-ws.org/Vol-2517/T1-5.pdf. [4] S. Kayalvizhi, D. Thenmozhi, C. Aravindan, Legal assistance using word embeddings, in: P. Mehta, P. Rosso, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation, Kolkata, India, December 12-15, 2019, volume 2517 of CEUR Workshop Proceedings, CEUR-WS.org, 2019, pp. 36–39. URL: http://ceur-ws.org/ Vol-2517/T1-6.pdf. [5] Y. Shao, Z. Ye, Thuir@aila 2019: Information retrieval approaches for identifying relevant precedents and statutes, in: P. Mehta, P. Rosso, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation, Kolkata, India, December 12-15, 2019, volume 2517 of CEUR Workshop Proceedings, CEUR-WS.org, 2019, pp. 46–51. URL: http://ceur-ws.org/Vol-2517/T1-8.pdf. [6] Z. Zhao, H. Ning, L. Liu, C. Huang, L. Kong, Y. Han, Z. Han, Fire2019@aila: Legal information retrieval using improved BM25, in: P. Mehta, P. Rosso, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation, Kolkata, India, December 12-15, 2019, volume 2517 of CEUR Workshop Proceedings, CEUR-WS.org, 2019, pp. 40–45. URL: http://ceur-ws.org/Vol-2517/T1-7.pdf. [7] P. Bhattacharya, P. Mehta, K. Ghosh, S. Ghosh, A. Pal, A. Bhattacharya, P. Majumder, Overview of the FIRE 2020 AILA track: Artificial Intelligence for Legal Assistance, in: Proceedings of FIRE 2020 - Forum for Information Retrieval Evaluation, 2020.