1. Introduction

D. Sudharsan);

DistilRoBERTa Based Sentence Embedding for Rhetorical Role Labelling of Legal Case Documents

Deepthi Sudharsan

Asmitha U

Premjith B

b_premjith@cb.amrita.edu 0

Soman K.P

kp_soman@amrita.edu 0

Vishwa Vidyapeetham

India

0 Center for Computational Engineering and Networking (CEN), Amrita School of Engineering , Coimbatore Amrita

2021

000 0 0003

In a country like India with a very dense and growing population, every year the number of legal judgements filed keep increasing. With increasing number of legal case documents, a systematic and structured organization of the files are essential for the smooth running of the legal system. As a part of AILA 2021, assigning rhetorical roles of legal documents was given as a shared task to automate the process. Deep Learning and Machine Learning models help achieve this task with ease and minimal error. For eficient information retrieval and classification, preprocessing and word embedding techniques such as sentence transformation have been discussed in the paper. Artificial Neural Networks performed the best and consequently, it was used to further evaluate and improve the prediction of the rhetorical roles. In comparison to other Machine Learning and Deep learning models trained for the task, a basic Artificial Neural Network with one hidden layer and 1024 × 2 neurons gave the maximum validation accuracy of 85.18% and testing precision of 30.9%.

Documents Rhetorical Role labelling distilroberta-base Artificial Neural Networks

1. Introduction

For the eficient working and the smooth administration of the court of law, an organized and eficient structure of storing the legal case documents is obligatory. Manual examination of legal judgments provided by higher courts or legal oficials for the acquisition of cardinal information can be a cumbersome and error-prone process. As a result, automatic information retrieval from legal court case transcripts and employing deep learning techniques to classify those judgments would provide several advantages to individuals working in the legal services industry [ 1 ]. To ensure the easy readability of the legal judgments and classifying the documents based on their common thematic rhetorical roles such as “Facts of the Case” , “Issues being discussed”, “Arguments given by the parties” etc. [ 2 ], deep learning networks prove to be eficient. The Artificial Intelligence for Legal Assistance (AILA 2021) [ 3, 4 ] came up with a task that aims to classify the rhetorical roles of sentences from legal case documents given the seven predefined roles that it can be classified under. nEvelop-O

To come up with an eficient and a less error-prone solution for the task, various machine learning and deep learning models were trained with and without hyperparameter tuning using GridSearchCV, although it later proved to ineficient for this task. Among the machine learning models that were trained using K - Nearest Neighbor (KNN), Decision Tree, Random Forest, Naive Bayes, Multi Layer Perceptron (MLP) and Support Vector Machine (SVM), SVM proved to be more accurate in predicting the roles with an accuracy of 53%. All the deep learning models that were trained (Long Short-Term Memory (LSTM) Networks), Artificial Neural Network (ANN) and Convolutional Neural Network (CNN), performed significantly better than the machine learning models, with ANN performing better than all the models that were trained for the task with a validation accuracy of about 85.18 % on the training dataset. The single layer ANN model was further evaluated for two diferent runs on the testing dataset, and the performance was analyzed.

The paper is broadly divided into the following sections: Section 2 introduces related research in the field of legal document retrieval; Section 3 provides the dataset information; Section 4 explains the methodology proposed for the task; Section 5 discusses the evaluation outcomes. Finally, Section 6 finishes the work with some suggestions for further improvements for better outcomes.

2. Related Works

In paper [ 5 ], GloVe, Doc2Vec and Term Frequency-Inverse Document Frequency (TFIDF) based methods were used to perform the labelling of Rhetorical Roles for Legal Judgements given in AILA 2020 [ 6 ]. Manual annotation is significantly used in the automatic labelling of the rhetorical role of sentences. Other works deal with the process of annotation – producing a set of rules for annotation, inter - annotator research, and so on – whereas papers that aim to automate the task of semantic labelling also perform an annotation analysis [ 7 ]. Classifier models such as fastText have also been proposed as an approach for searching through legal facts from case documents as discussed in [ 8 ]. In [ 9 ], BM25 ranking algorithm was used for identifying relevant prior cases for a given situation based on best matches. Similar work had been done in [ 10 ] as a part of AILA 2019 [ 11 ] while additionally using cosine similarity and jaccord similarity. With the rise of Deep learning applications for the purpose of legal information retrieval, a high demand for Neural Network based classification is reflected in many works[12].

3. Data Description

For the given task, the provided training data set consists of over 60 case documents and the rhetorical roles for the sentences in each document. The predefined rhetorical roles that were to be predicted are mentioned in Table 1.

4. Methodology

A model that can successfully predict the rhetorical roles with minimal errors needed to be designed in legal documents. After successful retrieval of sentences from all the documents and preprocessing, embedding using pre-trained model available in hugging-face transformer1 was performed on the data set. A variety of machine learning and deep learning models were trained and tested to find the optimal model that could be used to perform the task. The proposed methodology is shown in Figure 1.

Class Occurrence

4.1. Preprocessing

In the data cleaning process, numerical characters, punctuation and extra white spaces were removed using regular expression package. The sentences were modified into lowercase in order to maintain uniformity and using NLTK library2, stop words were further removed. During exploratory analysis, it was found that 69 sentences were unassigned roles and hence dropped. The non numerical labels were encoded to numerical attributes using label encoder.

4.2. Embedding

Sentence embedding was accomplished using pretrained sentence transformer distilroberta base [13] [14]. Distilroberta - base is a technique from Hugging face library that uses contextual relations between the words to yield contextualized word vector embedding. 1https://huggingface.co/transformers/ 2https://www.nltk.org/

4.3. Model Training

Initially, Machine learning models like KNN, Decision Tree, Random Forest, Naive Bayes, MLP were trained and SVM classifier was found to have the maximum accuracy of 53 %. To improve the classification accuracy, deep learning models such as LSTM (Long Short Term Memory), [15] ANN (Artificial Neural Networks) and CNN (Convolutional Neural Networks) [ 16] were trained. Out of the three deep learning models, ANN had comparatively higher accuracy of about 99%. While training, to address the imbalance in the data set 1, class weights [17] were generated and passed as input parameter to the models. For improving the overall performance of the models, GridSearchCV3 was used on the models to get the best parameters, but there was no improvement in the accuracy when the best parameters yielded by GridSearchCV was used. Hence, ANN model was chosen to perform the classification task. The structure of the selected ANN model is depicted in Figure 2.

The artificial neural network used has one hidden layer which is connected to the input and output layers. Embedded vectors after passing through the input layer and neurons in the hidden layer is then decoded using inverse transform of the label encoder.

5. Results and Discussions

The recorded accuracies for diferent number of layers and neurons after running for 32 epochs are compared in the Table 2.

3https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

As observed, for one layer without dropout, the training and validation accuracy seems to be substantially high but it is noticed that for 1024 neurons, the model was over fitting and hence 1024 × 2 neurons was used. The ANN model was run using these optimal parameters for two cases.

In the first run, both the train and test data were embedded together whereas in the second run they were embedded separately.

From Table 3, it is observed that run two performed better with a precision of 30.9 % than the first run which gave a precision of just 17.9 %. Hence, the training and testing data were embedded separately to achieve better results.

Table 4 shows the category - wise comparison of the precision, recall and Fscore metrics for both the runs. The single layered ANN architecture that was used predicted the ”Ruling by lower court” and ”Statute” labels incorrectly for both the runs. Run 2 was able to predict ”Argument”, ”Facts”, ”Ratio of the decision” better than Run 1. It is also observed that Run 2 was able to correctly predict ”Ruling by Present Court”. The Rhetorical role ”Precedent” was predicted better by Run 1 in comparison to run 2 unlike the trend shown by the other labels.

6. Conclusion

This paper talks about the systematic approach undertaken to successfully predict the rhetorical roles of legal documents using multiple machine learning and deep learning techniques. Basic single layered ANN trained using word embedding from pre - trained sentence transformer, distilroberta - base can help achieve high precision of 30.9 %. The paper can be further expanded by using alternate methods like BM25 ranking algorithm and other methods of embedding like TFIDF or fastText to improve the overall prediction accuracy. of sentences in indian legal judgments, in: Proc. International Conference on Legal Knowledge and Information Systems (JURIX), 2019. [12] S. Mandal, S. D. Das, Unsupervised identification of relevant cases & statutes using word embeddings, in: FIRE, 2019. [13] J. Du, E. Grave, B. Gunel, V. Chaudhary, O. Çelebi, M. Auli, V. Stoyanov, A. Conneau,

Self-training improves pre-training for natural language understanding, in: NAACL, 2021. [14] A. Barua, S. Thara, B. Premjith, K. Soman, Analysis of contextual and non-contextual word embedding models for hindi ner with web application for data collection, in: International Advanced Computing Conference, Springer, 2020, pp. 183–202. [15] B. Premjith, K. Soman, Deep learning approach for the morphological synthesis in malayalam and tamil at the character level, Transactions on Asian and Low-Resource Language Information Processing 20 (2021) 1–17. [16] T. T. Sasidhar, B. Premjith, K. Soman, Emotion detection in hinglish (hindi+ english) code-mixed social media text, Procedia Computer Science 171 (2020) 1346–1352. [17] B. Premjith, K. P. Soman, P. Poornachandran, Amrita_cen@ fact: Factuality identification in spanish text., in: IberLEF@ SEPLN, 2019, pp. 111–118.

[1]

Rathnayake ,

Rupasinghe , N. de Silva,

Warushavithana ,

Gamage ,

Perera ,

Perera , Classifying sentences in court case transcripts using discourse and argumentative properties , International Journal on Advances in ICT for Emerging Regions (ICTer) 12 ( 2019 ) 1 . doi:1 0 . 4 0 3 8 / i c t e r . v 1 2 i 1 . 7 2 0 0 .

[2]

Ghosh ,

Wyner , Identification of rhetorical roles of sentences in indian legal judgments , in: Legal Knowledge and Information Systems: JURIX 2019 : The Thirty-second Annual Conference , volume 322 , IOS Press, 2019 , p. 3 .

[3]

Parikh , U. Bhattacharya,

Mehta ,

Ayan ,

Bhattacharya ,

Ghosh ,

Pal ,

Bhattacharya ,

Majumder , Overview of the third shared task on artificial intelligence for legal assistance at fire 2021 , in: FIRE (Working Notes), 2021 .

[4]

Parikh , U. Bhattacharya,

Mehta ,

Ayan ,

Bhattacharya ,

Ghosh ,

Pal ,

Bhattacharya ,

Majumder , Fire 2021 aila track: Artificial intelligence for legal assistance , in: Proceedings of the 13th Forum for Information Retrieval Evaluation , 2021 .

[5]

Almuslim ,

Inkpen , Document level embeddings for identifying similar legal cases and laws , in: FIRE, 2020 .

[6]

Bhattacharya ,

Mehta ,

Ghosh ,

Pal ,

Bhattacharya ,

Majumder , Overview of the fire 2020 aila track: Artificial intelligence for legal assistance , in: FIRE (working notes) , 2020 .

[7]

Šavelka ,

K. D.

Ashley , Segmenting U.S. court decisions into functional and issue specific parts , in: JURIX , 2018 .

[8] I. Nejadgholi ,

R. Bougueng

Tchemeube ,

Witherspoon , A semi-supervised training method for semantic search of legal facts in canadian immigration cases , 2017. doi:1 0 . 3 2 3 3 / 9 7 8 - 1 - 6 1 4 9 9 - 8 3 8 - 9 - 1 2 5 .

[9]

Gain ,

Bandyopadhyay , A. De,

Saikh ,

Ekbal , Iitp at aila 2019: System report for artificial intelligence for legal assistance shared task , 2021 .

[10]

Kayalvizhi ,

Thenmozhi ,

Aravindan , Legal assistance using word embeddings , in: FIRE , 2019 .

[11]

Bhattacharya ,

Paul ,

Ghosh ,

Wyner , Identification of rhetorical roles