Unsupervised Identification of Relevant Cases & Statutes Using Word Embeddings Soumil Mandal1 and Sourya Dipta Das2 ∗ 1 SRM University, Chennai, India 2 Jadavpur University, Kolkata, India {soumil.mandal, dipta.juetce}@gmail.com Abstract. In this paper, we have described the systems that we submit- ted as team JU SRM for FIRE 2019 track on Artificial Intelligence for Legal Assistance (AILA 2019). The two tasks in this track were 1) identi- fying relevant prior cases and 2) identifying relevant statutes. For both of these tasks, we took an unsupervised approach using pre-trained word- embeddings for encoding texts and calculating relevance using cosine- similarity between the query and target documents. Keywords: Artificial Intelligence · Legal Assistance · Sent2Vec · Fast- Text · BERT 1 Introduction Similar to a lot of other practical domains, the domain of law and legality is gradually incorporating automated methods as well, especially after the rapid growth and development in machine learning and information retrieval models. To encourage researchers in delving into such automated methods, FIRE 2019 included a track named Artificial Intelligence for Legal Assistance (AILA) [11]. In a lot of countries, when a lawyer is presented with a case, the final verdict is generally based on two things, 1) statutes (established laws) and 2) precedents (prior cases). The statutes informs the lawyer regarding applying legal princi- pals based on a certain situation, while precedents informs the lawyer about how similar cases were dealt with in the past. If this pipeline of collecting rele- vant statutes and precedents can be automated as a information retrieval based model, this will not only help the lawyer but as well as several other people including the clients and subjects. Motivated by this, the organizers of AILA added two tasks, 1) identifying relevant prior cases for a given situation and 2) identifying most relevant statutes for a given situation. The goal was to given a case description as query, rank the target documents prior to relevancy. The datasets which the organizers provided consisted of 2914 prior cases, 197 statues and 50 queries, which were summarized case descriptions. To test our model prior ∗ Copyright c 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). FIRE 2019, 12-15 De- cember 2019, Kolkata, India. 2 Mandal and Das. to the final run submissions, the organizers provided us with the task 1 and task 2 outputs of the first 10 queries. To build our systems, we took an unsupervised approach using text-embeddings and cosine-similarity. The primary motivation behind this was semantic level modelling and better scaling. Before building our systems, we performed some basic NER removal using Spacy 3 with the following tags PERSON, ORDINAL, CARDINAL, WORK OF ART, TIME, PERCENT, QUANTITY. 2 Related Work In the legal domain, researchers have contributed in several problems like text classification, text summarising and information mining Gonalves et.al [2] showed how some linguistic techniques like lemmatization and POS identification can be used to increase accuracy of the model for classification of legal texts with low dimensional feature vector. As we know, legal documents follow a certain struc- ture of information written with formal languages and defined terminologies, researchers have tried to use these prior structural information to summarize the data which can be useful for other tasks like case recommendations, docu- ment classification and labeling. Saravanan et.al [3][6] have contributed in legal document summarizing in their subsequent works by using graphical models and CRFs. Conrad et.al [5] introduced a query based sentiment summarization for le- gal texts which can be very useful for mining the opinions. In the legal document labeling problem, Schweighofer et.al [1] proposed a solution by using hierarchi- cal self-organizing map and Mencia et.al [4] introduced a multilabel classification using one vs all classifiers with efficient perceptron algorithms. 3 Task 1 - Identifying Relevant Prior Cases For this task, the goal was to identify the relevant prior case from a collection of past cases. Here, we have used two types of word embeddings, namely pretrained Sent2Vec [9] and FastText [10] trained on the prior 2914 cases. On a whole, we created three models. For the first two models, we used a simple algorithm where the queries and precedents were encoded using pretrained word-embeddings. Instead of encoding whole cases or queries as a single long vector, we extracted sentences of max 20 tokens, then encoded each of them, and finally calculated the average of all these vectors to get the vector of the respective case or query of size 20. Then, for each query-case pair, we computed the cosine-similarity score and the ranked them accordingly. The only difference was that for the first model, we used Sent2Vec, while in the second we used FastText. We tested both of these models on the training data. The Sent2Vec model secured an BPREF of 0.0215 while the FastText model secured an BPREF of 0.0124. Using these values, we calculated weights of the models. For Sent2Vec, it was 0.0215/(0.0215 + 0.0124) = 0.63 and for FastText it was 0.0124/(0.0215 + 0.0124) = 0.36. With 3 https://spacy.io/api/annotationnamed-entities Unsupervised Identification of Relevant Cases & Statutes 3 these values, we created our third model, which was a weighted voting ensemble model. The performance metrics 4 of all of these models on the testing data is shown below in Table 1. 1/Ro1R denotes 1/(rank of first relevant document). Model P@10 MAP BPREF 1/Ro1R Sent2Vec 0.0250 0.0478 0.0284 0.131 FastTex 0.0175 0.0228 0.0163 0.065 Ensemble 0.0200 0.0181 0.0060 0.044 Table 1. Evaluation results for task 1 systems. 4 Task 2 - Identifying Relevant Statutes In this task, the goal was to identify relevant statutes given a summarized version of a case as input. To do this, we first extracted key-phrases from the queries and the statutes using the rake-nltk 5 library. For statutes, we further performed some manual augmentation as well as removal of key-phrases based on relevance. Example, for statute S10, ”equality of opportunity in matters of public employ- ment”, the library didn’t select ”discriminated” as a keyword so it was manually added while ”fifty per cent”, which was picked up was removed. Finally, we en- coded each of these key-phrases using a pretrained BERT [8] model. Using these encoding vectors, we created three models. For the first model, we computed the cosine similarity scores between each of the key-phrase pairs of every query- statute pairs. Then, for each of the query-statute pair cosine similarity scores, we took the max and second-max values and multiplied them to get the final rank determining score. For the second model, we took a similar approach as the first one, but this time, took an average of the key-phrase cosine similarity scores. In the third model, we used the product of the scores calculated for the first and the second model to get the final relevance score, i.e. the product of the max, second-max and the average score. The performance metrics of all of the models on the testing data is shown in Table 4. Model P@10 MAP BPREF 1/Ro1R M*SM 0.0600 0.0767 0.0309 0.1460 Average 0.0600 0.0918 0.0402 0.2010 Ensemble 0.0600 0.0831 0.0285 0.1620 Table 2. Evaluation results for task 2 systems. 4 https://trec.nist.gov/pubs/trec15/appendices/CE.MEASURES06.pdf 5 https://pypi.org/project/rake-nltk/ 4 Mandal and Das. 5 Conclusion & Future Work We have demonstrated that satisfactory results in both of the tasks can be achieved by taking a simple and fast unsupervised approach using pre-trained embeddings and cosine-similarity scores. Our Sent2Vec and average based system got a rank of 7 and 4 in task 1 and task 2 respectively based on the metric Ro1R. In the future, we would like to collect more legal data and annotate them to build supervised classification models. Acknowledgement The authors would like to thank Silversparro Pvt, Ltd. for providing necessary support & computational resources to complete this work. We particularly ex- tend our gratitude to Mr. Ankit Agarwal, CTO and Mr. Ravikant Bhargav, R&D head for their constant support and encouragement. References 1. Schweighofer, Erich, Andreas Rauber, and Michael Dittenbach. ”Automatic text representation, classification and labeling in European law.” In Proceedings of the 8th international conference on Artificial intelligence and law, pp. 78-87. ACM, 2001. 2. Gonalves, Teresa, and Paulo Quaresma. ”Is linguistic information relevant for the classification of legal texts?.” In Proceedings of the 10th international conference on Artificial intelligence and law, pp. 168-176. ACM, 2005. 3. Saravanan, M., Balaraman Ravindran, and S. Raman. ”Improving legal document summarization using graphical models.” Frontiers in Artificial Intelligence and Ap- plications 152 (2006): 51. 4. Mencia, Eneldo Loza, and Johannes Frnkranz. ”Efficient pairwise multilabel classi- fication for large-scale problems in the legal domain.” In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 50-65. Springer, Berlin, Heidelberg, 2008. 5. Conrad, Jack G., Jochen L. Leidner, Frank Schilder, and Ravi Kondadadi. ”Query- based opinion summarization for legal blog entries.” In Proceedings of the 12th International Conference on Artificial Intelligence and Law, pp. 167-176. ACM, 2009. 6. Saravanan, M., Balaraman Ravindran, and S. Raman. ”Automatic identification of rhetorical roles using conditional random fields for legal document summarization.” In Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I. 2008. 7. Bansal, Trapit, David Belanger, and Andrew McCallum. ”Ask the gru: Multi-task learning for deep text recommendations.” In Proceedings of the 10th ACM Confer- ence on Recommender Systems, pp. 107-114. ACM, 2016. 8. Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. ”Bert: Pre- training of deep bidirectional transformers for language understanding.” arXiv preprint arXiv:1810.04805 (2018). 9. Pagliardini, Matteo, Prakhar Gupta, and Martin Jaggi. ”Unsupervised learning of sentence embeddings using compositional n-gram features.” arXiv preprint arXiv:1703.02507 (2017). Unsupervised Identification of Relevant Cases & Statutes 5 10. Joulin, Armand, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hrve Jgou, and Tomas Mikolov. ”Fasttext. zip: Compressing text classification models.” arXiv preprint arXiv:1612.03651 (2016). 11. P. Bhattacharya, K. Ghosh, S. Ghosh, A. Pal, P. Mehta, A. Bhattacharya., P. Majumder, Overview of the Fire 2019 AILA track: Artificial Intelligence for Legal Assistance. In Proc. of FIRE 2019 - Forum for Information Retrieval Evaluation, Kolkata, India, December 12-15, 2019.