-

Legal Statutes Retrieval: A Comparative Approach on Performance of Title and Statutes Descriptive Text

Moemedi Lefoane

Tshepho Koboyatshwene

Goaletsa Rammidi

V. Lakshmi Narasimham

lakshmi.narasimhang@mopipi.ub.bw 0 0 University of Botswana , Gaborone , Botswana

Legal Statutes play a crucial role in the Justice system. For countries that adopt the common law system they are often cited in court decisions to argue cases of interest. AILA 2019 track presented two tasks; precedents retrieval task and statutes retrieval task1. Our team participated in the latter. The statutes provided consisted of two components namely; Title and Statute description. In this study we rst conduct the experiment to determine the best term weighting model for this task. After determining the best term weighting model, the second set of experiments which aimed to determine the extent to which these components (title and description of statutes) contribute to retrieval effectiveness. To nd out how retrieval e ectiveness is a ected by di erent components three experiments were conducted; the rst involved indexing title and description of each statute as a document, retrieval using IF B2 is performed generating the rst run (Baseline), the second experiment involved indexing only title disregarding description of the statutes, this generate the second run. For the nal experiment, only description of statutes are indexed disregarding title, again indexing, retrieval performed to generate the third run. The three runs were then sent to organisers for evaluation. The evaluation results shows our team came second, furthermore results suggest that indexing with title only and disregarding description of statutes is su cient enough for retrieval of statutes.

Legal Statutes Retrieval Legal Text Mining Information Retrieval

Information retrieval(IR) is concerned with nding documents of unstructured text that are relevant to the information need from a collection of documents or from other material provided. Material or a document is relevant if it has information of value to satisfy the information need [ 7 ]. As indicated earlier

Arti cial Intelligence for Legal Assistance (AILA 2019) track was divided into two tasks [ 1 ]. Our team participated in Legal Statutes Retrieval task, the goal was to generate a ranked list of relevant statutes for each object query provided in the dataset [ 1 ].

Experiments conducted by Tamrakar et al [ 5 ] on FIRE 2011 datasets using di erent probabilistic models in Terrier 3.5 such as BM 25, BB2, IF B2, In expB2, In expC2, InL2, DF R BM 25, DF I0 and P L2 yielded promising results. The datasets used consisted of various documents from newspapers and websites. M eanAverageP recision(M AP ) and R precision were used for measuring the performance of the di erent models. The results indicated the highest MAP value of 0.7846 for the IF B2 model with the usage of a sample or few of news corpus dataset. IF B2 is one of the DF R models implemented in Terrier [ 4 ].

Another study conducted by Diana [ 6 ] used two variants from DF R models namely, P L2 and DLH13 for the CHiC 2013 Lab using a collection of textual cultural heritage objects based on the English and/or Italian languages. The best performance was obtained using DLH13 for the monolingual experiments with two of the collections which were made available.

Divergence from Randomness (DFR) is a probabilistic keyword indexing model, which was proposed by Amati et al [ 2 ] and was then incorporated in Terrier as one of the IR models. In DRF , a term weight is computed by measuring the divergence between a term distribution produced by a random process within the collection and the actual term distribution within a document. The assumption is that some words are not equally important when describing the content of the documents. Considering the entire document collection C, there is a random distribution of words (such as stop words) that carry little information or are deemed as less important across all documents. Another assumption is that there is an elite set of documents that contain speciality words or terms that are more informative following Poisson distribution [ 2 ].

The rest of the paper is organised as follows; Section 2 outlines our proposed approach detailing dataset description and experimental setup, 3 and 4 discuss results and conclusion respectively. 2

Methodology

We submitted 3 runs for this task; The rst run formed the baseline. To generate the second and nal run we relied on eld base indexing model to index; rst title only without description of the statutes and nally indexing only description without title of the statutes. The rest of this section provides more details on how the runs were generated. For all the three runs we used IF B2 retrieval model. 2.1

Data set Description

The dataset for this study consist of 50 object queries, of which the rst 10 formed part of the training data. The remaining queries (11 - 50) formed part of the Test data for which 3 runs were generated and submitted to Forum for Information Retrieval Evaluation (FIRE) for evaluation. For the training data, relevance assessments were provided and the document collection for training data consisted of statutes document collection of 197 states. The 197 statutes also formed document collection for the Testdata set2. 2.2

Experimental Setup

The rst part of the experiment was to address the question; which of the term weighting model performed best for retrieval of statutes, so the experiment was set for training data. In order to perform experiments the data set provided was transformed into TREC Style format, that is for both object queries as well as statutes documents. The parsed documents follows TREC format and shell scripting was used for parsing. Section 2.3 and Section 2.4 illustrate the object query as well as document/statute in TREC format.

We used Terrier 4.23 [ 3 ] to perform all our experiments for indexing, retrieval. For evaluation we used trec eval 9.04. The platform has been used successfully for ad-hoc retrieval tasks. Preprocessing performed for all experiments are; stemming using Potter's stemmer, stopwords were removed using Terrier stopword list. We then performed retrieval using di erent term weighting models as implemented in Terrier and the results are shown in Table 1. M eanAverageP recision results revealed that Divergence from Randomness IF B2 overall performance was better than the other models. We therefore chose IF B2 for the next set of experiments to investigate retrieval e ectiveness of each of the statutes component.

To generate the rst run, we rst separated the given queries (qurey 1 - 50) into training and test queries. Queries 1- 10 form training queries for our training data, and queries 11 - 50 form test queries. The rst experiment was conducted to investigating di erent weighting models as implemented in terrier in order to nd which one performs the best on the training data. We observe that the IF B2 gives the best performance followed by LemurT F IDF and nally InExpB2. We therefore generate the rst run (UBLTM1) using IF B2.

For the second and and third run we transform into TREC style format but this time with two elds namely: Title and Description. We then index the statutes using title only and retrieve using test queries as well as IF B2 to generate our second run (UBLTM2). For the nal run we indexed using description only and retrieve using IF B2 to generate nal run (UBLTM3). The idea is so that we can investigate the e ect of title and description only on the retrieval e ectiveness of each of the two elds. 2 https://sites.google.com/view/ re-2019-aila/dataset-evaluation-plan 3 http://terrier.org/docs/v4.2/ 4 https://trec.nist.gov/trec eval/

Sample AILA Query transformed in to a format that can be used as query

Because the aim of the study was to compare the extent to which components of the statutes contributed to retrieval e ectiveness, the statutes were transformed into two types of TREC Style document collection; one where the entire content of the statutes i.e. the title and description of the statutes was transformed as shown below:

Below is a sample of part of AILA Q1 parsed into TREC TOPIC format: <TOP> <NUM> AILA Q1 </NUM> <DESC>Description: The appellant on February 9, 1961 was appointed as an O cer in Grade III in the respondent Bank ( for short 'the Bank'). He was promoted on April 1, 1968 to the Grade o cer in the Foreign Exchange Department in the Head O ce of the Bank. Sometime in 1964,...[TEXT OMITTED] ... </DESC> </NARR>Narrative: </TOP> 2.4

Sample Transformed Statute

Below is a sample of part of prior case 0001 parsed into TREC DOCUMENT format: <DOC> <DOCNO> S103</DOCNO> <TITLE> Freedom to manage religious a airs ... </TITLE> <TEXT> Subject to public order, morality and health, every religious denomination or any section thereof shall have the right- (a) to establish and maintain institutions for religious and charitable purposes; (b) to manage its own a airs in matters of religion; (c) to own and acquire movable and immovable property; and (d) to administer such property in accordance with law. ... </TEXT> </DOC> 3

Results

best model in terms of M AP . Table 2 shows top 9 results for runs submitted to AILA 2019 organisers for evaluation. Our Team name in the table is UBLTM. In the table, P @10 refers to Precision@10, M AP refers to Mean Average Precision, BP REF refers to binary preference-based measure and RecipRank refers to Reciprocal Rank. Our experiments set out to investigate the extent to which di erent parts of statutes contribute to retrieval e ectiveness, results reveal that titles of the statutes contain su cient information to aid retrieval. For future work, the nature of statutes could be investigated further to understand characteristics better and inform the direction to take.

Bhattacharya ,

Ghosh ,

Pal ,

Mehta , A. Bhattacharya. ,

Majumder , Overview of the Fire 2019 AILA track: Arti cial Intelligence for Legal Assistance . In Proc. of FIRE 2019 - Forum for Information Retrieval Evaluation , Kolkata, India, December 12-15 , 2019 .

Gianni

Amati and Cornelis Joost Van Rijsbergen. Proba- bilistic Models of Information Retrieval Based on Measuring the Divergence from Randomness . ACM Trans. Inf. Syst . 20 , 4 (Oct. 2002 ), 357389 . DOI:http://dx.doi.org/10.1145/582415.582416 Christopher D. Manning , Prabhakar Raghavan, and Hinrich Schtze . 2008 . Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK. http://nlp.stanford.edu/IR- book/information-retrieval-book.html

Ounis ,

Amati ,

Plachouras ,

He ,

Macdonald , and

Lioma . Terrier: A High Performance and Scalable In- formation Retrieval Platform . In Proceedings of ACM SIGIR06 Workshop on Open Source Information Retrieval (OSIR 2006 ).

Ounis , G. Amati, Plachouras V.,

He ,

Macdonald , and Johnson. Terrier Information Retrieval Platform. In Pro- ceedings of the 27th European Conference on IR Research (ECIR 2005) (Lecture Notes in Computer Science) , Vol. 3408 . Springer, 517519 ( 2005 ).

Tamrakar and

S. K.

Vishwakarma . Analysis of Proba- bilistic Model for Document Retrieval in Information Retrieval . In 2015 International Conference on Computational Intelli- gence and Communication Networks (CICN) . 760765 . ( 2015 ), DOI: http://dx.doi.org/10.1109/CICN. 2015 .155

Tanase . Using the divergence framework for randomness: CHiC 2013 lab report . CEUR Workshop Proceedings 1179 ( 2013 ).

7. Christopher

Manning , Prabhakar Raghavan, and Hinrich

Schtze . 2008 . Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK. http://nlp.stanford.edu/IR- book/information-retrieval-book.html