On the importance of legal catchphrases in precedence retrieval∗ Edwin Thuma† Nkwebi P. Motlogelwa‡ University of Botswana University of Botswana Department of Computer Science Department of Computer Science Gaborone, Botswana Gaborone, Botswana thumae@mopipi.ub.bw motlogel@mopipi.ub.bw ABSTRACT training set and current cases. Queries were formulated using le- This paper presents our working notes for FIRE 2017, Informa- gal catchphrases from the most relevant documents in the training tion Retrieval from Legal documents -Task 2 (Precedence retrieval). set. Common Law Systems around the world recognize the importance For retrieval, we deployed the parameter-free DPH term weight- of precedence in Law. In making decisions, Judges are obliged to ing model to score and rank prior cases. Moreover investigate whether consult prior cases that had already been decided to ensure that taking the dependence of query terms in to consideration when there is no divergence in treatment of similar situations in differ- ranking and scoring prior cases could improve thr retrieval perfor- ent cases. Our approach was to investigate the effectiveness of us- mance.Previous work has shown that incorporating term depen- ing legal catchphrases in precedence retrieval. To improve retrieval dency in scoring and ranking documents could significantly im- performance, we incorporated term dependency in our retrieval. In prove the retrieval performance [4]. In addition we deployed query addition, we investigate the effects of deploying query expansion expansion where the original queries are reformulated by adding on the retrieval performance. Our results show an improvement in new terms to investigate its impact on retrieval performance. Pre- the retrieval performance when we incorporate term dependence vious research has shown that query expansion could improve re- in scoring and ranking prior cases. However, we see a degradation trieval effectiveness [1]. in the retrieval performance when we deploy query expansion. This paper is structured as follows. Section 2 contains a back- ground on algorithms used. Section 3 describes the experimental KEYWORDS setup. In Section 4, we describe the methodologies used for the 3 Precedent retrieval, term dependency, query expansion, legal catch- runs submitted by team UB_Botswana_Legal for Task 2. Section 5 phrases presents results and discussions. 2 BACKGROUND 1 INTRODUCTION In this section, we begin by presenting a brief but essential back- Common Law Systems around the world recognize the importance ground on the different algorithms used in our experimental in- of precedence in Law. In making decisions, Judges are obliged to vestigation and evaluation. We start by describing the TF-IDF term align their decisions to relevant prior cases. Thus, when lawyers weighting model, in Section 2.1. We then describe DPH term weight- prepare for cases, they research extensively on prior cases. In ad- ing model in Section 2.2, Lastly we describe the Bose-Einstein 1 dition, Judges also consult prior cases that had already been de- (Bo1) model for query expansion in Section 2.3. cided to ensure that a similar situation is treated similarly in every case [3]. This can be overwhelming due to the enormous number 2.1 TF-IDF term weighting model of prior cases and length of each. Task 2 of the Information re- trieval in Legal Documents track (precedence retrieval), explores In our experimental setup, we used T F -IDF [5] to score and rank techniques and tools that could ease this task [3]. In general, prece- documents. Generally, T F -IDF calculates the weight of each term dence retrieval will retrieve a ranked list of prior cases that are re- t as the product of its term frequency (t f ) weight in document d lated to a certain current case. and its inverse document frequency (id ft ). ∑ N scoreT F -I D F (d, Q ) = 1 + log(t f ) ∗ log (1) In this work we investigate the importance of legal catchphrases d ft t ∈Q as queries in precedent retrieval. These legal catchphrases are ex- tracted from current cases. To achieve this, we used a training set of documents provided for Task 1 (catchphrase extraction) where case documents have corresponding gold standard catchphrase. We used Term Frequency-Inverse Document Frequency (TF-IDF) term weighting model to identify similarity between documents in the ∗ On the importance of legal catchphrases in precedence retrieval † Lecturer, Department of Computer Science, University of Botswana ‡ Lecturer, Department of Computer Science, University of Botswana Where: using the full Potter stemming algorithm, and stopword removal • t f is the term frequency of term t in document d. using terrier stopword list. • d ft is the document frequency of term t - the number of documents in the collection that the term t occurs in. 4 METHODOLOGY • id f = log dNf is the inverse document frequency of term t 4.1 query formulation t in a collection of N documents Query Generation For the different Runs For all the runs in this task, we indexed the 100 case documents 2.2 DPH Term Weighting Model provided in task1, which had the corresponding catchphrases us- Our baseline system used the parameter-free DPH term weight- ing Terrier-4.2 IR platform. During indexing, each case document ing model from the Divergence from Randomness (DFR) frame- was first tokenised and stopwords were removed using the Terrier work [2]. The DPH term weighting model calculates the score of a stopword list. Each token was then stemmed using the full Porter document d for a given query Q as follows: stemming algorithm. ∑ ( ) For each current case provided in task 2, We used the TF-IDF term score DP H (d, Q ) = t ∈Q qt f · norm · avд_l t f · log((t f · l ) · ( tNf c )) + 0.5 · log(2 · π · t f · (1 − t M LE )) (2) weighting model in Terrier 4.2 to score and rank the indexed case where qt f , t f and t f c are the frequencies of the term t in the query documents. Each case document was first pre-processed using the Q , in the document d and in the collection C respectively. N is num- same pre-processing steps undertaken during indexing. After re- ber of documents in the collection C, avд_l is the average length of trieving the top 40 case documents, we formulated queries for each documents in the collection C and l is the length of the document current case using the gold standard catchphrases that appear in tf (1−t M LE ) 2 d. t M LE = l and norm = t f +1 . these ranked case documents and also in the current case docu- ment used for retrieval. 2.3 Bose-Einstein 1 (Bo1) model for Query Expansion 4.2 UB_Botswana_Legal_Task2_R1 Using the formulated queries, we deployed the parameter-free DPH In our experimental investigation and evaluation, we used the Terrier- Divergence from Randomness term weighting model in Terrier- 4.0 Divergence from Randomness (DFR) Bose-Einstein 1 (Bo1) model 4.2 IR platform as our baseline system to score and rank the prior to select the most informative terms from the topmost documents cases. after a first pass document ranking. The DFR Bo1 model calculates the information content of a term t in the top-ranked documents as follows [1]: 4.3 UB_Botswana_Legal_Task2_R2 We used UB_Botswana_Legal_Task2_R1 as the baseline system. In 1 + Pn (t ) addition, we deployed the Sequential Dependence (SD) variant of w (t ) = t f x · log2 + log2 (1 + Pn (t )) (3) Pn (t ) the Markov Random Fields for term dependence. Sequential De- tfc pendence only assumes a dependence between neighbouring query Pn (t ) = (4) terms [4, 6]. In this work, we used a default window size of 2 as N provided in Terrier-4.2. where Pn (t ) is the probability of t in the whole collection, t f x is the frequency of the query term in the top x ranked documents, t f c is the frequency of the term t in the collection, and N is the 4.4 UB_Botswana_Legal_Task2_R3 number of documents in the collection. We used UB_Botswana_Legal_Task2_R1 as the baseline system. In addition, we deployed a simple pseudo-relevance feedback on the 3 EXPERIMENTAL SETUP local collection. We used the Bo1 model for query expansion to select the 10 most informative terms from the top 3 ranked docu- 3.1 Document Collection ments after the first pass retrieval (on the local collection) [6]. We In this work we use the document collection provided by the In- then performed a second pass retrieval on this local collection with formation Retrieval in Legal Documents track organizers. It com- the new expanded query. prised 200 documents representing current cases and 2000 docu- ments representing prior cases [3]. For each current case, the ob- 5 RESULTS AND DISCUSSION jective is to retrieve relevant ranked prior cases such that the most This work set out to investigate the importance of legal catchphrases relevant appear at the top of the list and the least relevant at the in precedence retrieval. The results of our submission in Table 1 bottom together with scores for prior case. were evaluated by the organizing committee of this task. Since most of the catchphrases were bi-grams and tri-grams, our exploita- tion of sequential term dependency variant for the Markov Ran- 3.2 Precedence Retrieval Experimental dom Fields for term dependence led to improvements in retrieval Platform performance in terms of Mean Average Precision and Precision @ For all our experimental evaluation, we used Terrier-4.2, an open 10. Our attempt to improve retrieval performance using query ex- source Information Retrieval (IR) platform. Documents were pre- pansion resulted in degradation in the retrieval performance. We processed before indexing: tokenising text, stemming each token suspect this might have been to due to query drift. 2 Table 1: Fire 2017 UB-Botswana Legal Run Evaluation results for Task 2 Run ID Mean Average Precision Mean reciprocal Rank Precision@10 Recall@100 UB_Botswana_Legal_Task2_R3 0.1671 0.3478 0.1225 0.559 UB_Botswana_Legal_Task2_R1 0.1487 0.3506 0.112 0.546 UB_Botswana_Legal_Task2_R2 0.1078 0.3017 0.0785 0.43 REFERENCES [1] G. Amati. 2003. Probabilistic Models for Information Retrieval based on Diver- gence from Randomness. University of Glasgow,UK, PhD Thesis (June 2003), 1 – 198. [2] G. Amati, E. Ambrosi, M. Bianchi, C. Gaibisso, and G. Gambosi. 2007. FUB, IASI- CNR and University of Tor Vergata at TREC 2007 Blog Track. In Proceedings of the 16th Text REtrieval Conference (TREC-2007). Text REtrieval Conference (TREC), Gaithersburg, Md., USA., 1–10. [3] Arpan Mandal, Kripabandhu Ghosh, Arnab Bhattacharya, Arindam Pal, and Sap- tarshi Ghosh. 2017. Overview of the FIRE 2017 track: Information Retrieval from Legal Documents (IRLeD). In Working notes of FIRE 2017 - Forum for Information Retrieval Evaluation (CEUR Workshop Proceedings). CEUR-WS.org. [4] Donald Metzler and W. Bruce Croft. 2005. A Markov Random Field Model for Term Dependencies. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’05). ACM, New York, NY, USA, 472–479. https://doi.org/10.1145/1076034.1076115 [5] Juan Ramos. 1999. Using TF-IDF to Determine Word Relevance in Document Queries. (1999). [6] Edwin Thuma, Nkwebi Peace Motlogelwa, and Tebo Leburu-Dingalo. 2017. UB- Botswana Participation to CLEF eHealth IR Challenge 2017: Task 3 (IRTask1 : Ad- hoc Search). In Working Notes of CLEF 2017 - Conference and Labs of the Evaluation Forum, Dublin, Ireland, September 11-14, 2017. http://ceur-ws.org/Vol-1866/paper_ 73.pdf 3